All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] Kernel Live Patching
@ 2014-11-06 14:39 Seth Jennings
  2014-11-06 14:39 ` [PATCH 1/2] kernel: add TAINT_LIVEPATCH Seth Jennings
                   ` (3 more replies)
  0 siblings, 4 replies; 73+ messages in thread
From: Seth Jennings @ 2014-11-06 14:39 UTC (permalink / raw)
  To: Josh Poimboeuf, Seth Jennings, Jiri Kosina, Vojtech Pavlik,
	Steven Rostedt
  Cc: live-patching, kpatch, linux-kernel

This patchset implements an ftrace-based mechanism and kernel interface for
doing live patching of kernel and kernel module functions.  It represents the
greatest common functionality set between kpatch [1] and kGraft [2] and can
accept patches built using either method.  This solution was discussed in the
Live Patching Mini-conference at LPC 2014 [3].

The model consists of a live patching "core" that provides an interface for
other "patch" kernel modules to register patches with the core.

Patch modules contain the new function code and create an lp_patch
structure containing the required data about what functions to patch, where the
new code for each patched function resides, and in which kernel object (vmlinux
or module) the function to be patch resides.  The patch module then invokes the
lp_register_patch() function to register with the core module, then
lp_enable_patch() to have to core module redirect the execution paths using
ftrace.

An example patch module can be found here:
https://github.com/spartacus06/livepatch/blob/master/patch/patch.c

The live patching core creates a sysfs hierarchy for user-level access to live
patching information.  The hierarchy is structured like this:

/sys/kernel/livepatch
/sys/kernel/livepatch/<patch>
/sys/kernel/livepatch/<patch>/enabled
/sys/kernel/livepatch/<patch>/<object>
/sys/kernel/livepatch/<patch>/<object>/<func>
/sys/kernel/livepatch/<patch>/<object>/<func>/new_addr
/sys/kernel/livepatch/<patch>/<object>/<func>/old_addr

The new_addr attribute provides the location of the new version of the function
within the patch module.  The old_addr attribute provides the location of the
old function.  The old function is located using one of two methods: it is
either provided by the patch module (only possible for a function in vmlinux)
or kallsyms lookup.  Symbol ambiguity results in a failure.

The core holds a reference on any kernel module that is patched to ensure it
does not unload while we are redirecting calls from it.  Also, the core takes a
reference on the patch module itself to keep it from unloading.  This is
because, without a mechanism to ensure that no thread is currently executing in
the patched function, we can not determine whether it is safe to unload the
patch module.  For this reason, unloading patch modules is currently not
allowed.

The core is able to release its reference on patched modules by disabling all
patches that patch a function in that module.  Disabling patches can be done
like this:

echo 0 > /sys/kernel/livepatch/<patch>/enabled

Patches can also be re-enabled, however, the core with retake any reference on a
kernel module that contains a patched function.

If a patch module contains a patch for a module that is not currently loaded,
there is nothing to patch so the core does nothing for that object.  However,
the core registers a module notifier so that if the module is ever loaded, it
is immediately patched.

kpatch and kGraft each have their own mechanisms for ensuring system
consistency during the patching process. This first version does not implement
any consistency mechanism that ensures that old and new code do not run
together.  In practice, ~90% of CVEs are safe to apply in this way, since they
simply add a conditional check.  However, any function change that can not
execute safely with the old version of the function can _not_ be safely applied
for now.

[1] https://github.com/dynup/kpatch
[2] https://git.kernel.org/cgit/linux/kernel/git/jirislaby/kgraft.git/
[3] https://etherpad.fr/p/LPC2014_LivePatching

Seth Jennings (2):
  kernel: add TAINT_LIVEPATCH
  kernel: add support for live patching

 Documentation/oops-tracing.txt  |    2 +
 Documentation/sysctl/kernel.txt |    1 +
 MAINTAINERS                     |   10 +
 arch/x86/Kconfig                |    2 +
 include/linux/kernel.h          |    1 +
 include/linux/livepatch.h       |   45 ++
 kernel/Makefile                 |    1 +
 kernel/livepatch/Kconfig        |   11 +
 kernel/livepatch/Makefile       |    3 +
 kernel/livepatch/core.c         | 1020 +++++++++++++++++++++++++++++++++++++++
 kernel/panic.c                  |    2 +
 11 files changed, 1098 insertions(+)
 create mode 100644 include/linux/livepatch.h
 create mode 100644 kernel/livepatch/Kconfig
 create mode 100644 kernel/livepatch/Makefile
 create mode 100644 kernel/livepatch/core.c

-- 
1.9.3


^ permalink raw reply	[flat|nested] 73+ messages in thread

* [PATCH 1/2] kernel: add TAINT_LIVEPATCH
  2014-11-06 14:39 [PATCH 0/2] Kernel Live Patching Seth Jennings
@ 2014-11-06 14:39 ` Seth Jennings
  2014-11-09 20:19   ` Greg KH
  2014-11-06 14:39 ` [PATCH 2/2] kernel: add support for live patching Seth Jennings
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 73+ messages in thread
From: Seth Jennings @ 2014-11-06 14:39 UTC (permalink / raw)
  To: Josh Poimboeuf, Seth Jennings, Jiri Kosina, Vojtech Pavlik,
	Steven Rostedt
  Cc: live-patching, kpatch, linux-kernel

This adds a new taint flag to indicate when the kernel or a kernel
module has been live patched.  This will provide a clean indication in
bug reports that live patching was used.

Additionally, if the crash occurs in a live patched function, the live
patch module will appear beside the patched function in the backtrace.

Signed-off-by: Seth Jennings <sjenning@redhat.com>
---
 Documentation/oops-tracing.txt  | 2 ++
 Documentation/sysctl/kernel.txt | 1 +
 include/linux/kernel.h          | 1 +
 kernel/panic.c                  | 2 ++
 4 files changed, 6 insertions(+)

diff --git a/Documentation/oops-tracing.txt b/Documentation/oops-tracing.txt
index beefb9f..f3ac05c 100644
--- a/Documentation/oops-tracing.txt
+++ b/Documentation/oops-tracing.txt
@@ -270,6 +270,8 @@ characters, each representing a particular tainted value.
 
  15: 'L' if a soft lockup has previously occurred on the system.
 
+ 16: 'K' if the kernel has been live patched.
+
 The primary reason for the 'Tainted: ' string is to tell kernel
 debuggers if this is a clean kernel or if anything unusual has
 occurred.  Tainting is permanent: even if an offending module is
diff --git a/Documentation/sysctl/kernel.txt b/Documentation/sysctl/kernel.txt
index d7fc4ab..085f73b 100644
--- a/Documentation/sysctl/kernel.txt
+++ b/Documentation/sysctl/kernel.txt
@@ -831,6 +831,7 @@ can be ORed together:
 8192 - An unsigned module has been loaded in a kernel supporting module
        signature.
 16384 - A soft lockup has previously occurred on the system.
+32768 - The kernel has been live patched.
 
 ==============================================================
 
diff --git a/include/linux/kernel.h b/include/linux/kernel.h
index 446d76a..a6aa2df 100644
--- a/include/linux/kernel.h
+++ b/include/linux/kernel.h
@@ -473,6 +473,7 @@ extern enum system_states {
 #define TAINT_OOT_MODULE		12
 #define TAINT_UNSIGNED_MODULE		13
 #define TAINT_SOFTLOCKUP		14
+#define TAINT_LIVEPATCH			15
 
 extern const char hex_asc[];
 #define hex_asc_lo(x)	hex_asc[((x) & 0x0f)]
diff --git a/kernel/panic.c b/kernel/panic.c
index d09dc5c..46bca3d 100644
--- a/kernel/panic.c
+++ b/kernel/panic.c
@@ -225,6 +225,7 @@ static const struct tnt tnts[] = {
 	{ TAINT_OOT_MODULE,		'O', ' ' },
 	{ TAINT_UNSIGNED_MODULE,	'E', ' ' },
 	{ TAINT_SOFTLOCKUP,		'L', ' ' },
+	{ TAINT_LIVEPATCH,		'K', ' ' },
 };
 
 /**
@@ -244,6 +245,7 @@ static const struct tnt tnts[] = {
  *  'I' - Working around severe firmware bug.
  *  'O' - Out-of-tree module has been loaded.
  *  'E' - Unsigned module has been loaded.
+ *  'K' - Kernel has been live patched.
  *
  *	The string is overwritten by the next call to print_tainted().
  */
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 73+ messages in thread

* [PATCH 2/2] kernel: add support for live patching
  2014-11-06 14:39 [PATCH 0/2] Kernel Live Patching Seth Jennings
  2014-11-06 14:39 ` [PATCH 1/2] kernel: add TAINT_LIVEPATCH Seth Jennings
@ 2014-11-06 14:39 ` Seth Jennings
  2014-11-06 15:11   ` Jiri Kosina
                     ` (7 more replies)
  2014-11-06 18:44 ` [PATCH 0/2] Kernel Live Patching Christoph Hellwig
  2014-11-09 20:16 ` Greg KH
  3 siblings, 8 replies; 73+ messages in thread
From: Seth Jennings @ 2014-11-06 14:39 UTC (permalink / raw)
  To: Josh Poimboeuf, Seth Jennings, Jiri Kosina, Vojtech Pavlik,
	Steven Rostedt
  Cc: live-patching, kpatch, linux-kernel

This commit introduces code for the live patching core.  It implements
an ftrace-based mechanism and kernel interface for doing live patching
of kernel and kernel module functions.

It represents the greatest common functionality set between kpatch and
kgraft and can accept patches built using either method.

This first version does not implement any consistency mechanism that
ensures that old and new code do not run together.  In practice, ~90% of
CVEs are safe to apply in this way, since they simply add a conditional
check.  However, any function change that can not execute safely with
the old version of the function can _not_ be safely applied in this
version.

Signed-off-by: Seth Jennings <sjenning@redhat.com>
---
 MAINTAINERS               |   10 +
 arch/x86/Kconfig          |    2 +
 include/linux/livepatch.h |   45 ++
 kernel/Makefile           |    1 +
 kernel/livepatch/Kconfig  |   11 +
 kernel/livepatch/Makefile |    3 +
 kernel/livepatch/core.c   | 1020 +++++++++++++++++++++++++++++++++++++++++++++
 7 files changed, 1092 insertions(+)
 create mode 100644 include/linux/livepatch.h
 create mode 100644 kernel/livepatch/Kconfig
 create mode 100644 kernel/livepatch/Makefile
 create mode 100644 kernel/livepatch/core.c

diff --git a/MAINTAINERS b/MAINTAINERS
index f98019e..02d1af7 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -5671,6 +5671,16 @@ F:	Documentation/misc-devices/lis3lv02d
 F:	drivers/misc/lis3lv02d/
 F:	drivers/platform/x86/hp_accel.c
 
+LIVE PATCHING
+M:	Josh Poimboeuf <jpoimboe@redhat.com>
+M:	Seth Jennings <sjenning@redhat.com>
+M:	Jiri Kosina <jkosina@suse.cz>
+M:	Vojtech Pavlik <vojtech@suse.cz>
+S:	Maintained
+F:	kernel/livepatch/
+F:	include/linux/livepatch.h
+L:	live-patching@vger.kernel.org
+
 LLC (802.2)
 M:	Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
 S:	Maintained
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 9cd2578..fb0bb59 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -1982,6 +1982,8 @@ config CMDLINE_OVERRIDE
 	  This is used to work around broken boot loaders.  This should
 	  be set to 'N' under normal conditions.
 
+source "kernel/livepatch/Kconfig"
+
 endmenu
 
 config ARCH_ENABLE_MEMORY_HOTPLUG
diff --git a/include/linux/livepatch.h b/include/linux/livepatch.h
new file mode 100644
index 0000000..c7a415b
--- /dev/null
+++ b/include/linux/livepatch.h
@@ -0,0 +1,45 @@
+#ifndef _LIVEPATCH_H_
+#define _LIVEPATCH_H_
+
+#include <linux/module.h>
+
+struct lp_func {
+	const char *old_name; /* function to be patched */
+	void *new_func; /* replacement function in patch module */
+	/*
+	 * The old_addr field is optional and can be used to resolve
+	 * duplicate symbol names in the vmlinux object.  If this
+	 * information is not present, the symbol is located by name
+	 * with kallsyms. If the name is not unique and old_addr is
+	 * not provided, the patch application fails as there is no
+	 * way to resolve the ambiguity.
+	 */
+	unsigned long old_addr;
+};
+
+struct lp_dynrela {
+	unsigned long dest;
+	unsigned long src;
+	unsigned long type;
+	const char *name;
+	int addend;
+	int external;
+};
+
+struct lp_object {
+	const char *name; /* "vmlinux" or module name */
+	struct lp_func *funcs;
+	struct lp_dynrela *dynrelas;
+};
+
+struct lp_patch {
+	struct module *mod; /* module containing the patch */
+	struct lp_object *objs;
+};
+
+int lp_register_patch(struct lp_patch *);
+int lp_unregister_patch(struct lp_patch *);
+int lp_enable_patch(struct lp_patch *);
+int lp_disable_patch(struct lp_patch *);
+
+#endif /* _LIVEPATCH_H_ */
diff --git a/kernel/Makefile b/kernel/Makefile
index a59481a..616994f 100644
--- a/kernel/Makefile
+++ b/kernel/Makefile
@@ -26,6 +26,7 @@ obj-y += power/
 obj-y += printk/
 obj-y += irq/
 obj-y += rcu/
+obj-y += livepatch/
 
 obj-$(CONFIG_CHECKPOINT_RESTORE) += kcmp.o
 obj-$(CONFIG_FREEZER) += freezer.o
diff --git a/kernel/livepatch/Kconfig b/kernel/livepatch/Kconfig
new file mode 100644
index 0000000..312ed81
--- /dev/null
+++ b/kernel/livepatch/Kconfig
@@ -0,0 +1,11 @@
+config LIVE_PATCHING
+	tristate "Live Kernel Patching"
+	depends on DYNAMIC_FTRACE_WITH_REGS && MODULES && SYSFS && KALLSYMS_ALL
+	default m
+	help
+	  Say Y here if you want to support live kernel patching.
+	  This setting has no runtime impact until a live-patch
+	  kernel module that uses the live-patch interface provided
+	  by this option is loaded, resulting in calls to patched
+	  functions being redirected to the new function code contained
+	  in the live-patch module.
diff --git a/kernel/livepatch/Makefile b/kernel/livepatch/Makefile
new file mode 100644
index 0000000..7c1f008
--- /dev/null
+++ b/kernel/livepatch/Makefile
@@ -0,0 +1,3 @@
+obj-$(CONFIG_LIVE_PATCHING) += livepatch.o
+
+livepatch-objs := core.o
diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c
new file mode 100644
index 0000000..b32dbb5
--- /dev/null
+++ b/kernel/livepatch/core.c
@@ -0,0 +1,1020 @@
+/*
+ * livepatch.c - Live Kernel Patching Core
+ *
+ * Copyright (C) 2014 Seth Jennings <sjenning@redhat.com>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2
+ * of the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
+#include <linux/module.h>
+#include <linux/kernel.h>
+#include <linux/semaphore.h>
+#include <linux/slab.h>
+#include <linux/ftrace.h>
+#include <linux/list.h>
+#include <linux/kallsyms.h>
+#include <linux/uaccess.h> /* probe_kernel_write */
+#include <asm/cacheflush.h> /* set_memory_[ro|rw] */
+
+#include <linux/livepatch.h>
+
+/*************************************
+ * Core structures
+ ************************************/
+
+/*
+ * lp_ structs vs lpc_ structs
+ *
+ * For each element (patch, object, func) in the live-patching code,
+ * there are two types with two different prefixes: lp_ and lpc_.
+ *
+ * Structures used by the live-patch modules to register with this core module
+ * are prefixed with lp_ (live patching).  These structures are part of the
+ * registration API and are defined in livepatch.h.  The structures used
+ * internally by this core module are prefixed with lpc_ (live patching core).
+ */
+
+static DEFINE_SEMAPHORE(lpc_mutex);
+static LIST_HEAD(lpc_patches);
+
+enum lpc_state {
+	DISABLED,
+	ENABLED
+};
+
+struct lpc_func {
+	struct list_head list;
+	struct kobject kobj;
+	struct ftrace_ops fops;
+	enum lpc_state state;
+
+	const char *old_name;
+	unsigned long new_addr;
+	unsigned long old_addr;
+};
+
+struct lpc_object {
+	struct list_head list;
+	struct kobject kobj;
+	struct module *mod; /* module associated with object */
+	enum lpc_state state;
+
+	const char *name;
+	struct list_head funcs;
+	struct lp_dynrela *dynrelas;
+};
+
+struct lpc_patch {
+	struct list_head list;
+	struct kobject kobj;
+	struct lp_patch *userpatch; /* for correlation during unregister */
+	enum lpc_state state;
+
+	struct module *mod;
+	struct list_head objs;
+};
+
+/*******************************************
+ * Helpers
+ *******************************************/
+
+/* sets obj->mod if object is not vmlinux and module was found */
+static bool is_object_loaded(struct lpc_object *obj)
+{
+	struct module *mod;
+
+	if (!strcmp(obj->name, "vmlinux"))
+		return 1;
+
+	mutex_lock(&module_mutex);
+	mod = find_module(obj->name);
+	mutex_unlock(&module_mutex);
+	obj->mod = mod;
+
+	return !!mod;
+}
+
+/************************************
+ * kallsyms
+ ***********************************/
+
+struct lpc_find_arg {
+	const char *objname;
+	const char *name;
+	unsigned long addr;
+	/*
+	 * If count == 0, the symbol was not found. If count == 1, a unique
+	 * match was found and addr is set.  If count > 1, there is
+	 * unresolvable ambiguity among "count" number of symbols with the same
+	 * name in the same object.
+	 */
+	unsigned long count;
+};
+
+static int lpc_find_callback(void *data, const char *name,
+			     struct module *mod, unsigned long addr)
+{
+	struct lpc_find_arg *args = data;
+
+	if ((mod && !args->objname) || (!mod && args->objname))
+		return 0;
+
+	if (strcmp(args->name, name))
+		return 0;
+
+	if (args->objname && strcmp(args->objname, mod->name))
+		return 0;
+
+	/*
+	 * args->addr might be overwritten if another match is found
+	 * but lpc_find_symbol() handles this and only returns the
+	 * addr if count == 1.
+	 */
+	args->addr = addr;
+	args->count++;
+
+	return 0;
+}
+
+static int lpc_find_symbol(const char *objname, const char *name,
+			   unsigned long *addr)
+{
+	struct lpc_find_arg args = {
+		.objname = objname,
+		.name = name,
+		.addr = 0,
+		.count = 0
+	};
+
+	if (objname && !strcmp(objname, "vmlinux"))
+		args.objname = NULL;
+
+	kallsyms_on_each_symbol(lpc_find_callback, &args);
+
+	if (args.count == 0)
+		pr_err("symbol '%s' not found in symbol table\n", name);
+	else if (args.count > 1)
+		pr_err("unresolvable ambiguity (%lu matches) on symbol '%s' in object '%s'\n",
+		       args.count, name, objname);
+	else {
+		*addr = args.addr;
+		return 0;
+	}
+
+	*addr = 0;
+	return -EINVAL;
+}
+
+struct lpc_verify_args {
+	const char *name;
+	const unsigned long addr;
+};
+
+static int lpc_verify_callback(void *data, const char *name,
+			       struct module *mod, unsigned long addr)
+{
+	struct lpc_verify_args *args = data;
+
+	if (!mod &&
+	    !strcmp(args->name, name) &&
+	    args->addr == addr)
+		return 1;
+	return 0;
+}
+
+static int lpc_verify_vmlinux_symbol(const char *name, unsigned long addr)
+{
+	struct lpc_verify_args args = {
+		.name = name,
+		.addr = addr,
+	};
+
+	if (kallsyms_on_each_symbol(lpc_verify_callback, &args))
+		return 0;
+	pr_err("symbol '%s' not found at specified address 0x%016lx, kernel mismatch?",
+		name, addr);
+	return -EINVAL;
+}
+
+static int lpc_find_verify_func_addr(struct lpc_func *func, const char *objname)
+{
+	int ret;
+
+	if (func->old_addr && strcmp(objname, "vmlinux")) {
+		pr_err("old address specified for module symbol\n");
+		return -EINVAL;
+	}
+
+	if (func->old_addr)
+		ret = lpc_verify_vmlinux_symbol(func->old_name,
+						func->old_addr);
+	else
+		ret = lpc_find_symbol(objname, func->old_name,
+				      &func->old_addr);
+
+	return ret;
+}
+
+/****************************************
+ * dynamic relocations (load-time linker)
+ ****************************************/
+
+/*
+ * external symbols are located outside the parent object (where the parent
+ * object is either vmlinux or the kmod being patched).
+ */
+static int lpc_find_external_symbol(struct module *pmod, const char *name,
+					unsigned long *addr)
+{
+	const struct kernel_symbol *sym;
+
+	/* first, check if it's an exported symbol */
+	preempt_disable();
+	sym = find_symbol(name, NULL, NULL, true, true);
+	preempt_enable();
+	if (sym) {
+		*addr = sym->value;
+		return 0;
+	}
+
+	/* otherwise check if it's in another .o within the patch module */
+	return lpc_find_symbol(pmod->name, name, addr);
+}
+
+static int lpc_write_object_relocations(struct module *pmod,
+					struct lpc_object *obj)
+{
+	int ret, size, readonly = 0, numpages;
+	struct lp_dynrela *dynrela;
+	u64 loc, val;
+	unsigned long core = (unsigned long)pmod->module_core;
+	unsigned long core_ro_size = pmod->core_ro_size;
+	unsigned long core_size = pmod->core_size;
+
+	for (dynrela = obj->dynrelas; dynrela->name; dynrela++) {
+		if (!strcmp(obj->name, "vmlinux")) {
+			ret = lpc_verify_vmlinux_symbol(dynrela->name,
+							dynrela->src);
+			if (ret)
+				return ret;
+		} else {
+			/* module, dynrela->src needs to be discovered */
+			if (dynrela->external)
+				ret = lpc_find_external_symbol(pmod,
+							       dynrela->name,
+							       &dynrela->src);
+			else
+				ret = lpc_find_symbol(obj->mod->name,
+						      dynrela->name,
+						      &dynrela->src);
+			if (ret)
+				return -EINVAL;
+		}
+
+		switch (dynrela->type) {
+		case R_X86_64_NONE:
+			continue;
+		case R_X86_64_PC32:
+			loc = dynrela->dest;
+			val = (u32)(dynrela->src + dynrela->addend -
+				    dynrela->dest);
+			size = 4;
+			break;
+		case R_X86_64_32S:
+			loc = dynrela->dest;
+			val = (s32)dynrela->src + dynrela->addend;
+			size = 4;
+			break;
+		case R_X86_64_64:
+			loc = dynrela->dest;
+			val = dynrela->src;
+			size = 8;
+			break;
+		default:
+			pr_err("unsupported rela type %ld for source %s (0x%lx <- 0x%lx)\n",
+			       dynrela->type, dynrela->name, dynrela->dest,
+			       dynrela->src);
+			return -EINVAL;
+		}
+
+		if (loc >= core && loc < core + core_ro_size)
+			readonly = 1;
+		else if (loc >= core + core_ro_size && loc < core + core_size)
+			readonly = 0;
+		else {
+			pr_err("bad dynrela location 0x%llx for symbol %s\n",
+			       loc, dynrela->name);
+			return -EINVAL;
+		}
+
+		numpages = (PAGE_SIZE - (loc & ~PAGE_MASK) >= size) ? 1 : 2;
+
+		if (readonly)
+			set_memory_rw(loc & PAGE_MASK, numpages);
+
+		ret = probe_kernel_write((void *)loc, &val, size);
+
+		if (readonly)
+			set_memory_ro(loc & PAGE_MASK, numpages);
+
+		if (ret) {
+			pr_err("write to 0x%llx failed for symbol %s\n",
+			       loc, dynrela->name);
+			return ret;
+		}
+	}
+
+	return 0;
+}
+
+/***********************************
+ * ftrace registration
+ **********************************/
+
+static void lpc_ftrace_handler(unsigned long ip, unsigned long parent_ip,
+			       struct ftrace_ops *ops, struct pt_regs *regs)
+{
+	struct lpc_func *func = ops->private;
+
+	regs->ip = func->new_addr;
+}
+
+static int lpc_enable_func(struct lpc_func *func)
+{
+	int ret;
+
+	BUG_ON(!func->old_addr);
+	BUG_ON(func->state != DISABLED);
+	ret = ftrace_set_filter_ip(&func->fops, func->old_addr, 0, 0);
+	if (ret) {
+		pr_err("failed to set ftrace filter for function '%s' (%d)\n",
+		       func->old_name, ret);
+		return ret;
+	}
+	ret = register_ftrace_function(&func->fops);
+	if (ret) {
+		pr_err("failed to register ftrace handler for function '%s' (%d)\n",
+		       func->old_name, ret);
+		ftrace_set_filter_ip(&func->fops, func->old_addr, 1, 0);
+	} else
+		func->state = ENABLED;
+
+	return ret;
+}
+
+static int lpc_unregister_func(struct lpc_func *func)
+{
+	int ret;
+
+	BUG_ON(func->state != ENABLED);
+	if (!func->old_addr)
+		/* parent object is not loaded */
+		return 0;
+	ret = unregister_ftrace_function(&func->fops);
+	if (ret) {
+		pr_err("failed to unregister ftrace handler for function '%s' (%d)\n",
+		       func->old_name, ret);
+		return ret;
+	}
+	ret = ftrace_set_filter_ip(&func->fops, func->old_addr, 1, 0);
+	if (ret)
+		pr_warn("function unregister succeeded but failed to clear the filter\n");
+	func->state = DISABLED;
+
+	return 0;
+}
+
+static int lpc_unregister_object(struct lpc_object *obj)
+{
+	struct lpc_func *func;
+	int ret;
+
+	list_for_each_entry(func, &obj->funcs, list) {
+		if (func->state != ENABLED)
+			continue;
+		ret = lpc_unregister_func(func);
+		if (ret)
+			return ret;
+		if (strcmp(obj->name, "vmlinux"))
+			func->old_addr = 0;
+	}
+	if (obj->mod)
+		module_put(obj->mod);
+	obj->state = DISABLED;
+
+	return 0;
+}
+
+/* caller must ensure that obj->mod is set if object is a module */
+static int lpc_enable_object(struct module *pmod, struct lpc_object *obj)
+{
+	struct lpc_func *func;
+	int ret;
+
+	if (obj->mod && !try_module_get(obj->mod))
+		return -ENODEV;
+
+	if (obj->dynrelas) {
+		ret = lpc_write_object_relocations(pmod, obj);
+		if (ret)
+			goto unregister;
+	}
+	list_for_each_entry(func, &obj->funcs, list) {
+		ret = lpc_find_verify_func_addr(func, obj->name);
+		if (ret)
+			goto unregister;
+
+		ret = lpc_enable_func(func);
+		if (ret)
+			goto unregister;
+	}
+	obj->state = ENABLED;
+
+	return 0;
+unregister:
+	WARN_ON(lpc_unregister_object(obj));
+	return ret;
+}
+
+/******************************
+ * enable/disable
+ ******************************/
+
+/* must be called with lpc_mutex held */
+static struct lpc_patch *lpc_find_patch(struct lp_patch *userpatch)
+{
+	struct lpc_patch *patch;
+
+	list_for_each_entry(patch, &lpc_patches, list)
+		if (patch->userpatch == userpatch)
+			return patch;
+
+	return NULL;
+}
+
+/* must be called with lpc_mutex held */
+static int lpc_disable_patch(struct lpc_patch *patch)
+{
+	struct lpc_object *obj;
+	int ret;
+
+	pr_notice("disabling patch '%s'\n", patch->mod->name);
+
+	list_for_each_entry(obj, &patch->objs, list) {
+		if (obj->state != ENABLED)
+			continue;
+		ret = lpc_unregister_object(obj);
+		if (ret)
+			return ret;
+	}
+	patch->state = DISABLED;
+
+	return 0;
+}
+
+int lp_disable_patch(struct lp_patch *userpatch)
+{
+	struct lpc_patch *patch;
+	int ret;
+
+	down(&lpc_mutex);
+	patch = lpc_find_patch(userpatch);
+	if (!patch) {
+		ret = -ENODEV;
+		goto out;
+	}
+	ret = lpc_disable_patch(patch);
+out:
+	up(&lpc_mutex);
+	return ret;
+}
+EXPORT_SYMBOL_GPL(lp_disable_patch);
+
+/* must be called with lpc_mutex held */
+static int lpc_enable_patch(struct lpc_patch *patch)
+{
+	struct lpc_object *obj;
+	int ret;
+
+	BUG_ON(patch->state != DISABLED);
+
+	pr_notice_once("tainting kernel with TAINT_LIVEPATCH\n");
+	add_taint(TAINT_LIVEPATCH, LOCKDEP_STILL_OK);
+
+	pr_notice("enabling patch '%s'\n", patch->mod->name);
+
+	list_for_each_entry(obj, &patch->objs, list) {
+		if (!is_object_loaded(obj))
+			continue;
+		ret = lpc_enable_object(patch->mod, obj);
+		if (ret)
+			goto unregister;
+	}
+	patch->state = ENABLED;
+	return 0;
+
+unregister:
+	WARN_ON(lpc_disable_patch(patch));
+	return ret;
+}
+
+int lp_enable_patch(struct lp_patch *userpatch)
+{
+	struct lpc_patch *patch;
+	int ret;
+
+	down(&lpc_mutex);
+	patch = lpc_find_patch(userpatch);
+	if (!patch) {
+		ret = -ENODEV;
+		goto out;
+	}
+	ret = lpc_enable_patch(patch);
+out:
+	up(&lpc_mutex);
+	return ret;
+}
+EXPORT_SYMBOL_GPL(lp_enable_patch);
+
+/******************************
+ * module notifier
+ *****************************/
+
+static int lp_module_notify(struct notifier_block *nb, unsigned long action,
+			    void *data)
+{
+	struct module *mod = data;
+	struct lpc_patch *patch;
+	struct lpc_object *obj;
+	int ret = 0;
+
+	if (action != MODULE_STATE_COMING)
+		return 0;
+
+	down(&lpc_mutex);
+
+	list_for_each_entry(patch, &lpc_patches, list) {
+		if (patch->state == DISABLED)
+			continue;
+		list_for_each_entry(obj, &patch->objs, list) {
+			if (strcmp(obj->name, mod->name))
+				continue;
+			pr_notice("load of module '%s' detected, applying patch '%s'\n",
+				  mod->name, patch->mod->name);
+			obj->mod = mod;
+			ret = lpc_enable_object(patch->mod, obj);
+			if (ret)
+				goto out;
+			break;
+		}
+	}
+
+	up(&lpc_mutex);
+	return 0;
+out:
+	up(&lpc_mutex);
+	WARN("failed to apply patch '%s' to module '%s'\n",
+		patch->mod->name, mod->name);
+	return 0;
+}
+
+static struct notifier_block lp_module_nb = {
+	.notifier_call = lp_module_notify,
+	.priority = INT_MIN, /* called last */
+};
+
+/********************************************
+ * Sysfs Interface
+ *******************************************/
+/*
+ * /sys/kernel/livepatch
+ * /sys/kernel/livepatch/<patch>
+ * /sys/kernel/livepatch/<patch>/enabled
+ * /sys/kernel/livepatch/<patch>/<object>
+ * /sys/kernel/livepatch/<patch>/<object>/<func>
+ * /sys/kernel/livepatch/<patch>/<object>/<func>/new_addr
+ * /sys/kernel/livepatch/<patch>/<object>/<func>/old_addr
+ */
+
+static ssize_t enabled_store(struct kobject *kobj, struct kobj_attribute *attr,
+			     const char *buf, size_t count)
+{
+	struct lpc_patch *patch;
+	int ret;
+	unsigned long val;
+
+	ret = kstrtoul(buf, 10, &val);
+	if (ret)
+		return -EINVAL;
+
+	if (val != DISABLED && val != ENABLED)
+		return -EINVAL;
+
+	patch = container_of(kobj, struct lpc_patch, kobj);
+
+	down(&lpc_mutex);
+	if (val == patch->state) {
+		/* already in requested state */
+		ret = -EINVAL;
+		goto out;
+	}
+
+	if (val == ENABLED) {
+		ret = lpc_enable_patch(patch);
+		if (ret)
+			goto out;
+	} else {
+		ret = lpc_disable_patch(patch);
+		if (ret)
+			goto out;
+	}
+	up(&lpc_mutex);
+	return count;
+out:
+	up(&lpc_mutex);
+	return ret;
+}
+
+static ssize_t enabled_show(struct kobject *kobj,
+			    struct kobj_attribute *attr, char *buf)
+{
+	struct lpc_patch *patch;
+
+	patch = container_of(kobj, struct lpc_patch, kobj);
+	return snprintf(buf, PAGE_SIZE-1, "%d\n", patch->state);
+}
+
+static struct kobj_attribute enabled_kobj_attr = __ATTR_RW(enabled);
+static struct attribute *lpc_patch_attrs[] = {
+	&enabled_kobj_attr.attr,
+	NULL
+};
+
+static ssize_t new_addr_show(struct kobject *kobj,
+			     struct kobj_attribute *attr, char *buf)
+{
+	struct lpc_func *func;
+
+	func = container_of(kobj, struct lpc_func, kobj);
+	return snprintf(buf, PAGE_SIZE-1, "0x%016lx\n", func->new_addr);
+}
+
+static struct kobj_attribute new_addr_kobj_attr = __ATTR_RO(new_addr);
+
+static ssize_t old_addr_show(struct kobject *kobj,
+			     struct kobj_attribute *attr, char *buf)
+{
+	struct lpc_func *func;
+
+	func = container_of(kobj, struct lpc_func, kobj);
+	return snprintf(buf, PAGE_SIZE-1, "0x%016lx\n", func->old_addr);
+}
+
+static struct kobj_attribute old_addr_kobj_attr = __ATTR_RO(old_addr);
+
+static struct attribute *lpc_func_attrs[] = {
+	&new_addr_kobj_attr.attr,
+	&old_addr_kobj_attr.attr,
+	NULL
+};
+
+static struct kobject *lpc_root_kobj;
+
+static int lpc_create_root_kobj(void)
+{
+	lpc_root_kobj =
+		kobject_create_and_add(THIS_MODULE->name, kernel_kobj);
+	if (!lpc_root_kobj)
+		return -ENOMEM;
+	return 0;
+}
+
+static void lpc_remove_root_kobj(void)
+{
+	kobject_put(lpc_root_kobj);
+}
+
+static void lpc_kobj_release_patch(struct kobject *kobj)
+{
+	struct lpc_patch *patch;
+
+	patch = container_of(kobj, struct lpc_patch, kobj);
+	if (!list_empty(&patch->list))
+		list_del(&patch->list);
+	kfree(patch);
+}
+
+static struct kobj_type lpc_ktype_patch = {
+	.release = lpc_kobj_release_patch,
+	.sysfs_ops = &kobj_sysfs_ops,
+	.default_attrs = lpc_patch_attrs
+};
+
+static void lpc_kobj_release_object(struct kobject *kobj)
+{
+	struct lpc_object *obj;
+
+	obj = container_of(kobj, struct lpc_object, kobj);
+	if (!list_empty(&obj->list))
+		list_del(&obj->list);
+	kfree(obj);
+}
+
+static struct kobj_type lpc_ktype_object = {
+	.release	= lpc_kobj_release_object,
+	.sysfs_ops	= &kobj_sysfs_ops,
+};
+
+static void lpc_kobj_release_func(struct kobject *kobj)
+{
+	struct lpc_func *func;
+
+	func = container_of(kobj, struct lpc_func, kobj);
+	if (!list_empty(&func->list))
+		list_del(&func->list);
+	kfree(func);
+}
+
+static struct kobj_type lpc_ktype_func = {
+	.release	= lpc_kobj_release_func,
+	.sysfs_ops	= &kobj_sysfs_ops,
+	.default_attrs = lpc_func_attrs
+};
+
+/*********************************
+ * structure allocation
+ ********************************/
+
+static void lpc_free_funcs(struct lpc_object *obj)
+{
+	struct lpc_func *func, *funcsafe;
+
+	list_for_each_entry_safe(func, funcsafe, &obj->funcs, list)
+		kobject_put(&func->kobj);
+}
+
+static void lpc_free_objects(struct lpc_patch *patch)
+{
+	struct lpc_object *obj, *objsafe;
+
+	list_for_each_entry_safe(obj, objsafe, &patch->objs, list) {
+		lpc_free_funcs(obj);
+		kobject_put(&obj->kobj);
+	}
+}
+
+static void lpc_free_patch(struct lpc_patch *patch)
+{
+	lpc_free_objects(patch);
+	kobject_put(&patch->kobj);
+}
+
+static struct lpc_func *lpc_create_func(struct kobject *root,
+					struct lp_func *userfunc)
+{
+	struct lpc_func *func;
+	struct ftrace_ops *ops;
+	int ret;
+
+	/* alloc */
+	func = kzalloc(sizeof(*func), GFP_KERNEL);
+	if (!func)
+		return NULL;
+
+	/* init */
+	INIT_LIST_HEAD(&func->list);
+	func->old_name = userfunc->old_name;
+	func->new_addr = (unsigned long)userfunc->new_func;
+	func->old_addr = userfunc->old_addr;
+	func->state = DISABLED;
+	ops = &func->fops;
+	ops->private = func;
+	ops->func = lpc_ftrace_handler;
+	ops->flags = FTRACE_OPS_FL_SAVE_REGS | FTRACE_OPS_FL_DYNAMIC;
+
+	/* sysfs */
+	ret = kobject_init_and_add(&func->kobj, &lpc_ktype_func,
+				   root, func->old_name);
+	if (ret) {
+		kfree(func);
+		return NULL;
+	}
+
+	return func;
+}
+
+static int lpc_create_funcs(struct lpc_object *obj,
+			    struct lp_func *userfuncs)
+{
+	struct lp_func *userfunc;
+	struct lpc_func *func;
+
+	if (!userfuncs)
+		return -EINVAL;
+
+	for (userfunc = userfuncs; userfunc->old_name; userfunc++) {
+		func = lpc_create_func(&obj->kobj, userfunc);
+		if (!func)
+			goto free;
+		list_add(&func->list, &obj->funcs);
+	}
+	return 0;
+free:
+	lpc_free_funcs(obj);
+	return -ENOMEM;
+}
+
+static struct lpc_object *lpc_create_object(struct kobject *root,
+					    struct lp_object *userobj)
+{
+	struct lpc_object *obj;
+	int ret;
+
+	/* alloc */
+	obj = kzalloc(sizeof(*obj), GFP_KERNEL);
+	if (!obj)
+		return NULL;
+
+	/* init */
+	INIT_LIST_HEAD(&obj->list);
+	obj->name = userobj->name;
+	obj->dynrelas = userobj->dynrelas;
+	obj->state = DISABLED;
+	/* obj->mod set by is_object_loaded() */
+	INIT_LIST_HEAD(&obj->funcs);
+
+	/* sysfs */
+	ret = kobject_init_and_add(&obj->kobj, &lpc_ktype_object,
+				   root, obj->name);
+	if (ret) {
+		kfree(obj);
+		return NULL;
+	}
+
+	/* create functions */
+	ret = lpc_create_funcs(obj, userobj->funcs);
+	if (ret) {
+		kobject_put(&obj->kobj);
+		return NULL;
+	}
+
+	return obj;
+}
+
+static int lpc_create_objects(struct lpc_patch *patch,
+			      struct lp_object *userobjs)
+{
+	struct lp_object *userobj;
+	struct lpc_object *obj;
+
+	if (!userobjs)
+		return -EINVAL;
+
+	for (userobj = userobjs; userobj->name; userobj++) {
+		obj = lpc_create_object(&patch->kobj, userobj);
+		if (!obj)
+			goto free;
+		list_add(&obj->list, &patch->objs);
+	}
+	return 0;
+free:
+	lpc_free_objects(patch);
+	return -ENOMEM;
+}
+
+static int lpc_create_patch(struct lp_patch *userpatch)
+{
+	struct lpc_patch *patch;
+	int ret;
+
+	/* alloc */
+	patch = kzalloc(sizeof(*patch), GFP_KERNEL);
+	if (!patch)
+		return -ENOMEM;
+
+	/* init */
+	INIT_LIST_HEAD(&patch->list);
+	patch->userpatch = userpatch;
+	patch->mod = userpatch->mod;
+	patch->state = DISABLED;
+	INIT_LIST_HEAD(&patch->objs);
+
+	/* sysfs */
+	ret = kobject_init_and_add(&patch->kobj, &lpc_ktype_patch,
+				   lpc_root_kobj, patch->mod->name);
+	if (ret) {
+		kfree(patch);
+		return ret;
+	}
+
+	/* create objects */
+	ret = lpc_create_objects(patch, userpatch->objs);
+	if (ret) {
+		kobject_put(&patch->kobj);
+		return ret;
+	}
+
+	/* add to global list of patches */
+	list_add(&patch->list, &lpc_patches);
+
+	return 0;
+}
+
+/************************************
+ * register/unregister
+ ***********************************/
+
+int lp_register_patch(struct lp_patch *userpatch)
+{
+	int ret;
+
+	if (!userpatch || !userpatch->mod || !userpatch->objs)
+		return -EINVAL;
+
+	/*
+	 * A reference is taken on the patch module to prevent it from being
+	 * unloaded.  Right now, we don't allow patch modules to unload since
+	 * there is currently no method to determine if a thread is still
+	 * running in the patched code contained in the patch module once
+	 * the ftrace registration is successful.
+	 */
+	if (!try_module_get(userpatch->mod))
+		return -ENODEV;
+
+	down(&lpc_mutex);
+	ret = lpc_create_patch(userpatch);
+	up(&lpc_mutex);
+	if (ret)
+		module_put(userpatch->mod);
+
+	return ret;
+}
+EXPORT_SYMBOL_GPL(lp_register_patch);
+
+int lp_unregister_patch(struct lp_patch *userpatch)
+{
+	struct lpc_patch *patch;
+	int ret = 0;
+
+	down(&lpc_mutex);
+	patch = lpc_find_patch(userpatch);
+	if (!patch) {
+		ret = -ENODEV;
+		goto out;
+	}
+	if (patch->state == ENABLED) {
+		ret = -EINVAL;
+		goto out;
+	}
+	lpc_free_patch(patch);
+out:
+	up(&lpc_mutex);
+	return ret;
+}
+EXPORT_SYMBOL_GPL(lp_unregister_patch);
+
+/************************************
+ * entry/exit
+ ************************************/
+
+static int lpc_init(void)
+{
+	int ret;
+
+	ret = register_module_notifier(&lp_module_nb);
+	if (ret)
+		return ret;
+
+	ret = lpc_create_root_kobj();
+	if (ret)
+		goto unregister;
+
+	return 0;
+unregister:
+	unregister_module_notifier(&lp_module_nb);
+	return ret;
+}
+
+static void lpc_exit(void)
+{
+	lpc_remove_root_kobj();
+	unregister_module_notifier(&lp_module_nb);
+}
+
+module_init(lpc_init);
+module_exit(lpc_exit);
+MODULE_DESCRIPTION("Live Kernel Patching Core");
+MODULE_LICENSE("GPL");
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 73+ messages in thread

* Re: [PATCH 2/2] kernel: add support for live patching
  2014-11-06 14:39 ` [PATCH 2/2] kernel: add support for live patching Seth Jennings
@ 2014-11-06 15:11   ` Jiri Kosina
  2014-11-06 16:20     ` Seth Jennings
  2014-11-06 15:51   ` Jiri Slaby
                     ` (6 subsequent siblings)
  7 siblings, 1 reply; 73+ messages in thread
From: Jiri Kosina @ 2014-11-06 15:11 UTC (permalink / raw)
  To: Seth Jennings
  Cc: Josh Poimboeuf, Vojtech Pavlik, Steven Rostedt, live-patching,
	kpatch, linux-kernel

On Thu, 6 Nov 2014, Seth Jennings wrote:

> This commit introduces code for the live patching core.  It implements
> an ftrace-based mechanism and kernel interface for doing live patching
> of kernel and kernel module functions.
> 
> It represents the greatest common functionality set between kpatch and
> kgraft and can accept patches built using either method.
> 
> This first version does not implement any consistency mechanism that
> ensures that old and new code do not run together.  In practice, ~90% of
> CVEs are safe to apply in this way, since they simply add a conditional
> check.  However, any function change that can not execute safely with
> the old version of the function can _not_ be safely applied in this
> version.

Thanks a lot for having started the work on this!

We will be reviewing it carefully in the coming days and will getting back 
to you (I was surprised to see that that diffstat indicates that it's 
actually more code than our whole kgraft implementation including the 
consistency model :) ).

I have one questions right away though.

> +/****************************************
> + * dynamic relocations (load-time linker)
> + ****************************************/
> +
> +/*
> + * external symbols are located outside the parent object (where the parent
> + * object is either vmlinux or the kmod being patched).
> + */

I have no ideas what dynrela is, and quickly reading the source doesn't 
really help too much.

Could you please provide some explanation / pointer to some documentation, 
explaining what exactly it is, and why should it be part of the common 
infrastructure?

Thanks,

-- 
Jiri Kosina
SUSE Labs

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [PATCH 2/2] kernel: add support for live patching
  2014-11-06 14:39 ` [PATCH 2/2] kernel: add support for live patching Seth Jennings
  2014-11-06 15:11   ` Jiri Kosina
@ 2014-11-06 15:51   ` Jiri Slaby
  2014-11-06 16:57     ` Seth Jennings
  2014-11-30 12:23     ` Pavel Machek
  2014-11-06 20:02   ` Steven Rostedt
                     ` (5 subsequent siblings)
  7 siblings, 2 replies; 73+ messages in thread
From: Jiri Slaby @ 2014-11-06 15:51 UTC (permalink / raw)
  To: Seth Jennings, Josh Poimboeuf, Jiri Kosina, Vojtech Pavlik,
	Steven Rostedt
  Cc: live-patching, kpatch, linux-kernel

On 11/06/2014, 03:39 PM, Seth Jennings wrote:
> This commit introduces code for the live patching core.  It implements
> an ftrace-based mechanism and kernel interface for doing live patching
> of kernel and kernel module functions.

Hi,

nice! So we have something to start with. Brilliant!

I have some comments below now. Yet, it obviously needs deeper review
which will take more time.

> --- /dev/null
> +++ b/include/linux/livepatch.h
> @@ -0,0 +1,45 @@
> +#ifndef _LIVEPATCH_H_
> +#define _LIVEPATCH_H_

This should follow the linux kernel naming: LINUX_LIVEPATCH_H


> +#include <linux/module.h>
> +
> +struct lp_func {

I am not much happy with "lp" which effectively means parallel printer
support. What about lip?

> +	const char *old_name; /* function to be patched */
> +	void *new_func; /* replacement function in patch module */
> +	/*
> +	 * The old_addr field is optional and can be used to resolve
> +	 * duplicate symbol names in the vmlinux object.  If this
> +	 * information is not present, the symbol is located by name
> +	 * with kallsyms. If the name is not unique and old_addr is
> +	 * not provided, the patch application fails as there is no
> +	 * way to resolve the ambiguity.
> +	 */
> +	unsigned long old_addr;
> +};
>
> +struct lp_dynrela {
> +	unsigned long dest;
> +	unsigned long src;
> +	unsigned long type;
> +	const char *name;
> +	int addend;
> +	int external;
> +};
> +
> +struct lp_object {
> +	const char *name; /* "vmlinux" or module name */
> +	struct lp_func *funcs;
> +	struct lp_dynrela *dynrelas;
> +};
> +
> +struct lp_patch {
> +	struct module *mod; /* module containing the patch */
> +	struct lp_object *objs;
> +};

Please document all the structures and all its members. And use
kernel-doc format for that. (You can take an inspiration in kgraft.)

> +int lp_register_patch(struct lp_patch *);
> +int lp_unregister_patch(struct lp_patch *);
> +int lp_enable_patch(struct lp_patch *);
> +int lp_disable_patch(struct lp_patch *);
> +
> +#endif /* _LIVEPATCH_H_ */

...

> --- /dev/null
> +++ b/kernel/livepatch/Makefile
> @@ -0,0 +1,3 @@
> +obj-$(CONFIG_LIVE_PATCHING) += livepatch.o
> +
> +livepatch-objs := core.o
> diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c
> new file mode 100644
> index 0000000..b32dbb5
> --- /dev/null
> +++ b/kernel/livepatch/core.c
> @@ -0,0 +1,1020 @@

...

> +/*************************************
> + * Core structures
> + ************************************/
> +
> +/*
> + * lp_ structs vs lpc_ structs
> + *
> + * For each element (patch, object, func) in the live-patching code,
> + * there are two types with two different prefixes: lp_ and lpc_.
> + *
> + * Structures used by the live-patch modules to register with this core module
> + * are prefixed with lp_ (live patching).  These structures are part of the
> + * registration API and are defined in livepatch.h.  The structures used
> + * internally by this core module are prefixed with lpc_ (live patching core).
> + */

I am not sure if the separation and the allocations/kobj handling are
worth it. It makes the code really less understandable. Can we have just
struct lip_function (don't unnecessarily abbreviate), lip_objectfile
(object is too generic, like Java object) and lip_patch containing all
the needed information? It would clean up the code a lot. (Yes, we would
have profited from c++ here.)

> +static DEFINE_SEMAPHORE(lpc_mutex);

Ugh, those are deprecated. Use mutex. (Or am I missing the need of
recursive locking?)

> +static LIST_HEAD(lpc_patches);
> +
> +enum lpc_state {
> +	DISABLED,
> +	ENABLED

These are too generic names. This is prone to conflicts in the tree.

> +};
> +
> +struct lpc_func {
> +	struct list_head list;
> +	struct kobject kobj;
> +	struct ftrace_ops fops;
> +	enum lpc_state state;
> +
> +	const char *old_name;

So you do lpc_func->old_name = lp_func->old_name.

Why? Duplication is always bad and introduces errors. The same for the
other members here and there. Well, lip_function would solve that.

> +	unsigned long new_addr;
> +	unsigned long old_addr;
> +};
> +
> +struct lpc_object {
> +	struct list_head list;
> +	struct kobject kobj;
> +	struct module *mod; /* module associated with object */
> +	enum lpc_state state;
> +
> +	const char *name;
> +	struct list_head funcs;
> +	struct lp_dynrela *dynrelas;
> +};
> +
> +struct lpc_patch {
> +	struct list_head list;
> +	struct kobject kobj;
> +	struct lp_patch *userpatch; /* for correlation during unregister */
> +	enum lpc_state state;
> +
> +	struct module *mod;
> +	struct list_head objs;
> +};
> +
> +/*******************************************
> + * Helpers
> + *******************************************/
> +
> +/* sets obj->mod if object is not vmlinux and module was found */
> +static bool is_object_loaded(struct lpc_object *obj)

Always prefix function names. We try to avoid kallsyms duplicates ;).

> +{
> +	struct module *mod;
> +
> +	if (!strcmp(obj->name, "vmlinux"))
> +		return 1;
> +
> +	mutex_lock(&module_mutex);
> +	mod = find_module(obj->name);
> +	mutex_unlock(&module_mutex);
> +	obj->mod = mod;

This is racy. Mod can be already gone now, right?.

> +
> +	return !!mod;
> +}
> +
> +/************************************
> + * kallsyms
> + ***********************************/
> +
> +struct lpc_find_arg {
> +	const char *objname;
> +	const char *name;
> +	unsigned long addr;
> +	/*
> +	 * If count == 0, the symbol was not found. If count == 1, a unique
> +	 * match was found and addr is set.  If count > 1, there is
> +	 * unresolvable ambiguity among "count" number of symbols with the same
> +	 * name in the same object.
> +	 */
> +	unsigned long count;
> +};

...

> +static int lpc_find_symbol(const char *objname, const char *name,
> +			   unsigned long *addr)

The first two params can be const, right?

> +{
> +	struct lpc_find_arg args = {
> +		.objname = objname,
> +		.name = name,
> +		.addr = 0,
> +		.count = 0
> +	};
> +
> +	if (objname && !strcmp(objname, "vmlinux"))
> +		args.objname = NULL;
> +
> +	kallsyms_on_each_symbol(lpc_find_callback, &args);
> +
> +	if (args.count == 0)
> +		pr_err("symbol '%s' not found in symbol table\n", name);
> +	else if (args.count > 1)
> +		pr_err("unresolvable ambiguity (%lu matches) on symbol '%s' in object '%s'\n",
> +		       args.count, name, objname);
> +	else {
> +		*addr = args.addr;
> +		return 0;
> +	}
> +
> +	*addr = 0;
> +	return -EINVAL;
> +}

...

> +/****************************************
> + * dynamic relocations (load-time linker)
> + ****************************************/

I am skipping this now (see Jiri's e-mail).

> +/***********************************
> + * ftrace registration
> + **********************************/
> +
> +static void lpc_ftrace_handler(unsigned long ip, unsigned long parent_ip,
> +			       struct ftrace_ops *ops, struct pt_regs *regs)
> +{
> +	struct lpc_func *func = ops->private;
> +
> +	regs->ip = func->new_addr;
> +}
> +
> +static int lpc_enable_func(struct lpc_func *func)
> +{
> +	int ret;
> +
> +	BUG_ON(!func->old_addr);
> +	BUG_ON(func->state != DISABLED);

No BUGs please, just return appropriately. Possibly with WARN_ON.

> +	ret = ftrace_set_filter_ip(&func->fops, func->old_addr, 0, 0);
> +	if (ret) {
> +		pr_err("failed to set ftrace filter for function '%s' (%d)\n",
> +		       func->old_name, ret);
> +		return ret;
> +	}
> +	ret = register_ftrace_function(&func->fops);
> +	if (ret) {
> +		pr_err("failed to register ftrace handler for function '%s' (%d)\n",
> +		       func->old_name, ret);
> +		ftrace_set_filter_ip(&func->fops, func->old_addr, 1, 0);
> +	} else
> +		func->state = ENABLED;
> +
> +	return ret;
> +}
> +
> +static int lpc_unregister_func(struct lpc_func *func)
> +{
> +	int ret;
> +
> +	BUG_ON(func->state != ENABLED);

Detto.

> +	if (!func->old_addr)
> +		/* parent object is not loaded */
> +		return 0;
> +	ret = unregister_ftrace_function(&func->fops);
> +	if (ret) {
> +		pr_err("failed to unregister ftrace handler for function '%s' (%d)\n",
> +		       func->old_name, ret);
> +		return ret;
> +	}
> +	ret = ftrace_set_filter_ip(&func->fops, func->old_addr, 1, 0);
> +	if (ret)
> +		pr_warn("function unregister succeeded but failed to clear the filter\n");
> +	func->state = DISABLED;
> +
> +	return 0;
> +}

> +/******************************
> + * enable/disable
> + ******************************/

...

> +/* must be called with lpc_mutex held */
> +static int lpc_enable_patch(struct lpc_patch *patch)

The question I want to raise here is whether we need two-state
registration: register+enable. We don't in kGraft. Why do you?

> +{
> +	struct lpc_object *obj;
> +	int ret;
> +
> +	BUG_ON(patch->state != DISABLED);

No bugs...

> +
> +	pr_notice_once("tainting kernel with TAINT_LIVEPATCH\n");
> +	add_taint(TAINT_LIVEPATCH, LOCKDEP_STILL_OK);
> +
> +	pr_notice("enabling patch '%s'\n", patch->mod->name);
> +
> +	list_for_each_entry(obj, &patch->objs, list) {
> +		if (!is_object_loaded(obj))
> +			continue;
> +		ret = lpc_enable_object(patch->mod, obj);
> +		if (ret)
> +			goto unregister;
> +	}
> +	patch->state = ENABLED;
> +	return 0;
> +
> +unregister:
> +	WARN_ON(lpc_disable_patch(patch));
> +	return ret;
> +}
> +
> +int lp_enable_patch(struct lp_patch *userpatch)
> +{
> +	struct lpc_patch *patch;
> +	int ret;
> +
> +	down(&lpc_mutex);
> +	patch = lpc_find_patch(userpatch);
> +	if (!patch) {
> +		ret = -ENODEV;
> +		goto out;
> +	}
> +	ret = lpc_enable_patch(patch);
> +out:
> +	up(&lpc_mutex);
> +	return ret;
> +}
> +EXPORT_SYMBOL_GPL(lp_enable_patch);

...

> +/************************************
> + * register/unregister
> + ***********************************/
> +
> +int lp_register_patch(struct lp_patch *userpatch)

This and other guys forming the interface should be documented.

> +{
> +	int ret;
> +
> +	if (!userpatch || !userpatch->mod || !userpatch->objs)
> +		return -EINVAL;
> +
> +	/*
> +	 * A reference is taken on the patch module to prevent it from being
> +	 * unloaded.  Right now, we don't allow patch modules to unload since
> +	 * there is currently no method to determine if a thread is still
> +	 * running in the patched code contained in the patch module once
> +	 * the ftrace registration is successful.
> +	 */
> +	if (!try_module_get(userpatch->mod))
> +		return -ENODEV;
> +
> +	down(&lpc_mutex);
> +	ret = lpc_create_patch(userpatch);
> +	up(&lpc_mutex);
> +	if (ret)
> +		module_put(userpatch->mod);
> +
> +	return ret;
> +}
> +EXPORT_SYMBOL_GPL(lp_register_patch);

...


Thanks for the work!

-- 
js
suse labs

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [PATCH 2/2] kernel: add support for live patching
  2014-11-06 15:11   ` Jiri Kosina
@ 2014-11-06 16:20     ` Seth Jennings
  2014-11-06 16:32       ` Josh Poimboeuf
                         ` (2 more replies)
  0 siblings, 3 replies; 73+ messages in thread
From: Seth Jennings @ 2014-11-06 16:20 UTC (permalink / raw)
  To: Jiri Kosina
  Cc: Josh Poimboeuf, Vojtech Pavlik, Steven Rostedt, live-patching,
	kpatch, linux-kernel

On Thu, Nov 06, 2014 at 04:11:37PM +0100, Jiri Kosina wrote:
> On Thu, 6 Nov 2014, Seth Jennings wrote:
> 
> > This commit introduces code for the live patching core.  It implements
> > an ftrace-based mechanism and kernel interface for doing live patching
> > of kernel and kernel module functions.
> > 
> > It represents the greatest common functionality set between kpatch and
> > kgraft and can accept patches built using either method.
> > 
> > This first version does not implement any consistency mechanism that
> > ensures that old and new code do not run together.  In practice, ~90% of
> > CVEs are safe to apply in this way, since they simply add a conditional
> > check.  However, any function change that can not execute safely with
> > the old version of the function can _not_ be safely applied in this
> > version.
> 
> Thanks a lot for having started the work on this!
> 
> We will be reviewing it carefully in the coming days and will getting back 
> to you (I was surprised to see that that diffstat indicates that it's 
> actually more code than our whole kgraft implementation including the 
> consistency model :) ).

The structure allocation and sysfs stuff is a lot of (mundane) code.
Lots of boring error path handling too.

Plus, I show that kernel/kgraft.c + kernel/kgraft_files.c is
906+193=1099.  I'd say they are about the same size :)

> 
> I have one questions right away though.
> 
> > +/****************************************
> > + * dynamic relocations (load-time linker)
> > + ****************************************/
> > +
> > +/*
> > + * external symbols are located outside the parent object (where the parent
> > + * object is either vmlinux or the kmod being patched).
> > + */
> 
> I have no ideas what dynrela is, and quickly reading the source doesn't 
> really help too much.
> 
> Could you please provide some explanation / pointer to some documentation, 
> explaining what exactly it is, and why should it be part of the common 
> infrastructure?

Yes, I should explain it.

This is something that is currently only used in the kpatch approach.
It allows the patching core to do dynamic relocations on the new
function code, similar to what the kernel module linker does, but this
works for non-exported symbols as well.

This is so the patch module doesn't have to do a kallsyms lookup on
every non-exported symbol that the new functions use.

The fields of the dynrela structure are those of a normal ELF rela
entry, except for the "external" field, which conveys information about
where the core module should go looking for the symbol referenced in the
dynrela entry.

Josh was under the impression that Vojtech was ok with putting the
dynrela stuff in the core.  Is that not correct (misunderstanding)?

Thanks,
Seth

> 
> Thanks,
> 
> -- 
> Jiri Kosina
> SUSE Labs

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [PATCH 2/2] kernel: add support for live patching
  2014-11-06 16:20     ` Seth Jennings
@ 2014-11-06 16:32       ` Josh Poimboeuf
  2014-11-06 18:00       ` Vojtech Pavlik
  2014-11-06 22:20       ` Jiri Kosina
  2 siblings, 0 replies; 73+ messages in thread
From: Josh Poimboeuf @ 2014-11-06 16:32 UTC (permalink / raw)
  To: Seth Jennings
  Cc: Jiri Kosina, Vojtech Pavlik, Steven Rostedt, live-patching,
	kpatch, linux-kernel

On Thu, Nov 06, 2014 at 10:20:49AM -0600, Seth Jennings wrote:
> On Thu, Nov 06, 2014 at 04:11:37PM +0100, Jiri Kosina wrote:
> > On Thu, 6 Nov 2014, Seth Jennings wrote:
> > > +/****************************************
> > > + * dynamic relocations (load-time linker)
> > > + ****************************************/
> > > +
> > > +/*
> > > + * external symbols are located outside the parent object (where the parent
> > > + * object is either vmlinux or the kmod being patched).
> > > + */
> > 
> > I have no ideas what dynrela is, and quickly reading the source doesn't 
> > really help too much.
> > 
> > Could you please provide some explanation / pointer to some documentation, 
> > explaining what exactly it is, and why should it be part of the common 
> > infrastructure?
> 
> Yes, I should explain it.
> 
> This is something that is currently only used in the kpatch approach.
> It allows the patching core to do dynamic relocations on the new
> function code, similar to what the kernel module linker does, but this
> works for non-exported symbols as well.
> 
> This is so the patch module doesn't have to do a kallsyms lookup on
> every non-exported symbol that the new functions use.
> 
> The fields of the dynrela structure are those of a normal ELF rela
> entry, except for the "external" field, which conveys information about
> where the core module should go looking for the symbol referenced in the
> dynrela entry.

BTW, use of the dynrelas is optional, but highly recommended.  The
kGraft approach of manually doing a kallsyms lookup for each
non-exported symbol is inherently dangerous because of duplicate
symbols.

-- 
Josh

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [PATCH 2/2] kernel: add support for live patching
  2014-11-06 15:51   ` Jiri Slaby
@ 2014-11-06 16:57     ` Seth Jennings
  2014-11-06 17:12       ` Josh Poimboeuf
  2014-11-07 18:21       ` Petr Mladek
  2014-11-30 12:23     ` Pavel Machek
  1 sibling, 2 replies; 73+ messages in thread
From: Seth Jennings @ 2014-11-06 16:57 UTC (permalink / raw)
  To: Jiri Slaby
  Cc: Josh Poimboeuf, Jiri Kosina, Vojtech Pavlik, Steven Rostedt,
	live-patching, kpatch, linux-kernel

On Thu, Nov 06, 2014 at 04:51:02PM +0100, Jiri Slaby wrote:
> On 11/06/2014, 03:39 PM, Seth Jennings wrote:
> > This commit introduces code for the live patching core.  It implements
> > an ftrace-based mechanism and kernel interface for doing live patching
> > of kernel and kernel module functions.
> 
> Hi,
> 
> nice! So we have something to start with. Brilliant!
> 
> I have some comments below now. Yet, it obviously needs deeper review
> which will take more time.
> 
> > --- /dev/null
> > +++ b/include/linux/livepatch.h
> > @@ -0,0 +1,45 @@
> > +#ifndef _LIVEPATCH_H_
> > +#define _LIVEPATCH_H_
> 
> This should follow the linux kernel naming: LINUX_LIVEPATCH_H

Didn't realize that was the convention.  Just to be sure, you meant
_LINUX_LIVEPATCH_H right (with the leading underscore)?

> 
> 
> > +#include <linux/module.h>
> > +
> > +struct lp_func {
> 
> I am not much happy with "lp" which effectively means parallel printer
> support. What about lip?

Not sure how much clearer lip is.  It isn't for me :-/  I'm not opposed
to changing it.  I was just trying to keep the name short since it is
used many times.  Reducing the prefix from something like "livepatch_"
to "lp_" seemed to be the shortest and most straightforward way.

> 
> > +	const char *old_name; /* function to be patched */
> > +	void *new_func; /* replacement function in patch module */
> > +	/*
> > +	 * The old_addr field is optional and can be used to resolve
> > +	 * duplicate symbol names in the vmlinux object.  If this
> > +	 * information is not present, the symbol is located by name
> > +	 * with kallsyms. If the name is not unique and old_addr is
> > +	 * not provided, the patch application fails as there is no
> > +	 * way to resolve the ambiguity.
> > +	 */
> > +	unsigned long old_addr;
> > +};
> >
> > +struct lp_dynrela {
> > +	unsigned long dest;
> > +	unsigned long src;
> > +	unsigned long type;
> > +	const char *name;
> > +	int addend;
> > +	int external;
> > +};
> > +
> > +struct lp_object {
> > +	const char *name; /* "vmlinux" or module name */
> > +	struct lp_func *funcs;
> > +	struct lp_dynrela *dynrelas;
> > +};
> > +
> > +struct lp_patch {
> > +	struct module *mod; /* module containing the patch */
> > +	struct lp_object *objs;
> > +};
> 
> Please document all the structures and all its members. And use
> kernel-doc format for that. (You can take an inspiration in kgraft.)

Sure.

> 
> > +int lp_register_patch(struct lp_patch *);
> > +int lp_unregister_patch(struct lp_patch *);
> > +int lp_enable_patch(struct lp_patch *);
> > +int lp_disable_patch(struct lp_patch *);
> > +
> > +#endif /* _LIVEPATCH_H_ */
> 
> ...
> 
> > --- /dev/null
> > +++ b/kernel/livepatch/Makefile
> > @@ -0,0 +1,3 @@
> > +obj-$(CONFIG_LIVE_PATCHING) += livepatch.o
> > +
> > +livepatch-objs := core.o
> > diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c
> > new file mode 100644
> > index 0000000..b32dbb5
> > --- /dev/null
> > +++ b/kernel/livepatch/core.c
> > @@ -0,0 +1,1020 @@
> 
> ...
> 
> > +/*************************************
> > + * Core structures
> > + ************************************/
> > +
> > +/*
> > + * lp_ structs vs lpc_ structs
> > + *
> > + * For each element (patch, object, func) in the live-patching code,
> > + * there are two types with two different prefixes: lp_ and lpc_.
> > + *
> > + * Structures used by the live-patch modules to register with this core module
> > + * are prefixed with lp_ (live patching).  These structures are part of the
> > + * registration API and are defined in livepatch.h.  The structures used
> > + * internally by this core module are prefixed with lpc_ (live patching core).
> > + */
> 
> I am not sure if the separation and the allocations/kobj handling are
> worth it. It makes the code really less understandable. Can we have just
> struct lip_function (don't unnecessarily abbreviate), lip_objectfile
> (object is too generic, like Java object) and lip_patch containing all
> the needed information? It would clean up the code a lot. (Yes, we would
> have profited from c++ here.)

I looked at doing this and this is actually what we did in kpatch.  We
made one structure that had "private" members that the user wasn't
suppose to access that were only used in the core.  This was messy
though.  Every time you wanted to add a "private" field to the struct so
the core could do something new, you were changing the API to the patch
modules as well.  While copying the data into an internal structure does
add code and opportunity for errors, that functionality is localized
into functions that are specifically tasked with taking care of that.
So the risk is minimized and we gain flexibility within the core and
more self-documenting API structures.

> 
> > +static DEFINE_SEMAPHORE(lpc_mutex);
> 
> Ugh, those are deprecated. Use mutex. (Or am I missing the need of
> recursive locking?)

Sure.

> 
> > +static LIST_HEAD(lpc_patches);
> > +
> > +enum lpc_state {
> > +	DISABLED,
> > +	ENABLED
> 
> These are too generic names. This is prone to conflicts in the tree.

Add LPC_ prefix good enough?

> 
> > +};
> > +
> > +struct lpc_func {
> > +	struct list_head list;
> > +	struct kobject kobj;
> > +	struct ftrace_ops fops;
> > +	enum lpc_state state;
> > +
> > +	const char *old_name;
> 
> So you do lpc_func->old_name = lp_func->old_name.
> 
> Why? Duplication is always bad and introduces errors. The same for the
> other members here and there. Well, lip_function would solve that.

See earlier comment.

> 
> > +	unsigned long new_addr;
> > +	unsigned long old_addr;
> > +};
> > +
> > +struct lpc_object {
> > +	struct list_head list;
> > +	struct kobject kobj;
> > +	struct module *mod; /* module associated with object */
> > +	enum lpc_state state;
> > +
> > +	const char *name;
> > +	struct list_head funcs;
> > +	struct lp_dynrela *dynrelas;
> > +};
> > +
> > +struct lpc_patch {
> > +	struct list_head list;
> > +	struct kobject kobj;
> > +	struct lp_patch *userpatch; /* for correlation during unregister */
> > +	enum lpc_state state;
> > +
> > +	struct module *mod;
> > +	struct list_head objs;
> > +};
> > +
> > +/*******************************************
> > + * Helpers
> > + *******************************************/
> > +
> > +/* sets obj->mod if object is not vmlinux and module was found */
> > +static bool is_object_loaded(struct lpc_object *obj)
> 
> Always prefix function names. We try to avoid kallsyms duplicates ;).

Sure.

> 
> > +{
> > +	struct module *mod;
> > +
> > +	if (!strcmp(obj->name, "vmlinux"))
> > +		return 1;
> > +
> > +	mutex_lock(&module_mutex);
> > +	mod = find_module(obj->name);
> > +	mutex_unlock(&module_mutex);
> > +	obj->mod = mod;
> 
> This is racy. Mod can be already gone now, right?.

Yes, we should take a ref on the module before releasing the
module_mutex.

> 
> > +
> > +	return !!mod;
> > +}
> > +
> > +/************************************
> > + * kallsyms
> > + ***********************************/
> > +
> > +struct lpc_find_arg {
> > +	const char *objname;
> > +	const char *name;
> > +	unsigned long addr;
> > +	/*
> > +	 * If count == 0, the symbol was not found. If count == 1, a unique
> > +	 * match was found and addr is set.  If count > 1, there is
> > +	 * unresolvable ambiguity among "count" number of symbols with the same
> > +	 * name in the same object.
> > +	 */
> > +	unsigned long count;
> > +};
> 
> ...
> 
> > +static int lpc_find_symbol(const char *objname, const char *name,
> > +			   unsigned long *addr)
> 
> The first two params can be const, right?

Not following.  The first two params _are_ const.  If you are saying they
should be... I agree! :)

> 
> > +{
> > +	struct lpc_find_arg args = {
> > +		.objname = objname,
> > +		.name = name,
> > +		.addr = 0,
> > +		.count = 0
> > +	};
> > +
> > +	if (objname && !strcmp(objname, "vmlinux"))
> > +		args.objname = NULL;
> > +
> > +	kallsyms_on_each_symbol(lpc_find_callback, &args);
> > +
> > +	if (args.count == 0)
> > +		pr_err("symbol '%s' not found in symbol table\n", name);
> > +	else if (args.count > 1)
> > +		pr_err("unresolvable ambiguity (%lu matches) on symbol '%s' in object '%s'\n",
> > +		       args.count, name, objname);
> > +	else {
> > +		*addr = args.addr;
> > +		return 0;
> > +	}
> > +
> > +	*addr = 0;
> > +	return -EINVAL;
> > +}
> 
> ...
> 
> > +/****************************************
> > + * dynamic relocations (load-time linker)
> > + ****************************************/
> 
> I am skipping this now (see Jiri's e-mail).
> 
> > +/***********************************
> > + * ftrace registration
> > + **********************************/
> > +
> > +static void lpc_ftrace_handler(unsigned long ip, unsigned long parent_ip,
> > +			       struct ftrace_ops *ops, struct pt_regs *regs)
> > +{
> > +	struct lpc_func *func = ops->private;
> > +
> > +	regs->ip = func->new_addr;
> > +}
> > +
> > +static int lpc_enable_func(struct lpc_func *func)
> > +{
> > +	int ret;
> > +
> > +	BUG_ON(!func->old_addr);
> > +	BUG_ON(func->state != DISABLED);
> 
> No BUGs please, just return appropriately. Possibly with WARN_ON.

Sure.

> 
> > +	ret = ftrace_set_filter_ip(&func->fops, func->old_addr, 0, 0);
> > +	if (ret) {
> > +		pr_err("failed to set ftrace filter for function '%s' (%d)\n",
> > +		       func->old_name, ret);
> > +		return ret;
> > +	}
> > +	ret = register_ftrace_function(&func->fops);
> > +	if (ret) {
> > +		pr_err("failed to register ftrace handler for function '%s' (%d)\n",
> > +		       func->old_name, ret);
> > +		ftrace_set_filter_ip(&func->fops, func->old_addr, 1, 0);
> > +	} else
> > +		func->state = ENABLED;
> > +
> > +	return ret;
> > +}
> > +
> > +static int lpc_unregister_func(struct lpc_func *func)
> > +{
> > +	int ret;
> > +
> > +	BUG_ON(func->state != ENABLED);
> 
> Detto.

Ok.

> 
> > +	if (!func->old_addr)
> > +		/* parent object is not loaded */
> > +		return 0;
> > +	ret = unregister_ftrace_function(&func->fops);
> > +	if (ret) {
> > +		pr_err("failed to unregister ftrace handler for function '%s' (%d)\n",
> > +		       func->old_name, ret);
> > +		return ret;
> > +	}
> > +	ret = ftrace_set_filter_ip(&func->fops, func->old_addr, 1, 0);
> > +	if (ret)
> > +		pr_warn("function unregister succeeded but failed to clear the filter\n");
> > +	func->state = DISABLED;
> > +
> > +	return 0;
> > +}
> 
> > +/******************************
> > + * enable/disable
> > + ******************************/
> 
> ...
> 
> > +/* must be called with lpc_mutex held */
> > +static int lpc_enable_patch(struct lpc_patch *patch)
> 
> The question I want to raise here is whether we need two-state
> registration: register+enable. We don't in kGraft. Why do you?

We actually don't in kpatch either and this was a late change for this
patchset.  The thinking was that, while the patch modules would normally
call lpc_register_patch() and lpc_enable_patch() in the same way all the
time, breaking them up created more symmetric code and gives more flexibility
to the API.

Josh might like to elaborate here.

> 
> > +{
> > +	struct lpc_object *obj;
> > +	int ret;
> > +
> > +	BUG_ON(patch->state != DISABLED);
> 
> No bugs...

Ok.

> 
> > +
> > +	pr_notice_once("tainting kernel with TAINT_LIVEPATCH\n");
> > +	add_taint(TAINT_LIVEPATCH, LOCKDEP_STILL_OK);
> > +
> > +	pr_notice("enabling patch '%s'\n", patch->mod->name);
> > +
> > +	list_for_each_entry(obj, &patch->objs, list) {
> > +		if (!is_object_loaded(obj))
> > +			continue;
> > +		ret = lpc_enable_object(patch->mod, obj);
> > +		if (ret)
> > +			goto unregister;
> > +	}
> > +	patch->state = ENABLED;
> > +	return 0;
> > +
> > +unregister:
> > +	WARN_ON(lpc_disable_patch(patch));
> > +	return ret;
> > +}
> > +
> > +int lp_enable_patch(struct lp_patch *userpatch)
> > +{
> > +	struct lpc_patch *patch;
> > +	int ret;
> > +
> > +	down(&lpc_mutex);
> > +	patch = lpc_find_patch(userpatch);
> > +	if (!patch) {
> > +		ret = -ENODEV;
> > +		goto out;
> > +	}
> > +	ret = lpc_enable_patch(patch);
> > +out:
> > +	up(&lpc_mutex);
> > +	return ret;
> > +}
> > +EXPORT_SYMBOL_GPL(lp_enable_patch);
> 
> ...
> 
> > +/************************************
> > + * register/unregister
> > + ***********************************/
> > +
> > +int lp_register_patch(struct lp_patch *userpatch)
> 
> This and other guys forming the interface should be documented.

Yes.

Thanks,
Seth

> 
> > +{
> > +	int ret;
> > +
> > +	if (!userpatch || !userpatch->mod || !userpatch->objs)
> > +		return -EINVAL;
> > +
> > +	/*
> > +	 * A reference is taken on the patch module to prevent it from being
> > +	 * unloaded.  Right now, we don't allow patch modules to unload since
> > +	 * there is currently no method to determine if a thread is still
> > +	 * running in the patched code contained in the patch module once
> > +	 * the ftrace registration is successful.
> > +	 */
> > +	if (!try_module_get(userpatch->mod))
> > +		return -ENODEV;
> > +
> > +	down(&lpc_mutex);
> > +	ret = lpc_create_patch(userpatch);
> > +	up(&lpc_mutex);
> > +	if (ret)
> > +		module_put(userpatch->mod);
> > +
> > +	return ret;
> > +}
> > +EXPORT_SYMBOL_GPL(lp_register_patch);
> 
> ...
> 
> 
> Thanks for the work!
> 
> -- 
> js
> suse labs

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [PATCH 2/2] kernel: add support for live patching
  2014-11-06 16:57     ` Seth Jennings
@ 2014-11-06 17:12       ` Josh Poimboeuf
  2014-11-07 18:21       ` Petr Mladek
  1 sibling, 0 replies; 73+ messages in thread
From: Josh Poimboeuf @ 2014-11-06 17:12 UTC (permalink / raw)
  To: Seth Jennings
  Cc: Jiri Slaby, Jiri Kosina, Vojtech Pavlik, Steven Rostedt,
	live-patching, kpatch, linux-kernel

On Thu, Nov 06, 2014 at 10:57:48AM -0600, Seth Jennings wrote:
> On Thu, Nov 06, 2014 at 04:51:02PM +0100, Jiri Slaby wrote:
> > On 11/06/2014, 03:39 PM, Seth Jennings wrote:
> > > +/* must be called with lpc_mutex held */
> > > +static int lpc_enable_patch(struct lpc_patch *patch)
> > 
> > The question I want to raise here is whether we need two-state
> > registration: register+enable. We don't in kGraft. Why do you?
> 
> We actually don't in kpatch either and this was a late change for this
> patchset.  The thinking was that, while the patch modules would normally
> call lpc_register_patch() and lpc_enable_patch() in the same way all the
> time, breaking them up created more symmetric code and gives more flexibility
> to the API.
> 
> Josh might like to elaborate here.

Yes, this was my brilliant idea :-)  I like it because it makes the
register/unregister interfaces more symmetrical.

We already have to separate disable and unregister so that a patch can
be disabled from sysfs, so it makes sense IMO to likewise separate
register and enable.

The downside is an extra function call.  The upside is it makes the code
cleaner, and the API easier to understand and more flexible.

-- 
Josh

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [PATCH 2/2] kernel: add support for live patching
  2014-11-06 16:20     ` Seth Jennings
  2014-11-06 16:32       ` Josh Poimboeuf
@ 2014-11-06 18:00       ` Vojtech Pavlik
  2014-11-06 22:20       ` Jiri Kosina
  2 siblings, 0 replies; 73+ messages in thread
From: Vojtech Pavlik @ 2014-11-06 18:00 UTC (permalink / raw)
  To: Seth Jennings
  Cc: Jiri Kosina, Josh Poimboeuf, Steven Rostedt, live-patching,
	kpatch, linux-kernel

On Thu, Nov 06, 2014 at 10:20:49AM -0600, Seth Jennings wrote:

> Yes, I should explain it.
> 
> This is something that is currently only used in the kpatch approach.
> It allows the patching core to do dynamic relocations on the new
> function code, similar to what the kernel module linker does, but this
> works for non-exported symbols as well.
> 
> This is so the patch module doesn't have to do a kallsyms lookup on
> every non-exported symbol that the new functions use.
> 
> The fields of the dynrela structure are those of a normal ELF rela
> entry, except for the "external" field, which conveys information about
> where the core module should go looking for the symbol referenced in the
> dynrela entry.
> 
> Josh was under the impression that Vojtech was ok with putting the
> dynrela stuff in the core.  Is that not correct (misunderstanding)?

Yes, that is correct, as obviously the kpatch way of generating patches
by extracting code from a compiled kernel would not be viable without
it.

For our own kGraft usage we're choosing to compile patches from C
source, and there we can simply replace the function calls by calls via
pointer looked up through kallsyms.

However, kGraft also has tools to create patches in an automated way,
where the individual functions are extracted from the compiled patched
kernel using a modified objopy and this is hitting exactly the same
issue of having to do relocation of unexported symbols if any are
referenced.

So no objection to the idea. We'll have to look more into the code to
comment on the implementation of the dynrela stuff.

-- 
Vojtech Pavlik
Director SUSE Labs

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [PATCH 0/2] Kernel Live Patching
  2014-11-06 14:39 [PATCH 0/2] Kernel Live Patching Seth Jennings
  2014-11-06 14:39 ` [PATCH 1/2] kernel: add TAINT_LIVEPATCH Seth Jennings
  2014-11-06 14:39 ` [PATCH 2/2] kernel: add support for live patching Seth Jennings
@ 2014-11-06 18:44 ` Christoph Hellwig
  2014-11-06 18:51   ` Vojtech Pavlik
  2014-11-09 20:16 ` Greg KH
  3 siblings, 1 reply; 73+ messages in thread
From: Christoph Hellwig @ 2014-11-06 18:44 UTC (permalink / raw)
  To: Seth Jennings
  Cc: Josh Poimboeuf, Jiri Kosina, Vojtech Pavlik, Steven Rostedt,
	live-patching, kpatch, linux-kernel

On Thu, Nov 06, 2014 at 08:39:06AM -0600, Seth Jennings wrote:
> An example patch module can be found here:
> https://github.com/spartacus06/livepatch/blob/master/patch/patch.c

Please include the generator for this patch in the kernel tree.
Providing interfaces for out of tree modules (or generators) is not
how the kernel internal APIs work.


^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [PATCH 0/2] Kernel Live Patching
  2014-11-06 18:44 ` [PATCH 0/2] Kernel Live Patching Christoph Hellwig
@ 2014-11-06 18:51   ` Vojtech Pavlik
  2014-11-06 18:58     ` Christoph Hellwig
  0 siblings, 1 reply; 73+ messages in thread
From: Vojtech Pavlik @ 2014-11-06 18:51 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Seth Jennings, Josh Poimboeuf, Jiri Kosina, Steven Rostedt,
	live-patching, kpatch, linux-kernel

On Thu, Nov 06, 2014 at 10:44:46AM -0800, Christoph Hellwig wrote:

> On Thu, Nov 06, 2014 at 08:39:06AM -0600, Seth Jennings wrote:
> > An example patch module can be found here:
> > https://github.com/spartacus06/livepatch/blob/master/patch/patch.c
> 
> Please include the generator for this patch in the kernel tree.
> Providing interfaces for out of tree modules (or generators) is not
> how the kernel internal APIs work.

I don't think this specific example was generated. 

I also don't think including the whole kpatch automation into the kernel
tree is a viable development model for it. (Same would apply for kGraft
automation.)

So I believe including one or more human produced examples makes most sense.

-- 
Vojtech Pavlik
Director SUSE Labs

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [PATCH 0/2] Kernel Live Patching
  2014-11-06 18:51   ` Vojtech Pavlik
@ 2014-11-06 18:58     ` Christoph Hellwig
  2014-11-06 19:34       ` Josh Poimboeuf
  2014-11-06 20:24       ` Vojtech Pavlik
  0 siblings, 2 replies; 73+ messages in thread
From: Christoph Hellwig @ 2014-11-06 18:58 UTC (permalink / raw)
  To: Vojtech Pavlik
  Cc: Christoph Hellwig, Seth Jennings, Josh Poimboeuf, Jiri Kosina,
	Steven Rostedt, live-patching, kpatch, linux-kernel

On Thu, Nov 06, 2014 at 07:51:57PM +0100, Vojtech Pavlik wrote:
> I don't think this specific example was generated. 
> 
> I also don't think including the whole kpatch automation into the kernel
> tree is a viable development model for it. (Same would apply for kGraft
> automation.)

Why?  We (IMHO incorrectly) used the argument of tight coupling to put
perf into the kernel tree.  Generating kernel live patches is way more
integrated that it absolutely has to go into the tree to be able to do
proper development on it in an integrated fashion.


^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [PATCH 0/2] Kernel Live Patching
  2014-11-06 18:58     ` Christoph Hellwig
@ 2014-11-06 19:34       ` Josh Poimboeuf
  2014-11-06 19:49         ` Steven Rostedt
  2014-11-07  7:45         ` Christoph Hellwig
  2014-11-06 20:24       ` Vojtech Pavlik
  1 sibling, 2 replies; 73+ messages in thread
From: Josh Poimboeuf @ 2014-11-06 19:34 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Vojtech Pavlik, Seth Jennings, Jiri Kosina, Steven Rostedt,
	live-patching, kpatch, linux-kernel

On Thu, Nov 06, 2014 at 10:58:57AM -0800, Christoph Hellwig wrote:
> On Thu, Nov 06, 2014 at 07:51:57PM +0100, Vojtech Pavlik wrote:
> > I don't think this specific example was generated. 

So there are two ways to use this live patching API: using a generated
module (e.g., using the kpatch-build tool) or manually compiling a
module via kbuild.

Vojtech's right, the provided example was not generated.  Maybe it
belongs in samples/livepatch?

> > 
> > I also don't think including the whole kpatch automation into the kernel
> > tree is a viable development model for it. (Same would apply for kGraft
> > automation.)
> 
> Why?  We (IMHO incorrectly) used the argument of tight coupling to put
> perf into the kernel tree.  Generating kernel live patches is way more
> integrated that it absolutely has to go into the tree to be able to do
> proper development on it in an integrated fashion.

I agree that we should also put kpatch-build (or some converged
kpatch/kGraft-build tool) into the kernel tree, because of the tight
interdependencies between it and the kernel.  I think it would make
development much easier.  Otherwise, for example, it may end up having a
lot of #ifdef hacks based on what kernel version it's targeting.

-- 
Josh

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [PATCH 0/2] Kernel Live Patching
  2014-11-06 19:34       ` Josh Poimboeuf
@ 2014-11-06 19:49         ` Steven Rostedt
  2014-11-06 20:02           ` Josh Poimboeuf
  2014-11-07  7:46           ` Christoph Hellwig
  2014-11-07  7:45         ` Christoph Hellwig
  1 sibling, 2 replies; 73+ messages in thread
From: Steven Rostedt @ 2014-11-06 19:49 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Christoph Hellwig, Vojtech Pavlik, Seth Jennings, Jiri Kosina,
	live-patching, kpatch, linux-kernel

On Thu, 6 Nov 2014 13:34:33 -0600
Josh Poimboeuf <jpoimboe@redhat.com> wrote:

> On Thu, Nov 06, 2014 at 10:58:57AM -0800, Christoph Hellwig wrote:
> > On Thu, Nov 06, 2014 at 07:51:57PM +0100, Vojtech Pavlik wrote:
> > > I don't think this specific example was generated. 
> 
> So there are two ways to use this live patching API: using a generated
> module (e.g., using the kpatch-build tool) or manually compiling a
> module via kbuild.
> 
> Vojtech's right, the provided example was not generated.  Maybe it
> belongs in samples/livepatch?
> 

I understand that there is two methods in doing this. Is it possible to
create a "simple generator" that only does the simple case. Perhaps can
detect non simple cases where it rejects the change and tells the user
they need to reboot.

Something that isn't really related to either kpatch or kGraft, but can
be used for testing purposes?

-- Steve

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [PATCH 0/2] Kernel Live Patching
  2014-11-06 19:49         ` Steven Rostedt
@ 2014-11-06 20:02           ` Josh Poimboeuf
  2014-11-07  7:46           ` Christoph Hellwig
  1 sibling, 0 replies; 73+ messages in thread
From: Josh Poimboeuf @ 2014-11-06 20:02 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Christoph Hellwig, Vojtech Pavlik, Seth Jennings, Jiri Kosina,
	live-patching, kpatch, linux-kernel

On Thu, Nov 06, 2014 at 02:49:26PM -0500, Steven Rostedt wrote:
> On Thu, 6 Nov 2014 13:34:33 -0600
> Josh Poimboeuf <jpoimboe@redhat.com> wrote:
> 
> > On Thu, Nov 06, 2014 at 10:58:57AM -0800, Christoph Hellwig wrote:
> > > On Thu, Nov 06, 2014 at 07:51:57PM +0100, Vojtech Pavlik wrote:
> > > > I don't think this specific example was generated. 
> > 
> > So there are two ways to use this live patching API: using a generated
> > module (e.g., using the kpatch-build tool) or manually compiling a
> > module via kbuild.
> > 
> > Vojtech's right, the provided example was not generated.  Maybe it
> > belongs in samples/livepatch?
> > 
> 
> I understand that there is two methods in doing this. Is it possible to
> create a "simple generator" that only does the simple case. Perhaps can
> detect non simple cases where it rejects the change and tells the user
> they need to reboot.
> 
> Something that isn't really related to either kpatch or kGraft, but can
> be used for testing purposes?

For basic testing, a generator isn't needed.  You can just use kbuild to
compile a kmod from a human-created source file, a la kGraft.  For
example:

  https://github.com/spartacus06/livepatch/blob/master/patch/patch.c

-- 
Josh

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [PATCH 2/2] kernel: add support for live patching
  2014-11-06 14:39 ` [PATCH 2/2] kernel: add support for live patching Seth Jennings
  2014-11-06 15:11   ` Jiri Kosina
  2014-11-06 15:51   ` Jiri Slaby
@ 2014-11-06 20:02   ` Steven Rostedt
  2014-11-06 20:19     ` Seth Jennings
  2014-11-07 17:13   ` module notifier: was " Petr Mladek
                     ` (4 subsequent siblings)
  7 siblings, 1 reply; 73+ messages in thread
From: Steven Rostedt @ 2014-11-06 20:02 UTC (permalink / raw)
  To: Seth Jennings
  Cc: Josh Poimboeuf, Jiri Kosina, Vojtech Pavlik, live-patching,
	kpatch, linux-kernel

On Thu,  6 Nov 2014 08:39:08 -0600
Seth Jennings <sjenning@redhat.com> wrote:

> --- /dev/null
> +++ b/kernel/livepatch/Kconfig
> @@ -0,0 +1,11 @@
> +config LIVE_PATCHING
> +	tristate "Live Kernel Patching"
> +	depends on DYNAMIC_FTRACE_WITH_REGS && MODULES && SYSFS && KALLSYMS_ALL
> +	default m

Nuke this default. This should be default 'n', which is what kconfig
defaults to when none is mentioned.

-- Steve

> +	help
> +	  Say Y here if you want to support live kernel patching.
> +	  This setting has no runtime impact until a live-patch
> +	  kernel module that uses the live-patch interface provided
> +	  by this option is loaded, resulting in calls to patched
> +	  functions being redirected to the new function code contained
> +	  in the live-patch module.

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [PATCH 2/2] kernel: add support for live patching
  2014-11-06 20:02   ` Steven Rostedt
@ 2014-11-06 20:19     ` Seth Jennings
  0 siblings, 0 replies; 73+ messages in thread
From: Seth Jennings @ 2014-11-06 20:19 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Josh Poimboeuf, Jiri Kosina, Vojtech Pavlik, live-patching,
	kpatch, linux-kernel

On Thu, Nov 06, 2014 at 03:02:04PM -0500, Steven Rostedt wrote:
> On Thu,  6 Nov 2014 08:39:08 -0600
> Seth Jennings <sjenning@redhat.com> wrote:
> 
> > --- /dev/null
> > +++ b/kernel/livepatch/Kconfig
> > @@ -0,0 +1,11 @@
> > +config LIVE_PATCHING
> > +	tristate "Live Kernel Patching"
> > +	depends on DYNAMIC_FTRACE_WITH_REGS && MODULES && SYSFS && KALLSYMS_ALL
> > +	default m
> 
> Nuke this default. This should be default 'n', which is what kconfig
> defaults to when none is mentioned.

Ok.

Thanks,
Seth

> 
> -- Steve
> 
> > +	help
> > +	  Say Y here if you want to support live kernel patching.
> > +	  This setting has no runtime impact until a live-patch
> > +	  kernel module that uses the live-patch interface provided
> > +	  by this option is loaded, resulting in calls to patched
> > +	  functions being redirected to the new function code contained
> > +	  in the live-patch module.

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [PATCH 0/2] Kernel Live Patching
  2014-11-06 18:58     ` Christoph Hellwig
  2014-11-06 19:34       ` Josh Poimboeuf
@ 2014-11-06 20:24       ` Vojtech Pavlik
  2014-11-07  7:47         ` Christoph Hellwig
  2014-11-07 12:31         ` Josh Poimboeuf
  1 sibling, 2 replies; 73+ messages in thread
From: Vojtech Pavlik @ 2014-11-06 20:24 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Seth Jennings, Josh Poimboeuf, Jiri Kosina, Steven Rostedt,
	live-patching, kpatch, linux-kernel

On Thu, Nov 06, 2014 at 10:58:57AM -0800, Christoph Hellwig wrote:

> On Thu, Nov 06, 2014 at 07:51:57PM +0100, Vojtech Pavlik wrote:
> > I don't think this specific example was generated. 
> > 
> > I also don't think including the whole kpatch automation into the kernel
> > tree is a viable development model for it. (Same would apply for kGraft
> > automation.)
> 
> Why?  We (IMHO incorrectly) used the argument of tight coupling to put
> perf into the kernel tree.  Generating kernel live patches is way more
> integrated that it absolutely has to go into the tree to be able to do
> proper development on it in an integrated fashion.

One reason is that there are currently at least two generators using
very different methods of generation (in addition to the option of doing
the patch module by hand), and neither of them are currently in a state
where they would be ready for inclusion into the kernel (although the
kpatch one is clearly closer to that).

A generator is not required for using the infrastructure and is merely a
means of preparing the live patch with less effort.

I'm not opposed at all to adding the generator(s) eventually.

However, given that their use is optional, I would prefer not to have to
wait on finishing and cleaning up the generator(s) to include of the
in-kernel live patching infrastructure.

-- 
Vojtech Pavlik
Director SUSE Labs

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [PATCH 2/2] kernel: add support for live patching
  2014-11-06 16:20     ` Seth Jennings
  2014-11-06 16:32       ` Josh Poimboeuf
  2014-11-06 18:00       ` Vojtech Pavlik
@ 2014-11-06 22:20       ` Jiri Kosina
  2014-11-07 12:50         ` Josh Poimboeuf
  2 siblings, 1 reply; 73+ messages in thread
From: Jiri Kosina @ 2014-11-06 22:20 UTC (permalink / raw)
  To: Seth Jennings
  Cc: Josh Poimboeuf, Vojtech Pavlik, Steven Rostedt, live-patching,
	kpatch, linux-kernel

On Thu, 6 Nov 2014, Seth Jennings wrote:

> > Thanks a lot for having started the work on this!
> > 
> > We will be reviewing it carefully in the coming days and will getting back 
> > to you (I was surprised to see that that diffstat indicates that it's 
> > actually more code than our whole kgraft implementation including the 
> > consistency model :) ).
> 
> The structure allocation and sysfs stuff is a lot of (mundane) code.
> Lots of boring error path handling too.

Also, lpc_create_object(), lpc_create_func(), lpc_create_patch(), 
lpc_create_objects(), lpc_create_funcs(), ... they all are pretty much 
alike, and are asking for some kind of unification ... perhaps iterator 
for generic structure initialization?

I am not also really fully convinced that we need the patch->object->funcs 
abstraction hierarchy (which also contributes to the structure allocation 
being rather a spaghetti copy/paste code) ... wouldn't patch->funcs be 
suffcient, with the "object" being made just a property of the function, 
for example?

> Plus, I show that kernel/kgraft.c + kernel/kgraft_files.c is
> 906+193=1099.  I'd say they are about the same size :)

Which is still seem to me to be a ratio worth thinking about improving :)

Thanks,

-- 
Jiri Kosina
SUSE Labs

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [PATCH 0/2] Kernel Live Patching
  2014-11-06 19:34       ` Josh Poimboeuf
  2014-11-06 19:49         ` Steven Rostedt
@ 2014-11-07  7:45         ` Christoph Hellwig
  1 sibling, 0 replies; 73+ messages in thread
From: Christoph Hellwig @ 2014-11-07  7:45 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Vojtech Pavlik, Seth Jennings, Jiri Kosina, Steven Rostedt,
	live-patching, kpatch, linux-kernel

On Thu, Nov 06, 2014 at 01:34:33PM -0600, Josh Poimboeuf wrote:
> I agree that we should also put kpatch-build (or some converged
> kpatch/kGraft-build tool) into the kernel tree, because of the tight
> interdependencies between it and the kernel.  I think it would make
> development much easier.  Otherwise, for example, it may end up having a
> lot of #ifdef hacks based on what kernel version it's targeting.

Exactly.

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [PATCH 0/2] Kernel Live Patching
  2014-11-06 19:49         ` Steven Rostedt
  2014-11-06 20:02           ` Josh Poimboeuf
@ 2014-11-07  7:46           ` Christoph Hellwig
  1 sibling, 0 replies; 73+ messages in thread
From: Christoph Hellwig @ 2014-11-07  7:46 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Josh Poimboeuf, Vojtech Pavlik, Seth Jennings, Jiri Kosina,
	live-patching, kpatch, linux-kernel

On Thu, Nov 06, 2014 at 02:49:26PM -0500, Steven Rostedt wrote:
> I understand that there is two methods in doing this. Is it possible to
> create a "simple generator" that only does the simple case. Perhaps can
> detect non simple cases where it rejects the change and tells the user
> they need to reboot.

Especially the complicated case needs to be in tree, otherwise we're
almost guaranteed to break it constantly.


^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [PATCH 0/2] Kernel Live Patching
  2014-11-06 20:24       ` Vojtech Pavlik
@ 2014-11-07  7:47         ` Christoph Hellwig
  2014-11-07 13:11           ` Josh Poimboeuf
  2014-11-07 12:31         ` Josh Poimboeuf
  1 sibling, 1 reply; 73+ messages in thread
From: Christoph Hellwig @ 2014-11-07  7:47 UTC (permalink / raw)
  To: Vojtech Pavlik
  Cc: Seth Jennings, Josh Poimboeuf, Jiri Kosina, Steven Rostedt,
	live-patching, kpatch, linux-kernel

On Thu, Nov 06, 2014 at 09:24:23PM +0100, Vojtech Pavlik wrote:
> One reason is that there are currently at least two generators using
> very different methods of generation (in addition to the option of doing
> the patch module by hand), and neither of them are currently in a state
> where they would be ready for inclusion into the kernel (although the
> kpatch one is clearly closer to that).

So agree on one method and get it into shape, just like we do for other
kernel subsystems.


^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [PATCH 0/2] Kernel Live Patching
  2014-11-06 20:24       ` Vojtech Pavlik
  2014-11-07  7:47         ` Christoph Hellwig
@ 2014-11-07 12:31         ` Josh Poimboeuf
  2014-11-07 12:48           ` Vojtech Pavlik
  1 sibling, 1 reply; 73+ messages in thread
From: Josh Poimboeuf @ 2014-11-07 12:31 UTC (permalink / raw)
  To: Vojtech Pavlik
  Cc: Christoph Hellwig, Seth Jennings, Jiri Kosina, Steven Rostedt,
	live-patching, kpatch, linux-kernel

On Thu, Nov 06, 2014 at 09:24:23PM +0100, Vojtech Pavlik wrote:
> On Thu, Nov 06, 2014 at 10:58:57AM -0800, Christoph Hellwig wrote:
> 
> > On Thu, Nov 06, 2014 at 07:51:57PM +0100, Vojtech Pavlik wrote:
> > > I don't think this specific example was generated. 
> > > 
> > > I also don't think including the whole kpatch automation into the kernel
> > > tree is a viable development model for it. (Same would apply for kGraft
> > > automation.)
> > 
> > Why?  We (IMHO incorrectly) used the argument of tight coupling to put
> > perf into the kernel tree.  Generating kernel live patches is way more
> > integrated that it absolutely has to go into the tree to be able to do
> > proper development on it in an integrated fashion.
> 
> One reason is that there are currently at least two generators using
> very different methods of generation (in addition to the option of doing
> the patch module by hand), and neither of them are currently in a state
> where they would be ready for inclusion into the kernel (although the
> kpatch one is clearly closer to that).

What generator does kGraft have?  Is that the one that generates the
source patch, or is there one that generates a binary patch module?

-- 
Josh

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [PATCH 0/2] Kernel Live Patching
  2014-11-07 12:31         ` Josh Poimboeuf
@ 2014-11-07 12:48           ` Vojtech Pavlik
  2014-11-07 13:06             ` Josh Poimboeuf
  0 siblings, 1 reply; 73+ messages in thread
From: Vojtech Pavlik @ 2014-11-07 12:48 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Christoph Hellwig, Seth Jennings, Jiri Kosina, Steven Rostedt,
	live-patching, kpatch, linux-kernel

On Fri, Nov 07, 2014 at 06:31:54AM -0600, Josh Poimboeuf wrote:
> On Thu, Nov 06, 2014 at 09:24:23PM +0100, Vojtech Pavlik wrote:
> > On Thu, Nov 06, 2014 at 10:58:57AM -0800, Christoph Hellwig wrote:
> > 
> > > On Thu, Nov 06, 2014 at 07:51:57PM +0100, Vojtech Pavlik wrote:
> > > > I don't think this specific example was generated. 
> > > > 
> > > > I also don't think including the whole kpatch automation into the kernel
> > > > tree is a viable development model for it. (Same would apply for kGraft
> > > > automation.)
> > > 
> > > Why?  We (IMHO incorrectly) used the argument of tight coupling to put
> > > perf into the kernel tree.  Generating kernel live patches is way more
> > > integrated that it absolutely has to go into the tree to be able to do
> > > proper development on it in an integrated fashion.
> > 
> > One reason is that there are currently at least two generators using
> > very different methods of generation (in addition to the option of doing
> > the patch module by hand), and neither of them are currently in a state
> > where they would be ready for inclusion into the kernel (although the
> > kpatch one is clearly closer to that).
> 
> What generator does kGraft have?  Is that the one that generates the
> source patch, or is there one that generates a binary patch module?

The generator for kGraft:

	* extracts a list of changed functions from a patch (rather naïvely so far)
	* uses DWARF debuginfo of the old kernel to handle things like inlining
	  and create a complete list of functions that need to be replaced
	* compiles the kernel with -fdata-sections -ffunction-sections
	* uses a modified objcopy to extract functions from the kernel
	  into a single .o file
	* creates a stub .c file that references those functions
	* compiles the .c and links with the .o to build a .ko

The main difference is in that the kGraft generator doesn't try to
compare the old and new binary objects, but rather works with function
lists and the DWARF info of the old code and extracts new functions from
the new binary.

However, as I said before, we have found enough trouble around eg.
IPA-SRA and other optimizations that make any automated approach fragile
and in our view more effort than benefit. Hence, we're intend to use the
manual way of creating live patches until proven that we were wrong in
this assessment. :)

-- 
Vojtech Pavlik
Director SUSE Labs

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [PATCH 2/2] kernel: add support for live patching
  2014-11-06 22:20       ` Jiri Kosina
@ 2014-11-07 12:50         ` Josh Poimboeuf
  2014-11-07 13:13           ` Jiri Kosina
  0 siblings, 1 reply; 73+ messages in thread
From: Josh Poimboeuf @ 2014-11-07 12:50 UTC (permalink / raw)
  To: Jiri Kosina
  Cc: Seth Jennings, Vojtech Pavlik, Steven Rostedt, live-patching,
	kpatch, linux-kernel

On Thu, Nov 06, 2014 at 11:20:48PM +0100, Jiri Kosina wrote:
> On Thu, 6 Nov 2014, Seth Jennings wrote:
> 
> > > Thanks a lot for having started the work on this!
> > > 
> > > We will be reviewing it carefully in the coming days and will getting back 
> > > to you (I was surprised to see that that diffstat indicates that it's 
> > > actually more code than our whole kgraft implementation including the 
> > > consistency model :) ).
> > 
> > The structure allocation and sysfs stuff is a lot of (mundane) code.
> > Lots of boring error path handling too.
> 
> Also, lpc_create_object(), lpc_create_func(), lpc_create_patch(), 
> lpc_create_objects(), lpc_create_funcs(), ... they all are pretty much 
> alike, and are asking for some kind of unification ... perhaps iterator 
> for generic structure initialization?

The allocation and initialization code is very simple and
straightforward.  I really don't see a problem there.

Can you give an example of what you mean by "iterator for generic
structure initialization"?

> I am not also really fully convinced that we need the patch->object->funcs 
> abstraction hierarchy (which also contributes to the structure allocation 
> being rather a spaghetti copy/paste code) ... wouldn't patch->funcs be 
> suffcient, with the "object" being made just a property of the function, 
> for example?
> 
> > Plus, I show that kernel/kgraft.c + kernel/kgraft_files.c is
> > 906+193=1099.  I'd say they are about the same size :)
> 
> Which is still seem to me to be a ratio worth thinking about improving :)

Yes, this code doesn't have a consistency model, but it does have some
other non-kGraft things like dynamic relocations, deferred module
patching, and a unified API.  There's really no point in comparing lines
of code.

-- 
Josh

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [PATCH 0/2] Kernel Live Patching
  2014-11-07 12:48           ` Vojtech Pavlik
@ 2014-11-07 13:06             ` Josh Poimboeuf
  0 siblings, 0 replies; 73+ messages in thread
From: Josh Poimboeuf @ 2014-11-07 13:06 UTC (permalink / raw)
  To: Vojtech Pavlik
  Cc: Christoph Hellwig, Seth Jennings, Jiri Kosina, Steven Rostedt,
	live-patching, kpatch, linux-kernel

On Fri, Nov 07, 2014 at 01:48:45PM +0100, Vojtech Pavlik wrote:
> On Fri, Nov 07, 2014 at 06:31:54AM -0600, Josh Poimboeuf wrote:
> > On Thu, Nov 06, 2014 at 09:24:23PM +0100, Vojtech Pavlik wrote:
> > > On Thu, Nov 06, 2014 at 10:58:57AM -0800, Christoph Hellwig wrote:
> > > 
> > > > On Thu, Nov 06, 2014 at 07:51:57PM +0100, Vojtech Pavlik wrote:
> > > > > I don't think this specific example was generated. 
> > > > > 
> > > > > I also don't think including the whole kpatch automation into the kernel
> > > > > tree is a viable development model for it. (Same would apply for kGraft
> > > > > automation.)
> > > > 
> > > > Why?  We (IMHO incorrectly) used the argument of tight coupling to put
> > > > perf into the kernel tree.  Generating kernel live patches is way more
> > > > integrated that it absolutely has to go into the tree to be able to do
> > > > proper development on it in an integrated fashion.
> > > 
> > > One reason is that there are currently at least two generators using
> > > very different methods of generation (in addition to the option of doing
> > > the patch module by hand), and neither of them are currently in a state
> > > where they would be ready for inclusion into the kernel (although the
> > > kpatch one is clearly closer to that).
> > 
> > What generator does kGraft have?  Is that the one that generates the
> > source patch, or is there one that generates a binary patch module?
> 
> The generator for kGraft:
> 
> 	* extracts a list of changed functions from a patch (rather naïvely so far)
> 	* uses DWARF debuginfo of the old kernel to handle things like inlining
> 	  and create a complete list of functions that need to be replaced
> 	* compiles the kernel with -fdata-sections -ffunction-sections
> 	* uses a modified objcopy to extract functions from the kernel
> 	  into a single .o file
> 	* creates a stub .c file that references those functions
> 	* compiles the .c and links with the .o to build a .ko
> 
> The main difference is in that the kGraft generator doesn't try to
> compare the old and new binary objects, but rather works with function
> lists and the DWARF info of the old code and extracts new functions from
> the new binary.

Thanks, interesting.  Sounds like we're mostly on the same page here.

> 
> However, as I said before, we have found enough trouble around eg.
> IPA-SRA and other optimizations that make any automated approach fragile
> and in our view more effort than benefit. Hence, we're intend to use the
> manual way of creating live patches until proven that we were wrong in
> this assessment. :)

Yeah.  We've already put in a lot of effort to support the gcc optimizations
like IPA-SRA, partial inlining, static variable renaming, etc.  And also
added support for many kernel special sections.

For now, at least, it works very well, and we find that generation is
_much_ easier and less error-prone than the manual approach.  So in our
experience, the benefits far outweigh the effort.

But I do agree that it's fragile, and at the mercy of any future gcc
optimization features.  Which is why I like our current approach of
supporting the manual approach as well.  The manual approach isn't
optimal, but it is a nice backup solution for us in case something
causes the generator to break.

-- 
Josh

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [PATCH 0/2] Kernel Live Patching
  2014-11-07  7:47         ` Christoph Hellwig
@ 2014-11-07 13:11           ` Josh Poimboeuf
  2014-11-07 14:04             ` Vojtech Pavlik
  0 siblings, 1 reply; 73+ messages in thread
From: Josh Poimboeuf @ 2014-11-07 13:11 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Vojtech Pavlik, Seth Jennings, Jiri Kosina, Steven Rostedt,
	live-patching, kpatch, linux-kernel

On Thu, Nov 06, 2014 at 11:47:45PM -0800, Christoph Hellwig wrote:
> On Thu, Nov 06, 2014 at 09:24:23PM +0100, Vojtech Pavlik wrote:
> > One reason is that there are currently at least two generators using
> > very different methods of generation (in addition to the option of doing
> > the patch module by hand), and neither of them are currently in a state
> > where they would be ready for inclusion into the kernel (although the
> > kpatch one is clearly closer to that).
> 
> So agree on one method and get it into shape, just like we do for other
> kernel subsystems.

That's our goal (and it sounds like everybody's on board with that).

But we have two different implementations, all the way up the stack.  I
think we have to work on combining and stabilizing one thing at a time.
We want to do this in stages:

1. Define a common core module with an API that can be used by all three
   approaches (manual, kpatch generator, kGraft generator)

2. Add consistency model(s) (e.g. kpatch stop_machine, kGraft per-task
   consistency, Masami's per task ref counting)

3. Add combined patch module generation tool

-- 
Josh

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [PATCH 2/2] kernel: add support for live patching
  2014-11-07 12:50         ` Josh Poimboeuf
@ 2014-11-07 13:13           ` Jiri Kosina
  2014-11-07 13:22             ` Josh Poimboeuf
  2014-11-07 14:57             ` Seth Jennings
  0 siblings, 2 replies; 73+ messages in thread
From: Jiri Kosina @ 2014-11-07 13:13 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Seth Jennings, Vojtech Pavlik, Steven Rostedt, live-patching,
	kpatch, linux-kernel

On Fri, 7 Nov 2014, Josh Poimboeuf wrote:

> > Also, lpc_create_object(), lpc_create_func(), lpc_create_patch(), 
> > lpc_create_objects(), lpc_create_funcs(), ... they all are pretty much 
> > alike, and are asking for some kind of unification ... perhaps iterator 
> > for generic structure initialization?
> 
> The allocation and initialization code is very simple and
> straightforward.  I really don't see a problem there.

This really boils down to the question I had in previous mail, whether 
three-level hierarchy (patch->object->funcs), which is why there is a lot 
of very alike initialization code, is not a bit over-designed.

> > I am not also really fully convinced that we need the 
> > patch->object->funcs abstraction hierarchy (which also contributes to 
> > the structure allocation being rather a spaghetti copy/paste code) ... 
> > wouldn't patch->funcs be suffcient, with the "object" being made just 
> > a property of the function, for example?
> > 
> > > Plus, I show that kernel/kgraft.c + kernel/kgraft_files.c is
> > > 906+193=1099.  I'd say they are about the same size :)
> > 
> > Which is still seem to me to be a ratio worth thinking about improving 
> > :)
> 
> Yes, this code doesn't have a consistency model, but it does have some
> other non-kGraft things like dynamic relocations, 

BTW we need to put those into arch/x86/ as they are unfortunately not 
generic. But more on this later independently.

> deferred module patching,

FWIW kgraft supports that as well.

> and a unified API.  There's really no point in comparing lines of code.

Oh, sure, I didn't mean that this is any kind of metrics that should be 
taken too seriously at all. I was just expressing my surprise that 
unification of the API would bring so much code that it makes the result 
comparably sized to "the whole thing" :)

Thanks,

-- 
Jiri Kosina
SUSE Labs

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [PATCH 2/2] kernel: add support for live patching
  2014-11-07 13:13           ` Jiri Kosina
@ 2014-11-07 13:22             ` Josh Poimboeuf
  2014-11-07 14:57             ` Seth Jennings
  1 sibling, 0 replies; 73+ messages in thread
From: Josh Poimboeuf @ 2014-11-07 13:22 UTC (permalink / raw)
  To: Jiri Kosina
  Cc: Seth Jennings, Vojtech Pavlik, Steven Rostedt, live-patching,
	kpatch, linux-kernel

On Fri, Nov 07, 2014 at 02:13:37PM +0100, Jiri Kosina wrote:
> On Fri, 7 Nov 2014, Josh Poimboeuf wrote:
> 
> > > Also, lpc_create_object(), lpc_create_func(), lpc_create_patch(), 
> > > lpc_create_objects(), lpc_create_funcs(), ... they all are pretty much 
> > > alike, and are asking for some kind of unification ... perhaps iterator 
> > > for generic structure initialization?
> > 
> > The allocation and initialization code is very simple and
> > straightforward.  I really don't see a problem there.
> 
> This really boils down to the question I had in previous mail, whether 
> three-level hierarchy (patch->object->funcs), which is why there is a lot 
> of very alike initialization code, is not a bit over-designed.

Oh sorry, I missed that point :-)  See below.
> 
> > > I am not also really fully convinced that we need the 
> > > patch->object->funcs abstraction hierarchy (which also contributes to 
> > > the structure allocation being rather a spaghetti copy/paste code) ... 
> > > wouldn't patch->funcs be suffcient, with the "object" being made just 
> > > a property of the function, for example?

The patched object represents the module being patched (or "vmlinux").
It is much more than a property of the function.  Multiple functions can
be patched in the same object.  There are several things we do on a
per-object basis, including try_module_get(), deferred module patching
(patching from the module notifier), and dynamic relocations.

> > > 
> > > > Plus, I show that kernel/kgraft.c + kernel/kgraft_files.c is
> > > > 906+193=1099.  I'd say they are about the same size :)
> > > 
> > > Which is still seem to me to be a ratio worth thinking about improving 
> > > :)
> > 
> > Yes, this code doesn't have a consistency model, but it does have some
> > other non-kGraft things like dynamic relocations, 
> 
> BTW we need to put those into arch/x86/ as they are unfortunately not 
> generic. But more on this later independently.
> 
> > deferred module patching,
> 
> FWIW kgraft supports that as well.
> 
> > and a unified API.  There's really no point in comparing lines of code.
> 
> Oh, sure, I didn't mean that this is any kind of metrics that should be 
> taken too seriously at all. I was just expressing my surprise that 
> unification of the API would bring so much code that it makes the result 
> comparably sized to "the whole thing" :)
> 
> Thanks,
> 
> -- 
> Jiri Kosina
> SUSE Labs

-- 
Josh

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [PATCH 0/2] Kernel Live Patching
  2014-11-07 13:11           ` Josh Poimboeuf
@ 2014-11-07 14:04             ` Vojtech Pavlik
  2014-11-07 15:45               ` Josh Poimboeuf
  0 siblings, 1 reply; 73+ messages in thread
From: Vojtech Pavlik @ 2014-11-07 14:04 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Christoph Hellwig, Seth Jennings, Jiri Kosina, Steven Rostedt,
	live-patching, kpatch, linux-kernel

On Fri, Nov 07, 2014 at 07:11:53AM -0600, Josh Poimboeuf wrote:

> 2. Add consistency model(s) (e.g. kpatch stop_machine, kGraft per-task
>    consistency, Masami's per task ref counting)

I have given some thought to the consistency models and how they differ
and how they potentially could be unified.

I have to thank Masami, because his rewrite of the kpatch model based on
refcounting is what brought it closer to the kGraft model and thus
allowed me to find the parallels.

Let me start by defining the properties of the patching consistency
model. First, what entity the execution must be outside of to be able to
make the switch, ordered from weakest to strongest:

	LEAVE_FUNCTION
		- execution has to leave a patched function to switch
		  to the new implementation

	LEAVE_PATCHED_SET
		- execution has to leave the set of patched functions
		  to switch to the new implementation

	LEAVE_KERNEL
		- execution has to leave the entire kernel to switch
		  to the new implementation

Then, what entity the switch happens for. Again, from weakest to strongest:

	SWITCH_FUNCTION
		- the switch to the new implementation happens on a per-function
		   basis

	SWITCH_THREAD
		- the switch to the new implementation is per-thread.

	SWITCH_KERNEL
		- the switch to the new implementation happens at once for
		  the whole kernel

Now with those definitions:

	livepatch (null model), as is, is LEAVE_FUNCTION and SWITCH_FUNCTION

	kpatch, masami-refcounting and Ksplice are LEAVE_PATCHED_SET and SWITCH_KERNEL

	kGraft is LEAVE_KERNEL and SWITCH_THREAD

	CRIU/kexec is LEAVE_KERNEL and SWITCH_KERNEL

By blending kGraft and masami-refcounting, we could create a consistency
engine capable of almost any combination of these properties and thus
all the consistency models.

However, I'm currently thinking that the most interesting model is
LEAVE_PATCHED_SET and SWITCH_THREAD, as it is reliable, fast converging,
doesn't require annotating kernel threads nor fails with frequent
sleepers like futexes. 

It provides the least consistency that is required to be able to change
the calling convention of functions and still allows for semantic
dependencies.
	
What do you think?

----------------------------------------------------------------------------

PS.: Livepatch's null model isn't in fact the weakest possible, as it still
guarantees executing complete intact functions, this thanks to ftrace.
That is much more than what would direct overwriting of the function in
memory achieve.

This is also the reason why Ksplice is locked to a very specific
consistency model. Ksplice can patch only when the kernel is stopped and
the model is built from that.

masami-refcounting, kpatch, kGraft, livepatch have a lot more freedom,
thanks to ftrace, into what the consistency model should look like.

PPS.: I haven't included any handling of changed data structures in
this, that's another set of properties.

-- 
Vojtech Pavlik
Director SUSE Labs

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [PATCH 2/2] kernel: add support for live patching
  2014-11-07 13:13           ` Jiri Kosina
  2014-11-07 13:22             ` Josh Poimboeuf
@ 2014-11-07 14:57             ` Seth Jennings
  1 sibling, 0 replies; 73+ messages in thread
From: Seth Jennings @ 2014-11-07 14:57 UTC (permalink / raw)
  To: Jiri Kosina
  Cc: Josh Poimboeuf, Vojtech Pavlik, Steven Rostedt, live-patching,
	kpatch, linux-kernel

On Fri, Nov 07, 2014 at 02:13:37PM +0100, Jiri Kosina wrote:
> On Fri, 7 Nov 2014, Josh Poimboeuf wrote:
> 
> > > Also, lpc_create_object(), lpc_create_func(), lpc_create_patch(), 
> > > lpc_create_objects(), lpc_create_funcs(), ... they all are pretty much 
> > > alike, and are asking for some kind of unification ... perhaps iterator 
> > > for generic structure initialization?
> > 
> > The allocation and initialization code is very simple and
> > straightforward.  I really don't see a problem there.
> 
> This really boils down to the question I had in previous mail, whether 
> three-level hierarchy (patch->object->funcs), which is why there is a lot 
> of very alike initialization code, is not a bit over-designed.

It might right now, but we coded ourselves into a corner a couple of
times in kpatch using optimal, but inflexible data structures and
sharing those data structures with the API.  This structure layout will
give us flexibility to make changes without having to gut everything.  I
see flexibility and modularity being important going forward as we are
both looking to extend the abilities.

Additionally it allows the sysfs directories to correlate to data
structures and we can use the kobject ref count to cleanly do object
cleanup (i.e.  kobject_put() with release handlers for each ktype).

As Josh said, we do have operations that apply to each level.  I think
your point is that we could do away with the object level, but we have
operations that happen on a per-object basis. lpc_enable_object() isn't
just a for loop for registering the functions with ftrace.  It also does
the dynamic relocations.  I'm sure we will find other things in the
future.  It is also nice to have a function that can be called from both
lpc_enable_patch() and lp_module_notify() to enable the object in a
common way.

Thanks,
Seth

> 
> > > I am not also really fully convinced that we need the 
> > > patch->object->funcs abstraction hierarchy (which also contributes to 
> > > the structure allocation being rather a spaghetti copy/paste code) ... 
> > > wouldn't patch->funcs be suffcient, with the "object" being made just 
> > > a property of the function, for example?
> > > 
> > > > Plus, I show that kernel/kgraft.c + kernel/kgraft_files.c is
> > > > 906+193=1099.  I'd say they are about the same size :)
> > > 
> > > Which is still seem to me to be a ratio worth thinking about improving 
> > > :)
> > 
> > Yes, this code doesn't have a consistency model, but it does have some
> > other non-kGraft things like dynamic relocations, 
> 
> BTW we need to put those into arch/x86/ as they are unfortunately not 
> generic. But more on this later independently.
> 
> > deferred module patching,
> 
> FWIW kgraft supports that as well.
> 
> > and a unified API.  There's really no point in comparing lines of code.
> 
> Oh, sure, I didn't mean that this is any kind of metrics that should be 
> taken too seriously at all. I was just expressing my surprise that 
> unification of the API would bring so much code that it makes the result 
> comparably sized to "the whole thing" :)
> 
> Thanks,
> 
> -- 
> Jiri Kosina
> SUSE Labs

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [PATCH 0/2] Kernel Live Patching
  2014-11-07 14:04             ` Vojtech Pavlik
@ 2014-11-07 15:45               ` Josh Poimboeuf
  2014-11-07 21:27                 ` Vojtech Pavlik
  0 siblings, 1 reply; 73+ messages in thread
From: Josh Poimboeuf @ 2014-11-07 15:45 UTC (permalink / raw)
  To: Vojtech Pavlik
  Cc: Christoph Hellwig, Seth Jennings, Jiri Kosina, Steven Rostedt,
	live-patching, kpatch, linux-kernel

On Fri, Nov 07, 2014 at 03:04:58PM +0100, Vojtech Pavlik wrote:
> On Fri, Nov 07, 2014 at 07:11:53AM -0600, Josh Poimboeuf wrote:
> 
> > 2. Add consistency model(s) (e.g. kpatch stop_machine, kGraft per-task
> >    consistency, Masami's per task ref counting)
> 
> I have given some thought to the consistency models and how they differ
> and how they potentially could be unified.
> 
> I have to thank Masami, because his rewrite of the kpatch model based on
> refcounting is what brought it closer to the kGraft model and thus
> allowed me to find the parallels.
> 
> Let me start by defining the properties of the patching consistency
> model. First, what entity the execution must be outside of to be able to
> make the switch, ordered from weakest to strongest:
> 
> 	LEAVE_FUNCTION
> 		- execution has to leave a patched function to switch
> 		  to the new implementation
> 
> 	LEAVE_PATCHED_SET
> 		- execution has to leave the set of patched functions
> 		  to switch to the new implementation
> 
> 	LEAVE_KERNEL
> 		- execution has to leave the entire kernel to switch
> 		  to the new implementation
> 
> Then, what entity the switch happens for. Again, from weakest to strongest:
> 
> 	SWITCH_FUNCTION
> 		- the switch to the new implementation happens on a per-function
> 		   basis
> 
> 	SWITCH_THREAD
> 		- the switch to the new implementation is per-thread.
> 
> 	SWITCH_KERNEL
> 		- the switch to the new implementation happens at once for
> 		  the whole kernel
> 
> Now with those definitions:
> 
> 	livepatch (null model), as is, is LEAVE_FUNCTION and SWITCH_FUNCTION
> 
> 	kpatch, masami-refcounting and Ksplice are LEAVE_PATCHED_SET and SWITCH_KERNEL
> 
> 	kGraft is LEAVE_KERNEL and SWITCH_THREAD
> 
> 	CRIU/kexec is LEAVE_KERNEL and SWITCH_KERNEL

Thanks, nice analysis!

> By blending kGraft and masami-refcounting, we could create a consistency
> engine capable of almost any combination of these properties and thus
> all the consistency models.

Can you elaborate on what this would look like?

> However, I'm currently thinking that the most interesting model is
> LEAVE_PATCHED_SET and SWITCH_THREAD, as it is reliable, fast converging,
> doesn't require annotating kernel threads nor fails with frequent
> sleepers like futexes. 
> 
> It provides the least consistency that is required to be able to change
> the calling convention of functions and still allows for semantic
> dependencies.
> 	
> What do you think?
> 

The big problem with SWITCH_THREAD is that it adds the possibility that
old functions can run simultaneously with new ones.  When you change
data or data semantics, which is roughly 10% of security patches, it
creates some serious headaches:

- It makes patch safety analysis much harder by doubling the number of
  permutations of scenarios you have to consider.  In addition to
  considering newfunc/olddata and newfunc/newdata, you also have to
  consider oldfunc/olddata and oldfunc/newdata.

- It requires two patches instead of one.  The first patch is needed to
  modify the old functions to be able to deal with new data.  After the
  first patch has been fully applied, then you apply the second patch
  which can start creating new versions of data.

On the other hand, SWITCH_KERNEL doesn't have those problems.  It does
have the problem you mentioned, roughly 2% of the time, where it can't
patch functions which are always in use.  But in that case we can skip
the backtrace check ~90% of the time.  So it's really maybe something
like 0.2% of patches which can't be patched with SWITCH_KERNEL.  But
even then I think we could overcome that by getting creative, e.g. using
the multiple patch approach.

So my perspective is that SWITCH_THREAD causes big headaches 10% of the
time, whereas SWITCH_KERNEL causes small headaches 1.8% of the time, and
big headaches 0.2% of the time :-)

> ----------------------------------------------------------------------------
> 
> PS.: Livepatch's null model isn't in fact the weakest possible, as it still
> guarantees executing complete intact functions, this thanks to ftrace.
> That is much more than what would direct overwriting of the function in
> memory achieve.
> 
> This is also the reason why Ksplice is locked to a very specific
> consistency model. Ksplice can patch only when the kernel is stopped and
> the model is built from that.
> 
> masami-refcounting, kpatch, kGraft, livepatch have a lot more freedom,
> thanks to ftrace, into what the consistency model should look like.
> 
> PPS.: I haven't included any handling of changed data structures in
> this, that's another set of properties.
> 
> -- 
> Vojtech Pavlik
> Director SUSE Labs

-- 
Josh

^ permalink raw reply	[flat|nested] 73+ messages in thread

* module notifier: was Re: [PATCH 2/2] kernel: add support for live patching
  2014-11-06 14:39 ` [PATCH 2/2] kernel: add support for live patching Seth Jennings
                     ` (2 preceding siblings ...)
  2014-11-06 20:02   ` Steven Rostedt
@ 2014-11-07 17:13   ` Petr Mladek
  2014-11-07 18:07     ` Seth Jennings
  2014-11-07 17:39   ` more patches for the same func: " Petr Mladek
                     ` (3 subsequent siblings)
  7 siblings, 1 reply; 73+ messages in thread
From: Petr Mladek @ 2014-11-07 17:13 UTC (permalink / raw)
  To: Seth Jennings
  Cc: Josh Poimboeuf, Jiri Kosina, Vojtech Pavlik, Steven Rostedt,
	live-patching, kpatch, linux-kernel

On Thu 2014-11-06 08:39:08, Seth Jennings wrote:
> This commit introduces code for the live patching core.  It implements
> an ftrace-based mechanism and kernel interface for doing live patching
> of kernel and kernel module functions.
> 
> It represents the greatest common functionality set between kpatch and
> kgraft and can accept patches built using either method.
> 
> This first version does not implement any consistency mechanism that
> ensures that old and new code do not run together.  In practice, ~90% of
> CVEs are safe to apply in this way, since they simply add a conditional
> check.  However, any function change that can not execute safely with
> the old version of the function can _not_ be safely applied in this
> version.

[...]
 
> +/******************************
> + * module notifier
> + *****************************/
> +
> +static int lp_module_notify(struct notifier_block *nb, unsigned long action,
> +			    void *data)
> +{
> +	struct module *mod = data;
> +	struct lpc_patch *patch;
> +	struct lpc_object *obj;
> +	int ret = 0;
> +
> +	if (action != MODULE_STATE_COMING)
> +		return 0;

IMHO, we should handle also MODULE_STATE_GOING. We should unregister
the ftrace handlers and update the state of the affected objects
(ENABLED -> DISABLED)

> +	down(&lpc_mutex);
> +
> +	list_for_each_entry(patch, &lpc_patches, list) {
> +		if (patch->state == DISABLED)
> +			continue;
> +		list_for_each_entry(obj, &patch->objs, list) {
> +			if (strcmp(obj->name, mod->name))
> +				continue;
> +			pr_notice("load of module '%s' detected, applying patch '%s'\n",
> +				  mod->name, patch->mod->name);
> +			obj->mod = mod;
> +			ret = lpc_enable_object(patch->mod, obj);
> +			if (ret)
> +				goto out;
> +			break;
> +		}
> +	}
> +
> +	up(&lpc_mutex);
> +	return 0;
> +out:

I would name this err_our or so to make it clear that it is used when
something fails.

> +	up(&lpc_mutex);
> +	WARN("failed to apply patch '%s' to module '%s'\n",
> +		patch->mod->name, mod->name);
> +	return 0;
> +}
> +
> +static struct notifier_block lp_module_nb = {
> +	.notifier_call = lp_module_notify,
> +	.priority = INT_MIN, /* called last */

The handler for MODULE_STATE_COMMING would need have higger priority,
if we want to cleanly unregister the ftrace handlers.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 73+ messages in thread

* more patches for the same func: was Re: [PATCH 2/2] kernel: add support for live patching
  2014-11-06 14:39 ` [PATCH 2/2] kernel: add support for live patching Seth Jennings
                     ` (3 preceding siblings ...)
  2014-11-07 17:13   ` module notifier: was " Petr Mladek
@ 2014-11-07 17:39   ` Petr Mladek
  2014-11-07 21:54     ` Josh Poimboeuf
  2014-11-07 19:40   ` Andy Lutomirski
                     ` (2 subsequent siblings)
  7 siblings, 1 reply; 73+ messages in thread
From: Petr Mladek @ 2014-11-07 17:39 UTC (permalink / raw)
  To: Seth Jennings
  Cc: Josh Poimboeuf, Jiri Kosina, Vojtech Pavlik, Steven Rostedt,
	live-patching, kpatch, linux-kernel

On Thu 2014-11-06 08:39:08, Seth Jennings wrote:
> This commit introduces code for the live patching core.  It implements
> an ftrace-based mechanism and kernel interface for doing live patching
> of kernel and kernel module functions.
> 
> It represents the greatest common functionality set between kpatch and
> kgraft and can accept patches built using either method.
> 
> This first version does not implement any consistency mechanism that
> ensures that old and new code do not run together.  In practice, ~90% of
> CVEs are safe to apply in this way, since they simply add a conditional
> check.  However, any function change that can not execute safely with
> the old version of the function can _not_ be safely applied in this
> version.

[...] 

> +static int lpc_enable_func(struct lpc_func *func)
> +{
> +	int ret;
> +
> +	BUG_ON(!func->old_addr);
> +	BUG_ON(func->state != DISABLED);
> +	ret = ftrace_set_filter_ip(&func->fops, func->old_addr, 0, 0);
> +	if (ret) {
> +		pr_err("failed to set ftrace filter for function '%s' (%d)\n",
> +		       func->old_name, ret);
> +		return ret;
> +	}
> +	ret = register_ftrace_function(&func->fops);
> +	if (ret) {
> +		pr_err("failed to register ftrace handler for function '%s' (%d)\n",
> +		       func->old_name, ret);
> +		ftrace_set_filter_ip(&func->fops, func->old_addr, 1, 0);
> +	} else
> +		func->state = ENABLED;
> +
> +	return ret;
> +}
> +

[...]

> +/* caller must ensure that obj->mod is set if object is a module */
> +static int lpc_enable_object(struct module *pmod, struct lpc_object *obj)
> +{
> +	struct lpc_func *func;
> +	int ret;
> +
> +	if (obj->mod && !try_module_get(obj->mod))
> +		return -ENODEV;
> +
> +	if (obj->dynrelas) {
> +		ret = lpc_write_object_relocations(pmod, obj);
> +		if (ret)
> +			goto unregister;
> +	}
> +	list_for_each_entry(func, &obj->funcs, list) {
> +		ret = lpc_find_verify_func_addr(func, obj->name);
> +		if (ret)
> +			goto unregister;
> +
> +		ret = lpc_enable_func(func);
> +		if (ret)
> +			goto unregister;
> +	}
> +	obj->state = ENABLED;
> +
> +	return 0;
> +unregister:
> +	WARN_ON(lpc_unregister_object(obj));
> +	return ret;
> +}
> +
> +/******************************
> + * enable/disable
> + ******************************/
> +
> +/* must be called with lpc_mutex held */
> +static struct lpc_patch *lpc_find_patch(struct lp_patch *userpatch)
> +{
> +	struct lpc_patch *patch;
> +
> +	list_for_each_entry(patch, &lpc_patches, list)
> +		if (patch->userpatch == userpatch)
> +			return patch;
> +
> +	return NULL;
> +}

[...]

> +
> +/* must be called with lpc_mutex held */
> +static int lpc_enable_patch(struct lpc_patch *patch)
> +{
> +	struct lpc_object *obj;
> +	int ret;
> +
> +	BUG_ON(patch->state != DISABLED);
> +
> +	pr_notice_once("tainting kernel with TAINT_LIVEPATCH\n");
> +	add_taint(TAINT_LIVEPATCH, LOCKDEP_STILL_OK);
> +
> +	pr_notice("enabling patch '%s'\n", patch->mod->name);
> +
> +	list_for_each_entry(obj, &patch->objs, list) {
> +		if (!is_object_loaded(obj))
> +			continue;
> +		ret = lpc_enable_object(patch->mod, obj);
> +		if (ret)
> +			goto unregister;
> +	}
> +	patch->state = ENABLED;
> +	return 0;
> +
> +unregister:
> +	WARN_ON(lpc_disable_patch(patch));
> +	return ret;
> +}
> +
> +int lp_enable_patch(struct lp_patch *userpatch)
> +{
> +	struct lpc_patch *patch;
> +	int ret;
> +
> +	down(&lpc_mutex);
> +	patch = lpc_find_patch(userpatch);
> +	if (!patch) {
> +		ret = -ENODEV;
> +		goto out;
> +	}
> +	ret = lpc_enable_patch(patch);
> +out:
> +	up(&lpc_mutex);
> +	return ret;
> +}
> +EXPORT_SYMBOL_GPL(lp_enable_patch);

AFAIK, this does not handle correctly the situation when there
are more patches for the same symbol. IMHO, the first registered
ftrace function wins. It means that later patches are ignored.

In kGraft, we detect this situation and do the following:

   add_new_ftrace_function()
   /* old one still might be used at this stage */
   if (old_function)
      remove_old_ftrace_function();
   /* the new one is used from now on */

Similar problem is when a patch is disabled. We need to know
if it was actually used. If not, we are done. If it is active,
we need to look if there is an older patch for the the same
symbol and enable the other ftrace function instead.

Best Regards,
Petr


PS: We should probably decide on the used structures before we start
coding fixes for this particular problems. I have similar concern about
the complexity as my colleagues have. But I need to think more about
it. Let's discuss it in the other thread.

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: module notifier: was Re: [PATCH 2/2] kernel: add support for live patching
  2014-11-07 17:13   ` module notifier: was " Petr Mladek
@ 2014-11-07 18:07     ` Seth Jennings
  2014-11-07 18:40       ` Petr Mladek
  0 siblings, 1 reply; 73+ messages in thread
From: Seth Jennings @ 2014-11-07 18:07 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Josh Poimboeuf, Jiri Kosina, Vojtech Pavlik, Steven Rostedt,
	live-patching, kpatch, linux-kernel

On Fri, Nov 07, 2014 at 06:13:07PM +0100, Petr Mladek wrote:
> On Thu 2014-11-06 08:39:08, Seth Jennings wrote:
> > This commit introduces code for the live patching core.  It implements
> > an ftrace-based mechanism and kernel interface for doing live patching
> > of kernel and kernel module functions.
> > 
> > It represents the greatest common functionality set between kpatch and
> > kgraft and can accept patches built using either method.
> > 
> > This first version does not implement any consistency mechanism that
> > ensures that old and new code do not run together.  In practice, ~90% of
> > CVEs are safe to apply in this way, since they simply add a conditional
> > check.  However, any function change that can not execute safely with
> > the old version of the function can _not_ be safely applied in this
> > version.
> 
> [...]
>  
> > +/******************************
> > + * module notifier
> > + *****************************/
> > +
> > +static int lp_module_notify(struct notifier_block *nb, unsigned long action,
> > +			    void *data)
> > +{
> > +	struct module *mod = data;
> > +	struct lpc_patch *patch;
> > +	struct lpc_object *obj;
> > +	int ret = 0;
> > +
> > +	if (action != MODULE_STATE_COMING)
> > +		return 0;
> 
> IMHO, we should handle also MODULE_STATE_GOING. We should unregister
> the ftrace handlers and update the state of the affected objects
> (ENABLED -> DISABLED)

The mechanism we use to avoid this right now is taking a reference on
patched module.  We only release that reference after the patch is
disabled, which unregisters all the patched functions from ftrace.

However, your comment reminded me of an idea I had to use
MODULE_STATE_GOING and let the lpc_mutex protect against races.  I think
it could be cleaner, but I haven't fleshed the idea out fully.

> 
> > +	down(&lpc_mutex);
> > +
> > +	list_for_each_entry(patch, &lpc_patches, list) {
> > +		if (patch->state == DISABLED)
> > +			continue;
> > +		list_for_each_entry(obj, &patch->objs, list) {
> > +			if (strcmp(obj->name, mod->name))
> > +				continue;
> > +			pr_notice("load of module '%s' detected, applying patch '%s'\n",
> > +				  mod->name, patch->mod->name);
> > +			obj->mod = mod;
> > +			ret = lpc_enable_object(patch->mod, obj);
> > +			if (ret)
> > +				goto out;
> > +			break;
> > +		}
> > +	}
> > +
> > +	up(&lpc_mutex);
> > +	return 0;
> > +out:
> 
> I would name this err_our or so to make it clear that it is used when
> something fails.

Just "err" good?

> 
> > +	up(&lpc_mutex);
> > +	WARN("failed to apply patch '%s' to module '%s'\n",
> > +		patch->mod->name, mod->name);
> > +	return 0;
> > +}
> > +
> > +static struct notifier_block lp_module_nb = {
> > +	.notifier_call = lp_module_notify,
> > +	.priority = INT_MIN, /* called last */
> 
> The handler for MODULE_STATE_COMMING would need have higger priority,
> if we want to cleanly unregister the ftrace handlers.

Yes, we might need two handlers at different priorities if we decide to
go that direction: one for MODULE_STATE_GOING at high/max and one for
MODULE_STATE_COMING at low/min.

Thanks,
Seth

> 
> Best Regards,
> Petr

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [PATCH 2/2] kernel: add support for live patching
  2014-11-06 16:57     ` Seth Jennings
  2014-11-06 17:12       ` Josh Poimboeuf
@ 2014-11-07 18:21       ` Petr Mladek
  2014-11-07 20:31         ` Josh Poimboeuf
  1 sibling, 1 reply; 73+ messages in thread
From: Petr Mladek @ 2014-11-07 18:21 UTC (permalink / raw)
  To: Seth Jennings
  Cc: Jiri Slaby, Josh Poimboeuf, Jiri Kosina, Vojtech Pavlik,
	Steven Rostedt, live-patching, kpatch, linux-kernel

On Thu 2014-11-06 10:57:48, Seth Jennings wrote:
> On Thu, Nov 06, 2014 at 04:51:02PM +0100, Jiri Slaby wrote:
> > On 11/06/2014, 03:39 PM, Seth Jennings wrote:
> > > +/*************************************
> > > + * Core structures
> > > + ************************************/
> > > +
> > > +/*
> > > + * lp_ structs vs lpc_ structs
> > > + *
> > > + * For each element (patch, object, func) in the live-patching code,
> > > + * there are two types with two different prefixes: lp_ and lpc_.
> > > + *
> > > + * Structures used by the live-patch modules to register with this core module
> > > + * are prefixed with lp_ (live patching).  These structures are part of the
> > > + * registration API and are defined in livepatch.h.  The structures used
> > > + * internally by this core module are prefixed with lpc_ (live patching core).
> > > + */
> > 
> > I am not sure if the separation and the allocations/kobj handling are
> > worth it. It makes the code really less understandable. Can we have just
> > struct lip_function (don't unnecessarily abbreviate), lip_objectfile
> > (object is too generic, like Java object) and lip_patch containing all
> > the needed information? It would clean up the code a lot. (Yes, we would
> > have profited from c++ here.)
> 
> I looked at doing this and this is actually what we did in kpatch.  We
> made one structure that had "private" members that the user wasn't
> suppose to access that were only used in the core.  This was messy
> though.  Every time you wanted to add a "private" field to the struct so
> the core could do something new, you were changing the API to the patch
> modules as well.  While copying the data into an internal structure does
> add code and opportunity for errors, that functionality is localized
> into functions that are specifically tasked with taking care of that.
> So the risk is minimized and we gain flexibility within the core and
> more self-documenting API structures.

I am not sure if the modified API is really such a big limit. The
modules initialize the needed members using ".member = value".
Also we do not need to take care of API/ABI backward compatibility because
there is very strict dependency between patches and the patched
kernel.

Well, I have to think more about it.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: module notifier: was Re: [PATCH 2/2] kernel: add support for live patching
  2014-11-07 18:07     ` Seth Jennings
@ 2014-11-07 18:40       ` Petr Mladek
  2014-11-07 18:55         ` Seth Jennings
  2014-11-11 19:40         ` Seth Jennings
  0 siblings, 2 replies; 73+ messages in thread
From: Petr Mladek @ 2014-11-07 18:40 UTC (permalink / raw)
  To: Seth Jennings
  Cc: Josh Poimboeuf, Jiri Kosina, Vojtech Pavlik, Steven Rostedt,
	live-patching, kpatch, linux-kernel

On Fri 2014-11-07 12:07:11, Seth Jennings wrote:
> On Fri, Nov 07, 2014 at 06:13:07PM +0100, Petr Mladek wrote:
> > On Thu 2014-11-06 08:39:08, Seth Jennings wrote:
> > > This commit introduces code for the live patching core.  It implements
> > > an ftrace-based mechanism and kernel interface for doing live patching
> > > of kernel and kernel module functions.
> > > 
> > > It represents the greatest common functionality set between kpatch and
> > > kgraft and can accept patches built using either method.
> > > 
> > > This first version does not implement any consistency mechanism that
> > > ensures that old and new code do not run together.  In practice, ~90% of
> > > CVEs are safe to apply in this way, since they simply add a conditional
> > > check.  However, any function change that can not execute safely with
> > > the old version of the function can _not_ be safely applied in this
> > > version.
> > 
> > [...]
> >  
> > > +/******************************
> > > + * module notifier
> > > + *****************************/
> > > +
> > > +static int lp_module_notify(struct notifier_block *nb, unsigned long action,
> > > +			    void *data)
> > > +{
> > > +	struct module *mod = data;
> > > +	struct lpc_patch *patch;
> > > +	struct lpc_object *obj;
> > > +	int ret = 0;
> > > +
> > > +	if (action != MODULE_STATE_COMING)
> > > +		return 0;
> > 
> > IMHO, we should handle also MODULE_STATE_GOING. We should unregister
> > the ftrace handlers and update the state of the affected objects
> > (ENABLED -> DISABLED)
> 
> The mechanism we use to avoid this right now is taking a reference on
> patched module.  We only release that reference after the patch is
> disabled, which unregisters all the patched functions from ftrace.

I see. This was actually another thing that I noticed and wanted to
investigate :-) I think that we should not force users to disable
the entire patch if they want to remove some module.


> However, your comment reminded me of an idea I had to use
> MODULE_STATE_GOING and let the lpc_mutex protect against races.  I think
> it could be cleaner, but I haven't fleshed the idea out fully.

AFAIK, the going module is not longer used when the notifier is
called. Therefore we could remove the patch fast way even when
patching would require the slow path otherwise.


> > 
> > > +	down(&lpc_mutex);
> > > +
> > > +	list_for_each_entry(patch, &lpc_patches, list) {
> > > +		if (patch->state == DISABLED)
> > > +			continue;
> > > +		list_for_each_entry(obj, &patch->objs, list) {
> > > +			if (strcmp(obj->name, mod->name))
> > > +				continue;
> > > +			pr_notice("load of module '%s' detected, applying patch '%s'\n",
> > > +				  mod->name, patch->mod->name);
> > > +			obj->mod = mod;
> > > +			ret = lpc_enable_object(patch->mod, obj);
> > > +			if (ret)
> > > +				goto out;
> > > +			break;
> > > +		}
> > > +	}
> > > +
> > > +	up(&lpc_mutex);
> > > +	return 0;
> > > +out:
> > 
> > I would name this err_our or so to make it clear that it is used when
> > something fails.
> 
> Just "err" good?

Fine with me.
 
> > > +	up(&lpc_mutex);
> > > +	WARN("failed to apply patch '%s' to module '%s'\n",
> > > +		patch->mod->name, mod->name);
> > > +	return 0;
> > > +}
> > > +
> > > +static struct notifier_block lp_module_nb = {
> > > +	.notifier_call = lp_module_notify,
> > > +	.priority = INT_MIN, /* called last */
> > 
> > The handler for MODULE_STATE_COMMING would need have higger priority,
> > if we want to cleanly unregister the ftrace handlers.
> 
> Yes, we might need two handlers at different priorities if we decide to
> go that direction: one for MODULE_STATE_GOING at high/max and one for
> MODULE_STATE_COMING at low/min.

kGraft has notifier only for the going state. The initialization is
called directly from load_module() after ftrace_module_init()
and complete_formation() before it is executed by parse_args().

I need to investigate if the notifier is more elegant and safe or not.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: module notifier: was Re: [PATCH 2/2] kernel: add support for live patching
  2014-11-07 18:40       ` Petr Mladek
@ 2014-11-07 18:55         ` Seth Jennings
  2014-11-11 19:40         ` Seth Jennings
  1 sibling, 0 replies; 73+ messages in thread
From: Seth Jennings @ 2014-11-07 18:55 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Josh Poimboeuf, Jiri Kosina, Vojtech Pavlik, Steven Rostedt,
	live-patching, kpatch, linux-kernel

On Fri, Nov 07, 2014 at 07:40:11PM +0100, Petr Mladek wrote:
> On Fri 2014-11-07 12:07:11, Seth Jennings wrote:
> > On Fri, Nov 07, 2014 at 06:13:07PM +0100, Petr Mladek wrote:
> > > On Thu 2014-11-06 08:39:08, Seth Jennings wrote:
> > > > This commit introduces code for the live patching core.  It implements
> > > > an ftrace-based mechanism and kernel interface for doing live patching
> > > > of kernel and kernel module functions.
> > > > 
> > > > It represents the greatest common functionality set between kpatch and
> > > > kgraft and can accept patches built using either method.
> > > > 
> > > > This first version does not implement any consistency mechanism that
> > > > ensures that old and new code do not run together.  In practice, ~90% of
> > > > CVEs are safe to apply in this way, since they simply add a conditional
> > > > check.  However, any function change that can not execute safely with
> > > > the old version of the function can _not_ be safely applied in this
> > > > version.
> > > 
> > > [...]
> > >  
> > > > +/******************************
> > > > + * module notifier
> > > > + *****************************/
> > > > +
> > > > +static int lp_module_notify(struct notifier_block *nb, unsigned long action,
> > > > +			    void *data)
> > > > +{
> > > > +	struct module *mod = data;
> > > > +	struct lpc_patch *patch;
> > > > +	struct lpc_object *obj;
> > > > +	int ret = 0;
> > > > +
> > > > +	if (action != MODULE_STATE_COMING)
> > > > +		return 0;
> > > 
> > > IMHO, we should handle also MODULE_STATE_GOING. We should unregister
> > > the ftrace handlers and update the state of the affected objects
> > > (ENABLED -> DISABLED)
> > 
> > The mechanism we use to avoid this right now is taking a reference on
> > patched module.  We only release that reference after the patch is
> > disabled, which unregisters all the patched functions from ftrace.
> 
> I see. This was actually another thing that I noticed and wanted to
> investigate :-) I think that we should not force users to disable
> the entire patch if they want to remove some module.

I agree that would be better.

> 
> 
> > However, your comment reminded me of an idea I had to use
> > MODULE_STATE_GOING and let the lpc_mutex protect against races.  I think
> > it could be cleaner, but I haven't fleshed the idea out fully.
> 
> AFAIK, the going module is not longer used when the notifier is
> called. Therefore we could remove the patch fast way even when
> patching would require the slow path otherwise.

Yes (Josh just brought this to my attention) is that the notifiers are
call with GOING _after_ the module's exit function is called.

Thanks,
Seth

> 
> 
> > > 
> > > > +	down(&lpc_mutex);
> > > > +
> > > > +	list_for_each_entry(patch, &lpc_patches, list) {
> > > > +		if (patch->state == DISABLED)
> > > > +			continue;
> > > > +		list_for_each_entry(obj, &patch->objs, list) {
> > > > +			if (strcmp(obj->name, mod->name))
> > > > +				continue;
> > > > +			pr_notice("load of module '%s' detected, applying patch '%s'\n",
> > > > +				  mod->name, patch->mod->name);
> > > > +			obj->mod = mod;
> > > > +			ret = lpc_enable_object(patch->mod, obj);
> > > > +			if (ret)
> > > > +				goto out;
> > > > +			break;
> > > > +		}
> > > > +	}
> > > > +
> > > > +	up(&lpc_mutex);
> > > > +	return 0;
> > > > +out:
> > > 
> > > I would name this err_our or so to make it clear that it is used when
> > > something fails.
> > 
> > Just "err" good?
> 
> Fine with me.
>  
> > > > +	up(&lpc_mutex);
> > > > +	WARN("failed to apply patch '%s' to module '%s'\n",
> > > > +		patch->mod->name, mod->name);
> > > > +	return 0;
> > > > +}
> > > > +
> > > > +static struct notifier_block lp_module_nb = {
> > > > +	.notifier_call = lp_module_notify,
> > > > +	.priority = INT_MIN, /* called last */
> > > 
> > > The handler for MODULE_STATE_COMMING would need have higger priority,
> > > if we want to cleanly unregister the ftrace handlers.
> > 
> > Yes, we might need two handlers at different priorities if we decide to
> > go that direction: one for MODULE_STATE_GOING at high/max and one for
> > MODULE_STATE_COMING at low/min.
> 
> kGraft has notifier only for the going state. The initialization is
> called directly from load_module() after ftrace_module_init()
> and complete_formation() before it is executed by parse_args().
> 
> I need to investigate if the notifier is more elegant and safe or not.
> 
> Best Regards,
> Petr

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [PATCH 2/2] kernel: add support for live patching
  2014-11-06 14:39 ` [PATCH 2/2] kernel: add support for live patching Seth Jennings
                     ` (4 preceding siblings ...)
  2014-11-07 17:39   ` more patches for the same func: " Petr Mladek
@ 2014-11-07 19:40   ` Andy Lutomirski
  2014-11-07 19:42     ` Seth Jennings
  2014-11-07 19:52     ` Seth Jennings
  2014-11-10 10:08   ` Jiri Kosina
  2014-11-13 10:16   ` Miroslav Benes
  7 siblings, 2 replies; 73+ messages in thread
From: Andy Lutomirski @ 2014-11-07 19:40 UTC (permalink / raw)
  To: Seth Jennings, Josh Poimboeuf, Jiri Kosina, Vojtech Pavlik,
	Steven Rostedt
  Cc: live-patching, kpatch, linux-kernel

On 11/06/2014 06:39 AM, Seth Jennings wrote:
> This commit introduces code for the live patching core.  It implements
> an ftrace-based mechanism and kernel interface for doing live patching
> of kernel and kernel module functions.
> 
> It represents the greatest common functionality set between kpatch and
> kgraft and can accept patches built using either method.
> 
> This first version does not implement any consistency mechanism that
> ensures that old and new code do not run together.  In practice, ~90% of
> CVEs are safe to apply in this way, since they simply add a conditional
> check.  However, any function change that can not execute safely with
> the old version of the function can _not_ be safely applied in this
> version.
> 

[...]

> +/********************************************
> + * Sysfs Interface
> + *******************************************/
> +/*
> + * /sys/kernel/livepatch
> + * /sys/kernel/livepatch/<patch>
> + * /sys/kernel/livepatch/<patch>/enabled
> + * /sys/kernel/livepatch/<patch>/<object>
> + * /sys/kernel/livepatch/<patch>/<object>/<func>
> + * /sys/kernel/livepatch/<patch>/<object>/<func>/new_addr
> + * /sys/kernel/livepatch/<patch>/<object>/<func>/old_addr
> + */

Letting anyone read new_addr and old_addr is a kASLR leak, and I would
argue that showing this information to non-root at all is probably a bad
idea.

Can you make new_addr and old_addr have mode 0600 and
/sys/kernel/livepatch itself have mode 0500?  For the latter, an admin
who wants unprivileged users to be able to see it can easily chmod it.

--Andy

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [PATCH 2/2] kernel: add support for live patching
  2014-11-07 19:40   ` Andy Lutomirski
@ 2014-11-07 19:42     ` Seth Jennings
  2014-11-07 19:52     ` Seth Jennings
  1 sibling, 0 replies; 73+ messages in thread
From: Seth Jennings @ 2014-11-07 19:42 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Josh Poimboeuf, Jiri Kosina, Vojtech Pavlik, Steven Rostedt,
	live-patching, kpatch, linux-kernel

On Fri, Nov 07, 2014 at 11:40:38AM -0800, Andy Lutomirski wrote:
> On 11/06/2014 06:39 AM, Seth Jennings wrote:
> > This commit introduces code for the live patching core.  It implements
> > an ftrace-based mechanism and kernel interface for doing live patching
> > of kernel and kernel module functions.
> > 
> > It represents the greatest common functionality set between kpatch and
> > kgraft and can accept patches built using either method.
> > 
> > This first version does not implement any consistency mechanism that
> > ensures that old and new code do not run together.  In practice, ~90% of
> > CVEs are safe to apply in this way, since they simply add a conditional
> > check.  However, any function change that can not execute safely with
> > the old version of the function can _not_ be safely applied in this
> > version.
> > 
> 
> [...]
> 
> > +/********************************************
> > + * Sysfs Interface
> > + *******************************************/
> > +/*
> > + * /sys/kernel/livepatch
> > + * /sys/kernel/livepatch/<patch>
> > + * /sys/kernel/livepatch/<patch>/enabled
> > + * /sys/kernel/livepatch/<patch>/<object>
> > + * /sys/kernel/livepatch/<patch>/<object>/<func>
> > + * /sys/kernel/livepatch/<patch>/<object>/<func>/new_addr
> > + * /sys/kernel/livepatch/<patch>/<object>/<func>/old_addr
> > + */
> 
> Letting anyone read new_addr and old_addr is a kASLR leak, and I would
> argue that showing this information to non-root at all is probably a bad
> idea.
> 
> Can you make new_addr and old_addr have mode 0600 and
> /sys/kernel/livepatch itself have mode 0500?  For the latter, an admin
> who wants unprivileged users to be able to see it can easily chmod it.

Good call.  Will do.

Thanks,
Seth

> 
> --Andy
> --
> To unsubscribe from this list: send the line "unsubscribe live-patching" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [PATCH 2/2] kernel: add support for live patching
  2014-11-07 19:40   ` Andy Lutomirski
  2014-11-07 19:42     ` Seth Jennings
@ 2014-11-07 19:52     ` Seth Jennings
  1 sibling, 0 replies; 73+ messages in thread
From: Seth Jennings @ 2014-11-07 19:52 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Josh Poimboeuf, Jiri Kosina, Vojtech Pavlik, Steven Rostedt,
	live-patching, kpatch, linux-kernel

On Fri, Nov 07, 2014 at 11:40:38AM -0800, Andy Lutomirski wrote:
> On 11/06/2014 06:39 AM, Seth Jennings wrote:
> > This commit introduces code for the live patching core.  It implements
> > an ftrace-based mechanism and kernel interface for doing live patching
> > of kernel and kernel module functions.
> > 
> > It represents the greatest common functionality set between kpatch and
> > kgraft and can accept patches built using either method.
> > 
> > This first version does not implement any consistency mechanism that
> > ensures that old and new code do not run together.  In practice, ~90% of
> > CVEs are safe to apply in this way, since they simply add a conditional
> > check.  However, any function change that can not execute safely with
> > the old version of the function can _not_ be safely applied in this
> > version.
> > 
> 
> [...]
> 
> > +/********************************************
> > + * Sysfs Interface
> > + *******************************************/
> > +/*
> > + * /sys/kernel/livepatch
> > + * /sys/kernel/livepatch/<patch>
> > + * /sys/kernel/livepatch/<patch>/enabled
> > + * /sys/kernel/livepatch/<patch>/<object>
> > + * /sys/kernel/livepatch/<patch>/<object>/<func>
> > + * /sys/kernel/livepatch/<patch>/<object>/<func>/new_addr
> > + * /sys/kernel/livepatch/<patch>/<object>/<func>/old_addr
> > + */
> 
> Letting anyone read new_addr and old_addr is a kASLR leak, and I would
> argue that showing this information to non-root at all is probably a bad
> idea.

Also worth noting that this live patching implementation currently
doesn't support kASLR, as there is a method for the patch module to
supply the old_addr, determined at generation time by pulling from
vmlinux/System.map/etc, for a particular function to resolve symbol
ambiguity in a kallsyms lookup.  Obviously, this old_addr would be wrong
for a kernel using kASLR.

Thanks,
Seth

> 
> Can you make new_addr and old_addr have mode 0600 and
> /sys/kernel/livepatch itself have mode 0500?  For the latter, an admin
> who wants unprivileged users to be able to see it can easily chmod it.
> 
> --Andy
> --
> To unsubscribe from this list: send the line "unsubscribe live-patching" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [PATCH 2/2] kernel: add support for live patching
  2014-11-07 18:21       ` Petr Mladek
@ 2014-11-07 20:31         ` Josh Poimboeuf
  0 siblings, 0 replies; 73+ messages in thread
From: Josh Poimboeuf @ 2014-11-07 20:31 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Seth Jennings, Jiri Slaby, Jiri Kosina, Vojtech Pavlik,
	Steven Rostedt, live-patching, kpatch, linux-kernel

On Fri, Nov 07, 2014 at 07:21:03PM +0100, Petr Mladek wrote:
> On Thu 2014-11-06 10:57:48, Seth Jennings wrote:
> > On Thu, Nov 06, 2014 at 04:51:02PM +0100, Jiri Slaby wrote:
> > > On 11/06/2014, 03:39 PM, Seth Jennings wrote:
> > > > +/*************************************
> > > > + * Core structures
> > > > + ************************************/
> > > > +
> > > > +/*
> > > > + * lp_ structs vs lpc_ structs
> > > > + *
> > > > + * For each element (patch, object, func) in the live-patching code,
> > > > + * there are two types with two different prefixes: lp_ and lpc_.
> > > > + *
> > > > + * Structures used by the live-patch modules to register with this core module
> > > > + * are prefixed with lp_ (live patching).  These structures are part of the
> > > > + * registration API and are defined in livepatch.h.  The structures used
> > > > + * internally by this core module are prefixed with lpc_ (live patching core).
> > > > + */
> > > 
> > > I am not sure if the separation and the allocations/kobj handling are
> > > worth it. It makes the code really less understandable. Can we have just
> > > struct lip_function (don't unnecessarily abbreviate), lip_objectfile
> > > (object is too generic, like Java object) and lip_patch containing all
> > > the needed information? It would clean up the code a lot. (Yes, we would
> > > have profited from c++ here.)
> > 
> > I looked at doing this and this is actually what we did in kpatch.  We
> > made one structure that had "private" members that the user wasn't
> > suppose to access that were only used in the core.  This was messy
> > though.  Every time you wanted to add a "private" field to the struct so
> > the core could do something new, you were changing the API to the patch
> > modules as well.  While copying the data into an internal structure does
> > add code and opportunity for errors, that functionality is localized
> > into functions that are specifically tasked with taking care of that.
> > So the risk is minimized and we gain flexibility within the core and
> > more self-documenting API structures.
> 
> I am not sure if the modified API is really such a big limit. The
> modules initialize the needed members using ".member = value".
> Also we do not need to take care of API/ABI backward compatibility because
> there is very strict dependency between patches and the patched
> kernel.

Our patch module generation tool (kpatch-build) relies on the API as
well, so we should try to keep the API as stable as possible.  At least
until we can put kpatch-build (or something like it) into the kernel
tree.

-- 
Josh

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [PATCH 0/2] Kernel Live Patching
  2014-11-07 15:45               ` Josh Poimboeuf
@ 2014-11-07 21:27                 ` Vojtech Pavlik
  2014-11-08  3:45                   ` Josh Poimboeuf
  2014-11-11  1:24                   ` Masami Hiramatsu
  0 siblings, 2 replies; 73+ messages in thread
From: Vojtech Pavlik @ 2014-11-07 21:27 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Christoph Hellwig, Seth Jennings, Jiri Kosina, Steven Rostedt,
	live-patching, kpatch, linux-kernel

On Fri, Nov 07, 2014 at 09:45:00AM -0600, Josh Poimboeuf wrote:

> > 	LEAVE_FUNCTION
> > 	LEAVE_PATCHED_SET
> > 	LEAVE_KERNEL
> > 
> > 	SWITCH_FUNCTION
> > 	SWITCH_THREAD
> > 	SWITCH_KERNEL
> > 
> > Now with those definitions:
> > 
> > 	livepatch (null model), as is, is LEAVE_FUNCTION and SWITCH_FUNCTION
> > 
> > 	kpatch, masami-refcounting and Ksplice are LEAVE_PATCHED_SET and SWITCH_KERNEL
> > 
> > 	kGraft is LEAVE_KERNEL and SWITCH_THREAD
> > 
> > 	CRIU/kexec is LEAVE_KERNEL and SWITCH_KERNEL
> 
> Thanks, nice analysis!
> 
> > By blending kGraft and masami-refcounting, we could create a consistency
> > engine capable of almost any combination of these properties and thus
> > all the consistency models.
> 
> Can you elaborate on what this would look like?

There would be the refcounting engine, counting entries/exits of the
area of interest (nothing for LEAVE_FUNCTION, patched functions for
LEAVE_PATCHED_SET - same as Masami's work now, or syscall entry/exit for
LEAVE_KERNEL), and it'd do the counting either per thread, flagging a
thread as 'new universe' when the count goes to zero, or flipping a
'new universe' switch for the whole kernel when the count goes down to zero.

A patch would have flags which specify a combination of the above
properties that are needed for successful patching of that specific
patch.

> The big problem with SWITCH_THREAD is that it adds the possibility that
> old functions can run simultaneously with new ones.  When you change
> data or data semantics, which is roughly 10% of security patches, it
> creates some serious headaches:
> 
> - It makes patch safety analysis much harder by doubling the number of
>   permutations of scenarios you have to consider.  In addition to
>   considering newfunc/olddata and newfunc/newdata, you also have to
>   consider oldfunc/olddata and oldfunc/newdata.
> 
> - It requires two patches instead of one.  The first patch is needed to
>   modify the old functions to be able to deal with new data.  After the
>   first patch has been fully applied, then you apply the second patch
>   which can start creating new versions of data.

For data layout an semantic changes, there are two approaches:

	1) TRANSFORM_WORLD

	Stop the world, transform everything, resume. This is what Ksplice does
	and what could work for kpatch, would be rather interesting (but
	possible) for masami-refcounting and doesn't work at all for the
	per-thread kGraft.

	It allows to deallocate structures, allocate new ones, basically
	rebuild the data structures of the kernel. No shadowing or using
	of padding is needed.

	The nice part is that the patch can stay pretty much the original patch
	that fixes the bug when applied to normal kernel sources.

	The most tricky part with this approach is writing the
	additional transformation code. Finding all instances of a
	changed data structure. It fails if only semantics are changed,
	but that is easily fixed by making sure there is always a layout
	change for any semantic change. All instances of a specific data
	structure can be found, worst case with some compiler help: No
	function can have pointers or instances of the structure on the
	stack, or registers, as that would include it in the patched
	set. So all have to be either global, or referenced by a
	globally-rooted tree, linked list or any other structure.

	This one is also possible to revert, if a reverse-transforming function
	is provided.

	masami-refcounting can be made to work with this by spinning in every
	function entry ftrace/kprobe callback after a universe flip and calling
	stop_kernel from the function exit callback that flipped the switch.

	2) TRANSFORM_ON_ACCESS

	This requires structure versioning and/or shadowing. All 'new' functions
	are written with this in mind and can both handle the old and new data formats
	and transform the data to the new format. When universe transition is
	completed for the whole system, a single flag is flipped for the
	functions to start transforming.

	The advantage is to not have to look up every single instance of the
	structure and not having to make sure you found them all.

	The disadvantages are that the patch now looks very different to what
	goes into the kernel sources, that you never know whether the conversion
	is complete and reverting the patch is tough, although can be helped by
	keeping track of transformed functions at a cost of maintaining another
	data structure for that.

	It works with any of the approaches (except null model) and while it
	needs two steps (patch, then enable conversion), it doesn't require two
	rounds of patching. Also, you don't have to consider oldfunc/newdata as
	that will never happen. oldfunc/olddata obviously works, so you only
	have to look at newfunc/olddata and newfunc/newdata as the
	transformation goes on.

I don't see either of these as really that much simpler. But I do see value
in offering both.

> On the other hand, SWITCH_KERNEL doesn't have those problems.  It does
> have the problem you mentioned, roughly 2% of the time, where it can't
> patch functions which are always in use.  But in that case we can skip
> the backtrace check ~90% of the time.  

An interesting bit is that when you skip the backtrace check you're
actually reverting to LEAVE_FUNCION SWITCH_FUNCTION, forfeiting all
consistency and not LEAVE_FUNCTION SWITCH_KERNEL as one would expect.

Hence for those 2% of cases (going with your number, because it's a
guess anyway) LEAVE_PATCHED_SET SWITCH_THREAD would in fact be a safer
option.

> So it's really maybe something
> like 0.2% of patches which can't be patched with SWITCH_KERNEL.  But
> even then I think we could overcome that by getting creative, e.g. using
> the multiple patch approach.
> 
> So my perspective is that SWITCH_THREAD causes big headaches 10% of the
> time, whereas SWITCH_KERNEL causes small headaches 1.8% of the time, and
> big headaches 0.2% of the time :-)

My preferred way would be to go with SWITCH_THREAD for the simpler stuff
and do a SWITCH_KERNEL for the 10% of complex patches. This because
(LEAVE_PATCHED_SET) SWITCH_THREAD finishes much quicker. But I'm biased
there. ;)

It seems more and more to me that we will actually want the more
powerful engine coping with the various options.

-- 
Vojtech Pavlik
Director SUSE Labs

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: more patches for the same func: was Re: [PATCH 2/2] kernel: add support for live patching
  2014-11-07 17:39   ` more patches for the same func: " Petr Mladek
@ 2014-11-07 21:54     ` Josh Poimboeuf
  0 siblings, 0 replies; 73+ messages in thread
From: Josh Poimboeuf @ 2014-11-07 21:54 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Seth Jennings, Jiri Kosina, Vojtech Pavlik, Steven Rostedt,
	live-patching, kpatch, linux-kernel

On Fri, Nov 07, 2014 at 06:39:03PM +0100, Petr Mladek wrote:
> On Thu 2014-11-06 08:39:08, Seth Jennings wrote:
> > This commit introduces code for the live patching core.  It implements
> > an ftrace-based mechanism and kernel interface for doing live patching
> > of kernel and kernel module functions.
> > 
> > It represents the greatest common functionality set between kpatch and
> > kgraft and can accept patches built using either method.
> > 
> > This first version does not implement any consistency mechanism that
> > ensures that old and new code do not run together.  In practice, ~90% of
> > CVEs are safe to apply in this way, since they simply add a conditional
> > check.  However, any function change that can not execute safely with
> > the old version of the function can _not_ be safely applied in this
> > version.
> 
> [...] 
> 
> > +static int lpc_enable_func(struct lpc_func *func)
> > +{
> > +	int ret;
> > +
> > +	BUG_ON(!func->old_addr);
> > +	BUG_ON(func->state != DISABLED);
> > +	ret = ftrace_set_filter_ip(&func->fops, func->old_addr, 0, 0);
> > +	if (ret) {
> > +		pr_err("failed to set ftrace filter for function '%s' (%d)\n",
> > +		       func->old_name, ret);
> > +		return ret;
> > +	}
> > +	ret = register_ftrace_function(&func->fops);
> > +	if (ret) {
> > +		pr_err("failed to register ftrace handler for function '%s' (%d)\n",
> > +		       func->old_name, ret);
> > +		ftrace_set_filter_ip(&func->fops, func->old_addr, 1, 0);
> > +	} else
> > +		func->state = ENABLED;
> > +
> > +	return ret;
> > +}
> > +
> 
> [...]
> 
> > +/* caller must ensure that obj->mod is set if object is a module */
> > +static int lpc_enable_object(struct module *pmod, struct lpc_object *obj)
> > +{
> > +	struct lpc_func *func;
> > +	int ret;
> > +
> > +	if (obj->mod && !try_module_get(obj->mod))
> > +		return -ENODEV;
> > +
> > +	if (obj->dynrelas) {
> > +		ret = lpc_write_object_relocations(pmod, obj);
> > +		if (ret)
> > +			goto unregister;
> > +	}
> > +	list_for_each_entry(func, &obj->funcs, list) {
> > +		ret = lpc_find_verify_func_addr(func, obj->name);
> > +		if (ret)
> > +			goto unregister;
> > +
> > +		ret = lpc_enable_func(func);
> > +		if (ret)
> > +			goto unregister;
> > +	}
> > +	obj->state = ENABLED;
> > +
> > +	return 0;
> > +unregister:
> > +	WARN_ON(lpc_unregister_object(obj));
> > +	return ret;
> > +}
> > +
> > +/******************************
> > + * enable/disable
> > + ******************************/
> > +
> > +/* must be called with lpc_mutex held */
> > +static struct lpc_patch *lpc_find_patch(struct lp_patch *userpatch)
> > +{
> > +	struct lpc_patch *patch;
> > +
> > +	list_for_each_entry(patch, &lpc_patches, list)
> > +		if (patch->userpatch == userpatch)
> > +			return patch;
> > +
> > +	return NULL;
> > +}
> 
> [...]
> 
> > +
> > +/* must be called with lpc_mutex held */
> > +static int lpc_enable_patch(struct lpc_patch *patch)
> > +{
> > +	struct lpc_object *obj;
> > +	int ret;
> > +
> > +	BUG_ON(patch->state != DISABLED);
> > +
> > +	pr_notice_once("tainting kernel with TAINT_LIVEPATCH\n");
> > +	add_taint(TAINT_LIVEPATCH, LOCKDEP_STILL_OK);
> > +
> > +	pr_notice("enabling patch '%s'\n", patch->mod->name);
> > +
> > +	list_for_each_entry(obj, &patch->objs, list) {
> > +		if (!is_object_loaded(obj))
> > +			continue;
> > +		ret = lpc_enable_object(patch->mod, obj);
> > +		if (ret)
> > +			goto unregister;
> > +	}
> > +	patch->state = ENABLED;
> > +	return 0;
> > +
> > +unregister:
> > +	WARN_ON(lpc_disable_patch(patch));
> > +	return ret;
> > +}
> > +
> > +int lp_enable_patch(struct lp_patch *userpatch)
> > +{
> > +	struct lpc_patch *patch;
> > +	int ret;
> > +
> > +	down(&lpc_mutex);
> > +	patch = lpc_find_patch(userpatch);
> > +	if (!patch) {
> > +		ret = -ENODEV;
> > +		goto out;
> > +	}
> > +	ret = lpc_enable_patch(patch);
> > +out:
> > +	up(&lpc_mutex);
> > +	return ret;
> > +}
> > +EXPORT_SYMBOL_GPL(lp_enable_patch);
> 
> AFAIK, this does not handle correctly the situation when there
> are more patches for the same symbol. IMHO, the first registered
> ftrace function wins. It means that later patches are ignored.

Yeah, good point.  With kpatch we had a single ftrace_ops which was
shared for all patched functions, so we didn't have this problem.  From
the ftrace handler, we would just get the most recent version of the
function from our hash table.

With livepatch we decided to change to the model of one ftrace ops per
patched function.  Which I think is probably a good idea, but it's hard
to say for sure without knowing our consistency model yet.

> 
> In kGraft, we detect this situation and do the following:
> 
>    add_new_ftrace_function()
>    /* old one still might be used at this stage */
>    if (old_function)
>       remove_old_ftrace_function();
>    /* the new one is used from now on */
> 
> Similar problem is when a patch is disabled. We need to know
> if it was actually used. If not, we are done. If it is active,
> we need to look if there is an older patch for the the same
> symbol and enable the other ftrace function instead.

Yeah, maybe we should do something similar for livepatch.

> 
> Best Regards,
> Petr
> 
> 
> PS: We should probably decide on the used structures before we start
> coding fixes for this particular problems. I have similar concern about
> the complexity as my colleagues have. But I need to think more about
> it. Let's discuss it in the other thread.

Sounds good :-)

-- 
Josh

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [PATCH 0/2] Kernel Live Patching
  2014-11-07 21:27                 ` Vojtech Pavlik
@ 2014-11-08  3:45                   ` Josh Poimboeuf
  2014-11-08  8:07                     ` Vojtech Pavlik
  2014-11-11  1:24                   ` Masami Hiramatsu
  1 sibling, 1 reply; 73+ messages in thread
From: Josh Poimboeuf @ 2014-11-08  3:45 UTC (permalink / raw)
  To: Vojtech Pavlik
  Cc: Christoph Hellwig, Seth Jennings, Jiri Kosina, Steven Rostedt,
	live-patching, kpatch, linux-kernel

On Fri, Nov 07, 2014 at 10:27:35PM +0100, Vojtech Pavlik wrote:
> On Fri, Nov 07, 2014 at 09:45:00AM -0600, Josh Poimboeuf wrote:
> 
> > > 	LEAVE_FUNCTION
> > > 	LEAVE_PATCHED_SET
> > > 	LEAVE_KERNEL
> > > 
> > > 	SWITCH_FUNCTION
> > > 	SWITCH_THREAD
> > > 	SWITCH_KERNEL
> > > 
> > > Now with those definitions:
> > > 
> > > 	livepatch (null model), as is, is LEAVE_FUNCTION and SWITCH_FUNCTION
> > > 
> > > 	kpatch, masami-refcounting and Ksplice are LEAVE_PATCHED_SET and SWITCH_KERNEL
> > > 
> > > 	kGraft is LEAVE_KERNEL and SWITCH_THREAD
> > > 
> > > 	CRIU/kexec is LEAVE_KERNEL and SWITCH_KERNEL
> > 
> > Thanks, nice analysis!
> > 
> > > By blending kGraft and masami-refcounting, we could create a consistency
> > > engine capable of almost any combination of these properties and thus
> > > all the consistency models.
> > 
> > Can you elaborate on what this would look like?
> 
> There would be the refcounting engine, counting entries/exits of the
> area of interest (nothing for LEAVE_FUNCTION, patched functions for
> LEAVE_PATCHED_SET - same as Masami's work now, or syscall entry/exit for
> LEAVE_KERNEL), and it'd do the counting either per thread, flagging a
> thread as 'new universe' when the count goes to zero, or flipping a
> 'new universe' switch for the whole kernel when the count goes down to zero.
> 
> A patch would have flags which specify a combination of the above
> properties that are needed for successful patching of that specific
> patch.

Would it really be necessary to support all possible permutations?  I
think that would add a lot of complexity to the code.  Especially if we
have to support LEAVE_KERNEL, which adds a lot of interdependencies with
the rest of the kernel (kthreads, syscall, irq, etc).

It seems to me that the two most interesting options are:
- LEAVE_PATCHED_SET + SWITCH_THREAD (Masami-kGraft)
- LEAVE_PATCHED_SET + SWITCH_KERNEL (kpatch and/or Masami-kpatch)

I do see some appeal for the choose-your-own-consistency-model idea.
But it wouldn't be practical for cumulative patch upgrades, which I
think we both agreed at LPC seems to be safer than incremental patching.
If you only ever have one patch module loaded at any given time, you
only get one consistency model anyway.

In order for multiple consistency models to be practical, I think we'd
need to figure out how to make incremental patching safe.

> > The big problem with SWITCH_THREAD is that it adds the possibility that
> > old functions can run simultaneously with new ones.  When you change
> > data or data semantics, which is roughly 10% of security patches, it
> > creates some serious headaches:
> > 
> > - It makes patch safety analysis much harder by doubling the number of
> >   permutations of scenarios you have to consider.  In addition to
> >   considering newfunc/olddata and newfunc/newdata, you also have to
> >   consider oldfunc/olddata and oldfunc/newdata.
> > 
> > - It requires two patches instead of one.  The first patch is needed to
> >   modify the old functions to be able to deal with new data.  After the
> >   first patch has been fully applied, then you apply the second patch
> >   which can start creating new versions of data.
> 
> For data layout an semantic changes, there are two approaches:
> 
> 	1) TRANSFORM_WORLD
> 
> 	Stop the world, transform everything, resume. This is what Ksplice does
> 	and what could work for kpatch, would be rather interesting (but
> 	possible) for masami-refcounting and doesn't work at all for the
> 	per-thread kGraft.
> 
> 	It allows to deallocate structures, allocate new ones, basically
> 	rebuild the data structures of the kernel. No shadowing or using
> 	of padding is needed.
> 
> 	The nice part is that the patch can stay pretty much the original patch
> 	that fixes the bug when applied to normal kernel sources.
> 
> 	The most tricky part with this approach is writing the
> 	additional transformation code. Finding all instances of a
> 	changed data structure. It fails if only semantics are changed,
> 	but that is easily fixed by making sure there is always a layout
> 	change for any semantic change. All instances of a specific data
> 	structure can be found, worst case with some compiler help: No
> 	function can have pointers or instances of the structure on the
> 	stack, or registers, as that would include it in the patched
> 	set. So all have to be either global, or referenced by a
> 	globally-rooted tree, linked list or any other structure.
> 
> 	This one is also possible to revert, if a reverse-transforming function
> 	is provided.
> 
> 	masami-refcounting can be made to work with this by spinning in every
> 	function entry ftrace/kprobe callback after a universe flip and calling
> 	stop_kernel from the function exit callback that flipped the switch.

I'm kind of surprised to hear that Ksplice does this.  I had considered
this approach, but it sounds really tricky, if not impossible in many
cases.

Ahem, this would be an opportune time for a Ksplice person to chime in
with their experiences...

> 	2) TRANSFORM_ON_ACCESS
> 
> 	This requires structure versioning and/or shadowing. All 'new' functions
> 	are written with this in mind and can both handle the old and new data formats
> 	and transform the data to the new format. When universe transition is
> 	completed for the whole system, a single flag is flipped for the
> 	functions to start transforming.

This is a variation on what we've been doing with kpatch, using shadow
data fields to add data and/or version metadata to structures, with a
few differences:

First, we haven't been transforming existing data.  All existing data
structures stay at v1.  Only new data structures are created as v2.  I
suppose it depends on the nature of the patch as to whether it's safe to
convert existing data.

Second, we don't need to flip a flag, because with SWITCH_KERNEL the
system universe transition happens instantly.

> 	The advantage is to not have to look up every single instance of the
> 	structure and not having to make sure you found them all.
>
> 	The disadvantages are that the patch now looks very different to what
> 	goes into the kernel sources,

In my experience, s/very different/slightly different/.

>	that you never know whether the conversion
> 	is complete and reverting the patch is tough, although can be helped by
> 	keeping track of transformed functions at a cost of maintaining another
> 	data structure for that.

True, and we might even want to prevent or discourage such patches from
being disabled somehow.

> 	It works with any of the approaches (except null model) and while it
> 	needs two steps (patch, then enable conversion), it doesn't require two
> 	rounds of patching. Also, you don't have to consider oldfunc/newdata as
> 	that will never happen. oldfunc/olddata obviously works, so you only
> 	have to look at newfunc/olddata and newfunc/newdata as the
> 	transformation goes on.

Makes sense, I hadn't thought of the flag flipping approach for
SWITCH_THREAD.  It's definitely a lot better than requiring an
intermediate patch.

> I don't see either of these as really that much simpler. But I do see value
> in offering both.
> 
> > On the other hand, SWITCH_KERNEL doesn't have those problems.  It does
> > have the problem you mentioned, roughly 2% of the time, where it can't
> > patch functions which are always in use.  But in that case we can skip
> > the backtrace check ~90% of the time.  
> 
> An interesting bit is that when you skip the backtrace check you're
> actually reverting to LEAVE_FUNCION SWITCH_FUNCTION, forfeiting all
> consistency and not LEAVE_FUNCTION SWITCH_KERNEL as one would expect.

Hm, if we used stop machine (or ref counting), but without the backtrace
check, wouldn't it be LEAVE_FUNCTION SWITCH_KERNEL?

> Hence for those 2% of cases (going with your number, because it's a
> guess anyway) LEAVE_PATCHED_SET SWITCH_THREAD would in fact be a safer
> option.
> 
> > So it's really maybe something
> > like 0.2% of patches which can't be patched with SWITCH_KERNEL.  But
> > even then I think we could overcome that by getting creative, e.g. using
> > the multiple patch approach.
> > 
> > So my perspective is that SWITCH_THREAD causes big headaches 10% of the
> > time, whereas SWITCH_KERNEL causes small headaches 1.8% of the time, and
> > big headaches 0.2% of the time :-)
> 
> My preferred way would be to go with SWITCH_THREAD for the simpler stuff
> and do a SWITCH_KERNEL for the 10% of complex patches.

You said above that neither SWITCH_KERNEL nor SWITCH_THREAD is much
simpler than the other for the 10% case.  So why would you use
SWITCH_KERNEL here?

> This because (LEAVE_PATCHED_SET) SWITCH_THREAD finishes much quicker.
> But I'm biased there. ;)

Why would LEAVE_PATCHED_SET SWITCH_THREAD finish much quicker than
LEAVE_PATCHED_SET SWITCH_KERNEL?  Wouldn't they be about the same?

-- 
Josh

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [PATCH 0/2] Kernel Live Patching
  2014-11-08  3:45                   ` Josh Poimboeuf
@ 2014-11-08  8:07                     ` Vojtech Pavlik
  2014-11-10 17:09                       ` Josh Poimboeuf
  0 siblings, 1 reply; 73+ messages in thread
From: Vojtech Pavlik @ 2014-11-08  8:07 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Christoph Hellwig, Seth Jennings, Jiri Kosina, Steven Rostedt,
	live-patching, kpatch, linux-kernel

On Fri, Nov 07, 2014 at 09:45:53PM -0600, Josh Poimboeuf wrote:

> On Fri, Nov 07, 2014 at 10:27:35PM +0100, Vojtech Pavlik wrote:
> > On Fri, Nov 07, 2014 at 09:45:00AM -0600, Josh Poimboeuf wrote:
> > 
> > > > 	LEAVE_FUNCTION
> > > > 	LEAVE_PATCHED_SET
> > > > 	LEAVE_KERNEL
> > > > 
> > > > 	SWITCH_FUNCTION
> > > > 	SWITCH_THREAD
> > > > 	SWITCH_KERNEL
> > > > 
> > 
> > There would be the refcounting engine, counting entries/exits of the
> > area of interest (nothing for LEAVE_FUNCTION, patched functions for
> > LEAVE_PATCHED_SET - same as Masami's work now, or syscall entry/exit
> > for LEAVE_KERNEL), and it'd do the counting either per thread,
> > flagging a thread as 'new universe' when the count goes to zero, or
> > flipping a 'new universe' switch for the whole kernel when the count
> > goes down to zero.
> > 
> > A patch would have flags which specify a combination of the above
> > properties that are needed for successful patching of that specific
> > patch.
> 
> Would it really be necessary to support all possible permutations?  I
> think that would add a lot of complexity to the code.  Especially if we
> have to support LEAVE_KERNEL, which adds a lot of interdependencies with
> the rest of the kernel (kthreads, syscall, irq, etc).
> 
> It seems to me that the two most interesting options are:
> - LEAVE_PATCHED_SET + SWITCH_THREAD (Masami-kGraft)
> - LEAVE_PATCHED_SET + SWITCH_KERNEL (kpatch and/or Masami-kpatch)

I agree here.

In fact LEAVE_KERNEL can be approximated by extending the patched
set as required to include functions which are not changed per se, but
are "contaminated" by propagation of semantic changes in the calling
convention, and/or data format.

This applies to cases like this (needs LEAVE_KERNEL or extending patched
set beyond changed functions):

-----------------------------------------------------------

	int bar() {
		[...]
-		return x;
+		return x + 1;
	}

	foo() {
		int ret = bar();
		do {
			wait_interruptible();
		} while (ret == bar());
	}

-----------------------------------------------------------

Or like this (needs SWITCH_KERNEL so won't work with kGraft, but also
extending patched set, will not work with kpatch as it stands today):

-----------------------------------------------------------

	void lock_a()
	{
-		spin_lock(&x);
+		spin_lock(&y);
	}
	void lock_b()
	{
-		spin_lock(&y);
+		spin_lock(&x);
	}
	void unlock_a()
	}
-		spin_unlock(&x);
+		spin_unlock(&y);
	}
	void unlock_b()
	{
-		spin_unlock(&y);
+		spin_unlock(&x);
	}

	void foo()
	{
		lock_a();
		lock_b();
		[...]
		unlock_b();
		unlock_a();
	}
-----------------------------------------------------------
	

So once we can extend the patched set as needed (manually, after
review), we can handle all the cases that LEAVE_KERNEL offers, making
LEAVE_KERNEL unnecessary.

It'd be nice if we wouldn't require actually patching those functions,
only include them in the set we have to leave to proceed.

The remaining question is the performance impact of using a large set
with refcounting. LEAVE_KERNEL's impact as implemented in kGraft is
beyond being measurable, it's about 16 added instructions for each
thread that get executed.

> I do see some appeal for the choose-your-own-consistency-model idea.
> But it wouldn't be practical for cumulative patch upgrades, which I
> think we both agreed at LPC seems to be safer than incremental
> patching.  If you only ever have one patch module loaded at any given
> time, you only get one consistency model anyway.

> In order for multiple consistency models to be practical, I think we'd
> need to figure out how to make incremental patching safe.

I believe we have to get incremental patching working anyway as it is a
valid usecase for many users, just not for major distributions.

And we may want to take a look at how to mark parts of a cumulative
patch with different consistency models, when we combine eg. the recent
futex CVE patch (not working with SWITCH_KERNEL) and anything requiring
SWITCH kernel.

> > For data layout an semantic changes, there are two approaches:
> > 
> > 	1) TRANSFORM_WORLD
 
> I'm kind of surprised to hear that Ksplice does this.  I had
> considered this approach, but it sounds really tricky, if not
> impossible in many cases.

By the way, the ability to do this is basically the only advantage of
actually stopping the kernel.
 
> Ahem, this would be an opportune time for a Ksplice person to chime in
> with their experiences...

Indeed. :)

> > 	2) TRANSFORM_ON_ACCESS
 
> This is a variation on what we've been doing with kpatch, using shadow
> data fields to add data and/or version metadata to structures, with a
> few differences:
> 
> First, we haven't been transforming existing data.  All existing data
> structures stay at v1.  Only new data structures are created as v2.  I
> suppose it depends on the nature of the patch as to whether it's safe
> to convert existing data.

Also it depends on whether existing data do contain enough information
to actually avoid the security issue the patch is fixing. You may want
to transform the data structures as soon as possible.

> Second, we don't need to flip a flag, because with SWITCH_KERNEL the
> system universe transition happens instantly.

Indeed. I wanted to point out that it works even with the SWITCH_THREAD
and using the patching_complete flag that already is used in kGraft for
other purposes.

> > 	The advantage is to not have to look up every single instance of
> > 	the structure and not having to make sure you found them all.
> >
> > 	The disadvantages are that the patch now looks very different to
> > 	what goes into the kernel sources,
 
> In my experience, s/very different/slightly different/.

Not the same, anyway. :)

> >	that you never know whether the conversion is complete and
> >	reverting the patch is tough, although can be helped by keeping
> >	track of transformed functions at a cost of maintaining another
> >	data structure for that.
> 
> True, and we might even want to prevent or discourage such patches
> from being disabled somehow.

Yes.

> > An interesting bit is that when you skip the backtrace check you're
> > actually reverting to LEAVE_FUNCION SWITCH_FUNCTION, forfeiting all
> > consistency and not LEAVE_FUNCTION SWITCH_KERNEL as one would expect.
> 
> Hm, if we used stop machine (or ref counting), but without the backtrace
> check, wouldn't it be LEAVE_FUNCTION SWITCH_KERNEL?

No; if you don't check the backtrace, any of the patched functions can
be on the stack and old code can still be executed after you resume the
kernel again. Look at this:

--------------------------------------------------------------

	baz() {
	}
	bar() {
-		[...]
+		[...]
		baz();
-		[...]
+		[...]
	}
	foo() {
		bar();
	}

--------------------------------------------------------------

Now we stop_kernel on baz(). We don't check the stack. We patch bar().
We resume, and baz() happily returns into bar(), executing old code.
At the same time, another CPU can call bar(), getting new code.

Stack checking at stop_kernel() time is required to keep the
SWITCH_KERNEL part of the model.

> > > So my perspective is that SWITCH_THREAD causes big headaches 10%
> > > of the time, whereas SWITCH_KERNEL causes small headaches 1.8% of
> > > the time, and big headaches 0.2% of the time :-)
> > 
> > My preferred way would be to go with SWITCH_THREAD for the simpler
> > stuff and do a SWITCH_KERNEL for the 10% of complex patches.
> 
> You said above that neither SWITCH_KERNEL nor SWITCH_THREAD is much
> simpler than the other for the 10% case.  So why would you use
> SWITCH_KERNEL here?

I think I was referring to the two data transformation methods and one
not being simpler than the other.

And since we're both looking at the TRANSFORM_ON_ACCESS model, there
isn't much of a difference using SWITCH_THREAD or SWITCH_KERNEL for the
data format modification cases indeed. It works the same.

But there are a few (probably much less than 10%) cases like the locking
one I used above, where SWITCH_THREAD just isn't going to cut it and for
those I would need SWITCH_KERNEL or get very creative with refactoring
the patch to do things differently.

> > This because (LEAVE_PATCHED_SET) SWITCH_THREAD finishes much quicker.
> > But I'm biased there. ;)
> 
> Why would LEAVE_PATCHED_SET SWITCH_THREAD finish much quicker than
> LEAVE_PATCHED_SET SWITCH_KERNEL?  Wouldn't they be about the same?

Because with LEAVE_PATCHED_SET SWITCH_THREAD you're waiting for each
thread to leave the patched set and when they each have done that at
least once, you're done. Even if some are already back within the set.

With LEAVE_PATCHED_SET SWITCH_KERNEL, you have to find the perfect
moment when all of the threads are outside of the patched set at the
same time. Depending on how often the functions are used and how large
the set is, reaching that moment will get harder.

-- 
Vojtech Pavlik
Director SUSE Labs

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [PATCH 0/2] Kernel Live Patching
  2014-11-06 14:39 [PATCH 0/2] Kernel Live Patching Seth Jennings
                   ` (2 preceding siblings ...)
  2014-11-06 18:44 ` [PATCH 0/2] Kernel Live Patching Christoph Hellwig
@ 2014-11-09 20:16 ` Greg KH
  3 siblings, 0 replies; 73+ messages in thread
From: Greg KH @ 2014-11-09 20:16 UTC (permalink / raw)
  To: Seth Jennings
  Cc: Josh Poimboeuf, Jiri Kosina, Vojtech Pavlik, Steven Rostedt,
	live-patching, kpatch, linux-kernel

On Thu, Nov 06, 2014 at 08:39:06AM -0600, Seth Jennings wrote:
> This patchset implements an ftrace-based mechanism and kernel interface for
> doing live patching of kernel and kernel module functions.  It represents the
> greatest common functionality set between kpatch [1] and kGraft [2] and can
> accept patches built using either method.  This solution was discussed in the
> Live Patching Mini-conference at LPC 2014 [3].
> 
> The model consists of a live patching "core" that provides an interface for
> other "patch" kernel modules to register patches with the core.
> 
> Patch modules contain the new function code and create an lp_patch
> structure containing the required data about what functions to patch, where the
> new code for each patched function resides, and in which kernel object (vmlinux
> or module) the function to be patch resides.  The patch module then invokes the
> lp_register_patch() function to register with the core module, then
> lp_enable_patch() to have to core module redirect the execution paths using
> ftrace.
> 
> An example patch module can be found here:
> https://github.com/spartacus06/livepatch/blob/master/patch/patch.c
> 
> The live patching core creates a sysfs hierarchy for user-level access to live
> patching information.  The hierarchy is structured like this:
> 
> /sys/kernel/livepatch
> /sys/kernel/livepatch/<patch>
> /sys/kernel/livepatch/<patch>/enabled
> /sys/kernel/livepatch/<patch>/<object>
> /sys/kernel/livepatch/<patch>/<object>/<func>
> /sys/kernel/livepatch/<patch>/<object>/<func>/new_addr
> /sys/kernel/livepatch/<patch>/<object>/<func>/old_addr

You are creating sysfs attributes with no Documentation/ABI/ entries,
please fix that in future patches.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [PATCH 1/2] kernel: add TAINT_LIVEPATCH
  2014-11-06 14:39 ` [PATCH 1/2] kernel: add TAINT_LIVEPATCH Seth Jennings
@ 2014-11-09 20:19   ` Greg KH
  2014-11-11 14:54     ` Seth Jennings
  0 siblings, 1 reply; 73+ messages in thread
From: Greg KH @ 2014-11-09 20:19 UTC (permalink / raw)
  To: Seth Jennings
  Cc: Josh Poimboeuf, Jiri Kosina, Vojtech Pavlik, Steven Rostedt,
	live-patching, kpatch, linux-kernel

On Thu, Nov 06, 2014 at 08:39:07AM -0600, Seth Jennings wrote:
> This adds a new taint flag to indicate when the kernel or a kernel
> module has been live patched.  This will provide a clean indication in
> bug reports that live patching was used.
> 
> Additionally, if the crash occurs in a live patched function, the live
> patch module will appear beside the patched function in the backtrace.
> 
> Signed-off-by: Seth Jennings <sjenning@redhat.com>
> ---
>  Documentation/oops-tracing.txt  | 2 ++
>  Documentation/sysctl/kernel.txt | 1 +
>  include/linux/kernel.h          | 1 +
>  kernel/panic.c                  | 2 ++
>  4 files changed, 6 insertions(+)
> 
> diff --git a/Documentation/oops-tracing.txt b/Documentation/oops-tracing.txt
> index beefb9f..f3ac05c 100644
> --- a/Documentation/oops-tracing.txt
> +++ b/Documentation/oops-tracing.txt
> @@ -270,6 +270,8 @@ characters, each representing a particular tainted value.
>  
>   15: 'L' if a soft lockup has previously occurred on the system.
>  
> + 16: 'K' if the kernel has been live patched.
> +
>  The primary reason for the 'Tainted: ' string is to tell kernel
>  debuggers if this is a clean kernel or if anything unusual has
>  occurred.  Tainting is permanent: even if an offending module is
> diff --git a/Documentation/sysctl/kernel.txt b/Documentation/sysctl/kernel.txt
> index d7fc4ab..085f73b 100644
> --- a/Documentation/sysctl/kernel.txt
> +++ b/Documentation/sysctl/kernel.txt
> @@ -831,6 +831,7 @@ can be ORed together:
>  8192 - An unsigned module has been loaded in a kernel supporting module
>         signature.
>  16384 - A soft lockup has previously occurred on the system.
> +32768 - The kernel has been live patched.
>  
>  ==============================================================
>  
> diff --git a/include/linux/kernel.h b/include/linux/kernel.h
> index 446d76a..a6aa2df 100644
> --- a/include/linux/kernel.h
> +++ b/include/linux/kernel.h
> @@ -473,6 +473,7 @@ extern enum system_states {
>  #define TAINT_OOT_MODULE		12
>  #define TAINT_UNSIGNED_MODULE		13
>  #define TAINT_SOFTLOCKUP		14
> +#define TAINT_LIVEPATCH			15
>  
>  extern const char hex_asc[];
>  #define hex_asc_lo(x)	hex_asc[((x) & 0x0f)]

Note, this conflicts with a taint value that others are proposing for
something else, so be aware you might run into problems when you hit
linux-next.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [PATCH 2/2] kernel: add support for live patching
  2014-11-06 14:39 ` [PATCH 2/2] kernel: add support for live patching Seth Jennings
                     ` (5 preceding siblings ...)
  2014-11-07 19:40   ` Andy Lutomirski
@ 2014-11-10 10:08   ` Jiri Kosina
  2014-11-10 17:31     ` Josh Poimboeuf
  2014-11-13 10:16   ` Miroslav Benes
  7 siblings, 1 reply; 73+ messages in thread
From: Jiri Kosina @ 2014-11-10 10:08 UTC (permalink / raw)
  To: Seth Jennings
  Cc: Josh Poimboeuf, Vojtech Pavlik, Steven Rostedt, live-patching,
	kpatch, linux-kernel

On Thu, 6 Nov 2014, Seth Jennings wrote:

> diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c
> new file mode 100644
> index 0000000..b32dbb5
> --- /dev/null
> +++ b/kernel/livepatch/core.c

[ ... snip ... ]

> +/****************************************
> + * dynamic relocations (load-time linker)
> + ****************************************/
> +
> +/*
> + * external symbols are located outside the parent object (where the parent
> + * object is either vmlinux or the kmod being patched).
> + */
> +static int lpc_find_external_symbol(struct module *pmod, const char *name,
> +					unsigned long *addr)
> +{
> +	const struct kernel_symbol *sym;
> +
> +	/* first, check if it's an exported symbol */
> +	preempt_disable();
> +	sym = find_symbol(name, NULL, NULL, true, true);
> +	preempt_enable();
> +	if (sym) {
> +		*addr = sym->value;
> +		return 0;
> +	}
> +
> +	/* otherwise check if it's in another .o within the patch module */
> +	return lpc_find_symbol(pmod->name, name, addr);
> +}
> +
> +static int lpc_write_object_relocations(struct module *pmod,
> +					struct lpc_object *obj)
> +{
> +	int ret, size, readonly = 0, numpages;
> +	struct lp_dynrela *dynrela;
> +	u64 loc, val;
> +	unsigned long core = (unsigned long)pmod->module_core;
> +	unsigned long core_ro_size = pmod->core_ro_size;
> +	unsigned long core_size = pmod->core_size;
> +
> +	for (dynrela = obj->dynrelas; dynrela->name; dynrela++) {
> +		if (!strcmp(obj->name, "vmlinux")) {
> +			ret = lpc_verify_vmlinux_symbol(dynrela->name,
> +							dynrela->src);
> +			if (ret)
> +				return ret;
> +		} else {
> +			/* module, dynrela->src needs to be discovered */
> +			if (dynrela->external)
> +				ret = lpc_find_external_symbol(pmod,
> +							       dynrela->name,
> +							       &dynrela->src);
> +			else
> +				ret = lpc_find_symbol(obj->mod->name,
> +						      dynrela->name,
> +						      &dynrela->src);
> +			if (ret)
> +				return -EINVAL;
> +		}
> +
> +		switch (dynrela->type) {
> +		case R_X86_64_NONE:
> +			continue;
> +		case R_X86_64_PC32:
> +			loc = dynrela->dest;
> +			val = (u32)(dynrela->src + dynrela->addend -
> +				    dynrela->dest);
> +			size = 4;
> +			break;
> +		case R_X86_64_32S:
> +			loc = dynrela->dest;
> +			val = (s32)dynrela->src + dynrela->addend;
> +			size = 4;
> +			break;
> +		case R_X86_64_64:
> +			loc = dynrela->dest;
> +			val = dynrela->src;
> +			size = 8;
> +			break;

This is x86-specific, so it definitely needs to go to arch/x86. The only 
hard precondition for arch to support live patching is ftrace with regs 
saving (we are currently in parallel working on extending the set of 
architectures that support this), so we shouldn't introduce any x86-isms 
into the generic code.

It seems to me that what basically needs to be done is to teach 
apply_relocate_add() about this kind of relocations and apply them as 
needed.

Thanks,

-- 
Jiri Kosina
SUSE Labs

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [PATCH 0/2] Kernel Live Patching
  2014-11-08  8:07                     ` Vojtech Pavlik
@ 2014-11-10 17:09                       ` Josh Poimboeuf
  2014-11-11  9:05                         ` Vojtech Pavlik
  0 siblings, 1 reply; 73+ messages in thread
From: Josh Poimboeuf @ 2014-11-10 17:09 UTC (permalink / raw)
  To: Vojtech Pavlik
  Cc: Christoph Hellwig, Seth Jennings, Jiri Kosina, Steven Rostedt,
	live-patching, kpatch, linux-kernel

On Sat, Nov 08, 2014 at 09:07:54AM +0100, Vojtech Pavlik wrote:
> On Fri, Nov 07, 2014 at 09:45:53PM -0600, Josh Poimboeuf wrote:
> 
> > On Fri, Nov 07, 2014 at 10:27:35PM +0100, Vojtech Pavlik wrote:
> > > On Fri, Nov 07, 2014 at 09:45:00AM -0600, Josh Poimboeuf wrote:
> > > 
> > > > > 	LEAVE_FUNCTION
> > > > > 	LEAVE_PATCHED_SET
> > > > > 	LEAVE_KERNEL
> > > > > 
> > > > > 	SWITCH_FUNCTION
> > > > > 	SWITCH_THREAD
> > > > > 	SWITCH_KERNEL
> > > > > 
> > > 
> > > There would be the refcounting engine, counting entries/exits of the
> > > area of interest (nothing for LEAVE_FUNCTION, patched functions for
> > > LEAVE_PATCHED_SET - same as Masami's work now, or syscall entry/exit
> > > for LEAVE_KERNEL), and it'd do the counting either per thread,
> > > flagging a thread as 'new universe' when the count goes to zero, or
> > > flipping a 'new universe' switch for the whole kernel when the count
> > > goes down to zero.
> > > 
> > > A patch would have flags which specify a combination of the above
> > > properties that are needed for successful patching of that specific
> > > patch.
> > 
> > Would it really be necessary to support all possible permutations?  I
> > think that would add a lot of complexity to the code.  Especially if we
> > have to support LEAVE_KERNEL, which adds a lot of interdependencies with
> > the rest of the kernel (kthreads, syscall, irq, etc).
> > 
> > It seems to me that the two most interesting options are:
> > - LEAVE_PATCHED_SET + SWITCH_THREAD (Masami-kGraft)
> > - LEAVE_PATCHED_SET + SWITCH_KERNEL (kpatch and/or Masami-kpatch)
> 
> I agree here.
> 
> In fact LEAVE_KERNEL can be approximated by extending the patched
> set as required to include functions which are not changed per se, but
> are "contaminated" by propagation of semantic changes in the calling
> convention, and/or data format.
> 
> This applies to cases like this (needs LEAVE_KERNEL or extending patched
> set beyond changed functions):
> 
> -----------------------------------------------------------
> 
> 	int bar() {
> 		[...]
> -		return x;
> +		return x + 1;
> 	}
> 
> 	foo() {
> 		int ret = bar();
> 		do {
> 			wait_interruptible();
> 		} while (ret == bar());
> 	}
> 
> -----------------------------------------------------------

Agreed.  Though I think this is quite rare anyway.  Do you know of any
real world examples of this pattern in the kernel?

> Or like this (needs SWITCH_KERNEL so won't work with kGraft, but also
> extending patched set, will not work with kpatch as it stands today):
> 
> -----------------------------------------------------------
> 
> 	void lock_a()
> 	{
> -		spin_lock(&x);
> +		spin_lock(&y);
> 	}
> 	void lock_b()
> 	{
> -		spin_lock(&y);
> +		spin_lock(&x);
> 	}
> 	void unlock_a()
> 	}
> -		spin_unlock(&x);
> +		spin_unlock(&y);
> 	}
> 	void unlock_b()
> 	{
> -		spin_unlock(&y);
> +		spin_unlock(&x);
> 	}
> 
> 	void foo()
> 	{
> 		lock_a();
> 		lock_b();
> 		[...]
> 		unlock_b();
> 		unlock_a();
> 	}
> -----------------------------------------------------------

Another way to handle this type of locking semantic change for either
kpatch or kGraft is to use shadow data to add a "v2" shadow field to the
lock's containing struct, which is set whenever the struct is allocated
in the new universe.  Then you can use that field to determine which
version of locking semantics to use.

> So once we can extend the patched set as needed (manually, after
> review), we can handle all the cases that LEAVE_KERNEL offers, making
> LEAVE_KERNEL unnecessary.

Agreed.

> It'd be nice if we wouldn't require actually patching those functions,
> only include them in the set we have to leave to proceed.

Maybe, let's wait and see how common this problem turns out to be :-)

> The remaining question is the performance impact of using a large set
> with refcounting. LEAVE_KERNEL's impact as implemented in kGraft is
> beyond being measurable, it's about 16 added instructions for each
> thread that get executed.

Sure, but I don't see this being much of an issue.

> > I do see some appeal for the choose-your-own-consistency-model idea.
> > But it wouldn't be practical for cumulative patch upgrades, which I
> > think we both agreed at LPC seems to be safer than incremental
> > patching.  If you only ever have one patch module loaded at any given
> > time, you only get one consistency model anyway.
> 
> > In order for multiple consistency models to be practical, I think we'd
> > need to figure out how to make incremental patching safe.
> 
> I believe we have to get incremental patching working anyway as it is a
> valid usecase for many users, just not for major distributions.

Well, we do already have it "working", but it's not safe enough for
serious use because we aren't properly dealing with build and runtime
dependencies between patch modules.  It may be tricky to get right.

But yeah, if you're _very_ careful to analyze any dependencies between
the patches, an occasional incremental patch might be do-able.

> And we may want to take a look at how to mark parts of a cumulative
> patch with different consistency models, when we combine eg. the recent
> futex CVE patch (not working with SWITCH_KERNEL) and anything requiring
> SWITCH kernel.

Yeah, interesting idea.  But again I think we'd still need to be very
careful with the dependencies.

> > > For data layout an semantic changes, there are two approaches:
> > > 
> > > 	1) TRANSFORM_WORLD
>  
> > I'm kind of surprised to hear that Ksplice does this.  I had
> > considered this approach, but it sounds really tricky, if not
> > impossible in many cases.
> 
> By the way, the ability to do this is basically the only advantage of
> actually stopping the kernel.

I pretty much agree.

> > Ahem, this would be an opportune time for a Ksplice person to chime in
> > with their experiences...
> 
> Indeed. :)
> 
> > > 	2) TRANSFORM_ON_ACCESS
>  
> > This is a variation on what we've been doing with kpatch, using shadow
> > data fields to add data and/or version metadata to structures, with a
> > few differences:
> > 
> > First, we haven't been transforming existing data.  All existing data
> > structures stay at v1.  Only new data structures are created as v2.  I
> > suppose it depends on the nature of the patch as to whether it's safe
> > to convert existing data.
> 
> Also it depends on whether existing data do contain enough information
> to actually avoid the security issue the patch is fixing. You may want
> to transform the data structures as soon as possible.

Agreed, it's something that needs to be considered on a per patch basis.

> > Second, we don't need to flip a flag, because with SWITCH_KERNEL the
> > system universe transition happens instantly.
> 
> Indeed. I wanted to point out that it works even with the SWITCH_THREAD
> and using the patching_complete flag that already is used in kGraft for
> other purposes.
> 
> > > 	The advantage is to not have to look up every single instance of
> > > 	the structure and not having to make sure you found them all.
> > >
> > > 	The disadvantages are that the patch now looks very different to
> > > 	what goes into the kernel sources,
>  
> > In my experience, s/very different/slightly different/.
> 
> Not the same, anyway. :)
> 
> > >	that you never know whether the conversion is complete and
> > >	reverting the patch is tough, although can be helped by keeping
> > >	track of transformed functions at a cost of maintaining another
> > >	data structure for that.
> > 
> > True, and we might even want to prevent or discourage such patches
> > from being disabled somehow.
> 
> Yes.
> 
> > > An interesting bit is that when you skip the backtrace check you're
> > > actually reverting to LEAVE_FUNCION SWITCH_FUNCTION, forfeiting all
> > > consistency and not LEAVE_FUNCTION SWITCH_KERNEL as one would expect.
> > 
> > Hm, if we used stop machine (or ref counting), but without the backtrace
> > check, wouldn't it be LEAVE_FUNCTION SWITCH_KERNEL?
> 
> No; if you don't check the backtrace, any of the patched functions can
> be on the stack and old code can still be executed after you resume the
> kernel again. Look at this:
> 
> --------------------------------------------------------------
> 
> 	baz() {
> 	}
> 	bar() {
> -		[...]
> +		[...]
> 		baz();
> -		[...]
> +		[...]
> 	}
> 	foo() {
> 		bar();
> 	}
> 
> --------------------------------------------------------------
> 
> Now we stop_kernel on baz(). We don't check the stack. We patch bar().
> We resume, and baz() happily returns into bar(), executing old code.
> At the same time, another CPU can call bar(), getting new code.
> 
> Stack checking at stop_kernel() time is required to keep the
> SWITCH_KERNEL part of the model.

Ok, I think I get what you're saying.  In my mind it's kind of a hybrid
of SWITCH_KERNEL and SWITCH_FUNCTION, since SWITCH_KERNEL would still be
used for other functions in the patch.  In that case we'd be forfeiting
consistency just for those skipped functions in the list.

> > > > So my perspective is that SWITCH_THREAD causes big headaches 10%
> > > > of the time, whereas SWITCH_KERNEL causes small headaches 1.8% of
> > > > the time, and big headaches 0.2% of the time :-)
> > > 
> > > My preferred way would be to go with SWITCH_THREAD for the simpler
> > > stuff and do a SWITCH_KERNEL for the 10% of complex patches.
> > 
> > You said above that neither SWITCH_KERNEL nor SWITCH_THREAD is much
> > simpler than the other for the 10% case.  So why would you use
> > SWITCH_KERNEL here?
> 
> I think I was referring to the two data transformation methods and one
> not being simpler than the other.
> 
> And since we're both looking at the TRANSFORM_ON_ACCESS model, there
> isn't much of a difference using SWITCH_THREAD or SWITCH_KERNEL for the
> data format modification cases indeed. It works the same.

Well, not exactly the same, since SWITCH_THREAD needs the flag for
creation/transformation, but yeah, close enough :-)

> But there are a few (probably much less than 10%) cases like the locking
> one I used above, where SWITCH_THREAD just isn't going to cut it and for
> those I would need SWITCH_KERNEL or get very creative with refactoring
> the patch to do things differently.

I'm not opposed to having both if necessary.  But I think the code would
be _much_ simpler if we could agree on a single consistency model that
can be used in all cases.  Plus there wouldn't be such a strong
requirement to get incremental patching to work safely (which will add
more complexity).

I actually agree with you that LEAVE_PATCHED_SET + SWITCH_THREAD is
pretty nice.

So I'd like to hear more about cases where you think we _need_
SWITCH_KERNEL.  As I mentioned above, I think many of those cases can be
solved by using data structure versioning with shadow data fields.

> > > This because (LEAVE_PATCHED_SET) SWITCH_THREAD finishes much quicker.
> > > But I'm biased there. ;)
> > 
> > Why would LEAVE_PATCHED_SET SWITCH_THREAD finish much quicker than
> > LEAVE_PATCHED_SET SWITCH_KERNEL?  Wouldn't they be about the same?
> 
> Because with LEAVE_PATCHED_SET SWITCH_THREAD you're waiting for each
> thread to leave the patched set and when they each have done that at
> least once, you're done. Even if some are already back within the set.

Ok, so you're talking about the case when you're trying to patch a
function which is always active.  Agreed :-)

> With LEAVE_PATCHED_SET SWITCH_KERNEL, you have to find the perfect
> moment when all of the threads are outside of the patched set at the
> same time. Depending on how often the functions are used and how large
> the set is, reaching that moment will get harder.

Yeah, I think this is the biggest drawback of SWITCH_KERNEL.  More
likely to fail (or never succeed).

-- 
Josh

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [PATCH 2/2] kernel: add support for live patching
  2014-11-10 10:08   ` Jiri Kosina
@ 2014-11-10 17:31     ` Josh Poimboeuf
  0 siblings, 0 replies; 73+ messages in thread
From: Josh Poimboeuf @ 2014-11-10 17:31 UTC (permalink / raw)
  To: Jiri Kosina
  Cc: Seth Jennings, Vojtech Pavlik, Steven Rostedt, live-patching,
	kpatch, linux-kernel

On Mon, Nov 10, 2014 at 11:08:00AM +0100, Jiri Kosina wrote:
> On Thu, 6 Nov 2014, Seth Jennings wrote:
> 
> > diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c
> > new file mode 100644
> > index 0000000..b32dbb5
> > --- /dev/null
> > +++ b/kernel/livepatch/core.c
> 
> [ ... snip ... ]
> 
> > +/****************************************
> > + * dynamic relocations (load-time linker)
> > + ****************************************/
> > +
> > +/*
> > + * external symbols are located outside the parent object (where the parent
> > + * object is either vmlinux or the kmod being patched).
> > + */
> > +static int lpc_find_external_symbol(struct module *pmod, const char *name,
> > +					unsigned long *addr)
> > +{
> > +	const struct kernel_symbol *sym;
> > +
> > +	/* first, check if it's an exported symbol */
> > +	preempt_disable();
> > +	sym = find_symbol(name, NULL, NULL, true, true);
> > +	preempt_enable();
> > +	if (sym) {
> > +		*addr = sym->value;
> > +		return 0;
> > +	}
> > +
> > +	/* otherwise check if it's in another .o within the patch module */
> > +	return lpc_find_symbol(pmod->name, name, addr);
> > +}
> > +
> > +static int lpc_write_object_relocations(struct module *pmod,
> > +					struct lpc_object *obj)
> > +{
> > +	int ret, size, readonly = 0, numpages;
> > +	struct lp_dynrela *dynrela;
> > +	u64 loc, val;
> > +	unsigned long core = (unsigned long)pmod->module_core;
> > +	unsigned long core_ro_size = pmod->core_ro_size;
> > +	unsigned long core_size = pmod->core_size;
> > +
> > +	for (dynrela = obj->dynrelas; dynrela->name; dynrela++) {
> > +		if (!strcmp(obj->name, "vmlinux")) {
> > +			ret = lpc_verify_vmlinux_symbol(dynrela->name,
> > +							dynrela->src);
> > +			if (ret)
> > +				return ret;
> > +		} else {
> > +			/* module, dynrela->src needs to be discovered */
> > +			if (dynrela->external)
> > +				ret = lpc_find_external_symbol(pmod,
> > +							       dynrela->name,
> > +							       &dynrela->src);
> > +			else
> > +				ret = lpc_find_symbol(obj->mod->name,
> > +						      dynrela->name,
> > +						      &dynrela->src);
> > +			if (ret)
> > +				return -EINVAL;
> > +		}
> > +
> > +		switch (dynrela->type) {
> > +		case R_X86_64_NONE:
> > +			continue;
> > +		case R_X86_64_PC32:
> > +			loc = dynrela->dest;
> > +			val = (u32)(dynrela->src + dynrela->addend -
> > +				    dynrela->dest);
> > +			size = 4;
> > +			break;
> > +		case R_X86_64_32S:
> > +			loc = dynrela->dest;
> > +			val = (s32)dynrela->src + dynrela->addend;
> > +			size = 4;
> > +			break;
> > +		case R_X86_64_64:
> > +			loc = dynrela->dest;
> > +			val = dynrela->src;
> > +			size = 8;
> > +			break;
> 
> This is x86-specific, so it definitely needs to go to arch/x86. The only 
> hard precondition for arch to support live patching is ftrace with regs 
> saving (we are currently in parallel working on extending the set of 
> architectures that support this), so we shouldn't introduce any x86-isms 
> into the generic code.
> 
> It seems to me that what basically needs to be done is to teach 
> apply_relocate_add() about this kind of relocations and apply them as 
> needed.

Agreed.

-- 
Josh

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: Re: [PATCH 0/2] Kernel Live Patching
  2014-11-07 21:27                 ` Vojtech Pavlik
  2014-11-08  3:45                   ` Josh Poimboeuf
@ 2014-11-11  1:24                   ` Masami Hiramatsu
  2014-11-11 10:26                     ` Vojtech Pavlik
  1 sibling, 1 reply; 73+ messages in thread
From: Masami Hiramatsu @ 2014-11-11  1:24 UTC (permalink / raw)
  To: Vojtech Pavlik
  Cc: Josh Poimboeuf, Christoph Hellwig, Seth Jennings, Jiri Kosina,
	Steven Rostedt, live-patching, kpatch, linux-kernel

Hi,

(2014/11/08 6:27), Vojtech Pavlik wrote:
> On Fri, Nov 07, 2014 at 09:45:00AM -0600, Josh Poimboeuf wrote:
> 
>>> 	LEAVE_FUNCTION
>>> 	LEAVE_PATCHED_SET
>>> 	LEAVE_KERNEL
>>>
>>> 	SWITCH_FUNCTION
>>> 	SWITCH_THREAD
>>> 	SWITCH_KERNEL
>>>
>>> Now with those definitions:
>>>
>>> 	livepatch (null model), as is, is LEAVE_FUNCTION and SWITCH_FUNCTION
>>>
>>> 	kpatch, masami-refcounting and Ksplice are LEAVE_PATCHED_SET and SWITCH_KERNEL
>>>
>>> 	kGraft is LEAVE_KERNEL and SWITCH_THREAD
>>>
>>> 	CRIU/kexec is LEAVE_KERNEL and SWITCH_KERNEL
>>
>> Thanks, nice analysis!

Hmm, I doubt this can cover all. what I'm thinking is a combination of
LEAVE_KERNEL and SWITCH_KERNEL by using my refcounting and kGraft's
per-thread "new universe" flagging(*). It switches all threads but not
change entire kernel as kexec does.

So, I think the patch may be classified by following four types

PATCH_FUNCTION - Patching per function. This ignores context, just
               change the function.
               User must ensure that the new function can co-exist
               with old functions on the same context (e.g. recursive
               call can cause inconsistency).

PATCH_THREAD - Patching per thread. If a thread leave the kernel,
               changes are applied for that thread.
               User must ensure that the new functions can co-exist
               with old functions per-thread. Inter-thread shared
               data acquisition(locks) should not be involved.

PATCH_KERNEL - Patching all threads. This wait for all threads leave the
               all target functions.
               User must ensure that the new functions can co-exist
               with old functions on a thread (note that if there is a
               loop, old one can be called first n times, and new one
               can be called afterwords).(**)

RENEW_KERNEL - Renew entire kernel and reset internally. No patch limitation,
               but involving kernel resetting. This may take a time.

(*) Instead of checking stacks, at first, wait for all threads leaving
the kernel once, after that, wait for refcount becomes zero and switch
all the patched functions.

(**) For the loops, if it is a simple loop or some simple lock calls,
we can wait for all threads leave the caller function to avoid inconsistency
by using refcounting.


>>> By blending kGraft and masami-refcounting, we could create a consistency
>>> engine capable of almost any combination of these properties and thus
>>> all the consistency models.
>>
>> Can you elaborate on what this would look like?
> 
> There would be the refcounting engine, counting entries/exits of the
> area of interest (nothing for LEAVE_FUNCTION, patched functions for
> LEAVE_PATCHED_SET - same as Masami's work now, or syscall entry/exit for
> LEAVE_KERNEL), and it'd do the counting either per thread, flagging a
> thread as 'new universe' when the count goes to zero, or flipping a
> 'new universe' switch for the whole kernel when the count goes down to zero.

Ah, that's similar thing what I'd like to try next :)

Sorry, here is an off-topic talk.
I think a problem of kGraft's LEAVE_KERNEL work is that the sleeping
processes. To ensure all the threads are changing to new universe,
we need to wakeup all the threads, or we need stack-dumping to find
someone is sleeping on the target functions. What would the kGraft do
for this issue?

> A patch would have flags which specify a combination of the above
> properties that are needed for successful patching of that specific
> patch.

Agreed.

Thank you,
-- 
Masami HIRAMATSU
Software Platform Research Dept. Linux Technology Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: masami.hiramatsu.pt@hitachi.com



^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [PATCH 0/2] Kernel Live Patching
  2014-11-10 17:09                       ` Josh Poimboeuf
@ 2014-11-11  9:05                         ` Vojtech Pavlik
  2014-11-11 17:45                           ` Josh Poimboeuf
  0 siblings, 1 reply; 73+ messages in thread
From: Vojtech Pavlik @ 2014-11-11  9:05 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Christoph Hellwig, Seth Jennings, Jiri Kosina, Steven Rostedt,
	live-patching, kpatch, linux-kernel

On Mon, Nov 10, 2014 at 11:09:03AM -0600, Josh Poimboeuf wrote:

> > In fact LEAVE_KERNEL can be approximated by extending the patched
> > set as required to include functions which are not changed per se, but
> > are "contaminated" by propagation of semantic changes in the calling
> > convention, and/or data format.
> > 
> > This applies to cases like this (needs LEAVE_KERNEL or extending patched
> > set beyond changed functions):
> > 
> > -----------------------------------------------------------
> > 
> > 	int bar() {
> > 		[...]
> > -		return x;
> > +		return x + 1;
> > 	}
> > 
> > 	foo() {
> > 		int ret = bar();
> > 		do {
> > 			wait_interruptible();
> > 		} while (ret == bar());
> > 	}
> > 
> > -----------------------------------------------------------
> 
> Agreed.  Though I think this is quite rare anyway.  Do you know of any
> real world examples of this pattern in the kernel?

No, I do not. All of the examples I presented are entirely synthetic
corner cases designed to show the weaknesses of the various models.

> > Or like this (needs SWITCH_KERNEL so won't work with kGraft, but also
> > extending patched set, will not work with kpatch as it stands today):
> > 
> > -----------------------------------------------------------
> > 
> > 	void lock_a()
> > 	{
> > -		spin_lock(&x);
> > +		spin_lock(&y);
> > 	}
> > 	void lock_b()
> > 	{
> > -		spin_lock(&y);
> > +		spin_lock(&x);
> > 	}
> > 	void unlock_a()
> > 	}
> > -		spin_unlock(&x);
> > +		spin_unlock(&y);
> > 	}
> > 	void unlock_b()
> > 	{
> > -		spin_unlock(&y);
> > +		spin_unlock(&x);
> > 	}
> > 
> > 	void foo()
> > 	{
> > 		lock_a();
> > 		lock_b();
> > 		[...]
> > 		unlock_b();
> > 		unlock_a();
> > 	}
> > -----------------------------------------------------------
> 
> Another way to handle this type of locking semantic change for either
> kpatch or kGraft is to use shadow data to add a "v2" shadow field to the
> lock's containing struct, which is set whenever the struct is allocated
> in the new universe.  Then you can use that field to determine which
> version of locking semantics to use.

Agreed, this way of handling the locking changes makes the locking
changes viable even for the SWITCH_THREAD case then.

And if you'd want to convert existing structs, then you also need to
make the shadow operations atomic, but that's possible.

Also, if the lock is not a member of any struct, you'd have to shadow
the lock itself, again, possible.

> > I believe we have to get incremental patching working anyway as it is a
> > valid usecase for many users, just not for major distributions.
> 
> Well, we do already have it "working", but it's not safe enough for
> serious use because we aren't properly dealing with build and runtime
> dependencies between patch modules.  It may be tricky to get right.

Similar for kGraft, incremental also works, and kGraft doesn't have an
issue with build time dependencies for obvious reasons. At runtime it
gets tricky, particularly if you start removing some of the patches.

> But yeah, if you're _very_ careful to analyze any dependencies between
> the patches, an occasional incremental patch might be do-able.

I don't have an issue with not offering incremental patching initially,
as we currently do not have a practical use for it.

> > And we may want to take a look at how to mark parts of a cumulative
> > patch with different consistency models, when we combine eg. the recent
> > futex CVE patch (not working with SWITCH_KERNEL) and anything requiring
> > SWITCH kernel.
> 
> Yeah, interesting idea.  But again I think we'd still need to be very
> careful with the dependencies.

Yes.

> > Now we stop_kernel on baz(). We don't check the stack. We patch bar().
> > We resume, and baz() happily returns into bar(), executing old code.
> > At the same time, another CPU can call bar(), getting new code.
> > 
> > Stack checking at stop_kernel() time is required to keep the
> > SWITCH_KERNEL part of the model.
> 
> Ok, I think I get what you're saying.  In my mind it's kind of a hybrid
> of SWITCH_KERNEL and SWITCH_FUNCTION, since SWITCH_KERNEL would still be
> used for other functions in the patch.  In that case we'd be forfeiting
> consistency just for those skipped functions in the list.

So you would not be skipping the stack checking entirely, just allowing
certain functions from the patched set to be on the stack while you
switch to the new universe.

That indeed would make it a mixed SWITCH_FUNCTION/SWITCH_KERNEL
situation. 

The big caveat in such a situation is that you must not change the
calling convention or semantics of any function called directly from the
function you skipped the stack check for. As doing so would crash the
kernel when an old call calls a new function.

> > But there are a few (probably much less than 10%) cases like the locking
> > one I used above, where SWITCH_THREAD just isn't going to cut it and for
> > those I would need SWITCH_KERNEL or get very creative with refactoring
> > the patch to do things differently.
> 
> I'm not opposed to having both if necessary.  But I think the code would
> be _much_ simpler if we could agree on a single consistency model that
> can be used in all cases.  Plus there wouldn't be such a strong
> requirement to get incremental patching to work safely (which will add
> more complexity).
> 
> I actually agree with you that LEAVE_PATCHED_SET + SWITCH_THREAD is
> pretty nice.

Cool! Do you see it as the next step consistency model we would focus on
implementing in livepatch after the null model is complete and upstream?

(That wouldn't preclude extending it or implementing more models later.)

> So I'd like to hear more about cases where you think we _need_
> SWITCH_KERNEL.  As I mentioned above, I think many of those cases can be
> solved by using data structure versioning with shadow data fields.

I have tried, but so far I can't find a situation whether we would
absolutely need SWITCH_KERNEL, assuming we have LEAVE_PATCHED_SET +
SWITCH_THREAD + TRANSFORM_ON_ACCESS.

> > > Why would LEAVE_PATCHED_SET SWITCH_THREAD finish much quicker than
> > > LEAVE_PATCHED_SET SWITCH_KERNEL?  Wouldn't they be about the same?
> > 
> > Because with LEAVE_PATCHED_SET SWITCH_THREAD you're waiting for each
> > thread to leave the patched set and when they each have done that at
> > least once, you're done. Even if some are already back within the set.
> 
> Ok, so you're talking about the case when you're trying to patch a
> function which is always active.  Agreed :-)

Yes.

> > With LEAVE_PATCHED_SET SWITCH_KERNEL, you have to find the perfect
> > moment when all of the threads are outside of the patched set at the
> > same time. Depending on how often the functions are used and how large
> > the set is, reaching that moment will get harder.
> 
> Yeah, I think this is the biggest drawback of SWITCH_KERNEL.  More
> likely to fail (or never succeed).

If some threads are sleeping in a loop inside the patched set:

With SWITCH_THREAD you can wake (eg. by a signal) the threads from
userspace as a last resort and that will complete your patching.

With SWITCH_KERNEL you'd have somehow to wake them at the same time
hoping they also leave the patched set together. That's unlikely to
happen when many threads are involved.

In addition the "in progress" behavior is nicer for SWITCH_THREAD, as
any new thread will already be running patched code. With SWITCH_KERNEL,
you're waiting with applying the fix until the perfect moment.

-- 
Vojtech Pavlik
Director SUSE Labs

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: Re: [PATCH 0/2] Kernel Live Patching
  2014-11-11  1:24                   ` Masami Hiramatsu
@ 2014-11-11 10:26                     ` Vojtech Pavlik
  2014-11-12 17:33                       ` Masami Hiramatsu
  0 siblings, 1 reply; 73+ messages in thread
From: Vojtech Pavlik @ 2014-11-11 10:26 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: Josh Poimboeuf, Christoph Hellwig, Seth Jennings, Jiri Kosina,
	Steven Rostedt, live-patching, kpatch, linux-kernel

On Tue, Nov 11, 2014 at 10:24:03AM +0900, Masami Hiramatsu wrote:

> Hmm, I doubt this can cover all. what I'm thinking is a combination of
> LEAVE_KERNEL and SWITCH_KERNEL by using my refcounting and kGraft's
> per-thread "new universe" flagging(*). It switches all threads but not
> change entire kernel as kexec does.

While your approach described below indeed forces all threads to leave
kernel once, to initialize the reference counts, this can be considered
a preparatory phase before the actual patching begins.

The actual consistency model remains unchanged from what kpatch offers
today, which guarantees that at the time of switch, no execution thread
is inside the set of patched functions and that the switch happens at
once for all threads. Hence I'd still classify the consistency model
offered as LEAVE_PATCHED_SET SWITCH_KERNEL.

> So, I think the patch may be classified by following four types
> 
> PATCH_FUNCTION - Patching per function. This ignores context, just
>                change the function.
>                User must ensure that the new function can co-exist
>                with old functions on the same context (e.g. recursive
>                call can cause inconsistency).
> 
> PATCH_THREAD - Patching per thread. If a thread leave the kernel,
>                changes are applied for that thread.
>                User must ensure that the new functions can co-exist
>                with old functions per-thread. Inter-thread shared
>                data acquisition(locks) should not be involved.
> 
> PATCH_KERNEL - Patching all threads. This wait for all threads leave the
>                all target functions.
>                User must ensure that the new functions can co-exist
>                with old functions on a thread (note that if there is a
>                loop, old one can be called first n times, and new one
>                can be called afterwords).(**)

Yes, but only when the function calling it is not included in the
patched set, which is only a problem for semantic changes accompanied by
no change in the function prototyppe. This can be avoided by changing
the prototype deliberately.

> RENEW_KERNEL - Renew entire kernel and reset internally. No patch limitation,
>                but involving kernel resetting. This may take a time.

And involves recording the userspace-kernel interface state exactly. The
interface is fairly large, so this can become hairy.

> (*) Instead of checking stacks, at first, wait for all threads leaving
> the kernel once, after that, wait for refcount becomes zero and switch
> all the patched functions.

This is a very beautiful idea.

It does away with both the stack parsing and the kernel stopping,
achieving kGraft's goals, while preserving kpatch's consistency model.

Sadly, it combines the disadvantages of both kpatch and kGraft: From
kpatch it takes the inability to patch functions where threads are
sleeping often and as such never leave them at once. From kGraft it
takes the need to annotate kernel threads and wake sleepers from
userspace.

So while it is beautiful, it's less practical than either kpatch or
kGraft alone. 

> (**) For the loops, if it is a simple loop or some simple lock calls,
> we can wait for all threads leave the caller function to avoid inconsistency
> by using refcounting.

Yes, this is what I call 'extending the patched set'. You can do that
either by deliberately changing the prototype of the patched function
being called, which causes the calling function to be considered
different, or just add it to the set of functions considered manually.

> > There would be the refcounting engine, counting entries/exits of the
> > area of interest (nothing for LEAVE_FUNCTION, patched functions for
> > LEAVE_PATCHED_SET - same as Masami's work now, or syscall entry/exit for
> > LEAVE_KERNEL), and it'd do the counting either per thread, flagging a
> > thread as 'new universe' when the count goes to zero, or flipping a
> > 'new universe' switch for the whole kernel when the count goes down to zero.
> 
> Ah, that's similar thing what I'd like to try next :)

Cool.

> Sorry, here is an off-topic talk.  I think a problem of kGraft's
> LEAVE_KERNEL work is that the sleeping processes. To ensure all the
> threads are changing to new universe, we need to wakeup all the
> threads, or we need stack-dumping to find someone is sleeping on the
> target functions. What would the kGraft do for this issue?

Yes, kGraft uses an userspace helper to find such sleeper and wake them
by sending a SIGUSR1 or SIGSTOP/SIGCONT. It's one of the disadvantages
of kGraft that sleeper threads have to be handled possibly case by case.

Also, kernel threads are problematic for kGraft (as you may have seen in
earlier kGraft upstream submissions) as they never leave the kernel.

-- 
Vojtech Pavlik
Director SUSE Labs

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [PATCH 1/2] kernel: add TAINT_LIVEPATCH
  2014-11-09 20:19   ` Greg KH
@ 2014-11-11 14:54     ` Seth Jennings
  0 siblings, 0 replies; 73+ messages in thread
From: Seth Jennings @ 2014-11-11 14:54 UTC (permalink / raw)
  To: Greg KH
  Cc: Josh Poimboeuf, Jiri Kosina, Vojtech Pavlik, Steven Rostedt,
	live-patching, kpatch, linux-kernel

On Sun, Nov 09, 2014 at 12:19:22PM -0800, Greg KH wrote:
> On Thu, Nov 06, 2014 at 08:39:07AM -0600, Seth Jennings wrote:
> > This adds a new taint flag to indicate when the kernel or a kernel
> > module has been live patched.  This will provide a clean indication in
> > bug reports that live patching was used.
> > 
> > Additionally, if the crash occurs in a live patched function, the live
> > patch module will appear beside the patched function in the backtrace.
> > 
> > Signed-off-by: Seth Jennings <sjenning@redhat.com>
> > ---
> >  Documentation/oops-tracing.txt  | 2 ++
> >  Documentation/sysctl/kernel.txt | 1 +
> >  include/linux/kernel.h          | 1 +
> >  kernel/panic.c                  | 2 ++
> >  4 files changed, 6 insertions(+)
> > 
> > diff --git a/Documentation/oops-tracing.txt b/Documentation/oops-tracing.txt
> > index beefb9f..f3ac05c 100644
> > --- a/Documentation/oops-tracing.txt
> > +++ b/Documentation/oops-tracing.txt
> > @@ -270,6 +270,8 @@ characters, each representing a particular tainted value.
> >  
> >   15: 'L' if a soft lockup has previously occurred on the system.
> >  
> > + 16: 'K' if the kernel has been live patched.
> > +
> >  The primary reason for the 'Tainted: ' string is to tell kernel
> >  debuggers if this is a clean kernel or if anything unusual has
> >  occurred.  Tainting is permanent: even if an offending module is
> > diff --git a/Documentation/sysctl/kernel.txt b/Documentation/sysctl/kernel.txt
> > index d7fc4ab..085f73b 100644
> > --- a/Documentation/sysctl/kernel.txt
> > +++ b/Documentation/sysctl/kernel.txt
> > @@ -831,6 +831,7 @@ can be ORed together:
> >  8192 - An unsigned module has been loaded in a kernel supporting module
> >         signature.
> >  16384 - A soft lockup has previously occurred on the system.
> > +32768 - The kernel has been live patched.
> >  
> >  ==============================================================
> >  
> > diff --git a/include/linux/kernel.h b/include/linux/kernel.h
> > index 446d76a..a6aa2df 100644
> > --- a/include/linux/kernel.h
> > +++ b/include/linux/kernel.h
> > @@ -473,6 +473,7 @@ extern enum system_states {
> >  #define TAINT_OOT_MODULE		12
> >  #define TAINT_UNSIGNED_MODULE		13
> >  #define TAINT_SOFTLOCKUP		14
> > +#define TAINT_LIVEPATCH			15
> >  
> >  extern const char hex_asc[];
> >  #define hex_asc_lo(x)	hex_asc[((x) & 0x0f)]
> 
> Note, this conflicts with a taint value that others are proposing for
> something else, so be aware you might run into problems when you hit
> linux-next.

Thanks for the notice.  I'll continue rebasing the patchset against the
latest -next and if the other proposal(s) gets in first, I'll change.

Seth

> 
> thanks,
> 
> greg k-h

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [PATCH 0/2] Kernel Live Patching
  2014-11-11  9:05                         ` Vojtech Pavlik
@ 2014-11-11 17:45                           ` Josh Poimboeuf
  0 siblings, 0 replies; 73+ messages in thread
From: Josh Poimboeuf @ 2014-11-11 17:45 UTC (permalink / raw)
  To: Vojtech Pavlik
  Cc: Christoph Hellwig, Seth Jennings, Jiri Kosina, Steven Rostedt,
	live-patching, kpatch, linux-kernel

On Tue, Nov 11, 2014 at 10:05:05AM +0100, Vojtech Pavlik wrote:
> On Mon, Nov 10, 2014 at 11:09:03AM -0600, Josh Poimboeuf wrote:
> > > But there are a few (probably much less than 10%) cases like the locking
> > > one I used above, where SWITCH_THREAD just isn't going to cut it and for
> > > those I would need SWITCH_KERNEL or get very creative with refactoring
> > > the patch to do things differently.
> > 
> > I'm not opposed to having both if necessary.  But I think the code would
> > be _much_ simpler if we could agree on a single consistency model that
> > can be used in all cases.  Plus there wouldn't be such a strong
> > requirement to get incremental patching to work safely (which will add
> > more complexity).
> > 
> > I actually agree with you that LEAVE_PATCHED_SET + SWITCH_THREAD is
> > pretty nice.
> 
> Cool! Do you see it as the next step consistency model we would focus on
> implementing in livepatch after the null model is complete and upstream?

Yeah, I'm thinking so.  None of the consistency models are perfect, but
I think this is a nice hybrid of the kGraft and kpatch models.  It
allows us to apply the greatest percentage of patches with the highest
success rate, while keeping the code complexity at a reasonable level.

-- 
Josh

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: module notifier: was Re: [PATCH 2/2] kernel: add support for live patching
  2014-11-07 18:40       ` Petr Mladek
  2014-11-07 18:55         ` Seth Jennings
@ 2014-11-11 19:40         ` Seth Jennings
  2014-11-11 22:17           ` Jiri Kosina
  1 sibling, 1 reply; 73+ messages in thread
From: Seth Jennings @ 2014-11-11 19:40 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Josh Poimboeuf, Jiri Kosina, Vojtech Pavlik, Steven Rostedt,
	live-patching, kpatch, linux-kernel

On Fri, Nov 07, 2014 at 07:40:11PM +0100, Petr Mladek wrote:
> On Fri 2014-11-07 12:07:11, Seth Jennings wrote:
> > On Fri, Nov 07, 2014 at 06:13:07PM +0100, Petr Mladek wrote:
> > > On Thu 2014-11-06 08:39:08, Seth Jennings wrote:
[...]
> > > > +	up(&lpc_mutex);
> > > > +	WARN("failed to apply patch '%s' to module '%s'\n",
> > > > +		patch->mod->name, mod->name);
> > > > +	return 0;
> > > > +}
> > > > +
> > > > +static struct notifier_block lp_module_nb = {
> > > > +	.notifier_call = lp_module_notify,
> > > > +	.priority = INT_MIN, /* called last */
> > > 
> > > The handler for MODULE_STATE_COMMING would need have higger priority,
> > > if we want to cleanly unregister the ftrace handlers.
> > 
> > Yes, we might need two handlers at different priorities if we decide to
> > go that direction: one for MODULE_STATE_GOING at high/max and one for
> > MODULE_STATE_COMING at low/min.
> 
> kGraft has notifier only for the going state. The initialization is
> called directly from load_module() after ftrace_module_init()
> and complete_formation() before it is executed by parse_args().
> 
> I need to investigate if the notifier is more elegant and safe or not.

I looked it up and having a COMING notifier with priority INT_MIN is
effectively the same as having a call between complete_formation() and
parse_args() since the notifiers are called as the last thing in
complete_formation().

I think I've found a clean way to avoid the ref taking on the patched
modules using only the notifier and lpc_mutex. It will be in v2
(hopefully out in the next couple of days).

Thanks,
Seth

> 
> Best Regards,
> Petr

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: module notifier: was Re: [PATCH 2/2] kernel: add support for live patching
  2014-11-11 19:40         ` Seth Jennings
@ 2014-11-11 22:17           ` Jiri Kosina
  2014-11-11 22:48             ` Seth Jennings
  0 siblings, 1 reply; 73+ messages in thread
From: Jiri Kosina @ 2014-11-11 22:17 UTC (permalink / raw)
  To: Seth Jennings
  Cc: Petr Mladek, Josh Poimboeuf, Vojtech Pavlik, Steven Rostedt,
	live-patching, kpatch, linux-kernel

On Tue, 11 Nov 2014, Seth Jennings wrote:

> It will be in v2 (hopefully out in the next couple of days).

FWIW we are also working on a few patches on top of v1 to back some of the 
proposals we've made during the first round of review, so maybe it might 
make sense to wait with v2 a little bit more, so that it incorporates as 
much v1 feedback as possible ... ?

Thanks,

-- 
Jiri Kosina
SUSE Labs

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: module notifier: was Re: [PATCH 2/2] kernel: add support for live patching
  2014-11-11 22:17           ` Jiri Kosina
@ 2014-11-11 22:48             ` Seth Jennings
  0 siblings, 0 replies; 73+ messages in thread
From: Seth Jennings @ 2014-11-11 22:48 UTC (permalink / raw)
  To: Jiri Kosina
  Cc: Petr Mladek, Josh Poimboeuf, Vojtech Pavlik, Steven Rostedt,
	live-patching, kpatch, linux-kernel

On Tue, Nov 11, 2014 at 11:17:39PM +0100, Jiri Kosina wrote:
> On Tue, 11 Nov 2014, Seth Jennings wrote:
> 
> > It will be in v2 (hopefully out in the next couple of days).
> 
> FWIW we are also working on a few patches on top of v1 to back some of the 
> proposals we've made during the first round of review, so maybe it might 
> make sense to wait with v2 a little bit more, so that it incorporates as 
> much v1 feedback as possible ... ?

What proposals in particular?  I've already made many of the changes
that we agreed upon.

Thanks,
Seth

> 
> Thanks,
> 
> -- 
> Jiri Kosina
> SUSE Labs

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: Re: Re: [PATCH 0/2] Kernel Live Patching
  2014-11-11 10:26                     ` Vojtech Pavlik
@ 2014-11-12 17:33                       ` Masami Hiramatsu
  2014-11-12 21:47                         ` Vojtech Pavlik
  0 siblings, 1 reply; 73+ messages in thread
From: Masami Hiramatsu @ 2014-11-12 17:33 UTC (permalink / raw)
  To: Vojtech Pavlik
  Cc: Josh Poimboeuf, Christoph Hellwig, Seth Jennings, Jiri Kosina,
	Steven Rostedt, live-patching, kpatch, linux-kernel

(2014/11/11 19:26), Vojtech Pavlik wrote:
> On Tue, Nov 11, 2014 at 10:24:03AM +0900, Masami Hiramatsu wrote:
> 
>> Hmm, I doubt this can cover all. what I'm thinking is a combination of
>> LEAVE_KERNEL and SWITCH_KERNEL by using my refcounting and kGraft's
>> per-thread "new universe" flagging(*). It switches all threads but not
>> change entire kernel as kexec does.
> 
> While your approach described below indeed forces all threads to leave
> kernel once, to initialize the reference counts, this can be considered
> a preparatory phase before the actual patching begins.
> 
> The actual consistency model remains unchanged from what kpatch offers
> today, which guarantees that at the time of switch, no execution thread
> is inside the set of patched functions and that the switch happens at
> once for all threads. Hence I'd still classify the consistency model
> offered as LEAVE_PATCHED_SET SWITCH_KERNEL.

Right. Consistency model is still same as kpatch. Btw, I think
we can just use the difference of consistency for classifying
the patches, since we have these classes, only limited combination
is meaningful.

>> 	LEAVE_FUNCTION
>> 	LEAVE_PATCHED_SET
>> 	LEAVE_KERNEL
>>
>> 	SWITCH_FUNCTION
>> 	SWITCH_THREAD
>> 	SWITCH_KERNEL

How about the below combination of consistent flags?

<flags>
CONSISTENT_IN_THREAD - patching is consistent in a thread.
CONSISTENT_IN_TIME - patching is atomically done.

<combination>
(none) - the 'null' mode? same as LEAVE_FUNCTION & SWITCH_FUNCTION

CONSISTENT_IN_THREAD - kGraft mode. same as LEAVE_KERNEL & SWITCH_THREAD

CONSISTENT_IN_TIME - kpatch mode. same as LEAVE_PATCHED_SET & SWITCH_KERNEL

CONSISTENT_IN_THREAD|CONSISTENT_IN_TIME - CRIU mode. same as LEAVE_KERNEL & SWITCH_KERNEL

So, each patch requires consistency constrains flag and livepatch tool
chooses the mode based on the flag.

>> So, I think the patch may be classified by following four types
>>
>> PATCH_FUNCTION - Patching per function. This ignores context, just
>>                change the function.
>>                User must ensure that the new function can co-exist
>>                with old functions on the same context (e.g. recursive
>>                call can cause inconsistency).
>>
>> PATCH_THREAD - Patching per thread. If a thread leave the kernel,
>>                changes are applied for that thread.
>>                User must ensure that the new functions can co-exist
>>                with old functions per-thread. Inter-thread shared
>>                data acquisition(locks) should not be involved.
>>
>> PATCH_KERNEL - Patching all threads. This wait for all threads leave the
>>                all target functions.
>>                User must ensure that the new functions can co-exist
>>                with old functions on a thread (note that if there is a
>>                loop, old one can be called first n times, and new one
>>                can be called afterwords).(**)
> 
> Yes, but only when the function calling it is not included in the
> patched set, which is only a problem for semantic changes accompanied by
> no change in the function prototyppe. This can be avoided by changing
> the prototype deliberately.

Hmm, but what would you think about following simple case?

----
int func(int a) {
  return a + 1;
}

...
  b = 0;
  for (i = 0; i < 10; i++)
    b = func(b);
...
----
----
int func(int a) {
  return a + 2; /* Changed */
}

...
  b = 0;
  for (i = 0; i < 10; i++)
    b = func(b);
...
----

So, after the patch, "b" will be in a range of 10 to 20, not 10 or 20.
Of course CONSISTENT_IN_THREAD can ensure it should be 10 or 20 :)

> 
>> RENEW_KERNEL - Renew entire kernel and reset internally. No patch limitation,
>>                but involving kernel resetting. This may take a time.
> 
> And involves recording the userspace-kernel interface state exactly. The
> interface is fairly large, so this can become hairy.
> 
>> (*) Instead of checking stacks, at first, wait for all threads leaving
>> the kernel once, after that, wait for refcount becomes zero and switch
>> all the patched functions.
> 
> This is a very beautiful idea.
> 
> It does away with both the stack parsing and the kernel stopping,
> achieving kGraft's goals, while preserving kpatch's consistency model.
> 
> Sadly, it combines the disadvantages of both kpatch and kGraft: From
> kpatch it takes the inability to patch functions where threads are
> sleeping often and as such never leave them at once. From kGraft it
> takes the need to annotate kernel threads and wake sleepers from
> userspace.

But how frequently the former case happens? It seems very very rare.
And if we aim to enable both kpatch mode and kGraft mode in the kernel,
anyway we'll have something for the latter cases.

> 
> So while it is beautiful, it's less practical than either kpatch or
> kGraft alone. 

Ah, sorry for confusing, I don't tend to integrate kpatch and kGraft.
Actually, it is just about modifying kpatch, since it may shorten
stack-checking time.
This means that does not change the consistency model.
We certainly need both of kGraft mode and kpatch mode.

>> (**) For the loops, if it is a simple loop or some simple lock calls,
>> we can wait for all threads leave the caller function to avoid inconsistency
>> by using refcounting.
> 
> Yes, this is what I call 'extending the patched set'. You can do that
> either by deliberately changing the prototype of the patched function
> being called, which causes the calling function to be considered
> different, or just add it to the set of functions considered manually.

I'd prefer latter one :) or just gives hints of watching targets.

>>> There would be the refcounting engine, counting entries/exits of the
>>> area of interest (nothing for LEAVE_FUNCTION, patched functions for
>>> LEAVE_PATCHED_SET - same as Masami's work now, or syscall entry/exit for
>>> LEAVE_KERNEL), and it'd do the counting either per thread, flagging a
>>> thread as 'new universe' when the count goes to zero, or flipping a
>>> 'new universe' switch for the whole kernel when the count goes down to zero.
>>
>> Ah, that's similar thing what I'd like to try next :)
> 
> Cool.
> 
>> Sorry, here is an off-topic talk.  I think a problem of kGraft's
>> LEAVE_KERNEL work is that the sleeping processes. To ensure all the
>> threads are changing to new universe, we need to wakeup all the
>> threads, or we need stack-dumping to find someone is sleeping on the
>> target functions. What would the kGraft do for this issue?
> 
> Yes, kGraft uses an userspace helper to find such sleeper and wake them
> by sending a SIGUSR1 or SIGSTOP/SIGCONT. It's one of the disadvantages
> of kGraft that sleeper threads have to be handled possibly case by case.
> Also, kernel threads are problematic for kGraft (as you may have seen in
> earlier kGraft upstream submissions) as they never leave the kernel.

Ah, I see. Perhaps, you can use stack-checking for sleeping threads and
context-switch hooking for kernel threads as kpatch does :)
Of course this will have a downside that patching can fail, but
can avoid such problems.

Thank you!


-- 
Masami HIRAMATSU
Software Platform Research Dept. Linux Technology Research Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: masami.hiramatsu.pt@hitachi.com



^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: Re: Re: [PATCH 0/2] Kernel Live Patching
  2014-11-12 17:33                       ` Masami Hiramatsu
@ 2014-11-12 21:47                         ` Vojtech Pavlik
  2014-11-13 15:56                           ` Masami Hiramatsu
  0 siblings, 1 reply; 73+ messages in thread
From: Vojtech Pavlik @ 2014-11-12 21:47 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: Josh Poimboeuf, Christoph Hellwig, Seth Jennings, Jiri Kosina,
	Steven Rostedt, live-patching, kpatch, linux-kernel

On Thu, Nov 13, 2014 at 02:33:24AM +0900, Masami Hiramatsu wrote:
> Right. Consistency model is still same as kpatch. Btw, I think
> we can just use the difference of consistency for classifying
> the patches, since we have these classes, only limited combination
> is meaningful.
> 
> >> 	LEAVE_FUNCTION
> >> 	LEAVE_PATCHED_SET
> >> 	LEAVE_KERNEL
> >>
> >> 	SWITCH_FUNCTION
> >> 	SWITCH_THREAD
> >> 	SWITCH_KERNEL
> 
> How about the below combination of consistent flags?
> 
> <flags>
> CONSISTENT_IN_THREAD - patching is consistent in a thread.
> CONSISTENT_IN_TIME - patching is atomically done.
> 
> <combination>
> (none) - the 'null' mode? same as LEAVE_FUNCTION & SWITCH_FUNCTION
> 
> CONSISTENT_IN_THREAD - kGraft mode. same as LEAVE_KERNEL & SWITCH_THREAD
> 
> CONSISTENT_IN_TIME - kpatch mode. same as LEAVE_PATCHED_SET & SWITCH_KERNEL
> 
> CONSISTENT_IN_THREAD|CONSISTENT_IN_TIME - CRIU mode. same as LEAVE_KERNEL & SWITCH_KERNEL

The reason I tried to parametrize the consistency model in a more
flexible and fine-grained manner than just describing the existing
solutions was for the purpose of exploring whether any of the remaining
combinations make sense.

It allowed me to look at what value we're getting from the consistency
models: Most importantly the ability to change function prototypes and
still make calls work.

For this, the minimum requirements are LEAVE_PATCHED_SET (what
kpatch does) and SWITCH_THREAD (which is what kGraft does). 

Both kpatch and kGraft do more, but:

I was able to show that LEAVE_KERNEL is unnecessary and any cases where
it is beneficial can be augmented by just increasing the patched set.

I believe at this point that SWITCH_KERNEL is unnecessary and that data or
locking changes - the major benefit of switching at once can be done by
shadowing/versioning of data structures, which is what both kpatch and
kGraft had planned to do anyway.

I haven't shown yet whether the strongest consistency (LEAVE_KERNEL +
SWITCH_KERNEL) is possible at all. CRIU is close, but not necessarily
doing quite that. It might be possible to just force processes to sleep
at syscall entry one by one until all are asleep. Also the benefits of
doing that are still unclear.

The goal is to find a consistency model that is best suited for the
goals of both kpatch and kGraft: Reliably apply simple to
mid-complexity kernel patches.

> So, each patch requires consistency constrains flag and livepatch tool
> chooses the mode based on the flag.
> 
> >> So, I think the patch may be classified by following four types
> >>
> >> PATCH_FUNCTION - Patching per function. This ignores context, just
> >>                change the function.
> >>                User must ensure that the new function can co-exist
> >>                with old functions on the same context (e.g. recursive
> >>                call can cause inconsistency).
> >>
> >> PATCH_THREAD - Patching per thread. If a thread leave the kernel,
> >>                changes are applied for that thread.
> >>                User must ensure that the new functions can co-exist
> >>                with old functions per-thread. Inter-thread shared
> >>                data acquisition(locks) should not be involved.
> >>
> >> PATCH_KERNEL - Patching all threads. This wait for all threads leave the
> >>                all target functions.
> >>                User must ensure that the new functions can co-exist
> >>                with old functions on a thread (note that if there is a
> >>                loop, old one can be called first n times, and new one
> >>                can be called afterwords).(**)
> > 
> > Yes, but only when the function calling it is not included in the
> > patched set, which is only a problem for semantic changes accompanied by
> > no change in the function prototyppe. This can be avoided by changing
> > the prototype deliberately.
> 
> Hmm, but what would you think about following simple case?
> 
> ----
> int func(int a) {
>   return a + 1;
> }
> 
> ...
>   b = 0;
>   for (i = 0; i < 10; i++)
>     b = func(b);
> ...
> ----
> ----
> int func(int a) {
>   return a + 2; /* Changed */
> }
> 
> ...
>   b = 0;
>   for (i = 0; i < 10; i++)
>     b = func(b);
> ...
> ----
> 
> So, after the patch, "b" will be in a range of 10 to 20, not 10 or 20.
> Of course CONSISTENT_IN_THREAD can ensure it should be 10 or 20 :)

If you force a prototype change, eg by changing func() to an unsigned
int, or simply add a parameter, the place where it is called from will
also be changed and will be included in the patched set. (Or you can
just include it manually in the set.)

Then, you can be sure that the place which calls func() is not on the
stack when patching. This way, in your classification, PATCH_KERNEL can
be as good as PATCH_THREAD. In my classification, I'm saying that
LEAVE_PATCHED_SET is as good as LEAVE_KERNEL.

> >> (*) Instead of checking stacks, at first, wait for all threads leaving
> >> the kernel once, after that, wait for refcount becomes zero and switch
> >> all the patched functions.
> > 
> > This is a very beautiful idea.
> > 
> > It does away with both the stack parsing and the kernel stopping,
> > achieving kGraft's goals, while preserving kpatch's consistency model.
> > 
> > Sadly, it combines the disadvantages of both kpatch and kGraft: From
> > kpatch it takes the inability to patch functions where threads are
> > sleeping often and as such never leave them at once. From kGraft it
> > takes the need to annotate kernel threads and wake sleepers from
> > userspace.
> 
> But how frequently the former case happens? It seems very very rare.
> And if we aim to enable both kpatch mode and kGraft mode in the kernel,
> anyway we'll have something for the latter cases.

The kpatch problem case isn't that rare. It just happened with a CVE in
futexes recently. It will happen if you try to patch anything that is on
the stack when a TTY or TCP read is waiting for data as another example. 

The kGraft problem case will happen when you load a 3rd party module
with a non-annotated kernel thread. Or a different problem will happen
when you have an application sleeping that will exit when receiving any
signal.

Both the cases can be handled with tricks and workarounds. But it'd be
much nicer to have a patching engine that is reliable.

> > So while it is beautiful, it's less practical than either kpatch or
> > kGraft alone. 
> 
> Ah, sorry for confusing, I don't tend to integrate kpatch and kGraft.
> Actually, it is just about modifying kpatch, since it may shorten
> stack-checking time.
> This means that does not change the consistency model.
> We certainly need both of kGraft mode and kpatch mode.

What I'm proposing is a LEAVE_PATCHED_SET + SWITCH_THREAD mode. It's
less consistency, but it is enough. And it is more reliable (likely to
succeed in finite time) than either kpatch or kGraft.

It'd be mostly based on your refcounting code, including stack
checking (when a process sleeps, counter gets set based on number of
patched functions on the stack), possibly including setting the counter
to 0 on syscall entry/exit, but it'd make the switch per-thread like
kGraft does, not for the whole system, when the respective counters
reach zero.

This handles the frequent sleeper case, it doesn't need annotated kernel
thread main loops, it will not need the user to wake up every process in
the system unless it sleeps in a patched function.

And it can handle all the patches that kpatch and kGraft can (it needs
shadowing for some).

> > Yes, this is what I call 'extending the patched set'. You can do that
> > either by deliberately changing the prototype of the patched function
> > being called, which causes the calling function to be considered
> > different, or just add it to the set of functions considered manually.
> 
> I'd prefer latter one :) or just gives hints of watching targets.

Me too.


-- 
Vojtech Pavlik
Director SUSE Labs

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [PATCH 2/2] kernel: add support for live patching
  2014-11-06 14:39 ` [PATCH 2/2] kernel: add support for live patching Seth Jennings
                     ` (6 preceding siblings ...)
  2014-11-10 10:08   ` Jiri Kosina
@ 2014-11-13 10:16   ` Miroslav Benes
  2014-11-13 14:38     ` Josh Poimboeuf
  2014-11-13 17:12     ` Seth Jennings
  7 siblings, 2 replies; 73+ messages in thread
From: Miroslav Benes @ 2014-11-13 10:16 UTC (permalink / raw)
  To: Seth Jennings
  Cc: Josh Poimboeuf, Jiri Kosina, Vojtech Pavlik, Steven Rostedt,
	jslaby, pmladek, live-patching, kpatch, linux-kernel


Hi,

thank you for the first version of the united live patching core.

The patch below implements some of our review objections. Changes are 
described in the commit log. It simplifies the hierarchy of data 
structures, removes data duplication (lp_ and lpc_ structures) and 
simplifies sysfs directory.

I did not try to repair other stuff (races, function names, function 
prefix, api symmetry etc.). It should serve as a demonstration of our 
point of view.

There are some problems with this. try_module_get and module_put may be 
called several times for each kernel module where some function is 
patched in. This should be fixed with module going notifier as suggested 
by Petr. 

The modified core was tested with modified testing live patch originally 
from Seth's github. It worked as expected.

Please take a look at these changes, so we can discuss them in more 
detail.

Best regards,
--
Miroslav Benes
SUSE Labs


----
>From f659a18a630de27b47d375119d793e28ee50da04 Mon Sep 17 00:00:00 2001
From: Miroslav Benes <mbenes@suse.cz>
Date: Thu, 13 Nov 2014 10:25:48 +0100
Subject: [PATCH] lpc: simplification of structure and sysfs hierarchy

Original code has several issues this patch tries to remove.

First, there is only lpc_func structure for patched function and lpc_patch for
the patch as a whole. Therefore lpc_object structure as middle step of hierarchy
is removed. Patched function is still associated with some object (vmlinux or
module) through obj_name. Dynrelas are now in lpc_patch structure and object
identifier (obj_name) is in the lpc_dynrela to preserve the connection.

Second, sysfs structure is simplified. We do not need to propagate old_addr and
new_addr. So, there is subdirectory for each patch (patching module) which
includes original enabled attribute and new one funcs attribute which lists the
patched functions.

Third, data duplication (lp_ and lpc_ structures) is removed. lpc_ structures
are now in the header file and made available for the user. This allows us to
remove almost all the functions for structure allocation in the original code.

Signed-off-by: Miroslav Benes <mbenes@suse.cz>
---
 include/linux/livepatch.h |  46 ++--
 kernel/livepatch/core.c   | 575 +++++++++++++---------------------------------
 2 files changed, 191 insertions(+), 430 deletions(-)

diff --git a/include/linux/livepatch.h b/include/linux/livepatch.h
index c7a415b..db5ba00 100644
--- a/include/linux/livepatch.h
+++ b/include/linux/livepatch.h
@@ -2,10 +2,23 @@
 #define _LIVEPATCH_H_
 
 #include <linux/module.h>
+#include <linux/ftrace.h>
 
-struct lp_func {
+enum lpc_state {
+	LPC_DISABLED,
+	LPC_ENABLED
+};
+
+struct lpc_func {
+	/* internal */
+	struct ftrace_ops fops;
+	enum lpc_state state;
+	struct module *mod; /* module associated with patched function */
+	unsigned long new_addr; /* replacement function in patch module */
+
+	/* external */
 	const char *old_name; /* function to be patched */
-	void *new_func; /* replacement function in patch module */
+	void *new_func;
 	/*
 	 * The old_addr field is optional and can be used to resolve
 	 * duplicate symbol names in the vmlinux object.  If this
@@ -15,31 +28,36 @@ struct lp_func {
 	 * way to resolve the ambiguity.
 	 */
 	unsigned long old_addr;
+
+	const char *obj_name; /* "vmlinux" or module name */
 };
 
-struct lp_dynrela {
+struct lpc_dynrela {
 	unsigned long dest;
 	unsigned long src;
 	unsigned long type;
 	const char *name;
+	const char *obj_name;
 	int addend;
 	int external;
 };
 
-struct lp_object {
-	const char *name; /* "vmlinux" or module name */
-	struct lp_func *funcs;
-	struct lp_dynrela *dynrelas;
-};
+struct lpc_patch {
+	/* internal */
+	struct list_head list;
+	struct kobject kobj;
+	enum lpc_state state;
 
-struct lp_patch {
+	/* external */
 	struct module *mod; /* module containing the patch */
-	struct lp_object *objs;
+	struct lpc_dynrela *dynrelas;
+	struct lpc_func funcs[];
 };
 
-int lp_register_patch(struct lp_patch *);
-int lp_unregister_patch(struct lp_patch *);
-int lp_enable_patch(struct lp_patch *);
-int lp_disable_patch(struct lp_patch *);
+
+extern int lpc_register_patch(struct lpc_patch *);
+extern int lpc_unregister_patch(struct lpc_patch *);
+extern int lpc_enable_patch(struct lpc_patch *);
+extern int lpc_disable_patch(struct lpc_patch *);
 
 #endif /* _LIVEPATCH_H_ */
diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c
index b32dbb5..feecc22 100644
--- a/kernel/livepatch/core.c
+++ b/kernel/livepatch/core.c
@@ -31,78 +31,32 @@
 
 #include <linux/livepatch.h>
 
+#define lpc_for_each_patch_func(p, pf)   \
+        for (pf = p->funcs; pf->old_name; pf++)
+
 /*************************************
  * Core structures
  ************************************/
 
-/*
- * lp_ structs vs lpc_ structs
- *
- * For each element (patch, object, func) in the live-patching code,
- * there are two types with two different prefixes: lp_ and lpc_.
- *
- * Structures used by the live-patch modules to register with this core module
- * are prefixed with lp_ (live patching).  These structures are part of the
- * registration API and are defined in livepatch.h.  The structures used
- * internally by this core module are prefixed with lpc_ (live patching core).
- */
-
 static DEFINE_SEMAPHORE(lpc_mutex);
 static LIST_HEAD(lpc_patches);
 
-enum lpc_state {
-	DISABLED,
-	ENABLED
-};
-
-struct lpc_func {
-	struct list_head list;
-	struct kobject kobj;
-	struct ftrace_ops fops;
-	enum lpc_state state;
-
-	const char *old_name;
-	unsigned long new_addr;
-	unsigned long old_addr;
-};
-
-struct lpc_object {
-	struct list_head list;
-	struct kobject kobj;
-	struct module *mod; /* module associated with object */
-	enum lpc_state state;
-
-	const char *name;
-	struct list_head funcs;
-	struct lp_dynrela *dynrelas;
-};
-
-struct lpc_patch {
-	struct list_head list;
-	struct kobject kobj;
-	struct lp_patch *userpatch; /* for correlation during unregister */
-	enum lpc_state state;
-
-	struct module *mod;
-	struct list_head objs;
-};
-
 /*******************************************
  * Helpers
  *******************************************/
 
-/* sets obj->mod if object is not vmlinux and module was found */
-static bool is_object_loaded(struct lpc_object *obj)
+/* sets patch_func->mod if object is not vmlinux and module was found */
+static bool is_object_loaded(struct lpc_func *patch_func)
 {
 	struct module *mod;
 
-	if (!strcmp(obj->name, "vmlinux"))
+	if (!strcmp(patch_func->obj_name, "vmlinux"))
 		return 1;
 
 	mutex_lock(&module_mutex);
-	mod = find_module(obj->name);
+	mod = find_module(patch_func->obj_name);
 	mutex_unlock(&module_mutex);
-	obj->mod = mod;
+	patch_func->mod = mod;
 
 	return !!mod;
 }
@@ -254,18 +208,18 @@ static int lpc_find_external_symbol(struct module *pmod, const char *name,
 	return lpc_find_symbol(pmod->name, name, addr);
 }
 
-static int lpc_write_object_relocations(struct module *pmod,
-					struct lpc_object *obj)
+static int lpc_write_relocations(struct module *pmod,
+		struct lpc_dynrela *patch_dynrelas)
 {
 	int ret, size, readonly = 0, numpages;
-	struct lp_dynrela *dynrela;
+	struct lpc_dynrela *dynrela;
 	u64 loc, val;
 	unsigned long core = (unsigned long)pmod->module_core;
 	unsigned long core_ro_size = pmod->core_ro_size;
 	unsigned long core_size = pmod->core_size;
 
-	for (dynrela = obj->dynrelas; dynrela->name; dynrela++) {
-		if (!strcmp(obj->name, "vmlinux")) {
+	for (dynrela = patch_dynrelas; dynrela->name; dynrela++) {
+		if (!strcmp(dynrela->obj_name, "vmlinux")) {
 			ret = lpc_verify_vmlinux_symbol(dynrela->name,
 							dynrela->src);
 			if (ret)
@@ -277,7 +231,7 @@ static int lpc_write_object_relocations(struct module *pmod,
 							       dynrela->name,
 							       &dynrela->src);
 			else
-				ret = lpc_find_symbol(obj->mod->name,
+				ret = lpc_find_symbol(dynrela->obj_name,
 						      dynrela->name,
 						      &dynrela->src);
 			if (ret)
@@ -357,7 +311,7 @@ static int lpc_enable_func(struct lpc_func *func)
 	int ret;
 
 	BUG_ON(!func->old_addr);
-	BUG_ON(func->state != DISABLED);
+	BUG_ON(func->state != LPC_DISABLED);
 	ret = ftrace_set_filter_ip(&func->fops, func->old_addr, 0, 0);
 	if (ret) {
 		pr_err("failed to set ftrace filter for function '%s' (%d)\n",
@@ -370,16 +324,16 @@ static int lpc_enable_func(struct lpc_func *func)
 		       func->old_name, ret);
 		ftrace_set_filter_ip(&func->fops, func->old_addr, 1, 0);
 	} else
-		func->state = ENABLED;
+		func->state = LPC_ENABLED;
 
 	return ret;
 }
 
-static int lpc_unregister_func(struct lpc_func *func)
+static int lpc_disable_func(struct lpc_func *func)
 {
 	int ret;
 
-	BUG_ON(func->state != ENABLED);
+	BUG_ON(func->state != LPC_ENABLED);
 	if (!func->old_addr)
 		/* parent object is not loaded */
 		return 0;
@@ -392,173 +346,131 @@ static int lpc_unregister_func(struct lpc_func *func)
 	ret = ftrace_set_filter_ip(&func->fops, func->old_addr, 1, 0);
 	if (ret)
 		pr_warn("function unregister succeeded but failed to clear the filter\n");
-	func->state = DISABLED;
+	func->state = LPC_DISABLED;
 
 	return 0;
 }
 
-static int lpc_unregister_object(struct lpc_object *obj)
-{
-	struct lpc_func *func;
-	int ret;
-
-	list_for_each_entry(func, &obj->funcs, list) {
-		if (func->state != ENABLED)
-			continue;
-		ret = lpc_unregister_func(func);
-		if (ret)
-			return ret;
-		if (strcmp(obj->name, "vmlinux"))
-			func->old_addr = 0;
-	}
-	if (obj->mod)
-		module_put(obj->mod);
-	obj->state = DISABLED;
-
-	return 0;
-}
-
-/* caller must ensure that obj->mod is set if object is a module */
-static int lpc_enable_object(struct module *pmod, struct lpc_object *obj)
-{
-	struct lpc_func *func;
-	int ret;
-
-	if (obj->mod && !try_module_get(obj->mod))
-		return -ENODEV;
-
-	if (obj->dynrelas) {
-		ret = lpc_write_object_relocations(pmod, obj);
-		if (ret)
-			goto unregister;
-	}
-	list_for_each_entry(func, &obj->funcs, list) {
-		ret = lpc_find_verify_func_addr(func, obj->name);
-		if (ret)
-			goto unregister;
-
-		ret = lpc_enable_func(func);
-		if (ret)
-			goto unregister;
-	}
-	obj->state = ENABLED;
-
-	return 0;
-unregister:
-	WARN_ON(lpc_unregister_object(obj));
-	return ret;
-}
-
 /******************************
  * enable/disable
  ******************************/
 
 /* must be called with lpc_mutex held */
-static struct lpc_patch *lpc_find_patch(struct lp_patch *userpatch)
-{
-	struct lpc_patch *patch;
-
-	list_for_each_entry(patch, &lpc_patches, list)
-		if (patch->userpatch == userpatch)
-			return patch;
-
-	return NULL;
-}
-
-/* must be called with lpc_mutex held */
-static int lpc_disable_patch(struct lpc_patch *patch)
+static int __lpc_disable_patch(struct lpc_patch *patch)
 {
-	struct lpc_object *obj;
+	struct lpc_func *patch_func;
 	int ret;
 
 	pr_notice("disabling patch '%s'\n", patch->mod->name);
 
-	list_for_each_entry(obj, &patch->objs, list) {
-		if (obj->state != ENABLED)
+	lpc_for_each_patch_func(patch, patch_func) {
+		if (patch_func->state != LPC_ENABLED)
 			continue;
-		ret = lpc_unregister_object(obj);
-		if (ret)
+		ret = lpc_disable_func(patch_func);
+		if (ret) {
+			pr_err("lpc: cannot disable function %s\n",
+				patch_func->old_name);
 			return ret;
+		}
+
+		if (strcmp(patch_func->obj_name, "vmlinux"))
+			patch_func->old_addr = 0;
+		if (patch_func->mod)
+			module_put(patch_func->mod);
 	}
-	patch->state = DISABLED;
+	patch->state = LPC_DISABLED;
 
 	return 0;
 }
 
-int lp_disable_patch(struct lp_patch *userpatch)
+int lpc_disable_patch(struct lpc_patch *patch)
 {
-	struct lpc_patch *patch;
 	int ret;
 
 	down(&lpc_mutex);
-	patch = lpc_find_patch(userpatch);
-	if (!patch) {
-		ret = -ENODEV;
-		goto out;
-	}
-	ret = lpc_disable_patch(patch);
-out:
+	ret = __lpc_disable_patch(patch);
 	up(&lpc_mutex);
+
 	return ret;
 }
-EXPORT_SYMBOL_GPL(lp_disable_patch);
+EXPORT_SYMBOL_GPL(lpc_disable_patch);
+
+static int lpc_verify_enable_func(struct lpc_func *patch_func)
+{
+	int ret;
+
+	if (patch_func->mod && !try_module_get(patch_func->mod))
+		return -ENODEV;
+
+	ret = lpc_find_verify_func_addr(patch_func, patch_func->obj_name);
+	if (ret)
+		return ret;
+
+	ret = lpc_enable_func(patch_func);
+	if (ret)
+		return ret;
+
+	return 0;
+}
 
 /* must be called with lpc_mutex held */
-static int lpc_enable_patch(struct lpc_patch *patch)
+static int __lpc_enable_patch(struct lpc_patch *patch)
 {
-	struct lpc_object *obj;
+	struct lpc_func *patch_func;
 	int ret;
 
-	BUG_ON(patch->state != DISABLED);
+	BUG_ON(patch->state != LPC_DISABLED);
 
 	pr_notice_once("tainting kernel with TAINT_LIVEPATCH\n");
 	add_taint(TAINT_LIVEPATCH, LOCKDEP_STILL_OK);
 
 	pr_notice("enabling patch '%s'\n", patch->mod->name);
 
-	list_for_each_entry(obj, &patch->objs, list) {
-		if (!is_object_loaded(obj))
+	if (patch->dynrelas) {
+		ret = lpc_write_relocations(patch->mod, patch->dynrelas);
+		if (ret)
+			goto err;
+	}
+
+	lpc_for_each_patch_func(patch, patch_func) {
+		if (!is_object_loaded(patch_func))
 			continue;
-		ret = lpc_enable_object(patch->mod, obj);
+
+		ret = lpc_verify_enable_func(patch_func);
 		if (ret)
-			goto unregister;
+			goto err;
 	}
-	patch->state = ENABLED;
+	patch->state = LPC_ENABLED;
+
 	return 0;
 
-unregister:
-	WARN_ON(lpc_disable_patch(patch));
+err:
+	WARN_ON(__lpc_disable_patch(patch));
 	return ret;
 }
 
-int lp_enable_patch(struct lp_patch *userpatch)
+int lpc_enable_patch(struct lpc_patch *patch)
 {
-	struct lpc_patch *patch;
 	int ret;
 
 	down(&lpc_mutex);
-	patch = lpc_find_patch(userpatch);
-	if (!patch) {
-		ret = -ENODEV;
-		goto out;
-	}
-	ret = lpc_enable_patch(patch);
-out:
+	ret = __lpc_enable_patch(patch);
 	up(&lpc_mutex);
+
 	return ret;
 }
-EXPORT_SYMBOL_GPL(lp_enable_patch);
+EXPORT_SYMBOL_GPL(lpc_enable_patch);
 
 /******************************
  * module notifier
  *****************************/
 
-static int lp_module_notify(struct notifier_block *nb, unsigned long action,
+static int lpc_module_notify(struct notifier_block *nb, unsigned long action,
 			    void *data)
 {
 	struct module *mod = data;
 	struct lpc_patch *patch;
-	struct lpc_object *obj;
+	struct lpc_func *patch_func;
 	int ret = 0;
 
 	if (action != MODULE_STATE_COMING)
@@ -567,32 +479,42 @@ static int lp_module_notify(struct notifier_block *nb, unsigned long action,
 	down(&lpc_mutex);
 
 	list_for_each_entry(patch, &lpc_patches, list) {
-		if (patch->state == DISABLED)
+		if (patch->state == LPC_DISABLED)
 			continue;
-		list_for_each_entry(obj, &patch->objs, list) {
-			if (strcmp(obj->name, mod->name))
+
+		if (patch->dynrelas) {
+			ret = lpc_write_relocations(patch->mod,
+				patch->dynrelas);
+			if (ret)
+				goto err;
+		}
+
+		lpc_for_each_patch_func(patch, patch_func) {
+			if (strcmp(patch_func->obj_name, mod->name))
 				continue;
+
 			pr_notice("load of module '%s' detected, applying patch '%s'\n",
 				  mod->name, patch->mod->name);
-			obj->mod = mod;
-			ret = lpc_enable_object(patch->mod, obj);
+			patch_func->mod = mod;
+
+			ret = lpc_verify_enable_func(patch_func);
 			if (ret)
-				goto out;
-			break;
+				goto err;
 		}
 	}
 
 	up(&lpc_mutex);
 	return 0;
-out:
+
+err:
 	up(&lpc_mutex);
 	WARN("failed to apply patch '%s' to module '%s'\n",
 		patch->mod->name, mod->name);
 	return 0;
 }
 
-static struct notifier_block lp_module_nb = {
-	.notifier_call = lp_module_notify,
+static struct notifier_block lpc_module_nb = {
+	.notifier_call = lpc_module_notify,
 	.priority = INT_MIN, /* called last */
 };
 
@@ -603,10 +525,7 @@ static struct notifier_block lp_module_nb = {
  * /sys/kernel/livepatch
  * /sys/kernel/livepatch/<patch>
  * /sys/kernel/livepatch/<patch>/enabled
- * /sys/kernel/livepatch/<patch>/<object>
- * /sys/kernel/livepatch/<patch>/<object>/<func>
- * /sys/kernel/livepatch/<patch>/<object>/<func>/new_addr
- * /sys/kernel/livepatch/<patch>/<object>/<func>/old_addr
+ * /sys/kernel/livepatch/<patch>/funcs
  */
 
 static ssize_t enabled_store(struct kobject *kobj, struct kobj_attribute *attr,
@@ -620,7 +539,7 @@ static ssize_t enabled_store(struct kobject *kobj, struct kobj_attribute *attr,
 	if (ret)
 		return -EINVAL;
 
-	if (val != DISABLED && val != ENABLED)
+	if (val != LPC_DISABLED && val != LPC_ENABLED)
 		return -EINVAL;
 
 	patch = container_of(kobj, struct lpc_patch, kobj);
@@ -632,12 +551,12 @@ static ssize_t enabled_store(struct kobject *kobj, struct kobj_attribute *attr,
 		goto out;
 	}
 
-	if (val == ENABLED) {
-		ret = lpc_enable_patch(patch);
+	if (val == LPC_ENABLED) {
+		ret = __lpc_enable_patch(patch);
 		if (ret)
 			goto out;
 	} else {
-		ret = lpc_disable_patch(patch);
+		ret = __lpc_disable_patch(patch);
 		if (ret)
 			goto out;
 	}
@@ -657,40 +576,35 @@ static ssize_t enabled_show(struct kobject *kobj,
 	return snprintf(buf, PAGE_SIZE-1, "%d\n", patch->state);
 }
 
-static struct kobj_attribute enabled_kobj_attr = __ATTR_RW(enabled);
-static struct attribute *lpc_patch_attrs[] = {
-	&enabled_kobj_attr.attr,
-	NULL
-};
-
-static ssize_t new_addr_show(struct kobject *kobj,
+static ssize_t funcs_show(struct kobject *kobj,
 			     struct kobj_attribute *attr, char *buf)
 {
-	struct lpc_func *func;
-
-	func = container_of(kobj, struct lpc_func, kobj);
-	return snprintf(buf, PAGE_SIZE-1, "0x%016lx\n", func->new_addr);
-}
+	struct lpc_patch *patch;
+	const struct lpc_func *patch_func;
+	ssize_t size;
 
-static struct kobj_attribute new_addr_kobj_attr = __ATTR_RO(new_addr);
+	size = snprintf(buf, PAGE_SIZE, "Functions:\n");
 
-static ssize_t old_addr_show(struct kobject *kobj,
-			     struct kobj_attribute *attr, char *buf)
-{
-	struct lpc_func *func;
+	patch = container_of(kobj, struct lpc_patch, kobj);
+	lpc_for_each_patch_func(patch, patch_func)
+		size += snprintf(buf + size, PAGE_SIZE - size, "%s\n",
+				patch_func->old_name);
 
-	func = container_of(kobj, struct lpc_func, kobj);
-	return snprintf(buf, PAGE_SIZE-1, "0x%016lx\n", func->old_addr);
+        return size;
 }
 
-static struct kobj_attribute old_addr_kobj_attr = __ATTR_RO(old_addr);
-
-static struct attribute *lpc_func_attrs[] = {
-	&new_addr_kobj_attr.attr,
-	&old_addr_kobj_attr.attr,
+static struct kobj_attribute enabled_kobj_attr = __ATTR_RW(enabled);
+static struct kobj_attribute funcs_kobj_attr = __ATTR_RO(funcs);
+static struct attribute *lpc_patch_attrs[] = {
+	&enabled_kobj_attr.attr,
+	&funcs_kobj_attr.attr,
 	NULL
 };
 
+static struct attribute_group lpc_patch_sysfs_group = {
+	.attrs = lpc_patch_attrs,
+};
+
 static struct kobject *lpc_root_kobj;
 
 static int lpc_create_root_kobj(void)
@@ -720,228 +634,67 @@ static void lpc_kobj_release_patch(struct kobject *kobj)
 static struct kobj_type lpc_ktype_patch = {
 	.release = lpc_kobj_release_patch,
 	.sysfs_ops = &kobj_sysfs_ops,
-	.default_attrs = lpc_patch_attrs
-};
-
-static void lpc_kobj_release_object(struct kobject *kobj)
-{
-	struct lpc_object *obj;
-
-	obj = container_of(kobj, struct lpc_object, kobj);
-	if (!list_empty(&obj->list))
-		list_del(&obj->list);
-	kfree(obj);
-}
-
-static struct kobj_type lpc_ktype_object = {
-	.release	= lpc_kobj_release_object,
-	.sysfs_ops	= &kobj_sysfs_ops,
-};
-
-static void lpc_kobj_release_func(struct kobject *kobj)
-{
-	struct lpc_func *func;
-
-	func = container_of(kobj, struct lpc_func, kobj);
-	if (!list_empty(&func->list))
-		list_del(&func->list);
-	kfree(func);
-}
-
-static struct kobj_type lpc_ktype_func = {
-	.release	= lpc_kobj_release_func,
-	.sysfs_ops	= &kobj_sysfs_ops,
-	.default_attrs = lpc_func_attrs
 };
 
 /*********************************
- * structure allocation
+ * structure init and free
  ********************************/
 
-static void lpc_free_funcs(struct lpc_object *obj)
-{
-	struct lpc_func *func, *funcsafe;
-
-	list_for_each_entry_safe(func, funcsafe, &obj->funcs, list)
-		kobject_put(&func->kobj);
-}
-
-static void lpc_free_objects(struct lpc_patch *patch)
-{
-	struct lpc_object *obj, *objsafe;
-
-	list_for_each_entry_safe(obj, objsafe, &patch->objs, list) {
-		lpc_free_funcs(obj);
-		kobject_put(&obj->kobj);
-	}
-}
-
 static void lpc_free_patch(struct lpc_patch *patch)
 {
-	lpc_free_objects(patch);
+	sysfs_remove_group(&patch->kobj, &lpc_patch_sysfs_group);
 	kobject_put(&patch->kobj);
 }
 
-static struct lpc_func *lpc_create_func(struct kobject *root,
-					struct lp_func *userfunc)
+static int lpc_init_patch(struct lpc_patch *patch)
 {
-	struct lpc_func *func;
 	struct ftrace_ops *ops;
+	struct lpc_func *patch_func;
 	int ret;
 
-	/* alloc */
-	func = kzalloc(sizeof(*func), GFP_KERNEL);
-	if (!func)
-		return NULL;
-
 	/* init */
-	INIT_LIST_HEAD(&func->list);
-	func->old_name = userfunc->old_name;
-	func->new_addr = (unsigned long)userfunc->new_func;
-	func->old_addr = userfunc->old_addr;
-	func->state = DISABLED;
-	ops = &func->fops;
-	ops->private = func;
-	ops->func = lpc_ftrace_handler;
-	ops->flags = FTRACE_OPS_FL_SAVE_REGS | FTRACE_OPS_FL_DYNAMIC;
-
-	/* sysfs */
-	ret = kobject_init_and_add(&func->kobj, &lpc_ktype_func,
-				   root, func->old_name);
-	if (ret) {
-		kfree(func);
-		return NULL;
-	}
-
-	return func;
-}
-
-static int lpc_create_funcs(struct lpc_object *obj,
-			    struct lp_func *userfuncs)
-{
-	struct lp_func *userfunc;
-	struct lpc_func *func;
-
-	if (!userfuncs)
-		return -EINVAL;
-
-	for (userfunc = userfuncs; userfunc->old_name; userfunc++) {
-		func = lpc_create_func(&obj->kobj, userfunc);
-		if (!func)
-			goto free;
-		list_add(&func->list, &obj->funcs);
-	}
-	return 0;
-free:
-	lpc_free_funcs(obj);
-	return -ENOMEM;
-}
-
-static struct lpc_object *lpc_create_object(struct kobject *root,
-					    struct lp_object *userobj)
-{
-	struct lpc_object *obj;
-	int ret;
-
-	/* alloc */
-	obj = kzalloc(sizeof(*obj), GFP_KERNEL);
-	if (!obj)
-		return NULL;
-
-	/* init */
-	INIT_LIST_HEAD(&obj->list);
-	obj->name = userobj->name;
-	obj->dynrelas = userobj->dynrelas;
-	obj->state = DISABLED;
-	/* obj->mod set by is_object_loaded() */
-	INIT_LIST_HEAD(&obj->funcs);
-
-	/* sysfs */
-	ret = kobject_init_and_add(&obj->kobj, &lpc_ktype_object,
-				   root, obj->name);
-	if (ret) {
-		kfree(obj);
-		return NULL;
-	}
-
-	/* create functions */
-	ret = lpc_create_funcs(obj, userobj->funcs);
-	if (ret) {
-		kobject_put(&obj->kobj);
-		return NULL;
-	}
-
-	return obj;
-}
-
-static int lpc_create_objects(struct lpc_patch *patch,
-			      struct lp_object *userobjs)
-{
-	struct lp_object *userobj;
-	struct lpc_object *obj;
-
-	if (!userobjs)
-		return -EINVAL;
-
-	for (userobj = userobjs; userobj->name; userobj++) {
-		obj = lpc_create_object(&patch->kobj, userobj);
-		if (!obj)
-			goto free;
-		list_add(&obj->list, &patch->objs);
-	}
-	return 0;
-free:
-	lpc_free_objects(patch);
-	return -ENOMEM;
-}
-
-static int lpc_create_patch(struct lp_patch *userpatch)
-{
-	struct lpc_patch *patch;
-	int ret;
-
-	/* alloc */
-	patch = kzalloc(sizeof(*patch), GFP_KERNEL);
-	if (!patch)
-		return -ENOMEM;
-
-	/* init */
-	INIT_LIST_HEAD(&patch->list);
-	patch->userpatch = userpatch;
-	patch->mod = userpatch->mod;
-	patch->state = DISABLED;
-	INIT_LIST_HEAD(&patch->objs);
+	patch->state = LPC_DISABLED;
 
 	/* sysfs */
 	ret = kobject_init_and_add(&patch->kobj, &lpc_ktype_patch,
 				   lpc_root_kobj, patch->mod->name);
-	if (ret) {
-		kfree(patch);
-		return ret;
-	}
+	if (ret)
+		goto err_root;
 
-	/* create objects */
-	ret = lpc_create_objects(patch, userpatch->objs);
-	if (ret) {
-		kobject_put(&patch->kobj);
-		return ret;
+	/* create functions */
+	lpc_for_each_patch_func(patch, patch_func) {
+		patch_func->new_addr = (unsigned long)patch_func->new_func;
+		patch_func->state = LPC_DISABLED;
+		ops = &patch_func->fops;
+		ops->private = patch_func;
+		ops->func = lpc_ftrace_handler;
+		ops->flags = FTRACE_OPS_FL_SAVE_REGS | FTRACE_OPS_FL_DYNAMIC;
 	}
 
+	ret = sysfs_create_group(&patch->kobj, &lpc_patch_sysfs_group);
+	if (ret)
+		goto err_patch;
+
 	/* add to global list of patches */
 	list_add(&patch->list, &lpc_patches);
 
 	return 0;
+
+err_patch:
+	kobject_put(&patch->kobj);
+err_root:
+	return ret;
 }
 
 /************************************
  * register/unregister
  ***********************************/
 
-int lp_register_patch(struct lp_patch *userpatch)
+int lpc_register_patch(struct lpc_patch *userpatch)
 {
 	int ret;
 
-	if (!userpatch || !userpatch->mod || !userpatch->objs)
+	if (!userpatch || !userpatch->mod || !userpatch->funcs)
 		return -EINVAL;
 
 	/*
@@ -955,36 +708,26 @@ int lp_register_patch(struct lp_patch *userpatch)
 		return -ENODEV;
 
 	down(&lpc_mutex);
-	ret = lpc_create_patch(userpatch);
+	ret = lpc_init_patch(userpatch);
 	up(&lpc_mutex);
 	if (ret)
 		module_put(userpatch->mod);
 
 	return ret;
 }
-EXPORT_SYMBOL_GPL(lp_register_patch);
+EXPORT_SYMBOL_GPL(lpc_register_patch);
 
-int lp_unregister_patch(struct lp_patch *userpatch)
+int lpc_unregister_patch(struct lpc_patch *userpatch)
 {
-	struct lpc_patch *patch;
 	int ret = 0;
 
 	down(&lpc_mutex);
-	patch = lpc_find_patch(userpatch);
-	if (!patch) {
-		ret = -ENODEV;
-		goto out;
-	}
-	if (patch->state == ENABLED) {
-		ret = -EINVAL;
-		goto out;
-	}
-	lpc_free_patch(patch);
-out:
+	lpc_free_patch(userpatch);
 	up(&lpc_mutex);
+
 	return ret;
 }
-EXPORT_SYMBOL_GPL(lp_unregister_patch);
+EXPORT_SYMBOL_GPL(lpc_unregister_patch);
 
 /************************************
  * entry/exit
@@ -994,7 +737,7 @@ static int lpc_init(void)
 {
 	int ret;
 
-	ret = register_module_notifier(&lp_module_nb);
+	ret = register_module_notifier(&lpc_module_nb);
 	if (ret)
 		return ret;
 
@@ -1004,14 +747,14 @@ static int lpc_init(void)
 
 	return 0;
 unregister:
-	unregister_module_notifier(&lp_module_nb);
+	unregister_module_notifier(&lpc_module_nb);
 	return ret;
 }
 
 static void lpc_exit(void)
 {
 	lpc_remove_root_kobj();
-	unregister_module_notifier(&lp_module_nb);
+	unregister_module_notifier(&lpc_module_nb);
 }
 
 module_init(lpc_init);
-- 
2.1.2


^ permalink raw reply related	[flat|nested] 73+ messages in thread

* Re: [PATCH 2/2] kernel: add support for live patching
  2014-11-13 10:16   ` Miroslav Benes
@ 2014-11-13 14:38     ` Josh Poimboeuf
  2014-11-13 17:12     ` Seth Jennings
  1 sibling, 0 replies; 73+ messages in thread
From: Josh Poimboeuf @ 2014-11-13 14:38 UTC (permalink / raw)
  To: Miroslav Benes
  Cc: Seth Jennings, Jiri Kosina, Vojtech Pavlik, Steven Rostedt,
	jslaby, pmladek, live-patching, kpatch, linux-kernel

On Thu, Nov 13, 2014 at 11:16:00AM +0100, Miroslav Benes wrote:
> 
> Hi,
> 
> thank you for the first version of the united live patching core.
> 
> The patch below implements some of our review objections. Changes are 
> described in the commit log. It simplifies the hierarchy of data 
> structures, removes data duplication (lp_ and lpc_ structures) and 
> simplifies sysfs directory.
> 
> I did not try to repair other stuff (races, function names, function 
> prefix, api symmetry etc.). It should serve as a demonstration of our 
> point of view.
> 
> There are some problems with this. try_module_get and module_put may be 
> called several times for each kernel module where some function is 
> patched in. This should be fixed with module going notifier as suggested 
> by Petr. 
> 
> The modified core was tested with modified testing live patch originally 
> from Seth's github. It worked as expected.
> 
> Please take a look at these changes, so we can discuss them in more 
> detail.

Hi Miroslav,

Thanks for the code suggestions.

This is a single patch with three major changes, which makes it hard to
discuss each individual change on its own merits.

Also, Seth has already made a lot of changes already based on previous
comments, and is very close to having a v2 patch.  Because this patch is
so big, there are a lot of conflicts.

Can you wait for the v2 patch set and then post your own patch set
against it, with the patches split out so we can discuss them
individually?

-- 
Josh

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [PATCH 0/2] Kernel Live Patching
  2014-11-12 21:47                         ` Vojtech Pavlik
@ 2014-11-13 15:56                           ` Masami Hiramatsu
  2014-11-13 16:38                             ` Vojtech Pavlik
  0 siblings, 1 reply; 73+ messages in thread
From: Masami Hiramatsu @ 2014-11-13 15:56 UTC (permalink / raw)
  To: Vojtech Pavlik
  Cc: Josh Poimboeuf, Christoph Hellwig, Seth Jennings, Jiri Kosina,
	Steven Rostedt, live-patching, kpatch, linux-kernel

(2014/11/13 6:47), Vojtech Pavlik wrote:
> On Thu, Nov 13, 2014 at 02:33:24AM +0900, Masami Hiramatsu wrote:
>> Right. Consistency model is still same as kpatch. Btw, I think
>> we can just use the difference of consistency for classifying
>> the patches, since we have these classes, only limited combination
>> is meaningful.
>>
>>>> 	LEAVE_FUNCTION
>>>> 	LEAVE_PATCHED_SET
>>>> 	LEAVE_KERNEL
>>>>
>>>> 	SWITCH_FUNCTION
>>>> 	SWITCH_THREAD
>>>> 	SWITCH_KERNEL
>>
>> How about the below combination of consistent flags?
>>
>> <flags>
>> CONSISTENT_IN_THREAD - patching is consistent in a thread.
>> CONSISTENT_IN_TIME - patching is atomically done.
>>
>> <combination>
>> (none) - the 'null' mode? same as LEAVE_FUNCTION & SWITCH_FUNCTION
>>
>> CONSISTENT_IN_THREAD - kGraft mode. same as LEAVE_KERNEL & SWITCH_THREAD
>>
>> CONSISTENT_IN_TIME - kpatch mode. same as LEAVE_PATCHED_SET & SWITCH_KERNEL
>>
>> CONSISTENT_IN_THREAD|CONSISTENT_IN_TIME - CRIU mode. same as LEAVE_KERNEL & SWITCH_KERNEL
> 
> The reason I tried to parametrize the consistency model in a more
> flexible and fine-grained manner than just describing the existing
> solutions was for the purpose of exploring whether any of the remaining
> combinations make sense.

I see. I don't mind the implementation of how to check the execution path.
I just considers that we need classify consistency requirements when
checking the "patch" itself (maybe by manual at first).

And since your classification seemed mixing the consistency and switching
timings, I thought we'd better split them into the consistency requirement
flags and implementation of safeness checking :)

Even if you can use refcounting with per-thread patching, it still switches
per-thread basis, inconsistent among threads.

> It allowed me to look at what value we're getting from the consistency
> models: Most importantly the ability to change function prototypes and
> still make calls work.
> 
> For this, the minimum requirements are LEAVE_PATCHED_SET (what
> kpatch does) and SWITCH_THREAD (which is what kGraft does). 
> 
> Both kpatch and kGraft do more, but:
> 
> I was able to show that LEAVE_KERNEL is unnecessary and any cases where
> it is beneficial can be augmented by just increasing the patched set.
> 
> I believe at this point that SWITCH_KERNEL is unnecessary and that data or
> locking changes - the major benefit of switching at once can be done by
> shadowing/versioning of data structures, which is what both kpatch and
> kGraft had planned to do anyway.
> 
> I haven't shown yet whether the strongest consistency (LEAVE_KERNEL +
> SWITCH_KERNEL) is possible at all. CRIU is close, but not necessarily
> doing quite that. It might be possible to just force processes to sleep
> at syscall entry one by one until all are asleep. Also the benefits of
> doing that are still unclear.

Of course, that is what kernel/freezer.c does :)
So, if you need to patch with the strongest consistency, you can freeze
them all.

> 
> The goal is to find a consistency model that is best suited for the
> goals of both kpatch and kGraft: Reliably apply simple to
> mid-complexity kernel patches.

Same as me. I just sorted out the possible consistency requirements.
And I've thought that the key was "consistent in a context of each thread" or
"consistent at the moment among all threads but not in a context" or
"consistent in contexts of all threads". What would you think, any other
consistency model is there?

>> So, each patch requires consistency constrains flag and livepatch tool
>> chooses the mode based on the flag.
>>
>>>> So, I think the patch may be classified by following four types
>>>>
>>>> PATCH_FUNCTION - Patching per function. This ignores context, just
>>>>                change the function.
>>>>                User must ensure that the new function can co-exist
>>>>                with old functions on the same context (e.g. recursive
>>>>                call can cause inconsistency).
>>>>
>>>> PATCH_THREAD - Patching per thread. If a thread leave the kernel,
>>>>                changes are applied for that thread.
>>>>                User must ensure that the new functions can co-exist
>>>>                with old functions per-thread. Inter-thread shared
>>>>                data acquisition(locks) should not be involved.
>>>>
>>>> PATCH_KERNEL - Patching all threads. This wait for all threads leave the
>>>>                all target functions.
>>>>                User must ensure that the new functions can co-exist
>>>>                with old functions on a thread (note that if there is a
>>>>                loop, old one can be called first n times, and new one
>>>>                can be called afterwords).(**)
>>>
>>> Yes, but only when the function calling it is not included in the
>>> patched set, which is only a problem for semantic changes accompanied by
>>> no change in the function prototyppe. This can be avoided by changing
>>> the prototype deliberately.
>>
>> Hmm, but what would you think about following simple case?
>>
>> ----
>> int func(int a) {
>>   return a + 1;
>> }
>>
>> ...
>>   b = 0;
>>   for (i = 0; i < 10; i++)
>>     b = func(b);
>> ...
>> ----
>> ----
>> int func(int a) {
>>   return a + 2; /* Changed */
>> }
>>
>> ...
>>   b = 0;
>>   for (i = 0; i < 10; i++)
>>     b = func(b);
>> ...
>> ----
>>
>> So, after the patch, "b" will be in a range of 10 to 20, not 10 or 20.
>> Of course CONSISTENT_IN_THREAD can ensure it should be 10 or 20 :)
> 
> If you force a prototype change, eg by changing func() to an unsigned
> int, or simply add a parameter, the place where it is called from will
> also be changed and will be included in the patched set. (Or you can
> just include it manually in the set.)

Yes.

> Then, you can be sure that the place which calls func() is not on the
> stack when patching. This way, in your classification, PATCH_KERNEL can
> be as good as PATCH_THREAD. In my classification, I'm saying that
> LEAVE_PATCHED_SET is as good as LEAVE_KERNEL.

OK, but again, to be sure that, we need to dump stack for each kernel
as I did.

>>>> (*) Instead of checking stacks, at first, wait for all threads leaving
>>>> the kernel once, after that, wait for refcount becomes zero and switch
>>>> all the patched functions.
>>>
>>> This is a very beautiful idea.
>>>
>>> It does away with both the stack parsing and the kernel stopping,
>>> achieving kGraft's goals, while preserving kpatch's consistency model.
>>>
>>> Sadly, it combines the disadvantages of both kpatch and kGraft: From
>>> kpatch it takes the inability to patch functions where threads are
>>> sleeping often and as such never leave them at once. From kGraft it
>>> takes the need to annotate kernel threads and wake sleepers from
>>> userspace.
>>
>> But how frequently the former case happens? It seems very very rare.
>> And if we aim to enable both kpatch mode and kGraft mode in the kernel,
>> anyway we'll have something for the latter cases.
> 
> The kpatch problem case isn't that rare. It just happened with a CVE in
> futexes recently. It will happen if you try to patch anything that is on
> the stack when a TTY or TCP read is waiting for data as another example. 

Oh, I see. this should be solved then... perhaps, we can freeze those
tasks and thaw it again.

> The kGraft problem case will happen when you load a 3rd party module
> with a non-annotated kernel thread. Or a different problem will happen
> when you have an application sleeping that will exit when receiving any
> signal.

Ah, yes. especially latter case is serious. maybe freezer can handle
this too...

> Both the cases can be handled with tricks and workarounds. But it'd be
> much nicer to have a patching engine that is reliable.
> 
>>> So while it is beautiful, it's less practical than either kpatch or
>>> kGraft alone. 
>>
>> Ah, sorry for confusing, I don't tend to integrate kpatch and kGraft.
>> Actually, it is just about modifying kpatch, since it may shorten
>> stack-checking time.
>> This means that does not change the consistency model.
>> We certainly need both of kGraft mode and kpatch mode.
> 
> What I'm proposing is a LEAVE_PATCHED_SET + SWITCH_THREAD mode. It's
> less consistency, but it is enough. And it is more reliable (likely to
> succeed in finite time) than either kpatch or kGraft.

Yeah, that is actual merge of kpatch and kGraft, and also can avoid
stop_machine (yes, that is important for me :)).

> It'd be mostly based on your refcounting code, including stack
> checking (when a process sleeps, counter gets set based on number of
> patched functions on the stack), possibly including setting the counter
> to 0 on syscall entry/exit, but it'd make the switch per-thread like
> kGraft does, not for the whole system, when the respective counters
> reach zero.

I'm not sure what happens if a process sleeps on the patched-set?
If we switch the other threads, when this sleeping thread wakes up
that will see the old functions (and old data). So I think we need
both SWITCH_THREAD and SWITCH_KERNEL options in that case.
What I'm thinking is to merge the code (technique) of both and
allow to choose the "switch-timing" based on the patch's consistency
requirement.

> 
> This handles the frequent sleeper case, it doesn't need annotated kernel
> thread main loops, it will not need the user to wake up every process in
> the system unless it sleeps in a patched function.
> 
> And it can handle all the patches that kpatch and kGraft can (it needs
> shadowing for some).
> 
>>> Yes, this is what I call 'extending the patched set'. You can do that
>>> either by deliberately changing the prototype of the patched function
>>> being called, which causes the calling function to be considered
>>> different, or just add it to the set of functions considered manually.
>>
>> I'd prefer latter one :) or just gives hints of watching targets.
> 
> Me too.
> 

Anyway, I'd like to support for this effort from kernel side.
At least I have to solve ftrace regs conflict by IPMODIFY flag and
a headache kretprobe failure case by sharing per-thread retstack
with ftrace-callgraph.

Thank you,

-- 
Masami HIRAMATSU
Software Platform Research Dept. Linux Technology Research Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: masami.hiramatsu.pt@hitachi.com



^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [PATCH 0/2] Kernel Live Patching
  2014-11-13 15:56                           ` Masami Hiramatsu
@ 2014-11-13 16:38                             ` Vojtech Pavlik
  2014-11-18 12:47                               ` Petr Mladek
  0 siblings, 1 reply; 73+ messages in thread
From: Vojtech Pavlik @ 2014-11-13 16:38 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: Josh Poimboeuf, Christoph Hellwig, Seth Jennings, Jiri Kosina,
	Steven Rostedt, live-patching, kpatch, linux-kernel

On Fri, Nov 14, 2014 at 12:56:38AM +0900, Masami Hiramatsu wrote:

> I see. I don't mind the implementation of how to check the execution path.
> I just considers that we need classify consistency requirements when
> checking the "patch" itself (maybe by manual at first).
> 
> And since your classification seemed mixing the consistency and switching
> timings, I thought we'd better split them into the consistency requirement
> flags and implementation of safeness checking :)

That makes sense. Although what classes of patches actually need what kind of
consistency is an open debate still.

> Even if you can use refcounting with per-thread patching, it still switches
> per-thread basis, inconsistent among threads.

Yes. 

> > I haven't shown yet whether the strongest consistency (LEAVE_KERNEL +
> > SWITCH_KERNEL) is possible at all. CRIU is close, but not necessarily
> > doing quite that. It might be possible to just force processes to sleep
> > at syscall entry one by one until all are asleep. Also the benefits of
> > doing that are still unclear.
> 
> Of course, that is what kernel/freezer.c does :)
> So, if you need to patch with the strongest consistency, you can freeze
> them all.

That'd be really cool. (Pun intended.)

I'd have to look deeper into freezer, but does it really stop the
threads at syscall entry? And not anywhere where they sleep? How would
it handle sleepers?

For LEAVE_KERNEL + SWITCH_KERNEL we'd have to freeze them when no kernel
function is on the stack at all.

> > The goal is to find a consistency model that is best suited for the
> > goals of both kpatch and kGraft: Reliably apply simple to
> > mid-complexity kernel patches.
> 
> Same as me. I just sorted out the possible consistency requirements.
> And I've thought that the key was "consistent in a context of each thread" or
> "consistent at the moment among all threads but not in a context" or
> "consistent in contexts of all threads". What would you think, any other
> consistency model is there?

I'm looking for the simplest solution that allows modification to
function prototypes. Because that's the most important value that both
kGraft and kpatch consistency models offer.

> > If you force a prototype change, eg by changing func() to an unsigned
> > int, or simply add a parameter, the place where it is called from will
> > also be changed and will be included in the patched set. (Or you can
> > just include it manually in the set.)
> 
> Yes.
> 
> > Then, you can be sure that the place which calls func() is not on the
> > stack when patching. This way, in your classification, PATCH_KERNEL can
> > be as good as PATCH_THREAD. In my classification, I'm saying that
> > LEAVE_PATCHED_SET is as good as LEAVE_KERNEL.
> 
> OK, but again, to be sure that, we need to dump stack for each kernel
> as I did.

Or pass through the userspace to initialize the counters as you
proposed. 

But I'd really like to avoid stack analysis or having to force every
thread through the kernel/userspace boundary.

I have one more, rather trivial idea how things could be done, I'll be
sending an email with it shortly.

> >> But how frequently the former case happens? It seems very very rare.
> >> And if we aim to enable both kpatch mode and kGraft mode in the kernel,
> >> anyway we'll have something for the latter cases.
> > 
> > The kpatch problem case isn't that rare. It just happened with a CVE in
> > futexes recently. It will happen if you try to patch anything that is on
> > the stack when a TTY or TCP read is waiting for data as another example. 
> 
> Oh, I see. this should be solved then... perhaps, we can freeze those
> tasks and thaw it again.
> 
> > The kGraft problem case will happen when you load a 3rd party module
> > with a non-annotated kernel thread. Or a different problem will happen
> > when you have an application sleeping that will exit when receiving any
> > signal.
> 
> Ah, yes. especially latter case is serious. maybe freezer can handle
> this too...

I should look into it then.

> > What I'm proposing is a LEAVE_PATCHED_SET + SWITCH_THREAD mode. It's
> > less consistency, but it is enough. And it is more reliable (likely to
> > succeed in finite time) than either kpatch or kGraft.
> 
> Yeah, that is actual merge of kpatch and kGraft, and also can avoid
> stop_machine (yes, that is important for me :)).

Good. Same here.

> > It'd be mostly based on your refcounting code, including stack
> > checking (when a process sleeps, counter gets set based on number of
> > patched functions on the stack), possibly including setting the counter
> > to 0 on syscall entry/exit, but it'd make the switch per-thread like
> > kGraft does, not for the whole system, when the respective counters
> > reach zero.
> 
> I'm not sure what happens if a process sleeps on the patched-set?

Then the patching process will be stuck until it is woken up somehow.
But it's still much better to only have to care about processes sleeping
on the patched-set than about processes sleeping anywhere (kGraft).

> If we switch the other threads, when this sleeping thread wakes up
> that will see the old functions (and old data).

Yes, until the patching process is complete, data must be kept in the
old format, even by new functions.

> So I think we need both SWITCH_THREAD and SWITCH_KERNEL options in
> that case.

With data shadowing that's not required. It still may be worth having
it.

> What I'm thinking is to merge the code (technique) of both and
> allow to choose the "switch-timing" based on the patch's consistency
> requirement.

That's what I'm thinking about, too. But I'm also thinking, "this will
be complex, is it really needed"?

> Anyway, I'd like to support for this effort from kernel side.
> At least I have to solve ftrace regs conflict by IPMODIFY flag and
> a headache kretprobe failure case by sharing per-thread retstack
> with ftrace-callgraph.

While I don't know enough about the IPMODIFY flag, I wholeheartedly
support sharing the return stacks between kprobes and ftrace graph
caller.

-- 
Vojtech Pavlik
Director SUSE Labs

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [PATCH 2/2] kernel: add support for live patching
  2014-11-13 10:16   ` Miroslav Benes
  2014-11-13 14:38     ` Josh Poimboeuf
@ 2014-11-13 17:12     ` Seth Jennings
  2014-11-14 13:30       ` Miroslav Benes
  1 sibling, 1 reply; 73+ messages in thread
From: Seth Jennings @ 2014-11-13 17:12 UTC (permalink / raw)
  To: Miroslav Benes
  Cc: Josh Poimboeuf, Jiri Kosina, Vojtech Pavlik, Steven Rostedt,
	jslaby, pmladek, live-patching, kpatch, linux-kernel

On Thu, Nov 13, 2014 at 11:16:00AM +0100, Miroslav Benes wrote:
> 
> Hi,
> 
> thank you for the first version of the united live patching core.
> 
> The patch below implements some of our review objections. Changes are 
> described in the commit log. It simplifies the hierarchy of data 
> structures, removes data duplication (lp_ and lpc_ structures) and 
> simplifies sysfs directory.
> 
> I did not try to repair other stuff (races, function names, function 
> prefix, api symmetry etc.). It should serve as a demonstration of our 
> point of view.
> 
> There are some problems with this. try_module_get and module_put may be 
> called several times for each kernel module where some function is 
> patched in. This should be fixed with module going notifier as suggested 
> by Petr. 
> 
> The modified core was tested with modified testing live patch originally 
> from Seth's github. It worked as expected.
> 
> Please take a look at these changes, so we can discuss them in more 
> detail.

Thanks Miroslav.

The functional changes are a little hard to break out from the
formatting changes like s/disable/unregister and s/lp_/lpc_/ or adding
LPC_ prefix to the enum, most (all?) of which I have included for v2.

A problem with getting rid of the object layer is that there are
operations we do that are object-level operations.  For example,
module lookup and deferred module patching.  Also, the dynamic
relocations need to be associated with an object, not a patch, as not
all relocations will be able to be applied at patch load time for
patches that apply to modules that aren't loaded.  I understand that you
can walk the patch-level dynrela table and skip dynrela entries that
don't match the target object, but why do that when you can cleanly
express the relationship with a data structure hierarchy?

One example is the call to is_object_loaded() (renamed and reworked in
v2 btw) per function rather than per object.  That is duplicate work and
information that could be more cleanly expressed through an object
layer.

I also understand that sysfs/kobject stuff adds code length.  However,
the new "funcs" attribute is procfs style, not sysfs style.  sysfs
attribute should convey _one_ value.

>From Documenation/filesystems/sysfs.txt:
==========
Attributes should be ASCII text files, preferably with only one value
per file. It is noted that it may not be efficient to contain only one
value per file, so it is socially acceptable to express an array of
values of the same type. 

Mixing types, expressing multiple lines of data, and doing fancy
formatting of data is heavily frowned upon. Doing these things may get
you publicly humiliated and your code rewritten without notice. 
==========

Also the function list would have object ambiguity.  If there was a
patched function my_func() in both vmlinux and a module, it would just
appear on the list twice. You can fix this by using the mod:func syntax
like kallsyms, but it isn't as clean as expressing it in a hierarchy.

As far as the unification of the API structures with the internal
structures I have two points.  First is that, IMHO, we should assume that
the structures coming from the user are const.  In kpatch, for example,
we pass through some structures that are not created in the code, but by
the patch generation tool and stored in an ELF section (read-only).
Additionally, I am really against exposing the internal fields.
Commenting them as "internal" is just messy and we have to change the .h
file every time when want to add a field for internal use.

It seems that the primary purpose of this patch is to reduce the lines
of code.  However, I think that the object layer of the data structure
cleanly expresses the object<->function relationship and makes code like
the deferred patching much more straightforward since you already have
the functions/dynrelas organized by object.  You don't have to do the
nasty "if (strcmp(func->obj_name, objname)) continue;" business over the
entire patch every time.

Be advised, I have also done away with the new_addr/old_addr attributes
for v2 and replaced the patched module ref'ing with a combination of a
GOING notifier with lpc_mutex for protection.

Thanks,
Seth

> 
> Best regards,
> --
> Miroslav Benes
> SUSE Labs
> 
> 
> ----
> From f659a18a630de27b47d375119d793e28ee50da04 Mon Sep 17 00:00:00 2001
> From: Miroslav Benes <mbenes@suse.cz>
> Date: Thu, 13 Nov 2014 10:25:48 +0100
> Subject: [PATCH] lpc: simplification of structure and sysfs hierarchy
> 
> Original code has several issues this patch tries to remove.
> 
> First, there is only lpc_func structure for patched function and lpc_patch for
> the patch as a whole. Therefore lpc_object structure as middle step of hierarchy
> is removed. Patched function is still associated with some object (vmlinux or
> module) through obj_name. Dynrelas are now in lpc_patch structure and object
> identifier (obj_name) is in the lpc_dynrela to preserve the connection.
> 
> Second, sysfs structure is simplified. We do not need to propagate old_addr and
> new_addr. So, there is subdirectory for each patch (patching module) which
> includes original enabled attribute and new one funcs attribute which lists the
> patched functions.
> 
> Third, data duplication (lp_ and lpc_ structures) is removed. lpc_ structures
> are now in the header file and made available for the user. This allows us to
> remove almost all the functions for structure allocation in the original code.
> 
> Signed-off-by: Miroslav Benes <mbenes@suse.cz>
> ---
>  include/linux/livepatch.h |  46 ++--
>  kernel/livepatch/core.c   | 575 +++++++++++++---------------------------------
>  2 files changed, 191 insertions(+), 430 deletions(-)
> 
> diff --git a/include/linux/livepatch.h b/include/linux/livepatch.h
> index c7a415b..db5ba00 100644
> --- a/include/linux/livepatch.h
> +++ b/include/linux/livepatch.h
> @@ -2,10 +2,23 @@
>  #define _LIVEPATCH_H_
>  
>  #include <linux/module.h>
> +#include <linux/ftrace.h>
>  
> -struct lp_func {
> +enum lpc_state {
> +	LPC_DISABLED,
> +	LPC_ENABLED
> +};
> +
> +struct lpc_func {
> +	/* internal */
> +	struct ftrace_ops fops;
> +	enum lpc_state state;
> +	struct module *mod; /* module associated with patched function */
> +	unsigned long new_addr; /* replacement function in patch module */
> +
> +	/* external */
>  	const char *old_name; /* function to be patched */
> -	void *new_func; /* replacement function in patch module */
> +	void *new_func;
>  	/*
>  	 * The old_addr field is optional and can be used to resolve
>  	 * duplicate symbol names in the vmlinux object.  If this
> @@ -15,31 +28,36 @@ struct lp_func {
>  	 * way to resolve the ambiguity.
>  	 */
>  	unsigned long old_addr;
> +
> +	const char *obj_name; /* "vmlinux" or module name */
>  };
>  
> -struct lp_dynrela {
> +struct lpc_dynrela {
>  	unsigned long dest;
>  	unsigned long src;
>  	unsigned long type;
>  	const char *name;
> +	const char *obj_name;
>  	int addend;
>  	int external;
>  };
>  
> -struct lp_object {
> -	const char *name; /* "vmlinux" or module name */
> -	struct lp_func *funcs;
> -	struct lp_dynrela *dynrelas;
> -};
> +struct lpc_patch {
> +	/* internal */
> +	struct list_head list;
> +	struct kobject kobj;
> +	enum lpc_state state;
>  
> -struct lp_patch {
> +	/* external */
>  	struct module *mod; /* module containing the patch */
> -	struct lp_object *objs;
> +	struct lpc_dynrela *dynrelas;
> +	struct lpc_func funcs[];
>  };
>  
> -int lp_register_patch(struct lp_patch *);
> -int lp_unregister_patch(struct lp_patch *);
> -int lp_enable_patch(struct lp_patch *);
> -int lp_disable_patch(struct lp_patch *);
> +
> +extern int lpc_register_patch(struct lpc_patch *);
> +extern int lpc_unregister_patch(struct lpc_patch *);
> +extern int lpc_enable_patch(struct lpc_patch *);
> +extern int lpc_disable_patch(struct lpc_patch *);
>  
>  #endif /* _LIVEPATCH_H_ */
> diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c
> index b32dbb5..feecc22 100644
> --- a/kernel/livepatch/core.c
> +++ b/kernel/livepatch/core.c
> @@ -31,78 +31,32 @@
>  
>  #include <linux/livepatch.h>
>  
> +#define lpc_for_each_patch_func(p, pf)   \
> +        for (pf = p->funcs; pf->old_name; pf++)
> +
>  /*************************************
>   * Core structures
>   ************************************/
>  
> -/*
> - * lp_ structs vs lpc_ structs
> - *
> - * For each element (patch, object, func) in the live-patching code,
> - * there are two types with two different prefixes: lp_ and lpc_.
> - *
> - * Structures used by the live-patch modules to register with this core module
> - * are prefixed with lp_ (live patching).  These structures are part of the
> - * registration API and are defined in livepatch.h.  The structures used
> - * internally by this core module are prefixed with lpc_ (live patching core).
> - */
> -
>  static DEFINE_SEMAPHORE(lpc_mutex);
>  static LIST_HEAD(lpc_patches);
>  
> -enum lpc_state {
> -	DISABLED,
> -	ENABLED
> -};
> -
> -struct lpc_func {
> -	struct list_head list;
> -	struct kobject kobj;
> -	struct ftrace_ops fops;
> -	enum lpc_state state;
> -
> -	const char *old_name;
> -	unsigned long new_addr;
> -	unsigned long old_addr;
> -};
> -
> -struct lpc_object {
> -	struct list_head list;
> -	struct kobject kobj;
> -	struct module *mod; /* module associated with object */
> -	enum lpc_state state;
> -
> -	const char *name;
> -	struct list_head funcs;
> -	struct lp_dynrela *dynrelas;
> -};
> -
> -struct lpc_patch {
> -	struct list_head list;
> -	struct kobject kobj;
> -	struct lp_patch *userpatch; /* for correlation during unregister */
> -	enum lpc_state state;
> -
> -	struct module *mod;
> -	struct list_head objs;
> -};
> -
>  /*******************************************
>   * Helpers
>   *******************************************/
>  
> -/* sets obj->mod if object is not vmlinux and module was found */
> -static bool is_object_loaded(struct lpc_object *obj)
> +/* sets patch_func->mod if object is not vmlinux and module was found */
> +static bool is_object_loaded(struct lpc_func *patch_func)
>  {
>  	struct module *mod;
>  
> -	if (!strcmp(obj->name, "vmlinux"))
> +	if (!strcmp(patch_func->obj_name, "vmlinux"))
>  		return 1;
>  
>  	mutex_lock(&module_mutex);
> -	mod = find_module(obj->name);
> +	mod = find_module(patch_func->obj_name);
>  	mutex_unlock(&module_mutex);
> -	obj->mod = mod;
> +	patch_func->mod = mod;
>  
>  	return !!mod;
>  }
> @@ -254,18 +208,18 @@ static int lpc_find_external_symbol(struct module *pmod, const char *name,
>  	return lpc_find_symbol(pmod->name, name, addr);
>  }
>  
> -static int lpc_write_object_relocations(struct module *pmod,
> -					struct lpc_object *obj)
> +static int lpc_write_relocations(struct module *pmod,
> +		struct lpc_dynrela *patch_dynrelas)
>  {
>  	int ret, size, readonly = 0, numpages;
> -	struct lp_dynrela *dynrela;
> +	struct lpc_dynrela *dynrela;
>  	u64 loc, val;
>  	unsigned long core = (unsigned long)pmod->module_core;
>  	unsigned long core_ro_size = pmod->core_ro_size;
>  	unsigned long core_size = pmod->core_size;
>  
> -	for (dynrela = obj->dynrelas; dynrela->name; dynrela++) {
> -		if (!strcmp(obj->name, "vmlinux")) {
> +	for (dynrela = patch_dynrelas; dynrela->name; dynrela++) {
> +		if (!strcmp(dynrela->obj_name, "vmlinux")) {
>  			ret = lpc_verify_vmlinux_symbol(dynrela->name,
>  							dynrela->src);
>  			if (ret)
> @@ -277,7 +231,7 @@ static int lpc_write_object_relocations(struct module *pmod,
>  							       dynrela->name,
>  							       &dynrela->src);
>  			else
> -				ret = lpc_find_symbol(obj->mod->name,
> +				ret = lpc_find_symbol(dynrela->obj_name,
>  						      dynrela->name,
>  						      &dynrela->src);
>  			if (ret)
> @@ -357,7 +311,7 @@ static int lpc_enable_func(struct lpc_func *func)
>  	int ret;
>  
>  	BUG_ON(!func->old_addr);
> -	BUG_ON(func->state != DISABLED);
> +	BUG_ON(func->state != LPC_DISABLED);
>  	ret = ftrace_set_filter_ip(&func->fops, func->old_addr, 0, 0);
>  	if (ret) {
>  		pr_err("failed to set ftrace filter for function '%s' (%d)\n",
> @@ -370,16 +324,16 @@ static int lpc_enable_func(struct lpc_func *func)
>  		       func->old_name, ret);
>  		ftrace_set_filter_ip(&func->fops, func->old_addr, 1, 0);
>  	} else
> -		func->state = ENABLED;
> +		func->state = LPC_ENABLED;
>  
>  	return ret;
>  }
>  
> -static int lpc_unregister_func(struct lpc_func *func)
> +static int lpc_disable_func(struct lpc_func *func)
>  {
>  	int ret;
>  
> -	BUG_ON(func->state != ENABLED);
> +	BUG_ON(func->state != LPC_ENABLED);
>  	if (!func->old_addr)
>  		/* parent object is not loaded */
>  		return 0;
> @@ -392,173 +346,131 @@ static int lpc_unregister_func(struct lpc_func *func)
>  	ret = ftrace_set_filter_ip(&func->fops, func->old_addr, 1, 0);
>  	if (ret)
>  		pr_warn("function unregister succeeded but failed to clear the filter\n");
> -	func->state = DISABLED;
> +	func->state = LPC_DISABLED;
>  
>  	return 0;
>  }
>  
> -static int lpc_unregister_object(struct lpc_object *obj)
> -{
> -	struct lpc_func *func;
> -	int ret;
> -
> -	list_for_each_entry(func, &obj->funcs, list) {
> -		if (func->state != ENABLED)
> -			continue;
> -		ret = lpc_unregister_func(func);
> -		if (ret)
> -			return ret;
> -		if (strcmp(obj->name, "vmlinux"))
> -			func->old_addr = 0;
> -	}
> -	if (obj->mod)
> -		module_put(obj->mod);
> -	obj->state = DISABLED;
> -
> -	return 0;
> -}
> -
> -/* caller must ensure that obj->mod is set if object is a module */
> -static int lpc_enable_object(struct module *pmod, struct lpc_object *obj)
> -{
> -	struct lpc_func *func;
> -	int ret;
> -
> -	if (obj->mod && !try_module_get(obj->mod))
> -		return -ENODEV;
> -
> -	if (obj->dynrelas) {
> -		ret = lpc_write_object_relocations(pmod, obj);
> -		if (ret)
> -			goto unregister;
> -	}
> -	list_for_each_entry(func, &obj->funcs, list) {
> -		ret = lpc_find_verify_func_addr(func, obj->name);
> -		if (ret)
> -			goto unregister;
> -
> -		ret = lpc_enable_func(func);
> -		if (ret)
> -			goto unregister;
> -	}
> -	obj->state = ENABLED;
> -
> -	return 0;
> -unregister:
> -	WARN_ON(lpc_unregister_object(obj));
> -	return ret;
> -}
> -
>  /******************************
>   * enable/disable
>   ******************************/
>  
>  /* must be called with lpc_mutex held */
> -static struct lpc_patch *lpc_find_patch(struct lp_patch *userpatch)
> -{
> -	struct lpc_patch *patch;
> -
> -	list_for_each_entry(patch, &lpc_patches, list)
> -		if (patch->userpatch == userpatch)
> -			return patch;
> -
> -	return NULL;
> -}
> -
> -/* must be called with lpc_mutex held */
> -static int lpc_disable_patch(struct lpc_patch *patch)
> +static int __lpc_disable_patch(struct lpc_patch *patch)
>  {
> -	struct lpc_object *obj;
> +	struct lpc_func *patch_func;
>  	int ret;
>  
>  	pr_notice("disabling patch '%s'\n", patch->mod->name);
>  
> -	list_for_each_entry(obj, &patch->objs, list) {
> -		if (obj->state != ENABLED)
> +	lpc_for_each_patch_func(patch, patch_func) {
> +		if (patch_func->state != LPC_ENABLED)
>  			continue;
> -		ret = lpc_unregister_object(obj);
> -		if (ret)
> +		ret = lpc_disable_func(patch_func);
> +		if (ret) {
> +			pr_err("lpc: cannot disable function %s\n",
> +				patch_func->old_name);
>  			return ret;
> +		}
> +
> +		if (strcmp(patch_func->obj_name, "vmlinux"))
> +			patch_func->old_addr = 0;
> +		if (patch_func->mod)
> +			module_put(patch_func->mod);
>  	}
> -	patch->state = DISABLED;
> +	patch->state = LPC_DISABLED;
>  
>  	return 0;
>  }
>  
> -int lp_disable_patch(struct lp_patch *userpatch)
> +int lpc_disable_patch(struct lpc_patch *patch)
>  {
> -	struct lpc_patch *patch;
>  	int ret;
>  
>  	down(&lpc_mutex);
> -	patch = lpc_find_patch(userpatch);
> -	if (!patch) {
> -		ret = -ENODEV;
> -		goto out;
> -	}
> -	ret = lpc_disable_patch(patch);
> -out:
> +	ret = __lpc_disable_patch(patch);
>  	up(&lpc_mutex);
> +
>  	return ret;
>  }
> -EXPORT_SYMBOL_GPL(lp_disable_patch);
> +EXPORT_SYMBOL_GPL(lpc_disable_patch);
> +
> +static int lpc_verify_enable_func(struct lpc_func *patch_func)
> +{
> +	int ret;
> +
> +	if (patch_func->mod && !try_module_get(patch_func->mod))
> +		return -ENODEV;
> +
> +	ret = lpc_find_verify_func_addr(patch_func, patch_func->obj_name);
> +	if (ret)
> +		return ret;
> +
> +	ret = lpc_enable_func(patch_func);
> +	if (ret)
> +		return ret;
> +
> +	return 0;
> +}
>  
>  /* must be called with lpc_mutex held */
> -static int lpc_enable_patch(struct lpc_patch *patch)
> +static int __lpc_enable_patch(struct lpc_patch *patch)
>  {
> -	struct lpc_object *obj;
> +	struct lpc_func *patch_func;
>  	int ret;
>  
> -	BUG_ON(patch->state != DISABLED);
> +	BUG_ON(patch->state != LPC_DISABLED);
>  
>  	pr_notice_once("tainting kernel with TAINT_LIVEPATCH\n");
>  	add_taint(TAINT_LIVEPATCH, LOCKDEP_STILL_OK);
>  
>  	pr_notice("enabling patch '%s'\n", patch->mod->name);
>  
> -	list_for_each_entry(obj, &patch->objs, list) {
> -		if (!is_object_loaded(obj))
> +	if (patch->dynrelas) {
> +		ret = lpc_write_relocations(patch->mod, patch->dynrelas);
> +		if (ret)
> +			goto err;
> +	}
> +
> +	lpc_for_each_patch_func(patch, patch_func) {
> +		if (!is_object_loaded(patch_func))
>  			continue;
> -		ret = lpc_enable_object(patch->mod, obj);
> +
> +		ret = lpc_verify_enable_func(patch_func);
>  		if (ret)
> -			goto unregister;
> +			goto err;
>  	}
> -	patch->state = ENABLED;
> +	patch->state = LPC_ENABLED;
> +
>  	return 0;
>  
> -unregister:
> -	WARN_ON(lpc_disable_patch(patch));
> +err:
> +	WARN_ON(__lpc_disable_patch(patch));
>  	return ret;
>  }
>  
> -int lp_enable_patch(struct lp_patch *userpatch)
> +int lpc_enable_patch(struct lpc_patch *patch)
>  {
> -	struct lpc_patch *patch;
>  	int ret;
>  
>  	down(&lpc_mutex);
> -	patch = lpc_find_patch(userpatch);
> -	if (!patch) {
> -		ret = -ENODEV;
> -		goto out;
> -	}
> -	ret = lpc_enable_patch(patch);
> -out:
> +	ret = __lpc_enable_patch(patch);
>  	up(&lpc_mutex);
> +
>  	return ret;
>  }
> -EXPORT_SYMBOL_GPL(lp_enable_patch);
> +EXPORT_SYMBOL_GPL(lpc_enable_patch);
>  
>  /******************************
>   * module notifier
>   *****************************/
>  
> -static int lp_module_notify(struct notifier_block *nb, unsigned long action,
> +static int lpc_module_notify(struct notifier_block *nb, unsigned long action,
>  			    void *data)
>  {
>  	struct module *mod = data;
>  	struct lpc_patch *patch;
> -	struct lpc_object *obj;
> +	struct lpc_func *patch_func;
>  	int ret = 0;
>  
>  	if (action != MODULE_STATE_COMING)
> @@ -567,32 +479,42 @@ static int lp_module_notify(struct notifier_block *nb, unsigned long action,
>  	down(&lpc_mutex);
>  
>  	list_for_each_entry(patch, &lpc_patches, list) {
> -		if (patch->state == DISABLED)
> +		if (patch->state == LPC_DISABLED)
>  			continue;
> -		list_for_each_entry(obj, &patch->objs, list) {
> -			if (strcmp(obj->name, mod->name))
> +
> +		if (patch->dynrelas) {
> +			ret = lpc_write_relocations(patch->mod,
> +				patch->dynrelas);
> +			if (ret)
> +				goto err;
> +		}
> +
> +		lpc_for_each_patch_func(patch, patch_func) {
> +			if (strcmp(patch_func->obj_name, mod->name))
>  				continue;
> +
>  			pr_notice("load of module '%s' detected, applying patch '%s'\n",
>  				  mod->name, patch->mod->name);
> -			obj->mod = mod;
> -			ret = lpc_enable_object(patch->mod, obj);
> +			patch_func->mod = mod;
> +
> +			ret = lpc_verify_enable_func(patch_func);
>  			if (ret)
> -				goto out;
> -			break;
> +				goto err;
>  		}
>  	}
>  
>  	up(&lpc_mutex);
>  	return 0;
> -out:
> +
> +err:
>  	up(&lpc_mutex);
>  	WARN("failed to apply patch '%s' to module '%s'\n",
>  		patch->mod->name, mod->name);
>  	return 0;
>  }
>  
> -static struct notifier_block lp_module_nb = {
> -	.notifier_call = lp_module_notify,
> +static struct notifier_block lpc_module_nb = {
> +	.notifier_call = lpc_module_notify,
>  	.priority = INT_MIN, /* called last */
>  };
>  
> @@ -603,10 +525,7 @@ static struct notifier_block lp_module_nb = {
>   * /sys/kernel/livepatch
>   * /sys/kernel/livepatch/<patch>
>   * /sys/kernel/livepatch/<patch>/enabled
> - * /sys/kernel/livepatch/<patch>/<object>
> - * /sys/kernel/livepatch/<patch>/<object>/<func>
> - * /sys/kernel/livepatch/<patch>/<object>/<func>/new_addr
> - * /sys/kernel/livepatch/<patch>/<object>/<func>/old_addr
> + * /sys/kernel/livepatch/<patch>/funcs
>   */
>  
>  static ssize_t enabled_store(struct kobject *kobj, struct kobj_attribute *attr,
> @@ -620,7 +539,7 @@ static ssize_t enabled_store(struct kobject *kobj, struct kobj_attribute *attr,
>  	if (ret)
>  		return -EINVAL;
>  
> -	if (val != DISABLED && val != ENABLED)
> +	if (val != LPC_DISABLED && val != LPC_ENABLED)
>  		return -EINVAL;
>  
>  	patch = container_of(kobj, struct lpc_patch, kobj);
> @@ -632,12 +551,12 @@ static ssize_t enabled_store(struct kobject *kobj, struct kobj_attribute *attr,
>  		goto out;
>  	}
>  
> -	if (val == ENABLED) {
> -		ret = lpc_enable_patch(patch);
> +	if (val == LPC_ENABLED) {
> +		ret = __lpc_enable_patch(patch);
>  		if (ret)
>  			goto out;
>  	} else {
> -		ret = lpc_disable_patch(patch);
> +		ret = __lpc_disable_patch(patch);
>  		if (ret)
>  			goto out;
>  	}
> @@ -657,40 +576,35 @@ static ssize_t enabled_show(struct kobject *kobj,
>  	return snprintf(buf, PAGE_SIZE-1, "%d\n", patch->state);
>  }
>  
> -static struct kobj_attribute enabled_kobj_attr = __ATTR_RW(enabled);
> -static struct attribute *lpc_patch_attrs[] = {
> -	&enabled_kobj_attr.attr,
> -	NULL
> -};
> -
> -static ssize_t new_addr_show(struct kobject *kobj,
> +static ssize_t funcs_show(struct kobject *kobj,
>  			     struct kobj_attribute *attr, char *buf)
>  {
> -	struct lpc_func *func;
> -
> -	func = container_of(kobj, struct lpc_func, kobj);
> -	return snprintf(buf, PAGE_SIZE-1, "0x%016lx\n", func->new_addr);
> -}
> +	struct lpc_patch *patch;
> +	const struct lpc_func *patch_func;
> +	ssize_t size;
>  
> -static struct kobj_attribute new_addr_kobj_attr = __ATTR_RO(new_addr);
> +	size = snprintf(buf, PAGE_SIZE, "Functions:\n");
>  
> -static ssize_t old_addr_show(struct kobject *kobj,
> -			     struct kobj_attribute *attr, char *buf)
> -{
> -	struct lpc_func *func;
> +	patch = container_of(kobj, struct lpc_patch, kobj);
> +	lpc_for_each_patch_func(patch, patch_func)
> +		size += snprintf(buf + size, PAGE_SIZE - size, "%s\n",
> +				patch_func->old_name);
>  
> -	func = container_of(kobj, struct lpc_func, kobj);
> -	return snprintf(buf, PAGE_SIZE-1, "0x%016lx\n", func->old_addr);
> +        return size;
>  }
>  
> -static struct kobj_attribute old_addr_kobj_attr = __ATTR_RO(old_addr);
> -
> -static struct attribute *lpc_func_attrs[] = {
> -	&new_addr_kobj_attr.attr,
> -	&old_addr_kobj_attr.attr,
> +static struct kobj_attribute enabled_kobj_attr = __ATTR_RW(enabled);
> +static struct kobj_attribute funcs_kobj_attr = __ATTR_RO(funcs);
> +static struct attribute *lpc_patch_attrs[] = {
> +	&enabled_kobj_attr.attr,
> +	&funcs_kobj_attr.attr,
>  	NULL
>  };
>  
> +static struct attribute_group lpc_patch_sysfs_group = {
> +	.attrs = lpc_patch_attrs,
> +};
> +
>  static struct kobject *lpc_root_kobj;
>  
>  static int lpc_create_root_kobj(void)
> @@ -720,228 +634,67 @@ static void lpc_kobj_release_patch(struct kobject *kobj)
>  static struct kobj_type lpc_ktype_patch = {
>  	.release = lpc_kobj_release_patch,
>  	.sysfs_ops = &kobj_sysfs_ops,
> -	.default_attrs = lpc_patch_attrs
> -};
> -
> -static void lpc_kobj_release_object(struct kobject *kobj)
> -{
> -	struct lpc_object *obj;
> -
> -	obj = container_of(kobj, struct lpc_object, kobj);
> -	if (!list_empty(&obj->list))
> -		list_del(&obj->list);
> -	kfree(obj);
> -}
> -
> -static struct kobj_type lpc_ktype_object = {
> -	.release	= lpc_kobj_release_object,
> -	.sysfs_ops	= &kobj_sysfs_ops,
> -};
> -
> -static void lpc_kobj_release_func(struct kobject *kobj)
> -{
> -	struct lpc_func *func;
> -
> -	func = container_of(kobj, struct lpc_func, kobj);
> -	if (!list_empty(&func->list))
> -		list_del(&func->list);
> -	kfree(func);
> -}
> -
> -static struct kobj_type lpc_ktype_func = {
> -	.release	= lpc_kobj_release_func,
> -	.sysfs_ops	= &kobj_sysfs_ops,
> -	.default_attrs = lpc_func_attrs
>  };
>  
>  /*********************************
> - * structure allocation
> + * structure init and free
>   ********************************/
>  
> -static void lpc_free_funcs(struct lpc_object *obj)
> -{
> -	struct lpc_func *func, *funcsafe;
> -
> -	list_for_each_entry_safe(func, funcsafe, &obj->funcs, list)
> -		kobject_put(&func->kobj);
> -}
> -
> -static void lpc_free_objects(struct lpc_patch *patch)
> -{
> -	struct lpc_object *obj, *objsafe;
> -
> -	list_for_each_entry_safe(obj, objsafe, &patch->objs, list) {
> -		lpc_free_funcs(obj);
> -		kobject_put(&obj->kobj);
> -	}
> -}
> -
>  static void lpc_free_patch(struct lpc_patch *patch)
>  {
> -	lpc_free_objects(patch);
> +	sysfs_remove_group(&patch->kobj, &lpc_patch_sysfs_group);
>  	kobject_put(&patch->kobj);
>  }
>  
> -static struct lpc_func *lpc_create_func(struct kobject *root,
> -					struct lp_func *userfunc)
> +static int lpc_init_patch(struct lpc_patch *patch)
>  {
> -	struct lpc_func *func;
>  	struct ftrace_ops *ops;
> +	struct lpc_func *patch_func;
>  	int ret;
>  
> -	/* alloc */
> -	func = kzalloc(sizeof(*func), GFP_KERNEL);
> -	if (!func)
> -		return NULL;
> -
>  	/* init */
> -	INIT_LIST_HEAD(&func->list);
> -	func->old_name = userfunc->old_name;
> -	func->new_addr = (unsigned long)userfunc->new_func;
> -	func->old_addr = userfunc->old_addr;
> -	func->state = DISABLED;
> -	ops = &func->fops;
> -	ops->private = func;
> -	ops->func = lpc_ftrace_handler;
> -	ops->flags = FTRACE_OPS_FL_SAVE_REGS | FTRACE_OPS_FL_DYNAMIC;
> -
> -	/* sysfs */
> -	ret = kobject_init_and_add(&func->kobj, &lpc_ktype_func,
> -				   root, func->old_name);
> -	if (ret) {
> -		kfree(func);
> -		return NULL;
> -	}
> -
> -	return func;
> -}
> -
> -static int lpc_create_funcs(struct lpc_object *obj,
> -			    struct lp_func *userfuncs)
> -{
> -	struct lp_func *userfunc;
> -	struct lpc_func *func;
> -
> -	if (!userfuncs)
> -		return -EINVAL;
> -
> -	for (userfunc = userfuncs; userfunc->old_name; userfunc++) {
> -		func = lpc_create_func(&obj->kobj, userfunc);
> -		if (!func)
> -			goto free;
> -		list_add(&func->list, &obj->funcs);
> -	}
> -	return 0;
> -free:
> -	lpc_free_funcs(obj);
> -	return -ENOMEM;
> -}
> -
> -static struct lpc_object *lpc_create_object(struct kobject *root,
> -					    struct lp_object *userobj)
> -{
> -	struct lpc_object *obj;
> -	int ret;
> -
> -	/* alloc */
> -	obj = kzalloc(sizeof(*obj), GFP_KERNEL);
> -	if (!obj)
> -		return NULL;
> -
> -	/* init */
> -	INIT_LIST_HEAD(&obj->list);
> -	obj->name = userobj->name;
> -	obj->dynrelas = userobj->dynrelas;
> -	obj->state = DISABLED;
> -	/* obj->mod set by is_object_loaded() */
> -	INIT_LIST_HEAD(&obj->funcs);
> -
> -	/* sysfs */
> -	ret = kobject_init_and_add(&obj->kobj, &lpc_ktype_object,
> -				   root, obj->name);
> -	if (ret) {
> -		kfree(obj);
> -		return NULL;
> -	}
> -
> -	/* create functions */
> -	ret = lpc_create_funcs(obj, userobj->funcs);
> -	if (ret) {
> -		kobject_put(&obj->kobj);
> -		return NULL;
> -	}
> -
> -	return obj;
> -}
> -
> -static int lpc_create_objects(struct lpc_patch *patch,
> -			      struct lp_object *userobjs)
> -{
> -	struct lp_object *userobj;
> -	struct lpc_object *obj;
> -
> -	if (!userobjs)
> -		return -EINVAL;
> -
> -	for (userobj = userobjs; userobj->name; userobj++) {
> -		obj = lpc_create_object(&patch->kobj, userobj);
> -		if (!obj)
> -			goto free;
> -		list_add(&obj->list, &patch->objs);
> -	}
> -	return 0;
> -free:
> -	lpc_free_objects(patch);
> -	return -ENOMEM;
> -}
> -
> -static int lpc_create_patch(struct lp_patch *userpatch)
> -{
> -	struct lpc_patch *patch;
> -	int ret;
> -
> -	/* alloc */
> -	patch = kzalloc(sizeof(*patch), GFP_KERNEL);
> -	if (!patch)
> -		return -ENOMEM;
> -
> -	/* init */
> -	INIT_LIST_HEAD(&patch->list);
> -	patch->userpatch = userpatch;
> -	patch->mod = userpatch->mod;
> -	patch->state = DISABLED;
> -	INIT_LIST_HEAD(&patch->objs);
> +	patch->state = LPC_DISABLED;
>  
>  	/* sysfs */
>  	ret = kobject_init_and_add(&patch->kobj, &lpc_ktype_patch,
>  				   lpc_root_kobj, patch->mod->name);
> -	if (ret) {
> -		kfree(patch);
> -		return ret;
> -	}
> +	if (ret)
> +		goto err_root;
>  
> -	/* create objects */
> -	ret = lpc_create_objects(patch, userpatch->objs);
> -	if (ret) {
> -		kobject_put(&patch->kobj);
> -		return ret;
> +	/* create functions */
> +	lpc_for_each_patch_func(patch, patch_func) {
> +		patch_func->new_addr = (unsigned long)patch_func->new_func;
> +		patch_func->state = LPC_DISABLED;
> +		ops = &patch_func->fops;
> +		ops->private = patch_func;
> +		ops->func = lpc_ftrace_handler;
> +		ops->flags = FTRACE_OPS_FL_SAVE_REGS | FTRACE_OPS_FL_DYNAMIC;
>  	}
>  
> +	ret = sysfs_create_group(&patch->kobj, &lpc_patch_sysfs_group);
> +	if (ret)
> +		goto err_patch;
> +
>  	/* add to global list of patches */
>  	list_add(&patch->list, &lpc_patches);
>  
>  	return 0;
> +
> +err_patch:
> +	kobject_put(&patch->kobj);
> +err_root:
> +	return ret;
>  }
>  
>  /************************************
>   * register/unregister
>   ***********************************/
>  
> -int lp_register_patch(struct lp_patch *userpatch)
> +int lpc_register_patch(struct lpc_patch *userpatch)
>  {
>  	int ret;
>  
> -	if (!userpatch || !userpatch->mod || !userpatch->objs)
> +	if (!userpatch || !userpatch->mod || !userpatch->funcs)
>  		return -EINVAL;
>  
>  	/*
> @@ -955,36 +708,26 @@ int lp_register_patch(struct lp_patch *userpatch)
>  		return -ENODEV;
>  
>  	down(&lpc_mutex);
> -	ret = lpc_create_patch(userpatch);
> +	ret = lpc_init_patch(userpatch);
>  	up(&lpc_mutex);
>  	if (ret)
>  		module_put(userpatch->mod);
>  
>  	return ret;
>  }
> -EXPORT_SYMBOL_GPL(lp_register_patch);
> +EXPORT_SYMBOL_GPL(lpc_register_patch);
>  
> -int lp_unregister_patch(struct lp_patch *userpatch)
> +int lpc_unregister_patch(struct lpc_patch *userpatch)
>  {
> -	struct lpc_patch *patch;
>  	int ret = 0;
>  
>  	down(&lpc_mutex);
> -	patch = lpc_find_patch(userpatch);
> -	if (!patch) {
> -		ret = -ENODEV;
> -		goto out;
> -	}
> -	if (patch->state == ENABLED) {
> -		ret = -EINVAL;
> -		goto out;
> -	}
> -	lpc_free_patch(patch);
> -out:
> +	lpc_free_patch(userpatch);
>  	up(&lpc_mutex);
> +
>  	return ret;
>  }
> -EXPORT_SYMBOL_GPL(lp_unregister_patch);
> +EXPORT_SYMBOL_GPL(lpc_unregister_patch);
>  
>  /************************************
>   * entry/exit
> @@ -994,7 +737,7 @@ static int lpc_init(void)
>  {
>  	int ret;
>  
> -	ret = register_module_notifier(&lp_module_nb);
> +	ret = register_module_notifier(&lpc_module_nb);
>  	if (ret)
>  		return ret;
>  
> @@ -1004,14 +747,14 @@ static int lpc_init(void)
>  
>  	return 0;
>  unregister:
> -	unregister_module_notifier(&lp_module_nb);
> +	unregister_module_notifier(&lpc_module_nb);
>  	return ret;
>  }
>  
>  static void lpc_exit(void)
>  {
>  	lpc_remove_root_kobj();
> -	unregister_module_notifier(&lp_module_nb);
> +	unregister_module_notifier(&lpc_module_nb);
>  }
>  
>  module_init(lpc_init);
> -- 
> 2.1.2
> 

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [PATCH 2/2] kernel: add support for live patching
  2014-11-13 17:12     ` Seth Jennings
@ 2014-11-14 13:30       ` Miroslav Benes
  2014-11-14 14:52         ` Petr Mladek
  0 siblings, 1 reply; 73+ messages in thread
From: Miroslav Benes @ 2014-11-14 13:30 UTC (permalink / raw)
  To: Seth Jennings
  Cc: Josh Poimboeuf, Jiri Kosina, Vojtech Pavlik, Steven Rostedt,
	jslaby, pmladek, live-patching, kpatch, linux-kernel

On Thu, 13 Nov 2014, Seth Jennings wrote:

> On Thu, Nov 13, 2014 at 11:16:00AM +0100, Miroslav Benes wrote:
> > 
> > Hi,
> > 
> > thank you for the first version of the united live patching core.
> > 
> > The patch below implements some of our review objections. Changes are 
> > described in the commit log. It simplifies the hierarchy of data 
> > structures, removes data duplication (lp_ and lpc_ structures) and 
> > simplifies sysfs directory.
> > 
> > I did not try to repair other stuff (races, function names, function 
> > prefix, api symmetry etc.). It should serve as a demonstration of our 
> > point of view.
> > 
> > There are some problems with this. try_module_get and module_put may be 
> > called several times for each kernel module where some function is 
> > patched in. This should be fixed with module going notifier as suggested 
> > by Petr. 
> > 
> > The modified core was tested with modified testing live patch originally 
> > from Seth's github. It worked as expected.
> > 
> > Please take a look at these changes, so we can discuss them in more 
> > detail.
> 
> Thanks Miroslav.
> 
> The functional changes are a little hard to break out from the
> formatting changes like s/disable/unregister and s/lp_/lpc_/ or adding
> LPC_ prefix to the enum, most (all?) of which I have included for v2.
> 
> A problem with getting rid of the object layer is that there are
> operations we do that are object-level operations.  For example,
> module lookup and deferred module patching.  Also, the dynamic
> relocations need to be associated with an object, not a patch, as not
> all relocations will be able to be applied at patch load time for
> patches that apply to modules that aren't loaded.  I understand that you
> can walk the patch-level dynrela table and skip dynrela entries that
> don't match the target object, but why do that when you can cleanly
> express the relationship with a data structure hierarchy?
> 
> One example is the call to is_object_loaded() (renamed and reworked in
> v2 btw) per function rather than per object.  That is duplicate work and
> information that could be more cleanly expressed through an object
> layer.

I understand your arguments as I had thought about this before. It is true 
that some operations which are connected with the object level could be 
duplicated in our approach. However the list of patched functions and 
especially objects (vmlinux or modules) would be always relatively short. 
Two-level hierarchy (functions and patches) is in my opinion more compact 
and easier to maintain. I do not think that object-level outweighs this. 
Let us think about it some more...

> I also understand that sysfs/kobject stuff adds code length.  However,
> the new "funcs" attribute is procfs style, not sysfs style.  sysfs
> attribute should convey _one_ value.
> 
> >From Documenation/filesystems/sysfs.txt:
> ==========
> Attributes should be ASCII text files, preferably with only one value
> per file. It is noted that it may not be efficient to contain only one
> value per file, so it is socially acceptable to express an array of
> values of the same type. 
> 
> Mixing types, expressing multiple lines of data, and doing fancy
> formatting of data is heavily frowned upon. Doing these things may get
> you publicly humiliated and your code rewritten without notice. 
> ==========

Ah, you are right. My mistake. Thank you.

> Also the function list would have object ambiguity.  If there was a
> patched function my_func() in both vmlinux and a module, it would just
> appear on the list twice. You can fix this by using the mod:func syntax
> like kallsyms, but it isn't as clean as expressing it in a hierarchy.

Yes, using mod:func or check against module name (and not only function 
name) would be necessary. Ambiguity would be also the problem in sysfs 
directory tree without object level. But it is doable.

> As far as the unification of the API structures with the internal
> structures I have two points.  First is that, IMHO, we should assume that
> the structures coming from the user are const.  In kpatch, for example,
> we pass through some structures that are not created in the code, but by
> the patch generation tool and stored in an ELF section (read-only).
> Additionally, I am really against exposing the internal fields.
> Commenting them as "internal" is just messy and we have to change the .h
> file every time when want to add a field for internal use.

Changing the header file and thus API between different kernel releases 
is not a problem in my opinion. First live patching module would be 
created against specific kernel version (so the correct API is known). 
Second we would like to add userspace tool for automatic patch generation 
to upstream sometime in the future. API would be of "no importance" there 
as the situation in perf now (if I understand it correctly).
 
> It seems that the primary purpose of this patch is to reduce the lines
> of code.  However, I think that the object layer of the data structure
> cleanly expresses the object<->function relationship and makes code like
> the deferred patching much more straightforward since you already have
> the functions/dynrelas organized by object.  You don't have to do the
> nasty "if (strcmp(func->obj_name, objname)) continue;" business over the
> entire patch every time.

The primary purpose was to show our point of view. I do not pretend that 
there are no problems, but there are also some benefits.

> Be advised, I have also done away with the new_addr/old_addr attributes
> for v2 and replaced the patched module ref'ing with a combination of a
> GOING notifier with lpc_mutex for protection.

Great. I'll wait for v2 and resend our patch based on that, rebased and 
split to several patches. We can continue the discussion afterwards. Is it 
ok?

Thank you

---
Miroslav Benes
SUSE Labs


> > ----
> > From f659a18a630de27b47d375119d793e28ee50da04 Mon Sep 17 00:00:00 2001
> > From: Miroslav Benes <mbenes@suse.cz>
> > Date: Thu, 13 Nov 2014 10:25:48 +0100
> > Subject: [PATCH] lpc: simplification of structure and sysfs hierarchy
> > 
> > Original code has several issues this patch tries to remove.
> > 
> > First, there is only lpc_func structure for patched function and lpc_patch for
> > the patch as a whole. Therefore lpc_object structure as middle step of hierarchy
> > is removed. Patched function is still associated with some object (vmlinux or
> > module) through obj_name. Dynrelas are now in lpc_patch structure and object
> > identifier (obj_name) is in the lpc_dynrela to preserve the connection.
> > 
> > Second, sysfs structure is simplified. We do not need to propagate old_addr and
> > new_addr. So, there is subdirectory for each patch (patching module) which
> > includes original enabled attribute and new one funcs attribute which lists the
> > patched functions.
> > 
> > Third, data duplication (lp_ and lpc_ structures) is removed. lpc_ structures
> > are now in the header file and made available for the user. This allows us to
> > remove almost all the functions for structure allocation in the original code.
> > 
> > Signed-off-by: Miroslav Benes <mbenes@suse.cz>
> > ---
> >  include/linux/livepatch.h |  46 ++--
> >  kernel/livepatch/core.c   | 575 +++++++++++++---------------------------------
> >  2 files changed, 191 insertions(+), 430 deletions(-)
> > 
> > diff --git a/include/linux/livepatch.h b/include/linux/livepatch.h
> > index c7a415b..db5ba00 100644
> > --- a/include/linux/livepatch.h
> > +++ b/include/linux/livepatch.h
> > @@ -2,10 +2,23 @@
> >  #define _LIVEPATCH_H_
> >  
> >  #include <linux/module.h>
> > +#include <linux/ftrace.h>
> >  
> > -struct lp_func {
> > +enum lpc_state {
> > +	LPC_DISABLED,
> > +	LPC_ENABLED
> > +};
> > +
> > +struct lpc_func {
> > +	/* internal */
> > +	struct ftrace_ops fops;
> > +	enum lpc_state state;
> > +	struct module *mod; /* module associated with patched function */
> > +	unsigned long new_addr; /* replacement function in patch module */
> > +
> > +	/* external */
> >  	const char *old_name; /* function to be patched */
> > -	void *new_func; /* replacement function in patch module */
> > +	void *new_func;
> >  	/*
> >  	 * The old_addr field is optional and can be used to resolve
> >  	 * duplicate symbol names in the vmlinux object.  If this
> > @@ -15,31 +28,36 @@ struct lp_func {
> >  	 * way to resolve the ambiguity.
> >  	 */
> >  	unsigned long old_addr;
> > +
> > +	const char *obj_name; /* "vmlinux" or module name */
> >  };
> >  
> > -struct lp_dynrela {
> > +struct lpc_dynrela {
> >  	unsigned long dest;
> >  	unsigned long src;
> >  	unsigned long type;
> >  	const char *name;
> > +	const char *obj_name;
> >  	int addend;
> >  	int external;
> >  };
> >  
> > -struct lp_object {
> > -	const char *name; /* "vmlinux" or module name */
> > -	struct lp_func *funcs;
> > -	struct lp_dynrela *dynrelas;
> > -};
> > +struct lpc_patch {
> > +	/* internal */
> > +	struct list_head list;
> > +	struct kobject kobj;
> > +	enum lpc_state state;
> >  
> > -struct lp_patch {
> > +	/* external */
> >  	struct module *mod; /* module containing the patch */
> > -	struct lp_object *objs;
> > +	struct lpc_dynrela *dynrelas;
> > +	struct lpc_func funcs[];
> >  };
> >  
> > -int lp_register_patch(struct lp_patch *);
> > -int lp_unregister_patch(struct lp_patch *);
> > -int lp_enable_patch(struct lp_patch *);
> > -int lp_disable_patch(struct lp_patch *);
> > +
> > +extern int lpc_register_patch(struct lpc_patch *);
> > +extern int lpc_unregister_patch(struct lpc_patch *);
> > +extern int lpc_enable_patch(struct lpc_patch *);
> > +extern int lpc_disable_patch(struct lpc_patch *);
> >  
> >  #endif /* _LIVEPATCH_H_ */
> > diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c
> > index b32dbb5..feecc22 100644
> > --- a/kernel/livepatch/core.c
> > +++ b/kernel/livepatch/core.c
> > @@ -31,78 +31,32 @@
> >  
> >  #include <linux/livepatch.h>
> >  
> > +#define lpc_for_each_patch_func(p, pf)   \
> > +        for (pf = p->funcs; pf->old_name; pf++)
> > +
> >  /*************************************
> >   * Core structures
> >   ************************************/
> >  
> > -/*
> > - * lp_ structs vs lpc_ structs
> > - *
> > - * For each element (patch, object, func) in the live-patching code,
> > - * there are two types with two different prefixes: lp_ and lpc_.
> > - *
> > - * Structures used by the live-patch modules to register with this core module
> > - * are prefixed with lp_ (live patching).  These structures are part of the
> > - * registration API and are defined in livepatch.h.  The structures used
> > - * internally by this core module are prefixed with lpc_ (live patching core).
> > - */
> > -
> >  static DEFINE_SEMAPHORE(lpc_mutex);
> >  static LIST_HEAD(lpc_patches);
> >  
> > -enum lpc_state {
> > -	DISABLED,
> > -	ENABLED
> > -};
> > -
> > -struct lpc_func {
> > -	struct list_head list;
> > -	struct kobject kobj;
> > -	struct ftrace_ops fops;
> > -	enum lpc_state state;
> > -
> > -	const char *old_name;
> > -	unsigned long new_addr;
> > -	unsigned long old_addr;
> > -};
> > -
> > -struct lpc_object {
> > -	struct list_head list;
> > -	struct kobject kobj;
> > -	struct module *mod; /* module associated with object */
> > -	enum lpc_state state;
> > -
> > -	const char *name;
> > -	struct list_head funcs;
> > -	struct lp_dynrela *dynrelas;
> > -};
> > -
> > -struct lpc_patch {
> > -	struct list_head list;
> > -	struct kobject kobj;
> > -	struct lp_patch *userpatch; /* for correlation during unregister */
> > -	enum lpc_state state;
> > -
> > -	struct module *mod;
> > -	struct list_head objs;
> > -};
> > -
> >  /*******************************************
> >   * Helpers
> >   *******************************************/
> >  
> > -/* sets obj->mod if object is not vmlinux and module was found */
> > -static bool is_object_loaded(struct lpc_object *obj)
> > +/* sets patch_func->mod if object is not vmlinux and module was found */
> > +static bool is_object_loaded(struct lpc_func *patch_func)
> >  {
> >  	struct module *mod;
> >  
> > -	if (!strcmp(obj->name, "vmlinux"))
> > +	if (!strcmp(patch_func->obj_name, "vmlinux"))
> >  		return 1;
> >  
> >  	mutex_lock(&module_mutex);
> > -	mod = find_module(obj->name);
> > +	mod = find_module(patch_func->obj_name);
> >  	mutex_unlock(&module_mutex);
> > -	obj->mod = mod;
> > +	patch_func->mod = mod;
> >  
> >  	return !!mod;
> >  }
> > @@ -254,18 +208,18 @@ static int lpc_find_external_symbol(struct module *pmod, const char *name,
> >  	return lpc_find_symbol(pmod->name, name, addr);
> >  }
> >  
> > -static int lpc_write_object_relocations(struct module *pmod,
> > -					struct lpc_object *obj)
> > +static int lpc_write_relocations(struct module *pmod,
> > +		struct lpc_dynrela *patch_dynrelas)
> >  {
> >  	int ret, size, readonly = 0, numpages;
> > -	struct lp_dynrela *dynrela;
> > +	struct lpc_dynrela *dynrela;
> >  	u64 loc, val;
> >  	unsigned long core = (unsigned long)pmod->module_core;
> >  	unsigned long core_ro_size = pmod->core_ro_size;
> >  	unsigned long core_size = pmod->core_size;
> >  
> > -	for (dynrela = obj->dynrelas; dynrela->name; dynrela++) {
> > -		if (!strcmp(obj->name, "vmlinux")) {
> > +	for (dynrela = patch_dynrelas; dynrela->name; dynrela++) {
> > +		if (!strcmp(dynrela->obj_name, "vmlinux")) {
> >  			ret = lpc_verify_vmlinux_symbol(dynrela->name,
> >  							dynrela->src);
> >  			if (ret)
> > @@ -277,7 +231,7 @@ static int lpc_write_object_relocations(struct module *pmod,
> >  							       dynrela->name,
> >  							       &dynrela->src);
> >  			else
> > -				ret = lpc_find_symbol(obj->mod->name,
> > +				ret = lpc_find_symbol(dynrela->obj_name,
> >  						      dynrela->name,
> >  						      &dynrela->src);
> >  			if (ret)
> > @@ -357,7 +311,7 @@ static int lpc_enable_func(struct lpc_func *func)
> >  	int ret;
> >  
> >  	BUG_ON(!func->old_addr);
> > -	BUG_ON(func->state != DISABLED);
> > +	BUG_ON(func->state != LPC_DISABLED);
> >  	ret = ftrace_set_filter_ip(&func->fops, func->old_addr, 0, 0);
> >  	if (ret) {
> >  		pr_err("failed to set ftrace filter for function '%s' (%d)\n",
> > @@ -370,16 +324,16 @@ static int lpc_enable_func(struct lpc_func *func)
> >  		       func->old_name, ret);
> >  		ftrace_set_filter_ip(&func->fops, func->old_addr, 1, 0);
> >  	} else
> > -		func->state = ENABLED;
> > +		func->state = LPC_ENABLED;
> >  
> >  	return ret;
> >  }
> >  
> > -static int lpc_unregister_func(struct lpc_func *func)
> > +static int lpc_disable_func(struct lpc_func *func)
> >  {
> >  	int ret;
> >  
> > -	BUG_ON(func->state != ENABLED);
> > +	BUG_ON(func->state != LPC_ENABLED);
> >  	if (!func->old_addr)
> >  		/* parent object is not loaded */
> >  		return 0;
> > @@ -392,173 +346,131 @@ static int lpc_unregister_func(struct lpc_func *func)
> >  	ret = ftrace_set_filter_ip(&func->fops, func->old_addr, 1, 0);
> >  	if (ret)
> >  		pr_warn("function unregister succeeded but failed to clear the filter\n");
> > -	func->state = DISABLED;
> > +	func->state = LPC_DISABLED;
> >  
> >  	return 0;
> >  }
> >  
> > -static int lpc_unregister_object(struct lpc_object *obj)
> > -{
> > -	struct lpc_func *func;
> > -	int ret;
> > -
> > -	list_for_each_entry(func, &obj->funcs, list) {
> > -		if (func->state != ENABLED)
> > -			continue;
> > -		ret = lpc_unregister_func(func);
> > -		if (ret)
> > -			return ret;
> > -		if (strcmp(obj->name, "vmlinux"))
> > -			func->old_addr = 0;
> > -	}
> > -	if (obj->mod)
> > -		module_put(obj->mod);
> > -	obj->state = DISABLED;
> > -
> > -	return 0;
> > -}
> > -
> > -/* caller must ensure that obj->mod is set if object is a module */
> > -static int lpc_enable_object(struct module *pmod, struct lpc_object *obj)
> > -{
> > -	struct lpc_func *func;
> > -	int ret;
> > -
> > -	if (obj->mod && !try_module_get(obj->mod))
> > -		return -ENODEV;
> > -
> > -	if (obj->dynrelas) {
> > -		ret = lpc_write_object_relocations(pmod, obj);
> > -		if (ret)
> > -			goto unregister;
> > -	}
> > -	list_for_each_entry(func, &obj->funcs, list) {
> > -		ret = lpc_find_verify_func_addr(func, obj->name);
> > -		if (ret)
> > -			goto unregister;
> > -
> > -		ret = lpc_enable_func(func);
> > -		if (ret)
> > -			goto unregister;
> > -	}
> > -	obj->state = ENABLED;
> > -
> > -	return 0;
> > -unregister:
> > -	WARN_ON(lpc_unregister_object(obj));
> > -	return ret;
> > -}
> > -
> >  /******************************
> >   * enable/disable
> >   ******************************/
> >  
> >  /* must be called with lpc_mutex held */
> > -static struct lpc_patch *lpc_find_patch(struct lp_patch *userpatch)
> > -{
> > -	struct lpc_patch *patch;
> > -
> > -	list_for_each_entry(patch, &lpc_patches, list)
> > -		if (patch->userpatch == userpatch)
> > -			return patch;
> > -
> > -	return NULL;
> > -}
> > -
> > -/* must be called with lpc_mutex held */
> > -static int lpc_disable_patch(struct lpc_patch *patch)
> > +static int __lpc_disable_patch(struct lpc_patch *patch)
> >  {
> > -	struct lpc_object *obj;
> > +	struct lpc_func *patch_func;
> >  	int ret;
> >  
> >  	pr_notice("disabling patch '%s'\n", patch->mod->name);
> >  
> > -	list_for_each_entry(obj, &patch->objs, list) {
> > -		if (obj->state != ENABLED)
> > +	lpc_for_each_patch_func(patch, patch_func) {
> > +		if (patch_func->state != LPC_ENABLED)
> >  			continue;
> > -		ret = lpc_unregister_object(obj);
> > -		if (ret)
> > +		ret = lpc_disable_func(patch_func);
> > +		if (ret) {
> > +			pr_err("lpc: cannot disable function %s\n",
> > +				patch_func->old_name);
> >  			return ret;
> > +		}
> > +
> > +		if (strcmp(patch_func->obj_name, "vmlinux"))
> > +			patch_func->old_addr = 0;
> > +		if (patch_func->mod)
> > +			module_put(patch_func->mod);
> >  	}
> > -	patch->state = DISABLED;
> > +	patch->state = LPC_DISABLED;
> >  
> >  	return 0;
> >  }
> >  
> > -int lp_disable_patch(struct lp_patch *userpatch)
> > +int lpc_disable_patch(struct lpc_patch *patch)
> >  {
> > -	struct lpc_patch *patch;
> >  	int ret;
> >  
> >  	down(&lpc_mutex);
> > -	patch = lpc_find_patch(userpatch);
> > -	if (!patch) {
> > -		ret = -ENODEV;
> > -		goto out;
> > -	}
> > -	ret = lpc_disable_patch(patch);
> > -out:
> > +	ret = __lpc_disable_patch(patch);
> >  	up(&lpc_mutex);
> > +
> >  	return ret;
> >  }
> > -EXPORT_SYMBOL_GPL(lp_disable_patch);
> > +EXPORT_SYMBOL_GPL(lpc_disable_patch);
> > +
> > +static int lpc_verify_enable_func(struct lpc_func *patch_func)
> > +{
> > +	int ret;
> > +
> > +	if (patch_func->mod && !try_module_get(patch_func->mod))
> > +		return -ENODEV;
> > +
> > +	ret = lpc_find_verify_func_addr(patch_func, patch_func->obj_name);
> > +	if (ret)
> > +		return ret;
> > +
> > +	ret = lpc_enable_func(patch_func);
> > +	if (ret)
> > +		return ret;
> > +
> > +	return 0;
> > +}
> >  
> >  /* must be called with lpc_mutex held */
> > -static int lpc_enable_patch(struct lpc_patch *patch)
> > +static int __lpc_enable_patch(struct lpc_patch *patch)
> >  {
> > -	struct lpc_object *obj;
> > +	struct lpc_func *patch_func;
> >  	int ret;
> >  
> > -	BUG_ON(patch->state != DISABLED);
> > +	BUG_ON(patch->state != LPC_DISABLED);
> >  
> >  	pr_notice_once("tainting kernel with TAINT_LIVEPATCH\n");
> >  	add_taint(TAINT_LIVEPATCH, LOCKDEP_STILL_OK);
> >  
> >  	pr_notice("enabling patch '%s'\n", patch->mod->name);
> >  
> > -	list_for_each_entry(obj, &patch->objs, list) {
> > -		if (!is_object_loaded(obj))
> > +	if (patch->dynrelas) {
> > +		ret = lpc_write_relocations(patch->mod, patch->dynrelas);
> > +		if (ret)
> > +			goto err;
> > +	}
> > +
> > +	lpc_for_each_patch_func(patch, patch_func) {
> > +		if (!is_object_loaded(patch_func))
> >  			continue;
> > -		ret = lpc_enable_object(patch->mod, obj);
> > +
> > +		ret = lpc_verify_enable_func(patch_func);
> >  		if (ret)
> > -			goto unregister;
> > +			goto err;
> >  	}
> > -	patch->state = ENABLED;
> > +	patch->state = LPC_ENABLED;
> > +
> >  	return 0;
> >  
> > -unregister:
> > -	WARN_ON(lpc_disable_patch(patch));
> > +err:
> > +	WARN_ON(__lpc_disable_patch(patch));
> >  	return ret;
> >  }
> >  
> > -int lp_enable_patch(struct lp_patch *userpatch)
> > +int lpc_enable_patch(struct lpc_patch *patch)
> >  {
> > -	struct lpc_patch *patch;
> >  	int ret;
> >  
> >  	down(&lpc_mutex);
> > -	patch = lpc_find_patch(userpatch);
> > -	if (!patch) {
> > -		ret = -ENODEV;
> > -		goto out;
> > -	}
> > -	ret = lpc_enable_patch(patch);
> > -out:
> > +	ret = __lpc_enable_patch(patch);
> >  	up(&lpc_mutex);
> > +
> >  	return ret;
> >  }
> > -EXPORT_SYMBOL_GPL(lp_enable_patch);
> > +EXPORT_SYMBOL_GPL(lpc_enable_patch);
> >  
> >  /******************************
> >   * module notifier
> >   *****************************/
> >  
> > -static int lp_module_notify(struct notifier_block *nb, unsigned long action,
> > +static int lpc_module_notify(struct notifier_block *nb, unsigned long action,
> >  			    void *data)
> >  {
> >  	struct module *mod = data;
> >  	struct lpc_patch *patch;
> > -	struct lpc_object *obj;
> > +	struct lpc_func *patch_func;
> >  	int ret = 0;
> >  
> >  	if (action != MODULE_STATE_COMING)
> > @@ -567,32 +479,42 @@ static int lp_module_notify(struct notifier_block *nb, unsigned long action,
> >  	down(&lpc_mutex);
> >  
> >  	list_for_each_entry(patch, &lpc_patches, list) {
> > -		if (patch->state == DISABLED)
> > +		if (patch->state == LPC_DISABLED)
> >  			continue;
> > -		list_for_each_entry(obj, &patch->objs, list) {
> > -			if (strcmp(obj->name, mod->name))
> > +
> > +		if (patch->dynrelas) {
> > +			ret = lpc_write_relocations(patch->mod,
> > +				patch->dynrelas);
> > +			if (ret)
> > +				goto err;
> > +		}
> > +
> > +		lpc_for_each_patch_func(patch, patch_func) {
> > +			if (strcmp(patch_func->obj_name, mod->name))
> >  				continue;
> > +
> >  			pr_notice("load of module '%s' detected, applying patch '%s'\n",
> >  				  mod->name, patch->mod->name);
> > -			obj->mod = mod;
> > -			ret = lpc_enable_object(patch->mod, obj);
> > +			patch_func->mod = mod;
> > +
> > +			ret = lpc_verify_enable_func(patch_func);
> >  			if (ret)
> > -				goto out;
> > -			break;
> > +				goto err;
> >  		}
> >  	}
> >  
> >  	up(&lpc_mutex);
> >  	return 0;
> > -out:
> > +
> > +err:
> >  	up(&lpc_mutex);
> >  	WARN("failed to apply patch '%s' to module '%s'\n",
> >  		patch->mod->name, mod->name);
> >  	return 0;
> >  }
> >  
> > -static struct notifier_block lp_module_nb = {
> > -	.notifier_call = lp_module_notify,
> > +static struct notifier_block lpc_module_nb = {
> > +	.notifier_call = lpc_module_notify,
> >  	.priority = INT_MIN, /* called last */
> >  };
> >  
> > @@ -603,10 +525,7 @@ static struct notifier_block lp_module_nb = {
> >   * /sys/kernel/livepatch
> >   * /sys/kernel/livepatch/<patch>
> >   * /sys/kernel/livepatch/<patch>/enabled
> > - * /sys/kernel/livepatch/<patch>/<object>
> > - * /sys/kernel/livepatch/<patch>/<object>/<func>
> > - * /sys/kernel/livepatch/<patch>/<object>/<func>/new_addr
> > - * /sys/kernel/livepatch/<patch>/<object>/<func>/old_addr
> > + * /sys/kernel/livepatch/<patch>/funcs
> >   */
> >  
> >  static ssize_t enabled_store(struct kobject *kobj, struct kobj_attribute *attr,
> > @@ -620,7 +539,7 @@ static ssize_t enabled_store(struct kobject *kobj, struct kobj_attribute *attr,
> >  	if (ret)
> >  		return -EINVAL;
> >  
> > -	if (val != DISABLED && val != ENABLED)
> > +	if (val != LPC_DISABLED && val != LPC_ENABLED)
> >  		return -EINVAL;
> >  
> >  	patch = container_of(kobj, struct lpc_patch, kobj);
> > @@ -632,12 +551,12 @@ static ssize_t enabled_store(struct kobject *kobj, struct kobj_attribute *attr,
> >  		goto out;
> >  	}
> >  
> > -	if (val == ENABLED) {
> > -		ret = lpc_enable_patch(patch);
> > +	if (val == LPC_ENABLED) {
> > +		ret = __lpc_enable_patch(patch);
> >  		if (ret)
> >  			goto out;
> >  	} else {
> > -		ret = lpc_disable_patch(patch);
> > +		ret = __lpc_disable_patch(patch);
> >  		if (ret)
> >  			goto out;
> >  	}
> > @@ -657,40 +576,35 @@ static ssize_t enabled_show(struct kobject *kobj,
> >  	return snprintf(buf, PAGE_SIZE-1, "%d\n", patch->state);
> >  }
> >  
> > -static struct kobj_attribute enabled_kobj_attr = __ATTR_RW(enabled);
> > -static struct attribute *lpc_patch_attrs[] = {
> > -	&enabled_kobj_attr.attr,
> > -	NULL
> > -};
> > -
> > -static ssize_t new_addr_show(struct kobject *kobj,
> > +static ssize_t funcs_show(struct kobject *kobj,
> >  			     struct kobj_attribute *attr, char *buf)
> >  {
> > -	struct lpc_func *func;
> > -
> > -	func = container_of(kobj, struct lpc_func, kobj);
> > -	return snprintf(buf, PAGE_SIZE-1, "0x%016lx\n", func->new_addr);
> > -}
> > +	struct lpc_patch *patch;
> > +	const struct lpc_func *patch_func;
> > +	ssize_t size;
> >  
> > -static struct kobj_attribute new_addr_kobj_attr = __ATTR_RO(new_addr);
> > +	size = snprintf(buf, PAGE_SIZE, "Functions:\n");
> >  
> > -static ssize_t old_addr_show(struct kobject *kobj,
> > -			     struct kobj_attribute *attr, char *buf)
> > -{
> > -	struct lpc_func *func;
> > +	patch = container_of(kobj, struct lpc_patch, kobj);
> > +	lpc_for_each_patch_func(patch, patch_func)
> > +		size += snprintf(buf + size, PAGE_SIZE - size, "%s\n",
> > +				patch_func->old_name);
> >  
> > -	func = container_of(kobj, struct lpc_func, kobj);
> > -	return snprintf(buf, PAGE_SIZE-1, "0x%016lx\n", func->old_addr);
> > +        return size;
> >  }
> >  
> > -static struct kobj_attribute old_addr_kobj_attr = __ATTR_RO(old_addr);
> > -
> > -static struct attribute *lpc_func_attrs[] = {
> > -	&new_addr_kobj_attr.attr,
> > -	&old_addr_kobj_attr.attr,
> > +static struct kobj_attribute enabled_kobj_attr = __ATTR_RW(enabled);
> > +static struct kobj_attribute funcs_kobj_attr = __ATTR_RO(funcs);
> > +static struct attribute *lpc_patch_attrs[] = {
> > +	&enabled_kobj_attr.attr,
> > +	&funcs_kobj_attr.attr,
> >  	NULL
> >  };
> >  
> > +static struct attribute_group lpc_patch_sysfs_group = {
> > +	.attrs = lpc_patch_attrs,
> > +};
> > +
> >  static struct kobject *lpc_root_kobj;
> >  
> >  static int lpc_create_root_kobj(void)
> > @@ -720,228 +634,67 @@ static void lpc_kobj_release_patch(struct kobject *kobj)
> >  static struct kobj_type lpc_ktype_patch = {
> >  	.release = lpc_kobj_release_patch,
> >  	.sysfs_ops = &kobj_sysfs_ops,
> > -	.default_attrs = lpc_patch_attrs
> > -};
> > -
> > -static void lpc_kobj_release_object(struct kobject *kobj)
> > -{
> > -	struct lpc_object *obj;
> > -
> > -	obj = container_of(kobj, struct lpc_object, kobj);
> > -	if (!list_empty(&obj->list))
> > -		list_del(&obj->list);
> > -	kfree(obj);
> > -}
> > -
> > -static struct kobj_type lpc_ktype_object = {
> > -	.release	= lpc_kobj_release_object,
> > -	.sysfs_ops	= &kobj_sysfs_ops,
> > -};
> > -
> > -static void lpc_kobj_release_func(struct kobject *kobj)
> > -{
> > -	struct lpc_func *func;
> > -
> > -	func = container_of(kobj, struct lpc_func, kobj);
> > -	if (!list_empty(&func->list))
> > -		list_del(&func->list);
> > -	kfree(func);
> > -}
> > -
> > -static struct kobj_type lpc_ktype_func = {
> > -	.release	= lpc_kobj_release_func,
> > -	.sysfs_ops	= &kobj_sysfs_ops,
> > -	.default_attrs = lpc_func_attrs
> >  };
> >  
> >  /*********************************
> > - * structure allocation
> > + * structure init and free
> >   ********************************/
> >  
> > -static void lpc_free_funcs(struct lpc_object *obj)
> > -{
> > -	struct lpc_func *func, *funcsafe;
> > -
> > -	list_for_each_entry_safe(func, funcsafe, &obj->funcs, list)
> > -		kobject_put(&func->kobj);
> > -}
> > -
> > -static void lpc_free_objects(struct lpc_patch *patch)
> > -{
> > -	struct lpc_object *obj, *objsafe;
> > -
> > -	list_for_each_entry_safe(obj, objsafe, &patch->objs, list) {
> > -		lpc_free_funcs(obj);
> > -		kobject_put(&obj->kobj);
> > -	}
> > -}
> > -
> >  static void lpc_free_patch(struct lpc_patch *patch)
> >  {
> > -	lpc_free_objects(patch);
> > +	sysfs_remove_group(&patch->kobj, &lpc_patch_sysfs_group);
> >  	kobject_put(&patch->kobj);
> >  }
> >  
> > -static struct lpc_func *lpc_create_func(struct kobject *root,
> > -					struct lp_func *userfunc)
> > +static int lpc_init_patch(struct lpc_patch *patch)
> >  {
> > -	struct lpc_func *func;
> >  	struct ftrace_ops *ops;
> > +	struct lpc_func *patch_func;
> >  	int ret;
> >  
> > -	/* alloc */
> > -	func = kzalloc(sizeof(*func), GFP_KERNEL);
> > -	if (!func)
> > -		return NULL;
> > -
> >  	/* init */
> > -	INIT_LIST_HEAD(&func->list);
> > -	func->old_name = userfunc->old_name;
> > -	func->new_addr = (unsigned long)userfunc->new_func;
> > -	func->old_addr = userfunc->old_addr;
> > -	func->state = DISABLED;
> > -	ops = &func->fops;
> > -	ops->private = func;
> > -	ops->func = lpc_ftrace_handler;
> > -	ops->flags = FTRACE_OPS_FL_SAVE_REGS | FTRACE_OPS_FL_DYNAMIC;
> > -
> > -	/* sysfs */
> > -	ret = kobject_init_and_add(&func->kobj, &lpc_ktype_func,
> > -				   root, func->old_name);
> > -	if (ret) {
> > -		kfree(func);
> > -		return NULL;
> > -	}
> > -
> > -	return func;
> > -}
> > -
> > -static int lpc_create_funcs(struct lpc_object *obj,
> > -			    struct lp_func *userfuncs)
> > -{
> > -	struct lp_func *userfunc;
> > -	struct lpc_func *func;
> > -
> > -	if (!userfuncs)
> > -		return -EINVAL;
> > -
> > -	for (userfunc = userfuncs; userfunc->old_name; userfunc++) {
> > -		func = lpc_create_func(&obj->kobj, userfunc);
> > -		if (!func)
> > -			goto free;
> > -		list_add(&func->list, &obj->funcs);
> > -	}
> > -	return 0;
> > -free:
> > -	lpc_free_funcs(obj);
> > -	return -ENOMEM;
> > -}
> > -
> > -static struct lpc_object *lpc_create_object(struct kobject *root,
> > -					    struct lp_object *userobj)
> > -{
> > -	struct lpc_object *obj;
> > -	int ret;
> > -
> > -	/* alloc */
> > -	obj = kzalloc(sizeof(*obj), GFP_KERNEL);
> > -	if (!obj)
> > -		return NULL;
> > -
> > -	/* init */
> > -	INIT_LIST_HEAD(&obj->list);
> > -	obj->name = userobj->name;
> > -	obj->dynrelas = userobj->dynrelas;
> > -	obj->state = DISABLED;
> > -	/* obj->mod set by is_object_loaded() */
> > -	INIT_LIST_HEAD(&obj->funcs);
> > -
> > -	/* sysfs */
> > -	ret = kobject_init_and_add(&obj->kobj, &lpc_ktype_object,
> > -				   root, obj->name);
> > -	if (ret) {
> > -		kfree(obj);
> > -		return NULL;
> > -	}
> > -
> > -	/* create functions */
> > -	ret = lpc_create_funcs(obj, userobj->funcs);
> > -	if (ret) {
> > -		kobject_put(&obj->kobj);
> > -		return NULL;
> > -	}
> > -
> > -	return obj;
> > -}
> > -
> > -static int lpc_create_objects(struct lpc_patch *patch,
> > -			      struct lp_object *userobjs)
> > -{
> > -	struct lp_object *userobj;
> > -	struct lpc_object *obj;
> > -
> > -	if (!userobjs)
> > -		return -EINVAL;
> > -
> > -	for (userobj = userobjs; userobj->name; userobj++) {
> > -		obj = lpc_create_object(&patch->kobj, userobj);
> > -		if (!obj)
> > -			goto free;
> > -		list_add(&obj->list, &patch->objs);
> > -	}
> > -	return 0;
> > -free:
> > -	lpc_free_objects(patch);
> > -	return -ENOMEM;
> > -}
> > -
> > -static int lpc_create_patch(struct lp_patch *userpatch)
> > -{
> > -	struct lpc_patch *patch;
> > -	int ret;
> > -
> > -	/* alloc */
> > -	patch = kzalloc(sizeof(*patch), GFP_KERNEL);
> > -	if (!patch)
> > -		return -ENOMEM;
> > -
> > -	/* init */
> > -	INIT_LIST_HEAD(&patch->list);
> > -	patch->userpatch = userpatch;
> > -	patch->mod = userpatch->mod;
> > -	patch->state = DISABLED;
> > -	INIT_LIST_HEAD(&patch->objs);
> > +	patch->state = LPC_DISABLED;
> >  
> >  	/* sysfs */
> >  	ret = kobject_init_and_add(&patch->kobj, &lpc_ktype_patch,
> >  				   lpc_root_kobj, patch->mod->name);
> > -	if (ret) {
> > -		kfree(patch);
> > -		return ret;
> > -	}
> > +	if (ret)
> > +		goto err_root;
> >  
> > -	/* create objects */
> > -	ret = lpc_create_objects(patch, userpatch->objs);
> > -	if (ret) {
> > -		kobject_put(&patch->kobj);
> > -		return ret;
> > +	/* create functions */
> > +	lpc_for_each_patch_func(patch, patch_func) {
> > +		patch_func->new_addr = (unsigned long)patch_func->new_func;
> > +		patch_func->state = LPC_DISABLED;
> > +		ops = &patch_func->fops;
> > +		ops->private = patch_func;
> > +		ops->func = lpc_ftrace_handler;
> > +		ops->flags = FTRACE_OPS_FL_SAVE_REGS | FTRACE_OPS_FL_DYNAMIC;
> >  	}
> >  
> > +	ret = sysfs_create_group(&patch->kobj, &lpc_patch_sysfs_group);
> > +	if (ret)
> > +		goto err_patch;
> > +
> >  	/* add to global list of patches */
> >  	list_add(&patch->list, &lpc_patches);
> >  
> >  	return 0;
> > +
> > +err_patch:
> > +	kobject_put(&patch->kobj);
> > +err_root:
> > +	return ret;
> >  }
> >  
> >  /************************************
> >   * register/unregister
> >   ***********************************/
> >  
> > -int lp_register_patch(struct lp_patch *userpatch)
> > +int lpc_register_patch(struct lpc_patch *userpatch)
> >  {
> >  	int ret;
> >  
> > -	if (!userpatch || !userpatch->mod || !userpatch->objs)
> > +	if (!userpatch || !userpatch->mod || !userpatch->funcs)
> >  		return -EINVAL;
> >  
> >  	/*
> > @@ -955,36 +708,26 @@ int lp_register_patch(struct lp_patch *userpatch)
> >  		return -ENODEV;
> >  
> >  	down(&lpc_mutex);
> > -	ret = lpc_create_patch(userpatch);
> > +	ret = lpc_init_patch(userpatch);
> >  	up(&lpc_mutex);
> >  	if (ret)
> >  		module_put(userpatch->mod);
> >  
> >  	return ret;
> >  }
> > -EXPORT_SYMBOL_GPL(lp_register_patch);
> > +EXPORT_SYMBOL_GPL(lpc_register_patch);
> >  
> > -int lp_unregister_patch(struct lp_patch *userpatch)
> > +int lpc_unregister_patch(struct lpc_patch *userpatch)
> >  {
> > -	struct lpc_patch *patch;
> >  	int ret = 0;
> >  
> >  	down(&lpc_mutex);
> > -	patch = lpc_find_patch(userpatch);
> > -	if (!patch) {
> > -		ret = -ENODEV;
> > -		goto out;
> > -	}
> > -	if (patch->state == ENABLED) {
> > -		ret = -EINVAL;
> > -		goto out;
> > -	}
> > -	lpc_free_patch(patch);
> > -out:
> > +	lpc_free_patch(userpatch);
> >  	up(&lpc_mutex);
> > +
> >  	return ret;
> >  }
> > -EXPORT_SYMBOL_GPL(lp_unregister_patch);
> > +EXPORT_SYMBOL_GPL(lpc_unregister_patch);
> >  
> >  /************************************
> >   * entry/exit
> > @@ -994,7 +737,7 @@ static int lpc_init(void)
> >  {
> >  	int ret;
> >  
> > -	ret = register_module_notifier(&lp_module_nb);
> > +	ret = register_module_notifier(&lpc_module_nb);
> >  	if (ret)
> >  		return ret;
> >  
> > @@ -1004,14 +747,14 @@ static int lpc_init(void)
> >  
> >  	return 0;
> >  unregister:
> > -	unregister_module_notifier(&lp_module_nb);
> > +	unregister_module_notifier(&lpc_module_nb);
> >  	return ret;
> >  }
> >  
> >  static void lpc_exit(void)
> >  {
> >  	lpc_remove_root_kobj();
> > -	unregister_module_notifier(&lp_module_nb);
> > +	unregister_module_notifier(&lpc_module_nb);
> >  }
> >  
> >  module_init(lpc_init);
> > -- 
> > 2.1.2
> > 
> 

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [PATCH 2/2] kernel: add support for live patching
  2014-11-14 13:30       ` Miroslav Benes
@ 2014-11-14 14:52         ` Petr Mladek
  0 siblings, 0 replies; 73+ messages in thread
From: Petr Mladek @ 2014-11-14 14:52 UTC (permalink / raw)
  To: Miroslav Benes
  Cc: Seth Jennings, Josh Poimboeuf, Jiri Kosina, Vojtech Pavlik,
	Steven Rostedt, jslaby, live-patching, kpatch, linux-kernel

On Fri 2014-11-14 14:30:30, Miroslav Benes wrote:
> On Thu, 13 Nov 2014, Seth Jennings wrote:
> 
> > On Thu, Nov 13, 2014 at 11:16:00AM +0100, Miroslav Benes wrote:
> > > 
> > > Hi,
> > > 
> > > thank you for the first version of the united live patching core.
> > > 
> > > The patch below implements some of our review objections. Changes are 
> > > described in the commit log. It simplifies the hierarchy of data 
> > > structures, removes data duplication (lp_ and lpc_ structures) and 
> > > simplifies sysfs directory.
> > > 
> > > I did not try to repair other stuff (races, function names, function 
> > > prefix, api symmetry etc.). It should serve as a demonstration of our 
> > > point of view.
> > > 
> > > There are some problems with this. try_module_get and module_put may be 
> > > called several times for each kernel module where some function is 
> > > patched in. This should be fixed with module going notifier as suggested 
> > > by Petr. 
> > > 
> > > The modified core was tested with modified testing live patch originally 
> > > from Seth's github. It worked as expected.
> > > 
> > > Please take a look at these changes, so we can discuss them in more 
> > > detail.
> > 
> > Thanks Miroslav.
> > 
> > The functional changes are a little hard to break out from the
> > formatting changes like s/disable/unregister and s/lp_/lpc_/ or adding
> > LPC_ prefix to the enum, most (all?) of which I have included for v2.
> > 
> > A problem with getting rid of the object layer is that there are
> > operations we do that are object-level operations.  For example,
> > module lookup and deferred module patching.  Also, the dynamic
> > relocations need to be associated with an object, not a patch, as not
> > all relocations will be able to be applied at patch load time for
> > patches that apply to modules that aren't loaded.  I understand that you
> > can walk the patch-level dynrela table and skip dynrela entries that
> > don't match the target object, but why do that when you can cleanly
> > express the relationship with a data structure hierarchy?
> > 
> > One example is the call to is_object_loaded() (renamed and reworked in
> > v2 btw) per function rather than per object.  That is duplicate work and
> > information that could be more cleanly expressed through an object
> > layer.
> 
> I understand your arguments as I had thought about this before. It is true 
> that some operations which are connected with the object level could be 
> duplicated in our approach. However the list of patched functions and 
> especially objects (vmlinux or modules) would be always relatively short. 
> Two-level hierarchy (functions and patches) is in my opinion more compact 
> and easier to maintain. I do not think that object-level outweighs this. 
> Let us think about it some more...
> 
> > I also understand that sysfs/kobject stuff adds code length.  However,
> > the new "funcs" attribute is procfs style, not sysfs style.  sysfs
> > attribute should convey _one_ value.
> > 
> > >From Documenation/filesystems/sysfs.txt:
> > ==========
> > Attributes should be ASCII text files, preferably with only one value
> > per file. It is noted that it may not be efficient to contain only one
> > value per file, so it is socially acceptable to express an array of
> > values of the same type. 
> > 
> > Mixing types, expressing multiple lines of data, and doing fancy
> > formatting of data is heavily frowned upon. Doing these things may get
> > you publicly humiliated and your code rewritten without notice. 
> > ==========
> 
> Ah, you are right. My mistake. Thank you.
> 
> > Also the function list would have object ambiguity.  If there was a
> > patched function my_func() in both vmlinux and a module, it would just
> > appear on the list twice. You can fix this by using the mod:func syntax
> > like kallsyms, but it isn't as clean as expressing it in a hierarchy.
> 
> Yes, using mod:func or check against module name (and not only function 
> name) would be necessary. Ambiguity would be also the problem in sysfs 
> directory tree without object level. But it is doable.
> 
> > As far as the unification of the API structures with the internal
> > structures I have two points.  First is that, IMHO, we should assume that
> > the structures coming from the user are const.  In kpatch, for example,
> > we pass through some structures that are not created in the code, but by
> > the patch generation tool and stored in an ELF section (read-only).
> > Additionally, I am really against exposing the internal fields.
> > Commenting them as "internal" is just messy and we have to change the .h
> > file every time when want to add a field for internal use.
> 
> Changing the header file and thus API between different kernel releases 
> is not a problem in my opinion. First live patching module would be 
> created against specific kernel version (so the correct API is known). 
> Second we would like to add userspace tool for automatic patch generation 
> to upstream sometime in the future. API would be of "no importance" there 
> as the situation in perf now (if I understand it correctly).
>  
> > It seems that the primary purpose of this patch is to reduce the lines
> > of code.  However, I think that the object layer of the data structure
> > cleanly expresses the object<->function relationship and makes code like
> > the deferred patching much more straightforward since you already have
> > the functions/dynrelas organized by object.  You don't have to do the
> > nasty "if (strcmp(func->obj_name, objname)) continue;" business over the
> > entire patch every time.

I think that it was not primary about the number of lines but about
the complexity.

> The primary purpose was to show our point of view. I do not pretend that 
> there are no problems, but there are also some benefits.

As discussed above, there were three main points where we had
different opinion.

1. The sysfs structure. It has to stay more complex as proposed in the
   original patch. We have somehow missed the rule about one value per
   file.

   Well, we might want to show the status instead of the addresses.
   Note that particular functions might be obsolete by another patch
   and it would be nice to show this in the sysfs tree.


2. The three level structure (patch -> object -> func) vs. the two
   level one (patch -> func). I have personally neutral feeling about
   it.

   The tree level structure looks cleaner, for example the problem
   with functions of the same name in different modules. It allows to
   optimize the structures and some operations, for example, it
   reduces the number of checks for loaded module,
   the functions for coming and going module are at one place.

   But it creates more spaghetti code, for example, the call
   create_patch() -> create_objects() -> create_object() ->
   create_funcs -> create_func(). Also the more structures
   create more complex abstraction that might be harder to keep in
   head, understand, and operate with.

   I think that we should compare the number of operations with all
   functions or just with the object. Then we could see if it is really
   worth it.

   Also we might take inspiration in some similar existing stuff,
   for example with ftrace, kprobe, kallsyms.


3. The separation of public and internal structures. I would really
   like to avoid the duplication.

   If I get it correctly that main motivation is the API stability
   but I think that it is not a real problem here.

   The separation creates extra code (copying). Also it creates a
   confusion because there are many variables and functions of the
   same name that differ only in one character (lp vs lpc).

   From my point of view, this is not worth having.


> > Be advised, I have also done away with the new_addr/old_addr attributes
> > for v2 and replaced the patched module ref'ing with a combination of a
> > GOING notifier with lpc_mutex for protection.

Great.

> Great. I'll wait for v2 and resend our patch based on that, rebased and 
> split to several patches. We can continue the discussion afterwards. Is it 
> ok?

Sounds like a good plan.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [PATCH 0/2] Kernel Live Patching
  2014-11-13 16:38                             ` Vojtech Pavlik
@ 2014-11-18 12:47                               ` Petr Mladek
  2014-11-18 18:58                                 ` Josh Poimboeuf
  0 siblings, 1 reply; 73+ messages in thread
From: Petr Mladek @ 2014-11-18 12:47 UTC (permalink / raw)
  To: Vojtech Pavlik
  Cc: Masami Hiramatsu, Josh Poimboeuf, Christoph Hellwig,
	Seth Jennings, Jiri Kosina, Steven Rostedt, live-patching,
	kpatch, linux-kernel

On Thu 2014-11-13 17:38:04, Vojtech Pavlik wrote:
> On Fri, Nov 14, 2014 at 12:56:38AM +0900, Masami Hiramatsu wrote:
> > > It'd be mostly based on your refcounting code, including stack
> > > checking (when a process sleeps, counter gets set based on number of
> > > patched functions on the stack), possibly including setting the counter
> > > to 0 on syscall entry/exit, but it'd make the switch per-thread like
> > > kGraft does, not for the whole system, when the respective counters
> > > reach zero.
> > 
> > I'm not sure what happens if a process sleeps on the patched-set?
> 
> Then the patching process will be stuck until it is woken up somehow.
> But it's still much better to only have to care about processes sleeping
> on the patched-set than about processes sleeping anywhere (kGraft).
> 
> > If we switch the other threads, when this sleeping thread wakes up
> > that will see the old functions (and old data).
> 
> Yes, until the patching process is complete, data must be kept in the
> old format, even by new functions.

I am not sure if I am able to follow all the ideas. Anyway,
the above sentence triggered some warning bells in my head ;-)

Would it mean waiting for two safe switch points, please? One to switch
functions and the second to switch the data structures?

The later condition looks pretty complicated to me. It would mean to
make sure that the old structures won't be stored anywhere and nobody
would want to use them later. It won't be enough to check the stack
because some function might be called later that would access a saved
pointer pointing to the old structure.


> > So I think we need both SWITCH_THREAD and SWITCH_KERNEL options in
> > that case.
> 
> With data shadowing that's not required. It still may be worth having
> it.

I am not 100% sure what the shadowing means. If it means that the new
function will be able to read and write both versions of the
structure, it looks more easily doable to me.


Best Regards,
Petr

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [PATCH 0/2] Kernel Live Patching
  2014-11-18 12:47                               ` Petr Mladek
@ 2014-11-18 18:58                                 ` Josh Poimboeuf
  0 siblings, 0 replies; 73+ messages in thread
From: Josh Poimboeuf @ 2014-11-18 18:58 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Vojtech Pavlik, Masami Hiramatsu, Christoph Hellwig,
	Seth Jennings, Jiri Kosina, Steven Rostedt, live-patching,
	kpatch, linux-kernel

On Tue, Nov 18, 2014 at 01:47:46PM +0100, Petr Mladek wrote:
> On Thu 2014-11-13 17:38:04, Vojtech Pavlik wrote:
> > On Fri, Nov 14, 2014 at 12:56:38AM +0900, Masami Hiramatsu wrote:
> > > > It'd be mostly based on your refcounting code, including stack
> > > > checking (when a process sleeps, counter gets set based on number of
> > > > patched functions on the stack), possibly including setting the counter
> > > > to 0 on syscall entry/exit, but it'd make the switch per-thread like
> > > > kGraft does, not for the whole system, when the respective counters
> > > > reach zero.
> > > 
> > > I'm not sure what happens if a process sleeps on the patched-set?
> > 
> > Then the patching process will be stuck until it is woken up somehow.
> > But it's still much better to only have to care about processes sleeping
> > on the patched-set than about processes sleeping anywhere (kGraft).
> > 
> > > If we switch the other threads, when this sleeping thread wakes up
> > > that will see the old functions (and old data).
> > 
> > Yes, until the patching process is complete, data must be kept in the
> > old format, even by new functions.
> 
> I am not sure if I am able to follow all the ideas. Anyway,
> the above sentence triggered some warning bells in my head ;-)
> 
> Would it mean waiting for two safe switch points, please? One to switch
> functions and the second to switch the data structures?

Yes, though the first switch "point" doesn't have to be instantaneous.
When changing data, new functions must be able to understand both old
and new data.  The transition to the new functions can happen gradually,
as long as you don't start creating new data (or migrating existing
data) until after the patching is complete.  That way you'll never have
an old function trying to access new data.

> The later condition looks pretty complicated to me. It would mean to
> make sure that the old structures won't be stored anywhere and nobody
> would want to use them later. It won't be enough to check the stack
> because some function might be called later that would access a saved
> pointer pointing to the old structure.

We don't need to necessarily get rid of all old data, because new
functions can handle both old and new data.

> 
> 
> > > So I think we need both SWITCH_THREAD and SWITCH_KERNEL options in
> > > that case.
> > 
> > With data shadowing that's not required. It still may be worth having
> > it.
> 
> I am not 100% sure what the shadowing means. If it means that the new
> function will be able to read and write both versions of the
> structure, it looks more easily doable to me.

Shadowing is a way of adding virtual fields to existing data structures
at runtime.  That enables the new functions to easily determine whether
they're dealing with v1 or v2 of a given data structure, as well as
access any new fields in v2.


-- 
Josh

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [PATCH 2/2] kernel: add support for live patching
  2014-11-06 15:51   ` Jiri Slaby
  2014-11-06 16:57     ` Seth Jennings
@ 2014-11-30 12:23     ` Pavel Machek
  2014-12-01 16:49       ` Seth Jennings
  1 sibling, 1 reply; 73+ messages in thread
From: Pavel Machek @ 2014-11-30 12:23 UTC (permalink / raw)
  To: Jiri Slaby
  Cc: Seth Jennings, Josh Poimboeuf, Jiri Kosina, Vojtech Pavlik,
	Steven Rostedt, live-patching, kpatch, linux-kernel

On Thu 2014-11-06 16:51:02, Jiri Slaby wrote:
> On 11/06/2014, 03:39 PM, Seth Jennings wrote:
> > This commit introduces code for the live patching core.  It implements
> > an ftrace-based mechanism and kernel interface for doing live patching
> > of kernel and kernel module functions.
> 
> Hi,
> 
> nice! So we have something to start with. Brilliant!
> 
> I have some comments below now. Yet, it obviously needs deeper review
> which will take more time.
> 
> > --- /dev/null
> > +++ b/include/linux/livepatch.h
> > @@ -0,0 +1,45 @@
> > +#ifndef _LIVEPATCH_H_
> > +#define _LIVEPATCH_H_
> 
> This should follow the linux kernel naming: LINUX_LIVEPATCH_H
> 
> 
> > +#include <linux/module.h>
> > +
> > +struct lp_func {
> 
> I am not much happy with "lp" which effectively means parallel printer
> support. What about lip?

What about "patch_"?

It is not so big subsystem that additional typing would matter much...

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [PATCH 2/2] kernel: add support for live patching
  2014-11-30 12:23     ` Pavel Machek
@ 2014-12-01 16:49       ` Seth Jennings
  0 siblings, 0 replies; 73+ messages in thread
From: Seth Jennings @ 2014-12-01 16:49 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Jiri Slaby, Josh Poimboeuf, Jiri Kosina, Vojtech Pavlik,
	Steven Rostedt, live-patching, kpatch, linux-kernel

On Sun, Nov 30, 2014 at 01:23:48PM +0100, Pavel Machek wrote:
> On Thu 2014-11-06 16:51:02, Jiri Slaby wrote:
> > On 11/06/2014, 03:39 PM, Seth Jennings wrote:
> > > This commit introduces code for the live patching core.  It implements
> > > an ftrace-based mechanism and kernel interface for doing live patching
> > > of kernel and kernel module functions.
> > 
> > Hi,
> > 
> > nice! So we have something to start with. Brilliant!
> > 
> > I have some comments below now. Yet, it obviously needs deeper review
> > which will take more time.
> > 
> > > --- /dev/null
> > > +++ b/include/linux/livepatch.h
> > > @@ -0,0 +1,45 @@
> > > +#ifndef _LIVEPATCH_H_
> > > +#define _LIVEPATCH_H_
> > 
> > This should follow the linux kernel naming: LINUX_LIVEPATCH_H
> > 
> > 
> > > +#include <linux/module.h>
> > > +
> > > +struct lp_func {
> > 
> > I am not much happy with "lp" which effectively means parallel printer
> > support. What about lip?
> 
> What about "patch_"?

Hey Pavel,

We are on v4 of this patchset:
https://lkml.org/lkml/2014/11/25/868

We ended up going with klp_ (kernel live patching).

Thanks,
Seth

> 
> It is not so big subsystem that additional typing would matter much...
> 
> -- 
> (english) http://www.livejournal.com/~pavelmachek
> (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
> --
> To unsubscribe from this list: send the line "unsubscribe live-patching" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 73+ messages in thread

end of thread, other threads:[~2014-12-01 16:49 UTC | newest]

Thread overview: 73+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-11-06 14:39 [PATCH 0/2] Kernel Live Patching Seth Jennings
2014-11-06 14:39 ` [PATCH 1/2] kernel: add TAINT_LIVEPATCH Seth Jennings
2014-11-09 20:19   ` Greg KH
2014-11-11 14:54     ` Seth Jennings
2014-11-06 14:39 ` [PATCH 2/2] kernel: add support for live patching Seth Jennings
2014-11-06 15:11   ` Jiri Kosina
2014-11-06 16:20     ` Seth Jennings
2014-11-06 16:32       ` Josh Poimboeuf
2014-11-06 18:00       ` Vojtech Pavlik
2014-11-06 22:20       ` Jiri Kosina
2014-11-07 12:50         ` Josh Poimboeuf
2014-11-07 13:13           ` Jiri Kosina
2014-11-07 13:22             ` Josh Poimboeuf
2014-11-07 14:57             ` Seth Jennings
2014-11-06 15:51   ` Jiri Slaby
2014-11-06 16:57     ` Seth Jennings
2014-11-06 17:12       ` Josh Poimboeuf
2014-11-07 18:21       ` Petr Mladek
2014-11-07 20:31         ` Josh Poimboeuf
2014-11-30 12:23     ` Pavel Machek
2014-12-01 16:49       ` Seth Jennings
2014-11-06 20:02   ` Steven Rostedt
2014-11-06 20:19     ` Seth Jennings
2014-11-07 17:13   ` module notifier: was " Petr Mladek
2014-11-07 18:07     ` Seth Jennings
2014-11-07 18:40       ` Petr Mladek
2014-11-07 18:55         ` Seth Jennings
2014-11-11 19:40         ` Seth Jennings
2014-11-11 22:17           ` Jiri Kosina
2014-11-11 22:48             ` Seth Jennings
2014-11-07 17:39   ` more patches for the same func: " Petr Mladek
2014-11-07 21:54     ` Josh Poimboeuf
2014-11-07 19:40   ` Andy Lutomirski
2014-11-07 19:42     ` Seth Jennings
2014-11-07 19:52     ` Seth Jennings
2014-11-10 10:08   ` Jiri Kosina
2014-11-10 17:31     ` Josh Poimboeuf
2014-11-13 10:16   ` Miroslav Benes
2014-11-13 14:38     ` Josh Poimboeuf
2014-11-13 17:12     ` Seth Jennings
2014-11-14 13:30       ` Miroslav Benes
2014-11-14 14:52         ` Petr Mladek
2014-11-06 18:44 ` [PATCH 0/2] Kernel Live Patching Christoph Hellwig
2014-11-06 18:51   ` Vojtech Pavlik
2014-11-06 18:58     ` Christoph Hellwig
2014-11-06 19:34       ` Josh Poimboeuf
2014-11-06 19:49         ` Steven Rostedt
2014-11-06 20:02           ` Josh Poimboeuf
2014-11-07  7:46           ` Christoph Hellwig
2014-11-07  7:45         ` Christoph Hellwig
2014-11-06 20:24       ` Vojtech Pavlik
2014-11-07  7:47         ` Christoph Hellwig
2014-11-07 13:11           ` Josh Poimboeuf
2014-11-07 14:04             ` Vojtech Pavlik
2014-11-07 15:45               ` Josh Poimboeuf
2014-11-07 21:27                 ` Vojtech Pavlik
2014-11-08  3:45                   ` Josh Poimboeuf
2014-11-08  8:07                     ` Vojtech Pavlik
2014-11-10 17:09                       ` Josh Poimboeuf
2014-11-11  9:05                         ` Vojtech Pavlik
2014-11-11 17:45                           ` Josh Poimboeuf
2014-11-11  1:24                   ` Masami Hiramatsu
2014-11-11 10:26                     ` Vojtech Pavlik
2014-11-12 17:33                       ` Masami Hiramatsu
2014-11-12 21:47                         ` Vojtech Pavlik
2014-11-13 15:56                           ` Masami Hiramatsu
2014-11-13 16:38                             ` Vojtech Pavlik
2014-11-18 12:47                               ` Petr Mladek
2014-11-18 18:58                                 ` Josh Poimboeuf
2014-11-07 12:31         ` Josh Poimboeuf
2014-11-07 12:48           ` Vojtech Pavlik
2014-11-07 13:06             ` Josh Poimboeuf
2014-11-09 20:16 ` Greg KH

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.