linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH 0/9] livepatch: consistency model
@ 2015-02-09 17:31 Josh Poimboeuf
  2015-02-09 17:31 ` [RFC PATCH 1/9] livepatch: simplify disable error path Josh Poimboeuf
                   ` (13 more replies)
  0 siblings, 14 replies; 106+ messages in thread
From: Josh Poimboeuf @ 2015-02-09 17:31 UTC (permalink / raw)
  To: Seth Jennings, Jiri Kosina, Vojtech Pavlik
  Cc: Masami Hiramatsu, live-patching, linux-kernel

This patch set implements a livepatch consistency model, targeted for 3.21.
Now that we have a solid livepatch code base, this is the biggest remaining
missing piece.

This code stems from the design proposal made by Vojtech [1] in November.  It
makes live patching safer in general.  Specifically, it allows you to apply
patches which change function prototypes.  It also lays the groundwork for
future code changes which will enable data and data semantic changes.

It's basically a hybrid of kpatch and kGraft, combining kpatch's backtrace
checking with kGraft's per-task consistency.  When patching, tasks are
carefully transitioned from the old universe to the new universe.  A task can
only be switched to the new universe if it's not using a function that is to be
patched or unpatched.  After all tasks have moved to the new universe, the
patching process is complete.

How it transitions various tasks to the new universe:

- The stacks of all sleeping tasks are checked.  Each task that is not sleeping
  on a to-be-patched function is switched.

- Other user tasks are handled by do_notify_resume() (see patch 9/9).  If a
  task is I/O bound, it switches universes when returning from a system call.
  If it's CPU bound, it switches when returning from an interrupt.  If it's
  sleeping on a patched function, the user can send SIGSTOP and SIGCONT to
  force it to switch upon return from the signal handler.

- Idle "swapper" tasks which are sleeping on a to-be-patched function can be
  switched from within the outer idle loop.

- An interrupt handler will inherit the universe of the task it interrupts.

- kthreads which are sleeping on to-be-patched functions are not yet handled
  (more on this below).


I think this approach provides the best benefits of both kpatch and kGraft:

advantages vs kpatch:
- no stop machine latency
- higher patch success rate (can patch in-use functions)
- patching failures are more predictable (primary failure mode is attempting to
  patch a kthread which is sleeping forever on a patched function, more on this
  below)

advantages vs kGraft:
- less code complexity (don't have to hack up the code of all the different
  kthreads)
- less impact to processes (don't have to signal all sleeping tasks)

disadvantages vs kpatch:
- no system-wide switch point (not really a functional limitation, just forces
  the patch author to be more careful. but that's probably a good thing anyway)


My biggest concerns and questions related to this patch set are:

1) To safely examine the task stacks, the transition code locks each task's rq
   struct, which requires using the scheduler's internal rq locking functions.
   It seems to work well, but I'm not sure if there's a cleaner way to safely
   do stack checking without stop_machine().

2) As mentioned above, kthreads which are always sleeping on a patched function
   will never transition to the new universe.  This is really a minor issue
   (less than 1% of patches).  It's not necessarily something that needs to be
   resolved with this patch set, but it would be good to have some discussion
   about it regardless.
   
   To overcome this issue, I have 1/2 an idea: we could add some stack checking
   code to the ftrace handler itself to transition the kthread to the new
   universe after it re-enters the function it was originally sleeping on, if
   the stack doesn't already have have any other to-be-patched functions.
   Combined with the klp_transition_work_fn()'s periodic stack checking of
   sleeping tasks, that would handle most of the cases (except when trying to
   patch the high-level thread_fn itself).

   But then how do you make the kthread wake up?  As far as I can tell,
   wake_up_process() doesn't seem to work on a kthread (unless I messed up my
   testing somehow).  What does kGraft do in this case?


[1] https://lkml.org/lkml/2014/11/7/354


Josh Poimboeuf (9):
  livepatch: simplify disable error path
  livepatch: separate enabled and patched states
  livepatch: move patching functions into patch.c
  livepatch: get function sizes
  sched: move task rq locking functions to sched.h
  livepatch: create per-task consistency model
  proc: add /proc/<pid>/universe to show livepatch status
  livepatch: allow patch modules to be removed
  livepatch: update task universe when exiting kernel

 arch/x86/include/asm/thread_info.h |   4 +-
 arch/x86/kernel/signal.c           |   4 +
 fs/proc/base.c                     |  11 ++
 include/linux/livepatch.h          |  38 ++--
 include/linux/sched.h              |   3 +
 kernel/fork.c                      |   2 +
 kernel/livepatch/Makefile          |   2 +-
 kernel/livepatch/core.c            | 360 ++++++++++---------------------------
 kernel/livepatch/patch.c           | 206 +++++++++++++++++++++
 kernel/livepatch/patch.h           |  26 +++
 kernel/livepatch/transition.c      | 318 ++++++++++++++++++++++++++++++++
 kernel/livepatch/transition.h      |  16 ++
 kernel/sched/core.c                |  34 +---
 kernel/sched/idle.c                |   4 +
 kernel/sched/sched.h               |  33 ++++
 15 files changed, 747 insertions(+), 314 deletions(-)
 create mode 100644 kernel/livepatch/patch.c
 create mode 100644 kernel/livepatch/patch.h
 create mode 100644 kernel/livepatch/transition.c
 create mode 100644 kernel/livepatch/transition.h

-- 
2.1.0


^ permalink raw reply	[flat|nested] 106+ messages in thread

* [RFC PATCH 1/9] livepatch: simplify disable error path
  2015-02-09 17:31 [RFC PATCH 0/9] livepatch: consistency model Josh Poimboeuf
@ 2015-02-09 17:31 ` Josh Poimboeuf
  2015-02-13 12:25   ` Miroslav Benes
  2015-02-18 20:07   ` Jiri Kosina
  2015-02-09 17:31 ` [RFC PATCH 2/9] livepatch: separate enabled and patched states Josh Poimboeuf
                   ` (12 subsequent siblings)
  13 siblings, 2 replies; 106+ messages in thread
From: Josh Poimboeuf @ 2015-02-09 17:31 UTC (permalink / raw)
  To: Seth Jennings, Jiri Kosina, Vojtech Pavlik
  Cc: Masami Hiramatsu, live-patching, linux-kernel

If registering the function with ftrace has previously succeeded,
unregistering will almost never fail.  Even if it does, it's not a fatal
error.  We can still carry on and disable the klp_func from being used
by removing it from the klp_ops func stack.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
---
 kernel/livepatch/core.c | 67 +++++++++++++------------------------------------
 1 file changed, 17 insertions(+), 50 deletions(-)

diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c
index 9adf86b..081df77 100644
--- a/kernel/livepatch/core.c
+++ b/kernel/livepatch/core.c
@@ -322,32 +322,20 @@ static void notrace klp_ftrace_handler(unsigned long ip,
 	klp_arch_set_pc(regs, (unsigned long)func->new_func);
 }
 
-static int klp_disable_func(struct klp_func *func)
+static void klp_disable_func(struct klp_func *func)
 {
 	struct klp_ops *ops;
-	int ret;
-
-	if (WARN_ON(func->state != KLP_ENABLED))
-		return -EINVAL;
 
-	if (WARN_ON(!func->old_addr))
-		return -EINVAL;
+	WARN_ON(func->state != KLP_ENABLED);
+	WARN_ON(!func->old_addr);
 
 	ops = klp_find_ops(func->old_addr);
 	if (WARN_ON(!ops))
-		return -EINVAL;
+		return;
 
 	if (list_is_singular(&ops->func_stack)) {
-		ret = unregister_ftrace_function(&ops->fops);
-		if (ret) {
-			pr_err("failed to unregister ftrace handler for function '%s' (%d)\n",
-			       func->old_name, ret);
-			return ret;
-		}
-
-		ret = ftrace_set_filter_ip(&ops->fops, func->old_addr, 1, 0);
-		if (ret)
-			pr_warn("function unregister succeeded but failed to clear the filter\n");
+		WARN_ON(unregister_ftrace_function(&ops->fops));
+		WARN_ON(ftrace_set_filter_ip(&ops->fops, func->old_addr, 1, 0));
 
 		list_del_rcu(&func->stack_node);
 		list_del(&ops->node);
@@ -357,8 +345,6 @@ static int klp_disable_func(struct klp_func *func)
 	}
 
 	func->state = KLP_DISABLED;
-
-	return 0;
 }
 
 static int klp_enable_func(struct klp_func *func)
@@ -419,23 +405,15 @@ err:
 	return ret;
 }
 
-static int klp_disable_object(struct klp_object *obj)
+static void klp_disable_object(struct klp_object *obj)
 {
 	struct klp_func *func;
-	int ret;
 
-	for (func = obj->funcs; func->old_name; func++) {
-		if (func->state != KLP_ENABLED)
-			continue;
-
-		ret = klp_disable_func(func);
-		if (ret)
-			return ret;
-	}
+	for (func = obj->funcs; func->old_name; func++)
+		if (func->state == KLP_ENABLED)
+			klp_disable_func(func);
 
 	obj->state = KLP_DISABLED;
-
-	return 0;
 }
 
 static int klp_enable_object(struct klp_object *obj)
@@ -451,22 +429,19 @@ static int klp_enable_object(struct klp_object *obj)
 
 	for (func = obj->funcs; func->old_name; func++) {
 		ret = klp_enable_func(func);
-		if (ret)
-			goto unregister;
+		if (ret) {
+			klp_disable_object(obj);
+			return ret;
+		}
 	}
 	obj->state = KLP_ENABLED;
 
 	return 0;
-
-unregister:
-	WARN_ON(klp_disable_object(obj));
-	return ret;
 }
 
 static int __klp_disable_patch(struct klp_patch *patch)
 {
 	struct klp_object *obj;
-	int ret;
 
 	/* enforce stacking: only the last enabled patch can be disabled */
 	if (!list_is_last(&patch->list, &klp_patches) &&
@@ -476,12 +451,8 @@ static int __klp_disable_patch(struct klp_patch *patch)
 	pr_notice("disabling patch '%s'\n", patch->mod->name);
 
 	for (obj = patch->objs; obj->funcs; obj++) {
-		if (obj->state != KLP_ENABLED)
-			continue;
-
-		ret = klp_disable_object(obj);
-		if (ret)
-			return ret;
+		if (obj->state == KLP_ENABLED)
+			klp_disable_object(obj);
 	}
 
 	patch->state = KLP_DISABLED;
@@ -931,7 +902,6 @@ static void klp_module_notify_going(struct klp_patch *patch,
 {
 	struct module *pmod = patch->mod;
 	struct module *mod = obj->mod;
-	int ret;
 
 	if (patch->state == KLP_DISABLED)
 		goto disabled;
@@ -939,10 +909,7 @@ static void klp_module_notify_going(struct klp_patch *patch,
 	pr_notice("reverting patch '%s' on unloading module '%s'\n",
 		  pmod->name, mod->name);
 
-	ret = klp_disable_object(obj);
-	if (ret)
-		pr_warn("failed to revert patch '%s' on module '%s' (%d)\n",
-			pmod->name, mod->name, ret);
+	klp_disable_object(obj);
 
 disabled:
 	klp_free_object_loaded(obj);
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 106+ messages in thread

* [RFC PATCH 2/9] livepatch: separate enabled and patched states
  2015-02-09 17:31 [RFC PATCH 0/9] livepatch: consistency model Josh Poimboeuf
  2015-02-09 17:31 ` [RFC PATCH 1/9] livepatch: simplify disable error path Josh Poimboeuf
@ 2015-02-09 17:31 ` Josh Poimboeuf
  2015-02-10 16:44   ` Jiri Slaby
  2015-02-13 12:57   ` Miroslav Benes
  2015-02-09 17:31 ` [RFC PATCH 3/9] livepatch: move patching functions into patch.c Josh Poimboeuf
                   ` (11 subsequent siblings)
  13 siblings, 2 replies; 106+ messages in thread
From: Josh Poimboeuf @ 2015-02-09 17:31 UTC (permalink / raw)
  To: Seth Jennings, Jiri Kosina, Vojtech Pavlik
  Cc: Masami Hiramatsu, live-patching, linux-kernel

Once we have a consistency model, patches and their objects will be
enabled and disabled at different times.  For example, when a patch is
disabled, its loaded objects' funcs can remain registered with ftrace
indefinitely until the unpatching operation is complete and they're no
longer in use.

It's less confusing if we give them different names: patches can be
enabled or disabled; objects (and their funcs) can be patched or
unpatched:

- Enabled means that a patch is logically enabled (but not necessarily
  fully applied).

- Patched means that an object's funcs are registered with ftrace and
  added to the klp_ops func stack.

Also, since these states are binary, represent them with boolean-type
variables instead of enums.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
---
 include/linux/livepatch.h | 15 ++++-----
 kernel/livepatch/core.c   | 79 +++++++++++++++++++++++------------------------
 2 files changed, 45 insertions(+), 49 deletions(-)

diff --git a/include/linux/livepatch.h b/include/linux/livepatch.h
index 95023fd..22a67d1 100644
--- a/include/linux/livepatch.h
+++ b/include/linux/livepatch.h
@@ -28,11 +28,6 @@
 
 #include <asm/livepatch.h>
 
-enum klp_state {
-	KLP_DISABLED,
-	KLP_ENABLED
-};
-
 /**
  * struct klp_func - function structure for live patching
  * @old_name:	name of the function to be patched
@@ -42,6 +37,7 @@ enum klp_state {
  * @kobj:	kobject for sysfs resources
  * @state:	tracks function-level patch application state
  * @stack_node:	list node for klp_ops func_stack list
+ * @patched:	the func has been added to the klp_ops list
  */
 struct klp_func {
 	/* external */
@@ -59,8 +55,8 @@ struct klp_func {
 
 	/* internal */
 	struct kobject kobj;
-	enum klp_state state;
 	struct list_head stack_node;
+	int patched;
 };
 
 /**
@@ -90,7 +86,7 @@ struct klp_reloc {
  * @kobj:	kobject for sysfs resources
  * @mod:	kernel module associated with the patched object
  * 		(NULL for vmlinux)
- * @state:	tracks object-level patch application state
+ * @patched:	the object's funcs have been add to the klp_ops list
  */
 struct klp_object {
 	/* external */
@@ -101,7 +97,7 @@ struct klp_object {
 	/* internal */
 	struct kobject *kobj;
 	struct module *mod;
-	enum klp_state state;
+	int patched;
 };
 
 /**
@@ -111,6 +107,7 @@ struct klp_object {
  * @list:	list node for global list of registered patches
  * @kobj:	kobject for sysfs resources
  * @state:	tracks patch-level application state
+ * @enabled:	the patch is enabled (but operation may be incomplete)
  */
 struct klp_patch {
 	/* external */
@@ -120,7 +117,7 @@ struct klp_patch {
 	/* internal */
 	struct list_head list;
 	struct kobject kobj;
-	enum klp_state state;
+	int enabled;
 };
 
 extern int klp_register_patch(struct klp_patch *);
diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c
index 081df77..73f9ba4 100644
--- a/kernel/livepatch/core.c
+++ b/kernel/livepatch/core.c
@@ -322,11 +322,11 @@ static void notrace klp_ftrace_handler(unsigned long ip,
 	klp_arch_set_pc(regs, (unsigned long)func->new_func);
 }
 
-static void klp_disable_func(struct klp_func *func)
+static void klp_unpatch_func(struct klp_func *func)
 {
 	struct klp_ops *ops;
 
-	WARN_ON(func->state != KLP_ENABLED);
+	WARN_ON(!func->patched);
 	WARN_ON(!func->old_addr);
 
 	ops = klp_find_ops(func->old_addr);
@@ -344,10 +344,10 @@ static void klp_disable_func(struct klp_func *func)
 		list_del_rcu(&func->stack_node);
 	}
 
-	func->state = KLP_DISABLED;
+	func->patched = 0;
 }
 
-static int klp_enable_func(struct klp_func *func)
+static int klp_patch_func(struct klp_func *func)
 {
 	struct klp_ops *ops;
 	int ret;
@@ -355,7 +355,7 @@ static int klp_enable_func(struct klp_func *func)
 	if (WARN_ON(!func->old_addr))
 		return -EINVAL;
 
-	if (WARN_ON(func->state != KLP_DISABLED))
+	if (WARN_ON(func->patched))
 		return -EINVAL;
 
 	ops = klp_find_ops(func->old_addr);
@@ -394,7 +394,7 @@ static int klp_enable_func(struct klp_func *func)
 		list_add_rcu(&func->stack_node, &ops->func_stack);
 	}
 
-	func->state = KLP_ENABLED;
+	func->patched = 1;
 
 	return 0;
 
@@ -405,36 +405,36 @@ err:
 	return ret;
 }
 
-static void klp_disable_object(struct klp_object *obj)
+static void klp_unpatch_object(struct klp_object *obj)
 {
 	struct klp_func *func;
 
 	for (func = obj->funcs; func->old_name; func++)
-		if (func->state == KLP_ENABLED)
-			klp_disable_func(func);
+		if (func->patched)
+			klp_unpatch_func(func);
 
-	obj->state = KLP_DISABLED;
+	obj->patched = 0;
 }
 
-static int klp_enable_object(struct klp_object *obj)
+static int klp_patch_object(struct klp_object *obj)
 {
 	struct klp_func *func;
 	int ret;
 
-	if (WARN_ON(obj->state != KLP_DISABLED))
+	if (WARN_ON(obj->patched))
 		return -EINVAL;
 
 	if (WARN_ON(!klp_is_object_loaded(obj)))
 		return -EINVAL;
 
 	for (func = obj->funcs; func->old_name; func++) {
-		ret = klp_enable_func(func);
+		ret = klp_patch_func(func);
 		if (ret) {
-			klp_disable_object(obj);
+			klp_unpatch_object(obj);
 			return ret;
 		}
 	}
-	obj->state = KLP_ENABLED;
+	obj->patched = 1;
 
 	return 0;
 }
@@ -445,17 +445,16 @@ static int __klp_disable_patch(struct klp_patch *patch)
 
 	/* enforce stacking: only the last enabled patch can be disabled */
 	if (!list_is_last(&patch->list, &klp_patches) &&
-	    list_next_entry(patch, list)->state == KLP_ENABLED)
+	    list_next_entry(patch, list)->enabled)
 		return -EBUSY;
 
 	pr_notice("disabling patch '%s'\n", patch->mod->name);
 
-	for (obj = patch->objs; obj->funcs; obj++) {
-		if (obj->state == KLP_ENABLED)
-			klp_disable_object(obj);
-	}
+	for (obj = patch->objs; obj->funcs; obj++)
+		if (obj->patched)
+			klp_unpatch_object(obj);
 
-	patch->state = KLP_DISABLED;
+	patch->enabled = 0;
 
 	return 0;
 }
@@ -479,7 +478,7 @@ int klp_disable_patch(struct klp_patch *patch)
 		goto err;
 	}
 
-	if (patch->state == KLP_DISABLED) {
+	if (!patch->enabled) {
 		ret = -EINVAL;
 		goto err;
 	}
@@ -497,12 +496,12 @@ static int __klp_enable_patch(struct klp_patch *patch)
 	struct klp_object *obj;
 	int ret;
 
-	if (WARN_ON(patch->state != KLP_DISABLED))
+	if (WARN_ON(patch->enabled))
 		return -EINVAL;
 
 	/* enforce stacking: only the first disabled patch can be enabled */
 	if (patch->list.prev != &klp_patches &&
-	    list_prev_entry(patch, list)->state == KLP_DISABLED)
+	    !list_prev_entry(patch, list)->enabled)
 		return -EBUSY;
 
 	pr_notice_once("tainting kernel with TAINT_LIVEPATCH\n");
@@ -516,12 +515,12 @@ static int __klp_enable_patch(struct klp_patch *patch)
 		if (!klp_is_object_loaded(obj))
 			continue;
 
-		ret = klp_enable_object(obj);
+		ret = klp_patch_object(obj);
 		if (ret)
 			goto unregister;
 	}
 
-	patch->state = KLP_ENABLED;
+	patch->enabled = 1;
 
 	return 0;
 
@@ -579,20 +578,20 @@ static ssize_t enabled_store(struct kobject *kobj, struct kobj_attribute *attr,
 	if (ret)
 		return -EINVAL;
 
-	if (val != KLP_DISABLED && val != KLP_ENABLED)
+	if (val > 1)
 		return -EINVAL;
 
 	patch = container_of(kobj, struct klp_patch, kobj);
 
 	mutex_lock(&klp_mutex);
 
-	if (val == patch->state) {
+	if (patch->enabled == val) {
 		/* already in requested state */
 		ret = -EINVAL;
 		goto err;
 	}
 
-	if (val == KLP_ENABLED) {
+	if (val) {
 		ret = __klp_enable_patch(patch);
 		if (ret)
 			goto err;
@@ -617,7 +616,7 @@ static ssize_t enabled_show(struct kobject *kobj,
 	struct klp_patch *patch;
 
 	patch = container_of(kobj, struct klp_patch, kobj);
-	return snprintf(buf, PAGE_SIZE-1, "%d\n", patch->state);
+	return snprintf(buf, PAGE_SIZE-1, "%d\n", patch->enabled);
 }
 
 static struct kobj_attribute enabled_kobj_attr = __ATTR_RW(enabled);
@@ -699,7 +698,7 @@ static void klp_free_patch(struct klp_patch *patch)
 static int klp_init_func(struct klp_object *obj, struct klp_func *func)
 {
 	INIT_LIST_HEAD(&func->stack_node);
-	func->state = KLP_DISABLED;
+	func->patched = 0;
 
 	return kobject_init_and_add(&func->kobj, &klp_ktype_func,
 				    obj->kobj, func->old_name);
@@ -736,7 +735,7 @@ static int klp_init_object(struct klp_patch *patch, struct klp_object *obj)
 	if (!obj->funcs)
 		return -EINVAL;
 
-	obj->state = KLP_DISABLED;
+	obj->patched = 0;
 
 	klp_find_object_module(obj);
 
@@ -775,7 +774,7 @@ static int klp_init_patch(struct klp_patch *patch)
 
 	mutex_lock(&klp_mutex);
 
-	patch->state = KLP_DISABLED;
+	patch->enabled = 0;
 
 	ret = kobject_init_and_add(&patch->kobj, &klp_ktype_patch,
 				   klp_root_kobj, patch->mod->name);
@@ -821,7 +820,7 @@ int klp_unregister_patch(struct klp_patch *patch)
 		goto out;
 	}
 
-	if (patch->state == KLP_ENABLED) {
+	if (patch->enabled) {
 		ret = -EBUSY;
 		goto out;
 	}
@@ -882,13 +881,13 @@ static void klp_module_notify_coming(struct klp_patch *patch,
 	if (ret)
 		goto err;
 
-	if (patch->state == KLP_DISABLED)
+	if (!patch->enabled)
 		return;
 
 	pr_notice("applying patch '%s' to loading module '%s'\n",
 		  pmod->name, mod->name);
 
-	ret = klp_enable_object(obj);
+	ret = klp_patch_object(obj);
 	if (!ret)
 		return;
 
@@ -903,15 +902,15 @@ static void klp_module_notify_going(struct klp_patch *patch,
 	struct module *pmod = patch->mod;
 	struct module *mod = obj->mod;
 
-	if (patch->state == KLP_DISABLED)
-		goto disabled;
+	if (!patch->enabled)
+		goto free;
 
 	pr_notice("reverting patch '%s' on unloading module '%s'\n",
 		  pmod->name, mod->name);
 
-	klp_disable_object(obj);
+	klp_unpatch_object(obj);
 
-disabled:
+free:
 	klp_free_object_loaded(obj);
 }
 
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 106+ messages in thread

* [RFC PATCH 3/9] livepatch: move patching functions into patch.c
  2015-02-09 17:31 [RFC PATCH 0/9] livepatch: consistency model Josh Poimboeuf
  2015-02-09 17:31 ` [RFC PATCH 1/9] livepatch: simplify disable error path Josh Poimboeuf
  2015-02-09 17:31 ` [RFC PATCH 2/9] livepatch: separate enabled and patched states Josh Poimboeuf
@ 2015-02-09 17:31 ` Josh Poimboeuf
  2015-02-10 18:27   ` Jiri Slaby
  2015-02-13 14:28   ` Miroslav Benes
  2015-02-09 17:31 ` [RFC PATCH 4/9] livepatch: get function sizes Josh Poimboeuf
                   ` (10 subsequent siblings)
  13 siblings, 2 replies; 106+ messages in thread
From: Josh Poimboeuf @ 2015-02-09 17:31 UTC (permalink / raw)
  To: Seth Jennings, Jiri Kosina, Vojtech Pavlik
  Cc: Masami Hiramatsu, live-patching, linux-kernel

Move functions related to the actual patching of functions and objects
into a new patch.c file.

The only functional change is to remove the unnecessary
WARN_ON(!klp_is_object_loaded()) check from klp_patch_object().

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
---
 kernel/livepatch/Makefile |   2 +-
 kernel/livepatch/core.c   | 175 +--------------------------------------------
 kernel/livepatch/patch.c  | 176 ++++++++++++++++++++++++++++++++++++++++++++++
 kernel/livepatch/patch.h  |  25 +++++++
 4 files changed, 203 insertions(+), 175 deletions(-)
 create mode 100644 kernel/livepatch/patch.c
 create mode 100644 kernel/livepatch/patch.h

diff --git a/kernel/livepatch/Makefile b/kernel/livepatch/Makefile
index e8780c0..e136dad 100644
--- a/kernel/livepatch/Makefile
+++ b/kernel/livepatch/Makefile
@@ -1,3 +1,3 @@
 obj-$(CONFIG_LIVEPATCH) += livepatch.o
 
-livepatch-objs := core.o
+livepatch-objs := core.o patch.o
diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c
index 73f9ba4..0c09eba 100644
--- a/kernel/livepatch/core.c
+++ b/kernel/livepatch/core.c
@@ -24,29 +24,10 @@
 #include <linux/kernel.h>
 #include <linux/mutex.h>
 #include <linux/slab.h>
-#include <linux/ftrace.h>
 #include <linux/list.h>
 #include <linux/kallsyms.h>
-#include <linux/livepatch.h>
 
-/**
- * struct klp_ops - structure for tracking registered ftrace ops structs
- *
- * A single ftrace_ops is shared between all enabled replacement functions
- * (klp_func structs) which have the same old_addr.  This allows the switch
- * between function versions to happen instantaneously by updating the klp_ops
- * struct's func_stack list.  The winner is the klp_func at the top of the
- * func_stack (front of the list).
- *
- * @node:	node for the global klp_ops list
- * @func_stack:	list head for the stack of klp_func's (active func is on top)
- * @fops:	registered ftrace ops struct
- */
-struct klp_ops {
-	struct list_head node;
-	struct list_head func_stack;
-	struct ftrace_ops fops;
-};
+#include "patch.h"
 
 /*
  * The klp_mutex protects the global lists and state transitions of any
@@ -57,25 +38,9 @@ struct klp_ops {
 static DEFINE_MUTEX(klp_mutex);
 
 static LIST_HEAD(klp_patches);
-static LIST_HEAD(klp_ops);
 
 static struct kobject *klp_root_kobj;
 
-static struct klp_ops *klp_find_ops(unsigned long old_addr)
-{
-	struct klp_ops *ops;
-	struct klp_func *func;
-
-	list_for_each_entry(ops, &klp_ops, node) {
-		func = list_first_entry(&ops->func_stack, struct klp_func,
-					stack_node);
-		if (func->old_addr == old_addr)
-			return ops;
-	}
-
-	return NULL;
-}
-
 static bool klp_is_module(struct klp_object *obj)
 {
 	return obj->name;
@@ -301,144 +266,6 @@ static int klp_write_object_relocations(struct module *pmod,
 	return 0;
 }
 
-static void notrace klp_ftrace_handler(unsigned long ip,
-				       unsigned long parent_ip,
-				       struct ftrace_ops *fops,
-				       struct pt_regs *regs)
-{
-	struct klp_ops *ops;
-	struct klp_func *func;
-
-	ops = container_of(fops, struct klp_ops, fops);
-
-	rcu_read_lock();
-	func = list_first_or_null_rcu(&ops->func_stack, struct klp_func,
-				      stack_node);
-	rcu_read_unlock();
-
-	if (WARN_ON_ONCE(!func))
-		return;
-
-	klp_arch_set_pc(regs, (unsigned long)func->new_func);
-}
-
-static void klp_unpatch_func(struct klp_func *func)
-{
-	struct klp_ops *ops;
-
-	WARN_ON(!func->patched);
-	WARN_ON(!func->old_addr);
-
-	ops = klp_find_ops(func->old_addr);
-	if (WARN_ON(!ops))
-		return;
-
-	if (list_is_singular(&ops->func_stack)) {
-		WARN_ON(unregister_ftrace_function(&ops->fops));
-		WARN_ON(ftrace_set_filter_ip(&ops->fops, func->old_addr, 1, 0));
-
-		list_del_rcu(&func->stack_node);
-		list_del(&ops->node);
-		kfree(ops);
-	} else {
-		list_del_rcu(&func->stack_node);
-	}
-
-	func->patched = 0;
-}
-
-static int klp_patch_func(struct klp_func *func)
-{
-	struct klp_ops *ops;
-	int ret;
-
-	if (WARN_ON(!func->old_addr))
-		return -EINVAL;
-
-	if (WARN_ON(func->patched))
-		return -EINVAL;
-
-	ops = klp_find_ops(func->old_addr);
-	if (!ops) {
-		ops = kzalloc(sizeof(*ops), GFP_KERNEL);
-		if (!ops)
-			return -ENOMEM;
-
-		ops->fops.func = klp_ftrace_handler;
-		ops->fops.flags = FTRACE_OPS_FL_SAVE_REGS |
-				  FTRACE_OPS_FL_DYNAMIC |
-				  FTRACE_OPS_FL_IPMODIFY;
-
-		list_add(&ops->node, &klp_ops);
-
-		INIT_LIST_HEAD(&ops->func_stack);
-		list_add_rcu(&func->stack_node, &ops->func_stack);
-
-		ret = ftrace_set_filter_ip(&ops->fops, func->old_addr, 0, 0);
-		if (ret) {
-			pr_err("failed to set ftrace filter for function '%s' (%d)\n",
-			       func->old_name, ret);
-			goto err;
-		}
-
-		ret = register_ftrace_function(&ops->fops);
-		if (ret) {
-			pr_err("failed to register ftrace handler for function '%s' (%d)\n",
-			       func->old_name, ret);
-			ftrace_set_filter_ip(&ops->fops, func->old_addr, 1, 0);
-			goto err;
-		}
-
-
-	} else {
-		list_add_rcu(&func->stack_node, &ops->func_stack);
-	}
-
-	func->patched = 1;
-
-	return 0;
-
-err:
-	list_del_rcu(&func->stack_node);
-	list_del(&ops->node);
-	kfree(ops);
-	return ret;
-}
-
-static void klp_unpatch_object(struct klp_object *obj)
-{
-	struct klp_func *func;
-
-	for (func = obj->funcs; func->old_name; func++)
-		if (func->patched)
-			klp_unpatch_func(func);
-
-	obj->patched = 0;
-}
-
-static int klp_patch_object(struct klp_object *obj)
-{
-	struct klp_func *func;
-	int ret;
-
-	if (WARN_ON(obj->patched))
-		return -EINVAL;
-
-	if (WARN_ON(!klp_is_object_loaded(obj)))
-		return -EINVAL;
-
-	for (func = obj->funcs; func->old_name; func++) {
-		ret = klp_patch_func(func);
-		if (ret) {
-			klp_unpatch_object(obj);
-			return ret;
-		}
-	}
-	obj->patched = 1;
-
-	return 0;
-}
-
 static int __klp_disable_patch(struct klp_patch *patch)
 {
 	struct klp_object *obj;
diff --git a/kernel/livepatch/patch.c b/kernel/livepatch/patch.c
new file mode 100644
index 0000000..281fbca
--- /dev/null
+++ b/kernel/livepatch/patch.c
@@ -0,0 +1,176 @@
+/*
+ * patch.c - Kernel Live Patching patching functions
+ *
+ * Copyright (C) 2014 Seth Jennings <sjenning@redhat.com>
+ * Copyright (C) 2014 SUSE
+ * Copyright (C) 2015 Josh Poimboeuf <jpoimboe@redhat.com>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2
+ * of the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
+#include <linux/slab.h>
+
+#include "patch.h"
+
+static LIST_HEAD(klp_ops);
+
+static void notrace klp_ftrace_handler(unsigned long ip,
+				       unsigned long parent_ip,
+				       struct ftrace_ops *fops,
+				       struct pt_regs *regs)
+{
+	struct klp_ops *ops;
+	struct klp_func *func;
+
+	ops = container_of(fops, struct klp_ops, fops);
+
+	rcu_read_lock();
+	func = list_first_or_null_rcu(&ops->func_stack, struct klp_func,
+				      stack_node);
+	rcu_read_unlock();
+
+	if (WARN_ON_ONCE(!func))
+		return;
+
+	klp_arch_set_pc(regs, (unsigned long)func->new_func);
+}
+
+struct klp_ops *klp_find_ops(unsigned long old_addr)
+{
+	struct klp_ops *ops;
+	struct klp_func *func;
+
+	list_for_each_entry(ops, &klp_ops, node) {
+		func = list_first_entry(&ops->func_stack, struct klp_func,
+					stack_node);
+		if (func->old_addr == old_addr)
+			return ops;
+	}
+
+	return NULL;
+}
+
+static void klp_unpatch_func(struct klp_func *func)
+{
+	struct klp_ops *ops;
+
+	WARN_ON(!func->patched);
+	WARN_ON(!func->old_addr);
+
+	ops = klp_find_ops(func->old_addr);
+	if (WARN_ON(!ops))
+		return;
+
+	if (list_is_singular(&ops->func_stack)) {
+		WARN_ON(unregister_ftrace_function(&ops->fops));
+		WARN_ON(ftrace_set_filter_ip(&ops->fops, func->old_addr, 1, 0));
+
+		list_del_rcu(&func->stack_node);
+		list_del(&ops->node);
+		kfree(ops);
+	} else {
+		list_del_rcu(&func->stack_node);
+	}
+
+	func->patched = 0;
+}
+
+static int klp_patch_func(struct klp_func *func)
+{
+	struct klp_ops *ops;
+	int ret;
+
+	if (WARN_ON(!func->old_addr))
+		return -EINVAL;
+
+	if (WARN_ON(func->patched))
+		return -EINVAL;
+
+	ops = klp_find_ops(func->old_addr);
+	if (!ops) {
+		ops = kzalloc(sizeof(*ops), GFP_KERNEL);
+		if (!ops)
+			return -ENOMEM;
+
+		ops->fops.func = klp_ftrace_handler;
+		ops->fops.flags = FTRACE_OPS_FL_SAVE_REGS |
+				  FTRACE_OPS_FL_DYNAMIC |
+				  FTRACE_OPS_FL_IPMODIFY;
+
+		list_add(&ops->node, &klp_ops);
+
+		INIT_LIST_HEAD(&ops->func_stack);
+		list_add_rcu(&func->stack_node, &ops->func_stack);
+
+		ret = ftrace_set_filter_ip(&ops->fops, func->old_addr, 0, 0);
+		if (ret) {
+			pr_err("failed to set ftrace filter for function '%s' (%d)\n",
+			       func->old_name, ret);
+			goto err;
+		}
+
+		ret = register_ftrace_function(&ops->fops);
+		if (ret) {
+			pr_err("failed to register ftrace handler for function '%s' (%d)\n",
+			       func->old_name, ret);
+			ftrace_set_filter_ip(&ops->fops, func->old_addr, 1, 0);
+			goto err;
+		}
+	} else {
+		list_add_rcu(&func->stack_node, &ops->func_stack);
+	}
+
+	func->patched = 1;
+
+	return 0;
+
+err:
+	list_del_rcu(&func->stack_node);
+	list_del(&ops->node);
+	kfree(ops);
+	return ret;
+}
+
+void klp_unpatch_object(struct klp_object *obj)
+{
+	struct klp_func *func;
+
+	for (func = obj->funcs; func->old_name; func++)
+		if (func->patched)
+			klp_unpatch_func(func);
+
+	obj->patched = 0;
+}
+
+int klp_patch_object(struct klp_object *obj)
+{
+	struct klp_func *func;
+	int ret;
+
+	if (WARN_ON(obj->patched))
+		return -EINVAL;
+
+	for (func = obj->funcs; func->old_name; func++) {
+		ret = klp_patch_func(func);
+		if (ret) {
+			klp_unpatch_object(obj);
+			return ret;
+		}
+	}
+	obj->patched = 1;
+
+	return 0;
+}
diff --git a/kernel/livepatch/patch.h b/kernel/livepatch/patch.h
new file mode 100644
index 0000000..bb34bd3
--- /dev/null
+++ b/kernel/livepatch/patch.h
@@ -0,0 +1,25 @@
+#include <linux/livepatch.h>
+
+/**
+ * struct klp_ops - structure for tracking registered ftrace ops structs
+ *
+ * A single ftrace_ops is shared between all enabled replacement functions
+ * (klp_func structs) which have the same old_addr.  This allows the switch
+ * between function versions to happen instantaneously by updating the klp_ops
+ * struct's func_stack list.  The winner is the klp_func at the top of the
+ * func_stack (front of the list).
+ *
+ * @node:	node for the global klp_ops list
+ * @func_stack:	list head for the stack of klp_func's (active func is on top)
+ * @fops:	registered ftrace ops struct
+ */
+struct klp_ops {
+	struct list_head node;
+	struct list_head func_stack;
+	struct ftrace_ops fops;
+};
+
+struct klp_ops *klp_find_ops(unsigned long old_addr);
+
+extern int klp_patch_object(struct klp_object *obj);
+extern void klp_unpatch_object(struct klp_object *obj);
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 106+ messages in thread

* [RFC PATCH 4/9] livepatch: get function sizes
  2015-02-09 17:31 [RFC PATCH 0/9] livepatch: consistency model Josh Poimboeuf
                   ` (2 preceding siblings ...)
  2015-02-09 17:31 ` [RFC PATCH 3/9] livepatch: move patching functions into patch.c Josh Poimboeuf
@ 2015-02-09 17:31 ` Josh Poimboeuf
  2015-02-10 18:30   ` Jiri Slaby
  2015-02-09 17:31 ` [RFC PATCH 5/9] sched: move task rq locking functions to sched.h Josh Poimboeuf
                   ` (9 subsequent siblings)
  13 siblings, 1 reply; 106+ messages in thread
From: Josh Poimboeuf @ 2015-02-09 17:31 UTC (permalink / raw)
  To: Seth Jennings, Jiri Kosina, Vojtech Pavlik
  Cc: Masami Hiramatsu, live-patching, linux-kernel

For the consistency model we'll need to know the sizes of the old and
new functions to determine if they're on any task stacks.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
---
 include/linux/livepatch.h |  3 +++
 kernel/livepatch/core.c   | 19 ++++++++++++++++++-
 2 files changed, 21 insertions(+), 1 deletion(-)

diff --git a/include/linux/livepatch.h b/include/linux/livepatch.h
index 22a67d1..0e65b4d 100644
--- a/include/linux/livepatch.h
+++ b/include/linux/livepatch.h
@@ -37,6 +37,8 @@
  * @kobj:	kobject for sysfs resources
  * @state:	tracks function-level patch application state
  * @stack_node:	list node for klp_ops func_stack list
+ * @old_size:	size of the old function
+ * @new_size:	size of the new function
  * @patched:	the func has been added to the klp_ops list
  */
 struct klp_func {
@@ -56,6 +58,7 @@ struct klp_func {
 	/* internal */
 	struct kobject kobj;
 	struct list_head stack_node;
+	unsigned long old_size, new_size;
 	int patched;
 };
 
diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c
index 0c09eba..85d4ef7 100644
--- a/kernel/livepatch/core.c
+++ b/kernel/livepatch/core.c
@@ -197,8 +197,25 @@ static int klp_find_verify_func_addr(struct klp_object *obj,
 	else
 		ret = klp_verify_vmlinux_symbol(func->old_name,
 						func->old_addr);
+	if (ret)
+		return ret;
 
-	return ret;
+	ret = kallsyms_lookup_size_offset(func->old_addr, &func->old_size,
+					  NULL);
+	if (!ret) {
+		pr_err("kallsyms lookup failed for '%s'\n", func->old_name);
+		return -EINVAL;
+	}
+
+	ret = kallsyms_lookup_size_offset((unsigned long)func->new_func,
+					  &func->new_size, NULL);
+	if (!ret) {
+		pr_err("kallsyms lookup failed for '%s' replacement\n",
+		       func->old_name);
+		return -EINVAL;
+	}
+
+	return 0;
 }
 
 /*
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 106+ messages in thread

* [RFC PATCH 5/9] sched: move task rq locking functions to sched.h
  2015-02-09 17:31 [RFC PATCH 0/9] livepatch: consistency model Josh Poimboeuf
                   ` (3 preceding siblings ...)
  2015-02-09 17:31 ` [RFC PATCH 4/9] livepatch: get function sizes Josh Poimboeuf
@ 2015-02-09 17:31 ` Josh Poimboeuf
  2015-02-10 10:48   ` Masami Hiramatsu
  2015-02-09 17:31 ` [RFC PATCH 6/9] livepatch: create per-task consistency model Josh Poimboeuf
                   ` (8 subsequent siblings)
  13 siblings, 1 reply; 106+ messages in thread
From: Josh Poimboeuf @ 2015-02-09 17:31 UTC (permalink / raw)
  To: Seth Jennings, Jiri Kosina, Vojtech Pavlik
  Cc: Masami Hiramatsu, live-patching, linux-kernel

Move task_rq_lock/unlock() to sched.h so they can be used elsewhere.
The livepatch code needs to lock each task's rq in order to safely
examine its stack and switch it to a new patch universe.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
---
 kernel/sched/core.c  | 32 --------------------------------
 kernel/sched/sched.h | 33 +++++++++++++++++++++++++++++++++
 2 files changed, 33 insertions(+), 32 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index b5797b7..78d91e6 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -326,44 +326,12 @@ static inline struct rq *__task_rq_lock(struct task_struct *p)
 	}
 }
 
-/*
- * task_rq_lock - lock p->pi_lock and lock the rq @p resides on.
- */
-static struct rq *task_rq_lock(struct task_struct *p, unsigned long *flags)
-	__acquires(p->pi_lock)
-	__acquires(rq->lock)
-{
-	struct rq *rq;
-
-	for (;;) {
-		raw_spin_lock_irqsave(&p->pi_lock, *flags);
-		rq = task_rq(p);
-		raw_spin_lock(&rq->lock);
-		if (likely(rq == task_rq(p) && !task_on_rq_migrating(p)))
-			return rq;
-		raw_spin_unlock(&rq->lock);
-		raw_spin_unlock_irqrestore(&p->pi_lock, *flags);
-
-		while (unlikely(task_on_rq_migrating(p)))
-			cpu_relax();
-	}
-}
-
 static void __task_rq_unlock(struct rq *rq)
 	__releases(rq->lock)
 {
 	raw_spin_unlock(&rq->lock);
 }
 
-static inline void
-task_rq_unlock(struct rq *rq, struct task_struct *p, unsigned long *flags)
-	__releases(rq->lock)
-	__releases(p->pi_lock)
-{
-	raw_spin_unlock(&rq->lock);
-	raw_spin_unlock_irqrestore(&p->pi_lock, *flags);
-}
-
 /*
  * this_rq_lock - lock this runqueue and disable interrupts.
  */
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index 9a2a45c..ae514c9 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -1542,6 +1542,39 @@ static inline void double_rq_unlock(struct rq *rq1, struct rq *rq2)
 
 #endif
 
+/*
+ * task_rq_lock - lock p->pi_lock and lock the rq @p resides on.
+ */
+static inline struct rq *task_rq_lock(struct task_struct *p,
+				      unsigned long *flags)
+	__acquires(p->pi_lock)
+	__acquires(rq->lock)
+{
+	struct rq *rq;
+
+	for (;;) {
+		raw_spin_lock_irqsave(&p->pi_lock, *flags);
+		rq = task_rq(p);
+		raw_spin_lock(&rq->lock);
+		if (likely(rq == task_rq(p) && !task_on_rq_migrating(p)))
+			return rq;
+		raw_spin_unlock(&rq->lock);
+		raw_spin_unlock_irqrestore(&p->pi_lock, *flags);
+
+		while (unlikely(task_on_rq_migrating(p)))
+			cpu_relax();
+	}
+}
+
+static inline void task_rq_unlock(struct rq *rq, struct task_struct *p,
+				  unsigned long *flags)
+	__releases(rq->lock)
+	__releases(p->pi_lock)
+{
+	raw_spin_unlock(&rq->lock);
+	raw_spin_unlock_irqrestore(&p->pi_lock, *flags);
+}
+
 extern struct sched_entity *__pick_first_entity(struct cfs_rq *cfs_rq);
 extern struct sched_entity *__pick_last_entity(struct cfs_rq *cfs_rq);
 extern void print_cfs_stats(struct seq_file *m, int cpu);
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 106+ messages in thread

* [RFC PATCH 6/9] livepatch: create per-task consistency model
  2015-02-09 17:31 [RFC PATCH 0/9] livepatch: consistency model Josh Poimboeuf
                   ` (4 preceding siblings ...)
  2015-02-09 17:31 ` [RFC PATCH 5/9] sched: move task rq locking functions to sched.h Josh Poimboeuf
@ 2015-02-09 17:31 ` Josh Poimboeuf
  2015-02-10 10:58   ` Masami Hiramatsu
                     ` (6 more replies)
  2015-02-09 17:31 ` [RFC PATCH 7/9] proc: add /proc/<pid>/universe to show livepatch status Josh Poimboeuf
                   ` (7 subsequent siblings)
  13 siblings, 7 replies; 106+ messages in thread
From: Josh Poimboeuf @ 2015-02-09 17:31 UTC (permalink / raw)
  To: Seth Jennings, Jiri Kosina, Vojtech Pavlik
  Cc: Masami Hiramatsu, live-patching, linux-kernel

Add a basic per-task consistency model.  This is the foundation which
will eventually enable us to patch those ~10% of security patches which
change function prototypes and/or data semantics.

When a patch is enabled, livepatch enters into a transition state where
tasks are converging from the old universe to the new universe.  If a
given task isn't using any of the patched functions, it's switched to
the new universe.  Once all the tasks have been converged to the new
universe, patching is complete.

The same sequence occurs when a patch is disabled, except the tasks
converge from the new universe to the old universe.

The /sys/kernel/livepatch/<patch>/transition file shows whether a patch
is in transition.  Only a single patch (the topmost patch on the stack)
can be in transition at a given time.  A patch can remain in the
transition state indefinitely, if any of the tasks are stuck in the
previous universe.

A transition can be reversed and effectively canceled by writing the
opposite value to the /sys/kernel/livepatch/<patch>/enabled file while
the transition is in progress.  Then all the tasks will attempt to
converge back to the original universe.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
---
 include/linux/livepatch.h     |  18 ++-
 include/linux/sched.h         |   3 +
 kernel/fork.c                 |   2 +
 kernel/livepatch/Makefile     |   2 +-
 kernel/livepatch/core.c       |  71 ++++++----
 kernel/livepatch/patch.c      |  34 ++++-
 kernel/livepatch/patch.h      |   1 +
 kernel/livepatch/transition.c | 300 ++++++++++++++++++++++++++++++++++++++++++
 kernel/livepatch/transition.h |  16 +++
 kernel/sched/core.c           |   2 +
 10 files changed, 423 insertions(+), 26 deletions(-)
 create mode 100644 kernel/livepatch/transition.c
 create mode 100644 kernel/livepatch/transition.h

diff --git a/include/linux/livepatch.h b/include/linux/livepatch.h
index 0e65b4d..b8c2f15 100644
--- a/include/linux/livepatch.h
+++ b/include/linux/livepatch.h
@@ -40,6 +40,7 @@
  * @old_size:	size of the old function
  * @new_size:	size of the new function
  * @patched:	the func has been added to the klp_ops list
+ * @transition:	the func is currently being applied or reverted
  */
 struct klp_func {
 	/* external */
@@ -60,6 +61,7 @@ struct klp_func {
 	struct list_head stack_node;
 	unsigned long old_size, new_size;
 	int patched;
+	int transition;
 };
 
 /**
@@ -128,6 +130,20 @@ extern int klp_unregister_patch(struct klp_patch *);
 extern int klp_enable_patch(struct klp_patch *);
 extern int klp_disable_patch(struct klp_patch *);
 
-#endif /* CONFIG_LIVEPATCH */
+extern int klp_universe_goal;
+
+static inline void klp_update_task_universe(struct task_struct *t)
+{
+	/* corresponding smp_wmb() is in klp_set_universe_goal() */
+	smp_rmb();
+
+	t->klp_universe = klp_universe_goal;
+}
+
+#else /* !CONFIG_LIVEPATCH */
+
+static inline void klp_update_task_universe(struct task_struct *t) {}
+
+#endif /* !CONFIG_LIVEPATCH */
 
 #endif /* _LINUX_LIVEPATCH_H_ */
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 8db31ef..a95e59a 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1701,6 +1701,9 @@ struct task_struct {
 #ifdef CONFIG_DEBUG_ATOMIC_SLEEP
 	unsigned long	task_state_change;
 #endif
+#ifdef CONFIG_LIVEPATCH
+	int klp_universe;
+#endif
 };
 
 /* Future-safe accessor for struct task_struct's cpus_allowed. */
diff --git a/kernel/fork.c b/kernel/fork.c
index 4dc2dda..1dcbebe 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -74,6 +74,7 @@
 #include <linux/uprobes.h>
 #include <linux/aio.h>
 #include <linux/compiler.h>
+#include <linux/livepatch.h>
 
 #include <asm/pgtable.h>
 #include <asm/pgalloc.h>
@@ -1538,6 +1539,7 @@ static struct task_struct *copy_process(unsigned long clone_flags,
 	total_forks++;
 	spin_unlock(&current->sighand->siglock);
 	syscall_tracepoint_update(p);
+	klp_update_task_universe(p);
 	write_unlock_irq(&tasklist_lock);
 
 	proc_fork_connector(p);
diff --git a/kernel/livepatch/Makefile b/kernel/livepatch/Makefile
index e136dad..2b8bdb1 100644
--- a/kernel/livepatch/Makefile
+++ b/kernel/livepatch/Makefile
@@ -1,3 +1,3 @@
 obj-$(CONFIG_LIVEPATCH) += livepatch.o
 
-livepatch-objs := core.o patch.o
+livepatch-objs := core.o patch.o transition.o
diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c
index 85d4ef7..790dc10 100644
--- a/kernel/livepatch/core.c
+++ b/kernel/livepatch/core.c
@@ -28,14 +28,17 @@
 #include <linux/kallsyms.h>
 
 #include "patch.h"
+#include "transition.h"
 
 /*
- * The klp_mutex protects the global lists and state transitions of any
- * structure reachable from them.  References to any structure must be obtained
- * under mutex protection (except in klp_ftrace_handler(), which uses RCU to
- * ensure it gets consistent data).
+ * The klp_mutex is a coarse lock which serializes access to klp data.  All
+ * accesses to klp-related variables and structures must have mutex protection,
+ * except within the following functions which carefully avoid the need for it:
+ *
+ * - klp_ftrace_handler()
+ * - klp_update_task_universe()
  */
-static DEFINE_MUTEX(klp_mutex);
+DEFINE_MUTEX(klp_mutex);
 
 static LIST_HEAD(klp_patches);
 
@@ -67,7 +70,6 @@ static void klp_find_object_module(struct klp_object *obj)
 	mutex_unlock(&module_mutex);
 }
 
-/* klp_mutex must be held by caller */
 static bool klp_is_patch_registered(struct klp_patch *patch)
 {
 	struct klp_patch *mypatch;
@@ -285,18 +287,17 @@ static int klp_write_object_relocations(struct module *pmod,
 
 static int __klp_disable_patch(struct klp_patch *patch)
 {
-	struct klp_object *obj;
+	if (klp_transition_patch)
+		return -EBUSY;
 
 	/* enforce stacking: only the last enabled patch can be disabled */
 	if (!list_is_last(&patch->list, &klp_patches) &&
 	    list_next_entry(patch, list)->enabled)
 		return -EBUSY;
 
-	pr_notice("disabling patch '%s'\n", patch->mod->name);
-
-	for (obj = patch->objs; obj->funcs; obj++)
-		if (obj->patched)
-			klp_unpatch_object(obj);
+	klp_init_transition(patch, KLP_UNIVERSE_NEW);
+	klp_start_transition(KLP_UNIVERSE_OLD);
+	klp_try_complete_transition();
 
 	patch->enabled = 0;
 
@@ -340,6 +341,9 @@ static int __klp_enable_patch(struct klp_patch *patch)
 	struct klp_object *obj;
 	int ret;
 
+	if (klp_transition_patch)
+		return -EBUSY;
+
 	if (WARN_ON(patch->enabled))
 		return -EINVAL;
 
@@ -351,7 +355,7 @@ static int __klp_enable_patch(struct klp_patch *patch)
 	pr_notice_once("tainting kernel with TAINT_LIVEPATCH\n");
 	add_taint(TAINT_LIVEPATCH, LOCKDEP_STILL_OK);
 
-	pr_notice("enabling patch '%s'\n", patch->mod->name);
+	klp_init_transition(patch, KLP_UNIVERSE_OLD);
 
 	for (obj = patch->objs; obj->funcs; obj++) {
 		klp_find_object_module(obj);
@@ -360,17 +364,24 @@ static int __klp_enable_patch(struct klp_patch *patch)
 			continue;
 
 		ret = klp_patch_object(obj);
-		if (ret)
-			goto unregister;
+		if (ret) {
+			pr_warn("failed to enable patch '%s'\n",
+				patch->mod->name);
+
+			klp_unpatch_objects(patch);
+			klp_complete_transition();
+
+			return ret;
+		}
 	}
 
+	klp_start_transition(KLP_UNIVERSE_NEW);
+
+	klp_try_complete_transition();
+
 	patch->enabled = 1;
 
 	return 0;
-
-unregister:
-	WARN_ON(__klp_disable_patch(patch));
-	return ret;
 }
 
 /**
@@ -407,6 +418,7 @@ EXPORT_SYMBOL_GPL(klp_enable_patch);
  * /sys/kernel/livepatch
  * /sys/kernel/livepatch/<patch>
  * /sys/kernel/livepatch/<patch>/enabled
+ * /sys/kernel/livepatch/<patch>/transition
  * /sys/kernel/livepatch/<patch>/<object>
  * /sys/kernel/livepatch/<patch>/<object>/<func>
  */
@@ -435,7 +447,9 @@ static ssize_t enabled_store(struct kobject *kobj, struct kobj_attribute *attr,
 		goto err;
 	}
 
-	if (val) {
+	if (klp_transition_patch == patch) {
+		klp_reverse_transition();
+	} else if (val) {
 		ret = __klp_enable_patch(patch);
 		if (ret)
 			goto err;
@@ -463,9 +477,21 @@ static ssize_t enabled_show(struct kobject *kobj,
 	return snprintf(buf, PAGE_SIZE-1, "%d\n", patch->enabled);
 }
 
+static ssize_t transition_show(struct kobject *kobj,
+			       struct kobj_attribute *attr, char *buf)
+{
+	struct klp_patch *patch;
+
+	patch = container_of(kobj, struct klp_patch, kobj);
+	return snprintf(buf, PAGE_SIZE-1, "%d\n",
+			klp_transition_patch == patch);
+}
+
 static struct kobj_attribute enabled_kobj_attr = __ATTR_RW(enabled);
+static struct kobj_attribute transition_kobj_attr = __ATTR_RO(transition);
 static struct attribute *klp_patch_attrs[] = {
 	&enabled_kobj_attr.attr,
+	&transition_kobj_attr.attr,
 	NULL
 };
 
@@ -543,6 +569,7 @@ static int klp_init_func(struct klp_object *obj, struct klp_func *func)
 {
 	INIT_LIST_HEAD(&func->stack_node);
 	func->patched = 0;
+	func->transition = 0;
 
 	return kobject_init_and_add(&func->kobj, &klp_ktype_func,
 				    obj->kobj, func->old_name);
@@ -725,7 +752,7 @@ static void klp_module_notify_coming(struct klp_patch *patch,
 	if (ret)
 		goto err;
 
-	if (!patch->enabled)
+	if (!patch->enabled && klp_transition_patch != patch)
 		return;
 
 	pr_notice("applying patch '%s' to loading module '%s'\n",
@@ -746,7 +773,7 @@ static void klp_module_notify_going(struct klp_patch *patch,
 	struct module *pmod = patch->mod;
 	struct module *mod = obj->mod;
 
-	if (!patch->enabled)
+	if (!patch->enabled && klp_transition_patch != patch)
 		goto free;
 
 	pr_notice("reverting patch '%s' on unloading module '%s'\n",
diff --git a/kernel/livepatch/patch.c b/kernel/livepatch/patch.c
index 281fbca..f12256b 100644
--- a/kernel/livepatch/patch.c
+++ b/kernel/livepatch/patch.c
@@ -24,6 +24,7 @@
 #include <linux/slab.h>
 
 #include "patch.h"
+#include "transition.h"
 
 static LIST_HEAD(klp_ops);
 
@@ -38,14 +39,34 @@ static void notrace klp_ftrace_handler(unsigned long ip,
 	ops = container_of(fops, struct klp_ops, fops);
 
 	rcu_read_lock();
+
 	func = list_first_or_null_rcu(&ops->func_stack, struct klp_func,
 				      stack_node);
-	rcu_read_unlock();
 
 	if (WARN_ON_ONCE(!func))
-		return;
+		goto unlock;
+
+	if (unlikely(func->transition)) {
+		/* corresponding smp_wmb() is in klp_init_transition() */
+		smp_rmb();
+
+		if (current->klp_universe == KLP_UNIVERSE_OLD) {
+			/*
+			 * Use the previously patched version of the function.
+			 * If no previous patches exist, use the original
+			 * function.
+			 */
+			func = list_entry_rcu(func->stack_node.next,
+					      struct klp_func, stack_node);
+
+			if (&func->stack_node == &ops->func_stack)
+				goto unlock;
+		}
+	}
 
 	klp_arch_set_pc(regs, (unsigned long)func->new_func);
+unlock:
+	rcu_read_unlock();
 }
 
 struct klp_ops *klp_find_ops(unsigned long old_addr)
@@ -174,3 +195,12 @@ int klp_patch_object(struct klp_object *obj)
 
 	return 0;
 }
+
+void klp_unpatch_objects(struct klp_patch *patch)
+{
+	struct klp_object *obj;
+
+	for (obj = patch->objs; obj->funcs; obj++)
+		if (obj->patched)
+			klp_unpatch_object(obj);
+}
diff --git a/kernel/livepatch/patch.h b/kernel/livepatch/patch.h
index bb34bd3..1648259 100644
--- a/kernel/livepatch/patch.h
+++ b/kernel/livepatch/patch.h
@@ -23,3 +23,4 @@ struct klp_ops *klp_find_ops(unsigned long old_addr);
 
 extern int klp_patch_object(struct klp_object *obj);
 extern void klp_unpatch_object(struct klp_object *obj);
+extern void klp_unpatch_objects(struct klp_patch *patch);
diff --git a/kernel/livepatch/transition.c b/kernel/livepatch/transition.c
new file mode 100644
index 0000000..2630296
--- /dev/null
+++ b/kernel/livepatch/transition.c
@@ -0,0 +1,300 @@
+/*
+ * transition.c - Kernel Live Patching transition functions
+ *
+ * Copyright (C) 2015 Josh Poimboeuf <jpoimboe@redhat.com>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2
+ * of the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
+#include <linux/cpu.h>
+#include <asm/stacktrace.h>
+#include "../sched/sched.h"
+
+#include "patch.h"
+#include "transition.h"
+
+static void klp_transition_work_fn(struct work_struct *);
+static DECLARE_DELAYED_WORK(klp_transition_work, klp_transition_work_fn);
+
+struct klp_patch *klp_transition_patch;
+
+int klp_universe_goal = KLP_UNIVERSE_UNDEFINED;
+
+static void klp_set_universe_goal(int universe)
+{
+	klp_universe_goal = universe;
+
+	/* corresponding smp_rmb() is in klp_update_task_universe() */
+	smp_wmb();
+}
+
+/*
+ * The transition to the universe goal is complete.  Clean up the data
+ * structures.
+ */
+void klp_complete_transition(void)
+{
+	struct klp_object *obj;
+	struct klp_func *func;
+
+	for (obj = klp_transition_patch->objs; obj->funcs; obj++)
+		for (func = obj->funcs; func->old_name; func++)
+			func->transition = 0;
+
+	klp_transition_patch = NULL;
+}
+
+static int klp_stacktrace_address_verify_func(struct klp_func *func,
+					      unsigned long address)
+{
+	unsigned long func_addr, func_size;
+
+	if (klp_universe_goal == KLP_UNIVERSE_OLD) {
+		 /* check the to-be-unpatched function (the func itself) */
+		func_addr = (unsigned long)func->new_func;
+		func_size = func->new_size;
+	} else {
+		/* check the to-be-patched function (previous func) */
+		struct klp_ops *ops;
+
+		ops = klp_find_ops(func->old_addr);
+
+		if (list_is_singular(&ops->func_stack)) {
+			/* original function */
+			func_addr = func->old_addr;
+			func_size = func->old_size;
+		} else {
+			/* previously patched function */
+			struct klp_func *prev;
+
+			prev = list_next_entry(func, stack_node);
+			func_addr = (unsigned long)prev->new_func;
+			func_size = prev->new_size;
+		}
+	}
+
+	if (address >= func_addr && address < func_addr + func_size)
+		return -1;
+
+	return 0;
+}
+
+/*
+ * Determine whether the given return address on the stack is within a
+ * to-be-patched or to-be-unpatched function.
+ */
+static void klp_stacktrace_address_verify(void *data, unsigned long address,
+					  int reliable)
+{
+	struct klp_object *obj;
+	struct klp_func *func;
+	int *ret = data;
+
+	if (*ret)
+		return;
+
+	for (obj = klp_transition_patch->objs; obj->funcs; obj++) {
+		if (!obj->patched)
+			continue;
+		for (func = obj->funcs; func->old_name; func++) {
+			if (klp_stacktrace_address_verify_func(func, address)) {
+				*ret = -1;
+				return;
+			}
+		}
+	}
+}
+
+static int klp_stacktrace_stack(void *data, char *name)
+{
+	return 0;
+}
+
+static const struct stacktrace_ops klp_stacktrace_ops = {
+	.address = klp_stacktrace_address_verify,
+	.stack = klp_stacktrace_stack,
+	.walk_stack = print_context_stack_bp,
+};
+
+/*
+ * Try to safely transition a task to the universe goal.  If the task is
+ * currently running or is sleeping on a to-be-patched or to-be-unpatched
+ * function, return false.
+ */
+static bool klp_transition_task(struct task_struct *t)
+{
+	struct rq *rq;
+	unsigned long flags;
+	int ret;
+	bool success = false;
+
+	if (t->klp_universe == klp_universe_goal)
+		return true;
+
+	rq = task_rq_lock(t, &flags);
+
+	if (task_running(rq, t) && t != current) {
+		pr_debug("%s: pid %d (%s) is running\n", __func__, t->pid,
+			 t->comm);
+		goto done;
+	}
+
+	ret = 0;
+	dump_trace(t, NULL, NULL, 0, &klp_stacktrace_ops, &ret);
+	if (ret) {
+		pr_debug("%s: pid %d (%s) is sleeping on a patched function\n",
+			 __func__, t->pid, t->comm);
+		goto done;
+	}
+
+	klp_update_task_universe(t);
+
+	success = true;
+done:
+	task_rq_unlock(rq, t, &flags);
+	return success;
+}
+
+/*
+ * Try to transition all tasks to the universe goal.  If any tasks are still
+ * stuck in the original universe, schedule a retry.
+ */
+void klp_try_complete_transition(void)
+{
+	unsigned int cpu;
+	struct task_struct *g, *t;
+	bool complete = true;
+
+	/* try to transition all normal tasks */
+	read_lock(&tasklist_lock);
+	for_each_process_thread(g, t)
+		if (!klp_transition_task(t))
+			complete = false;
+	read_unlock(&tasklist_lock);
+
+	/* try to transition the idle "swapper" tasks */
+	get_online_cpus();
+	for_each_online_cpu(cpu)
+		if (!klp_transition_task(idle_task(cpu)))
+			complete = false;
+	put_online_cpus();
+
+	/* if not complete, try again later */
+	if (!complete) {
+		schedule_delayed_work(&klp_transition_work,
+				      round_jiffies_relative(HZ));
+		return;
+	}
+
+	/* success! unpatch obsolete functions and do some cleanup */
+
+	if (klp_universe_goal == KLP_UNIVERSE_OLD) {
+		klp_unpatch_objects(klp_transition_patch);
+
+		/* prevent ftrace handler from reading old func->transition */
+		synchronize_rcu();
+	}
+
+	pr_notice("'%s': %s complete\n", klp_transition_patch->mod->name,
+		  klp_universe_goal == KLP_UNIVERSE_NEW ? "patching" :
+							  "unpatching");
+
+	klp_complete_transition();
+}
+
+static void klp_transition_work_fn(struct work_struct *work)
+{
+	mutex_lock(&klp_mutex);
+
+	if (klp_transition_patch)
+		klp_try_complete_transition();
+
+	mutex_unlock(&klp_mutex);
+}
+
+/*
+ * Start the transition to the specified universe so tasks can begin switching
+ * to it.
+ */
+void klp_start_transition(int universe)
+{
+	if (WARN_ON(klp_universe_goal == universe))
+		return;
+
+	pr_notice("'%s': %s...\n", klp_transition_patch->mod->name,
+		  universe == KLP_UNIVERSE_NEW ? "patching" : "unpatching");
+
+	klp_set_universe_goal(universe);
+}
+
+/*
+ * Can be called in the middle of an existing transition to reverse the
+ * direction of the universe goal.  This can be done to effectively cancel an
+ * existing enable or disable operation if there are any tasks which are stuck
+ * in the original universe.
+ */
+void klp_reverse_transition(void)
+{
+	struct klp_patch *patch = klp_transition_patch;
+
+	klp_start_transition(!klp_universe_goal);
+	klp_try_complete_transition();
+
+	patch->enabled = !patch->enabled;
+}
+
+/*
+ * Reset the universe goal and all tasks to the starting universe, and set all
+ * func->transition's to 1 to prepare for patching.
+ */
+void klp_init_transition(struct klp_patch *patch, int universe)
+{
+	struct task_struct *g, *t;
+	unsigned int cpu;
+	struct klp_object *obj;
+	struct klp_func *func;
+
+	klp_transition_patch = patch;
+
+	/*
+	 * If the previous transition was in the opposite direction, we may
+	 * already be in the requested initial universe.
+	 */
+	if (klp_universe_goal == universe)
+		goto init_funcs;
+
+	klp_set_universe_goal(universe);
+
+	/* init all normal task universes */
+	read_lock(&tasklist_lock);
+	for_each_process_thread(g, t)
+		klp_update_task_universe(t);
+	read_unlock(&tasklist_lock);
+
+	/* init all idle "swapper" task universes */
+	get_online_cpus();
+	for_each_online_cpu(cpu)
+		klp_update_task_universe(idle_task(cpu));
+	put_online_cpus();
+
+init_funcs:
+	/* corresponding smp_rmb() is in klp_ftrace_handler() */
+	smp_wmb();
+
+	for (obj = patch->objs; obj->funcs; obj++)
+		for (func = obj->funcs; func->old_name; func++)
+			func->transition = 1;
+}
diff --git a/kernel/livepatch/transition.h b/kernel/livepatch/transition.h
new file mode 100644
index 0000000..ba9a55c
--- /dev/null
+++ b/kernel/livepatch/transition.h
@@ -0,0 +1,16 @@
+#include <linux/livepatch.h>
+
+enum {
+	KLP_UNIVERSE_UNDEFINED = -1,
+	KLP_UNIVERSE_OLD,
+	KLP_UNIVERSE_NEW,
+};
+
+extern struct mutex klp_mutex;
+extern struct klp_patch *klp_transition_patch;
+
+extern void klp_init_transition(struct klp_patch *patch, int universe);
+extern void klp_start_transition(int universe);
+extern void klp_reverse_transition(void);
+extern void klp_try_complete_transition(void);
+extern void klp_complete_transition(void);
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 78d91e6..7b877f4 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -74,6 +74,7 @@
 #include <linux/binfmts.h>
 #include <linux/context_tracking.h>
 #include <linux/compiler.h>
+#include <linux/livepatch.h>
 
 #include <asm/switch_to.h>
 #include <asm/tlb.h>
@@ -4601,6 +4602,7 @@ void init_idle(struct task_struct *idle, int cpu)
 #if defined(CONFIG_SMP)
 	sprintf(idle->comm, "%s/%d", INIT_TASK_COMM, cpu);
 #endif
+	klp_update_task_universe(idle);
 }
 
 int cpuset_cpumask_can_shrink(const struct cpumask *cur,
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 106+ messages in thread

* [RFC PATCH 7/9] proc: add /proc/<pid>/universe to show livepatch status
  2015-02-09 17:31 [RFC PATCH 0/9] livepatch: consistency model Josh Poimboeuf
                   ` (5 preceding siblings ...)
  2015-02-09 17:31 ` [RFC PATCH 6/9] livepatch: create per-task consistency model Josh Poimboeuf
@ 2015-02-09 17:31 ` Josh Poimboeuf
  2015-02-10 18:47   ` Jiri Slaby
  2015-02-09 17:31 ` [RFC PATCH 8/9] livepatch: allow patch modules to be removed Josh Poimboeuf
                   ` (6 subsequent siblings)
  13 siblings, 1 reply; 106+ messages in thread
From: Josh Poimboeuf @ 2015-02-09 17:31 UTC (permalink / raw)
  To: Seth Jennings, Jiri Kosina, Vojtech Pavlik
  Cc: Masami Hiramatsu, live-patching, linux-kernel

Expose the per-task klp_universe value so users can determine which
tasks are holding up completion of a patching operation.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
---
 fs/proc/base.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/fs/proc/base.c b/fs/proc/base.c
index 3f3d7ae..b9fe6b5 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -2528,6 +2528,14 @@ static int proc_pid_personality(struct seq_file *m, struct pid_namespace *ns,
 	return err;
 }
 
+#ifdef CONFIG_LIVEPATCH
+static int proc_pid_klp_universe(struct seq_file *m, struct pid_namespace *ns,
+				 struct pid *pid, struct task_struct *task)
+{
+	return seq_printf(m, "%d\n", task->klp_universe);
+}
+#endif /* CONFIG_LIVEPATCH */
+
 /*
  * Thread groups
  */
@@ -2628,6 +2636,9 @@ static const struct pid_entry tgid_base_stuff[] = {
 #ifdef CONFIG_CHECKPOINT_RESTORE
 	REG("timers",	  S_IRUGO, proc_timers_operations),
 #endif
+#ifdef CONFIG_LIVEPATCH
+	ONE("universe", S_IRUGO, proc_pid_klp_universe),
+#endif
 };
 
 static int proc_tgid_base_readdir(struct file *file, struct dir_context *ctx)
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 106+ messages in thread

* [RFC PATCH 8/9] livepatch: allow patch modules to be removed
  2015-02-09 17:31 [RFC PATCH 0/9] livepatch: consistency model Josh Poimboeuf
                   ` (6 preceding siblings ...)
  2015-02-09 17:31 ` [RFC PATCH 7/9] proc: add /proc/<pid>/universe to show livepatch status Josh Poimboeuf
@ 2015-02-09 17:31 ` Josh Poimboeuf
  2015-02-10 19:02   ` Jiri Slaby
  2015-02-09 17:31 ` [RFC PATCH 9/9] livepatch: update task universe when exiting kernel Josh Poimboeuf
                   ` (5 subsequent siblings)
  13 siblings, 1 reply; 106+ messages in thread
From: Josh Poimboeuf @ 2015-02-09 17:31 UTC (permalink / raw)
  To: Seth Jennings, Jiri Kosina, Vojtech Pavlik
  Cc: Masami Hiramatsu, live-patching, linux-kernel

Now that we have a consistency model we can detect when unpatching is
complete and the patch module can be safely removed.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
---
 kernel/livepatch/core.c       | 25 ++++---------------------
 kernel/livepatch/transition.c |  3 +++
 2 files changed, 7 insertions(+), 21 deletions(-)

diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c
index 790dc10..e572523 100644
--- a/kernel/livepatch/core.c
+++ b/kernel/livepatch/core.c
@@ -352,6 +352,9 @@ static int __klp_enable_patch(struct klp_patch *patch)
 	    !list_prev_entry(patch, list)->enabled)
 		return -EBUSY;
 
+	if (!try_module_get(patch->mod))
+		return -ENODEV;
+
 	pr_notice_once("tainting kernel with TAINT_LIVEPATCH\n");
 	add_taint(TAINT_LIVEPATCH, LOCKDEP_STILL_OK);
 
@@ -497,10 +500,6 @@ static struct attribute *klp_patch_attrs[] = {
 
 static void klp_kobj_release_patch(struct kobject *kobj)
 {
-	/*
-	 * Once we have a consistency model we'll need to module_put() the
-	 * patch module here.  See klp_register_patch() for more details.
-	 */
 }
 
 static struct kobj_type klp_ktype_patch = {
@@ -715,29 +714,13 @@ EXPORT_SYMBOL_GPL(klp_unregister_patch);
  */
 int klp_register_patch(struct klp_patch *patch)
 {
-	int ret;
-
 	if (!klp_initialized())
 		return -ENODEV;
 
 	if (!patch || !patch->mod)
 		return -EINVAL;
 
-	/*
-	 * A reference is taken on the patch module to prevent it from being
-	 * unloaded.  Right now, we don't allow patch modules to unload since
-	 * there is currently no method to determine if a thread is still
-	 * running in the patched code contained in the patch module once
-	 * the ftrace registration is successful.
-	 */
-	if (!try_module_get(patch->mod))
-		return -ENODEV;
-
-	ret = klp_init_patch(patch);
-	if (ret)
-		module_put(patch->mod);
-
-	return ret;
+	return klp_init_patch(patch);
 }
 EXPORT_SYMBOL_GPL(klp_register_patch);
 
diff --git a/kernel/livepatch/transition.c b/kernel/livepatch/transition.c
index 2630296..20fafd2 100644
--- a/kernel/livepatch/transition.c
+++ b/kernel/livepatch/transition.c
@@ -54,6 +54,9 @@ void klp_complete_transition(void)
 		for (func = obj->funcs; func->old_name; func++)
 			func->transition = 0;
 
+	if (klp_universe_goal == KLP_UNIVERSE_OLD)
+		module_put(klp_transition_patch->mod);
+
 	klp_transition_patch = NULL;
 }
 
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 106+ messages in thread

* [RFC PATCH 9/9] livepatch: update task universe when exiting kernel
  2015-02-09 17:31 [RFC PATCH 0/9] livepatch: consistency model Josh Poimboeuf
                   ` (7 preceding siblings ...)
  2015-02-09 17:31 ` [RFC PATCH 8/9] livepatch: allow patch modules to be removed Josh Poimboeuf
@ 2015-02-09 17:31 ` Josh Poimboeuf
  2015-02-16 10:16   ` Jiri Slaby
  2015-02-09 23:15 ` [RFC PATCH 0/9] livepatch: consistency model Jiri Kosina
                   ` (4 subsequent siblings)
  13 siblings, 1 reply; 106+ messages in thread
From: Josh Poimboeuf @ 2015-02-09 17:31 UTC (permalink / raw)
  To: Seth Jennings, Jiri Kosina, Vojtech Pavlik
  Cc: Masami Hiramatsu, live-patching, linux-kernel

Update a tasks's universe when returning from a system call or user
space interrupt, or after handling a signal.

This greatly increases the chances of a patch operation succeeding.  If
a task is I/O bound, it can switch universes when returning from a
system call.  If a task is CPU bound, it can switch universes when
returning from an interrupt.  If a task is sleeping on a to-be-patched
function, the user can send SIGSTOP and SIGCONT to force it to switch.

Since the idle "swapper" tasks don't ever exit the kernel, they're
updated from within the idle loop.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
---
 arch/x86/include/asm/thread_info.h |  4 +++-
 arch/x86/kernel/signal.c           |  4 ++++
 include/linux/livepatch.h          |  2 ++
 kernel/livepatch/transition.c      | 15 +++++++++++++++
 kernel/sched/idle.c                |  4 ++++
 5 files changed, 28 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/thread_info.h b/arch/x86/include/asm/thread_info.h
index 547e344..4e46d36 100644
--- a/arch/x86/include/asm/thread_info.h
+++ b/arch/x86/include/asm/thread_info.h
@@ -78,6 +78,7 @@ struct thread_info {
 #define TIF_MCE_NOTIFY		10	/* notify userspace of an MCE */
 #define TIF_USER_RETURN_NOTIFY	11	/* notify kernel of userspace return */
 #define TIF_UPROBE		12	/* breakpointed or singlestepping */
+#define TIF_KLP_NEED_UPDATE	13	/* pending live patching update */
 #define TIF_NOTSC		16	/* TSC is not accessible in userland */
 #define TIF_IA32		17	/* IA32 compatibility process */
 #define TIF_FORK		18	/* ret_from_fork */
@@ -102,6 +103,7 @@ struct thread_info {
 #define _TIF_SECCOMP		(1 << TIF_SECCOMP)
 #define _TIF_MCE_NOTIFY		(1 << TIF_MCE_NOTIFY)
 #define _TIF_USER_RETURN_NOTIFY	(1 << TIF_USER_RETURN_NOTIFY)
+#define _TIF_KLP_NEED_UPDATE	(1 << TIF_KLP_NEED_UPDATE)
 #define _TIF_UPROBE		(1 << TIF_UPROBE)
 #define _TIF_NOTSC		(1 << TIF_NOTSC)
 #define _TIF_IA32		(1 << TIF_IA32)
@@ -141,7 +143,7 @@ struct thread_info {
 /* Only used for 64 bit */
 #define _TIF_DO_NOTIFY_MASK						\
 	(_TIF_SIGPENDING | _TIF_MCE_NOTIFY | _TIF_NOTIFY_RESUME |	\
-	 _TIF_USER_RETURN_NOTIFY | _TIF_UPROBE)
+	 _TIF_USER_RETURN_NOTIFY | _TIF_UPROBE | _TIF_KLP_NEED_UPDATE)
 
 /* flags to check in __switch_to() */
 #define _TIF_WORK_CTXSW							\
diff --git a/arch/x86/kernel/signal.c b/arch/x86/kernel/signal.c
index ed37a76..1d4b8e6 100644
--- a/arch/x86/kernel/signal.c
+++ b/arch/x86/kernel/signal.c
@@ -23,6 +23,7 @@
 #include <linux/user-return-notifier.h>
 #include <linux/uprobes.h>
 #include <linux/context_tracking.h>
+#include <linux/livepatch.h>
 
 #include <asm/processor.h>
 #include <asm/ucontext.h>
@@ -760,6 +761,9 @@ do_notify_resume(struct pt_regs *regs, void *unused, __u32 thread_info_flags)
 	if (thread_info_flags & _TIF_USER_RETURN_NOTIFY)
 		fire_user_return_notifiers();
 
+	if (unlikely(thread_info_flags & _TIF_KLP_NEED_UPDATE))
+		klp_update_task_universe(current);
+
 	user_enter();
 }
 
diff --git a/include/linux/livepatch.h b/include/linux/livepatch.h
index b8c2f15..14f6a96 100644
--- a/include/linux/livepatch.h
+++ b/include/linux/livepatch.h
@@ -134,6 +134,8 @@ extern int klp_universe_goal;
 
 static inline void klp_update_task_universe(struct task_struct *t)
 {
+	clear_tsk_thread_flag(t, TIF_KLP_NEED_UPDATE);
+
 	/* corresponding smp_wmb() is in klp_set_universe_goal() */
 	smp_rmb();
 
diff --git a/kernel/livepatch/transition.c b/kernel/livepatch/transition.c
index 20fafd2..dac8ea5 100644
--- a/kernel/livepatch/transition.c
+++ b/kernel/livepatch/transition.c
@@ -234,6 +234,9 @@ static void klp_transition_work_fn(struct work_struct *work)
  */
 void klp_start_transition(int universe)
 {
+	struct task_struct *g, *t;
+	unsigned int cpu;
+
 	if (WARN_ON(klp_universe_goal == universe))
 		return;
 
@@ -241,6 +244,18 @@ void klp_start_transition(int universe)
 		  universe == KLP_UNIVERSE_NEW ? "patching" : "unpatching");
 
 	klp_set_universe_goal(universe);
+
+	/* mark all normal tasks as needing a universe update */
+	read_lock(&tasklist_lock);
+	for_each_process_thread(g, t)
+		set_tsk_thread_flag(t, TIF_KLP_NEED_UPDATE);
+	read_unlock(&tasklist_lock);
+
+	/* mark all idle "swapper" tasks as needing a universe update */
+	get_online_cpus();
+	for_each_online_cpu(cpu)
+		set_tsk_thread_flag(idle_task(cpu), TIF_KLP_NEED_UPDATE);
+	put_online_cpus();
 }
 
 /*
diff --git a/kernel/sched/idle.c b/kernel/sched/idle.c
index c47fce7..c1390b6 100644
--- a/kernel/sched/idle.c
+++ b/kernel/sched/idle.c
@@ -7,6 +7,7 @@
 #include <linux/tick.h>
 #include <linux/mm.h>
 #include <linux/stackprotector.h>
+#include <linux/livepatch.h>
 
 #include <asm/tlb.h>
 
@@ -250,6 +251,9 @@ static void cpu_idle_loop(void)
 
 		sched_ttwu_pending();
 		schedule_preempt_disabled();
+
+		if (unlikely(test_thread_flag(TIF_KLP_NEED_UPDATE)))
+			klp_update_task_universe(current);
 	}
 }
 
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH 0/9] livepatch: consistency model
  2015-02-09 17:31 [RFC PATCH 0/9] livepatch: consistency model Josh Poimboeuf
                   ` (8 preceding siblings ...)
  2015-02-09 17:31 ` [RFC PATCH 9/9] livepatch: update task universe when exiting kernel Josh Poimboeuf
@ 2015-02-09 23:15 ` Jiri Kosina
  2015-02-10  3:05   ` Josh Poimboeuf
  2015-02-10  8:57 ` Jiri Kosina
                   ` (3 subsequent siblings)
  13 siblings, 1 reply; 106+ messages in thread
From: Jiri Kosina @ 2015-02-09 23:15 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Seth Jennings, Vojtech Pavlik, Masami Hiramatsu, live-patching,
	linux-kernel

On Mon, 9 Feb 2015, Josh Poimboeuf wrote:

> This patch set implements a livepatch consistency model, targeted for 3.21.
> Now that we have a solid livepatch code base, this is the biggest remaining
> missing piece.

Hi Josh,

first, thanks a lot for putting this together. From a cursory look it 
certainly seems to be a very solid base for future steps.

I am afraid I won't get to proper review before merge window concludes 
though. But after that it gets moved the top of my TODO list.

> This code stems from the design proposal made by Vojtech [1] in November.  It
> makes live patching safer in general.  Specifically, it allows you to apply
> patches which change function prototypes.  It also lays the groundwork for
> future code changes which will enable data and data semantic changes.
> 
> It's basically a hybrid of kpatch and kGraft, combining kpatch's backtrace
> checking with kGraft's per-task consistency.  When patching, tasks are
> carefully transitioned from the old universe to the new universe.  A task can
> only be switched to the new universe if it's not using a function that is to be
> patched or unpatched.  After all tasks have moved to the new universe, the
> patching process is complete.
> 
> How it transitions various tasks to the new universe:
> 
> - The stacks of all sleeping tasks are checked.  Each task that is not sleeping
>   on a to-be-patched function is switched.
> 
> - Other user tasks are handled by do_notify_resume() (see patch 9/9).  If a
>   task is I/O bound, it switches universes when returning from a system call.
>   If it's CPU bound, it switches when returning from an interrupt.  

Just one rather minor comment to this -- we can actually switch CPU-bound 
processess "immediately" when we notice they are running in userspace 
(assuming that we are also migrating them when they are entering the 
kernel as well ... which doesn't seem to be implemented by this patchset, 
but that could be easily added at low cost).

Relying on IRQs is problematic, because you can have CPU completely 
isolated from both scheduler and IRQs (that's what realtime folks are 
doing routinely), so you don't see IRQ on that particular CPU for ages.

The way how do detect whether given CPU is running in userspace (without 
interfering with it too much, like, say, sending costly IPI) is rather 
tricky though. On kernels with CONFIG_CONTEXT_TRACKING we could make use 
of that feature, but my gut feeling is that most people keep that 
disabled.

Another alternative is what we are doing in kgraft with 
kgr_needs_lazy_migration(), but admittedly that's very far from being 
pretty.

-- 
Jiri Kosina
SUSE Labs

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH 0/9] livepatch: consistency model
  2015-02-09 23:15 ` [RFC PATCH 0/9] livepatch: consistency model Jiri Kosina
@ 2015-02-10  3:05   ` Josh Poimboeuf
  2015-02-10  7:21     ` Jiri Kosina
  0 siblings, 1 reply; 106+ messages in thread
From: Josh Poimboeuf @ 2015-02-10  3:05 UTC (permalink / raw)
  To: Jiri Kosina
  Cc: Seth Jennings, Vojtech Pavlik, Masami Hiramatsu, live-patching,
	linux-kernel

On Tue, Feb 10, 2015 at 12:15:21AM +0100, Jiri Kosina wrote:
> On Mon, 9 Feb 2015, Josh Poimboeuf wrote:
> > This patch set implements a livepatch consistency model, targeted for 3.21.
> > Now that we have a solid livepatch code base, this is the biggest remaining
> > missing piece.
> 
> Hi Josh,
> 
> first, thanks a lot for putting this together. From a cursory look it 
> certainly seems to be a very solid base for future steps.
> 
> I am afraid I won't get to proper review before merge window concludes 
> though. But after that it gets moved the top of my TODO list.

No problem.  Sorry for the inconvenient timing...

> > This code stems from the design proposal made by Vojtech [1] in November.  It
> > makes live patching safer in general.  Specifically, it allows you to apply
> > patches which change function prototypes.  It also lays the groundwork for
> > future code changes which will enable data and data semantic changes.
> > 
> > It's basically a hybrid of kpatch and kGraft, combining kpatch's backtrace
> > checking with kGraft's per-task consistency.  When patching, tasks are
> > carefully transitioned from the old universe to the new universe.  A task can
> > only be switched to the new universe if it's not using a function that is to be
> > patched or unpatched.  After all tasks have moved to the new universe, the
> > patching process is complete.
> > 
> > How it transitions various tasks to the new universe:
> > 
> > - The stacks of all sleeping tasks are checked.  Each task that is not sleeping
> >   on a to-be-patched function is switched.
> > 
> > - Other user tasks are handled by do_notify_resume() (see patch 9/9).  If a
> >   task is I/O bound, it switches universes when returning from a system call.
> >   If it's CPU bound, it switches when returning from an interrupt.  
> 
> Just one rather minor comment to this -- we can actually switch CPU-bound 
> processess "immediately" when we notice they are running in userspace 
> (assuming that we are also migrating them when they are entering the 
> kernel as well ... which doesn't seem to be implemented by this patchset, 
> but that could be easily added at low cost).

We could, but I guess the trick is figuring out how to tell if the task
is in user space.  But anyway, I don't really see why it would be
necessary.

> Relying on IRQs is problematic, because you can have CPU completely 
> isolated from both scheduler and IRQs (that's what realtime folks are 
> doing routinely), so you don't see IRQ on that particular CPU for ages.

It doesn't _rely_ on IRQs, it's just another tool in the kit to help
tasks converge quickly.  The front line of attack is backtrace checking
of sleeping tasks.  Then it uses system call switching and IRQs as the
next wave of attack, with signals as the last resort.  So you can still
fall back on sending signals if needed.

> The way how do detect whether given CPU is running in userspace (without 
> interfering with it too much, like, say, sending costly IPI) is rather 
> tricky though. On kernels with CONFIG_CONTEXT_TRACKING we could make use 
> of that feature, but my gut feeling is that most people keep that 
> disabled.

Yeah, that seems to be related to nohz.  I think we'd have to have it
enabled 100% of the time on all CPUs, even when not patching.  Sounds
like a lot of unnecessary overhead (unless the user already has it
enabled on all CPUs).

> Another alternative is what we are doing in kgraft with
> kgr_needs_lazy_migration(), but admittedly that's very far from being
> pretty.

Hm, is it really safe to read a stack while the task could be writing to
it?

-- 
Josh

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH 0/9] livepatch: consistency model
  2015-02-10  3:05   ` Josh Poimboeuf
@ 2015-02-10  7:21     ` Jiri Kosina
  0 siblings, 0 replies; 106+ messages in thread
From: Jiri Kosina @ 2015-02-10  7:21 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Seth Jennings, Vojtech Pavlik, Masami Hiramatsu, live-patching,
	linux-kernel

On Mon, 9 Feb 2015, Josh Poimboeuf wrote:

> > The way how do detect whether given CPU is running in userspace 
> > (without interfering with it too much, like, say, sending costly IPI) 
> > is rather tricky though. On kernels with CONFIG_CONTEXT_TRACKING we 
> > could make use of that feature, but my gut feeling is that most people 
> > keep that disabled.

> Yeah, that seems to be related to nohz.  I think we'd have to have it
> enabled 100% of the time on all CPUs, even when not patching.  Sounds
> like a lot of unnecessary overhead (unless the user already has it
> enabled on all CPUs).

Agreed, we could make use of it when it's enabled in kernel config anyway, 
but it would be impractical for us to hard require it.

> > Another alternative is what we are doing in kgraft with 
> > kgr_needs_lazy_migration(), but admittedly that's very far from being 
> > pretty.
> 
> Hm, is it really safe to read a stack while the task could be writing to
> it?

It might indeed look like that on a first sight :) but let's look at the 
possible race scenarios:

(1) task is running in userspace when you start looking at its kernel 
    stack, and while you are examining it, it enters the kernel. That's 
    not a problem, because no matter what verdict  kgr_needs_lazy_migration() 
    yields, the migration to new universe happens during kernel entry 
    anyway

(2) task is actively running in kernelspace. There is no way for 
    print_context_stack() to result it that small number of nr_entries. 
    The stack context might be bogus due to the race, but it always 
    starts at a valid bp which can't be that low.

(3) task is running in kernelspace, but is about to exit to userspace, and 
    looking at the kernel stack races with this. That's again not a 
    problem, because no matter what verdict kgr_needs_lazy_migration() 
    yields, the migration to the new unuverse happens during kernel exit 
    anyway

So I agree that this is ugly as hell, and depends on architecture-specific 
implementation of print_context_stack(); but architectures are free to 
give up this optimization if it can't be used.

But yes, we should be able to come up with something better if we want to 
use this optimization upstream.

Thanks,

-- 
Jiri Kosina
SUSE Labs

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH 0/9] livepatch: consistency model
  2015-02-09 17:31 [RFC PATCH 0/9] livepatch: consistency model Josh Poimboeuf
                   ` (9 preceding siblings ...)
  2015-02-09 23:15 ` [RFC PATCH 0/9] livepatch: consistency model Jiri Kosina
@ 2015-02-10  8:57 ` Jiri Kosina
  2015-02-10 14:43   ` Josh Poimboeuf
  2015-02-10 11:16 ` Masami Hiramatsu
                   ` (2 subsequent siblings)
  13 siblings, 1 reply; 106+ messages in thread
From: Jiri Kosina @ 2015-02-10  8:57 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Seth Jennings, Vojtech Pavlik, Masami Hiramatsu, live-patching,
	linux-kernel

On Mon, 9 Feb 2015, Josh Poimboeuf wrote:

> 2) As mentioned above, kthreads which are always sleeping on a patched function
>    will never transition to the new universe.  This is really a minor issue
>    (less than 1% of patches).  It's not necessarily something that needs to be
>    resolved with this patch set, but it would be good to have some discussion
>    about it regardless.
>    
>    To overcome this issue, I have 1/2 an idea: we could add some stack checking
>    code to the ftrace handler itself to transition the kthread to the new
>    universe after it re-enters the function it was originally sleeping on, if
>    the stack doesn't already have have any other to-be-patched functions.
>    Combined with the klp_transition_work_fn()'s periodic stack checking of
>    sleeping tasks, that would handle most of the cases (except when trying to
>    patch the high-level thread_fn itself).
> 
>    But then how do you make the kthread wake up?  As far as I can tell,
>    wake_up_process() doesn't seem to work on a kthread (unless I messed up my
>    testing somehow).  What does kGraft do in this case?

wake_up_process() really should work for (p->flags & PF_KTHREAD) 
task_struct. What was your testing scenario?

-- 
Jiri Kosina
SUSE Labs

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH 5/9] sched: move task rq locking functions to sched.h
  2015-02-09 17:31 ` [RFC PATCH 5/9] sched: move task rq locking functions to sched.h Josh Poimboeuf
@ 2015-02-10 10:48   ` Masami Hiramatsu
  2015-02-10 14:54     ` Josh Poimboeuf
  0 siblings, 1 reply; 106+ messages in thread
From: Masami Hiramatsu @ 2015-02-10 10:48 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Seth Jennings, Jiri Kosina, Vojtech Pavlik, live-patching, linux-kernel

(2015/02/10 2:31), Josh Poimboeuf wrote:
> Move task_rq_lock/unlock() to sched.h so they can be used elsewhere.
> The livepatch code needs to lock each task's rq in order to safely
> examine its stack and switch it to a new patch universe.

Hmm, why don't you just expose (extern in sched.h) those?

Thank you,

> 
> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
> ---
>  kernel/sched/core.c  | 32 --------------------------------
>  kernel/sched/sched.h | 33 +++++++++++++++++++++++++++++++++
>  2 files changed, 33 insertions(+), 32 deletions(-)
> 
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index b5797b7..78d91e6 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -326,44 +326,12 @@ static inline struct rq *__task_rq_lock(struct task_struct *p)
>  	}
>  }
>  
> -/*
> - * task_rq_lock - lock p->pi_lock and lock the rq @p resides on.
> - */
> -static struct rq *task_rq_lock(struct task_struct *p, unsigned long *flags)
> -	__acquires(p->pi_lock)
> -	__acquires(rq->lock)
> -{
> -	struct rq *rq;
> -
> -	for (;;) {
> -		raw_spin_lock_irqsave(&p->pi_lock, *flags);
> -		rq = task_rq(p);
> -		raw_spin_lock(&rq->lock);
> -		if (likely(rq == task_rq(p) && !task_on_rq_migrating(p)))
> -			return rq;
> -		raw_spin_unlock(&rq->lock);
> -		raw_spin_unlock_irqrestore(&p->pi_lock, *flags);
> -
> -		while (unlikely(task_on_rq_migrating(p)))
> -			cpu_relax();
> -	}
> -}
> -
>  static void __task_rq_unlock(struct rq *rq)
>  	__releases(rq->lock)
>  {
>  	raw_spin_unlock(&rq->lock);
>  }
>  
> -static inline void
> -task_rq_unlock(struct rq *rq, struct task_struct *p, unsigned long *flags)
> -	__releases(rq->lock)
> -	__releases(p->pi_lock)
> -{
> -	raw_spin_unlock(&rq->lock);
> -	raw_spin_unlock_irqrestore(&p->pi_lock, *flags);
> -}
> -
>  /*
>   * this_rq_lock - lock this runqueue and disable interrupts.
>   */
> diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
> index 9a2a45c..ae514c9 100644
> --- a/kernel/sched/sched.h
> +++ b/kernel/sched/sched.h
> @@ -1542,6 +1542,39 @@ static inline void double_rq_unlock(struct rq *rq1, struct rq *rq2)
>  
>  #endif
>  
> +/*
> + * task_rq_lock - lock p->pi_lock and lock the rq @p resides on.
> + */
> +static inline struct rq *task_rq_lock(struct task_struct *p,
> +				      unsigned long *flags)
> +	__acquires(p->pi_lock)
> +	__acquires(rq->lock)
> +{
> +	struct rq *rq;
> +
> +	for (;;) {
> +		raw_spin_lock_irqsave(&p->pi_lock, *flags);
> +		rq = task_rq(p);
> +		raw_spin_lock(&rq->lock);
> +		if (likely(rq == task_rq(p) && !task_on_rq_migrating(p)))
> +			return rq;
> +		raw_spin_unlock(&rq->lock);
> +		raw_spin_unlock_irqrestore(&p->pi_lock, *flags);
> +
> +		while (unlikely(task_on_rq_migrating(p)))
> +			cpu_relax();
> +	}
> +}
> +
> +static inline void task_rq_unlock(struct rq *rq, struct task_struct *p,
> +				  unsigned long *flags)
> +	__releases(rq->lock)
> +	__releases(p->pi_lock)
> +{
> +	raw_spin_unlock(&rq->lock);
> +	raw_spin_unlock_irqrestore(&p->pi_lock, *flags);
> +}
> +
>  extern struct sched_entity *__pick_first_entity(struct cfs_rq *cfs_rq);
>  extern struct sched_entity *__pick_last_entity(struct cfs_rq *cfs_rq);
>  extern void print_cfs_stats(struct seq_file *m, int cpu);
> 


-- 
Masami HIRAMATSU
Software Platform Research Dept. Linux Technology Research Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: masami.hiramatsu.pt@hitachi.com



^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH 6/9] livepatch: create per-task consistency model
  2015-02-09 17:31 ` [RFC PATCH 6/9] livepatch: create per-task consistency model Josh Poimboeuf
@ 2015-02-10 10:58   ` Masami Hiramatsu
  2015-02-10 14:59     ` Josh Poimboeuf
  2015-02-10 15:59   ` Miroslav Benes
                     ` (5 subsequent siblings)
  6 siblings, 1 reply; 106+ messages in thread
From: Masami Hiramatsu @ 2015-02-10 10:58 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Seth Jennings, Jiri Kosina, Vojtech Pavlik, live-patching, linux-kernel

(2015/02/10 2:31), Josh Poimboeuf wrote:
> Add a basic per-task consistency model.  This is the foundation which
> will eventually enable us to patch those ~10% of security patches which
> change function prototypes and/or data semantics.
> 
> When a patch is enabled, livepatch enters into a transition state where
> tasks are converging from the old universe to the new universe.  If a
> given task isn't using any of the patched functions, it's switched to
> the new universe.  Once all the tasks have been converged to the new
> universe, patching is complete.
> 
> The same sequence occurs when a patch is disabled, except the tasks
> converge from the new universe to the old universe.
> 
> The /sys/kernel/livepatch/<patch>/transition file shows whether a patch
> is in transition.  Only a single patch (the topmost patch on the stack)
> can be in transition at a given time.  A patch can remain in the
> transition state indefinitely, if any of the tasks are stuck in the
> previous universe.
> 
> A transition can be reversed and effectively canceled by writing the
> opposite value to the /sys/kernel/livepatch/<patch>/enabled file while
> the transition is in progress.  Then all the tasks will attempt to
> converge back to the original universe.
> 
> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
> ---
>  include/linux/livepatch.h     |  18 ++-
>  include/linux/sched.h         |   3 +
>  kernel/fork.c                 |   2 +
>  kernel/livepatch/Makefile     |   2 +-
>  kernel/livepatch/core.c       |  71 ++++++----
>  kernel/livepatch/patch.c      |  34 ++++-
>  kernel/livepatch/patch.h      |   1 +
>  kernel/livepatch/transition.c | 300 ++++++++++++++++++++++++++++++++++++++++++
>  kernel/livepatch/transition.h |  16 +++
>  kernel/sched/core.c           |   2 +
>  10 files changed, 423 insertions(+), 26 deletions(-)
>  create mode 100644 kernel/livepatch/transition.c
>  create mode 100644 kernel/livepatch/transition.h
> 
> diff --git a/include/linux/livepatch.h b/include/linux/livepatch.h
> index 0e65b4d..b8c2f15 100644
> --- a/include/linux/livepatch.h
> +++ b/include/linux/livepatch.h
> @@ -40,6 +40,7 @@
>   * @old_size:	size of the old function
>   * @new_size:	size of the new function
>   * @patched:	the func has been added to the klp_ops list
> + * @transition:	the func is currently being applied or reverted
>   */
>  struct klp_func {
>  	/* external */
> @@ -60,6 +61,7 @@ struct klp_func {
>  	struct list_head stack_node;
>  	unsigned long old_size, new_size;
>  	int patched;
> +	int transition;
>  };
>  
>  /**
> @@ -128,6 +130,20 @@ extern int klp_unregister_patch(struct klp_patch *);
>  extern int klp_enable_patch(struct klp_patch *);
>  extern int klp_disable_patch(struct klp_patch *);
>  
> -#endif /* CONFIG_LIVEPATCH */
> +extern int klp_universe_goal;
> +
> +static inline void klp_update_task_universe(struct task_struct *t)
> +{
> +	/* corresponding smp_wmb() is in klp_set_universe_goal() */
> +	smp_rmb();
> +
> +	t->klp_universe = klp_universe_goal;
> +}
> +
> +#else /* !CONFIG_LIVEPATCH */
> +
> +static inline void klp_update_task_universe(struct task_struct *t) {}
> +
> +#endif /* !CONFIG_LIVEPATCH */
>  
>  #endif /* _LINUX_LIVEPATCH_H_ */
> diff --git a/include/linux/sched.h b/include/linux/sched.h
> index 8db31ef..a95e59a 100644
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -1701,6 +1701,9 @@ struct task_struct {
>  #ifdef CONFIG_DEBUG_ATOMIC_SLEEP
>  	unsigned long	task_state_change;
>  #endif
> +#ifdef CONFIG_LIVEPATCH
> +	int klp_universe;
> +#endif
>  };
>  
>  /* Future-safe accessor for struct task_struct's cpus_allowed. */
> diff --git a/kernel/fork.c b/kernel/fork.c
> index 4dc2dda..1dcbebe 100644
> --- a/kernel/fork.c
> +++ b/kernel/fork.c
> @@ -74,6 +74,7 @@
>  #include <linux/uprobes.h>
>  #include <linux/aio.h>
>  #include <linux/compiler.h>
> +#include <linux/livepatch.h>
>  
>  #include <asm/pgtable.h>
>  #include <asm/pgalloc.h>
> @@ -1538,6 +1539,7 @@ static struct task_struct *copy_process(unsigned long clone_flags,
>  	total_forks++;
>  	spin_unlock(&current->sighand->siglock);
>  	syscall_tracepoint_update(p);
> +	klp_update_task_universe(p);
>  	write_unlock_irq(&tasklist_lock);
>  
>  	proc_fork_connector(p);
> diff --git a/kernel/livepatch/Makefile b/kernel/livepatch/Makefile
> index e136dad..2b8bdb1 100644
> --- a/kernel/livepatch/Makefile
> +++ b/kernel/livepatch/Makefile
> @@ -1,3 +1,3 @@
>  obj-$(CONFIG_LIVEPATCH) += livepatch.o
>  
> -livepatch-objs := core.o patch.o
> +livepatch-objs := core.o patch.o transition.o
> diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c
> index 85d4ef7..790dc10 100644
> --- a/kernel/livepatch/core.c
> +++ b/kernel/livepatch/core.c
> @@ -28,14 +28,17 @@
>  #include <linux/kallsyms.h>
>  
>  #include "patch.h"
> +#include "transition.h"
>  
>  /*
> - * The klp_mutex protects the global lists and state transitions of any
> - * structure reachable from them.  References to any structure must be obtained
> - * under mutex protection (except in klp_ftrace_handler(), which uses RCU to
> - * ensure it gets consistent data).
> + * The klp_mutex is a coarse lock which serializes access to klp data.  All
> + * accesses to klp-related variables and structures must have mutex protection,
> + * except within the following functions which carefully avoid the need for it:
> + *
> + * - klp_ftrace_handler()
> + * - klp_update_task_universe()
>   */
> -static DEFINE_MUTEX(klp_mutex);
> +DEFINE_MUTEX(klp_mutex);
>  
>  static LIST_HEAD(klp_patches);
>  
> @@ -67,7 +70,6 @@ static void klp_find_object_module(struct klp_object *obj)
>  	mutex_unlock(&module_mutex);
>  }
>  
> -/* klp_mutex must be held by caller */
>  static bool klp_is_patch_registered(struct klp_patch *patch)
>  {
>  	struct klp_patch *mypatch;
> @@ -285,18 +287,17 @@ static int klp_write_object_relocations(struct module *pmod,
>  
>  static int __klp_disable_patch(struct klp_patch *patch)
>  {
> -	struct klp_object *obj;
> +	if (klp_transition_patch)
> +		return -EBUSY;
>  
>  	/* enforce stacking: only the last enabled patch can be disabled */
>  	if (!list_is_last(&patch->list, &klp_patches) &&
>  	    list_next_entry(patch, list)->enabled)
>  		return -EBUSY;
>  
> -	pr_notice("disabling patch '%s'\n", patch->mod->name);
> -
> -	for (obj = patch->objs; obj->funcs; obj++)
> -		if (obj->patched)
> -			klp_unpatch_object(obj);
> +	klp_init_transition(patch, KLP_UNIVERSE_NEW);
> +	klp_start_transition(KLP_UNIVERSE_OLD);
> +	klp_try_complete_transition();
>  
>  	patch->enabled = 0;
>  
> @@ -340,6 +341,9 @@ static int __klp_enable_patch(struct klp_patch *patch)
>  	struct klp_object *obj;
>  	int ret;
>  
> +	if (klp_transition_patch)
> +		return -EBUSY;
> +
>  	if (WARN_ON(patch->enabled))
>  		return -EINVAL;
>  
> @@ -351,7 +355,7 @@ static int __klp_enable_patch(struct klp_patch *patch)
>  	pr_notice_once("tainting kernel with TAINT_LIVEPATCH\n");
>  	add_taint(TAINT_LIVEPATCH, LOCKDEP_STILL_OK);
>  
> -	pr_notice("enabling patch '%s'\n", patch->mod->name);
> +	klp_init_transition(patch, KLP_UNIVERSE_OLD);
>  
>  	for (obj = patch->objs; obj->funcs; obj++) {
>  		klp_find_object_module(obj);
> @@ -360,17 +364,24 @@ static int __klp_enable_patch(struct klp_patch *patch)
>  			continue;
>  
>  		ret = klp_patch_object(obj);
> -		if (ret)
> -			goto unregister;
> +		if (ret) {
> +			pr_warn("failed to enable patch '%s'\n",
> +				patch->mod->name);
> +
> +			klp_unpatch_objects(patch);
> +			klp_complete_transition();
> +
> +			return ret;
> +		}
>  	}
>  
> +	klp_start_transition(KLP_UNIVERSE_NEW);
> +
> +	klp_try_complete_transition();
> +
>  	patch->enabled = 1;
>  
>  	return 0;
> -
> -unregister:
> -	WARN_ON(__klp_disable_patch(patch));
> -	return ret;
>  }
>  
>  /**
> @@ -407,6 +418,7 @@ EXPORT_SYMBOL_GPL(klp_enable_patch);
>   * /sys/kernel/livepatch
>   * /sys/kernel/livepatch/<patch>
>   * /sys/kernel/livepatch/<patch>/enabled
> + * /sys/kernel/livepatch/<patch>/transition
>   * /sys/kernel/livepatch/<patch>/<object>
>   * /sys/kernel/livepatch/<patch>/<object>/<func>
>   */
> @@ -435,7 +447,9 @@ static ssize_t enabled_store(struct kobject *kobj, struct kobj_attribute *attr,
>  		goto err;
>  	}
>  
> -	if (val) {
> +	if (klp_transition_patch == patch) {
> +		klp_reverse_transition();
> +	} else if (val) {
>  		ret = __klp_enable_patch(patch);
>  		if (ret)
>  			goto err;
> @@ -463,9 +477,21 @@ static ssize_t enabled_show(struct kobject *kobj,
>  	return snprintf(buf, PAGE_SIZE-1, "%d\n", patch->enabled);
>  }
>  
> +static ssize_t transition_show(struct kobject *kobj,
> +			       struct kobj_attribute *attr, char *buf)
> +{
> +	struct klp_patch *patch;
> +
> +	patch = container_of(kobj, struct klp_patch, kobj);
> +	return snprintf(buf, PAGE_SIZE-1, "%d\n",
> +			klp_transition_patch == patch);
> +}
> +
>  static struct kobj_attribute enabled_kobj_attr = __ATTR_RW(enabled);
> +static struct kobj_attribute transition_kobj_attr = __ATTR_RO(transition);
>  static struct attribute *klp_patch_attrs[] = {
>  	&enabled_kobj_attr.attr,
> +	&transition_kobj_attr.attr,
>  	NULL
>  };
>  
> @@ -543,6 +569,7 @@ static int klp_init_func(struct klp_object *obj, struct klp_func *func)
>  {
>  	INIT_LIST_HEAD(&func->stack_node);
>  	func->patched = 0;
> +	func->transition = 0;
>  
>  	return kobject_init_and_add(&func->kobj, &klp_ktype_func,
>  				    obj->kobj, func->old_name);
> @@ -725,7 +752,7 @@ static void klp_module_notify_coming(struct klp_patch *patch,
>  	if (ret)
>  		goto err;
>  
> -	if (!patch->enabled)
> +	if (!patch->enabled && klp_transition_patch != patch)
>  		return;
>  
>  	pr_notice("applying patch '%s' to loading module '%s'\n",
> @@ -746,7 +773,7 @@ static void klp_module_notify_going(struct klp_patch *patch,
>  	struct module *pmod = patch->mod;
>  	struct module *mod = obj->mod;
>  
> -	if (!patch->enabled)
> +	if (!patch->enabled && klp_transition_patch != patch)
>  		goto free;
>  
>  	pr_notice("reverting patch '%s' on unloading module '%s'\n",
> diff --git a/kernel/livepatch/patch.c b/kernel/livepatch/patch.c
> index 281fbca..f12256b 100644
> --- a/kernel/livepatch/patch.c
> +++ b/kernel/livepatch/patch.c
> @@ -24,6 +24,7 @@
>  #include <linux/slab.h>
>  
>  #include "patch.h"
> +#include "transition.h"
>  
>  static LIST_HEAD(klp_ops);
>  
> @@ -38,14 +39,34 @@ static void notrace klp_ftrace_handler(unsigned long ip,
>  	ops = container_of(fops, struct klp_ops, fops);
>  
>  	rcu_read_lock();
> +
>  	func = list_first_or_null_rcu(&ops->func_stack, struct klp_func,
>  				      stack_node);
> -	rcu_read_unlock();
>  
>  	if (WARN_ON_ONCE(!func))
> -		return;
> +		goto unlock;
> +
> +	if (unlikely(func->transition)) {
> +		/* corresponding smp_wmb() is in klp_init_transition() */
> +		smp_rmb();
> +
> +		if (current->klp_universe == KLP_UNIVERSE_OLD) {
> +			/*
> +			 * Use the previously patched version of the function.
> +			 * If no previous patches exist, use the original
> +			 * function.
> +			 */
> +			func = list_entry_rcu(func->stack_node.next,
> +					      struct klp_func, stack_node);
> +
> +			if (&func->stack_node == &ops->func_stack)
> +				goto unlock;
> +		}
> +	}
>  
>  	klp_arch_set_pc(regs, (unsigned long)func->new_func);
> +unlock:
> +	rcu_read_unlock();
>  }
>  
>  struct klp_ops *klp_find_ops(unsigned long old_addr)
> @@ -174,3 +195,12 @@ int klp_patch_object(struct klp_object *obj)
>  
>  	return 0;
>  }
> +
> +void klp_unpatch_objects(struct klp_patch *patch)
> +{
> +	struct klp_object *obj;
> +
> +	for (obj = patch->objs; obj->funcs; obj++)
> +		if (obj->patched)
> +			klp_unpatch_object(obj);
> +}
> diff --git a/kernel/livepatch/patch.h b/kernel/livepatch/patch.h
> index bb34bd3..1648259 100644
> --- a/kernel/livepatch/patch.h
> +++ b/kernel/livepatch/patch.h
> @@ -23,3 +23,4 @@ struct klp_ops *klp_find_ops(unsigned long old_addr);
>  
>  extern int klp_patch_object(struct klp_object *obj);
>  extern void klp_unpatch_object(struct klp_object *obj);
> +extern void klp_unpatch_objects(struct klp_patch *patch);
> diff --git a/kernel/livepatch/transition.c b/kernel/livepatch/transition.c
> new file mode 100644
> index 0000000..2630296
> --- /dev/null
> +++ b/kernel/livepatch/transition.c
> @@ -0,0 +1,300 @@
> +/*
> + * transition.c - Kernel Live Patching transition functions
> + *
> + * Copyright (C) 2015 Josh Poimboeuf <jpoimboe@redhat.com>
> + *
> + * This program is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU General Public License
> + * as published by the Free Software Foundation; either version 2
> + * of the License, or (at your option) any later version.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program; if not, see <http://www.gnu.org/licenses/>.
> + */
> +
> +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
> +
> +#include <linux/cpu.h>
> +#include <asm/stacktrace.h>
> +#include "../sched/sched.h"
> +
> +#include "patch.h"
> +#include "transition.h"
> +
> +static void klp_transition_work_fn(struct work_struct *);
> +static DECLARE_DELAYED_WORK(klp_transition_work, klp_transition_work_fn);
> +
> +struct klp_patch *klp_transition_patch;
> +
> +int klp_universe_goal = KLP_UNIVERSE_UNDEFINED;
> +
> +static void klp_set_universe_goal(int universe)
> +{
> +	klp_universe_goal = universe;
> +
> +	/* corresponding smp_rmb() is in klp_update_task_universe() */
> +	smp_wmb();
> +}
> +
> +/*
> + * The transition to the universe goal is complete.  Clean up the data
> + * structures.
> + */
> +void klp_complete_transition(void)
> +{
> +	struct klp_object *obj;
> +	struct klp_func *func;
> +
> +	for (obj = klp_transition_patch->objs; obj->funcs; obj++)
> +		for (func = obj->funcs; func->old_name; func++)
> +			func->transition = 0;
> +
> +	klp_transition_patch = NULL;
> +}
> +
> +static int klp_stacktrace_address_verify_func(struct klp_func *func,
> +					      unsigned long address)
> +{
> +	unsigned long func_addr, func_size;
> +
> +	if (klp_universe_goal == KLP_UNIVERSE_OLD) {
> +		 /* check the to-be-unpatched function (the func itself) */
> +		func_addr = (unsigned long)func->new_func;
> +		func_size = func->new_size;
> +	} else {
> +		/* check the to-be-patched function (previous func) */
> +		struct klp_ops *ops;
> +
> +		ops = klp_find_ops(func->old_addr);
> +
> +		if (list_is_singular(&ops->func_stack)) {
> +			/* original function */
> +			func_addr = func->old_addr;
> +			func_size = func->old_size;
> +		} else {
> +			/* previously patched function */
> +			struct klp_func *prev;
> +
> +			prev = list_next_entry(func, stack_node);
> +			func_addr = (unsigned long)prev->new_func;
> +			func_size = prev->new_size;
> +		}
> +	}
> +
> +	if (address >= func_addr && address < func_addr + func_size)
> +		return -1;
> +
> +	return 0;
> +}
> +
> +/*
> + * Determine whether the given return address on the stack is within a
> + * to-be-patched or to-be-unpatched function.
> + */
> +static void klp_stacktrace_address_verify(void *data, unsigned long address,
> +					  int reliable)
> +{
> +	struct klp_object *obj;
> +	struct klp_func *func;
> +	int *ret = data;
> +
> +	if (*ret)
> +		return;
> +
> +	for (obj = klp_transition_patch->objs; obj->funcs; obj++) {
> +		if (!obj->patched)
> +			continue;
> +		for (func = obj->funcs; func->old_name; func++) {
> +			if (klp_stacktrace_address_verify_func(func, address)) {
> +				*ret = -1;
> +				return;
> +			}
> +		}
> +	}
> +}
> +
> +static int klp_stacktrace_stack(void *data, char *name)
> +{
> +	return 0;
> +}
> +
> +static const struct stacktrace_ops klp_stacktrace_ops = {
> +	.address = klp_stacktrace_address_verify,
> +	.stack = klp_stacktrace_stack,
> +	.walk_stack = print_context_stack_bp,
> +};
> +
> +/*
> + * Try to safely transition a task to the universe goal.  If the task is
> + * currently running or is sleeping on a to-be-patched or to-be-unpatched
> + * function, return false.
> + */
> +static bool klp_transition_task(struct task_struct *t)
> +{
> +	struct rq *rq;
> +	unsigned long flags;
> +	int ret;
> +	bool success = false;
> +
> +	if (t->klp_universe == klp_universe_goal)
> +		return true;
> +
> +	rq = task_rq_lock(t, &flags);
> +
> +	if (task_running(rq, t) && t != current) {
> +		pr_debug("%s: pid %d (%s) is running\n", __func__, t->pid,
> +			 t->comm);
> +		goto done;
> +	}

Let me confirm that this always skips running tasks, and klp retries
checking by using delayed worker, correct?

Indeed, this can work if we retries enough long...

Thank you,

> +
> +	ret = 0;
> +	dump_trace(t, NULL, NULL, 0, &klp_stacktrace_ops, &ret);
> +	if (ret) {
> +		pr_debug("%s: pid %d (%s) is sleeping on a patched function\n",
> +			 __func__, t->pid, t->comm);
> +		goto done;
> +	}
> +
> +	klp_update_task_universe(t);
> +
> +	success = true;
> +done:
> +	task_rq_unlock(rq, t, &flags);
> +	return success;
> +}
> +
> +/*
> + * Try to transition all tasks to the universe goal.  If any tasks are still
> + * stuck in the original universe, schedule a retry.
> + */
> +void klp_try_complete_transition(void)
> +{
> +	unsigned int cpu;
> +	struct task_struct *g, *t;
> +	bool complete = true;
> +
> +	/* try to transition all normal tasks */
> +	read_lock(&tasklist_lock);
> +	for_each_process_thread(g, t)
> +		if (!klp_transition_task(t))
> +			complete = false;
> +	read_unlock(&tasklist_lock);
> +
> +	/* try to transition the idle "swapper" tasks */
> +	get_online_cpus();
> +	for_each_online_cpu(cpu)
> +		if (!klp_transition_task(idle_task(cpu)))
> +			complete = false;
> +	put_online_cpus();
> +
> +	/* if not complete, try again later */
> +	if (!complete) {
> +		schedule_delayed_work(&klp_transition_work,
> +				      round_jiffies_relative(HZ));
> +		return;
> +	}
> +
> +	/* success! unpatch obsolete functions and do some cleanup */
> +
> +	if (klp_universe_goal == KLP_UNIVERSE_OLD) {
> +		klp_unpatch_objects(klp_transition_patch);
> +
> +		/* prevent ftrace handler from reading old func->transition */
> +		synchronize_rcu();
> +	}
> +
> +	pr_notice("'%s': %s complete\n", klp_transition_patch->mod->name,
> +		  klp_universe_goal == KLP_UNIVERSE_NEW ? "patching" :
> +							  "unpatching");
> +
> +	klp_complete_transition();
> +}
> +
> +static void klp_transition_work_fn(struct work_struct *work)
> +{
> +	mutex_lock(&klp_mutex);
> +
> +	if (klp_transition_patch)
> +		klp_try_complete_transition();
> +
> +	mutex_unlock(&klp_mutex);
> +}
> +
> +/*
> + * Start the transition to the specified universe so tasks can begin switching
> + * to it.
> + */
> +void klp_start_transition(int universe)
> +{
> +	if (WARN_ON(klp_universe_goal == universe))
> +		return;
> +
> +	pr_notice("'%s': %s...\n", klp_transition_patch->mod->name,
> +		  universe == KLP_UNIVERSE_NEW ? "patching" : "unpatching");
> +
> +	klp_set_universe_goal(universe);
> +}
> +
> +/*
> + * Can be called in the middle of an existing transition to reverse the
> + * direction of the universe goal.  This can be done to effectively cancel an
> + * existing enable or disable operation if there are any tasks which are stuck
> + * in the original universe.
> + */
> +void klp_reverse_transition(void)
> +{
> +	struct klp_patch *patch = klp_transition_patch;
> +
> +	klp_start_transition(!klp_universe_goal);
> +	klp_try_complete_transition();
> +
> +	patch->enabled = !patch->enabled;
> +}
> +
> +/*
> + * Reset the universe goal and all tasks to the starting universe, and set all
> + * func->transition's to 1 to prepare for patching.
> + */
> +void klp_init_transition(struct klp_patch *patch, int universe)
> +{
> +	struct task_struct *g, *t;
> +	unsigned int cpu;
> +	struct klp_object *obj;
> +	struct klp_func *func;
> +
> +	klp_transition_patch = patch;
> +
> +	/*
> +	 * If the previous transition was in the opposite direction, we may
> +	 * already be in the requested initial universe.
> +	 */
> +	if (klp_universe_goal == universe)
> +		goto init_funcs;
> +
> +	klp_set_universe_goal(universe);
> +
> +	/* init all normal task universes */
> +	read_lock(&tasklist_lock);
> +	for_each_process_thread(g, t)
> +		klp_update_task_universe(t);
> +	read_unlock(&tasklist_lock);
> +
> +	/* init all idle "swapper" task universes */
> +	get_online_cpus();
> +	for_each_online_cpu(cpu)
> +		klp_update_task_universe(idle_task(cpu));
> +	put_online_cpus();
> +
> +init_funcs:
> +	/* corresponding smp_rmb() is in klp_ftrace_handler() */
> +	smp_wmb();
> +
> +	for (obj = patch->objs; obj->funcs; obj++)
> +		for (func = obj->funcs; func->old_name; func++)
> +			func->transition = 1;
> +}
> diff --git a/kernel/livepatch/transition.h b/kernel/livepatch/transition.h
> new file mode 100644
> index 0000000..ba9a55c
> --- /dev/null
> +++ b/kernel/livepatch/transition.h
> @@ -0,0 +1,16 @@
> +#include <linux/livepatch.h>
> +
> +enum {
> +	KLP_UNIVERSE_UNDEFINED = -1,
> +	KLP_UNIVERSE_OLD,
> +	KLP_UNIVERSE_NEW,
> +};
> +
> +extern struct mutex klp_mutex;
> +extern struct klp_patch *klp_transition_patch;
> +
> +extern void klp_init_transition(struct klp_patch *patch, int universe);
> +extern void klp_start_transition(int universe);
> +extern void klp_reverse_transition(void);
> +extern void klp_try_complete_transition(void);
> +extern void klp_complete_transition(void);
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 78d91e6..7b877f4 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -74,6 +74,7 @@
>  #include <linux/binfmts.h>
>  #include <linux/context_tracking.h>
>  #include <linux/compiler.h>
> +#include <linux/livepatch.h>
>  
>  #include <asm/switch_to.h>
>  #include <asm/tlb.h>
> @@ -4601,6 +4602,7 @@ void init_idle(struct task_struct *idle, int cpu)
>  #if defined(CONFIG_SMP)
>  	sprintf(idle->comm, "%s/%d", INIT_TASK_COMM, cpu);
>  #endif
> +	klp_update_task_universe(idle);
>  }
>  
>  int cpuset_cpumask_can_shrink(const struct cpumask *cur,
> 


-- 
Masami HIRAMATSU
Software Platform Research Dept. Linux Technology Research Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: masami.hiramatsu.pt@hitachi.com



^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH 0/9] livepatch: consistency model
  2015-02-09 17:31 [RFC PATCH 0/9] livepatch: consistency model Josh Poimboeuf
                   ` (10 preceding siblings ...)
  2015-02-10  8:57 ` Jiri Kosina
@ 2015-02-10 11:16 ` Masami Hiramatsu
  2015-02-10 15:59   ` Josh Poimboeuf
  2015-02-13 10:14 ` Jiri Kosina
  2015-03-10 16:23 ` Josh Poimboeuf
  13 siblings, 1 reply; 106+ messages in thread
From: Masami Hiramatsu @ 2015-02-10 11:16 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Seth Jennings, Jiri Kosina, Vojtech Pavlik, live-patching, linux-kernel

(2015/02/10 2:31), Josh Poimboeuf wrote:
> This patch set implements a livepatch consistency model, targeted for 3.21.
> Now that we have a solid livepatch code base, this is the biggest remaining
> missing piece.
> 
> This code stems from the design proposal made by Vojtech [1] in November.  It
> makes live patching safer in general.  Specifically, it allows you to apply
> patches which change function prototypes.  It also lays the groundwork for
> future code changes which will enable data and data semantic changes.

Interesting, How would you do that?

> It's basically a hybrid of kpatch and kGraft, combining kpatch's backtrace
> checking with kGraft's per-task consistency.  When patching, tasks are
> carefully transitioned from the old universe to the new universe.  A task can
> only be switched to the new universe if it's not using a function that is to be
> patched or unpatched.  After all tasks have moved to the new universe, the
> patching process is complete.
> 
> How it transitions various tasks to the new universe:
> 
> - The stacks of all sleeping tasks are checked.  Each task that is not sleeping
>   on a to-be-patched function is switched.
> 
> - Other user tasks are handled by do_notify_resume() (see patch 9/9).  If a
>   task is I/O bound, it switches universes when returning from a system call.
>   If it's CPU bound, it switches when returning from an interrupt.  If it's
>   sleeping on a patched function, the user can send SIGSTOP and SIGCONT to
>   force it to switch upon return from the signal handler.

Ah, OK. So you can handle those without hooking switch_to :)

> 
> - Idle "swapper" tasks which are sleeping on a to-be-patched function can be
>   switched from within the outer idle loop.
> 
> - An interrupt handler will inherit the universe of the task it interrupts.
> 
> - kthreads which are sleeping on to-be-patched functions are not yet handled
>   (more on this below).
> 
> 
> I think this approach provides the best benefits of both kpatch and kGraft:
> 
> advantages vs kpatch:
> - no stop machine latency

Good! :)

> - higher patch success rate (can patch in-use functions)
> - patching failures are more predictable (primary failure mode is attempting to
>   patch a kthread which is sleeping forever on a patched function, more on this
>   below)
> 
> advantages vs kGraft:
> - less code complexity (don't have to hack up the code of all the different
>   kthreads)
> - less impact to processes (don't have to signal all sleeping tasks)
> 
> disadvantages vs kpatch:
> - no system-wide switch point (not really a functional limitation, just forces
>   the patch author to be more careful. but that's probably a good thing anyway)

OK, we must check carefully that the old function and new function can be co-exist.

> My biggest concerns and questions related to this patch set are:
> 
> 1) To safely examine the task stacks, the transition code locks each task's rq
>    struct, which requires using the scheduler's internal rq locking functions.
>    It seems to work well, but I'm not sure if there's a cleaner way to safely
>    do stack checking without stop_machine().

We'd better ask scheduler people.

> 
> 2) As mentioned above, kthreads which are always sleeping on a patched function
>    will never transition to the new universe.  This is really a minor issue
>    (less than 1% of patches).  It's not necessarily something that needs to be
>    resolved with this patch set, but it would be good to have some discussion
>    about it regardless.
>    
>    To overcome this issue, I have 1/2 an idea: we could add some stack checking
>    code to the ftrace handler itself to transition the kthread to the new
>    universe after it re-enters the function it was originally sleeping on, if
>    the stack doesn't already have have any other to-be-patched functions.
>    Combined with the klp_transition_work_fn()'s periodic stack checking of
>    sleeping tasks, that would handle most of the cases (except when trying to
>    patch the high-level thread_fn itself).

It makes sense to me. (I just did similar thing)

> 
>    But then how do you make the kthread wake up?  As far as I can tell,
>    wake_up_process() doesn't seem to work on a kthread (unless I messed up my
>    testing somehow).  What does kGraft do in this case?

Hmm, at a glance, the code itself can work on kthread too...
Maybe you can also send you testing patch too.

Thank you,

> 
> 
> [1] https://lkml.org/lkml/2014/11/7/354
> 
> 
> Josh Poimboeuf (9):
>   livepatch: simplify disable error path
>   livepatch: separate enabled and patched states
>   livepatch: move patching functions into patch.c
>   livepatch: get function sizes
>   sched: move task rq locking functions to sched.h
>   livepatch: create per-task consistency model
>   proc: add /proc/<pid>/universe to show livepatch status
>   livepatch: allow patch modules to be removed
>   livepatch: update task universe when exiting kernel
> 
>  arch/x86/include/asm/thread_info.h |   4 +-
>  arch/x86/kernel/signal.c           |   4 +
>  fs/proc/base.c                     |  11 ++
>  include/linux/livepatch.h          |  38 ++--
>  include/linux/sched.h              |   3 +
>  kernel/fork.c                      |   2 +
>  kernel/livepatch/Makefile          |   2 +-
>  kernel/livepatch/core.c            | 360 ++++++++++---------------------------
>  kernel/livepatch/patch.c           | 206 +++++++++++++++++++++
>  kernel/livepatch/patch.h           |  26 +++
>  kernel/livepatch/transition.c      | 318 ++++++++++++++++++++++++++++++++
>  kernel/livepatch/transition.h      |  16 ++
>  kernel/sched/core.c                |  34 +---
>  kernel/sched/idle.c                |   4 +
>  kernel/sched/sched.h               |  33 ++++
>  15 files changed, 747 insertions(+), 314 deletions(-)
>  create mode 100644 kernel/livepatch/patch.c
>  create mode 100644 kernel/livepatch/patch.h
>  create mode 100644 kernel/livepatch/transition.c
>  create mode 100644 kernel/livepatch/transition.h
> 


-- 
Masami HIRAMATSU
Software Platform Research Dept. Linux Technology Research Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: masami.hiramatsu.pt@hitachi.com



^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH 0/9] livepatch: consistency model
  2015-02-10  8:57 ` Jiri Kosina
@ 2015-02-10 14:43   ` Josh Poimboeuf
  0 siblings, 0 replies; 106+ messages in thread
From: Josh Poimboeuf @ 2015-02-10 14:43 UTC (permalink / raw)
  To: Jiri Kosina
  Cc: Seth Jennings, Vojtech Pavlik, Masami Hiramatsu, live-patching,
	linux-kernel

On Tue, Feb 10, 2015 at 09:57:44AM +0100, Jiri Kosina wrote:
> On Mon, 9 Feb 2015, Josh Poimboeuf wrote:
> 
> > 2) As mentioned above, kthreads which are always sleeping on a patched function
> >    will never transition to the new universe.  This is really a minor issue
> >    (less than 1% of patches).  It's not necessarily something that needs to be
> >    resolved with this patch set, but it would be good to have some discussion
> >    about it regardless.
> >    
> >    To overcome this issue, I have 1/2 an idea: we could add some stack checking
> >    code to the ftrace handler itself to transition the kthread to the new
> >    universe after it re-enters the function it was originally sleeping on, if
> >    the stack doesn't already have have any other to-be-patched functions.
> >    Combined with the klp_transition_work_fn()'s periodic stack checking of
> >    sleeping tasks, that would handle most of the cases (except when trying to
> >    patch the high-level thread_fn itself).
> > 
> >    But then how do you make the kthread wake up?  As far as I can tell,
> >    wake_up_process() doesn't seem to work on a kthread (unless I messed up my
> >    testing somehow).  What does kGraft do in this case?
> 
> wake_up_process() really should work for (p->flags & PF_KTHREAD) 
> task_struct. What was your testing scenario?

Hm, I probably did something stupid.  I'll try it again :-)

-- 
Josh

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH 5/9] sched: move task rq locking functions to sched.h
  2015-02-10 10:48   ` Masami Hiramatsu
@ 2015-02-10 14:54     ` Josh Poimboeuf
  0 siblings, 0 replies; 106+ messages in thread
From: Josh Poimboeuf @ 2015-02-10 14:54 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: Seth Jennings, Jiri Kosina, Vojtech Pavlik, live-patching, linux-kernel

On Tue, Feb 10, 2015 at 07:48:17PM +0900, Masami Hiramatsu wrote:
> (2015/02/10 2:31), Josh Poimboeuf wrote:
> > Move task_rq_lock/unlock() to sched.h so they can be used elsewhere.
> > The livepatch code needs to lock each task's rq in order to safely
> > examine its stack and switch it to a new patch universe.
> 
> Hmm, why don't you just expose (extern in sched.h) those?

One reason was because task_rq_unlock was already static inline, and I
didn't want to un-inline it.  But that's probably a dumb reason, since I
inlined task_rq_lock and it wasn't inlined before.

But also, there are some other inlined locking functions in sched.h:
double_lock_balance, double_rq_lock, double_lock_irq, etc.  So it just
seemed to "fit" better there.

Either way works for me.  I'll ask some scheduler people.

-- 
Josh

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH 6/9] livepatch: create per-task consistency model
  2015-02-10 10:58   ` Masami Hiramatsu
@ 2015-02-10 14:59     ` Josh Poimboeuf
  0 siblings, 0 replies; 106+ messages in thread
From: Josh Poimboeuf @ 2015-02-10 14:59 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: Seth Jennings, Jiri Kosina, Vojtech Pavlik, live-patching, linux-kernel

On Tue, Feb 10, 2015 at 07:58:30PM +0900, Masami Hiramatsu wrote:
> (2015/02/10 2:31), Josh Poimboeuf wrote:
> > +/*
> > + * Try to safely transition a task to the universe goal.  If the task is
> > + * currently running or is sleeping on a to-be-patched or to-be-unpatched
> > + * function, return false.
> > + */
> > +static bool klp_transition_task(struct task_struct *t)
> > +{
> > +	struct rq *rq;
> > +	unsigned long flags;
> > +	int ret;
> > +	bool success = false;
> > +
> > +	if (t->klp_universe == klp_universe_goal)
> > +		return true;
> > +
> > +	rq = task_rq_lock(t, &flags);
> > +
> > +	if (task_running(rq, t) && t != current) {
> > +		pr_debug("%s: pid %d (%s) is running\n", __func__, t->pid,
> > +			 t->comm);
> > +		goto done;
> > +	}
> 
> Let me confirm that this always skips running tasks, and klp retries
> checking by using delayed worker, correct?

Correct.  Also, patch 9 of the series adds other ways to convert tasks,
using syscalls, irqs and signals.


-- 
Josh

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH 6/9] livepatch: create per-task consistency model
  2015-02-09 17:31 ` [RFC PATCH 6/9] livepatch: create per-task consistency model Josh Poimboeuf
  2015-02-10 10:58   ` Masami Hiramatsu
@ 2015-02-10 15:59   ` Miroslav Benes
  2015-02-10 16:56     ` Josh Poimboeuf
  2015-02-10 19:27   ` Seth Jennings
                     ` (4 subsequent siblings)
  6 siblings, 1 reply; 106+ messages in thread
From: Miroslav Benes @ 2015-02-10 15:59 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Seth Jennings, Jiri Kosina, Vojtech Pavlik, Masami Hiramatsu,
	live-patching, linux-kernel


On Mon, 9 Feb 2015, Josh Poimboeuf wrote:

> Add a basic per-task consistency model.  This is the foundation which
> will eventually enable us to patch those ~10% of security patches which
> change function prototypes and/or data semantics.
> 
> When a patch is enabled, livepatch enters into a transition state where
> tasks are converging from the old universe to the new universe.  If a
> given task isn't using any of the patched functions, it's switched to
> the new universe.  Once all the tasks have been converged to the new
> universe, patching is complete.
> 
> The same sequence occurs when a patch is disabled, except the tasks
> converge from the new universe to the old universe.
> 
> The /sys/kernel/livepatch/<patch>/transition file shows whether a patch
> is in transition.  Only a single patch (the topmost patch on the stack)
> can be in transition at a given time.  A patch can remain in the
> transition state indefinitely, if any of the tasks are stuck in the
> previous universe.
> 
> A transition can be reversed and effectively canceled by writing the
> opposite value to the /sys/kernel/livepatch/<patch>/enabled file while
> the transition is in progress.  Then all the tasks will attempt to
> converge back to the original universe.

Hi Josh,

first, thanks a lot for great work. I'm starting to go through it and it's 
gonna take me some time to do and send a complete review. Anyway, I 
suspect there is a possible race in the code. I'm still not sure though. 
See below...

[...]

> @@ -38,14 +39,34 @@ static void notrace klp_ftrace_handler(unsigned long ip,
>  	ops = container_of(fops, struct klp_ops, fops);
>  
>  	rcu_read_lock();
> +
>  	func = list_first_or_null_rcu(&ops->func_stack, struct klp_func,
>  				      stack_node);
> -	rcu_read_unlock();
>  
>  	if (WARN_ON_ONCE(!func))
> -		return;
> +		goto unlock;
> +
> +	if (unlikely(func->transition)) {
> +		/* corresponding smp_wmb() is in klp_init_transition() */
> +		smp_rmb();
> +
> +		if (current->klp_universe == KLP_UNIVERSE_OLD) {
> +			/*
> +			 * Use the previously patched version of the function.
> +			 * If no previous patches exist, use the original
> +			 * function.
> +			 */
> +			func = list_entry_rcu(func->stack_node.next,
> +					      struct klp_func, stack_node);
> +
> +			if (&func->stack_node == &ops->func_stack)
> +				goto unlock;
> +		}
> +	}
>  
>  	klp_arch_set_pc(regs, (unsigned long)func->new_func);
> +unlock:
> +	rcu_read_unlock();
>  }

The problem is that there is no guarantee that ftrace handler is called in 
an atomic context. Hence it could be preempted (if CONFIG_PREEMPT is y) 
and it could be preempted anywhere before rcu_read_lock (which disables 
preemption for CONFIG_PREEMPT). Ftrace often uses ftrace_ops_list_func as 
a callback which calls the handlers with preemption disabled. But not 
always. For dynamic trampolines it should call the handlers directly and 
preemption is not disabled.

So...

> +/*
> + * Try to transition all tasks to the universe goal.  If any tasks are still
> + * stuck in the original universe, schedule a retry.
> + */
> +void klp_try_complete_transition(void)
> +{
> +	unsigned int cpu;
> +	struct task_struct *g, *t;
> +	bool complete = true;
> +
> +	/* try to transition all normal tasks */
> +	read_lock(&tasklist_lock);
> +	for_each_process_thread(g, t)
> +		if (!klp_transition_task(t))
> +			complete = false;
> +	read_unlock(&tasklist_lock);
> +
> +	/* try to transition the idle "swapper" tasks */
> +	get_online_cpus();
> +	for_each_online_cpu(cpu)
> +		if (!klp_transition_task(idle_task(cpu)))
> +			complete = false;
> +	put_online_cpus();
> +
> +	/* if not complete, try again later */
> +	if (!complete) {
> +		schedule_delayed_work(&klp_transition_work,
> +				      round_jiffies_relative(HZ));
> +		return;
> +	}
> +
> +	/* success! unpatch obsolete functions and do some cleanup */
> +
> +	if (klp_universe_goal == KLP_UNIVERSE_OLD) {
> +		klp_unpatch_objects(klp_transition_patch);
> +
> +		/* prevent ftrace handler from reading old func->transition */
> +		synchronize_rcu();
> +	}
> +
> +	pr_notice("'%s': %s complete\n", klp_transition_patch->mod->name,
> +		  klp_universe_goal == KLP_UNIVERSE_NEW ? "patching" :
> +							  "unpatching");
> +
> +	klp_complete_transition();
> +}

...synchronize_rcu() could be insufficient. There still can be some  
process in our ftrace handler after the call.

Consider the following scenario:

When synchronize_rcu is called some process could have been preempted on 
some other cpu somewhere at the start of the ftrace handler before  
rcu_read_lock. synchronize_rcu waits for the grace period to pass, but that 
does not mean anything for our process in the handler, because it is not 
in rcu critical section. There is no guarantee that after synchronize_rcu 
the process would be away from the handler. 

"Meanwhile" klp_try_complete_transition continues and calls 
klp_complete_transition. This clears func->transition flags. Now the 
process in the handler could be scheduled again. It reads the wrong value 
of func->transition and redirection to the wrong function is done.

What do you think? I hope I made myself clear.

There is the similar problem for dynamic trampolines in ftrace. You cannot 
remove them unless there is no process in the handler. I think rcu-tasks 
were merged a while ago for this purpose. However ftrace does not use them 
yet and I don't know if we could exploit them to solve this issue. I need 
to think more about it.

Anyway thanks a lot!

Miroslav

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH 0/9] livepatch: consistency model
  2015-02-10 11:16 ` Masami Hiramatsu
@ 2015-02-10 15:59   ` Josh Poimboeuf
  2015-02-10 17:29     ` Josh Poimboeuf
  0 siblings, 1 reply; 106+ messages in thread
From: Josh Poimboeuf @ 2015-02-10 15:59 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: Seth Jennings, Jiri Kosina, Vojtech Pavlik, live-patching, linux-kernel

On Tue, Feb 10, 2015 at 08:16:59PM +0900, Masami Hiramatsu wrote:
> (2015/02/10 2:31), Josh Poimboeuf wrote:
> > This patch set implements a livepatch consistency model, targeted for 3.21.
> > Now that we have a solid livepatch code base, this is the biggest remaining
> > missing piece.
> > 
> > This code stems from the design proposal made by Vojtech [1] in November.  It
> > makes live patching safer in general.  Specifically, it allows you to apply
> > patches which change function prototypes.  It also lays the groundwork for
> > future code changes which will enable data and data semantic changes.
> 
> Interesting, How would you do that?

As Vojtech described in the earlier thread from November, there are
different approaches for changing data:

1. TRANSFORM_WORLD: stop the world, transform everything, resume

2. TRANSFORM_ON_ACCESS: transform data structures when you access them

I would add a third category (which is what we've been doing with
kpatch):

3. TRANSFORM_ON_CREATE: create new data structures created after a certain point
are the "v2" versions

I think approach 1 seems very tricky, if not impossible in many cases,
even if you're using stop_machine().  Right now we're focusing on
enabling approaches 2 and 3, since they seem more practical, don't
require stop_machine(), and are generally easier to get right.

With kpatch we've been using approach 3, with a lot of success.  Here's
how I would do it with livepatch:

As a prerequisite, we need shadow variables, which is a way to add
virtual fields to existing structs at runtime.  For an example, see:

   https://github.com/dynup/kpatch/blob/master/test/integration/shadow-newpid.patch

In that example, I added "newpid" to task_struct.  If it's only
something like locking semantics that are changing, you can just add a
"v2" field to the struct to specify that it's the 2nd version of the
struct.

When converting a patch to be used for livepatch, the patch author must
carefully look for data struct versioning changes.  It doesn't matter if
there's a new field, or if the semantics of using that data has changed.
Either way, the patch author must define a new version of the struct.

If a struct has changed, all patched functions need to be able to deal
with struct v1 or struct v2.  This is true for those functions which
access the structs as well as the functions which create them.

For example, a function which accesses the struct might change to:

  if (klp_shadow_has_field(struct, "v2"))
      /* access struct the new way */
  else
      /* access struct the old way */

A function which creates the struct might change to:

  struct foo *struct_create()
  {
     /* kmalloc and init struct here */

     if (klp_patching_complete())
         /* add v2 shadow fields */
  }


The klp_patching_complete() call is needed to prevent v1 functions from
accessing v2 data.  The creation/transformation of v2 structs shouldn't
occur until after the patching process is complete, and all tasks are
converged to the new universe.

> > disadvantages vs kpatch:
> > - no system-wide switch point (not really a functional limitation, just forces
> >   the patch author to be more careful. but that's probably a good thing anyway)
> 
> OK, we must check carefully that the old function and new function can be co-exist.

Agreed, and this requires the patch author to look carefully for data
version changes, as described above.  Which they should be doing
regardless.

> > My biggest concerns and questions related to this patch set are:
> > 
> > 1) To safely examine the task stacks, the transition code locks each task's rq
> >    struct, which requires using the scheduler's internal rq locking functions.
> >    It seems to work well, but I'm not sure if there's a cleaner way to safely
> >    do stack checking without stop_machine().
> 
> We'd better ask scheduler people.

Agreed, I will.

> > 2) As mentioned above, kthreads which are always sleeping on a patched function
> >    will never transition to the new universe.  This is really a minor issue
> >    (less than 1% of patches).  It's not necessarily something that needs to be
> >    resolved with this patch set, but it would be good to have some discussion
> >    about it regardless.
> >    
> >    To overcome this issue, I have 1/2 an idea: we could add some stack checking
> >    code to the ftrace handler itself to transition the kthread to the new
> >    universe after it re-enters the function it was originally sleeping on, if
> >    the stack doesn't already have have any other to-be-patched functions.
> >    Combined with the klp_transition_work_fn()'s periodic stack checking of
> >    sleeping tasks, that would handle most of the cases (except when trying to
> >    patch the high-level thread_fn itself).
> 
> It makes sense to me. (I just did similar thing)
> 
> > 
> >    But then how do you make the kthread wake up?  As far as I can tell,
> >    wake_up_process() doesn't seem to work on a kthread (unless I messed up my
> >    testing somehow).  What does kGraft do in this case?
> 
> Hmm, at a glance, the code itself can work on kthread too...
> Maybe you can also send you testing patch too.

Yeah, I probably messed it up.  I'll try it again :-)

-- 
Josh

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH 2/9] livepatch: separate enabled and patched states
  2015-02-09 17:31 ` [RFC PATCH 2/9] livepatch: separate enabled and patched states Josh Poimboeuf
@ 2015-02-10 16:44   ` Jiri Slaby
  2015-02-10 17:21     ` Josh Poimboeuf
  2015-02-13 12:57   ` Miroslav Benes
  1 sibling, 1 reply; 106+ messages in thread
From: Jiri Slaby @ 2015-02-10 16:44 UTC (permalink / raw)
  To: Josh Poimboeuf, Seth Jennings, Jiri Kosina, Vojtech Pavlik
  Cc: Masami Hiramatsu, live-patching, linux-kernel

On 02/09/2015, 06:31 PM, Josh Poimboeuf wrote:
> Once we have a consistency model, patches and their objects will be
> enabled and disabled at different times.  For example, when a patch is
> disabled, its loaded objects' funcs can remain registered with ftrace
> indefinitely until the unpatching operation is complete and they're no
> longer in use.
> 
> It's less confusing if we give them different names: patches can be
> enabled or disabled; objects (and their funcs) can be patched or
> unpatched:
> 
> - Enabled means that a patch is logically enabled (but not necessarily
>   fully applied).
> 
> - Patched means that an object's funcs are registered with ftrace and
>   added to the klp_ops func stack.
> 
> Also, since these states are binary, represent them with boolean-type
> variables instead of enums.

So please do so: we have bool/true/false.

-- 
js
suse labs

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH 6/9] livepatch: create per-task consistency model
  2015-02-10 15:59   ` Miroslav Benes
@ 2015-02-10 16:56     ` Josh Poimboeuf
  2015-02-11 16:28       ` Miroslav Benes
  0 siblings, 1 reply; 106+ messages in thread
From: Josh Poimboeuf @ 2015-02-10 16:56 UTC (permalink / raw)
  To: Miroslav Benes
  Cc: Seth Jennings, Jiri Kosina, Vojtech Pavlik, Masami Hiramatsu,
	live-patching, linux-kernel

On Tue, Feb 10, 2015 at 04:59:17PM +0100, Miroslav Benes wrote:
> 
> On Mon, 9 Feb 2015, Josh Poimboeuf wrote:
> 
> > Add a basic per-task consistency model.  This is the foundation which
> > will eventually enable us to patch those ~10% of security patches which
> > change function prototypes and/or data semantics.
> > 
> > When a patch is enabled, livepatch enters into a transition state where
> > tasks are converging from the old universe to the new universe.  If a
> > given task isn't using any of the patched functions, it's switched to
> > the new universe.  Once all the tasks have been converged to the new
> > universe, patching is complete.
> > 
> > The same sequence occurs when a patch is disabled, except the tasks
> > converge from the new universe to the old universe.
> > 
> > The /sys/kernel/livepatch/<patch>/transition file shows whether a patch
> > is in transition.  Only a single patch (the topmost patch on the stack)
> > can be in transition at a given time.  A patch can remain in the
> > transition state indefinitely, if any of the tasks are stuck in the
> > previous universe.
> > 
> > A transition can be reversed and effectively canceled by writing the
> > opposite value to the /sys/kernel/livepatch/<patch>/enabled file while
> > the transition is in progress.  Then all the tasks will attempt to
> > converge back to the original universe.
> 
> Hi Josh,
> 
> first, thanks a lot for great work. I'm starting to go through it and it's 
> gonna take me some time to do and send a complete review.

I know there are a lot of details to look at, please take your time.  I
really appreciate your review.  (And everybody else's, for that matter
:-)

> > +	/* success! unpatch obsolete functions and do some cleanup */
> > +
> > +	if (klp_universe_goal == KLP_UNIVERSE_OLD) {
> > +		klp_unpatch_objects(klp_transition_patch);
> > +
> > +		/* prevent ftrace handler from reading old func->transition */
> > +		synchronize_rcu();
> > +	}
> > +
> > +	pr_notice("'%s': %s complete\n", klp_transition_patch->mod->name,
> > +		  klp_universe_goal == KLP_UNIVERSE_NEW ? "patching" :
> > +							  "unpatching");
> > +
> > +	klp_complete_transition();
> > +}
> 
> ...synchronize_rcu() could be insufficient. There still can be some  
> process in our ftrace handler after the call.
> 
> Consider the following scenario:
> 
> When synchronize_rcu is called some process could have been preempted on 
> some other cpu somewhere at the start of the ftrace handler before  
> rcu_read_lock. synchronize_rcu waits for the grace period to pass, but that 
> does not mean anything for our process in the handler, because it is not 
> in rcu critical section. There is no guarantee that after synchronize_rcu 
> the process would be away from the handler. 
> 
> "Meanwhile" klp_try_complete_transition continues and calls 
> klp_complete_transition. This clears func->transition flags. Now the 
> process in the handler could be scheduled again. It reads the wrong value 
> of func->transition and redirection to the wrong function is done.
> 
> What do you think? I hope I made myself clear.

You really made me think.  But I don't think there's a race here.

Consider the two separate cases, patching and unpatching:

1. patching has completed: klp_universe_goal and all tasks'
   klp_universes are at KLP_UNIVERSE_NEW.  In this case, the value of
   func->transition doesn't matter, because we want to use the func at
   the top of the stack, and if klp_universe is NEW, the ftrace handler
   will do that, regardless of the value of func->transition.  This is
   why I didn't do the rcu_synchronize() in this case.  But maybe you're
   not worried about this case anyway, I just described it for the sake
   of completeness :-)

2. unpatching has completed: klp_universe_goal and all tasks'
   klp_universes are at KLP_UNIVERSE_OLD.  In this case, the value of
   func->transition _does_ matter.  However, notice that
   klp_unpatch_objects() is called before rcu_synchronize().  That
   removes the "new" func from the klp_ops stack.  Since the ftrace
   handler accesses the list _after_ calling rcu_read_lock(), it will
   never see the "new" func, and thus func->transition will never be
   set.

   That said, I think there is a race where the WARN_ON_ONCE(!func)
   could trigger here, and it wouldn't be an error.  So I think I'll
   remove the warning.

Does that make sense?

> There is the similar problem for dynamic trampolines in ftrace. You
> cannot remove them unless there is no process in the handler. I think
> rcu-tasks were merged a while ago for this purpose. However ftrace
> does not use them yet and I don't know if we could exploit them to
> solve this issue. I need to think more about it.

Ok, sounds like that's an ftrace bug that could affect us.

-- 
Josh

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH 2/9] livepatch: separate enabled and patched states
  2015-02-10 16:44   ` Jiri Slaby
@ 2015-02-10 17:21     ` Josh Poimboeuf
  0 siblings, 0 replies; 106+ messages in thread
From: Josh Poimboeuf @ 2015-02-10 17:21 UTC (permalink / raw)
  To: Jiri Slaby
  Cc: Seth Jennings, Jiri Kosina, Vojtech Pavlik, Masami Hiramatsu,
	live-patching, linux-kernel

On Tue, Feb 10, 2015 at 05:44:30PM +0100, Jiri Slaby wrote:
> On 02/09/2015, 06:31 PM, Josh Poimboeuf wrote:
> > Once we have a consistency model, patches and their objects will be
> > enabled and disabled at different times.  For example, when a patch is
> > disabled, its loaded objects' funcs can remain registered with ftrace
> > indefinitely until the unpatching operation is complete and they're no
> > longer in use.
> > 
> > It's less confusing if we give them different names: patches can be
> > enabled or disabled; objects (and their funcs) can be patched or
> > unpatched:
> > 
> > - Enabled means that a patch is logically enabled (but not necessarily
> >   fully applied).
> > 
> > - Patched means that an object's funcs are registered with ftrace and
> >   added to the klp_ops func stack.
> > 
> > Also, since these states are binary, represent them with boolean-type
> > variables instead of enums.
> 
> So please do so: we have bool/true/false.

Will do, thanks.

-- 
Josh

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH 0/9] livepatch: consistency model
  2015-02-10 15:59   ` Josh Poimboeuf
@ 2015-02-10 17:29     ` Josh Poimboeuf
  0 siblings, 0 replies; 106+ messages in thread
From: Josh Poimboeuf @ 2015-02-10 17:29 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: Seth Jennings, Jiri Kosina, Vojtech Pavlik, live-patching, linux-kernel

On Tue, Feb 10, 2015 at 09:59:58AM -0600, Josh Poimboeuf wrote:
> On Tue, Feb 10, 2015 at 08:16:59PM +0900, Masami Hiramatsu wrote:
> > (2015/02/10 2:31), Josh Poimboeuf wrote:
> > > This patch set implements a livepatch consistency model, targeted for 3.21.
> > > Now that we have a solid livepatch code base, this is the biggest remaining
> > > missing piece.
> > > 
> > > This code stems from the design proposal made by Vojtech [1] in November.  It
> > > makes live patching safer in general.  Specifically, it allows you to apply
> > > patches which change function prototypes.  It also lays the groundwork for
> > > future code changes which will enable data and data semantic changes.
> > 
> > Interesting, How would you do that?
> 
> As Vojtech described in the earlier thread from November, there are
> different approaches for changing data:
> 
> 1. TRANSFORM_WORLD: stop the world, transform everything, resume
> 
> 2. TRANSFORM_ON_ACCESS: transform data structures when you access them
> 
> I would add a third category (which is what we've been doing with
> kpatch):
> 
> 3. TRANSFORM_ON_CREATE: create new data structures created after a certain point
> are the "v2" versions

Sorry, bad wording, I meant to say:

3. TRANSFORM_ON_CREATE: create new versions of the data structures when
   you create them

If that still doesn't make sense, hopefully the below explanation
clarifies what I mean :-)

> 
> I think approach 1 seems very tricky, if not impossible in many cases,
> even if you're using stop_machine().  Right now we're focusing on
> enabling approaches 2 and 3, since they seem more practical, don't
> require stop_machine(), and are generally easier to get right.
> 
> With kpatch we've been using approach 3, with a lot of success.  Here's
> how I would do it with livepatch:
> 
> As a prerequisite, we need shadow variables, which is a way to add
> virtual fields to existing structs at runtime.  For an example, see:
> 
>    https://github.com/dynup/kpatch/blob/master/test/integration/shadow-newpid.patch
> 
> In that example, I added "newpid" to task_struct.  If it's only
> something like locking semantics that are changing, you can just add a
> "v2" field to the struct to specify that it's the 2nd version of the
> struct.
> 
> When converting a patch to be used for livepatch, the patch author must
> carefully look for data struct versioning changes.  It doesn't matter if
> there's a new field, or if the semantics of using that data has changed.
> Either way, the patch author must define a new version of the struct.
> 
> If a struct has changed, all patched functions need to be able to deal
> with struct v1 or struct v2.  This is true for those functions which
> access the structs as well as the functions which create them.
> 
> For example, a function which accesses the struct might change to:
> 
>   if (klp_shadow_has_field(struct, "v2"))
>       /* access struct the new way */
>   else
>       /* access struct the old way */
> 
> A function which creates the struct might change to:
> 
>   struct foo *struct_create()
>   {
>      /* kmalloc and init struct here */
> 
>      if (klp_patching_complete())
>          /* add v2 shadow fields */
>   }
> 
> 
> The klp_patching_complete() call is needed to prevent v1 functions from
> accessing v2 data.  The creation/transformation of v2 structs shouldn't
> occur until after the patching process is complete, and all tasks are
> converged to the new universe.
> 
> > > disadvantages vs kpatch:
> > > - no system-wide switch point (not really a functional limitation, just forces
> > >   the patch author to be more careful. but that's probably a good thing anyway)
> > 
> > OK, we must check carefully that the old function and new function can be co-exist.
> 
> Agreed, and this requires the patch author to look carefully for data
> version changes, as described above.  Which they should be doing
> regardless.
> 
> > > My biggest concerns and questions related to this patch set are:
> > > 
> > > 1) To safely examine the task stacks, the transition code locks each task's rq
> > >    struct, which requires using the scheduler's internal rq locking functions.
> > >    It seems to work well, but I'm not sure if there's a cleaner way to safely
> > >    do stack checking without stop_machine().
> > 
> > We'd better ask scheduler people.
> 
> Agreed, I will.
> 
> > > 2) As mentioned above, kthreads which are always sleeping on a patched function
> > >    will never transition to the new universe.  This is really a minor issue
> > >    (less than 1% of patches).  It's not necessarily something that needs to be
> > >    resolved with this patch set, but it would be good to have some discussion
> > >    about it regardless.
> > >    
> > >    To overcome this issue, I have 1/2 an idea: we could add some stack checking
> > >    code to the ftrace handler itself to transition the kthread to the new
> > >    universe after it re-enters the function it was originally sleeping on, if
> > >    the stack doesn't already have have any other to-be-patched functions.
> > >    Combined with the klp_transition_work_fn()'s periodic stack checking of
> > >    sleeping tasks, that would handle most of the cases (except when trying to
> > >    patch the high-level thread_fn itself).
> > 
> > It makes sense to me. (I just did similar thing)
> > 
> > > 
> > >    But then how do you make the kthread wake up?  As far as I can tell,
> > >    wake_up_process() doesn't seem to work on a kthread (unless I messed up my
> > >    testing somehow).  What does kGraft do in this case?
> > 
> > Hmm, at a glance, the code itself can work on kthread too...
> > Maybe you can also send you testing patch too.
> 
> Yeah, I probably messed it up.  I'll try it again :-)
> 
> -- 
> Josh

-- 
Josh

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH 3/9] livepatch: move patching functions into patch.c
  2015-02-09 17:31 ` [RFC PATCH 3/9] livepatch: move patching functions into patch.c Josh Poimboeuf
@ 2015-02-10 18:27   ` Jiri Slaby
  2015-02-10 18:50     ` Josh Poimboeuf
  2015-02-13 14:28   ` Miroslav Benes
  1 sibling, 1 reply; 106+ messages in thread
From: Jiri Slaby @ 2015-02-10 18:27 UTC (permalink / raw)
  To: Josh Poimboeuf, Seth Jennings, Jiri Kosina, Vojtech Pavlik
  Cc: Masami Hiramatsu, live-patching, linux-kernel

On 02/09/2015, 06:31 PM, Josh Poimboeuf wrote:
> Move functions related to the actual patching of functions and objects
> into a new patch.c file.
> 
> The only functional change is to remove the unnecessary
> WARN_ON(!klp_is_object_loaded()) check from klp_patch_object().
> 
> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
> --- a/kernel/livepatch/core.c
> +++ b/kernel/livepatch/core.c
> @@ -24,29 +24,10 @@
>  #include <linux/kernel.h>
>  #include <linux/mutex.h>
>  #include <linux/slab.h>
> -#include <linux/ftrace.h>
>  #include <linux/list.h>
>  #include <linux/kallsyms.h>
> -#include <linux/livepatch.h>

I don't understand, you define some functions declared there and you
remove the include? patch.h below is not enough. When somebody shuffles
with the files again, we would have to fix this.

>  
> -/**
> - * struct klp_ops - structure for tracking registered ftrace ops structs
> - *
> - * A single ftrace_ops is shared between all enabled replacement functions
> - * (klp_func structs) which have the same old_addr.  This allows the switch
> - * between function versions to happen instantaneously by updating the klp_ops
> - * struct's func_stack list.  The winner is the klp_func at the top of the
> - * func_stack (front of the list).
> - *
> - * @node:	node for the global klp_ops list
> - * @func_stack:	list head for the stack of klp_func's (active func is on top)
> - * @fops:	registered ftrace ops struct
> - */
> -struct klp_ops {
> -	struct list_head node;
> -	struct list_head func_stack;
> -	struct ftrace_ops fops;
> -};
> +#include "patch.h"

...

> --- /dev/null
> +++ b/kernel/livepatch/patch.c
> @@ -0,0 +1,176 @@
> +/*
> + * patch.c - Kernel Live Patching patching functions

...

> +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
> +
> +#include <linux/slab.h>
> +
> +#include "patch.h"
> +
> +static LIST_HEAD(klp_ops);

list.h should be included.

> +static void notrace klp_ftrace_handler(unsigned long ip,
> +				       unsigned long parent_ip,
> +				       struct ftrace_ops *fops,

ftrace.h should be included.

> +				       struct pt_regs *regs)
> +{
> +	struct klp_ops *ops;
> +	struct klp_func *func;
> +
> +	ops = container_of(fops, struct klp_ops, fops);
> +
> +	rcu_read_lock();
> +	func = list_first_or_null_rcu(&ops->func_stack, struct klp_func,
> +				      stack_node);

rculist.h & perhaps rcupdate.h?

> +	rcu_read_unlock();
> +
> +	if (WARN_ON_ONCE(!func))
> +		return;
> +
> +	klp_arch_set_pc(regs, (unsigned long)func->new_func);
> +}

...

> +static void klp_unpatch_func(struct klp_func *func)
> +{
> +	struct klp_ops *ops;
> +
> +	WARN_ON(!func->patched);
> +	WARN_ON(!func->old_addr);

bug.h

> +
> +	ops = klp_find_ops(func->old_addr);
> +	if (WARN_ON(!ops))
> +		return;
> +
> +	if (list_is_singular(&ops->func_stack)) {
> +		WARN_ON(unregister_ftrace_function(&ops->fops));
> +		WARN_ON(ftrace_set_filter_ip(&ops->fops, func->old_addr, 1, 0));
> +
> +		list_del_rcu(&func->stack_node);
> +		list_del(&ops->node);
> +		kfree(ops);
> +	} else {
> +		list_del_rcu(&func->stack_node);
> +	}
> +
> +	func->patched = 0;
> +}
> +
> +static int klp_patch_func(struct klp_func *func)
> +{
> +	struct klp_ops *ops;
> +	int ret;
> +
> +	if (WARN_ON(!func->old_addr))
> +		return -EINVAL;
> +
> +	if (WARN_ON(func->patched))
> +		return -EINVAL;
> +
> +	ops = klp_find_ops(func->old_addr);
> +	if (!ops) {
> +		ops = kzalloc(sizeof(*ops), GFP_KERNEL);
> +		if (!ops)
> +			return -ENOMEM;
> +
> +		ops->fops.func = klp_ftrace_handler;
> +		ops->fops.flags = FTRACE_OPS_FL_SAVE_REGS |
> +				  FTRACE_OPS_FL_DYNAMIC |
> +				  FTRACE_OPS_FL_IPMODIFY;
> +
> +		list_add(&ops->node, &klp_ops);
> +
> +		INIT_LIST_HEAD(&ops->func_stack);
> +		list_add_rcu(&func->stack_node, &ops->func_stack);
> +
> +		ret = ftrace_set_filter_ip(&ops->fops, func->old_addr, 0, 0);
> +		if (ret) {
> +			pr_err("failed to set ftrace filter for function '%s' (%d)\n",
> +			       func->old_name, ret);

printk.h

> +			goto err;
> +		}
> +
> +		ret = register_ftrace_function(&ops->fops);
> +		if (ret) {
> +			pr_err("failed to register ftrace handler for function '%s' (%d)\n",
> +			       func->old_name, ret);
> +			ftrace_set_filter_ip(&ops->fops, func->old_addr, 1, 0);
> +			goto err;
> +		}
> +	} else {
> +		list_add_rcu(&func->stack_node, &ops->func_stack);
> +	}
> +
> +	func->patched = 1;
> +
> +	return 0;
> +
> +err:
> +	list_del_rcu(&func->stack_node);
> +	list_del(&ops->node);
> +	kfree(ops);
> +	return ret;
> +}

...

> --- /dev/null
> +++ b/kernel/livepatch/patch.h
> @@ -0,0 +1,25 @@

This is not a correct header. Double-inclusion protection is missing.

> +#include <linux/livepatch.h>
> +
> +/**
> + * struct klp_ops - structure for tracking registered ftrace ops structs
> + *
> + * A single ftrace_ops is shared between all enabled replacement functions
> + * (klp_func structs) which have the same old_addr.  This allows the switch
> + * between function versions to happen instantaneously by updating the klp_ops
> + * struct's func_stack list.  The winner is the klp_func at the top of the
> + * func_stack (front of the list).
> + *
> + * @node:	node for the global klp_ops list
> + * @func_stack:	list head for the stack of klp_func's (active func is on top)
> + * @fops:	registered ftrace ops struct
> + */
> +struct klp_ops {
> +	struct list_head node;
> +	struct list_head func_stack;
> +	struct ftrace_ops fops;

This header obviously needs list.h and ftrace.h.

> +};
> +
> +struct klp_ops *klp_find_ops(unsigned long old_addr);
> +
> +extern int klp_patch_object(struct klp_object *obj);
> +extern void klp_unpatch_object(struct klp_object *obj);
> 

regards,
-- 
js
suse labs

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH 4/9] livepatch: get function sizes
  2015-02-09 17:31 ` [RFC PATCH 4/9] livepatch: get function sizes Josh Poimboeuf
@ 2015-02-10 18:30   ` Jiri Slaby
  2015-02-10 18:53     ` Josh Poimboeuf
  0 siblings, 1 reply; 106+ messages in thread
From: Jiri Slaby @ 2015-02-10 18:30 UTC (permalink / raw)
  To: Josh Poimboeuf, Seth Jennings, Jiri Kosina, Vojtech Pavlik
  Cc: Masami Hiramatsu, live-patching, linux-kernel

On 02/09/2015, 06:31 PM, Josh Poimboeuf wrote:
> --- a/kernel/livepatch/core.c
> +++ b/kernel/livepatch/core.c
> @@ -197,8 +197,25 @@ static int klp_find_verify_func_addr(struct klp_object *obj,
>  	else
>  		ret = klp_verify_vmlinux_symbol(func->old_name,
>  						func->old_addr);
> +	if (ret)
> +		return ret;
>  
> -	return ret;
> +	ret = kallsyms_lookup_size_offset(func->old_addr, &func->old_size,
> +					  NULL);
> +	if (!ret) {
> +		pr_err("kallsyms lookup failed for '%s'\n", func->old_name);
> +		return -EINVAL;
> +	}
> +
> +	ret = kallsyms_lookup_size_offset((unsigned long)func->new_func,
> +					  &func->new_size, NULL);
> +	if (!ret) {
> +		pr_err("kallsyms lookup failed for '%s' replacement\n",
> +		       func->old_name);
> +		return -EINVAL;

EINVAL does not seem to be an appropriate return value for "not found".
Maybe ENOENT?

regards,
-- 
js
suse labs

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH 7/9] proc: add /proc/<pid>/universe to show livepatch status
  2015-02-09 17:31 ` [RFC PATCH 7/9] proc: add /proc/<pid>/universe to show livepatch status Josh Poimboeuf
@ 2015-02-10 18:47   ` Jiri Slaby
  2015-02-10 18:57     ` Josh Poimboeuf
  0 siblings, 1 reply; 106+ messages in thread
From: Jiri Slaby @ 2015-02-10 18:47 UTC (permalink / raw)
  To: Josh Poimboeuf, Seth Jennings, Jiri Kosina, Vojtech Pavlik
  Cc: Masami Hiramatsu, live-patching, linux-kernel

On 02/09/2015, 06:31 PM, Josh Poimboeuf wrote:
> Expose the per-task klp_universe value so users can determine which
> tasks are holding up completion of a patching operation.
> 
> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
> ---
>  fs/proc/base.c | 11 +++++++++++
>  1 file changed, 11 insertions(+)
> 
> diff --git a/fs/proc/base.c b/fs/proc/base.c
> index 3f3d7ae..b9fe6b5 100644
> --- a/fs/proc/base.c
> +++ b/fs/proc/base.c
> @@ -2528,6 +2528,14 @@ static int proc_pid_personality(struct seq_file *m, struct pid_namespace *ns,
>  	return err;
>  }
>  
> +#ifdef CONFIG_LIVEPATCH
> +static int proc_pid_klp_universe(struct seq_file *m, struct pid_namespace *ns,
> +				 struct pid *pid, struct task_struct *task)
> +{
> +	return seq_printf(m, "%d\n", task->klp_universe);
> +}
> +#endif /* CONFIG_LIVEPATCH */
> +
>  /*
>   * Thread groups
>   */
> @@ -2628,6 +2636,9 @@ static const struct pid_entry tgid_base_stuff[] = {
>  #ifdef CONFIG_CHECKPOINT_RESTORE
>  	REG("timers",	  S_IRUGO, proc_timers_operations),
>  #endif
> +#ifdef CONFIG_LIVEPATCH
> +	ONE("universe", S_IRUGO, proc_pid_klp_universe),

I am not sure if this can be UGO or if it should be USR only instead.
Leaving for discussion, but I incline to use USR to avoid *any* info
leakage.

> +#endif

regards,
-- 
js
suse labs

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH 3/9] livepatch: move patching functions into patch.c
  2015-02-10 18:27   ` Jiri Slaby
@ 2015-02-10 18:50     ` Josh Poimboeuf
  0 siblings, 0 replies; 106+ messages in thread
From: Josh Poimboeuf @ 2015-02-10 18:50 UTC (permalink / raw)
  To: Jiri Slaby
  Cc: Seth Jennings, Jiri Kosina, Vojtech Pavlik, Masami Hiramatsu,
	live-patching, linux-kernel

On Tue, Feb 10, 2015 at 07:27:51PM +0100, Jiri Slaby wrote:
> On 02/09/2015, 06:31 PM, Josh Poimboeuf wrote:
> > Move functions related to the actual patching of functions and objects
> > into a new patch.c file.
> > 
> > The only functional change is to remove the unnecessary
> > WARN_ON(!klp_is_object_loaded()) check from klp_patch_object().
> > 
> > Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
> > --- a/kernel/livepatch/core.c
> > +++ b/kernel/livepatch/core.c
> > @@ -24,29 +24,10 @@
> >  #include <linux/kernel.h>
> >  #include <linux/mutex.h>
> >  #include <linux/slab.h>
> > -#include <linux/ftrace.h>
> >  #include <linux/list.h>
> >  #include <linux/kallsyms.h>
> > -#include <linux/livepatch.h>
> 
> I don't understand, you define some functions declared there and you
> remove the include? patch.h below is not enough. When somebody shuffles
> with the files again, we would have to fix this.
> 
> >  
> > -/**
> > - * struct klp_ops - structure for tracking registered ftrace ops structs
> > - *
> > - * A single ftrace_ops is shared between all enabled replacement functions
> > - * (klp_func structs) which have the same old_addr.  This allows the switch
> > - * between function versions to happen instantaneously by updating the klp_ops
> > - * struct's func_stack list.  The winner is the klp_func at the top of the
> > - * func_stack (front of the list).
> > - *
> > - * @node:	node for the global klp_ops list
> > - * @func_stack:	list head for the stack of klp_func's (active func is on top)
> > - * @fops:	registered ftrace ops struct
> > - */
> > -struct klp_ops {
> > -	struct list_head node;
> > -	struct list_head func_stack;
> > -	struct ftrace_ops fops;
> > -};
> > +#include "patch.h"
> 
> ...
> 
> > --- /dev/null
> > +++ b/kernel/livepatch/patch.c
> > @@ -0,0 +1,176 @@
> > +/*
> > + * patch.c - Kernel Live Patching patching functions
> 
> ...
> 
> > +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
> > +
> > +#include <linux/slab.h>
> > +
> > +#include "patch.h"
> > +
> > +static LIST_HEAD(klp_ops);
> 
> list.h should be included.
> 
> > +static void notrace klp_ftrace_handler(unsigned long ip,
> > +				       unsigned long parent_ip,
> > +				       struct ftrace_ops *fops,
> 
> ftrace.h should be included.
> 
> > +				       struct pt_regs *regs)
> > +{
> > +	struct klp_ops *ops;
> > +	struct klp_func *func;
> > +
> > +	ops = container_of(fops, struct klp_ops, fops);
> > +
> > +	rcu_read_lock();
> > +	func = list_first_or_null_rcu(&ops->func_stack, struct klp_func,
> > +				      stack_node);
> 
> rculist.h & perhaps rcupdate.h?
> 
> > +	rcu_read_unlock();
> > +
> > +	if (WARN_ON_ONCE(!func))
> > +		return;
> > +
> > +	klp_arch_set_pc(regs, (unsigned long)func->new_func);
> > +}
> 
> ...
> 
> > +static void klp_unpatch_func(struct klp_func *func)
> > +{
> > +	struct klp_ops *ops;
> > +
> > +	WARN_ON(!func->patched);
> > +	WARN_ON(!func->old_addr);
> 
> bug.h
> 
> > +
> > +	ops = klp_find_ops(func->old_addr);
> > +	if (WARN_ON(!ops))
> > +		return;
> > +
> > +	if (list_is_singular(&ops->func_stack)) {
> > +		WARN_ON(unregister_ftrace_function(&ops->fops));
> > +		WARN_ON(ftrace_set_filter_ip(&ops->fops, func->old_addr, 1, 0));
> > +
> > +		list_del_rcu(&func->stack_node);
> > +		list_del(&ops->node);
> > +		kfree(ops);
> > +	} else {
> > +		list_del_rcu(&func->stack_node);
> > +	}
> > +
> > +	func->patched = 0;
> > +}
> > +
> > +static int klp_patch_func(struct klp_func *func)
> > +{
> > +	struct klp_ops *ops;
> > +	int ret;
> > +
> > +	if (WARN_ON(!func->old_addr))
> > +		return -EINVAL;
> > +
> > +	if (WARN_ON(func->patched))
> > +		return -EINVAL;
> > +
> > +	ops = klp_find_ops(func->old_addr);
> > +	if (!ops) {
> > +		ops = kzalloc(sizeof(*ops), GFP_KERNEL);
> > +		if (!ops)
> > +			return -ENOMEM;
> > +
> > +		ops->fops.func = klp_ftrace_handler;
> > +		ops->fops.flags = FTRACE_OPS_FL_SAVE_REGS |
> > +				  FTRACE_OPS_FL_DYNAMIC |
> > +				  FTRACE_OPS_FL_IPMODIFY;
> > +
> > +		list_add(&ops->node, &klp_ops);
> > +
> > +		INIT_LIST_HEAD(&ops->func_stack);
> > +		list_add_rcu(&func->stack_node, &ops->func_stack);
> > +
> > +		ret = ftrace_set_filter_ip(&ops->fops, func->old_addr, 0, 0);
> > +		if (ret) {
> > +			pr_err("failed to set ftrace filter for function '%s' (%d)\n",
> > +			       func->old_name, ret);
> 
> printk.h
> 
> > +			goto err;
> > +		}
> > +
> > +		ret = register_ftrace_function(&ops->fops);
> > +		if (ret) {
> > +			pr_err("failed to register ftrace handler for function '%s' (%d)\n",
> > +			       func->old_name, ret);
> > +			ftrace_set_filter_ip(&ops->fops, func->old_addr, 1, 0);
> > +			goto err;
> > +		}
> > +	} else {
> > +		list_add_rcu(&func->stack_node, &ops->func_stack);
> > +	}
> > +
> > +	func->patched = 1;
> > +
> > +	return 0;
> > +
> > +err:
> > +	list_del_rcu(&func->stack_node);
> > +	list_del(&ops->node);
> > +	kfree(ops);
> > +	return ret;
> > +}
> 
> ...
> 
> > --- /dev/null
> > +++ b/kernel/livepatch/patch.h
> > @@ -0,0 +1,25 @@
> 
> This is not a correct header. Double-inclusion protection is missing.
> 
> > +#include <linux/livepatch.h>
> > +
> > +/**
> > + * struct klp_ops - structure for tracking registered ftrace ops structs
> > + *
> > + * A single ftrace_ops is shared between all enabled replacement functions
> > + * (klp_func structs) which have the same old_addr.  This allows the switch
> > + * between function versions to happen instantaneously by updating the klp_ops
> > + * struct's func_stack list.  The winner is the klp_func at the top of the
> > + * func_stack (front of the list).
> > + *
> > + * @node:	node for the global klp_ops list
> > + * @func_stack:	list head for the stack of klp_func's (active func is on top)
> > + * @fops:	registered ftrace ops struct
> > + */
> > +struct klp_ops {
> > +	struct list_head node;
> > +	struct list_head func_stack;
> > +	struct ftrace_ops fops;
> 
> This header obviously needs list.h and ftrace.h.
> 
> > +};
> > +
> > +struct klp_ops *klp_find_ops(unsigned long old_addr);
> > +
> > +extern int klp_patch_object(struct klp_object *obj);
> > +extern void klp_unpatch_object(struct klp_object *obj);
> > 
> 

Agreed to all, thanks.


-- 
Josh

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH 4/9] livepatch: get function sizes
  2015-02-10 18:30   ` Jiri Slaby
@ 2015-02-10 18:53     ` Josh Poimboeuf
  0 siblings, 0 replies; 106+ messages in thread
From: Josh Poimboeuf @ 2015-02-10 18:53 UTC (permalink / raw)
  To: Jiri Slaby
  Cc: Seth Jennings, Jiri Kosina, Vojtech Pavlik, Masami Hiramatsu,
	live-patching, linux-kernel

On Tue, Feb 10, 2015 at 07:30:50PM +0100, Jiri Slaby wrote:
> On 02/09/2015, 06:31 PM, Josh Poimboeuf wrote:
> > --- a/kernel/livepatch/core.c
> > +++ b/kernel/livepatch/core.c
> > @@ -197,8 +197,25 @@ static int klp_find_verify_func_addr(struct klp_object *obj,
> >  	else
> >  		ret = klp_verify_vmlinux_symbol(func->old_name,
> >  						func->old_addr);
> > +	if (ret)
> > +		return ret;
> >  
> > -	return ret;
> > +	ret = kallsyms_lookup_size_offset(func->old_addr, &func->old_size,
> > +					  NULL);
> > +	if (!ret) {
> > +		pr_err("kallsyms lookup failed for '%s'\n", func->old_name);
> > +		return -EINVAL;
> > +	}
> > +
> > +	ret = kallsyms_lookup_size_offset((unsigned long)func->new_func,
> > +					  &func->new_size, NULL);
> > +	if (!ret) {
> > +		pr_err("kallsyms lookup failed for '%s' replacement\n",
> > +		       func->old_name);
> > +		return -EINVAL;
> 
> EINVAL does not seem to be an appropriate return value for "not found".
> Maybe ENOENT?

Ok.

-- 
Josh

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH 7/9] proc: add /proc/<pid>/universe to show livepatch status
  2015-02-10 18:47   ` Jiri Slaby
@ 2015-02-10 18:57     ` Josh Poimboeuf
  0 siblings, 0 replies; 106+ messages in thread
From: Josh Poimboeuf @ 2015-02-10 18:57 UTC (permalink / raw)
  To: Jiri Slaby
  Cc: Seth Jennings, Jiri Kosina, Vojtech Pavlik, Masami Hiramatsu,
	live-patching, linux-kernel

On Tue, Feb 10, 2015 at 07:47:12PM +0100, Jiri Slaby wrote:
> On 02/09/2015, 06:31 PM, Josh Poimboeuf wrote:
> > Expose the per-task klp_universe value so users can determine which
> > tasks are holding up completion of a patching operation.
> > 
> > Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
> > ---
> >  fs/proc/base.c | 11 +++++++++++
> >  1 file changed, 11 insertions(+)
> > 
> > diff --git a/fs/proc/base.c b/fs/proc/base.c
> > index 3f3d7ae..b9fe6b5 100644
> > --- a/fs/proc/base.c
> > +++ b/fs/proc/base.c
> > @@ -2528,6 +2528,14 @@ static int proc_pid_personality(struct seq_file *m, struct pid_namespace *ns,
> >  	return err;
> >  }
> >  
> > +#ifdef CONFIG_LIVEPATCH
> > +static int proc_pid_klp_universe(struct seq_file *m, struct pid_namespace *ns,
> > +				 struct pid *pid, struct task_struct *task)
> > +{
> > +	return seq_printf(m, "%d\n", task->klp_universe);
> > +}
> > +#endif /* CONFIG_LIVEPATCH */
> > +
> >  /*
> >   * Thread groups
> >   */
> > @@ -2628,6 +2636,9 @@ static const struct pid_entry tgid_base_stuff[] = {
> >  #ifdef CONFIG_CHECKPOINT_RESTORE
> >  	REG("timers",	  S_IRUGO, proc_timers_operations),
> >  #endif
> > +#ifdef CONFIG_LIVEPATCH
> > +	ONE("universe", S_IRUGO, proc_pid_klp_universe),
> 
> I am not sure if this can be UGO or if it should be USR only instead.
> Leaving for discussion, but I incline to use USR to avoid *any* info
> leakage.

That's fine.  I can't think of any reason why a non-root user would need
to know the task's universe.

-- 
Josh

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH 8/9] livepatch: allow patch modules to be removed
  2015-02-09 17:31 ` [RFC PATCH 8/9] livepatch: allow patch modules to be removed Josh Poimboeuf
@ 2015-02-10 19:02   ` Jiri Slaby
  2015-02-10 19:57     ` Josh Poimboeuf
  2015-02-12 15:22     ` Miroslav Benes
  0 siblings, 2 replies; 106+ messages in thread
From: Jiri Slaby @ 2015-02-10 19:02 UTC (permalink / raw)
  To: Josh Poimboeuf, Seth Jennings, Jiri Kosina, Vojtech Pavlik
  Cc: Masami Hiramatsu, live-patching, linux-kernel

On 02/09/2015, 06:31 PM, Josh Poimboeuf wrote:
> --- a/kernel/livepatch/core.c
> +++ b/kernel/livepatch/core.c
...
> @@ -497,10 +500,6 @@ static struct attribute *klp_patch_attrs[] = {
>  
>  static void klp_kobj_release_patch(struct kobject *kobj)
>  {
> -	/*
> -	 * Once we have a consistency model we'll need to module_put() the
> -	 * patch module here.  See klp_register_patch() for more details.
> -	 */

I deliberately let you write the note in there :). What happens when I
leave some attribute in /sys open and you remove the module in the meantime?

> --- a/kernel/livepatch/transition.c
> +++ b/kernel/livepatch/transition.c
> @@ -54,6 +54,9 @@ void klp_complete_transition(void)
>  		for (func = obj->funcs; func->old_name; func++)
>  			func->transition = 0;
>  
> +	if (klp_universe_goal == KLP_UNIVERSE_OLD)
> +		module_put(klp_transition_patch->mod);
> +
>  	klp_transition_patch = NULL;
>  }

regards,
-- 
js
suse labs

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH 6/9] livepatch: create per-task consistency model
  2015-02-09 17:31 ` [RFC PATCH 6/9] livepatch: create per-task consistency model Josh Poimboeuf
  2015-02-10 10:58   ` Masami Hiramatsu
  2015-02-10 15:59   ` Miroslav Benes
@ 2015-02-10 19:27   ` Seth Jennings
  2015-02-10 19:32     ` Josh Poimboeuf
  2015-02-11 10:21   ` Miroslav Benes
                     ` (3 subsequent siblings)
  6 siblings, 1 reply; 106+ messages in thread
From: Seth Jennings @ 2015-02-10 19:27 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Jiri Kosina, Vojtech Pavlik, Masami Hiramatsu, live-patching,
	linux-kernel

On Mon, Feb 09, 2015 at 11:31:18AM -0600, Josh Poimboeuf wrote:
> Add a basic per-task consistency model.  This is the foundation which
> will eventually enable us to patch those ~10% of security patches which
> change function prototypes and/or data semantics.
> 
> When a patch is enabled, livepatch enters into a transition state where
> tasks are converging from the old universe to the new universe.  If a
> given task isn't using any of the patched functions, it's switched to
> the new universe.  Once all the tasks have been converged to the new
> universe, patching is complete.
> 
> The same sequence occurs when a patch is disabled, except the tasks
> converge from the new universe to the old universe.
> 
> The /sys/kernel/livepatch/<patch>/transition file shows whether a patch
> is in transition.  Only a single patch (the topmost patch on the stack)
> can be in transition at a given time.  A patch can remain in the
> transition state indefinitely, if any of the tasks are stuck in the
> previous universe.
> 
> A transition can be reversed and effectively canceled by writing the
> opposite value to the /sys/kernel/livepatch/<patch>/enabled file while
> the transition is in progress.  Then all the tasks will attempt to
> converge back to the original universe.
> 
> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
> ---
>  include/linux/livepatch.h     |  18 ++-
>  include/linux/sched.h         |   3 +
>  kernel/fork.c                 |   2 +
>  kernel/livepatch/Makefile     |   2 +-
>  kernel/livepatch/core.c       |  71 ++++++----
>  kernel/livepatch/patch.c      |  34 ++++-
>  kernel/livepatch/patch.h      |   1 +
>  kernel/livepatch/transition.c | 300 ++++++++++++++++++++++++++++++++++++++++++
>  kernel/livepatch/transition.h |  16 +++
>  kernel/sched/core.c           |   2 +
>  10 files changed, 423 insertions(+), 26 deletions(-)
>  create mode 100644 kernel/livepatch/transition.c
>  create mode 100644 kernel/livepatch/transition.h
> 
<snip>
> diff --git a/kernel/livepatch/transition.h b/kernel/livepatch/transition.h
> new file mode 100644
> index 0000000..ba9a55c
> --- /dev/null
> +++ b/kernel/livepatch/transition.h
> @@ -0,0 +1,16 @@
> +#include <linux/livepatch.h>
> +
> +enum {
> +	KLP_UNIVERSE_UNDEFINED = -1,
> +	KLP_UNIVERSE_OLD,
> +	KLP_UNIVERSE_NEW,
> +};
> +
> +extern struct mutex klp_mutex;

klp_mutex isn't defined in transition.c.  Maybe this extern should be in
the transition.c file or in a core.h file, since core.c provides the
definition?

Thanks,
Seth

> +extern struct klp_patch *klp_transition_patch;
> +
> +extern void klp_init_transition(struct klp_patch *patch, int universe);
> +extern void klp_start_transition(int universe);
> +extern void klp_reverse_transition(void);
> +extern void klp_try_complete_transition(void);
> +extern void klp_complete_transition(void);
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 78d91e6..7b877f4 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -74,6 +74,7 @@
>  #include <linux/binfmts.h>
>  #include <linux/context_tracking.h>
>  #include <linux/compiler.h>
> +#include <linux/livepatch.h>
>  
>  #include <asm/switch_to.h>
>  #include <asm/tlb.h>
> @@ -4601,6 +4602,7 @@ void init_idle(struct task_struct *idle, int cpu)
>  #if defined(CONFIG_SMP)
>  	sprintf(idle->comm, "%s/%d", INIT_TASK_COMM, cpu);
>  #endif
> +	klp_update_task_universe(idle);
>  }
>  
>  int cpuset_cpumask_can_shrink(const struct cpumask *cur,
> -- 
> 2.1.0
> 

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH 6/9] livepatch: create per-task consistency model
  2015-02-10 19:27   ` Seth Jennings
@ 2015-02-10 19:32     ` Josh Poimboeuf
  0 siblings, 0 replies; 106+ messages in thread
From: Josh Poimboeuf @ 2015-02-10 19:32 UTC (permalink / raw)
  To: Seth Jennings
  Cc: Jiri Kosina, Vojtech Pavlik, Masami Hiramatsu, live-patching,
	linux-kernel

On Tue, Feb 10, 2015 at 01:27:59PM -0600, Seth Jennings wrote:
> On Mon, Feb 09, 2015 at 11:31:18AM -0600, Josh Poimboeuf wrote:
> > diff --git a/kernel/livepatch/transition.h b/kernel/livepatch/transition.h
> > new file mode 100644
> > index 0000000..ba9a55c
> > --- /dev/null
> > +++ b/kernel/livepatch/transition.h
> > @@ -0,0 +1,16 @@
> > +#include <linux/livepatch.h>
> > +
> > +enum {
> > +	KLP_UNIVERSE_UNDEFINED = -1,
> > +	KLP_UNIVERSE_OLD,
> > +	KLP_UNIVERSE_NEW,
> > +};
> > +
> > +extern struct mutex klp_mutex;
> 
> klp_mutex isn't defined in transition.c.  Maybe this extern should be in
> the transition.c file or in a core.h file, since core.c provides the
> definition?

I originally had the extern in transition.c, but then checkpatch
complained so I moved it to transition.h.  But yeah, it doesn't really
belong there either.

It's kind of ugly for transition.c to be using that mutex anyway.  I
think it'll be cleaner if I just move the work_fn into core.c.

-- 
Josh

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH 8/9] livepatch: allow patch modules to be removed
  2015-02-10 19:02   ` Jiri Slaby
@ 2015-02-10 19:57     ` Josh Poimboeuf
  2015-02-11 10:55       ` Jiri Slaby
  2015-02-12 15:22     ` Miroslav Benes
  1 sibling, 1 reply; 106+ messages in thread
From: Josh Poimboeuf @ 2015-02-10 19:57 UTC (permalink / raw)
  To: Jiri Slaby
  Cc: Seth Jennings, Jiri Kosina, Vojtech Pavlik, Masami Hiramatsu,
	live-patching, linux-kernel

On Tue, Feb 10, 2015 at 08:02:34PM +0100, Jiri Slaby wrote:
> On 02/09/2015, 06:31 PM, Josh Poimboeuf wrote:
> > --- a/kernel/livepatch/core.c
> > +++ b/kernel/livepatch/core.c
> ...
> > @@ -497,10 +500,6 @@ static struct attribute *klp_patch_attrs[] = {
> >  
> >  static void klp_kobj_release_patch(struct kobject *kobj)
> >  {
> > -	/*
> > -	 * Once we have a consistency model we'll need to module_put() the
> > -	 * patch module here.  See klp_register_patch() for more details.
> > -	 */
> 
> I deliberately let you write the note in there :). What happens when I
> leave some attribute in /sys open and you remove the module in the meantime?

You're right, as was I the first time :-)

The only problem is that it would be nice if we could call
klp_unregister_patch() from the patch module's exit function, so that
doing an rmmod on the patch module unregisters it.  But if we put
module_put() in the patch release function, then we have a circular
dependency and we could never rmmod it.

How about instead we do a klp_is_patch_registered() at the beginning of
all the attribute accessor functions?  It's kind of ugly, but I can't
think of a better idea at the moment.

-- 
Josh

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH 6/9] livepatch: create per-task consistency model
  2015-02-09 17:31 ` [RFC PATCH 6/9] livepatch: create per-task consistency model Josh Poimboeuf
                     ` (2 preceding siblings ...)
  2015-02-10 19:27   ` Seth Jennings
@ 2015-02-11 10:21   ` Miroslav Benes
  2015-02-11 20:19     ` Josh Poimboeuf
  2015-02-12  3:21   ` Josh Poimboeuf
                     ` (2 subsequent siblings)
  6 siblings, 1 reply; 106+ messages in thread
From: Miroslav Benes @ 2015-02-11 10:21 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Seth Jennings, Jiri Kosina, Vojtech Pavlik, Masami Hiramatsu,
	live-patching, linux-kernel


On Mon, 9 Feb 2015, Josh Poimboeuf wrote:

[...]

> @@ -38,14 +39,34 @@ static void notrace klp_ftrace_handler(unsigned long ip,
>  	ops = container_of(fops, struct klp_ops, fops);
>  
>  	rcu_read_lock();
> +
>  	func = list_first_or_null_rcu(&ops->func_stack, struct klp_func,
>  				      stack_node);
> -	rcu_read_unlock();
>  
>  	if (WARN_ON_ONCE(!func))
> -		return;
> +		goto unlock;
> +
> +	if (unlikely(func->transition)) {
> +		/* corresponding smp_wmb() is in klp_init_transition() */
> +		smp_rmb();
> +
> +		if (current->klp_universe == KLP_UNIVERSE_OLD) {
> +			/*
> +			 * Use the previously patched version of the function.
> +			 * If no previous patches exist, use the original
> +			 * function.
> +			 */
> +			func = list_entry_rcu(func->stack_node.next,
> +					      struct klp_func, stack_node);
> +
> +			if (&func->stack_node == &ops->func_stack)
> +				goto unlock;
> +		}
> +	}
>  
>  	klp_arch_set_pc(regs, (unsigned long)func->new_func);
> +unlock:
> +	rcu_read_unlock();
>  }

I decided to understand the code more before answering the email about the 
race and found another problem. I think.

Imagine we patched some function foo() with foo_1() from patch_1 and now 
we'd like to patch it again with foo_2() in patch_2. __klp_enable_patch 
calls klp_init_transition which sets klp_universe for all processes to 
KLP_UNIVERSE_OLD and marks the foo_2() for transition (it is gonna be 1). 
Then __klp_enable_patch adds foo_2() to the RCU-protected list for foo(). 
BUT what if somebody calls foo() right between klp_init_transition and 
the loop in __klp_enable_patch? The ftrace handler first returns the 
first entry in the list which is foo_1() (foo_2() is still not present), 
then it checks for func->transition. It is 1. It checks for 
current->klp_universe which is KLP_UNIVERSE_OLD and so the next entry is 
retrieved. There is no such and therefore foo() is called. This is 
obviously wrong because foo_1() was expected.

Everything would work fine if one would call foo() before 
klp_start_transition and after the loop in __klp_enable_patch. The 
solution might be to move the setting of func->transition to 
klp_start_transition, but this could break something different. I don't 
know yet.

Am I wrong?

Miroslav

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH 8/9] livepatch: allow patch modules to be removed
  2015-02-10 19:57     ` Josh Poimboeuf
@ 2015-02-11 10:55       ` Jiri Slaby
  2015-02-11 18:39         ` Josh Poimboeuf
  0 siblings, 1 reply; 106+ messages in thread
From: Jiri Slaby @ 2015-02-11 10:55 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Seth Jennings, Jiri Kosina, Vojtech Pavlik, Masami Hiramatsu,
	live-patching, linux-kernel

On 02/10/2015, 08:57 PM, Josh Poimboeuf wrote:
> On Tue, Feb 10, 2015 at 08:02:34PM +0100, Jiri Slaby wrote:
>> On 02/09/2015, 06:31 PM, Josh Poimboeuf wrote:
>>> --- a/kernel/livepatch/core.c
>>> +++ b/kernel/livepatch/core.c
>> ...
>>> @@ -497,10 +500,6 @@ static struct attribute *klp_patch_attrs[] = {
>>>  
>>>  static void klp_kobj_release_patch(struct kobject *kobj)
>>>  {
>>> -	/*
>>> -	 * Once we have a consistency model we'll need to module_put() the
>>> -	 * patch module here.  See klp_register_patch() for more details.
>>> -	 */
>>
>> I deliberately let you write the note in there :). What happens when I
>> leave some attribute in /sys open and you remove the module in the meantime?
> 
> You're right, as was I the first time :-)
> 
> The only problem is that it would be nice if we could call
> klp_unregister_patch() from the patch module's exit function, so that
> doing an rmmod on the patch module unregisters it.  But if we put
> module_put() in the patch release function, then we have a circular
> dependency and we could never rmmod it.
>
> How about instead we do a klp_is_patch_registered() at the beginning of
> all the attribute accessor functions?  It's kind of ugly, but I can't
> think of a better idea at the moment.

Ugh, no :). You even have the kobject proper in the module which would
be gone.

However we can take inspiration in kgraft. I introduced a completion
there and wait for it in rmmod. This completion is made complete in
kobject's release. See:
https://git.kernel.org/cgit/linux/kernel/git/jirislaby/kgraft.git/tree/kernel/kgraft_files.c?h=kgraft#n30
https://git.kernel.org/cgit/linux/kernel/git/jirislaby/kgraft.git/tree/kernel/kgraft_files.c?h=kgraft#n138

This should IMO work here too.

regards,
-- 
js
suse labs

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH 6/9] livepatch: create per-task consistency model
  2015-02-10 16:56     ` Josh Poimboeuf
@ 2015-02-11 16:28       ` Miroslav Benes
  2015-02-11 20:23         ` Josh Poimboeuf
  0 siblings, 1 reply; 106+ messages in thread
From: Miroslav Benes @ 2015-02-11 16:28 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Seth Jennings, Jiri Kosina, Vojtech Pavlik, Masami Hiramatsu,
	live-patching, linux-kernel

On Tue, 10 Feb 2015, Josh Poimboeuf wrote:

> On Tue, Feb 10, 2015 at 04:59:17PM +0100, Miroslav Benes wrote:
> > 
> > On Mon, 9 Feb 2015, Josh Poimboeuf wrote:
> > 
> > > Add a basic per-task consistency model.  This is the foundation which
> > > will eventually enable us to patch those ~10% of security patches which
> > > change function prototypes and/or data semantics.
> > > 
> > > When a patch is enabled, livepatch enters into a transition state where
> > > tasks are converging from the old universe to the new universe.  If a
> > > given task isn't using any of the patched functions, it's switched to
> > > the new universe.  Once all the tasks have been converged to the new
> > > universe, patching is complete.
> > > 
> > > The same sequence occurs when a patch is disabled, except the tasks
> > > converge from the new universe to the old universe.
> > > 
> > > The /sys/kernel/livepatch/<patch>/transition file shows whether a patch
> > > is in transition.  Only a single patch (the topmost patch on the stack)
> > > can be in transition at a given time.  A patch can remain in the
> > > transition state indefinitely, if any of the tasks are stuck in the
> > > previous universe.
> > > 
> > > A transition can be reversed and effectively canceled by writing the
> > > opposite value to the /sys/kernel/livepatch/<patch>/enabled file while
> > > the transition is in progress.  Then all the tasks will attempt to
> > > converge back to the original universe.
> > 
> > Hi Josh,
> > 
> > first, thanks a lot for great work. I'm starting to go through it and it's 
> > gonna take me some time to do and send a complete review.
> 
> I know there are a lot of details to look at, please take your time.  I
> really appreciate your review.  (And everybody else's, for that matter
> :-)
> 
> > > +	/* success! unpatch obsolete functions and do some cleanup */
> > > +
> > > +	if (klp_universe_goal == KLP_UNIVERSE_OLD) {
> > > +		klp_unpatch_objects(klp_transition_patch);
> > > +
> > > +		/* prevent ftrace handler from reading old func->transition */
> > > +		synchronize_rcu();
> > > +	}
> > > +
> > > +	pr_notice("'%s': %s complete\n", klp_transition_patch->mod->name,
> > > +		  klp_universe_goal == KLP_UNIVERSE_NEW ? "patching" :
> > > +							  "unpatching");
> > > +
> > > +	klp_complete_transition();
> > > +}
> > 
> > ...synchronize_rcu() could be insufficient. There still can be some  
> > process in our ftrace handler after the call.
> > 
> > Consider the following scenario:
> > 
> > When synchronize_rcu is called some process could have been preempted on 
> > some other cpu somewhere at the start of the ftrace handler before  
> > rcu_read_lock. synchronize_rcu waits for the grace period to pass, but that 
> > does not mean anything for our process in the handler, because it is not 
> > in rcu critical section. There is no guarantee that after synchronize_rcu 
> > the process would be away from the handler. 
> > 
> > "Meanwhile" klp_try_complete_transition continues and calls 
> > klp_complete_transition. This clears func->transition flags. Now the 
> > process in the handler could be scheduled again. It reads the wrong value 
> > of func->transition and redirection to the wrong function is done.
> > 
> > What do you think? I hope I made myself clear.
> 
> You really made me think.  But I don't think there's a race here.
> 
> Consider the two separate cases, patching and unpatching:
> 
> 1. patching has completed: klp_universe_goal and all tasks'
>    klp_universes are at KLP_UNIVERSE_NEW.  In this case, the value of
>    func->transition doesn't matter, because we want to use the func at
>    the top of the stack, and if klp_universe is NEW, the ftrace handler
>    will do that, regardless of the value of func->transition.  This is
>    why I didn't do the rcu_synchronize() in this case.  But maybe you're
>    not worried about this case anyway, I just described it for the sake
>    of completeness :-)

Yes, this case shouldn't be a problem :)

> 2. unpatching has completed: klp_universe_goal and all tasks'
>    klp_universes are at KLP_UNIVERSE_OLD.  In this case, the value of
>    func->transition _does_ matter.  However, notice that
>    klp_unpatch_objects() is called before rcu_synchronize().  That
>    removes the "new" func from the klp_ops stack.  Since the ftrace
>    handler accesses the list _after_ calling rcu_read_lock(), it will
>    never see the "new" func, and thus func->transition will never be
>    set.

Hm, so indeed I messed it up. Let me rework the scenario a bit. We have a 
function foo(), which has been already patched with foo_1() from patch_1 
and foo_2() from patch_2. Now we would like to unpatch patch_2. It is 
successfully completed and klp_try_complete_transition calls 
klp_unpatch_objects and synchronize_rcu. Thus foo_2() is removed from the 
RCU list in ops. 

Now to the funny part. After synchronize_rcu() and before 
klp_complete_transition some process might get to the ftrace handler (it 
is still there because of the patch_1 still being present). It gets foo_1 
from the list_first_or_null_rcu, sees that func->transition is 1 (it 
hasn't been cleared yet), current->klp_universe is KLP_UNIVERSE_OLD... so 
it tries to get previous function. There is none and foo() is called. This 
is incorrect.

It is very similar scenario to the one in my other email earlier this day. 
I think we need to clear func->transition before calling 
klp_unpatch_objects. More or less.

>    That said, I think there is a race where the WARN_ON_ONCE(!func)
>    could trigger here, and it wouldn't be an error.  So I think I'll
>    remove the warning.
> 
> Does that make sense?
> 
> > There is the similar problem for dynamic trampolines in ftrace. You
> > cannot remove them unless there is no process in the handler. I think
> > rcu-tasks were merged a while ago for this purpose. However ftrace
> > does not use them yet and I don't know if we could exploit them to
> > solve this issue. I need to think more about it.
> 
> Ok, sounds like that's an ftrace bug that could affect us.

Fortunately it is not. Steven knows about it and he does not allow dynamic 
trampolines for CONFIG_PREEMPT and FTRACE_OPS_FL_DYNAMIC. Not yet. See the 
comment in kernel/trace/ftrace.c for ftrace_update_trampoline.

Anyway the conclusion is that we need to be really careful with ftrace 
handler. Especially in the future with dynamic trampolines and especially 
with CONFIG_PREEMPT. Now the handler runs always in atomic context (at 
least in cases relevant for our use) if I am not mistaken.

Miroslav

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH 8/9] livepatch: allow patch modules to be removed
  2015-02-11 10:55       ` Jiri Slaby
@ 2015-02-11 18:39         ` Josh Poimboeuf
  0 siblings, 0 replies; 106+ messages in thread
From: Josh Poimboeuf @ 2015-02-11 18:39 UTC (permalink / raw)
  To: Jiri Slaby
  Cc: Seth Jennings, Jiri Kosina, Vojtech Pavlik, Masami Hiramatsu,
	live-patching, linux-kernel

On Wed, Feb 11, 2015 at 11:55:05AM +0100, Jiri Slaby wrote:
> On 02/10/2015, 08:57 PM, Josh Poimboeuf wrote:
> > On Tue, Feb 10, 2015 at 08:02:34PM +0100, Jiri Slaby wrote:
> >> On 02/09/2015, 06:31 PM, Josh Poimboeuf wrote:
> >>> --- a/kernel/livepatch/core.c
> >>> +++ b/kernel/livepatch/core.c
> >> ...
> >>> @@ -497,10 +500,6 @@ static struct attribute *klp_patch_attrs[] = {
> >>>  
> >>>  static void klp_kobj_release_patch(struct kobject *kobj)
> >>>  {
> >>> -	/*
> >>> -	 * Once we have a consistency model we'll need to module_put() the
> >>> -	 * patch module here.  See klp_register_patch() for more details.
> >>> -	 */
> >>
> >> I deliberately let you write the note in there :). What happens when I
> >> leave some attribute in /sys open and you remove the module in the meantime?
> > 
> > You're right, as was I the first time :-)
> > 
> > The only problem is that it would be nice if we could call
> > klp_unregister_patch() from the patch module's exit function, so that
> > doing an rmmod on the patch module unregisters it.  But if we put
> > module_put() in the patch release function, then we have a circular
> > dependency and we could never rmmod it.
> >
> > How about instead we do a klp_is_patch_registered() at the beginning of
> > all the attribute accessor functions?  It's kind of ugly, but I can't
> > think of a better idea at the moment.
> 
> Ugh, no :). You even have the kobject proper in the module which would
> be gone.
> 
> However we can take inspiration in kgraft. I introduced a completion
> there and wait for it in rmmod. This completion is made complete in
> kobject's release. See:
> https://git.kernel.org/cgit/linux/kernel/git/jirislaby/kgraft.git/tree/kernel/kgraft_files.c?h=kgraft#n30
> https://git.kernel.org/cgit/linux/kernel/git/jirislaby/kgraft.git/tree/kernel/kgraft_files.c?h=kgraft#n138
> 
> This should IMO work here too.

Thanks, that sounds a lot better.  I'll try to do something like that.

-- 
Josh

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH 6/9] livepatch: create per-task consistency model
  2015-02-11 10:21   ` Miroslav Benes
@ 2015-02-11 20:19     ` Josh Poimboeuf
  2015-02-12 10:45       ` Miroslav Benes
  0 siblings, 1 reply; 106+ messages in thread
From: Josh Poimboeuf @ 2015-02-11 20:19 UTC (permalink / raw)
  To: Miroslav Benes
  Cc: Seth Jennings, Jiri Kosina, Vojtech Pavlik, Masami Hiramatsu,
	live-patching, linux-kernel

On Wed, Feb 11, 2015 at 11:21:51AM +0100, Miroslav Benes wrote:
> 
> On Mon, 9 Feb 2015, Josh Poimboeuf wrote:
> 
> [...]
> 
> > @@ -38,14 +39,34 @@ static void notrace klp_ftrace_handler(unsigned long ip,
> >  	ops = container_of(fops, struct klp_ops, fops);
> >  
> >  	rcu_read_lock();
> > +
> >  	func = list_first_or_null_rcu(&ops->func_stack, struct klp_func,
> >  				      stack_node);
> > -	rcu_read_unlock();
> >  
> >  	if (WARN_ON_ONCE(!func))
> > -		return;
> > +		goto unlock;
> > +
> > +	if (unlikely(func->transition)) {
> > +		/* corresponding smp_wmb() is in klp_init_transition() */
> > +		smp_rmb();
> > +
> > +		if (current->klp_universe == KLP_UNIVERSE_OLD) {
> > +			/*
> > +			 * Use the previously patched version of the function.
> > +			 * If no previous patches exist, use the original
> > +			 * function.
> > +			 */
> > +			func = list_entry_rcu(func->stack_node.next,
> > +					      struct klp_func, stack_node);
> > +
> > +			if (&func->stack_node == &ops->func_stack)
> > +				goto unlock;
> > +		}
> > +	}
> >  
> >  	klp_arch_set_pc(regs, (unsigned long)func->new_func);
> > +unlock:
> > +	rcu_read_unlock();
> >  }
> 
> I decided to understand the code more before answering the email about the 
> race and found another problem. I think.
> 
> Imagine we patched some function foo() with foo_1() from patch_1 and now 
> we'd like to patch it again with foo_2() in patch_2. __klp_enable_patch 
> calls klp_init_transition which sets klp_universe for all processes to 
> KLP_UNIVERSE_OLD and marks the foo_2() for transition (it is gonna be 1). 
> Then __klp_enable_patch adds foo_2() to the RCU-protected list for foo(). 
> BUT what if somebody calls foo() right between klp_init_transition and 
> the loop in __klp_enable_patch? The ftrace handler first returns the 
> first entry in the list which is foo_1() (foo_2() is still not present), 
> then it checks for func->transition. It is 1.

No, actually foo_1()'s func->transition will be 0.  Only foo_2()'s
func->transition will be 1.

> It checks for 
> current->klp_universe which is KLP_UNIVERSE_OLD and so the next entry is 
> retrieved. There is no such and therefore foo() is called. This is 
> obviously wrong because foo_1() was expected.
> 
> Everything would work fine if one would call foo() before 
> klp_start_transition and after the loop in __klp_enable_patch. The 
> solution might be to move the setting of func->transition to 
> klp_start_transition, but this could break something different. I don't 
> know yet.
> 
> Am I wrong?
> 
> Miroslav

-- 
Josh

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH 6/9] livepatch: create per-task consistency model
  2015-02-11 16:28       ` Miroslav Benes
@ 2015-02-11 20:23         ` Josh Poimboeuf
  0 siblings, 0 replies; 106+ messages in thread
From: Josh Poimboeuf @ 2015-02-11 20:23 UTC (permalink / raw)
  To: Miroslav Benes
  Cc: Seth Jennings, Jiri Kosina, Vojtech Pavlik, Masami Hiramatsu,
	live-patching, linux-kernel

On Wed, Feb 11, 2015 at 05:28:13PM +0100, Miroslav Benes wrote:
> On Tue, 10 Feb 2015, Josh Poimboeuf wrote:
> 
> > On Tue, Feb 10, 2015 at 04:59:17PM +0100, Miroslav Benes wrote:
> > > 
> > > On Mon, 9 Feb 2015, Josh Poimboeuf wrote:
> > > 
> > > > Add a basic per-task consistency model.  This is the foundation which
> > > > will eventually enable us to patch those ~10% of security patches which
> > > > change function prototypes and/or data semantics.
> > > > 
> > > > When a patch is enabled, livepatch enters into a transition state where
> > > > tasks are converging from the old universe to the new universe.  If a
> > > > given task isn't using any of the patched functions, it's switched to
> > > > the new universe.  Once all the tasks have been converged to the new
> > > > universe, patching is complete.
> > > > 
> > > > The same sequence occurs when a patch is disabled, except the tasks
> > > > converge from the new universe to the old universe.
> > > > 
> > > > The /sys/kernel/livepatch/<patch>/transition file shows whether a patch
> > > > is in transition.  Only a single patch (the topmost patch on the stack)
> > > > can be in transition at a given time.  A patch can remain in the
> > > > transition state indefinitely, if any of the tasks are stuck in the
> > > > previous universe.
> > > > 
> > > > A transition can be reversed and effectively canceled by writing the
> > > > opposite value to the /sys/kernel/livepatch/<patch>/enabled file while
> > > > the transition is in progress.  Then all the tasks will attempt to
> > > > converge back to the original universe.
> > > 
> > > Hi Josh,
> > > 
> > > first, thanks a lot for great work. I'm starting to go through it and it's 
> > > gonna take me some time to do and send a complete review.
> > 
> > I know there are a lot of details to look at, please take your time.  I
> > really appreciate your review.  (And everybody else's, for that matter
> > :-)
> > 
> > > > +	/* success! unpatch obsolete functions and do some cleanup */
> > > > +
> > > > +	if (klp_universe_goal == KLP_UNIVERSE_OLD) {
> > > > +		klp_unpatch_objects(klp_transition_patch);
> > > > +
> > > > +		/* prevent ftrace handler from reading old func->transition */
> > > > +		synchronize_rcu();
> > > > +	}
> > > > +
> > > > +	pr_notice("'%s': %s complete\n", klp_transition_patch->mod->name,
> > > > +		  klp_universe_goal == KLP_UNIVERSE_NEW ? "patching" :
> > > > +							  "unpatching");
> > > > +
> > > > +	klp_complete_transition();
> > > > +}
> > > 
> > > ...synchronize_rcu() could be insufficient. There still can be some  
> > > process in our ftrace handler after the call.
> > > 
> > > Consider the following scenario:
> > > 
> > > When synchronize_rcu is called some process could have been preempted on 
> > > some other cpu somewhere at the start of the ftrace handler before  
> > > rcu_read_lock. synchronize_rcu waits for the grace period to pass, but that 
> > > does not mean anything for our process in the handler, because it is not 
> > > in rcu critical section. There is no guarantee that after synchronize_rcu 
> > > the process would be away from the handler. 
> > > 
> > > "Meanwhile" klp_try_complete_transition continues and calls 
> > > klp_complete_transition. This clears func->transition flags. Now the 
> > > process in the handler could be scheduled again. It reads the wrong value 
> > > of func->transition and redirection to the wrong function is done.
> > > 
> > > What do you think? I hope I made myself clear.
> > 
> > You really made me think.  But I don't think there's a race here.
> > 
> > Consider the two separate cases, patching and unpatching:
> > 
> > 1. patching has completed: klp_universe_goal and all tasks'
> >    klp_universes are at KLP_UNIVERSE_NEW.  In this case, the value of
> >    func->transition doesn't matter, because we want to use the func at
> >    the top of the stack, and if klp_universe is NEW, the ftrace handler
> >    will do that, regardless of the value of func->transition.  This is
> >    why I didn't do the rcu_synchronize() in this case.  But maybe you're
> >    not worried about this case anyway, I just described it for the sake
> >    of completeness :-)
> 
> Yes, this case shouldn't be a problem :)
> 
> > 2. unpatching has completed: klp_universe_goal and all tasks'
> >    klp_universes are at KLP_UNIVERSE_OLD.  In this case, the value of
> >    func->transition _does_ matter.  However, notice that
> >    klp_unpatch_objects() is called before rcu_synchronize().  That
> >    removes the "new" func from the klp_ops stack.  Since the ftrace
> >    handler accesses the list _after_ calling rcu_read_lock(), it will
> >    never see the "new" func, and thus func->transition will never be
> >    set.
> 
> Hm, so indeed I messed it up. Let me rework the scenario a bit. We have a 
> function foo(), which has been already patched with foo_1() from patch_1 
> and foo_2() from patch_2. Now we would like to unpatch patch_2. It is 
> successfully completed and klp_try_complete_transition calls 
> klp_unpatch_objects and synchronize_rcu. Thus foo_2() is removed from the 
> RCU list in ops. 
> 
> Now to the funny part. After synchronize_rcu() and before 
> klp_complete_transition some process might get to the ftrace handler (it 
> is still there because of the patch_1 still being present). It gets foo_1 
> from the list_first_or_null_rcu, sees that func->transition is 1 (it 
> hasn't been cleared yet)

Same answer as the other email, foo_1()'s func->transition will be 0 :-)

When patching, only the new klp_func gets transition set to 1.

When unpatching, only the klp_func being removed gets transition set to
1.

> , current->klp_universe is KLP_UNIVERSE_OLD... so 
> it tries to get previous function. There is none and foo() is called. This 
> is incorrect.
> 
> It is very similar scenario to the one in my other email earlier this day. 
> I think we need to clear func->transition before calling 
> klp_unpatch_objects. More or less.

-- 
Josh

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH 6/9] livepatch: create per-task consistency model
  2015-02-09 17:31 ` [RFC PATCH 6/9] livepatch: create per-task consistency model Josh Poimboeuf
                     ` (3 preceding siblings ...)
  2015-02-11 10:21   ` Miroslav Benes
@ 2015-02-12  3:21   ` Josh Poimboeuf
  2015-02-12 11:56     ` Peter Zijlstra
  2015-02-12 13:26     ` Jiri Slaby
  2015-02-14 11:40   ` Jiri Slaby
  2015-02-16 14:19   ` Miroslav Benes
  6 siblings, 2 replies; 106+ messages in thread
From: Josh Poimboeuf @ 2015-02-12  3:21 UTC (permalink / raw)
  To: Ingo Molnar, Peter Zijlstra
  Cc: Masami Hiramatsu, live-patching, linux-kernel, Seth Jennings,
	Jiri Kosina, Vojtech Pavlik

Ingo, Peter,

Would you have any objections to making task_rq_lock/unlock() non-static
(or moving them to kernel/sched/sched.h) so they can be called by the
livepatch code?

To provide some background, I'm looking for a way to temporarily prevent
a sleeping task from running while its stack is examined, to decide
whether it can be safely switched to the new patching "universe".  For
more details see klp_transition_task() in the patch below.

Using task_rq_lock() is the most straightforward way I could find to
achieve that.

On Mon, Feb 09, 2015 at 11:31:18AM -0600, Josh Poimboeuf wrote:
> Add a basic per-task consistency model.  This is the foundation which
> will eventually enable us to patch those ~10% of security patches which
> change function prototypes and/or data semantics.
> 
> When a patch is enabled, livepatch enters into a transition state where
> tasks are converging from the old universe to the new universe.  If a
> given task isn't using any of the patched functions, it's switched to
> the new universe.  Once all the tasks have been converged to the new
> universe, patching is complete.
> 
> The same sequence occurs when a patch is disabled, except the tasks
> converge from the new universe to the old universe.
> 
> The /sys/kernel/livepatch/<patch>/transition file shows whether a patch
> is in transition.  Only a single patch (the topmost patch on the stack)
> can be in transition at a given time.  A patch can remain in the
> transition state indefinitely, if any of the tasks are stuck in the
> previous universe.
> 
> A transition can be reversed and effectively canceled by writing the
> opposite value to the /sys/kernel/livepatch/<patch>/enabled file while
> the transition is in progress.  Then all the tasks will attempt to
> converge back to the original universe.
> 
> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
> ---
>  include/linux/livepatch.h     |  18 ++-
>  include/linux/sched.h         |   3 +
>  kernel/fork.c                 |   2 +
>  kernel/livepatch/Makefile     |   2 +-
>  kernel/livepatch/core.c       |  71 ++++++----
>  kernel/livepatch/patch.c      |  34 ++++-
>  kernel/livepatch/patch.h      |   1 +
>  kernel/livepatch/transition.c | 300 ++++++++++++++++++++++++++++++++++++++++++
>  kernel/livepatch/transition.h |  16 +++
>  kernel/sched/core.c           |   2 +
>  10 files changed, 423 insertions(+), 26 deletions(-)
>  create mode 100644 kernel/livepatch/transition.c
>  create mode 100644 kernel/livepatch/transition.h
> 
> diff --git a/include/linux/livepatch.h b/include/linux/livepatch.h
> index 0e65b4d..b8c2f15 100644
> --- a/include/linux/livepatch.h
> +++ b/include/linux/livepatch.h
> @@ -40,6 +40,7 @@
>   * @old_size:	size of the old function
>   * @new_size:	size of the new function
>   * @patched:	the func has been added to the klp_ops list
> + * @transition:	the func is currently being applied or reverted
>   */
>  struct klp_func {
>  	/* external */
> @@ -60,6 +61,7 @@ struct klp_func {
>  	struct list_head stack_node;
>  	unsigned long old_size, new_size;
>  	int patched;
> +	int transition;
>  };
>  
>  /**
> @@ -128,6 +130,20 @@ extern int klp_unregister_patch(struct klp_patch *);
>  extern int klp_enable_patch(struct klp_patch *);
>  extern int klp_disable_patch(struct klp_patch *);
>  
> -#endif /* CONFIG_LIVEPATCH */
> +extern int klp_universe_goal;
> +
> +static inline void klp_update_task_universe(struct task_struct *t)
> +{
> +	/* corresponding smp_wmb() is in klp_set_universe_goal() */
> +	smp_rmb();
> +
> +	t->klp_universe = klp_universe_goal;
> +}
> +
> +#else /* !CONFIG_LIVEPATCH */
> +
> +static inline void klp_update_task_universe(struct task_struct *t) {}
> +
> +#endif /* !CONFIG_LIVEPATCH */
>  
>  #endif /* _LINUX_LIVEPATCH_H_ */
> diff --git a/include/linux/sched.h b/include/linux/sched.h
> index 8db31ef..a95e59a 100644
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -1701,6 +1701,9 @@ struct task_struct {
>  #ifdef CONFIG_DEBUG_ATOMIC_SLEEP
>  	unsigned long	task_state_change;
>  #endif
> +#ifdef CONFIG_LIVEPATCH
> +	int klp_universe;
> +#endif
>  };
>  
>  /* Future-safe accessor for struct task_struct's cpus_allowed. */
> diff --git a/kernel/fork.c b/kernel/fork.c
> index 4dc2dda..1dcbebe 100644
> --- a/kernel/fork.c
> +++ b/kernel/fork.c
> @@ -74,6 +74,7 @@
>  #include <linux/uprobes.h>
>  #include <linux/aio.h>
>  #include <linux/compiler.h>
> +#include <linux/livepatch.h>
>  
>  #include <asm/pgtable.h>
>  #include <asm/pgalloc.h>
> @@ -1538,6 +1539,7 @@ static struct task_struct *copy_process(unsigned long clone_flags,
>  	total_forks++;
>  	spin_unlock(&current->sighand->siglock);
>  	syscall_tracepoint_update(p);
> +	klp_update_task_universe(p);
>  	write_unlock_irq(&tasklist_lock);
>  
>  	proc_fork_connector(p);
> diff --git a/kernel/livepatch/Makefile b/kernel/livepatch/Makefile
> index e136dad..2b8bdb1 100644
> --- a/kernel/livepatch/Makefile
> +++ b/kernel/livepatch/Makefile
> @@ -1,3 +1,3 @@
>  obj-$(CONFIG_LIVEPATCH) += livepatch.o
>  
> -livepatch-objs := core.o patch.o
> +livepatch-objs := core.o patch.o transition.o
> diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c
> index 85d4ef7..790dc10 100644
> --- a/kernel/livepatch/core.c
> +++ b/kernel/livepatch/core.c
> @@ -28,14 +28,17 @@
>  #include <linux/kallsyms.h>
>  
>  #include "patch.h"
> +#include "transition.h"
>  
>  /*
> - * The klp_mutex protects the global lists and state transitions of any
> - * structure reachable from them.  References to any structure must be obtained
> - * under mutex protection (except in klp_ftrace_handler(), which uses RCU to
> - * ensure it gets consistent data).
> + * The klp_mutex is a coarse lock which serializes access to klp data.  All
> + * accesses to klp-related variables and structures must have mutex protection,
> + * except within the following functions which carefully avoid the need for it:
> + *
> + * - klp_ftrace_handler()
> + * - klp_update_task_universe()
>   */
> -static DEFINE_MUTEX(klp_mutex);
> +DEFINE_MUTEX(klp_mutex);
>  
>  static LIST_HEAD(klp_patches);
>  
> @@ -67,7 +70,6 @@ static void klp_find_object_module(struct klp_object *obj)
>  	mutex_unlock(&module_mutex);
>  }
>  
> -/* klp_mutex must be held by caller */
>  static bool klp_is_patch_registered(struct klp_patch *patch)
>  {
>  	struct klp_patch *mypatch;
> @@ -285,18 +287,17 @@ static int klp_write_object_relocations(struct module *pmod,
>  
>  static int __klp_disable_patch(struct klp_patch *patch)
>  {
> -	struct klp_object *obj;
> +	if (klp_transition_patch)
> +		return -EBUSY;
>  
>  	/* enforce stacking: only the last enabled patch can be disabled */
>  	if (!list_is_last(&patch->list, &klp_patches) &&
>  	    list_next_entry(patch, list)->enabled)
>  		return -EBUSY;
>  
> -	pr_notice("disabling patch '%s'\n", patch->mod->name);
> -
> -	for (obj = patch->objs; obj->funcs; obj++)
> -		if (obj->patched)
> -			klp_unpatch_object(obj);
> +	klp_init_transition(patch, KLP_UNIVERSE_NEW);
> +	klp_start_transition(KLP_UNIVERSE_OLD);
> +	klp_try_complete_transition();
>  
>  	patch->enabled = 0;
>  
> @@ -340,6 +341,9 @@ static int __klp_enable_patch(struct klp_patch *patch)
>  	struct klp_object *obj;
>  	int ret;
>  
> +	if (klp_transition_patch)
> +		return -EBUSY;
> +
>  	if (WARN_ON(patch->enabled))
>  		return -EINVAL;
>  
> @@ -351,7 +355,7 @@ static int __klp_enable_patch(struct klp_patch *patch)
>  	pr_notice_once("tainting kernel with TAINT_LIVEPATCH\n");
>  	add_taint(TAINT_LIVEPATCH, LOCKDEP_STILL_OK);
>  
> -	pr_notice("enabling patch '%s'\n", patch->mod->name);
> +	klp_init_transition(patch, KLP_UNIVERSE_OLD);
>  
>  	for (obj = patch->objs; obj->funcs; obj++) {
>  		klp_find_object_module(obj);
> @@ -360,17 +364,24 @@ static int __klp_enable_patch(struct klp_patch *patch)
>  			continue;
>  
>  		ret = klp_patch_object(obj);
> -		if (ret)
> -			goto unregister;
> +		if (ret) {
> +			pr_warn("failed to enable patch '%s'\n",
> +				patch->mod->name);
> +
> +			klp_unpatch_objects(patch);
> +			klp_complete_transition();
> +
> +			return ret;
> +		}
>  	}
>  
> +	klp_start_transition(KLP_UNIVERSE_NEW);
> +
> +	klp_try_complete_transition();
> +
>  	patch->enabled = 1;
>  
>  	return 0;
> -
> -unregister:
> -	WARN_ON(__klp_disable_patch(patch));
> -	return ret;
>  }
>  
>  /**
> @@ -407,6 +418,7 @@ EXPORT_SYMBOL_GPL(klp_enable_patch);
>   * /sys/kernel/livepatch
>   * /sys/kernel/livepatch/<patch>
>   * /sys/kernel/livepatch/<patch>/enabled
> + * /sys/kernel/livepatch/<patch>/transition
>   * /sys/kernel/livepatch/<patch>/<object>
>   * /sys/kernel/livepatch/<patch>/<object>/<func>
>   */
> @@ -435,7 +447,9 @@ static ssize_t enabled_store(struct kobject *kobj, struct kobj_attribute *attr,
>  		goto err;
>  	}
>  
> -	if (val) {
> +	if (klp_transition_patch == patch) {
> +		klp_reverse_transition();
> +	} else if (val) {
>  		ret = __klp_enable_patch(patch);
>  		if (ret)
>  			goto err;
> @@ -463,9 +477,21 @@ static ssize_t enabled_show(struct kobject *kobj,
>  	return snprintf(buf, PAGE_SIZE-1, "%d\n", patch->enabled);
>  }
>  
> +static ssize_t transition_show(struct kobject *kobj,
> +			       struct kobj_attribute *attr, char *buf)
> +{
> +	struct klp_patch *patch;
> +
> +	patch = container_of(kobj, struct klp_patch, kobj);
> +	return snprintf(buf, PAGE_SIZE-1, "%d\n",
> +			klp_transition_patch == patch);
> +}
> +
>  static struct kobj_attribute enabled_kobj_attr = __ATTR_RW(enabled);
> +static struct kobj_attribute transition_kobj_attr = __ATTR_RO(transition);
>  static struct attribute *klp_patch_attrs[] = {
>  	&enabled_kobj_attr.attr,
> +	&transition_kobj_attr.attr,
>  	NULL
>  };
>  
> @@ -543,6 +569,7 @@ static int klp_init_func(struct klp_object *obj, struct klp_func *func)
>  {
>  	INIT_LIST_HEAD(&func->stack_node);
>  	func->patched = 0;
> +	func->transition = 0;
>  
>  	return kobject_init_and_add(&func->kobj, &klp_ktype_func,
>  				    obj->kobj, func->old_name);
> @@ -725,7 +752,7 @@ static void klp_module_notify_coming(struct klp_patch *patch,
>  	if (ret)
>  		goto err;
>  
> -	if (!patch->enabled)
> +	if (!patch->enabled && klp_transition_patch != patch)
>  		return;
>  
>  	pr_notice("applying patch '%s' to loading module '%s'\n",
> @@ -746,7 +773,7 @@ static void klp_module_notify_going(struct klp_patch *patch,
>  	struct module *pmod = patch->mod;
>  	struct module *mod = obj->mod;
>  
> -	if (!patch->enabled)
> +	if (!patch->enabled && klp_transition_patch != patch)
>  		goto free;
>  
>  	pr_notice("reverting patch '%s' on unloading module '%s'\n",
> diff --git a/kernel/livepatch/patch.c b/kernel/livepatch/patch.c
> index 281fbca..f12256b 100644
> --- a/kernel/livepatch/patch.c
> +++ b/kernel/livepatch/patch.c
> @@ -24,6 +24,7 @@
>  #include <linux/slab.h>
>  
>  #include "patch.h"
> +#include "transition.h"
>  
>  static LIST_HEAD(klp_ops);
>  
> @@ -38,14 +39,34 @@ static void notrace klp_ftrace_handler(unsigned long ip,
>  	ops = container_of(fops, struct klp_ops, fops);
>  
>  	rcu_read_lock();
> +
>  	func = list_first_or_null_rcu(&ops->func_stack, struct klp_func,
>  				      stack_node);
> -	rcu_read_unlock();
>  
>  	if (WARN_ON_ONCE(!func))
> -		return;
> +		goto unlock;
> +
> +	if (unlikely(func->transition)) {
> +		/* corresponding smp_wmb() is in klp_init_transition() */
> +		smp_rmb();
> +
> +		if (current->klp_universe == KLP_UNIVERSE_OLD) {
> +			/*
> +			 * Use the previously patched version of the function.
> +			 * If no previous patches exist, use the original
> +			 * function.
> +			 */
> +			func = list_entry_rcu(func->stack_node.next,
> +					      struct klp_func, stack_node);
> +
> +			if (&func->stack_node == &ops->func_stack)
> +				goto unlock;
> +		}
> +	}
>  
>  	klp_arch_set_pc(regs, (unsigned long)func->new_func);
> +unlock:
> +	rcu_read_unlock();
>  }
>  
>  struct klp_ops *klp_find_ops(unsigned long old_addr)
> @@ -174,3 +195,12 @@ int klp_patch_object(struct klp_object *obj)
>  
>  	return 0;
>  }
> +
> +void klp_unpatch_objects(struct klp_patch *patch)
> +{
> +	struct klp_object *obj;
> +
> +	for (obj = patch->objs; obj->funcs; obj++)
> +		if (obj->patched)
> +			klp_unpatch_object(obj);
> +}
> diff --git a/kernel/livepatch/patch.h b/kernel/livepatch/patch.h
> index bb34bd3..1648259 100644
> --- a/kernel/livepatch/patch.h
> +++ b/kernel/livepatch/patch.h
> @@ -23,3 +23,4 @@ struct klp_ops *klp_find_ops(unsigned long old_addr);
>  
>  extern int klp_patch_object(struct klp_object *obj);
>  extern void klp_unpatch_object(struct klp_object *obj);
> +extern void klp_unpatch_objects(struct klp_patch *patch);
> diff --git a/kernel/livepatch/transition.c b/kernel/livepatch/transition.c
> new file mode 100644
> index 0000000..2630296
> --- /dev/null
> +++ b/kernel/livepatch/transition.c
> @@ -0,0 +1,300 @@
> +/*
> + * transition.c - Kernel Live Patching transition functions
> + *
> + * Copyright (C) 2015 Josh Poimboeuf <jpoimboe@redhat.com>
> + *
> + * This program is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU General Public License
> + * as published by the Free Software Foundation; either version 2
> + * of the License, or (at your option) any later version.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program; if not, see <http://www.gnu.org/licenses/>.
> + */
> +
> +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
> +
> +#include <linux/cpu.h>
> +#include <asm/stacktrace.h>
> +#include "../sched/sched.h"
> +
> +#include "patch.h"
> +#include "transition.h"
> +
> +static void klp_transition_work_fn(struct work_struct *);
> +static DECLARE_DELAYED_WORK(klp_transition_work, klp_transition_work_fn);
> +
> +struct klp_patch *klp_transition_patch;
> +
> +int klp_universe_goal = KLP_UNIVERSE_UNDEFINED;
> +
> +static void klp_set_universe_goal(int universe)
> +{
> +	klp_universe_goal = universe;
> +
> +	/* corresponding smp_rmb() is in klp_update_task_universe() */
> +	smp_wmb();
> +}
> +
> +/*
> + * The transition to the universe goal is complete.  Clean up the data
> + * structures.
> + */
> +void klp_complete_transition(void)
> +{
> +	struct klp_object *obj;
> +	struct klp_func *func;
> +
> +	for (obj = klp_transition_patch->objs; obj->funcs; obj++)
> +		for (func = obj->funcs; func->old_name; func++)
> +			func->transition = 0;
> +
> +	klp_transition_patch = NULL;
> +}
> +
> +static int klp_stacktrace_address_verify_func(struct klp_func *func,
> +					      unsigned long address)
> +{
> +	unsigned long func_addr, func_size;
> +
> +	if (klp_universe_goal == KLP_UNIVERSE_OLD) {
> +		 /* check the to-be-unpatched function (the func itself) */
> +		func_addr = (unsigned long)func->new_func;
> +		func_size = func->new_size;
> +	} else {
> +		/* check the to-be-patched function (previous func) */
> +		struct klp_ops *ops;
> +
> +		ops = klp_find_ops(func->old_addr);
> +
> +		if (list_is_singular(&ops->func_stack)) {
> +			/* original function */
> +			func_addr = func->old_addr;
> +			func_size = func->old_size;
> +		} else {
> +			/* previously patched function */
> +			struct klp_func *prev;
> +
> +			prev = list_next_entry(func, stack_node);
> +			func_addr = (unsigned long)prev->new_func;
> +			func_size = prev->new_size;
> +		}
> +	}
> +
> +	if (address >= func_addr && address < func_addr + func_size)
> +		return -1;
> +
> +	return 0;
> +}
> +
> +/*
> + * Determine whether the given return address on the stack is within a
> + * to-be-patched or to-be-unpatched function.
> + */
> +static void klp_stacktrace_address_verify(void *data, unsigned long address,
> +					  int reliable)
> +{
> +	struct klp_object *obj;
> +	struct klp_func *func;
> +	int *ret = data;
> +
> +	if (*ret)
> +		return;
> +
> +	for (obj = klp_transition_patch->objs; obj->funcs; obj++) {
> +		if (!obj->patched)
> +			continue;
> +		for (func = obj->funcs; func->old_name; func++) {
> +			if (klp_stacktrace_address_verify_func(func, address)) {
> +				*ret = -1;
> +				return;
> +			}
> +		}
> +	}
> +}
> +
> +static int klp_stacktrace_stack(void *data, char *name)
> +{
> +	return 0;
> +}
> +
> +static const struct stacktrace_ops klp_stacktrace_ops = {
> +	.address = klp_stacktrace_address_verify,
> +	.stack = klp_stacktrace_stack,
> +	.walk_stack = print_context_stack_bp,
> +};
> +
> +/*
> + * Try to safely transition a task to the universe goal.  If the task is
> + * currently running or is sleeping on a to-be-patched or to-be-unpatched
> + * function, return false.
> + */
> +static bool klp_transition_task(struct task_struct *t)
> +{
> +	struct rq *rq;
> +	unsigned long flags;
> +	int ret;
> +	bool success = false;
> +
> +	if (t->klp_universe == klp_universe_goal)
> +		return true;
> +
> +	rq = task_rq_lock(t, &flags);
> +
> +	if (task_running(rq, t) && t != current) {
> +		pr_debug("%s: pid %d (%s) is running\n", __func__, t->pid,
> +			 t->comm);
> +		goto done;
> +	}
> +
> +	ret = 0;
> +	dump_trace(t, NULL, NULL, 0, &klp_stacktrace_ops, &ret);
> +	if (ret) {
> +		pr_debug("%s: pid %d (%s) is sleeping on a patched function\n",
> +			 __func__, t->pid, t->comm);
> +		goto done;
> +	}
> +
> +	klp_update_task_universe(t);
> +
> +	success = true;
> +done:
> +	task_rq_unlock(rq, t, &flags);
> +	return success;
> +}
> +
> +/*
> + * Try to transition all tasks to the universe goal.  If any tasks are still
> + * stuck in the original universe, schedule a retry.
> + */
> +void klp_try_complete_transition(void)
> +{
> +	unsigned int cpu;
> +	struct task_struct *g, *t;
> +	bool complete = true;
> +
> +	/* try to transition all normal tasks */
> +	read_lock(&tasklist_lock);
> +	for_each_process_thread(g, t)
> +		if (!klp_transition_task(t))
> +			complete = false;
> +	read_unlock(&tasklist_lock);
> +
> +	/* try to transition the idle "swapper" tasks */
> +	get_online_cpus();
> +	for_each_online_cpu(cpu)
> +		if (!klp_transition_task(idle_task(cpu)))
> +			complete = false;
> +	put_online_cpus();
> +
> +	/* if not complete, try again later */
> +	if (!complete) {
> +		schedule_delayed_work(&klp_transition_work,
> +				      round_jiffies_relative(HZ));
> +		return;
> +	}
> +
> +	/* success! unpatch obsolete functions and do some cleanup */
> +
> +	if (klp_universe_goal == KLP_UNIVERSE_OLD) {
> +		klp_unpatch_objects(klp_transition_patch);
> +
> +		/* prevent ftrace handler from reading old func->transition */
> +		synchronize_rcu();
> +	}
> +
> +	pr_notice("'%s': %s complete\n", klp_transition_patch->mod->name,
> +		  klp_universe_goal == KLP_UNIVERSE_NEW ? "patching" :
> +							  "unpatching");
> +
> +	klp_complete_transition();
> +}
> +
> +static void klp_transition_work_fn(struct work_struct *work)
> +{
> +	mutex_lock(&klp_mutex);
> +
> +	if (klp_transition_patch)
> +		klp_try_complete_transition();
> +
> +	mutex_unlock(&klp_mutex);
> +}
> +
> +/*
> + * Start the transition to the specified universe so tasks can begin switching
> + * to it.
> + */
> +void klp_start_transition(int universe)
> +{
> +	if (WARN_ON(klp_universe_goal == universe))
> +		return;
> +
> +	pr_notice("'%s': %s...\n", klp_transition_patch->mod->name,
> +		  universe == KLP_UNIVERSE_NEW ? "patching" : "unpatching");
> +
> +	klp_set_universe_goal(universe);
> +}
> +
> +/*
> + * Can be called in the middle of an existing transition to reverse the
> + * direction of the universe goal.  This can be done to effectively cancel an
> + * existing enable or disable operation if there are any tasks which are stuck
> + * in the original universe.
> + */
> +void klp_reverse_transition(void)
> +{
> +	struct klp_patch *patch = klp_transition_patch;
> +
> +	klp_start_transition(!klp_universe_goal);
> +	klp_try_complete_transition();
> +
> +	patch->enabled = !patch->enabled;
> +}
> +
> +/*
> + * Reset the universe goal and all tasks to the starting universe, and set all
> + * func->transition's to 1 to prepare for patching.
> + */
> +void klp_init_transition(struct klp_patch *patch, int universe)
> +{
> +	struct task_struct *g, *t;
> +	unsigned int cpu;
> +	struct klp_object *obj;
> +	struct klp_func *func;
> +
> +	klp_transition_patch = patch;
> +
> +	/*
> +	 * If the previous transition was in the opposite direction, we may
> +	 * already be in the requested initial universe.
> +	 */
> +	if (klp_universe_goal == universe)
> +		goto init_funcs;
> +
> +	klp_set_universe_goal(universe);
> +
> +	/* init all normal task universes */
> +	read_lock(&tasklist_lock);
> +	for_each_process_thread(g, t)
> +		klp_update_task_universe(t);
> +	read_unlock(&tasklist_lock);
> +
> +	/* init all idle "swapper" task universes */
> +	get_online_cpus();
> +	for_each_online_cpu(cpu)
> +		klp_update_task_universe(idle_task(cpu));
> +	put_online_cpus();
> +
> +init_funcs:
> +	/* corresponding smp_rmb() is in klp_ftrace_handler() */
> +	smp_wmb();
> +
> +	for (obj = patch->objs; obj->funcs; obj++)
> +		for (func = obj->funcs; func->old_name; func++)
> +			func->transition = 1;
> +}
> diff --git a/kernel/livepatch/transition.h b/kernel/livepatch/transition.h
> new file mode 100644
> index 0000000..ba9a55c
> --- /dev/null
> +++ b/kernel/livepatch/transition.h
> @@ -0,0 +1,16 @@
> +#include <linux/livepatch.h>
> +
> +enum {
> +	KLP_UNIVERSE_UNDEFINED = -1,
> +	KLP_UNIVERSE_OLD,
> +	KLP_UNIVERSE_NEW,
> +};
> +
> +extern struct mutex klp_mutex;
> +extern struct klp_patch *klp_transition_patch;
> +
> +extern void klp_init_transition(struct klp_patch *patch, int universe);
> +extern void klp_start_transition(int universe);
> +extern void klp_reverse_transition(void);
> +extern void klp_try_complete_transition(void);
> +extern void klp_complete_transition(void);
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 78d91e6..7b877f4 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -74,6 +74,7 @@
>  #include <linux/binfmts.h>
>  #include <linux/context_tracking.h>
>  #include <linux/compiler.h>
> +#include <linux/livepatch.h>
>  
>  #include <asm/switch_to.h>
>  #include <asm/tlb.h>
> @@ -4601,6 +4602,7 @@ void init_idle(struct task_struct *idle, int cpu)
>  #if defined(CONFIG_SMP)
>  	sprintf(idle->comm, "%s/%d", INIT_TASK_COMM, cpu);
>  #endif
> +	klp_update_task_universe(idle);
>  }
>  
>  int cpuset_cpumask_can_shrink(const struct cpumask *cur,
> -- 
> 2.1.0
> 
> --
> To unsubscribe from this list: send the line "unsubscribe live-patching" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

-- 
Josh

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH 6/9] livepatch: create per-task consistency model
  2015-02-11 20:19     ` Josh Poimboeuf
@ 2015-02-12 10:45       ` Miroslav Benes
  0 siblings, 0 replies; 106+ messages in thread
From: Miroslav Benes @ 2015-02-12 10:45 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Seth Jennings, Jiri Kosina, Vojtech Pavlik, Masami Hiramatsu,
	live-patching, linux-kernel

On Wed, 11 Feb 2015, Josh Poimboeuf wrote:

> On Wed, Feb 11, 2015 at 11:21:51AM +0100, Miroslav Benes wrote:
> > 
> > On Mon, 9 Feb 2015, Josh Poimboeuf wrote:
> > 
> > [...]
> > 
> > > @@ -38,14 +39,34 @@ static void notrace klp_ftrace_handler(unsigned long ip,
> > >  	ops = container_of(fops, struct klp_ops, fops);
> > >  
> > >  	rcu_read_lock();
> > > +
> > >  	func = list_first_or_null_rcu(&ops->func_stack, struct klp_func,
> > >  				      stack_node);
> > > -	rcu_read_unlock();
> > >  
> > >  	if (WARN_ON_ONCE(!func))
> > > -		return;
> > > +		goto unlock;
> > > +
> > > +	if (unlikely(func->transition)) {
> > > +		/* corresponding smp_wmb() is in klp_init_transition() */
> > > +		smp_rmb();
> > > +
> > > +		if (current->klp_universe == KLP_UNIVERSE_OLD) {
> > > +			/*
> > > +			 * Use the previously patched version of the function.
> > > +			 * If no previous patches exist, use the original
> > > +			 * function.
> > > +			 */
> > > +			func = list_entry_rcu(func->stack_node.next,
> > > +					      struct klp_func, stack_node);
> > > +
> > > +			if (&func->stack_node == &ops->func_stack)
> > > +				goto unlock;
> > > +		}
> > > +	}
> > >  
> > >  	klp_arch_set_pc(regs, (unsigned long)func->new_func);
> > > +unlock:
> > > +	rcu_read_unlock();
> > >  }
> > 
> > I decided to understand the code more before answering the email about the 
> > race and found another problem. I think.
> > 
> > Imagine we patched some function foo() with foo_1() from patch_1 and now 
> > we'd like to patch it again with foo_2() in patch_2. __klp_enable_patch 
> > calls klp_init_transition which sets klp_universe for all processes to 
> > KLP_UNIVERSE_OLD and marks the foo_2() for transition (it is gonna be 1). 
> > Then __klp_enable_patch adds foo_2() to the RCU-protected list for foo(). 
> > BUT what if somebody calls foo() right between klp_init_transition and 
> > the loop in __klp_enable_patch? The ftrace handler first returns the 
> > first entry in the list which is foo_1() (foo_2() is still not present), 
> > then it checks for func->transition. It is 1.
> 
> No, actually foo_1()'s func->transition will be 0.  Only foo_2()'s
> func->transition will be 1.

Ah, you're right in both cases. Sorry for the noise.

Miroslav

> 
> > It checks for 
> > current->klp_universe which is KLP_UNIVERSE_OLD and so the next entry is 
> > retrieved. There is no such and therefore foo() is called. This is 
> > obviously wrong because foo_1() was expected.
> > 
> > Everything would work fine if one would call foo() before 
> > klp_start_transition and after the loop in __klp_enable_patch. The 
> > solution might be to move the setting of func->transition to 
> > klp_start_transition, but this could break something different. I don't 
> > know yet.
> > 
> > Am I wrong?
> > 
> > Miroslav
> 
> -- 
> Josh
> 

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH 6/9] livepatch: create per-task consistency model
  2015-02-12  3:21   ` Josh Poimboeuf
@ 2015-02-12 11:56     ` Peter Zijlstra
  2015-02-12 12:25       ` Jiri Kosina
  2015-02-12 12:51       ` Josh Poimboeuf
  2015-02-12 13:26     ` Jiri Slaby
  1 sibling, 2 replies; 106+ messages in thread
From: Peter Zijlstra @ 2015-02-12 11:56 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Ingo Molnar, Masami Hiramatsu, live-patching, linux-kernel,
	Seth Jennings, Jiri Kosina, Vojtech Pavlik

On Wed, Feb 11, 2015 at 09:21:21PM -0600, Josh Poimboeuf wrote:
> Ingo, Peter,
> 
> Would you have any objections to making task_rq_lock/unlock() non-static
> (or moving them to kernel/sched/sched.h) so they can be called by the
> livepatch code?

Basically yes. I really don't want to expose that. And
kernel/sched/sched.h is very much not intended for use outside of
kernel/sched/ so even that is a no go.

> To provide some background, I'm looking for a way to temporarily prevent
> a sleeping task from running while its stack is examined, to decide
> whether it can be safely switched to the new patching "universe".  For
> more details see klp_transition_task() in the patch below.
> 
> Using task_rq_lock() is the most straightforward way I could find to
> achieve that.

Its not at all clear how all this would work to me. And I'm not
motivated enough to go try and reverse engineer your patch; IMO
livepatching is utter fail.

If your infrastructure relies on the uptime of a single machine you've
lost already.

FWIW, the barriers in klp_update_task_universe() and
klp_set_universe_goal() look like complete crack, and their comments are
seriously deficient.

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH 6/9] livepatch: create per-task consistency model
  2015-02-12 11:56     ` Peter Zijlstra
@ 2015-02-12 12:25       ` Jiri Kosina
  2015-02-12 12:36         ` Peter Zijlstra
  2015-02-12 12:39         ` Peter Zijlstra
  2015-02-12 12:51       ` Josh Poimboeuf
  1 sibling, 2 replies; 106+ messages in thread
From: Jiri Kosina @ 2015-02-12 12:25 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Josh Poimboeuf, Ingo Molnar, Masami Hiramatsu, live-patching,
	linux-kernel, Seth Jennings, Vojtech Pavlik

On Thu, 12 Feb 2015, Peter Zijlstra wrote:

> Its not at all clear how all this would work to me. And I'm not
> motivated enough to go try and reverse engineer your patch; IMO
> livepatching is utter fail.
> 
> If your infrastructure relies on the uptime of a single machine you've
> lost already.

Well, the fact indisputable fact is that there is a demand for this. It's 
not about one machine, it's about scheduling dowtimes of datacentres.

But if this needs to be discussed, it should be done outside of this 
thread I guess.

> FWIW, the barriers in klp_update_task_universe() and 
> klp_set_universe_goal() look like complete crack, and their comments are 
> seriously deficient.

These particular barriers seem correct to me; you basically need to make 
sure that whenever a thread with TIF_KLP_NEED_UPDATE goes through 
do_notify_resume(), it sees proper universe number to be converted to.

-- 
Jiri Kosina
SUSE Labs

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH 6/9] livepatch: create per-task consistency model
  2015-02-12 12:25       ` Jiri Kosina
@ 2015-02-12 12:36         ` Peter Zijlstra
  2015-02-12 12:39           ` Jiri Kosina
  2015-02-12 12:39         ` Peter Zijlstra
  1 sibling, 1 reply; 106+ messages in thread
From: Peter Zijlstra @ 2015-02-12 12:36 UTC (permalink / raw)
  To: Jiri Kosina
  Cc: Josh Poimboeuf, Ingo Molnar, Masami Hiramatsu, live-patching,
	linux-kernel, Seth Jennings, Vojtech Pavlik

On Thu, Feb 12, 2015 at 01:25:14PM +0100, Jiri Kosina wrote:
> On Thu, 12 Feb 2015, Peter Zijlstra wrote:

> > FWIW, the barriers in klp_update_task_universe() and 
> > klp_set_universe_goal() look like complete crack, and their comments are 
> > seriously deficient.
> 
> These particular barriers seem correct to me; you basically need to make 
> sure that whenever a thread with TIF_KLP_NEED_UPDATE goes through 
> do_notify_resume(), it sees proper universe number to be converted to.

I'm not seeing how they're going to help with that.

The comment should describe the data race and how the barriers are
making it not happen.

putting wmb after a store and rmb before a read doesn't avoid the reader
seeing the old value in any universe I know of.

Barriers are about order, you need two consecutive stores for a wmb to
make sense, and two consecutive reads for an rmb, and if they're paired
the stores and reads need to be to the same addresses.

Without that they're pointless.

The comment doesn't describe which two variables are ordered how.

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH 6/9] livepatch: create per-task consistency model
  2015-02-12 12:25       ` Jiri Kosina
  2015-02-12 12:36         ` Peter Zijlstra
@ 2015-02-12 12:39         ` Peter Zijlstra
  2015-02-12 12:42           ` Jiri Kosina
  1 sibling, 1 reply; 106+ messages in thread
From: Peter Zijlstra @ 2015-02-12 12:39 UTC (permalink / raw)
  To: Jiri Kosina
  Cc: Josh Poimboeuf, Ingo Molnar, Masami Hiramatsu, live-patching,
	linux-kernel, Seth Jennings, Vojtech Pavlik

On Thu, Feb 12, 2015 at 01:25:14PM +0100, Jiri Kosina wrote:
> On Thu, 12 Feb 2015, Peter Zijlstra wrote:
> 
> > Its not at all clear how all this would work to me. And I'm not
> > motivated enough to go try and reverse engineer your patch; IMO
> > livepatching is utter fail.
> > 
> > If your infrastructure relies on the uptime of a single machine you've
> > lost already.
> 
> Well, the fact indisputable fact is that there is a demand for this. It's 
> not about one machine, it's about scheduling dowtimes of datacentres.

The changelog says:

 > ... A patch can remain in the
 > transition state indefinitely, if any of the tasks are stuck in the
 > previous universe.

Therefore there is no scheduling anything. Without timeliness guarantees
you can't make a schedule.

Might as well just reboot, at least that's fairly well guaranteed to
happen.

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH 6/9] livepatch: create per-task consistency model
  2015-02-12 12:36         ` Peter Zijlstra
@ 2015-02-12 12:39           ` Jiri Kosina
  0 siblings, 0 replies; 106+ messages in thread
From: Jiri Kosina @ 2015-02-12 12:39 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Josh Poimboeuf, Ingo Molnar, Masami Hiramatsu, live-patching,
	linux-kernel, Seth Jennings, Vojtech Pavlik

On Thu, 12 Feb 2015, Peter Zijlstra wrote:

> > > FWIW, the barriers in klp_update_task_universe() and 
> > > klp_set_universe_goal() look like complete crack, and their comments are 
> > > seriously deficient.
> > 
> > These particular barriers seem correct to me; you basically need to make 
> > sure that whenever a thread with TIF_KLP_NEED_UPDATE goes through 
> > do_notify_resume(), it sees proper universe number to be converted to.
> 
> I'm not seeing how they're going to help with that.
> 
> The comment should describe the data race and how the barriers are
> making it not happen.
> 
> putting wmb after a store and rmb before a read doesn't avoid the reader
> seeing the old value in any universe I know of.

This is about dependency between klp_universe_goal and TIF_KLP_NEED_UPDATE 
in threadinfo flags.

What is confusing here is that threadinfo flags are not set in 
klp_set_universe_goal() directly, but in the caller 
(klp_start_transition()).

I fully agree with you that this deserves better comment though.

-- 
Jiri Kosina
SUSE Labs

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH 6/9] livepatch: create per-task consistency model
  2015-02-12 12:39         ` Peter Zijlstra
@ 2015-02-12 12:42           ` Jiri Kosina
  2015-02-12 13:01             ` Josh Poimboeuf
  0 siblings, 1 reply; 106+ messages in thread
From: Jiri Kosina @ 2015-02-12 12:42 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Josh Poimboeuf, Ingo Molnar, Masami Hiramatsu, live-patching,
	linux-kernel, Seth Jennings, Vojtech Pavlik

On Thu, 12 Feb 2015, Peter Zijlstra wrote:

> > Well, the fact indisputable fact is that there is a demand for this. It's 
> > not about one machine, it's about scheduling dowtimes of datacentres.
> 
> The changelog says:
> 
>  > ... A patch can remain in the
>  > transition state indefinitely, if any of the tasks are stuck in the
>  > previous universe.
> 
> Therefore there is no scheduling anything. Without timeliness guarantees
> you can't make a schedule.
> 
> Might as well just reboot, at least that's fairly well guaranteed to
> happen.

All running (reasonably alive) tasks will be running patched code though. 

You can't just claim complete victory (and get ready for accepting another 
patch, etc) if there is a long-time sleeper that hasn't been converted 
yet.

-- 
Jiri Kosina
SUSE Labs

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH 6/9] livepatch: create per-task consistency model
  2015-02-12 11:56     ` Peter Zijlstra
  2015-02-12 12:25       ` Jiri Kosina
@ 2015-02-12 12:51       ` Josh Poimboeuf
  2015-02-12 13:08         ` Peter Zijlstra
  1 sibling, 1 reply; 106+ messages in thread
From: Josh Poimboeuf @ 2015-02-12 12:51 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Masami Hiramatsu, live-patching, linux-kernel,
	Seth Jennings, Jiri Kosina, Vojtech Pavlik

On Thu, Feb 12, 2015 at 12:56:28PM +0100, Peter Zijlstra wrote:
> On Wed, Feb 11, 2015 at 09:21:21PM -0600, Josh Poimboeuf wrote:
> > Ingo, Peter,
> > 
> > Would you have any objections to making task_rq_lock/unlock() non-static
> > (or moving them to kernel/sched/sched.h) so they can be called by the
> > livepatch code?
> 
> Basically yes. I really don't want to expose that. And
> kernel/sched/sched.h is very much not intended for use outside of
> kernel/sched/ so even that is a no go.
> 
> > To provide some background, I'm looking for a way to temporarily prevent
> > a sleeping task from running while its stack is examined, to decide
> > whether it can be safely switched to the new patching "universe".  For
> > more details see klp_transition_task() in the patch below.
> > 
> > Using task_rq_lock() is the most straightforward way I could find to
> > achieve that.
> 
> Its not at all clear how all this would work to me. And I'm not
> motivated enough to go try and reverse engineer your patch;

The short answer is: I need a way to ensure that a task isn't sleeping
on any of the functions we're trying to patch.  If it's not, then I can
switch the task over to start using new versions of functions.

Obviously, there are many more details than that.  If you have specific
questions I can try to answer them.

> IMO livepatching is utter fail.
> 
> If your infrastructure relies on the uptime of a single machine you've
> lost already.

It's not always about uptime.  IMO it's usually more about decoupling
your reboot schedule from your distro's kernel release schedule.

Most users want to plan in advance when they're going to reboot, rather
than being at the mercy of when CVEs and kernel fixes are released.

Rebooting is costly and risky, even (or often especially) for large
systems for which you have to stagger the reboots.  You want to do it at
a time when you're ready for something bad to happen, without having to
also worry about security in the mean time while you're waiting for your
reboot window.

> FWIW, the barriers in klp_update_task_universe() and
> klp_set_universe_goal() look like complete crack, and their comments are
> seriously deficient.

Ok, I'll try to improve the comments for the barriers.

-- 
Josh

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH 6/9] livepatch: create per-task consistency model
  2015-02-12 12:42           ` Jiri Kosina
@ 2015-02-12 13:01             ` Josh Poimboeuf
  0 siblings, 0 replies; 106+ messages in thread
From: Josh Poimboeuf @ 2015-02-12 13:01 UTC (permalink / raw)
  To: Jiri Kosina
  Cc: Peter Zijlstra, Ingo Molnar, Masami Hiramatsu, live-patching,
	linux-kernel, Seth Jennings, Vojtech Pavlik

On Thu, Feb 12, 2015 at 01:42:01PM +0100, Jiri Kosina wrote:
> On Thu, 12 Feb 2015, Peter Zijlstra wrote:
> 
> > > Well, the fact indisputable fact is that there is a demand for this. It's 
> > > not about one machine, it's about scheduling dowtimes of datacentres.
> > 
> > The changelog says:
> > 
> >  > ... A patch can remain in the
> >  > transition state indefinitely, if any of the tasks are stuck in the
> >  > previous universe.
> > 
> > Therefore there is no scheduling anything. Without timeliness guarantees
> > you can't make a schedule.
> > 
> > Might as well just reboot, at least that's fairly well guaranteed to
> > happen.
> 
> All running (reasonably alive) tasks will be running patched code though. 
> 
> You can't just claim complete victory (and get ready for accepting another 
> patch, etc) if there is a long-time sleeper that hasn't been converted 
> yet.

Agreed.  And also we have several strategies for reducing the time
needed to get all tasks to a patched state (see patch 9 of this series
for more details).  The goal is to not leave systems in limbo for more
than a few seconds.

-- 
Josh

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH 6/9] livepatch: create per-task consistency model
  2015-02-12 12:51       ` Josh Poimboeuf
@ 2015-02-12 13:08         ` Peter Zijlstra
  2015-02-12 13:16           ` Jiri Kosina
                             ` (2 more replies)
  0 siblings, 3 replies; 106+ messages in thread
From: Peter Zijlstra @ 2015-02-12 13:08 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Ingo Molnar, Masami Hiramatsu, live-patching, linux-kernel,
	Seth Jennings, Jiri Kosina, Vojtech Pavlik

On Thu, Feb 12, 2015 at 06:51:49AM -0600, Josh Poimboeuf wrote:
> > > To provide some background, I'm looking for a way to temporarily prevent
> > > a sleeping task from running while its stack is examined, to decide
> > > whether it can be safely switched to the new patching "universe".  For
> > > more details see klp_transition_task() in the patch below.
> > > 
> > > Using task_rq_lock() is the most straightforward way I could find to
> > > achieve that.
> > 
> > Its not at all clear how all this would work to me. And I'm not
> > motivated enough to go try and reverse engineer your patch;
> 
> The short answer is: I need a way to ensure that a task isn't sleeping
> on any of the functions we're trying to patch.  If it's not, then I can
> switch the task over to start using new versions of functions.
> 
> Obviously, there are many more details than that.  If you have specific
> questions I can try to answer them.

How can one task run new and another task old functions? Once you patch
any indirect function pointer any task will see the new call.

And what's wrong with using known good spots like the freezer?

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH 6/9] livepatch: create per-task consistency model
  2015-02-12 13:08         ` Peter Zijlstra
@ 2015-02-12 13:16           ` Jiri Kosina
  2015-02-12 14:20             ` Josh Poimboeuf
  2015-02-12 13:16           ` Jiri Slaby
  2015-02-12 14:32           ` Jiri Kosina
  2 siblings, 1 reply; 106+ messages in thread
From: Jiri Kosina @ 2015-02-12 13:16 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Josh Poimboeuf, Ingo Molnar, Masami Hiramatsu, live-patching,
	linux-kernel, Seth Jennings, Vojtech Pavlik

On Thu, 12 Feb 2015, Peter Zijlstra wrote:

> > The short answer is: I need a way to ensure that a task isn't sleeping
> > on any of the functions we're trying to patch.  If it's not, then I can
> > switch the task over to start using new versions of functions.
> > 
> > Obviously, there are many more details than that.  If you have specific
> > questions I can try to answer them.
> 
> How can one task run new and another task old functions? Once you patch
> any indirect function pointer any task will see the new call.

Patched functions are redirected through ftrace trampoline, and decision 
is being made there which function (old or new) to redirect to.

Function calls through pointer always go first to the original function, 
and get redirected from its __fentry__ site.

Once the system is in fully patched state, the overhead of the trampoline 
is reduced (no expensive decision-making to be made there, etc) to 
minimum.

Sure, you will never be on a 100% of performance of the unpatched kernel 
for redirected functions, the indirect call through the trampoline will 
always be there (although ftrace with dynamic trampolines is really 
minimizing this penalty to few extra instructions, one extra call and one 
extra ret being the expensive ones).

> And what's wrong with using known good spots like the freezer?

It has undefined semantics when it comes to what you want to achieve here.

Say for example you have a kernel thread which does something like

while (some_condition) {
	ret = foo();
	...
	try_to_freeze();
	...
}

and you have a livepatch patching foo() and changing its return value 
semantics. Then freezer doesn't really help.

-- 
Jiri Kosina
SUSE Labs

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH 6/9] livepatch: create per-task consistency model
  2015-02-12 13:08         ` Peter Zijlstra
  2015-02-12 13:16           ` Jiri Kosina
@ 2015-02-12 13:16           ` Jiri Slaby
  2015-02-12 13:35             ` Peter Zijlstra
  2015-02-12 14:32           ` Jiri Kosina
  2 siblings, 1 reply; 106+ messages in thread
From: Jiri Slaby @ 2015-02-12 13:16 UTC (permalink / raw)
  To: Peter Zijlstra, Josh Poimboeuf
  Cc: Ingo Molnar, Masami Hiramatsu, live-patching, linux-kernel,
	Seth Jennings, Jiri Kosina, Vojtech Pavlik

Hi,

On 02/12/2015, 02:08 PM, Peter Zijlstra wrote:
> How can one task run new and another task old functions?

because this is how it is designed to work in one of the consistency models.

> Once you patch
> any indirect function pointer any task will see the new call.

It does not patch any pointers. Callees' fentrys are "patched" using ftrace.

> And what's wrong with using known good spots like the freezer?

This was already discussed too. Please STA.

thanks,
-- 
js
suse labs

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH 6/9] livepatch: create per-task consistency model
  2015-02-12  3:21   ` Josh Poimboeuf
  2015-02-12 11:56     ` Peter Zijlstra
@ 2015-02-12 13:26     ` Jiri Slaby
  2015-02-12 15:48       ` Josh Poimboeuf
  1 sibling, 1 reply; 106+ messages in thread
From: Jiri Slaby @ 2015-02-12 13:26 UTC (permalink / raw)
  To: Josh Poimboeuf, Ingo Molnar, Peter Zijlstra
  Cc: Masami Hiramatsu, live-patching, linux-kernel, Seth Jennings,
	Jiri Kosina, Vojtech Pavlik

On 02/12/2015, 04:21 AM, Josh Poimboeuf wrote:
> Ingo, Peter,
> 
> Would you have any objections to making task_rq_lock/unlock() non-static
> (or moving them to kernel/sched/sched.h) so they can be called by the
> livepatch code?
> 
> To provide some background, I'm looking for a way to temporarily prevent
> a sleeping task from running while its stack is examined, to decide
> whether it can be safely switched to the new patching "universe".  For
> more details see klp_transition_task() in the patch below.
> 
> Using task_rq_lock() is the most straightforward way I could find to
> achieve that.

Hi, I cannot speak whether it is the proper way or not.

But if so, would it make sense to do the opposite: expose an API to walk
through the processes' stack and make the decision? Concretely, move
parts of klp_stacktrace_address_verify_func to sched.c or somewhere in
kernel/sched/ and leave task_rq_lock untouched.

regards,
-- 
js
suse labs

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH 6/9] livepatch: create per-task consistency model
  2015-02-12 13:16           ` Jiri Slaby
@ 2015-02-12 13:35             ` Peter Zijlstra
  2015-02-12 14:08               ` Jiri Kosina
  2015-02-12 14:20               ` Jiri Slaby
  0 siblings, 2 replies; 106+ messages in thread
From: Peter Zijlstra @ 2015-02-12 13:35 UTC (permalink / raw)
  To: Jiri Slaby
  Cc: Josh Poimboeuf, Ingo Molnar, Masami Hiramatsu, live-patching,
	linux-kernel, Seth Jennings, Jiri Kosina, Vojtech Pavlik

On Thu, Feb 12, 2015 at 02:16:28PM +0100, Jiri Slaby wrote:
> > And what's wrong with using known good spots like the freezer?
> 
> This was already discussed too. Please STA.

WTF is STA? You guys want something from me; I don't have time, not
inclination to go hunt down whatever dark corner of the interweb
contains your ramblings.

If you can't be arsed to explain things, I certainly cannot be arsed to
consider your request.

So you now have my full NAK on touching the scheduler, have at it, go
deal with someone else.

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH 6/9] livepatch: create per-task consistency model
  2015-02-12 13:35             ` Peter Zijlstra
@ 2015-02-12 14:08               ` Jiri Kosina
  2015-02-12 15:24                 ` Josh Poimboeuf
  2015-02-12 14:20               ` Jiri Slaby
  1 sibling, 1 reply; 106+ messages in thread
From: Jiri Kosina @ 2015-02-12 14:08 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Jiri Slaby, Josh Poimboeuf, Ingo Molnar, Masami Hiramatsu,
	live-patching, linux-kernel, Seth Jennings, Vojtech Pavlik

On Thu, 12 Feb 2015, Peter Zijlstra wrote:

> > > And what's wrong with using known good spots like the freezer?
> > 
> > This was already discussed too. Please STA.
> 
> WTF is STA? You guys want something from me; I don't have time, not
> inclination to go hunt down whatever dark corner of the interweb
> contains your ramblings.
> 
> If you can't be arsed to explain things, I certainly cannot be arsed to
> consider your request.

I believe I have provided answer to the freezer question in my previous 
mail, so please let's continue the discussion there if needed.

> So you now have my full NAK on touching the scheduler, have at it, go
> deal with someone else.

I personally am not a big fan of the task_rq_lock() public exposure 
either. What might be generally useful though (not only for livepatching) 
would be an API that would allow for "safe" stack dump (where "safe" means 
that guarantee, that it wouldn't be interferred by process waking up in 
the middle of dumping, would be provided). Does that sound like even 
remotely acceptable idea to you?

Thanks,

-- 
Jiri Kosina
SUSE Labs

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH 6/9] livepatch: create per-task consistency model
  2015-02-12 13:35             ` Peter Zijlstra
  2015-02-12 14:08               ` Jiri Kosina
@ 2015-02-12 14:20               ` Jiri Slaby
  1 sibling, 0 replies; 106+ messages in thread
From: Jiri Slaby @ 2015-02-12 14:20 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Josh Poimboeuf, Ingo Molnar, Masami Hiramatsu, live-patching,
	linux-kernel, Seth Jennings, Jiri Kosina, Vojtech Pavlik

On 02/12/2015, 02:35 PM, Peter Zijlstra wrote:
> On Thu, Feb 12, 2015 at 02:16:28PM +0100, Jiri Slaby wrote:
>>> And what's wrong with using known good spots like the freezer?
>>
>> This was already discussed too. Please STA.
> 
> WTF is STA? You guys want something from me; I don't have time, not
> inclination to go hunt down whatever dark corner of the interweb
> contains your ramblings.

You definitely do not need STA, if you don't want to know the details. I
think repeating the whole thread would not be productive for all of us.

The short answer you can read from the above is: it is not possible. On
the top of that, Jiri provided you with a simple example to answer why.

> If you can't be arsed to explain things, I certainly cannot be arsed to
> consider your request.

Please see above.

> So you now have my full NAK on touching the scheduler, have at it, go
> deal with someone else.

Ok, we already got your expressed attitude towards live patching. This
is not a kind of input we were hoping for though. Could you comment on
the technical aspects and the proposed solutions instead?

thanks,
-- 
js
suse labs

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH 6/9] livepatch: create per-task consistency model
  2015-02-12 13:16           ` Jiri Kosina
@ 2015-02-12 14:20             ` Josh Poimboeuf
  2015-02-12 14:27               ` Jiri Kosina
  0 siblings, 1 reply; 106+ messages in thread
From: Josh Poimboeuf @ 2015-02-12 14:20 UTC (permalink / raw)
  To: Jiri Kosina
  Cc: Peter Zijlstra, Ingo Molnar, Masami Hiramatsu, live-patching,
	linux-kernel, Seth Jennings, Vojtech Pavlik

On Thu, Feb 12, 2015 at 02:16:07PM +0100, Jiri Kosina wrote:
> On Thu, 12 Feb 2015, Peter Zijlstra wrote:
> 
> > > The short answer is: I need a way to ensure that a task isn't sleeping
> > > on any of the functions we're trying to patch.  If it's not, then I can
> > > switch the task over to start using new versions of functions.
> > > 
> > > Obviously, there are many more details than that.  If you have specific
> > > questions I can try to answer them.
> > 
> > How can one task run new and another task old functions? Once you patch
> > any indirect function pointer any task will see the new call.
> 
> Patched functions are redirected through ftrace trampoline, and decision 
> is being made there which function (old or new) to redirect to.
> 
> Function calls through pointer always go first to the original function, 
> and get redirected from its __fentry__ site.
> 
> Once the system is in fully patched state, the overhead of the trampoline 
> is reduced (no expensive decision-making to be made there, etc) to 
> minimum.
> 
> Sure, you will never be on a 100% of performance of the unpatched kernel 
> for redirected functions, the indirect call through the trampoline will 
> always be there (although ftrace with dynamic trampolines is really 
> minimizing this penalty to few extra instructions, one extra call and one 
> extra ret being the expensive ones).
> 
> > And what's wrong with using known good spots like the freezer?
> 
> It has undefined semantics when it comes to what you want to achieve here.
> 
> Say for example you have a kernel thread which does something like
> 
> while (some_condition) {
> 	ret = foo();
> 	...
> 	try_to_freeze();
> 	...
> }
> 
> and you have a livepatch patching foo() and changing its return value 
> semantics. Then freezer doesn't really help.

Don't we have the same issue with livepatch?  For example:

while (some_condition) {
	ret = foo();
	...
	schedule(); <-- switch to the new universe while it's sleeps
	...
	// use ret in an unexpected way
}

I think it's not really a problem, just something the patch author needs
to be aware of regardless.  It should be part of the checklist.  You
always need to be extremely careful when changing a function's return
semantics.

IIRC, when I looked at the freezer before, the biggest problems I found
were that it's too disruptive to the process, and that not all kthreads
are freezable.  And I don't see anything inherently safer about it
compared to just stack checking.

-- 
Josh

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH 6/9] livepatch: create per-task consistency model
  2015-02-12 14:20             ` Josh Poimboeuf
@ 2015-02-12 14:27               ` Jiri Kosina
  0 siblings, 0 replies; 106+ messages in thread
From: Jiri Kosina @ 2015-02-12 14:27 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Peter Zijlstra, Ingo Molnar, Masami Hiramatsu, live-patching,
	linux-kernel, Seth Jennings, Vojtech Pavlik

On Thu, 12 Feb 2015, Josh Poimboeuf wrote:

> > and you have a livepatch patching foo() and changing its return value 
> > semantics. Then freezer doesn't really help.
> 
> Don't we have the same issue with livepatch?  For example:
> 
> while (some_condition) {
> 	ret = foo();
> 	...
> 	schedule(); <-- switch to the new universe while it's sleeps
> 	...
> 	// use ret in an unexpected way
> }

Well if ret is changing semantics, the livepatch will also have to patch 
the calling function (so that it handles new semantics properly), and 
therefore by looking at the stacks you would see that fact and wouldn't 
migrate the scheduled-out task to the new universe.

> I think it's not really a problem, just something the patch author needs 
> to be aware of regardless.  

Exactly; that's just up to the patch author to undersntad what the 
semantical aspects of the patch he is writing are, and make appropriate 
consistency model choice.

Thanks,

-- 
Jiri Kosina
SUSE Labs

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH 6/9] livepatch: create per-task consistency model
  2015-02-12 13:08         ` Peter Zijlstra
  2015-02-12 13:16           ` Jiri Kosina
  2015-02-12 13:16           ` Jiri Slaby
@ 2015-02-12 14:32           ` Jiri Kosina
  2015-02-18 20:17             ` Ingo Molnar
  2 siblings, 1 reply; 106+ messages in thread
From: Jiri Kosina @ 2015-02-12 14:32 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Josh Poimboeuf, Ingo Molnar, Masami Hiramatsu, live-patching,
	linux-kernel, Seth Jennings, Vojtech Pavlik

On Thu, 12 Feb 2015, Peter Zijlstra wrote:

> And what's wrong with using known good spots like the freezer?

Quoting Tejun from the thread Jiri Slaby likely had on mind:

"The fact that they may coincide often can be useful as a guideline or 
whatever but I'm completely against just mushing it together when it isn't 
correct.  This kind of things quickly lead to ambiguous situations where 
people are not sure about the specific semantics or guarantees of the 
construct and implement weird voodoo code followed by voodoo fixes.  We 
already had a full round of that with the kernel freezer itself, where 
people thought that the freezer magically makes PM work properly for a 
subsystem.  Let's please not do that again."

The whole thread begins here, in case everything hasn't been covered here 
yet:

	https://lkml.org/lkml/2014/7/2/328

Thanks again for looking into this,

-- 
Jiri Kosina
SUSE Labs

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH 8/9] livepatch: allow patch modules to be removed
  2015-02-10 19:02   ` Jiri Slaby
  2015-02-10 19:57     ` Josh Poimboeuf
@ 2015-02-12 15:22     ` Miroslav Benes
  2015-02-13 12:44       ` Josh Poimboeuf
  2015-02-13 16:04       ` Josh Poimboeuf
  1 sibling, 2 replies; 106+ messages in thread
From: Miroslav Benes @ 2015-02-12 15:22 UTC (permalink / raw)
  To: Jiri Slaby
  Cc: Josh Poimboeuf, Seth Jennings, Jiri Kosina, Vojtech Pavlik,
	Masami Hiramatsu, live-patching, linux-kernel

On Tue, 10 Feb 2015, Jiri Slaby wrote:

> On 02/09/2015, 06:31 PM, Josh Poimboeuf wrote:
> > --- a/kernel/livepatch/core.c
> > +++ b/kernel/livepatch/core.c
> ...
> > @@ -497,10 +500,6 @@ static struct attribute *klp_patch_attrs[] = {
> >  
> >  static void klp_kobj_release_patch(struct kobject *kobj)
> >  {
> > -	/*
> > -	 * Once we have a consistency model we'll need to module_put() the
> > -	 * patch module here.  See klp_register_patch() for more details.
> > -	 */
> 
> I deliberately let you write the note in there :). What happens when I
> leave some attribute in /sys open and you remove the module in the meantime?

And if that attribute is <enabled> it can lead even to the deadlock. You 
can try it yourself with the patchset applied and lockdep on. Simple 
series of insmod, disable and rmmod of the patch.

Just for the sake of completeness...

Miroslav

> 
> > --- a/kernel/livepatch/transition.c
> > +++ b/kernel/livepatch/transition.c
> > @@ -54,6 +54,9 @@ void klp_complete_transition(void)
> >  		for (func = obj->funcs; func->old_name; func++)
> >  			func->transition = 0;
> >  
> > +	if (klp_universe_goal == KLP_UNIVERSE_OLD)
> > +		module_put(klp_transition_patch->mod);
> > +
> >  	klp_transition_patch = NULL;
> >  }

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH 6/9] livepatch: create per-task consistency model
  2015-02-12 14:08               ` Jiri Kosina
@ 2015-02-12 15:24                 ` Josh Poimboeuf
  0 siblings, 0 replies; 106+ messages in thread
From: Josh Poimboeuf @ 2015-02-12 15:24 UTC (permalink / raw)
  To: Jiri Kosina
  Cc: Peter Zijlstra, Jiri Slaby, Ingo Molnar, Masami Hiramatsu,
	live-patching, linux-kernel, Seth Jennings, Vojtech Pavlik

On Thu, Feb 12, 2015 at 03:08:38PM +0100, Jiri Kosina wrote:
> On Thu, 12 Feb 2015, Peter Zijlstra wrote:
> I personally am not a big fan of the task_rq_lock() public exposure 
> either. What might be generally useful though (not only for livepatching) 
> would be an API that would allow for "safe" stack dump (where "safe" means 
> that guarantee, that it wouldn't be interferred by process waking up in 
> the middle of dumping, would be provided).

In general, I think a safe stack dump is needed.  A lot of the stack
dumping in the kernel seems dangerous.  For example, it looks like doing
a `cat /proc/pid/stack` while the process is writing the stack could
easily go off into the weeds.

But I don't see how it would help the livepatch case.  What happens if
the process starts running in the to-be-patched function after we call
the "safe" dump_stack() but before switching it to the new universe?

-- 
Josh

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH 6/9] livepatch: create per-task consistency model
  2015-02-12 13:26     ` Jiri Slaby
@ 2015-02-12 15:48       ` Josh Poimboeuf
  0 siblings, 0 replies; 106+ messages in thread
From: Josh Poimboeuf @ 2015-02-12 15:48 UTC (permalink / raw)
  To: Jiri Slaby
  Cc: Ingo Molnar, Peter Zijlstra, Masami Hiramatsu, live-patching,
	linux-kernel, Seth Jennings, Jiri Kosina, Vojtech Pavlik

On Thu, Feb 12, 2015 at 02:26:42PM +0100, Jiri Slaby wrote:
> On 02/12/2015, 04:21 AM, Josh Poimboeuf wrote:
> > Ingo, Peter,
> > 
> > Would you have any objections to making task_rq_lock/unlock() non-static
> > (or moving them to kernel/sched/sched.h) so they can be called by the
> > livepatch code?
> > 
> > To provide some background, I'm looking for a way to temporarily prevent
> > a sleeping task from running while its stack is examined, to decide
> > whether it can be safely switched to the new patching "universe".  For
> > more details see klp_transition_task() in the patch below.
> > 
> > Using task_rq_lock() is the most straightforward way I could find to
> > achieve that.
> 
> Hi, I cannot speak whether it is the proper way or not.
> 
> But if so, would it make sense to do the opposite: expose an API to walk
> through the processes' stack and make the decision? Concretely, move
> parts of klp_stacktrace_address_verify_func to sched.c or somewhere in
> kernel/sched/ and leave task_rq_lock untouched.

Yeah, it makes sense in theory.  But I'm not sure how to do that in a
way that prevents races when switching the task's universe.  I think we
need the rq locked for both the stack walk and the universe switch.

In general, I agree it would be good to find a way to keep the rq
locking functions in sched.c.

-- 
Josh

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH 0/9] livepatch: consistency model
  2015-02-09 17:31 [RFC PATCH 0/9] livepatch: consistency model Josh Poimboeuf
                   ` (11 preceding siblings ...)
  2015-02-10 11:16 ` Masami Hiramatsu
@ 2015-02-13 10:14 ` Jiri Kosina
  2015-02-13 14:19   ` Josh Poimboeuf
  2015-03-10 16:23 ` Josh Poimboeuf
  13 siblings, 1 reply; 106+ messages in thread
From: Jiri Kosina @ 2015-02-13 10:14 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Seth Jennings, Vojtech Pavlik, Masami Hiramatsu, live-patching,
	linux-kernel

On Mon, 9 Feb 2015, Josh Poimboeuf wrote:

> My biggest concerns and questions related to this patch set are:
> 
> 1) To safely examine the task stacks, the transition code locks each task's rq
>    struct, which requires using the scheduler's internal rq locking functions.
>    It seems to work well, but I'm not sure if there's a cleaner way to safely
>    do stack checking without stop_machine().

How about we take a slightly different aproach -- put a probe (or ftrace) 
on __switch_to() during a klp transition period, and examine stacktraces 
for tasks that are just about to start running from there?

The only tasks that would not be covered by this would be purely CPU-bound 
tasks that never schedule. But we are likely in trouble with those anyway, 
because odds are that non-rescheduling CPU-bound tasks are also 
RT-priority tasks running on isolated CPUs, which we will fail to handle 
anyway.

I think Masami used similar trick in his kpatch-without-stopmachine 
aproach.

-- 
Jiri Kosina
SUSE Labs

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH 1/9] livepatch: simplify disable error path
  2015-02-09 17:31 ` [RFC PATCH 1/9] livepatch: simplify disable error path Josh Poimboeuf
@ 2015-02-13 12:25   ` Miroslav Benes
  2015-02-18 17:03     ` Petr Mladek
  2015-02-18 20:07   ` Jiri Kosina
  1 sibling, 1 reply; 106+ messages in thread
From: Miroslav Benes @ 2015-02-13 12:25 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Seth Jennings, Jiri Kosina, Vojtech Pavlik, Masami Hiramatsu,
	live-patching, Linux Kernel Mailing List

On Mon, 9 Feb 2015, Josh Poimboeuf wrote:

> If registering the function with ftrace has previously succeeded,
> unregistering will almost never fail.  Even if it does, it's not a fatal
> error.  We can still carry on and disable the klp_func from being used
> by removing it from the klp_ops func stack.
> 
> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>

This makes sense, so

Reviewed-by: Miroslav Benes <mbenes@suse.cz>

I think this patch could be taken independently of the consistency model. 
If no one else has any objection...

Miroslav

> ---
>  kernel/livepatch/core.c | 67 +++++++++++++------------------------------------
>  1 file changed, 17 insertions(+), 50 deletions(-)
> 
> diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c
> index 9adf86b..081df77 100644
> --- a/kernel/livepatch/core.c
> +++ b/kernel/livepatch/core.c
> @@ -322,32 +322,20 @@ static void notrace klp_ftrace_handler(unsigned long ip,
>  	klp_arch_set_pc(regs, (unsigned long)func->new_func);
>  }
>  
> -static int klp_disable_func(struct klp_func *func)
> +static void klp_disable_func(struct klp_func *func)
>  {
>  	struct klp_ops *ops;
> -	int ret;
> -
> -	if (WARN_ON(func->state != KLP_ENABLED))
> -		return -EINVAL;
>  
> -	if (WARN_ON(!func->old_addr))
> -		return -EINVAL;
> +	WARN_ON(func->state != KLP_ENABLED);
> +	WARN_ON(!func->old_addr);
>  
>  	ops = klp_find_ops(func->old_addr);
>  	if (WARN_ON(!ops))
> -		return -EINVAL;
> +		return;
>  
>  	if (list_is_singular(&ops->func_stack)) {
> -		ret = unregister_ftrace_function(&ops->fops);
> -		if (ret) {
> -			pr_err("failed to unregister ftrace handler for function '%s' (%d)\n",
> -			       func->old_name, ret);
> -			return ret;
> -		}
> -
> -		ret = ftrace_set_filter_ip(&ops->fops, func->old_addr, 1, 0);
> -		if (ret)
> -			pr_warn("function unregister succeeded but failed to clear the filter\n");
> +		WARN_ON(unregister_ftrace_function(&ops->fops));
> +		WARN_ON(ftrace_set_filter_ip(&ops->fops, func->old_addr, 1, 0));
>  
>  		list_del_rcu(&func->stack_node);
>  		list_del(&ops->node);
> @@ -357,8 +345,6 @@ static int klp_disable_func(struct klp_func *func)
>  	}
>  
>  	func->state = KLP_DISABLED;
> -
> -	return 0;
>  }
>  
>  static int klp_enable_func(struct klp_func *func)
> @@ -419,23 +405,15 @@ err:
>  	return ret;
>  }
>  
> -static int klp_disable_object(struct klp_object *obj)
> +static void klp_disable_object(struct klp_object *obj)
>  {
>  	struct klp_func *func;
> -	int ret;
>  
> -	for (func = obj->funcs; func->old_name; func++) {
> -		if (func->state != KLP_ENABLED)
> -			continue;
> -
> -		ret = klp_disable_func(func);
> -		if (ret)
> -			return ret;
> -	}
> +	for (func = obj->funcs; func->old_name; func++)
> +		if (func->state == KLP_ENABLED)
> +			klp_disable_func(func);
>  
>  	obj->state = KLP_DISABLED;
> -
> -	return 0;
>  }
>  
>  static int klp_enable_object(struct klp_object *obj)
> @@ -451,22 +429,19 @@ static int klp_enable_object(struct klp_object *obj)
>  
>  	for (func = obj->funcs; func->old_name; func++) {
>  		ret = klp_enable_func(func);
> -		if (ret)
> -			goto unregister;
> +		if (ret) {
> +			klp_disable_object(obj);
> +			return ret;
> +		}
>  	}
>  	obj->state = KLP_ENABLED;
>  
>  	return 0;
> -
> -unregister:
> -	WARN_ON(klp_disable_object(obj));
> -	return ret;
>  }
>  
>  static int __klp_disable_patch(struct klp_patch *patch)
>  {
>  	struct klp_object *obj;
> -	int ret;
>  
>  	/* enforce stacking: only the last enabled patch can be disabled */
>  	if (!list_is_last(&patch->list, &klp_patches) &&
> @@ -476,12 +451,8 @@ static int __klp_disable_patch(struct klp_patch *patch)
>  	pr_notice("disabling patch '%s'\n", patch->mod->name);
>  
>  	for (obj = patch->objs; obj->funcs; obj++) {
> -		if (obj->state != KLP_ENABLED)
> -			continue;
> -
> -		ret = klp_disable_object(obj);
> -		if (ret)
> -			return ret;
> +		if (obj->state == KLP_ENABLED)
> +			klp_disable_object(obj);
>  	}
>  
>  	patch->state = KLP_DISABLED;
> @@ -931,7 +902,6 @@ static void klp_module_notify_going(struct klp_patch *patch,
>  {
>  	struct module *pmod = patch->mod;
>  	struct module *mod = obj->mod;
> -	int ret;
>  
>  	if (patch->state == KLP_DISABLED)
>  		goto disabled;
> @@ -939,10 +909,7 @@ static void klp_module_notify_going(struct klp_patch *patch,
>  	pr_notice("reverting patch '%s' on unloading module '%s'\n",
>  		  pmod->name, mod->name);
>  
> -	ret = klp_disable_object(obj);
> -	if (ret)
> -		pr_warn("failed to revert patch '%s' on module '%s' (%d)\n",
> -			pmod->name, mod->name, ret);
> +	klp_disable_object(obj);
>  
>  disabled:
>  	klp_free_object_loaded(obj);
> -- 
> 2.1.0
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH 8/9] livepatch: allow patch modules to be removed
  2015-02-12 15:22     ` Miroslav Benes
@ 2015-02-13 12:44       ` Josh Poimboeuf
  2015-02-13 16:04       ` Josh Poimboeuf
  1 sibling, 0 replies; 106+ messages in thread
From: Josh Poimboeuf @ 2015-02-13 12:44 UTC (permalink / raw)
  To: Miroslav Benes
  Cc: Jiri Slaby, Seth Jennings, Jiri Kosina, Vojtech Pavlik,
	Masami Hiramatsu, live-patching, linux-kernel

On Thu, Feb 12, 2015 at 04:22:24PM +0100, Miroslav Benes wrote:
> On Tue, 10 Feb 2015, Jiri Slaby wrote:
> 
> > On 02/09/2015, 06:31 PM, Josh Poimboeuf wrote:
> > > --- a/kernel/livepatch/core.c
> > > +++ b/kernel/livepatch/core.c
> > ...
> > > @@ -497,10 +500,6 @@ static struct attribute *klp_patch_attrs[] = {
> > >  
> > >  static void klp_kobj_release_patch(struct kobject *kobj)
> > >  {
> > > -	/*
> > > -	 * Once we have a consistency model we'll need to module_put() the
> > > -	 * patch module here.  See klp_register_patch() for more details.
> > > -	 */
> > 
> > I deliberately let you write the note in there :). What happens when I
> > leave some attribute in /sys open and you remove the module in the meantime?
> 
> And if that attribute is <enabled> it can lead even to the deadlock. You 
> can try it yourself with the patchset applied and lockdep on. Simple 
> series of insmod, disable and rmmod of the patch.
> 
> Just for the sake of completeness...

Ouch, thanks.

-- 
Josh

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH 2/9] livepatch: separate enabled and patched states
  2015-02-09 17:31 ` [RFC PATCH 2/9] livepatch: separate enabled and patched states Josh Poimboeuf
  2015-02-10 16:44   ` Jiri Slaby
@ 2015-02-13 12:57   ` Miroslav Benes
  2015-02-13 14:39     ` Josh Poimboeuf
  1 sibling, 1 reply; 106+ messages in thread
From: Miroslav Benes @ 2015-02-13 12:57 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Seth Jennings, Jiri Kosina, Vojtech Pavlik, Masami Hiramatsu,
	live-patching, linux-kernel

On Mon, 9 Feb 2015, Josh Poimboeuf wrote:

> Once we have a consistency model, patches and their objects will be
> enabled and disabled at different times.  For example, when a patch is
> disabled, its loaded objects' funcs can remain registered with ftrace
> indefinitely until the unpatching operation is complete and they're no
> longer in use.
> 
> It's less confusing if we give them different names: patches can be
> enabled or disabled; objects (and their funcs) can be patched or
> unpatched:
> 
> - Enabled means that a patch is logically enabled (but not necessarily
>   fully applied).
> 
> - Patched means that an object's funcs are registered with ftrace and
>   added to the klp_ops func stack.
> 
> Also, since these states are binary, represent them with boolean-type
> variables instead of enums.

They are binary now but will it hold also in the future? I cannot come up 
with any other possible state of the function right now, but that doesn't 
mean there isn't any. It would be sad to return it back to enums one day 
:)

Also would it be useful to expose patched variable for functions and 
objects in sysfs?

Two small things below...

> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
> ---
>  include/linux/livepatch.h | 15 ++++-----
>  kernel/livepatch/core.c   | 79 +++++++++++++++++++++++------------------------
>  2 files changed, 45 insertions(+), 49 deletions(-)
> 
> diff --git a/include/linux/livepatch.h b/include/linux/livepatch.h
> index 95023fd..22a67d1 100644
> --- a/include/linux/livepatch.h
> +++ b/include/linux/livepatch.h
> @@ -28,11 +28,6 @@
>  
>  #include <asm/livepatch.h>
>  
> -enum klp_state {
> -	KLP_DISABLED,
> -	KLP_ENABLED
> -};
> -
>  /**
>   * struct klp_func - function structure for live patching
>   * @old_name:	name of the function to be patched
> @@ -42,6 +37,7 @@ enum klp_state {
>   * @kobj:	kobject for sysfs resources
>   * @state:	tracks function-level patch application state
>   * @stack_node:	list node for klp_ops func_stack list
> + * @patched:	the func has been added to the klp_ops list
>   */
>  struct klp_func {
>  	/* external */
> @@ -59,8 +55,8 @@ struct klp_func {
>  
>  	/* internal */
>  	struct kobject kobj;
> -	enum klp_state state;
>  	struct list_head stack_node;
> +	int patched;
>  };

@state remains in the comment above

>  /**
> @@ -90,7 +86,7 @@ struct klp_reloc {
>   * @kobj:	kobject for sysfs resources
>   * @mod:	kernel module associated with the patched object
>   * 		(NULL for vmlinux)
> - * @state:	tracks object-level patch application state
> + * @patched:	the object's funcs have been add to the klp_ops list
>   */
>  struct klp_object {
>  	/* external */
> @@ -101,7 +97,7 @@ struct klp_object {
>  	/* internal */
>  	struct kobject *kobj;
>  	struct module *mod;
> -	enum klp_state state;
> +	int patched;
>  };
>  
>  /**
> @@ -111,6 +107,7 @@ struct klp_object {
>   * @list:	list node for global list of registered patches
>   * @kobj:	kobject for sysfs resources
>   * @state:	tracks patch-level application state
> + * @enabled:	the patch is enabled (but operation may be incomplete)
>   */
>  struct klp_patch {
>  	/* external */
> @@ -120,7 +117,7 @@ struct klp_patch {
>  	/* internal */
>  	struct list_head list;
>  	struct kobject kobj;
> -	enum klp_state state;
> +	int enabled;
>  };

Dtto

Miroslav

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH 0/9] livepatch: consistency model
  2015-02-13 10:14 ` Jiri Kosina
@ 2015-02-13 14:19   ` Josh Poimboeuf
  2015-02-13 14:22     ` Jiri Kosina
  0 siblings, 1 reply; 106+ messages in thread
From: Josh Poimboeuf @ 2015-02-13 14:19 UTC (permalink / raw)
  To: Jiri Kosina
  Cc: Seth Jennings, Vojtech Pavlik, Masami Hiramatsu, live-patching,
	linux-kernel

On Fri, Feb 13, 2015 at 11:14:01AM +0100, Jiri Kosina wrote:
> On Mon, 9 Feb 2015, Josh Poimboeuf wrote:
> 
> > My biggest concerns and questions related to this patch set are:
> > 
> > 1) To safely examine the task stacks, the transition code locks each task's rq
> >    struct, which requires using the scheduler's internal rq locking functions.
> >    It seems to work well, but I'm not sure if there's a cleaner way to safely
> >    do stack checking without stop_machine().
> 
> How about we take a slightly different aproach -- put a probe (or ftrace) 
> on __switch_to() during a klp transition period, and examine stacktraces 
> for tasks that are just about to start running from there?
> 
> The only tasks that would not be covered by this would be purely CPU-bound 
> tasks that never schedule. But we are likely in trouble with those anyway, 
> because odds are that non-rescheduling CPU-bound tasks are also 
> RT-priority tasks running on isolated CPUs, which we will fail to handle 
> anyway.
> 
> I think Masami used similar trick in his kpatch-without-stopmachine 
> aproach.

Yeah, that's definitely an option, though I'm really not too crazy about
it.  Hooking into the scheduler is kind of scary and disruptive.  We'd
also have to wake up all the sleeping processes.

-- 
Josh

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH 0/9] livepatch: consistency model
  2015-02-13 14:19   ` Josh Poimboeuf
@ 2015-02-13 14:22     ` Jiri Kosina
  2015-02-13 14:40       ` Miroslav Benes
  2015-02-13 14:41       ` Josh Poimboeuf
  0 siblings, 2 replies; 106+ messages in thread
From: Jiri Kosina @ 2015-02-13 14:22 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Seth Jennings, Vojtech Pavlik, Masami Hiramatsu, live-patching,
	linux-kernel

On Fri, 13 Feb 2015, Josh Poimboeuf wrote:

> > How about we take a slightly different aproach -- put a probe (or ftrace) 
> > on __switch_to() during a klp transition period, and examine stacktraces 
> > for tasks that are just about to start running from there?
> > 
> > The only tasks that would not be covered by this would be purely CPU-bound 
> > tasks that never schedule. But we are likely in trouble with those anyway, 
> > because odds are that non-rescheduling CPU-bound tasks are also 
> > RT-priority tasks running on isolated CPUs, which we will fail to handle 
> > anyway.
> > 
> > I think Masami used similar trick in his kpatch-without-stopmachine 
> > aproach.
> 
> Yeah, that's definitely an option, though I'm really not too crazy about
> it.  Hooking into the scheduler is kind of scary and disruptive.  

This is basically about running a stack checking for ->next before 
switching to it, i.e. read-only operation (admittedly inducing some 
latency, but that's the same with locking the runqueue). And only when in 
transition phase.

> We'd also have to wake up all the sleeping processes.

Yes, I don't think there is a way around that.

Thanks,

-- 
Jiri Kosina
SUSE Labs

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH 3/9] livepatch: move patching functions into patch.c
  2015-02-09 17:31 ` [RFC PATCH 3/9] livepatch: move patching functions into patch.c Josh Poimboeuf
  2015-02-10 18:27   ` Jiri Slaby
@ 2015-02-13 14:28   ` Miroslav Benes
  2015-02-13 15:09     ` Josh Poimboeuf
  1 sibling, 1 reply; 106+ messages in thread
From: Miroslav Benes @ 2015-02-13 14:28 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Seth Jennings, Jiri Kosina, Vojtech Pavlik, Masami Hiramatsu,
	live-patching, linux-kernel

On Mon, 9 Feb 2015, Josh Poimboeuf wrote:

> Move functions related to the actual patching of functions and objects
> into a new patch.c file.

I am definitely for splitting the code to several different files. 
Otherwise it would be soon unmanageable. However I don't know if this 
patch is the best possible. Maybe it is just nitpicking so let's not spend 
too much time on this :)

Without this patch there are several different groups of functions in 
core.c:
1. infrastructure such as global variables, klp_init and some helper 
   functions
2. (un)registration and initialization of the patch
3. enable/disable with patching/unpatching, ftrace handler
4. sysfs code
5. module notifier
6. relocations

I would move sysfs code away to separate file. 

If we decide to move patching code I think it would make sense to move 
enable/disable functions along with it. Or perhaps __klp_enable_patch and 
__klp_disable_patch only. It is possible though that the result would be 
much worse.

Or we can move some other group of functions...

[...]

> diff --git a/kernel/livepatch/patch.h b/kernel/livepatch/patch.h
> new file mode 100644
> index 0000000..bb34bd3
> --- /dev/null
> +++ b/kernel/livepatch/patch.h
> @@ -0,0 +1,25 @@
> +#include <linux/livepatch.h>
> +
> +/**
> + * struct klp_ops - structure for tracking registered ftrace ops structs
> + *
> + * A single ftrace_ops is shared between all enabled replacement functions
> + * (klp_func structs) which have the same old_addr.  This allows the switch
> + * between function versions to happen instantaneously by updating the klp_ops
> + * struct's func_stack list.  The winner is the klp_func at the top of the
> + * func_stack (front of the list).
> + *
> + * @node:	node for the global klp_ops list
> + * @func_stack:	list head for the stack of klp_func's (active func is on top)
> + * @fops:	registered ftrace ops struct
> + */
> +struct klp_ops {
> +	struct list_head node;
> +	struct list_head func_stack;
> +	struct ftrace_ops fops;
> +};
> +
> +struct klp_ops *klp_find_ops(unsigned long old_addr);
> +
> +extern int klp_patch_object(struct klp_object *obj);
> +extern void klp_unpatch_object(struct klp_object *obj);

Is there a reason why klp_find_ops is not extern and the other two 
functions are? I think it is redundant and it is better to be consistent.

Regards,
Miroslav

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH 2/9] livepatch: separate enabled and patched states
  2015-02-13 12:57   ` Miroslav Benes
@ 2015-02-13 14:39     ` Josh Poimboeuf
  2015-02-13 14:46       ` Miroslav Benes
  0 siblings, 1 reply; 106+ messages in thread
From: Josh Poimboeuf @ 2015-02-13 14:39 UTC (permalink / raw)
  To: Miroslav Benes
  Cc: Seth Jennings, Jiri Kosina, Vojtech Pavlik, Masami Hiramatsu,
	live-patching, linux-kernel

On Fri, Feb 13, 2015 at 01:57:38PM +0100, Miroslav Benes wrote:
> On Mon, 9 Feb 2015, Josh Poimboeuf wrote:
> 
> > Once we have a consistency model, patches and their objects will be
> > enabled and disabled at different times.  For example, when a patch is
> > disabled, its loaded objects' funcs can remain registered with ftrace
> > indefinitely until the unpatching operation is complete and they're no
> > longer in use.
> > 
> > It's less confusing if we give them different names: patches can be
> > enabled or disabled; objects (and their funcs) can be patched or
> > unpatched:
> > 
> > - Enabled means that a patch is logically enabled (but not necessarily
> >   fully applied).
> > 
> > - Patched means that an object's funcs are registered with ftrace and
> >   added to the klp_ops func stack.
> > 
> > Also, since these states are binary, represent them with boolean-type
> > variables instead of enums.
> 
> They are binary now but will it hold also in the future? I cannot come up 
> with any other possible state of the function right now, but that doesn't 
> mean there isn't any. It would be sad to return it back to enums one day 
> :)

I really can't think of any reason why they would become non-binary.
IMO it's more likely we could add more boolean variables, but if that
got out of hand we could just switch to using bit flags.

Either way I don't see a problem with changing them later if we need to.

> Also would it be useful to expose patched variable for functions and 
> objects in sysfs?

Not that I know of.  Do you have a use case in mind?  I view "patched"
as an internal variable, corresponding to whether the object or its
functions are registered with ftrace/klp_ops.  It doesn't mean "patched"
in a way that would really make sense to the user, because of the
gradual nature of the patching process.

> 
> Two small things below...

Agreed to both, thanks.

> 
> > Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
> > ---
> >  include/linux/livepatch.h | 15 ++++-----
> >  kernel/livepatch/core.c   | 79 +++++++++++++++++++++++------------------------
> >  2 files changed, 45 insertions(+), 49 deletions(-)
> > 
> > diff --git a/include/linux/livepatch.h b/include/linux/livepatch.h
> > index 95023fd..22a67d1 100644
> > --- a/include/linux/livepatch.h
> > +++ b/include/linux/livepatch.h
> > @@ -28,11 +28,6 @@
> >  
> >  #include <asm/livepatch.h>
> >  
> > -enum klp_state {
> > -	KLP_DISABLED,
> > -	KLP_ENABLED
> > -};
> > -
> >  /**
> >   * struct klp_func - function structure for live patching
> >   * @old_name:	name of the function to be patched
> > @@ -42,6 +37,7 @@ enum klp_state {
> >   * @kobj:	kobject for sysfs resources
> >   * @state:	tracks function-level patch application state
> >   * @stack_node:	list node for klp_ops func_stack list
> > + * @patched:	the func has been added to the klp_ops list
> >   */
> >  struct klp_func {
> >  	/* external */
> > @@ -59,8 +55,8 @@ struct klp_func {
> >  
> >  	/* internal */
> >  	struct kobject kobj;
> > -	enum klp_state state;
> >  	struct list_head stack_node;
> > +	int patched;
> >  };
> 
> @state remains in the comment above
> 
> >  /**
> > @@ -90,7 +86,7 @@ struct klp_reloc {
> >   * @kobj:	kobject for sysfs resources
> >   * @mod:	kernel module associated with the patched object
> >   * 		(NULL for vmlinux)
> > - * @state:	tracks object-level patch application state
> > + * @patched:	the object's funcs have been add to the klp_ops list
> >   */
> >  struct klp_object {
> >  	/* external */
> > @@ -101,7 +97,7 @@ struct klp_object {
> >  	/* internal */
> >  	struct kobject *kobj;
> >  	struct module *mod;
> > -	enum klp_state state;
> > +	int patched;
> >  };
> >  
> >  /**
> > @@ -111,6 +107,7 @@ struct klp_object {
> >   * @list:	list node for global list of registered patches
> >   * @kobj:	kobject for sysfs resources
> >   * @state:	tracks patch-level application state
> > + * @enabled:	the patch is enabled (but operation may be incomplete)
> >   */
> >  struct klp_patch {
> >  	/* external */
> > @@ -120,7 +117,7 @@ struct klp_patch {
> >  	/* internal */
> >  	struct list_head list;
> >  	struct kobject kobj;
> > -	enum klp_state state;
> > +	int enabled;
> >  };
> 
> Dtto
> 
> Miroslav

-- 
Josh

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH 0/9] livepatch: consistency model
  2015-02-13 14:22     ` Jiri Kosina
@ 2015-02-13 14:40       ` Miroslav Benes
  2015-02-13 14:55         ` Josh Poimboeuf
  2015-02-13 14:41       ` Josh Poimboeuf
  1 sibling, 1 reply; 106+ messages in thread
From: Miroslav Benes @ 2015-02-13 14:40 UTC (permalink / raw)
  To: Jiri Kosina
  Cc: Josh Poimboeuf, Seth Jennings, Vojtech Pavlik, Masami Hiramatsu,
	live-patching, linux-kernel

On Fri, 13 Feb 2015, Jiri Kosina wrote:

> On Fri, 13 Feb 2015, Josh Poimboeuf wrote:
> 
> > > How about we take a slightly different aproach -- put a probe (or ftrace) 
> > > on __switch_to() during a klp transition period, and examine stacktraces 
> > > for tasks that are just about to start running from there?
> > > 
> > > The only tasks that would not be covered by this would be purely CPU-bound 
> > > tasks that never schedule. But we are likely in trouble with those anyway, 
> > > because odds are that non-rescheduling CPU-bound tasks are also 
> > > RT-priority tasks running on isolated CPUs, which we will fail to handle 
> > > anyway.
> > > 
> > > I think Masami used similar trick in his kpatch-without-stopmachine 
> > > aproach.
> > 
> > Yeah, that's definitely an option, though I'm really not too crazy about
> > it.  Hooking into the scheduler is kind of scary and disruptive.  
> 
> This is basically about running a stack checking for ->next before 
> switching to it, i.e. read-only operation (admittedly inducing some 
> latency, but that's the same with locking the runqueue). And only when in 
> transition phase.
> 
> > We'd also have to wake up all the sleeping processes.
> 
> Yes, I don't think there is a way around that.

I think there are two options how to do it if I understand you correctly.

1. we would put a probe on __switch_to and afterwards wake up all the 
   sleeping processes.

2. we would do it in an asynchronous manner. We would put a probe and let 
   the processes to wake themselves. The transition delayed workqueue 
   would only check if there is some non-migrated process. Of course if 
   some process sleeps for a long time it would take a long time to 
   complete the patching. It would be up to the user to send a signal to 
   the process to wake up.

Does it make sense? If yes, I cannot decide which approach is better.

Miroslav

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH 0/9] livepatch: consistency model
  2015-02-13 14:22     ` Jiri Kosina
  2015-02-13 14:40       ` Miroslav Benes
@ 2015-02-13 14:41       ` Josh Poimboeuf
  2015-02-24 11:27         ` Masami Hiramatsu
  1 sibling, 1 reply; 106+ messages in thread
From: Josh Poimboeuf @ 2015-02-13 14:41 UTC (permalink / raw)
  To: Jiri Kosina
  Cc: Seth Jennings, Vojtech Pavlik, Masami Hiramatsu, live-patching,
	linux-kernel

On Fri, Feb 13, 2015 at 03:22:15PM +0100, Jiri Kosina wrote:
> On Fri, 13 Feb 2015, Josh Poimboeuf wrote:
> 
> > > How about we take a slightly different aproach -- put a probe (or ftrace) 
> > > on __switch_to() during a klp transition period, and examine stacktraces 
> > > for tasks that are just about to start running from there?
> > > 
> > > The only tasks that would not be covered by this would be purely CPU-bound 
> > > tasks that never schedule. But we are likely in trouble with those anyway, 
> > > because odds are that non-rescheduling CPU-bound tasks are also 
> > > RT-priority tasks running on isolated CPUs, which we will fail to handle 
> > > anyway.
> > > 
> > > I think Masami used similar trick in his kpatch-without-stopmachine 
> > > aproach.
> > 
> > Yeah, that's definitely an option, though I'm really not too crazy about
> > it.  Hooking into the scheduler is kind of scary and disruptive.  
> 
> This is basically about running a stack checking for ->next before 
> switching to it, i.e. read-only operation (admittedly inducing some 
> latency, but that's the same with locking the runqueue). And only when in 
> transition phase.

Yes, but it would introduce much more latency than locking rq, since
there would be at least some added latency to every schedule() call
during the transition phase.  Locking the rq would only add latency in
those cases where another CPU is trying to do a context switch while
we're holding the lock.

It also seems much more dangerous.  A bug in __switch_to() could easily
do a lot of damage.

> > We'd also have to wake up all the sleeping processes.
> 
> Yes, I don't think there is a way around that.

Actually this patch set is a way around that :-)

-- 
Josh

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH 2/9] livepatch: separate enabled and patched states
  2015-02-13 14:39     ` Josh Poimboeuf
@ 2015-02-13 14:46       ` Miroslav Benes
  0 siblings, 0 replies; 106+ messages in thread
From: Miroslav Benes @ 2015-02-13 14:46 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Seth Jennings, Jiri Kosina, Vojtech Pavlik, Masami Hiramatsu,
	live-patching, linux-kernel

On Fri, 13 Feb 2015, Josh Poimboeuf wrote:

> On Fri, Feb 13, 2015 at 01:57:38PM +0100, Miroslav Benes wrote:
> > On Mon, 9 Feb 2015, Josh Poimboeuf wrote:
> > 
> > > Once we have a consistency model, patches and their objects will be
> > > enabled and disabled at different times.  For example, when a patch is
> > > disabled, its loaded objects' funcs can remain registered with ftrace
> > > indefinitely until the unpatching operation is complete and they're no
> > > longer in use.
> > > 
> > > It's less confusing if we give them different names: patches can be
> > > enabled or disabled; objects (and their funcs) can be patched or
> > > unpatched:
> > > 
> > > - Enabled means that a patch is logically enabled (but not necessarily
> > >   fully applied).
> > > 
> > > - Patched means that an object's funcs are registered with ftrace and
> > >   added to the klp_ops func stack.
> > > 
> > > Also, since these states are binary, represent them with boolean-type
> > > variables instead of enums.
> > 
> > They are binary now but will it hold also in the future? I cannot come up 
> > with any other possible state of the function right now, but that doesn't 
> > mean there isn't any. It would be sad to return it back to enums one day 
> > :)
> 
> I really can't think of any reason why they would become non-binary.
> IMO it's more likely we could add more boolean variables, but if that
> got out of hand we could just switch to using bit flags.
>
> Either way I don't see a problem with changing them later if we need to.

Agreed. 
 
> > Also would it be useful to expose patched variable for functions and 
> > objects in sysfs?
> 
> Not that I know of.  Do you have a use case in mind?  I view "patched"
> as an internal variable, corresponding to whether the object or its
> functions are registered with ftrace/klp_ops.  It doesn't mean "patched"
> in a way that would really make sense to the user, because of the
> gradual nature of the patching process.

The only reasonable thing which I thought about was in case of an error. 
If something bad happens it could be useful to know which state the functions 
are in (patched/unpatched). Anyway it is nothing of importance right now 
and we can add it anytime later if we decide it useful.

Miroslav

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH 0/9] livepatch: consistency model
  2015-02-13 14:40       ` Miroslav Benes
@ 2015-02-13 14:55         ` Josh Poimboeuf
  0 siblings, 0 replies; 106+ messages in thread
From: Josh Poimboeuf @ 2015-02-13 14:55 UTC (permalink / raw)
  To: Miroslav Benes
  Cc: Jiri Kosina, Seth Jennings, Vojtech Pavlik, Masami Hiramatsu,
	live-patching, linux-kernel

On Fri, Feb 13, 2015 at 03:40:14PM +0100, Miroslav Benes wrote:
> On Fri, 13 Feb 2015, Jiri Kosina wrote:
> 
> > On Fri, 13 Feb 2015, Josh Poimboeuf wrote:
> > 
> > > > How about we take a slightly different aproach -- put a probe (or ftrace) 
> > > > on __switch_to() during a klp transition period, and examine stacktraces 
> > > > for tasks that are just about to start running from there?
> > > > 
> > > > The only tasks that would not be covered by this would be purely CPU-bound 
> > > > tasks that never schedule. But we are likely in trouble with those anyway, 
> > > > because odds are that non-rescheduling CPU-bound tasks are also 
> > > > RT-priority tasks running on isolated CPUs, which we will fail to handle 
> > > > anyway.
> > > > 
> > > > I think Masami used similar trick in his kpatch-without-stopmachine 
> > > > aproach.
> > > 
> > > Yeah, that's definitely an option, though I'm really not too crazy about
> > > it.  Hooking into the scheduler is kind of scary and disruptive.  
> > 
> > This is basically about running a stack checking for ->next before 
> > switching to it, i.e. read-only operation (admittedly inducing some 
> > latency, but that's the same with locking the runqueue). And only when in 
> > transition phase.
> > 
> > > We'd also have to wake up all the sleeping processes.
> > 
> > Yes, I don't think there is a way around that.
> 
> I think there are two options how to do it if I understand you correctly.
> 
> 1. we would put a probe on __switch_to and afterwards wake up all the 
>    sleeping processes.
> 
> 2. we would do it in an asynchronous manner. We would put a probe and let 
>    the processes to wake themselves. The transition delayed workqueue 
>    would only check if there is some non-migrated process. Of course if 
>    some process sleeps for a long time it would take a long time to 
>    complete the patching. It would be up to the user to send a signal to 
>    the process to wake up.
> 
> Does it make sense? If yes, I cannot decide which approach is better.

Option 2 wouldn't really work for kthreads because you can't signal them
to wake up from user space.  And I really want to avoid having to leave
the system in a partially patched state for a long period of time.

But also option 1 wouldn't necessarily result in the system being
immediately patched, since you could have some CPU-bound tasks.  So some
asynchronous patching is still needed.

-- 
Josh

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH 3/9] livepatch: move patching functions into patch.c
  2015-02-13 14:28   ` Miroslav Benes
@ 2015-02-13 15:09     ` Josh Poimboeuf
  0 siblings, 0 replies; 106+ messages in thread
From: Josh Poimboeuf @ 2015-02-13 15:09 UTC (permalink / raw)
  To: Miroslav Benes
  Cc: Seth Jennings, Jiri Kosina, Vojtech Pavlik, Masami Hiramatsu,
	live-patching, linux-kernel

On Fri, Feb 13, 2015 at 03:28:28PM +0100, Miroslav Benes wrote:
> On Mon, 9 Feb 2015, Josh Poimboeuf wrote:
> 
> > Move functions related to the actual patching of functions and objects
> > into a new patch.c file.
> 
> I am definitely for splitting the code to several different files. 
> Otherwise it would be soon unmanageable. However I don't know if this 
> patch is the best possible. Maybe it is just nitpicking so let's not spend 
> too much time on this :)
> 
> Without this patch there are several different groups of functions in 
> core.c:
> 1. infrastructure such as global variables, klp_init and some helper 
>    functions
> 2. (un)registration and initialization of the patch
> 3. enable/disable with patching/unpatching, ftrace handler
> 4. sysfs code
> 5. module notifier
> 6. relocations
> 
> I would move sysfs code away to separate file. 

I'm not sure about moving the sysfs code to its own file, mainly because
of enabled_store():

1. It needs the klp_mutex.  It's really nice and clean to keep the
   klp_mutex a static variable in core.c (which I plan on doing in v2 of
   the patch set).

2. It's one of the main entry points into the klp code, along with
   register/unregister and enable/disable.  It makes a lot of sense to
   keep all of those entry points in the same file IMO.

> If we decide to move patching code I think it would make sense to move 
> enable/disable functions along with it. Or perhaps __klp_enable_patch and 
> __klp_disable_patch only. It is possible though that the result would be 
> much worse.

I would vote to keep enable/disable in core.c for the same reasons as
stated above for enabled_store().  It's possible that
__klp_enable_patch() and __klp_disable_patch() could be moved elsewhere.
Personally I like them where they are, since they call into both
"transition" functions and "patch" functions.

So, big surprise, I agree with my own code splitting decisions ;-)

> 
> Or we can move some other group of functions...
> 
> [...]
> 
> > diff --git a/kernel/livepatch/patch.h b/kernel/livepatch/patch.h
> > new file mode 100644
> > index 0000000..bb34bd3
> > --- /dev/null
> > +++ b/kernel/livepatch/patch.h
> > @@ -0,0 +1,25 @@
> > +#include <linux/livepatch.h>
> > +
> > +/**
> > + * struct klp_ops - structure for tracking registered ftrace ops structs
> > + *
> > + * A single ftrace_ops is shared between all enabled replacement functions
> > + * (klp_func structs) which have the same old_addr.  This allows the switch
> > + * between function versions to happen instantaneously by updating the klp_ops
> > + * struct's func_stack list.  The winner is the klp_func at the top of the
> > + * func_stack (front of the list).
> > + *
> > + * @node:	node for the global klp_ops list
> > + * @func_stack:	list head for the stack of klp_func's (active func is on top)
> > + * @fops:	registered ftrace ops struct
> > + */
> > +struct klp_ops {
> > +	struct list_head node;
> > +	struct list_head func_stack;
> > +	struct ftrace_ops fops;
> > +};
> > +
> > +struct klp_ops *klp_find_ops(unsigned long old_addr);
> > +
> > +extern int klp_patch_object(struct klp_object *obj);
> > +extern void klp_unpatch_object(struct klp_object *obj);
> 
> Is there a reason why klp_find_ops is not extern and the other two 
> functions are? I think it is redundant and it is better to be consistent.

Good catch, thanks.

-- 
Josh

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH 8/9] livepatch: allow patch modules to be removed
  2015-02-12 15:22     ` Miroslav Benes
  2015-02-13 12:44       ` Josh Poimboeuf
@ 2015-02-13 16:04       ` Josh Poimboeuf
  2015-02-13 16:17         ` Miroslav Benes
  1 sibling, 1 reply; 106+ messages in thread
From: Josh Poimboeuf @ 2015-02-13 16:04 UTC (permalink / raw)
  To: Miroslav Benes
  Cc: Jiri Slaby, Seth Jennings, Jiri Kosina, Vojtech Pavlik,
	Masami Hiramatsu, live-patching, linux-kernel

On Thu, Feb 12, 2015 at 04:22:24PM +0100, Miroslav Benes wrote:
> On Tue, 10 Feb 2015, Jiri Slaby wrote:
> 
> > On 02/09/2015, 06:31 PM, Josh Poimboeuf wrote:
> > > --- a/kernel/livepatch/core.c
> > > +++ b/kernel/livepatch/core.c
> > ...
> > > @@ -497,10 +500,6 @@ static struct attribute *klp_patch_attrs[] = {
> > >  
> > >  static void klp_kobj_release_patch(struct kobject *kobj)
> > >  {
> > > -	/*
> > > -	 * Once we have a consistency model we'll need to module_put() the
> > > -	 * patch module here.  See klp_register_patch() for more details.
> > > -	 */
> > 
> > I deliberately let you write the note in there :). What happens when I
> > leave some attribute in /sys open and you remove the module in the meantime?
> 
> And if that attribute is <enabled> it can lead even to the deadlock. You 
> can try it yourself with the patchset applied and lockdep on. Simple 
> series of insmod, disable and rmmod of the patch.
> 
> Just for the sake of completeness...

Hm, even with Jiri Slaby's suggested fix to add the completion to the
unregister path, I still get a lockdep warning.  This looks more insidious,
related to the locking order of a kernfs lock and the klp lock.  I'll need to
look at this some more...


[26244.952692] ======================================================
[26244.954469] [ INFO: possible circular locking dependency detected ]
[26244.954469] 3.19.0-rc1+ #99 Tainted: G        W   E K
[26244.954469] -------------------------------------------------------
[26244.954469] rmmod/1270 is trying to acquire lock:
[26244.954469]  (s_active#70){++++.+}, at: [<ffffffff812fcb07>] kernfs_remove+0x27/0x40
[26244.954469] 
[26244.954469] but task is already holding lock:
[26244.954469]  (klp_mutex){+.+.+.}, at: [<ffffffff81130503>] klp_unregister_patch+0x23/0xc0
[26244.954469] 
[26244.954469] which lock already depends on the new lock.
[26244.954469] 
[26244.954469] 
[26244.954469] the existing dependency chain (in reverse order) is:
[26244.954469] 
-> #1 (klp_mutex){+.+.+.}:
[26244.954469]        [<ffffffff8110cfff>] lock_acquire+0xcf/0x2a0
[26244.954469]        [<ffffffff8184ea5d>] mutex_lock_nested+0x7d/0x430
[26244.954469]        [<ffffffff811303cf>] enabled_store+0x5f/0xf0
[26244.954469]        [<ffffffff8141b98f>] kobj_attr_store+0xf/0x20
[26244.954469]        [<ffffffff812fe759>] sysfs_kf_write+0x49/0x60
[26244.954469]        [<ffffffff812fe050>] kernfs_fop_write+0x140/0x1a0
[26244.954469]        [<ffffffff8126fb1a>] vfs_write+0xba/0x200
[26244.954469]        [<ffffffff8127080c>] SyS_write+0x5c/0xd0
[26244.954469]        [<ffffffff818541a9>] system_call_fastpath+0x12/0x17
[26244.954469] 
-> #0 (s_active#70){++++.+}:
[26244.954469]        [<ffffffff8110c5de>] __lock_acquire+0x1c5e/0x1de0
[26244.954469]        [<ffffffff8110cfff>] lock_acquire+0xcf/0x2a0
[26244.954469]        [<ffffffff812fbacb>] __kernfs_remove+0x27b/0x390
[26244.954469]        [<ffffffff812fcb07>] kernfs_remove+0x27/0x40
[26244.954469]        [<ffffffff812ff041>] sysfs_remove_dir+0x51/0x90
[26244.954469]        [<ffffffff8141bbc8>] kobject_del+0x18/0x50
[26244.954469]        [<ffffffff8141bc5a>] kobject_release+0x5a/0x1c0
[26244.954469]        [<ffffffff8141bb25>] kobject_put+0x35/0x70
[26244.954469]        [<ffffffff8113056a>] klp_unregister_patch+0x8a/0xc0
[26244.954469]        [<ffffffffa034d0c5>] livepatch_exit+0x25/0xf60 [livepatch_sample]
[26244.954469]        [<ffffffff81155ddf>] SyS_delete_module+0x1cf/0x280
[26244.954469]        [<ffffffff818541a9>] system_call_fastpath+0x12/0x17
[26244.954469] 
[26244.954469] other info that might help us debug this:
[26244.954469] 
[26244.954469]  Possible unsafe locking scenario:
[26244.954469] 
[26244.954469]        CPU0                    CPU1
[26244.954469]        ----                    ----
[26244.954469]   lock(klp_mutex);
[26244.954469]                                lock(s_active#70);
[26244.954469]                                lock(klp_mutex);
[26244.954469]   lock(s_active#70);
[26244.954469] 
[26244.954469]  *** DEADLOCK ***
[26244.954469] 
[26244.954469] 1 lock held by rmmod/1270:
[26244.954469]  #0:  (klp_mutex){+.+.+.}, at: [<ffffffff81130503>] klp_unregister_patch+0x23/0xc0
[26244.954469] 
[26244.954469] stack backtrace:
[26244.954469] CPU: 1 PID: 1270 Comm: rmmod Tainted: G        W   E K 3.19.0-rc1+ #99
[26244.954469] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140709_153950- 04/01/2014
[26244.954469]  0000000000000000 000000001f4deaad ffff880079877bf8 ffffffff81849fd2
[26244.954469]  0000000000000000 ffffffff82aea9c0 ffff880079877c48 ffffffff8184710b
[26244.954469]  00000000001d6640 ffff880079877ca8 ffff8800788525c0 ffff880078852e90
[26244.954469] Call Trace:
[26244.954469]  [<ffffffff81849fd2>] dump_stack+0x4c/0x65
[26244.954469]  [<ffffffff8184710b>] print_circular_bug+0x202/0x213
[26244.954469]  [<ffffffff8110c5de>] __lock_acquire+0x1c5e/0x1de0
[26244.954469]  [<ffffffff81247b3d>] ? __slab_free+0xbd/0x390
[26244.954469]  [<ffffffff810e8765>] ? sched_clock_local+0x25/0x90
[26244.954469]  [<ffffffff8110cfff>] lock_acquire+0xcf/0x2a0
[26244.954469]  [<ffffffff812fcb07>] ? kernfs_remove+0x27/0x40
[26244.954469]  [<ffffffff812fbacb>] __kernfs_remove+0x27b/0x390
[26244.954469]  [<ffffffff812fcb07>] ? kernfs_remove+0x27/0x40
[26244.954469]  [<ffffffff811071cf>] ? lock_release_holdtime.part.29+0xf/0x200
[26244.954469]  [<ffffffff812fcb07>] kernfs_remove+0x27/0x40
[26244.954469]  [<ffffffff812ff041>] sysfs_remove_dir+0x51/0x90
[26244.954469]  [<ffffffff8141bbc8>] kobject_del+0x18/0x50
[26244.954469]  [<ffffffff8141bc5a>] kobject_release+0x5a/0x1c0
[26244.954469]  [<ffffffff8141bb25>] kobject_put+0x35/0x70
[26244.954469]  [<ffffffff8113056a>] klp_unregister_patch+0x8a/0xc0
[26244.954469]  [<ffffffffa034d0c5>] livepatch_exit+0x25/0xf60 [livepatch_sample]
[26244.954469]  [<ffffffff81155ddf>] SyS_delete_module+0x1cf/0x280
[26244.954469]  [<ffffffff81428a9b>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[26244.954469]  [<ffffffff818541a9>] system_call_fastpath+0x12/0x17


To recreate:

insmod livepatch-sample.ko

# wait for patching to complete

~/a.out &  <-- simple program which opens the "enabled" file in the background

echo 0 >/sys/kernel/livepatch/livepatch_sample/enabled

# wait for unpatch to complete

rmmod livepatch-sample.ko

-- 
Josh

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH 8/9] livepatch: allow patch modules to be removed
  2015-02-13 16:04       ` Josh Poimboeuf
@ 2015-02-13 16:17         ` Miroslav Benes
  2015-02-13 20:49           ` Josh Poimboeuf
  0 siblings, 1 reply; 106+ messages in thread
From: Miroslav Benes @ 2015-02-13 16:17 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Jiri Slaby, Seth Jennings, Jiri Kosina, Vojtech Pavlik,
	Masami Hiramatsu, live-patching, linux-kernel

On Fri, 13 Feb 2015, Josh Poimboeuf wrote:

> On Thu, Feb 12, 2015 at 04:22:24PM +0100, Miroslav Benes wrote:
> > On Tue, 10 Feb 2015, Jiri Slaby wrote:
> > 
> > > On 02/09/2015, 06:31 PM, Josh Poimboeuf wrote:
> > > > --- a/kernel/livepatch/core.c
> > > > +++ b/kernel/livepatch/core.c
> > > ...
> > > > @@ -497,10 +500,6 @@ static struct attribute *klp_patch_attrs[] = {
> > > >  
> > > >  static void klp_kobj_release_patch(struct kobject *kobj)
> > > >  {
> > > > -	/*
> > > > -	 * Once we have a consistency model we'll need to module_put() the
> > > > -	 * patch module here.  See klp_register_patch() for more details.
> > > > -	 */
> > > 
> > > I deliberately let you write the note in there :). What happens when I
> > > leave some attribute in /sys open and you remove the module in the meantime?
> > 
> > And if that attribute is <enabled> it can lead even to the deadlock. You 
> > can try it yourself with the patchset applied and lockdep on. Simple 
> > series of insmod, disable and rmmod of the patch.
> > 
> > Just for the sake of completeness...
> 
> Hm, even with Jiri Slaby's suggested fix to add the completion to the
> unregister path, I still get a lockdep warning.  This looks more insidious,
> related to the locking order of a kernfs lock and the klp lock.  I'll need to
> look at this some more...

Yes, I was afraid of this. Lockdep warning is a separate bug. It is caused 
by taking klp_mutex in enabled_store. During rmmod klp_unregister_patch 
takes klp_mutex and destroys the sysfs structure. If somebody writes to 
enabled just after unregister takes the mutex and before the sysfs 
removal, he would cause the deadlock, because enabled_store takes the 
"sysfs lock" and then klp_mutex. That is exactly what the lockdep tells us 
below.

We can look for inspiration elsewhere. Grep for s_active through git log 
of the mainline offers several commits which dealt exactly with this. Will 
browse through that...

> [26244.952692] ======================================================
> [26244.954469] [ INFO: possible circular locking dependency detected ]
> [26244.954469] 3.19.0-rc1+ #99 Tainted: G        W   E K
> [26244.954469] -------------------------------------------------------
> [26244.954469] rmmod/1270 is trying to acquire lock:
> [26244.954469]  (s_active#70){++++.+}, at: [<ffffffff812fcb07>] kernfs_remove+0x27/0x40
> [26244.954469] 
> [26244.954469] but task is already holding lock:
> [26244.954469]  (klp_mutex){+.+.+.}, at: [<ffffffff81130503>] klp_unregister_patch+0x23/0xc0
> [26244.954469] 
> [26244.954469] which lock already depends on the new lock.
> [26244.954469] 
> [26244.954469] 
> [26244.954469] the existing dependency chain (in reverse order) is:
> [26244.954469] 
> -> #1 (klp_mutex){+.+.+.}:
> [26244.954469]        [<ffffffff8110cfff>] lock_acquire+0xcf/0x2a0
> [26244.954469]        [<ffffffff8184ea5d>] mutex_lock_nested+0x7d/0x430
> [26244.954469]        [<ffffffff811303cf>] enabled_store+0x5f/0xf0
> [26244.954469]        [<ffffffff8141b98f>] kobj_attr_store+0xf/0x20
> [26244.954469]        [<ffffffff812fe759>] sysfs_kf_write+0x49/0x60
> [26244.954469]        [<ffffffff812fe050>] kernfs_fop_write+0x140/0x1a0
> [26244.954469]        [<ffffffff8126fb1a>] vfs_write+0xba/0x200
> [26244.954469]        [<ffffffff8127080c>] SyS_write+0x5c/0xd0
> [26244.954469]        [<ffffffff818541a9>] system_call_fastpath+0x12/0x17
> [26244.954469] 
> -> #0 (s_active#70){++++.+}:
> [26244.954469]        [<ffffffff8110c5de>] __lock_acquire+0x1c5e/0x1de0
> [26244.954469]        [<ffffffff8110cfff>] lock_acquire+0xcf/0x2a0
> [26244.954469]        [<ffffffff812fbacb>] __kernfs_remove+0x27b/0x390
> [26244.954469]        [<ffffffff812fcb07>] kernfs_remove+0x27/0x40
> [26244.954469]        [<ffffffff812ff041>] sysfs_remove_dir+0x51/0x90
> [26244.954469]        [<ffffffff8141bbc8>] kobject_del+0x18/0x50
> [26244.954469]        [<ffffffff8141bc5a>] kobject_release+0x5a/0x1c0
> [26244.954469]        [<ffffffff8141bb25>] kobject_put+0x35/0x70
> [26244.954469]        [<ffffffff8113056a>] klp_unregister_patch+0x8a/0xc0
> [26244.954469]        [<ffffffffa034d0c5>] livepatch_exit+0x25/0xf60 [livepatch_sample]
> [26244.954469]        [<ffffffff81155ddf>] SyS_delete_module+0x1cf/0x280
> [26244.954469]        [<ffffffff818541a9>] system_call_fastpath+0x12/0x17
> [26244.954469] 
> [26244.954469] other info that might help us debug this:
> [26244.954469] 
> [26244.954469]  Possible unsafe locking scenario:
> [26244.954469] 
> [26244.954469]        CPU0                    CPU1
> [26244.954469]        ----                    ----
> [26244.954469]   lock(klp_mutex);
> [26244.954469]                                lock(s_active#70);
> [26244.954469]                                lock(klp_mutex);
> [26244.954469]   lock(s_active#70);
> [26244.954469] 
> [26244.954469]  *** DEADLOCK ***
> [26244.954469] 
> [26244.954469] 1 lock held by rmmod/1270:
> [26244.954469]  #0:  (klp_mutex){+.+.+.}, at: [<ffffffff81130503>] klp_unregister_patch+0x23/0xc0
> [26244.954469] 
> [26244.954469] stack backtrace:
> [26244.954469] CPU: 1 PID: 1270 Comm: rmmod Tainted: G        W   E K 3.19.0-rc1+ #99
> [26244.954469] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140709_153950- 04/01/2014
> [26244.954469]  0000000000000000 000000001f4deaad ffff880079877bf8 ffffffff81849fd2
> [26244.954469]  0000000000000000 ffffffff82aea9c0 ffff880079877c48 ffffffff8184710b
> [26244.954469]  00000000001d6640 ffff880079877ca8 ffff8800788525c0 ffff880078852e90
> [26244.954469] Call Trace:
> [26244.954469]  [<ffffffff81849fd2>] dump_stack+0x4c/0x65
> [26244.954469]  [<ffffffff8184710b>] print_circular_bug+0x202/0x213
> [26244.954469]  [<ffffffff8110c5de>] __lock_acquire+0x1c5e/0x1de0
> [26244.954469]  [<ffffffff81247b3d>] ? __slab_free+0xbd/0x390
> [26244.954469]  [<ffffffff810e8765>] ? sched_clock_local+0x25/0x90
> [26244.954469]  [<ffffffff8110cfff>] lock_acquire+0xcf/0x2a0
> [26244.954469]  [<ffffffff812fcb07>] ? kernfs_remove+0x27/0x40
> [26244.954469]  [<ffffffff812fbacb>] __kernfs_remove+0x27b/0x390
> [26244.954469]  [<ffffffff812fcb07>] ? kernfs_remove+0x27/0x40
> [26244.954469]  [<ffffffff811071cf>] ? lock_release_holdtime.part.29+0xf/0x200
> [26244.954469]  [<ffffffff812fcb07>] kernfs_remove+0x27/0x40
> [26244.954469]  [<ffffffff812ff041>] sysfs_remove_dir+0x51/0x90
> [26244.954469]  [<ffffffff8141bbc8>] kobject_del+0x18/0x50
> [26244.954469]  [<ffffffff8141bc5a>] kobject_release+0x5a/0x1c0
> [26244.954469]  [<ffffffff8141bb25>] kobject_put+0x35/0x70
> [26244.954469]  [<ffffffff8113056a>] klp_unregister_patch+0x8a/0xc0
> [26244.954469]  [<ffffffffa034d0c5>] livepatch_exit+0x25/0xf60 [livepatch_sample]
> [26244.954469]  [<ffffffff81155ddf>] SyS_delete_module+0x1cf/0x280
> [26244.954469]  [<ffffffff81428a9b>] ? trace_hardirqs_on_thunk+0x3a/0x3f
> [26244.954469]  [<ffffffff818541a9>] system_call_fastpath+0x12/0x17
> 
> 
> To recreate:
> 
> insmod livepatch-sample.ko
> 
> # wait for patching to complete
> 
> ~/a.out &  <-- simple program which opens the "enabled" file in the background

I didn't even need such a program. Lockdep warned me with sole insmod, 
echo and rmmod. It is magically clever.

Miroslav

> echo 0 >/sys/kernel/livepatch/livepatch_sample/enabled
> 
> # wait for unpatch to complete
> 
> rmmod livepatch-sample.ko


^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH 8/9] livepatch: allow patch modules to be removed
  2015-02-13 16:17         ` Miroslav Benes
@ 2015-02-13 20:49           ` Josh Poimboeuf
  2015-02-16 16:06             ` Miroslav Benes
  0 siblings, 1 reply; 106+ messages in thread
From: Josh Poimboeuf @ 2015-02-13 20:49 UTC (permalink / raw)
  To: Miroslav Benes
  Cc: Jiri Slaby, Seth Jennings, Jiri Kosina, Vojtech Pavlik,
	Masami Hiramatsu, live-patching, linux-kernel

On Fri, Feb 13, 2015 at 05:17:10PM +0100, Miroslav Benes wrote:
> On Fri, 13 Feb 2015, Josh Poimboeuf wrote:
> > Hm, even with Jiri Slaby's suggested fix to add the completion to the
> > unregister path, I still get a lockdep warning.  This looks more insidious,
> > related to the locking order of a kernfs lock and the klp lock.  I'll need to
> > look at this some more...
> 
> Yes, I was afraid of this. Lockdep warning is a separate bug. It is caused 
> by taking klp_mutex in enabled_store. During rmmod klp_unregister_patch 
> takes klp_mutex and destroys the sysfs structure. If somebody writes to 
> enabled just after unregister takes the mutex and before the sysfs 
> removal, he would cause the deadlock, because enabled_store takes the 
> "sysfs lock" and then klp_mutex. That is exactly what the lockdep tells us 
> below.
> 
> We can look for inspiration elsewhere. Grep for s_active through git log 
> of the mainline offers several commits which dealt exactly with this. Will 
> browse through that...

Thanks Miroslav, please let me know what you find.  It wouldn't surprise
me if this were a very common problem.

One option would be to move the enabled_store() work out to a workqueue
or something.

> > 
> > To recreate:
> > 
> > insmod livepatch-sample.ko
> > 
> > # wait for patching to complete
> > 
> > ~/a.out &  <-- simple program which opens the "enabled" file in the background
> 
> I didn't even need such a program. Lockdep warned me with sole insmod, 
> echo and rmmod. It is magically clever.

Ah, even easier... lockdep is awesome.

-- 
Josh

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH 6/9] livepatch: create per-task consistency model
  2015-02-09 17:31 ` [RFC PATCH 6/9] livepatch: create per-task consistency model Josh Poimboeuf
                     ` (4 preceding siblings ...)
  2015-02-12  3:21   ` Josh Poimboeuf
@ 2015-02-14 11:40   ` Jiri Slaby
  2015-02-17 14:59     ` Josh Poimboeuf
  2015-02-16 14:19   ` Miroslav Benes
  6 siblings, 1 reply; 106+ messages in thread
From: Jiri Slaby @ 2015-02-14 11:40 UTC (permalink / raw)
  To: Josh Poimboeuf, Seth Jennings, Jiri Kosina, Vojtech Pavlik
  Cc: Masami Hiramatsu, live-patching, linux-kernel

On 02/09/2015, 06:31 PM, Josh Poimboeuf wrote:
> Add a basic per-task consistency model.  This is the foundation which
> will eventually enable us to patch those ~10% of security patches which
> change function prototypes and/or data semantics.
> 
> When a patch is enabled, livepatch enters into a transition state where
> tasks are converging from the old universe to the new universe.  If a
> given task isn't using any of the patched functions, it's switched to
> the new universe.  Once all the tasks have been converged to the new
> universe, patching is complete.
> 
> The same sequence occurs when a patch is disabled, except the tasks
> converge from the new universe to the old universe.
> 
> The /sys/kernel/livepatch/<patch>/transition file shows whether a patch
> is in transition.  Only a single patch (the topmost patch on the stack)
> can be in transition at a given time.  A patch can remain in the
> transition state indefinitely, if any of the tasks are stuck in the
> previous universe.
> 
> A transition can be reversed and effectively canceled by writing the
> opposite value to the /sys/kernel/livepatch/<patch>/enabled file while
> the transition is in progress.  Then all the tasks will attempt to
> converge back to the original universe.
> 
> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
> ---
>  include/linux/livepatch.h     |  18 ++-
>  include/linux/sched.h         |   3 +
>  kernel/fork.c                 |   2 +
>  kernel/livepatch/Makefile     |   2 +-
>  kernel/livepatch/core.c       |  71 ++++++----
>  kernel/livepatch/patch.c      |  34 ++++-
>  kernel/livepatch/patch.h      |   1 +
>  kernel/livepatch/transition.c | 300 ++++++++++++++++++++++++++++++++++++++++++
>  kernel/livepatch/transition.h |  16 +++
>  kernel/sched/core.c           |   2 +
>  10 files changed, 423 insertions(+), 26 deletions(-)
>  create mode 100644 kernel/livepatch/transition.c
>  create mode 100644 kernel/livepatch/transition.h
> 
> diff --git a/include/linux/livepatch.h b/include/linux/livepatch.h
> index 0e65b4d..b8c2f15 100644
> --- a/include/linux/livepatch.h
> +++ b/include/linux/livepatch.h
> @@ -40,6 +40,7 @@
>   * @old_size:	size of the old function
>   * @new_size:	size of the new function
>   * @patched:	the func has been added to the klp_ops list
> + * @transition:	the func is currently being applied or reverted
>   */
>  struct klp_func {
>  	/* external */
> @@ -60,6 +61,7 @@ struct klp_func {
>  	struct list_head stack_node;
>  	unsigned long old_size, new_size;
>  	int patched;
> +	int transition;
>  };
>  
>  /**
> @@ -128,6 +130,20 @@ extern int klp_unregister_patch(struct klp_patch *);
>  extern int klp_enable_patch(struct klp_patch *);
>  extern int klp_disable_patch(struct klp_patch *);
>  
> -#endif /* CONFIG_LIVEPATCH */
> +extern int klp_universe_goal;
> +
> +static inline void klp_update_task_universe(struct task_struct *t)
> +{
> +	/* corresponding smp_wmb() is in klp_set_universe_goal() */
> +	smp_rmb();
> +
> +	t->klp_universe = klp_universe_goal;
> +}
> +
> +#else /* !CONFIG_LIVEPATCH */
> +
> +static inline void klp_update_task_universe(struct task_struct *t) {}
> +
> +#endif /* !CONFIG_LIVEPATCH */
>  
>  #endif /* _LINUX_LIVEPATCH_H_ */
> diff --git a/include/linux/sched.h b/include/linux/sched.h
> index 8db31ef..a95e59a 100644
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -1701,6 +1701,9 @@ struct task_struct {
>  #ifdef CONFIG_DEBUG_ATOMIC_SLEEP
>  	unsigned long	task_state_change;
>  #endif
> +#ifdef CONFIG_LIVEPATCH
> +	int klp_universe;
> +#endif
>  };
>  
>  /* Future-safe accessor for struct task_struct's cpus_allowed. */
> diff --git a/kernel/fork.c b/kernel/fork.c
> index 4dc2dda..1dcbebe 100644
> --- a/kernel/fork.c
> +++ b/kernel/fork.c
> @@ -74,6 +74,7 @@
>  #include <linux/uprobes.h>
>  #include <linux/aio.h>
>  #include <linux/compiler.h>
> +#include <linux/livepatch.h>
>  
>  #include <asm/pgtable.h>
>  #include <asm/pgalloc.h>
> @@ -1538,6 +1539,7 @@ static struct task_struct *copy_process(unsigned long clone_flags,
>  	total_forks++;
>  	spin_unlock(&current->sighand->siglock);
>  	syscall_tracepoint_update(p);
> +	klp_update_task_universe(p);
>  	write_unlock_irq(&tasklist_lock);
>  
>  	proc_fork_connector(p);
> diff --git a/kernel/livepatch/Makefile b/kernel/livepatch/Makefile
> index e136dad..2b8bdb1 100644
> --- a/kernel/livepatch/Makefile
> +++ b/kernel/livepatch/Makefile
> @@ -1,3 +1,3 @@
>  obj-$(CONFIG_LIVEPATCH) += livepatch.o
>  
> -livepatch-objs := core.o patch.o
> +livepatch-objs := core.o patch.o transition.o
> diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c
> index 85d4ef7..790dc10 100644
> --- a/kernel/livepatch/core.c
> +++ b/kernel/livepatch/core.c
> @@ -28,14 +28,17 @@
>  #include <linux/kallsyms.h>
>  
>  #include "patch.h"
> +#include "transition.h"
>  
>  /*
> - * The klp_mutex protects the global lists and state transitions of any
> - * structure reachable from them.  References to any structure must be obtained
> - * under mutex protection (except in klp_ftrace_handler(), which uses RCU to
> - * ensure it gets consistent data).
> + * The klp_mutex is a coarse lock which serializes access to klp data.  All
> + * accesses to klp-related variables and structures must have mutex protection,
> + * except within the following functions which carefully avoid the need for it:
> + *
> + * - klp_ftrace_handler()
> + * - klp_update_task_universe()
>   */
> -static DEFINE_MUTEX(klp_mutex);
> +DEFINE_MUTEX(klp_mutex);
>  
>  static LIST_HEAD(klp_patches);
>  
> @@ -67,7 +70,6 @@ static void klp_find_object_module(struct klp_object *obj)
>  	mutex_unlock(&module_mutex);
>  }
>  
> -/* klp_mutex must be held by caller */
>  static bool klp_is_patch_registered(struct klp_patch *patch)
>  {
>  	struct klp_patch *mypatch;
> @@ -285,18 +287,17 @@ static int klp_write_object_relocations(struct module *pmod,
>  
>  static int __klp_disable_patch(struct klp_patch *patch)
>  {
> -	struct klp_object *obj;
> +	if (klp_transition_patch)
> +		return -EBUSY;
>  
>  	/* enforce stacking: only the last enabled patch can be disabled */
>  	if (!list_is_last(&patch->list, &klp_patches) &&
>  	    list_next_entry(patch, list)->enabled)
>  		return -EBUSY;
>  
> -	pr_notice("disabling patch '%s'\n", patch->mod->name);
> -
> -	for (obj = patch->objs; obj->funcs; obj++)
> -		if (obj->patched)
> -			klp_unpatch_object(obj);
> +	klp_init_transition(patch, KLP_UNIVERSE_NEW);
> +	klp_start_transition(KLP_UNIVERSE_OLD);
> +	klp_try_complete_transition();
>  
>  	patch->enabled = 0;
>  
> @@ -340,6 +341,9 @@ static int __klp_enable_patch(struct klp_patch *patch)
>  	struct klp_object *obj;
>  	int ret;
>  
> +	if (klp_transition_patch)
> +		return -EBUSY;
> +
>  	if (WARN_ON(patch->enabled))
>  		return -EINVAL;
>  
> @@ -351,7 +355,7 @@ static int __klp_enable_patch(struct klp_patch *patch)
>  	pr_notice_once("tainting kernel with TAINT_LIVEPATCH\n");
>  	add_taint(TAINT_LIVEPATCH, LOCKDEP_STILL_OK);
>  
> -	pr_notice("enabling patch '%s'\n", patch->mod->name);
> +	klp_init_transition(patch, KLP_UNIVERSE_OLD);
>  
>  	for (obj = patch->objs; obj->funcs; obj++) {
>  		klp_find_object_module(obj);
> @@ -360,17 +364,24 @@ static int __klp_enable_patch(struct klp_patch *patch)
>  			continue;
>  
>  		ret = klp_patch_object(obj);
> -		if (ret)
> -			goto unregister;
> +		if (ret) {
> +			pr_warn("failed to enable patch '%s'\n",
> +				patch->mod->name);
> +
> +			klp_unpatch_objects(patch);
> +			klp_complete_transition();
> +
> +			return ret;
> +		}
>  	}
>  
> +	klp_start_transition(KLP_UNIVERSE_NEW);
> +
> +	klp_try_complete_transition();
> +
>  	patch->enabled = 1;
>  
>  	return 0;
> -
> -unregister:
> -	WARN_ON(__klp_disable_patch(patch));
> -	return ret;
>  }
>  
>  /**
> @@ -407,6 +418,7 @@ EXPORT_SYMBOL_GPL(klp_enable_patch);
>   * /sys/kernel/livepatch
>   * /sys/kernel/livepatch/<patch>
>   * /sys/kernel/livepatch/<patch>/enabled
> + * /sys/kernel/livepatch/<patch>/transition
>   * /sys/kernel/livepatch/<patch>/<object>
>   * /sys/kernel/livepatch/<patch>/<object>/<func>
>   */
> @@ -435,7 +447,9 @@ static ssize_t enabled_store(struct kobject *kobj, struct kobj_attribute *attr,
>  		goto err;
>  	}
>  
> -	if (val) {
> +	if (klp_transition_patch == patch) {
> +		klp_reverse_transition();
> +	} else if (val) {
>  		ret = __klp_enable_patch(patch);
>  		if (ret)
>  			goto err;
> @@ -463,9 +477,21 @@ static ssize_t enabled_show(struct kobject *kobj,
>  	return snprintf(buf, PAGE_SIZE-1, "%d\n", patch->enabled);
>  }
>  
> +static ssize_t transition_show(struct kobject *kobj,
> +			       struct kobj_attribute *attr, char *buf)
> +{
> +	struct klp_patch *patch;
> +
> +	patch = container_of(kobj, struct klp_patch, kobj);
> +	return snprintf(buf, PAGE_SIZE-1, "%d\n",
> +			klp_transition_patch == patch);
> +}
> +
>  static struct kobj_attribute enabled_kobj_attr = __ATTR_RW(enabled);
> +static struct kobj_attribute transition_kobj_attr = __ATTR_RO(transition);
>  static struct attribute *klp_patch_attrs[] = {
>  	&enabled_kobj_attr.attr,
> +	&transition_kobj_attr.attr,
>  	NULL
>  };
>  
> @@ -543,6 +569,7 @@ static int klp_init_func(struct klp_object *obj, struct klp_func *func)
>  {
>  	INIT_LIST_HEAD(&func->stack_node);
>  	func->patched = 0;
> +	func->transition = 0;
>  
>  	return kobject_init_and_add(&func->kobj, &klp_ktype_func,
>  				    obj->kobj, func->old_name);
> @@ -725,7 +752,7 @@ static void klp_module_notify_coming(struct klp_patch *patch,
>  	if (ret)
>  		goto err;
>  
> -	if (!patch->enabled)
> +	if (!patch->enabled && klp_transition_patch != patch)
>  		return;
>  
>  	pr_notice("applying patch '%s' to loading module '%s'\n",
> @@ -746,7 +773,7 @@ static void klp_module_notify_going(struct klp_patch *patch,
>  	struct module *pmod = patch->mod;
>  	struct module *mod = obj->mod;
>  
> -	if (!patch->enabled)
> +	if (!patch->enabled && klp_transition_patch != patch)
>  		goto free;
>  
>  	pr_notice("reverting patch '%s' on unloading module '%s'\n",
> diff --git a/kernel/livepatch/patch.c b/kernel/livepatch/patch.c
> index 281fbca..f12256b 100644
> --- a/kernel/livepatch/patch.c
> +++ b/kernel/livepatch/patch.c
> @@ -24,6 +24,7 @@
>  #include <linux/slab.h>
>  
>  #include "patch.h"
> +#include "transition.h"
>  
>  static LIST_HEAD(klp_ops);
>  
> @@ -38,14 +39,34 @@ static void notrace klp_ftrace_handler(unsigned long ip,
>  	ops = container_of(fops, struct klp_ops, fops);
>  
>  	rcu_read_lock();
> +
>  	func = list_first_or_null_rcu(&ops->func_stack, struct klp_func,
>  				      stack_node);
> -	rcu_read_unlock();
>  
>  	if (WARN_ON_ONCE(!func))
> -		return;
> +		goto unlock;
> +
> +	if (unlikely(func->transition)) {
> +		/* corresponding smp_wmb() is in klp_init_transition() */
> +		smp_rmb();
> +
> +		if (current->klp_universe == KLP_UNIVERSE_OLD) {
> +			/*
> +			 * Use the previously patched version of the function.
> +			 * If no previous patches exist, use the original
> +			 * function.
> +			 */
> +			func = list_entry_rcu(func->stack_node.next,
> +					      struct klp_func, stack_node);
> +
> +			if (&func->stack_node == &ops->func_stack)
> +				goto unlock;
> +		}
> +	}
>  
>  	klp_arch_set_pc(regs, (unsigned long)func->new_func);
> +unlock:
> +	rcu_read_unlock();
>  }
>  
>  struct klp_ops *klp_find_ops(unsigned long old_addr)
> @@ -174,3 +195,12 @@ int klp_patch_object(struct klp_object *obj)
>  
>  	return 0;
>  }
> +
> +void klp_unpatch_objects(struct klp_patch *patch)
> +{
> +	struct klp_object *obj;
> +
> +	for (obj = patch->objs; obj->funcs; obj++)
> +		if (obj->patched)
> +			klp_unpatch_object(obj);
> +}
> diff --git a/kernel/livepatch/patch.h b/kernel/livepatch/patch.h
> index bb34bd3..1648259 100644
> --- a/kernel/livepatch/patch.h
> +++ b/kernel/livepatch/patch.h
> @@ -23,3 +23,4 @@ struct klp_ops *klp_find_ops(unsigned long old_addr);
>  
>  extern int klp_patch_object(struct klp_object *obj);
>  extern void klp_unpatch_object(struct klp_object *obj);
> +extern void klp_unpatch_objects(struct klp_patch *patch);
> diff --git a/kernel/livepatch/transition.c b/kernel/livepatch/transition.c
> new file mode 100644
> index 0000000..2630296
> --- /dev/null
> +++ b/kernel/livepatch/transition.c
> @@ -0,0 +1,300 @@
> +/*
> + * transition.c - Kernel Live Patching transition functions
> + *
> + * Copyright (C) 2015 Josh Poimboeuf <jpoimboe@redhat.com>
> + *
> + * This program is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU General Public License
> + * as published by the Free Software Foundation; either version 2
> + * of the License, or (at your option) any later version.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program; if not, see <http://www.gnu.org/licenses/>.
> + */
> +
> +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
> +
> +#include <linux/cpu.h>
> +#include <asm/stacktrace.h>
> +#include "../sched/sched.h"
> +
> +#include "patch.h"
> +#include "transition.h"
> +
> +static void klp_transition_work_fn(struct work_struct *);
> +static DECLARE_DELAYED_WORK(klp_transition_work, klp_transition_work_fn);
> +
> +struct klp_patch *klp_transition_patch;
> +
> +int klp_universe_goal = KLP_UNIVERSE_UNDEFINED;
> +
> +static void klp_set_universe_goal(int universe)
> +{
> +	klp_universe_goal = universe;
> +
> +	/* corresponding smp_rmb() is in klp_update_task_universe() */
> +	smp_wmb();
> +}
> +
> +/*
> + * The transition to the universe goal is complete.  Clean up the data
> + * structures.
> + */
> +void klp_complete_transition(void)
> +{
> +	struct klp_object *obj;
> +	struct klp_func *func;
> +
> +	for (obj = klp_transition_patch->objs; obj->funcs; obj++)
> +		for (func = obj->funcs; func->old_name; func++)
> +			func->transition = 0;
> +
> +	klp_transition_patch = NULL;
> +}
> +
> +static int klp_stacktrace_address_verify_func(struct klp_func *func,
> +					      unsigned long address)
> +{
> +	unsigned long func_addr, func_size;
> +
> +	if (klp_universe_goal == KLP_UNIVERSE_OLD) {
> +		 /* check the to-be-unpatched function (the func itself) */
> +		func_addr = (unsigned long)func->new_func;
> +		func_size = func->new_size;
> +	} else {
> +		/* check the to-be-patched function (previous func) */
> +		struct klp_ops *ops;
> +
> +		ops = klp_find_ops(func->old_addr);
> +
> +		if (list_is_singular(&ops->func_stack)) {
> +			/* original function */
> +			func_addr = func->old_addr;
> +			func_size = func->old_size;
> +		} else {
> +			/* previously patched function */
> +			struct klp_func *prev;
> +
> +			prev = list_next_entry(func, stack_node);
> +			func_addr = (unsigned long)prev->new_func;
> +			func_size = prev->new_size;
> +		}
> +	}
> +
> +	if (address >= func_addr && address < func_addr + func_size)
> +		return -1;
> +
> +	return 0;
> +}
> +
> +/*
> + * Determine whether the given return address on the stack is within a
> + * to-be-patched or to-be-unpatched function.
> + */
> +static void klp_stacktrace_address_verify(void *data, unsigned long address,
> +					  int reliable)
> +{
> +	struct klp_object *obj;
> +	struct klp_func *func;
> +	int *ret = data;
> +
> +	if (*ret)
> +		return;
> +
> +	for (obj = klp_transition_patch->objs; obj->funcs; obj++) {
> +		if (!obj->patched)
> +			continue;
> +		for (func = obj->funcs; func->old_name; func++) {
> +			if (klp_stacktrace_address_verify_func(func, address)) {
> +				*ret = -1;
> +				return;
> +			}
> +		}
> +	}
> +}
> +
> +static int klp_stacktrace_stack(void *data, char *name)
> +{
> +	return 0;
> +}
> +
> +static const struct stacktrace_ops klp_stacktrace_ops = {
> +	.address = klp_stacktrace_address_verify,
> +	.stack = klp_stacktrace_stack,
> +	.walk_stack = print_context_stack_bp,
> +};
> +
> +/*
> + * Try to safely transition a task to the universe goal.  If the task is
> + * currently running or is sleeping on a to-be-patched or to-be-unpatched
> + * function, return false.
> + */
> +static bool klp_transition_task(struct task_struct *t)
> +{
> +	struct rq *rq;
> +	unsigned long flags;
> +	int ret;
> +	bool success = false;
> +
> +	if (t->klp_universe == klp_universe_goal)
> +		return true;
> +
> +	rq = task_rq_lock(t, &flags);
> +
> +	if (task_running(rq, t) && t != current) {
> +		pr_debug("%s: pid %d (%s) is running\n", __func__, t->pid,
> +			 t->comm);
> +		goto done;
> +	}
> +
> +	ret = 0;
> +	dump_trace(t, NULL, NULL, 0, &klp_stacktrace_ops, &ret);
> +	if (ret) {
> +		pr_debug("%s: pid %d (%s) is sleeping on a patched function\n",
> +			 __func__, t->pid, t->comm);
> +		goto done;
> +	}
> +
> +	klp_update_task_universe(t);
> +
> +	success = true;
> +done:
> +	task_rq_unlock(rq, t, &flags);
> +	return success;
> +}
> +
> +/*
> + * Try to transition all tasks to the universe goal.  If any tasks are still
> + * stuck in the original universe, schedule a retry.
> + */
> +void klp_try_complete_transition(void)
> +{
> +	unsigned int cpu;
> +	struct task_struct *g, *t;
> +	bool complete = true;
> +
> +	/* try to transition all normal tasks */
> +	read_lock(&tasklist_lock);
> +	for_each_process_thread(g, t)
> +		if (!klp_transition_task(t))
> +			complete = false;
> +	read_unlock(&tasklist_lock);
> +
> +	/* try to transition the idle "swapper" tasks */
> +	get_online_cpus();
> +	for_each_online_cpu(cpu)
> +		if (!klp_transition_task(idle_task(cpu)))
> +			complete = false;
> +	put_online_cpus();
> +
> +	/* if not complete, try again later */
> +	if (!complete) {
> +		schedule_delayed_work(&klp_transition_work,
> +				      round_jiffies_relative(HZ));
> +		return;
> +	}
> +
> +	/* success! unpatch obsolete functions and do some cleanup */
> +
> +	if (klp_universe_goal == KLP_UNIVERSE_OLD) {
> +		klp_unpatch_objects(klp_transition_patch);
> +
> +		/* prevent ftrace handler from reading old func->transition */
> +		synchronize_rcu();
> +	}
> +
> +	pr_notice("'%s': %s complete\n", klp_transition_patch->mod->name,
> +		  klp_universe_goal == KLP_UNIVERSE_NEW ? "patching" :
> +							  "unpatching");
> +
> +	klp_complete_transition();
> +}
> +
> +static void klp_transition_work_fn(struct work_struct *work)
> +{
> +	mutex_lock(&klp_mutex);
> +
> +	if (klp_transition_patch)
> +		klp_try_complete_transition();
> +
> +	mutex_unlock(&klp_mutex);
> +}
> +
> +/*
> + * Start the transition to the specified universe so tasks can begin switching
> + * to it.
> + */
> +void klp_start_transition(int universe)
> +{
> +	if (WARN_ON(klp_universe_goal == universe))
> +		return;
> +
> +	pr_notice("'%s': %s...\n", klp_transition_patch->mod->name,
> +		  universe == KLP_UNIVERSE_NEW ? "patching" : "unpatching");
> +
> +	klp_set_universe_goal(universe);
> +}
> +
> +/*
> + * Can be called in the middle of an existing transition to reverse the
> + * direction of the universe goal.  This can be done to effectively cancel an
> + * existing enable or disable operation if there are any tasks which are stuck
> + * in the original universe.
> + */
> +void klp_reverse_transition(void)
> +{
> +	struct klp_patch *patch = klp_transition_patch;
> +
> +	klp_start_transition(!klp_universe_goal);
> +	klp_try_complete_transition();
> +
> +	patch->enabled = !patch->enabled;
> +}
> +
> +/*
> + * Reset the universe goal and all tasks to the starting universe, and set all
> + * func->transition's to 1 to prepare for patching.
> + */
> +void klp_init_transition(struct klp_patch *patch, int universe)
> +{
> +	struct task_struct *g, *t;
> +	unsigned int cpu;
> +	struct klp_object *obj;
> +	struct klp_func *func;
> +
> +	klp_transition_patch = patch;
> +
> +	/*
> +	 * If the previous transition was in the opposite direction, we may
> +	 * already be in the requested initial universe.
> +	 */
> +	if (klp_universe_goal == universe)
> +		goto init_funcs;
> +
> +	klp_set_universe_goal(universe);
> +
> +	/* init all normal task universes */
> +	read_lock(&tasklist_lock);
> +	for_each_process_thread(g, t)
> +		klp_update_task_universe(t);
> +	read_unlock(&tasklist_lock);
> +
> +	/* init all idle "swapper" task universes */
> +	get_online_cpus();
> +	for_each_online_cpu(cpu)
> +		klp_update_task_universe(idle_task(cpu));
> +	put_online_cpus();
> +
> +init_funcs:
> +	/* corresponding smp_rmb() is in klp_ftrace_handler() */
> +	smp_wmb();
> +
> +	for (obj = patch->objs; obj->funcs; obj++)
> +		for (func = obj->funcs; func->old_name; func++)
> +			func->transition = 1;

So I finally got to review of this one. I have only two concerns:
1) it removes the ability for the user to use 'no consistency model'.
But you don't need to worry about this, I plan to implement this as soon
as you send v2 of these.

2) How is this 'transition = 1' store above guaranteed to reach other
CPUs before you start registering ftrace handlers? The CPUs need not see
the update when some handler is already invoked before start_transition
AFAICS.

thanks,
-- 
js
suse labs

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH 9/9] livepatch: update task universe when exiting kernel
  2015-02-09 17:31 ` [RFC PATCH 9/9] livepatch: update task universe when exiting kernel Josh Poimboeuf
@ 2015-02-16 10:16   ` Jiri Slaby
  2015-02-17 14:58     ` Josh Poimboeuf
  0 siblings, 1 reply; 106+ messages in thread
From: Jiri Slaby @ 2015-02-16 10:16 UTC (permalink / raw)
  To: Josh Poimboeuf, Seth Jennings, Jiri Kosina, Vojtech Pavlik
  Cc: Masami Hiramatsu, live-patching, linux-kernel

On 02/09/2015, 06:31 PM, Josh Poimboeuf wrote:
> Update a tasks's universe when returning from a system call or user
> space interrupt, or after handling a signal.
> 
> This greatly increases the chances of a patch operation succeeding.  If
> a task is I/O bound, it can switch universes when returning from a
> system call.  If a task is CPU bound, it can switch universes when
> returning from an interrupt.  If a task is sleeping on a to-be-patched
> function, the user can send SIGSTOP and SIGCONT to force it to switch.
> 
> Since the idle "swapper" tasks don't ever exit the kernel, they're
> updated from within the idle loop.
> 
> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
> ---
>  arch/x86/include/asm/thread_info.h |  4 +++-
>  arch/x86/kernel/signal.c           |  4 ++++
>  include/linux/livepatch.h          |  2 ++
>  kernel/livepatch/transition.c      | 15 +++++++++++++++
>  kernel/sched/idle.c                |  4 ++++
...
> --- a/kernel/sched/idle.c
> +++ b/kernel/sched/idle.c
> @@ -7,6 +7,7 @@
>  #include <linux/tick.h>
>  #include <linux/mm.h>
>  #include <linux/stackprotector.h>
> +#include <linux/livepatch.h>
>  
>  #include <asm/tlb.h>
>  
> @@ -250,6 +251,9 @@ static void cpu_idle_loop(void)
>  
>  		sched_ttwu_pending();
>  		schedule_preempt_disabled();
> +
> +		if (unlikely(test_thread_flag(TIF_KLP_NEED_UPDATE)))
> +			klp_update_task_universe(current);

Oh, this is indeed broken on non-x86 archs as kbuild reports.
(TIF_KLP_NEED_UPDATE undefined)

We need a klp_maybe_update_task_universe inline or something like that
and define it void for non-LIVEPATCH configs.

regards,
-- 
js
suse labs

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH 6/9] livepatch: create per-task consistency model
  2015-02-09 17:31 ` [RFC PATCH 6/9] livepatch: create per-task consistency model Josh Poimboeuf
                     ` (5 preceding siblings ...)
  2015-02-14 11:40   ` Jiri Slaby
@ 2015-02-16 14:19   ` Miroslav Benes
  2015-02-17 15:10     ` Josh Poimboeuf
  6 siblings, 1 reply; 106+ messages in thread
From: Miroslav Benes @ 2015-02-16 14:19 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Seth Jennings, Jiri Kosina, Vojtech Pavlik, Masami Hiramatsu,
	live-patching, linux-kernel

On Mon, 9 Feb 2015, Josh Poimboeuf wrote:

> Add a basic per-task consistency model.  This is the foundation which
> will eventually enable us to patch those ~10% of security patches which
> change function prototypes and/or data semantics.
> 
> When a patch is enabled, livepatch enters into a transition state where
> tasks are converging from the old universe to the new universe.  If a
> given task isn't using any of the patched functions, it's switched to
> the new universe.  Once all the tasks have been converged to the new
> universe, patching is complete.
> 
> The same sequence occurs when a patch is disabled, except the tasks
> converge from the new universe to the old universe.
> 
> The /sys/kernel/livepatch/<patch>/transition file shows whether a patch
> is in transition.  Only a single patch (the topmost patch on the stack)
> can be in transition at a given time.  A patch can remain in the
> transition state indefinitely, if any of the tasks are stuck in the
> previous universe.
> 
> A transition can be reversed and effectively canceled by writing the
> opposite value to the /sys/kernel/livepatch/<patch>/enabled file while
> the transition is in progress.  Then all the tasks will attempt to
> converge back to the original universe.

I finally managed to go through this patch and I have only few comments 
apart from what Jiri has already written...

I think it would be useful to add more comments throughout the code.

[...]

>  /**
> @@ -407,6 +418,7 @@ EXPORT_SYMBOL_GPL(klp_enable_patch);
>   * /sys/kernel/livepatch
>   * /sys/kernel/livepatch/<patch>
>   * /sys/kernel/livepatch/<patch>/enabled
> + * /sys/kernel/livepatch/<patch>/transition
>   * /sys/kernel/livepatch/<patch>/<object>
>   * /sys/kernel/livepatch/<patch>/<object>/<func>
>   */
> @@ -435,7 +447,9 @@ static ssize_t enabled_store(struct kobject *kobj, struct kobj_attribute *attr,
>  		goto err;
>  	}
>  
> -	if (val) {
> +	if (klp_transition_patch == patch) {
> +		klp_reverse_transition();
> +	} else if (val) {
>  		ret = __klp_enable_patch(patch);
>  		if (ret)
>  			goto err;
> @@ -463,9 +477,21 @@ static ssize_t enabled_show(struct kobject *kobj,
>  	return snprintf(buf, PAGE_SIZE-1, "%d\n", patch->enabled);
>  }
>  
> +static ssize_t transition_show(struct kobject *kobj,
> +			       struct kobj_attribute *attr, char *buf)
> +{
> +	struct klp_patch *patch;
> +
> +	patch = container_of(kobj, struct klp_patch, kobj);
> +	return snprintf(buf, PAGE_SIZE-1, "%d\n",
> +			klp_transition_patch == patch);
> +}
> +
>  static struct kobj_attribute enabled_kobj_attr = __ATTR_RW(enabled);
> +static struct kobj_attribute transition_kobj_attr = __ATTR_RO(transition);
>  static struct attribute *klp_patch_attrs[] = {
>  	&enabled_kobj_attr.attr,
> +	&transition_kobj_attr.attr,
>  	NULL
>  };

sysfs documentation (Documentation/ABI/testing/sysfs-kernel-livepatch) 
should be updated as well. Also the meaning of enabled attribute was 
changed a bit (by different patch of the set though).

[...]

> +
> +void klp_unpatch_objects(struct klp_patch *patch)
> +{
> +	struct klp_object *obj;
> +
> +	for (obj = patch->objs; obj->funcs; obj++)
> +		if (obj->patched)
> +			klp_unpatch_object(obj);
> +}

Maybe we should introduce for_each_* macros which could be used in the 
code and avoid such functions. I do not have strong opinion about it.

> diff --git a/kernel/livepatch/patch.h b/kernel/livepatch/patch.h
> index bb34bd3..1648259 100644
> --- a/kernel/livepatch/patch.h
> +++ b/kernel/livepatch/patch.h
> @@ -23,3 +23,4 @@ struct klp_ops *klp_find_ops(unsigned long old_addr);
>  
>  extern int klp_patch_object(struct klp_object *obj);
>  extern void klp_unpatch_object(struct klp_object *obj);
> +extern void klp_unpatch_objects(struct klp_patch *patch);

[...]

> diff --git a/kernel/livepatch/transition.h b/kernel/livepatch/transition.h
> new file mode 100644
> index 0000000..ba9a55c
> --- /dev/null
> +++ b/kernel/livepatch/transition.h
> @@ -0,0 +1,16 @@
> +#include <linux/livepatch.h>
> +
> +enum {
> +	KLP_UNIVERSE_UNDEFINED = -1,
> +	KLP_UNIVERSE_OLD,
> +	KLP_UNIVERSE_NEW,
> +};
> +
> +extern struct mutex klp_mutex;
> +extern struct klp_patch *klp_transition_patch;
> +
> +extern void klp_init_transition(struct klp_patch *patch, int universe);
> +extern void klp_start_transition(int universe);
> +extern void klp_reverse_transition(void);
> +extern void klp_try_complete_transition(void);
> +extern void klp_complete_transition(void);

Double inclusion protection is missing and externs for functions are 
redundant.

Otherwise it looks quite ok.

Miroslav

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH 8/9] livepatch: allow patch modules to be removed
  2015-02-13 20:49           ` Josh Poimboeuf
@ 2015-02-16 16:06             ` Miroslav Benes
  2015-02-17 15:55               ` Josh Poimboeuf
  0 siblings, 1 reply; 106+ messages in thread
From: Miroslav Benes @ 2015-02-16 16:06 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Jiri Slaby, Seth Jennings, Jiri Kosina, Vojtech Pavlik,
	Masami Hiramatsu, live-patching, linux-kernel

On Fri, 13 Feb 2015, Josh Poimboeuf wrote:

> On Fri, Feb 13, 2015 at 05:17:10PM +0100, Miroslav Benes wrote:
> > On Fri, 13 Feb 2015, Josh Poimboeuf wrote:
> > > Hm, even with Jiri Slaby's suggested fix to add the completion to the
> > > unregister path, I still get a lockdep warning.  This looks more insidious,
> > > related to the locking order of a kernfs lock and the klp lock.  I'll need to
> > > look at this some more...
> > 
> > Yes, I was afraid of this. Lockdep warning is a separate bug. It is caused 
> > by taking klp_mutex in enabled_store. During rmmod klp_unregister_patch 
> > takes klp_mutex and destroys the sysfs structure. If somebody writes to 
> > enabled just after unregister takes the mutex and before the sysfs 
> > removal, he would cause the deadlock, because enabled_store takes the 
> > "sysfs lock" and then klp_mutex. That is exactly what the lockdep tells us 
> > below.
> > 
> > We can look for inspiration elsewhere. Grep for s_active through git log 
> > of the mainline offers several commits which dealt exactly with this. Will 
> > browse through that...
> 
> Thanks Miroslav, please let me know what you find.  It wouldn't surprise
> me if this were a very common problem.
> 
> One option would be to move the enabled_store() work out to a workqueue
> or something.

Yes, that is one possibility. It is not the only one.

1. we could replace mutex_lock in enabled_store with mutex_trylock. If the 
lock was not acquired we would return -EBUSY. Or could we 'return 
restart_syscall' (maybe after some tiny msleep)?

2. we could reorganize klp_unregister_patch somehow and move sysfs removal 
out of mutex protection.

Miroslav

> > > 
> > > To recreate:
> > > 
> > > insmod livepatch-sample.ko
> > > 
> > > # wait for patching to complete
> > > 
> > > ~/a.out &  <-- simple program which opens the "enabled" file in the background
> > 
> > I didn't even need such a program. Lockdep warned me with sole insmod, 
> > echo and rmmod. It is magically clever.
> 
> Ah, even easier... lockdep is awesome.
> 
> -- 
> Josh
> 

--
Miroslav Benes
SUSE Labs

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH 9/9] livepatch: update task universe when exiting kernel
  2015-02-16 10:16   ` Jiri Slaby
@ 2015-02-17 14:58     ` Josh Poimboeuf
  0 siblings, 0 replies; 106+ messages in thread
From: Josh Poimboeuf @ 2015-02-17 14:58 UTC (permalink / raw)
  To: Jiri Slaby
  Cc: Seth Jennings, Jiri Kosina, Vojtech Pavlik, Masami Hiramatsu,
	live-patching, linux-kernel

On Mon, Feb 16, 2015 at 11:16:11AM +0100, Jiri Slaby wrote:
> On 02/09/2015, 06:31 PM, Josh Poimboeuf wrote:
> > Update a tasks's universe when returning from a system call or user
> > space interrupt, or after handling a signal.
> > 
> > This greatly increases the chances of a patch operation succeeding.  If
> > a task is I/O bound, it can switch universes when returning from a
> > system call.  If a task is CPU bound, it can switch universes when
> > returning from an interrupt.  If a task is sleeping on a to-be-patched
> > function, the user can send SIGSTOP and SIGCONT to force it to switch.
> > 
> > Since the idle "swapper" tasks don't ever exit the kernel, they're
> > updated from within the idle loop.
> > 
> > Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
> > ---
> >  arch/x86/include/asm/thread_info.h |  4 +++-
> >  arch/x86/kernel/signal.c           |  4 ++++
> >  include/linux/livepatch.h          |  2 ++
> >  kernel/livepatch/transition.c      | 15 +++++++++++++++
> >  kernel/sched/idle.c                |  4 ++++
> ...
> > --- a/kernel/sched/idle.c
> > +++ b/kernel/sched/idle.c
> > @@ -7,6 +7,7 @@
> >  #include <linux/tick.h>
> >  #include <linux/mm.h>
> >  #include <linux/stackprotector.h>
> > +#include <linux/livepatch.h>
> >  
> >  #include <asm/tlb.h>
> >  
> > @@ -250,6 +251,9 @@ static void cpu_idle_loop(void)
> >  
> >  		sched_ttwu_pending();
> >  		schedule_preempt_disabled();
> > +
> > +		if (unlikely(test_thread_flag(TIF_KLP_NEED_UPDATE)))
> > +			klp_update_task_universe(current);
> 
> Oh, this is indeed broken on non-x86 archs as kbuild reports.
> (TIF_KLP_NEED_UPDATE undefined)
> 
> We need a klp_maybe_update_task_universe inline or something like that
> and define it void for non-LIVEPATCH configs.

Doh, thanks.

-- 
Josh

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH 6/9] livepatch: create per-task consistency model
  2015-02-14 11:40   ` Jiri Slaby
@ 2015-02-17 14:59     ` Josh Poimboeuf
  0 siblings, 0 replies; 106+ messages in thread
From: Josh Poimboeuf @ 2015-02-17 14:59 UTC (permalink / raw)
  To: Jiri Slaby
  Cc: Seth Jennings, Jiri Kosina, Vojtech Pavlik, Masami Hiramatsu,
	live-patching, linux-kernel

On Sat, Feb 14, 2015 at 12:40:01PM +0100, Jiri Slaby wrote:
> On 02/09/2015, 06:31 PM, Josh Poimboeuf wrote:
> > Add a basic per-task consistency model.  This is the foundation which
> > will eventually enable us to patch those ~10% of security patches which
> > change function prototypes and/or data semantics.
> > 
> > When a patch is enabled, livepatch enters into a transition state where
> > tasks are converging from the old universe to the new universe.  If a
> > given task isn't using any of the patched functions, it's switched to
> > the new universe.  Once all the tasks have been converged to the new
> > universe, patching is complete.
> > 
> > The same sequence occurs when a patch is disabled, except the tasks
> > converge from the new universe to the old universe.
> > 
> > The /sys/kernel/livepatch/<patch>/transition file shows whether a patch
> > is in transition.  Only a single patch (the topmost patch on the stack)
> > can be in transition at a given time.  A patch can remain in the
> > transition state indefinitely, if any of the tasks are stuck in the
> > previous universe.
> > 
> > A transition can be reversed and effectively canceled by writing the
> > opposite value to the /sys/kernel/livepatch/<patch>/enabled file while
> > the transition is in progress.  Then all the tasks will attempt to
> > converge back to the original universe.
> > 
> > Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
> > ---
> >  include/linux/livepatch.h     |  18 ++-
> >  include/linux/sched.h         |   3 +
> >  kernel/fork.c                 |   2 +
> >  kernel/livepatch/Makefile     |   2 +-
> >  kernel/livepatch/core.c       |  71 ++++++----
> >  kernel/livepatch/patch.c      |  34 ++++-
> >  kernel/livepatch/patch.h      |   1 +
> >  kernel/livepatch/transition.c | 300 ++++++++++++++++++++++++++++++++++++++++++
> >  kernel/livepatch/transition.h |  16 +++
> >  kernel/sched/core.c           |   2 +
> >  10 files changed, 423 insertions(+), 26 deletions(-)
> >  create mode 100644 kernel/livepatch/transition.c
> >  create mode 100644 kernel/livepatch/transition.h
> > 
> > diff --git a/include/linux/livepatch.h b/include/linux/livepatch.h
> > index 0e65b4d..b8c2f15 100644
> > --- a/include/linux/livepatch.h
> > +++ b/include/linux/livepatch.h
> > @@ -40,6 +40,7 @@
> >   * @old_size:	size of the old function
> >   * @new_size:	size of the new function
> >   * @patched:	the func has been added to the klp_ops list
> > + * @transition:	the func is currently being applied or reverted
> >   */
> >  struct klp_func {
> >  	/* external */
> > @@ -60,6 +61,7 @@ struct klp_func {
> >  	struct list_head stack_node;
> >  	unsigned long old_size, new_size;
> >  	int patched;
> > +	int transition;
> >  };
> >  
> >  /**
> > @@ -128,6 +130,20 @@ extern int klp_unregister_patch(struct klp_patch *);
> >  extern int klp_enable_patch(struct klp_patch *);
> >  extern int klp_disable_patch(struct klp_patch *);
> >  
> > -#endif /* CONFIG_LIVEPATCH */
> > +extern int klp_universe_goal;
> > +
> > +static inline void klp_update_task_universe(struct task_struct *t)
> > +{
> > +	/* corresponding smp_wmb() is in klp_set_universe_goal() */
> > +	smp_rmb();
> > +
> > +	t->klp_universe = klp_universe_goal;
> > +}
> > +
> > +#else /* !CONFIG_LIVEPATCH */
> > +
> > +static inline void klp_update_task_universe(struct task_struct *t) {}
> > +
> > +#endif /* !CONFIG_LIVEPATCH */
> >  
> >  #endif /* _LINUX_LIVEPATCH_H_ */
> > diff --git a/include/linux/sched.h b/include/linux/sched.h
> > index 8db31ef..a95e59a 100644
> > --- a/include/linux/sched.h
> > +++ b/include/linux/sched.h
> > @@ -1701,6 +1701,9 @@ struct task_struct {
> >  #ifdef CONFIG_DEBUG_ATOMIC_SLEEP
> >  	unsigned long	task_state_change;
> >  #endif
> > +#ifdef CONFIG_LIVEPATCH
> > +	int klp_universe;
> > +#endif
> >  };
> >  
> >  /* Future-safe accessor for struct task_struct's cpus_allowed. */
> > diff --git a/kernel/fork.c b/kernel/fork.c
> > index 4dc2dda..1dcbebe 100644
> > --- a/kernel/fork.c
> > +++ b/kernel/fork.c
> > @@ -74,6 +74,7 @@
> >  #include <linux/uprobes.h>
> >  #include <linux/aio.h>
> >  #include <linux/compiler.h>
> > +#include <linux/livepatch.h>
> >  
> >  #include <asm/pgtable.h>
> >  #include <asm/pgalloc.h>
> > @@ -1538,6 +1539,7 @@ static struct task_struct *copy_process(unsigned long clone_flags,
> >  	total_forks++;
> >  	spin_unlock(&current->sighand->siglock);
> >  	syscall_tracepoint_update(p);
> > +	klp_update_task_universe(p);
> >  	write_unlock_irq(&tasklist_lock);
> >  
> >  	proc_fork_connector(p);
> > diff --git a/kernel/livepatch/Makefile b/kernel/livepatch/Makefile
> > index e136dad..2b8bdb1 100644
> > --- a/kernel/livepatch/Makefile
> > +++ b/kernel/livepatch/Makefile
> > @@ -1,3 +1,3 @@
> >  obj-$(CONFIG_LIVEPATCH) += livepatch.o
> >  
> > -livepatch-objs := core.o patch.o
> > +livepatch-objs := core.o patch.o transition.o
> > diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c
> > index 85d4ef7..790dc10 100644
> > --- a/kernel/livepatch/core.c
> > +++ b/kernel/livepatch/core.c
> > @@ -28,14 +28,17 @@
> >  #include <linux/kallsyms.h>
> >  
> >  #include "patch.h"
> > +#include "transition.h"
> >  
> >  /*
> > - * The klp_mutex protects the global lists and state transitions of any
> > - * structure reachable from them.  References to any structure must be obtained
> > - * under mutex protection (except in klp_ftrace_handler(), which uses RCU to
> > - * ensure it gets consistent data).
> > + * The klp_mutex is a coarse lock which serializes access to klp data.  All
> > + * accesses to klp-related variables and structures must have mutex protection,
> > + * except within the following functions which carefully avoid the need for it:
> > + *
> > + * - klp_ftrace_handler()
> > + * - klp_update_task_universe()
> >   */
> > -static DEFINE_MUTEX(klp_mutex);
> > +DEFINE_MUTEX(klp_mutex);
> >  
> >  static LIST_HEAD(klp_patches);
> >  
> > @@ -67,7 +70,6 @@ static void klp_find_object_module(struct klp_object *obj)
> >  	mutex_unlock(&module_mutex);
> >  }
> >  
> > -/* klp_mutex must be held by caller */
> >  static bool klp_is_patch_registered(struct klp_patch *patch)
> >  {
> >  	struct klp_patch *mypatch;
> > @@ -285,18 +287,17 @@ static int klp_write_object_relocations(struct module *pmod,
> >  
> >  static int __klp_disable_patch(struct klp_patch *patch)
> >  {
> > -	struct klp_object *obj;
> > +	if (klp_transition_patch)
> > +		return -EBUSY;
> >  
> >  	/* enforce stacking: only the last enabled patch can be disabled */
> >  	if (!list_is_last(&patch->list, &klp_patches) &&
> >  	    list_next_entry(patch, list)->enabled)
> >  		return -EBUSY;
> >  
> > -	pr_notice("disabling patch '%s'\n", patch->mod->name);
> > -
> > -	for (obj = patch->objs; obj->funcs; obj++)
> > -		if (obj->patched)
> > -			klp_unpatch_object(obj);
> > +	klp_init_transition(patch, KLP_UNIVERSE_NEW);
> > +	klp_start_transition(KLP_UNIVERSE_OLD);
> > +	klp_try_complete_transition();
> >  
> >  	patch->enabled = 0;
> >  
> > @@ -340,6 +341,9 @@ static int __klp_enable_patch(struct klp_patch *patch)
> >  	struct klp_object *obj;
> >  	int ret;
> >  
> > +	if (klp_transition_patch)
> > +		return -EBUSY;
> > +
> >  	if (WARN_ON(patch->enabled))
> >  		return -EINVAL;
> >  
> > @@ -351,7 +355,7 @@ static int __klp_enable_patch(struct klp_patch *patch)
> >  	pr_notice_once("tainting kernel with TAINT_LIVEPATCH\n");
> >  	add_taint(TAINT_LIVEPATCH, LOCKDEP_STILL_OK);
> >  
> > -	pr_notice("enabling patch '%s'\n", patch->mod->name);
> > +	klp_init_transition(patch, KLP_UNIVERSE_OLD);
> >  
> >  	for (obj = patch->objs; obj->funcs; obj++) {
> >  		klp_find_object_module(obj);
> > @@ -360,17 +364,24 @@ static int __klp_enable_patch(struct klp_patch *patch)
> >  			continue;
> >  
> >  		ret = klp_patch_object(obj);
> > -		if (ret)
> > -			goto unregister;
> > +		if (ret) {
> > +			pr_warn("failed to enable patch '%s'\n",
> > +				patch->mod->name);
> > +
> > +			klp_unpatch_objects(patch);
> > +			klp_complete_transition();
> > +
> > +			return ret;
> > +		}
> >  	}
> >  
> > +	klp_start_transition(KLP_UNIVERSE_NEW);
> > +
> > +	klp_try_complete_transition();
> > +
> >  	patch->enabled = 1;
> >  
> >  	return 0;
> > -
> > -unregister:
> > -	WARN_ON(__klp_disable_patch(patch));
> > -	return ret;
> >  }
> >  
> >  /**
> > @@ -407,6 +418,7 @@ EXPORT_SYMBOL_GPL(klp_enable_patch);
> >   * /sys/kernel/livepatch
> >   * /sys/kernel/livepatch/<patch>
> >   * /sys/kernel/livepatch/<patch>/enabled
> > + * /sys/kernel/livepatch/<patch>/transition
> >   * /sys/kernel/livepatch/<patch>/<object>
> >   * /sys/kernel/livepatch/<patch>/<object>/<func>
> >   */
> > @@ -435,7 +447,9 @@ static ssize_t enabled_store(struct kobject *kobj, struct kobj_attribute *attr,
> >  		goto err;
> >  	}
> >  
> > -	if (val) {
> > +	if (klp_transition_patch == patch) {
> > +		klp_reverse_transition();
> > +	} else if (val) {
> >  		ret = __klp_enable_patch(patch);
> >  		if (ret)
> >  			goto err;
> > @@ -463,9 +477,21 @@ static ssize_t enabled_show(struct kobject *kobj,
> >  	return snprintf(buf, PAGE_SIZE-1, "%d\n", patch->enabled);
> >  }
> >  
> > +static ssize_t transition_show(struct kobject *kobj,
> > +			       struct kobj_attribute *attr, char *buf)
> > +{
> > +	struct klp_patch *patch;
> > +
> > +	patch = container_of(kobj, struct klp_patch, kobj);
> > +	return snprintf(buf, PAGE_SIZE-1, "%d\n",
> > +			klp_transition_patch == patch);
> > +}
> > +
> >  static struct kobj_attribute enabled_kobj_attr = __ATTR_RW(enabled);
> > +static struct kobj_attribute transition_kobj_attr = __ATTR_RO(transition);
> >  static struct attribute *klp_patch_attrs[] = {
> >  	&enabled_kobj_attr.attr,
> > +	&transition_kobj_attr.attr,
> >  	NULL
> >  };
> >  
> > @@ -543,6 +569,7 @@ static int klp_init_func(struct klp_object *obj, struct klp_func *func)
> >  {
> >  	INIT_LIST_HEAD(&func->stack_node);
> >  	func->patched = 0;
> > +	func->transition = 0;
> >  
> >  	return kobject_init_and_add(&func->kobj, &klp_ktype_func,
> >  				    obj->kobj, func->old_name);
> > @@ -725,7 +752,7 @@ static void klp_module_notify_coming(struct klp_patch *patch,
> >  	if (ret)
> >  		goto err;
> >  
> > -	if (!patch->enabled)
> > +	if (!patch->enabled && klp_transition_patch != patch)
> >  		return;
> >  
> >  	pr_notice("applying patch '%s' to loading module '%s'\n",
> > @@ -746,7 +773,7 @@ static void klp_module_notify_going(struct klp_patch *patch,
> >  	struct module *pmod = patch->mod;
> >  	struct module *mod = obj->mod;
> >  
> > -	if (!patch->enabled)
> > +	if (!patch->enabled && klp_transition_patch != patch)
> >  		goto free;
> >  
> >  	pr_notice("reverting patch '%s' on unloading module '%s'\n",
> > diff --git a/kernel/livepatch/patch.c b/kernel/livepatch/patch.c
> > index 281fbca..f12256b 100644
> > --- a/kernel/livepatch/patch.c
> > +++ b/kernel/livepatch/patch.c
> > @@ -24,6 +24,7 @@
> >  #include <linux/slab.h>
> >  
> >  #include "patch.h"
> > +#include "transition.h"
> >  
> >  static LIST_HEAD(klp_ops);
> >  
> > @@ -38,14 +39,34 @@ static void notrace klp_ftrace_handler(unsigned long ip,
> >  	ops = container_of(fops, struct klp_ops, fops);
> >  
> >  	rcu_read_lock();
> > +
> >  	func = list_first_or_null_rcu(&ops->func_stack, struct klp_func,
> >  				      stack_node);
> > -	rcu_read_unlock();
> >  
> >  	if (WARN_ON_ONCE(!func))
> > -		return;
> > +		goto unlock;
> > +
> > +	if (unlikely(func->transition)) {
> > +		/* corresponding smp_wmb() is in klp_init_transition() */
> > +		smp_rmb();
> > +
> > +		if (current->klp_universe == KLP_UNIVERSE_OLD) {
> > +			/*
> > +			 * Use the previously patched version of the function.
> > +			 * If no previous patches exist, use the original
> > +			 * function.
> > +			 */
> > +			func = list_entry_rcu(func->stack_node.next,
> > +					      struct klp_func, stack_node);
> > +
> > +			if (&func->stack_node == &ops->func_stack)
> > +				goto unlock;
> > +		}
> > +	}
> >  
> >  	klp_arch_set_pc(regs, (unsigned long)func->new_func);
> > +unlock:
> > +	rcu_read_unlock();
> >  }
> >  
> >  struct klp_ops *klp_find_ops(unsigned long old_addr)
> > @@ -174,3 +195,12 @@ int klp_patch_object(struct klp_object *obj)
> >  
> >  	return 0;
> >  }
> > +
> > +void klp_unpatch_objects(struct klp_patch *patch)
> > +{
> > +	struct klp_object *obj;
> > +
> > +	for (obj = patch->objs; obj->funcs; obj++)
> > +		if (obj->patched)
> > +			klp_unpatch_object(obj);
> > +}
> > diff --git a/kernel/livepatch/patch.h b/kernel/livepatch/patch.h
> > index bb34bd3..1648259 100644
> > --- a/kernel/livepatch/patch.h
> > +++ b/kernel/livepatch/patch.h
> > @@ -23,3 +23,4 @@ struct klp_ops *klp_find_ops(unsigned long old_addr);
> >  
> >  extern int klp_patch_object(struct klp_object *obj);
> >  extern void klp_unpatch_object(struct klp_object *obj);
> > +extern void klp_unpatch_objects(struct klp_patch *patch);
> > diff --git a/kernel/livepatch/transition.c b/kernel/livepatch/transition.c
> > new file mode 100644
> > index 0000000..2630296
> > --- /dev/null
> > +++ b/kernel/livepatch/transition.c
> > @@ -0,0 +1,300 @@
> > +/*
> > + * transition.c - Kernel Live Patching transition functions
> > + *
> > + * Copyright (C) 2015 Josh Poimboeuf <jpoimboe@redhat.com>
> > + *
> > + * This program is free software; you can redistribute it and/or
> > + * modify it under the terms of the GNU General Public License
> > + * as published by the Free Software Foundation; either version 2
> > + * of the License, or (at your option) any later version.
> > + *
> > + * This program is distributed in the hope that it will be useful,
> > + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> > + * GNU General Public License for more details.
> > + *
> > + * You should have received a copy of the GNU General Public License
> > + * along with this program; if not, see <http://www.gnu.org/licenses/>.
> > + */
> > +
> > +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
> > +
> > +#include <linux/cpu.h>
> > +#include <asm/stacktrace.h>
> > +#include "../sched/sched.h"
> > +
> > +#include "patch.h"
> > +#include "transition.h"
> > +
> > +static void klp_transition_work_fn(struct work_struct *);
> > +static DECLARE_DELAYED_WORK(klp_transition_work, klp_transition_work_fn);
> > +
> > +struct klp_patch *klp_transition_patch;
> > +
> > +int klp_universe_goal = KLP_UNIVERSE_UNDEFINED;
> > +
> > +static void klp_set_universe_goal(int universe)
> > +{
> > +	klp_universe_goal = universe;
> > +
> > +	/* corresponding smp_rmb() is in klp_update_task_universe() */
> > +	smp_wmb();
> > +}
> > +
> > +/*
> > + * The transition to the universe goal is complete.  Clean up the data
> > + * structures.
> > + */
> > +void klp_complete_transition(void)
> > +{
> > +	struct klp_object *obj;
> > +	struct klp_func *func;
> > +
> > +	for (obj = klp_transition_patch->objs; obj->funcs; obj++)
> > +		for (func = obj->funcs; func->old_name; func++)
> > +			func->transition = 0;
> > +
> > +	klp_transition_patch = NULL;
> > +}
> > +
> > +static int klp_stacktrace_address_verify_func(struct klp_func *func,
> > +					      unsigned long address)
> > +{
> > +	unsigned long func_addr, func_size;
> > +
> > +	if (klp_universe_goal == KLP_UNIVERSE_OLD) {
> > +		 /* check the to-be-unpatched function (the func itself) */
> > +		func_addr = (unsigned long)func->new_func;
> > +		func_size = func->new_size;
> > +	} else {
> > +		/* check the to-be-patched function (previous func) */
> > +		struct klp_ops *ops;
> > +
> > +		ops = klp_find_ops(func->old_addr);
> > +
> > +		if (list_is_singular(&ops->func_stack)) {
> > +			/* original function */
> > +			func_addr = func->old_addr;
> > +			func_size = func->old_size;
> > +		} else {
> > +			/* previously patched function */
> > +			struct klp_func *prev;
> > +
> > +			prev = list_next_entry(func, stack_node);
> > +			func_addr = (unsigned long)prev->new_func;
> > +			func_size = prev->new_size;
> > +		}
> > +	}
> > +
> > +	if (address >= func_addr && address < func_addr + func_size)
> > +		return -1;
> > +
> > +	return 0;
> > +}
> > +
> > +/*
> > + * Determine whether the given return address on the stack is within a
> > + * to-be-patched or to-be-unpatched function.
> > + */
> > +static void klp_stacktrace_address_verify(void *data, unsigned long address,
> > +					  int reliable)
> > +{
> > +	struct klp_object *obj;
> > +	struct klp_func *func;
> > +	int *ret = data;
> > +
> > +	if (*ret)
> > +		return;
> > +
> > +	for (obj = klp_transition_patch->objs; obj->funcs; obj++) {
> > +		if (!obj->patched)
> > +			continue;
> > +		for (func = obj->funcs; func->old_name; func++) {
> > +			if (klp_stacktrace_address_verify_func(func, address)) {
> > +				*ret = -1;
> > +				return;
> > +			}
> > +		}
> > +	}
> > +}
> > +
> > +static int klp_stacktrace_stack(void *data, char *name)
> > +{
> > +	return 0;
> > +}
> > +
> > +static const struct stacktrace_ops klp_stacktrace_ops = {
> > +	.address = klp_stacktrace_address_verify,
> > +	.stack = klp_stacktrace_stack,
> > +	.walk_stack = print_context_stack_bp,
> > +};
> > +
> > +/*
> > + * Try to safely transition a task to the universe goal.  If the task is
> > + * currently running or is sleeping on a to-be-patched or to-be-unpatched
> > + * function, return false.
> > + */
> > +static bool klp_transition_task(struct task_struct *t)
> > +{
> > +	struct rq *rq;
> > +	unsigned long flags;
> > +	int ret;
> > +	bool success = false;
> > +
> > +	if (t->klp_universe == klp_universe_goal)
> > +		return true;
> > +
> > +	rq = task_rq_lock(t, &flags);
> > +
> > +	if (task_running(rq, t) && t != current) {
> > +		pr_debug("%s: pid %d (%s) is running\n", __func__, t->pid,
> > +			 t->comm);
> > +		goto done;
> > +	}
> > +
> > +	ret = 0;
> > +	dump_trace(t, NULL, NULL, 0, &klp_stacktrace_ops, &ret);
> > +	if (ret) {
> > +		pr_debug("%s: pid %d (%s) is sleeping on a patched function\n",
> > +			 __func__, t->pid, t->comm);
> > +		goto done;
> > +	}
> > +
> > +	klp_update_task_universe(t);
> > +
> > +	success = true;
> > +done:
> > +	task_rq_unlock(rq, t, &flags);
> > +	return success;
> > +}
> > +
> > +/*
> > + * Try to transition all tasks to the universe goal.  If any tasks are still
> > + * stuck in the original universe, schedule a retry.
> > + */
> > +void klp_try_complete_transition(void)
> > +{
> > +	unsigned int cpu;
> > +	struct task_struct *g, *t;
> > +	bool complete = true;
> > +
> > +	/* try to transition all normal tasks */
> > +	read_lock(&tasklist_lock);
> > +	for_each_process_thread(g, t)
> > +		if (!klp_transition_task(t))
> > +			complete = false;
> > +	read_unlock(&tasklist_lock);
> > +
> > +	/* try to transition the idle "swapper" tasks */
> > +	get_online_cpus();
> > +	for_each_online_cpu(cpu)
> > +		if (!klp_transition_task(idle_task(cpu)))
> > +			complete = false;
> > +	put_online_cpus();
> > +
> > +	/* if not complete, try again later */
> > +	if (!complete) {
> > +		schedule_delayed_work(&klp_transition_work,
> > +				      round_jiffies_relative(HZ));
> > +		return;
> > +	}
> > +
> > +	/* success! unpatch obsolete functions and do some cleanup */
> > +
> > +	if (klp_universe_goal == KLP_UNIVERSE_OLD) {
> > +		klp_unpatch_objects(klp_transition_patch);
> > +
> > +		/* prevent ftrace handler from reading old func->transition */
> > +		synchronize_rcu();
> > +	}
> > +
> > +	pr_notice("'%s': %s complete\n", klp_transition_patch->mod->name,
> > +		  klp_universe_goal == KLP_UNIVERSE_NEW ? "patching" :
> > +							  "unpatching");
> > +
> > +	klp_complete_transition();
> > +}
> > +
> > +static void klp_transition_work_fn(struct work_struct *work)
> > +{
> > +	mutex_lock(&klp_mutex);
> > +
> > +	if (klp_transition_patch)
> > +		klp_try_complete_transition();
> > +
> > +	mutex_unlock(&klp_mutex);
> > +}
> > +
> > +/*
> > + * Start the transition to the specified universe so tasks can begin switching
> > + * to it.
> > + */
> > +void klp_start_transition(int universe)
> > +{
> > +	if (WARN_ON(klp_universe_goal == universe))
> > +		return;
> > +
> > +	pr_notice("'%s': %s...\n", klp_transition_patch->mod->name,
> > +		  universe == KLP_UNIVERSE_NEW ? "patching" : "unpatching");
> > +
> > +	klp_set_universe_goal(universe);
> > +}
> > +
> > +/*
> > + * Can be called in the middle of an existing transition to reverse the
> > + * direction of the universe goal.  This can be done to effectively cancel an
> > + * existing enable or disable operation if there are any tasks which are stuck
> > + * in the original universe.
> > + */
> > +void klp_reverse_transition(void)
> > +{
> > +	struct klp_patch *patch = klp_transition_patch;
> > +
> > +	klp_start_transition(!klp_universe_goal);
> > +	klp_try_complete_transition();
> > +
> > +	patch->enabled = !patch->enabled;
> > +}
> > +
> > +/*
> > + * Reset the universe goal and all tasks to the starting universe, and set all
> > + * func->transition's to 1 to prepare for patching.
> > + */
> > +void klp_init_transition(struct klp_patch *patch, int universe)
> > +{
> > +	struct task_struct *g, *t;
> > +	unsigned int cpu;
> > +	struct klp_object *obj;
> > +	struct klp_func *func;
> > +
> > +	klp_transition_patch = patch;
> > +
> > +	/*
> > +	 * If the previous transition was in the opposite direction, we may
> > +	 * already be in the requested initial universe.
> > +	 */
> > +	if (klp_universe_goal == universe)
> > +		goto init_funcs;
> > +
> > +	klp_set_universe_goal(universe);
> > +
> > +	/* init all normal task universes */
> > +	read_lock(&tasklist_lock);
> > +	for_each_process_thread(g, t)
> > +		klp_update_task_universe(t);
> > +	read_unlock(&tasklist_lock);
> > +
> > +	/* init all idle "swapper" task universes */
> > +	get_online_cpus();
> > +	for_each_online_cpu(cpu)
> > +		klp_update_task_universe(idle_task(cpu));
> > +	put_online_cpus();
> > +
> > +init_funcs:
> > +	/* corresponding smp_rmb() is in klp_ftrace_handler() */
> > +	smp_wmb();
> > +
> > +	for (obj = patch->objs; obj->funcs; obj++)
> > +		for (func = obj->funcs; func->old_name; func++)
> > +			func->transition = 1;
> 
> So I finally got to review of this one. I have only two concerns:
> 1) it removes the ability for the user to use 'no consistency model'.
> But you don't need to worry about this, I plan to implement this as soon
> as you send v2 of these.

Ok, sounds good.

> 2) How is this 'transition = 1' store above guaranteed to reach other
> CPUs before you start registering ftrace handlers? The CPUs need not see
> the update when some handler is already invoked before start_transition
> AFAICS.

Yeah, I think the order of the 'transition = 1' and adding the func to
the ops stack list should be enforced.  Also I'll probably rework the
barriers a little bit in v2 so that they're more explicit and better
commented.

-- 
Josh

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH 6/9] livepatch: create per-task consistency model
  2015-02-16 14:19   ` Miroslav Benes
@ 2015-02-17 15:10     ` Josh Poimboeuf
  2015-02-17 15:48       ` Miroslav Benes
  0 siblings, 1 reply; 106+ messages in thread
From: Josh Poimboeuf @ 2015-02-17 15:10 UTC (permalink / raw)
  To: Miroslav Benes
  Cc: Seth Jennings, Jiri Kosina, Vojtech Pavlik, Masami Hiramatsu,
	live-patching, linux-kernel

On Mon, Feb 16, 2015 at 03:19:10PM +0100, Miroslav Benes wrote:
> On Mon, 9 Feb 2015, Josh Poimboeuf wrote:
> 
> > Add a basic per-task consistency model.  This is the foundation which
> > will eventually enable us to patch those ~10% of security patches which
> > change function prototypes and/or data semantics.
> > 
> > When a patch is enabled, livepatch enters into a transition state where
> > tasks are converging from the old universe to the new universe.  If a
> > given task isn't using any of the patched functions, it's switched to
> > the new universe.  Once all the tasks have been converged to the new
> > universe, patching is complete.
> > 
> > The same sequence occurs when a patch is disabled, except the tasks
> > converge from the new universe to the old universe.
> > 
> > The /sys/kernel/livepatch/<patch>/transition file shows whether a patch
> > is in transition.  Only a single patch (the topmost patch on the stack)
> > can be in transition at a given time.  A patch can remain in the
> > transition state indefinitely, if any of the tasks are stuck in the
> > previous universe.
> > 
> > A transition can be reversed and effectively canceled by writing the
> > opposite value to the /sys/kernel/livepatch/<patch>/enabled file while
> > the transition is in progress.  Then all the tasks will attempt to
> > converge back to the original universe.
> 
> I finally managed to go through this patch and I have only few comments 
> apart from what Jiri has already written...
> 
> I think it would be useful to add more comments throughout the code.

Ok, I'll try to add more comments throughout.

> sysfs documentation (Documentation/ABI/testing/sysfs-kernel-livepatch) 
> should be updated as well. Also the meaning of enabled attribute was 
> changed a bit (by different patch of the set though).

Ok.

> > +
> > +void klp_unpatch_objects(struct klp_patch *patch)
> > +{
> > +	struct klp_object *obj;
> > +
> > +	for (obj = patch->objs; obj->funcs; obj++)
> > +		if (obj->patched)
> > +			klp_unpatch_object(obj);
> > +}
> 
> Maybe we should introduce for_each_* macros which could be used in the 
> code and avoid such functions. I do not have strong opinion about it.

Yeah, but each such loop seems to differ a little bit, so I'm not quite
sure how to structure the macros such that they'd be useful.  Maybe for
a future patch.

> > diff --git a/kernel/livepatch/patch.h b/kernel/livepatch/patch.h
> > index bb34bd3..1648259 100644
> > --- a/kernel/livepatch/patch.h
> > +++ b/kernel/livepatch/patch.h
> > @@ -23,3 +23,4 @@ struct klp_ops *klp_find_ops(unsigned long old_addr);
> >  
> >  extern int klp_patch_object(struct klp_object *obj);
> >  extern void klp_unpatch_object(struct klp_object *obj);
> > +extern void klp_unpatch_objects(struct klp_patch *patch);
> 
> [...]
> 
> > diff --git a/kernel/livepatch/transition.h b/kernel/livepatch/transition.h
> > new file mode 100644
> > index 0000000..ba9a55c
> > --- /dev/null
> > +++ b/kernel/livepatch/transition.h
> > @@ -0,0 +1,16 @@
> > +#include <linux/livepatch.h>
> > +
> > +enum {
> > +	KLP_UNIVERSE_UNDEFINED = -1,
> > +	KLP_UNIVERSE_OLD,
> > +	KLP_UNIVERSE_NEW,
> > +};
> > +
> > +extern struct mutex klp_mutex;
> > +extern struct klp_patch *klp_transition_patch;
> > +
> > +extern void klp_init_transition(struct klp_patch *patch, int universe);
> > +extern void klp_start_transition(int universe);
> > +extern void klp_reverse_transition(void);
> > +extern void klp_try_complete_transition(void);
> > +extern void klp_complete_transition(void);
> 
> Double inclusion protection is missing

Ok.

> and externs for functions are redundant.

I agree, but it seems to be the norm in Linux.  I have no idea why.  I'm
just following the existing convention.

> Otherwise it looks quite ok.

Thanks!

-- 
Josh

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH 6/9] livepatch: create per-task consistency model
  2015-02-17 15:10     ` Josh Poimboeuf
@ 2015-02-17 15:48       ` Miroslav Benes
  2015-02-17 16:01         ` Josh Poimboeuf
  0 siblings, 1 reply; 106+ messages in thread
From: Miroslav Benes @ 2015-02-17 15:48 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Seth Jennings, Jiri Kosina, Vojtech Pavlik, Masami Hiramatsu,
	live-patching, linux-kernel

On Tue, 17 Feb 2015, Josh Poimboeuf wrote:

> On Mon, Feb 16, 2015 at 03:19:10PM +0100, Miroslav Benes wrote:
> > On Mon, 9 Feb 2015, Josh Poimboeuf wrote:
> > 

[...]

> > > +
> > > +void klp_unpatch_objects(struct klp_patch *patch)
> > > +{
> > > +	struct klp_object *obj;
> > > +
> > > +	for (obj = patch->objs; obj->funcs; obj++)
> > > +		if (obj->patched)
> > > +			klp_unpatch_object(obj);
> > > +}
> > 
> > Maybe we should introduce for_each_* macros which could be used in the 
> > code and avoid such functions. I do not have strong opinion about it.
> 
> Yeah, but each such loop seems to differ a little bit, so I'm not quite
> sure how to structure the macros such that they'd be useful.  Maybe for
> a future patch.

Yes, that is correct. The code in the caller of klp_unpatch_objects would 
look something like this

klp_for_each_object(obj, patch->objs)
	if (obj->patched)
		klp_unpatch_object(obj)

So there is in fact no change (compared to opencoding of 
klp_unpatch_objects), but IMO it is more legible. The upside is 
that we wouldn't introduce functions with similar names which could be 
confusing in the future AND we could use such macros throughout the code.

One step more could be macro klp_for_each_patched_object which would 
include the check.

However it is a nitpick, matter of taste and it is up to you.

> 
> > > diff --git a/kernel/livepatch/patch.h b/kernel/livepatch/patch.h
> > > index bb34bd3..1648259 100644
> > > --- a/kernel/livepatch/patch.h
> > > +++ b/kernel/livepatch/patch.h
> > > @@ -23,3 +23,4 @@ struct klp_ops *klp_find_ops(unsigned long old_addr);
> > >  
> > >  extern int klp_patch_object(struct klp_object *obj);
> > >  extern void klp_unpatch_object(struct klp_object *obj);
> > > +extern void klp_unpatch_objects(struct klp_patch *patch);
> > 
> > [...]
> > 
> > > diff --git a/kernel/livepatch/transition.h b/kernel/livepatch/transition.h
> > > new file mode 100644
> > > index 0000000..ba9a55c
> > > --- /dev/null
> > > +++ b/kernel/livepatch/transition.h
> > > @@ -0,0 +1,16 @@
> > > +#include <linux/livepatch.h>
> > > +
> > > +enum {
> > > +	KLP_UNIVERSE_UNDEFINED = -1,
> > > +	KLP_UNIVERSE_OLD,
> > > +	KLP_UNIVERSE_NEW,
> > > +};
> > > +
> > > +extern struct mutex klp_mutex;
> > > +extern struct klp_patch *klp_transition_patch;
> > > +
> > > +extern void klp_init_transition(struct klp_patch *patch, int universe);
> > > +extern void klp_start_transition(int universe);
> > > +extern void klp_reverse_transition(void);
> > > +extern void klp_try_complete_transition(void);
> > > +extern void klp_complete_transition(void);
> > 
> > Double inclusion protection is missing
> 
> Ok.
> 
> > and externs for functions are redundant.
> 
> I agree, but it seems to be the norm in Linux.  I have no idea why.  I'm
> just following the existing convention.

Yes, I know. It seems that each author does it differently. You can find 
both forms even in one header file in the kernel. There is no functional 
difference AFAIK (it is not the case for variables of course). So as long 
as we are consistent I do not care. And since we have externs already in 
livepatch.h... you can scratch this remark if you want to :)

Miroslav

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH 8/9] livepatch: allow patch modules to be removed
  2015-02-16 16:06             ` Miroslav Benes
@ 2015-02-17 15:55               ` Josh Poimboeuf
  2015-02-17 16:38                 ` Miroslav Benes
  0 siblings, 1 reply; 106+ messages in thread
From: Josh Poimboeuf @ 2015-02-17 15:55 UTC (permalink / raw)
  To: Miroslav Benes
  Cc: Jiri Slaby, Seth Jennings, Jiri Kosina, Vojtech Pavlik,
	Masami Hiramatsu, live-patching, linux-kernel

On Mon, Feb 16, 2015 at 05:06:15PM +0100, Miroslav Benes wrote:
> On Fri, 13 Feb 2015, Josh Poimboeuf wrote:
> 
> > On Fri, Feb 13, 2015 at 05:17:10PM +0100, Miroslav Benes wrote:
> > > On Fri, 13 Feb 2015, Josh Poimboeuf wrote:
> > > > Hm, even with Jiri Slaby's suggested fix to add the completion to the
> > > > unregister path, I still get a lockdep warning.  This looks more insidious,
> > > > related to the locking order of a kernfs lock and the klp lock.  I'll need to
> > > > look at this some more...
> > > 
> > > Yes, I was afraid of this. Lockdep warning is a separate bug. It is caused 
> > > by taking klp_mutex in enabled_store. During rmmod klp_unregister_patch 
> > > takes klp_mutex and destroys the sysfs structure. If somebody writes to 
> > > enabled just after unregister takes the mutex and before the sysfs 
> > > removal, he would cause the deadlock, because enabled_store takes the 
> > > "sysfs lock" and then klp_mutex. That is exactly what the lockdep tells us 
> > > below.
> > > 
> > > We can look for inspiration elsewhere. Grep for s_active through git log 
> > > of the mainline offers several commits which dealt exactly with this. Will 
> > > browse through that...
> > 
> > Thanks Miroslav, please let me know what you find.  It wouldn't surprise
> > me if this were a very common problem.
> > 
> > One option would be to move the enabled_store() work out to a workqueue
> > or something.
> 
> Yes, that is one possibility. It is not the only one.
> 
> 1. we could replace mutex_lock in enabled_store with mutex_trylock. If the 
> lock was not acquired we would return -EBUSY. Or could we 'return 
> restart_syscall' (maybe after some tiny msleep)?

Hm, doesn't that still violate the locking order rules?  I thought locks
always had to be taken in the same order -- always sysfs before klp, or
klp before sysfs.  Not sure if there would still be any deadlocks
lurking, but lockdep might still complain.

> 2. we could reorganize klp_unregister_patch somehow and move sysfs removal 
> out of mutex protection.

Yeah, I was thinking about this too.  Pretty sure we'd have to remove
both the sysfs add and the sysfs removal from mutex protection.  I like
this option if we can get it to work.

-- 
Josh

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH 6/9] livepatch: create per-task consistency model
  2015-02-17 15:48       ` Miroslav Benes
@ 2015-02-17 16:01         ` Josh Poimboeuf
  2015-02-18 12:42           ` Miroslav Benes
  0 siblings, 1 reply; 106+ messages in thread
From: Josh Poimboeuf @ 2015-02-17 16:01 UTC (permalink / raw)
  To: Miroslav Benes
  Cc: Seth Jennings, Jiri Kosina, Vojtech Pavlik, Masami Hiramatsu,
	live-patching, linux-kernel

On Tue, Feb 17, 2015 at 04:48:39PM +0100, Miroslav Benes wrote:
> On Tue, 17 Feb 2015, Josh Poimboeuf wrote:
> 
> > On Mon, Feb 16, 2015 at 03:19:10PM +0100, Miroslav Benes wrote:
> > > On Mon, 9 Feb 2015, Josh Poimboeuf wrote:
> > > 
> 
> [...]
> 
> > > > +
> > > > +void klp_unpatch_objects(struct klp_patch *patch)
> > > > +{
> > > > +	struct klp_object *obj;
> > > > +
> > > > +	for (obj = patch->objs; obj->funcs; obj++)
> > > > +		if (obj->patched)
> > > > +			klp_unpatch_object(obj);
> > > > +}
> > > 
> > > Maybe we should introduce for_each_* macros which could be used in the 
> > > code and avoid such functions. I do not have strong opinion about it.
> > 
> > Yeah, but each such loop seems to differ a little bit, so I'm not quite
> > sure how to structure the macros such that they'd be useful.  Maybe for
> > a future patch.
> 
> Yes, that is correct. The code in the caller of klp_unpatch_objects would 
> look something like this
> 
> klp_for_each_object(obj, patch->objs)
> 	if (obj->patched)
> 		klp_unpatch_object(obj)

Yeah, that is slightly more readable and less error prone.  I'll do it.

> > > and externs for functions are redundant.
> > 
> > I agree, but it seems to be the norm in Linux.  I have no idea why.  I'm
> > just following the existing convention.
> 
> Yes, I know. It seems that each author does it differently. You can find 
> both forms even in one header file in the kernel. There is no functional 
> difference AFAIK (it is not the case for variables of course). So as long 
> as we are consistent I do not care. And since we have externs already in 
> livepatch.h... you can scratch this remark if you want to :)

Ok.  If there are no objections, let's stick with our existing
nonsensical convention for now :-)

-- 
Josh

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH 8/9] livepatch: allow patch modules to be removed
  2015-02-17 15:55               ` Josh Poimboeuf
@ 2015-02-17 16:38                 ` Miroslav Benes
  0 siblings, 0 replies; 106+ messages in thread
From: Miroslav Benes @ 2015-02-17 16:38 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Jiri Slaby, Seth Jennings, Jiri Kosina, Vojtech Pavlik,
	Masami Hiramatsu, live-patching, linux-kernel

On Tue, 17 Feb 2015, Josh Poimboeuf wrote:

> On Mon, Feb 16, 2015 at 05:06:15PM +0100, Miroslav Benes wrote:
> > On Fri, 13 Feb 2015, Josh Poimboeuf wrote:
> > 
> > > On Fri, Feb 13, 2015 at 05:17:10PM +0100, Miroslav Benes wrote:
> > > > On Fri, 13 Feb 2015, Josh Poimboeuf wrote:
> > > > > Hm, even with Jiri Slaby's suggested fix to add the completion to the
> > > > > unregister path, I still get a lockdep warning.  This looks more insidious,
> > > > > related to the locking order of a kernfs lock and the klp lock.  I'll need to
> > > > > look at this some more...
> > > > 
> > > > Yes, I was afraid of this. Lockdep warning is a separate bug. It is caused 
> > > > by taking klp_mutex in enabled_store. During rmmod klp_unregister_patch 
> > > > takes klp_mutex and destroys the sysfs structure. If somebody writes to 
> > > > enabled just after unregister takes the mutex and before the sysfs 
> > > > removal, he would cause the deadlock, because enabled_store takes the 
> > > > "sysfs lock" and then klp_mutex. That is exactly what the lockdep tells us 
> > > > below.
> > > > 
> > > > We can look for inspiration elsewhere. Grep for s_active through git log 
> > > > of the mainline offers several commits which dealt exactly with this. Will 
> > > > browse through that...
> > > 
> > > Thanks Miroslav, please let me know what you find.  It wouldn't surprise
> > > me if this were a very common problem.
> > > 
> > > One option would be to move the enabled_store() work out to a workqueue
> > > or something.
> > 
> > Yes, that is one possibility. It is not the only one.
> > 
> > 1. we could replace mutex_lock in enabled_store with mutex_trylock. If the 
> > lock was not acquired we would return -EBUSY. Or could we 'return 
> > restart_syscall' (maybe after some tiny msleep)?
> 
> Hm, doesn't that still violate the locking order rules?  I thought locks
> always had to be taken in the same order -- always sysfs before klp, or
> klp before sysfs.  Not sure if there would still be any deadlocks
> lurking, but lockdep might still complain.

Yes, but in this case you break the possible deadlock order. From the 
lockdep report...

   CPU0                    CPU1
   ----                    ----
   lock(klp_mutex);
                           lock(s_active#70);
                           lock(klp_mutex);
   lock(s_active#70);

CPU0 called klp_unregister_patch and CPU1 possible enabled_store in a race 
window. Deadlock wouldn't be there because trylock(klp_mutex) on CPU1 
would return 0 and enabled_store thus EBUSY. And in every other scenario 
trylock would prevent deadlock too or klp_unregister_patch would wait on 
klp_mutex (I hope I did not miss anything).

I tried it and lockdep did not complain. 

And you can look at 36c38fb7144aa941dc072ba8f58b2dbe509c0345 or 
5e33bc4165f3edd558d9633002465a95230effc1. They dealt with it the same way 
(but it does not mean anything).

It would need more testing to be sure though.

> > 2. we could reorganize klp_unregister_patch somehow and move sysfs removal 
> > out of mutex protection.
> 
> Yeah, I was thinking about this too.  Pretty sure we'd have to remove
> both the sysfs add and the sysfs removal from mutex protection.  I like
> this option if we can get it to work.

Yes, why not.

Maybe someone else will share his opinion on this...

Miroslav

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH 6/9] livepatch: create per-task consistency model
  2015-02-17 16:01         ` Josh Poimboeuf
@ 2015-02-18 12:42           ` Miroslav Benes
  2015-02-18 13:15             ` Josh Poimboeuf
  0 siblings, 1 reply; 106+ messages in thread
From: Miroslav Benes @ 2015-02-18 12:42 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Seth Jennings, Jiri Kosina, Vojtech Pavlik, Masami Hiramatsu,
	live-patching, linux-kernel

On Tue, 17 Feb 2015, Josh Poimboeuf wrote:

> On Tue, Feb 17, 2015 at 04:48:39PM +0100, Miroslav Benes wrote:
> > On Tue, 17 Feb 2015, Josh Poimboeuf wrote:
> > 
> > > On Mon, Feb 16, 2015 at 03:19:10PM +0100, Miroslav Benes wrote:
> 
> > > > and externs for functions are redundant.
> > > 
> > > I agree, but it seems to be the norm in Linux.  I have no idea why.  I'm
> > > just following the existing convention.
> > 
> > Yes, I know. It seems that each author does it differently. You can find 
> > both forms even in one header file in the kernel. There is no functional 
> > difference AFAIK (it is not the case for variables of course). So as long 
> > as we are consistent I do not care. And since we have externs already in 
> > livepatch.h... you can scratch this remark if you want to :)
> 
> Ok.  If there are no objections, let's stick with our existing
> nonsensical convention for now :-)

So I was thinking about it again and we should not use bad patterns in our 
code from the beginning. Externs do not make sense so let's get rid of 
them everywhere (i.e. in the consistency model and also in livepatch.h). 

The C specification talks about extern in context of internal and external 
linkages or in context of inline functions but it does not make any sense 
to me. Could you look at the specification and tell me if it makes any 
sense to you, please?

Jiri, Vojtech, do you have any opinion about this?

Miroslav

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH 6/9] livepatch: create per-task consistency model
  2015-02-18 12:42           ` Miroslav Benes
@ 2015-02-18 13:15             ` Josh Poimboeuf
  2015-02-18 13:42               ` Miroslav Benes
  0 siblings, 1 reply; 106+ messages in thread
From: Josh Poimboeuf @ 2015-02-18 13:15 UTC (permalink / raw)
  To: Miroslav Benes
  Cc: Seth Jennings, Jiri Kosina, Vojtech Pavlik, Masami Hiramatsu,
	live-patching, linux-kernel

On Wed, Feb 18, 2015 at 01:42:56PM +0100, Miroslav Benes wrote:
> On Tue, 17 Feb 2015, Josh Poimboeuf wrote:
> 
> > On Tue, Feb 17, 2015 at 04:48:39PM +0100, Miroslav Benes wrote:
> > > On Tue, 17 Feb 2015, Josh Poimboeuf wrote:
> > > 
> > > > On Mon, Feb 16, 2015 at 03:19:10PM +0100, Miroslav Benes wrote:
> > 
> > > > > and externs for functions are redundant.
> > > > 
> > > > I agree, but it seems to be the norm in Linux.  I have no idea why.  I'm
> > > > just following the existing convention.
> > > 
> > > Yes, I know. It seems that each author does it differently. You can find 
> > > both forms even in one header file in the kernel. There is no functional 
> > > difference AFAIK (it is not the case for variables of course). So as long 
> > > as we are consistent I do not care. And since we have externs already in 
> > > livepatch.h... you can scratch this remark if you want to :)
> > 
> > Ok.  If there are no objections, let's stick with our existing
> > nonsensical convention for now :-)
> 
> So I was thinking about it again and we should not use bad patterns in our 
> code from the beginning. Externs do not make sense so let's get rid of 
> them everywhere (i.e. in the consistency model and also in livepatch.h). 
> 
> The C specification talks about extern in context of internal and external 
> linkages or in context of inline functions but it does not make any sense 
> to me. Could you look at the specification and tell me if it makes any 
> sense to you, please?

Relevant parts from C11:

	For an identifier declared with the storage-class specifier extern in a
	scope in which a prior declaration of that identifier is visible, if the
	prior declaration specifies internal or external linkage, the linkage of
	the identifier at the later declaration is the same as the linkage
	specified at the prior declaration.  If no prior declaration is visible,
	or if the prior declaration specifies no linkage, then the identifier
	has external linkage.

	If the declaration of an identifier for a function has no storage-class
	specifier, its linkage is determined exactly as if it were declared with
	the storage-class specifier extern .If the declaration of an identifier
	for an object has file scope and no storage-class specifier, its linkage
	is external.

Sounds to me like "extern" is redundant for functions.  I'm fine with
removing it.  Care to work up a patch for livepatch.h?

> 
> Jiri, Vojtech, do you have any opinion about this?
> 
> Miroslav

-- 
Josh

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH 6/9] livepatch: create per-task consistency model
  2015-02-18 13:15             ` Josh Poimboeuf
@ 2015-02-18 13:42               ` Miroslav Benes
  0 siblings, 0 replies; 106+ messages in thread
From: Miroslav Benes @ 2015-02-18 13:42 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Seth Jennings, Jiri Kosina, Vojtech Pavlik, Masami Hiramatsu,
	live-patching, linux-kernel

On Wed, 18 Feb 2015, Josh Poimboeuf wrote:

> On Wed, Feb 18, 2015 at 01:42:56PM +0100, Miroslav Benes wrote:
> > On Tue, 17 Feb 2015, Josh Poimboeuf wrote:
> > 
> > > On Tue, Feb 17, 2015 at 04:48:39PM +0100, Miroslav Benes wrote:
> > > > On Tue, 17 Feb 2015, Josh Poimboeuf wrote:
> > > > 
> > > > > On Mon, Feb 16, 2015 at 03:19:10PM +0100, Miroslav Benes wrote:
> > > 
> > > > > > and externs for functions are redundant.
> > > > > 
> > > > > I agree, but it seems to be the norm in Linux.  I have no idea why.  I'm
> > > > > just following the existing convention.
> > > > 
> > > > Yes, I know. It seems that each author does it differently. You can find 
> > > > both forms even in one header file in the kernel. There is no functional 
> > > > difference AFAIK (it is not the case for variables of course). So as long 
> > > > as we are consistent I do not care. And since we have externs already in 
> > > > livepatch.h... you can scratch this remark if you want to :)
> > > 
> > > Ok.  If there are no objections, let's stick with our existing
> > > nonsensical convention for now :-)
> > 
> > So I was thinking about it again and we should not use bad patterns in our 
> > code from the beginning. Externs do not make sense so let's get rid of 
> > them everywhere (i.e. in the consistency model and also in livepatch.h). 
> > 
> > The C specification talks about extern in context of internal and external 
> > linkages or in context of inline functions but it does not make any sense 
> > to me. Could you look at the specification and tell me if it makes any 
> > sense to you, please?
> 
> Relevant parts from C11:
> 
> 	For an identifier declared with the storage-class specifier extern in a
> 	scope in which a prior declaration of that identifier is visible, if the
> 	prior declaration specifies internal or external linkage, the linkage of
> 	the identifier at the later declaration is the same as the linkage
> 	specified at the prior declaration.  If no prior declaration is visible,
> 	or if the prior declaration specifies no linkage, then the identifier
> 	has external linkage.
> 
> 	If the declaration of an identifier for a function has no storage-class
> 	specifier, its linkage is determined exactly as if it were declared with
> 	the storage-class specifier extern .If the declaration of an identifier
> 	for an object has file scope and no storage-class specifier, its linkage
> 	is external.
> 
> Sounds to me like "extern" is redundant for functions.  I'm fine with
> removing it.  Care to work up a patch for livepatch.h?

Agreed. I'll do that. Thanks.

Miroslav

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH 1/9] livepatch: simplify disable error path
  2015-02-13 12:25   ` Miroslav Benes
@ 2015-02-18 17:03     ` Petr Mladek
  0 siblings, 0 replies; 106+ messages in thread
From: Petr Mladek @ 2015-02-18 17:03 UTC (permalink / raw)
  To: Miroslav Benes
  Cc: Josh Poimboeuf, Seth Jennings, Jiri Kosina, Vojtech Pavlik,
	Masami Hiramatsu, live-patching, Linux Kernel Mailing List

On Fri 2015-02-13 13:25:35, Miroslav Benes wrote:
> On Mon, 9 Feb 2015, Josh Poimboeuf wrote:
> 
> > If registering the function with ftrace has previously succeeded,
> > unregistering will almost never fail.  Even if it does, it's not a fatal
> > error.  We can still carry on and disable the klp_func from being used
> > by removing it from the klp_ops func stack.
> > 
> > Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
> 
> This makes sense, so
> 
> Reviewed-by: Miroslav Benes <mbenes@suse.cz>
> 
> I think this patch could be taken independently of the consistency model. 
> If no one else has any objection...

Yup, it looks good to me.

Reviewed-by: Petr Mladek <pmladek@suse.cz>
 
> Miroslav
> 
> > ---
> >  kernel/livepatch/core.c | 67 +++++++++++++------------------------------------
> >  1 file changed, 17 insertions(+), 50 deletions(-)
> > 
> > diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c
> > index 9adf86b..081df77 100644
> > --- a/kernel/livepatch/core.c
> > +++ b/kernel/livepatch/core.c
> > @@ -322,32 +322,20 @@ static void notrace klp_ftrace_handler(unsigned long ip,
> >  	klp_arch_set_pc(regs, (unsigned long)func->new_func);
> >  }
> >  
> > -static int klp_disable_func(struct klp_func *func)
> > +static void klp_disable_func(struct klp_func *func)
> >  {
> >  	struct klp_ops *ops;
> > -	int ret;
> > -
> > -	if (WARN_ON(func->state != KLP_ENABLED))
> > -		return -EINVAL;
> >  
> > -	if (WARN_ON(!func->old_addr))
> > -		return -EINVAL;
> > +	WARN_ON(func->state != KLP_ENABLED);
> > +	WARN_ON(!func->old_addr);
> >  
> >  	ops = klp_find_ops(func->old_addr);
> >  	if (WARN_ON(!ops))
> > -		return -EINVAL;
> > +		return;
> >  
> >  	if (list_is_singular(&ops->func_stack)) {
> > -		ret = unregister_ftrace_function(&ops->fops);
> > -		if (ret) {
> > -			pr_err("failed to unregister ftrace handler for function '%s' (%d)\n",
> > -			       func->old_name, ret);
> > -			return ret;
> > -		}
> > -
> > -		ret = ftrace_set_filter_ip(&ops->fops, func->old_addr, 1, 0);
> > -		if (ret)
> > -			pr_warn("function unregister succeeded but failed to clear the filter\n");
> > +		WARN_ON(unregister_ftrace_function(&ops->fops));
> > +		WARN_ON(ftrace_set_filter_ip(&ops->fops, func->old_addr, 1, 0));
> >  
> >  		list_del_rcu(&func->stack_node);
> >  		list_del(&ops->node);
> > @@ -357,8 +345,6 @@ static int klp_disable_func(struct klp_func *func)
> >  	}
> >  
> >  	func->state = KLP_DISABLED;
> > -
> > -	return 0;
> >  }
> >  
> >  static int klp_enable_func(struct klp_func *func)
> > @@ -419,23 +405,15 @@ err:
> >  	return ret;
> >  }
> >  
> > -static int klp_disable_object(struct klp_object *obj)
> > +static void klp_disable_object(struct klp_object *obj)
> >  {
> >  	struct klp_func *func;
> > -	int ret;
> >  
> > -	for (func = obj->funcs; func->old_name; func++) {
> > -		if (func->state != KLP_ENABLED)
> > -			continue;
> > -
> > -		ret = klp_disable_func(func);
> > -		if (ret)
> > -			return ret;
> > -	}
> > +	for (func = obj->funcs; func->old_name; func++)
> > +		if (func->state == KLP_ENABLED)
> > +			klp_disable_func(func);
> >  
> >  	obj->state = KLP_DISABLED;
> > -
> > -	return 0;
> >  }
> >  
> >  static int klp_enable_object(struct klp_object *obj)
> > @@ -451,22 +429,19 @@ static int klp_enable_object(struct klp_object *obj)
> >  
> >  	for (func = obj->funcs; func->old_name; func++) {
> >  		ret = klp_enable_func(func);
> > -		if (ret)
> > -			goto unregister;
> > +		if (ret) {
> > +			klp_disable_object(obj);
> > +			return ret;
> > +		}
> >  	}
> >  	obj->state = KLP_ENABLED;
> >  
> >  	return 0;
> > -
> > -unregister:
> > -	WARN_ON(klp_disable_object(obj));
> > -	return ret;
> >  }
> >  
> >  static int __klp_disable_patch(struct klp_patch *patch)
> >  {
> >  	struct klp_object *obj;
> > -	int ret;
> >  
> >  	/* enforce stacking: only the last enabled patch can be disabled */
> >  	if (!list_is_last(&patch->list, &klp_patches) &&
> > @@ -476,12 +451,8 @@ static int __klp_disable_patch(struct klp_patch *patch)
> >  	pr_notice("disabling patch '%s'\n", patch->mod->name);
> >  
> >  	for (obj = patch->objs; obj->funcs; obj++) {
> > -		if (obj->state != KLP_ENABLED)
> > -			continue;
> > -
> > -		ret = klp_disable_object(obj);
> > -		if (ret)
> > -			return ret;
> > +		if (obj->state == KLP_ENABLED)
> > +			klp_disable_object(obj);
> >  	}
> >  
> >  	patch->state = KLP_DISABLED;
> > @@ -931,7 +902,6 @@ static void klp_module_notify_going(struct klp_patch *patch,
> >  {
> >  	struct module *pmod = patch->mod;
> >  	struct module *mod = obj->mod;
> > -	int ret;
> >  
> >  	if (patch->state == KLP_DISABLED)
> >  		goto disabled;
> > @@ -939,10 +909,7 @@ static void klp_module_notify_going(struct klp_patch *patch,
> >  	pr_notice("reverting patch '%s' on unloading module '%s'\n",
> >  		  pmod->name, mod->name);
> >  
> > -	ret = klp_disable_object(obj);
> > -	if (ret)
> > -		pr_warn("failed to revert patch '%s' on module '%s' (%d)\n",
> > -			pmod->name, mod->name, ret);
> > +	klp_disable_object(obj);
> >  
> >  disabled:
> >  	klp_free_object_loaded(obj);
> > -- 
> > 2.1.0
> > 
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > Please read the FAQ at  http://www.tux.org/lkml/
> > 
> --
> To unsubscribe from this list: send the line "unsubscribe live-patching" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH 1/9] livepatch: simplify disable error path
  2015-02-09 17:31 ` [RFC PATCH 1/9] livepatch: simplify disable error path Josh Poimboeuf
  2015-02-13 12:25   ` Miroslav Benes
@ 2015-02-18 20:07   ` Jiri Kosina
  1 sibling, 0 replies; 106+ messages in thread
From: Jiri Kosina @ 2015-02-18 20:07 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Seth Jennings, Vojtech Pavlik, Masami Hiramatsu, live-patching,
	linux-kernel

On Mon, 9 Feb 2015, Josh Poimboeuf wrote:

> If registering the function with ftrace has previously succeeded,
> unregistering will almost never fail.  Even if it does, it's not a fatal
> error.  We can still carry on and disable the klp_func from being used
> by removing it from the klp_ops func stack.
> 
> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>

Applied to for-3.21/core, thanks.

-- 
Jiri Kosina
SUSE Labs

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH 6/9] livepatch: create per-task consistency model
  2015-02-12 14:32           ` Jiri Kosina
@ 2015-02-18 20:17             ` Ingo Molnar
  2015-02-18 20:44               ` Vojtech Pavlik
  0 siblings, 1 reply; 106+ messages in thread
From: Ingo Molnar @ 2015-02-18 20:17 UTC (permalink / raw)
  To: Jiri Kosina
  Cc: Peter Zijlstra, Josh Poimboeuf, Ingo Molnar, Masami Hiramatsu,
	live-patching, linux-kernel, Seth Jennings, Vojtech Pavlik


* Jiri Kosina <jkosina@suse.cz> wrote:

> On Thu, 12 Feb 2015, Peter Zijlstra wrote:
> 
> > And what's wrong with using known good spots like the freezer?
> 
> Quoting Tejun from the thread Jiri Slaby likely had on 
> mind:
> 
> "The fact that they may coincide often can be useful as a 
> guideline or whatever but I'm completely against just 
> mushing it together when it isn't correct.  This kind of 
> things quickly lead to ambiguous situations where people 
> are not sure about the specific semantics or guarantees 
> of the construct and implement weird voodoo code followed 
> by voodoo fixes.  We already had a full round of that 
> with the kernel freezer itself, where people thought that 
> the freezer magically makes PM work properly for a 
> subsystem.  Let's please not do that again."

I don't follow this vague argument.

The concept of 'freezing' all userspace execution is pretty 
unambiguous: tasks that are running are trapped out at 
known safe points such as context switch points or syscall 
entry. Once all tasks have stopped, the system is frozen in 
the sense that only the code we want is running, so you can 
run special code without worrying about races.

What's the problem with that? Why would it be fundamentally 
unsuitable for live patching?

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH 6/9] livepatch: create per-task consistency model
  2015-02-18 20:17             ` Ingo Molnar
@ 2015-02-18 20:44               ` Vojtech Pavlik
  2015-02-19  9:52                 ` Peter Zijlstra
  0 siblings, 1 reply; 106+ messages in thread
From: Vojtech Pavlik @ 2015-02-18 20:44 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Jiri Kosina, Peter Zijlstra, Josh Poimboeuf, Ingo Molnar,
	Masami Hiramatsu, live-patching, linux-kernel, Seth Jennings

On Wed, Feb 18, 2015 at 09:17:55PM +0100, Ingo Molnar wrote:
> 
> * Jiri Kosina <jkosina@suse.cz> wrote:
> 
> > On Thu, 12 Feb 2015, Peter Zijlstra wrote:
> > 
> > > And what's wrong with using known good spots like the freezer?
> > 
> > Quoting Tejun from the thread Jiri Slaby likely had on 
> > mind:
> > 
> > "The fact that they may coincide often can be useful as a 
> > guideline or whatever but I'm completely against just 
> > mushing it together when it isn't correct.  This kind of 
> > things quickly lead to ambiguous situations where people 
> > are not sure about the specific semantics or guarantees 
> > of the construct and implement weird voodoo code followed 
> > by voodoo fixes.  We already had a full round of that 
> > with the kernel freezer itself, where people thought that 
> > the freezer magically makes PM work properly for a 
> > subsystem.  Let's please not do that again."
> 
> I don't follow this vague argument.
> 
> The concept of 'freezing' all userspace execution is pretty 
> unambiguous: tasks that are running are trapped out at 
> known safe points such as context switch points or syscall 
> entry. Once all tasks have stopped, the system is frozen in 
> the sense that only the code we want is running, so you can 
> run special code without worrying about races.
> 
> What's the problem with that? Why would it be fundamentally 
> unsuitable for live patching?

For live patching it doesn't matter whether code is running, sleeping or
frozen.

What matters is whether there is state before patching that may not be
valid after patching.

For userspace tasks, the exit from a syscall is a perfect moment for
switching to the "after" state, as all stacks, and thus all local
variables are gone and no local state exists in the kernel for the
thread.

The freezer is a logical choice for kernel threads, however, given that
kernel threads have no defined entry/exit point and execute within a
single main function, local variables stay and thus local state persists
from before to after freezing.

Defining that no local state within a kernel thread may be relied upon
after exiting from the freezer is certainly possible, and is already
true for many kernel threads.

It isn't a given property of the freezer itself, though. And isn't
obvious for author of new kernel threads either.

The ideal solution would be to convert the majority of kernel threads to
workqueues, because then there is a defined entry/exit point over which
state isn't transferred. That is a lot of work, though, and has other
drawbacks, particularly in the realtime space.

-- 
Vojtech Pavlik
Director SUSE Labs

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH 6/9] livepatch: create per-task consistency model
  2015-02-18 20:44               ` Vojtech Pavlik
@ 2015-02-19  9:52                 ` Peter Zijlstra
  2015-02-19 10:11                   ` Vojtech Pavlik
  0 siblings, 1 reply; 106+ messages in thread
From: Peter Zijlstra @ 2015-02-19  9:52 UTC (permalink / raw)
  To: Vojtech Pavlik
  Cc: Ingo Molnar, Jiri Kosina, Josh Poimboeuf, Ingo Molnar,
	Masami Hiramatsu, live-patching, linux-kernel, Seth Jennings

On Wed, Feb 18, 2015 at 09:44:44PM +0100, Vojtech Pavlik wrote:
> For live patching it doesn't matter whether code is running, sleeping or
> frozen.
> 
> What matters is whether there is state before patching that may not be
> valid after patching.
> 
> For userspace tasks, the exit from a syscall is a perfect moment for
> switching to the "after" state, as all stacks, and thus all local
> variables are gone and no local state exists in the kernel for the
> thread.
> 
> The freezer is a logical choice for kernel threads, however, given that
> kernel threads have no defined entry/exit point and execute within a
> single main function, local variables stay and thus local state persists
> from before to after freezing.
> 
> Defining that no local state within a kernel thread may be relied upon
> after exiting from the freezer is certainly possible, and is already
> true for many kernel threads.
> 
> It isn't a given property of the freezer itself, though. And isn't
> obvious for author of new kernel threads either.
> 
> The ideal solution would be to convert the majority of kernel threads to
> workqueues, because then there is a defined entry/exit point over which
> state isn't transferred. That is a lot of work, though, and has other
> drawbacks, particularly in the realtime space.

kthread_park() functionality seems to be exactly what you want.

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH 6/9] livepatch: create per-task consistency model
  2015-02-19  9:52                 ` Peter Zijlstra
@ 2015-02-19 10:11                   ` Vojtech Pavlik
  2015-02-19 10:51                     ` Peter Zijlstra
  0 siblings, 1 reply; 106+ messages in thread
From: Vojtech Pavlik @ 2015-02-19 10:11 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Jiri Kosina, Josh Poimboeuf, Ingo Molnar,
	Masami Hiramatsu, live-patching, linux-kernel, Seth Jennings

On Thu, Feb 19, 2015 at 10:52:51AM +0100, Peter Zijlstra wrote:

> On Wed, Feb 18, 2015 at 09:44:44PM +0100, Vojtech Pavlik wrote:
> > For live patching it doesn't matter whether code is running, sleeping or
> > frozen.
> > 
> > What matters is whether there is state before patching that may not be
> > valid after patching.
> > 
> > For userspace tasks, the exit from a syscall is a perfect moment for
> > switching to the "after" state, as all stacks, and thus all local
> > variables are gone and no local state exists in the kernel for the
> > thread.
> > 
> > The freezer is a logical choice for kernel threads, however, given that
> > kernel threads have no defined entry/exit point and execute within a
> > single main function, local variables stay and thus local state persists
> > from before to after freezing.
> > 
> > Defining that no local state within a kernel thread may be relied upon
> > after exiting from the freezer is certainly possible, and is already
> > true for many kernel threads.
> > 
> > It isn't a given property of the freezer itself, though. And isn't
> > obvious for author of new kernel threads either.
> > 
> > The ideal solution would be to convert the majority of kernel threads to
> > workqueues, because then there is a defined entry/exit point over which
> > state isn't transferred. That is a lot of work, though, and has other
> > drawbacks, particularly in the realtime space.
> 
> kthread_park() functionality seems to be exactly what you want.

It might be exactly that, indeed. The requrement of not just cleaning
up, but also not using contents of local variables from before parking
would need to be documented.

And kernel threads would need to start using it, too. I have been able
to find one instance where this functionality is actually used. So it is
again a matter of a massive patch adding that, like with the approach of
converting kernel threads to workqueues.

By the way, if kthread_park() was implemented all through the kernel,
would we still need the freezer for kernel threads at all? Since parking
seems to be stronger than freezing, it could also be used for that
purpose.

Vojtech

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH 6/9] livepatch: create per-task consistency model
  2015-02-19 10:11                   ` Vojtech Pavlik
@ 2015-02-19 10:51                     ` Peter Zijlstra
  0 siblings, 0 replies; 106+ messages in thread
From: Peter Zijlstra @ 2015-02-19 10:51 UTC (permalink / raw)
  To: Vojtech Pavlik
  Cc: Ingo Molnar, Jiri Kosina, Josh Poimboeuf, Ingo Molnar,
	Masami Hiramatsu, live-patching, linux-kernel, Seth Jennings

On Thu, Feb 19, 2015 at 11:11:53AM +0100, Vojtech Pavlik wrote:
> On Thu, Feb 19, 2015 at 10:52:51AM +0100, Peter Zijlstra wrote:
> > kthread_park() functionality seems to be exactly what you want.
> 
> It might be exactly that, indeed. The requrement of not just cleaning
> up, but also not using contents of local variables from before parking
> would need to be documented.
> 
> And kernel threads would need to start using it, too. I have been able
> to find one instance where this functionality is actually used. 

Yeah, there's work to be done there. It was introduced for the cpu
hotplug stuff, and some per-cpu threads use this through the smpboot
infrastructure.

More need to be converted. It would be relatively straight fwd to park
threaded IRQs on irq-suspend like activity for example.

> So it is
> again a matter of a massive patch adding that, like with the approach of
> converting kernel threads to workqueues.

Yeah, but not nearly all kthreads can be converted to workqueues. And
there is various problems with workqueues that make it undesirable for
some even if possible.

> By the way, if kthread_park() was implemented all through the kernel,
> would we still need the freezer for kernel threads at all? Since parking
> seems to be stronger than freezing, it could also be used for that
> purpose.

I think not; there might of course be horrible exceptions but in general
parking should be good enough indeed.

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH 0/9] livepatch: consistency model
  2015-02-13 14:41       ` Josh Poimboeuf
@ 2015-02-24 11:27         ` Masami Hiramatsu
  0 siblings, 0 replies; 106+ messages in thread
From: Masami Hiramatsu @ 2015-02-24 11:27 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Jiri Kosina, Seth Jennings, Vojtech Pavlik, live-patching, linux-kernel

(2015/02/13 23:41), Josh Poimboeuf wrote:
> On Fri, Feb 13, 2015 at 03:22:15PM +0100, Jiri Kosina wrote:
>> On Fri, 13 Feb 2015, Josh Poimboeuf wrote:
>>
>>>> How about we take a slightly different aproach -- put a probe (or ftrace) 
>>>> on __switch_to() during a klp transition period, and examine stacktraces 
>>>> for tasks that are just about to start running from there?
>>>>
>>>> The only tasks that would not be covered by this would be purely CPU-bound 
>>>> tasks that never schedule. But we are likely in trouble with those anyway, 
>>>> because odds are that non-rescheduling CPU-bound tasks are also 
>>>> RT-priority tasks running on isolated CPUs, which we will fail to handle 
>>>> anyway.
>>>>
>>>> I think Masami used similar trick in his kpatch-without-stopmachine 
>>>> aproach.
>>>
>>> Yeah, that's definitely an option, though I'm really not too crazy about
>>> it.  Hooking into the scheduler is kind of scary and disruptive.  
>>
>> This is basically about running a stack checking for ->next before 
>> switching to it, i.e. read-only operation (admittedly inducing some 
>> latency, but that's the same with locking the runqueue). And only when in 
>> transition phase.
> 
> Yes, but it would introduce much more latency than locking rq, since
> there would be at least some added latency to every schedule() call
> during the transition phase.  Locking the rq would only add latency in
> those cases where another CPU is trying to do a context switch while
> we're holding the lock.

If we can implement checking routine at the enter of switching process,
it will not have such bigger cost. My prototype code used kprobes just
for hack, but we can do it in the scheduler too.

>
> It also seems much more dangerous.  A bug in __switch_to() could easily
> do a lot of damage.

Indeed. It requires per-task locking on scheduler for safety on switching
to avoid concurrent stack checking.

>>> We'd also have to wake up all the sleeping processes.
>>
>> Yes, I don't think there is a way around that.
> 
> Actually this patch set is a way around that :-)

Thank you,

-- 
Masami HIRAMATSU
Software Platform Research Dept. Linux Technology Research Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: masami.hiramatsu.pt@hitachi.com



^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH 0/9] livepatch: consistency model
  2015-02-09 17:31 [RFC PATCH 0/9] livepatch: consistency model Josh Poimboeuf
                   ` (12 preceding siblings ...)
  2015-02-13 10:14 ` Jiri Kosina
@ 2015-03-10 16:23 ` Josh Poimboeuf
  2015-03-10 21:02   ` Jiri Kosina
  13 siblings, 1 reply; 106+ messages in thread
From: Josh Poimboeuf @ 2015-03-10 16:23 UTC (permalink / raw)
  To: Seth Jennings, Jiri Kosina, Vojtech Pavlik
  Cc: Masami Hiramatsu, live-patching, linux-kernel, Peter Zijlstra,
	Ingo Molnar

On Mon, Feb 09, 2015 at 11:31:12AM -0600, Josh Poimboeuf wrote:
> This patch set implements a livepatch consistency model, targeted for 3.21.
> Now that we have a solid livepatch code base, this is the biggest remaining
> missing piece.
> 
> This code stems from the design proposal made by Vojtech [1] in November.  It
> makes live patching safer in general.  Specifically, it allows you to apply
> patches which change function prototypes.  It also lays the groundwork for
> future code changes which will enable data and data semantic changes.
> 
> It's basically a hybrid of kpatch and kGraft, combining kpatch's backtrace
> checking with kGraft's per-task consistency.  When patching, tasks are
> carefully transitioned from the old universe to the new universe.  A task can
> only be switched to the new universe if it's not using a function that is to be
> patched or unpatched.  After all tasks have moved to the new universe, the
> patching process is complete.
[...]

Just an update on the status of this RFC.  Thanks to everybody for all
the useful comments.  I plan to incorporate the resulting changes in an
eventual v2 of this patch set.

But, as Peter and Ingo have pointed out, stack traces are indeed
unreliable.  I have some ideas about how to improve them, coming soon in
another RFC, which will be a prerequisite for this patch set.

-- 
Josh

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH 0/9] livepatch: consistency model
  2015-03-10 16:23 ` Josh Poimboeuf
@ 2015-03-10 21:02   ` Jiri Kosina
  2015-03-10 21:30     ` Josh Poimboeuf
  0 siblings, 1 reply; 106+ messages in thread
From: Jiri Kosina @ 2015-03-10 21:02 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Seth Jennings, Vojtech Pavlik, Masami Hiramatsu, live-patching,
	linux-kernel, Peter Zijlstra, Ingo Molnar, jslaby, mbenes

On Tue, 10 Mar 2015, Josh Poimboeuf wrote:

> Just an update on the status of this RFC.  Thanks to everybody for all 
> the useful comments.  I plan to incorporate the resulting changes in an 
> eventual v2 of this patch set.
> 
> But, as Peter and Ingo have pointed out, stack traces are indeed
> unreliable.  I have some ideas about how to improve them, coming soon in
> another RFC, which will be a prerequisite for this patch set.

Thanks for the update. Just FYI, in parallel, Jiri Slaby (with help from a 
few other people) (added to CC) is working on RFC on a per-thread patching 
on top of the livepatching core, so that we can actually compare pros and 
cons of both aproaches and implementations.

It might still take some time before its finalized and sent out as a RFC, 
as I'd like it to also contain the "fake signal" task handling suggested 
by Ingo. Miroslav is working on that part.

-- 
Jiri Kosina
SUSE Labs

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH 0/9] livepatch: consistency model
  2015-03-10 21:02   ` Jiri Kosina
@ 2015-03-10 21:30     ` Josh Poimboeuf
  0 siblings, 0 replies; 106+ messages in thread
From: Josh Poimboeuf @ 2015-03-10 21:30 UTC (permalink / raw)
  To: Jiri Kosina
  Cc: Seth Jennings, Vojtech Pavlik, Masami Hiramatsu, live-patching,
	linux-kernel, Peter Zijlstra, Ingo Molnar, jslaby, mbenes

On Tue, Mar 10, 2015 at 05:02:20PM -0400, Jiri Kosina wrote:
> On Tue, 10 Mar 2015, Josh Poimboeuf wrote:
> 
> > Just an update on the status of this RFC.  Thanks to everybody for all 
> > the useful comments.  I plan to incorporate the resulting changes in an 
> > eventual v2 of this patch set.
> > 
> > But, as Peter and Ingo have pointed out, stack traces are indeed
> > unreliable.  I have some ideas about how to improve them, coming soon in
> > another RFC, which will be a prerequisite for this patch set.
> 
> Thanks for the update. Just FYI, in parallel, Jiri Slaby (with help from a 
> few other people) (added to CC) is working on RFC on a per-thread patching 
> on top of the livepatching core, so that we can actually compare pros and 
> cons of both aproaches and implementations.
> 
> It might still take some time before its finalized and sent out as a RFC, 
> as I'd like it to also contain the "fake signal" task handling suggested 
> by Ingo. Miroslav is working on that part.

Ok, thanks for the heads up.

I think the two approaches are complementary.  _If_ we can make stack
checking safe, then IMO it's by far the most lightweight and least
disruptive option, so it should be the first wave of attack.  We can use
that to transition most of the tasks.

Any remaining "straggler" tasks (which are either sleeping on a patched
function or have a potentially unreliable stack) can use something else
like fake signals.

-- 
Josh

^ permalink raw reply	[flat|nested] 106+ messages in thread

end of thread, other threads:[~2015-03-10 21:30 UTC | newest]

Thread overview: 106+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-02-09 17:31 [RFC PATCH 0/9] livepatch: consistency model Josh Poimboeuf
2015-02-09 17:31 ` [RFC PATCH 1/9] livepatch: simplify disable error path Josh Poimboeuf
2015-02-13 12:25   ` Miroslav Benes
2015-02-18 17:03     ` Petr Mladek
2015-02-18 20:07   ` Jiri Kosina
2015-02-09 17:31 ` [RFC PATCH 2/9] livepatch: separate enabled and patched states Josh Poimboeuf
2015-02-10 16:44   ` Jiri Slaby
2015-02-10 17:21     ` Josh Poimboeuf
2015-02-13 12:57   ` Miroslav Benes
2015-02-13 14:39     ` Josh Poimboeuf
2015-02-13 14:46       ` Miroslav Benes
2015-02-09 17:31 ` [RFC PATCH 3/9] livepatch: move patching functions into patch.c Josh Poimboeuf
2015-02-10 18:27   ` Jiri Slaby
2015-02-10 18:50     ` Josh Poimboeuf
2015-02-13 14:28   ` Miroslav Benes
2015-02-13 15:09     ` Josh Poimboeuf
2015-02-09 17:31 ` [RFC PATCH 4/9] livepatch: get function sizes Josh Poimboeuf
2015-02-10 18:30   ` Jiri Slaby
2015-02-10 18:53     ` Josh Poimboeuf
2015-02-09 17:31 ` [RFC PATCH 5/9] sched: move task rq locking functions to sched.h Josh Poimboeuf
2015-02-10 10:48   ` Masami Hiramatsu
2015-02-10 14:54     ` Josh Poimboeuf
2015-02-09 17:31 ` [RFC PATCH 6/9] livepatch: create per-task consistency model Josh Poimboeuf
2015-02-10 10:58   ` Masami Hiramatsu
2015-02-10 14:59     ` Josh Poimboeuf
2015-02-10 15:59   ` Miroslav Benes
2015-02-10 16:56     ` Josh Poimboeuf
2015-02-11 16:28       ` Miroslav Benes
2015-02-11 20:23         ` Josh Poimboeuf
2015-02-10 19:27   ` Seth Jennings
2015-02-10 19:32     ` Josh Poimboeuf
2015-02-11 10:21   ` Miroslav Benes
2015-02-11 20:19     ` Josh Poimboeuf
2015-02-12 10:45       ` Miroslav Benes
2015-02-12  3:21   ` Josh Poimboeuf
2015-02-12 11:56     ` Peter Zijlstra
2015-02-12 12:25       ` Jiri Kosina
2015-02-12 12:36         ` Peter Zijlstra
2015-02-12 12:39           ` Jiri Kosina
2015-02-12 12:39         ` Peter Zijlstra
2015-02-12 12:42           ` Jiri Kosina
2015-02-12 13:01             ` Josh Poimboeuf
2015-02-12 12:51       ` Josh Poimboeuf
2015-02-12 13:08         ` Peter Zijlstra
2015-02-12 13:16           ` Jiri Kosina
2015-02-12 14:20             ` Josh Poimboeuf
2015-02-12 14:27               ` Jiri Kosina
2015-02-12 13:16           ` Jiri Slaby
2015-02-12 13:35             ` Peter Zijlstra
2015-02-12 14:08               ` Jiri Kosina
2015-02-12 15:24                 ` Josh Poimboeuf
2015-02-12 14:20               ` Jiri Slaby
2015-02-12 14:32           ` Jiri Kosina
2015-02-18 20:17             ` Ingo Molnar
2015-02-18 20:44               ` Vojtech Pavlik
2015-02-19  9:52                 ` Peter Zijlstra
2015-02-19 10:11                   ` Vojtech Pavlik
2015-02-19 10:51                     ` Peter Zijlstra
2015-02-12 13:26     ` Jiri Slaby
2015-02-12 15:48       ` Josh Poimboeuf
2015-02-14 11:40   ` Jiri Slaby
2015-02-17 14:59     ` Josh Poimboeuf
2015-02-16 14:19   ` Miroslav Benes
2015-02-17 15:10     ` Josh Poimboeuf
2015-02-17 15:48       ` Miroslav Benes
2015-02-17 16:01         ` Josh Poimboeuf
2015-02-18 12:42           ` Miroslav Benes
2015-02-18 13:15             ` Josh Poimboeuf
2015-02-18 13:42               ` Miroslav Benes
2015-02-09 17:31 ` [RFC PATCH 7/9] proc: add /proc/<pid>/universe to show livepatch status Josh Poimboeuf
2015-02-10 18:47   ` Jiri Slaby
2015-02-10 18:57     ` Josh Poimboeuf
2015-02-09 17:31 ` [RFC PATCH 8/9] livepatch: allow patch modules to be removed Josh Poimboeuf
2015-02-10 19:02   ` Jiri Slaby
2015-02-10 19:57     ` Josh Poimboeuf
2015-02-11 10:55       ` Jiri Slaby
2015-02-11 18:39         ` Josh Poimboeuf
2015-02-12 15:22     ` Miroslav Benes
2015-02-13 12:44       ` Josh Poimboeuf
2015-02-13 16:04       ` Josh Poimboeuf
2015-02-13 16:17         ` Miroslav Benes
2015-02-13 20:49           ` Josh Poimboeuf
2015-02-16 16:06             ` Miroslav Benes
2015-02-17 15:55               ` Josh Poimboeuf
2015-02-17 16:38                 ` Miroslav Benes
2015-02-09 17:31 ` [RFC PATCH 9/9] livepatch: update task universe when exiting kernel Josh Poimboeuf
2015-02-16 10:16   ` Jiri Slaby
2015-02-17 14:58     ` Josh Poimboeuf
2015-02-09 23:15 ` [RFC PATCH 0/9] livepatch: consistency model Jiri Kosina
2015-02-10  3:05   ` Josh Poimboeuf
2015-02-10  7:21     ` Jiri Kosina
2015-02-10  8:57 ` Jiri Kosina
2015-02-10 14:43   ` Josh Poimboeuf
2015-02-10 11:16 ` Masami Hiramatsu
2015-02-10 15:59   ` Josh Poimboeuf
2015-02-10 17:29     ` Josh Poimboeuf
2015-02-13 10:14 ` Jiri Kosina
2015-02-13 14:19   ` Josh Poimboeuf
2015-02-13 14:22     ` Jiri Kosina
2015-02-13 14:40       ` Miroslav Benes
2015-02-13 14:55         ` Josh Poimboeuf
2015-02-13 14:41       ` Josh Poimboeuf
2015-02-24 11:27         ` Masami Hiramatsu
2015-03-10 16:23 ` Josh Poimboeuf
2015-03-10 21:02   ` Jiri Kosina
2015-03-10 21:30     ` Josh Poimboeuf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).