[RFC PATCH] livepatch: allow removal of a disabled patch

* [RFC PATCH] livepatch: allow removal of a disabled patch
@ 2016-05-02 11:57 Miroslav Benes
  2016-05-02 15:08 ` Josh Poimboeuf
  2016-05-03 21:37 ` Josh Poimboeuf
  0 siblings, 2 replies; 21+ messages in thread
From: Miroslav Benes @ 2016-05-02 11:57 UTC (permalink / raw)
  To: jpoimboe, jeyu, jikos, pmladek, jslaby
  Cc: live-patching, linux-kernel, huawei.libin, minfei.huang, Miroslav Benes

Currently we do not allow patch module to unload since there is no
method to determine if a task is still running in the patched code.

The consistency model gives us the way because when the patching
finishes we know that all tasks were marked as safe to call a new
patched function. Thus every new call to the function calls the new
patched code and at the same time no task can be somewhere in the old
code, because it had to leave that code to be marked as safe.

We can safely let the patch module go after that.

Completion is used for synchronization between module removal and sysfs
infrastructure. Note that we still do not allow the removal for
immediate model, that is no consistency model.

With this change a call to try_module_get() is moved to
__klp_enable_patch from klp_register_patch to make module reference
counting symmetric (module_put() is in a patch disable path) and to
allow to take a new reference to a disabled module when being enabled.

Also all kobject_put(&patch->kobj) calls are moved outside of klp_mutex
lock protection to prevent a deadlock situation when
klp_unregister_patch is called and sysfs directories are removed. There
is no need to do the same for other kobject_put() callsites as we
currently do not have their sysfs counterparts.

Signed-off-by: Miroslav Benes <mbenes@suse.cz>
---
Based on Josh's v2.

There are still two things which need to be discussed.

1. Do we really need a completion? If I am not missing something
kobject_del() always waits for sysfs callers to leave thanks to kernfs
active protection. The only problem is in kobject_release() when
CONFIG_DEBUG_KOBJECT_RELEASE=y. In this case kobject_delayed_cleanup()
is delay-scheduled, which is a problem for us. For this we definitely
need the completion, right? JiriS, is this correct?

2. I refuse to remove a module when patch->immediate is set or one of
its func->immediate is set. We can do slightly better, because we can
allow the module to go if patch->immediate is false and all func with
immediate set to true are from some object which is not loaded. This is
for discussion because it needs more changes in the code (I'd like to
call klp_is_object_loaded() which is somewhere else and so on).


 Documentation/livepatch/livepatch.txt | 29 ++++---------
 include/linux/livepatch.h             |  3 ++
 kernel/livepatch/core.c               | 80 ++++++++++++++++++++++-------------
 kernel/livepatch/transition.c         | 12 +++++-
 samples/livepatch/livepatch-sample.c  |  1 -
 5 files changed, 72 insertions(+), 53 deletions(-)

diff --git a/Documentation/livepatch/livepatch.txt b/Documentation/livepatch/livepatch.txt
index bee86d0fe854..a05124258a73 100644
--- a/Documentation/livepatch/livepatch.txt
+++ b/Documentation/livepatch/livepatch.txt
@@ -272,8 +272,15 @@ section "Livepatch life-cycle" below for more details about these
 two operations.
 
 Module removal is only safe when there are no users of the underlying
-functions.  The immediate consistency model is not able to detect this;
-therefore livepatch modules cannot be removed. See "Limitations" below.
+functions. The immediate consistency model is not able to detect this. The
+code just redirects the functions at the very beginning and it does not
+check if the functions are in use. In other words, it knows when the
+functions get called but it does not know when the functions return.
+Therefore it cannot be decided when the livepatch module can be safely
+removed. This is solved by a hybrid consistency model. When the system is
+transitioned to a new patch state (patched/unpatched) it is guaranteed that
+no task sleeps or runs in the old code.
+
 
 5. Livepatch life-cycle
 =======================
@@ -444,24 +451,6 @@ See Documentation/ABI/testing/sysfs-kernel-livepatch for more details.
     There is work in progress to remove this limitation.
 
 
-  + Livepatch modules can not be removed.
-
-    The current implementation just redirects the functions at the very
-    beginning. It does not check if the functions are in use. In other
-    words, it knows when the functions get called but it does not
-    know when the functions return. Therefore it can not decide when
-    the livepatch module can be safely removed.
-
-    This will get most likely solved once a more complex consistency model
-    is supported. The idea is that a safe state for patching should also
-    mean a safe state for removing the patch.
-
-    Note that the patch itself might get disabled by writing zero
-    to /sys/kernel/livepatch/<patch>/enabled. It causes that the new
-    code will not longer get called. But it does not guarantee
-    that anyone is not sleeping anywhere in the new code.
-
-
   + Livepatch works reliably only when the dynamic ftrace is located at
     the very beginning of the function.
 
diff --git a/include/linux/livepatch.h b/include/linux/livepatch.h
index cd2dfd95e109..a9651c6e1b52 100644
--- a/include/linux/livepatch.h
+++ b/include/linux/livepatch.h
@@ -23,6 +23,7 @@
 
 #include <linux/module.h>
 #include <linux/ftrace.h>
+#include <linux/completion.h>
 
 #if IS_ENABLED(CONFIG_LIVEPATCH)
 
@@ -114,6 +115,7 @@ struct klp_object {
  * @list:	list node for global list of registered patches
  * @kobj:	kobject for sysfs resources
  * @enabled:	the patch is enabled (but operation may be incomplete)
+ * @finish:	for waiting till it is safe to remove the patch module
  */
 struct klp_patch {
 	/* external */
@@ -125,6 +127,7 @@ struct klp_patch {
 	struct list_head list;
 	struct kobject kobj;
 	bool enabled;
+	struct completion finish;
 };
 
 #define klp_for_each_object(patch, obj) \
diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c
index d55701222b49..a649186fb4af 100644
--- a/kernel/livepatch/core.c
+++ b/kernel/livepatch/core.c
@@ -29,6 +29,7 @@
 #include <linux/livepatch.h>
 #include <linux/elf.h>
 #include <linux/moduleloader.h>
+#include <linux/completion.h>
 #include <asm/cacheflush.h>
 #include "patch.h"
 #include "transition.h"
@@ -371,6 +372,18 @@ static int __klp_enable_patch(struct klp_patch *patch)
 	    !list_prev_entry(patch, list)->enabled)
 		return -EBUSY;
 
+	/*
+	* A reference is taken on the patch module to prevent it from being
+	* unloaded.
+	*
+	* Note: For immediate (no consistency model) patches we don't allow
+	* patch modules to unload since there is no safe/sane method to
+	* determine if a thread is still running in the patched code contained
+	* in the patch module once the ftrace registration is successful.
+	*/
+	if (!try_module_get(patch->mod))
+		return -ENODEV;
+
 	pr_notice_once("tainting kernel with TAINT_LIVEPATCH\n");
 	add_taint(TAINT_LIVEPATCH, LOCKDEP_STILL_OK);
 
@@ -459,6 +472,15 @@ static ssize_t enabled_store(struct kobject *kobj, struct kobj_attribute *attr,
 
 	mutex_lock(&klp_mutex);
 
+	if (!klp_is_patch_registered(patch)) {
+		/*
+		 * Module with the patch could either disappear meanwhile or is
+		 * not properly initialized yet.
+		 */
+		ret = -EINVAL;
+		goto err;
+	}
+
 	if (patch->enabled == val) {
 		/* already in requested state */
 		ret = -EINVAL;
@@ -515,10 +537,10 @@ static struct attribute *klp_patch_attrs[] = {
 
 static void klp_kobj_release_patch(struct kobject *kobj)
 {
-	/*
-	 * Once we have a consistency model we'll need to module_put() the
-	 * patch module here.  See klp_register_patch() for more details.
-	 */
+	struct klp_patch *patch;
+
+	patch = container_of(kobj, struct klp_patch, kobj);
+	complete(&patch->finish);
 }
 
 static struct kobj_type klp_ktype_patch = {
@@ -589,7 +611,6 @@ static void klp_free_patch(struct klp_patch *patch)
 	klp_free_objects_limited(patch, NULL);
 	if (!list_empty(&patch->list))
 		list_del(&patch->list);
-	kobject_put(&patch->kobj);
 }
 
 static int klp_init_func(struct klp_object *obj, struct klp_func *func)
@@ -700,11 +721,14 @@ static int klp_init_patch(struct klp_patch *patch)
 	mutex_lock(&klp_mutex);
 
 	patch->enabled = false;
+	init_completion(&patch->finish);
 
 	ret = kobject_init_and_add(&patch->kobj, &klp_ktype_patch,
 				   klp_root_kobj, "%s", patch->mod->name);
-	if (ret)
-		goto unlock;
+	if (ret) {
+		mutex_unlock(&klp_mutex);
+		return ret;
+	}
 
 	klp_for_each_object(patch, obj) {
 		ret = klp_init_object(patch, obj);
@@ -720,9 +744,12 @@ static int klp_init_patch(struct klp_patch *patch)
 
 free:
 	klp_free_objects_limited(patch, obj);
-	kobject_put(&patch->kobj);
-unlock:
+
 	mutex_unlock(&klp_mutex);
+
+	kobject_put(&patch->kobj);
+	wait_for_completion(&patch->finish);
+
 	return ret;
 }
 
@@ -736,23 +763,29 @@ static int klp_init_patch(struct klp_patch *patch)
  */
 int klp_unregister_patch(struct klp_patch *patch)
 {
-	int ret = 0;
+	int ret;
 
 	mutex_lock(&klp_mutex);
 
 	if (!klp_is_patch_registered(patch)) {
 		ret = -EINVAL;
-		goto out;
+		goto err;
 	}
 
 	if (patch->enabled) {
 		ret = -EBUSY;
-		goto out;
+		goto err;
 	}
 
 	klp_free_patch(patch);
 
-out:
+	mutex_unlock(&klp_mutex);
+
+	kobject_put(&patch->kobj);
+	wait_for_completion(&patch->finish);
+
+	return 0;
+err:
 	mutex_unlock(&klp_mutex);
 	return ret;
 }
@@ -765,12 +798,13 @@ EXPORT_SYMBOL_GPL(klp_unregister_patch);
  * Initializes the data structure associated with the patch and
  * creates the sysfs interface.
  *
+ * There is no need to take the reference on the patch module here. It is done
+ * later when the patch is enabled.
+ *
  * Return: 0 on success, otherwise error
  */
 int klp_register_patch(struct klp_patch *patch)
 {
-	int ret;
-
 	if (!patch || !patch->mod)
 		return -EINVAL;
 
@@ -783,21 +817,7 @@ int klp_register_patch(struct klp_patch *patch)
 	if (!klp_initialized())
 		return -ENODEV;
 
-	/*
-	 * A reference is taken on the patch module to prevent it from being
-	 * unloaded.  Right now, we don't allow patch modules to unload since
-	 * there is currently no method to determine if a thread is still
-	 * running in the patched code contained in the patch module once
-	 * the ftrace registration is successful.
-	 */
-	if (!try_module_get(patch->mod))
-		return -ENODEV;
-
-	ret = klp_init_patch(patch);
-	if (ret)
-		module_put(patch->mod);
-
-	return ret;
+	return klp_init_patch(patch);
 }
 EXPORT_SYMBOL_GPL(klp_register_patch);
 
diff --git a/kernel/livepatch/transition.c b/kernel/livepatch/transition.c
index 92819bb0961b..6cc49d253195 100644
--- a/kernel/livepatch/transition.c
+++ b/kernel/livepatch/transition.c
@@ -197,13 +197,21 @@ void klp_complete_transition(void)
 	struct klp_func *func;
 	struct task_struct *g, *task;
 	unsigned int cpu;
+	bool is_immediate = false;
 
 	if (klp_transition_patch->immediate)
 		goto done;
 
-	klp_for_each_object(klp_transition_patch, obj)
-		klp_for_each_func(obj, func)
+	klp_for_each_object(klp_transition_patch, obj) {
+		klp_for_each_func(obj, func) {
 			func->transition = false;
+			if (func->immediate)
+				is_immediate = true;
+		}
+	}
+
+	if (klp_target_state == KLP_UNPATCHED && !is_immediate)
+		module_put(klp_transition_patch->mod);
 
 	read_lock(&tasklist_lock);
 	for_each_process_thread(g, task) {
diff --git a/samples/livepatch/livepatch-sample.c b/samples/livepatch/livepatch-sample.c
index e34f871e69b1..a84676aa7c62 100644
--- a/samples/livepatch/livepatch-sample.c
+++ b/samples/livepatch/livepatch-sample.c
@@ -82,7 +82,6 @@ static int livepatch_init(void)
 
 static void livepatch_exit(void)
 {
-	WARN_ON(klp_disable_patch(&patch));
 	WARN_ON(klp_unregister_patch(&patch));
 }
 
-- 
2.8.1

^ permalink raw reply related	[flat|nested] 21+ messages in thread