linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2] livepatch: allow removal of a disabled patch
@ 2016-06-01  8:31 Miroslav Benes
  2016-06-01 11:29 ` Christopher Arges
  2016-06-07  9:36 ` Petr Mladek
  0 siblings, 2 replies; 8+ messages in thread
From: Miroslav Benes @ 2016-06-01  8:31 UTC (permalink / raw)
  To: jpoimboe, jeyu, jikos, pmladek
  Cc: jslaby, live-patching, linux-kernel, huawei.libin, minfei.huang,
	Miroslav Benes

Currently we do not allow patch module to unload since there is no
method to determine if a task is still running in the patched code.

The consistency model gives us the way because when the unpatching
finishes we know that all tasks were marked as safe to call an original
function. Thus every new call to the function calls the original code
and at the same time no task can be somewhere in the patched code,
because it had to leave that code to be marked as safe.

We can safely let the patch module go after that.

Completion is used for synchronization between module removal and sysfs
infrastructure in a similar way to commit 942e443127e9 ("module: Fix
mod->mkobj.kobj potentially freed too early").

Note that we still do not allow the removal for immediate model, that is
no consistency model. The module refcount may increase in this case if
somebody disables and enables the patch several times. This should not
cause any harm.

With this change a call to try_module_get() is moved to
__klp_enable_patch from klp_register_patch to make module reference
counting symmetric (module_put() is in a patch disable path) and to
allow to take a new reference to a disabled module when being enabled.

Also all kobject_put(&patch->kobj) calls are moved outside of klp_mutex
lock protection to prevent a deadlock situation when
klp_unregister_patch is called and sysfs directories are removed. There
is no need to do the same for other kobject_put() callsites as we
currently do not have their sysfs counterparts.

Signed-off-by: Miroslav Benes <mbenes@suse.cz>
---
Based on Josh's v2 of the consistency model.

 Documentation/livepatch/livepatch.txt | 29 ++++---------
 include/linux/livepatch.h             |  3 ++
 kernel/livepatch/core.c               | 80 ++++++++++++++++++++++-------------
 kernel/livepatch/transition.c         | 12 +++++-
 samples/livepatch/livepatch-sample.c  |  1 -
 5 files changed, 72 insertions(+), 53 deletions(-)

diff --git a/Documentation/livepatch/livepatch.txt b/Documentation/livepatch/livepatch.txt
index bee86d0fe854..a05124258a73 100644
--- a/Documentation/livepatch/livepatch.txt
+++ b/Documentation/livepatch/livepatch.txt
@@ -272,8 +272,15 @@ section "Livepatch life-cycle" below for more details about these
 two operations.
 
 Module removal is only safe when there are no users of the underlying
-functions.  The immediate consistency model is not able to detect this;
-therefore livepatch modules cannot be removed. See "Limitations" below.
+functions. The immediate consistency model is not able to detect this. The
+code just redirects the functions at the very beginning and it does not
+check if the functions are in use. In other words, it knows when the
+functions get called but it does not know when the functions return.
+Therefore it cannot be decided when the livepatch module can be safely
+removed. This is solved by a hybrid consistency model. When the system is
+transitioned to a new patch state (patched/unpatched) it is guaranteed that
+no task sleeps or runs in the old code.
+
 
 5. Livepatch life-cycle
 =======================
@@ -444,24 +451,6 @@ See Documentation/ABI/testing/sysfs-kernel-livepatch for more details.
     There is work in progress to remove this limitation.
 
 
-  + Livepatch modules can not be removed.
-
-    The current implementation just redirects the functions at the very
-    beginning. It does not check if the functions are in use. In other
-    words, it knows when the functions get called but it does not
-    know when the functions return. Therefore it can not decide when
-    the livepatch module can be safely removed.
-
-    This will get most likely solved once a more complex consistency model
-    is supported. The idea is that a safe state for patching should also
-    mean a safe state for removing the patch.
-
-    Note that the patch itself might get disabled by writing zero
-    to /sys/kernel/livepatch/<patch>/enabled. It causes that the new
-    code will not longer get called. But it does not guarantee
-    that anyone is not sleeping anywhere in the new code.
-
-
   + Livepatch works reliably only when the dynamic ftrace is located at
     the very beginning of the function.
 
diff --git a/include/linux/livepatch.h b/include/linux/livepatch.h
index cd2dfd95e109..a9651c6e1b52 100644
--- a/include/linux/livepatch.h
+++ b/include/linux/livepatch.h
@@ -23,6 +23,7 @@
 
 #include <linux/module.h>
 #include <linux/ftrace.h>
+#include <linux/completion.h>
 
 #if IS_ENABLED(CONFIG_LIVEPATCH)
 
@@ -114,6 +115,7 @@ struct klp_object {
  * @list:	list node for global list of registered patches
  * @kobj:	kobject for sysfs resources
  * @enabled:	the patch is enabled (but operation may be incomplete)
+ * @finish:	for waiting till it is safe to remove the patch module
  */
 struct klp_patch {
 	/* external */
@@ -125,6 +127,7 @@ struct klp_patch {
 	struct list_head list;
 	struct kobject kobj;
 	bool enabled;
+	struct completion finish;
 };
 
 #define klp_for_each_object(patch, obj) \
diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c
index d55701222b49..a649186fb4af 100644
--- a/kernel/livepatch/core.c
+++ b/kernel/livepatch/core.c
@@ -29,6 +29,7 @@
 #include <linux/livepatch.h>
 #include <linux/elf.h>
 #include <linux/moduleloader.h>
+#include <linux/completion.h>
 #include <asm/cacheflush.h>
 #include "patch.h"
 #include "transition.h"
@@ -371,6 +372,18 @@ static int __klp_enable_patch(struct klp_patch *patch)
 	    !list_prev_entry(patch, list)->enabled)
 		return -EBUSY;
 
+	/*
+	* A reference is taken on the patch module to prevent it from being
+	* unloaded.
+	*
+	* Note: For immediate (no consistency model) patches we don't allow
+	* patch modules to unload since there is no safe/sane method to
+	* determine if a thread is still running in the patched code contained
+	* in the patch module once the ftrace registration is successful.
+	*/
+	if (!try_module_get(patch->mod))
+		return -ENODEV;
+
 	pr_notice_once("tainting kernel with TAINT_LIVEPATCH\n");
 	add_taint(TAINT_LIVEPATCH, LOCKDEP_STILL_OK);
 
@@ -459,6 +472,15 @@ static ssize_t enabled_store(struct kobject *kobj, struct kobj_attribute *attr,
 
 	mutex_lock(&klp_mutex);
 
+	if (!klp_is_patch_registered(patch)) {
+		/*
+		 * Module with the patch could either disappear meanwhile or is
+		 * not properly initialized yet.
+		 */
+		ret = -EINVAL;
+		goto err;
+	}
+
 	if (patch->enabled == val) {
 		/* already in requested state */
 		ret = -EINVAL;
@@ -515,10 +537,10 @@ static struct attribute *klp_patch_attrs[] = {
 
 static void klp_kobj_release_patch(struct kobject *kobj)
 {
-	/*
-	 * Once we have a consistency model we'll need to module_put() the
-	 * patch module here.  See klp_register_patch() for more details.
-	 */
+	struct klp_patch *patch;
+
+	patch = container_of(kobj, struct klp_patch, kobj);
+	complete(&patch->finish);
 }
 
 static struct kobj_type klp_ktype_patch = {
@@ -589,7 +611,6 @@ static void klp_free_patch(struct klp_patch *patch)
 	klp_free_objects_limited(patch, NULL);
 	if (!list_empty(&patch->list))
 		list_del(&patch->list);
-	kobject_put(&patch->kobj);
 }
 
 static int klp_init_func(struct klp_object *obj, struct klp_func *func)
@@ -700,11 +721,14 @@ static int klp_init_patch(struct klp_patch *patch)
 	mutex_lock(&klp_mutex);
 
 	patch->enabled = false;
+	init_completion(&patch->finish);
 
 	ret = kobject_init_and_add(&patch->kobj, &klp_ktype_patch,
 				   klp_root_kobj, "%s", patch->mod->name);
-	if (ret)
-		goto unlock;
+	if (ret) {
+		mutex_unlock(&klp_mutex);
+		return ret;
+	}
 
 	klp_for_each_object(patch, obj) {
 		ret = klp_init_object(patch, obj);
@@ -720,9 +744,12 @@ static int klp_init_patch(struct klp_patch *patch)
 
 free:
 	klp_free_objects_limited(patch, obj);
-	kobject_put(&patch->kobj);
-unlock:
+
 	mutex_unlock(&klp_mutex);
+
+	kobject_put(&patch->kobj);
+	wait_for_completion(&patch->finish);
+
 	return ret;
 }
 
@@ -736,23 +763,29 @@ static int klp_init_patch(struct klp_patch *patch)
  */
 int klp_unregister_patch(struct klp_patch *patch)
 {
-	int ret = 0;
+	int ret;
 
 	mutex_lock(&klp_mutex);
 
 	if (!klp_is_patch_registered(patch)) {
 		ret = -EINVAL;
-		goto out;
+		goto err;
 	}
 
 	if (patch->enabled) {
 		ret = -EBUSY;
-		goto out;
+		goto err;
 	}
 
 	klp_free_patch(patch);
 
-out:
+	mutex_unlock(&klp_mutex);
+
+	kobject_put(&patch->kobj);
+	wait_for_completion(&patch->finish);
+
+	return 0;
+err:
 	mutex_unlock(&klp_mutex);
 	return ret;
 }
@@ -765,12 +798,13 @@ EXPORT_SYMBOL_GPL(klp_unregister_patch);
  * Initializes the data structure associated with the patch and
  * creates the sysfs interface.
  *
+ * There is no need to take the reference on the patch module here. It is done
+ * later when the patch is enabled.
+ *
  * Return: 0 on success, otherwise error
  */
 int klp_register_patch(struct klp_patch *patch)
 {
-	int ret;
-
 	if (!patch || !patch->mod)
 		return -EINVAL;
 
@@ -783,21 +817,7 @@ int klp_register_patch(struct klp_patch *patch)
 	if (!klp_initialized())
 		return -ENODEV;
 
-	/*
-	 * A reference is taken on the patch module to prevent it from being
-	 * unloaded.  Right now, we don't allow patch modules to unload since
-	 * there is currently no method to determine if a thread is still
-	 * running in the patched code contained in the patch module once
-	 * the ftrace registration is successful.
-	 */
-	if (!try_module_get(patch->mod))
-		return -ENODEV;
-
-	ret = klp_init_patch(patch);
-	if (ret)
-		module_put(patch->mod);
-
-	return ret;
+	return klp_init_patch(patch);
 }
 EXPORT_SYMBOL_GPL(klp_register_patch);
 
diff --git a/kernel/livepatch/transition.c b/kernel/livepatch/transition.c
index 92819bb0961b..6cc49d253195 100644
--- a/kernel/livepatch/transition.c
+++ b/kernel/livepatch/transition.c
@@ -197,13 +197,21 @@ void klp_complete_transition(void)
 	struct klp_func *func;
 	struct task_struct *g, *task;
 	unsigned int cpu;
+	bool is_immediate = false;
 
 	if (klp_transition_patch->immediate)
 		goto done;
 
-	klp_for_each_object(klp_transition_patch, obj)
-		klp_for_each_func(obj, func)
+	klp_for_each_object(klp_transition_patch, obj) {
+		klp_for_each_func(obj, func) {
 			func->transition = false;
+			if (func->immediate)
+				is_immediate = true;
+		}
+	}
+
+	if (klp_target_state == KLP_UNPATCHED && !is_immediate)
+		module_put(klp_transition_patch->mod);
 
 	read_lock(&tasklist_lock);
 	for_each_process_thread(g, task) {
diff --git a/samples/livepatch/livepatch-sample.c b/samples/livepatch/livepatch-sample.c
index e34f871e69b1..a84676aa7c62 100644
--- a/samples/livepatch/livepatch-sample.c
+++ b/samples/livepatch/livepatch-sample.c
@@ -82,7 +82,6 @@ static int livepatch_init(void)
 
 static void livepatch_exit(void)
 {
-	WARN_ON(klp_disable_patch(&patch));
 	WARN_ON(klp_unregister_patch(&patch));
 }
 
-- 
2.8.3

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] livepatch: allow removal of a disabled patch
  2016-06-01  8:31 [PATCH v2] livepatch: allow removal of a disabled patch Miroslav Benes
@ 2016-06-01 11:29 ` Christopher Arges
  2016-06-01 11:43   ` Miroslav Benes
  2016-06-07  9:36 ` Petr Mladek
  1 sibling, 1 reply; 8+ messages in thread
From: Christopher Arges @ 2016-06-01 11:29 UTC (permalink / raw)
  To: Miroslav Benes
  Cc: jpoimboe, jeyu, jikos, pmladek, jslaby, live-patching,
	linux-kernel, huawei.libin, minfei.huang

On Wed, Jun 01, 2016 at 10:31:59AM +0200, Miroslav Benes wrote:
> Currently we do not allow patch module to unload since there is no
> method to determine if a task is still running in the patched code.
> 
> The consistency model gives us the way because when the unpatching
> finishes we know that all tasks were marked as safe to call an original
> function. Thus every new call to the function calls the original code
> and at the same time no task can be somewhere in the patched code,
> because it had to leave that code to be marked as safe.
> 
> We can safely let the patch module go after that.
> 
> Completion is used for synchronization between module removal and sysfs
> infrastructure in a similar way to commit 942e443127e9 ("module: Fix
> mod->mkobj.kobj potentially freed too early").
> 
> Note that we still do not allow the removal for immediate model, that is
> no consistency model. The module refcount may increase in this case if
> somebody disables and enables the patch several times. This should not
> cause any harm.
> 
> With this change a call to try_module_get() is moved to
> __klp_enable_patch from klp_register_patch to make module reference
> counting symmetric (module_put() is in a patch disable path) and to
> allow to take a new reference to a disabled module when being enabled.
> 
> Also all kobject_put(&patch->kobj) calls are moved outside of klp_mutex
> lock protection to prevent a deadlock situation when
> klp_unregister_patch is called and sysfs directories are removed. There
> is no need to do the same for other kobject_put() callsites as we
> currently do not have their sysfs counterparts.
> 
> Signed-off-by: Miroslav Benes <mbenes@suse.cz>
> ---
> Based on Josh's v2 of the consistency model.
> 
>  Documentation/livepatch/livepatch.txt | 29 ++++---------
>  include/linux/livepatch.h             |  3 ++
>  kernel/livepatch/core.c               | 80 ++++++++++++++++++++++-------------
>  kernel/livepatch/transition.c         | 12 +++++-
>  samples/livepatch/livepatch-sample.c  |  1 -
>  5 files changed, 72 insertions(+), 53 deletions(-)
> 
> diff --git a/Documentation/livepatch/livepatch.txt b/Documentation/livepatch/livepatch.txt
> index bee86d0fe854..a05124258a73 100644
> --- a/Documentation/livepatch/livepatch.txt
> +++ b/Documentation/livepatch/livepatch.txt
> @@ -272,8 +272,15 @@ section "Livepatch life-cycle" below for more details about these
>  two operations.
>  
>  Module removal is only safe when there are no users of the underlying
> -functions.  The immediate consistency model is not able to detect this;
> -therefore livepatch modules cannot be removed. See "Limitations" below.
> +functions. The immediate consistency model is not able to detect this. The
> +code just redirects the functions at the very beginning and it does not
> +check if the functions are in use. In other words, it knows when the
> +functions get called but it does not know when the functions return.
> +Therefore it cannot be decided when the livepatch module can be safely
> +removed. This is solved by a hybrid consistency model. When the system is
> +transitioned to a new patch state (patched/unpatched) it is guaranteed that
> +no task sleeps or runs in the old code.
> +
>  
>  5. Livepatch life-cycle
>  =======================
> @@ -444,24 +451,6 @@ See Documentation/ABI/testing/sysfs-kernel-livepatch for more details.
>      There is work in progress to remove this limitation.
>  
>  
> -  + Livepatch modules can not be removed.
> -
> -    The current implementation just redirects the functions at the very
> -    beginning. It does not check if the functions are in use. In other
> -    words, it knows when the functions get called but it does not
> -    know when the functions return. Therefore it can not decide when
> -    the livepatch module can be safely removed.
> -
> -    This will get most likely solved once a more complex consistency model
> -    is supported. The idea is that a safe state for patching should also
> -    mean a safe state for removing the patch.
> -
> -    Note that the patch itself might get disabled by writing zero
> -    to /sys/kernel/livepatch/<patch>/enabled. It causes that the new
> -    code will not longer get called. But it does not guarantee
> -    that anyone is not sleeping anywhere in the new code.
> -
> -
>    + Livepatch works reliably only when the dynamic ftrace is located at
>      the very beginning of the function.
>  
> diff --git a/include/linux/livepatch.h b/include/linux/livepatch.h
> index cd2dfd95e109..a9651c6e1b52 100644
> --- a/include/linux/livepatch.h
> +++ b/include/linux/livepatch.h
> @@ -23,6 +23,7 @@
>  
>  #include <linux/module.h>
>  #include <linux/ftrace.h>
> +#include <linux/completion.h>
>  
>  #if IS_ENABLED(CONFIG_LIVEPATCH)
>  
> @@ -114,6 +115,7 @@ struct klp_object {
>   * @list:	list node for global list of registered patches
>   * @kobj:	kobject for sysfs resources
>   * @enabled:	the patch is enabled (but operation may be incomplete)
> + * @finish:	for waiting till it is safe to remove the patch module
>   */
>  struct klp_patch {
>  	/* external */
> @@ -125,6 +127,7 @@ struct klp_patch {
>  	struct list_head list;
>  	struct kobject kobj;
>  	bool enabled;
> +	struct completion finish;
>  };
>  
>  #define klp_for_each_object(patch, obj) \
> diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c
> index d55701222b49..a649186fb4af 100644
> --- a/kernel/livepatch/core.c
> +++ b/kernel/livepatch/core.c
> @@ -29,6 +29,7 @@
>  #include <linux/livepatch.h>
>  #include <linux/elf.h>
>  #include <linux/moduleloader.h>
> +#include <linux/completion.h>
>  #include <asm/cacheflush.h>
>  #include "patch.h"
>  #include "transition.h"
> @@ -371,6 +372,18 @@ static int __klp_enable_patch(struct klp_patch *patch)
>  	    !list_prev_entry(patch, list)->enabled)
>  		return -EBUSY;
>  
> +	/*
> +	* A reference is taken on the patch module to prevent it from being
> +	* unloaded.
> +	*
> +	* Note: For immediate (no consistency model) patches we don't allow
> +	* patch modules to unload since there is no safe/sane method to
> +	* determine if a thread is still running in the patched code contained
> +	* in the patch module once the ftrace registration is successful.
> +	*/
> +	if (!try_module_get(patch->mod))
> +		return -ENODEV;
> +
>  	pr_notice_once("tainting kernel with TAINT_LIVEPATCH\n");
>  	add_taint(TAINT_LIVEPATCH, LOCKDEP_STILL_OK);
>  
> @@ -459,6 +472,15 @@ static ssize_t enabled_store(struct kobject *kobj, struct kobj_attribute *attr,
>  
>  	mutex_lock(&klp_mutex);
>  
> +	if (!klp_is_patch_registered(patch)) {
> +		/*
> +		 * Module with the patch could either disappear meanwhile or is
> +		 * not properly initialized yet.
> +		 */
> +		ret = -EINVAL;
> +		goto err;
> +	}
> +
>  	if (patch->enabled == val) {
>  		/* already in requested state */
>  		ret = -EINVAL;
> @@ -515,10 +537,10 @@ static struct attribute *klp_patch_attrs[] = {
>  
>  static void klp_kobj_release_patch(struct kobject *kobj)
>  {
> -	/*
> -	 * Once we have a consistency model we'll need to module_put() the
> -	 * patch module here.  See klp_register_patch() for more details.
> -	 */
> +	struct klp_patch *patch;
> +
> +	patch = container_of(kobj, struct klp_patch, kobj);
> +	complete(&patch->finish);
>  }
>  
>  static struct kobj_type klp_ktype_patch = {
> @@ -589,7 +611,6 @@ static void klp_free_patch(struct klp_patch *patch)
>  	klp_free_objects_limited(patch, NULL);
>  	if (!list_empty(&patch->list))
>  		list_del(&patch->list);
> -	kobject_put(&patch->kobj);
>  }
>  
>  static int klp_init_func(struct klp_object *obj, struct klp_func *func)
> @@ -700,11 +721,14 @@ static int klp_init_patch(struct klp_patch *patch)
>  	mutex_lock(&klp_mutex);
>  
>  	patch->enabled = false;
> +	init_completion(&patch->finish);
>  
>  	ret = kobject_init_and_add(&patch->kobj, &klp_ktype_patch,
>  				   klp_root_kobj, "%s", patch->mod->name);
> -	if (ret)
> -		goto unlock;
> +	if (ret) {
> +		mutex_unlock(&klp_mutex);
> +		return ret;
> +	}
>  
>  	klp_for_each_object(patch, obj) {
>  		ret = klp_init_object(patch, obj);
> @@ -720,9 +744,12 @@ static int klp_init_patch(struct klp_patch *patch)
>  
>  free:
>  	klp_free_objects_limited(patch, obj);
> -	kobject_put(&patch->kobj);
> -unlock:
> +
>  	mutex_unlock(&klp_mutex);
> +
> +	kobject_put(&patch->kobj);
> +	wait_for_completion(&patch->finish);
> +
>  	return ret;
>  }
>  
> @@ -736,23 +763,29 @@ static int klp_init_patch(struct klp_patch *patch)
>   */
>  int klp_unregister_patch(struct klp_patch *patch)
>  {
> -	int ret = 0;
> +	int ret;
>  
>  	mutex_lock(&klp_mutex);
>  
>  	if (!klp_is_patch_registered(patch)) {
>  		ret = -EINVAL;
> -		goto out;
> +		goto err;
>  	}
>  
>  	if (patch->enabled) {
>  		ret = -EBUSY;
> -		goto out;
> +		goto err;
>  	}
>  
>  	klp_free_patch(patch);
>  
> -out:
> +	mutex_unlock(&klp_mutex);
> +
> +	kobject_put(&patch->kobj);
> +	wait_for_completion(&patch->finish);
> +
> +	return 0;
> +err:
>  	mutex_unlock(&klp_mutex);
>  	return ret;
>  }
> @@ -765,12 +798,13 @@ EXPORT_SYMBOL_GPL(klp_unregister_patch);
>   * Initializes the data structure associated with the patch and
>   * creates the sysfs interface.
>   *
> + * There is no need to take the reference on the patch module here. It is done
> + * later when the patch is enabled.
> + *
>   * Return: 0 on success, otherwise error
>   */
>  int klp_register_patch(struct klp_patch *patch)
>  {
> -	int ret;
> -
>  	if (!patch || !patch->mod)
>  		return -EINVAL;
>  
> @@ -783,21 +817,7 @@ int klp_register_patch(struct klp_patch *patch)
>  	if (!klp_initialized())
>  		return -ENODEV;
>  
> -	/*
> -	 * A reference is taken on the patch module to prevent it from being
> -	 * unloaded.  Right now, we don't allow patch modules to unload since
> -	 * there is currently no method to determine if a thread is still
> -	 * running in the patched code contained in the patch module once
> -	 * the ftrace registration is successful.
> -	 */
> -	if (!try_module_get(patch->mod))
> -		return -ENODEV;
> -
> -	ret = klp_init_patch(patch);
> -	if (ret)
> -		module_put(patch->mod);
> -
> -	return ret;
> +	return klp_init_patch(patch);
>  }
>  EXPORT_SYMBOL_GPL(klp_register_patch);
>  
> diff --git a/kernel/livepatch/transition.c b/kernel/livepatch/transition.c
> index 92819bb0961b..6cc49d253195 100644
> --- a/kernel/livepatch/transition.c
> +++ b/kernel/livepatch/transition.c
> @@ -197,13 +197,21 @@ void klp_complete_transition(void)
>  	struct klp_func *func;
>  	struct task_struct *g, *task;
>  	unsigned int cpu;
> +	bool is_immediate = false;
>  
>  	if (klp_transition_patch->immediate)
>  		goto done;
>  
> -	klp_for_each_object(klp_transition_patch, obj)
> -		klp_for_each_func(obj, func)
> +	klp_for_each_object(klp_transition_patch, obj) {
> +		klp_for_each_func(obj, func) {
>  			func->transition = false;
> +			if (func->immediate)
> +				is_immediate = true;

Once an immediate function is found you could return from this function since
releasing a reference from the livepatch module is no longer possible.

--chris

> +		}
> +	}
> +
> +	if (klp_target_state == KLP_UNPATCHED && !is_immediate)
> +		module_put(klp_transition_patch->mod);
>  
>  	read_lock(&tasklist_lock);
>  	for_each_process_thread(g, task) {
> diff --git a/samples/livepatch/livepatch-sample.c b/samples/livepatch/livepatch-sample.c
> index e34f871e69b1..a84676aa7c62 100644
> --- a/samples/livepatch/livepatch-sample.c
> +++ b/samples/livepatch/livepatch-sample.c
> @@ -82,7 +82,6 @@ static int livepatch_init(void)
>  
>  static void livepatch_exit(void)
>  {
> -	WARN_ON(klp_disable_patch(&patch));
>  	WARN_ON(klp_unregister_patch(&patch));
>  }
>  
> -- 
> 2.8.3
> 
> --
> To unsubscribe from this list: send the line "unsubscribe live-patching" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] livepatch: allow removal of a disabled patch
  2016-06-01 11:29 ` Christopher Arges
@ 2016-06-01 11:43   ` Miroslav Benes
  0 siblings, 0 replies; 8+ messages in thread
From: Miroslav Benes @ 2016-06-01 11:43 UTC (permalink / raw)
  To: Christopher Arges
  Cc: jpoimboe, jeyu, jikos, pmladek, jslaby, live-patching,
	linux-kernel, huawei.libin, minfei.huang

On Wed, 1 Jun 2016, Christopher Arges wrote:

> On Wed, Jun 01, 2016 at 10:31:59AM +0200, Miroslav Benes wrote:
> >
> > diff --git a/kernel/livepatch/transition.c b/kernel/livepatch/transition.c
> > index 92819bb0961b..6cc49d253195 100644
> > --- a/kernel/livepatch/transition.c
> > +++ b/kernel/livepatch/transition.c
> > @@ -197,13 +197,21 @@ void klp_complete_transition(void)
> >  	struct klp_func *func;
> >  	struct task_struct *g, *task;
> >  	unsigned int cpu;
> > +	bool is_immediate = false;
> >  
> >  	if (klp_transition_patch->immediate)
> >  		goto done;
> >  
> > -	klp_for_each_object(klp_transition_patch, obj)
> > -		klp_for_each_func(obj, func)
> > +	klp_for_each_object(klp_transition_patch, obj) {
> > +		klp_for_each_func(obj, func) {
> >  			func->transition = false;
> > +			if (func->immediate)
> > +				is_immediate = true;
> 
> Once an immediate function is found you could return from this function since
> releasing a reference from the livepatch module is no longer possible.

That is true, but I think we should tidy up after the patching process 
even in this case. That is clear all func->transition, task->patch_state 
and task->TIF_PATCH_PENDING flags. These all could be set if 
patch->immediate is false and one of the func->immediate is true.

Regards,
Miroslav

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] livepatch: allow removal of a disabled patch
  2016-06-01  8:31 [PATCH v2] livepatch: allow removal of a disabled patch Miroslav Benes
  2016-06-01 11:29 ` Christopher Arges
@ 2016-06-07  9:36 ` Petr Mladek
  2016-06-07 22:39   ` Jessica Yu
  1 sibling, 1 reply; 8+ messages in thread
From: Petr Mladek @ 2016-06-07  9:36 UTC (permalink / raw)
  To: Miroslav Benes
  Cc: jpoimboe, jeyu, jikos, jslaby, live-patching, linux-kernel,
	huawei.libin, minfei.huang

On Wed 2016-06-01 10:31:59, Miroslav Benes wrote:
> Currently we do not allow patch module to unload since there is no
> method to determine if a task is still running in the patched code.
> 
> The consistency model gives us the way because when the unpatching
> finishes we know that all tasks were marked as safe to call an original
> function. Thus every new call to the function calls the original code
> and at the same time no task can be somewhere in the patched code,
> because it had to leave that code to be marked as safe.
> 
> We can safely let the patch module go after that.
> 
> Completion is used for synchronization between module removal and sysfs
> infrastructure in a similar way to commit 942e443127e9 ("module: Fix
> mod->mkobj.kobj potentially freed too early").
> 
> Note that we still do not allow the removal for immediate model, that is
> no consistency model. The module refcount may increase in this case if
> somebody disables and enables the patch several times. This should not
> cause any harm.
> 
> With this change a call to try_module_get() is moved to
> __klp_enable_patch from klp_register_patch to make module reference
> counting symmetric (module_put() is in a patch disable path) and to
> allow to take a new reference to a disabled module when being enabled.
> 
> Also all kobject_put(&patch->kobj) calls are moved outside of klp_mutex
> lock protection to prevent a deadlock situation when
> klp_unregister_patch is called and sysfs directories are removed. There
> is no need to do the same for other kobject_put() callsites as we
> currently do not have their sysfs counterparts.
> 
> Signed-off-by: Miroslav Benes <mbenes@suse.cz>
> ---
> Based on Josh's v2 of the consistency model.
> 
> diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c
> index d55701222b49..a649186fb4af 100644
> --- a/kernel/livepatch/core.c
> +++ b/kernel/livepatch/core.c
> @@ -459,6 +472,15 @@ static ssize_t enabled_store(struct kobject *kobj, struct kobj_attribute *attr,
>  
>  	mutex_lock(&klp_mutex);
>  
> +	if (!klp_is_patch_registered(patch)) {
> +		/*
> +		 * Module with the patch could either disappear meanwhile or is
> +		 * not properly initialized yet.
> +		 */
> +		ret = -EINVAL;
> +		goto err;
> +	}
> +
>  	if (patch->enabled == val) {
>  		/* already in requested state */
>  		ret = -EINVAL;
> @@ -700,11 +721,14 @@ static int klp_init_patch(struct klp_patch *patch)
>  	mutex_lock(&klp_mutex);
>  
>  	patch->enabled = false;
> +	init_completion(&patch->finish);
>  
>  	ret = kobject_init_and_add(&patch->kobj, &klp_ktype_patch,
>  				   klp_root_kobj, "%s", patch->mod->name);
> -	if (ret)
> -		goto unlock;
> +	if (ret) {
> +		mutex_unlock(&klp_mutex);
> +		return ret;
> +	}
>  
>  	klp_for_each_object(patch, obj) {
>  		ret = klp_init_object(patch, obj);
> @@ -720,9 +744,12 @@ static int klp_init_patch(struct klp_patch *patch)
>  
>  free:
>  	klp_free_objects_limited(patch, obj);
> -	kobject_put(&patch->kobj);
> -unlock:
> +
>  	mutex_unlock(&klp_mutex);
> +

Just for record. The sysfs entry exists but patch->list is not
initialized in this error path. Therefore we could write into

   /sys/.../livepatch_foo/enable

and call enabled_store(). It is safe because enabled_store() calls
klp_is_patch_registered() and it does not need patch->list for this
check. Kudos for the klp_is_patch_registered() implementation.

I write this because it is not obvious and it took me some time
to verify that we are on the safe side.

Well, I would feel more comfortable if we initialize patch->list
above.


> +	kobject_put(&patch->kobj);
> +	wait_for_completion(&patch->finish);
> +
>  	return ret;
>  }
>  
> diff --git a/samples/livepatch/livepatch-sample.c b/samples/livepatch/livepatch-sample.c
> index e34f871e69b1..a84676aa7c62 100644
> --- a/samples/livepatch/livepatch-sample.c
> +++ b/samples/livepatch/livepatch-sample.c
> @@ -82,7 +82,6 @@ static int livepatch_init(void)
>  
>  static void livepatch_exit(void)
>  {
> -	WARN_ON(klp_disable_patch(&patch));
>  	WARN_ON(klp_unregister_patch(&patch));

Just for record. I was curious if the following race:

CPU0				CPU1

rmmod livepatch_foo

  livepatch_exit()

				echo 1> /sys/.../livepatch_foo/enable

				  __klp_enable_patch()

				    lock(&klp_mutex)
				    ...
				    patch->enabled = true;
				    ...
				    unlock(&klp_mutex)

     klp_unregister_patch()

       lock(&klp_mutex);

       if (patch->enabled)
	 ret = -EBUSY;

       unlock(&klp_mutex);


Fortunately, this is not possible. livepatch_exit() is called when
the module is in MODULE_STATE_GOING. It means that try_module_get()
will fail and therefore __klp_enable_patch() will fail.


All in all, the patch looks good to me. I am fine with the completion.
In fact, I like it.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: livepatch: allow removal of a disabled patch
  2016-06-07  9:36 ` Petr Mladek
@ 2016-06-07 22:39   ` Jessica Yu
  2016-06-08  8:10     ` Petr Mladek
  0 siblings, 1 reply; 8+ messages in thread
From: Jessica Yu @ 2016-06-07 22:39 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Miroslav Benes, jpoimboe, jikos, jslaby, live-patching,
	linux-kernel, huawei.libin, minfei.huang

+++ Petr Mladek [07/06/16 11:36 +0200]:
>On Wed 2016-06-01 10:31:59, Miroslav Benes wrote:
>> Currently we do not allow patch module to unload since there is no
>> method to determine if a task is still running in the patched code.
>>
>> The consistency model gives us the way because when the unpatching
>> finishes we know that all tasks were marked as safe to call an original
>> function. Thus every new call to the function calls the original code
>> and at the same time no task can be somewhere in the patched code,
>> because it had to leave that code to be marked as safe.
>>
>> We can safely let the patch module go after that.
>>
>> Completion is used for synchronization between module removal and sysfs
>> infrastructure in a similar way to commit 942e443127e9 ("module: Fix
>> mod->mkobj.kobj potentially freed too early").
>>
>> Note that we still do not allow the removal for immediate model, that is
>> no consistency model. The module refcount may increase in this case if
>> somebody disables and enables the patch several times. This should not
>> cause any harm.
>>
>> With this change a call to try_module_get() is moved to
>> __klp_enable_patch from klp_register_patch to make module reference
>> counting symmetric (module_put() is in a patch disable path) and to
>> allow to take a new reference to a disabled module when being enabled.
>>
>> Also all kobject_put(&patch->kobj) calls are moved outside of klp_mutex
>> lock protection to prevent a deadlock situation when
>> klp_unregister_patch is called and sysfs directories are removed. There
>> is no need to do the same for other kobject_put() callsites as we
>> currently do not have their sysfs counterparts.
>>
>> Signed-off-by: Miroslav Benes <mbenes@suse.cz>
>> ---
>> Based on Josh's v2 of the consistency model.
>>
>> diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c
>> index d55701222b49..a649186fb4af 100644
>> --- a/kernel/livepatch/core.c
>> +++ b/kernel/livepatch/core.c
>> @@ -459,6 +472,15 @@ static ssize_t enabled_store(struct kobject *kobj, struct kobj_attribute *attr,
>>
>>  	mutex_lock(&klp_mutex);
>>
>> +	if (!klp_is_patch_registered(patch)) {
>> +		/*
>> +		 * Module with the patch could either disappear meanwhile or is
>> +		 * not properly initialized yet.
>> +		 */
>> +		ret = -EINVAL;
>> +		goto err;
>> +	}
>> +
>>  	if (patch->enabled == val) {
>>  		/* already in requested state */
>>  		ret = -EINVAL;
>> @@ -700,11 +721,14 @@ static int klp_init_patch(struct klp_patch *patch)
>>  	mutex_lock(&klp_mutex);
>>
>>  	patch->enabled = false;
>> +	init_completion(&patch->finish);
>>
>>  	ret = kobject_init_and_add(&patch->kobj, &klp_ktype_patch,
>>  				   klp_root_kobj, "%s", patch->mod->name);
>> -	if (ret)
>> -		goto unlock;
>> +	if (ret) {
>> +		mutex_unlock(&klp_mutex);
>> +		return ret;
>> +	}
>>
>>  	klp_for_each_object(patch, obj) {
>>  		ret = klp_init_object(patch, obj);
>> @@ -720,9 +744,12 @@ static int klp_init_patch(struct klp_patch *patch)
>>
>>  free:
>>  	klp_free_objects_limited(patch, obj);
>> -	kobject_put(&patch->kobj);
>> -unlock:
>> +
>>  	mutex_unlock(&klp_mutex);
>> +
>
>Just for record. The sysfs entry exists but patch->list is not
>initialized in this error path. Therefore we could write into
>
>   /sys/.../livepatch_foo/enable
>
>and call enabled_store(). It is safe because enabled_store() calls
>klp_is_patch_registered() and it does not need patch->list for this
>check. Kudos for the klp_is_patch_registered() implementation.
>
>I write this because it is not obvious and it took me some time
>to verify that we are on the safe side.
>
>Well, I would feel more comfortable if we initialize patch->list
>above.

Hi Petr,

I'm a bit unclear on this, can you clarify what you mean by initialize
patch->list? I don't think any extra initialization is needed here to
be able use a patch->list node with an existing list (klp_patches),
since this field is embedded and the klp_patches list header is
already statically initialized with LIST_HEAD.

Jessica

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: livepatch: allow removal of a disabled patch
  2016-06-07 22:39   ` Jessica Yu
@ 2016-06-08  8:10     ` Petr Mladek
  0 siblings, 0 replies; 8+ messages in thread
From: Petr Mladek @ 2016-06-08  8:10 UTC (permalink / raw)
  To: Jessica Yu
  Cc: Miroslav Benes, jpoimboe, jikos, jslaby, live-patching,
	linux-kernel, huawei.libin, minfei.huang

On Tue 2016-06-07 18:39:51, Jessica Yu wrote:
> +++ Petr Mladek [07/06/16 11:36 +0200]:
> >On Wed 2016-06-01 10:31:59, Miroslav Benes wrote:
> >>Currently we do not allow patch module to unload since there is no
> >>method to determine if a task is still running in the patched code.
> >>
> >>The consistency model gives us the way because when the unpatching
> >>finishes we know that all tasks were marked as safe to call an original
> >>function. Thus every new call to the function calls the original code
> >>and at the same time no task can be somewhere in the patched code,
> >>because it had to leave that code to be marked as safe.
> >>
> >>We can safely let the patch module go after that.
> >>
> >>Completion is used for synchronization between module removal and sysfs
> >>infrastructure in a similar way to commit 942e443127e9 ("module: Fix
> >>mod->mkobj.kobj potentially freed too early").
> >>
> >>Note that we still do not allow the removal for immediate model, that is
> >>no consistency model. The module refcount may increase in this case if
> >>somebody disables and enables the patch several times. This should not
> >>cause any harm.
> >>
> >>With this change a call to try_module_get() is moved to
> >>__klp_enable_patch from klp_register_patch to make module reference
> >>counting symmetric (module_put() is in a patch disable path) and to
> >>allow to take a new reference to a disabled module when being enabled.
> >>
> >>Also all kobject_put(&patch->kobj) calls are moved outside of klp_mutex
> >>lock protection to prevent a deadlock situation when
> >>klp_unregister_patch is called and sysfs directories are removed. There
> >>is no need to do the same for other kobject_put() callsites as we
> >>currently do not have their sysfs counterparts.
> >>
> >>Signed-off-by: Miroslav Benes <mbenes@suse.cz>
> >>---
> >>Based on Josh's v2 of the consistency model.
> >>
> >>diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c
> >>index d55701222b49..a649186fb4af 100644
> >>--- a/kernel/livepatch/core.c
> >>+++ b/kernel/livepatch/core.c
> >>@@ -459,6 +472,15 @@ static ssize_t enabled_store(struct kobject *kobj, struct kobj_attribute *attr,
> >>
> >> 	mutex_lock(&klp_mutex);
> >>
> >>+	if (!klp_is_patch_registered(patch)) {
> >>+		/*
> >>+		 * Module with the patch could either disappear meanwhile or is
> >>+		 * not properly initialized yet.
> >>+		 */
> >>+		ret = -EINVAL;
> >>+		goto err;
> >>+	}
> >>+
> >> 	if (patch->enabled == val) {
> >> 		/* already in requested state */
> >> 		ret = -EINVAL;
> >>@@ -700,11 +721,14 @@ static int klp_init_patch(struct klp_patch *patch)
> >> 	mutex_lock(&klp_mutex);
> >>
> >> 	patch->enabled = false;
> >>+	init_completion(&patch->finish);
> >>
> >> 	ret = kobject_init_and_add(&patch->kobj, &klp_ktype_patch,
> >> 				   klp_root_kobj, "%s", patch->mod->name);
> >>-	if (ret)
> >>-		goto unlock;
> >>+	if (ret) {
> >>+		mutex_unlock(&klp_mutex);
> >>+		return ret;
> >>+	}
> >>
> >> 	klp_for_each_object(patch, obj) {
> >> 		ret = klp_init_object(patch, obj);
> >>@@ -720,9 +744,12 @@ static int klp_init_patch(struct klp_patch *patch)
> >>
> >> free:
> >> 	klp_free_objects_limited(patch, obj);
> >>-	kobject_put(&patch->kobj);
> >>-unlock:
> >>+
> >> 	mutex_unlock(&klp_mutex);
> >>+
> >
> >Just for record. The sysfs entry exists but patch->list is not
> >initialized in this error path. Therefore we could write into
> >
> >  /sys/.../livepatch_foo/enable
> >
> >and call enabled_store(). It is safe because enabled_store() calls
> >klp_is_patch_registered() and it does not need patch->list for this
> >check. Kudos for the klp_is_patch_registered() implementation.
> >
> >I write this because it is not obvious and it took me some time
> >to verify that we are on the safe side.
> >
> >Well, I would feel more comfortable if we initialize patch->list
> >above.
> 
> Hi Petr,
> 
> I'm a bit unclear on this, can you clarify what you mean by initialize
> patch->list? I don't think any extra initialization is needed here to
> be able use a patch->list node with an existing list (klp_patches),
> since this field is embedded and the klp_patches list header is
> already statically initialized with LIST_HEAD.

I meant to call

	INIT_LIST_HEAD(&patch->list);

around the existing initialization:

	patch->enabled = false;
	init_completion(&patch->finish);


The list is zeroed by default. This is true at least in
the livepatch_sample.c. It means that it is not a valid
list head without an extra initialization.

Also note that we do a check for list_empty(&patch->list) in
klp_free_patch(). This check is useless if we do not initialize
the list.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: livepatch: allow removal of a disabled patch
  2016-05-06  0:42                   ` Jessica Yu
@ 2016-05-06  7:51                     ` Miroslav Benes
  0 siblings, 0 replies; 8+ messages in thread
From: Miroslav Benes @ 2016-05-06  7:51 UTC (permalink / raw)
  To: Jessica Yu
  Cc: Josh Poimboeuf, Jiri Kosina, pmladek, jslaby, live-patching,
	Linux Kernel Mailing List, huawei.libin, minfei.huang

On Thu, 5 May 2016, Jessica Yu wrote:

> +++ Josh Poimboeuf [05/05/16 10:04 -0500]:
> > On Thu, May 05, 2016 at 04:25:48PM +0200, Miroslav Benes wrote:
> > > On Thu, 5 May 2016, Josh Poimboeuf wrote:
> > > 
> > > > On Thu, May 05, 2016 at 10:28:12AM +0200, Miroslav Benes wrote:
> > > > > I think it boils down to the following problem.
> > > > >
> > > > > 1. CONFIG_DEBUG_KOBJECT_RELEASE=y
> > > > >
> > > > > 2. we have dynamic kobjects, so there is a pointer in klp_patch to
> > > struct
> > > > > kobject
> > > > >
> > > > > 3. it is allocated during klp_init_patch() and all is fine
> > > > >
> > > > > 4. now we want to remove the patch module. It is disabled and
> > > module_put()
> > > > > is called. User calls rmmod on the module.
> > > > >
> > > > > 5. klp_unregister_patch() is called in __exit method.
> > > > >
> > > > > 6. klp_free_patch() is called.
> > > > >
> > > > > 7. kobject_put(patch->kobj) is called.
> > > > >
> > > > > ...now it gets interesting...
> > > > >
> > > > > 8. among others kobject_cleanup() is scheduled as a delayed work (this
> > > is
> > > > > important).
> > > > >
> > > > > 9. there is no completion, so kobject_put returns and the module goes
> > > > > away.
> > > > >
> > > > > 10. someone calls patch enabled_store attribute (for example). They
> > > can
> > > > > because kobject_cleanup() has not been called yet. It is delayed
> > > > > scheduled.
> > > > >
> > > > > ...crash...
> > > >
> > > > But what exactly causes the crash?  In enabled_store() we can see that
> > > > the patch isn't in the list, so we can return -EINVAL.
> > > 
> > > Ok, bad example. Take enabled_show() instead. It could be fixed in the
> > > same way, but I am not sure it is the right thing to do. It does not scale
> > > because the problem is elsewhere.
> > > 
> > > Anyway, it is (even if theoretically) there in my opinion and we
> > > have two options.
> > > 
> > > 1. We could forget about CONFIG_DEBUG_KOBJECT_RELEASE and all is ok
> > > without completion and regardless of dynamic/static kobject allocation.
> > > 
> > > 2. We introduce completion and we are ok even with
> > > CONFIG_DEBUG_KOBJECT_RELEASE=y and again regardless of dynamic/static
> > > kobject allocation.
> > 
> > I would disagree with the statement that the dynamic kobject doesn't
> > scale.  We would just need a helper function to get from a kobject to
> > its klp_patch.
> > 
> > In fact, to me it seems like the right way to do it.  It doesn't make
> > sense for the code which creates the kobject to be different from the
> > code which initializes it.  It's slightly out of context, but
> > kobject.txt does say:
> > 
> >  "Code which creates a kobject must, of course, initialize that object."
> > 
> > I view the completion as a hack to compensate for the fact that we're
> > abusing the kobject interface.  And so it makes sense to me that
> > CONFIG_DEBUG_KOBJECT_RELEASE would cause problems, because we're using
> > kobjects in the wrong way.
> > 
> > So in my view, the two options are:
> > 
> > 1. Convert the kobject to dynamic as I described.
> > 
> > 2. Change the klp_register() interface so that klp_patch gets allocated
> >   in livepatch code.
> > 
> > I'd be curious to hear what others think.
> 
> So, I think both of these solutions would enable us to get rid of
> the completion. Let me try to summarize..

Whoa, thanks for a good summary. That's exactly what was needed :)

> For solution #1, if we dynamically allocate the kobject, i.e. we have a
> pointer now instead of having it embedded in the klp_patch struct, we no
> longer need to worry if the corresponding klp_patch gets deallocated under
> our nose. Since the kobject_cleanup function is delayed w/
> CONFIG_DEBUG_KOBJECT_RELEASE, it is possible to have sysfs entries that
> refer to a klp_patch that no longer exists. Thus if any of the sysfs
> functions get called, we would have to take care to ensure that the
> klp_patch struct corresponding to the kobject in question actually still
> exists. In this case, all sysfs functions would require an extra check to
> make sure the matching klp_patch is still on the patches list and return an
> error if it isn't found. The "pro" is that this change would be simple, the
> "con" is that now kobjects are decoupled and managed completely separately
> from the object (klp_patch) with which they are associated, which doesn't
> feel 100% right.

Yes, I've thought a lot about it during the night (I've even dreams about 
kobjects now) and this doesn't seem as a good solution. We would need to 
make sure that we have appropriate checks everytime we change something in 
the code. Moreover I don't like walking through the list of patches each 
time. It is not critical path, but it is not nice anyway. This is what I 
meant with scaling problem previously.

I agree that while compaction makes all the problems go away, it is more 
like 'after a fashion' solution. Which leads me to... 

> For solution #2, we could have livepatch manage the (de)allocation of
> klp_patch objects internally. Maybe in this scenario the caller would need
> to request a klp_patch object be allocated and the caller would fill out the
> returned klp_patch struct as appropriate. In this case, we would be able
> leave the kobject embedded in the klp_patch struct (and dynamic kobjects
> wouldn't be needed), as livepatch would now have control of both structures.

I think this is the way to go.

> Then during the patch module exit path, when kobject_put is called, the
> klp_patch struct would be freed in its kobject's release function. We
> wouldn't have to hold up rmmod, and delayed execution of kobject_cleanup
> wouldn't break anything, because a klp_patch would then have the same
> "lifespan"
> as its corresponding kobject, and therefore it would be safe to invoke
> enabled_store & co. up until kobject_cleanup is finally executed. We'd be
> able to use container_of in this case as well. In addition, we wouldn't
> have to force all sysfs functions to support an awkward edge-case (i.e.
> checking if the corresponding klp_patch still exists). I think this
> solution also matches better with the typical use-case of the kobject
> release method, as described in kobject.txt (replacing 'my_object' with
> klp_patch):
> ---
> (...)
> This notification is done through a kobject's release() method.
> Usually such a method has a form like:
> 
>   void my_object_release(struct kobject *kobj)
>    {
>        struct my_object *mine = container_of(kobj, struct my_object, kobj);
>        /* Perform any additional cleanup on this object, then... */
>        kfree(mine);
>    }
> ---
> Apologies for the giant wall of text. In any case I feel like solution #2
> is actually closer in line with how kobjects are normally used, embedded in
> the structures they refer to, which get deallocated once their refcount
> hits 0. What do people think?

I agree. Josh and jikos proposed this as well, so I think we have an 
agreement. I'm gonna prepare a patch independent of the consistency model 
as this is a separate issue.

Thanks a lot,
Miroslav

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: livepatch: allow removal of a disabled patch
  2016-05-05 15:04                 ` Josh Poimboeuf
@ 2016-05-06  0:42                   ` Jessica Yu
  2016-05-06  7:51                     ` Miroslav Benes
  0 siblings, 1 reply; 8+ messages in thread
From: Jessica Yu @ 2016-05-06  0:42 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Miroslav Benes, Jiri Kosina, pmladek, jslaby, live-patching,
	linux-kernel, huawei.libin, minfei.huang

+++ Josh Poimboeuf [05/05/16 10:04 -0500]:
>On Thu, May 05, 2016 at 04:25:48PM +0200, Miroslav Benes wrote:
>> On Thu, 5 May 2016, Josh Poimboeuf wrote:
>>
>> > On Thu, May 05, 2016 at 10:28:12AM +0200, Miroslav Benes wrote:
>> > > I think it boils down to the following problem.
>> > >
>> > > 1. CONFIG_DEBUG_KOBJECT_RELEASE=y
>> > >
>> > > 2. we have dynamic kobjects, so there is a pointer in klp_patch to struct
>> > > kobject
>> > >
>> > > 3. it is allocated during klp_init_patch() and all is fine
>> > >
>> > > 4. now we want to remove the patch module. It is disabled and module_put()
>> > > is called. User calls rmmod on the module.
>> > >
>> > > 5. klp_unregister_patch() is called in __exit method.
>> > >
>> > > 6. klp_free_patch() is called.
>> > >
>> > > 7. kobject_put(patch->kobj) is called.
>> > >
>> > > ...now it gets interesting...
>> > >
>> > > 8. among others kobject_cleanup() is scheduled as a delayed work (this is
>> > > important).
>> > >
>> > > 9. there is no completion, so kobject_put returns and the module goes
>> > > away.
>> > >
>> > > 10. someone calls patch enabled_store attribute (for example). They can
>> > > because kobject_cleanup() has not been called yet. It is delayed
>> > > scheduled.
>> > >
>> > > ...crash...
>> >
>> > But what exactly causes the crash?  In enabled_store() we can see that
>> > the patch isn't in the list, so we can return -EINVAL.
>>
>> Ok, bad example. Take enabled_show() instead. It could be fixed in the
>> same way, but I am not sure it is the right thing to do. It does not scale
>> because the problem is elsewhere.
>>
>> Anyway, it is (even if theoretically) there in my opinion and we
>> have two options.
>>
>> 1. We could forget about CONFIG_DEBUG_KOBJECT_RELEASE and all is ok
>> without completion and regardless of dynamic/static kobject allocation.
>>
>> 2. We introduce completion and we are ok even with
>> CONFIG_DEBUG_KOBJECT_RELEASE=y and again regardless of dynamic/static
>> kobject allocation.
>
>I would disagree with the statement that the dynamic kobject doesn't
>scale.  We would just need a helper function to get from a kobject to
>its klp_patch.
>
>In fact, to me it seems like the right way to do it.  It doesn't make
>sense for the code which creates the kobject to be different from the
>code which initializes it.  It's slightly out of context, but
>kobject.txt does say:
>
>  "Code which creates a kobject must, of course, initialize that object."
>
>I view the completion as a hack to compensate for the fact that we're
>abusing the kobject interface.  And so it makes sense to me that
>CONFIG_DEBUG_KOBJECT_RELEASE would cause problems, because we're using
>kobjects in the wrong way.
>
>So in my view, the two options are:
>
>1. Convert the kobject to dynamic as I described.
>
>2. Change the klp_register() interface so that klp_patch gets allocated
>   in livepatch code.
>
>I'd be curious to hear what others think.

So, I think both of these solutions would enable us to get rid of
the completion. Let me try to summarize..

For solution #1, if we dynamically allocate the kobject, i.e. we have a
pointer now instead of having it embedded in the klp_patch struct, we no
longer need to worry if the corresponding klp_patch gets deallocated under
our nose. Since the kobject_cleanup function is delayed w/
CONFIG_DEBUG_KOBJECT_RELEASE, it is possible to have sysfs entries that
refer to a klp_patch that no longer exists. Thus if any of the sysfs
functions get called, we would have to take care to ensure that the
klp_patch struct corresponding to the kobject in question actually still
exists. In this case, all sysfs functions would require an extra check to
make sure the matching klp_patch is still on the patches list and return an
error if it isn't found. The "pro" is that this change would be simple, the
"con" is that now kobjects are decoupled and managed completely separately
from the object (klp_patch) with which they are associated, which doesn't
feel 100% right.

For solution #2, we could have livepatch manage the (de)allocation of
klp_patch objects internally. Maybe in this scenario the caller would need
to request a klp_patch object be allocated and the caller would fill out the
returned klp_patch struct as appropriate. In this case, we would be able
leave the kobject embedded in the klp_patch struct (and dynamic kobjects
wouldn't be needed), as livepatch would now have control of both structures.

Then during the patch module exit path, when kobject_put is called, the
klp_patch struct would be freed in its kobject's release function. We
wouldn't have to hold up rmmod, and delayed execution of kobject_cleanup
wouldn't break anything, because a klp_patch would then have the same "lifespan"
as its corresponding kobject, and therefore it would be safe to invoke
enabled_store & co. up until kobject_cleanup is finally executed. We'd be
able to use container_of in this case as well. In addition, we wouldn't
have to force all sysfs functions to support an awkward edge-case (i.e.
checking if the corresponding klp_patch still exists). I think this
solution also matches better with the typical use-case of the kobject
release method, as described in kobject.txt (replacing 'my_object' with
klp_patch):
---
(...)
This notification is done through a kobject's release() method.
Usually such a method has a form like:

   void my_object_release(struct kobject *kobj)
    {
        struct my_object *mine = container_of(kobj, struct my_object, kobj);
        /* Perform any additional cleanup on this object, then... */
        kfree(mine);
    }
---
Apologies for the giant wall of text. In any case I feel like solution #2
is actually closer in line with how kobjects are normally used, embedded in
the structures they refer to, which get deallocated once their refcount
hits 0. What do people think?

Jessica

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2016-06-08  8:10 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-06-01  8:31 [PATCH v2] livepatch: allow removal of a disabled patch Miroslav Benes
2016-06-01 11:29 ` Christopher Arges
2016-06-01 11:43   ` Miroslav Benes
2016-06-07  9:36 ` Petr Mladek
2016-06-07 22:39   ` Jessica Yu
2016-06-08  8:10     ` Petr Mladek
  -- strict thread matches above, loose matches on Subject: below --
2016-05-04  2:39 [RFC PATCH] " Josh Poimboeuf
2016-05-04  3:36 ` Josh Poimboeuf
2016-05-04 11:58   ` Miroslav Benes
2016-05-04 13:14     ` Josh Poimboeuf
2016-05-04 14:35       ` Miroslav Benes
2016-05-04 16:14         ` Josh Poimboeuf
2016-05-05  8:28           ` Miroslav Benes
2016-05-05 13:27             ` Josh Poimboeuf
2016-05-05 14:25               ` Miroslav Benes
2016-05-05 15:04                 ` Josh Poimboeuf
2016-05-06  0:42                   ` Jessica Yu
2016-05-06  7:51                     ` Miroslav Benes

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).