live-patching.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH V4 0/3] livepatch: cleanup kpl_patch kobject release
@ 2021-11-02 14:59 Ming Lei
  2021-11-02 14:59 ` [PATCH V4 1/3] livepatch: remove 'struct completion finish' from klp_patch Ming Lei
                   ` (2 more replies)
  0 siblings, 3 replies; 13+ messages in thread
From: Ming Lei @ 2021-11-02 14:59 UTC (permalink / raw)
  To: Josh Poimboeuf, Jiri Kosina, Miroslav Benes, Petr Mladek, live-patching
  Cc: linux-kernel, Greg Kroah-Hartman, Luis Chamberlain, Joe Lawrence,
	Ming Lei

Hello,

The 1st patch moves module_put() to release handler of klp_patch
kobject.

The 2nd patch changes to free klp_patch and other kobjects without
klp_mutex.

The 3rd patch switches to synchronous kobject release for klp_patch.


V4:
	- add freeing klp_patch after klp_enable_patch() returns, since
	patch still can be freed there in case of ->replace is set

V3:
	- one line fix on check of list_empty() in enabled_store(), 3/3

V2:
	- remove enabled attribute before deleting this klp_patch kobject,
	for avoiding deadlock in deleting me


Ming Lei (3):
  livepatch: remove 'struct completion finish' from klp_patch
  livepatch: free klp_patch object without holding klp_mutex
  livepatch: free klp_patch object synchronously

 include/linux/livepatch.h     |  2 -
 kernel/livepatch/core.c       | 73 +++++++++++++++++------------------
 kernel/livepatch/core.h       |  3 +-
 kernel/livepatch/transition.c | 23 ++++++++---
 kernel/livepatch/transition.h |  2 +-
 5 files changed, 54 insertions(+), 49 deletions(-)

-- 
2.31.1


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH V4 1/3] livepatch: remove 'struct completion finish' from klp_patch
  2021-11-02 14:59 [PATCH V4 0/3] livepatch: cleanup kpl_patch kobject release Ming Lei
@ 2021-11-02 14:59 ` Ming Lei
  2021-11-02 15:56   ` Petr Mladek
  2021-11-08 17:46   ` Miroslav Benes
  2021-11-02 14:59 ` [PATCH V4 2/3] livepatch: free klp_patch object without holding klp_mutex Ming Lei
  2021-11-02 14:59 ` [PATCH V4 3/3] livepatch: free klp_patch object synchronously Ming Lei
  2 siblings, 2 replies; 13+ messages in thread
From: Ming Lei @ 2021-11-02 14:59 UTC (permalink / raw)
  To: Josh Poimboeuf, Jiri Kosina, Miroslav Benes, Petr Mladek, live-patching
  Cc: linux-kernel, Greg Kroah-Hartman, Luis Chamberlain, Joe Lawrence,
	Ming Lei

The completion finish is just for waiting release of the klp_patch
object, then releases module refcnt. We can simply drop the module
refcnt in the kobject release handler of klp_patch.

This way also helps to support allocating klp_patch from heap.

Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 include/linux/livepatch.h |  1 -
 kernel/livepatch/core.c   | 12 +++---------
 2 files changed, 3 insertions(+), 10 deletions(-)

diff --git a/include/linux/livepatch.h b/include/linux/livepatch.h
index 2614247a9781..9712818997c5 100644
--- a/include/linux/livepatch.h
+++ b/include/linux/livepatch.h
@@ -170,7 +170,6 @@ struct klp_patch {
 	bool enabled;
 	bool forced;
 	struct work_struct free_work;
-	struct completion finish;
 };
 
 #define klp_for_each_object_static(patch, obj) \
diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c
index 335d988bd811..b967b4b0071b 100644
--- a/kernel/livepatch/core.c
+++ b/kernel/livepatch/core.c
@@ -551,10 +551,10 @@ static int klp_add_nops(struct klp_patch *patch)
 
 static void klp_kobj_release_patch(struct kobject *kobj)
 {
-	struct klp_patch *patch;
+	struct klp_patch *patch = container_of(kobj, struct klp_patch, kobj);
 
-	patch = container_of(kobj, struct klp_patch, kobj);
-	complete(&patch->finish);
+	if (!patch->forced)
+		module_put(patch->mod);
 }
 
 static struct kobj_type klp_ktype_patch = {
@@ -678,11 +678,6 @@ static void klp_free_patch_finish(struct klp_patch *patch)
 	 * cannot get enabled again.
 	 */
 	kobject_put(&patch->kobj);
-	wait_for_completion(&patch->finish);
-
-	/* Put the module after the last access to struct klp_patch. */
-	if (!patch->forced)
-		module_put(patch->mod);
 }
 
 /*
@@ -876,7 +871,6 @@ static int klp_init_patch_early(struct klp_patch *patch)
 	patch->enabled = false;
 	patch->forced = false;
 	INIT_WORK(&patch->free_work, klp_free_patch_work_fn);
-	init_completion(&patch->finish);
 
 	klp_for_each_object_static(patch, obj) {
 		if (!obj->funcs)
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH V4 2/3] livepatch: free klp_patch object without holding klp_mutex
  2021-11-02 14:59 [PATCH V4 0/3] livepatch: cleanup kpl_patch kobject release Ming Lei
  2021-11-02 14:59 ` [PATCH V4 1/3] livepatch: remove 'struct completion finish' from klp_patch Ming Lei
@ 2021-11-02 14:59 ` Ming Lei
  2021-11-02 14:59 ` [PATCH V4 3/3] livepatch: free klp_patch object synchronously Ming Lei
  2 siblings, 0 replies; 13+ messages in thread
From: Ming Lei @ 2021-11-02 14:59 UTC (permalink / raw)
  To: Josh Poimboeuf, Jiri Kosina, Miroslav Benes, Petr Mladek, live-patching
  Cc: linux-kernel, Greg Kroah-Hartman, Luis Chamberlain, Joe Lawrence,
	Ming Lei

kobject_del() is called from kobject_put(), and after the klp_patch
kobject is deleted, any show()/store() are done.

Once the klp_patch object is removed from list and prepared for
releasing, no need to hold the global mutex of klp_mutex, so
move the freeing outside of klp_mutex.

Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 kernel/livepatch/core.c       | 36 +++++++++++++++++++++--------------
 kernel/livepatch/core.h       |  3 +--
 kernel/livepatch/transition.c | 23 ++++++++++++++++------
 kernel/livepatch/transition.h |  2 +-
 4 files changed, 41 insertions(+), 23 deletions(-)

diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c
index b967b4b0071b..27768bb5a38c 100644
--- a/kernel/livepatch/core.c
+++ b/kernel/livepatch/core.c
@@ -327,7 +327,8 @@ int klp_apply_section_relocs(struct module *pmod, Elf_Shdr *sechdrs,
  * /sys/kernel/livepatch/<patch>/<object>
  * /sys/kernel/livepatch/<patch>/<object>/<function,sympos>
  */
-static int __klp_disable_patch(struct klp_patch *patch);
+static int __klp_disable_patch(struct klp_patch *patch,
+		struct list_head *to_free);
 
 static ssize_t enabled_store(struct kobject *kobj, struct kobj_attribute *attr,
 			     const char *buf, size_t count)
@@ -335,6 +336,7 @@ static ssize_t enabled_store(struct kobject *kobj, struct kobj_attribute *attr,
 	struct klp_patch *patch;
 	int ret;
 	bool enabled;
+	LIST_HEAD(to_free);
 
 	ret = kstrtobool(buf, &enabled);
 	if (ret)
@@ -360,13 +362,15 @@ static ssize_t enabled_store(struct kobject *kobj, struct kobj_attribute *attr,
 	if (patch == klp_transition_patch)
 		klp_reverse_transition();
 	else if (!enabled)
-		ret = __klp_disable_patch(patch);
+		ret = __klp_disable_patch(patch, &to_free);
 	else
 		ret = -EINVAL;
 
 out:
 	mutex_unlock(&klp_mutex);
 
+	klp_free_patches_async(&to_free);
+
 	if (ret)
 		return ret;
 	return count;
@@ -693,20 +697,19 @@ static void klp_free_patch_work_fn(struct work_struct *work)
 	klp_free_patch_finish(patch);
 }
 
-void klp_free_patch_async(struct klp_patch *patch)
+static void klp_free_patch_async(struct klp_patch *patch)
 {
 	klp_free_patch_start(patch);
 	schedule_work(&patch->free_work);
 }
 
-void klp_free_replaced_patches_async(struct klp_patch *new_patch)
+void klp_free_patches_async(struct list_head *to_free)
 {
-	struct klp_patch *old_patch, *tmp_patch;
+	struct klp_patch *patch, *tmp_patch;
 
-	klp_for_each_patch_safe(old_patch, tmp_patch) {
-		if (old_patch == new_patch)
-			return;
-		klp_free_patch_async(old_patch);
+	list_for_each_entry_safe(patch, tmp_patch, to_free, list) {
+		list_del_init(&patch->list);
+		klp_free_patch_async(patch);
 	}
 }
 
@@ -915,7 +918,8 @@ static int klp_init_patch(struct klp_patch *patch)
 	return 0;
 }
 
-static int __klp_disable_patch(struct klp_patch *patch)
+static int __klp_disable_patch(struct klp_patch *patch,
+		struct list_head *to_free)
 {
 	struct klp_object *obj;
 
@@ -942,12 +946,13 @@ static int __klp_disable_patch(struct klp_patch *patch)
 
 	klp_start_transition();
 	patch->enabled = false;
-	klp_try_complete_transition();
+	klp_try_complete_transition(to_free);
 
 	return 0;
 }
 
-static int __klp_enable_patch(struct klp_patch *patch)
+static int __klp_enable_patch(struct klp_patch *patch,
+		struct list_head *to_free)
 {
 	struct klp_object *obj;
 	int ret;
@@ -992,7 +997,7 @@ static int __klp_enable_patch(struct klp_patch *patch)
 
 	klp_start_transition();
 	patch->enabled = true;
-	klp_try_complete_transition();
+	klp_try_complete_transition(to_free);
 
 	return 0;
 err:
@@ -1018,6 +1023,7 @@ static int __klp_enable_patch(struct klp_patch *patch)
 int klp_enable_patch(struct klp_patch *patch)
 {
 	int ret;
+	LIST_HEAD(to_free);
 
 	if (!patch || !patch->mod)
 		return -EINVAL;
@@ -1055,12 +1061,14 @@ int klp_enable_patch(struct klp_patch *patch)
 	if (ret)
 		goto err;
 
-	ret = __klp_enable_patch(patch);
+	ret = __klp_enable_patch(patch, &to_free);
 	if (ret)
 		goto err;
 
 	mutex_unlock(&klp_mutex);
 
+	klp_free_patches_async(&to_free);
+
 	return 0;
 
 err:
diff --git a/kernel/livepatch/core.h b/kernel/livepatch/core.h
index 38209c7361b6..8ff97745ba40 100644
--- a/kernel/livepatch/core.h
+++ b/kernel/livepatch/core.h
@@ -13,8 +13,7 @@ extern struct list_head klp_patches;
 #define klp_for_each_patch(patch)	\
 	list_for_each_entry(patch, &klp_patches, list)
 
-void klp_free_patch_async(struct klp_patch *patch);
-void klp_free_replaced_patches_async(struct klp_patch *new_patch);
+void klp_free_patches_async(struct list_head *to_free);
 void klp_unpatch_replaced_patches(struct klp_patch *new_patch);
 void klp_discard_nops(struct klp_patch *new_patch);
 
diff --git a/kernel/livepatch/transition.c b/kernel/livepatch/transition.c
index 5683ac0d2566..0c1857405c17 100644
--- a/kernel/livepatch/transition.c
+++ b/kernel/livepatch/transition.c
@@ -31,12 +31,16 @@ static unsigned int klp_signals_cnt;
  */
 static void klp_transition_work_fn(struct work_struct *work)
 {
+	LIST_HEAD(to_free);
+
 	mutex_lock(&klp_mutex);
 
 	if (klp_transition_patch)
-		klp_try_complete_transition();
+		klp_try_complete_transition(&to_free);
 
 	mutex_unlock(&klp_mutex);
+
+	klp_free_patches_async(&to_free);
 }
 static DECLARE_DELAYED_WORK(klp_transition_work, klp_transition_work_fn);
 
@@ -382,7 +386,7 @@ static void klp_send_signals(void)
  *
  * If any tasks are still stuck in the initial patch state, schedule a retry.
  */
-void klp_try_complete_transition(void)
+void klp_try_complete_transition(struct list_head *to_free)
 {
 	unsigned int cpu;
 	struct task_struct *g, *task;
@@ -450,10 +454,17 @@ void klp_try_complete_transition(void)
 	 * klp_complete_transition() but it is called also
 	 * from klp_cancel_transition().
 	 */
-	if (!patch->enabled)
-		klp_free_patch_async(patch);
-	else if (patch->replace)
-		klp_free_replaced_patches_async(patch);
+	if (!patch->enabled) {
+		list_move(&patch->list, to_free);
+	} else if (patch->replace) {
+		struct klp_patch *old_patch, *tmp_patch;
+
+		klp_for_each_patch_safe(old_patch, tmp_patch) {
+			if (old_patch == patch)
+				break;
+			list_move(&old_patch->list, to_free);
+		}
+	}
 }
 
 /*
diff --git a/kernel/livepatch/transition.h b/kernel/livepatch/transition.h
index 322db16233de..20e3a5a0cbce 100644
--- a/kernel/livepatch/transition.h
+++ b/kernel/livepatch/transition.h
@@ -9,7 +9,7 @@ extern struct klp_patch *klp_transition_patch;
 void klp_init_transition(struct klp_patch *patch, int state);
 void klp_cancel_transition(void);
 void klp_start_transition(void);
-void klp_try_complete_transition(void);
+void klp_try_complete_transition(struct list_head *to_free);
 void klp_reverse_transition(void);
 void klp_force_transition(void);
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH V4 3/3] livepatch: free klp_patch object synchronously
  2021-11-02 14:59 [PATCH V4 0/3] livepatch: cleanup kpl_patch kobject release Ming Lei
  2021-11-02 14:59 ` [PATCH V4 1/3] livepatch: remove 'struct completion finish' from klp_patch Ming Lei
  2021-11-02 14:59 ` [PATCH V4 2/3] livepatch: free klp_patch object without holding klp_mutex Ming Lei
@ 2021-11-02 14:59 ` Ming Lei
  2021-11-03 13:55   ` Petr Mladek
  2 siblings, 1 reply; 13+ messages in thread
From: Ming Lei @ 2021-11-02 14:59 UTC (permalink / raw)
  To: Josh Poimboeuf, Jiri Kosina, Miroslav Benes, Petr Mladek, live-patching
  Cc: linux-kernel, Greg Kroah-Hartman, Luis Chamberlain, Joe Lawrence,
	Ming Lei

klp_mutex isn't acquired before calling kobject_put(klp_patch), so it is
fine to free klp_patch object synchronously.

One issue is that enabled store() method, in which the klp_patch kobject
itself is deleted & released. However, sysfs has provided APIs for dealing
with this corner case, so use sysfs_break_active_protection() and
sysfs_unbreak_active_protection() for releasing klp_patch kobject from
enabled_store(), meantime the enabled attribute has to be removed
before deleting the klp_patch kobject.

Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 include/linux/livepatch.h     |  1 -
 kernel/livepatch/core.c       | 37 +++++++++++++++--------------------
 kernel/livepatch/core.h       |  2 +-
 kernel/livepatch/transition.c |  2 +-
 4 files changed, 18 insertions(+), 24 deletions(-)

diff --git a/include/linux/livepatch.h b/include/linux/livepatch.h
index 9712818997c5..4dcebf52fac5 100644
--- a/include/linux/livepatch.h
+++ b/include/linux/livepatch.h
@@ -169,7 +169,6 @@ struct klp_patch {
 	struct list_head obj_list;
 	bool enabled;
 	bool forced;
-	struct work_struct free_work;
 };
 
 #define klp_for_each_object_static(patch, obj) \
diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c
index 27768bb5a38c..36999cddc011 100644
--- a/kernel/livepatch/core.c
+++ b/kernel/livepatch/core.c
@@ -337,6 +337,7 @@ static ssize_t enabled_store(struct kobject *kobj, struct kobj_attribute *attr,
 	int ret;
 	bool enabled;
 	LIST_HEAD(to_free);
+	struct kernfs_node *kn = NULL;
 
 	ret = kstrtobool(buf, &enabled);
 	if (ret)
@@ -369,10 +370,18 @@ static ssize_t enabled_store(struct kobject *kobj, struct kobj_attribute *attr,
 out:
 	mutex_unlock(&klp_mutex);
 
-	klp_free_patches_async(&to_free);
-
 	if (ret)
 		return ret;
+
+	if (!list_empty(&to_free)) {
+		kn = sysfs_break_active_protection(kobj, &attr->attr);
+		WARN_ON_ONCE(!kn);
+		sysfs_remove_file(kobj, &attr->attr);
+		klp_free_patches(&to_free);
+		if (kn)
+			sysfs_unbreak_active_protection(kn);
+	}
+
 	return count;
 }
 
@@ -684,32 +693,19 @@ static void klp_free_patch_finish(struct klp_patch *patch)
 	kobject_put(&patch->kobj);
 }
 
-/*
- * The livepatch might be freed from sysfs interface created by the patch.
- * This work allows to wait until the interface is destroyed in a separate
- * context.
- */
-static void klp_free_patch_work_fn(struct work_struct *work)
-{
-	struct klp_patch *patch =
-		container_of(work, struct klp_patch, free_work);
-
-	klp_free_patch_finish(patch);
-}
-
-static void klp_free_patch_async(struct klp_patch *patch)
+static void klp_free_patch(struct klp_patch *patch)
 {
 	klp_free_patch_start(patch);
-	schedule_work(&patch->free_work);
+	klp_free_patch_finish(patch);
 }
 
-void klp_free_patches_async(struct list_head *to_free)
+void klp_free_patches(struct list_head *to_free)
 {
 	struct klp_patch *patch, *tmp_patch;
 
 	list_for_each_entry_safe(patch, tmp_patch, to_free, list) {
 		list_del_init(&patch->list);
-		klp_free_patch_async(patch);
+		klp_free_patch(patch);
 	}
 }
 
@@ -873,7 +869,6 @@ static int klp_init_patch_early(struct klp_patch *patch)
 	kobject_init(&patch->kobj, &klp_ktype_patch);
 	patch->enabled = false;
 	patch->forced = false;
-	INIT_WORK(&patch->free_work, klp_free_patch_work_fn);
 
 	klp_for_each_object_static(patch, obj) {
 		if (!obj->funcs)
@@ -1067,7 +1062,7 @@ int klp_enable_patch(struct klp_patch *patch)
 
 	mutex_unlock(&klp_mutex);
 
-	klp_free_patches_async(&to_free);
+	klp_free_patches(&to_free);
 
 	return 0;
 
diff --git a/kernel/livepatch/core.h b/kernel/livepatch/core.h
index 8ff97745ba40..ea593f370049 100644
--- a/kernel/livepatch/core.h
+++ b/kernel/livepatch/core.h
@@ -13,7 +13,7 @@ extern struct list_head klp_patches;
 #define klp_for_each_patch(patch)	\
 	list_for_each_entry(patch, &klp_patches, list)
 
-void klp_free_patches_async(struct list_head *to_free);
+void klp_free_patches(struct list_head *to_free);
 void klp_unpatch_replaced_patches(struct klp_patch *new_patch);
 void klp_discard_nops(struct klp_patch *new_patch);
 
diff --git a/kernel/livepatch/transition.c b/kernel/livepatch/transition.c
index 0c1857405c17..1a339a076dd4 100644
--- a/kernel/livepatch/transition.c
+++ b/kernel/livepatch/transition.c
@@ -40,7 +40,7 @@ static void klp_transition_work_fn(struct work_struct *work)
 
 	mutex_unlock(&klp_mutex);
 
-	klp_free_patches_async(&to_free);
+	klp_free_patches(&to_free);
 }
 static DECLARE_DELAYED_WORK(klp_transition_work, klp_transition_work_fn);
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH V4 1/3] livepatch: remove 'struct completion finish' from klp_patch
  2021-11-02 14:59 ` [PATCH V4 1/3] livepatch: remove 'struct completion finish' from klp_patch Ming Lei
@ 2021-11-02 15:56   ` Petr Mladek
  2021-11-03  0:51     ` Ming Lei
  2021-11-08 17:46   ` Miroslav Benes
  1 sibling, 1 reply; 13+ messages in thread
From: Petr Mladek @ 2021-11-02 15:56 UTC (permalink / raw)
  To: Ming Lei
  Cc: Josh Poimboeuf, Jiri Kosina, Miroslav Benes, live-patching,
	linux-kernel, Greg Kroah-Hartman, Luis Chamberlain, Joe Lawrence

On Tue 2021-11-02 22:59:30, Ming Lei wrote:
> The completion finish is just for waiting release of the klp_patch
> object, then releases module refcnt. We can simply drop the module
> refcnt in the kobject release handler of klp_patch.
> 
> This way also helps to support allocating klp_patch from heap.

IMHO, this is wrong assumption. kobject_put() might do everyting
asynchronously, see:

   kobject_put()
     kobject_release()
       INIT_DELAYED_WORK(&kobj->release, kobject_delayed_cleanup);
       schedule_delayed_work(&kobj->release, delay);

   asynchronously:

     kobject_delayed_cleanup()
      kobject_cleanup()
	__kobject_del()


> Signed-off-by: Ming Lei <ming.lei@redhat.com>
> ---
>  include/linux/livepatch.h |  1 -
>  kernel/livepatch/core.c   | 12 +++---------
>  2 files changed, 3 insertions(+), 10 deletions(-)
> 
> diff --git a/include/linux/livepatch.h b/include/linux/livepatch.h
> index 2614247a9781..9712818997c5 100644
> --- a/include/linux/livepatch.h
> +++ b/include/linux/livepatch.h
> @@ -170,7 +170,6 @@ struct klp_patch {
>  	bool enabled;
>  	bool forced;
>  	struct work_struct free_work;
> -	struct completion finish;
>  };
>  
>  #define klp_for_each_object_static(patch, obj) \
> diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c
> index 335d988bd811..b967b4b0071b 100644
> --- a/kernel/livepatch/core.c
> +++ b/kernel/livepatch/core.c
> @@ -551,10 +551,10 @@ static int klp_add_nops(struct klp_patch *patch)
>  
>  static void klp_kobj_release_patch(struct kobject *kobj)
>  {
> -	struct klp_patch *patch;
> +	struct klp_patch *patch = container_of(kobj, struct klp_patch, kobj);
>  
> -	patch = container_of(kobj, struct klp_patch, kobj);
> -	complete(&patch->finish);
> +	if (!patch->forced)
> +		module_put(patch->mod);
>  }
>  
>  static struct kobj_type klp_ktype_patch = {
> @@ -678,11 +678,6 @@ static void klp_free_patch_finish(struct klp_patch *patch)
>  	 * cannot get enabled again.
>  	 */
>  	kobject_put(&patch->kobj);
> -	wait_for_completion(&patch->finish);
> -
> -	/* Put the module after the last access to struct klp_patch. */
> -	if (!patch->forced)
> -		module_put(patch->mod);

klp_free_patch_finish() does not longer wait until the release
callbacks are called.

klp_free_patch_finish() is called also in klp_enable_patch() error
path.

klp_enable_patch() is called in module_init(). For example, see
samples/livepatch/livepatch-sample.c

The module must not get removed until the release callbacks are called.
Does the module loader check the module reference counter when
module_init() fails?

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH V4 1/3] livepatch: remove 'struct completion finish' from klp_patch
  2021-11-02 15:56   ` Petr Mladek
@ 2021-11-03  0:51     ` Ming Lei
  2021-11-03 12:52       ` Petr Mladek
  0 siblings, 1 reply; 13+ messages in thread
From: Ming Lei @ 2021-11-03  0:51 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Josh Poimboeuf, Jiri Kosina, Miroslav Benes, live-patching,
	linux-kernel, Greg Kroah-Hartman, Luis Chamberlain, Joe Lawrence,
	ming.lei

On Tue, Nov 02, 2021 at 04:56:10PM +0100, Petr Mladek wrote:
> On Tue 2021-11-02 22:59:30, Ming Lei wrote:
> > The completion finish is just for waiting release of the klp_patch
> > object, then releases module refcnt. We can simply drop the module
> > refcnt in the kobject release handler of klp_patch.
> > 
> > This way also helps to support allocating klp_patch from heap.
> 
> IMHO, this is wrong assumption. kobject_put() might do everyting
> asynchronously, see:
> 
>    kobject_put()
>      kobject_release()
>        INIT_DELAYED_WORK(&kobj->release, kobject_delayed_cleanup);
>        schedule_delayed_work(&kobj->release, delay);
> 
>    asynchronously:
> 
>      kobject_delayed_cleanup()
>       kobject_cleanup()
> 	__kobject_del()

OK, this is one generic kobject release vs. module unloading issue to
solve, not unique for klp module, and there should be lots of drivers
suffering from it.

> 
> 
> > Signed-off-by: Ming Lei <ming.lei@redhat.com>
> > ---
> >  include/linux/livepatch.h |  1 -
> >  kernel/livepatch/core.c   | 12 +++---------
> >  2 files changed, 3 insertions(+), 10 deletions(-)
> > 
> > diff --git a/include/linux/livepatch.h b/include/linux/livepatch.h
> > index 2614247a9781..9712818997c5 100644
> > --- a/include/linux/livepatch.h
> > +++ b/include/linux/livepatch.h
> > @@ -170,7 +170,6 @@ struct klp_patch {
> >  	bool enabled;
> >  	bool forced;
> >  	struct work_struct free_work;
> > -	struct completion finish;
> >  };
> >  
> >  #define klp_for_each_object_static(patch, obj) \
> > diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c
> > index 335d988bd811..b967b4b0071b 100644
> > --- a/kernel/livepatch/core.c
> > +++ b/kernel/livepatch/core.c
> > @@ -551,10 +551,10 @@ static int klp_add_nops(struct klp_patch *patch)
> >  
> >  static void klp_kobj_release_patch(struct kobject *kobj)
> >  {
> > -	struct klp_patch *patch;
> > +	struct klp_patch *patch = container_of(kobj, struct klp_patch, kobj);
> >  
> > -	patch = container_of(kobj, struct klp_patch, kobj);
> > -	complete(&patch->finish);
> > +	if (!patch->forced)
> > +		module_put(patch->mod);
> >  }
> >  
> >  static struct kobj_type klp_ktype_patch = {
> > @@ -678,11 +678,6 @@ static void klp_free_patch_finish(struct klp_patch *patch)
> >  	 * cannot get enabled again.
> >  	 */
> >  	kobject_put(&patch->kobj);
> > -	wait_for_completion(&patch->finish);
> > -
> > -	/* Put the module after the last access to struct klp_patch. */
> > -	if (!patch->forced)
> > -		module_put(patch->mod);
> 
> klp_free_patch_finish() does not longer wait until the release
> callbacks are called.
> 
> klp_free_patch_finish() is called also in klp_enable_patch() error
> path.
> 
> klp_enable_patch() is called in module_init(). For example, see
> samples/livepatch/livepatch-sample.c
> 
> The module must not get removed until the release callbacks are called.
> Does the module loader check the module reference counter when
> module_init() fails?

Good catch, that is really one corner case, in which the kobject has to
be cleaned up before returning from mod->init(), cause there can't be
module unloading in case of mod->init() failure. 

Yeah, it should be more related with async kobject_put().

Also looks it is reasonable to add check when cleaning module loading
failure.


thanks,
Ming


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH V4 1/3] livepatch: remove 'struct completion finish' from klp_patch
  2021-11-03  0:51     ` Ming Lei
@ 2021-11-03 12:52       ` Petr Mladek
  2021-11-05 12:04         ` Ming Lei
  0 siblings, 1 reply; 13+ messages in thread
From: Petr Mladek @ 2021-11-03 12:52 UTC (permalink / raw)
  To: Ming Lei
  Cc: Josh Poimboeuf, Jiri Kosina, Miroslav Benes, live-patching,
	linux-kernel, Greg Kroah-Hartman, Luis Chamberlain, Joe Lawrence

On Wed 2021-11-03 08:51:32, Ming Lei wrote:
> On Tue, Nov 02, 2021 at 04:56:10PM +0100, Petr Mladek wrote:
> > On Tue 2021-11-02 22:59:30, Ming Lei wrote:
> > > The completion finish is just for waiting release of the klp_patch
> > > object, then releases module refcnt. We can simply drop the module
> > > refcnt in the kobject release handler of klp_patch.
> > > 
> > > This way also helps to support allocating klp_patch from heap.

First, I am sorry for confusion. The above description is correct.
I does not say anything about that kobject_put() is synchronous.

> > IMHO, this is wrong assumption. kobject_put() might do everyting
> > asynchronously, see:

I see that you are aware of this behavior.

> >    kobject_put()
> >      kobject_release()
> >        INIT_DELAYED_WORK(&kobj->release, kobject_delayed_cleanup);
> >        schedule_delayed_work(&kobj->release, delay);
> > 
> >    asynchronously:
> > 
> >      kobject_delayed_cleanup()
> >       kobject_cleanup()
> > 	__kobject_del()
> 
> OK, this is one generic kobject release vs. module unloading issue to
> solve, not unique for klp module, and there should be lots of drivers
> suffering from it.

Yup, the problem is generic. It would be nice to have a generic
solution. For example, add kobject_release_sync() that would return
only when the object is really released.

> > > --- a/kernel/livepatch/core.c
> > > +++ b/kernel/livepatch/core.c
> > > @@ -678,11 +678,6 @@ static void klp_free_patch_finish(struct klp_patch *patch)
> > >  	 * cannot get enabled again.
> > >  	 */
> > >  	kobject_put(&patch->kobj);
> > > -	wait_for_completion(&patch->finish);
> > > -
> > > -	/* Put the module after the last access to struct klp_patch. */
> > > -	if (!patch->forced)
> > > -		module_put(patch->mod);
> > 
> > klp_free_patch_finish() does not longer wait until the release
> > callbacks are called.
> > 
> > klp_free_patch_finish() is called also in klp_enable_patch() error
> > path.
> > 
> > klp_enable_patch() is called in module_init(). For example, see
> > samples/livepatch/livepatch-sample.c
> > 
> > The module must not get removed until the release callbacks are called.
> > Does the module loader check the module reference counter when
> > module_init() fails?
> 
> Good catch, that is really one corner case, in which the kobject has to
> be cleaned up before returning from mod->init(), cause there can't be
> module unloading in case of mod->init() failure.

Just to be sure. We want to keep the safe behavior in this case.
There are many situations when klp_enable() might fail. And the error
handling must be safe.

In general, livepatch developers are very conservative.
Livepatches are not easy to create. They are used only by people
who really want to avoid reboot. We want to keep the livepatch kernel
framework as safe as possible to avoid any potential damage.

> Yeah, it should be more related with async kobject_put().

Yup, it would be nice to have some sychronous variant provided
by kobject API.

> Also looks it is reasonable to add check when cleaning module loading
> failure.

Which means that the completion has to stay until there is any generic
solution. And this patch should be dropped for now.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH V4 3/3] livepatch: free klp_patch object synchronously
  2021-11-02 14:59 ` [PATCH V4 3/3] livepatch: free klp_patch object synchronously Ming Lei
@ 2021-11-03 13:55   ` Petr Mladek
  2021-11-05  7:59     ` Ming Lei
  0 siblings, 1 reply; 13+ messages in thread
From: Petr Mladek @ 2021-11-03 13:55 UTC (permalink / raw)
  To: Ming Lei
  Cc: Josh Poimboeuf, Jiri Kosina, Miroslav Benes, live-patching,
	linux-kernel, Greg Kroah-Hartman, Luis Chamberlain, Joe Lawrence

On Tue 2021-11-02 22:59:32, Ming Lei wrote:
> klp_mutex isn't acquired before calling kobject_put(klp_patch), so it is
> fine to free klp_patch object synchronously.
> 
> One issue is that enabled store() method, in which the klp_patch kobject
> itself is deleted & released. However, sysfs has provided APIs for dealing
> with this corner case, so use sysfs_break_active_protection() and
> sysfs_unbreak_active_protection() for releasing klp_patch kobject from
> enabled_store(), meantime the enabled attribute has to be removed
> before deleting the klp_patch kobject.
> 
> --- a/kernel/livepatch/core.c
> +++ b/kernel/livepatch/core.c
> @@ -369,10 +370,18 @@ static ssize_t enabled_store(struct kobject *kobj, struct kobj_attribute *attr,
>  out:
>  	mutex_unlock(&klp_mutex);
>  
> -	klp_free_patches_async(&to_free);
> -
>  	if (ret)
>  		return ret;
> +
> +	if (!list_empty(&to_free)) {
> +		kn = sysfs_break_active_protection(kobj, &attr->attr);
> +		WARN_ON_ONCE(!kn);
> +		sysfs_remove_file(kobj, &attr->attr);
> +		klp_free_patches(&to_free);
> +		if (kn)
> +			sysfs_unbreak_active_protection(kn);
> +	}

I agree that using workqueues for free_work looks like a hack.
But this looks even more tricky and fragile to me. It feels like
playing with sysfs/kernfs internals.

It might look less tricky when using sysfs_remove_file_self().

Anyway, there are only few users of these APIs:

   + sysfs_break_active_protection() is used only scsi
   + kernfs_break_active_protection() is used by cgroups, cpusets, and rdtgroup.
   + sysfs_remove_file_self() is used by some RDMA-related stuff.

It means that there are some users but it is not widely used API.

I would personally prefer to keep it as is. I do not see any
fundamental advantage of the new code. But I might be biased
because the current code was written by me ;-)

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH V4 3/3] livepatch: free klp_patch object synchronously
  2021-11-03 13:55   ` Petr Mladek
@ 2021-11-05  7:59     ` Ming Lei
  2021-11-10 14:42       ` Petr Mladek
  0 siblings, 1 reply; 13+ messages in thread
From: Ming Lei @ 2021-11-05  7:59 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Josh Poimboeuf, Jiri Kosina, Miroslav Benes, live-patching,
	linux-kernel, Greg Kroah-Hartman, Luis Chamberlain, Joe Lawrence,
	ming.lei

On Wed, Nov 03, 2021 at 02:55:19PM +0100, Petr Mladek wrote:
> On Tue 2021-11-02 22:59:32, Ming Lei wrote:
> > klp_mutex isn't acquired before calling kobject_put(klp_patch), so it is
> > fine to free klp_patch object synchronously.
> > 
> > One issue is that enabled store() method, in which the klp_patch kobject
> > itself is deleted & released. However, sysfs has provided APIs for dealing
> > with this corner case, so use sysfs_break_active_protection() and
> > sysfs_unbreak_active_protection() for releasing klp_patch kobject from
> > enabled_store(), meantime the enabled attribute has to be removed
> > before deleting the klp_patch kobject.
> > 
> > --- a/kernel/livepatch/core.c
> > +++ b/kernel/livepatch/core.c
> > @@ -369,10 +370,18 @@ static ssize_t enabled_store(struct kobject *kobj, struct kobj_attribute *attr,
> >  out:
> >  	mutex_unlock(&klp_mutex);
> >  
> > -	klp_free_patches_async(&to_free);
> > -
> >  	if (ret)
> >  		return ret;
> > +
> > +	if (!list_empty(&to_free)) {
> > +		kn = sysfs_break_active_protection(kobj, &attr->attr);
> > +		WARN_ON_ONCE(!kn);
> > +		sysfs_remove_file(kobj, &attr->attr);
> > +		klp_free_patches(&to_free);
> > +		if (kn)
> > +			sysfs_unbreak_active_protection(kn);
> > +	}
> 
> I agree that using workqueues for free_work looks like a hack.
> But this looks even more tricky and fragile to me. It feels like
> playing with sysfs/kernfs internals.
> 
> It might look less tricky when using sysfs_remove_file_self().

The protection needs to cover removing both 'enabled' attribute and
the patch kobject, so sysfs_remove_file_self() isn't good here.

> 
> Anyway, there are only few users of these APIs:
> 
>    + sysfs_break_active_protection() is used only scsi
>    + kernfs_break_active_protection() is used by cgroups, cpusets, and rdtgroup.
>    + sysfs_remove_file_self() is used by some RDMA-related stuff.
> 
> It means that there are some users but it is not widely used API.

It is used by generic pci device and scsi device, both are the most popular
devices in the world, either one of the two subsystem should have huge amount
of users, so it means the interface itself has been proved/verified for long
time by many enough real users.

> 
> I would personally prefer to keep it as is. I do not see any
> fundamental advantage of the new code. But I might be biased
> because the current code was written by me ;-)

The fundamental advantage is that the API has been used/verified by
enough real users. Also killing attribute/kobject itself isn't unique
for livepatch, that is actually one common pattern, so it needn't
such hacky implementation.


Thanks,
Ming


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH V4 1/3] livepatch: remove 'struct completion finish' from klp_patch
  2021-11-03 12:52       ` Petr Mladek
@ 2021-11-05 12:04         ` Ming Lei
  2021-11-10 13:57           ` Petr Mladek
  0 siblings, 1 reply; 13+ messages in thread
From: Ming Lei @ 2021-11-05 12:04 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Josh Poimboeuf, Jiri Kosina, Miroslav Benes, live-patching,
	linux-kernel, Greg Kroah-Hartman, Luis Chamberlain, Joe Lawrence,
	ming.lei

On Wed, Nov 03, 2021 at 01:52:29PM +0100, Petr Mladek wrote:
> On Wed 2021-11-03 08:51:32, Ming Lei wrote:
> > On Tue, Nov 02, 2021 at 04:56:10PM +0100, Petr Mladek wrote:
> > > On Tue 2021-11-02 22:59:30, Ming Lei wrote:
> > > > The completion finish is just for waiting release of the klp_patch
> > > > object, then releases module refcnt. We can simply drop the module
> > > > refcnt in the kobject release handler of klp_patch.
> > > > 
> > > > This way also helps to support allocating klp_patch from heap.
> 
> First, I am sorry for confusion. The above description is correct.
> I does not say anything about that kobject_put() is synchronous.
> 
> > > IMHO, this is wrong assumption. kobject_put() might do everyting
> > > asynchronously, see:
> 
> I see that you are aware of this behavior.
> 
> > >    kobject_put()
> > >      kobject_release()
> > >        INIT_DELAYED_WORK(&kobj->release, kobject_delayed_cleanup);
> > >        schedule_delayed_work(&kobj->release, delay);
> > > 
> > >    asynchronously:
> > > 
> > >      kobject_delayed_cleanup()
> > >       kobject_cleanup()
> > > 	__kobject_del()
> > 
> > OK, this is one generic kobject release vs. module unloading issue to
> > solve, not unique for klp module, and there should be lots of drivers
> > suffering from it.
> 
> Yup, the problem is generic. It would be nice to have a generic
> solution. For example, add kobject_release_sync() that would return
> only when the object is really released.

The generic solution has been posted out:

https://lore.kernel.org/lkml/20211105063710.4092936-1-ming.lei@redhat.com/

which needn't any generic API change, just flushes all scheduled kobject
cleanup work before freeing module, and the change is transparent for
drivers.

IMO, kobject_release_sync() is one wrong direction for fixing this
issue, since it is basically impossible to audit if one kobject_put()
need to be replaced with kobject_release_sync().

> 
> > > > --- a/kernel/livepatch/core.c
> > > > +++ b/kernel/livepatch/core.c
> > > > @@ -678,11 +678,6 @@ static void klp_free_patch_finish(struct klp_patch *patch)
> > > >  	 * cannot get enabled again.
> > > >  	 */
> > > >  	kobject_put(&patch->kobj);
> > > > -	wait_for_completion(&patch->finish);
> > > > -
> > > > -	/* Put the module after the last access to struct klp_patch. */
> > > > -	if (!patch->forced)
> > > > -		module_put(patch->mod);
> > > 
> > > klp_free_patch_finish() does not longer wait until the release
> > > callbacks are called.
> > > 
> > > klp_free_patch_finish() is called also in klp_enable_patch() error
> > > path.
> > > 
> > > klp_enable_patch() is called in module_init(). For example, see
> > > samples/livepatch/livepatch-sample.c
> > > 
> > > The module must not get removed until the release callbacks are called.
> > > Does the module loader check the module reference counter when
> > > module_init() fails?
> > 
> > Good catch, that is really one corner case, in which the kobject has to
> > be cleaned up before returning from mod->init(), cause there can't be
> > module unloading in case of mod->init() failure.
> 
> Just to be sure. We want to keep the safe behavior in this case.
> There are many situations when klp_enable() might fail. And the error
> handling must be safe.
> 
> In general, livepatch developers are very conservative.
> Livepatches are not easy to create. They are used only by people
> who really want to avoid reboot. We want to keep the livepatch kernel
> framework as safe as possible to avoid any potential damage.

The posted patch can cover this situation in which module_init() fails.



Thanks,
Ming


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH V4 1/3] livepatch: remove 'struct completion finish' from klp_patch
  2021-11-02 14:59 ` [PATCH V4 1/3] livepatch: remove 'struct completion finish' from klp_patch Ming Lei
  2021-11-02 15:56   ` Petr Mladek
@ 2021-11-08 17:46   ` Miroslav Benes
  1 sibling, 0 replies; 13+ messages in thread
From: Miroslav Benes @ 2021-11-08 17:46 UTC (permalink / raw)
  To: Ming Lei
  Cc: Josh Poimboeuf, Jiri Kosina, Petr Mladek, live-patching,
	linux-kernel, Greg Kroah-Hartman, Luis Chamberlain, Joe Lawrence

On Tue, 2 Nov 2021, Ming Lei wrote:

> The completion finish is just for waiting release of the klp_patch
> object, then releases module refcnt. We can simply drop the module
> refcnt in the kobject release handler of klp_patch.
> 
> This way also helps to support allocating klp_patch from heap.
> 
> Signed-off-by: Ming Lei <ming.lei@redhat.com>
> ---
>  include/linux/livepatch.h |  1 -
>  kernel/livepatch/core.c   | 12 +++---------
>  2 files changed, 3 insertions(+), 10 deletions(-)
> 
> diff --git a/include/linux/livepatch.h b/include/linux/livepatch.h
> index 2614247a9781..9712818997c5 100644
> --- a/include/linux/livepatch.h
> +++ b/include/linux/livepatch.h
> @@ -170,7 +170,6 @@ struct klp_patch {
>  	bool enabled;
>  	bool forced;
>  	struct work_struct free_work;
> -	struct completion finish;
>  };
>  
>  #define klp_for_each_object_static(patch, obj) \

Petr pointed out the main problem. I'll just add that if it comes to it, 
you could also remove the inclusion of completion.h header file and a 
description of finish struct member.

Miroslav

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH V4 1/3] livepatch: remove 'struct completion finish' from klp_patch
  2021-11-05 12:04         ` Ming Lei
@ 2021-11-10 13:57           ` Petr Mladek
  0 siblings, 0 replies; 13+ messages in thread
From: Petr Mladek @ 2021-11-10 13:57 UTC (permalink / raw)
  To: Ming Lei
  Cc: Josh Poimboeuf, Jiri Kosina, Miroslav Benes, live-patching,
	linux-kernel, Greg Kroah-Hartman, Luis Chamberlain, Joe Lawrence

On Fri 2021-11-05 20:04:03, Ming Lei wrote:
> On Wed, Nov 03, 2021 at 01:52:29PM +0100, Petr Mladek wrote:
> > On Wed 2021-11-03 08:51:32, Ming Lei wrote:
> > > On Tue, Nov 02, 2021 at 04:56:10PM +0100, Petr Mladek wrote:
> > I see that you are aware of this behavior.
> > 
> > > >    kobject_put()
> > > >      kobject_release()
> > > >        INIT_DELAYED_WORK(&kobj->release, kobject_delayed_cleanup);
> > > >        schedule_delayed_work(&kobj->release, delay);
> > > > 
> > > >    asynchronously:
> > > > 
> > > >      kobject_delayed_cleanup()
> > > >       kobject_cleanup()
> > > > 	__kobject_del()
> > > 
> > > OK, this is one generic kobject release vs. module unloading issue to
> > > solve, not unique for klp module, and there should be lots of drivers
> > > suffering from it.
> > 
> > Yup, the problem is generic. It would be nice to have a generic
> > solution. For example, add kobject_release_sync() that would return
> > only when the object is really released.
> 
> The generic solution has been posted out:
> 
> https://lore.kernel.org/lkml/20211105063710.4092936-1-ming.lei@redhat.com/
> 
> which needn't any generic API change, just flushes all scheduled kobject
> cleanup work before freeing module, and the change is transparent for
> drivers.

No, it is not enough. The proposed "generic solution" solves only
the problem introduced by CONFIG_DEBUG_KOBJECT_RELEASE. It does not
prevent unloading the module from another reasons. I mean that it
does not help when the last reference is not released in time
or never from some reasons.


> IMO, kobject_release_sync() is one wrong direction for fixing this
> issue, since it is basically impossible to audit if one kobject_put()
> need to be replaced with kobject_release_sync().
> 
> > 
> > > > > --- a/kernel/livepatch/core.c
> > > > > +++ b/kernel/livepatch/core.c
> > > > > @@ -678,11 +678,6 @@ static void klp_free_patch_finish(struct klp_patch *patch)
> > > > >  	 * cannot get enabled again.
> > > > >  	 */
> > > > >  	kobject_put(&patch->kobj);
> > > > > -	wait_for_completion(&patch->finish);
> > > > > -
> > > > > -	/* Put the module after the last access to struct klp_patch. */
> > > > > -	if (!patch->forced)
> > > > > -		module_put(patch->mod);
> > > > 
> > > > klp_free_patch_finish() does not longer wait until the release
> > > > callbacks are called.
> > > > 
> > > > klp_free_patch_finish() is called also in klp_enable_patch() error
> > > > path.
> > > > 
> > > > klp_enable_patch() is called in module_init(). For example, see
> > > > samples/livepatch/livepatch-sample.c
> > > > 
> > > > The module must not get removed until the release callbacks are called.
> > > > Does the module loader check the module reference counter when
> > > > module_init() fails?
> > > 
> > > Good catch, that is really one corner case, in which the kobject has to
> > > be cleaned up before returning from mod->init(), cause there can't be
> > > module unloading in case of mod->init() failure.
> > 
> > Just to be sure. We want to keep the safe behavior in this case.
> > There are many situations when klp_enable() might fail. And the error
> > handling must be safe.
> > 
> > In general, livepatch developers are very conservative.
> > Livepatches are not easy to create. They are used only by people
> > who really want to avoid reboot. We want to keep the livepatch kernel
> > framework as safe as possible to avoid any potential damage.
> 
> The posted patch can cover this situation in which module_init() fails.

No, it works only when the last reference was dropped, the delayed
release queued, and the kobject added into kobj_cleanup_list.

The current livepatch code is much more safe. klp_free_patch_finish()
waits for the completion. It will allows to remove the module only
when the kobject was really released. It will block the module
removal as long as needed, even forever, if there is a bug somewhere.

If we use wait_for_completion_timeout() in a while cycle, we could
even report when it takes suspiciously long and something likely
went wrong. This is one way to "always" catch and report the problem.

I am not sure if similar approach is usable for device drivers. But I
had this in mind when I proposed the kobject_release_sync() API.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH V4 3/3] livepatch: free klp_patch object synchronously
  2021-11-05  7:59     ` Ming Lei
@ 2021-11-10 14:42       ` Petr Mladek
  0 siblings, 0 replies; 13+ messages in thread
From: Petr Mladek @ 2021-11-10 14:42 UTC (permalink / raw)
  To: Ming Lei
  Cc: Josh Poimboeuf, Jiri Kosina, Miroslav Benes, live-patching,
	linux-kernel, Greg Kroah-Hartman, Luis Chamberlain, Joe Lawrence

On Fri 2021-11-05 15:59:56, Ming Lei wrote:
> On Wed, Nov 03, 2021 at 02:55:19PM +0100, Petr Mladek wrote:
> > On Tue 2021-11-02 22:59:32, Ming Lei wrote:
> > > klp_mutex isn't acquired before calling kobject_put(klp_patch), so it is
> > > fine to free klp_patch object synchronously.
> > > 
> > > One issue is that enabled store() method, in which the klp_patch kobject
> > > itself is deleted & released. However, sysfs has provided APIs for dealing
> > > with this corner case, so use sysfs_break_active_protection() and
> > > sysfs_unbreak_active_protection() for releasing klp_patch kobject from
> > > enabled_store(), meantime the enabled attribute has to be removed
> > > before deleting the klp_patch kobject.
> > > 
> > > --- a/kernel/livepatch/core.c
> > > +++ b/kernel/livepatch/core.c
> > > @@ -369,10 +370,18 @@ static ssize_t enabled_store(struct kobject *kobj, struct kobj_attribute *attr,
> > >  out:
> > >  	mutex_unlock(&klp_mutex);
> > >  
> > > -	klp_free_patches_async(&to_free);
> > > -
> > >  	if (ret)
> > >  		return ret;
> > > +
> > > +	if (!list_empty(&to_free)) {
> > > +		kn = sysfs_break_active_protection(kobj, &attr->attr);
> > > +		WARN_ON_ONCE(!kn);
> > > +		sysfs_remove_file(kobj, &attr->attr);
> > > +		klp_free_patches(&to_free);
> > > +		if (kn)
> > > +			sysfs_unbreak_active_protection(kn);
> > > +	}
> > 
> > I agree that using workqueues for free_work looks like a hack.
> > But this looks even more tricky and fragile to me. It feels like
> > playing with sysfs/kernfs internals.
> > 
> > It might look less tricky when using sysfs_remove_file_self().
> 
> The protection needs to cover removing both 'enabled' attribute and
> the patch kobject, so sysfs_remove_file_self() isn't good here.

I see.

> > Anyway, there are only few users of these APIs:
> > 
> >    + sysfs_break_active_protection() is used only scsi
> >    + kernfs_break_active_protection() is used by cgroups, cpusets, and rdtgroup.
> >    + sysfs_remove_file_self() is used by some RDMA-related stuff.
> > 
> > It means that there are some users but it is not widely used API.
> 
> It is used by generic pci device and scsi device, both are the most popular
> devices in the world, either one of the two subsystem should have huge amount
> of users, so it means the interface itself has been proved/verified for long
> time by many enough real users.

Good to know. It means that if there is a regression then scsi users
should find it quickly.


> > > +		kn = sysfs_break_active_protection(kobj, &attr->attr);
> > > +		WARN_ON_ONCE(!kn);
> > > +		sysfs_remove_file(kobj, &attr->attr);
> > > +		klp_free_patches(&to_free);
> > > +		if (kn)
> > > +			sysfs_unbreak_active_protection(kn);


> > I would personally prefer to keep it as is. I do not see any
> > fundamental advantage of the new code. But I might be biased
> > because the current code was written by me ;-)
> 
> The fundamental advantage is that the API has been used/verified by
> enough real users. Also killing attribute/kobject itself isn't unique
> for livepatch, that is actually one common pattern, so it needn't
> such hacky implementation.

I am not sure what you mean by many users:

   + sysfs_break_active_protection() is used only once
     by sdev_store_delete()

   + sysfs_remove_file_self() seems to be used 7x in kernel sources.


It all goes down to kernfs_break_active_protection() that has
a bit scary description:

 * This function releases the active reference of @kn the caller is
 * holding.  Once this function is called, @kn may be removed at any point
 * and the caller is solely responsible for ensuring that the objects it
 * dereferences are accessible.


and the related kernfs_unbreak_active_protection() has even more
scarry description:

 * If kernfs_break_active_protection() was called, this function must be
 * invoked before finishing the kernfs operation.  Note that while this
 * function restores the active reference, it doesn't and can't actually
 * restore the active protection - @kn may already or be in the process of
 * being removed.  Once kernfs_break_active_protection() is invoked, that
 * protection is irreversibly gone for the kernfs operation instance.
 *
 * While this function may be called at any point after
 * kernfs_break_active_protection() is invoked, its most useful location
 * would be right before the enclosing kernfs operation returns.


It feels like this API allows you to cut the branch you are staying on.
You have to be sure that you do it in the right order and remove
the spot under your feet as the very last piece. While normally
this is guranteed by the refecence counters.

In compare, the workqueue approach looks less risky. You just ask
someone (worker) to remove your branch after you leave. It will be
naturally done only when nobody is on the branch and in the right
order thanks to the reference counters.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2021-11-10 14:42 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-02 14:59 [PATCH V4 0/3] livepatch: cleanup kpl_patch kobject release Ming Lei
2021-11-02 14:59 ` [PATCH V4 1/3] livepatch: remove 'struct completion finish' from klp_patch Ming Lei
2021-11-02 15:56   ` Petr Mladek
2021-11-03  0:51     ` Ming Lei
2021-11-03 12:52       ` Petr Mladek
2021-11-05 12:04         ` Ming Lei
2021-11-10 13:57           ` Petr Mladek
2021-11-08 17:46   ` Miroslav Benes
2021-11-02 14:59 ` [PATCH V4 2/3] livepatch: free klp_patch object without holding klp_mutex Ming Lei
2021-11-02 14:59 ` [PATCH V4 3/3] livepatch: free klp_patch object synchronously Ming Lei
2021-11-03 13:55   ` Petr Mladek
2021-11-05  7:59     ` Ming Lei
2021-11-10 14:42       ` Petr Mladek

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).