live-patching.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/3] livepatch,sched: Add livepatch task switching to cond_resched()
@ 2023-02-09 19:17 Josh Poimboeuf
  2023-02-09 19:17 ` [PATCH 1/3] livepatch: Skip task_call_func() for current task Josh Poimboeuf
                   ` (4 more replies)
  0 siblings, 5 replies; 12+ messages in thread
From: Josh Poimboeuf @ 2023-02-09 19:17 UTC (permalink / raw)
  To: live-patching
  Cc: linux-kernel, Seth Forshee, Peter Zijlstra, Song Liu,
	Mark Rutland, Petr Mladek, Joe Lawrence, Miroslav Benes,
	Jiri Kosina, Ingo Molnar

Fix patching stalls caused by busy kthreads.

Josh Poimboeuf (3):
  livepatch: Skip task_call_func() for current task
  livepatch,sched: Add livepatch task switching to cond_resched()
  vhost: Fix livepatch timeouts in vhost_worker()

 drivers/vhost/vhost.c           |  3 +-
 include/linux/livepatch.h       |  1 +
 include/linux/livepatch_sched.h | 29 +++++++++++++++
 include/linux/sched.h           | 20 ++++++++---
 kernel/livepatch/transition.c   | 60 ++++++++++++++++++++++++++++++-
 kernel/sched/core.c             | 64 ++++++++++++++++++++++++++++-----
 6 files changed, 161 insertions(+), 16 deletions(-)
 create mode 100644 include/linux/livepatch_sched.h

-- 
2.39.0


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 1/3] livepatch: Skip task_call_func() for current task
  2023-02-09 19:17 [PATCH 0/3] livepatch,sched: Add livepatch task switching to cond_resched() Josh Poimboeuf
@ 2023-02-09 19:17 ` Josh Poimboeuf
  2023-02-15  9:56   ` Petr Mladek
  2023-02-09 19:17 ` [PATCH 2/3] livepatch,sched: Add livepatch task switching to cond_resched() Josh Poimboeuf
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 12+ messages in thread
From: Josh Poimboeuf @ 2023-02-09 19:17 UTC (permalink / raw)
  To: live-patching
  Cc: linux-kernel, Seth Forshee, Peter Zijlstra, Song Liu,
	Mark Rutland, Petr Mladek, Joe Lawrence, Miroslav Benes,
	Jiri Kosina, Ingo Molnar

The current task doesn't need the scheduler's protection to unwind its
own stack.

Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
---
 kernel/livepatch/transition.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/kernel/livepatch/transition.c b/kernel/livepatch/transition.c
index f1b25ec581e0..4d1f443778f7 100644
--- a/kernel/livepatch/transition.c
+++ b/kernel/livepatch/transition.c
@@ -307,7 +307,11 @@ static bool klp_try_switch_task(struct task_struct *task)
 	 * functions.  If all goes well, switch the task to the target patch
 	 * state.
 	 */
-	ret = task_call_func(task, klp_check_and_switch_task, &old_name);
+	if (task == current)
+		ret = klp_check_and_switch_task(current, &old_name);
+	else
+		ret = task_call_func(task, klp_check_and_switch_task, &old_name);
+
 	switch (ret) {
 	case 0:		/* success */
 		break;
-- 
2.39.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 2/3] livepatch,sched: Add livepatch task switching to cond_resched()
  2023-02-09 19:17 [PATCH 0/3] livepatch,sched: Add livepatch task switching to cond_resched() Josh Poimboeuf
  2023-02-09 19:17 ` [PATCH 1/3] livepatch: Skip task_call_func() for current task Josh Poimboeuf
@ 2023-02-09 19:17 ` Josh Poimboeuf
  2023-02-15 13:30   ` Petr Mladek
  2023-02-09 19:17 ` [PATCH 3/3] vhost: Fix livepatch timeouts in vhost_worker() Josh Poimboeuf
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 12+ messages in thread
From: Josh Poimboeuf @ 2023-02-09 19:17 UTC (permalink / raw)
  To: live-patching
  Cc: linux-kernel, Seth Forshee, Peter Zijlstra, Song Liu,
	Mark Rutland, Petr Mladek, Joe Lawrence, Miroslav Benes,
	Jiri Kosina, Ingo Molnar

There have been reports [1][2] of live patches failing to complete
within a reasonable amount of time due to CPU-bound kthreads.

Fix it by patching tasks in cond_resched().

There are four different flavors of cond_resched(), depending on the
kernel configuration.  Hook into all of them.

A more elegant solution might be to use a preempt notifier.  However,
non-ORC unwinders can't unwind a preempted task reliably.

[1] https://lore.kernel.org/lkml/20220507174628.2086373-1-song@kernel.org/
[2] https://lkml.kernel.org/lkml/20230120-vhost-klp-switching-v1-0-7c2b65519c43@kernel.org

Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
---
 include/linux/livepatch.h       |  1 +
 include/linux/livepatch_sched.h | 29 +++++++++++++++
 include/linux/sched.h           | 20 ++++++++---
 kernel/livepatch/transition.c   | 54 ++++++++++++++++++++++++++++
 kernel/sched/core.c             | 64 ++++++++++++++++++++++++++++-----
 5 files changed, 155 insertions(+), 13 deletions(-)
 create mode 100644 include/linux/livepatch_sched.h

diff --git a/include/linux/livepatch.h b/include/linux/livepatch.h
index 293e29960c6e..9b9b38e89563 100644
--- a/include/linux/livepatch.h
+++ b/include/linux/livepatch.h
@@ -13,6 +13,7 @@
 #include <linux/ftrace.h>
 #include <linux/completion.h>
 #include <linux/list.h>
+#include <linux/livepatch_sched.h>
 
 #if IS_ENABLED(CONFIG_LIVEPATCH)
 
diff --git a/include/linux/livepatch_sched.h b/include/linux/livepatch_sched.h
new file mode 100644
index 000000000000..013794fb5da0
--- /dev/null
+++ b/include/linux/livepatch_sched.h
@@ -0,0 +1,29 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+#ifndef _LINUX_LIVEPATCH_SCHED_H_
+#define _LINUX_LIVEPATCH_SCHED_H_
+
+#include <linux/jump_label.h>
+#include <linux/static_call_types.h>
+
+#ifdef CONFIG_LIVEPATCH
+
+void __klp_sched_try_switch(void);
+
+#if !defined(CONFIG_PREEMPT_DYNAMIC) || !defined(CONFIG_HAVE_PREEMPT_DYNAMIC_CALL)
+
+DECLARE_STATIC_KEY_FALSE(klp_sched_try_switch_key);
+
+static __always_inline void klp_sched_try_switch(void)
+{
+	if (static_branch_unlikely(&klp_sched_try_switch_key))
+		__klp_sched_try_switch();
+}
+
+#endif /* !CONFIG_PREEMPT_DYNAMIC || !CONFIG_HAVE_PREEMPT_DYNAMIC_CALL */
+
+#else /* !CONFIG_LIVEPATCH */
+static inline void klp_sched_try_switch(void) {}
+static inline void __klp_sched_try_switch(void) {}
+#endif /* CONFIG_LIVEPATCH */
+
+#endif /* _LINUX_LIVEPATCH_SCHED_H_ */
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 4df2b3e76b30..2bc5e925a6d8 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -36,6 +36,7 @@
 #include <linux/seqlock.h>
 #include <linux/kcsan.h>
 #include <linux/rv.h>
+#include <linux/livepatch_sched.h>
 #include <asm/kmap_size.h>
 
 /* task_struct member predeclarations (sorted alphabetically): */
@@ -2070,6 +2071,9 @@ extern int __cond_resched(void);
 
 #if defined(CONFIG_PREEMPT_DYNAMIC) && defined(CONFIG_HAVE_PREEMPT_DYNAMIC_CALL)
 
+void sched_dynamic_klp_enable(void);
+void sched_dynamic_klp_disable(void);
+
 DECLARE_STATIC_CALL(cond_resched, __cond_resched);
 
 static __always_inline int _cond_resched(void)
@@ -2078,6 +2082,7 @@ static __always_inline int _cond_resched(void)
 }
 
 #elif defined(CONFIG_PREEMPT_DYNAMIC) && defined(CONFIG_HAVE_PREEMPT_DYNAMIC_KEY)
+
 extern int dynamic_cond_resched(void);
 
 static __always_inline int _cond_resched(void)
@@ -2085,20 +2090,25 @@ static __always_inline int _cond_resched(void)
 	return dynamic_cond_resched();
 }
 
-#else
+#else /* !CONFIG_PREEMPTION */
 
 static inline int _cond_resched(void)
 {
+	klp_sched_try_switch();
 	return __cond_resched();
 }
 
-#endif /* CONFIG_PREEMPT_DYNAMIC */
+#endif /* PREEMPT_DYNAMIC && CONFIG_HAVE_PREEMPT_DYNAMIC_CALL */
 
-#else
+#else /* CONFIG_PREEMPTION && !CONFIG_PREEMPT_DYNAMIC */
 
-static inline int _cond_resched(void) { return 0; }
+static inline int _cond_resched(void)
+{
+	klp_sched_try_switch();
+	return 0;
+}
 
-#endif /* !defined(CONFIG_PREEMPTION) || defined(CONFIG_PREEMPT_DYNAMIC) */
+#endif /* !CONFIG_PREEMPTION || CONFIG_PREEMPT_DYNAMIC */
 
 #define cond_resched() ({			\
 	__might_resched(__FILE__, __LINE__, 0);	\
diff --git a/kernel/livepatch/transition.c b/kernel/livepatch/transition.c
index 4d1f443778f7..e629966c6fbe 100644
--- a/kernel/livepatch/transition.c
+++ b/kernel/livepatch/transition.c
@@ -9,6 +9,7 @@
 
 #include <linux/cpu.h>
 #include <linux/stacktrace.h>
+#include <linux/static_call.h>
 #include "core.h"
 #include "patch.h"
 #include "transition.h"
@@ -24,6 +25,25 @@ static int klp_target_state = KLP_UNDEFINED;
 
 static unsigned int klp_signals_cnt;
 
+/*
+ * When a livepatch is in progress, enable klp stack checking in
+ * cond_resched().  This helps CPU-bound kthreads get patched.
+ */
+#if defined(CONFIG_PREEMPT_DYNAMIC) && defined(CONFIG_HAVE_PREEMPT_DYNAMIC_CALL)
+
+#define klp_cond_resched_enable() sched_dynamic_klp_enable()
+#define klp_cond_resched_disable() sched_dynamic_klp_disable()
+
+#else /* !CONFIG_PREEMPT_DYNAMIC || !CONFIG_HAVE_PREEMPT_DYNAMIC_CALL */
+
+DEFINE_STATIC_KEY_FALSE(klp_sched_try_switch_key);
+EXPORT_SYMBOL(klp_sched_try_switch_key);
+
+#define klp_cond_resched_enable() static_branch_enable(&klp_sched_try_switch_key)
+#define klp_cond_resched_disable() static_branch_disable(&klp_sched_try_switch_key)
+
+#endif /* CONFIG_PREEMPT_DYNAMIC && CONFIG_HAVE_PREEMPT_DYNAMIC_CALL */
+
 /*
  * This work can be performed periodically to finish patching or unpatching any
  * "straggler" tasks which failed to transition in the first attempt.
@@ -76,6 +96,8 @@ static void klp_complete_transition(void)
 		 klp_transition_patch->mod->name,
 		 klp_target_state == KLP_PATCHED ? "patching" : "unpatching");
 
+	klp_cond_resched_disable();
+
 	if (klp_transition_patch->replace && klp_target_state == KLP_PATCHED) {
 		klp_unpatch_replaced_patches(klp_transition_patch);
 		klp_discard_nops(klp_transition_patch);
@@ -338,6 +360,36 @@ static bool klp_try_switch_task(struct task_struct *task)
 	return !ret;
 }
 
+void __klp_sched_try_switch(void)
+{
+	if (likely(!klp_patch_pending(current)))
+		return;
+
+	/*
+	 * This function is called from cond_resched() which is called in many
+	 * places throughout the kernel.  Using the klp_mutex here might
+	 * deadlock.
+	 *
+	 * Instead, disable preemption to prevent racing with other callers of
+	 * klp_try_switch_task().  Thanks to task_call_func() they won't be
+	 * able to switch this task while it's running.
+	 */
+	preempt_disable();
+
+	/*
+	 * Make sure current didn't get patched between the above check and
+	 * preempt_disable().
+	 */
+	if (unlikely(!klp_patch_pending(current)))
+		goto out;
+
+	klp_try_switch_task(current);
+
+out:
+	preempt_enable();
+}
+EXPORT_SYMBOL(__klp_sched_try_switch);
+
 /*
  * Sends a fake signal to all non-kthread tasks with TIF_PATCH_PENDING set.
  * Kthreads with TIF_PATCH_PENDING set are woken up.
@@ -496,6 +548,8 @@ void klp_start_transition(void)
 			set_tsk_thread_flag(task, TIF_PATCH_PENDING);
 	}
 
+	klp_cond_resched_enable();
+
 	klp_signals_cnt = 0;
 }
 
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 3a0ef2fefbd5..ed5bf5ec0477 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -8500,6 +8500,7 @@ EXPORT_STATIC_CALL_TRAMP(might_resched);
 static DEFINE_STATIC_KEY_FALSE(sk_dynamic_cond_resched);
 int __sched dynamic_cond_resched(void)
 {
+	klp_sched_try_switch();
 	if (!static_branch_unlikely(&sk_dynamic_cond_resched))
 		return 0;
 	return __cond_resched();
@@ -8648,13 +8649,17 @@ int sched_dynamic_mode(const char *str)
 #error "Unsupported PREEMPT_DYNAMIC mechanism"
 #endif
 
-void sched_dynamic_update(int mode)
+DEFINE_MUTEX(sched_dynamic_mutex);
+static bool klp_override;
+
+static void __sched_dynamic_update(int mode)
 {
 	/*
 	 * Avoid {NONE,VOLUNTARY} -> FULL transitions from ever ending up in
 	 * the ZERO state, which is invalid.
 	 */
-	preempt_dynamic_enable(cond_resched);
+	if (!klp_override)
+		preempt_dynamic_enable(cond_resched);
 	preempt_dynamic_enable(might_resched);
 	preempt_dynamic_enable(preempt_schedule);
 	preempt_dynamic_enable(preempt_schedule_notrace);
@@ -8662,36 +8667,79 @@ void sched_dynamic_update(int mode)
 
 	switch (mode) {
 	case preempt_dynamic_none:
-		preempt_dynamic_enable(cond_resched);
+		if (!klp_override)
+			preempt_dynamic_enable(cond_resched);
 		preempt_dynamic_disable(might_resched);
 		preempt_dynamic_disable(preempt_schedule);
 		preempt_dynamic_disable(preempt_schedule_notrace);
 		preempt_dynamic_disable(irqentry_exit_cond_resched);
-		pr_info("Dynamic Preempt: none\n");
+		if (mode != preempt_dynamic_mode)
+			pr_info("Dynamic Preempt: none\n");
 		break;
 
 	case preempt_dynamic_voluntary:
-		preempt_dynamic_enable(cond_resched);
+		if (!klp_override)
+			preempt_dynamic_enable(cond_resched);
 		preempt_dynamic_enable(might_resched);
 		preempt_dynamic_disable(preempt_schedule);
 		preempt_dynamic_disable(preempt_schedule_notrace);
 		preempt_dynamic_disable(irqentry_exit_cond_resched);
-		pr_info("Dynamic Preempt: voluntary\n");
+		if (mode != preempt_dynamic_mode)
+			pr_info("Dynamic Preempt: voluntary\n");
 		break;
 
 	case preempt_dynamic_full:
-		preempt_dynamic_disable(cond_resched);
+		if (!klp_override)
+			preempt_dynamic_disable(cond_resched);
 		preempt_dynamic_disable(might_resched);
 		preempt_dynamic_enable(preempt_schedule);
 		preempt_dynamic_enable(preempt_schedule_notrace);
 		preempt_dynamic_enable(irqentry_exit_cond_resched);
-		pr_info("Dynamic Preempt: full\n");
+		if (mode != preempt_dynamic_mode)
+			pr_info("Dynamic Preempt: full\n");
 		break;
 	}
 
 	preempt_dynamic_mode = mode;
 }
 
+void sched_dynamic_update(int mode)
+{
+	mutex_lock(&sched_dynamic_mutex);
+	__sched_dynamic_update(mode);
+	mutex_unlock(&sched_dynamic_mutex);
+}
+
+#ifdef CONFIG_HAVE_PREEMPT_DYNAMIC_CALL
+
+static int klp_cond_resched(void)
+{
+	__klp_sched_try_switch();
+	return __cond_resched();
+}
+
+void sched_dynamic_klp_enable(void)
+{
+	mutex_lock(&sched_dynamic_mutex);
+
+	klp_override = true;
+	static_call_update(cond_resched, klp_cond_resched);
+
+	mutex_unlock(&sched_dynamic_mutex);
+}
+
+void sched_dynamic_klp_disable(void)
+{
+	mutex_lock(&sched_dynamic_mutex);
+
+	klp_override = false;
+	__sched_dynamic_update(preempt_dynamic_mode);
+
+	mutex_unlock(&sched_dynamic_mutex);
+}
+
+#endif /* CONFIG_HAVE_PREEMPT_DYNAMIC_CALL */
+
 static int __init setup_preempt_mode(char *str)
 {
 	int mode = sched_dynamic_mode(str);
-- 
2.39.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 3/3] vhost: Fix livepatch timeouts in vhost_worker()
  2023-02-09 19:17 [PATCH 0/3] livepatch,sched: Add livepatch task switching to cond_resched() Josh Poimboeuf
  2023-02-09 19:17 ` [PATCH 1/3] livepatch: Skip task_call_func() for current task Josh Poimboeuf
  2023-02-09 19:17 ` [PATCH 2/3] livepatch,sched: Add livepatch task switching to cond_resched() Josh Poimboeuf
@ 2023-02-09 19:17 ` Josh Poimboeuf
  2023-02-15 13:31   ` Petr Mladek
  2023-02-10 21:08 ` [PATCH 0/3] livepatch,sched: Add livepatch task switching to cond_resched() Seth Forshee
  2023-02-14 12:08 ` Peter Zijlstra
  4 siblings, 1 reply; 12+ messages in thread
From: Josh Poimboeuf @ 2023-02-09 19:17 UTC (permalink / raw)
  To: live-patching
  Cc: linux-kernel, Seth Forshee, Peter Zijlstra, Song Liu,
	Mark Rutland, Petr Mladek, Joe Lawrence, Miroslav Benes,
	Jiri Kosina, Ingo Molnar

Livepatch timeouts were reported due to busy vhost_worker() kthreads.
Now that cond_resched() can do livepatch task switching, use
cond_resched() in vhost_worker().  That's the better way to
conditionally call schedule() anyway.

Reported-by: Seth Forshee (DigitalOcean) <sforshee@digitalocean.com>
Link: https://lkml.kernel.org/lkml/20230120-vhost-klp-switching-v1-0-7c2b65519c43@kernel.org
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
---
 drivers/vhost/vhost.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index cbe72bfd2f1f..424c0c939f57 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -363,8 +363,7 @@ static int vhost_worker(void *data)
 			kcov_remote_start_common(dev->kcov_handle);
 			work->fn(work);
 			kcov_remote_stop();
-			if (need_resched())
-				schedule();
+			cond_resched();
 		}
 	}
 	kthread_unuse_mm(dev->mm);
-- 
2.39.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH 0/3] livepatch,sched: Add livepatch task switching to cond_resched()
  2023-02-09 19:17 [PATCH 0/3] livepatch,sched: Add livepatch task switching to cond_resched() Josh Poimboeuf
                   ` (2 preceding siblings ...)
  2023-02-09 19:17 ` [PATCH 3/3] vhost: Fix livepatch timeouts in vhost_worker() Josh Poimboeuf
@ 2023-02-10 21:08 ` Seth Forshee
  2023-02-14 12:08 ` Peter Zijlstra
  4 siblings, 0 replies; 12+ messages in thread
From: Seth Forshee @ 2023-02-10 21:08 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: live-patching, linux-kernel, Peter Zijlstra, Song Liu,
	Mark Rutland, Petr Mladek, Joe Lawrence, Miroslav Benes,
	Jiri Kosina, Ingo Molnar

On Thu, Feb 09, 2023 at 11:17:46AM -0800, Josh Poimboeuf wrote:
> Fix patching stalls caused by busy kthreads.
> 
> Josh Poimboeuf (3):
>   livepatch: Skip task_call_func() for current task
>   livepatch,sched: Add livepatch task switching to cond_resched()
>   vhost: Fix livepatch timeouts in vhost_worker()

This is working perfectly in my testing. The busy vhost threads get
switched immediately. Thanks for working up these patches!

Tested-by: Seth Forshee (DigitalOcean) <sforshee@kernel.org>

Thanks,
Seth

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 0/3] livepatch,sched: Add livepatch task switching to cond_resched()
  2023-02-09 19:17 [PATCH 0/3] livepatch,sched: Add livepatch task switching to cond_resched() Josh Poimboeuf
                   ` (3 preceding siblings ...)
  2023-02-10 21:08 ` [PATCH 0/3] livepatch,sched: Add livepatch task switching to cond_resched() Seth Forshee
@ 2023-02-14 12:08 ` Peter Zijlstra
  2023-02-14 19:05   ` Josh Poimboeuf
  4 siblings, 1 reply; 12+ messages in thread
From: Peter Zijlstra @ 2023-02-14 12:08 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: live-patching, linux-kernel, Seth Forshee, Song Liu,
	Mark Rutland, Petr Mladek, Joe Lawrence, Miroslav Benes,
	Jiri Kosina, Ingo Molnar

On Thu, Feb 09, 2023 at 11:17:46AM -0800, Josh Poimboeuf wrote:
> Fix patching stalls caused by busy kthreads.
> 
> Josh Poimboeuf (3):
>   livepatch: Skip task_call_func() for current task
>   livepatch,sched: Add livepatch task switching to cond_resched()
>   vhost: Fix livepatch timeouts in vhost_worker()

Seems reasonable, you want me to take them through the sched tree?

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 0/3] livepatch,sched: Add livepatch task switching to cond_resched()
  2023-02-14 12:08 ` Peter Zijlstra
@ 2023-02-14 19:05   ` Josh Poimboeuf
  0 siblings, 0 replies; 12+ messages in thread
From: Josh Poimboeuf @ 2023-02-14 19:05 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: live-patching, linux-kernel, Seth Forshee, Song Liu,
	Mark Rutland, Petr Mladek, Joe Lawrence, Miroslav Benes,
	Jiri Kosina, Ingo Molnar

On Tue, Feb 14, 2023 at 01:08:22PM +0100, Peter Zijlstra wrote:
> On Thu, Feb 09, 2023 at 11:17:46AM -0800, Josh Poimboeuf wrote:
> > Fix patching stalls caused by busy kthreads.
> > 
> > Josh Poimboeuf (3):
> >   livepatch: Skip task_call_func() for current task
> >   livepatch,sched: Add livepatch task switching to cond_resched()
> >   vhost: Fix livepatch timeouts in vhost_worker()
> 
> Seems reasonable, you want me to take them through the sched tree?

That would be great, though first I'd like to get an ack from at least
another livepatch maintainer.

-- 
Josh

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/3] livepatch: Skip task_call_func() for current task
  2023-02-09 19:17 ` [PATCH 1/3] livepatch: Skip task_call_func() for current task Josh Poimboeuf
@ 2023-02-15  9:56   ` Petr Mladek
  0 siblings, 0 replies; 12+ messages in thread
From: Petr Mladek @ 2023-02-15  9:56 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: live-patching, linux-kernel, Seth Forshee, Peter Zijlstra,
	Song Liu, Mark Rutland, Joe Lawrence, Miroslav Benes,
	Jiri Kosina, Ingo Molnar

On Thu 2023-02-09 11:17:47, Josh Poimboeuf wrote:
> The current task doesn't need the scheduler's protection to unwind its
> own stack.
> 
> Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>

I do not see any problem with it:

Reviewed-by: Petr Mladek <pmladek@suse.com>

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/3] livepatch,sched: Add livepatch task switching to cond_resched()
  2023-02-09 19:17 ` [PATCH 2/3] livepatch,sched: Add livepatch task switching to cond_resched() Josh Poimboeuf
@ 2023-02-15 13:30   ` Petr Mladek
  2023-02-16  2:26     ` Josh Poimboeuf
  0 siblings, 1 reply; 12+ messages in thread
From: Petr Mladek @ 2023-02-15 13:30 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: live-patching, linux-kernel, Seth Forshee, Peter Zijlstra,
	Song Liu, Mark Rutland, Joe Lawrence, Miroslav Benes,
	Jiri Kosina, Ingo Molnar

On Thu 2023-02-09 11:17:48, Josh Poimboeuf wrote:
> There have been reports [1][2] of live patches failing to complete
> within a reasonable amount of time due to CPU-bound kthreads.
> 
> Fix it by patching tasks in cond_resched().
> 
> There are four different flavors of cond_resched(), depending on the
> kernel configuration.  Hook into all of them.
> 
> A more elegant solution might be to use a preempt notifier.  However,
> non-ORC unwinders can't unwind a preempted task reliably.
> 
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -2085,20 +2090,25 @@ static __always_inline int _cond_resched(void)
>  	return dynamic_cond_resched();
>  }
>  
> -#else
> +#else /* !CONFIG_PREEMPTION */
>  
>  static inline int _cond_resched(void)
>  {
> +	klp_sched_try_switch();
>  	return __cond_resched();

My only concern is if it might cause any performance problems.

On one hand, cond_resched() is used in code paths that are slow
on its own. Also it will do nothing most of the time.

On the other hand, cond_resched() is typically used in cycles.
One cycle might be fast. The code might be slow because there
are too many cycles. Repeating the same failing test might
prolong the time significantly.

An idea is to try the switch only when it was not done during
a real schedule. Something like:

static inline int _cond_resched(void)
{
	int scheduled;

	scheduled = __cond_resched();
	if (scheduled)
		klp_sched_try_switch();

	return scheduled();
}

But it would make it less reliable/predictable. Also it won't work
in configurations when cond_resched() is always a nop.

I am probably too careful. We might keep it simple until any real
life problems are reported.

> --- a/kernel/livepatch/transition.c
> +++ b/kernel/livepatch/transition.c
> @@ -76,6 +96,8 @@ static void klp_complete_transition(void)
>  		 klp_transition_patch->mod->name,
>  		 klp_target_state == KLP_PATCHED ? "patching" : "unpatching");
>  
> +	klp_cond_resched_disable();
> +

Nit: Strictly speaking, this is not needed when klp_complete_transition()
     is called from klp_cancel_transition(). In this case,
     klp_cond_resched_enable() was not called. So it might be moved into
     klp_try_complete_transition().


More important thing, thinking loudly:

We need to make sure that no task is in the middle
klp_cond_resched_disable() when we modify anything that is used there.

We seem to be on the safe side in klp_complete_transition(). We are
here only when all tasks have TIF_PATCH_PENDING cleared. In this case,
__klp_sched_try_switch() just returns. Also it calls
klp_synchronize_transition() so that all tasks finish the critical part
in __klp_sched_try_switch() before any new transition starts.

But it is not the case in klp_reverse_transition(). It modifies
klp_target_state() when __klp_sched_try_switch might be in the middle
of klp_check_stack() and it might give wrong result.

klp_reverse_transition() already solves similar race with
klp_update_patch_state() by clearing all TIF_PATCH_PENDING flags
and calling klp_synchronize_transition(). We just need to do
it earlier. Something like:

--- a/kernel/livepatch/transition.c
+++ b/kernel/livepatch/transition.c
@@ -642,14 +642,11 @@ void klp_reverse_transition(void)
 		 klp_target_state == KLP_PATCHED ? "patching to unpatching" :
 						   "unpatching to patching");
 
-	klp_transition_patch->enabled = !klp_transition_patch->enabled;
-
-	klp_target_state = !klp_target_state;
 
 	/*
 	 * Clear all TIF_PATCH_PENDING flags to prevent races caused by
-	 * klp_update_patch_state() running in parallel with
-	 * klp_start_transition().
+	 * klp_update_patch_state() and __klp_sched_try_switch()
+	 * running in parallel.
 	 */
 	read_lock(&tasklist_lock);
 	for_each_process_thread(g, task)
@@ -659,9 +656,16 @@ void klp_reverse_transition(void)
 	for_each_possible_cpu(cpu)
 		clear_tsk_thread_flag(idle_task(cpu), TIF_PATCH_PENDING);
 
-	/* Let any remaining calls to klp_update_patch_state() complete */
+	/*
+	 * Make sure that both klp_update_patch_state() and
+	 * __klp_sched_try_switch() see the TIF_PATCH_PENDING cleared.
+	 */
 	klp_synchronize_transition();
 
+	klp_transition_patch->enabled = !klp_transition_patch->enabled;
+
+	klp_target_state = !klp_target_state;
+
 	klp_start_transition();
 }
 


>  	if (klp_transition_patch->replace && klp_target_state == KLP_PATCHED) {
>  		klp_unpatch_replaced_patches(klp_transition_patch);
>  		klp_discard_nops(klp_transition_patch);


Otherwise, the patch looks fine. I looked at it primary from
the livepatching code side.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 3/3] vhost: Fix livepatch timeouts in vhost_worker()
  2023-02-09 19:17 ` [PATCH 3/3] vhost: Fix livepatch timeouts in vhost_worker() Josh Poimboeuf
@ 2023-02-15 13:31   ` Petr Mladek
  0 siblings, 0 replies; 12+ messages in thread
From: Petr Mladek @ 2023-02-15 13:31 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: live-patching, linux-kernel, Seth Forshee, Peter Zijlstra,
	Song Liu, Mark Rutland, Joe Lawrence, Miroslav Benes,
	Jiri Kosina, Ingo Molnar

On Thu 2023-02-09 11:17:49, Josh Poimboeuf wrote:
> Livepatch timeouts were reported due to busy vhost_worker() kthreads.
> Now that cond_resched() can do livepatch task switching, use
> cond_resched() in vhost_worker().  That's the better way to
> conditionally call schedule() anyway.
> 
> Reported-by: Seth Forshee (DigitalOcean) <sforshee@digitalocean.com>
> Link: https://lkml.kernel.org/lkml/20230120-vhost-klp-switching-v1-0-7c2b65519c43@kernel.org
> Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>

Reviewed-by: Petr Mladek <pmladek@suse.com>

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/3] livepatch,sched: Add livepatch task switching to cond_resched()
  2023-02-15 13:30   ` Petr Mladek
@ 2023-02-16  2:26     ` Josh Poimboeuf
  2023-02-17  9:14       ` Petr Mladek
  0 siblings, 1 reply; 12+ messages in thread
From: Josh Poimboeuf @ 2023-02-16  2:26 UTC (permalink / raw)
  To: Petr Mladek
  Cc: live-patching, linux-kernel, Seth Forshee, Peter Zijlstra,
	Song Liu, Mark Rutland, Joe Lawrence, Miroslav Benes,
	Jiri Kosina, Ingo Molnar

On Wed, Feb 15, 2023 at 02:30:36PM +0100, Petr Mladek wrote:
> >  static inline int _cond_resched(void)
> >  {
> > +	klp_sched_try_switch();
> >  	return __cond_resched();
> 
> My only concern is if it might cause any performance problems.
> 
> On one hand, cond_resched() is used in code paths that are slow
> on its own. Also it will do nothing most of the time.
> 
> On the other hand, cond_resched() is typically used in cycles.
> One cycle might be fast. The code might be slow because there
> are too many cycles. Repeating the same failing test might
> prolong the time significantly.

Yes, but it should hopefully be very rare to patch a function in the
call stack of a kthread loop.  In general it's a good idea for the patch
author to avoid that.

> An idea is to try the switch only when it was not done during
> a real schedule. Something like:
> 
> static inline int _cond_resched(void)
> {
> 	int scheduled;
> 
> 	scheduled = __cond_resched();
> 	if (scheduled)
> 		klp_sched_try_switch();
> 
> 	return scheduled();
> }
> 
> But it would make it less reliable/predictable. Also it won't work
> in configurations when cond_resched() is always a nop.
> 
> I am probably too careful. We might keep it simple until any real
> life problems are reported.

If we can get away with it, I much prefer the simple unconditional
klp_sched_try_switch() because of the predictability and quickness with
which the kthread gets patched.

> > --- a/kernel/livepatch/transition.c
> > +++ b/kernel/livepatch/transition.c
> > @@ -76,6 +96,8 @@ static void klp_complete_transition(void)
> >  		 klp_transition_patch->mod->name,
> >  		 klp_target_state == KLP_PATCHED ? "patching" : "unpatching");
> >  
> > +	klp_cond_resched_disable();
> > +
> 
> Nit: Strictly speaking, this is not needed when klp_complete_transition()
>      is called from klp_cancel_transition(). In this case,
>      klp_cond_resched_enable() was not called. So it might be moved into
>      klp_try_complete_transition().

Argh, I always forget about that pesky klp_cancel_transition().

> More important thing, thinking loudly:
> 
> We need to make sure that no task is in the middle
> klp_cond_resched_disable() when we modify anything that is used there.
> 
> We seem to be on the safe side in klp_complete_transition(). We are
> here only when all tasks have TIF_PATCH_PENDING cleared. In this case,
> __klp_sched_try_switch() just returns. Also it calls
> klp_synchronize_transition() so that all tasks finish the critical part
> in __klp_sched_try_switch() before any new transition starts.
> 
> But it is not the case in klp_reverse_transition(). It modifies
> klp_target_state() when __klp_sched_try_switch might be in the middle
> of klp_check_stack() and it might give wrong result.
> 
> klp_reverse_transition() already solves similar race with
> klp_update_patch_state() by clearing all TIF_PATCH_PENDING flags
> and calling klp_synchronize_transition(). We just need to do
> it earlier. Something like:

Yes!  Thanks, I can always count on you to find the race conditions ;-)

This highlights the similarities between klp_target_state(current) and
__klp_sched_try_switch(), they both access TIF_PATCH_PENDING
out-of-band.

Also, I'll update the comment in klp_copy_process(). It should be safe
for with __klp_sched_try_switch() for the same reason as
klp_update_patch_state(current): they all only work on 'current'.

-- 
Josh

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/3] livepatch,sched: Add livepatch task switching to cond_resched()
  2023-02-16  2:26     ` Josh Poimboeuf
@ 2023-02-17  9:14       ` Petr Mladek
  0 siblings, 0 replies; 12+ messages in thread
From: Petr Mladek @ 2023-02-17  9:14 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: live-patching, linux-kernel, Seth Forshee, Peter Zijlstra,
	Song Liu, Mark Rutland, Joe Lawrence, Miroslav Benes,
	Jiri Kosina, Ingo Molnar

On Wed 2023-02-15 18:26:30, Josh Poimboeuf wrote:
> On Wed, Feb 15, 2023 at 02:30:36PM +0100, Petr Mladek wrote:
> > >  static inline int _cond_resched(void)
> > >  {
> > > +	klp_sched_try_switch();
> > >  	return __cond_resched();
> > 
> > My only concern is if it might cause any performance problems.
> > 
> > On one hand, cond_resched() is used in code paths that are slow
> > on its own. Also it will do nothing most of the time.
> > 
> > On the other hand, cond_resched() is typically used in cycles.
> > One cycle might be fast. The code might be slow because there
> > are too many cycles. Repeating the same failing test might
> > prolong the time significantly.
> 
> Yes, but it should hopefully be very rare to patch a function in the
> call stack of a kthread loop.  In general it's a good idea for the patch
> author to avoid that.

I do not have any data about it.

> > An idea is to try the switch only when it was not done during
> > a real schedule. Something like:
> > 
> > static inline int _cond_resched(void)
> > {
> > 	int scheduled;
> > 
> > 	scheduled = __cond_resched();
> > 	if (scheduled)
> > 		klp_sched_try_switch();
> > 
> > 	return scheduled();
> > }
> > 
> > But it would make it less reliable/predictable. Also it won't work
> > in configurations when cond_resched() is always a nop.
> > 
> > I am probably too careful. We might keep it simple until any real
> > life problems are reported.
> 
> If we can get away with it, I much prefer the simple unconditional
> klp_sched_try_switch() because of the predictability and quickness with
> which the kthread gets patched.

I am fine with keeping it simple now. We could always change it when
it causes problems.

I primary wanted to point out the potential problem and check what
others think about it.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2023-02-17  9:14 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-02-09 19:17 [PATCH 0/3] livepatch,sched: Add livepatch task switching to cond_resched() Josh Poimboeuf
2023-02-09 19:17 ` [PATCH 1/3] livepatch: Skip task_call_func() for current task Josh Poimboeuf
2023-02-15  9:56   ` Petr Mladek
2023-02-09 19:17 ` [PATCH 2/3] livepatch,sched: Add livepatch task switching to cond_resched() Josh Poimboeuf
2023-02-15 13:30   ` Petr Mladek
2023-02-16  2:26     ` Josh Poimboeuf
2023-02-17  9:14       ` Petr Mladek
2023-02-09 19:17 ` [PATCH 3/3] vhost: Fix livepatch timeouts in vhost_worker() Josh Poimboeuf
2023-02-15 13:31   ` Petr Mladek
2023-02-10 21:08 ` [PATCH 0/3] livepatch,sched: Add livepatch task switching to cond_resched() Seth Forshee
2023-02-14 12:08 ` Peter Zijlstra
2023-02-14 19:05   ` Josh Poimboeuf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).