live-patching.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] livepatch: Fix idle cpu's tasks transition
@ 2021-09-09  9:16 Vasily Gorbik
  2021-09-10  9:17 ` Petr Mladek
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Vasily Gorbik @ 2021-09-09  9:16 UTC (permalink / raw)
  To: Josh Poimboeuf, Jiri Kosina, Miroslav Benes, Petr Mladek
  Cc: Joe Lawrence, Ingo Molnar, Peter Zijlstra, Frederic Weisbecker,
	Thomas Gleixner, Heiko Carstens, Sven Schnelle, Sumanth Korikkar,
	live-patching, linux-kernel

On an idle system with large amount of cpus it might happen that
klp_update_patch_state() is not reached in do_idle() for a long periods
of time. With debug messages enabled log is filled with:
[  499.442643] livepatch: klp_try_switch_task: swapper/63:0 is running

without any signs of progress. Ending up with "failed to complete
transition".

On s390 LPAR with 128 cpus not a single transition is able to complete
and livepatch kselftests fail. Tests on idling x86 kvm instance with 128
cpus demonstrate similar symptoms with and without CONFIG_NO_HZ.

To deal with that, since runqueue is already locked in
klp_try_switch_task() identify idling cpus and trigger rescheduling
potentially waking them up and making sure idle tasks break out of
do_idle() inner loop and reach klp_update_patch_state(). This helps to
speed up transition time while avoiding unnecessary extra system load.

Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
---
Previous discussion and RFC PATCH:
lkml.kernel.org/r/patch.git-b76842ceb035.your-ad-here.call-01625661932-ext-1304@work.hours

 kernel/livepatch/transition.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/kernel/livepatch/transition.c b/kernel/livepatch/transition.c
index 3a4beb9395c4..c5832b2dd081 100644
--- a/kernel/livepatch/transition.c
+++ b/kernel/livepatch/transition.c
@@ -308,6 +308,8 @@ static bool klp_try_switch_task(struct task_struct *task)
 	rq = task_rq_lock(task, &flags);
 
 	if (task_running(rq, task) && task != current) {
+		if (is_idle_task(task))
+			resched_curr(rq);
 		snprintf(err_buf, STACK_ERR_BUF_SIZE,
 			 "%s: %s:%d is running\n", __func__, task->comm,
 			 task->pid);
-- 
2.25.4

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] livepatch: Fix idle cpu's tasks transition
  2021-09-09  9:16 [PATCH] livepatch: Fix idle cpu's tasks transition Vasily Gorbik
@ 2021-09-10  9:17 ` Petr Mladek
  2021-09-10  9:32 ` Miroslav Benes
  2021-09-10 16:03 ` Josh Poimboeuf
  2 siblings, 0 replies; 4+ messages in thread
From: Petr Mladek @ 2021-09-10  9:17 UTC (permalink / raw)
  To: Vasily Gorbik
  Cc: Josh Poimboeuf, Jiri Kosina, Miroslav Benes, Joe Lawrence,
	Ingo Molnar, Peter Zijlstra, Frederic Weisbecker,
	Thomas Gleixner, Heiko Carstens, Sven Schnelle, Sumanth Korikkar,
	live-patching, linux-kernel

On Thu 2021-09-09 11:16:01, Vasily Gorbik wrote:
> On an idle system with large amount of cpus it might happen that
> klp_update_patch_state() is not reached in do_idle() for a long periods
> of time. With debug messages enabled log is filled with:
> [  499.442643] livepatch: klp_try_switch_task: swapper/63:0 is running
> 
> without any signs of progress. Ending up with "failed to complete
> transition".
> 
> On s390 LPAR with 128 cpus not a single transition is able to complete
> and livepatch kselftests fail. Tests on idling x86 kvm instance with 128
> cpus demonstrate similar symptoms with and without CONFIG_NO_HZ.
> 
> To deal with that, since runqueue is already locked in
> klp_try_switch_task() identify idling cpus and trigger rescheduling
> potentially waking them up and making sure idle tasks break out of
> do_idle() inner loop and reach klp_update_patch_state(). This helps to
> speed up transition time while avoiding unnecessary extra system load.
> 
> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
> ---
> Previous discussion and RFC PATCH:
> lkml.kernel.org/r/patch.git-b76842ceb035.your-ad-here.call-01625661932-ext-1304@work.hours
> 
>  kernel/livepatch/transition.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/kernel/livepatch/transition.c b/kernel/livepatch/transition.c
> index 3a4beb9395c4..c5832b2dd081 100644
> --- a/kernel/livepatch/transition.c
> +++ b/kernel/livepatch/transition.c
> @@ -308,6 +308,8 @@ static bool klp_try_switch_task(struct task_struct *task)

Please, update also the comment above klp_try_switch_task(). I would
write something like:

/*
 * Try to safely switch a task to the target patch state.  If it's currently
 * running, or it's sleeping on a to-be-patched or to-be-unpatched function, or
 * if the stack is unreliable, return false.
 *
 * Idle tasks are switched in the main loop when running.
 */

>  	rq = task_rq_lock(task, &flags);
>  
>  	if (task_running(rq, task) && task != current) {

This would deserve a comment, for example:

		/*
		 * Idle task might stay running for a long time. Switch them
		 * in the main loop.
		 */

> +		if (is_idle_task(task))
> +			resched_curr(rq);
>  		snprintf(err_buf, STACK_ERR_BUF_SIZE,
>  			 "%s: %s:%d is running\n", __func__, task->comm,
>  			 task->pid);

Otherwise, it looks good to me. With the two comments:

Reviewed-by: Petr Mladek <pmladek@suse.com>

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] livepatch: Fix idle cpu's tasks transition
  2021-09-09  9:16 [PATCH] livepatch: Fix idle cpu's tasks transition Vasily Gorbik
  2021-09-10  9:17 ` Petr Mladek
@ 2021-09-10  9:32 ` Miroslav Benes
  2021-09-10 16:03 ` Josh Poimboeuf
  2 siblings, 0 replies; 4+ messages in thread
From: Miroslav Benes @ 2021-09-10  9:32 UTC (permalink / raw)
  To: Vasily Gorbik
  Cc: Josh Poimboeuf, Jiri Kosina, Petr Mladek, Joe Lawrence,
	Ingo Molnar, Peter Zijlstra, Frederic Weisbecker,
	Thomas Gleixner, Heiko Carstens, Sven Schnelle, Sumanth Korikkar,
	live-patching, linux-kernel

On Thu, 9 Sep 2021, Vasily Gorbik wrote:

> On an idle system with large amount of cpus it might happen that
> klp_update_patch_state() is not reached in do_idle() for a long periods
> of time. With debug messages enabled log is filled with:
> [  499.442643] livepatch: klp_try_switch_task: swapper/63:0 is running
> 
> without any signs of progress. Ending up with "failed to complete
> transition".
> 
> On s390 LPAR with 128 cpus not a single transition is able to complete
> and livepatch kselftests fail. Tests on idling x86 kvm instance with 128
> cpus demonstrate similar symptoms with and without CONFIG_NO_HZ.
> 
> To deal with that, since runqueue is already locked in
> klp_try_switch_task() identify idling cpus and trigger rescheduling
> potentially waking them up and making sure idle tasks break out of
> do_idle() inner loop and reach klp_update_patch_state(). This helps to
> speed up transition time while avoiding unnecessary extra system load.
> 
> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>

Seems reasonable to me.

Acked-by: Miroslav Benes <mbenes@suse.cz>

M

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] livepatch: Fix idle cpu's tasks transition
  2021-09-09  9:16 [PATCH] livepatch: Fix idle cpu's tasks transition Vasily Gorbik
  2021-09-10  9:17 ` Petr Mladek
  2021-09-10  9:32 ` Miroslav Benes
@ 2021-09-10 16:03 ` Josh Poimboeuf
  2 siblings, 0 replies; 4+ messages in thread
From: Josh Poimboeuf @ 2021-09-10 16:03 UTC (permalink / raw)
  To: Vasily Gorbik
  Cc: Jiri Kosina, Miroslav Benes, Petr Mladek, Joe Lawrence,
	Ingo Molnar, Peter Zijlstra, Frederic Weisbecker,
	Thomas Gleixner, Heiko Carstens, Sven Schnelle, Sumanth Korikkar,
	live-patching, linux-kernel

On Thu, Sep 09, 2021 at 11:16:01AM +0200, Vasily Gorbik wrote:
> On an idle system with large amount of cpus it might happen that
> klp_update_patch_state() is not reached in do_idle() for a long periods
> of time. With debug messages enabled log is filled with:
> [  499.442643] livepatch: klp_try_switch_task: swapper/63:0 is running
> 
> without any signs of progress. Ending up with "failed to complete
> transition".
> 
> On s390 LPAR with 128 cpus not a single transition is able to complete
> and livepatch kselftests fail. Tests on idling x86 kvm instance with 128
> cpus demonstrate similar symptoms with and without CONFIG_NO_HZ.
> 
> To deal with that, since runqueue is already locked in
> klp_try_switch_task() identify idling cpus and trigger rescheduling
> potentially waking them up and making sure idle tasks break out of
> do_idle() inner loop and reach klp_update_patch_state(). This helps to
> speed up transition time while avoiding unnecessary extra system load.
> 
> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>

Looks ok to me, but we should get an ack from Ingo or Peter since
livepatch would be calling another private scheduler interface.

> ---
> Previous discussion and RFC PATCH:
> lkml.kernel.org/r/patch.git-b76842ceb035.your-ad-here.call-01625661932-ext-1304@work.hours
> 
>  kernel/livepatch/transition.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/kernel/livepatch/transition.c b/kernel/livepatch/transition.c
> index 3a4beb9395c4..c5832b2dd081 100644
> --- a/kernel/livepatch/transition.c
> +++ b/kernel/livepatch/transition.c
> @@ -308,6 +308,8 @@ static bool klp_try_switch_task(struct task_struct *task)
>  	rq = task_rq_lock(task, &flags);
>  
>  	if (task_running(rq, task) && task != current) {
> +		if (is_idle_task(task))
> +			resched_curr(rq);
>  		snprintf(err_buf, STACK_ERR_BUF_SIZE,
>  			 "%s: %s:%d is running\n", __func__, task->comm,
>  			 task->pid);
> -- 
> 2.25.4
> 

-- 
Josh


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2021-09-10 16:03 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-09  9:16 [PATCH] livepatch: Fix idle cpu's tasks transition Vasily Gorbik
2021-09-10  9:17 ` Petr Mladek
2021-09-10  9:32 ` Miroslav Benes
2021-09-10 16:03 ` Josh Poimboeuf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).