All of lore.kernel.org
 help / color / mirror / Atom feed
From: Rik van Riel <riel@fb.com>
To: "pmladek@suse.com" <pmladek@suse.com>, Song Liu <songliubraving@fb.com>
Cc: "joe.lawrence@redhat.com" <joe.lawrence@redhat.com>,
	"song@kernel.org" <song@kernel.org>,
	"jpoimboe@redhat.com" <jpoimboe@redhat.com>,
	"peterz@infradead.org" <peterz@infradead.org>,
	"mingo@redhat.com" <mingo@redhat.com>,
	"vincent.guittot@linaro.org" <vincent.guittot@linaro.org>,
	"live-patching@vger.kernel.org" <live-patching@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Kernel Team <Kernel-team@fb.com>
Subject: Re: [RFC] sched,livepatch: call klp_try_switch_task in __cond_resched
Date: Tue, 10 May 2022 13:33:13 +0000	[thread overview]
Message-ID: <3a9bfb4a52b715bd8739d8834409c9549ec7f22f.camel@fb.com> (raw)
In-Reply-To: <YnoawYtoCSvrK7lb@alley>

On Tue, 2022-05-10 at 09:56 +0200, Petr Mladek wrote:
> 
> IMHO, the problem is that klp_transition_work_fn() tries the
> transition "only" once per second, see
> 
> void klp_try_complete_transition(void)
> {
> [...]
>                 schedule_delayed_work(&klp_transition_work,
>                                       round_jiffies_relative(HZ));
> [...]
> }
> 
> It means that there are "only" 60 attempts to migrate the busy
> process.
> It fails when the process is in the running state or sleeping in a
> livepatched function. There is a _non-zero_ chance of a bad luck.
> 

We are definitely hitting that non-zero chance :)

> Anyway, the limit 60s looks like a bad idea to me. It is too low.

That has its own issues, though. System management software
tracks whether kpatch succeeds, and a run of the system
management software will not complete until all of the commands
it has run have completed.

One reason for this is that allowing system management software
to just fork more and more things that might potentially get
stuck is that you never want your system management software
to come even close to resembling a fork bomb :)

Rollout of the next config change to a system should not be
blocked on KLP completion.

I think the best approach for us might be to just track what
is causing the transition failures, and send in trivial patches
to make the outer loop in such kernel threads do the same KLP
transition the idle task already does.

  reply	other threads:[~2022-05-10 14:14 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-07 17:46 [RFC] sched,livepatch: call klp_try_switch_task in __cond_resched Song Liu
2022-05-07 18:26 ` Rik van Riel
2022-05-07 19:04   ` Song Liu
2022-05-07 19:18     ` Rik van Riel
2022-05-08 20:41       ` Peter Zijlstra
2022-05-09  1:07         ` Rik van Riel
2022-05-09  7:04 ` Peter Zijlstra
2022-05-09  8:06   ` Song Liu
2022-05-09  9:38     ` Peter Zijlstra
2022-05-09 14:13       ` Rik van Riel
2022-05-09 15:22         ` Petr Mladek
2022-05-09 15:07 ` Petr Mladek
2022-05-09 16:22   ` Song Liu
2022-05-10  7:56     ` Petr Mladek
2022-05-10 13:33       ` Rik van Riel [this message]
2022-05-10 15:44         ` Petr Mladek
2022-05-10 16:07           ` Rik van Riel
2022-05-10 16:52             ` Josh Poimboeuf
2022-05-10 18:07               ` Rik van Riel
2022-05-10 18:42                 ` Josh Poimboeuf
2022-05-10 19:45                   ` Song Liu
2022-05-10 23:04                     ` Josh Poimboeuf
2022-05-10 23:57                       ` Song Liu
2022-05-11  0:33                         ` Josh Poimboeuf
2022-05-11  9:24                           ` Petr Mladek
2022-05-11 16:33                             ` Song Liu
2022-05-12  4:07                               ` Josh Poimboeuf
2022-05-13 12:33                               ` Petr Mladek
2022-05-13 13:34                                 ` Peter Zijlstra
2022-05-11  0:35                         ` Rik van Riel
2022-05-11  0:37                           ` Josh Poimboeuf
2022-05-11  0:46                             ` Rik van Riel
2022-05-11  1:12                               ` Josh Poimboeuf
2022-05-11 18:09                                 ` Rik van Riel
2022-05-12  3:59                                   ` Josh Poimboeuf
2022-05-09 15:52 ` [RFC] sched,livepatch: call stop_one_cpu in klp_check_and_switch_task Rik van Riel
2022-05-09 16:28   ` Song Liu
2022-05-09 18:00   ` Josh Poimboeuf
2022-05-09 19:10     ` Rik van Riel
2022-05-09 19:17       ` Josh Poimboeuf
2022-05-09 19:49         ` Rik van Riel
2022-05-09 20:09           ` Josh Poimboeuf
2022-05-10  0:32             ` Song Liu
2022-05-10  9:35               ` Peter Zijlstra
2022-05-10  1:48             ` Rik van Riel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3a9bfb4a52b715bd8739d8834409c9549ec7f22f.camel@fb.com \
    --to=riel@fb.com \
    --cc=Kernel-team@fb.com \
    --cc=joe.lawrence@redhat.com \
    --cc=jpoimboe@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=live-patching@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=pmladek@suse.com \
    --cc=song@kernel.org \
    --cc=songliubraving@fb.com \
    --cc=vincent.guittot@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.