From: Seth Forshee <sforshee@kernel.org>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Josh Poimboeuf <jpoimboe@kernel.org>,
Petr Mladek <pmladek@suse.com>, Jason Wang <jasowang@redhat.com>,
"Michael S. Tsirkin" <mst@redhat.com>,
Jiri Kosina <jikos@kernel.org>, Miroslav Benes <mbenes@suse.cz>,
Joe Lawrence <joe.lawrence@redhat.com>,
virtualization@lists.linux-foundation.org, kvm@vger.kernel.org,
netdev@vger.kernel.org, live-patching@vger.kernel.org,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH 0/2] vhost: improve livepatch switching for heavily loaded vhost worker kthreads
Date: Fri, 27 Jan 2023 14:02:40 -0600 [thread overview]
Message-ID: <Y9Qt4BT2WXK2dToL@do-x1extreme> (raw)
In-Reply-To: <Y9OpTtqWjAkC2pal@hirez.programming.kicks-ass.net>
On Fri, Jan 27, 2023 at 11:37:02AM +0100, Peter Zijlstra wrote:
> On Thu, Jan 26, 2023 at 08:43:55PM -0800, Josh Poimboeuf wrote:
> > On Thu, Jan 26, 2023 at 03:12:35PM -0600, Seth Forshee (DigitalOcean) wrote:
> > > On Thu, Jan 26, 2023 at 06:03:16PM +0100, Petr Mladek wrote:
> > > > On Fri 2023-01-20 16:12:20, Seth Forshee (DigitalOcean) wrote:
> > > > > We've fairly regularaly seen liveptches which cannot transition within kpatch's
> > > > > timeout period due to busy vhost worker kthreads.
> > > >
> > > > I have missed this detail. Miroslav told me that we have solved
> > > > something similar some time ago, see
> > > > https://lore.kernel.org/all/20220507174628.2086373-1-song@kernel.org/
> > >
> > > Interesting thread. I had thought about something along the lines of the
> > > original patch, but there are some ideas in there that I hadn't
> > > considered.
> >
> > Here's another idea, have we considered this? Have livepatch set
> > TIF_NEED_RESCHED on all kthreads to force them into schedule(), and then
> > have the scheduler call klp_try_switch_task() if TIF_PATCH_PENDING is
> > set.
> >
> > Not sure how scheduler folks would feel about that ;-)
>
> So, let me try and page all that back in.... :-)
>
> KLP needs to unwind the stack to see if any of the patched functions are
> active, if not, flip task to new set.
>
> Unwinding the stack of a task can be done when:
>
> - task is inactive (stable reg and stack) -- provided it stays inactive
> while unwinding etc..
>
> - task is current (guarantees stack doesn't dip below where we started
> due to being busy on top etc..)
>
> Can NOT be done from interrupt context, because can hit in the middle of
> setting up stack frames etc..
>
> The issue at hand is that some tasks run for a long time without passing
> through an explicit check.
>
> The thread above tried sticking something in cond_resched() which is a
> problem for PREEMPT=y since cond_resched() is a no-op.
>
> Preempt notifiers were raised, and those would actually be nice, except
> you can only install a notifier on current and you need some memory
> allocated per task, which makes it less than ideal. Plus ...
>
> ... putting something in finish_task_switch() wouldn't be the end of the
> world I suppose, but then you still need to force schedule the task --
> imagine it being the only runnable task on the CPU, there's nothing
> going to make it actually switch.
>
> Which then leads me to suggest something daft like this.. does that
> help?
The changes below are working well for me. The context has a read lock
on tasklist_lock so it can't sleep, so I'm using stop_one_cpu_nowait()
with per-CPU state.
Based on Josh's comments it sounds like the klp_have_reliable_stack()
check probably isn't quite right, and we might want to add something
else for PREEMPT+!ORC. But I wanted to go ahead and see if this approach
seems reasonable to everyone.
Thanks,
Seth
diff --git a/kernel/livepatch/transition.c b/kernel/livepatch/transition.c
index f1b25ec581e0..9f3898f02828 100644
--- a/kernel/livepatch/transition.c
+++ b/kernel/livepatch/transition.c
@@ -9,6 +9,7 @@
#include <linux/cpu.h>
#include <linux/stacktrace.h>
+#include <linux/stop_machine.h>
#include "core.h"
#include "patch.h"
#include "transition.h"
@@ -334,9 +335,16 @@ static bool klp_try_switch_task(struct task_struct *task)
return !ret;
}
+static int __try_switch_kthread(void *arg)
+{
+ return klp_try_switch_task(arg) ? 0 : -EBUSY;
+}
+
+static DEFINE_PER_CPU(struct cpu_stop_work, klp_stop_work);
+
/*
* Sends a fake signal to all non-kthread tasks with TIF_PATCH_PENDING set.
- * Kthreads with TIF_PATCH_PENDING set are woken up.
+ * Kthreads with TIF_PATCH_PENDING set are preempted or woken up.
*/
static void klp_send_signals(void)
{
@@ -357,11 +365,22 @@ static void klp_send_signals(void)
* would be meaningless. It is not serious though.
*/
if (task->flags & PF_KTHREAD) {
- /*
- * Wake up a kthread which sleeps interruptedly and
- * still has not been migrated.
- */
- wake_up_state(task, TASK_INTERRUPTIBLE);
+ if (task_curr(task) && klp_have_reliable_stack()) {
+ /*
+ * kthread is currently running on a CPU; try
+ * to preempt it.
+ */
+ stop_one_cpu_nowait(task_cpu(task),
+ __try_switch_kthread,
+ task,
+ this_cpu_ptr(&klp_stop_work));
+ } else {
+ /*
+ * Wake up a kthread which sleeps interruptedly
+ * and still has not been migrated.
+ */
+ wake_up_state(task, TASK_INTERRUPTIBLE);
+ }
} else {
/*
* Send fake signal to all non-kthread tasks which are
next prev parent reply other threads:[~2023-01-27 20:05 UTC|newest]
Thread overview: 36+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-01-20 22:12 [PATCH 0/2] vhost: improve livepatch switching for heavily loaded vhost worker kthreads Seth Forshee (DigitalOcean)
2023-01-20 22:12 ` [PATCH 1/2] livepatch: add an interface for safely switching kthreads Seth Forshee (DigitalOcean)
2023-01-20 22:12 ` [PATCH 2/2] vhost: check for pending livepatches from vhost worker kthreads Seth Forshee (DigitalOcean)
2023-01-24 14:17 ` Petr Mladek
2023-01-24 17:21 ` Seth Forshee
2023-01-25 11:34 ` Petr Mladek
2023-01-25 16:57 ` Seth Forshee
2023-01-26 11:16 ` Petr Mladek
2023-01-26 11:49 ` Petr Mladek
2023-01-22 8:34 ` [PATCH 0/2] vhost: improve livepatch switching for heavily loaded " Michael S. Tsirkin
2023-01-26 17:03 ` Petr Mladek
2023-01-26 21:12 ` Seth Forshee (DigitalOcean)
2023-01-27 4:43 ` Josh Poimboeuf
2023-01-27 10:37 ` Peter Zijlstra
2023-01-27 12:09 ` Petr Mladek
2023-01-27 14:37 ` Seth Forshee
2023-01-27 16:52 ` Josh Poimboeuf
2023-01-27 17:09 ` Josh Poimboeuf
2023-01-27 22:11 ` Josh Poimboeuf
2023-01-30 12:40 ` Peter Zijlstra
2023-01-30 17:50 ` Seth Forshee
2023-01-30 18:18 ` Josh Poimboeuf
2023-01-30 18:36 ` Mark Rutland
2023-01-30 19:48 ` Josh Poimboeuf
2023-01-31 1:53 ` Song Liu
2023-01-31 10:22 ` Mark Rutland
2023-01-31 16:38 ` Josh Poimboeuf
2023-02-01 11:10 ` Mark Rutland
2023-02-01 16:57 ` Josh Poimboeuf
2023-02-01 17:11 ` Mark Rutland
2023-01-30 19:59 ` Josh Poimboeuf
2023-01-31 10:02 ` Peter Zijlstra
2023-01-27 20:02 ` Seth Forshee [this message]
2023-01-27 11:19 ` Petr Mladek
2023-01-27 14:57 ` Seth Forshee
2023-01-30 9:55 ` Petr Mladek
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Y9Qt4BT2WXK2dToL@do-x1extreme \
--to=sforshee@kernel.org \
--cc=jasowang@redhat.com \
--cc=jikos@kernel.org \
--cc=joe.lawrence@redhat.com \
--cc=jpoimboe@kernel.org \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=live-patching@vger.kernel.org \
--cc=mbenes@suse.cz \
--cc=mst@redhat.com \
--cc=netdev@vger.kernel.org \
--cc=peterz@infradead.org \
--cc=pmladek@suse.com \
--cc=virtualization@lists.linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).