linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Christian Brauner <brauner@kernel.org>
To: Mike Christie <michael.christie@oracle.com>,
	Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thorsten Leemhuis <linux@leemhuis.info>,
	nicolas.dichtel@6wind.com,
	Linux kernel regressions list <regressions@lists.linux.dev>,
	hch@infradead.org, stefanha@redhat.com, jasowang@redhat.com,
	mst@redhat.com, sgarzare@redhat.com,
	virtualization@lists.linux-foundation.org, ebiederm@xmission.com,
	konrad.wilk@oracle.com, linux-kernel@vger.kernel.org,
	Jens Axboe <axboe@kernel.dk>
Subject: Re: [PATCH v11 8/8] vhost: use vhost_tasks for worker threads
Date: Tue, 16 May 2023 10:39:53 +0200	[thread overview]
Message-ID: <20230516-weltmeere-backofen-27f12ae2c9e0@brauner> (raw)
In-Reply-To: <122b597e-a5fa-daf7-27bb-6f04fa98d496@oracle.com>

On Mon, May 15, 2023 at 05:23:12PM -0500, Mike Christie wrote:
> On 5/15/23 10:44 AM, Linus Torvalds wrote:
> > On Mon, May 15, 2023 at 7:23 AM Christian Brauner <brauner@kernel.org> wrote:
> >>
> >> So I think we will be able to address (1) and (2) by making vhost tasks
> >> proper threads and blocking every signal except for SIGKILL and SIGSTOP
> >> and then having vhost handle get_signal() - as you mentioned - the same
> >> way io uring already does. We should also remove the ingore_signals
> >> thing completely imho. I don't think we ever want to do this with user
> >> workers.
> > 
> > Right. That's what IO_URING does:
> > 
> >         if (args->io_thread) {
> >                 /*
> >                  * Mark us an IO worker, and block any signal that isn't
> >                  * fatal or STOP
> >                  */
> >                 p->flags |= PF_IO_WORKER;
> >                 siginitsetinv(&p->blocked, sigmask(SIGKILL)|sigmask(SIGSTOP));
> >         }
> > 
> > and I really think that vhost should basically do exactly what io_uring does.
> > 
> > Not because io_uring fundamentally got this right - but simply because
> > io_uring had almost all the same bugs (and then some), and what the
> > io_uring worker threads ended up doing was to basically zoom in on
> > "this works".
> > 
> > And it zoomed in on it largely by just going for "make it look as much
> > as possible as a real user thread", because every time the kernel
> > thread did something different, it just caused problems.
> > 
> > So I think the patch should just look something like the attached.
> > Mike, can you test this on whatever vhost test-suite?
> 
> I tried that approach already and it doesn't work because io_uring and vhost
> differ in that vhost drivers implement a device where each device has a vhost_task
> and the drivers have a file_operations for the device. When the vhost_task's
> parent gets signal like SIGKILL, then it will exit and call into the vhost
> driver's file_operations->release function. At this time, we need to do cleanup

But that's no reason why the vhost worker couldn't just be allowed to
exit on SIGKILL cleanly similar to io_uring. That's just describing the
current architecture which isn't a necessity afaict. And the helper
thread could e.g., crash.

> like flush the device which uses the vhost_task. There is also the case where if
> the vhost_task gets a SIGKILL, we can just exit from under the vhost layer.

In a way I really don't like the patch below. Because this should be
solvable by adapting vhost workers. Right now, vhost is coming from a
kthread model and we ported it to a user worker model and the whole
point of this excercise has been that the workers behave more like
regular userspace processes. So my tendency is to not massage kernel
signal handling to now also include a special case for user workers in
addition to kthreads. That's just the wrong way around and then vhost
could've just stuck with kthreads in the first place.

So I'm fine with skipping over the freezing case for now but SIGKILL
should be handled imho. Only init and kthreads should get the luxury of
ignoring SIGKILL.

So, I'm afraid I'm asking some work here of you but how feasible would a
model be where vhost_worker() similar to io_wq_worker() gracefully
handles SIGKILL. Yes, I see there's

net.c:   .release = vhost_net_release
scsi.c:  .release = vhost_scsi_release
test.c:  .release = vhost_test_release
vdpa.c:  .release = vhost_vdpa_release
vsock.c: .release = virtio_transport_release
vsock.c: .release = vhost_vsock_dev_release

but that means you have all the basic logic in place and all of those
drivers also support the VHOST_RESET_OWNER ioctl which also stops the
vhost worker. I'm confident that a lof this can be leveraged to just
cleanup on SIGKILL.

So it feels like this should be achievable by adding a callback to
struct vhost_worker that get's called when vhost_worker() gets SIGKILL
and that all the users of vhost workers are forced to implement.

Yes, it is more work but I think that's the right thing to do and not to
complicate our signal handling.

Worst case if this can't be done fast enough we'll have to revert the
vhost parts. I think the user worker parts are mostly sane and are
useful.
Thoughts?

  parent reply	other threads:[~2023-05-16  8:40 UTC|newest]

Thread overview: 98+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-02-02 23:25 [PATCH v11 0/8] Use copy_process in vhost layer Mike Christie
2023-02-02 23:25 ` [PATCH v11 1/8] fork: Make IO worker options flag based Mike Christie
2023-02-03  0:14   ` Linus Torvalds
2023-02-02 23:25 ` [PATCH v11 2/8] fork/vm: Move common PF_IO_WORKER behavior to new flag Mike Christie
2023-02-02 23:25 ` [PATCH v11 3/8] fork: add USER_WORKER flag to not dup/clone files Mike Christie
2023-02-03  0:16   ` Linus Torvalds
2023-02-02 23:25 ` [PATCH v11 4/8] fork: Add USER_WORKER flag to ignore signals Mike Christie
2023-02-03  0:19   ` Linus Torvalds
2023-02-05 16:06     ` Mike Christie
2023-02-02 23:25 ` [PATCH v11 5/8] fork: allow kernel code to call copy_process Mike Christie
2023-02-02 23:25 ` [PATCH v11 6/8] vhost_task: Allow vhost layer to use copy_process Mike Christie
2023-02-03  0:43   ` Linus Torvalds
2023-02-02 23:25 ` [PATCH v11 7/8] vhost: move worker thread fields to new struct Mike Christie
2023-02-02 23:25 ` [PATCH v11 8/8] vhost: use vhost_tasks for worker threads Mike Christie
2023-05-05 13:40   ` Nicolas Dichtel
2023-05-05 18:22     ` Linus Torvalds
2023-05-05 22:37       ` Mike Christie
2023-05-06  1:53         ` Linus Torvalds
2023-05-08 17:13         ` Christian Brauner
2023-05-09  8:09         ` Nicolas Dichtel
2023-05-09  8:17           ` Nicolas Dichtel
2023-05-13 12:39         ` Thorsten Leemhuis
2023-05-13 15:08           ` Linus Torvalds
2023-05-15 14:23             ` Christian Brauner
2023-05-15 15:44               ` Linus Torvalds
2023-05-15 15:52                 ` Jens Axboe
2023-05-15 15:54                   ` Linus Torvalds
2023-05-15 17:23                     ` Linus Torvalds
2023-05-15 15:56                   ` Linus Torvalds
2023-05-15 22:23                 ` Mike Christie
2023-05-15 22:54                   ` Linus Torvalds
2023-05-16  3:53                     ` Mike Christie
2023-05-16 13:18                       ` Oleg Nesterov
2023-05-16 13:40                       ` Oleg Nesterov
2023-05-16 15:56                     ` Eric W. Biederman
2023-05-16 18:37                       ` Oleg Nesterov
2023-05-16 20:12                         ` Eric W. Biederman
2023-05-17 17:09                           ` Oleg Nesterov
2023-05-17 18:22                             ` Mike Christie
2023-05-16  8:39                   ` Christian Brauner [this message]
2023-05-16 16:24                     ` Mike Christie
2023-05-16 16:44                       ` Christian Brauner
2023-05-19 12:15                     ` [RFC PATCH 0/8] vhost_tasks: Use CLONE_THREAD/SIGHAND Christian Brauner
2023-06-01  7:58                       ` Thorsten Leemhuis
2023-06-01 10:18                         ` Nicolas Dichtel
2023-06-01 10:47                         ` Christian Brauner
2023-06-01 11:29                           ` Thorsten Leemhuis
2023-06-01 12:26                           ` Linus Torvalds
2023-06-01 16:10                           ` Mike Christie
2023-05-16 14:06     ` [PATCH v11 8/8] vhost: use vhost_tasks for worker threads Linux regression tracking #adding (Thorsten Leemhuis)
2023-05-26  9:03       ` Linux regression tracking #update (Thorsten Leemhuis)
2023-06-02 11:38       ` Thorsten Leemhuis
2023-07-20 13:06   ` Michael S. Tsirkin
2023-07-23  4:03     ` michael.christie
2023-07-23  9:31       ` Michael S. Tsirkin
2023-08-10 18:57       ` Michael S. Tsirkin
2023-08-11 18:51         ` Mike Christie
2023-08-13 19:01           ` Michael S. Tsirkin
2023-08-14  3:13             ` michael.christie
2023-02-07  8:19 ` [PATCH v11 0/8] Use copy_process in vhost layer Christian Brauner
2023-05-18  0:09 [RFC PATCH 0/8] vhost_tasks: Use CLONE_THREAD/SIGHAND Mike Christie
2023-05-18  0:09 ` [RFC PATCH 1/8] signal: Dequeue SIGKILL even if SIGNAL_GROUP_EXIT/group_exec_task is set Mike Christie
2023-05-18  2:34   ` Eric W. Biederman
2023-05-18  3:49   ` Eric W. Biederman
2023-05-18 15:21     ` Mike Christie
2023-05-18 16:25       ` Oleg Nesterov
2023-05-18 16:42         ` Mike Christie
2023-05-18 17:04           ` Oleg Nesterov
2023-05-18 18:28             ` Eric W. Biederman
2023-05-18 22:57               ` Mike Christie
2023-05-19  4:16                 ` Eric W. Biederman
2023-05-19 23:24                   ` Mike Christie
2023-05-22 13:30               ` Oleg Nesterov
2023-05-18  8:08   ` Christian Brauner
2023-05-18 15:27     ` Mike Christie
2023-05-18 17:07       ` Christian Brauner
2023-05-18 18:08         ` Oleg Nesterov
2023-05-18 18:12           ` Christian Brauner
2023-05-18 18:23             ` Oleg Nesterov
2023-05-18  0:09 ` [RFC PATCH 2/8] vhost/vhost_task: Hook vhost layer into signal handler Mike Christie
2023-05-18  0:16   ` Linus Torvalds
2023-05-18  1:01     ` Mike Christie
2023-05-18  8:16       ` Christian Brauner
2023-05-18  0:09 ` [RFC PATCH 3/8] fork/vhost_task: Switch to CLONE_THREAD and CLONE_SIGHAND Mike Christie
2023-05-18  8:18   ` Christian Brauner
2023-05-18  0:09 ` [RFC PATCH 4/8] vhost-net: Move vhost_net_open Mike Christie
2023-05-18  0:09 ` [RFC PATCH 5/8] vhost: Add callback that stops new work and waits on running ones Mike Christie
2023-05-18 14:18   ` Christian Brauner
2023-05-18 15:03     ` Mike Christie
2023-05-18 15:09       ` Christian Brauner
2023-05-18 18:38       ` Eric W. Biederman
2023-05-18  0:09 ` [RFC PATCH 6/8] vhost-scsi: Add callback to stop and wait on works Mike Christie
2023-05-18  0:09 ` [RFC PATCH 7/8] vhost-net: " Mike Christie
2023-05-18  0:09 ` [RFC PATCH 8/8] fork/vhost_task: remove no_files Mike Christie
2023-05-18  1:04   ` Mike Christie
2023-05-18  8:25 ` [RFC PATCH 0/8] vhost_tasks: Use CLONE_THREAD/SIGHAND Christian Brauner
2023-05-18  8:40   ` Christian Brauner
2023-05-18 14:30   ` Christian Brauner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230516-weltmeere-backofen-27f12ae2c9e0@brauner \
    --to=brauner@kernel.org \
    --cc=axboe@kernel.dk \
    --cc=ebiederm@xmission.com \
    --cc=hch@infradead.org \
    --cc=jasowang@redhat.com \
    --cc=konrad.wilk@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux@leemhuis.info \
    --cc=michael.christie@oracle.com \
    --cc=mst@redhat.com \
    --cc=nicolas.dichtel@6wind.com \
    --cc=regressions@lists.linux.dev \
    --cc=sgarzare@redhat.com \
    --cc=stefanha@redhat.com \
    --cc=torvalds@linux-foundation.org \
    --cc=virtualization@lists.linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).