From: michael.christie@oracle.com To: Jason Wang <jasowang@redhat.com>, target-devel@vger.kernel.org, linux-scsi@vger.kernel.org, stefanha@redhat.com, pbonzini@redhat.com, mst@redhat.com, sgarzare@redhat.com, virtualization@lists.linux-foundation.org Subject: Re: [PATCH V3 11/11] vhost: allow userspace to create workers Date: Tue, 26 Oct 2021 11:49:37 -0500 [thread overview] Message-ID: <4d33b7e1-5efb-3729-ee15-98608704f096@oracle.com> (raw) In-Reply-To: <8aee8f07-76bd-f111-bc5f-fc5cad46ce56@redhat.com> On 10/26/21 12:37 AM, Jason Wang wrote: > > 在 2021/10/22 下午1:19, Mike Christie 写道: >> This patch allows userspace to create workers and bind them to vqs. You >> can have N workers per dev and also share N workers with M vqs. >> >> Signed-off-by: Mike Christie <michael.christie@oracle.com> > > > A question, who is the best one to determine the binding? Is it the VMM (Qemu etc) or the management stack? If the latter, it looks to me it's better to expose this via sysfs? I thought it would be where you have management app settings, then the management app talks to the qemu control interface like it does when it adds new devices on the fly. A problem with the management app doing it is to handle the RLIMIT_NPROC review comment, this patchset: https://lore.kernel.org/all/20211007214448.6282-1-michael.christie@oracle.com/ basically has the kernel do a clone() from the caller's context. So adding a worker is like doing the VHOST_SET_OWNER ioctl where it still has to be done from a process you can inherit values like the mm, cgroups, and now RLIMITs. >> diff --git a/include/uapi/linux/vhost_types.h b/include/uapi/linux/vhost_types.h >> index f7f6a3a28977..af654e3cef0e 100644 >> --- a/include/uapi/linux/vhost_types.h >> +++ b/include/uapi/linux/vhost_types.h >> @@ -47,6 +47,18 @@ struct vhost_vring_addr { >> __u64 log_guest_addr; >> }; >> +#define VHOST_VRING_NEW_WORKER -1 > > > Do we need VHOST_VRING_FREE_WORKER? And I wonder if using dedicated ioctls are better: > > VHOST_VRING_NEW/FREE_WORKER > VHOST_VRING_ATTACH_WORKER We didn't need a free worker, because the kernel handles it for userspace. I tried to make it easy for userspace because in some cases it may not be able to do syscalls like close on the device. For example if qemu crashes or for vhost-scsi we don't do an explicit close during VM shutdown. So we start off with the default worker thread that's used by all vqs like we do today. Userspace can then override it by creating a new worker. That also unbinds/ detaches the existing worker and does a put on the workers refcount. We also do a put on the worker when we stop using it during device shutdown/closure/release. When the worker's refcount goes to zero the kernel deletes it. I think separating the calls could be helpful though.
WARNING: multiple messages have this Message-ID (diff)
From: michael.christie@oracle.com To: Jason Wang <jasowang@redhat.com>, target-devel@vger.kernel.org, linux-scsi@vger.kernel.org, stefanha@redhat.com, pbonzini@redhat.com, mst@redhat.com, sgarzare@redhat.com, virtualization@lists.linux-foundation.org Subject: Re: [PATCH V3 11/11] vhost: allow userspace to create workers Date: Tue, 26 Oct 2021 11:49:37 -0500 [thread overview] Message-ID: <4d33b7e1-5efb-3729-ee15-98608704f096@oracle.com> (raw) In-Reply-To: <8aee8f07-76bd-f111-bc5f-fc5cad46ce56@redhat.com> On 10/26/21 12:37 AM, Jason Wang wrote: > > 在 2021/10/22 下午1:19, Mike Christie 写道: >> This patch allows userspace to create workers and bind them to vqs. You >> can have N workers per dev and also share N workers with M vqs. >> >> Signed-off-by: Mike Christie <michael.christie@oracle.com> > > > A question, who is the best one to determine the binding? Is it the VMM (Qemu etc) or the management stack? If the latter, it looks to me it's better to expose this via sysfs? I thought it would be where you have management app settings, then the management app talks to the qemu control interface like it does when it adds new devices on the fly. A problem with the management app doing it is to handle the RLIMIT_NPROC review comment, this patchset: https://lore.kernel.org/all/20211007214448.6282-1-michael.christie@oracle.com/ basically has the kernel do a clone() from the caller's context. So adding a worker is like doing the VHOST_SET_OWNER ioctl where it still has to be done from a process you can inherit values like the mm, cgroups, and now RLIMITs. >> diff --git a/include/uapi/linux/vhost_types.h b/include/uapi/linux/vhost_types.h >> index f7f6a3a28977..af654e3cef0e 100644 >> --- a/include/uapi/linux/vhost_types.h >> +++ b/include/uapi/linux/vhost_types.h >> @@ -47,6 +47,18 @@ struct vhost_vring_addr { >> __u64 log_guest_addr; >> }; >> +#define VHOST_VRING_NEW_WORKER -1 > > > Do we need VHOST_VRING_FREE_WORKER? And I wonder if using dedicated ioctls are better: > > VHOST_VRING_NEW/FREE_WORKER > VHOST_VRING_ATTACH_WORKER We didn't need a free worker, because the kernel handles it for userspace. I tried to make it easy for userspace because in some cases it may not be able to do syscalls like close on the device. For example if qemu crashes or for vhost-scsi we don't do an explicit close during VM shutdown. So we start off with the default worker thread that's used by all vqs like we do today. Userspace can then override it by creating a new worker. That also unbinds/ detaches the existing worker and does a put on the workers refcount. We also do a put on the worker when we stop using it during device shutdown/closure/release. When the worker's refcount goes to zero the kernel deletes it. I think separating the calls could be helpful though. _______________________________________________ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
next prev parent reply other threads:[~2021-10-26 16:49 UTC|newest] Thread overview: 74+ messages / expand[flat|nested] mbox.gz Atom feed top 2021-10-22 5:18 [PATCH V3 00/11] vhost: multiple worker support Mike Christie 2021-10-22 5:18 ` Mike Christie 2021-10-22 5:19 ` [PATCH] QEMU vhost-scsi: add support for VHOST_SET_VRING_WORKER Mike Christie 2021-10-22 5:19 ` Mike Christie 2021-10-22 5:19 ` [PATCH V3 01/11] vhost: add vhost_worker pointer to vhost_virtqueue Mike Christie 2021-10-22 5:19 ` Mike Christie 2021-10-22 5:19 ` [PATCH V3 02/11] vhost, vhost-net: add helper to check if vq has work Mike Christie 2021-10-22 5:19 ` Mike Christie 2021-10-22 5:19 ` [PATCH V3 03/11] vhost: take worker or vq instead of dev for queueing Mike Christie 2021-10-22 5:19 ` Mike Christie 2021-10-22 5:19 ` [PATCH V3 04/11] vhost: take worker or vq instead of dev for flushing Mike Christie 2021-10-22 5:19 ` Mike Christie 2021-10-22 5:19 ` [PATCH V3 05/11] vhost: convert poll work to be vq based Mike Christie 2021-10-22 5:19 ` Mike Christie 2021-10-22 5:19 ` [PATCH V3 06/11] vhost-sock: convert to vq helpers Mike Christie 2021-10-22 5:19 ` Mike Christie 2021-10-25 9:08 ` Stefano Garzarella 2021-10-25 9:08 ` Stefano Garzarella 2021-10-25 16:09 ` michael.christie 2021-10-25 16:09 ` michael.christie 2021-10-22 5:19 ` [PATCH V3 07/11] vhost-scsi: make SCSI cmd completion per vq Mike Christie 2021-10-22 5:19 ` Mike Christie 2021-10-22 5:19 ` [PATCH V3 08/11] vhost-scsi: convert to vq helpers Mike Christie 2021-10-22 5:19 ` Mike Christie 2021-10-22 5:19 ` [PATCH V3 09/11] vhost-scsi: flush IO vqs then send TMF rsp Mike Christie 2021-10-22 5:19 ` Mike Christie 2021-10-22 5:19 ` [PATCH V3 10/11] vhost: remove device wide queu/flushing helpers Mike Christie 2021-10-22 5:19 ` Mike Christie 2021-10-22 5:19 ` [PATCH V3 11/11] vhost: allow userspace to create workers Mike Christie 2021-10-22 5:19 ` Mike Christie 2021-10-22 10:47 ` Michael S. Tsirkin 2021-10-22 10:47 ` Michael S. Tsirkin 2021-10-22 16:12 ` michael.christie 2021-10-22 16:12 ` michael.christie 2021-10-22 18:17 ` michael.christie 2021-10-22 18:17 ` michael.christie 2021-10-23 20:11 ` Michael S. Tsirkin 2021-10-23 20:11 ` Michael S. Tsirkin 2021-10-25 16:04 ` michael.christie 2021-10-25 16:04 ` michael.christie 2021-10-25 17:14 ` Michael S. Tsirkin 2021-10-25 17:14 ` Michael S. Tsirkin 2021-10-26 5:37 ` Jason Wang 2021-10-26 5:37 ` Jason Wang 2021-10-26 13:09 ` Michael S. Tsirkin 2021-10-26 13:09 ` Michael S. Tsirkin 2021-10-26 16:36 ` Stefan Hajnoczi 2021-10-26 16:36 ` Stefan Hajnoczi 2021-10-26 15:44 ` Stefan Hajnoczi 2021-10-26 15:44 ` Stefan Hajnoczi 2021-10-27 2:55 ` Jason Wang 2021-10-27 2:55 ` Jason Wang 2021-10-27 9:01 ` Stefan Hajnoczi 2021-10-27 9:01 ` Stefan Hajnoczi 2021-10-26 16:49 ` michael.christie [this message] 2021-10-26 16:49 ` michael.christie 2021-10-27 6:02 ` Jason Wang 2021-10-27 6:02 ` Jason Wang 2021-10-27 9:03 ` Stefan Hajnoczi 2021-10-27 9:03 ` Stefan Hajnoczi 2021-10-26 15:22 ` Stefan Hajnoczi 2021-10-26 15:22 ` Stefan Hajnoczi 2021-10-26 15:24 ` Stefan Hajnoczi 2021-10-26 15:24 ` Stefan Hajnoczi 2021-10-22 6:02 ` [PATCH V3 00/11] vhost: multiple worker support michael.christie 2021-10-22 6:02 ` michael.christie 2021-10-22 9:49 ` Michael S. Tsirkin 2021-10-22 9:49 ` Michael S. Tsirkin 2021-10-22 9:48 ` Michael S. Tsirkin 2021-10-22 9:48 ` Michael S. Tsirkin 2021-10-22 15:54 ` michael.christie 2021-10-22 15:54 ` michael.christie 2021-10-23 20:12 ` Michael S. Tsirkin 2021-10-23 20:12 ` Michael S. Tsirkin
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=4d33b7e1-5efb-3729-ee15-98608704f096@oracle.com \ --to=michael.christie@oracle.com \ --cc=jasowang@redhat.com \ --cc=linux-scsi@vger.kernel.org \ --cc=mst@redhat.com \ --cc=pbonzini@redhat.com \ --cc=sgarzare@redhat.com \ --cc=stefanha@redhat.com \ --cc=target-devel@vger.kernel.org \ --cc=virtualization@lists.linux-foundation.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.