All of lore.kernel.org
 help / color / mirror / Atom feed
From: michael.christie@oracle.com
To: Jason Wang <jasowang@redhat.com>,
	target-devel@vger.kernel.org, linux-scsi@vger.kernel.org,
	stefanha@redhat.com, pbonzini@redhat.com, mst@redhat.com,
	sgarzare@redhat.com, virtualization@lists.linux-foundation.org
Subject: Re: [PATCH V3 11/11] vhost: allow userspace to create workers
Date: Tue, 26 Oct 2021 11:49:37 -0500	[thread overview]
Message-ID: <4d33b7e1-5efb-3729-ee15-98608704f096@oracle.com> (raw)
In-Reply-To: <8aee8f07-76bd-f111-bc5f-fc5cad46ce56@redhat.com>

On 10/26/21 12:37 AM, Jason Wang wrote:
> 
> 在 2021/10/22 下午1:19, Mike Christie 写道:
>> This patch allows userspace to create workers and bind them to vqs. You
>> can have N workers per dev and also share N workers with M vqs.
>>
>> Signed-off-by: Mike Christie <michael.christie@oracle.com>
> 
> 
> A question, who is the best one to determine the binding? Is it the VMM (Qemu etc) or the management stack? If the latter, it looks to me it's better to expose this via sysfs?

I thought it would be where you have management app settings, then the
management app talks to the qemu control interface like it does when it
adds new devices on the fly.

A problem with the management app doing it is to handle the RLIMIT_NPROC
review comment, this patchset:

https://lore.kernel.org/all/20211007214448.6282-1-michael.christie@oracle.com/

basically has the kernel do a clone() from the caller's context. So adding
a worker is like doing the VHOST_SET_OWNER ioctl where it still has to be done
from a process you can inherit values like the mm, cgroups, and now RLIMITs.


>> diff --git a/include/uapi/linux/vhost_types.h b/include/uapi/linux/vhost_types.h
>> index f7f6a3a28977..af654e3cef0e 100644
>> --- a/include/uapi/linux/vhost_types.h
>> +++ b/include/uapi/linux/vhost_types.h
>> @@ -47,6 +47,18 @@ struct vhost_vring_addr {
>>       __u64 log_guest_addr;
>>   };
>>   +#define VHOST_VRING_NEW_WORKER -1
> 
> 
> Do we need VHOST_VRING_FREE_WORKER? And I wonder if using dedicated ioctls are better:
> 
> VHOST_VRING_NEW/FREE_WORKER
> VHOST_VRING_ATTACH_WORKER


We didn't need a free worker, because the kernel handles it for userspace. I
tried to make it easy for userspace because in some cases it may not be able
to do syscalls like close on the device. For example if qemu crashes or for
vhost-scsi we don't do an explicit close during VM shutdown.

So we start off with the default worker thread that's used by all vqs like we do
today. Userspace can then override it by creating a new worker. That also unbinds/
detaches the existing worker and does a put on the workers refcount. We also do a
put on the worker when we stop using it during device shutdown/closure/release.
When the worker's refcount goes to zero the kernel deletes it.

I think separating the calls could be helpful though.


WARNING: multiple messages have this Message-ID (diff)
From: michael.christie@oracle.com
To: Jason Wang <jasowang@redhat.com>,
	target-devel@vger.kernel.org, linux-scsi@vger.kernel.org,
	stefanha@redhat.com, pbonzini@redhat.com, mst@redhat.com,
	sgarzare@redhat.com, virtualization@lists.linux-foundation.org
Subject: Re: [PATCH V3 11/11] vhost: allow userspace to create workers
Date: Tue, 26 Oct 2021 11:49:37 -0500	[thread overview]
Message-ID: <4d33b7e1-5efb-3729-ee15-98608704f096@oracle.com> (raw)
In-Reply-To: <8aee8f07-76bd-f111-bc5f-fc5cad46ce56@redhat.com>

On 10/26/21 12:37 AM, Jason Wang wrote:
> 
> 在 2021/10/22 下午1:19, Mike Christie 写道:
>> This patch allows userspace to create workers and bind them to vqs. You
>> can have N workers per dev and also share N workers with M vqs.
>>
>> Signed-off-by: Mike Christie <michael.christie@oracle.com>
> 
> 
> A question, who is the best one to determine the binding? Is it the VMM (Qemu etc) or the management stack? If the latter, it looks to me it's better to expose this via sysfs?

I thought it would be where you have management app settings, then the
management app talks to the qemu control interface like it does when it
adds new devices on the fly.

A problem with the management app doing it is to handle the RLIMIT_NPROC
review comment, this patchset:

https://lore.kernel.org/all/20211007214448.6282-1-michael.christie@oracle.com/

basically has the kernel do a clone() from the caller's context. So adding
a worker is like doing the VHOST_SET_OWNER ioctl where it still has to be done
from a process you can inherit values like the mm, cgroups, and now RLIMITs.


>> diff --git a/include/uapi/linux/vhost_types.h b/include/uapi/linux/vhost_types.h
>> index f7f6a3a28977..af654e3cef0e 100644
>> --- a/include/uapi/linux/vhost_types.h
>> +++ b/include/uapi/linux/vhost_types.h
>> @@ -47,6 +47,18 @@ struct vhost_vring_addr {
>>       __u64 log_guest_addr;
>>   };
>>   +#define VHOST_VRING_NEW_WORKER -1
> 
> 
> Do we need VHOST_VRING_FREE_WORKER? And I wonder if using dedicated ioctls are better:
> 
> VHOST_VRING_NEW/FREE_WORKER
> VHOST_VRING_ATTACH_WORKER


We didn't need a free worker, because the kernel handles it for userspace. I
tried to make it easy for userspace because in some cases it may not be able
to do syscalls like close on the device. For example if qemu crashes or for
vhost-scsi we don't do an explicit close during VM shutdown.

So we start off with the default worker thread that's used by all vqs like we do
today. Userspace can then override it by creating a new worker. That also unbinds/
detaches the existing worker and does a put on the workers refcount. We also do a
put on the worker when we stop using it during device shutdown/closure/release.
When the worker's refcount goes to zero the kernel deletes it.

I think separating the calls could be helpful though.

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

  parent reply	other threads:[~2021-10-26 16:49 UTC|newest]

Thread overview: 74+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-22  5:18 [PATCH V3 00/11] vhost: multiple worker support Mike Christie
2021-10-22  5:18 ` Mike Christie
2021-10-22  5:19 ` [PATCH] QEMU vhost-scsi: add support for VHOST_SET_VRING_WORKER Mike Christie
2021-10-22  5:19   ` Mike Christie
2021-10-22  5:19 ` [PATCH V3 01/11] vhost: add vhost_worker pointer to vhost_virtqueue Mike Christie
2021-10-22  5:19   ` Mike Christie
2021-10-22  5:19 ` [PATCH V3 02/11] vhost, vhost-net: add helper to check if vq has work Mike Christie
2021-10-22  5:19   ` Mike Christie
2021-10-22  5:19 ` [PATCH V3 03/11] vhost: take worker or vq instead of dev for queueing Mike Christie
2021-10-22  5:19   ` Mike Christie
2021-10-22  5:19 ` [PATCH V3 04/11] vhost: take worker or vq instead of dev for flushing Mike Christie
2021-10-22  5:19   ` Mike Christie
2021-10-22  5:19 ` [PATCH V3 05/11] vhost: convert poll work to be vq based Mike Christie
2021-10-22  5:19   ` Mike Christie
2021-10-22  5:19 ` [PATCH V3 06/11] vhost-sock: convert to vq helpers Mike Christie
2021-10-22  5:19   ` Mike Christie
2021-10-25  9:08   ` Stefano Garzarella
2021-10-25  9:08     ` Stefano Garzarella
2021-10-25 16:09     ` michael.christie
2021-10-25 16:09       ` michael.christie
2021-10-22  5:19 ` [PATCH V3 07/11] vhost-scsi: make SCSI cmd completion per vq Mike Christie
2021-10-22  5:19   ` Mike Christie
2021-10-22  5:19 ` [PATCH V3 08/11] vhost-scsi: convert to vq helpers Mike Christie
2021-10-22  5:19   ` Mike Christie
2021-10-22  5:19 ` [PATCH V3 09/11] vhost-scsi: flush IO vqs then send TMF rsp Mike Christie
2021-10-22  5:19   ` Mike Christie
2021-10-22  5:19 ` [PATCH V3 10/11] vhost: remove device wide queu/flushing helpers Mike Christie
2021-10-22  5:19   ` Mike Christie
2021-10-22  5:19 ` [PATCH V3 11/11] vhost: allow userspace to create workers Mike Christie
2021-10-22  5:19   ` Mike Christie
2021-10-22 10:47   ` Michael S. Tsirkin
2021-10-22 10:47     ` Michael S. Tsirkin
2021-10-22 16:12     ` michael.christie
2021-10-22 16:12       ` michael.christie
2021-10-22 18:17       ` michael.christie
2021-10-22 18:17         ` michael.christie
2021-10-23 20:11         ` Michael S. Tsirkin
2021-10-23 20:11           ` Michael S. Tsirkin
2021-10-25 16:04           ` michael.christie
2021-10-25 16:04             ` michael.christie
2021-10-25 17:14             ` Michael S. Tsirkin
2021-10-25 17:14               ` Michael S. Tsirkin
2021-10-26  5:37   ` Jason Wang
2021-10-26  5:37     ` Jason Wang
2021-10-26 13:09     ` Michael S. Tsirkin
2021-10-26 13:09       ` Michael S. Tsirkin
2021-10-26 16:36       ` Stefan Hajnoczi
2021-10-26 16:36         ` Stefan Hajnoczi
2021-10-26 15:44     ` Stefan Hajnoczi
2021-10-26 15:44       ` Stefan Hajnoczi
2021-10-27  2:55       ` Jason Wang
2021-10-27  2:55         ` Jason Wang
2021-10-27  9:01         ` Stefan Hajnoczi
2021-10-27  9:01           ` Stefan Hajnoczi
2021-10-26 16:49     ` michael.christie [this message]
2021-10-26 16:49       ` michael.christie
2021-10-27  6:02       ` Jason Wang
2021-10-27  6:02         ` Jason Wang
2021-10-27  9:03       ` Stefan Hajnoczi
2021-10-27  9:03         ` Stefan Hajnoczi
2021-10-26 15:22   ` Stefan Hajnoczi
2021-10-26 15:22     ` Stefan Hajnoczi
2021-10-26 15:24   ` Stefan Hajnoczi
2021-10-26 15:24     ` Stefan Hajnoczi
2021-10-22  6:02 ` [PATCH V3 00/11] vhost: multiple worker support michael.christie
2021-10-22  6:02   ` michael.christie
2021-10-22  9:49   ` Michael S. Tsirkin
2021-10-22  9:49     ` Michael S. Tsirkin
2021-10-22  9:48 ` Michael S. Tsirkin
2021-10-22  9:48   ` Michael S. Tsirkin
2021-10-22 15:54   ` michael.christie
2021-10-22 15:54     ` michael.christie
2021-10-23 20:12     ` Michael S. Tsirkin
2021-10-23 20:12       ` Michael S. Tsirkin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4d33b7e1-5efb-3729-ee15-98608704f096@oracle.com \
    --to=michael.christie@oracle.com \
    --cc=jasowang@redhat.com \
    --cc=linux-scsi@vger.kernel.org \
    --cc=mst@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=sgarzare@redhat.com \
    --cc=stefanha@redhat.com \
    --cc=target-devel@vger.kernel.org \
    --cc=virtualization@lists.linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.