All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mike Christie <michael.christie@oracle.com>
To: Stefan Hajnoczi <stefanha@redhat.com>
Cc: target-devel@vger.kernel.org, linux-scsi@vger.kernel.org,
	pbonzini@redhat.com, jasowang@redhat.com, mst@redhat.com,
	sgarzare@redhat.com, virtualization@lists.linux-foundation.org
Subject: Re: [PATCH 7/9] vhost: allow userspace to create workers
Date: Thu, 17 Jun 2021 21:49:07 -0500	[thread overview]
Message-ID: <c98dd627-f330-b49c-6716-1caf737c73da@oracle.com> (raw)
In-Reply-To: <YMHH+das0nmMBbt5@stefanha-x1.localdomain>

On 6/10/21 3:06 AM, Stefan Hajnoczi wrote:
> On Wed, Jun 09, 2021 at 04:03:55PM -0500, Mike Christie wrote:
>> On 6/7/21 10:19 AM, Stefan Hajnoczi wrote:
>>> My concern is that threads should probably accounted against
>>> RLIMIT_NPROC and max_threads rather than something indirect like 128 *
>>> RLIMIT_NOFILE (a userspace process can only have RLIMIT_NOFILE
>>> vhost-user file descriptors open).
>>>
>>
>> Ah ok, I see what you want I think.
>>
>> Ok, I think the options are:
>>
>> 0. Nothing. Just use existing indirect/RLIMIT_NOFILE.
>>
>> 1. Do something like io_uring's create_io_thread/copy_process. If we call
>> copy_process from the vhost ioctl context, then the userspace process that
>> did the ioctl will have it's processes count incremented and checked against
>> its rlimit.
>>
>> The drawbacks:
>> - This gets a little more complicated than just calling copy_process though.
>> We end up duplicating a lot of the kthread API.
>> - We have to deal with new error cases like the parent exiting early.
>> - I think all devs sharing a worker have to have the same owner. kthread_use_mm
>> and kthread_unuse_mm to switch between mm's for differrent owner's devs seem to
>> be causing lots of errors. I'm still looking into this one though.
>>
>> 2.  It's not really what you want, but for unbound work io_uring has a check for
>> RLIMIT_NPROC in the io_uring code. It does:
>>
>> wqe->acct[IO_WQ_ACCT_UNBOUND].max_workers =
>> 					task_rlimit(current, RLIMIT_NPROC);
>>
>> then does:
>>
>> if (!ret && acct->nr_workers < acct->max_workers) {
>>
>> Drawbacks:
>> In vhost.c, we could do something similar. It would make sure that vhost.c does
>> not create more worker threads than the rlimit value, but we wouldn't be
>> incrementing the userspace process's process count. The userspace process could
>> then create RLIMIT_NPROC threads and vhost.c could also create RLIMIT_NPROC
>> threads, so we end up with 2 * RLIMIT_NPROC threads.
> 
> Yes, in that case we might as well go with Option 0, so I think this
> option can be eliminated.
> 
>> 3. Change the kthread and copy_process code so we can pass in the thread
>> (or it's creds or some struct that has the values that need to be check) that
>> needs to be checked and updated.
>>
>> Drawback:
>> This might be considered too ugly for how special case vhost is. For example, we
>> need checks/code like the io_thread/PF_IO_WORKER code in copy_process for io_uring.
>> I can see how added that for io_uring because it affects so many users, but I can
>> see how vhost is not special enough.
> 
> This seems like the most general solution. If you try it and get
> negative feedback then maybe the maintainers can help suggest how to
> solve this problem :).

Hey, I implemented #3 here:

https://github.com/mikechristie/linux/commit/76f7a555a85147420a22d0163c15259e01e02193

in this patchset:

https://github.com/mikechristie/linux/commits/kthread-node-user

but before I post I wanted to bring up an option 4 someone mentioned to me
offlist.

Again it's io_uring. Check out fs/io_uring.c:__io_account_mem(). For RLIMIT_MEMLOCK
it just does the check and increments the user's counter itself. It's simple like
option 2, and it handles the issue where the process doing the ioctl wasn't having
its RLIMIT_NPROC checked/updated.




WARNING: multiple messages have this Message-ID
From: Mike Christie <michael.christie@oracle.com>
To: Stefan Hajnoczi <stefanha@redhat.com>
Cc: linux-scsi@vger.kernel.org, mst@redhat.com,
	virtualization@lists.linux-foundation.org,
	target-devel@vger.kernel.org, pbonzini@redhat.com
Subject: Re: [PATCH 7/9] vhost: allow userspace to create workers
Date: Thu, 17 Jun 2021 21:49:07 -0500	[thread overview]
Message-ID: <c98dd627-f330-b49c-6716-1caf737c73da@oracle.com> (raw)
In-Reply-To: <YMHH+das0nmMBbt5@stefanha-x1.localdomain>

On 6/10/21 3:06 AM, Stefan Hajnoczi wrote:
> On Wed, Jun 09, 2021 at 04:03:55PM -0500, Mike Christie wrote:
>> On 6/7/21 10:19 AM, Stefan Hajnoczi wrote:
>>> My concern is that threads should probably accounted against
>>> RLIMIT_NPROC and max_threads rather than something indirect like 128 *
>>> RLIMIT_NOFILE (a userspace process can only have RLIMIT_NOFILE
>>> vhost-user file descriptors open).
>>>
>>
>> Ah ok, I see what you want I think.
>>
>> Ok, I think the options are:
>>
>> 0. Nothing. Just use existing indirect/RLIMIT_NOFILE.
>>
>> 1. Do something like io_uring's create_io_thread/copy_process. If we call
>> copy_process from the vhost ioctl context, then the userspace process that
>> did the ioctl will have it's processes count incremented and checked against
>> its rlimit.
>>
>> The drawbacks:
>> - This gets a little more complicated than just calling copy_process though.
>> We end up duplicating a lot of the kthread API.
>> - We have to deal with new error cases like the parent exiting early.
>> - I think all devs sharing a worker have to have the same owner. kthread_use_mm
>> and kthread_unuse_mm to switch between mm's for differrent owner's devs seem to
>> be causing lots of errors. I'm still looking into this one though.
>>
>> 2.  It's not really what you want, but for unbound work io_uring has a check for
>> RLIMIT_NPROC in the io_uring code. It does:
>>
>> wqe->acct[IO_WQ_ACCT_UNBOUND].max_workers =
>> 					task_rlimit(current, RLIMIT_NPROC);
>>
>> then does:
>>
>> if (!ret && acct->nr_workers < acct->max_workers) {
>>
>> Drawbacks:
>> In vhost.c, we could do something similar. It would make sure that vhost.c does
>> not create more worker threads than the rlimit value, but we wouldn't be
>> incrementing the userspace process's process count. The userspace process could
>> then create RLIMIT_NPROC threads and vhost.c could also create RLIMIT_NPROC
>> threads, so we end up with 2 * RLIMIT_NPROC threads.
> 
> Yes, in that case we might as well go with Option 0, so I think this
> option can be eliminated.
> 
>> 3. Change the kthread and copy_process code so we can pass in the thread
>> (or it's creds or some struct that has the values that need to be check) that
>> needs to be checked and updated.
>>
>> Drawback:
>> This might be considered too ugly for how special case vhost is. For example, we
>> need checks/code like the io_thread/PF_IO_WORKER code in copy_process for io_uring.
>> I can see how added that for io_uring because it affects so many users, but I can
>> see how vhost is not special enough.
> 
> This seems like the most general solution. If you try it and get
> negative feedback then maybe the maintainers can help suggest how to
> solve this problem :).

Hey, I implemented #3 here:

https://github.com/mikechristie/linux/commit/76f7a555a85147420a22d0163c15259e01e02193

in this patchset:

https://github.com/mikechristie/linux/commits/kthread-node-user

but before I post I wanted to bring up an option 4 someone mentioned to me
offlist.

Again it's io_uring. Check out fs/io_uring.c:__io_account_mem(). For RLIMIT_MEMLOCK
it just does the check and increments the user's counter itself. It's simple like
option 2, and it handles the issue where the process doing the ioctl wasn't having
its RLIMIT_NPROC checked/updated.



_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

  reply	other threads:[~2021-06-18  2:49 UTC|newest]

Thread overview: 64+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-25 18:05 vhost: multiple worker support Mike Christie
2021-05-25 18:05 ` Mike Christie
2021-05-25 18:05 ` [PATCH 1/9] vhost: move worker thread fields to new struct Mike Christie
2021-05-25 18:05   ` Mike Christie
2021-06-03 10:16   ` Stefan Hajnoczi
2021-06-03 10:16     ` Stefan Hajnoczi
2021-05-25 18:05 ` [PATCH 2/9] vhost: move vhost worker creation to kick setup Mike Christie
2021-05-25 18:05   ` Mike Christie
2021-06-03 10:28   ` Stefan Hajnoczi
2021-06-03 10:28     ` Stefan Hajnoczi
2021-05-25 18:05 ` [PATCH 3/9] vhost: modify internal functions to take a vhost_worker Mike Christie
2021-05-25 18:05   ` Mike Christie
2021-06-03 10:45   ` Stefan Hajnoczi
2021-06-03 10:45     ` Stefan Hajnoczi
2021-05-25 18:05 ` [PATCH 4/9] vhost: allow vhost_polls to use different vhost_workers Mike Christie
2021-05-25 18:05   ` Mike Christie
2021-06-03 13:51   ` Stefan Hajnoczi
2021-06-03 13:51     ` Stefan Hajnoczi
2021-05-25 18:05 ` [PATCH 5/9] vhost-scsi: flush IO vqs then send TMF rsp Mike Christie
2021-05-25 18:05   ` Mike Christie
2021-06-03 13:54   ` Stefan Hajnoczi
2021-06-03 13:54     ` Stefan Hajnoczi
2021-05-25 18:05 ` [PATCH 6/9] vhost-scsi: make SCSI cmd completion per vq Mike Christie
2021-05-25 18:05   ` Mike Christie
2021-06-03 13:57   ` Stefan Hajnoczi
2021-06-03 13:57     ` Stefan Hajnoczi
2021-05-25 18:05 ` [PATCH 7/9] vhost: allow userspace to create workers Mike Christie
2021-05-25 18:05   ` Mike Christie
2021-06-03 14:30   ` Stefan Hajnoczi
2021-06-03 14:30     ` Stefan Hajnoczi
2021-06-05 23:53     ` michael.christie
2021-06-05 23:53       ` michael.christie
2021-06-07 15:19       ` Stefan Hajnoczi
2021-06-07 15:19         ` Stefan Hajnoczi
2021-06-09 21:03         ` Mike Christie
2021-06-09 21:03           ` Mike Christie
2021-06-10  8:06           ` Stefan Hajnoczi
2021-06-10  8:06             ` Stefan Hajnoczi
2021-06-18  2:49             ` Mike Christie [this message]
2021-06-18  2:49               ` Mike Christie
2021-06-21 13:41               ` Stefan Hajnoczi
2021-06-21 13:41                 ` Stefan Hajnoczi
2021-05-25 18:05 ` [PATCH 8/9] vhost: add vhost_dev pointer to vhost_work Mike Christie
2021-05-25 18:05   ` Mike Christie
2021-06-03 14:31   ` Stefan Hajnoczi
2021-06-03 14:31     ` Stefan Hajnoczi
2021-05-25 18:06 ` [PATCH 9/9] vhost: support sharing workers across devs Mike Christie
2021-05-25 18:06   ` Mike Christie
2021-06-03 14:32   ` Stefan Hajnoczi
2021-06-03 14:32     ` Stefan Hajnoczi
2021-06-07  2:18     ` Jason Wang
2021-06-07  2:18       ` Jason Wang
2021-06-03 10:13 ` vhost: multiple worker support Stefan Hajnoczi
2021-06-03 10:13   ` Stefan Hajnoczi
2021-06-03 18:45   ` Mike Christie
2021-06-03 18:45     ` Mike Christie
2021-06-03 14:37 ` Stefan Hajnoczi
2021-06-03 14:37   ` Stefan Hajnoczi
2021-06-03 22:16   ` Mike Christie
2021-06-03 22:16     ` Mike Christie
2021-06-05 22:40     ` michael.christie
2021-06-05 22:40       ` michael.christie
2021-06-07 15:23       ` Stefan Hajnoczi
2021-06-07 15:23         ` Stefan Hajnoczi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c98dd627-f330-b49c-6716-1caf737c73da@oracle.com \
    --to=michael.christie@oracle.com \
    --cc=jasowang@redhat.com \
    --cc=linux-scsi@vger.kernel.org \
    --cc=mst@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=sgarzare@redhat.com \
    --cc=stefanha@redhat.com \
    --cc=target-devel@vger.kernel.org \
    --cc=virtualization@lists.linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.