All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Michael S. Tsirkin" <mst@redhat.com>
To: Mike Christie <michael.christie@oracle.com>
Cc: target-devel@vger.kernel.org, linux-scsi@vger.kernel.org,
	stefanha@redhat.com, pbonzini@redhat.com, jasowang@redhat.com,
	sgarzare@redhat.com, virtualization@lists.linux-foundation.org
Subject: Re: [PATCH V3 00/11] vhost: multiple worker support
Date: Fri, 22 Oct 2021 05:48:18 -0400	[thread overview]
Message-ID: <20211022054625-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <20211022051911.108383-1-michael.christie@oracle.com>

On Fri, Oct 22, 2021 at 12:18:59AM -0500, Mike Christie wrote:
> The following patches apply over linus's tree and this patchset
> 
> https://lore.kernel.org/all/20211007214448.6282-1-michael.christie@oracle.com/
> 
> which allows us to check the vhost owner thread's RLIMITs:


Unfortunately that patchset in turn triggers kbuild warnings.
I was hoping you would address them, I don't think
merging that patchset before kbuild issues are addressed
is possible.

It also doesn't have lots of acks, I'm a bit apprehensive
of merging core changes like this through the vhost tree.
Try to CC more widely/ping people?

> It looks like that patchset has been ok'd by all the major parties
> and just needs a small cleanup to apply to Jens and Paul trees, so I
> wanted to post my threading patches based over it for review.
> 
> The following patches allow us to support multiple vhost workers per
> device. I ended up just doing Stefan's original idea where userspace has
> the kernel create a worker and we pass back the pid. This has the benefit
> over the workqueue and userspace thread approach where we only have
> one'ish code path in the kernel during setup to detect old tools. The
> main IO paths and device/vq setup/teardown paths all use common code.
> 
> I've also included a patch for qemu so you can get an idea of how it
> works. If we are ok with the kernel code then I'll break that up into
> a patchset and send to qemu-devel for review.
> 
> Results:
> --------
> 
> fio jobs        1       2       4       8       12      16
> ----------------------------------------------------------
> 1 worker        84k    492k    510k    -       -       -
> worker per vq   184k   380k    744k    1422k   2256k   2434k
> 
> Notes:
> 0. This used a simple fio command:
> 
> fio --filename=/dev/sdb  --direct=1 --rw=randrw --bs=4k \
> --ioengine=libaio --iodepth=128  --numjobs=$JOBS_ABOVE
> 
> and I used a VM with 16 vCPUs and 16 virtqueues.
> 
> 1. The patches were tested with emulate_pr=0 and these patches:
> 
> https://lore.kernel.org/all/yq1tuhge4bg.fsf@ca-mkp.ca.oracle.com/t/
> 
> which are in mkp's scsi branches for the next kernel. They fix the perf
> issues where IOPs dropped at 12 vqs/jobs.
> 
> 2. Because we have a hard limit of 1024 cmds, if the num jobs * iodepth
> was greater than 1024, I would decrease iodepth. So 12 jobs used 85 cmds,
> and 16 used 64.
> 
> 3. The perf issue above at 2 jobs is because when we only have 1 worker
> we execute more cmds per vhost_work due to all vqs funneling to one worker.
> This results in less context switches and better batching without having to
> tweak any settings. I'm working on patches to add back batching during lio
> completion and do polling on the submission side.
> 
> We will still want the threading patches, because if we batch at the fio
> level plus use the vhost theading patches, we can see a big boost like
> below. So hopefully doing it at the kernel will allow apps to just work
> without having to be smart like fio.
> 
> fio using io_uring and batching with the iodepth_batch* settings:
> 
> fio jobs        1       2       4       8       12      16
> -------------------------------------------------------------
> 1 worker        494k    520k    -       -       -       -
> worker per vq   496k    878k    1542k   2436k   2304k   2590k
> 
> V3:
> - fully convert vhost code to use vq based APIs instead of leaving it
> half per dev and half per vq.
> - rebase against kernel worker API.
> - Drop delayed worker creation. We always create the default worker at
> VHOST_SET_OWNER time. Userspace can create and bind workers after that.
> 
> v2:
> - change loop that we take a refcount to the worker in
> - replaced pid == -1 with define.
> - fixed tabbing/spacing coding style issue
> - use hash instead of list to lookup workers.
> - I dropped the patch that added an ioctl cmd to get a vq's worker's
> pid. I saw we might do a generic netlink interface instead.
> 
> 
> 


WARNING: multiple messages have this Message-ID (diff)
From: "Michael S. Tsirkin" <mst@redhat.com>
To: Mike Christie <michael.christie@oracle.com>
Cc: linux-scsi@vger.kernel.org,
	virtualization@lists.linux-foundation.org,
	target-devel@vger.kernel.org, stefanha@redhat.com,
	pbonzini@redhat.com
Subject: Re: [PATCH V3 00/11] vhost: multiple worker support
Date: Fri, 22 Oct 2021 05:48:18 -0400	[thread overview]
Message-ID: <20211022054625-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <20211022051911.108383-1-michael.christie@oracle.com>

On Fri, Oct 22, 2021 at 12:18:59AM -0500, Mike Christie wrote:
> The following patches apply over linus's tree and this patchset
> 
> https://lore.kernel.org/all/20211007214448.6282-1-michael.christie@oracle.com/
> 
> which allows us to check the vhost owner thread's RLIMITs:


Unfortunately that patchset in turn triggers kbuild warnings.
I was hoping you would address them, I don't think
merging that patchset before kbuild issues are addressed
is possible.

It also doesn't have lots of acks, I'm a bit apprehensive
of merging core changes like this through the vhost tree.
Try to CC more widely/ping people?

> It looks like that patchset has been ok'd by all the major parties
> and just needs a small cleanup to apply to Jens and Paul trees, so I
> wanted to post my threading patches based over it for review.
> 
> The following patches allow us to support multiple vhost workers per
> device. I ended up just doing Stefan's original idea where userspace has
> the kernel create a worker and we pass back the pid. This has the benefit
> over the workqueue and userspace thread approach where we only have
> one'ish code path in the kernel during setup to detect old tools. The
> main IO paths and device/vq setup/teardown paths all use common code.
> 
> I've also included a patch for qemu so you can get an idea of how it
> works. If we are ok with the kernel code then I'll break that up into
> a patchset and send to qemu-devel for review.
> 
> Results:
> --------
> 
> fio jobs        1       2       4       8       12      16
> ----------------------------------------------------------
> 1 worker        84k    492k    510k    -       -       -
> worker per vq   184k   380k    744k    1422k   2256k   2434k
> 
> Notes:
> 0. This used a simple fio command:
> 
> fio --filename=/dev/sdb  --direct=1 --rw=randrw --bs=4k \
> --ioengine=libaio --iodepth=128  --numjobs=$JOBS_ABOVE
> 
> and I used a VM with 16 vCPUs and 16 virtqueues.
> 
> 1. The patches were tested with emulate_pr=0 and these patches:
> 
> https://lore.kernel.org/all/yq1tuhge4bg.fsf@ca-mkp.ca.oracle.com/t/
> 
> which are in mkp's scsi branches for the next kernel. They fix the perf
> issues where IOPs dropped at 12 vqs/jobs.
> 
> 2. Because we have a hard limit of 1024 cmds, if the num jobs * iodepth
> was greater than 1024, I would decrease iodepth. So 12 jobs used 85 cmds,
> and 16 used 64.
> 
> 3. The perf issue above at 2 jobs is because when we only have 1 worker
> we execute more cmds per vhost_work due to all vqs funneling to one worker.
> This results in less context switches and better batching without having to
> tweak any settings. I'm working on patches to add back batching during lio
> completion and do polling on the submission side.
> 
> We will still want the threading patches, because if we batch at the fio
> level plus use the vhost theading patches, we can see a big boost like
> below. So hopefully doing it at the kernel will allow apps to just work
> without having to be smart like fio.
> 
> fio using io_uring and batching with the iodepth_batch* settings:
> 
> fio jobs        1       2       4       8       12      16
> -------------------------------------------------------------
> 1 worker        494k    520k    -       -       -       -
> worker per vq   496k    878k    1542k   2436k   2304k   2590k
> 
> V3:
> - fully convert vhost code to use vq based APIs instead of leaving it
> half per dev and half per vq.
> - rebase against kernel worker API.
> - Drop delayed worker creation. We always create the default worker at
> VHOST_SET_OWNER time. Userspace can create and bind workers after that.
> 
> v2:
> - change loop that we take a refcount to the worker in
> - replaced pid == -1 with define.
> - fixed tabbing/spacing coding style issue
> - use hash instead of list to lookup workers.
> - I dropped the patch that added an ioctl cmd to get a vq's worker's
> pid. I saw we might do a generic netlink interface instead.
> 
> 
> 

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

  parent reply	other threads:[~2021-10-22  9:48 UTC|newest]

Thread overview: 74+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-22  5:18 [PATCH V3 00/11] vhost: multiple worker support Mike Christie
2021-10-22  5:18 ` Mike Christie
2021-10-22  5:19 ` [PATCH] QEMU vhost-scsi: add support for VHOST_SET_VRING_WORKER Mike Christie
2021-10-22  5:19   ` Mike Christie
2021-10-22  5:19 ` [PATCH V3 01/11] vhost: add vhost_worker pointer to vhost_virtqueue Mike Christie
2021-10-22  5:19   ` Mike Christie
2021-10-22  5:19 ` [PATCH V3 02/11] vhost, vhost-net: add helper to check if vq has work Mike Christie
2021-10-22  5:19   ` Mike Christie
2021-10-22  5:19 ` [PATCH V3 03/11] vhost: take worker or vq instead of dev for queueing Mike Christie
2021-10-22  5:19   ` Mike Christie
2021-10-22  5:19 ` [PATCH V3 04/11] vhost: take worker or vq instead of dev for flushing Mike Christie
2021-10-22  5:19   ` Mike Christie
2021-10-22  5:19 ` [PATCH V3 05/11] vhost: convert poll work to be vq based Mike Christie
2021-10-22  5:19   ` Mike Christie
2021-10-22  5:19 ` [PATCH V3 06/11] vhost-sock: convert to vq helpers Mike Christie
2021-10-22  5:19   ` Mike Christie
2021-10-25  9:08   ` Stefano Garzarella
2021-10-25  9:08     ` Stefano Garzarella
2021-10-25 16:09     ` michael.christie
2021-10-25 16:09       ` michael.christie
2021-10-22  5:19 ` [PATCH V3 07/11] vhost-scsi: make SCSI cmd completion per vq Mike Christie
2021-10-22  5:19   ` Mike Christie
2021-10-22  5:19 ` [PATCH V3 08/11] vhost-scsi: convert to vq helpers Mike Christie
2021-10-22  5:19   ` Mike Christie
2021-10-22  5:19 ` [PATCH V3 09/11] vhost-scsi: flush IO vqs then send TMF rsp Mike Christie
2021-10-22  5:19   ` Mike Christie
2021-10-22  5:19 ` [PATCH V3 10/11] vhost: remove device wide queu/flushing helpers Mike Christie
2021-10-22  5:19   ` Mike Christie
2021-10-22  5:19 ` [PATCH V3 11/11] vhost: allow userspace to create workers Mike Christie
2021-10-22  5:19   ` Mike Christie
2021-10-22 10:47   ` Michael S. Tsirkin
2021-10-22 10:47     ` Michael S. Tsirkin
2021-10-22 16:12     ` michael.christie
2021-10-22 16:12       ` michael.christie
2021-10-22 18:17       ` michael.christie
2021-10-22 18:17         ` michael.christie
2021-10-23 20:11         ` Michael S. Tsirkin
2021-10-23 20:11           ` Michael S. Tsirkin
2021-10-25 16:04           ` michael.christie
2021-10-25 16:04             ` michael.christie
2021-10-25 17:14             ` Michael S. Tsirkin
2021-10-25 17:14               ` Michael S. Tsirkin
2021-10-26  5:37   ` Jason Wang
2021-10-26  5:37     ` Jason Wang
2021-10-26 13:09     ` Michael S. Tsirkin
2021-10-26 13:09       ` Michael S. Tsirkin
2021-10-26 16:36       ` Stefan Hajnoczi
2021-10-26 16:36         ` Stefan Hajnoczi
2021-10-26 15:44     ` Stefan Hajnoczi
2021-10-26 15:44       ` Stefan Hajnoczi
2021-10-27  2:55       ` Jason Wang
2021-10-27  2:55         ` Jason Wang
2021-10-27  9:01         ` Stefan Hajnoczi
2021-10-27  9:01           ` Stefan Hajnoczi
2021-10-26 16:49     ` michael.christie
2021-10-26 16:49       ` michael.christie
2021-10-27  6:02       ` Jason Wang
2021-10-27  6:02         ` Jason Wang
2021-10-27  9:03       ` Stefan Hajnoczi
2021-10-27  9:03         ` Stefan Hajnoczi
2021-10-26 15:22   ` Stefan Hajnoczi
2021-10-26 15:22     ` Stefan Hajnoczi
2021-10-26 15:24   ` Stefan Hajnoczi
2021-10-26 15:24     ` Stefan Hajnoczi
2021-10-22  6:02 ` [PATCH V3 00/11] vhost: multiple worker support michael.christie
2021-10-22  6:02   ` michael.christie
2021-10-22  9:49   ` Michael S. Tsirkin
2021-10-22  9:49     ` Michael S. Tsirkin
2021-10-22  9:48 ` Michael S. Tsirkin [this message]
2021-10-22  9:48   ` Michael S. Tsirkin
2021-10-22 15:54   ` michael.christie
2021-10-22 15:54     ` michael.christie
2021-10-23 20:12     ` Michael S. Tsirkin
2021-10-23 20:12       ` Michael S. Tsirkin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20211022054625-mutt-send-email-mst@kernel.org \
    --to=mst@redhat.com \
    --cc=jasowang@redhat.com \
    --cc=linux-scsi@vger.kernel.org \
    --cc=michael.christie@oracle.com \
    --cc=pbonzini@redhat.com \
    --cc=sgarzare@redhat.com \
    --cc=stefanha@redhat.com \
    --cc=target-devel@vger.kernel.org \
    --cc=virtualization@lists.linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.