From: Stefan Hajnoczi <stefanha@redhat.com> To: Suwan Kim <suwan.kim027@gmail.com> Cc: mst@redhat.com, jasowang@redhat.com, pbonzini@redhat.com, virtualization@lists.linux-foundation.org, linux-block@vger.kernel.org Subject: Re: [PATCH] virtio-blk: support polling I/O Date: Mon, 14 Mar 2022 15:19:01 +0000 [thread overview] Message-ID: <Yi9c5bhdDrQ1pLDY@stefanha-x1.localdomain> (raw) In-Reply-To: <20220311152832.17703-1-suwan.kim027@gmail.com> [-- Attachment #1: Type: text/plain, Size: 4166 bytes --] On Sat, Mar 12, 2022 at 12:28:32AM +0900, Suwan Kim wrote: > This patch supports polling I/O via virtio-blk driver. Polling > feature is enabled based on "VIRTIO_BLK_F_MQ" feature and the number > of polling queues can be set by QEMU virtio-blk-pci property > "num-poll-queues=N". This patch improves the polling I/O throughput > and latency. > > The virtio-blk driver doesn't not have a poll function and a poll > queue and it has been operating in interrupt driven method even if > the polling function is called in the upper layer. > > virtio-blk polling is implemented upon 'batched completion' of block > layer. virtblk_poll() queues completed request to io_comp_batch->req_list > and later, virtblk_complete_batch() calls unmap function and ends > the requests in batch. > > virtio-blk reads the number of queues and poll queues from QEMU > virtio-blk-pci properties ("num-queues=N", "num-poll-queues=M"). > It allocates N virtqueues to virtio_blk->vqs[N] and it uses [0..(N-M-1)] > as default queues and [(N-M)..(N-1)] as poll queues. Unlike the default > queues, the poll queues have no callback function. > > Regarding HW-SW queue mapping, the default queue mapping uses the > existing method that condsiders MSI irq vector. But the poll queue > doesn't have an irq, so it uses the regular blk-mq cpu mapping. > > To enable poll queues, "num-poll-queues=N" property of virtio-blk-pci > needs to be added to QEMU command line. For that, I temporarily > implemented the property on QEMU. Please refer to the git repository below. > > git : https://github.com/asfaca/qemu.git #on master branch commit > > For verifying the improvement, I did Fio polling I/O performance test > with io_uring engine with the options below. > (io_uring, hipri, randread, direct=1, bs=512, iodepth=64 numjobs=N) > I set 4 vcpu and 4 virtio-blk queues - 2 default queues and 2 poll > queues for VM. > (-device virtio-blk-pci,num-queues=4,num-poll-queues=2) > As a result, IOPS and average latency improved about 10%. > > Test result: > > - Fio io_uring poll without virtio-blk poll support > -- numjobs=1 : IOPS = 297K, avg latency = 214.59us > -- numjobs=2 : IOPS = 360K, avg latency = 363.88us > -- numjobs=4 : IOPS = 289K, avg latency = 885.42us > > - Fio io_uring poll with virtio-blk poll support > -- numjobs=1 : IOPS = 332K, avg latency = 192.61us > -- numjobs=2 : IOPS = 371K, avg latency = 348.31us > -- numjobs=4 : IOPS = 321K, avg latency = 795.93us Last year there was a patch series that switched regular queues into polling queues when HIPRI requests were in flight: https://lore.kernel.org/linux-block/20210520141305.355961-1-stefanha@redhat.com/T/ The advantage is that polling is possible without prior device configuration, making it easier for users. However, the dynamic approach is a bit more complex and bugs can result in lost irqs (hung I/O). Christoph Hellwig asked for dedicated polling queues, which your patch series now delivers. I think your patch series is worth merging once the comments others have already made have been addressed. I'll keep an eye out for the VIRTIO spec change to extend the virtio-blk configuration space, which needs to be accepted before the Linux can be merged. > @@ -728,16 +749,82 @@ static const struct attribute_group *virtblk_attr_groups[] = { > static int virtblk_map_queues(struct blk_mq_tag_set *set) > { > struct virtio_blk *vblk = set->driver_data; > + int i, qoff; > + > + for (i = 0, qoff = 0; i < set->nr_maps; i++) { > + struct blk_mq_queue_map *map = &set->map[i]; > + > + map->nr_queues = vblk->io_queues[i]; > + map->queue_offset = qoff; > + qoff += map->nr_queues; > + > + if (map->nr_queues == 0) > + continue; > + > + if (i == HCTX_TYPE_DEFAULT) > + blk_mq_virtio_map_queues(&set->map[i], vblk->vdev, 0); > + else > + blk_mq_map_queues(&set->map[i]); A comment would be nice here to explain that regular queues have interrupts and hence CPU affinity is defined by the core virtio code, but polling queues have no interrupts so we let the block layer assign CPU affinity. [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 488 bytes --]
WARNING: multiple messages have this Message-ID (diff)
From: Stefan Hajnoczi <stefanha@redhat.com> To: Suwan Kim <suwan.kim027@gmail.com> Cc: linux-block@vger.kernel.org, pbonzini@redhat.com, virtualization@lists.linux-foundation.org, mst@redhat.com Subject: Re: [PATCH] virtio-blk: support polling I/O Date: Mon, 14 Mar 2022 15:19:01 +0000 [thread overview] Message-ID: <Yi9c5bhdDrQ1pLDY@stefanha-x1.localdomain> (raw) In-Reply-To: <20220311152832.17703-1-suwan.kim027@gmail.com> [-- Attachment #1.1: Type: text/plain, Size: 4166 bytes --] On Sat, Mar 12, 2022 at 12:28:32AM +0900, Suwan Kim wrote: > This patch supports polling I/O via virtio-blk driver. Polling > feature is enabled based on "VIRTIO_BLK_F_MQ" feature and the number > of polling queues can be set by QEMU virtio-blk-pci property > "num-poll-queues=N". This patch improves the polling I/O throughput > and latency. > > The virtio-blk driver doesn't not have a poll function and a poll > queue and it has been operating in interrupt driven method even if > the polling function is called in the upper layer. > > virtio-blk polling is implemented upon 'batched completion' of block > layer. virtblk_poll() queues completed request to io_comp_batch->req_list > and later, virtblk_complete_batch() calls unmap function and ends > the requests in batch. > > virtio-blk reads the number of queues and poll queues from QEMU > virtio-blk-pci properties ("num-queues=N", "num-poll-queues=M"). > It allocates N virtqueues to virtio_blk->vqs[N] and it uses [0..(N-M-1)] > as default queues and [(N-M)..(N-1)] as poll queues. Unlike the default > queues, the poll queues have no callback function. > > Regarding HW-SW queue mapping, the default queue mapping uses the > existing method that condsiders MSI irq vector. But the poll queue > doesn't have an irq, so it uses the regular blk-mq cpu mapping. > > To enable poll queues, "num-poll-queues=N" property of virtio-blk-pci > needs to be added to QEMU command line. For that, I temporarily > implemented the property on QEMU. Please refer to the git repository below. > > git : https://github.com/asfaca/qemu.git #on master branch commit > > For verifying the improvement, I did Fio polling I/O performance test > with io_uring engine with the options below. > (io_uring, hipri, randread, direct=1, bs=512, iodepth=64 numjobs=N) > I set 4 vcpu and 4 virtio-blk queues - 2 default queues and 2 poll > queues for VM. > (-device virtio-blk-pci,num-queues=4,num-poll-queues=2) > As a result, IOPS and average latency improved about 10%. > > Test result: > > - Fio io_uring poll without virtio-blk poll support > -- numjobs=1 : IOPS = 297K, avg latency = 214.59us > -- numjobs=2 : IOPS = 360K, avg latency = 363.88us > -- numjobs=4 : IOPS = 289K, avg latency = 885.42us > > - Fio io_uring poll with virtio-blk poll support > -- numjobs=1 : IOPS = 332K, avg latency = 192.61us > -- numjobs=2 : IOPS = 371K, avg latency = 348.31us > -- numjobs=4 : IOPS = 321K, avg latency = 795.93us Last year there was a patch series that switched regular queues into polling queues when HIPRI requests were in flight: https://lore.kernel.org/linux-block/20210520141305.355961-1-stefanha@redhat.com/T/ The advantage is that polling is possible without prior device configuration, making it easier for users. However, the dynamic approach is a bit more complex and bugs can result in lost irqs (hung I/O). Christoph Hellwig asked for dedicated polling queues, which your patch series now delivers. I think your patch series is worth merging once the comments others have already made have been addressed. I'll keep an eye out for the VIRTIO spec change to extend the virtio-blk configuration space, which needs to be accepted before the Linux can be merged. > @@ -728,16 +749,82 @@ static const struct attribute_group *virtblk_attr_groups[] = { > static int virtblk_map_queues(struct blk_mq_tag_set *set) > { > struct virtio_blk *vblk = set->driver_data; > + int i, qoff; > + > + for (i = 0, qoff = 0; i < set->nr_maps; i++) { > + struct blk_mq_queue_map *map = &set->map[i]; > + > + map->nr_queues = vblk->io_queues[i]; > + map->queue_offset = qoff; > + qoff += map->nr_queues; > + > + if (map->nr_queues == 0) > + continue; > + > + if (i == HCTX_TYPE_DEFAULT) > + blk_mq_virtio_map_queues(&set->map[i], vblk->vdev, 0); > + else > + blk_mq_map_queues(&set->map[i]); A comment would be nice here to explain that regular queues have interrupts and hence CPU affinity is defined by the core virtio code, but polling queues have no interrupts so we let the block layer assign CPU affinity. [-- Attachment #1.2: signature.asc --] [-- Type: application/pgp-signature, Size: 488 bytes --] [-- Attachment #2: Type: text/plain, Size: 183 bytes --] _______________________________________________ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
next prev parent reply other threads:[~2022-03-14 15:19 UTC|newest] Thread overview: 49+ messages / expand[flat|nested] mbox.gz Atom feed top 2022-03-11 15:28 [PATCH] virtio-blk: support polling I/O Suwan Kim 2022-03-11 15:38 ` Michael S. Tsirkin 2022-03-11 15:38 ` Michael S. Tsirkin 2022-03-11 16:07 ` Suwan Kim 2022-03-11 16:18 ` Michael S. Tsirkin 2022-03-11 16:18 ` Michael S. Tsirkin 2022-03-11 16:38 ` Suwan Kim 2022-03-13 10:37 ` Max Gurtovoy 2022-03-14 9:43 ` Suwan Kim 2022-03-14 10:25 ` Max Gurtovoy 2022-03-14 11:15 ` Michael S. Tsirkin 2022-03-14 11:15 ` Michael S. Tsirkin 2022-03-14 13:22 ` Suwan Kim 2022-03-14 15:12 ` Michael S. Tsirkin 2022-03-14 15:12 ` Michael S. Tsirkin 2022-03-14 13:26 ` Max Gurtovoy 2022-03-14 15:15 ` Michael S. Tsirkin 2022-03-14 15:15 ` Michael S. Tsirkin 2022-03-14 16:33 ` Max Gurtovoy 2022-03-14 22:22 ` Michael S. Tsirkin 2022-03-14 22:22 ` Michael S. Tsirkin 2022-03-13 10:42 ` Max Gurtovoy 2022-03-14 9:55 ` Suwan Kim 2022-03-14 10:32 ` Max Gurtovoy 2022-03-14 6:14 ` Jason Wang 2022-03-14 6:14 ` Jason Wang 2022-03-14 12:33 ` Suwan Kim 2022-03-14 14:48 ` Stefan Hajnoczi 2022-03-14 14:48 ` Stefan Hajnoczi 2022-03-15 8:59 ` Jason Wang 2022-03-15 14:43 ` Suwan Kim 2022-03-15 14:53 ` Michael S. Tsirkin 2022-03-15 14:53 ` Michael S. Tsirkin 2022-03-16 2:02 ` Jason Wang 2022-03-16 2:02 ` Jason Wang 2022-03-16 11:25 ` Max Gurtovoy 2022-03-16 13:32 ` Suwan Kim 2022-03-16 15:32 ` Suwan Kim 2022-03-16 15:36 ` Max Gurtovoy 2022-03-16 16:00 ` Stefan Hajnoczi 2022-03-16 16:00 ` Stefan Hajnoczi 2022-03-17 2:20 ` Jason Wang 2022-03-17 2:20 ` Jason Wang 2022-03-17 15:03 ` Suwan Kim 2022-03-14 15:19 ` Stefan Hajnoczi [this message] 2022-03-14 15:19 ` Stefan Hajnoczi 2022-03-15 13:55 ` Suwan Kim 2022-03-16 12:20 ` Stefan Hajnoczi 2022-03-16 12:20 ` Stefan Hajnoczi
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=Yi9c5bhdDrQ1pLDY@stefanha-x1.localdomain \ --to=stefanha@redhat.com \ --cc=jasowang@redhat.com \ --cc=linux-block@vger.kernel.org \ --cc=mst@redhat.com \ --cc=pbonzini@redhat.com \ --cc=suwan.kim027@gmail.com \ --cc=virtualization@lists.linux-foundation.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.