All of lore.kernel.org
 help / color / mirror / Atom feed
From: Stefan Hajnoczi <stefanha@redhat.com>
To: Suwan Kim <suwan.kim027@gmail.com>
Cc: mst@redhat.com, jasowang@redhat.com, pbonzini@redhat.com,
	virtualization@lists.linux-foundation.org,
	linux-block@vger.kernel.org
Subject: Re: [PATCH] virtio-blk: support polling I/O
Date: Mon, 14 Mar 2022 15:19:01 +0000	[thread overview]
Message-ID: <Yi9c5bhdDrQ1pLDY@stefanha-x1.localdomain> (raw)
In-Reply-To: <20220311152832.17703-1-suwan.kim027@gmail.com>

[-- Attachment #1: Type: text/plain, Size: 4166 bytes --]

On Sat, Mar 12, 2022 at 12:28:32AM +0900, Suwan Kim wrote:
> This patch supports polling I/O via virtio-blk driver. Polling
> feature is enabled based on "VIRTIO_BLK_F_MQ" feature and the number
> of polling queues can be set by QEMU virtio-blk-pci property
> "num-poll-queues=N". This patch improves the polling I/O throughput
> and latency.
> 
> The virtio-blk driver doesn't not have a poll function and a poll
> queue and it has been operating in interrupt driven method even if
> the polling function is called in the upper layer.
> 
> virtio-blk polling is implemented upon 'batched completion' of block
> layer. virtblk_poll() queues completed request to io_comp_batch->req_list
> and later, virtblk_complete_batch() calls unmap function and ends
> the requests in batch.
> 
> virtio-blk reads the number of queues and poll queues from QEMU
> virtio-blk-pci properties ("num-queues=N", "num-poll-queues=M").
> It allocates N virtqueues to virtio_blk->vqs[N] and it uses [0..(N-M-1)]
> as default queues and [(N-M)..(N-1)] as poll queues. Unlike the default
> queues, the poll queues have no callback function.
> 
> Regarding HW-SW queue mapping, the default queue mapping uses the
> existing method that condsiders MSI irq vector. But the poll queue
> doesn't have an irq, so it uses the regular blk-mq cpu mapping.
> 
> To enable poll queues, "num-poll-queues=N" property of virtio-blk-pci
> needs to be added to QEMU command line. For that, I temporarily
> implemented the property on QEMU. Please refer to the git repository below.
> 
> 	git : https://github.com/asfaca/qemu.git #on master branch commit
> 
> For verifying the improvement, I did Fio polling I/O performance test
> with io_uring engine with the options below.
> (io_uring, hipri, randread, direct=1, bs=512, iodepth=64 numjobs=N)
> I set 4 vcpu and 4 virtio-blk queues - 2 default queues and 2 poll
> queues for VM.
> (-device virtio-blk-pci,num-queues=4,num-poll-queues=2)
> As a result, IOPS and average latency improved about 10%.
> 
> Test result:
> 
> - Fio io_uring poll without virtio-blk poll support
> 	-- numjobs=1 : IOPS = 297K, avg latency = 214.59us
> 	-- numjobs=2 : IOPS = 360K, avg latency = 363.88us
> 	-- numjobs=4 : IOPS = 289K, avg latency = 885.42us
> 
> - Fio io_uring poll with virtio-blk poll support
> 	-- numjobs=1 : IOPS = 332K, avg latency = 192.61us
> 	-- numjobs=2 : IOPS = 371K, avg latency = 348.31us
> 	-- numjobs=4 : IOPS = 321K, avg latency = 795.93us

Last year there was a patch series that switched regular queues into
polling queues when HIPRI requests were in flight:
https://lore.kernel.org/linux-block/20210520141305.355961-1-stefanha@redhat.com/T/

The advantage is that polling is possible without prior device
configuration, making it easier for users.

However, the dynamic approach is a bit more complex and bugs can result
in lost irqs (hung I/O). Christoph Hellwig asked for dedicated polling
queues, which your patch series now delivers.

I think your patch series is worth merging once the comments others have
already made have been addressed. I'll keep an eye out for the VIRTIO
spec change to extend the virtio-blk configuration space, which needs to
be accepted before the Linux can be merged.

> @@ -728,16 +749,82 @@ static const struct attribute_group *virtblk_attr_groups[] = {
>  static int virtblk_map_queues(struct blk_mq_tag_set *set)
>  {
>  	struct virtio_blk *vblk = set->driver_data;
> +	int i, qoff;
> +
> +	for (i = 0, qoff = 0; i < set->nr_maps; i++) {
> +		struct blk_mq_queue_map *map = &set->map[i];
> +
> +		map->nr_queues = vblk->io_queues[i];
> +		map->queue_offset = qoff;
> +		qoff += map->nr_queues;
> +
> +		if (map->nr_queues == 0)
> +			continue;
> +
> +		if (i == HCTX_TYPE_DEFAULT)
> +			blk_mq_virtio_map_queues(&set->map[i], vblk->vdev, 0);
> +		else
> +			blk_mq_map_queues(&set->map[i]);

A comment would be nice here to explain that regular queues have
interrupts and hence CPU affinity is defined by the core virtio code,
but polling queues have no interrupts so we let the block layer assign
CPU affinity.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

WARNING: multiple messages have this Message-ID (diff)
From: Stefan Hajnoczi <stefanha@redhat.com>
To: Suwan Kim <suwan.kim027@gmail.com>
Cc: linux-block@vger.kernel.org, pbonzini@redhat.com,
	virtualization@lists.linux-foundation.org, mst@redhat.com
Subject: Re: [PATCH] virtio-blk: support polling I/O
Date: Mon, 14 Mar 2022 15:19:01 +0000	[thread overview]
Message-ID: <Yi9c5bhdDrQ1pLDY@stefanha-x1.localdomain> (raw)
In-Reply-To: <20220311152832.17703-1-suwan.kim027@gmail.com>


[-- Attachment #1.1: Type: text/plain, Size: 4166 bytes --]

On Sat, Mar 12, 2022 at 12:28:32AM +0900, Suwan Kim wrote:
> This patch supports polling I/O via virtio-blk driver. Polling
> feature is enabled based on "VIRTIO_BLK_F_MQ" feature and the number
> of polling queues can be set by QEMU virtio-blk-pci property
> "num-poll-queues=N". This patch improves the polling I/O throughput
> and latency.
> 
> The virtio-blk driver doesn't not have a poll function and a poll
> queue and it has been operating in interrupt driven method even if
> the polling function is called in the upper layer.
> 
> virtio-blk polling is implemented upon 'batched completion' of block
> layer. virtblk_poll() queues completed request to io_comp_batch->req_list
> and later, virtblk_complete_batch() calls unmap function and ends
> the requests in batch.
> 
> virtio-blk reads the number of queues and poll queues from QEMU
> virtio-blk-pci properties ("num-queues=N", "num-poll-queues=M").
> It allocates N virtqueues to virtio_blk->vqs[N] and it uses [0..(N-M-1)]
> as default queues and [(N-M)..(N-1)] as poll queues. Unlike the default
> queues, the poll queues have no callback function.
> 
> Regarding HW-SW queue mapping, the default queue mapping uses the
> existing method that condsiders MSI irq vector. But the poll queue
> doesn't have an irq, so it uses the regular blk-mq cpu mapping.
> 
> To enable poll queues, "num-poll-queues=N" property of virtio-blk-pci
> needs to be added to QEMU command line. For that, I temporarily
> implemented the property on QEMU. Please refer to the git repository below.
> 
> 	git : https://github.com/asfaca/qemu.git #on master branch commit
> 
> For verifying the improvement, I did Fio polling I/O performance test
> with io_uring engine with the options below.
> (io_uring, hipri, randread, direct=1, bs=512, iodepth=64 numjobs=N)
> I set 4 vcpu and 4 virtio-blk queues - 2 default queues and 2 poll
> queues for VM.
> (-device virtio-blk-pci,num-queues=4,num-poll-queues=2)
> As a result, IOPS and average latency improved about 10%.
> 
> Test result:
> 
> - Fio io_uring poll without virtio-blk poll support
> 	-- numjobs=1 : IOPS = 297K, avg latency = 214.59us
> 	-- numjobs=2 : IOPS = 360K, avg latency = 363.88us
> 	-- numjobs=4 : IOPS = 289K, avg latency = 885.42us
> 
> - Fio io_uring poll with virtio-blk poll support
> 	-- numjobs=1 : IOPS = 332K, avg latency = 192.61us
> 	-- numjobs=2 : IOPS = 371K, avg latency = 348.31us
> 	-- numjobs=4 : IOPS = 321K, avg latency = 795.93us

Last year there was a patch series that switched regular queues into
polling queues when HIPRI requests were in flight:
https://lore.kernel.org/linux-block/20210520141305.355961-1-stefanha@redhat.com/T/

The advantage is that polling is possible without prior device
configuration, making it easier for users.

However, the dynamic approach is a bit more complex and bugs can result
in lost irqs (hung I/O). Christoph Hellwig asked for dedicated polling
queues, which your patch series now delivers.

I think your patch series is worth merging once the comments others have
already made have been addressed. I'll keep an eye out for the VIRTIO
spec change to extend the virtio-blk configuration space, which needs to
be accepted before the Linux can be merged.

> @@ -728,16 +749,82 @@ static const struct attribute_group *virtblk_attr_groups[] = {
>  static int virtblk_map_queues(struct blk_mq_tag_set *set)
>  {
>  	struct virtio_blk *vblk = set->driver_data;
> +	int i, qoff;
> +
> +	for (i = 0, qoff = 0; i < set->nr_maps; i++) {
> +		struct blk_mq_queue_map *map = &set->map[i];
> +
> +		map->nr_queues = vblk->io_queues[i];
> +		map->queue_offset = qoff;
> +		qoff += map->nr_queues;
> +
> +		if (map->nr_queues == 0)
> +			continue;
> +
> +		if (i == HCTX_TYPE_DEFAULT)
> +			blk_mq_virtio_map_queues(&set->map[i], vblk->vdev, 0);
> +		else
> +			blk_mq_map_queues(&set->map[i]);

A comment would be nice here to explain that regular queues have
interrupts and hence CPU affinity is defined by the core virtio code,
but polling queues have no interrupts so we let the block layer assign
CPU affinity.

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

[-- Attachment #2: Type: text/plain, Size: 183 bytes --]

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

  parent reply	other threads:[~2022-03-14 15:19 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-11 15:28 [PATCH] virtio-blk: support polling I/O Suwan Kim
2022-03-11 15:38 ` Michael S. Tsirkin
2022-03-11 15:38   ` Michael S. Tsirkin
2022-03-11 16:07   ` Suwan Kim
2022-03-11 16:18     ` Michael S. Tsirkin
2022-03-11 16:18       ` Michael S. Tsirkin
2022-03-11 16:38       ` Suwan Kim
2022-03-13 10:37     ` Max Gurtovoy
2022-03-14  9:43       ` Suwan Kim
2022-03-14 10:25         ` Max Gurtovoy
2022-03-14 11:15           ` Michael S. Tsirkin
2022-03-14 11:15             ` Michael S. Tsirkin
2022-03-14 13:22             ` Suwan Kim
2022-03-14 15:12               ` Michael S. Tsirkin
2022-03-14 15:12                 ` Michael S. Tsirkin
2022-03-14 13:26             ` Max Gurtovoy
2022-03-14 15:15               ` Michael S. Tsirkin
2022-03-14 15:15                 ` Michael S. Tsirkin
2022-03-14 16:33                 ` Max Gurtovoy
2022-03-14 22:22                   ` Michael S. Tsirkin
2022-03-14 22:22                     ` Michael S. Tsirkin
2022-03-13 10:42 ` Max Gurtovoy
2022-03-14  9:55   ` Suwan Kim
2022-03-14 10:32     ` Max Gurtovoy
2022-03-14  6:14 ` Jason Wang
2022-03-14  6:14   ` Jason Wang
2022-03-14 12:33   ` Suwan Kim
2022-03-14 14:48     ` Stefan Hajnoczi
2022-03-14 14:48       ` Stefan Hajnoczi
2022-03-15  8:59     ` Jason Wang
2022-03-15 14:43       ` Suwan Kim
2022-03-15 14:53         ` Michael S. Tsirkin
2022-03-15 14:53           ` Michael S. Tsirkin
2022-03-16  2:02         ` Jason Wang
2022-03-16  2:02           ` Jason Wang
2022-03-16 11:25           ` Max Gurtovoy
2022-03-16 13:32           ` Suwan Kim
2022-03-16 15:32           ` Suwan Kim
2022-03-16 15:36             ` Max Gurtovoy
2022-03-16 16:00               ` Stefan Hajnoczi
2022-03-16 16:00                 ` Stefan Hajnoczi
2022-03-17  2:20             ` Jason Wang
2022-03-17  2:20               ` Jason Wang
2022-03-17 15:03               ` Suwan Kim
2022-03-14 15:19 ` Stefan Hajnoczi [this message]
2022-03-14 15:19   ` Stefan Hajnoczi
2022-03-15 13:55   ` Suwan Kim
2022-03-16 12:20     ` Stefan Hajnoczi
2022-03-16 12:20       ` Stefan Hajnoczi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Yi9c5bhdDrQ1pLDY@stefanha-x1.localdomain \
    --to=stefanha@redhat.com \
    --cc=jasowang@redhat.com \
    --cc=linux-block@vger.kernel.org \
    --cc=mst@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=suwan.kim027@gmail.com \
    --cc=virtualization@lists.linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.