All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/1] improve brd performance with blk-mq
       [not found] <CGME20230203103122eucas1p161c0f0b674d26e23cf38466d5415420e@eucas1p1.samsung.com>
@ 2023-02-03 10:30 ` Pankaj Raghav
       [not found]   ` <CGME20230203103127eucas1p293a9fa97366fc89c62f18053be6aca1f@eucas1p2.samsung.com>
  2023-02-07  1:43   ` [PATCH 0/1] improve brd " Ming Lei
  0 siblings, 2 replies; 19+ messages in thread
From: Pankaj Raghav @ 2023-02-03 10:30 UTC (permalink / raw)
  To: axboe; +Cc: hch, mcgrof, gost.dev, linux-block, Pankaj Raghav

Hi Jens,
 brd is one of the few block drivers that still uses submit_bio instead
 of blk-mq framework. The following patch converts brd to start using
 blk-mq framework. Performance gains are pretty evident for read workloads.
 The performance numbers are also attached as a part of
 the commit log.

 Performance (WD=[read|randread|write|randwrite]):
 $ modprobe brd rd_size=1048576 rd_nr=1
 $ echo "none" > /sys/block/ram0/queue/scheduler
 $ fio --name=<WD>  --ioengine=io_uring --iodepth=64 --rw=<WD> --size=1G \
   --io_size=20G --loop=4 --cpus_allowed=1 --filename=/dev/ram0 --iodepth=64
   --direct=[0/1]

  --direct=0
  ------------------------------------------------------
  |            | bio(base) | blk-mq            | delta |
  ------------------------------------------------------
  | randread   | 133       | 223               | +75%  |
  ------------------------------------------------------
  | read       | 150       | 313               | +108% |
  -----------------------------------------------------
  | randwrite  | 111       | 109               | -1.8% |
  -----------------------------------------------------
  | write      | 118       | 117               | -0.8%|
  -----------------------------------------------------
  
  --direct=1
  ------------------------------------------------------
  |            | bio(base) | blk-mq            | delta |
  ------------------------------------------------------
  | randread   | 182       | 414               | +127% |
  ------------------------------------------------------
  | read       | 190       | 429               | +125% |
  -----------------------------------------------------
  | randwrite  | 378       | 387               | +2.38%|
  -----------------------------------------------------
  | write      | 443       | 420               | -5.1% |
  -----------------------------------------------------

 I also added some simple blktests for the brd device to test the changes,
 and it can be found here:
 https://github.com/Panky-codes/blktests/tree/brd

 I will send the blktests patches once this gets merged.

Pankaj Raghav (1):
  brd: improve performance with blk-mq

 drivers/block/brd.c | 64 +++++++++++++++++++++++++++++++++------------
 1 file changed, 47 insertions(+), 17 deletions(-)

-- 
2.39.0


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH] brd: improve performance with blk-mq
       [not found]   ` <CGME20230203103127eucas1p293a9fa97366fc89c62f18053be6aca1f@eucas1p2.samsung.com>
@ 2023-02-03 10:30     ` Pankaj Raghav
  2023-02-06 15:50       ` Christoph Hellwig
  0 siblings, 1 reply; 19+ messages in thread
From: Pankaj Raghav @ 2023-02-03 10:30 UTC (permalink / raw)
  To: axboe; +Cc: hch, mcgrof, gost.dev, linux-block, Pankaj Raghav

move to blk-mq based request processing as brd is one of the few drivers
that still uses submit_bio interface. The changes are pretty trivial
to start using blk-mq. The performance increases up to 125% for direct IO
read workloads. There is a slight dip in performance for direct IO write
workload but considering the general performance gain with blk-mq
support, it is not a lot.

SW queues are mapped to one hw queue in the brd device, and rest of IO
processing is retained as is.

Performance results with none scheduler:

--direct=0
------------------------------------------------------
|            | bio(base) | blk-mq            | delta |
------------------------------------------------------
| randread   | 133       | 223               | +75%  |
------------------------------------------------------
| read       | 150       | 313               | +108% |
-----------------------------------------------------
| randwrite  | 111       | 109               | -1.8% |
-----------------------------------------------------
| write      | 118       | 117               | -0.8%|
-----------------------------------------------------

--direct=1
------------------------------------------------------
|            | bio(base) | blk-mq            | delta |
------------------------------------------------------
| randread   | 182       | 414               | +127% |
------------------------------------------------------
| read       | 190       | 429               | +125% |
-----------------------------------------------------
| randwrite  | 378       | 387               | +2.38%|
-----------------------------------------------------
| write      | 443       | 420               | -5.1% |
-----------------------------------------------------

Signed-off-by: Pankaj Raghav <p.raghav@samsung.com>
---
 drivers/block/brd.c | 64 +++++++++++++++++++++++++++++++++------------
 1 file changed, 47 insertions(+), 17 deletions(-)

diff --git a/drivers/block/brd.c b/drivers/block/brd.c
index 37dce184eb56..99b37ac31532 100644
--- a/drivers/block/brd.c
+++ b/drivers/block/brd.c
@@ -16,6 +16,7 @@
 #include <linux/major.h>
 #include <linux/blkdev.h>
 #include <linux/bio.h>
+#include <linux/blk-mq.h>
 #include <linux/highmem.h>
 #include <linux/mutex.h>
 #include <linux/pagemap.h>
@@ -46,6 +47,7 @@ struct brd_device {
 	spinlock_t		brd_lock;
 	struct radix_tree_root	brd_pages;
 	u64			brd_nr_pages;
+	struct blk_mq_tag_set tag_set;
 };
 
 /*
@@ -282,36 +284,46 @@ static int brd_do_bvec(struct brd_device *brd, struct page *page,
 	return err;
 }
 
-static void brd_submit_bio(struct bio *bio)
+static blk_status_t brd_queue_rq(struct blk_mq_hw_ctx *hctx,
+				 const struct blk_mq_queue_data *bd)
 {
-	struct brd_device *brd = bio->bi_bdev->bd_disk->private_data;
-	sector_t sector = bio->bi_iter.bi_sector;
+	struct request *rq = bd->rq;
+	struct brd_device *brd = hctx->queue->queuedata;
+	sector_t sector = blk_rq_pos(rq);
 	struct bio_vec bvec;
-	struct bvec_iter iter;
+	struct req_iterator iter;
+	blk_status_t err = BLK_STS_OK;
 
-	bio_for_each_segment(bvec, bio, iter) {
+	blk_mq_start_request(bd->rq);
+	rq_for_each_segment(bvec, rq, iter) {
 		unsigned int len = bvec.bv_len;
-		int err;
+		int ret;
 
 		/* Don't support un-aligned buffer */
 		WARN_ON_ONCE((bvec.bv_offset & (SECTOR_SIZE - 1)) ||
-				(len & (SECTOR_SIZE - 1)));
+			     (len & (SECTOR_SIZE - 1)));
 
-		err = brd_do_bvec(brd, bvec.bv_page, len, bvec.bv_offset,
-				  bio_op(bio), sector);
-		if (err) {
-			bio_io_error(bio);
-			return;
+		ret = brd_do_bvec(brd, bvec.bv_page, len, bvec.bv_offset,
+				  req_op(rq), sector);
+		if (ret) {
+			err = BLK_STS_IOERR;
+			goto end_request;
 		}
 		sector += len >> SECTOR_SHIFT;
 	}
 
-	bio_endio(bio);
+end_request:
+	blk_mq_end_request(bd->rq, err);
+	return BLK_STS_OK;
 }
 
+static const struct blk_mq_ops brd_mq_ops = {
+	.queue_rq = brd_queue_rq,
+};
+
+
 static const struct block_device_operations brd_fops = {
 	.owner =		THIS_MODULE,
-	.submit_bio =		brd_submit_bio,
 };
 
 /*
@@ -355,7 +367,7 @@ static int brd_alloc(int i)
 	struct brd_device *brd;
 	struct gendisk *disk;
 	char buf[DISK_NAME_LEN];
-	int err = -ENOMEM;
+	int err = 0;
 
 	list_for_each_entry(brd, &brd_devices, brd_list)
 		if (brd->brd_number == i)
@@ -364,6 +376,14 @@ static int brd_alloc(int i)
 	if (!brd)
 		return -ENOMEM;
 	brd->brd_number		= i;
+	brd->tag_set.ops = &brd_mq_ops;
+	brd->tag_set.queue_depth = 128;
+	brd->tag_set.numa_node = NUMA_NO_NODE;
+	brd->tag_set.flags = BLK_MQ_F_SHOULD_MERGE | BLK_MQ_F_BLOCKING;
+	brd->tag_set.cmd_size = 0;
+	brd->tag_set.driver_data = brd;
+	brd->tag_set.nr_hw_queues = 1;
+
 	list_add_tail(&brd->brd_list, &brd_devices);
 
 	spin_lock_init(&brd->brd_lock);
@@ -374,9 +394,17 @@ static int brd_alloc(int i)
 		debugfs_create_u64(buf, 0444, brd_debugfs_dir,
 				&brd->brd_nr_pages);
 
-	disk = brd->brd_disk = blk_alloc_disk(NUMA_NO_NODE);
-	if (!disk)
+	err = blk_mq_alloc_tag_set(&brd->tag_set);
+	if (err) {
+		err = -ENOMEM;
 		goto out_free_dev;
+	}
+
+	disk = brd->brd_disk = blk_mq_alloc_disk(&brd->tag_set, brd);
+	if (IS_ERR(disk)) {
+		err = PTR_ERR(disk);
+		goto out_free_tags;
+	}
 
 	disk->major		= RAMDISK_MAJOR;
 	disk->first_minor	= i * max_part;
@@ -407,6 +435,8 @@ static int brd_alloc(int i)
 
 out_cleanup_disk:
 	put_disk(disk);
+out_free_tags:
+	blk_mq_free_tag_set(&brd->tag_set);
 out_free_dev:
 	list_del(&brd->brd_list);
 	kfree(brd);
-- 
2.39.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH] brd: improve performance with blk-mq
  2023-02-03 10:30     ` [PATCH] brd: improve " Pankaj Raghav
@ 2023-02-06 15:50       ` Christoph Hellwig
  2023-02-06 16:10         ` Pankaj Raghav
  0 siblings, 1 reply; 19+ messages in thread
From: Christoph Hellwig @ 2023-02-06 15:50 UTC (permalink / raw)
  To: Pankaj Raghav; +Cc: axboe, hch, mcgrof, gost.dev, linux-block

On Fri, Feb 03, 2023 at 04:00:06PM +0530, Pankaj Raghav wrote:
> move to blk-mq based request processing as brd is one of the few drivers
> that still uses submit_bio interface. The changes are pretty trivial
> to start using blk-mq. The performance increases up to 125% for direct IO
> read workloads. There is a slight dip in performance for direct IO write
> workload but considering the general performance gain with blk-mq
> support, it is not a lot.

Can you find out why writes regress, and what improves for reads?

In general blk-mq is doing a lot more work for better batching, but much
of that batching should not matter for a simple ramdisk.  So the results
look a little odd to me, and extra numbers and explanations would
really help.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] brd: improve performance with blk-mq
  2023-02-06 15:50       ` Christoph Hellwig
@ 2023-02-06 16:10         ` Pankaj Raghav
  0 siblings, 0 replies; 19+ messages in thread
From: Pankaj Raghav @ 2023-02-06 16:10 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: axboe, mcgrof, gost.dev, linux-block

On 2023-02-06 21:20, Christoph Hellwig wrote:
> On Fri, Feb 03, 2023 at 04:00:06PM +0530, Pankaj Raghav wrote:
>> move to blk-mq based request processing as brd is one of the few drivers
>> that still uses submit_bio interface. The changes are pretty trivial
>> to start using blk-mq. The performance increases up to 125% for direct IO
>> read workloads. There is a slight dip in performance for direct IO write
>> workload but considering the general performance gain with blk-mq
>> support, it is not a lot.
> 
> Can you find out why writes regress, and what improves for reads?
> 
Definitely. I tried to do similar experiments in null blk with submit_bio
& blk-mq backend and noticed a similar pattern in performance. I will look
into it next week as I am OOO rest of the week.

> In general blk-mq is doing a lot more work for better batching, but much
> of that batching should not matter for a simple ramdisk.  So the results
> look a little odd to me, and extra numbers and explanations would
> really help.

Let me know if you think I am missing something in the code that can cause
this behavior! Thanks.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 0/1] improve brd performance with blk-mq
  2023-02-03 10:30 ` [PATCH 0/1] improve brd performance with blk-mq Pankaj Raghav
       [not found]   ` <CGME20230203103127eucas1p293a9fa97366fc89c62f18053be6aca1f@eucas1p2.samsung.com>
@ 2023-02-07  1:43   ` Ming Lei
  2023-02-13  5:56     ` Pankaj Raghav
  1 sibling, 1 reply; 19+ messages in thread
From: Ming Lei @ 2023-02-07  1:43 UTC (permalink / raw)
  To: Pankaj Raghav; +Cc: axboe, hch, mcgrof, gost.dev, linux-block

On Fri, Feb 03, 2023 at 04:00:05PM +0530, Pankaj Raghav wrote:
> Hi Jens,
>  brd is one of the few block drivers that still uses submit_bio instead
>  of blk-mq framework. The following patch converts brd to start using
>  blk-mq framework. Performance gains are pretty evident for read workloads.
>  The performance numbers are also attached as a part of
>  the commit log.
> 
>  Performance (WD=[read|randread|write|randwrite]):
>  $ modprobe brd rd_size=1048576 rd_nr=1
>  $ echo "none" > /sys/block/ram0/queue/scheduler
>  $ fio --name=<WD>  --ioengine=io_uring --iodepth=64 --rw=<WD> --size=1G \
>    --io_size=20G --loop=4 --cpus_allowed=1 --filename=/dev/ram0 --iodepth=64
>    --direct=[0/1]
> 
>   --direct=0

Can you share perf data on other non-io_uring engine often used? The
thing is that we still have lots of non-io_uring workloads, which can't
be hurt now.


Thanks
Ming


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 0/1] improve brd performance with blk-mq
  2023-02-07  1:43   ` [PATCH 0/1] improve brd " Ming Lei
@ 2023-02-13  5:56     ` Pankaj Raghav
  2023-02-13  8:10       ` Ming Lei
  0 siblings, 1 reply; 19+ messages in thread
From: Pankaj Raghav @ 2023-02-13  5:56 UTC (permalink / raw)
  To: Ming Lei; +Cc: axboe, hch, mcgrof, gost.dev, linux-block

On 2023-02-07 07:13, Ming Lei wrote:
> On Fri, Feb 03, 2023 at 04:00:05PM +0530, Pankaj Raghav wrote:
>> Hi Jens,
>>  brd is one of the few block drivers that still uses submit_bio instead
>>  of blk-mq framework. The following patch converts brd to start using
>>  blk-mq framework. Performance gains are pretty evident for read workloads.
>>  The performance numbers are also attached as a part of
>>  the commit log.
>>
>>  Performance (WD=[read|randread|write|randwrite]):
>>  $ modprobe brd rd_size=1048576 rd_nr=1
>>  $ echo "none" > /sys/block/ram0/queue/scheduler
>>  $ fio --name=<WD>  --ioengine=io_uring --iodepth=64 --rw=<WD> --size=1G \
>>    --io_size=20G --loop=4 --cpus_allowed=1 --filename=/dev/ram0 --iodepth=64
>>    --direct=[0/1]
>>
>>   --direct=0
> 
> Can you share perf data on other non-io_uring engine often used? The
> thing is that we still have lots of non-io_uring workloads, which can't
> be hurt now.
> 
Sounds good. Does psync and libaio along with io_uring suffice?

> 
> Thanks
> Ming
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 0/1] improve brd performance with blk-mq
  2023-02-13  5:56     ` Pankaj Raghav
@ 2023-02-13  8:10       ` Ming Lei
  2023-02-14 14:48         ` Pankaj Raghav
  0 siblings, 1 reply; 19+ messages in thread
From: Ming Lei @ 2023-02-13  8:10 UTC (permalink / raw)
  To: Pankaj Raghav; +Cc: axboe, hch, mcgrof, gost.dev, linux-block

On Mon, Feb 13, 2023 at 11:26:22AM +0530, Pankaj Raghav wrote:
> On 2023-02-07 07:13, Ming Lei wrote:
> > On Fri, Feb 03, 2023 at 04:00:05PM +0530, Pankaj Raghav wrote:
> >> Hi Jens,
> >>  brd is one of the few block drivers that still uses submit_bio instead
> >>  of blk-mq framework. The following patch converts brd to start using
> >>  blk-mq framework. Performance gains are pretty evident for read workloads.
> >>  The performance numbers are also attached as a part of
> >>  the commit log.
> >>
> >>  Performance (WD=[read|randread|write|randwrite]):
> >>  $ modprobe brd rd_size=1048576 rd_nr=1
> >>  $ echo "none" > /sys/block/ram0/queue/scheduler
> >>  $ fio --name=<WD>  --ioengine=io_uring --iodepth=64 --rw=<WD> --size=1G \
> >>    --io_size=20G --loop=4 --cpus_allowed=1 --filename=/dev/ram0 --iodepth=64
> >>    --direct=[0/1]
> >>
> >>   --direct=0
> > 
> > Can you share perf data on other non-io_uring engine often used? The
> > thing is that we still have lots of non-io_uring workloads, which can't
> > be hurt now.
> > 
> Sounds good. Does psync and libaio along with io_uring suffice?

Yeah, it should be enough.


Thanks,
Ming


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 0/1] improve brd performance with blk-mq
  2023-02-13  8:10       ` Ming Lei
@ 2023-02-14 14:48         ` Pankaj Raghav
  2023-02-15  6:47           ` Christoph Hellwig
  2023-02-15 23:38           ` Jens Axboe
  0 siblings, 2 replies; 19+ messages in thread
From: Pankaj Raghav @ 2023-02-14 14:48 UTC (permalink / raw)
  To: Ming Lei; +Cc: axboe, hch, mcgrof, gost.dev, linux-block

Hi Ming,

On 2023-02-13 13:40, Ming Lei wrote:
>>>
>>> Can you share perf data on other non-io_uring engine often used? The
>>> thing is that we still have lots of non-io_uring workloads, which can't
>>> be hurt now.
>>>
>> Sounds good. Does psync and libaio along with io_uring suffice?
> 
> Yeah, it should be enough.
>

Performance regression is noticed for libaio and psync. I did the same
tests on null_blk with bio and blk-mq backends, and noticed a similar pattern.

Should we add a module parameter to switch between bio and blk-mq back-end
in brd, similar to null_blk? The default option would be bio to avoid
regression on existing workloads.

There is a clear performance gain for some workloads with blk-mq support in
brd. Let me know your thoughts. See below the performance results.

Results for brd with --direct enabled:

+-----------+-----------+--------+--------+


| io_uring  | bio(base) | blk-mq | delta  |

+-----------+-----------+--------+--------+

|   read    |    189    |  449   | 137.57 |

| randread  |    182    |  419   | 130.22 |

|   write   |    442    |  425   | -3.85  |

| randwrite |    377    |  384   |  1.86  |

+-----------+-----------+--------+--------+

+-----------+-----------+--------+--------+

|  libaio   | bio(base) | blk-mq | delta  |

+-----------+-----------+--------+--------+

|   read    |    415    |  339   | -18.31 |

| randread  |    392    |  325   | -17.09 |

|   write   |    410    |  340   | -17.07 |

| randwrite |    371    |  318   | -14.29 |

+-----------+-----------+--------+--------+

+-----------+-----------+--------+--------+

|   psync   | bio(base) | blk-mq | delta  |

+-----------+-----------+--------+--------+

|   read    |    700    |  554   | -20.86 |

| randread  |    633    |  497   | -21.48 |

|   write   |    705    |  181   | -74.33 |

| randwrite |    598    |  169   | -71.74 |

+-----------+-----------+--------+--------+

Both libaio and psync results in performance regression with blk-mq based brd.

Results for memory-backed null_blk device with --direct enabled:

+-----------+-----------+--------+--------+
| io_uring  | bio(base) | blk-mq | delta  |
+-----------+-----------+--------+--------+
|   read    |    189    |  433   | 129.1  |
| randread  |    196    |  408   | 108.16 |
|   write   |    457    |  414   | -9.41  |
| randwrite |    356    |  350   | -1.69  |
+-----------+-----------+--------+--------+

+-----------+-----------+--------+--------+
|  libaio   | bio(base) | blk-mq | delta  |
+-----------+-----------+--------+--------+
|   read    |    406    |  341   | -16.01 |
| randread  |    378    |  320   | -15.34 |
|   write   |    404    |  331   | -18.07 |
| randwrite |    359    |  309   | -13.93 |
+-----------+-----------+--------+--------+

+-----------+-----------+--------+--------+
|   psync   | bio(base) | blk-mq | delta  |
+-----------+-----------+--------+--------+
|   read    |    684    |  522   | -23.68 |
| randread  |    605    |  470   | -22.31 |
|   write   |    676    |  139   | -79.44 |
| randwrite |    558    |  135   | -75.81 |
+-----------+-----------+--------+--------+

Regards,
Pankaj

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 0/1] improve brd performance with blk-mq
  2023-02-14 14:48         ` Pankaj Raghav
@ 2023-02-15  6:47           ` Christoph Hellwig
  2023-02-15 23:38           ` Jens Axboe
  1 sibling, 0 replies; 19+ messages in thread
From: Christoph Hellwig @ 2023-02-15  6:47 UTC (permalink / raw)
  To: Pankaj Raghav; +Cc: Ming Lei, axboe, hch, mcgrof, gost.dev, linux-block

On Tue, Feb 14, 2023 at 08:18:54PM +0530, Pankaj Raghav wrote:
> Should we add a module parameter to switch between bio and blk-mq back-end
> in brd, similar to null_blk? The default option would be bio to avoid
> regression on existing workloads.

No.  Duplicate code paths are alaways a bad idea.  Please drill in further
what causes the differences.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 0/1] improve brd performance with blk-mq
  2023-02-14 14:48         ` Pankaj Raghav
  2023-02-15  6:47           ` Christoph Hellwig
@ 2023-02-15 23:38           ` Jens Axboe
  2023-02-15 23:39             ` Jens Axboe
  2023-02-17 14:27             ` Pankaj Raghav
  1 sibling, 2 replies; 19+ messages in thread
From: Jens Axboe @ 2023-02-15 23:38 UTC (permalink / raw)
  To: Pankaj Raghav, Ming Lei; +Cc: hch, mcgrof, gost.dev, linux-block

On 2/14/23 7:48?AM, Pankaj Raghav wrote:
> Hi Ming,
> 
> On 2023-02-13 13:40, Ming Lei wrote:
>>>>
>>>> Can you share perf data on other non-io_uring engine often used? The
>>>> thing is that we still have lots of non-io_uring workloads, which can't
>>>> be hurt now.
>>>>
>>> Sounds good. Does psync and libaio along with io_uring suffice?
>>
>> Yeah, it should be enough.
>>
> 
> Performance regression is noticed for libaio and psync. I did the same
> tests on null_blk with bio and blk-mq backends, and noticed a similar pattern.
> 
> Should we add a module parameter to switch between bio and blk-mq back-end
> in brd, similar to null_blk? The default option would be bio to avoid
> regression on existing workloads.
> 
> There is a clear performance gain for some workloads with blk-mq support in
> brd. Let me know your thoughts. See below the performance results.
> 
> Results for brd with --direct enabled:

I think your numbers are skewed because brd isn't flagg nowait, can you
try with this?

I ran some quick testing here, using the current tree:

		without patch		with patch
io_uring	~430K IOPS		~3.4M IOPS
libaio		~895K IOPS		~895K IOPS

which is a pretty substantial difference...

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 0/1] improve brd performance with blk-mq
  2023-02-15 23:38           ` Jens Axboe
@ 2023-02-15 23:39             ` Jens Axboe
  2023-02-17 14:27             ` Pankaj Raghav
  1 sibling, 0 replies; 19+ messages in thread
From: Jens Axboe @ 2023-02-15 23:39 UTC (permalink / raw)
  To: Pankaj Raghav, Ming Lei; +Cc: hch, mcgrof, gost.dev, linux-block

On 2/15/23 4:38?PM, Jens Axboe wrote:
> On 2/14/23 7:48?AM, Pankaj Raghav wrote:
>> Hi Ming,
>>
>> On 2023-02-13 13:40, Ming Lei wrote:
>>>>>
>>>>> Can you share perf data on other non-io_uring engine often used? The
>>>>> thing is that we still have lots of non-io_uring workloads, which can't
>>>>> be hurt now.
>>>>>
>>>> Sounds good. Does psync and libaio along with io_uring suffice?
>>>
>>> Yeah, it should be enough.
>>>
>>
>> Performance regression is noticed for libaio and psync. I did the same
>> tests on null_blk with bio and blk-mq backends, and noticed a similar pattern.
>>
>> Should we add a module parameter to switch between bio and blk-mq back-end
>> in brd, similar to null_blk? The default option would be bio to avoid
>> regression on existing workloads.
>>
>> There is a clear performance gain for some workloads with blk-mq support in
>> brd. Let me know your thoughts. See below the performance results.
>>
>> Results for brd with --direct enabled:
> 
> I think your numbers are skewed because brd isn't flagg nowait, can you
> try with this?
> 
> I ran some quick testing here, using the current tree:
> 
> 		without patch		with patch
> io_uring	~430K IOPS		~3.4M IOPS
> libaio		~895K IOPS		~895K IOPS
> 
> which is a pretty substantial difference...

And here's the actual patch. FWIW, this doesn't make a difference
for libaio, because aio doesn't really care if it blocks or not.

diff --git a/drivers/block/brd.c b/drivers/block/brd.c
index 20acc4a1fd6d..82419e345777 100644
--- a/drivers/block/brd.c
+++ b/drivers/block/brd.c
@@ -412,6 +412,7 @@ static int brd_alloc(int i)
 	/* Tell the block layer that this is not a rotational device */
 	blk_queue_flag_set(QUEUE_FLAG_NONROT, disk->queue);
 	blk_queue_flag_clear(QUEUE_FLAG_ADD_RANDOM, disk->queue);
+	blk_queue_flag_set(QUEUE_FLAG_NOWAIT, disk->queue);
 	err = add_disk(disk);
 	if (err)
 		goto out_cleanup_disk;

-- 
Jens Axboe


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH 0/1] improve brd performance with blk-mq
  2023-02-15 23:38           ` Jens Axboe
  2023-02-15 23:39             ` Jens Axboe
@ 2023-02-17 14:27             ` Pankaj Raghav
  2023-02-17 14:40               ` Jens Axboe
  1 sibling, 1 reply; 19+ messages in thread
From: Pankaj Raghav @ 2023-02-17 14:27 UTC (permalink / raw)
  To: Jens Axboe; +Cc: hch, mcgrof, gost.dev, linux-block, Ming Lei

Hi Jens,

On 2023-02-16 05:08, Jens Axboe wrote:

> I think your numbers are skewed because brd isn't flagg nowait, can you
> try with this?
> 
> I ran some quick testing here, using the current tree:
> 
> 		without patch		with patch
> io_uring	~430K IOPS		~3.4M IOPS
> libaio		~895K IOPS		~895K IOPS
> 
> which is a pretty substantial difference...
> 

I rebased my blk-mq changes on top of your nowait patches, but still I see a
regression with blk-mq. When I tried to trace and run perf, nothing odd
stood out, except for the normal blk-mq overhead.

Could you try it in your setup and see if you are noticing a similar trend?
Because based on the numbers you shared yesterday, I didn't see this regression.

fio script I run to benchmark:

$ fio --name=<workload>  --rw=<workload>  --ramp_time=5s --size=1G
--io_size=10G --loop=4 --cpus_allowed=1 --filename=/dev/ram0 --direct=1
--iodepth=128 --ioengine=<engine>

+-----------+-----------+--------+--------+
| io_uring  | bio(base) | blk-mq | delta  |
+-----------+-----------+--------+--------+
|   read    |    577    |  446   | -22.7  |
| randread  |    504    |  416   | -17.46 |
|   write   |    554    |  424   | -23.47 |
| randwrite |    484    |  381   | -21.28 |
+-----------+-----------+--------+--------+

+-----------+-----------+--------+--------+
|  libaio   | bio(base) | blk-mq | delta  |
+-----------+-----------+--------+--------+
|   read    |    412    |  341   | -17.23 |
| randread  |    389    |  335   | -13.88 |
|   write   |    401    |  329   | -17.96 |
| randwrite |    351    |  304   | -13.39 |
+-----------+-----------+--------+--------+

My rebased blk-mq diff:

diff --git a/drivers/block/brd.c b/drivers/block/brd.c
index 34177f1bd97d..726c4b94c7b6 100644
--- a/drivers/block/brd.c
+++ b/drivers/block/brd.c
@@ -16,6 +16,7 @@
 #include <linux/major.h>
 #include <linux/blkdev.h>
 #include <linux/bio.h>
+#include <linux/blk-mq.h>
 #include <linux/highmem.h>
 #include <linux/mutex.h>
 #include <linux/pagemap.h>
@@ -46,6 +47,7 @@ struct brd_device {
 	spinlock_t		brd_lock;
 	struct radix_tree_root	brd_pages;
 	u64			brd_nr_pages;
+	struct blk_mq_tag_set tag_set;
 };

 /*
@@ -284,40 +286,48 @@ static int brd_do_bvec(struct brd_device *brd, struct page *page,
 	return err;
 }

-static void brd_submit_bio(struct bio *bio)
+static blk_status_t brd_queue_rq(struct blk_mq_hw_ctx *hctx,
+				 const struct blk_mq_queue_data *bd)
 {
-	struct brd_device *brd = bio->bi_bdev->bd_disk->private_data;
-	sector_t sector = bio->bi_iter.bi_sector;
+	struct request *rq = bd->rq;
+	struct brd_device *brd = hctx->queue->queuedata;
+	sector_t sector = blk_rq_pos(rq);
 	struct bio_vec bvec;
-	struct bvec_iter iter;
+	struct req_iterator iter;
+	blk_status_t err = BLK_STS_OK;

-	bio_for_each_segment(bvec, bio, iter) {
+	blk_mq_start_request(bd->rq);
+	rq_for_each_segment(bvec, rq, iter) {
 		unsigned int len = bvec.bv_len;
-		int err;
+		int ret;

 		/* Don't support un-aligned buffer */
 		WARN_ON_ONCE((bvec.bv_offset & (SECTOR_SIZE - 1)) ||
-				(len & (SECTOR_SIZE - 1)));
+			     (len & (SECTOR_SIZE - 1)));

-		err = brd_do_bvec(brd, bvec.bv_page, len, bvec.bv_offset,
-				  bio->bi_opf, sector);
-		if (err) {
-			if (err == -ENOMEM && bio->bi_opf & REQ_NOWAIT) {
-				bio_wouldblock_error(bio);
-				return;
-			}
-			bio_io_error(bio);
-			return;
+		ret = brd_do_bvec(brd, bvec.bv_page, len, bvec.bv_offset,
+				  rq->cmd_flags, sector);
+		if (ret) {
+			if (ret == -ENOMEM && rq->cmd_flags & REQ_NOWAIT)
+				err = BLK_STS_AGAIN;
+			else
+				err = BLK_STS_IOERR;
+			goto end_request;
 		}
 		sector += len >> SECTOR_SHIFT;
 	}

-	bio_endio(bio);
+end_request:
+	blk_mq_end_request(bd->rq, err);
+	return BLK_STS_OK;
 }

+static const struct blk_mq_ops brd_mq_ops = {
+	.queue_rq = brd_queue_rq,
+};
+
 static const struct block_device_operations brd_fops = {
 	.owner =		THIS_MODULE,
-	.submit_bio =		brd_submit_bio,
 };

 /*
@@ -361,7 +371,7 @@ static int brd_alloc(int i)
 	struct brd_device *brd;
 	struct gendisk *disk;
 	char buf[DISK_NAME_LEN];
-	int err = -ENOMEM;
+	int err = 0;

 	list_for_each_entry(brd, &brd_devices, brd_list)
 		if (brd->brd_number == i)
@@ -370,6 +380,15 @@ static int brd_alloc(int i)
 	if (!brd)
 		return -ENOMEM;
 	brd->brd_number		= i;
+	brd->tag_set.ops = &brd_mq_ops;
+	brd->tag_set.queue_depth = 128;
+	brd->tag_set.numa_node = NUMA_NO_NODE;
+	brd->tag_set.flags = BLK_MQ_F_SHOULD_MERGE | BLK_MQ_F_BLOCKING |
+			     BLK_MQ_F_NO_SCHED_BY_DEFAULT;
+	brd->tag_set.cmd_size = 0;
+	brd->tag_set.driver_data = brd;
+	brd->tag_set.nr_hw_queues = 1;
+
 	list_add_tail(&brd->brd_list, &brd_devices);

 	spin_lock_init(&brd->brd_lock);
@@ -380,9 +399,17 @@ static int brd_alloc(int i)
 		debugfs_create_u64(buf, 0444, brd_debugfs_dir,
 				&brd->brd_nr_pages);

-	disk = brd->brd_disk = blk_alloc_disk(NUMA_NO_NODE);
-	if (!disk)
+	err = blk_mq_alloc_tag_set(&brd->tag_set);
+	if (err) {
+		err = -ENOMEM;
 		goto out_free_dev;
+	}
+
+	disk = brd->brd_disk = blk_mq_alloc_disk(&brd->tag_set, brd);
+	if (IS_ERR(disk)) {
+		err = PTR_ERR(disk);
+		goto out_free_tags;
+	}

 	disk->major		= RAMDISK_MAJOR;
 	disk->first_minor	= i * max_part;
@@ -414,6 +441,8 @@ static int brd_alloc(int i)

 out_cleanup_disk:
 	put_disk(disk);
+out_free_tags:
+	blk_mq_free_tag_set(&brd->tag_set);
 out_free_dev:
 	list_del(&brd->brd_list);
 	kfree(brd);
-- 
2.39.1

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH 0/1] improve brd performance with blk-mq
  2023-02-17 14:27             ` Pankaj Raghav
@ 2023-02-17 14:40               ` Jens Axboe
  2023-02-17 14:52                 ` Pankaj Raghav
  0 siblings, 1 reply; 19+ messages in thread
From: Jens Axboe @ 2023-02-17 14:40 UTC (permalink / raw)
  To: Pankaj Raghav; +Cc: hch, mcgrof, gost.dev, linux-block, Ming Lei

On 2/17/23 7:27 AM, Pankaj Raghav wrote:
> Hi Jens,
> 
> On 2023-02-16 05:08, Jens Axboe wrote:
> 
>> I think your numbers are skewed because brd isn't flagg nowait, can you
>> try with this?
>>
>> I ran some quick testing here, using the current tree:
>>
>> 		without patch		with patch
>> io_uring	~430K IOPS		~3.4M IOPS
>> libaio		~895K IOPS		~895K IOPS
>>
>> which is a pretty substantial difference...
>>
> 
> I rebased my blk-mq changes on top of your nowait patches, but still I see a
> regression with blk-mq. When I tried to trace and run perf, nothing odd
> stood out, except for the normal blk-mq overhead.
> 
> Could you try it in your setup and see if you are noticing a similar trend?
> Because based on the numbers you shared yesterday, I didn't see this regression.
> 
> fio script I run to benchmark:
> 
> $ fio --name=<workload>  --rw=<workload>  --ramp_time=5s --size=1G
> --io_size=10G --loop=4 --cpus_allowed=1 --filename=/dev/ram0 --direct=1
> --iodepth=128 --ioengine=<engine>
> 
> +-----------+-----------+--------+--------+
> | io_uring  | bio(base) | blk-mq | delta  |
> +-----------+-----------+--------+--------+
> |   read    |    577    |  446   | -22.7  |
> | randread  |    504    |  416   | -17.46 |
> |   write   |    554    |  424   | -23.47 |
> | randwrite |    484    |  381   | -21.28 |
> +-----------+-----------+--------+--------+
> 
> +-----------+-----------+--------+--------+
> |  libaio   | bio(base) | blk-mq | delta  |
> +-----------+-----------+--------+--------+
> |   read    |    412    |  341   | -17.23 |
> | randread  |    389    |  335   | -13.88 |
> |   write   |    401    |  329   | -17.96 |
> | randwrite |    351    |  304   | -13.39 |
> +-----------+-----------+--------+--------+

This is pretty much expected, as blk-mq adds a bunch of things that
brd doesn't really care about. One example of such would be tag
management.

My reaction to your initial report wasn't a surprise that blk-mq
would be slower than bio based for this use case, rather that
io_uring was slower than libaio.

-- 
Jens Axboe



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 0/1] improve brd performance with blk-mq
  2023-02-17 14:40               ` Jens Axboe
@ 2023-02-17 14:52                 ` Pankaj Raghav
  2023-02-21 21:59                   ` Luis Chamberlain
  0 siblings, 1 reply; 19+ messages in thread
From: Pankaj Raghav @ 2023-02-17 14:52 UTC (permalink / raw)
  To: Jens Axboe; +Cc: hch, mcgrof, gost.dev, linux-block, Ming Lei

>> +-----------+-----------+--------+--------+
>> | io_uring  | bio(base) | blk-mq | delta  |
>> +-----------+-----------+--------+--------+
>> |   read    |    577    |  446   | -22.7  |
>> | randread  |    504    |  416   | -17.46 |
>> |   write   |    554    |  424   | -23.47 |
>> | randwrite |    484    |  381   | -21.28 |
>> +-----------+-----------+--------+--------+
>>
>> +-----------+-----------+--------+--------+
>> |  libaio   | bio(base) | blk-mq | delta  |
>> +-----------+-----------+--------+--------+
>> |   read    |    412    |  341   | -17.23 |
>> | randread  |    389    |  335   | -13.88 |
>> |   write   |    401    |  329   | -17.96 |
>> | randwrite |    351    |  304   | -13.39 |
>> +-----------+-----------+--------+--------+
> 
> This is pretty much expected, as blk-mq adds a bunch of things that
> brd doesn't really care about. One example of such would be tag
> management.

Got it. That was my conclusion as well.

> My reaction to your initial report wasn't a surprise that blk-mq
> would be slower than bio based for this use case, rather that
> io_uring was slower than libaio.
> 

Yeah! So with nowait option, that isn't the case anymore.

I will park this effort as blk-mq doesn't improve the performance for brd,
and we can retain the submit_bio interface.

Thanks for the input Jens, Christoph and Ming!

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 0/1] improve brd performance with blk-mq
  2023-02-17 14:52                 ` Pankaj Raghav
@ 2023-02-21 21:59                   ` Luis Chamberlain
  2023-02-21 23:51                     ` Jens Axboe
  0 siblings, 1 reply; 19+ messages in thread
From: Luis Chamberlain @ 2023-02-21 21:59 UTC (permalink / raw)
  To: Pankaj Raghav; +Cc: Jens Axboe, hch, gost.dev, linux-block, Ming Lei

On Fri, Feb 17, 2023 at 08:22:15PM +0530, Pankaj Raghav wrote:
> I will park this effort as blk-mq doesn't improve the performance for brd,
> and we can retain the submit_bio interface.

I am not sure if the feedback was intended to suggest we shouldn't do
the blk-mq conversion, but rather explain why in some workloads it
may not be as good as the old submit_bio() interface. Probably low
hanging fruit, if we *really* wanted to provide parity for the odd
workloads.

If we *mostly*  we see better performance with blk-mq it would seem
likely reasonable to merge. Dozens of drivers were converted to blk-mq
and *most* without *any* performance justification on them. I think
ming's was the commit log that had the most elaborate performacne
metrics and I think it also showed some *minor* slowdown on some
workloads, but the dramatic gains made it worthwhile.

Most of the conversions to blk-mq didn't even have *any* metrics posted.

  Luis

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 0/1] improve brd performance with blk-mq
  2023-02-21 21:59                   ` Luis Chamberlain
@ 2023-02-21 23:51                     ` Jens Axboe
  2023-02-22 22:42                       ` Luis Chamberlain
  0 siblings, 1 reply; 19+ messages in thread
From: Jens Axboe @ 2023-02-21 23:51 UTC (permalink / raw)
  To: Luis Chamberlain, Pankaj Raghav; +Cc: hch, gost.dev, linux-block, Ming Lei

On 2/21/23 2:59?PM, Luis Chamberlain wrote:
> On Fri, Feb 17, 2023 at 08:22:15PM +0530, Pankaj Raghav wrote:
>> I will park this effort as blk-mq doesn't improve the performance for brd,
>> and we can retain the submit_bio interface.
> 
> I am not sure if the feedback was intended to suggest we shouldn't do
> the blk-mq conversion, but rather explain why in some workloads it
> may not be as good as the old submit_bio() interface. Probably low
> hanging fruit, if we *really* wanted to provide parity for the odd
> workloads.
> 
> If we *mostly*  we see better performance with blk-mq it would seem
> likely reasonable to merge. Dozens of drivers were converted to blk-mq
> and *most* without *any* performance justification on them. I think
> ming's was the commit log that had the most elaborate performacne
> metrics and I think it also showed some *minor* slowdown on some
> workloads, but the dramatic gains made it worthwhile.
> 
> Most of the conversions to blk-mq didn't even have *any* metrics posted.

You're comparing apples and oranges. I don't want to get into (fairly)
ancient history at this point, but the original implementation was honed
with the nvme conversion - which is the most performant driver/hardware
we have available.

Converting something that doesn't need a scheduler, doesn't need
timeouts, doesn't benefit from merging, doesn't need tagging etc doesn't
make a lot of sense. If you need none of that, *of course* you're going
to see a slowdown from doing all of these extra things by default.
That's pretty obvious.

This isn't about workloads at all.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 0/1] improve brd performance with blk-mq
  2023-02-21 23:51                     ` Jens Axboe
@ 2023-02-22 22:42                       ` Luis Chamberlain
  2023-02-22 22:47                         ` Jens Axboe
  0 siblings, 1 reply; 19+ messages in thread
From: Luis Chamberlain @ 2023-02-22 22:42 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Pankaj Raghav, hch, gost.dev, linux-block, Ming Lei

On Tue, Feb 21, 2023 at 04:51:21PM -0700, Jens Axboe wrote:
> On 2/21/23 2:59?PM, Luis Chamberlain wrote:
> > On Fri, Feb 17, 2023 at 08:22:15PM +0530, Pankaj Raghav wrote:
> >> I will park this effort as blk-mq doesn't improve the performance for brd,
> >> and we can retain the submit_bio interface.
> > 
> > I am not sure if the feedback was intended to suggest we shouldn't do
> > the blk-mq conversion, but rather explain why in some workloads it
> > may not be as good as the old submit_bio() interface. Probably low
> > hanging fruit, if we *really* wanted to provide parity for the odd
> > workloads.
> > 
> > If we *mostly*  we see better performance with blk-mq it would seem
> > likely reasonable to merge. Dozens of drivers were converted to blk-mq
> > and *most* without *any* performance justification on them. I think
> > ming's was the commit log that had the most elaborate performacne
> > metrics and I think it also showed some *minor* slowdown on some
> > workloads, but the dramatic gains made it worthwhile.
> > 
> > Most of the conversions to blk-mq didn't even have *any* metrics posted.
> 
> You're comparing apples and oranges. I don't want to get into (fairly)
> ancient history at this point, but the original implementation was honed
> with the nvme conversion - which is the most performant driver/hardware
> we have available.
> 
> Converting something that doesn't need a scheduler, doesn't need
> timeouts, doesn't benefit from merging, doesn't need tagging etc doesn't
> make a lot of sense. If you need none of that, *of course* you're going
> to see a slowdown from doing all of these extra things by default.
> That's pretty obvious.
> 
> This isn't about workloads at all.

I'm not arguing mq design over-architecture for simple devices. It is a
given that if one doesn't need some of those things surely they can
create a minor delta loss in performance. I'm asking and suggesting that
despite a few workloads being affected with a *minor delta* for brd for mq
conversion if the huge gains possible on some *other* workloads suffice for
it to be converted over.

We're talking about + ~125% performance boost benefit for randreads.

  Luis

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 0/1] improve brd performance with blk-mq
  2023-02-22 22:42                       ` Luis Chamberlain
@ 2023-02-22 22:47                         ` Jens Axboe
  2023-02-22 22:54                           ` Luis Chamberlain
  0 siblings, 1 reply; 19+ messages in thread
From: Jens Axboe @ 2023-02-22 22:47 UTC (permalink / raw)
  To: Luis Chamberlain; +Cc: Pankaj Raghav, hch, gost.dev, linux-block, Ming Lei

On 2/22/23 3:42?PM, Luis Chamberlain wrote:
> On Tue, Feb 21, 2023 at 04:51:21PM -0700, Jens Axboe wrote:
>> On 2/21/23 2:59?PM, Luis Chamberlain wrote:
>>> On Fri, Feb 17, 2023 at 08:22:15PM +0530, Pankaj Raghav wrote:
>>>> I will park this effort as blk-mq doesn't improve the performance for brd,
>>>> and we can retain the submit_bio interface.
>>>
>>> I am not sure if the feedback was intended to suggest we shouldn't do
>>> the blk-mq conversion, but rather explain why in some workloads it
>>> may not be as good as the old submit_bio() interface. Probably low
>>> hanging fruit, if we *really* wanted to provide parity for the odd
>>> workloads.
>>>
>>> If we *mostly*  we see better performance with blk-mq it would seem
>>> likely reasonable to merge. Dozens of drivers were converted to blk-mq
>>> and *most* without *any* performance justification on them. I think
>>> ming's was the commit log that had the most elaborate performacne
>>> metrics and I think it also showed some *minor* slowdown on some
>>> workloads, but the dramatic gains made it worthwhile.
>>>
>>> Most of the conversions to blk-mq didn't even have *any* metrics posted.
>>
>> You're comparing apples and oranges. I don't want to get into (fairly)
>> ancient history at this point, but the original implementation was honed
>> with the nvme conversion - which is the most performant driver/hardware
>> we have available.
>>
>> Converting something that doesn't need a scheduler, doesn't need
>> timeouts, doesn't benefit from merging, doesn't need tagging etc doesn't
>> make a lot of sense. If you need none of that, *of course* you're going
>> to see a slowdown from doing all of these extra things by default.
>> That's pretty obvious.
>>
>> This isn't about workloads at all.
> 
> I'm not arguing mq design over-architecture for simple devices. It is a
> given that if one doesn't need some of those things surely they can
> create a minor delta loss in performance. I'm asking and suggesting that
> despite a few workloads being affected with a *minor delta* for brd for mq
> conversion if the huge gains possible on some *other* workloads suffice for
> it to be converted over.
> 
> We're talking about + ~125% performance boost benefit for randreads.

Please actually read the whole thread. The boost there was due to brd
not supporting nowait, that has since been corrected. And the latest
numbers reflect that and how the expected outcome (bio > blk-mq for brd,
io_uring > aio for both).

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 0/1] improve brd performance with blk-mq
  2023-02-22 22:47                         ` Jens Axboe
@ 2023-02-22 22:54                           ` Luis Chamberlain
  0 siblings, 0 replies; 19+ messages in thread
From: Luis Chamberlain @ 2023-02-22 22:54 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Pankaj Raghav, hch, gost.dev, linux-block, Ming Lei

On Wed, Feb 22, 2023 at 03:47:43PM -0700, Jens Axboe wrote:
> On 2/22/23 3:42?PM, Luis Chamberlain wrote:
> > On Tue, Feb 21, 2023 at 04:51:21PM -0700, Jens Axboe wrote:
> >> On 2/21/23 2:59?PM, Luis Chamberlain wrote:
> >>> On Fri, Feb 17, 2023 at 08:22:15PM +0530, Pankaj Raghav wrote:
> >>>> I will park this effort as blk-mq doesn't improve the performance for brd,
> >>>> and we can retain the submit_bio interface.
> >>>
> >>> I am not sure if the feedback was intended to suggest we shouldn't do
> >>> the blk-mq conversion, but rather explain why in some workloads it
> >>> may not be as good as the old submit_bio() interface. Probably low
> >>> hanging fruit, if we *really* wanted to provide parity for the odd
> >>> workloads.
> >>>
> >>> If we *mostly*  we see better performance with blk-mq it would seem
> >>> likely reasonable to merge. Dozens of drivers were converted to blk-mq
> >>> and *most* without *any* performance justification on them. I think
> >>> ming's was the commit log that had the most elaborate performacne
> >>> metrics and I think it also showed some *minor* slowdown on some
> >>> workloads, but the dramatic gains made it worthwhile.
> >>>
> >>> Most of the conversions to blk-mq didn't even have *any* metrics posted.
> >>
> >> You're comparing apples and oranges. I don't want to get into (fairly)
> >> ancient history at this point, but the original implementation was honed
> >> with the nvme conversion - which is the most performant driver/hardware
> >> we have available.
> >>
> >> Converting something that doesn't need a scheduler, doesn't need
> >> timeouts, doesn't benefit from merging, doesn't need tagging etc doesn't
> >> make a lot of sense. If you need none of that, *of course* you're going
> >> to see a slowdown from doing all of these extra things by default.
> >> That's pretty obvious.
> >>
> >> This isn't about workloads at all.
> > 
> > I'm not arguing mq design over-architecture for simple devices. It is a
> > given that if one doesn't need some of those things surely they can
> > create a minor delta loss in performance. I'm asking and suggesting that
> > despite a few workloads being affected with a *minor delta* for brd for mq
> > conversion if the huge gains possible on some *other* workloads suffice for
> > it to be converted over.
> > 
> > We're talking about + ~125% performance boost benefit for randreads.
> 
> Please actually read the whole thread. The boost there was due to brd
> not supporting nowait, that has since been corrected.

Ah, that's what happens when reading patches over the phone while *on
vacation*, coming back and thinking you *don't* have to re-read the
threads.

> And the latest
> numbers reflect that and how the expected outcome (bio > blk-mq for brd,
> io_uring > aio for both).

Got it thanks!

  Luis

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2023-02-22 22:54 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <CGME20230203103122eucas1p161c0f0b674d26e23cf38466d5415420e@eucas1p1.samsung.com>
2023-02-03 10:30 ` [PATCH 0/1] improve brd performance with blk-mq Pankaj Raghav
     [not found]   ` <CGME20230203103127eucas1p293a9fa97366fc89c62f18053be6aca1f@eucas1p2.samsung.com>
2023-02-03 10:30     ` [PATCH] brd: improve " Pankaj Raghav
2023-02-06 15:50       ` Christoph Hellwig
2023-02-06 16:10         ` Pankaj Raghav
2023-02-07  1:43   ` [PATCH 0/1] improve brd " Ming Lei
2023-02-13  5:56     ` Pankaj Raghav
2023-02-13  8:10       ` Ming Lei
2023-02-14 14:48         ` Pankaj Raghav
2023-02-15  6:47           ` Christoph Hellwig
2023-02-15 23:38           ` Jens Axboe
2023-02-15 23:39             ` Jens Axboe
2023-02-17 14:27             ` Pankaj Raghav
2023-02-17 14:40               ` Jens Axboe
2023-02-17 14:52                 ` Pankaj Raghav
2023-02-21 21:59                   ` Luis Chamberlain
2023-02-21 23:51                     ` Jens Axboe
2023-02-22 22:42                       ` Luis Chamberlain
2023-02-22 22:47                         ` Jens Axboe
2023-02-22 22:54                           ` Luis Chamberlain

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.