All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] block/dm: add resubmit_bio_noacct for fixing iops throttling
@ 2022-01-10  7:51 ` Ming Lei
  0 siblings, 0 replies; 18+ messages in thread
From: Ming Lei @ 2022-01-10  7:51 UTC (permalink / raw)
  To: Jens Axboe, Mike Snitzer
  Cc: linux-block, dm-devel, Ming Lei, lining, Tejun Heo, Chunguang Xu

Hello Guys,

Commit 4f1e9630afe6 ("blk-throtl: optimize IOPS throttle for large IO scenarios")
only fixes iops throttle for blk-mq drivers. This patchset adds API of resubmit_bio_noacct
so that we can use it for fixing the same issue on bio based drivers.

Meantime fix the issue on device mapper via the added API, and the issue
is reported by lining.

Ming Lei (2):
  block: add resubmit_bio_noacct()
  dm: use resubmit_bio_noacct to submit split bio

 block/blk-core.c       | 12 ++++++++++++
 block/blk-merge.c      |  4 +---
 drivers/md/dm.c        |  2 +-
 include/linux/blkdev.h |  1 +
 4 files changed, 15 insertions(+), 4 deletions(-)

Cc: lining <lining2020x@163.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Chunguang Xu <brookxu@tencent.com>
-- 
2.31.1


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [dm-devel] [PATCH 0/2] block/dm: add resubmit_bio_noacct for fixing iops throttling
@ 2022-01-10  7:51 ` Ming Lei
  0 siblings, 0 replies; 18+ messages in thread
From: Ming Lei @ 2022-01-10  7:51 UTC (permalink / raw)
  To: Jens Axboe, Mike Snitzer
  Cc: Ming Lei, linux-block, dm-devel, lining, Tejun Heo, Chunguang Xu

Hello Guys,

Commit 4f1e9630afe6 ("blk-throtl: optimize IOPS throttle for large IO scenarios")
only fixes iops throttle for blk-mq drivers. This patchset adds API of resubmit_bio_noacct
so that we can use it for fixing the same issue on bio based drivers.

Meantime fix the issue on device mapper via the added API, and the issue
is reported by lining.

Ming Lei (2):
  block: add resubmit_bio_noacct()
  dm: use resubmit_bio_noacct to submit split bio

 block/blk-core.c       | 12 ++++++++++++
 block/blk-merge.c      |  4 +---
 drivers/md/dm.c        |  2 +-
 include/linux/blkdev.h |  1 +
 4 files changed, 15 insertions(+), 4 deletions(-)

Cc: lining <lining2020x@163.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Chunguang Xu <brookxu@tencent.com>
-- 
2.31.1

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 1/2] block: add resubmit_bio_noacct()
  2022-01-10  7:51 ` [dm-devel] " Ming Lei
@ 2022-01-10  7:51   ` Ming Lei
  -1 siblings, 0 replies; 18+ messages in thread
From: Ming Lei @ 2022-01-10  7:51 UTC (permalink / raw)
  To: Jens Axboe, Mike Snitzer
  Cc: linux-block, dm-devel, Ming Lei, lining, Tejun Heo, Chunguang Xu

Add block layer API of resubmit_bio_noacct() for handling blk-throttle
iops limit correctly. Typical use case is that bio split, and it isn't
good to export blk_throtl_charge_bio_split() for drivers, so add new API
for serving such purpose.

Cc: lining <lining2020x@163.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Chunguang Xu <brookxu@tencent.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 block/blk-core.c       | 12 ++++++++++++
 block/blk-merge.c      |  4 +---
 include/linux/blkdev.h |  1 +
 3 files changed, 14 insertions(+), 3 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index fd029c86d6ac..733fec7dc5d6 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -910,6 +910,18 @@ void submit_bio_noacct(struct bio *bio)
 }
 EXPORT_SYMBOL(submit_bio_noacct);
 
+/*
+ * Usually for submitting one bio which has been checked by
+ * submit_bio_checks already. The typical use case is for handling
+ * blk-throttle iops limit correctly.
+ */
+void resubmit_bio_noacct(struct bio *bio)
+{
+	submit_bio_noacct(bio);
+	blk_throtl_charge_bio_split(bio);
+}
+EXPORT_SYMBOL(resubmit_bio_noacct);
+
 /**
  * submit_bio - submit a bio to the block device layer for I/O
  * @bio: The &struct bio which describes the I/O
diff --git a/block/blk-merge.c b/block/blk-merge.c
index 4de34a332c9f..acc786d872e6 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -366,10 +366,8 @@ void __blk_queue_split(struct request_queue *q, struct bio **bio,
 
 		bio_chain(split, *bio);
 		trace_block_split(split, (*bio)->bi_iter.bi_sector);
-		submit_bio_noacct(*bio);
+		resubmit_bio_noacct(*bio);
 		*bio = split;
-
-		blk_throtl_charge_bio_split(*bio);
 	}
 }
 
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 22746b2d6825..cce2db9fae1f 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -600,6 +600,7 @@ static inline unsigned int blk_queue_depth(struct request_queue *q)
 extern int blk_register_queue(struct gendisk *disk);
 extern void blk_unregister_queue(struct gendisk *disk);
 void submit_bio_noacct(struct bio *bio);
+void resubmit_bio_noacct(struct bio *bio);
 
 extern int blk_lld_busy(struct request_queue *q);
 extern void blk_queue_split(struct bio **);
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [dm-devel] [PATCH 1/2] block: add resubmit_bio_noacct()
@ 2022-01-10  7:51   ` Ming Lei
  0 siblings, 0 replies; 18+ messages in thread
From: Ming Lei @ 2022-01-10  7:51 UTC (permalink / raw)
  To: Jens Axboe, Mike Snitzer
  Cc: Ming Lei, linux-block, dm-devel, lining, Tejun Heo, Chunguang Xu

Add block layer API of resubmit_bio_noacct() for handling blk-throttle
iops limit correctly. Typical use case is that bio split, and it isn't
good to export blk_throtl_charge_bio_split() for drivers, so add new API
for serving such purpose.

Cc: lining <lining2020x@163.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Chunguang Xu <brookxu@tencent.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 block/blk-core.c       | 12 ++++++++++++
 block/blk-merge.c      |  4 +---
 include/linux/blkdev.h |  1 +
 3 files changed, 14 insertions(+), 3 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index fd029c86d6ac..733fec7dc5d6 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -910,6 +910,18 @@ void submit_bio_noacct(struct bio *bio)
 }
 EXPORT_SYMBOL(submit_bio_noacct);
 
+/*
+ * Usually for submitting one bio which has been checked by
+ * submit_bio_checks already. The typical use case is for handling
+ * blk-throttle iops limit correctly.
+ */
+void resubmit_bio_noacct(struct bio *bio)
+{
+	submit_bio_noacct(bio);
+	blk_throtl_charge_bio_split(bio);
+}
+EXPORT_SYMBOL(resubmit_bio_noacct);
+
 /**
  * submit_bio - submit a bio to the block device layer for I/O
  * @bio: The &struct bio which describes the I/O
diff --git a/block/blk-merge.c b/block/blk-merge.c
index 4de34a332c9f..acc786d872e6 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -366,10 +366,8 @@ void __blk_queue_split(struct request_queue *q, struct bio **bio,
 
 		bio_chain(split, *bio);
 		trace_block_split(split, (*bio)->bi_iter.bi_sector);
-		submit_bio_noacct(*bio);
+		resubmit_bio_noacct(*bio);
 		*bio = split;
-
-		blk_throtl_charge_bio_split(*bio);
 	}
 }
 
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 22746b2d6825..cce2db9fae1f 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -600,6 +600,7 @@ static inline unsigned int blk_queue_depth(struct request_queue *q)
 extern int blk_register_queue(struct gendisk *disk);
 extern void blk_unregister_queue(struct gendisk *disk);
 void submit_bio_noacct(struct bio *bio);
+void resubmit_bio_noacct(struct bio *bio);
 
 extern int blk_lld_busy(struct request_queue *q);
 extern void blk_queue_split(struct bio **);
-- 
2.31.1

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 2/2] dm: use resubmit_bio_noacct to submit split bio
  2022-01-10  7:51 ` [dm-devel] " Ming Lei
@ 2022-01-10  7:51   ` Ming Lei
  -1 siblings, 0 replies; 18+ messages in thread
From: Ming Lei @ 2022-01-10  7:51 UTC (permalink / raw)
  To: Jens Axboe, Mike Snitzer
  Cc: linux-block, dm-devel, Ming Lei, lining, Tejun Heo, Chunguang Xu

lining reported that blk-throttle iops limit doesn't work correctly
for dm-thin. Turns out it is same issue with the one addressed by commit
4f1e9630afe6 ("blk-throtl: optimize IOPS throttle for large IO scenarios").

So use the new added block layer API for addressing the same issue.

Reported-by: lining <lining2020x@163.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Chunguang Xu <brookxu@tencent.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 drivers/md/dm.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index 280918cdcabd..8a58379e737c 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -1562,7 +1562,7 @@ static void __split_and_process_bio(struct mapped_device *md,
 
 			bio_chain(b, bio);
 			trace_block_split(b, bio->bi_iter.bi_sector);
-			submit_bio_noacct(bio);
+			resubmit_bio_noacct(bio);
 		}
 	}
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [dm-devel] [PATCH 2/2] dm: use resubmit_bio_noacct to submit split bio
@ 2022-01-10  7:51   ` Ming Lei
  0 siblings, 0 replies; 18+ messages in thread
From: Ming Lei @ 2022-01-10  7:51 UTC (permalink / raw)
  To: Jens Axboe, Mike Snitzer
  Cc: Ming Lei, linux-block, dm-devel, lining, Tejun Heo, Chunguang Xu

lining reported that blk-throttle iops limit doesn't work correctly
for dm-thin. Turns out it is same issue with the one addressed by commit
4f1e9630afe6 ("blk-throtl: optimize IOPS throttle for large IO scenarios").

So use the new added block layer API for addressing the same issue.

Reported-by: lining <lining2020x@163.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Chunguang Xu <brookxu@tencent.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 drivers/md/dm.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index 280918cdcabd..8a58379e737c 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -1562,7 +1562,7 @@ static void __split_and_process_bio(struct mapped_device *md,
 
 			bio_chain(b, bio);
 			trace_block_split(b, bio->bi_iter.bi_sector);
-			submit_bio_noacct(bio);
+			resubmit_bio_noacct(bio);
 		}
 	}
 
-- 
2.31.1

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/2] block: add resubmit_bio_noacct()
  2022-01-10  7:51   ` [dm-devel] " Ming Lei
@ 2022-01-10 17:35     ` Christoph Hellwig
  -1 siblings, 0 replies; 18+ messages in thread
From: Christoph Hellwig @ 2022-01-10 17:35 UTC (permalink / raw)
  To: Ming Lei
  Cc: Jens Axboe, Mike Snitzer, linux-block, dm-devel, lining,
	Tejun Heo, Chunguang Xu

On Mon, Jan 10, 2022 at 03:51:40PM +0800, Ming Lei wrote:
> Add block layer API of resubmit_bio_noacct() for handling blk-throttle
> iops limit correctly. Typical use case is that bio split, and it isn't
> good to export blk_throtl_charge_bio_split() for drivers, so add new API
> for serving such purpose.

Umm, submit_bio_noacct is meant exactly for this case of resubmitting
a bio.  We should not need another API for that.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [dm-devel] [PATCH 1/2] block: add resubmit_bio_noacct()
@ 2022-01-10 17:35     ` Christoph Hellwig
  0 siblings, 0 replies; 18+ messages in thread
From: Christoph Hellwig @ 2022-01-10 17:35 UTC (permalink / raw)
  To: Ming Lei
  Cc: Jens Axboe, Mike Snitzer, linux-block, dm-devel, lining,
	Tejun Heo, Chunguang Xu

On Mon, Jan 10, 2022 at 03:51:40PM +0800, Ming Lei wrote:
> Add block layer API of resubmit_bio_noacct() for handling blk-throttle
> iops limit correctly. Typical use case is that bio split, and it isn't
> good to export blk_throtl_charge_bio_split() for drivers, so add new API
> for serving such purpose.

Umm, submit_bio_noacct is meant exactly for this case of resubmitting
a bio.  We should not need another API for that.

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/2] block: add resubmit_bio_noacct()
  2022-01-10 17:35     ` [dm-devel] " Christoph Hellwig
@ 2022-01-10 19:03       ` Mike Snitzer
  -1 siblings, 0 replies; 18+ messages in thread
From: Mike Snitzer @ 2022-01-10 19:03 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Ming Lei, Jens Axboe, linux-block, dm-devel, lining, Tejun Heo,
	Chunguang Xu

On Mon, Jan 10 2022 at 12:35P -0500,
Christoph Hellwig <hch@infradead.org> wrote:

> On Mon, Jan 10, 2022 at 03:51:40PM +0800, Ming Lei wrote:
> > Add block layer API of resubmit_bio_noacct() for handling blk-throttle
> > iops limit correctly. Typical use case is that bio split, and it isn't
> > good to export blk_throtl_charge_bio_split() for drivers, so add new API
> > for serving such purpose.
> 
> Umm, submit_bio_noacct is meant exactly for this case of resubmitting
> a bio.  We should not need another API for that.
> 

Ming is lifting code out of __blk_queue_split() for reuse (by DM in
this instance, because it has its own bio_split+bio_chain).

Are you saying submit_bio_noacct() should be made to call
blk_throtl_charge_bio_split() and blk_throtl_charge_bio_split() simply
return if not a split bio? (not sure bio has enough context to know,
other than looking at some side-effect change from bio_chain)

But Ming: your __blk_queue_split() change seems wrong.
Prior to your patch __blk_queue_split() did:

bio_chain(split, *bio);
submit_bio_noacct(*bio);
*bio = split;
blk_throtl_charge_bio_split(*bio);

After your patch (effectively):

bio_chain(split, *bio);
submit_bio_noacct(*bio);
blk_throtl_charge_bio_split(bio);
*bio = split;

Maybe that was intended? (or maybe it doesn't matter because bio_split
copies fields with bio_clone_fast())?  Regardless, it is subtle.

Should blk_throtl_charge_bio_split() just be pushed down to
bio_split()?

In general, such narrow hacks for how to properly resubmit split bios
are asking for further trouble.  As is, I'm having to triage new
reports of bio-based accounting issues (which has called into question
my hack/fix commit a1e1cb72d9649 ("dm: fix redundant IO accounting for
bios that need splitting") that papered over this bigger issue of
needing proper split IO accounting, so likely needs to be revisited).

We also have the much bigger issue of IO poll support (or
lack-there-of) for split bios.

Mike


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [dm-devel] [PATCH 1/2] block: add resubmit_bio_noacct()
@ 2022-01-10 19:03       ` Mike Snitzer
  0 siblings, 0 replies; 18+ messages in thread
From: Mike Snitzer @ 2022-01-10 19:03 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jens Axboe, Ming Lei, linux-block, dm-devel, lining, Tejun Heo,
	Chunguang Xu

On Mon, Jan 10 2022 at 12:35P -0500,
Christoph Hellwig <hch@infradead.org> wrote:

> On Mon, Jan 10, 2022 at 03:51:40PM +0800, Ming Lei wrote:
> > Add block layer API of resubmit_bio_noacct() for handling blk-throttle
> > iops limit correctly. Typical use case is that bio split, and it isn't
> > good to export blk_throtl_charge_bio_split() for drivers, so add new API
> > for serving such purpose.
> 
> Umm, submit_bio_noacct is meant exactly for this case of resubmitting
> a bio.  We should not need another API for that.
> 

Ming is lifting code out of __blk_queue_split() for reuse (by DM in
this instance, because it has its own bio_split+bio_chain).

Are you saying submit_bio_noacct() should be made to call
blk_throtl_charge_bio_split() and blk_throtl_charge_bio_split() simply
return if not a split bio? (not sure bio has enough context to know,
other than looking at some side-effect change from bio_chain)

But Ming: your __blk_queue_split() change seems wrong.
Prior to your patch __blk_queue_split() did:

bio_chain(split, *bio);
submit_bio_noacct(*bio);
*bio = split;
blk_throtl_charge_bio_split(*bio);

After your patch (effectively):

bio_chain(split, *bio);
submit_bio_noacct(*bio);
blk_throtl_charge_bio_split(bio);
*bio = split;

Maybe that was intended? (or maybe it doesn't matter because bio_split
copies fields with bio_clone_fast())?  Regardless, it is subtle.

Should blk_throtl_charge_bio_split() just be pushed down to
bio_split()?

In general, such narrow hacks for how to properly resubmit split bios
are asking for further trouble.  As is, I'm having to triage new
reports of bio-based accounting issues (which has called into question
my hack/fix commit a1e1cb72d9649 ("dm: fix redundant IO accounting for
bios that need splitting") that papered over this bigger issue of
needing proper split IO accounting, so likely needs to be revisited).

We also have the much bigger issue of IO poll support (or
lack-there-of) for split bios.

Mike

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/2] block: add resubmit_bio_noacct()
  2022-01-10 17:35     ` [dm-devel] " Christoph Hellwig
@ 2022-01-11  1:56       ` Ming Lei
  -1 siblings, 0 replies; 18+ messages in thread
From: Ming Lei @ 2022-01-11  1:56 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jens Axboe, Mike Snitzer, linux-block, dm-devel, lining,
	Tejun Heo, Chunguang Xu

On Mon, Jan 10, 2022 at 09:35:22AM -0800, Christoph Hellwig wrote:
> On Mon, Jan 10, 2022 at 03:51:40PM +0800, Ming Lei wrote:
> > Add block layer API of resubmit_bio_noacct() for handling blk-throttle
> > iops limit correctly. Typical use case is that bio split, and it isn't
> > good to export blk_throtl_charge_bio_split() for drivers, so add new API
> > for serving such purpose.
> 
> Umm, submit_bio_noacct is meant exactly for this case of resubmitting
> a bio.  We should not need another API for that.

Follows the background of the issue first:

1) IOPS limit throttle needs to cover split bio since iostat accounts
split bio actually, and user space setup iops limit throttle by the
feedback from iostat/diskstat;

2) block throttle is block layer internal stuff, and we shouldn't expose
blk_throtl_charge_bio_split() to driver.

Maybe rename the new API as submit_split_bio_noacct(), but we can't
reuse submit_bio_noacct() simply, otherwise blk_throtl_charge_bio_split()
needs to be exported.

Another ides is to clearing BIO_THROTTLED before calling submit_bio_noacct(),
meantime blk-throttle code needs to change to avoid double accounting of bio
bytes, so the caller of submit_bio_noacct() still needs some change.
This way can get smooth IOPS throttle, but needs to call __blk_throtl_bio
for split bio one more time.

Or other idea for this bio split vs. iops limit issue?


Thanks,
Ming


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [dm-devel] [PATCH 1/2] block: add resubmit_bio_noacct()
@ 2022-01-11  1:56       ` Ming Lei
  0 siblings, 0 replies; 18+ messages in thread
From: Ming Lei @ 2022-01-11  1:56 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jens Axboe, Mike Snitzer, linux-block, dm-devel, lining,
	Tejun Heo, Chunguang Xu

On Mon, Jan 10, 2022 at 09:35:22AM -0800, Christoph Hellwig wrote:
> On Mon, Jan 10, 2022 at 03:51:40PM +0800, Ming Lei wrote:
> > Add block layer API of resubmit_bio_noacct() for handling blk-throttle
> > iops limit correctly. Typical use case is that bio split, and it isn't
> > good to export blk_throtl_charge_bio_split() for drivers, so add new API
> > for serving such purpose.
> 
> Umm, submit_bio_noacct is meant exactly for this case of resubmitting
> a bio.  We should not need another API for that.

Follows the background of the issue first:

1) IOPS limit throttle needs to cover split bio since iostat accounts
split bio actually, and user space setup iops limit throttle by the
feedback from iostat/diskstat;

2) block throttle is block layer internal stuff, and we shouldn't expose
blk_throtl_charge_bio_split() to driver.

Maybe rename the new API as submit_split_bio_noacct(), but we can't
reuse submit_bio_noacct() simply, otherwise blk_throtl_charge_bio_split()
needs to be exported.

Another ides is to clearing BIO_THROTTLED before calling submit_bio_noacct(),
meantime blk-throttle code needs to change to avoid double accounting of bio
bytes, so the caller of submit_bio_noacct() still needs some change.
This way can get smooth IOPS throttle, but needs to call __blk_throtl_bio
for split bio one more time.

Or other idea for this bio split vs. iops limit issue?


Thanks,
Ming

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/2] block: add resubmit_bio_noacct()
  2022-01-10 19:03       ` [dm-devel] " Mike Snitzer
@ 2022-01-11  2:26         ` Ming Lei
  -1 siblings, 0 replies; 18+ messages in thread
From: Ming Lei @ 2022-01-11  2:26 UTC (permalink / raw)
  To: Mike Snitzer
  Cc: Christoph Hellwig, Jens Axboe, linux-block, dm-devel, lining,
	Tejun Heo, Chunguang Xu

On Mon, Jan 10, 2022 at 02:03:16PM -0500, Mike Snitzer wrote:
> On Mon, Jan 10 2022 at 12:35P -0500,
> Christoph Hellwig <hch@infradead.org> wrote:
> 
> > On Mon, Jan 10, 2022 at 03:51:40PM +0800, Ming Lei wrote:
> > > Add block layer API of resubmit_bio_noacct() for handling blk-throttle
> > > iops limit correctly. Typical use case is that bio split, and it isn't
> > > good to export blk_throtl_charge_bio_split() for drivers, so add new API
> > > for serving such purpose.
> > 
> > Umm, submit_bio_noacct is meant exactly for this case of resubmitting
> > a bio.  We should not need another API for that.
> > 
> 
> Ming is lifting code out of __blk_queue_split() for reuse (by DM in
> this instance, because it has its own bio_split+bio_chain).
> 
> Are you saying submit_bio_noacct() should be made to call
> blk_throtl_charge_bio_split() and blk_throtl_charge_bio_split() simply
> return if not a split bio? (not sure bio has enough context to know,

DM or MD may submit split bio to underlying queue directly, so we can't
do that simply. Also some FS may call bio_split() too.

Or we simply let blk_throtl_bio cover everything? That is basically
what V1 did, and the only issue is that we can't run the account
in case that submit_bio_noacct() is called from blk_throtl_dispatch_work_fn().

> other than looking at some side-effect change from bio_chain)
> 
> But Ming: your __blk_queue_split() change seems wrong.
> Prior to your patch __blk_queue_split() did:
> 
> bio_chain(split, *bio);
> submit_bio_noacct(*bio);
> *bio = split;
> blk_throtl_charge_bio_split(*bio);
> 
> After your patch (effectively):
> 
> bio_chain(split, *bio);
> submit_bio_noacct(*bio);
> blk_throtl_charge_bio_split(bio);
> *bio = split;
> 
> Maybe that was intended? (or maybe it doesn't matter because bio_split
> copies fields with bio_clone_fast())?  Regardless, it is subtle.

It doesn't matter, blk_throtl_charge_bio_split() just accounts bio
number, here the split bio is submitted to the same queue.

> 
> Should blk_throtl_charge_bio_split() just be pushed down to
> bio_split()?

It is fragile, will the bio allocated from bio_split() always
submitted finally? or submitted to same queue?

> 
> In general, such narrow hacks for how to properly resubmit split bios
> are asking for further trouble.

I don't think it is hacks, it is one approach which has been verified as
workable in blk-mq.

> As is, I'm having to triage new
> reports of bio-based accounting issues (which has called into question
> my hack/fix commit a1e1cb72d9649 ("dm: fix redundant IO accounting for
> bios that need splitting") that papered over this bigger issue of
> needing proper split IO accounting, so likely needs to be revisited).

Here the issue is just about bio number accounting.

BTW, maybe you can follow blk-mq's way: just account after io is split,
such as, move start_io_acct() to the end of __split_and_process_non_flush(),
then you can just account io start once.


Thanks, 
Ming


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [dm-devel] [PATCH 1/2] block: add resubmit_bio_noacct()
@ 2022-01-11  2:26         ` Ming Lei
  0 siblings, 0 replies; 18+ messages in thread
From: Ming Lei @ 2022-01-11  2:26 UTC (permalink / raw)
  To: Mike Snitzer
  Cc: Jens Axboe, linux-block, Christoph Hellwig, dm-devel, lining,
	Tejun Heo, Chunguang Xu

On Mon, Jan 10, 2022 at 02:03:16PM -0500, Mike Snitzer wrote:
> On Mon, Jan 10 2022 at 12:35P -0500,
> Christoph Hellwig <hch@infradead.org> wrote:
> 
> > On Mon, Jan 10, 2022 at 03:51:40PM +0800, Ming Lei wrote:
> > > Add block layer API of resubmit_bio_noacct() for handling blk-throttle
> > > iops limit correctly. Typical use case is that bio split, and it isn't
> > > good to export blk_throtl_charge_bio_split() for drivers, so add new API
> > > for serving such purpose.
> > 
> > Umm, submit_bio_noacct is meant exactly for this case of resubmitting
> > a bio.  We should not need another API for that.
> > 
> 
> Ming is lifting code out of __blk_queue_split() for reuse (by DM in
> this instance, because it has its own bio_split+bio_chain).
> 
> Are you saying submit_bio_noacct() should be made to call
> blk_throtl_charge_bio_split() and blk_throtl_charge_bio_split() simply
> return if not a split bio? (not sure bio has enough context to know,

DM or MD may submit split bio to underlying queue directly, so we can't
do that simply. Also some FS may call bio_split() too.

Or we simply let blk_throtl_bio cover everything? That is basically
what V1 did, and the only issue is that we can't run the account
in case that submit_bio_noacct() is called from blk_throtl_dispatch_work_fn().

> other than looking at some side-effect change from bio_chain)
> 
> But Ming: your __blk_queue_split() change seems wrong.
> Prior to your patch __blk_queue_split() did:
> 
> bio_chain(split, *bio);
> submit_bio_noacct(*bio);
> *bio = split;
> blk_throtl_charge_bio_split(*bio);
> 
> After your patch (effectively):
> 
> bio_chain(split, *bio);
> submit_bio_noacct(*bio);
> blk_throtl_charge_bio_split(bio);
> *bio = split;
> 
> Maybe that was intended? (or maybe it doesn't matter because bio_split
> copies fields with bio_clone_fast())?  Regardless, it is subtle.

It doesn't matter, blk_throtl_charge_bio_split() just accounts bio
number, here the split bio is submitted to the same queue.

> 
> Should blk_throtl_charge_bio_split() just be pushed down to
> bio_split()?

It is fragile, will the bio allocated from bio_split() always
submitted finally? or submitted to same queue?

> 
> In general, such narrow hacks for how to properly resubmit split bios
> are asking for further trouble.

I don't think it is hacks, it is one approach which has been verified as
workable in blk-mq.

> As is, I'm having to triage new
> reports of bio-based accounting issues (which has called into question
> my hack/fix commit a1e1cb72d9649 ("dm: fix redundant IO accounting for
> bios that need splitting") that papered over this bigger issue of
> needing proper split IO accounting, so likely needs to be revisited).

Here the issue is just about bio number accounting.

BTW, maybe you can follow blk-mq's way: just account after io is split,
such as, move start_io_acct() to the end of __split_and_process_non_flush(),
then you can just account io start once.


Thanks, 
Ming

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [dm-devel] [PATCH 1/2] block: add resubmit_bio_noacct()
  2022-01-10 19:03       ` [dm-devel] " Mike Snitzer
@ 2022-01-11  8:36         ` Christoph Hellwig
  -1 siblings, 0 replies; 18+ messages in thread
From: Christoph Hellwig @ 2022-01-11  8:36 UTC (permalink / raw)
  To: Mike Snitzer
  Cc: Christoph Hellwig, Jens Axboe, Ming Lei, linux-block, dm-devel,
	lining, Tejun Heo, Chunguang Xu

On Mon, Jan 10, 2022 at 02:03:16PM -0500, Mike Snitzer wrote:
> Ming is lifting code out of __blk_queue_split() for reuse (by DM in
> this instance, because it has its own bio_split+bio_chain).
> 
> Are you saying submit_bio_noacct() should be made to call
> blk_throtl_charge_bio_split() and blk_throtl_charge_bio_split() simply
> return if not a split bio? (not sure bio has enough context to know,
> other than looking at some side-effect change from bio_chain)

Yes.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [dm-devel] [PATCH 1/2] block: add resubmit_bio_noacct()
@ 2022-01-11  8:36         ` Christoph Hellwig
  0 siblings, 0 replies; 18+ messages in thread
From: Christoph Hellwig @ 2022-01-11  8:36 UTC (permalink / raw)
  To: Mike Snitzer
  Cc: Jens Axboe, linux-block, Ming Lei, Christoph Hellwig, dm-devel,
	lining, Tejun Heo, Chunguang Xu

On Mon, Jan 10, 2022 at 02:03:16PM -0500, Mike Snitzer wrote:
> Ming is lifting code out of __blk_queue_split() for reuse (by DM in
> this instance, because it has its own bio_split+bio_chain).
> 
> Are you saying submit_bio_noacct() should be made to call
> blk_throtl_charge_bio_split() and blk_throtl_charge_bio_split() simply
> return if not a split bio? (not sure bio has enough context to know,
> other than looking at some side-effect change from bio_chain)

Yes.

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [dm-devel] [PATCH 1/2] block: add resubmit_bio_noacct()
  2022-01-11  1:56       ` [dm-devel] " Ming Lei
@ 2022-01-11  8:42         ` Christoph Hellwig
  -1 siblings, 0 replies; 18+ messages in thread
From: Christoph Hellwig @ 2022-01-11  8:42 UTC (permalink / raw)
  To: Ming Lei
  Cc: Christoph Hellwig, Jens Axboe, Mike Snitzer, linux-block,
	dm-devel, lining, Tejun Heo, Chunguang Xu

On Tue, Jan 11, 2022 at 09:56:36AM +0800, Ming Lei wrote:
> 2) block throttle is block layer internal stuff, and we shouldn't expose
> blk_throtl_charge_bio_split() to driver.
> 
> Maybe rename the new API as submit_split_bio_noacct(), but we can't
> reuse submit_bio_noacct() simply, otherwise blk_throtl_charge_bio_split()
> needs to be exported.

submit_bio_noacct should only ever be used for resubmitting bios that
came up from the upper layer, although they might not always be "split".
blk_throtl_charge_bio_split

> Another ides is to clearing BIO_THROTTLED before calling submit_bio_noacct(),
> meantime blk-throttle code needs to change to avoid double accounting of bio
> bytes, so the caller of submit_bio_noacct() still needs some change.
> This way can get smooth IOPS throttle, but needs to call __blk_throtl_bio
> for split bio one more time.
> 
> Or other idea for this bio split vs. iops limit issue?

Well, if you want a helper specificly for splits, add one that actally
specifically handles splits and makes the callers life easier, something
like:

void bio_submit_splice(struct bio *split, struct bio *orig)
{
	split->bi_opf |= REQ_NOMERGE;
	trace_block_split(split, orig->bi_iter.bi_sector);
	submit_bio_noacct(orig);
	blk_throtl_charge_bio_split(orig);
}

including a proper kerneldoc comment.

But I still fail to grasp how a split is so different from just
resubmitting a not split bio.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [dm-devel] [PATCH 1/2] block: add resubmit_bio_noacct()
@ 2022-01-11  8:42         ` Christoph Hellwig
  0 siblings, 0 replies; 18+ messages in thread
From: Christoph Hellwig @ 2022-01-11  8:42 UTC (permalink / raw)
  To: Ming Lei
  Cc: Jens Axboe, linux-block, Mike Snitzer, Christoph Hellwig,
	dm-devel, lining, Tejun Heo, Chunguang Xu

On Tue, Jan 11, 2022 at 09:56:36AM +0800, Ming Lei wrote:
> 2) block throttle is block layer internal stuff, and we shouldn't expose
> blk_throtl_charge_bio_split() to driver.
> 
> Maybe rename the new API as submit_split_bio_noacct(), but we can't
> reuse submit_bio_noacct() simply, otherwise blk_throtl_charge_bio_split()
> needs to be exported.

submit_bio_noacct should only ever be used for resubmitting bios that
came up from the upper layer, although they might not always be "split".
blk_throtl_charge_bio_split

> Another ides is to clearing BIO_THROTTLED before calling submit_bio_noacct(),
> meantime blk-throttle code needs to change to avoid double accounting of bio
> bytes, so the caller of submit_bio_noacct() still needs some change.
> This way can get smooth IOPS throttle, but needs to call __blk_throtl_bio
> for split bio one more time.
> 
> Or other idea for this bio split vs. iops limit issue?

Well, if you want a helper specificly for splits, add one that actally
specifically handles splits and makes the callers life easier, something
like:

void bio_submit_splice(struct bio *split, struct bio *orig)
{
	split->bi_opf |= REQ_NOMERGE;
	trace_block_split(split, orig->bi_iter.bi_sector);
	submit_bio_noacct(orig);
	blk_throtl_charge_bio_split(orig);
}

including a proper kerneldoc comment.

But I still fail to grasp how a split is so different from just
resubmitting a not split bio.

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2022-01-11  8:42 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-01-10  7:51 [PATCH 0/2] block/dm: add resubmit_bio_noacct for fixing iops throttling Ming Lei
2022-01-10  7:51 ` [dm-devel] " Ming Lei
2022-01-10  7:51 ` [PATCH 1/2] block: add resubmit_bio_noacct() Ming Lei
2022-01-10  7:51   ` [dm-devel] " Ming Lei
2022-01-10 17:35   ` Christoph Hellwig
2022-01-10 17:35     ` [dm-devel] " Christoph Hellwig
2022-01-10 19:03     ` Mike Snitzer
2022-01-10 19:03       ` [dm-devel] " Mike Snitzer
2022-01-11  2:26       ` Ming Lei
2022-01-11  2:26         ` [dm-devel] " Ming Lei
2022-01-11  8:36       ` Christoph Hellwig
2022-01-11  8:36         ` Christoph Hellwig
2022-01-11  1:56     ` Ming Lei
2022-01-11  1:56       ` [dm-devel] " Ming Lei
2022-01-11  8:42       ` Christoph Hellwig
2022-01-11  8:42         ` Christoph Hellwig
2022-01-10  7:51 ` [PATCH 2/2] dm: use resubmit_bio_noacct to submit split bio Ming Lei
2022-01-10  7:51   ` [dm-devel] " Ming Lei

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.