All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/4] dm: fix various issues with bio splitting code
@ 2019-01-19 18:05 ` Mike Snitzer
  0 siblings, 0 replies; 30+ messages in thread
From: Mike Snitzer @ 2019-01-19 18:05 UTC (permalink / raw)
  To: dm-devel; +Cc: NeilBrown, Ming Lei, axboe, linux-block

Hi,

These mostly stable@ patches will be sent to Linus for 5.0-rc4
inclusion (would've sent today for 5.0-rc3 but I don't yet have
linux-next coverage).

Just wanted to give a heads up because some of the problems fixed
could bite other users of the bio_split()+generic_make_request()
recursive pattern.  We should probably factor out some helpers that
all users could share (e.g. so they don't forget to set
BIO_QUEUE_ENTERED or call trace_block_split(), etc).

Thanks,
Mike

Mike Snitzer (4):
  dm: fix clone_bio() to trigger blk_recount_segments()
  dm: fix redundant IO accounting for bios that need splitting
  dm: fix missing bio_split() pattern code in __split_and_process_bio()
  dm: fix dm_wq_work() to only use __split_and_process_bio() if appropriate

 drivers/md/dm.c | 44 ++++++++++++++++++++++++++++++++------------
 1 file changed, 32 insertions(+), 12 deletions(-)

-- 
2.15.0


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH 0/4] dm: fix various issues with bio splitting code
@ 2019-01-19 18:05 ` Mike Snitzer
  0 siblings, 0 replies; 30+ messages in thread
From: Mike Snitzer @ 2019-01-19 18:05 UTC (permalink / raw)
  To: dm-devel; +Cc: axboe, linux-block, NeilBrown, Ming Lei

Hi,

These mostly stable@ patches will be sent to Linus for 5.0-rc4
inclusion (would've sent today for 5.0-rc3 but I don't yet have
linux-next coverage).

Just wanted to give a heads up because some of the problems fixed
could bite other users of the bio_split()+generic_make_request()
recursive pattern.  We should probably factor out some helpers that
all users could share (e.g. so they don't forget to set
BIO_QUEUE_ENTERED or call trace_block_split(), etc).

Thanks,
Mike

Mike Snitzer (4):
  dm: fix clone_bio() to trigger blk_recount_segments()
  dm: fix redundant IO accounting for bios that need splitting
  dm: fix missing bio_split() pattern code in __split_and_process_bio()
  dm: fix dm_wq_work() to only use __split_and_process_bio() if appropriate

 drivers/md/dm.c | 44 ++++++++++++++++++++++++++++++++------------
 1 file changed, 32 insertions(+), 12 deletions(-)

-- 
2.15.0

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH 1/4] dm: fix clone_bio() to trigger blk_recount_segments()
  2019-01-19 18:05 ` Mike Snitzer
@ 2019-01-19 18:05   ` Mike Snitzer
  -1 siblings, 0 replies; 30+ messages in thread
From: Mike Snitzer @ 2019-01-19 18:05 UTC (permalink / raw)
  To: dm-devel; +Cc: NeilBrown, Ming Lei, axboe, linux-block

DM's clone_bio() now benefits from using bio_trim() by fixing the fact
that clone_bio() wasn't clearing BIO_SEG_VALID like bio_trim() does;
which triggers blk_recount_segments() via bio_phys_segments().

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
---
 drivers/md/dm.c | 8 ++------
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index d67c95ef8d7e..fcb97b0a5743 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -1320,7 +1320,7 @@ static int clone_bio(struct dm_target_io *tio, struct bio *bio,
 
 	__bio_clone_fast(clone, bio);
 
-	if (unlikely(bio_integrity(bio) != NULL)) {
+	if (bio_integrity(bio)) {
 		int r;
 
 		if (unlikely(!dm_target_has_integrity(tio->ti->type) &&
@@ -1336,11 +1336,7 @@ static int clone_bio(struct dm_target_io *tio, struct bio *bio,
 			return r;
 	}
 
-	bio_advance(clone, to_bytes(sector - clone->bi_iter.bi_sector));
-	clone->bi_iter.bi_size = to_bytes(len);
-
-	if (unlikely(bio_integrity(bio) != NULL))
-		bio_integrity_trim(clone);
+	bio_trim(clone, sector - clone->bi_iter.bi_sector, len);
 
 	return 0;
 }
-- 
2.15.0


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 1/4] dm: fix clone_bio() to trigger blk_recount_segments()
@ 2019-01-19 18:05   ` Mike Snitzer
  0 siblings, 0 replies; 30+ messages in thread
From: Mike Snitzer @ 2019-01-19 18:05 UTC (permalink / raw)
  To: dm-devel; +Cc: axboe, linux-block, NeilBrown, Ming Lei

DM's clone_bio() now benefits from using bio_trim() by fixing the fact
that clone_bio() wasn't clearing BIO_SEG_VALID like bio_trim() does;
which triggers blk_recount_segments() via bio_phys_segments().

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
---
 drivers/md/dm.c | 8 ++------
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index d67c95ef8d7e..fcb97b0a5743 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -1320,7 +1320,7 @@ static int clone_bio(struct dm_target_io *tio, struct bio *bio,
 
 	__bio_clone_fast(clone, bio);
 
-	if (unlikely(bio_integrity(bio) != NULL)) {
+	if (bio_integrity(bio)) {
 		int r;
 
 		if (unlikely(!dm_target_has_integrity(tio->ti->type) &&
@@ -1336,11 +1336,7 @@ static int clone_bio(struct dm_target_io *tio, struct bio *bio,
 			return r;
 	}
 
-	bio_advance(clone, to_bytes(sector - clone->bi_iter.bi_sector));
-	clone->bi_iter.bi_size = to_bytes(len);
-
-	if (unlikely(bio_integrity(bio) != NULL))
-		bio_integrity_trim(clone);
+	bio_trim(clone, sector - clone->bi_iter.bi_sector, len);
 
 	return 0;
 }
-- 
2.15.0

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 2/4] dm: fix redundant IO accounting for bios that need splitting
  2019-01-19 18:05 ` Mike Snitzer
@ 2019-01-19 18:05   ` Mike Snitzer
  -1 siblings, 0 replies; 30+ messages in thread
From: Mike Snitzer @ 2019-01-19 18:05 UTC (permalink / raw)
  To: dm-devel; +Cc: NeilBrown, Ming Lei, axboe, linux-block

The risk of redundant IO accounting was not taken into consideration
when commit 18a25da84354 ("dm: ensure bio submission follows a
depth-first tree walk") introduced IO splitting in terms of recursion
via generic_make_request().

Fix this by subtracting the split bio's payload from the IO stats that
were already accounted for by start_io_acct() upon dm_make_request()
entry.  This repeat oscillation of the IO accounting, up then down,
isn't ideal but refactoring DM core's IO splitting to pre-split bios
_before_ they are accounted turned out to be an excessive amount of
change that will need a full development cycle to refine and verify.

Before this fix:

  /dev/mapper/stripe_dev is a 4-way stripe using a 32k chunksize, so
  bios are split on 32k boundaries.

  # fio --name=16M --filename=/dev/mapper/stripe_dev --rw=write --bs=64k --size=16M \
    	--iodepth=1 --ioengine=libaio --direct=1 --refill_buffers

  with debugging added:
  [103898.310264] device-mapper: core: start_io_acct: dm-2 WRITE bio->bi_iter.bi_sector=0 len=128
  [103898.318704] device-mapper: core: __split_and_process_bio: recursing for following split bio:
  [103898.329136] device-mapper: core: start_io_acct: dm-2 WRITE bio->bi_iter.bi_sector=64 len=64
  ...

  16M written yet 136M (278528 * 512b) accounted:
  # cat /sys/block/dm-2/stat | awk '{ print $7 }'
  278528

After this fix:

  16M written and 16M (32768 * 512b) accounted:
  # cat /sys/block/dm-2/stat | awk '{ print $7 }'
  32768

Fixes: 18a25da84354 ("dm: ensure bio submission follows a depth-first tree walk")
Cc: stable@vger.kernel.org # 4.16+
Reported-by: Bryan Gurney <bgurney@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
---
 drivers/md/dm.c | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index fcb97b0a5743..fbadda68e23b 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -1584,6 +1584,9 @@ static void init_clone_info(struct clone_info *ci, struct mapped_device *md,
 	ci->sector = bio->bi_iter.bi_sector;
 }
 
+#define __dm_part_stat_sub(part, field, subnd)	\
+	(part_stat_get(part, field) -= (subnd))
+
 /*
  * Entry point to split a bio into clones and submit them to the targets.
  */
@@ -1638,6 +1641,19 @@ static blk_qc_t __split_and_process_bio(struct mapped_device *md,
 				struct bio *b = bio_split(bio, bio_sectors(bio) - ci.sector_count,
 							  GFP_NOIO, &md->queue->bio_split);
 				ci.io->orig_bio = b;
+
+				/*
+				 * Adjust IO stats for each split, otherwise upon queue
+				 * reentry there will be redundant IO accounting.
+				 * NOTE: this is a stop-gap fix, a proper fix involves
+				 * significant refactoring of DM core's bio splitting
+				 * (by eliminating DM's splitting and just using bio_split)
+				 */
+				part_stat_lock();
+				__dm_part_stat_sub(&dm_disk(md)->part0,
+						   sectors[op_stat_group(bio_op(bio))], ci.sector_count);
+				part_stat_unlock();
+
 				bio_chain(b, bio);
 				ret = generic_make_request(bio);
 				break;
-- 
2.15.0


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 2/4] dm: fix redundant IO accounting for bios that need splitting
@ 2019-01-19 18:05   ` Mike Snitzer
  0 siblings, 0 replies; 30+ messages in thread
From: Mike Snitzer @ 2019-01-19 18:05 UTC (permalink / raw)
  To: dm-devel; +Cc: axboe, linux-block, NeilBrown, Ming Lei

The risk of redundant IO accounting was not taken into consideration
when commit 18a25da84354 ("dm: ensure bio submission follows a
depth-first tree walk") introduced IO splitting in terms of recursion
via generic_make_request().

Fix this by subtracting the split bio's payload from the IO stats that
were already accounted for by start_io_acct() upon dm_make_request()
entry.  This repeat oscillation of the IO accounting, up then down,
isn't ideal but refactoring DM core's IO splitting to pre-split bios
_before_ they are accounted turned out to be an excessive amount of
change that will need a full development cycle to refine and verify.

Before this fix:

  /dev/mapper/stripe_dev is a 4-way stripe using a 32k chunksize, so
  bios are split on 32k boundaries.

  # fio --name=16M --filename=/dev/mapper/stripe_dev --rw=write --bs=64k --size=16M \
    	--iodepth=1 --ioengine=libaio --direct=1 --refill_buffers

  with debugging added:
  [103898.310264] device-mapper: core: start_io_acct: dm-2 WRITE bio->bi_iter.bi_sector=0 len=128
  [103898.318704] device-mapper: core: __split_and_process_bio: recursing for following split bio:
  [103898.329136] device-mapper: core: start_io_acct: dm-2 WRITE bio->bi_iter.bi_sector=64 len=64
  ...

  16M written yet 136M (278528 * 512b) accounted:
  # cat /sys/block/dm-2/stat | awk '{ print $7 }'
  278528

After this fix:

  16M written and 16M (32768 * 512b) accounted:
  # cat /sys/block/dm-2/stat | awk '{ print $7 }'
  32768

Fixes: 18a25da84354 ("dm: ensure bio submission follows a depth-first tree walk")
Cc: stable@vger.kernel.org # 4.16+
Reported-by: Bryan Gurney <bgurney@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
---
 drivers/md/dm.c | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index fcb97b0a5743..fbadda68e23b 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -1584,6 +1584,9 @@ static void init_clone_info(struct clone_info *ci, struct mapped_device *md,
 	ci->sector = bio->bi_iter.bi_sector;
 }
 
+#define __dm_part_stat_sub(part, field, subnd)	\
+	(part_stat_get(part, field) -= (subnd))
+
 /*
  * Entry point to split a bio into clones and submit them to the targets.
  */
@@ -1638,6 +1641,19 @@ static blk_qc_t __split_and_process_bio(struct mapped_device *md,
 				struct bio *b = bio_split(bio, bio_sectors(bio) - ci.sector_count,
 							  GFP_NOIO, &md->queue->bio_split);
 				ci.io->orig_bio = b;
+
+				/*
+				 * Adjust IO stats for each split, otherwise upon queue
+				 * reentry there will be redundant IO accounting.
+				 * NOTE: this is a stop-gap fix, a proper fix involves
+				 * significant refactoring of DM core's bio splitting
+				 * (by eliminating DM's splitting and just using bio_split)
+				 */
+				part_stat_lock();
+				__dm_part_stat_sub(&dm_disk(md)->part0,
+						   sectors[op_stat_group(bio_op(bio))], ci.sector_count);
+				part_stat_unlock();
+
 				bio_chain(b, bio);
 				ret = generic_make_request(bio);
 				break;
-- 
2.15.0

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 3/4] dm: fix missing bio_split() pattern code in __split_and_process_bio()
  2019-01-19 18:05 ` Mike Snitzer
@ 2019-01-19 18:05   ` Mike Snitzer
  -1 siblings, 0 replies; 30+ messages in thread
From: Mike Snitzer @ 2019-01-19 18:05 UTC (permalink / raw)
  To: dm-devel; +Cc: NeilBrown, Ming Lei, axboe, linux-block

Use the same BIO_QUEUE_ENTERED pattern that was established by commit
cd4a4ae4683dc ("block: don't use blocking queue entered for recursive
bio submits") by setting BIO_QUEUE_ENTERED after bio_split() and before
recursing via generic_make_request().

Also add trace_block_split() because it provides useful context about
bio splits in blktrace.

Depends-on: cd4a4ae4683dc ("block: don't use blocking queue entered for recursive bio submits")
Fixes: 18a25da84354 ("dm: ensure bio submission follows a depth-first tree walk")
Cc: stable@vger.kernel.org # 4.16+
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
---
 drivers/md/dm.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index fbadda68e23b..6e29c2d99b99 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -1654,7 +1654,9 @@ static blk_qc_t __split_and_process_bio(struct mapped_device *md,
 						   sectors[op_stat_group(bio_op(bio))], ci.sector_count);
 				part_stat_unlock();
 
+				bio_set_flag(bio, BIO_QUEUE_ENTERED);
 				bio_chain(b, bio);
+				trace_block_split(md->queue, b, bio->bi_iter.bi_sector);
 				ret = generic_make_request(bio);
 				break;
 			}
-- 
2.15.0


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 3/4] dm: fix missing bio_split() pattern code in __split_and_process_bio()
@ 2019-01-19 18:05   ` Mike Snitzer
  0 siblings, 0 replies; 30+ messages in thread
From: Mike Snitzer @ 2019-01-19 18:05 UTC (permalink / raw)
  To: dm-devel; +Cc: axboe, linux-block, NeilBrown, Ming Lei

Use the same BIO_QUEUE_ENTERED pattern that was established by commit
cd4a4ae4683dc ("block: don't use blocking queue entered for recursive
bio submits") by setting BIO_QUEUE_ENTERED after bio_split() and before
recursing via generic_make_request().

Also add trace_block_split() because it provides useful context about
bio splits in blktrace.

Depends-on: cd4a4ae4683dc ("block: don't use blocking queue entered for recursive bio submits")
Fixes: 18a25da84354 ("dm: ensure bio submission follows a depth-first tree walk")
Cc: stable@vger.kernel.org # 4.16+
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
---
 drivers/md/dm.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index fbadda68e23b..6e29c2d99b99 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -1654,7 +1654,9 @@ static blk_qc_t __split_and_process_bio(struct mapped_device *md,
 						   sectors[op_stat_group(bio_op(bio))], ci.sector_count);
 				part_stat_unlock();
 
+				bio_set_flag(bio, BIO_QUEUE_ENTERED);
 				bio_chain(b, bio);
+				trace_block_split(md->queue, b, bio->bi_iter.bi_sector);
 				ret = generic_make_request(bio);
 				break;
 			}
-- 
2.15.0

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 4/4] dm: fix dm_wq_work() to only use __split_and_process_bio() if appropriate
  2019-01-19 18:05 ` Mike Snitzer
@ 2019-01-19 18:05   ` Mike Snitzer
  -1 siblings, 0 replies; 30+ messages in thread
From: Mike Snitzer @ 2019-01-19 18:05 UTC (permalink / raw)
  To: dm-devel; +Cc: NeilBrown, Ming Lei, axboe, linux-block

Otherwise targets that don't support/expect IO splitting could resubmit
bios using code paths with unnecessary IO splitting complexity.

Depends-on: 24113d487843 ("dm: avoid indirect call in __dm_make_request")
Fixes: 978e51ba38e00 ("dm: optimize bio-based NVMe IO submission")
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
---
 drivers/md/dm.c | 18 ++++++++++++------
 1 file changed, 12 insertions(+), 6 deletions(-)

diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index 6e29c2d99b99..aa7e429646b3 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -1727,6 +1727,15 @@ static blk_qc_t __process_bio(struct mapped_device *md,
 	return ret;
 }
 
+static blk_qc_t dm_process_bio(struct mapped_device *md,
+			       struct dm_table *map, struct bio *bio)
+{
+	if (dm_get_md_type(md) == DM_TYPE_NVME_BIO_BASED)
+		return __process_bio(md, map, bio);
+	else
+		return __split_and_process_bio(md, map, bio);
+}
+
 static blk_qc_t dm_make_request(struct request_queue *q, struct bio *bio)
 {
 	struct mapped_device *md = q->queuedata;
@@ -1747,10 +1756,7 @@ static blk_qc_t dm_make_request(struct request_queue *q, struct bio *bio)
 		return ret;
 	}
 
-	if (dm_get_md_type(md) == DM_TYPE_NVME_BIO_BASED)
-		ret = __process_bio(md, map, bio);
-	else
-		ret = __split_and_process_bio(md, map, bio);
+	ret = dm_process_bio(md, map, bio);
 
 	dm_put_live_table(md, srcu_idx);
 	return ret;
@@ -2429,9 +2435,9 @@ static void dm_wq_work(struct work_struct *work)
 			break;
 
 		if (dm_request_based(md))
-			generic_make_request(c);
+			(void) generic_make_request(c);
 		else
-			__split_and_process_bio(md, map, c);
+			(void) dm_process_bio(md, map, c);
 	}
 
 	dm_put_live_table(md, srcu_idx);
-- 
2.15.0


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 4/4] dm: fix dm_wq_work() to only use __split_and_process_bio() if appropriate
@ 2019-01-19 18:05   ` Mike Snitzer
  0 siblings, 0 replies; 30+ messages in thread
From: Mike Snitzer @ 2019-01-19 18:05 UTC (permalink / raw)
  To: dm-devel; +Cc: axboe, linux-block, NeilBrown, Ming Lei

Otherwise targets that don't support/expect IO splitting could resubmit
bios using code paths with unnecessary IO splitting complexity.

Depends-on: 24113d487843 ("dm: avoid indirect call in __dm_make_request")
Fixes: 978e51ba38e00 ("dm: optimize bio-based NVMe IO submission")
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
---
 drivers/md/dm.c | 18 ++++++++++++------
 1 file changed, 12 insertions(+), 6 deletions(-)

diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index 6e29c2d99b99..aa7e429646b3 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -1727,6 +1727,15 @@ static blk_qc_t __process_bio(struct mapped_device *md,
 	return ret;
 }
 
+static blk_qc_t dm_process_bio(struct mapped_device *md,
+			       struct dm_table *map, struct bio *bio)
+{
+	if (dm_get_md_type(md) == DM_TYPE_NVME_BIO_BASED)
+		return __process_bio(md, map, bio);
+	else
+		return __split_and_process_bio(md, map, bio);
+}
+
 static blk_qc_t dm_make_request(struct request_queue *q, struct bio *bio)
 {
 	struct mapped_device *md = q->queuedata;
@@ -1747,10 +1756,7 @@ static blk_qc_t dm_make_request(struct request_queue *q, struct bio *bio)
 		return ret;
 	}
 
-	if (dm_get_md_type(md) == DM_TYPE_NVME_BIO_BASED)
-		ret = __process_bio(md, map, bio);
-	else
-		ret = __split_and_process_bio(md, map, bio);
+	ret = dm_process_bio(md, map, bio);
 
 	dm_put_live_table(md, srcu_idx);
 	return ret;
@@ -2429,9 +2435,9 @@ static void dm_wq_work(struct work_struct *work)
 			break;
 
 		if (dm_request_based(md))
-			generic_make_request(c);
+			(void) generic_make_request(c);
 		else
-			__split_and_process_bio(md, map, c);
+			(void) dm_process_bio(md, map, c);
 	}
 
 	dm_put_live_table(md, srcu_idx);
-- 
2.15.0

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH 3/4] dm: fix missing bio_split() pattern code in __split_and_process_bio()
  2019-01-19 18:05   ` Mike Snitzer
@ 2019-01-21  3:21     ` Ming Lei
  -1 siblings, 0 replies; 30+ messages in thread
From: Ming Lei @ 2019-01-21  3:21 UTC (permalink / raw)
  To: Mike Snitzer; +Cc: dm-devel, NeilBrown, axboe, linux-block

On Sat, Jan 19, 2019 at 01:05:05PM -0500, Mike Snitzer wrote:
> Use the same BIO_QUEUE_ENTERED pattern that was established by commit
> cd4a4ae4683dc ("block: don't use blocking queue entered for recursive
> bio submits") by setting BIO_QUEUE_ENTERED after bio_split() and before
> recursing via generic_make_request().
> 
> Also add trace_block_split() because it provides useful context about
> bio splits in blktrace.
> 
> Depends-on: cd4a4ae4683dc ("block: don't use blocking queue entered for recursive bio submits")
> Fixes: 18a25da84354 ("dm: ensure bio submission follows a depth-first tree walk")
> Cc: stable@vger.kernel.org # 4.16+
> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
> ---
>  drivers/md/dm.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> index fbadda68e23b..6e29c2d99b99 100644
> --- a/drivers/md/dm.c
> +++ b/drivers/md/dm.c
> @@ -1654,7 +1654,9 @@ static blk_qc_t __split_and_process_bio(struct mapped_device *md,
>  						   sectors[op_stat_group(bio_op(bio))], ci.sector_count);
>  				part_stat_unlock();
>  
> +				bio_set_flag(bio, BIO_QUEUE_ENTERED);
>  				bio_chain(b, bio);
> +				trace_block_split(md->queue, b, bio->bi_iter.bi_sector);
>  				ret = generic_make_request(bio);
>  				break;
>  			}

In theory, BIO_QUEUE_ENTERED is only required when __split_and_process_bio() is
called from generic_make_request(). However, it may be called from dm_wq_work(),
this way might cause trouble on operation to q->q_usage_counter.

Thanks,
Ming

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 3/4] dm: fix missing bio_split() pattern code in __split_and_process_bio()
@ 2019-01-21  3:21     ` Ming Lei
  0 siblings, 0 replies; 30+ messages in thread
From: Ming Lei @ 2019-01-21  3:21 UTC (permalink / raw)
  To: Mike Snitzer; +Cc: axboe, linux-block, dm-devel, NeilBrown

On Sat, Jan 19, 2019 at 01:05:05PM -0500, Mike Snitzer wrote:
> Use the same BIO_QUEUE_ENTERED pattern that was established by commit
> cd4a4ae4683dc ("block: don't use blocking queue entered for recursive
> bio submits") by setting BIO_QUEUE_ENTERED after bio_split() and before
> recursing via generic_make_request().
> 
> Also add trace_block_split() because it provides useful context about
> bio splits in blktrace.
> 
> Depends-on: cd4a4ae4683dc ("block: don't use blocking queue entered for recursive bio submits")
> Fixes: 18a25da84354 ("dm: ensure bio submission follows a depth-first tree walk")
> Cc: stable@vger.kernel.org # 4.16+
> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
> ---
>  drivers/md/dm.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> index fbadda68e23b..6e29c2d99b99 100644
> --- a/drivers/md/dm.c
> +++ b/drivers/md/dm.c
> @@ -1654,7 +1654,9 @@ static blk_qc_t __split_and_process_bio(struct mapped_device *md,
>  						   sectors[op_stat_group(bio_op(bio))], ci.sector_count);
>  				part_stat_unlock();
>  
> +				bio_set_flag(bio, BIO_QUEUE_ENTERED);
>  				bio_chain(b, bio);
> +				trace_block_split(md->queue, b, bio->bi_iter.bi_sector);
>  				ret = generic_make_request(bio);
>  				break;
>  			}

In theory, BIO_QUEUE_ENTERED is only required when __split_and_process_bio() is
called from generic_make_request(). However, it may be called from dm_wq_work(),
this way might cause trouble on operation to q->q_usage_counter.

Thanks,
Ming

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 1/4] dm: fix clone_bio() to trigger blk_recount_segments()
  2019-01-19 18:05   ` Mike Snitzer
@ 2019-01-21  3:25     ` Ming Lei
  -1 siblings, 0 replies; 30+ messages in thread
From: Ming Lei @ 2019-01-21  3:25 UTC (permalink / raw)
  To: Mike Snitzer; +Cc: dm-devel, NeilBrown, axboe, linux-block

On Sat, Jan 19, 2019 at 01:05:03PM -0500, Mike Snitzer wrote:
> DM's clone_bio() now benefits from using bio_trim() by fixing the fact
> that clone_bio() wasn't clearing BIO_SEG_VALID like bio_trim() does;
> which triggers blk_recount_segments() via bio_phys_segments().
> 
> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
> ---
>  drivers/md/dm.c | 8 ++------
>  1 file changed, 2 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> index d67c95ef8d7e..fcb97b0a5743 100644
> --- a/drivers/md/dm.c
> +++ b/drivers/md/dm.c
> @@ -1320,7 +1320,7 @@ static int clone_bio(struct dm_target_io *tio, struct bio *bio,
>  
>  	__bio_clone_fast(clone, bio);
>  
> -	if (unlikely(bio_integrity(bio) != NULL)) {
> +	if (bio_integrity(bio)) {
>  		int r;
>  
>  		if (unlikely(!dm_target_has_integrity(tio->ti->type) &&
> @@ -1336,11 +1336,7 @@ static int clone_bio(struct dm_target_io *tio, struct bio *bio,
>  			return r;
>  	}
>  
> -	bio_advance(clone, to_bytes(sector - clone->bi_iter.bi_sector));
> -	clone->bi_iter.bi_size = to_bytes(len);
> -
> -	if (unlikely(bio_integrity(bio) != NULL))
> -		bio_integrity_trim(clone);
> +	bio_trim(clone, sector - clone->bi_iter.bi_sector, len);
>  
>  	return 0;
>  }
> -- 
> 2.15.0
> 

Reviewed-by: Ming Lei <ming.lei@redhat.com>

Thanks,
Ming

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 1/4] dm: fix clone_bio() to trigger blk_recount_segments()
@ 2019-01-21  3:25     ` Ming Lei
  0 siblings, 0 replies; 30+ messages in thread
From: Ming Lei @ 2019-01-21  3:25 UTC (permalink / raw)
  To: Mike Snitzer; +Cc: axboe, linux-block, dm-devel, NeilBrown

On Sat, Jan 19, 2019 at 01:05:03PM -0500, Mike Snitzer wrote:
> DM's clone_bio() now benefits from using bio_trim() by fixing the fact
> that clone_bio() wasn't clearing BIO_SEG_VALID like bio_trim() does;
> which triggers blk_recount_segments() via bio_phys_segments().
> 
> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
> ---
>  drivers/md/dm.c | 8 ++------
>  1 file changed, 2 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> index d67c95ef8d7e..fcb97b0a5743 100644
> --- a/drivers/md/dm.c
> +++ b/drivers/md/dm.c
> @@ -1320,7 +1320,7 @@ static int clone_bio(struct dm_target_io *tio, struct bio *bio,
>  
>  	__bio_clone_fast(clone, bio);
>  
> -	if (unlikely(bio_integrity(bio) != NULL)) {
> +	if (bio_integrity(bio)) {
>  		int r;
>  
>  		if (unlikely(!dm_target_has_integrity(tio->ti->type) &&
> @@ -1336,11 +1336,7 @@ static int clone_bio(struct dm_target_io *tio, struct bio *bio,
>  			return r;
>  	}
>  
> -	bio_advance(clone, to_bytes(sector - clone->bi_iter.bi_sector));
> -	clone->bi_iter.bi_size = to_bytes(len);
> -
> -	if (unlikely(bio_integrity(bio) != NULL))
> -		bio_integrity_trim(clone);
> +	bio_trim(clone, sector - clone->bi_iter.bi_sector, len);
>  
>  	return 0;
>  }
> -- 
> 2.15.0
> 

Reviewed-by: Ming Lei <ming.lei@redhat.com>

Thanks,
Ming

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 2/4] dm: fix redundant IO accounting for bios that need splitting
  2019-01-19 18:05   ` Mike Snitzer
@ 2019-01-21  3:52     ` Ming Lei
  -1 siblings, 0 replies; 30+ messages in thread
From: Ming Lei @ 2019-01-21  3:52 UTC (permalink / raw)
  To: Mike Snitzer; +Cc: dm-devel, NeilBrown, axboe, linux-block

On Sat, Jan 19, 2019 at 01:05:04PM -0500, Mike Snitzer wrote:
> The risk of redundant IO accounting was not taken into consideration
> when commit 18a25da84354 ("dm: ensure bio submission follows a
> depth-first tree walk") introduced IO splitting in terms of recursion
> via generic_make_request().
> 
> Fix this by subtracting the split bio's payload from the IO stats that
> were already accounted for by start_io_acct() upon dm_make_request()
> entry.  This repeat oscillation of the IO accounting, up then down,
> isn't ideal but refactoring DM core's IO splitting to pre-split bios
> _before_ they are accounted turned out to be an excessive amount of
> change that will need a full development cycle to refine and verify.
> 
> Before this fix:
> 
>   /dev/mapper/stripe_dev is a 4-way stripe using a 32k chunksize, so
>   bios are split on 32k boundaries.
> 
>   # fio --name=16M --filename=/dev/mapper/stripe_dev --rw=write --bs=64k --size=16M \
>     	--iodepth=1 --ioengine=libaio --direct=1 --refill_buffers
> 
>   with debugging added:
>   [103898.310264] device-mapper: core: start_io_acct: dm-2 WRITE bio->bi_iter.bi_sector=0 len=128
>   [103898.318704] device-mapper: core: __split_and_process_bio: recursing for following split bio:
>   [103898.329136] device-mapper: core: start_io_acct: dm-2 WRITE bio->bi_iter.bi_sector=64 len=64
>   ...
> 
>   16M written yet 136M (278528 * 512b) accounted:
>   # cat /sys/block/dm-2/stat | awk '{ print $7 }'
>   278528
> 
> After this fix:
> 
>   16M written and 16M (32768 * 512b) accounted:
>   # cat /sys/block/dm-2/stat | awk '{ print $7 }'
>   32768
> 
> Fixes: 18a25da84354 ("dm: ensure bio submission follows a depth-first tree walk")
> Cc: stable@vger.kernel.org # 4.16+
> Reported-by: Bryan Gurney <bgurney@redhat.com>
> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
> ---
>  drivers/md/dm.c | 16 ++++++++++++++++
>  1 file changed, 16 insertions(+)
> 
> diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> index fcb97b0a5743..fbadda68e23b 100644
> --- a/drivers/md/dm.c
> +++ b/drivers/md/dm.c
> @@ -1584,6 +1584,9 @@ static void init_clone_info(struct clone_info *ci, struct mapped_device *md,
>  	ci->sector = bio->bi_iter.bi_sector;
>  }
>  
> +#define __dm_part_stat_sub(part, field, subnd)	\
> +	(part_stat_get(part, field) -= (subnd))
> +
>  /*
>   * Entry point to split a bio into clones and submit them to the targets.
>   */
> @@ -1638,6 +1641,19 @@ static blk_qc_t __split_and_process_bio(struct mapped_device *md,
>  				struct bio *b = bio_split(bio, bio_sectors(bio) - ci.sector_count,
>  							  GFP_NOIO, &md->queue->bio_split);
>  				ci.io->orig_bio = b;
> +
> +				/*
> +				 * Adjust IO stats for each split, otherwise upon queue
> +				 * reentry there will be redundant IO accounting.
> +				 * NOTE: this is a stop-gap fix, a proper fix involves
> +				 * significant refactoring of DM core's bio splitting
> +				 * (by eliminating DM's splitting and just using bio_split)
> +				 */
> +				part_stat_lock();
> +				__dm_part_stat_sub(&dm_disk(md)->part0,
> +						   sectors[op_stat_group(bio_op(bio))], ci.sector_count);
> +				part_stat_unlock();
> +
>  				bio_chain(b, bio);
>  				ret = generic_make_request(bio);
>  				break;

This ways is a bit ugly, but looks it works and it is simple, especially
DM target may accept partial bio, so:

Reviewed-by: Ming Lei <ming.lei@redhat.com>

Thanks,
Ming

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 2/4] dm: fix redundant IO accounting for bios that need splitting
@ 2019-01-21  3:52     ` Ming Lei
  0 siblings, 0 replies; 30+ messages in thread
From: Ming Lei @ 2019-01-21  3:52 UTC (permalink / raw)
  To: Mike Snitzer; +Cc: axboe, linux-block, dm-devel, NeilBrown

On Sat, Jan 19, 2019 at 01:05:04PM -0500, Mike Snitzer wrote:
> The risk of redundant IO accounting was not taken into consideration
> when commit 18a25da84354 ("dm: ensure bio submission follows a
> depth-first tree walk") introduced IO splitting in terms of recursion
> via generic_make_request().
> 
> Fix this by subtracting the split bio's payload from the IO stats that
> were already accounted for by start_io_acct() upon dm_make_request()
> entry.  This repeat oscillation of the IO accounting, up then down,
> isn't ideal but refactoring DM core's IO splitting to pre-split bios
> _before_ they are accounted turned out to be an excessive amount of
> change that will need a full development cycle to refine and verify.
> 
> Before this fix:
> 
>   /dev/mapper/stripe_dev is a 4-way stripe using a 32k chunksize, so
>   bios are split on 32k boundaries.
> 
>   # fio --name=16M --filename=/dev/mapper/stripe_dev --rw=write --bs=64k --size=16M \
>     	--iodepth=1 --ioengine=libaio --direct=1 --refill_buffers
> 
>   with debugging added:
>   [103898.310264] device-mapper: core: start_io_acct: dm-2 WRITE bio->bi_iter.bi_sector=0 len=128
>   [103898.318704] device-mapper: core: __split_and_process_bio: recursing for following split bio:
>   [103898.329136] device-mapper: core: start_io_acct: dm-2 WRITE bio->bi_iter.bi_sector=64 len=64
>   ...
> 
>   16M written yet 136M (278528 * 512b) accounted:
>   # cat /sys/block/dm-2/stat | awk '{ print $7 }'
>   278528
> 
> After this fix:
> 
>   16M written and 16M (32768 * 512b) accounted:
>   # cat /sys/block/dm-2/stat | awk '{ print $7 }'
>   32768
> 
> Fixes: 18a25da84354 ("dm: ensure bio submission follows a depth-first tree walk")
> Cc: stable@vger.kernel.org # 4.16+
> Reported-by: Bryan Gurney <bgurney@redhat.com>
> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
> ---
>  drivers/md/dm.c | 16 ++++++++++++++++
>  1 file changed, 16 insertions(+)
> 
> diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> index fcb97b0a5743..fbadda68e23b 100644
> --- a/drivers/md/dm.c
> +++ b/drivers/md/dm.c
> @@ -1584,6 +1584,9 @@ static void init_clone_info(struct clone_info *ci, struct mapped_device *md,
>  	ci->sector = bio->bi_iter.bi_sector;
>  }
>  
> +#define __dm_part_stat_sub(part, field, subnd)	\
> +	(part_stat_get(part, field) -= (subnd))
> +
>  /*
>   * Entry point to split a bio into clones and submit them to the targets.
>   */
> @@ -1638,6 +1641,19 @@ static blk_qc_t __split_and_process_bio(struct mapped_device *md,
>  				struct bio *b = bio_split(bio, bio_sectors(bio) - ci.sector_count,
>  							  GFP_NOIO, &md->queue->bio_split);
>  				ci.io->orig_bio = b;
> +
> +				/*
> +				 * Adjust IO stats for each split, otherwise upon queue
> +				 * reentry there will be redundant IO accounting.
> +				 * NOTE: this is a stop-gap fix, a proper fix involves
> +				 * significant refactoring of DM core's bio splitting
> +				 * (by eliminating DM's splitting and just using bio_split)
> +				 */
> +				part_stat_lock();
> +				__dm_part_stat_sub(&dm_disk(md)->part0,
> +						   sectors[op_stat_group(bio_op(bio))], ci.sector_count);
> +				part_stat_unlock();
> +
>  				bio_chain(b, bio);
>  				ret = generic_make_request(bio);
>  				break;

This ways is a bit ugly, but looks it works and it is simple, especially
DM target may accept partial bio, so:

Reviewed-by: Ming Lei <ming.lei@redhat.com>

Thanks,
Ming

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [dm-devel] [PATCH 3/4] dm: fix missing bio_split() pattern code in __split_and_process_bio()
  2019-01-19 18:05   ` Mike Snitzer
@ 2019-01-21  4:39     ` NeilBrown
  -1 siblings, 0 replies; 30+ messages in thread
From: NeilBrown @ 2019-01-21  4:39 UTC (permalink / raw)
  To: Mike Snitzer, dm-devel; +Cc: axboe, linux-block, Ming Lei

[-- Attachment #1: Type: text/plain, Size: 1782 bytes --]

On Sat, Jan 19 2019, Mike Snitzer wrote:

> Use the same BIO_QUEUE_ENTERED pattern that was established by commit
> cd4a4ae4683dc ("block: don't use blocking queue entered for recursive
> bio submits") by setting BIO_QUEUE_ENTERED after bio_split() and before
> recursing via generic_make_request().
>
> Also add trace_block_split() because it provides useful context about
> bio splits in blktrace.
>
> Depends-on: cd4a4ae4683dc ("block: don't use blocking queue entered for recursive bio submits")
> Fixes: 18a25da84354 ("dm: ensure bio submission follows a depth-first tree walk")
> Cc: stable@vger.kernel.org # 4.16+
> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
> ---
>  drivers/md/dm.c | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> index fbadda68e23b..6e29c2d99b99 100644
> --- a/drivers/md/dm.c
> +++ b/drivers/md/dm.c
> @@ -1654,7 +1654,9 @@ static blk_qc_t __split_and_process_bio(struct mapped_device *md,
>  						   sectors[op_stat_group(bio_op(bio))], ci.sector_count);
>  				part_stat_unlock();
>  
> +				bio_set_flag(bio, BIO_QUEUE_ENTERED);
>  				bio_chain(b, bio);
> +				trace_block_split(md->queue, b, bio->bi_iter.bi_sector);
>  				ret = generic_make_request(bio);
>  				break;
>  			}
Thanks Mike...

If I understand this correctly, then we need to make the same change for
all other callers of bio_split(), except blk_queue_split().
Maybe we should just set the flag and do the trace in bio_split().
Do you see any harm with doing it that way (in the next merge-window, I
don't suggest you change this patch).

Thanks,
NeilBrown


> -- 
> 2.15.0
>
> --
> dm-devel mailing list
> dm-devel@redhat.com
> https://www.redhat.com/mailman/listinfo/dm-devel

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 3/4] dm: fix missing bio_split() pattern code in __split_and_process_bio()
@ 2019-01-21  4:39     ` NeilBrown
  0 siblings, 0 replies; 30+ messages in thread
From: NeilBrown @ 2019-01-21  4:39 UTC (permalink / raw)
  To: Mike Snitzer, dm-devel; +Cc: axboe, linux-block, Ming Lei


[-- Attachment #1.1: Type: text/plain, Size: 1782 bytes --]

On Sat, Jan 19 2019, Mike Snitzer wrote:

> Use the same BIO_QUEUE_ENTERED pattern that was established by commit
> cd4a4ae4683dc ("block: don't use blocking queue entered for recursive
> bio submits") by setting BIO_QUEUE_ENTERED after bio_split() and before
> recursing via generic_make_request().
>
> Also add trace_block_split() because it provides useful context about
> bio splits in blktrace.
>
> Depends-on: cd4a4ae4683dc ("block: don't use blocking queue entered for recursive bio submits")
> Fixes: 18a25da84354 ("dm: ensure bio submission follows a depth-first tree walk")
> Cc: stable@vger.kernel.org # 4.16+
> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
> ---
>  drivers/md/dm.c | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> index fbadda68e23b..6e29c2d99b99 100644
> --- a/drivers/md/dm.c
> +++ b/drivers/md/dm.c
> @@ -1654,7 +1654,9 @@ static blk_qc_t __split_and_process_bio(struct mapped_device *md,
>  						   sectors[op_stat_group(bio_op(bio))], ci.sector_count);
>  				part_stat_unlock();
>  
> +				bio_set_flag(bio, BIO_QUEUE_ENTERED);
>  				bio_chain(b, bio);
> +				trace_block_split(md->queue, b, bio->bi_iter.bi_sector);
>  				ret = generic_make_request(bio);
>  				break;
>  			}
Thanks Mike...

If I understand this correctly, then we need to make the same change for
all other callers of bio_split(), except blk_queue_split().
Maybe we should just set the flag and do the trace in bio_split().
Do you see any harm with doing it that way (in the next merge-window, I
don't suggest you change this patch).

Thanks,
NeilBrown


> -- 
> 2.15.0
>
> --
> dm-devel mailing list
> dm-devel@redhat.com
> https://www.redhat.com/mailman/listinfo/dm-devel

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 3/4] dm: fix missing bio_split() pattern code in __split_and_process_bio()
  2019-01-21  3:21     ` Ming Lei
@ 2019-01-21 16:02       ` Mike Snitzer
  -1 siblings, 0 replies; 30+ messages in thread
From: Mike Snitzer @ 2019-01-21 16:02 UTC (permalink / raw)
  To: Ming Lei; +Cc: dm-devel, NeilBrown, axboe, linux-block

On Sun, Jan 20 2019 at 10:21P -0500,
Ming Lei <ming.lei@redhat.com> wrote:

> On Sat, Jan 19, 2019 at 01:05:05PM -0500, Mike Snitzer wrote:
> > Use the same BIO_QUEUE_ENTERED pattern that was established by commit
> > cd4a4ae4683dc ("block: don't use blocking queue entered for recursive
> > bio submits") by setting BIO_QUEUE_ENTERED after bio_split() and before
> > recursing via generic_make_request().
> > 
> > Also add trace_block_split() because it provides useful context about
> > bio splits in blktrace.
> > 
> > Depends-on: cd4a4ae4683dc ("block: don't use blocking queue entered for recursive bio submits")
> > Fixes: 18a25da84354 ("dm: ensure bio submission follows a depth-first tree walk")
> > Cc: stable@vger.kernel.org # 4.16+
> > Signed-off-by: Mike Snitzer <snitzer@redhat.com>
> > ---
> >  drivers/md/dm.c | 2 ++
> >  1 file changed, 2 insertions(+)
> > 
> > diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> > index fbadda68e23b..6e29c2d99b99 100644
> > --- a/drivers/md/dm.c
> > +++ b/drivers/md/dm.c
> > @@ -1654,7 +1654,9 @@ static blk_qc_t __split_and_process_bio(struct mapped_device *md,
> >  						   sectors[op_stat_group(bio_op(bio))], ci.sector_count);
> >  				part_stat_unlock();
> >  
> > +				bio_set_flag(bio, BIO_QUEUE_ENTERED);
> >  				bio_chain(b, bio);
> > +				trace_block_split(md->queue, b, bio->bi_iter.bi_sector);
> >  				ret = generic_make_request(bio);
> >  				break;
> >  			}
> 
> In theory, BIO_QUEUE_ENTERED is only required when __split_and_process_bio() is
> called from generic_make_request(). However, it may be called from dm_wq_work(),
> this way might cause trouble on operation to q->q_usage_counter.

Good point, I've tweaked this patch to clear BIO_QUEUE_ENTERED in
dm_make_request().

And to Neil's point: yes, these changes really do need to made
common since it appears all bio_split() callers do go on to call
generic_make_request().

Anyway, here is the updated patch that is now staged in linux-next:

From: Mike Snitzer <snitzer@redhat.com>
Date: Fri, 18 Jan 2019 01:21:11 -0500
Subject: [PATCH v2] dm: fix missing bio_split() pattern code in __split_and_process_bio()

Use the same BIO_QUEUE_ENTERED pattern that was established by commit
cd4a4ae4683dc ("block: don't use blocking queue entered for recursive
bio submits") by setting BIO_QUEUE_ENTERED after bio_split() and before
recursing via generic_make_request().

Also add trace_block_split() because it provides useful context about
bio splits in blktrace.

Depends-on: cd4a4ae4683dc ("block: don't use blocking queue entered for recursive bio submits")
Fixes: 18a25da84354 ("dm: ensure bio submission follows a depth-first tree walk")
Cc: stable@vger.kernel.org # 4.16+
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
---
 drivers/md/dm.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index fbadda68e23b..25884f833a32 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -1654,7 +1654,9 @@ static blk_qc_t __split_and_process_bio(struct mapped_device *md,
 						   sectors[op_stat_group(bio_op(bio))], ci.sector_count);
 				part_stat_unlock();
 
+				bio_set_flag(bio, BIO_QUEUE_ENTERED);
 				bio_chain(b, bio);
+				trace_block_split(md->queue, b, bio->bi_iter.bi_sector);
 				ret = generic_make_request(bio);
 				break;
 			}
@@ -1734,6 +1736,13 @@ static blk_qc_t dm_make_request(struct request_queue *q, struct bio *bio)
 
 	map = dm_get_live_table(md, &srcu_idx);
 
+	/*
+	 * Clear the bio-reentered-generic_make_request() flag,
+	 * will be set again as needed if bio needs to be split.
+	 */
+	if (bio_flagged(bio, BIO_QUEUE_ENTERED))
+		bio_clear_flag(bio, BIO_QUEUE_ENTERED);
+
 	/* if we're suspended, we have to queue this io for later */
 	if (unlikely(test_bit(DMF_BLOCK_IO_FOR_SUSPEND, &md->flags))) {
 		dm_put_live_table(md, srcu_idx);
-- 
2.15.0


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH 3/4] dm: fix missing bio_split() pattern code in __split_and_process_bio()
@ 2019-01-21 16:02       ` Mike Snitzer
  0 siblings, 0 replies; 30+ messages in thread
From: Mike Snitzer @ 2019-01-21 16:02 UTC (permalink / raw)
  To: Ming Lei; +Cc: axboe, linux-block, dm-devel, NeilBrown

On Sun, Jan 20 2019 at 10:21P -0500,
Ming Lei <ming.lei@redhat.com> wrote:

> On Sat, Jan 19, 2019 at 01:05:05PM -0500, Mike Snitzer wrote:
> > Use the same BIO_QUEUE_ENTERED pattern that was established by commit
> > cd4a4ae4683dc ("block: don't use blocking queue entered for recursive
> > bio submits") by setting BIO_QUEUE_ENTERED after bio_split() and before
> > recursing via generic_make_request().
> > 
> > Also add trace_block_split() because it provides useful context about
> > bio splits in blktrace.
> > 
> > Depends-on: cd4a4ae4683dc ("block: don't use blocking queue entered for recursive bio submits")
> > Fixes: 18a25da84354 ("dm: ensure bio submission follows a depth-first tree walk")
> > Cc: stable@vger.kernel.org # 4.16+
> > Signed-off-by: Mike Snitzer <snitzer@redhat.com>
> > ---
> >  drivers/md/dm.c | 2 ++
> >  1 file changed, 2 insertions(+)
> > 
> > diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> > index fbadda68e23b..6e29c2d99b99 100644
> > --- a/drivers/md/dm.c
> > +++ b/drivers/md/dm.c
> > @@ -1654,7 +1654,9 @@ static blk_qc_t __split_and_process_bio(struct mapped_device *md,
> >  						   sectors[op_stat_group(bio_op(bio))], ci.sector_count);
> >  				part_stat_unlock();
> >  
> > +				bio_set_flag(bio, BIO_QUEUE_ENTERED);
> >  				bio_chain(b, bio);
> > +				trace_block_split(md->queue, b, bio->bi_iter.bi_sector);
> >  				ret = generic_make_request(bio);
> >  				break;
> >  			}
> 
> In theory, BIO_QUEUE_ENTERED is only required when __split_and_process_bio() is
> called from generic_make_request(). However, it may be called from dm_wq_work(),
> this way might cause trouble on operation to q->q_usage_counter.

Good point, I've tweaked this patch to clear BIO_QUEUE_ENTERED in
dm_make_request().

And to Neil's point: yes, these changes really do need to made
common since it appears all bio_split() callers do go on to call
generic_make_request().

Anyway, here is the updated patch that is now staged in linux-next:

From: Mike Snitzer <snitzer@redhat.com>
Date: Fri, 18 Jan 2019 01:21:11 -0500
Subject: [PATCH v2] dm: fix missing bio_split() pattern code in __split_and_process_bio()

Use the same BIO_QUEUE_ENTERED pattern that was established by commit
cd4a4ae4683dc ("block: don't use blocking queue entered for recursive
bio submits") by setting BIO_QUEUE_ENTERED after bio_split() and before
recursing via generic_make_request().

Also add trace_block_split() because it provides useful context about
bio splits in blktrace.

Depends-on: cd4a4ae4683dc ("block: don't use blocking queue entered for recursive bio submits")
Fixes: 18a25da84354 ("dm: ensure bio submission follows a depth-first tree walk")
Cc: stable@vger.kernel.org # 4.16+
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
---
 drivers/md/dm.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index fbadda68e23b..25884f833a32 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -1654,7 +1654,9 @@ static blk_qc_t __split_and_process_bio(struct mapped_device *md,
 						   sectors[op_stat_group(bio_op(bio))], ci.sector_count);
 				part_stat_unlock();
 
+				bio_set_flag(bio, BIO_QUEUE_ENTERED);
 				bio_chain(b, bio);
+				trace_block_split(md->queue, b, bio->bi_iter.bi_sector);
 				ret = generic_make_request(bio);
 				break;
 			}
@@ -1734,6 +1736,13 @@ static blk_qc_t dm_make_request(struct request_queue *q, struct bio *bio)
 
 	map = dm_get_live_table(md, &srcu_idx);
 
+	/*
+	 * Clear the bio-reentered-generic_make_request() flag,
+	 * will be set again as needed if bio needs to be split.
+	 */
+	if (bio_flagged(bio, BIO_QUEUE_ENTERED))
+		bio_clear_flag(bio, BIO_QUEUE_ENTERED);
+
 	/* if we're suspended, we have to queue this io for later */
 	if (unlikely(test_bit(DMF_BLOCK_IO_FOR_SUSPEND, &md->flags))) {
 		dm_put_live_table(md, srcu_idx);
-- 
2.15.0

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH 3/4] dm: fix missing bio_split() pattern code in __split_and_process_bio()
  2019-01-21 16:02       ` Mike Snitzer
@ 2019-01-22  2:46         ` Ming Lei
  -1 siblings, 0 replies; 30+ messages in thread
From: Ming Lei @ 2019-01-22  2:46 UTC (permalink / raw)
  To: Mike Snitzer; +Cc: dm-devel, NeilBrown, axboe, linux-block

On Mon, Jan 21, 2019 at 11:02:04AM -0500, Mike Snitzer wrote:
> On Sun, Jan 20 2019 at 10:21P -0500,
> Ming Lei <ming.lei@redhat.com> wrote:
> 
> > On Sat, Jan 19, 2019 at 01:05:05PM -0500, Mike Snitzer wrote:
> > > Use the same BIO_QUEUE_ENTERED pattern that was established by commit
> > > cd4a4ae4683dc ("block: don't use blocking queue entered for recursive
> > > bio submits") by setting BIO_QUEUE_ENTERED after bio_split() and before
> > > recursing via generic_make_request().
> > > 
> > > Also add trace_block_split() because it provides useful context about
> > > bio splits in blktrace.
> > > 
> > > Depends-on: cd4a4ae4683dc ("block: don't use blocking queue entered for recursive bio submits")
> > > Fixes: 18a25da84354 ("dm: ensure bio submission follows a depth-first tree walk")
> > > Cc: stable@vger.kernel.org # 4.16+
> > > Signed-off-by: Mike Snitzer <snitzer@redhat.com>
> > > ---
> > >  drivers/md/dm.c | 2 ++
> > >  1 file changed, 2 insertions(+)
> > > 
> > > diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> > > index fbadda68e23b..6e29c2d99b99 100644
> > > --- a/drivers/md/dm.c
> > > +++ b/drivers/md/dm.c
> > > @@ -1654,7 +1654,9 @@ static blk_qc_t __split_and_process_bio(struct mapped_device *md,
> > >  						   sectors[op_stat_group(bio_op(bio))], ci.sector_count);
> > >  				part_stat_unlock();
> > >  
> > > +				bio_set_flag(bio, BIO_QUEUE_ENTERED);
> > >  				bio_chain(b, bio);
> > > +				trace_block_split(md->queue, b, bio->bi_iter.bi_sector);
> > >  				ret = generic_make_request(bio);
> > >  				break;
> > >  			}
> > 
> > In theory, BIO_QUEUE_ENTERED is only required when __split_and_process_bio() is
> > called from generic_make_request(). However, it may be called from dm_wq_work(),
> > this way might cause trouble on operation to q->q_usage_counter.
> 
> Good point, I've tweaked this patch to clear BIO_QUEUE_ENTERED in
> dm_make_request().
> 
> And to Neil's point: yes, these changes really do need to made
> common since it appears all bio_split() callers do go on to call
> generic_make_request().
> 
> Anyway, here is the updated patch that is now staged in linux-next:
> 
> From: Mike Snitzer <snitzer@redhat.com>
> Date: Fri, 18 Jan 2019 01:21:11 -0500
> Subject: [PATCH v2] dm: fix missing bio_split() pattern code in __split_and_process_bio()
> 
> Use the same BIO_QUEUE_ENTERED pattern that was established by commit
> cd4a4ae4683dc ("block: don't use blocking queue entered for recursive
> bio submits") by setting BIO_QUEUE_ENTERED after bio_split() and before
> recursing via generic_make_request().
> 
> Also add trace_block_split() because it provides useful context about
> bio splits in blktrace.
> 
> Depends-on: cd4a4ae4683dc ("block: don't use blocking queue entered for recursive bio submits")
> Fixes: 18a25da84354 ("dm: ensure bio submission follows a depth-first tree walk")
> Cc: stable@vger.kernel.org # 4.16+
> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
> ---
>  drivers/md/dm.c | 9 +++++++++
>  1 file changed, 9 insertions(+)
> 
> diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> index fbadda68e23b..25884f833a32 100644
> --- a/drivers/md/dm.c
> +++ b/drivers/md/dm.c
> @@ -1654,7 +1654,9 @@ static blk_qc_t __split_and_process_bio(struct mapped_device *md,
>  						   sectors[op_stat_group(bio_op(bio))], ci.sector_count);
>  				part_stat_unlock();
>  
> +				bio_set_flag(bio, BIO_QUEUE_ENTERED);
>  				bio_chain(b, bio);
> +				trace_block_split(md->queue, b, bio->bi_iter.bi_sector);
>  				ret = generic_make_request(bio);
>  				break;
>  			}
> @@ -1734,6 +1736,13 @@ static blk_qc_t dm_make_request(struct request_queue *q, struct bio *bio)
>  
>  	map = dm_get_live_table(md, &srcu_idx);
>  
> +	/*
> +	 * Clear the bio-reentered-generic_make_request() flag,
> +	 * will be set again as needed if bio needs to be split.
> +	 */
> +	if (bio_flagged(bio, BIO_QUEUE_ENTERED))
> +		bio_clear_flag(bio, BIO_QUEUE_ENTERED);
> +
>  	/* if we're suspended, we have to queue this io for later */
>  	if (unlikely(test_bit(DMF_BLOCK_IO_FOR_SUSPEND, &md->flags))) {
>  		dm_put_live_table(md, srcu_idx);
> -- 
> 2.15.0
> 

Hi Mike,

I'd suggest to fix this kind issue in the following way, then we
can avoid to touch this flag from drivers:

diff --git a/block/blk-core.c b/block/blk-core.c
index 3c5f61ceeb67..e70103560ac2 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -1024,6 +1024,8 @@ blk_qc_t generic_make_request(struct bio *bio)
 		else
 			bio_io_error(bio);
 		return ret;
+	} else {
+		bio_set_flag(bio, BIO_QUEUE_ENTERED);
 	}
 
 	if (!generic_make_request_checks(bio))
@@ -1074,6 +1076,8 @@ blk_qc_t generic_make_request(struct bio *bio)
 			if (blk_queue_enter(q, flags) < 0) {
 				enter_succeeded = false;
 				q = NULL;
+			} else {
+				bio_set_flag(bio, BIO_QUEUE_ENTERED);
 			}
 		}
 
diff --git a/block/blk-merge.c b/block/blk-merge.c
index b990853f6de7..8777e286bd3f 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -339,16 +339,6 @@ void blk_queue_split(struct request_queue *q, struct bio **bio)
 		/* there isn't chance to merge the splitted bio */
 		split->bi_opf |= REQ_NOMERGE;
 
-		/*
-		 * Since we're recursing into make_request here, ensure
-		 * that we mark this bio as already having entered the queue.
-		 * If not, and the queue is going away, we can get stuck
-		 * forever on waiting for the queue reference to drop. But
-		 * that will never happen, as we're already holding a
-		 * reference to it.
-		 */
-		bio_set_flag(*bio, BIO_QUEUE_ENTERED);
-
 		bio_chain(split, *bio);
 		trace_block_split(q, split, (*bio)->bi_iter.bi_sector);
 		generic_make_request(*bio);

Thanks,
Ming

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH 3/4] dm: fix missing bio_split() pattern code in __split_and_process_bio()
@ 2019-01-22  2:46         ` Ming Lei
  0 siblings, 0 replies; 30+ messages in thread
From: Ming Lei @ 2019-01-22  2:46 UTC (permalink / raw)
  To: Mike Snitzer; +Cc: axboe, linux-block, dm-devel, NeilBrown

On Mon, Jan 21, 2019 at 11:02:04AM -0500, Mike Snitzer wrote:
> On Sun, Jan 20 2019 at 10:21P -0500,
> Ming Lei <ming.lei@redhat.com> wrote:
> 
> > On Sat, Jan 19, 2019 at 01:05:05PM -0500, Mike Snitzer wrote:
> > > Use the same BIO_QUEUE_ENTERED pattern that was established by commit
> > > cd4a4ae4683dc ("block: don't use blocking queue entered for recursive
> > > bio submits") by setting BIO_QUEUE_ENTERED after bio_split() and before
> > > recursing via generic_make_request().
> > > 
> > > Also add trace_block_split() because it provides useful context about
> > > bio splits in blktrace.
> > > 
> > > Depends-on: cd4a4ae4683dc ("block: don't use blocking queue entered for recursive bio submits")
> > > Fixes: 18a25da84354 ("dm: ensure bio submission follows a depth-first tree walk")
> > > Cc: stable@vger.kernel.org # 4.16+
> > > Signed-off-by: Mike Snitzer <snitzer@redhat.com>
> > > ---
> > >  drivers/md/dm.c | 2 ++
> > >  1 file changed, 2 insertions(+)
> > > 
> > > diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> > > index fbadda68e23b..6e29c2d99b99 100644
> > > --- a/drivers/md/dm.c
> > > +++ b/drivers/md/dm.c
> > > @@ -1654,7 +1654,9 @@ static blk_qc_t __split_and_process_bio(struct mapped_device *md,
> > >  						   sectors[op_stat_group(bio_op(bio))], ci.sector_count);
> > >  				part_stat_unlock();
> > >  
> > > +				bio_set_flag(bio, BIO_QUEUE_ENTERED);
> > >  				bio_chain(b, bio);
> > > +				trace_block_split(md->queue, b, bio->bi_iter.bi_sector);
> > >  				ret = generic_make_request(bio);
> > >  				break;
> > >  			}
> > 
> > In theory, BIO_QUEUE_ENTERED is only required when __split_and_process_bio() is
> > called from generic_make_request(). However, it may be called from dm_wq_work(),
> > this way might cause trouble on operation to q->q_usage_counter.
> 
> Good point, I've tweaked this patch to clear BIO_QUEUE_ENTERED in
> dm_make_request().
> 
> And to Neil's point: yes, these changes really do need to made
> common since it appears all bio_split() callers do go on to call
> generic_make_request().
> 
> Anyway, here is the updated patch that is now staged in linux-next:
> 
> From: Mike Snitzer <snitzer@redhat.com>
> Date: Fri, 18 Jan 2019 01:21:11 -0500
> Subject: [PATCH v2] dm: fix missing bio_split() pattern code in __split_and_process_bio()
> 
> Use the same BIO_QUEUE_ENTERED pattern that was established by commit
> cd4a4ae4683dc ("block: don't use blocking queue entered for recursive
> bio submits") by setting BIO_QUEUE_ENTERED after bio_split() and before
> recursing via generic_make_request().
> 
> Also add trace_block_split() because it provides useful context about
> bio splits in blktrace.
> 
> Depends-on: cd4a4ae4683dc ("block: don't use blocking queue entered for recursive bio submits")
> Fixes: 18a25da84354 ("dm: ensure bio submission follows a depth-first tree walk")
> Cc: stable@vger.kernel.org # 4.16+
> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
> ---
>  drivers/md/dm.c | 9 +++++++++
>  1 file changed, 9 insertions(+)
> 
> diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> index fbadda68e23b..25884f833a32 100644
> --- a/drivers/md/dm.c
> +++ b/drivers/md/dm.c
> @@ -1654,7 +1654,9 @@ static blk_qc_t __split_and_process_bio(struct mapped_device *md,
>  						   sectors[op_stat_group(bio_op(bio))], ci.sector_count);
>  				part_stat_unlock();
>  
> +				bio_set_flag(bio, BIO_QUEUE_ENTERED);
>  				bio_chain(b, bio);
> +				trace_block_split(md->queue, b, bio->bi_iter.bi_sector);
>  				ret = generic_make_request(bio);
>  				break;
>  			}
> @@ -1734,6 +1736,13 @@ static blk_qc_t dm_make_request(struct request_queue *q, struct bio *bio)
>  
>  	map = dm_get_live_table(md, &srcu_idx);
>  
> +	/*
> +	 * Clear the bio-reentered-generic_make_request() flag,
> +	 * will be set again as needed if bio needs to be split.
> +	 */
> +	if (bio_flagged(bio, BIO_QUEUE_ENTERED))
> +		bio_clear_flag(bio, BIO_QUEUE_ENTERED);
> +
>  	/* if we're suspended, we have to queue this io for later */
>  	if (unlikely(test_bit(DMF_BLOCK_IO_FOR_SUSPEND, &md->flags))) {
>  		dm_put_live_table(md, srcu_idx);
> -- 
> 2.15.0
> 

Hi Mike,

I'd suggest to fix this kind issue in the following way, then we
can avoid to touch this flag from drivers:

diff --git a/block/blk-core.c b/block/blk-core.c
index 3c5f61ceeb67..e70103560ac2 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -1024,6 +1024,8 @@ blk_qc_t generic_make_request(struct bio *bio)
 		else
 			bio_io_error(bio);
 		return ret;
+	} else {
+		bio_set_flag(bio, BIO_QUEUE_ENTERED);
 	}
 
 	if (!generic_make_request_checks(bio))
@@ -1074,6 +1076,8 @@ blk_qc_t generic_make_request(struct bio *bio)
 			if (blk_queue_enter(q, flags) < 0) {
 				enter_succeeded = false;
 				q = NULL;
+			} else {
+				bio_set_flag(bio, BIO_QUEUE_ENTERED);
 			}
 		}
 
diff --git a/block/blk-merge.c b/block/blk-merge.c
index b990853f6de7..8777e286bd3f 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -339,16 +339,6 @@ void blk_queue_split(struct request_queue *q, struct bio **bio)
 		/* there isn't chance to merge the splitted bio */
 		split->bi_opf |= REQ_NOMERGE;
 
-		/*
-		 * Since we're recursing into make_request here, ensure
-		 * that we mark this bio as already having entered the queue.
-		 * If not, and the queue is going away, we can get stuck
-		 * forever on waiting for the queue reference to drop. But
-		 * that will never happen, as we're already holding a
-		 * reference to it.
-		 */
-		bio_set_flag(*bio, BIO_QUEUE_ENTERED);
-
 		bio_chain(split, *bio);
 		trace_block_split(q, split, (*bio)->bi_iter.bi_sector);
 		generic_make_request(*bio);

Thanks,
Ming

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH 3/4] dm: fix missing bio_split() pattern code in __split_and_process_bio()
  2019-01-22  2:46         ` Ming Lei
@ 2019-01-22  3:17           ` Mike Snitzer
  -1 siblings, 0 replies; 30+ messages in thread
From: Mike Snitzer @ 2019-01-22  3:17 UTC (permalink / raw)
  To: Ming Lei; +Cc: dm-devel, NeilBrown, axboe, linux-block

On Mon, Jan 21 2019 at  9:46pm -0500,
Ming Lei <ming.lei@redhat.com> wrote:

> On Mon, Jan 21, 2019 at 11:02:04AM -0500, Mike Snitzer wrote:
> > On Sun, Jan 20 2019 at 10:21P -0500,
> > Ming Lei <ming.lei@redhat.com> wrote:
> > 
> > > On Sat, Jan 19, 2019 at 01:05:05PM -0500, Mike Snitzer wrote:
> > > > Use the same BIO_QUEUE_ENTERED pattern that was established by commit
> > > > cd4a4ae4683dc ("block: don't use blocking queue entered for recursive
> > > > bio submits") by setting BIO_QUEUE_ENTERED after bio_split() and before
> > > > recursing via generic_make_request().
> > > > 
> > > > Also add trace_block_split() because it provides useful context about
> > > > bio splits in blktrace.
> > > > 
> > > > Depends-on: cd4a4ae4683dc ("block: don't use blocking queue entered for recursive bio submits")
> > > > Fixes: 18a25da84354 ("dm: ensure bio submission follows a depth-first tree walk")
> > > > Cc: stable@vger.kernel.org # 4.16+
> > > > Signed-off-by: Mike Snitzer <snitzer@redhat.com>
> > > > ---
> > > >  drivers/md/dm.c | 2 ++
> > > >  1 file changed, 2 insertions(+)
> > > > 
> > > > diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> > > > index fbadda68e23b..6e29c2d99b99 100644
> > > > --- a/drivers/md/dm.c
> > > > +++ b/drivers/md/dm.c
> > > > @@ -1654,7 +1654,9 @@ static blk_qc_t __split_and_process_bio(struct mapped_device *md,
> > > >  						   sectors[op_stat_group(bio_op(bio))], ci.sector_count);
> > > >  				part_stat_unlock();
> > > >  
> > > > +				bio_set_flag(bio, BIO_QUEUE_ENTERED);
> > > >  				bio_chain(b, bio);
> > > > +				trace_block_split(md->queue, b, bio->bi_iter.bi_sector);
> > > >  				ret = generic_make_request(bio);
> > > >  				break;
> > > >  			}
> > > 
> > > In theory, BIO_QUEUE_ENTERED is only required when __split_and_process_bio() is
> > > called from generic_make_request(). However, it may be called from dm_wq_work(),
> > > this way might cause trouble on operation to q->q_usage_counter.
> > 
> > Good point, I've tweaked this patch to clear BIO_QUEUE_ENTERED in
> > dm_make_request().
> > 
> > And to Neil's point: yes, these changes really do need to made
> > common since it appears all bio_split() callers do go on to call
> > generic_make_request().
> > 
> > Anyway, here is the updated patch that is now staged in linux-next:
> > 
> > From: Mike Snitzer <snitzer@redhat.com>
> > Date: Fri, 18 Jan 2019 01:21:11 -0500
> > Subject: [PATCH v2] dm: fix missing bio_split() pattern code in __split_and_process_bio()
> > 
> > Use the same BIO_QUEUE_ENTERED pattern that was established by commit
> > cd4a4ae4683dc ("block: don't use blocking queue entered for recursive
> > bio submits") by setting BIO_QUEUE_ENTERED after bio_split() and before
> > recursing via generic_make_request().
> > 
> > Also add trace_block_split() because it provides useful context about
> > bio splits in blktrace.
> > 
> > Depends-on: cd4a4ae4683dc ("block: don't use blocking queue entered for recursive bio submits")
> > Fixes: 18a25da84354 ("dm: ensure bio submission follows a depth-first tree walk")
> > Cc: stable@vger.kernel.org # 4.16+
> > Signed-off-by: Mike Snitzer <snitzer@redhat.com>
> > ---
> >  drivers/md/dm.c | 9 +++++++++
> >  1 file changed, 9 insertions(+)
> > 
> > diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> > index fbadda68e23b..25884f833a32 100644
> > --- a/drivers/md/dm.c
> > +++ b/drivers/md/dm.c
> > @@ -1654,7 +1654,9 @@ static blk_qc_t __split_and_process_bio(struct mapped_device *md,
> >  						   sectors[op_stat_group(bio_op(bio))], ci.sector_count);
> >  				part_stat_unlock();
> >  
> > +				bio_set_flag(bio, BIO_QUEUE_ENTERED);
> >  				bio_chain(b, bio);
> > +				trace_block_split(md->queue, b, bio->bi_iter.bi_sector);
> >  				ret = generic_make_request(bio);
> >  				break;
> >  			}
> > @@ -1734,6 +1736,13 @@ static blk_qc_t dm_make_request(struct request_queue *q, struct bio *bio)
> >  
> >  	map = dm_get_live_table(md, &srcu_idx);
> >  
> > +	/*
> > +	 * Clear the bio-reentered-generic_make_request() flag,
> > +	 * will be set again as needed if bio needs to be split.
> > +	 */
> > +	if (bio_flagged(bio, BIO_QUEUE_ENTERED))
> > +		bio_clear_flag(bio, BIO_QUEUE_ENTERED);
> > +
> >  	/* if we're suspended, we have to queue this io for later */
> >  	if (unlikely(test_bit(DMF_BLOCK_IO_FOR_SUSPEND, &md->flags))) {
> >  		dm_put_live_table(md, srcu_idx);
> > -- 
> > 2.15.0
> > 
> 
> Hi Mike,
> 
> I'd suggest to fix this kind issue in the following way, then we
> can avoid to touch this flag from drivers:
> 
> diff --git a/block/blk-core.c b/block/blk-core.c
> index 3c5f61ceeb67..e70103560ac2 100644
> --- a/block/blk-core.c
> +++ b/block/blk-core.c
> @@ -1024,6 +1024,8 @@ blk_qc_t generic_make_request(struct bio *bio)
>  		else
>  			bio_io_error(bio);
>  		return ret;
> +	} else {
> +		bio_set_flag(bio, BIO_QUEUE_ENTERED);
>  	}
>  
>  	if (!generic_make_request_checks(bio))
> @@ -1074,6 +1076,8 @@ blk_qc_t generic_make_request(struct bio *bio)
>  			if (blk_queue_enter(q, flags) < 0) {
>  				enter_succeeded = false;
>  				q = NULL;
> +			} else {
> +				bio_set_flag(bio, BIO_QUEUE_ENTERED);
>  			}
>  		}
>  
> diff --git a/block/blk-merge.c b/block/blk-merge.c
> index b990853f6de7..8777e286bd3f 100644
> --- a/block/blk-merge.c
> +++ b/block/blk-merge.c
> @@ -339,16 +339,6 @@ void blk_queue_split(struct request_queue *q, struct bio **bio)
>  		/* there isn't chance to merge the splitted bio */
>  		split->bi_opf |= REQ_NOMERGE;
>  
> -		/*
> -		 * Since we're recursing into make_request here, ensure
> -		 * that we mark this bio as already having entered the queue.
> -		 * If not, and the queue is going away, we can get stuck
> -		 * forever on waiting for the queue reference to drop. But
> -		 * that will never happen, as we're already holding a
> -		 * reference to it.
> -		 */
> -		bio_set_flag(*bio, BIO_QUEUE_ENTERED);
> -
>  		bio_chain(split, *bio);
>  		trace_block_split(q, split, (*bio)->bi_iter.bi_sector);
>  		generic_make_request(*bio);
> 

Not opposed to this.

Thanks,
Mike

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 3/4] dm: fix missing bio_split() pattern code in __split_and_process_bio()
@ 2019-01-22  3:17           ` Mike Snitzer
  0 siblings, 0 replies; 30+ messages in thread
From: Mike Snitzer @ 2019-01-22  3:17 UTC (permalink / raw)
  To: Ming Lei; +Cc: axboe, linux-block, dm-devel, NeilBrown

On Mon, Jan 21 2019 at  9:46pm -0500,
Ming Lei <ming.lei@redhat.com> wrote:

> On Mon, Jan 21, 2019 at 11:02:04AM -0500, Mike Snitzer wrote:
> > On Sun, Jan 20 2019 at 10:21P -0500,
> > Ming Lei <ming.lei@redhat.com> wrote:
> > 
> > > On Sat, Jan 19, 2019 at 01:05:05PM -0500, Mike Snitzer wrote:
> > > > Use the same BIO_QUEUE_ENTERED pattern that was established by commit
> > > > cd4a4ae4683dc ("block: don't use blocking queue entered for recursive
> > > > bio submits") by setting BIO_QUEUE_ENTERED after bio_split() and before
> > > > recursing via generic_make_request().
> > > > 
> > > > Also add trace_block_split() because it provides useful context about
> > > > bio splits in blktrace.
> > > > 
> > > > Depends-on: cd4a4ae4683dc ("block: don't use blocking queue entered for recursive bio submits")
> > > > Fixes: 18a25da84354 ("dm: ensure bio submission follows a depth-first tree walk")
> > > > Cc: stable@vger.kernel.org # 4.16+
> > > > Signed-off-by: Mike Snitzer <snitzer@redhat.com>
> > > > ---
> > > >  drivers/md/dm.c | 2 ++
> > > >  1 file changed, 2 insertions(+)
> > > > 
> > > > diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> > > > index fbadda68e23b..6e29c2d99b99 100644
> > > > --- a/drivers/md/dm.c
> > > > +++ b/drivers/md/dm.c
> > > > @@ -1654,7 +1654,9 @@ static blk_qc_t __split_and_process_bio(struct mapped_device *md,
> > > >  						   sectors[op_stat_group(bio_op(bio))], ci.sector_count);
> > > >  				part_stat_unlock();
> > > >  
> > > > +				bio_set_flag(bio, BIO_QUEUE_ENTERED);
> > > >  				bio_chain(b, bio);
> > > > +				trace_block_split(md->queue, b, bio->bi_iter.bi_sector);
> > > >  				ret = generic_make_request(bio);
> > > >  				break;
> > > >  			}
> > > 
> > > In theory, BIO_QUEUE_ENTERED is only required when __split_and_process_bio() is
> > > called from generic_make_request(). However, it may be called from dm_wq_work(),
> > > this way might cause trouble on operation to q->q_usage_counter.
> > 
> > Good point, I've tweaked this patch to clear BIO_QUEUE_ENTERED in
> > dm_make_request().
> > 
> > And to Neil's point: yes, these changes really do need to made
> > common since it appears all bio_split() callers do go on to call
> > generic_make_request().
> > 
> > Anyway, here is the updated patch that is now staged in linux-next:
> > 
> > From: Mike Snitzer <snitzer@redhat.com>
> > Date: Fri, 18 Jan 2019 01:21:11 -0500
> > Subject: [PATCH v2] dm: fix missing bio_split() pattern code in __split_and_process_bio()
> > 
> > Use the same BIO_QUEUE_ENTERED pattern that was established by commit
> > cd4a4ae4683dc ("block: don't use blocking queue entered for recursive
> > bio submits") by setting BIO_QUEUE_ENTERED after bio_split() and before
> > recursing via generic_make_request().
> > 
> > Also add trace_block_split() because it provides useful context about
> > bio splits in blktrace.
> > 
> > Depends-on: cd4a4ae4683dc ("block: don't use blocking queue entered for recursive bio submits")
> > Fixes: 18a25da84354 ("dm: ensure bio submission follows a depth-first tree walk")
> > Cc: stable@vger.kernel.org # 4.16+
> > Signed-off-by: Mike Snitzer <snitzer@redhat.com>
> > ---
> >  drivers/md/dm.c | 9 +++++++++
> >  1 file changed, 9 insertions(+)
> > 
> > diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> > index fbadda68e23b..25884f833a32 100644
> > --- a/drivers/md/dm.c
> > +++ b/drivers/md/dm.c
> > @@ -1654,7 +1654,9 @@ static blk_qc_t __split_and_process_bio(struct mapped_device *md,
> >  						   sectors[op_stat_group(bio_op(bio))], ci.sector_count);
> >  				part_stat_unlock();
> >  
> > +				bio_set_flag(bio, BIO_QUEUE_ENTERED);
> >  				bio_chain(b, bio);
> > +				trace_block_split(md->queue, b, bio->bi_iter.bi_sector);
> >  				ret = generic_make_request(bio);
> >  				break;
> >  			}
> > @@ -1734,6 +1736,13 @@ static blk_qc_t dm_make_request(struct request_queue *q, struct bio *bio)
> >  
> >  	map = dm_get_live_table(md, &srcu_idx);
> >  
> > +	/*
> > +	 * Clear the bio-reentered-generic_make_request() flag,
> > +	 * will be set again as needed if bio needs to be split.
> > +	 */
> > +	if (bio_flagged(bio, BIO_QUEUE_ENTERED))
> > +		bio_clear_flag(bio, BIO_QUEUE_ENTERED);
> > +
> >  	/* if we're suspended, we have to queue this io for later */
> >  	if (unlikely(test_bit(DMF_BLOCK_IO_FOR_SUSPEND, &md->flags))) {
> >  		dm_put_live_table(md, srcu_idx);
> > -- 
> > 2.15.0
> > 
> 
> Hi Mike,
> 
> I'd suggest to fix this kind issue in the following way, then we
> can avoid to touch this flag from drivers:
> 
> diff --git a/block/blk-core.c b/block/blk-core.c
> index 3c5f61ceeb67..e70103560ac2 100644
> --- a/block/blk-core.c
> +++ b/block/blk-core.c
> @@ -1024,6 +1024,8 @@ blk_qc_t generic_make_request(struct bio *bio)
>  		else
>  			bio_io_error(bio);
>  		return ret;
> +	} else {
> +		bio_set_flag(bio, BIO_QUEUE_ENTERED);
>  	}
>  
>  	if (!generic_make_request_checks(bio))
> @@ -1074,6 +1076,8 @@ blk_qc_t generic_make_request(struct bio *bio)
>  			if (blk_queue_enter(q, flags) < 0) {
>  				enter_succeeded = false;
>  				q = NULL;
> +			} else {
> +				bio_set_flag(bio, BIO_QUEUE_ENTERED);
>  			}
>  		}
>  
> diff --git a/block/blk-merge.c b/block/blk-merge.c
> index b990853f6de7..8777e286bd3f 100644
> --- a/block/blk-merge.c
> +++ b/block/blk-merge.c
> @@ -339,16 +339,6 @@ void blk_queue_split(struct request_queue *q, struct bio **bio)
>  		/* there isn't chance to merge the splitted bio */
>  		split->bi_opf |= REQ_NOMERGE;
>  
> -		/*
> -		 * Since we're recursing into make_request here, ensure
> -		 * that we mark this bio as already having entered the queue.
> -		 * If not, and the queue is going away, we can get stuck
> -		 * forever on waiting for the queue reference to drop. But
> -		 * that will never happen, as we're already holding a
> -		 * reference to it.
> -		 */
> -		bio_set_flag(*bio, BIO_QUEUE_ENTERED);
> -
>  		bio_chain(split, *bio);
>  		trace_block_split(q, split, (*bio)->bi_iter.bi_sector);
>  		generic_make_request(*bio);
> 

Not opposed to this.

Thanks,
Mike

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 3/4] dm: fix missing bio_split() pattern code in __split_and_process_bio()
  2019-01-22  3:17           ` Mike Snitzer
@ 2019-01-22  3:35             ` Mike Snitzer
  -1 siblings, 0 replies; 30+ messages in thread
From: Mike Snitzer @ 2019-01-22  3:35 UTC (permalink / raw)
  To: Ming Lei; +Cc: dm-devel, NeilBrown, axboe, linux-block

On Mon, Jan 21 2019 at 10:17pm -0500,
Mike Snitzer <snitzer@redhat.com> wrote:

> On Mon, Jan 21 2019 at  9:46pm -0500,
> Ming Lei <ming.lei@redhat.com> wrote:
> 
> > On Mon, Jan 21, 2019 at 11:02:04AM -0500, Mike Snitzer wrote:
> > > On Sun, Jan 20 2019 at 10:21P -0500,
> > > Ming Lei <ming.lei@redhat.com> wrote:
> > > 
> > > > On Sat, Jan 19, 2019 at 01:05:05PM -0500, Mike Snitzer wrote:
> > > > > Use the same BIO_QUEUE_ENTERED pattern that was established by commit
> > > > > cd4a4ae4683dc ("block: don't use blocking queue entered for recursive
> > > > > bio submits") by setting BIO_QUEUE_ENTERED after bio_split() and before
> > > > > recursing via generic_make_request().
> > > > > 
> > > > > Also add trace_block_split() because it provides useful context about
> > > > > bio splits in blktrace.
> > > > > 
> > > > > Depends-on: cd4a4ae4683dc ("block: don't use blocking queue entered for recursive bio submits")
> > > > > Fixes: 18a25da84354 ("dm: ensure bio submission follows a depth-first tree walk")
> > > > > Cc: stable@vger.kernel.org # 4.16+
> > > > > Signed-off-by: Mike Snitzer <snitzer@redhat.com>
> > > > > ---
> > > > >  drivers/md/dm.c | 2 ++
> > > > >  1 file changed, 2 insertions(+)
> > > > > 
> > > > > diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> > > > > index fbadda68e23b..6e29c2d99b99 100644
> > > > > --- a/drivers/md/dm.c
> > > > > +++ b/drivers/md/dm.c
> > > > > @@ -1654,7 +1654,9 @@ static blk_qc_t __split_and_process_bio(struct mapped_device *md,
> > > > >  						   sectors[op_stat_group(bio_op(bio))], ci.sector_count);
> > > > >  				part_stat_unlock();
> > > > >  
> > > > > +				bio_set_flag(bio, BIO_QUEUE_ENTERED);
> > > > >  				bio_chain(b, bio);
> > > > > +				trace_block_split(md->queue, b, bio->bi_iter.bi_sector);
> > > > >  				ret = generic_make_request(bio);
> > > > >  				break;
> > > > >  			}
> > > > 
> > > > In theory, BIO_QUEUE_ENTERED is only required when __split_and_process_bio() is
> > > > called from generic_make_request(). However, it may be called from dm_wq_work(),
> > > > this way might cause trouble on operation to q->q_usage_counter.
> > > 
> > > Good point, I've tweaked this patch to clear BIO_QUEUE_ENTERED in
> > > dm_make_request().
> > > 
> > > And to Neil's point: yes, these changes really do need to made
> > > common since it appears all bio_split() callers do go on to call
> > > generic_make_request().
> > > 
> > > Anyway, here is the updated patch that is now staged in linux-next:
> > > 
> > > From: Mike Snitzer <snitzer@redhat.com>
> > > Date: Fri, 18 Jan 2019 01:21:11 -0500
> > > Subject: [PATCH v2] dm: fix missing bio_split() pattern code in __split_and_process_bio()
> > > 
> > > Use the same BIO_QUEUE_ENTERED pattern that was established by commit
> > > cd4a4ae4683dc ("block: don't use blocking queue entered for recursive
> > > bio submits") by setting BIO_QUEUE_ENTERED after bio_split() and before
> > > recursing via generic_make_request().
> > > 
> > > Also add trace_block_split() because it provides useful context about
> > > bio splits in blktrace.
> > > 
> > > Depends-on: cd4a4ae4683dc ("block: don't use blocking queue entered for recursive bio submits")
> > > Fixes: 18a25da84354 ("dm: ensure bio submission follows a depth-first tree walk")
> > > Cc: stable@vger.kernel.org # 4.16+
> > > Signed-off-by: Mike Snitzer <snitzer@redhat.com>
> > > ---
> > >  drivers/md/dm.c | 9 +++++++++
> > >  1 file changed, 9 insertions(+)
> > > 
> > > diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> > > index fbadda68e23b..25884f833a32 100644
> > > --- a/drivers/md/dm.c
> > > +++ b/drivers/md/dm.c
> > > @@ -1654,7 +1654,9 @@ static blk_qc_t __split_and_process_bio(struct mapped_device *md,
> > >  						   sectors[op_stat_group(bio_op(bio))], ci.sector_count);
> > >  				part_stat_unlock();
> > >  
> > > +				bio_set_flag(bio, BIO_QUEUE_ENTERED);
> > >  				bio_chain(b, bio);
> > > +				trace_block_split(md->queue, b, bio->bi_iter.bi_sector);
> > >  				ret = generic_make_request(bio);
> > >  				break;
> > >  			}
> > > @@ -1734,6 +1736,13 @@ static blk_qc_t dm_make_request(struct request_queue *q, struct bio *bio)
> > >  
> > >  	map = dm_get_live_table(md, &srcu_idx);
> > >  
> > > +	/*
> > > +	 * Clear the bio-reentered-generic_make_request() flag,
> > > +	 * will be set again as needed if bio needs to be split.
> > > +	 */
> > > +	if (bio_flagged(bio, BIO_QUEUE_ENTERED))
> > > +		bio_clear_flag(bio, BIO_QUEUE_ENTERED);
> > > +
> > >  	/* if we're suspended, we have to queue this io for later */
> > >  	if (unlikely(test_bit(DMF_BLOCK_IO_FOR_SUSPEND, &md->flags))) {
> > >  		dm_put_live_table(md, srcu_idx);
> > > -- 
> > > 2.15.0
> > > 
> > 
> > Hi Mike,
> > 
> > I'd suggest to fix this kind issue in the following way, then we
> > can avoid to touch this flag from drivers:
> > 
> > diff --git a/block/blk-core.c b/block/blk-core.c
> > index 3c5f61ceeb67..e70103560ac2 100644
> > --- a/block/blk-core.c
> > +++ b/block/blk-core.c
> > @@ -1024,6 +1024,8 @@ blk_qc_t generic_make_request(struct bio *bio)
> >  		else
> >  			bio_io_error(bio);
> >  		return ret;
> > +	} else {
> > +		bio_set_flag(bio, BIO_QUEUE_ENTERED);
> >  	}
> >  
> >  	if (!generic_make_request_checks(bio))
> > @@ -1074,6 +1076,8 @@ blk_qc_t generic_make_request(struct bio *bio)
> >  			if (blk_queue_enter(q, flags) < 0) {
> >  				enter_succeeded = false;
> >  				q = NULL;
> > +			} else {
> > +				bio_set_flag(bio, BIO_QUEUE_ENTERED);
> >  			}
> >  		}
> >  
> > diff --git a/block/blk-merge.c b/block/blk-merge.c
> > index b990853f6de7..8777e286bd3f 100644
> > --- a/block/blk-merge.c
> > +++ b/block/blk-merge.c
> > @@ -339,16 +339,6 @@ void blk_queue_split(struct request_queue *q, struct bio **bio)
> >  		/* there isn't chance to merge the splitted bio */
> >  		split->bi_opf |= REQ_NOMERGE;
> >  
> > -		/*
> > -		 * Since we're recursing into make_request here, ensure
> > -		 * that we mark this bio as already having entered the queue.
> > -		 * If not, and the queue is going away, we can get stuck
> > -		 * forever on waiting for the queue reference to drop. But
> > -		 * that will never happen, as we're already holding a
> > -		 * reference to it.
> > -		 */
> > -		bio_set_flag(*bio, BIO_QUEUE_ENTERED);
> > -
> >  		bio_chain(split, *bio);
> >  		trace_block_split(q, split, (*bio)->bi_iter.bi_sector);
> >  		generic_make_request(*bio);
> > 
> 
> Not opposed to this.

But thinking further: when you have a stack of cascading
q->make_request_fn it could easily be that work done the next layer
down end up causing the bio to recurse to generic_make_request() but not
directly (e.g. dm_wq_work)... yet BIO_QUEUE_ENTERED will still be set
when it really isn't appropriate.

Getting too cute with setting bio flags but not clearing them on
different device boundaries could render the flags useless (or worse:
incorrect).

I'm not out for enaging in a focused audit/churn in this area that
becomes a slippery slope during the rest of 5.0-rcX.

That is why I was going for a local DM change for 5.0 and, in parallel,
work on the more generic fixes for 5.1.

So I'm back to preferring that...

But if you, Jens or others feel strongly about it I'm open to discuss it
further.

Think we need to set REQ_NOMERGE in the split too (like
blk_queue_split() is doing).  Again, a comprehensive cleanup and
consolidation of bio_split+generic_make_request pattern is needed.  MD
has a lot of it, DM has it, and then there is blk_queue_split().
Basically blk_queue_split()'s bio_split+bio_chain+generic_make_request
and all the flags that get set inbetween should be factored out for all
to use.

Mike

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 3/4] dm: fix missing bio_split() pattern code in __split_and_process_bio()
@ 2019-01-22  3:35             ` Mike Snitzer
  0 siblings, 0 replies; 30+ messages in thread
From: Mike Snitzer @ 2019-01-22  3:35 UTC (permalink / raw)
  To: Ming Lei; +Cc: axboe, linux-block, dm-devel, NeilBrown

On Mon, Jan 21 2019 at 10:17pm -0500,
Mike Snitzer <snitzer@redhat.com> wrote:

> On Mon, Jan 21 2019 at  9:46pm -0500,
> Ming Lei <ming.lei@redhat.com> wrote:
> 
> > On Mon, Jan 21, 2019 at 11:02:04AM -0500, Mike Snitzer wrote:
> > > On Sun, Jan 20 2019 at 10:21P -0500,
> > > Ming Lei <ming.lei@redhat.com> wrote:
> > > 
> > > > On Sat, Jan 19, 2019 at 01:05:05PM -0500, Mike Snitzer wrote:
> > > > > Use the same BIO_QUEUE_ENTERED pattern that was established by commit
> > > > > cd4a4ae4683dc ("block: don't use blocking queue entered for recursive
> > > > > bio submits") by setting BIO_QUEUE_ENTERED after bio_split() and before
> > > > > recursing via generic_make_request().
> > > > > 
> > > > > Also add trace_block_split() because it provides useful context about
> > > > > bio splits in blktrace.
> > > > > 
> > > > > Depends-on: cd4a4ae4683dc ("block: don't use blocking queue entered for recursive bio submits")
> > > > > Fixes: 18a25da84354 ("dm: ensure bio submission follows a depth-first tree walk")
> > > > > Cc: stable@vger.kernel.org # 4.16+
> > > > > Signed-off-by: Mike Snitzer <snitzer@redhat.com>
> > > > > ---
> > > > >  drivers/md/dm.c | 2 ++
> > > > >  1 file changed, 2 insertions(+)
> > > > > 
> > > > > diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> > > > > index fbadda68e23b..6e29c2d99b99 100644
> > > > > --- a/drivers/md/dm.c
> > > > > +++ b/drivers/md/dm.c
> > > > > @@ -1654,7 +1654,9 @@ static blk_qc_t __split_and_process_bio(struct mapped_device *md,
> > > > >  						   sectors[op_stat_group(bio_op(bio))], ci.sector_count);
> > > > >  				part_stat_unlock();
> > > > >  
> > > > > +				bio_set_flag(bio, BIO_QUEUE_ENTERED);
> > > > >  				bio_chain(b, bio);
> > > > > +				trace_block_split(md->queue, b, bio->bi_iter.bi_sector);
> > > > >  				ret = generic_make_request(bio);
> > > > >  				break;
> > > > >  			}
> > > > 
> > > > In theory, BIO_QUEUE_ENTERED is only required when __split_and_process_bio() is
> > > > called from generic_make_request(). However, it may be called from dm_wq_work(),
> > > > this way might cause trouble on operation to q->q_usage_counter.
> > > 
> > > Good point, I've tweaked this patch to clear BIO_QUEUE_ENTERED in
> > > dm_make_request().
> > > 
> > > And to Neil's point: yes, these changes really do need to made
> > > common since it appears all bio_split() callers do go on to call
> > > generic_make_request().
> > > 
> > > Anyway, here is the updated patch that is now staged in linux-next:
> > > 
> > > From: Mike Snitzer <snitzer@redhat.com>
> > > Date: Fri, 18 Jan 2019 01:21:11 -0500
> > > Subject: [PATCH v2] dm: fix missing bio_split() pattern code in __split_and_process_bio()
> > > 
> > > Use the same BIO_QUEUE_ENTERED pattern that was established by commit
> > > cd4a4ae4683dc ("block: don't use blocking queue entered for recursive
> > > bio submits") by setting BIO_QUEUE_ENTERED after bio_split() and before
> > > recursing via generic_make_request().
> > > 
> > > Also add trace_block_split() because it provides useful context about
> > > bio splits in blktrace.
> > > 
> > > Depends-on: cd4a4ae4683dc ("block: don't use blocking queue entered for recursive bio submits")
> > > Fixes: 18a25da84354 ("dm: ensure bio submission follows a depth-first tree walk")
> > > Cc: stable@vger.kernel.org # 4.16+
> > > Signed-off-by: Mike Snitzer <snitzer@redhat.com>
> > > ---
> > >  drivers/md/dm.c | 9 +++++++++
> > >  1 file changed, 9 insertions(+)
> > > 
> > > diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> > > index fbadda68e23b..25884f833a32 100644
> > > --- a/drivers/md/dm.c
> > > +++ b/drivers/md/dm.c
> > > @@ -1654,7 +1654,9 @@ static blk_qc_t __split_and_process_bio(struct mapped_device *md,
> > >  						   sectors[op_stat_group(bio_op(bio))], ci.sector_count);
> > >  				part_stat_unlock();
> > >  
> > > +				bio_set_flag(bio, BIO_QUEUE_ENTERED);
> > >  				bio_chain(b, bio);
> > > +				trace_block_split(md->queue, b, bio->bi_iter.bi_sector);
> > >  				ret = generic_make_request(bio);
> > >  				break;
> > >  			}
> > > @@ -1734,6 +1736,13 @@ static blk_qc_t dm_make_request(struct request_queue *q, struct bio *bio)
> > >  
> > >  	map = dm_get_live_table(md, &srcu_idx);
> > >  
> > > +	/*
> > > +	 * Clear the bio-reentered-generic_make_request() flag,
> > > +	 * will be set again as needed if bio needs to be split.
> > > +	 */
> > > +	if (bio_flagged(bio, BIO_QUEUE_ENTERED))
> > > +		bio_clear_flag(bio, BIO_QUEUE_ENTERED);
> > > +
> > >  	/* if we're suspended, we have to queue this io for later */
> > >  	if (unlikely(test_bit(DMF_BLOCK_IO_FOR_SUSPEND, &md->flags))) {
> > >  		dm_put_live_table(md, srcu_idx);
> > > -- 
> > > 2.15.0
> > > 
> > 
> > Hi Mike,
> > 
> > I'd suggest to fix this kind issue in the following way, then we
> > can avoid to touch this flag from drivers:
> > 
> > diff --git a/block/blk-core.c b/block/blk-core.c
> > index 3c5f61ceeb67..e70103560ac2 100644
> > --- a/block/blk-core.c
> > +++ b/block/blk-core.c
> > @@ -1024,6 +1024,8 @@ blk_qc_t generic_make_request(struct bio *bio)
> >  		else
> >  			bio_io_error(bio);
> >  		return ret;
> > +	} else {
> > +		bio_set_flag(bio, BIO_QUEUE_ENTERED);
> >  	}
> >  
> >  	if (!generic_make_request_checks(bio))
> > @@ -1074,6 +1076,8 @@ blk_qc_t generic_make_request(struct bio *bio)
> >  			if (blk_queue_enter(q, flags) < 0) {
> >  				enter_succeeded = false;
> >  				q = NULL;
> > +			} else {
> > +				bio_set_flag(bio, BIO_QUEUE_ENTERED);
> >  			}
> >  		}
> >  
> > diff --git a/block/blk-merge.c b/block/blk-merge.c
> > index b990853f6de7..8777e286bd3f 100644
> > --- a/block/blk-merge.c
> > +++ b/block/blk-merge.c
> > @@ -339,16 +339,6 @@ void blk_queue_split(struct request_queue *q, struct bio **bio)
> >  		/* there isn't chance to merge the splitted bio */
> >  		split->bi_opf |= REQ_NOMERGE;
> >  
> > -		/*
> > -		 * Since we're recursing into make_request here, ensure
> > -		 * that we mark this bio as already having entered the queue.
> > -		 * If not, and the queue is going away, we can get stuck
> > -		 * forever on waiting for the queue reference to drop. But
> > -		 * that will never happen, as we're already holding a
> > -		 * reference to it.
> > -		 */
> > -		bio_set_flag(*bio, BIO_QUEUE_ENTERED);
> > -
> >  		bio_chain(split, *bio);
> >  		trace_block_split(q, split, (*bio)->bi_iter.bi_sector);
> >  		generic_make_request(*bio);
> > 
> 
> Not opposed to this.

But thinking further: when you have a stack of cascading
q->make_request_fn it could easily be that work done the next layer
down end up causing the bio to recurse to generic_make_request() but not
directly (e.g. dm_wq_work)... yet BIO_QUEUE_ENTERED will still be set
when it really isn't appropriate.

Getting too cute with setting bio flags but not clearing them on
different device boundaries could render the flags useless (or worse:
incorrect).

I'm not out for enaging in a focused audit/churn in this area that
becomes a slippery slope during the rest of 5.0-rcX.

That is why I was going for a local DM change for 5.0 and, in parallel,
work on the more generic fixes for 5.1.

So I'm back to preferring that...

But if you, Jens or others feel strongly about it I'm open to discuss it
further.

Think we need to set REQ_NOMERGE in the split too (like
blk_queue_split() is doing).  Again, a comprehensive cleanup and
consolidation of bio_split+generic_make_request pattern is needed.  MD
has a lot of it, DM has it, and then there is blk_queue_split().
Basically blk_queue_split()'s bio_split+bio_chain+generic_make_request
and all the flags that get set inbetween should be factored out for all
to use.

Mike

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 3/4] dm: fix missing bio_split() pattern code in __split_and_process_bio()
  2019-01-22  3:35             ` Mike Snitzer
@ 2019-01-22  3:49               ` Ming Lei
  -1 siblings, 0 replies; 30+ messages in thread
From: Ming Lei @ 2019-01-22  3:49 UTC (permalink / raw)
  To: Mike Snitzer; +Cc: dm-devel, NeilBrown, axboe, linux-block

On Mon, Jan 21, 2019 at 10:35:11PM -0500, Mike Snitzer wrote:
> On Mon, Jan 21 2019 at 10:17pm -0500,
> Mike Snitzer <snitzer@redhat.com> wrote:
> 
> > On Mon, Jan 21 2019 at  9:46pm -0500,
> > Ming Lei <ming.lei@redhat.com> wrote:
> > 
> > > On Mon, Jan 21, 2019 at 11:02:04AM -0500, Mike Snitzer wrote:
> > > > On Sun, Jan 20 2019 at 10:21P -0500,
> > > > Ming Lei <ming.lei@redhat.com> wrote:
> > > > 
> > > > > On Sat, Jan 19, 2019 at 01:05:05PM -0500, Mike Snitzer wrote:
> > > > > > Use the same BIO_QUEUE_ENTERED pattern that was established by commit
> > > > > > cd4a4ae4683dc ("block: don't use blocking queue entered for recursive
> > > > > > bio submits") by setting BIO_QUEUE_ENTERED after bio_split() and before
> > > > > > recursing via generic_make_request().
> > > > > > 
> > > > > > Also add trace_block_split() because it provides useful context about
> > > > > > bio splits in blktrace.
> > > > > > 
> > > > > > Depends-on: cd4a4ae4683dc ("block: don't use blocking queue entered for recursive bio submits")
> > > > > > Fixes: 18a25da84354 ("dm: ensure bio submission follows a depth-first tree walk")
> > > > > > Cc: stable@vger.kernel.org # 4.16+
> > > > > > Signed-off-by: Mike Snitzer <snitzer@redhat.com>
> > > > > > ---
> > > > > >  drivers/md/dm.c | 2 ++
> > > > > >  1 file changed, 2 insertions(+)
> > > > > > 
> > > > > > diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> > > > > > index fbadda68e23b..6e29c2d99b99 100644
> > > > > > --- a/drivers/md/dm.c
> > > > > > +++ b/drivers/md/dm.c
> > > > > > @@ -1654,7 +1654,9 @@ static blk_qc_t __split_and_process_bio(struct mapped_device *md,
> > > > > >  						   sectors[op_stat_group(bio_op(bio))], ci.sector_count);
> > > > > >  				part_stat_unlock();
> > > > > >  
> > > > > > +				bio_set_flag(bio, BIO_QUEUE_ENTERED);
> > > > > >  				bio_chain(b, bio);
> > > > > > +				trace_block_split(md->queue, b, bio->bi_iter.bi_sector);
> > > > > >  				ret = generic_make_request(bio);
> > > > > >  				break;
> > > > > >  			}
> > > > > 
> > > > > In theory, BIO_QUEUE_ENTERED is only required when __split_and_process_bio() is
> > > > > called from generic_make_request(). However, it may be called from dm_wq_work(),
> > > > > this way might cause trouble on operation to q->q_usage_counter.
> > > > 
> > > > Good point, I've tweaked this patch to clear BIO_QUEUE_ENTERED in
> > > > dm_make_request().
> > > > 
> > > > And to Neil's point: yes, these changes really do need to made
> > > > common since it appears all bio_split() callers do go on to call
> > > > generic_make_request().
> > > > 
> > > > Anyway, here is the updated patch that is now staged in linux-next:
> > > > 
> > > > From: Mike Snitzer <snitzer@redhat.com>
> > > > Date: Fri, 18 Jan 2019 01:21:11 -0500
> > > > Subject: [PATCH v2] dm: fix missing bio_split() pattern code in __split_and_process_bio()
> > > > 
> > > > Use the same BIO_QUEUE_ENTERED pattern that was established by commit
> > > > cd4a4ae4683dc ("block: don't use blocking queue entered for recursive
> > > > bio submits") by setting BIO_QUEUE_ENTERED after bio_split() and before
> > > > recursing via generic_make_request().
> > > > 
> > > > Also add trace_block_split() because it provides useful context about
> > > > bio splits in blktrace.
> > > > 
> > > > Depends-on: cd4a4ae4683dc ("block: don't use blocking queue entered for recursive bio submits")
> > > > Fixes: 18a25da84354 ("dm: ensure bio submission follows a depth-first tree walk")
> > > > Cc: stable@vger.kernel.org # 4.16+
> > > > Signed-off-by: Mike Snitzer <snitzer@redhat.com>
> > > > ---
> > > >  drivers/md/dm.c | 9 +++++++++
> > > >  1 file changed, 9 insertions(+)
> > > > 
> > > > diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> > > > index fbadda68e23b..25884f833a32 100644
> > > > --- a/drivers/md/dm.c
> > > > +++ b/drivers/md/dm.c
> > > > @@ -1654,7 +1654,9 @@ static blk_qc_t __split_and_process_bio(struct mapped_device *md,
> > > >  						   sectors[op_stat_group(bio_op(bio))], ci.sector_count);
> > > >  				part_stat_unlock();
> > > >  
> > > > +				bio_set_flag(bio, BIO_QUEUE_ENTERED);
> > > >  				bio_chain(b, bio);
> > > > +				trace_block_split(md->queue, b, bio->bi_iter.bi_sector);
> > > >  				ret = generic_make_request(bio);
> > > >  				break;
> > > >  			}
> > > > @@ -1734,6 +1736,13 @@ static blk_qc_t dm_make_request(struct request_queue *q, struct bio *bio)
> > > >  
> > > >  	map = dm_get_live_table(md, &srcu_idx);
> > > >  
> > > > +	/*
> > > > +	 * Clear the bio-reentered-generic_make_request() flag,
> > > > +	 * will be set again as needed if bio needs to be split.
> > > > +	 */
> > > > +	if (bio_flagged(bio, BIO_QUEUE_ENTERED))
> > > > +		bio_clear_flag(bio, BIO_QUEUE_ENTERED);
> > > > +
> > > >  	/* if we're suspended, we have to queue this io for later */
> > > >  	if (unlikely(test_bit(DMF_BLOCK_IO_FOR_SUSPEND, &md->flags))) {
> > > >  		dm_put_live_table(md, srcu_idx);
> > > > -- 
> > > > 2.15.0
> > > > 
> > > 
> > > Hi Mike,
> > > 
> > > I'd suggest to fix this kind issue in the following way, then we
> > > can avoid to touch this flag from drivers:
> > > 
> > > diff --git a/block/blk-core.c b/block/blk-core.c
> > > index 3c5f61ceeb67..e70103560ac2 100644
> > > --- a/block/blk-core.c
> > > +++ b/block/blk-core.c
> > > @@ -1024,6 +1024,8 @@ blk_qc_t generic_make_request(struct bio *bio)
> > >  		else
> > >  			bio_io_error(bio);
> > >  		return ret;
> > > +	} else {
> > > +		bio_set_flag(bio, BIO_QUEUE_ENTERED);
> > >  	}
> > >  
> > >  	if (!generic_make_request_checks(bio))
> > > @@ -1074,6 +1076,8 @@ blk_qc_t generic_make_request(struct bio *bio)
> > >  			if (blk_queue_enter(q, flags) < 0) {
> > >  				enter_succeeded = false;
> > >  				q = NULL;
> > > +			} else {
> > > +				bio_set_flag(bio, BIO_QUEUE_ENTERED);
> > >  			}
> > >  		}
> > >  
> > > diff --git a/block/blk-merge.c b/block/blk-merge.c
> > > index b990853f6de7..8777e286bd3f 100644
> > > --- a/block/blk-merge.c
> > > +++ b/block/blk-merge.c
> > > @@ -339,16 +339,6 @@ void blk_queue_split(struct request_queue *q, struct bio **bio)
> > >  		/* there isn't chance to merge the splitted bio */
> > >  		split->bi_opf |= REQ_NOMERGE;
> > >  
> > > -		/*
> > > -		 * Since we're recursing into make_request here, ensure
> > > -		 * that we mark this bio as already having entered the queue.
> > > -		 * If not, and the queue is going away, we can get stuck
> > > -		 * forever on waiting for the queue reference to drop. But
> > > -		 * that will never happen, as we're already holding a
> > > -		 * reference to it.
> > > -		 */
> > > -		bio_set_flag(*bio, BIO_QUEUE_ENTERED);
> > > -
> > >  		bio_chain(split, *bio);
> > >  		trace_block_split(q, split, (*bio)->bi_iter.bi_sector);
> > >  		generic_make_request(*bio);
> > > 
> > 
> > Not opposed to this.
> 
> But thinking further: when you have a stack of cascading
> q->make_request_fn it could easily be that work done the next layer
> down end up causing the bio to recurse to generic_make_request() but not
> directly (e.g. dm_wq_work)... yet BIO_QUEUE_ENTERED will still be set
> when it really isn't appropriate.

That is true, in theory, we need a per-queue stack variable to record
if queue usage counter is held. But it is quite hard to do that in
kernel because we don't have stack variable allocator, otherwise
this issue can be solved clean & simple.

> 
> Getting too cute with setting bio flags but not clearing them on
> different device boundaries could render the flags useless (or worse:
> incorrect).

How about clearing the flag just following q->make_request_fn() in
generic_make_request()?

> 
> I'm not out for enaging in a focused audit/churn in this area that
> becomes a slippery slope during the rest of 5.0-rcX.
> 
> That is why I was going for a local DM change for 5.0 and, in parallel,
> work on the more generic fixes for 5.1.
> 
> So I'm back to preferring that...
> 
> But if you, Jens or others feel strongly about it I'm open to discuss it
> further.

One concern is that if this flag starts to be used by drivers, sooner or
later it may be difficult to maintain.

> 
> Think we need to set REQ_NOMERGE in the split too (like
> blk_queue_split() is doing).  Again, a comprehensive cleanup and
> consolidation of bio_split+generic_make_request pattern is needed.  MD
> has a lot of it, DM has it, and then there is blk_queue_split().
> Basically blk_queue_split()'s bio_split+bio_chain+generic_make_request
> and all the flags that get set inbetween should be factored out for all
> to use.

Sounds a good topic and I am interested in,  maybe you can submit a lsfmm
proposal, :-)


Thanks,
Ming

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 3/4] dm: fix missing bio_split() pattern code in __split_and_process_bio()
@ 2019-01-22  3:49               ` Ming Lei
  0 siblings, 0 replies; 30+ messages in thread
From: Ming Lei @ 2019-01-22  3:49 UTC (permalink / raw)
  To: Mike Snitzer; +Cc: axboe, linux-block, dm-devel, NeilBrown

On Mon, Jan 21, 2019 at 10:35:11PM -0500, Mike Snitzer wrote:
> On Mon, Jan 21 2019 at 10:17pm -0500,
> Mike Snitzer <snitzer@redhat.com> wrote:
> 
> > On Mon, Jan 21 2019 at  9:46pm -0500,
> > Ming Lei <ming.lei@redhat.com> wrote:
> > 
> > > On Mon, Jan 21, 2019 at 11:02:04AM -0500, Mike Snitzer wrote:
> > > > On Sun, Jan 20 2019 at 10:21P -0500,
> > > > Ming Lei <ming.lei@redhat.com> wrote:
> > > > 
> > > > > On Sat, Jan 19, 2019 at 01:05:05PM -0500, Mike Snitzer wrote:
> > > > > > Use the same BIO_QUEUE_ENTERED pattern that was established by commit
> > > > > > cd4a4ae4683dc ("block: don't use blocking queue entered for recursive
> > > > > > bio submits") by setting BIO_QUEUE_ENTERED after bio_split() and before
> > > > > > recursing via generic_make_request().
> > > > > > 
> > > > > > Also add trace_block_split() because it provides useful context about
> > > > > > bio splits in blktrace.
> > > > > > 
> > > > > > Depends-on: cd4a4ae4683dc ("block: don't use blocking queue entered for recursive bio submits")
> > > > > > Fixes: 18a25da84354 ("dm: ensure bio submission follows a depth-first tree walk")
> > > > > > Cc: stable@vger.kernel.org # 4.16+
> > > > > > Signed-off-by: Mike Snitzer <snitzer@redhat.com>
> > > > > > ---
> > > > > >  drivers/md/dm.c | 2 ++
> > > > > >  1 file changed, 2 insertions(+)
> > > > > > 
> > > > > > diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> > > > > > index fbadda68e23b..6e29c2d99b99 100644
> > > > > > --- a/drivers/md/dm.c
> > > > > > +++ b/drivers/md/dm.c
> > > > > > @@ -1654,7 +1654,9 @@ static blk_qc_t __split_and_process_bio(struct mapped_device *md,
> > > > > >  						   sectors[op_stat_group(bio_op(bio))], ci.sector_count);
> > > > > >  				part_stat_unlock();
> > > > > >  
> > > > > > +				bio_set_flag(bio, BIO_QUEUE_ENTERED);
> > > > > >  				bio_chain(b, bio);
> > > > > > +				trace_block_split(md->queue, b, bio->bi_iter.bi_sector);
> > > > > >  				ret = generic_make_request(bio);
> > > > > >  				break;
> > > > > >  			}
> > > > > 
> > > > > In theory, BIO_QUEUE_ENTERED is only required when __split_and_process_bio() is
> > > > > called from generic_make_request(). However, it may be called from dm_wq_work(),
> > > > > this way might cause trouble on operation to q->q_usage_counter.
> > > > 
> > > > Good point, I've tweaked this patch to clear BIO_QUEUE_ENTERED in
> > > > dm_make_request().
> > > > 
> > > > And to Neil's point: yes, these changes really do need to made
> > > > common since it appears all bio_split() callers do go on to call
> > > > generic_make_request().
> > > > 
> > > > Anyway, here is the updated patch that is now staged in linux-next:
> > > > 
> > > > From: Mike Snitzer <snitzer@redhat.com>
> > > > Date: Fri, 18 Jan 2019 01:21:11 -0500
> > > > Subject: [PATCH v2] dm: fix missing bio_split() pattern code in __split_and_process_bio()
> > > > 
> > > > Use the same BIO_QUEUE_ENTERED pattern that was established by commit
> > > > cd4a4ae4683dc ("block: don't use blocking queue entered for recursive
> > > > bio submits") by setting BIO_QUEUE_ENTERED after bio_split() and before
> > > > recursing via generic_make_request().
> > > > 
> > > > Also add trace_block_split() because it provides useful context about
> > > > bio splits in blktrace.
> > > > 
> > > > Depends-on: cd4a4ae4683dc ("block: don't use blocking queue entered for recursive bio submits")
> > > > Fixes: 18a25da84354 ("dm: ensure bio submission follows a depth-first tree walk")
> > > > Cc: stable@vger.kernel.org # 4.16+
> > > > Signed-off-by: Mike Snitzer <snitzer@redhat.com>
> > > > ---
> > > >  drivers/md/dm.c | 9 +++++++++
> > > >  1 file changed, 9 insertions(+)
> > > > 
> > > > diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> > > > index fbadda68e23b..25884f833a32 100644
> > > > --- a/drivers/md/dm.c
> > > > +++ b/drivers/md/dm.c
> > > > @@ -1654,7 +1654,9 @@ static blk_qc_t __split_and_process_bio(struct mapped_device *md,
> > > >  						   sectors[op_stat_group(bio_op(bio))], ci.sector_count);
> > > >  				part_stat_unlock();
> > > >  
> > > > +				bio_set_flag(bio, BIO_QUEUE_ENTERED);
> > > >  				bio_chain(b, bio);
> > > > +				trace_block_split(md->queue, b, bio->bi_iter.bi_sector);
> > > >  				ret = generic_make_request(bio);
> > > >  				break;
> > > >  			}
> > > > @@ -1734,6 +1736,13 @@ static blk_qc_t dm_make_request(struct request_queue *q, struct bio *bio)
> > > >  
> > > >  	map = dm_get_live_table(md, &srcu_idx);
> > > >  
> > > > +	/*
> > > > +	 * Clear the bio-reentered-generic_make_request() flag,
> > > > +	 * will be set again as needed if bio needs to be split.
> > > > +	 */
> > > > +	if (bio_flagged(bio, BIO_QUEUE_ENTERED))
> > > > +		bio_clear_flag(bio, BIO_QUEUE_ENTERED);
> > > > +
> > > >  	/* if we're suspended, we have to queue this io for later */
> > > >  	if (unlikely(test_bit(DMF_BLOCK_IO_FOR_SUSPEND, &md->flags))) {
> > > >  		dm_put_live_table(md, srcu_idx);
> > > > -- 
> > > > 2.15.0
> > > > 
> > > 
> > > Hi Mike,
> > > 
> > > I'd suggest to fix this kind issue in the following way, then we
> > > can avoid to touch this flag from drivers:
> > > 
> > > diff --git a/block/blk-core.c b/block/blk-core.c
> > > index 3c5f61ceeb67..e70103560ac2 100644
> > > --- a/block/blk-core.c
> > > +++ b/block/blk-core.c
> > > @@ -1024,6 +1024,8 @@ blk_qc_t generic_make_request(struct bio *bio)
> > >  		else
> > >  			bio_io_error(bio);
> > >  		return ret;
> > > +	} else {
> > > +		bio_set_flag(bio, BIO_QUEUE_ENTERED);
> > >  	}
> > >  
> > >  	if (!generic_make_request_checks(bio))
> > > @@ -1074,6 +1076,8 @@ blk_qc_t generic_make_request(struct bio *bio)
> > >  			if (blk_queue_enter(q, flags) < 0) {
> > >  				enter_succeeded = false;
> > >  				q = NULL;
> > > +			} else {
> > > +				bio_set_flag(bio, BIO_QUEUE_ENTERED);
> > >  			}
> > >  		}
> > >  
> > > diff --git a/block/blk-merge.c b/block/blk-merge.c
> > > index b990853f6de7..8777e286bd3f 100644
> > > --- a/block/blk-merge.c
> > > +++ b/block/blk-merge.c
> > > @@ -339,16 +339,6 @@ void blk_queue_split(struct request_queue *q, struct bio **bio)
> > >  		/* there isn't chance to merge the splitted bio */
> > >  		split->bi_opf |= REQ_NOMERGE;
> > >  
> > > -		/*
> > > -		 * Since we're recursing into make_request here, ensure
> > > -		 * that we mark this bio as already having entered the queue.
> > > -		 * If not, and the queue is going away, we can get stuck
> > > -		 * forever on waiting for the queue reference to drop. But
> > > -		 * that will never happen, as we're already holding a
> > > -		 * reference to it.
> > > -		 */
> > > -		bio_set_flag(*bio, BIO_QUEUE_ENTERED);
> > > -
> > >  		bio_chain(split, *bio);
> > >  		trace_block_split(q, split, (*bio)->bi_iter.bi_sector);
> > >  		generic_make_request(*bio);
> > > 
> > 
> > Not opposed to this.
> 
> But thinking further: when you have a stack of cascading
> q->make_request_fn it could easily be that work done the next layer
> down end up causing the bio to recurse to generic_make_request() but not
> directly (e.g. dm_wq_work)... yet BIO_QUEUE_ENTERED will still be set
> when it really isn't appropriate.

That is true, in theory, we need a per-queue stack variable to record
if queue usage counter is held. But it is quite hard to do that in
kernel because we don't have stack variable allocator, otherwise
this issue can be solved clean & simple.

> 
> Getting too cute with setting bio flags but not clearing them on
> different device boundaries could render the flags useless (or worse:
> incorrect).

How about clearing the flag just following q->make_request_fn() in
generic_make_request()?

> 
> I'm not out for enaging in a focused audit/churn in this area that
> becomes a slippery slope during the rest of 5.0-rcX.
> 
> That is why I was going for a local DM change for 5.0 and, in parallel,
> work on the more generic fixes for 5.1.
> 
> So I'm back to preferring that...
> 
> But if you, Jens or others feel strongly about it I'm open to discuss it
> further.

One concern is that if this flag starts to be used by drivers, sooner or
later it may be difficult to maintain.

> 
> Think we need to set REQ_NOMERGE in the split too (like
> blk_queue_split() is doing).  Again, a comprehensive cleanup and
> consolidation of bio_split+generic_make_request pattern is needed.  MD
> has a lot of it, DM has it, and then there is blk_queue_split().
> Basically blk_queue_split()'s bio_split+bio_chain+generic_make_request
> and all the flags that get set inbetween should be factored out for all
> to use.

Sounds a good topic and I am interested in,  maybe you can submit a lsfmm
proposal, :-)


Thanks,
Ming

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 2/4] dm: fix redundant IO accounting for bios that need splitting
  2019-01-19 18:05   ` Mike Snitzer
  (?)
  (?)
@ 2019-01-22 15:55   ` Sasha Levin
  -1 siblings, 0 replies; 30+ messages in thread
From: Sasha Levin @ 2019-01-22 15:55 UTC (permalink / raw)
  To: Sasha Levin, Mike Snitzer, dm-devel; +Cc: NeilBrown, stable, Ming Lei

[-- Attachment #1: Type: text/plain, Size: 936 bytes --]

Hi,

[This is an automated email]

This commit has been processed because it contains a "Fixes:" tag,
fixing commit: 18a25da84354 dm: ensure bio submission follows a depth-first tree walk.

The bot has tested the following trees: v4.20.3, v4.19.16.

v4.20.3: Build failed! Errors:
    drivers/md/dm.c:1582:3: error: implicit declaration of function ‘part_stat_get’; did you mean ‘part_stat_dec’? [-Werror=implicit-function-declaration]
    drivers/md/dm.c:1639:10: error: ‘sectors’ undeclared (first use in this function); did you mean ‘Sector’?

v4.19.16: Build failed! Errors:
    drivers/md/dm.c:1581:3: error: implicit declaration of function ‘part_stat_get’; did you mean ‘part_stat_dec’? [-Werror=implicit-function-declaration]
    drivers/md/dm.c:1641:10: error: ‘sectors’ undeclared (first use in this function); did you mean ‘Sector’?


How should we proceed with this patch?

--
Thanks,
Sasha


[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 3/4] dm: fix missing bio_split() pattern code in __split_and_process_bio()
  2019-01-19 18:05   ` Mike Snitzer
                     ` (2 preceding siblings ...)
  (?)
@ 2019-01-22 15:56   ` Sasha Levin
  -1 siblings, 0 replies; 30+ messages in thread
From: Sasha Levin @ 2019-01-22 15:56 UTC (permalink / raw)
  To: Sasha Levin, Mike Snitzer, dm-devel; +Cc: NeilBrown, stable, Ming Lei

Hi,

[This is an automated email]

This commit has been processed because it contains a "Fixes:" tag,
fixing commit: 18a25da84354 dm: ensure bio submission follows a depth-first tree walk.

The bot has tested the following trees: v4.20.3, v4.19.16.

v4.20.3: Failed to apply! Possible dependencies:
    7ed15898163a ("dm: fix redundant IO accounting for bios that need splitting")

v4.19.16: Failed to apply! Possible dependencies:
    7ed15898163a ("dm: fix redundant IO accounting for bios that need splitting")


How should we proceed with this patch?

--
Thanks,
Sasha

^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2019-01-22 15:56 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-01-19 18:05 [PATCH 0/4] dm: fix various issues with bio splitting code Mike Snitzer
2019-01-19 18:05 ` Mike Snitzer
2019-01-19 18:05 ` [PATCH 1/4] dm: fix clone_bio() to trigger blk_recount_segments() Mike Snitzer
2019-01-19 18:05   ` Mike Snitzer
2019-01-21  3:25   ` Ming Lei
2019-01-21  3:25     ` Ming Lei
2019-01-19 18:05 ` [PATCH 2/4] dm: fix redundant IO accounting for bios that need splitting Mike Snitzer
2019-01-19 18:05   ` Mike Snitzer
2019-01-21  3:52   ` Ming Lei
2019-01-21  3:52     ` Ming Lei
2019-01-22 15:55   ` Sasha Levin
2019-01-19 18:05 ` [PATCH 3/4] dm: fix missing bio_split() pattern code in __split_and_process_bio() Mike Snitzer
2019-01-19 18:05   ` Mike Snitzer
2019-01-21  3:21   ` Ming Lei
2019-01-21  3:21     ` Ming Lei
2019-01-21 16:02     ` Mike Snitzer
2019-01-21 16:02       ` Mike Snitzer
2019-01-22  2:46       ` Ming Lei
2019-01-22  2:46         ` Ming Lei
2019-01-22  3:17         ` Mike Snitzer
2019-01-22  3:17           ` Mike Snitzer
2019-01-22  3:35           ` Mike Snitzer
2019-01-22  3:35             ` Mike Snitzer
2019-01-22  3:49             ` Ming Lei
2019-01-22  3:49               ` Ming Lei
2019-01-21  4:39   ` [dm-devel] " NeilBrown
2019-01-21  4:39     ` NeilBrown
2019-01-22 15:56   ` Sasha Levin
2019-01-19 18:05 ` [PATCH 4/4] dm: fix dm_wq_work() to only use __split_and_process_bio() if appropriate Mike Snitzer
2019-01-19 18:05   ` Mike Snitzer

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.