* [PATCH v2 0/4] block: a couple chunk_sectors fixes/improvements
@ 2020-09-15 17:23 ` Mike Snitzer
0 siblings, 0 replies; 15+ messages in thread
From: Mike Snitzer @ 2020-09-15 17:23 UTC (permalink / raw)
To: Jens Axboe; +Cc: Ming Lei, Vijayendra Suman, dm-devel, linux-block
Hi,
This v2 drops a patch from v1 and fixes the chunk_sectprs check added to
blk_stack_limits to convert chubk_sectors to bytes before comparing with
physical_block_size.
Jens, please feel free to pick up patches 1 and 2.
DM patches 3 and 4 are provided just to give context for how DM will be
updated to use chunk_sectors.
Mike Snitzer (4):
block: use lcm_not_zero() when stacking chunk_sectors
block: allow 'chunk_sectors' to be non-power-of-2
dm table: stack 'chunk_sectors' limit to account for target-specific splitting
dm: unconditionally call blk_queue_split() in dm_process_bio()
block/blk-settings.c | 22 ++++++++++++----------
drivers/md/dm-table.c | 5 +++++
drivers/md/dm.c | 45 +--------------------------------------------
include/linux/blkdev.h | 12 +++++++++---
4 files changed, 27 insertions(+), 57 deletions(-)
--
2.15.0
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH v2 0/4] block: a couple chunk_sectors fixes/improvements
@ 2020-09-15 17:23 ` Mike Snitzer
0 siblings, 0 replies; 15+ messages in thread
From: Mike Snitzer @ 2020-09-15 17:23 UTC (permalink / raw)
To: Jens Axboe; +Cc: linux-block, dm-devel, Vijayendra Suman, Ming Lei
Hi,
This v2 drops a patch from v1 and fixes the chunk_sectprs check added to
blk_stack_limits to convert chubk_sectors to bytes before comparing with
physical_block_size.
Jens, please feel free to pick up patches 1 and 2.
DM patches 3 and 4 are provided just to give context for how DM will be
updated to use chunk_sectors.
Mike Snitzer (4):
block: use lcm_not_zero() when stacking chunk_sectors
block: allow 'chunk_sectors' to be non-power-of-2
dm table: stack 'chunk_sectors' limit to account for target-specific splitting
dm: unconditionally call blk_queue_split() in dm_process_bio()
block/blk-settings.c | 22 ++++++++++++----------
drivers/md/dm-table.c | 5 +++++
drivers/md/dm.c | 45 +--------------------------------------------
include/linux/blkdev.h | 12 +++++++++---
4 files changed, 27 insertions(+), 57 deletions(-)
--
2.15.0
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH v2 1/4] block: use lcm_not_zero() when stacking chunk_sectors
2020-09-15 17:23 ` Mike Snitzer
@ 2020-09-15 17:23 ` Mike Snitzer
-1 siblings, 0 replies; 15+ messages in thread
From: Mike Snitzer @ 2020-09-15 17:23 UTC (permalink / raw)
To: Jens Axboe; +Cc: Ming Lei, Vijayendra Suman, dm-devel, linux-block
Like 'io_opt', blk_stack_limits() should stack 'chunk_sectors' using
lcm_not_zero() rather than min_not_zero() -- otherwise the final
'chunk_sectors' could result in sub-optimal alignment of IO to
component devices in the IO stack.
Also, if 'chunk_sectors' isn't a multiple of 'physical_block_size'
then it is a bug in the driver and the device should be flagged as
'misaligned'.
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
---
block/blk-settings.c | 12 ++++++++----
1 file changed, 8 insertions(+), 4 deletions(-)
diff --git a/block/blk-settings.c b/block/blk-settings.c
index 76a7e03bcd6c..b2e1a929a6db 100644
--- a/block/blk-settings.c
+++ b/block/blk-settings.c
@@ -534,6 +534,7 @@ int blk_stack_limits(struct queue_limits *t, struct queue_limits *b,
t->io_min = max(t->io_min, b->io_min);
t->io_opt = lcm_not_zero(t->io_opt, b->io_opt);
+ t->chunk_sectors = lcm_not_zero(t->chunk_sectors, b->chunk_sectors);
/* Physical block size a multiple of the logical block size? */
if (t->physical_block_size & (t->logical_block_size - 1)) {
@@ -556,6 +557,13 @@ int blk_stack_limits(struct queue_limits *t, struct queue_limits *b,
ret = -1;
}
+ /* chunk_sectors a multiple of the physical block size? */
+ if ((t->chunk_sectors << 9) & (t->physical_block_size - 1)) {
+ t->chunk_sectors = 0;
+ t->misaligned = 1;
+ ret = -1;
+ }
+
t->raid_partial_stripes_expensive =
max(t->raid_partial_stripes_expensive,
b->raid_partial_stripes_expensive);
@@ -594,10 +602,6 @@ int blk_stack_limits(struct queue_limits *t, struct queue_limits *b,
t->discard_granularity;
}
- if (b->chunk_sectors)
- t->chunk_sectors = min_not_zero(t->chunk_sectors,
- b->chunk_sectors);
-
t->zoned = max(t->zoned, b->zoned);
return ret;
}
--
2.15.0
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH v2 1/4] block: use lcm_not_zero() when stacking chunk_sectors
@ 2020-09-15 17:23 ` Mike Snitzer
0 siblings, 0 replies; 15+ messages in thread
From: Mike Snitzer @ 2020-09-15 17:23 UTC (permalink / raw)
To: Jens Axboe; +Cc: linux-block, dm-devel, Vijayendra Suman, Ming Lei
Like 'io_opt', blk_stack_limits() should stack 'chunk_sectors' using
lcm_not_zero() rather than min_not_zero() -- otherwise the final
'chunk_sectors' could result in sub-optimal alignment of IO to
component devices in the IO stack.
Also, if 'chunk_sectors' isn't a multiple of 'physical_block_size'
then it is a bug in the driver and the device should be flagged as
'misaligned'.
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
---
block/blk-settings.c | 12 ++++++++----
1 file changed, 8 insertions(+), 4 deletions(-)
diff --git a/block/blk-settings.c b/block/blk-settings.c
index 76a7e03bcd6c..b2e1a929a6db 100644
--- a/block/blk-settings.c
+++ b/block/blk-settings.c
@@ -534,6 +534,7 @@ int blk_stack_limits(struct queue_limits *t, struct queue_limits *b,
t->io_min = max(t->io_min, b->io_min);
t->io_opt = lcm_not_zero(t->io_opt, b->io_opt);
+ t->chunk_sectors = lcm_not_zero(t->chunk_sectors, b->chunk_sectors);
/* Physical block size a multiple of the logical block size? */
if (t->physical_block_size & (t->logical_block_size - 1)) {
@@ -556,6 +557,13 @@ int blk_stack_limits(struct queue_limits *t, struct queue_limits *b,
ret = -1;
}
+ /* chunk_sectors a multiple of the physical block size? */
+ if ((t->chunk_sectors << 9) & (t->physical_block_size - 1)) {
+ t->chunk_sectors = 0;
+ t->misaligned = 1;
+ ret = -1;
+ }
+
t->raid_partial_stripes_expensive =
max(t->raid_partial_stripes_expensive,
b->raid_partial_stripes_expensive);
@@ -594,10 +602,6 @@ int blk_stack_limits(struct queue_limits *t, struct queue_limits *b,
t->discard_granularity;
}
- if (b->chunk_sectors)
- t->chunk_sectors = min_not_zero(t->chunk_sectors,
- b->chunk_sectors);
-
t->zoned = max(t->zoned, b->zoned);
return ret;
}
--
2.15.0
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH v2 2/4] block: allow 'chunk_sectors' to be non-power-of-2
2020-09-15 17:23 ` Mike Snitzer
@ 2020-09-15 17:23 ` Mike Snitzer
-1 siblings, 0 replies; 15+ messages in thread
From: Mike Snitzer @ 2020-09-15 17:23 UTC (permalink / raw)
To: Jens Axboe; +Cc: Ming Lei, Vijayendra Suman, dm-devel, linux-block
It is possible for a block device to use a non power-of-2 for chunk
size which results in a full-stripe size that is also a non
power-of-2.
Update blk_queue_chunk_sectors() and blk_max_size_offset() to
accommodate drivers that need a non power-of-2 chunk_sectors.
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
---
block/blk-settings.c | 10 ++++------
include/linux/blkdev.h | 12 +++++++++---
2 files changed, 13 insertions(+), 9 deletions(-)
diff --git a/block/blk-settings.c b/block/blk-settings.c
index b2e1a929a6db..5ea3de48afba 100644
--- a/block/blk-settings.c
+++ b/block/blk-settings.c
@@ -172,15 +172,13 @@ EXPORT_SYMBOL(blk_queue_max_hw_sectors);
*
* Description:
* If a driver doesn't want IOs to cross a given chunk size, it can set
- * this limit and prevent merging across chunks. Note that the chunk size
- * must currently be a power-of-2 in sectors. Also note that the block
- * layer must accept a page worth of data at any offset. So if the
- * crossing of chunks is a hard limitation in the driver, it must still be
- * prepared to split single page bios.
+ * this limit and prevent merging across chunks. Note that the block layer
+ * must accept a page worth of data at any offset. So if the crossing of
+ * chunks is a hard limitation in the driver, it must still be prepared
+ * to split single page bios.
**/
void blk_queue_chunk_sectors(struct request_queue *q, unsigned int chunk_sectors)
{
- BUG_ON(!is_power_of_2(chunk_sectors));
q->limits.chunk_sectors = chunk_sectors;
}
EXPORT_SYMBOL(blk_queue_chunk_sectors);
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index bb5636cc17b9..bbfbda33e993 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -1059,11 +1059,17 @@ static inline unsigned int blk_queue_get_max_sectors(struct request_queue *q,
static inline unsigned int blk_max_size_offset(struct request_queue *q,
sector_t offset)
{
- if (!q->limits.chunk_sectors)
+ unsigned int chunk_sectors = q->limits.chunk_sectors;
+
+ if (!chunk_sectors)
return q->limits.max_sectors;
- return min(q->limits.max_sectors, (unsigned int)(q->limits.chunk_sectors -
- (offset & (q->limits.chunk_sectors - 1))));
+ if (is_power_of_2(chunk_sectors))
+ chunk_sectors -= offset & (chunk_sectors - 1);
+ else
+ chunk_sectors -= sector_div(offset, chunk_sectors);
+
+ return min(q->limits.max_sectors, chunk_sectors);
}
static inline unsigned int blk_rq_get_max_sectors(struct request *rq,
--
2.15.0
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH v2 2/4] block: allow 'chunk_sectors' to be non-power-of-2
@ 2020-09-15 17:23 ` Mike Snitzer
0 siblings, 0 replies; 15+ messages in thread
From: Mike Snitzer @ 2020-09-15 17:23 UTC (permalink / raw)
To: Jens Axboe; +Cc: linux-block, dm-devel, Vijayendra Suman, Ming Lei
It is possible for a block device to use a non power-of-2 for chunk
size which results in a full-stripe size that is also a non
power-of-2.
Update blk_queue_chunk_sectors() and blk_max_size_offset() to
accommodate drivers that need a non power-of-2 chunk_sectors.
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
---
block/blk-settings.c | 10 ++++------
include/linux/blkdev.h | 12 +++++++++---
2 files changed, 13 insertions(+), 9 deletions(-)
diff --git a/block/blk-settings.c b/block/blk-settings.c
index b2e1a929a6db..5ea3de48afba 100644
--- a/block/blk-settings.c
+++ b/block/blk-settings.c
@@ -172,15 +172,13 @@ EXPORT_SYMBOL(blk_queue_max_hw_sectors);
*
* Description:
* If a driver doesn't want IOs to cross a given chunk size, it can set
- * this limit and prevent merging across chunks. Note that the chunk size
- * must currently be a power-of-2 in sectors. Also note that the block
- * layer must accept a page worth of data at any offset. So if the
- * crossing of chunks is a hard limitation in the driver, it must still be
- * prepared to split single page bios.
+ * this limit and prevent merging across chunks. Note that the block layer
+ * must accept a page worth of data at any offset. So if the crossing of
+ * chunks is a hard limitation in the driver, it must still be prepared
+ * to split single page bios.
**/
void blk_queue_chunk_sectors(struct request_queue *q, unsigned int chunk_sectors)
{
- BUG_ON(!is_power_of_2(chunk_sectors));
q->limits.chunk_sectors = chunk_sectors;
}
EXPORT_SYMBOL(blk_queue_chunk_sectors);
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index bb5636cc17b9..bbfbda33e993 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -1059,11 +1059,17 @@ static inline unsigned int blk_queue_get_max_sectors(struct request_queue *q,
static inline unsigned int blk_max_size_offset(struct request_queue *q,
sector_t offset)
{
- if (!q->limits.chunk_sectors)
+ unsigned int chunk_sectors = q->limits.chunk_sectors;
+
+ if (!chunk_sectors)
return q->limits.max_sectors;
- return min(q->limits.max_sectors, (unsigned int)(q->limits.chunk_sectors -
- (offset & (q->limits.chunk_sectors - 1))));
+ if (is_power_of_2(chunk_sectors))
+ chunk_sectors -= offset & (chunk_sectors - 1);
+ else
+ chunk_sectors -= sector_div(offset, chunk_sectors);
+
+ return min(q->limits.max_sectors, chunk_sectors);
}
static inline unsigned int blk_rq_get_max_sectors(struct request *rq,
--
2.15.0
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH v2 3/4] dm table: stack 'chunk_sectors' limit to account for target-specific splitting
2020-09-15 17:23 ` Mike Snitzer
@ 2020-09-15 17:23 ` Mike Snitzer
-1 siblings, 0 replies; 15+ messages in thread
From: Mike Snitzer @ 2020-09-15 17:23 UTC (permalink / raw)
To: Jens Axboe; +Cc: Ming Lei, Vijayendra Suman, dm-devel, linux-block
If target set ti->max_io_len it must be used when stacking
DM device's queue_limits to establish a 'chunk_sectors' that is
compatible with the IO stack.
By using lcm_not_zero() care is taken to avoid blindly overriding the
chunk_sectors limit stacked up by blk_stack_limits().
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
---
drivers/md/dm-table.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/drivers/md/dm-table.c b/drivers/md/dm-table.c
index 5edc3079e7c1..248c5a1074a7 100644
--- a/drivers/md/dm-table.c
+++ b/drivers/md/dm-table.c
@@ -18,6 +18,7 @@
#include <linux/mutex.h>
#include <linux/delay.h>
#include <linux/atomic.h>
+#include <linux/lcm.h>
#include <linux/blk-mq.h>
#include <linux/mount.h>
#include <linux/dax.h>
@@ -1502,6 +1503,10 @@ int dm_calculate_queue_limits(struct dm_table *table,
zone_sectors = ti_limits.chunk_sectors;
}
+ /* Stack chunk_sectors if target-specific splitting is required */
+ if (ti->max_io_len)
+ ti_limits.chunk_sectors = lcm_not_zero(ti->max_io_len,
+ ti_limits.chunk_sectors);
/* Set I/O hints portion of queue limits */
if (ti->type->io_hints)
ti->type->io_hints(ti, &ti_limits);
--
2.15.0
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH v2 3/4] dm table: stack 'chunk_sectors' limit to account for target-specific splitting
@ 2020-09-15 17:23 ` Mike Snitzer
0 siblings, 0 replies; 15+ messages in thread
From: Mike Snitzer @ 2020-09-15 17:23 UTC (permalink / raw)
To: Jens Axboe; +Cc: linux-block, dm-devel, Vijayendra Suman, Ming Lei
If target set ti->max_io_len it must be used when stacking
DM device's queue_limits to establish a 'chunk_sectors' that is
compatible with the IO stack.
By using lcm_not_zero() care is taken to avoid blindly overriding the
chunk_sectors limit stacked up by blk_stack_limits().
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
---
drivers/md/dm-table.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/drivers/md/dm-table.c b/drivers/md/dm-table.c
index 5edc3079e7c1..248c5a1074a7 100644
--- a/drivers/md/dm-table.c
+++ b/drivers/md/dm-table.c
@@ -18,6 +18,7 @@
#include <linux/mutex.h>
#include <linux/delay.h>
#include <linux/atomic.h>
+#include <linux/lcm.h>
#include <linux/blk-mq.h>
#include <linux/mount.h>
#include <linux/dax.h>
@@ -1502,6 +1503,10 @@ int dm_calculate_queue_limits(struct dm_table *table,
zone_sectors = ti_limits.chunk_sectors;
}
+ /* Stack chunk_sectors if target-specific splitting is required */
+ if (ti->max_io_len)
+ ti_limits.chunk_sectors = lcm_not_zero(ti->max_io_len,
+ ti_limits.chunk_sectors);
/* Set I/O hints portion of queue limits */
if (ti->type->io_hints)
ti->type->io_hints(ti, &ti_limits);
--
2.15.0
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH v2 4/4] dm: unconditionally call blk_queue_split() in dm_process_bio()
2020-09-15 17:23 ` Mike Snitzer
@ 2020-09-15 17:23 ` Mike Snitzer
-1 siblings, 0 replies; 15+ messages in thread
From: Mike Snitzer @ 2020-09-15 17:23 UTC (permalink / raw)
To: Jens Axboe; +Cc: Ming Lei, Vijayendra Suman, dm-devel, linux-block
blk_queue_split() has become compulsory from .submit_bio -- regardless
of whether it is recursing. Update DM core to always call
blk_queue_split().
dm_queue_split() is removed because __split_and_process_bio() handles
splitting as needed.
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
---
drivers/md/dm.c | 45 +--------------------------------------------
1 file changed, 1 insertion(+), 44 deletions(-)
diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index fb0255d25e4b..0bae9f26dc8e 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -1530,22 +1530,6 @@ static int __send_write_zeroes(struct clone_info *ci, struct dm_target *ti)
return __send_changing_extent_only(ci, ti, get_num_write_zeroes_bios(ti));
}
-static bool is_abnormal_io(struct bio *bio)
-{
- bool r = false;
-
- switch (bio_op(bio)) {
- case REQ_OP_DISCARD:
- case REQ_OP_SECURE_ERASE:
- case REQ_OP_WRITE_SAME:
- case REQ_OP_WRITE_ZEROES:
- r = true;
- break;
- }
-
- return r;
-}
-
static bool __process_abnormal_io(struct clone_info *ci, struct dm_target *ti,
int *result)
{
@@ -1723,23 +1707,6 @@ static blk_qc_t __process_bio(struct mapped_device *md, struct dm_table *map,
return ret;
}
-static void dm_queue_split(struct mapped_device *md, struct dm_target *ti, struct bio **bio)
-{
- unsigned len, sector_count;
-
- sector_count = bio_sectors(*bio);
- len = min_t(sector_t, max_io_len((*bio)->bi_iter.bi_sector, ti), sector_count);
-
- if (sector_count > len) {
- struct bio *split = bio_split(*bio, len, GFP_NOIO, &md->queue->bio_split);
-
- bio_chain(split, *bio);
- trace_block_split(md->queue, split, (*bio)->bi_iter.bi_sector);
- submit_bio_noacct(*bio);
- *bio = split;
- }
-}
-
static blk_qc_t dm_process_bio(struct mapped_device *md,
struct dm_table *map, struct bio *bio)
{
@@ -1759,17 +1726,7 @@ static blk_qc_t dm_process_bio(struct mapped_device *md,
}
}
- /*
- * If in ->queue_bio we need to use blk_queue_split(), otherwise
- * queue_limits for abnormal requests (e.g. discard, writesame, etc)
- * won't be imposed.
- */
- if (current->bio_list) {
- if (is_abnormal_io(bio))
- blk_queue_split(&bio);
- else
- dm_queue_split(md, ti, &bio);
- }
+ blk_queue_split(&bio);
if (dm_get_md_type(md) == DM_TYPE_NVME_BIO_BASED)
return __process_bio(md, map, bio, ti);
--
2.15.0
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH v2 4/4] dm: unconditionally call blk_queue_split() in dm_process_bio()
@ 2020-09-15 17:23 ` Mike Snitzer
0 siblings, 0 replies; 15+ messages in thread
From: Mike Snitzer @ 2020-09-15 17:23 UTC (permalink / raw)
To: Jens Axboe; +Cc: linux-block, dm-devel, Vijayendra Suman, Ming Lei
blk_queue_split() has become compulsory from .submit_bio -- regardless
of whether it is recursing. Update DM core to always call
blk_queue_split().
dm_queue_split() is removed because __split_and_process_bio() handles
splitting as needed.
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
---
drivers/md/dm.c | 45 +--------------------------------------------
1 file changed, 1 insertion(+), 44 deletions(-)
diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index fb0255d25e4b..0bae9f26dc8e 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -1530,22 +1530,6 @@ static int __send_write_zeroes(struct clone_info *ci, struct dm_target *ti)
return __send_changing_extent_only(ci, ti, get_num_write_zeroes_bios(ti));
}
-static bool is_abnormal_io(struct bio *bio)
-{
- bool r = false;
-
- switch (bio_op(bio)) {
- case REQ_OP_DISCARD:
- case REQ_OP_SECURE_ERASE:
- case REQ_OP_WRITE_SAME:
- case REQ_OP_WRITE_ZEROES:
- r = true;
- break;
- }
-
- return r;
-}
-
static bool __process_abnormal_io(struct clone_info *ci, struct dm_target *ti,
int *result)
{
@@ -1723,23 +1707,6 @@ static blk_qc_t __process_bio(struct mapped_device *md, struct dm_table *map,
return ret;
}
-static void dm_queue_split(struct mapped_device *md, struct dm_target *ti, struct bio **bio)
-{
- unsigned len, sector_count;
-
- sector_count = bio_sectors(*bio);
- len = min_t(sector_t, max_io_len((*bio)->bi_iter.bi_sector, ti), sector_count);
-
- if (sector_count > len) {
- struct bio *split = bio_split(*bio, len, GFP_NOIO, &md->queue->bio_split);
-
- bio_chain(split, *bio);
- trace_block_split(md->queue, split, (*bio)->bi_iter.bi_sector);
- submit_bio_noacct(*bio);
- *bio = split;
- }
-}
-
static blk_qc_t dm_process_bio(struct mapped_device *md,
struct dm_table *map, struct bio *bio)
{
@@ -1759,17 +1726,7 @@ static blk_qc_t dm_process_bio(struct mapped_device *md,
}
}
- /*
- * If in ->queue_bio we need to use blk_queue_split(), otherwise
- * queue_limits for abnormal requests (e.g. discard, writesame, etc)
- * won't be imposed.
- */
- if (current->bio_list) {
- if (is_abnormal_io(bio))
- blk_queue_split(&bio);
- else
- dm_queue_split(md, ti, &bio);
- }
+ blk_queue_split(&bio);
if (dm_get_md_type(md) == DM_TYPE_NVME_BIO_BASED)
return __process_bio(md, map, bio, ti);
--
2.15.0
^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH v2 4/4] dm: unconditionally call blk_queue_split() in dm_process_bio()
2020-09-15 17:23 ` Mike Snitzer
(?)
@ 2020-09-16 1:08 ` Ming Lei
2020-09-16 1:28 ` Mike Snitzer
-1 siblings, 1 reply; 15+ messages in thread
From: Ming Lei @ 2020-09-16 1:08 UTC (permalink / raw)
To: Mike Snitzer; +Cc: Jens Axboe, Vijayendra Suman, dm-devel, linux-block
On Tue, Sep 15, 2020 at 01:23:57PM -0400, Mike Snitzer wrote:
> blk_queue_split() has become compulsory from .submit_bio -- regardless
> of whether it is recursing. Update DM core to always call
> blk_queue_split().
>
> dm_queue_split() is removed because __split_and_process_bio() handles
> splitting as needed.
>
> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
> ---
> drivers/md/dm.c | 45 +--------------------------------------------
> 1 file changed, 1 insertion(+), 44 deletions(-)
>
> diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> index fb0255d25e4b..0bae9f26dc8e 100644
> --- a/drivers/md/dm.c
> +++ b/drivers/md/dm.c
> @@ -1530,22 +1530,6 @@ static int __send_write_zeroes(struct clone_info *ci, struct dm_target *ti)
> return __send_changing_extent_only(ci, ti, get_num_write_zeroes_bios(ti));
> }
>
> -static bool is_abnormal_io(struct bio *bio)
> -{
> - bool r = false;
> -
> - switch (bio_op(bio)) {
> - case REQ_OP_DISCARD:
> - case REQ_OP_SECURE_ERASE:
> - case REQ_OP_WRITE_SAME:
> - case REQ_OP_WRITE_ZEROES:
> - r = true;
> - break;
> - }
> -
> - return r;
> -}
> -
> static bool __process_abnormal_io(struct clone_info *ci, struct dm_target *ti,
> int *result)
> {
> @@ -1723,23 +1707,6 @@ static blk_qc_t __process_bio(struct mapped_device *md, struct dm_table *map,
> return ret;
> }
>
> -static void dm_queue_split(struct mapped_device *md, struct dm_target *ti, struct bio **bio)
> -{
> - unsigned len, sector_count;
> -
> - sector_count = bio_sectors(*bio);
> - len = min_t(sector_t, max_io_len((*bio)->bi_iter.bi_sector, ti), sector_count);
> -
> - if (sector_count > len) {
> - struct bio *split = bio_split(*bio, len, GFP_NOIO, &md->queue->bio_split);
> -
> - bio_chain(split, *bio);
> - trace_block_split(md->queue, split, (*bio)->bi_iter.bi_sector);
> - submit_bio_noacct(*bio);
> - *bio = split;
> - }
> -}
> -
> static blk_qc_t dm_process_bio(struct mapped_device *md,
> struct dm_table *map, struct bio *bio)
> {
> @@ -1759,17 +1726,7 @@ static blk_qc_t dm_process_bio(struct mapped_device *md,
> }
> }
>
> - /*
> - * If in ->queue_bio we need to use blk_queue_split(), otherwise
> - * queue_limits for abnormal requests (e.g. discard, writesame, etc)
> - * won't be imposed.
> - */
> - if (current->bio_list) {
> - if (is_abnormal_io(bio))
> - blk_queue_split(&bio);
> - else
> - dm_queue_split(md, ti, &bio);
> - }
> + blk_queue_split(&bio);
In max_io_len(), target boundary is taken into account when figuring out
the max io len. However, this info won't be used any more after
switching to blk_queue_split(). Is that one potential problem?
thanks,
Ming
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v2 4/4] dm: unconditionally call blk_queue_split() in dm_process_bio()
2020-09-16 1:08 ` Ming Lei
@ 2020-09-16 1:28 ` Mike Snitzer
2020-09-16 1:48 ` Ming Lei
0 siblings, 1 reply; 15+ messages in thread
From: Mike Snitzer @ 2020-09-16 1:28 UTC (permalink / raw)
To: Ming Lei; +Cc: Jens Axboe, Vijayendra Suman, dm-devel, linux-block
On Tue, Sep 15 2020 at 9:08pm -0400,
Ming Lei <ming.lei@redhat.com> wrote:
> On Tue, Sep 15, 2020 at 01:23:57PM -0400, Mike Snitzer wrote:
> > blk_queue_split() has become compulsory from .submit_bio -- regardless
> > of whether it is recursing. Update DM core to always call
> > blk_queue_split().
> >
> > dm_queue_split() is removed because __split_and_process_bio() handles
> > splitting as needed.
> >
> > Signed-off-by: Mike Snitzer <snitzer@redhat.com>
> > ---
> > drivers/md/dm.c | 45 +--------------------------------------------
> > 1 file changed, 1 insertion(+), 44 deletions(-)
> >
> > diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> > index fb0255d25e4b..0bae9f26dc8e 100644
> > --- a/drivers/md/dm.c
> > +++ b/drivers/md/dm.c
> > @@ -1530,22 +1530,6 @@ static int __send_write_zeroes(struct clone_info *ci, struct dm_target *ti)
> > return __send_changing_extent_only(ci, ti, get_num_write_zeroes_bios(ti));
> > }
> >
> > -static bool is_abnormal_io(struct bio *bio)
> > -{
> > - bool r = false;
> > -
> > - switch (bio_op(bio)) {
> > - case REQ_OP_DISCARD:
> > - case REQ_OP_SECURE_ERASE:
> > - case REQ_OP_WRITE_SAME:
> > - case REQ_OP_WRITE_ZEROES:
> > - r = true;
> > - break;
> > - }
> > -
> > - return r;
> > -}
> > -
> > static bool __process_abnormal_io(struct clone_info *ci, struct dm_target *ti,
> > int *result)
> > {
> > @@ -1723,23 +1707,6 @@ static blk_qc_t __process_bio(struct mapped_device *md, struct dm_table *map,
> > return ret;
> > }
> >
> > -static void dm_queue_split(struct mapped_device *md, struct dm_target *ti, struct bio **bio)
> > -{
> > - unsigned len, sector_count;
> > -
> > - sector_count = bio_sectors(*bio);
> > - len = min_t(sector_t, max_io_len((*bio)->bi_iter.bi_sector, ti), sector_count);
> > -
> > - if (sector_count > len) {
> > - struct bio *split = bio_split(*bio, len, GFP_NOIO, &md->queue->bio_split);
> > -
> > - bio_chain(split, *bio);
> > - trace_block_split(md->queue, split, (*bio)->bi_iter.bi_sector);
> > - submit_bio_noacct(*bio);
> > - *bio = split;
> > - }
> > -}
> > -
> > static blk_qc_t dm_process_bio(struct mapped_device *md,
> > struct dm_table *map, struct bio *bio)
> > {
> > @@ -1759,17 +1726,7 @@ static blk_qc_t dm_process_bio(struct mapped_device *md,
> > }
> > }
> >
> > - /*
> > - * If in ->queue_bio we need to use blk_queue_split(), otherwise
> > - * queue_limits for abnormal requests (e.g. discard, writesame, etc)
> > - * won't be imposed.
> > - */
> > - if (current->bio_list) {
> > - if (is_abnormal_io(bio))
> > - blk_queue_split(&bio);
> > - else
> > - dm_queue_split(md, ti, &bio);
> > - }
> > + blk_queue_split(&bio);
>
> In max_io_len(), target boundary is taken into account when figuring out
> the max io len. However, this info won't be used any more after
> switching to blk_queue_split(). Is that one potential problem?
Thanks for your review. But no, as the patch header says:
"dm_queue_split() is removed because __split_and_process_bio() handles
splitting as needed."
(__split_and_process_non_flush calls max_io_len, as does
__process_abnormal_io by calling __send_changing_extent_only)
SO the blk_queue_split() bio will be further split if needed (due to
DM target boundary, etc).
Mike
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v2 4/4] dm: unconditionally call blk_queue_split() in dm_process_bio()
2020-09-16 1:28 ` Mike Snitzer
@ 2020-09-16 1:48 ` Ming Lei
2020-09-16 3:39 ` Mike Snitzer
0 siblings, 1 reply; 15+ messages in thread
From: Ming Lei @ 2020-09-16 1:48 UTC (permalink / raw)
To: Mike Snitzer; +Cc: Jens Axboe, Vijayendra Suman, dm-devel, linux-block
On Tue, Sep 15, 2020 at 09:28:14PM -0400, Mike Snitzer wrote:
> On Tue, Sep 15 2020 at 9:08pm -0400,
> Ming Lei <ming.lei@redhat.com> wrote:
>
> > On Tue, Sep 15, 2020 at 01:23:57PM -0400, Mike Snitzer wrote:
> > > blk_queue_split() has become compulsory from .submit_bio -- regardless
> > > of whether it is recursing. Update DM core to always call
> > > blk_queue_split().
> > >
> > > dm_queue_split() is removed because __split_and_process_bio() handles
> > > splitting as needed.
> > >
> > > Signed-off-by: Mike Snitzer <snitzer@redhat.com>
> > > ---
> > > drivers/md/dm.c | 45 +--------------------------------------------
> > > 1 file changed, 1 insertion(+), 44 deletions(-)
> > >
> > > diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> > > index fb0255d25e4b..0bae9f26dc8e 100644
> > > --- a/drivers/md/dm.c
> > > +++ b/drivers/md/dm.c
> > > @@ -1530,22 +1530,6 @@ static int __send_write_zeroes(struct clone_info *ci, struct dm_target *ti)
> > > return __send_changing_extent_only(ci, ti, get_num_write_zeroes_bios(ti));
> > > }
> > >
> > > -static bool is_abnormal_io(struct bio *bio)
> > > -{
> > > - bool r = false;
> > > -
> > > - switch (bio_op(bio)) {
> > > - case REQ_OP_DISCARD:
> > > - case REQ_OP_SECURE_ERASE:
> > > - case REQ_OP_WRITE_SAME:
> > > - case REQ_OP_WRITE_ZEROES:
> > > - r = true;
> > > - break;
> > > - }
> > > -
> > > - return r;
> > > -}
> > > -
> > > static bool __process_abnormal_io(struct clone_info *ci, struct dm_target *ti,
> > > int *result)
> > > {
> > > @@ -1723,23 +1707,6 @@ static blk_qc_t __process_bio(struct mapped_device *md, struct dm_table *map,
> > > return ret;
> > > }
> > >
> > > -static void dm_queue_split(struct mapped_device *md, struct dm_target *ti, struct bio **bio)
> > > -{
> > > - unsigned len, sector_count;
> > > -
> > > - sector_count = bio_sectors(*bio);
> > > - len = min_t(sector_t, max_io_len((*bio)->bi_iter.bi_sector, ti), sector_count);
> > > -
> > > - if (sector_count > len) {
> > > - struct bio *split = bio_split(*bio, len, GFP_NOIO, &md->queue->bio_split);
> > > -
> > > - bio_chain(split, *bio);
> > > - trace_block_split(md->queue, split, (*bio)->bi_iter.bi_sector);
> > > - submit_bio_noacct(*bio);
> > > - *bio = split;
> > > - }
> > > -}
> > > -
> > > static blk_qc_t dm_process_bio(struct mapped_device *md,
> > > struct dm_table *map, struct bio *bio)
> > > {
> > > @@ -1759,17 +1726,7 @@ static blk_qc_t dm_process_bio(struct mapped_device *md,
> > > }
> > > }
> > >
> > > - /*
> > > - * If in ->queue_bio we need to use blk_queue_split(), otherwise
> > > - * queue_limits for abnormal requests (e.g. discard, writesame, etc)
> > > - * won't be imposed.
> > > - */
> > > - if (current->bio_list) {
> > > - if (is_abnormal_io(bio))
> > > - blk_queue_split(&bio);
> > > - else
> > > - dm_queue_split(md, ti, &bio);
> > > - }
> > > + blk_queue_split(&bio);
> >
> > In max_io_len(), target boundary is taken into account when figuring out
> > the max io len. However, this info won't be used any more after
> > switching to blk_queue_split(). Is that one potential problem?
>
> Thanks for your review. But no, as the patch header says:
> "dm_queue_split() is removed because __split_and_process_bio() handles
> splitting as needed."
>
> (__split_and_process_non_flush calls max_io_len, as does
> __process_abnormal_io by calling __send_changing_extent_only)
>
> SO the blk_queue_split() bio will be further split if needed (due to
> DM target boundary, etc).
Thanks for your explanation.
Then looks there is double split issue since both blk_queue_split()
and __split_and_process_non_flush() may split bio from same bioset(md->queue->bio_split),
and this way may cause deadlock, see comment of bio_alloc_bioset(), especially
the paragraph of 'callers must never allocate more than 1 bio at a time
from this pool.'
Thanks,
Ming
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v2 4/4] dm: unconditionally call blk_queue_split() in dm_process_bio()
2020-09-16 1:48 ` Ming Lei
@ 2020-09-16 3:39 ` Mike Snitzer
2020-09-16 7:51 ` Ming Lei
0 siblings, 1 reply; 15+ messages in thread
From: Mike Snitzer @ 2020-09-16 3:39 UTC (permalink / raw)
To: Ming Lei; +Cc: Jens Axboe, Vijayendra Suman, dm-devel, linux-block
On Tue, Sep 15 2020 at 9:48pm -0400,
Ming Lei <ming.lei@redhat.com> wrote:
> On Tue, Sep 15, 2020 at 09:28:14PM -0400, Mike Snitzer wrote:
> > On Tue, Sep 15 2020 at 9:08pm -0400,
> > Ming Lei <ming.lei@redhat.com> wrote:
> >
> > > On Tue, Sep 15, 2020 at 01:23:57PM -0400, Mike Snitzer wrote:
> > > > blk_queue_split() has become compulsory from .submit_bio -- regardless
> > > > of whether it is recursing. Update DM core to always call
> > > > blk_queue_split().
> > > >
> > > > dm_queue_split() is removed because __split_and_process_bio() handles
> > > > splitting as needed.
> > > >
> > > > Signed-off-by: Mike Snitzer <snitzer@redhat.com>
> > > > ---
> > > > drivers/md/dm.c | 45 +--------------------------------------------
> > > > 1 file changed, 1 insertion(+), 44 deletions(-)
> > > >
> > > > diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> > > > index fb0255d25e4b..0bae9f26dc8e 100644
> > > > --- a/drivers/md/dm.c
> > > > +++ b/drivers/md/dm.c
> > > > @@ -1530,22 +1530,6 @@ static int __send_write_zeroes(struct clone_info *ci, struct dm_target *ti)
> > > > return __send_changing_extent_only(ci, ti, get_num_write_zeroes_bios(ti));
> > > > }
> > > >
> > > > -static bool is_abnormal_io(struct bio *bio)
> > > > -{
> > > > - bool r = false;
> > > > -
> > > > - switch (bio_op(bio)) {
> > > > - case REQ_OP_DISCARD:
> > > > - case REQ_OP_SECURE_ERASE:
> > > > - case REQ_OP_WRITE_SAME:
> > > > - case REQ_OP_WRITE_ZEROES:
> > > > - r = true;
> > > > - break;
> > > > - }
> > > > -
> > > > - return r;
> > > > -}
> > > > -
> > > > static bool __process_abnormal_io(struct clone_info *ci, struct dm_target *ti,
> > > > int *result)
> > > > {
> > > > @@ -1723,23 +1707,6 @@ static blk_qc_t __process_bio(struct mapped_device *md, struct dm_table *map,
> > > > return ret;
> > > > }
> > > >
> > > > -static void dm_queue_split(struct mapped_device *md, struct dm_target *ti, struct bio **bio)
> > > > -{
> > > > - unsigned len, sector_count;
> > > > -
> > > > - sector_count = bio_sectors(*bio);
> > > > - len = min_t(sector_t, max_io_len((*bio)->bi_iter.bi_sector, ti), sector_count);
> > > > -
> > > > - if (sector_count > len) {
> > > > - struct bio *split = bio_split(*bio, len, GFP_NOIO, &md->queue->bio_split);
> > > > -
> > > > - bio_chain(split, *bio);
> > > > - trace_block_split(md->queue, split, (*bio)->bi_iter.bi_sector);
> > > > - submit_bio_noacct(*bio);
> > > > - *bio = split;
> > > > - }
> > > > -}
> > > > -
> > > > static blk_qc_t dm_process_bio(struct mapped_device *md,
> > > > struct dm_table *map, struct bio *bio)
> > > > {
> > > > @@ -1759,17 +1726,7 @@ static blk_qc_t dm_process_bio(struct mapped_device *md,
> > > > }
> > > > }
> > > >
> > > > - /*
> > > > - * If in ->queue_bio we need to use blk_queue_split(), otherwise
> > > > - * queue_limits for abnormal requests (e.g. discard, writesame, etc)
> > > > - * won't be imposed.
> > > > - */
> > > > - if (current->bio_list) {
> > > > - if (is_abnormal_io(bio))
> > > > - blk_queue_split(&bio);
> > > > - else
> > > > - dm_queue_split(md, ti, &bio);
> > > > - }
> > > > + blk_queue_split(&bio);
> > >
> > > In max_io_len(), target boundary is taken into account when figuring out
> > > the max io len. However, this info won't be used any more after
> > > switching to blk_queue_split(). Is that one potential problem?
> >
> > Thanks for your review. But no, as the patch header says:
> > "dm_queue_split() is removed because __split_and_process_bio() handles
> > splitting as needed."
> >
> > (__split_and_process_non_flush calls max_io_len, as does
> > __process_abnormal_io by calling __send_changing_extent_only)
> >
> > SO the blk_queue_split() bio will be further split if needed (due to
> > DM target boundary, etc).
>
> Thanks for your explanation.
>
> Then looks there is double split issue since both blk_queue_split()
> and __split_and_process_non_flush() may split bio from same bioset(md->queue->bio_split),
> and this way may cause deadlock, see comment of bio_alloc_bioset(), especially
> the paragraph of 'callers must never allocate more than 1 bio at a time
> from this pool.'
Next sentence is:
"Callers that need to allocate more than 1 bio must always submit the
previously allocated bio for IO before attempting to allocate a new
one."
__split_and_process_non_flush -> __map_bio -> submit_bio_noacct
bio_split
submit_bio_noacct
With commit 18a25da84354c, NeilBrown wrote the __split_and_process_bio()
with an eye toward depth-first submission to avoid this deadlock you're
concerned about. That commit header speaks to it directly.
I did go on to change Neil's code a bit with commit f21c601a2bb31 -- but
I _think_ the current code is still OK relative to bio_split mempool
use.
Mike
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v2 4/4] dm: unconditionally call blk_queue_split() in dm_process_bio()
2020-09-16 3:39 ` Mike Snitzer
@ 2020-09-16 7:51 ` Ming Lei
0 siblings, 0 replies; 15+ messages in thread
From: Ming Lei @ 2020-09-16 7:51 UTC (permalink / raw)
To: Mike Snitzer; +Cc: Jens Axboe, Vijayendra Suman, dm-devel, linux-block
On Tue, Sep 15, 2020 at 11:39:46PM -0400, Mike Snitzer wrote:
> On Tue, Sep 15 2020 at 9:48pm -0400,
> Ming Lei <ming.lei@redhat.com> wrote:
>
> > On Tue, Sep 15, 2020 at 09:28:14PM -0400, Mike Snitzer wrote:
> > > On Tue, Sep 15 2020 at 9:08pm -0400,
> > > Ming Lei <ming.lei@redhat.com> wrote:
> > >
> > > > On Tue, Sep 15, 2020 at 01:23:57PM -0400, Mike Snitzer wrote:
> > > > > blk_queue_split() has become compulsory from .submit_bio -- regardless
> > > > > of whether it is recursing. Update DM core to always call
> > > > > blk_queue_split().
> > > > >
> > > > > dm_queue_split() is removed because __split_and_process_bio() handles
> > > > > splitting as needed.
> > > > >
> > > > > Signed-off-by: Mike Snitzer <snitzer@redhat.com>
> > > > > ---
> > > > > drivers/md/dm.c | 45 +--------------------------------------------
> > > > > 1 file changed, 1 insertion(+), 44 deletions(-)
> > > > >
> > > > > diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> > > > > index fb0255d25e4b..0bae9f26dc8e 100644
> > > > > --- a/drivers/md/dm.c
> > > > > +++ b/drivers/md/dm.c
> > > > > @@ -1530,22 +1530,6 @@ static int __send_write_zeroes(struct clone_info *ci, struct dm_target *ti)
> > > > > return __send_changing_extent_only(ci, ti, get_num_write_zeroes_bios(ti));
> > > > > }
> > > > >
> > > > > -static bool is_abnormal_io(struct bio *bio)
> > > > > -{
> > > > > - bool r = false;
> > > > > -
> > > > > - switch (bio_op(bio)) {
> > > > > - case REQ_OP_DISCARD:
> > > > > - case REQ_OP_SECURE_ERASE:
> > > > > - case REQ_OP_WRITE_SAME:
> > > > > - case REQ_OP_WRITE_ZEROES:
> > > > > - r = true;
> > > > > - break;
> > > > > - }
> > > > > -
> > > > > - return r;
> > > > > -}
> > > > > -
> > > > > static bool __process_abnormal_io(struct clone_info *ci, struct dm_target *ti,
> > > > > int *result)
> > > > > {
> > > > > @@ -1723,23 +1707,6 @@ static blk_qc_t __process_bio(struct mapped_device *md, struct dm_table *map,
> > > > > return ret;
> > > > > }
> > > > >
> > > > > -static void dm_queue_split(struct mapped_device *md, struct dm_target *ti, struct bio **bio)
> > > > > -{
> > > > > - unsigned len, sector_count;
> > > > > -
> > > > > - sector_count = bio_sectors(*bio);
> > > > > - len = min_t(sector_t, max_io_len((*bio)->bi_iter.bi_sector, ti), sector_count);
> > > > > -
> > > > > - if (sector_count > len) {
> > > > > - struct bio *split = bio_split(*bio, len, GFP_NOIO, &md->queue->bio_split);
> > > > > -
> > > > > - bio_chain(split, *bio);
> > > > > - trace_block_split(md->queue, split, (*bio)->bi_iter.bi_sector);
> > > > > - submit_bio_noacct(*bio);
> > > > > - *bio = split;
> > > > > - }
> > > > > -}
> > > > > -
> > > > > static blk_qc_t dm_process_bio(struct mapped_device *md,
> > > > > struct dm_table *map, struct bio *bio)
> > > > > {
> > > > > @@ -1759,17 +1726,7 @@ static blk_qc_t dm_process_bio(struct mapped_device *md,
> > > > > }
> > > > > }
> > > > >
> > > > > - /*
> > > > > - * If in ->queue_bio we need to use blk_queue_split(), otherwise
> > > > > - * queue_limits for abnormal requests (e.g. discard, writesame, etc)
> > > > > - * won't be imposed.
> > > > > - */
> > > > > - if (current->bio_list) {
> > > > > - if (is_abnormal_io(bio))
> > > > > - blk_queue_split(&bio);
> > > > > - else
> > > > > - dm_queue_split(md, ti, &bio);
> > > > > - }
> > > > > + blk_queue_split(&bio);
> > > >
> > > > In max_io_len(), target boundary is taken into account when figuring out
> > > > the max io len. However, this info won't be used any more after
> > > > switching to blk_queue_split(). Is that one potential problem?
> > >
> > > Thanks for your review. But no, as the patch header says:
> > > "dm_queue_split() is removed because __split_and_process_bio() handles
> > > splitting as needed."
> > >
> > > (__split_and_process_non_flush calls max_io_len, as does
> > > __process_abnormal_io by calling __send_changing_extent_only)
> > >
> > > SO the blk_queue_split() bio will be further split if needed (due to
> > > DM target boundary, etc).
> >
> > Thanks for your explanation.
> >
> > Then looks there is double split issue since both blk_queue_split()
> > and __split_and_process_non_flush() may split bio from same bioset(md->queue->bio_split),
> > and this way may cause deadlock, see comment of bio_alloc_bioset(), especially
> > the paragraph of 'callers must never allocate more than 1 bio at a time
> > from this pool.'
>
> Next sentence is:
> "Callers that need to allocate more than 1 bio must always submit the
> previously allocated bio for IO before attempting to allocate a new
> one."
Yeah, I know that. This sentence actually means that the previous
submission should make forward progress, then the bio may be completed &
freed, so that new allocation can move on.
However, in this situation, __split_and_process_non_flush() doesn't
provide such forward progress, see below.
>
> __split_and_process_non_flush -> __map_bio -> submit_bio_noacct
> bio_split
> submit_bio_noacct
Yeah, the above submission is done on clone bio & underlying queue. What
matters is if the submission can make forward progress. After
__split_and_process_non_flush() returns, the splitted 'bio'(original bio)
can't be done by previous submission because this bio won't be freed until
dec_pending() from __split_and_process_bio() returns.
So when ci.sector_count doesn't become zero, bio_split() is called again from
the same bio_set for allocating new bio, the allocation may never be made because
the original bio allocated from the same bio_set can't be freed during bio_split().
Thanks,
Ming
^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2020-09-16 7:51 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-15 17:23 [PATCH v2 0/4] block: a couple chunk_sectors fixes/improvements Mike Snitzer
2020-09-15 17:23 ` Mike Snitzer
2020-09-15 17:23 ` [PATCH v2 1/4] block: use lcm_not_zero() when stacking chunk_sectors Mike Snitzer
2020-09-15 17:23 ` Mike Snitzer
2020-09-15 17:23 ` [PATCH v2 2/4] block: allow 'chunk_sectors' to be non-power-of-2 Mike Snitzer
2020-09-15 17:23 ` Mike Snitzer
2020-09-15 17:23 ` [PATCH v2 3/4] dm table: stack 'chunk_sectors' limit to account for target-specific splitting Mike Snitzer
2020-09-15 17:23 ` Mike Snitzer
2020-09-15 17:23 ` [PATCH v2 4/4] dm: unconditionally call blk_queue_split() in dm_process_bio() Mike Snitzer
2020-09-15 17:23 ` Mike Snitzer
2020-09-16 1:08 ` Ming Lei
2020-09-16 1:28 ` Mike Snitzer
2020-09-16 1:48 ` Ming Lei
2020-09-16 3:39 ` Mike Snitzer
2020-09-16 7:51 ` Ming Lei
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.