* [PATCH 5/5] block_dev: make blkdev_dio_pool a non-rescuing bioset
2017-03-10 6:00 [PATCH 0/5] Updates following recent generic_make_request improvement NeilBrown
@ 2017-03-10 6:00 ` NeilBrown
2017-03-10 6:00 ` [PATCH 2/5] blk: remove bio_set arg from blk_queue_split() NeilBrown
` (4 subsequent siblings)
5 siblings, 0 replies; 9+ messages in thread
From: NeilBrown @ 2017-03-10 6:00 UTC (permalink / raw)
To: Jens Axboe
Cc: linux-block, Mike Snitzer, LKML, linux-raid,
device-mapper development, Mikulas Patocka, Pavel Machek,
Jack Wang, Lars Ellenberg, Kent Overstreet
Allocations from blkdev_dio_pool are never made under
generic_make_request, so this bioset does not need a rescuer thread.
Signed-off-by: NeilBrown <neilb@suse.com>
---
fs/block_dev.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/block_dev.c b/fs/block_dev.c
index c0ca5f0d0369..2eca00ec4370 100644
--- a/fs/block_dev.c
+++ b/fs/block_dev.c
@@ -436,7 +436,7 @@ blkdev_direct_IO(struct kiocb *iocb, struct iov_iter *iter)
static __init int blkdev_init(void)
{
- blkdev_dio_pool = bioset_create_rescued(4, offsetof(struct blkdev_dio, bio));
+ blkdev_dio_pool = bioset_create(4, offsetof(struct blkdev_dio, bio));
if (!blkdev_dio_pool)
return -ENOMEM;
return 0;
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 2/5] blk: remove bio_set arg from blk_queue_split()
2017-03-10 6:00 [PATCH 0/5] Updates following recent generic_make_request improvement NeilBrown
2017-03-10 6:00 ` [PATCH 5/5] block_dev: make blkdev_dio_pool a non-rescuing bioset NeilBrown
@ 2017-03-10 6:00 ` NeilBrown
2017-03-10 6:00 ` [PATCH 3/5] blk: make the bioset rescue_workqueue optional NeilBrown
` (3 subsequent siblings)
5 siblings, 0 replies; 9+ messages in thread
From: NeilBrown @ 2017-03-10 6:00 UTC (permalink / raw)
To: Jens Axboe
Cc: linux-block, Mike Snitzer, LKML, linux-raid,
device-mapper development, Mikulas Patocka, Pavel Machek,
Jack Wang, Lars Ellenberg, Kent Overstreet
blk_queue_split() is always called with the last arg being q->bio_split,
where 'q' is the first arg.
Also blk_queue_split() sometimes uses the passed-in 'bs' and sometimes uses
q->bio_split.
This is inconsistent and unnecessary. Remove the last arg and always use
q->bio_split inside blk_queue_split()
Signed-off-by: NeilBrown <neilb@suse.com>
---
block/blk-core.c | 2 +-
block/blk-merge.c | 7 +++----
block/blk-mq.c | 4 ++--
drivers/block/drbd/drbd_req.c | 2 +-
drivers/block/pktcdvd.c | 2 +-
drivers/block/ps3vram.c | 2 +-
drivers/block/rsxx/dev.c | 2 +-
drivers/block/umem.c | 2 +-
drivers/block/zram/zram_drv.c | 2 +-
drivers/lightnvm/rrpc.c | 2 +-
drivers/md/md.c | 2 +-
drivers/s390/block/dcssblk.c | 2 +-
drivers/s390/block/xpram.c | 2 +-
include/linux/blkdev.h | 3 +--
14 files changed, 17 insertions(+), 19 deletions(-)
diff --git a/block/blk-core.c b/block/blk-core.c
index d772c221cc17..fae7966e1f98 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -1635,7 +1635,7 @@ static blk_qc_t blk_queue_bio(struct request_queue *q, struct bio *bio)
*/
blk_queue_bounce(q, &bio);
- blk_queue_split(q, &bio, q->bio_split);
+ blk_queue_split(q, &bio);
if (bio_integrity_enabled(bio) && bio_integrity_prep(bio)) {
bio->bi_error = -EIO;
diff --git a/block/blk-merge.c b/block/blk-merge.c
index 2afa262425d1..ce8838aff7f7 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -188,8 +188,7 @@ static struct bio *blk_bio_segment_split(struct request_queue *q,
return do_split ? new : NULL;
}
-void blk_queue_split(struct request_queue *q, struct bio **bio,
- struct bio_set *bs)
+void blk_queue_split(struct request_queue *q, struct bio **bio)
{
struct bio *split, *res;
unsigned nsegs;
@@ -197,14 +196,14 @@ void blk_queue_split(struct request_queue *q, struct bio **bio,
switch (bio_op(*bio)) {
case REQ_OP_DISCARD:
case REQ_OP_SECURE_ERASE:
- split = blk_bio_discard_split(q, *bio, bs, &nsegs);
+ split = blk_bio_discard_split(q, *bio, q->bio_split, &nsegs);
break;
case REQ_OP_WRITE_ZEROES:
split = NULL;
nsegs = (*bio)->bi_phys_segments;
break;
case REQ_OP_WRITE_SAME:
- split = blk_bio_write_same_split(q, *bio, bs, &nsegs);
+ split = blk_bio_write_same_split(q, *bio, q->bio_split, &nsegs);
break;
default:
split = blk_bio_segment_split(q, *bio, q->bio_split, &nsegs);
diff --git a/block/blk-mq.c b/block/blk-mq.c
index 159187a28d66..e582d7f7511e 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1502,7 +1502,7 @@ static blk_qc_t blk_mq_make_request(struct request_queue *q, struct bio *bio)
return BLK_QC_T_NONE;
}
- blk_queue_split(q, &bio, q->bio_split);
+ blk_queue_split(q, &bio);
if (!is_flush_fua && !blk_queue_nomerges(q) &&
blk_attempt_plug_merge(q, bio, &request_count, &same_queue_rq))
@@ -1624,7 +1624,7 @@ static blk_qc_t blk_sq_make_request(struct request_queue *q, struct bio *bio)
return BLK_QC_T_NONE;
}
- blk_queue_split(q, &bio, q->bio_split);
+ blk_queue_split(q, &bio);
if (!is_flush_fua && !blk_queue_nomerges(q)) {
if (blk_attempt_plug_merge(q, bio, &request_count, NULL))
diff --git a/drivers/block/drbd/drbd_req.c b/drivers/block/drbd/drbd_req.c
index 652114ae1a8a..f6ed6f7f5ab2 100644
--- a/drivers/block/drbd/drbd_req.c
+++ b/drivers/block/drbd/drbd_req.c
@@ -1554,7 +1554,7 @@ blk_qc_t drbd_make_request(struct request_queue *q, struct bio *bio)
struct drbd_device *device = (struct drbd_device *) q->queuedata;
unsigned long start_jif;
- blk_queue_split(q, &bio, q->bio_split);
+ blk_queue_split(q, &bio);
start_jif = jiffies;
diff --git a/drivers/block/pktcdvd.c b/drivers/block/pktcdvd.c
index 66d846ba85a9..98394d034c29 100644
--- a/drivers/block/pktcdvd.c
+++ b/drivers/block/pktcdvd.c
@@ -2414,7 +2414,7 @@ static blk_qc_t pkt_make_request(struct request_queue *q, struct bio *bio)
blk_queue_bounce(q, &bio);
- blk_queue_split(q, &bio, q->bio_split);
+ blk_queue_split(q, &bio);
pd = q->queuedata;
if (!pd) {
diff --git a/drivers/block/ps3vram.c b/drivers/block/ps3vram.c
index 456b4fe21559..48072c0c1010 100644
--- a/drivers/block/ps3vram.c
+++ b/drivers/block/ps3vram.c
@@ -606,7 +606,7 @@ static blk_qc_t ps3vram_make_request(struct request_queue *q, struct bio *bio)
dev_dbg(&dev->core, "%s\n", __func__);
- blk_queue_split(q, &bio, q->bio_split);
+ blk_queue_split(q, &bio);
spin_lock_irq(&priv->lock);
busy = !bio_list_empty(&priv->list);
diff --git a/drivers/block/rsxx/dev.c b/drivers/block/rsxx/dev.c
index f81d70b39d10..8c8024f616ae 100644
--- a/drivers/block/rsxx/dev.c
+++ b/drivers/block/rsxx/dev.c
@@ -151,7 +151,7 @@ static blk_qc_t rsxx_make_request(struct request_queue *q, struct bio *bio)
struct rsxx_bio_meta *bio_meta;
int st = -EINVAL;
- blk_queue_split(q, &bio, q->bio_split);
+ blk_queue_split(q, &bio);
might_sleep();
diff --git a/drivers/block/umem.c b/drivers/block/umem.c
index c141cc3be22b..c8d8a2f16f8e 100644
--- a/drivers/block/umem.c
+++ b/drivers/block/umem.c
@@ -529,7 +529,7 @@ static blk_qc_t mm_make_request(struct request_queue *q, struct bio *bio)
(unsigned long long)bio->bi_iter.bi_sector,
bio->bi_iter.bi_size);
- blk_queue_split(q, &bio, q->bio_split);
+ blk_queue_split(q, &bio);
spin_lock_irq(&card->lock);
*card->biotail = bio;
diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
index dceb5edd1e54..49d8bf37c4ef 100644
--- a/drivers/block/zram/zram_drv.c
+++ b/drivers/block/zram/zram_drv.c
@@ -880,7 +880,7 @@ static blk_qc_t zram_make_request(struct request_queue *queue, struct bio *bio)
{
struct zram *zram = queue->queuedata;
- blk_queue_split(queue, &bio, queue->bio_split);
+ blk_queue_split(queue, &bio);
if (!valid_io_request(zram, bio->bi_iter.bi_sector,
bio->bi_iter.bi_size)) {
diff --git a/drivers/lightnvm/rrpc.c b/drivers/lightnvm/rrpc.c
index e00b1d7b976f..701fa2fafbb2 100644
--- a/drivers/lightnvm/rrpc.c
+++ b/drivers/lightnvm/rrpc.c
@@ -999,7 +999,7 @@ static blk_qc_t rrpc_make_rq(struct request_queue *q, struct bio *bio)
struct nvm_rq *rqd;
int err;
- blk_queue_split(q, &bio, q->bio_split);
+ blk_queue_split(q, &bio);
if (bio_op(bio) == REQ_OP_DISCARD) {
rrpc_discard(rrpc, bio);
diff --git a/drivers/md/md.c b/drivers/md/md.c
index 548d1b8014f8..d7d2bb51a58d 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -253,7 +253,7 @@ static blk_qc_t md_make_request(struct request_queue *q, struct bio *bio)
unsigned int sectors;
int cpu;
- blk_queue_split(q, &bio, q->bio_split);
+ blk_queue_split(q, &bio);
if (mddev == NULL || mddev->pers == NULL) {
bio_io_error(bio);
diff --git a/drivers/s390/block/dcssblk.c b/drivers/s390/block/dcssblk.c
index 415d10a67b7a..10ece6f3c7eb 100644
--- a/drivers/s390/block/dcssblk.c
+++ b/drivers/s390/block/dcssblk.c
@@ -829,7 +829,7 @@ dcssblk_make_request(struct request_queue *q, struct bio *bio)
unsigned long source_addr;
unsigned long bytes_done;
- blk_queue_split(q, &bio, q->bio_split);
+ blk_queue_split(q, &bio);
bytes_done = 0;
dev_info = bio->bi_bdev->bd_disk->private_data;
diff --git a/drivers/s390/block/xpram.c b/drivers/s390/block/xpram.c
index b9d7e755c8a3..a48f0d40c1d2 100644
--- a/drivers/s390/block/xpram.c
+++ b/drivers/s390/block/xpram.c
@@ -190,7 +190,7 @@ static blk_qc_t xpram_make_request(struct request_queue *q, struct bio *bio)
unsigned long page_addr;
unsigned long bytes;
- blk_queue_split(q, &bio, q->bio_split);
+ blk_queue_split(q, &bio);
if ((bio->bi_iter.bi_sector & 7) != 0 ||
(bio->bi_iter.bi_size & 4095) != 0)
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 5a7da607ca04..8f98b4e2b9e6 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -933,8 +933,7 @@ extern int blk_insert_cloned_request(struct request_queue *q,
struct request *rq);
extern int blk_rq_append_bio(struct request *rq, struct bio *bio);
extern void blk_delay_queue(struct request_queue *, unsigned long);
-extern void blk_queue_split(struct request_queue *, struct bio **,
- struct bio_set *);
+extern void blk_queue_split(struct request_queue *, struct bio **);
extern void blk_recount_segments(struct request_queue *, struct bio *);
extern int scsi_verify_blk_ioctl(struct block_device *, unsigned int);
extern int scsi_cmd_blk_ioctl(struct block_device *, fmode_t,
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 3/5] blk: make the bioset rescue_workqueue optional.
2017-03-10 6:00 [PATCH 0/5] Updates following recent generic_make_request improvement NeilBrown
2017-03-10 6:00 ` [PATCH 5/5] block_dev: make blkdev_dio_pool a non-rescuing bioset NeilBrown
2017-03-10 6:00 ` [PATCH 2/5] blk: remove bio_set arg from blk_queue_split() NeilBrown
@ 2017-03-10 6:00 ` NeilBrown
2017-03-10 6:00 ` NeilBrown
` (2 subsequent siblings)
5 siblings, 0 replies; 9+ messages in thread
From: NeilBrown @ 2017-03-10 6:00 UTC (permalink / raw)
To: Jens Axboe
Cc: linux-block, Mike Snitzer, LKML, linux-raid,
device-mapper development, Mikulas Patocka, Pavel Machek,
Jack Wang, Lars Ellenberg, Kent Overstreet
This patch converts bioset_create() and bioset_create_nobvec()
to not create a workqueue so alloctions will never trigger
punt_bios_to_rescuer().
It also introduces bioset_create_rescued() and bioset_create_nobvec_rescued()
which preserve the old behaviour.
*All* callers of bioset_create() and bioset_create_nobvec() are
converted to the _rescued() version, so that no change in behaviour
is experienced.
It is hoped that most, if not all, biosets can end up being the
non-rescued version.
Signed-off-by: NeilBrown <neilb@suse.com>
---
block/bio.c | 30 +++++++++++++++++++++++++-----
block/blk-core.c | 2 +-
drivers/block/drbd/drbd_main.c | 2 +-
drivers/md/bcache/super.c | 4 ++--
drivers/md/dm-crypt.c | 2 +-
drivers/md/dm-io.c | 2 +-
drivers/md/dm.c | 5 +++--
drivers/md/md.c | 2 +-
drivers/md/raid5-cache.c | 2 +-
drivers/target/target_core_iblock.c | 2 +-
fs/block_dev.c | 2 +-
fs/btrfs/extent_io.c | 4 ++--
fs/xfs/xfs_super.c | 2 +-
include/linux/bio.h | 2 ++
14 files changed, 43 insertions(+), 20 deletions(-)
diff --git a/block/bio.c b/block/bio.c
index e75878f8b14a..3790c3f376b6 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -362,6 +362,8 @@ static void punt_bios_to_rescuer(struct bio_set *bs)
struct bio_list punt, nopunt;
struct bio *bio;
+ if (!WARN_ON_ONCE(!bs->rescue_workqueue))
+ return;
/*
* In order to guarantee forward progress we must punt only bios that
* were allocated from this bio_set; otherwise, if there was a bio on
@@ -472,7 +474,8 @@ struct bio *bio_alloc_bioset(gfp_t gfp_mask, int nr_iovecs, struct bio_set *bs)
if (current->bio_list &&
(!bio_list_empty(¤t->bio_list[0]) ||
- !bio_list_empty(¤t->bio_list[1])))
+ !bio_list_empty(¤t->bio_list[1])) &&
+ bs->rescue_workqueue)
gfp_mask &= ~__GFP_DIRECT_RECLAIM;
p = mempool_alloc(bs->bio_pool, gfp_mask);
@@ -1941,7 +1944,8 @@ EXPORT_SYMBOL(bioset_free);
static struct bio_set *__bioset_create(unsigned int pool_size,
unsigned int front_pad,
- bool create_bvec_pool)
+ bool create_bvec_pool,
+ bool create_rescue_workqueue)
{
unsigned int back_pad = BIO_INLINE_VECS * sizeof(struct bio_vec);
struct bio_set *bs;
@@ -1972,6 +1976,9 @@ static struct bio_set *__bioset_create(unsigned int pool_size,
goto bad;
}
+ if (!create_rescue_workqueue)
+ return bs;
+
bs->rescue_workqueue = alloc_workqueue("bioset", WQ_MEM_RECLAIM, 0);
if (!bs->rescue_workqueue)
goto bad;
@@ -1997,10 +2004,16 @@ static struct bio_set *__bioset_create(unsigned int pool_size,
*/
struct bio_set *bioset_create(unsigned int pool_size, unsigned int front_pad)
{
- return __bioset_create(pool_size, front_pad, true);
+ return __bioset_create(pool_size, front_pad, true, false);
}
EXPORT_SYMBOL(bioset_create);
+struct bio_set *bioset_create_rescued(unsigned int pool_size, unsigned int front_pad)
+{
+ return __bioset_create(pool_size, front_pad, true, true);
+}
+EXPORT_SYMBOL(bioset_create_rescued);
+
/**
* bioset_create_nobvec - Create a bio_set without bio_vec mempool
* @pool_size: Number of bio to cache in the mempool
@@ -2012,10 +2025,17 @@ EXPORT_SYMBOL(bioset_create);
*/
struct bio_set *bioset_create_nobvec(unsigned int pool_size, unsigned int front_pad)
{
- return __bioset_create(pool_size, front_pad, false);
+ return __bioset_create(pool_size, front_pad, false, false);
}
EXPORT_SYMBOL(bioset_create_nobvec);
+struct bio_set *bioset_create_nobvec_rescued(unsigned int pool_size,
+ unsigned int front_pad)
+{
+ return __bioset_create(pool_size, front_pad, false, true);
+}
+EXPORT_SYMBOL(bioset_create_nobvec_rescued);
+
#ifdef CONFIG_BLK_CGROUP
/**
@@ -2130,7 +2150,7 @@ static int __init init_bio(void)
bio_integrity_init();
biovec_init_slabs();
- fs_bio_set = bioset_create(BIO_POOL_SIZE, 0);
+ fs_bio_set = bioset_create_rescued(BIO_POOL_SIZE, 0);
if (!fs_bio_set)
panic("bio: can't allocate bios\n");
diff --git a/block/blk-core.c b/block/blk-core.c
index fae7966e1f98..430c82f646eb 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -712,7 +712,7 @@ struct request_queue *blk_alloc_queue_node(gfp_t gfp_mask, int node_id)
if (q->id < 0)
goto fail_q;
- q->bio_split = bioset_create(BIO_POOL_SIZE, 0);
+ q->bio_split = bioset_create_rescued(BIO_POOL_SIZE, 0);
if (!q->bio_split)
goto fail_id;
diff --git a/drivers/block/drbd/drbd_main.c b/drivers/block/drbd/drbd_main.c
index 92c60cbd04ee..2c69c2ab0fff 100644
--- a/drivers/block/drbd/drbd_main.c
+++ b/drivers/block/drbd/drbd_main.c
@@ -2166,7 +2166,7 @@ static int drbd_create_mempools(void)
goto Enomem;
/* mempools */
- drbd_md_io_bio_set = bioset_create(DRBD_MIN_POOL_PAGES, 0);
+ drbd_md_io_bio_set = bioset_create_rescued(DRBD_MIN_POOL_PAGES, 0);
if (drbd_md_io_bio_set == NULL)
goto Enomem;
diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c
index 85e3f21c2514..6cb30792f0ed 100644
--- a/drivers/md/bcache/super.c
+++ b/drivers/md/bcache/super.c
@@ -786,7 +786,7 @@ static int bcache_device_init(struct bcache_device *d, unsigned block_size,
minor *= BCACHE_MINORS;
- if (!(d->bio_split = bioset_create(4, offsetof(struct bbio, bio))) ||
+ if (!(d->bio_split = bioset_create_rescued(4, offsetof(struct bbio, bio))) ||
!(d->disk = alloc_disk(BCACHE_MINORS))) {
ida_simple_remove(&bcache_minor, minor);
return -ENOMEM;
@@ -1520,7 +1520,7 @@ struct cache_set *bch_cache_set_alloc(struct cache_sb *sb)
sizeof(struct bbio) + sizeof(struct bio_vec) *
bucket_pages(c))) ||
!(c->fill_iter = mempool_create_kmalloc_pool(1, iter_size)) ||
- !(c->bio_split = bioset_create(4, offsetof(struct bbio, bio))) ||
+ !(c->bio_split = bioset_create_rescued(4, offsetof(struct bbio, bio))) ||
!(c->uuids = alloc_bucket_pages(GFP_KERNEL, c)) ||
!(c->moving_gc_wq = alloc_workqueue("bcache_gc",
WQ_MEM_RECLAIM, 0)) ||
diff --git a/drivers/md/dm-crypt.c b/drivers/md/dm-crypt.c
index 389a3637ffcc..91a2d637d44f 100644
--- a/drivers/md/dm-crypt.c
+++ b/drivers/md/dm-crypt.c
@@ -1936,7 +1936,7 @@ static int crypt_ctr(struct dm_target *ti, unsigned int argc, char **argv)
goto bad;
}
- cc->bs = bioset_create(MIN_IOS, 0);
+ cc->bs = bioset_create_rescued(MIN_IOS, 0);
if (!cc->bs) {
ti->error = "Cannot allocate crypt bioset";
goto bad;
diff --git a/drivers/md/dm-io.c b/drivers/md/dm-io.c
index 03940bf36f6c..fe1241c196b1 100644
--- a/drivers/md/dm-io.c
+++ b/drivers/md/dm-io.c
@@ -58,7 +58,7 @@ struct dm_io_client *dm_io_client_create(void)
if (!client->pool)
goto bad;
- client->bios = bioset_create(min_ios, 0);
+ client->bios = bioset_create_rescued(min_ios, 0);
if (!client->bios)
goto bad;
diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index dfb75979e455..41b1f033841f 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -1002,7 +1002,8 @@ static void flush_current_bio_list(struct blk_plug_cb *cb, bool from_schedule)
while ((bio = bio_list_pop(&list))) {
struct bio_set *bs = bio->bi_pool;
- if (unlikely(!bs) || bs == fs_bio_set) {
+ if (unlikely(!bs) || bs == fs_bio_set ||
+ !bs->rescue_workqueue) {
bio_list_add(¤t->bio_list[i], bio);
continue;
}
@@ -2577,7 +2578,7 @@ struct dm_md_mempools *dm_alloc_md_mempools(struct mapped_device *md, unsigned t
BUG();
}
- pools->bs = bioset_create_nobvec(pool_size, front_pad);
+ pools->bs = bioset_create_nobvec_rescued(pool_size, front_pad);
if (!pools->bs)
goto out;
diff --git a/drivers/md/md.c b/drivers/md/md.c
index d7d2bb51a58d..e5f08a195837 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -5220,7 +5220,7 @@ int md_run(struct mddev *mddev)
}
if (mddev->bio_set == NULL) {
- mddev->bio_set = bioset_create(BIO_POOL_SIZE, 0);
+ mddev->bio_set = bioset_create_rescued(BIO_POOL_SIZE, 0);
if (!mddev->bio_set)
return -ENOMEM;
}
diff --git a/drivers/md/raid5-cache.c b/drivers/md/raid5-cache.c
index 3f307be01b10..c95c6c046395 100644
--- a/drivers/md/raid5-cache.c
+++ b/drivers/md/raid5-cache.c
@@ -2831,7 +2831,7 @@ int r5l_init_log(struct r5conf *conf, struct md_rdev *rdev)
if (!log->io_pool)
goto io_pool;
- log->bs = bioset_create(R5L_POOL_SIZE, 0);
+ log->bs = bioset_create_rescued(R5L_POOL_SIZE, 0);
if (!log->bs)
goto io_bs;
diff --git a/drivers/target/target_core_iblock.c b/drivers/target/target_core_iblock.c
index d316ed537d59..5bf3392195c6 100644
--- a/drivers/target/target_core_iblock.c
+++ b/drivers/target/target_core_iblock.c
@@ -93,7 +93,7 @@ static int iblock_configure_device(struct se_device *dev)
return -EINVAL;
}
- ib_dev->ibd_bio_set = bioset_create(IBLOCK_BIO_POOL_SIZE, 0);
+ ib_dev->ibd_bio_set = bioset_create_rescued(IBLOCK_BIO_POOL_SIZE, 0);
if (!ib_dev->ibd_bio_set) {
pr_err("IBLOCK: Unable to create bioset\n");
goto out;
diff --git a/fs/block_dev.c b/fs/block_dev.c
index 2eca00ec4370..c0ca5f0d0369 100644
--- a/fs/block_dev.c
+++ b/fs/block_dev.c
@@ -436,7 +436,7 @@ blkdev_direct_IO(struct kiocb *iocb, struct iov_iter *iter)
static __init int blkdev_init(void)
{
- blkdev_dio_pool = bioset_create(4, offsetof(struct blkdev_dio, bio));
+ blkdev_dio_pool = bioset_create_rescued(4, offsetof(struct blkdev_dio, bio));
if (!blkdev_dio_pool)
return -ENOMEM;
return 0;
diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index 28e81922a21c..34aa8893790a 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -173,8 +173,8 @@ int __init extent_io_init(void)
if (!extent_buffer_cache)
goto free_state_cache;
- btrfs_bioset = bioset_create(BIO_POOL_SIZE,
- offsetof(struct btrfs_io_bio, bio));
+ btrfs_bioset = bioset_create_rescued(BIO_POOL_SIZE,
+ offsetof(struct btrfs_io_bio, bio));
if (!btrfs_bioset)
goto free_buffer_cache;
diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
index 890862f2447c..f4c4d6f41d91 100644
--- a/fs/xfs/xfs_super.c
+++ b/fs/xfs/xfs_super.c
@@ -1756,7 +1756,7 @@ MODULE_ALIAS_FS("xfs");
STATIC int __init
xfs_init_zones(void)
{
- xfs_ioend_bioset = bioset_create(4 * MAX_BUF_PER_PAGE,
+ xfs_ioend_bioset = bioset_create_rescued(4 * MAX_BUF_PER_PAGE,
offsetof(struct xfs_ioend, io_inline_bio));
if (!xfs_ioend_bioset)
goto out;
diff --git a/include/linux/bio.h b/include/linux/bio.h
index 8e521194f6fc..05730603fcf1 100644
--- a/include/linux/bio.h
+++ b/include/linux/bio.h
@@ -379,7 +379,9 @@ static inline struct bio *bio_next_split(struct bio *bio, int sectors,
}
extern struct bio_set *bioset_create(unsigned int, unsigned int);
+extern struct bio_set *bioset_create_rescued(unsigned int, unsigned int);
extern struct bio_set *bioset_create_nobvec(unsigned int, unsigned int);
+extern struct bio_set *bioset_create_nobvec_rescued(unsigned int, unsigned int);
extern void bioset_free(struct bio_set *);
extern mempool_t *biovec_create_pool(int pool_entries);
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 4/5] blk: use non-rescuing bioset for q->bio_split.
2017-03-10 6:00 [PATCH 0/5] Updates following recent generic_make_request improvement NeilBrown
@ 2017-03-10 6:00 ` NeilBrown
2017-03-10 6:00 ` [PATCH 2/5] blk: remove bio_set arg from blk_queue_split() NeilBrown
` (4 subsequent siblings)
5 siblings, 0 replies; 9+ messages in thread
From: NeilBrown @ 2017-03-10 6:00 UTC (permalink / raw)
To: Jens Axboe
Cc: linux-raid, Mike Snitzer, LKML, linux-block,
device-mapper development, Mikulas Patocka, Pavel Machek,
Jack Wang, Lars Ellenberg, Kent Overstreet
A rescuing bioset is only useful if there might be bios from
that same bioset on the bio_list_on_stack queue at a time
when bio_alloc_bioset() is called. This never applies to
q->bio_split.
Allocations from q->bio_split are only ever made from
blk_queue_split() which is only ever called early in each of
various make_request_fn()s. The original bio (call this A)
is then passed to generic_make_request() and is placed on
the bio_list_on_stack queue, and the bio that was allocated
from q->bio_split (B) is processed.
The processing of this may cause other bios to be passed to
generic_make_request() or may even cause the bio B itself to
be passed, possible after some prefix has been split off
(using some other bioset).
generic_make_request() now guarantees that all of these bios
(B and dependants) will be fully processed before the tail
of the original bio A gets handled. None of these early bios
can possible trigger an allocation from the original
q->bio_split as they are either too small to require
splitting or (more likely) are destined for a different queue.
The next time that the original q->bio_split might be used
by this thread is when A is processed again, as it might
still be too big to handle directly. By this time there
cannot be any other bios allocated from q->bio_split in the
generic_make_request() queue. So no rescuing will ever be
needed.
Signed-off-by: NeilBrown <neilb@suse.com>
---
block/blk-core.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/block/blk-core.c b/block/blk-core.c
index 430c82f646eb..fae7966e1f98 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -712,7 +712,7 @@ struct request_queue *blk_alloc_queue_node(gfp_t gfp_mask, int node_id)
if (q->id < 0)
goto fail_q;
- q->bio_split = bioset_create_rescued(BIO_POOL_SIZE, 0);
+ q->bio_split = bioset_create(BIO_POOL_SIZE, 0);
if (!q->bio_split)
goto fail_id;
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 4/5] blk: use non-rescuing bioset for q->bio_split.
@ 2017-03-10 6:00 ` NeilBrown
0 siblings, 0 replies; 9+ messages in thread
From: NeilBrown @ 2017-03-10 6:00 UTC (permalink / raw)
To: Jens Axboe
Cc: linux-block, Mike Snitzer, LKML, linux-raid,
device-mapper development, Mikulas Patocka, Pavel Machek,
Jack Wang, Lars Ellenberg, Kent Overstreet
A rescuing bioset is only useful if there might be bios from
that same bioset on the bio_list_on_stack queue at a time
when bio_alloc_bioset() is called. This never applies to
q->bio_split.
Allocations from q->bio_split are only ever made from
blk_queue_split() which is only ever called early in each of
various make_request_fn()s. The original bio (call this A)
is then passed to generic_make_request() and is placed on
the bio_list_on_stack queue, and the bio that was allocated
from q->bio_split (B) is processed.
The processing of this may cause other bios to be passed to
generic_make_request() or may even cause the bio B itself to
be passed, possible after some prefix has been split off
(using some other bioset).
generic_make_request() now guarantees that all of these bios
(B and dependants) will be fully processed before the tail
of the original bio A gets handled. None of these early bios
can possible trigger an allocation from the original
q->bio_split as they are either too small to require
splitting or (more likely) are destined for a different queue.
The next time that the original q->bio_split might be used
by this thread is when A is processed again, as it might
still be too big to handle directly. By this time there
cannot be any other bios allocated from q->bio_split in the
generic_make_request() queue. So no rescuing will ever be
needed.
Signed-off-by: NeilBrown <neilb@suse.com>
---
block/blk-core.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/block/blk-core.c b/block/blk-core.c
index 430c82f646eb..fae7966e1f98 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -712,7 +712,7 @@ struct request_queue *blk_alloc_queue_node(gfp_t gfp_mask, int node_id)
if (q->id < 0)
goto fail_q;
- q->bio_split = bioset_create_rescued(BIO_POOL_SIZE, 0);
+ q->bio_split = bioset_create(BIO_POOL_SIZE, 0);
if (!q->bio_split)
goto fail_id;
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 1/5] blk: Ensure users for current->bio_list can see the full list.
2017-03-10 6:00 [PATCH 0/5] Updates following recent generic_make_request improvement NeilBrown
` (3 preceding siblings ...)
2017-03-10 6:00 ` NeilBrown
@ 2017-03-10 6:00 ` NeilBrown
2017-03-11 22:32 ` [PATCH 0/5] Updates following recent generic_make_request improvement Jens Axboe
5 siblings, 0 replies; 9+ messages in thread
From: NeilBrown @ 2017-03-10 6:00 UTC (permalink / raw)
To: Jens Axboe
Cc: linux-block, Mike Snitzer, LKML, linux-raid,
device-mapper development, Mikulas Patocka, Pavel Machek,
Jack Wang, Lars Ellenberg, Kent Overstreet
Commit 79bd99596b73 ("blk: improve order of bio handling in generic_make_request()")
changed current->bio_list so that it did not contain *all* of the
queued bios, but only those submitted by the currently running
make_request_fn.
There are two places which walk the list and requeue selected bios,
and others that check if the list is empty. These are no longer
correct.
So redefine current->bio_list to point to an array of two lists, which
contain all queued bios, and adjust various code to test or walk both
lists.
Signed-off-by: NeilBrown <neilb@suse.com>
---
block/bio.c | 12 +++++++++---
block/blk-core.c | 30 ++++++++++++++++++------------
drivers/md/dm.c | 29 ++++++++++++++++-------------
drivers/md/raid10.c | 3 ++-
4 files changed, 45 insertions(+), 29 deletions(-)
diff --git a/block/bio.c b/block/bio.c
index 5eec5e08417f..e75878f8b14a 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -376,10 +376,14 @@ static void punt_bios_to_rescuer(struct bio_set *bs)
bio_list_init(&punt);
bio_list_init(&nopunt);
- while ((bio = bio_list_pop(current->bio_list)))
+ while ((bio = bio_list_pop(¤t->bio_list[0])))
bio_list_add(bio->bi_pool == bs ? &punt : &nopunt, bio);
+ current->bio_list[0] = nopunt;
- *current->bio_list = nopunt;
+ bio_list_init(&nopunt);
+ while ((bio = bio_list_pop(¤t->bio_list[1])))
+ bio_list_add(bio->bi_pool == bs ? &punt : &nopunt, bio);
+ current->bio_list[1] = nopunt;
spin_lock(&bs->rescue_lock);
bio_list_merge(&bs->rescue_list, &punt);
@@ -466,7 +470,9 @@ struct bio *bio_alloc_bioset(gfp_t gfp_mask, int nr_iovecs, struct bio_set *bs)
* we retry with the original gfp_flags.
*/
- if (current->bio_list && !bio_list_empty(current->bio_list))
+ if (current->bio_list &&
+ (!bio_list_empty(¤t->bio_list[0]) ||
+ !bio_list_empty(¤t->bio_list[1])))
gfp_mask &= ~__GFP_DIRECT_RECLAIM;
p = mempool_alloc(bs->bio_pool, gfp_mask);
diff --git a/block/blk-core.c b/block/blk-core.c
index 0eeb99ef654f..d772c221cc17 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -1973,7 +1973,14 @@ generic_make_request_checks(struct bio *bio)
*/
blk_qc_t generic_make_request(struct bio *bio)
{
- struct bio_list bio_list_on_stack;
+ /*
+ * bio_list_on_stack[0] contains bios submitted by the current
+ * make_request_fn.
+ * bio_list_on_stack[1] contains bios that were submitted before
+ * the current make_request_fn, but that haven't been processed
+ * yet.
+ */
+ struct bio_list bio_list_on_stack[2];
blk_qc_t ret = BLK_QC_T_NONE;
if (!generic_make_request_checks(bio))
@@ -1990,7 +1997,7 @@ blk_qc_t generic_make_request(struct bio *bio)
* should be added at the tail
*/
if (current->bio_list) {
- bio_list_add(current->bio_list, bio);
+ bio_list_add(¤t->bio_list[0], bio);
goto out;
}
@@ -2009,18 +2016,17 @@ blk_qc_t generic_make_request(struct bio *bio)
* bio_list, and call into ->make_request() again.
*/
BUG_ON(bio->bi_next);
- bio_list_init(&bio_list_on_stack);
- current->bio_list = &bio_list_on_stack;
+ bio_list_init(&bio_list_on_stack[0]);
+ current->bio_list = bio_list_on_stack;
do {
struct request_queue *q = bdev_get_queue(bio->bi_bdev);
if (likely(blk_queue_enter(q, false) == 0)) {
- struct bio_list hold;
struct bio_list lower, same;
/* Create a fresh bio_list for all subordinate requests */
- hold = bio_list_on_stack;
- bio_list_init(&bio_list_on_stack);
+ bio_list_on_stack[1] = bio_list_on_stack[0];
+ bio_list_init(&bio_list_on_stack[0]);
ret = q->make_request_fn(q, bio);
blk_queue_exit(q);
@@ -2030,19 +2036,19 @@ blk_qc_t generic_make_request(struct bio *bio)
*/
bio_list_init(&lower);
bio_list_init(&same);
- while ((bio = bio_list_pop(&bio_list_on_stack)) != NULL)
+ while ((bio = bio_list_pop(&bio_list_on_stack[0])) != NULL)
if (q == bdev_get_queue(bio->bi_bdev))
bio_list_add(&same, bio);
else
bio_list_add(&lower, bio);
/* now assemble so we handle the lowest level first */
- bio_list_merge(&bio_list_on_stack, &lower);
- bio_list_merge(&bio_list_on_stack, &same);
- bio_list_merge(&bio_list_on_stack, &hold);
+ bio_list_merge(&bio_list_on_stack[0], &lower);
+ bio_list_merge(&bio_list_on_stack[0], &same);
+ bio_list_merge(&bio_list_on_stack[0], &bio_list_on_stack[1]);
} else {
bio_io_error(bio);
}
- bio = bio_list_pop(current->bio_list);
+ bio = bio_list_pop(&bio_list_on_stack[0]);
} while (bio);
current->bio_list = NULL; /* deactivate */
diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index f4ffd1eb8f44..dfb75979e455 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -989,26 +989,29 @@ static void flush_current_bio_list(struct blk_plug_cb *cb, bool from_schedule)
struct dm_offload *o = container_of(cb, struct dm_offload, cb);
struct bio_list list;
struct bio *bio;
+ int i;
INIT_LIST_HEAD(&o->cb.list);
if (unlikely(!current->bio_list))
return;
- list = *current->bio_list;
- bio_list_init(current->bio_list);
-
- while ((bio = bio_list_pop(&list))) {
- struct bio_set *bs = bio->bi_pool;
- if (unlikely(!bs) || bs == fs_bio_set) {
- bio_list_add(current->bio_list, bio);
- continue;
+ for (i = 0; i < 2; i++) {
+ list = current->bio_list[i];
+ bio_list_init(¤t->bio_list[i]);
+
+ while ((bio = bio_list_pop(&list))) {
+ struct bio_set *bs = bio->bi_pool;
+ if (unlikely(!bs) || bs == fs_bio_set) {
+ bio_list_add(¤t->bio_list[i], bio);
+ continue;
+ }
+
+ spin_lock(&bs->rescue_lock);
+ bio_list_add(&bs->rescue_list, bio);
+ queue_work(bs->rescue_workqueue, &bs->rescue_work);
+ spin_unlock(&bs->rescue_lock);
}
-
- spin_lock(&bs->rescue_lock);
- bio_list_add(&bs->rescue_list, bio);
- queue_work(bs->rescue_workqueue, &bs->rescue_work);
- spin_unlock(&bs->rescue_lock);
}
}
diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
index 063c43d83b72..0536658c9d40 100644
--- a/drivers/md/raid10.c
+++ b/drivers/md/raid10.c
@@ -974,7 +974,8 @@ static void wait_barrier(struct r10conf *conf)
!conf->barrier ||
(atomic_read(&conf->nr_pending) &&
current->bio_list &&
- !bio_list_empty(current->bio_list)),
+ (!bio_list_empty(¤t->bio_list[0]) ||
+ !bio_list_empty(¤t->bio_list[1]))),
conf->resync_lock);
conf->nr_waiting--;
if (!conf->nr_waiting)
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH 0/5] Updates following recent generic_make_request improvement
2017-03-10 6:00 [PATCH 0/5] Updates following recent generic_make_request improvement NeilBrown
` (4 preceding siblings ...)
2017-03-10 6:00 ` [PATCH 1/5] blk: Ensure users for current->bio_list can see the full list NeilBrown
@ 2017-03-11 22:32 ` Jens Axboe
2017-03-12 21:52 ` [dm-devel] " NeilBrown
5 siblings, 1 reply; 9+ messages in thread
From: Jens Axboe @ 2017-03-11 22:32 UTC (permalink / raw)
To: NeilBrown
Cc: linux-block, Mike Snitzer, LKML, linux-raid,
device-mapper development, Mikulas Patocka, Pavel Machek,
Jack Wang, Lars Ellenberg, Kent Overstreet
On 03/09/2017 11:00 PM, NeilBrown wrote:
> This is a rebase of the series I sent earlier, based on the
> very latest from Linus, which included my first patch.
>
> The first fixes a problem that patch introduced, and so should go to
> Linux promptly.
> The others are more general improvements and can go in the normal
> course of events.
>
> It is possible that the changes to btrfs and xfs can just be dropped
> as a subsequent patch will be needed to revert them anyway. They are
> there only to be able to say that "blk: make the bioset
> rescue_workqueue optional." doesn't change any functionality at all.
I have applied the first patch for this series, and added the Fixes
tag. We'll let the rest simmer a bit, then aim for 4.12 for those.
--
Jens Axboe
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [dm-devel] [PATCH 0/5] Updates following recent generic_make_request improvement
2017-03-11 22:32 ` [PATCH 0/5] Updates following recent generic_make_request improvement Jens Axboe
@ 2017-03-12 21:52 ` NeilBrown
0 siblings, 0 replies; 9+ messages in thread
From: NeilBrown @ 2017-03-12 21:52 UTC (permalink / raw)
To: Jens Axboe
Cc: linux-raid, Mike Snitzer, LKML, linux-block,
device-mapper development, Mikulas Patocka, Pavel Machek,
Jack Wang, Lars Ellenberg, Kent Overstreet
[-- Attachment #1: Type: text/plain, Size: 866 bytes --]
On Sat, Mar 11 2017, Jens Axboe wrote:
> On 03/09/2017 11:00 PM, NeilBrown wrote:
>> This is a rebase of the series I sent earlier, based on the
>> very latest from Linus, which included my first patch.
>>
>> The first fixes a problem that patch introduced, and so should go to
>> Linux promptly.
>> The others are more general improvements and can go in the normal
>> course of events.
>>
>> It is possible that the changes to btrfs and xfs can just be dropped
>> as a subsequent patch will be needed to revert them anyway. They are
>> there only to be able to say that "blk: make the bioset
>> rescue_workqueue optional." doesn't change any functionality at all.
>
> I have applied the first patch for this series, and added the Fixes
> tag. We'll let the rest simmer a bit, then aim for 4.12 for those.
>
Sounds good, thanks.
NeilBrown
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]
^ permalink raw reply [flat|nested] 9+ messages in thread