From: Ming Lei <ming.lei@redhat.com>
To: NeilBrown <neilb@suse.com>
Cc: Jens Axboe <axboe@kernel.dk>,
linux-block@vger.kernel.org, Ming Lei <tom.leiming@gmail.com>,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH 05/13] block: Improvements to bounce-buffer handling
Date: Tue, 2 May 2017 19:56:58 +0800 [thread overview]
Message-ID: <20170502115657.GE1803@ming.t460p> (raw)
In-Reply-To: <149369654475.5146.215368176129979588.stgit@noble>
On Tue, May 02, 2017 at 01:42:24PM +1000, NeilBrown wrote:
> Since commit 23688bf4f830 ("block: ensure to split after potentially
> bouncing a bio") blk_queue_bounce() is called *before*
> blk_queue_split().
> This means that:
> 1/ the comments blk_queue_split() about bounce buffers are
> irrelevant, and
> 2/ a very large bio (more than BIO_MAX_PAGES) will no longer be
> split before it arrives at blk_queue_bounce(), leading to the
> possibility that bio_clone_bioset() will fail and a NULL
> will be dereferenced.
>
> Separately, blk_queue_bounce() shouldn't use fs_bio_set as the bio
> being copied could be from the same set, and this could lead to a
> deadlock.
>
> So:
> - allocate 2 private biosets for blk_queue_bounce, one for
> splitting enormous bios and one for cloning bios.
> - add code to split a bio that exceeds BIO_MAX_PAGES.
> - Fix up the comments in blk_queue_split()
>
> Credit-to: Ming Lei <tom.leiming@gmail.com> (suggested using single bio_for_each_segment loop)
> Signed-off-by: NeilBrown <neilb@suse.com>
> ---
> block/blk-merge.c | 14 ++++----------
> block/bounce.c | 32 ++++++++++++++++++++++++++------
> 2 files changed, 30 insertions(+), 16 deletions(-)
>
> diff --git a/block/blk-merge.c b/block/blk-merge.c
> index d59074556703..51c84540d3bb 100644
> --- a/block/blk-merge.c
> +++ b/block/blk-merge.c
> @@ -117,17 +117,11 @@ static struct bio *blk_bio_segment_split(struct request_queue *q,
> * each holds at most BIO_MAX_PAGES bvecs because
> * bio_clone() can fail to allocate big bvecs.
> *
> - * It should have been better to apply the limit per
> - * request queue in which bio_clone() is involved,
> - * instead of globally. The biggest blocker is the
> - * bio_clone() in bio bounce.
> + * Those drivers which will need to use bio_clone()
> + * should tell us in some way. For now, impose the
> + * BIO_MAX_PAGES limit on all queues.
> *
> - * If bio is splitted by this reason, we should have
> - * allowed to continue bios merging, but don't do
> - * that now for making the change simple.
> - *
> - * TODO: deal with bio bounce's bio_clone() gracefully
> - * and convert the global limit into per-queue limit.
> + * TODO: handle users of bio_clone() differently.
> */
> if (bvecs++ >= BIO_MAX_PAGES)
> goto split;
> diff --git a/block/bounce.c b/block/bounce.c
> index 1cb5dd3a5da1..087ecc2dc66c 100644
> --- a/block/bounce.c
> +++ b/block/bounce.c
> @@ -26,6 +26,7 @@
> #define POOL_SIZE 64
> #define ISA_POOL_SIZE 16
>
> +struct bio_set *bounce_bio_set, *bounce_bio_split;
> static mempool_t *page_pool, *isa_page_pool;
>
> #if defined(CONFIG_HIGHMEM) || defined(CONFIG_NEED_BOUNCE_POOL)
> @@ -40,6 +41,14 @@ static __init int init_emergency_pool(void)
> BUG_ON(!page_pool);
> pr_info("pool size: %d pages\n", POOL_SIZE);
>
> + bounce_bio_set = bioset_create(BIO_POOL_SIZE, 0, BIOSET_NEED_BVECS);
> + BUG_ON(!bounce_bio_set);
> + if (bioset_integrity_create(bounce_bio_set, BIO_POOL_SIZE))
> + BUG_ON(1);
> +
> + bounce_bio_split = bioset_create(BIO_POOL_SIZE, 0, 0);
> + BUG_ON(!bounce_bio_split);
> +
> return 0;
> }
>
> @@ -186,15 +195,26 @@ static void __blk_queue_bounce(struct request_queue *q, struct bio **bio_orig,
> int rw = bio_data_dir(*bio_orig);
> struct bio_vec *to, from;
> struct bvec_iter iter;
> - unsigned i;
> + unsigned i = 0;
> + bool bounce = false;
> + int sectors = 0;
>
> - bio_for_each_segment(from, *bio_orig, iter)
> + bio_for_each_segment(from, *bio_orig, iter) {
> + if (i++ < BIO_MAX_PAGES)
> + sectors += from.bv_len >> 9;
> if (page_to_pfn(from.bv_page) > queue_bounce_pfn(q))
> - goto bounce;
> + bounce = true;
> + }
> + if (!bounce)
> + return;
>
> - return;
> -bounce:
> - bio = bio_clone_bioset(*bio_orig, GFP_NOIO, fs_bio_set);
> + if (sectors < bio_sectors(*bio_orig)) {
> + bio = bio_split(*bio_orig, sectors, GFP_NOIO, bounce_bio_split);
> + bio_chain(bio, *bio_orig);
> + generic_make_request(*bio_orig);
> + *bio_orig = bio;
> + }
> + bio = bio_clone_bioset(*bio_orig, GFP_NOIO, bounce_bio_set);
>
> bio_for_each_segment_all(to, bio, i) {
> struct page *page = to->bv_page;
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Thanks,
Ming
next prev parent reply other threads:[~2017-05-02 11:57 UTC|newest]
Thread overview: 46+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-05-02 3:42 [PATCH 00/13] block: assorted cleanup for bio splitting and cloning NeilBrown
2017-05-02 3:42 ` [PATCH 03/13] blk: make the bioset rescue_workqueue optional NeilBrown
2017-05-02 8:14 ` Christoph Hellwig
2017-05-02 11:00 ` Ming Lei
2017-05-02 22:10 ` NeilBrown
2017-05-02 22:34 ` [PATCH 03/13 V2] " NeilBrown
2017-05-03 3:24 ` Ming Lei
2017-05-02 3:42 ` [PATCH 02/13] blk: replace bioset_create_nobvec() with a flags arg to bioset_create() NeilBrown
2017-05-02 8:06 ` Christoph Hellwig
2017-05-02 21:47 ` NeilBrown
2017-05-02 9:40 ` Ming Lei
2017-05-02 3:42 ` [PATCH 01/13] blk: remove bio_set arg from blk_queue_split() NeilBrown
2017-05-02 10:44 ` javigon
2017-05-02 3:42 ` [PATCH 05/13] block: Improvements to bounce-buffer handling NeilBrown
2017-05-02 8:13 ` Christoph Hellwig
2017-05-02 11:56 ` Ming Lei [this message]
2017-05-02 3:42 ` [PATCH 04/13] blk: use non-rescuing bioset for q->bio_split NeilBrown
2017-05-02 11:54 ` Ming Lei
2017-05-02 23:21 ` [PATCH 04/13 V2] " NeilBrown
2017-05-02 3:42 ` [PATCH 09/13] lightnvm/pblk-read: use bio_clone_fast() NeilBrown
2017-05-02 8:15 ` Christoph Hellwig
2017-05-02 11:22 ` Javier González
2017-05-02 21:51 ` NeilBrown
2017-05-03 6:42 ` Javier González
2017-05-02 3:42 ` [PATCH 10/13] xen-blkfront: remove bio splitting NeilBrown
2017-05-02 8:15 ` Christoph Hellwig
2017-05-02 3:42 ` [PATCH 07/13] drbd: use bio_clone_fast() instead of bio_clone() NeilBrown
2017-05-02 23:20 ` [PATCH 07/13 V2] " NeilBrown
2017-05-02 3:42 ` [PATCH 06/13] rbd: " NeilBrown
2017-05-02 3:42 ` [PATCH 08/13] pktcdvd: " NeilBrown
2017-05-02 3:42 ` [PATCH 13/13] block: don't check for BIO_MAX_PAGES in blk_bio_segment_split() NeilBrown
2017-05-02 8:15 ` Christoph Hellwig
2017-05-02 10:22 ` Ming Lei
2017-05-02 22:54 ` NeilBrown
2017-05-02 23:50 ` Ming Lei
2017-05-02 3:42 ` [PATCH 11/13] bcache: use kmalloc to allocate bio in bch_data_verify() NeilBrown
2017-05-02 3:42 ` [PATCH 12/13] block: remove bio_clone() and all references NeilBrown
2017-05-11 0:58 ` [PATCH 00/13] block: assorted cleanup for bio splitting and cloning NeilBrown
2017-06-16 5:54 ` NeilBrown
2017-06-16 6:42 ` Christoph Hellwig
2017-06-16 7:30 ` NeilBrown
2017-06-16 7:34 ` Christoph Hellwig
2017-06-16 20:45 ` Jens Axboe
2017-06-18 4:40 ` NeilBrown
2017-06-18 4:40 ` NeilBrown
2017-06-18 4:38 NeilBrown
2017-06-18 4:38 ` [PATCH 05/13] block: Improvements to bounce-buffer handling NeilBrown
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170502115657.GE1803@ming.t460p \
--to=ming.lei@redhat.com \
--cc=axboe@kernel.dk \
--cc=linux-block@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=neilb@suse.com \
--cc=tom.leiming@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).