From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751270AbdEBL5J (ORCPT ); Tue, 2 May 2017 07:57:09 -0400 Received: from mx1.redhat.com ([209.132.183.28]:58350 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751077AbdEBL5G (ORCPT ); Tue, 2 May 2017 07:57:06 -0400 DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com 66F39265AE5 Authentication-Results: ext-mx10.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx10.extmail.prod.ext.phx2.redhat.com; spf=pass smtp.mailfrom=ming.lei@redhat.com DKIM-Filter: OpenDKIM Filter v2.11.0 mx1.redhat.com 66F39265AE5 Date: Tue, 2 May 2017 19:56:58 +0800 From: Ming Lei To: NeilBrown Cc: Jens Axboe , linux-block@vger.kernel.org, Ming Lei , linux-kernel@vger.kernel.org Subject: Re: [PATCH 05/13] block: Improvements to bounce-buffer handling Message-ID: <20170502115657.GE1803@ming.t460p> References: <149369628671.5146.4865312503373040039.stgit@noble> <149369654475.5146.215368176129979588.stgit@noble> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <149369654475.5146.215368176129979588.stgit@noble> User-Agent: Mutt/1.8.0 (2017-02-23) X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.39]); Tue, 02 May 2017 11:57:06 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, May 02, 2017 at 01:42:24PM +1000, NeilBrown wrote: > Since commit 23688bf4f830 ("block: ensure to split after potentially > bouncing a bio") blk_queue_bounce() is called *before* > blk_queue_split(). > This means that: > 1/ the comments blk_queue_split() about bounce buffers are > irrelevant, and > 2/ a very large bio (more than BIO_MAX_PAGES) will no longer be > split before it arrives at blk_queue_bounce(), leading to the > possibility that bio_clone_bioset() will fail and a NULL > will be dereferenced. > > Separately, blk_queue_bounce() shouldn't use fs_bio_set as the bio > being copied could be from the same set, and this could lead to a > deadlock. > > So: > - allocate 2 private biosets for blk_queue_bounce, one for > splitting enormous bios and one for cloning bios. > - add code to split a bio that exceeds BIO_MAX_PAGES. > - Fix up the comments in blk_queue_split() > > Credit-to: Ming Lei (suggested using single bio_for_each_segment loop) > Signed-off-by: NeilBrown > --- > block/blk-merge.c | 14 ++++---------- > block/bounce.c | 32 ++++++++++++++++++++++++++------ > 2 files changed, 30 insertions(+), 16 deletions(-) > > diff --git a/block/blk-merge.c b/block/blk-merge.c > index d59074556703..51c84540d3bb 100644 > --- a/block/blk-merge.c > +++ b/block/blk-merge.c > @@ -117,17 +117,11 @@ static struct bio *blk_bio_segment_split(struct request_queue *q, > * each holds at most BIO_MAX_PAGES bvecs because > * bio_clone() can fail to allocate big bvecs. > * > - * It should have been better to apply the limit per > - * request queue in which bio_clone() is involved, > - * instead of globally. The biggest blocker is the > - * bio_clone() in bio bounce. > + * Those drivers which will need to use bio_clone() > + * should tell us in some way. For now, impose the > + * BIO_MAX_PAGES limit on all queues. > * > - * If bio is splitted by this reason, we should have > - * allowed to continue bios merging, but don't do > - * that now for making the change simple. > - * > - * TODO: deal with bio bounce's bio_clone() gracefully > - * and convert the global limit into per-queue limit. > + * TODO: handle users of bio_clone() differently. > */ > if (bvecs++ >= BIO_MAX_PAGES) > goto split; > diff --git a/block/bounce.c b/block/bounce.c > index 1cb5dd3a5da1..087ecc2dc66c 100644 > --- a/block/bounce.c > +++ b/block/bounce.c > @@ -26,6 +26,7 @@ > #define POOL_SIZE 64 > #define ISA_POOL_SIZE 16 > > +struct bio_set *bounce_bio_set, *bounce_bio_split; > static mempool_t *page_pool, *isa_page_pool; > > #if defined(CONFIG_HIGHMEM) || defined(CONFIG_NEED_BOUNCE_POOL) > @@ -40,6 +41,14 @@ static __init int init_emergency_pool(void) > BUG_ON(!page_pool); > pr_info("pool size: %d pages\n", POOL_SIZE); > > + bounce_bio_set = bioset_create(BIO_POOL_SIZE, 0, BIOSET_NEED_BVECS); > + BUG_ON(!bounce_bio_set); > + if (bioset_integrity_create(bounce_bio_set, BIO_POOL_SIZE)) > + BUG_ON(1); > + > + bounce_bio_split = bioset_create(BIO_POOL_SIZE, 0, 0); > + BUG_ON(!bounce_bio_split); > + > return 0; > } > > @@ -186,15 +195,26 @@ static void __blk_queue_bounce(struct request_queue *q, struct bio **bio_orig, > int rw = bio_data_dir(*bio_orig); > struct bio_vec *to, from; > struct bvec_iter iter; > - unsigned i; > + unsigned i = 0; > + bool bounce = false; > + int sectors = 0; > > - bio_for_each_segment(from, *bio_orig, iter) > + bio_for_each_segment(from, *bio_orig, iter) { > + if (i++ < BIO_MAX_PAGES) > + sectors += from.bv_len >> 9; > if (page_to_pfn(from.bv_page) > queue_bounce_pfn(q)) > - goto bounce; > + bounce = true; > + } > + if (!bounce) > + return; > > - return; > -bounce: > - bio = bio_clone_bioset(*bio_orig, GFP_NOIO, fs_bio_set); > + if (sectors < bio_sectors(*bio_orig)) { > + bio = bio_split(*bio_orig, sectors, GFP_NOIO, bounce_bio_split); > + bio_chain(bio, *bio_orig); > + generic_make_request(*bio_orig); > + *bio_orig = bio; > + } > + bio = bio_clone_bioset(*bio_orig, GFP_NOIO, bounce_bio_set); > > bio_for_each_segment_all(to, bio, i) { > struct page *page = to->bv_page; Reviewed-by: Ming Lei Thanks, Ming