From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pf0-f198.google.com (mail-pf0-f198.google.com [209.85.192.198]) by kanga.kvack.org (Postfix) with ESMTP id 0EE0D6B02E1 for ; Wed, 16 May 2018 01:06:26 -0400 (EDT) Received: by mail-pf0-f198.google.com with SMTP id e20-v6so1813048pff.14 for ; Tue, 15 May 2018 22:06:26 -0700 (PDT) Received: from smtp.codeaurora.org (smtp.codeaurora.org. [198.145.29.96]) by mx.google.com with ESMTPS id s24-v6si1679730pfm.257.2018.05.15.22.06.20 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 15 May 2018 22:06:20 -0700 (PDT) Subject: Re: [PATCH 01/33] block: add a lower-level bio_add_page interface References: <20180509074830.16196-1-hch@lst.de> <20180509074830.16196-2-hch@lst.de> From: Ritesh Harjani Message-ID: <37c16316-aa3a-e3df-79d0-9fca37a5996f@codeaurora.org> Date: Wed, 16 May 2018 10:36:14 +0530 MIME-Version: 1.0 In-Reply-To: <20180509074830.16196-2-hch@lst.de> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: To: Christoph Hellwig , linux-xfs@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, linux-block@vger.kernel.org, linux-mm@kvack.org On 5/9/2018 1:17 PM, Christoph Hellwig wrote: > For the upcoming removal of buffer heads in XFS we need to keep track of > the number of outstanding writeback requests per page. For this we need > to know if bio_add_page merged a region with the previous bvec or not. > Instead of adding additional arguments this refactors bio_add_page to > be implemented using three lower level helpers which users like XFS can > use directly if they care about the merge decisions. > > Signed-off-by: Christoph Hellwig > --- > block/bio.c | 87 ++++++++++++++++++++++++++++++--------------- > include/linux/bio.h | 9 +++++ > 2 files changed, 67 insertions(+), 29 deletions(-) > > diff --git a/block/bio.c b/block/bio.c > index 53e0f0a1ed94..6ceba6adbf42 100644 > --- a/block/bio.c > +++ b/block/bio.c > @@ -773,7 +773,7 @@ int bio_add_pc_page(struct request_queue *q, struct bio *bio, struct page > return 0; > } > > - if (bio->bi_vcnt >= bio->bi_max_vecs) > + if (bio_full(bio)) > return 0; > > /* > @@ -820,6 +820,59 @@ int bio_add_pc_page(struct request_queue *q, struct bio *bio, struct page > } > EXPORT_SYMBOL(bio_add_pc_page); > > +/** > + * __bio_try_merge_page - try adding data to an existing bvec > + * @bio: destination bio > + * @page: page to add > + * @len: length of the range to add > + * @off: offset into @page > + * > + * Try adding the data described at @page + @offset to the last bvec of @bio. > + * Return %true on success or %false on failure. This can happen frequently > + * for file systems with a block size smaller than the page size. > + */ > +bool __bio_try_merge_page(struct bio *bio, struct page *page, > + unsigned int len, unsigned int off) > +{ > + if (bio->bi_vcnt > 0) { > + struct bio_vec *bv = &bio->bi_io_vec[bio->bi_vcnt - 1]; > + > + if (page == bv->bv_page && off == bv->bv_offset + bv->bv_len) { > + bv->bv_len += len; > + bio->bi_iter.bi_size += len; > + return true; > + } > + } > + return false; > +} > +EXPORT_SYMBOL_GPL(__bio_try_merge_page); > + > +/** > + * __bio_add_page - add page to a bio in a new segment > + * @bio: destination bio > + * @page: page to add > + * @len: length of the range to add > + * @off: offset into @page > + * > + * Add the data at @page + @offset to @bio as a new bvec. The caller must > + * ensure that @bio has space for another bvec. > + */ > +void __bio_add_page(struct bio *bio, struct page *page, > + unsigned int len, unsigned int off) > +{ > + struct bio_vec *bv = &bio->bi_io_vec[bio->bi_vcnt]; > + > + WARN_ON_ONCE(bio_full(bio)); Please correct my understanding here. I am still new at understanding this. 1. if bio_full is true that means no space in bio->bio_io_vec[] no? Than how come we are still proceeding ahead with only warning? While originally in bio_add_page we used to return after checking bio_full. Callers can still call __bio_add_page directly right. 2. Also the bio_io_vec size allocated will only be upto bio->bi_max_vecs right? I could not follow up very well with the bvec_alloc function, mainly when nr_iovec > inline_vecs. So how and where it is getting sure that we are getting _nr_iovecs_ allocated from the bvec_pool? hmm.. tricky. Please help me understand this. 1. So we have defined different slabs of different sizes in bvec_slabs. and when the allocation request of nr_iovecs come we try to grab the predefined(in terms of size) slab of bvec_slabs and return. In case if that allocation does not succeed from slab, we go for mempool_alloc. 2. IF above is correct why don't we set the bio->bi_max_vecs to the size of the slab instead of keeeping it to nr_iovecs which user requested? (in bio_alloc_bioset) 3. Could you please help understand why for cloned bio we still allow __bio_add_page to work? why not WARN and return like in original code? 4. Ok, I see that in patch 32 you are first checking bio_full and calling for xfs_chain_bio. But there also I think you are making sure that new ioend->io_bio is the new chained bio which is not full. Apologies if above doesn't make any sense. > + > + bv->bv_page = page; > + bv->bv_offset = off; > + bv->bv_len = len; > + > + bio->bi_iter.bi_size += len; > + bio->bi_vcnt++; > +} > +EXPORT_SYMBOL_GPL(__bio_add_page); > + > /** > * bio_add_page - attempt to add page to bio > * @bio: destination bio > @@ -833,40 +886,16 @@ EXPORT_SYMBOL(bio_add_pc_page); > int bio_add_page(struct bio *bio, struct page *page, > unsigned int len, unsigned int offset) > { > - struct bio_vec *bv; > - > /* > * cloned bio must not modify vec list > */ > if (WARN_ON_ONCE(bio_flagged(bio, BIO_CLONED))) > return 0; > - > - /* > - * For filesystems with a blocksize smaller than the pagesize > - * we will often be called with the same page as last time and > - * a consecutive offset. Optimize this special case. > - */ > - if (bio->bi_vcnt > 0) { > - bv = &bio->bi_io_vec[bio->bi_vcnt - 1]; > - > - if (page == bv->bv_page && > - offset == bv->bv_offset + bv->bv_len) { > - bv->bv_len += len; > - goto done; > - } > + if (!__bio_try_merge_page(bio, page, len, offset)) { > + if (bio_full(bio)) > + return 0; > + __bio_add_page(bio, page, len, offset); > } > - > - if (bio->bi_vcnt >= bio->bi_max_vecs) > - return 0; Originally here we were supposed to return and not proceed further. Should __bio_add_page not have similar checks to safeguard crossing the bio_io_vec[] boundary? > - > - bv = &bio->bi_io_vec[bio->bi_vcnt]; > - bv->bv_page = page; > - bv->bv_len = len; > - bv->bv_offset = offset; > - > - bio->bi_vcnt++; > -done: > - bio->bi_iter.bi_size += len; > return len; > } > EXPORT_SYMBOL(bio_add_page); > diff --git a/include/linux/bio.h b/include/linux/bio.h > index ce547a25e8ae..3e73c8bc25ea 100644 > --- a/include/linux/bio.h > +++ b/include/linux/bio.h > @@ -123,6 +123,11 @@ static inline void *bio_data(struct bio *bio) > return NULL; > } > > +static inline bool bio_full(struct bio *bio) > +{ > + return bio->bi_vcnt >= bio->bi_max_vecs; > +} > + > /* > * will die > */ > @@ -470,6 +475,10 @@ void bio_chain(struct bio *, struct bio *); > extern int bio_add_page(struct bio *, struct page *, unsigned int,unsigned int); > extern int bio_add_pc_page(struct request_queue *, struct bio *, struct page *, > unsigned int, unsigned int); > +bool __bio_try_merge_page(struct bio *bio, struct page *page, > + unsigned int len, unsigned int off); > +void __bio_add_page(struct bio *bio, struct page *page, > + unsigned int len, unsigned int off); > int bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter); > struct rq_map_data; > extern struct bio *bio_map_user_iov(struct request_queue *, > -- Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center, Inc. Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project.