From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp.codeaurora.org ([198.145.29.96]:53804 "EHLO smtp.codeaurora.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751040AbeEPFGV (ORCPT ); Wed, 16 May 2018 01:06:21 -0400 Subject: Re: [PATCH 01/33] block: add a lower-level bio_add_page interface To: Christoph Hellwig , linux-xfs@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, linux-block@vger.kernel.org, linux-mm@kvack.org References: <20180509074830.16196-1-hch@lst.de> <20180509074830.16196-2-hch@lst.de> From: Ritesh Harjani Message-ID: <37c16316-aa3a-e3df-79d0-9fca37a5996f@codeaurora.org> Date: Wed, 16 May 2018 10:36:14 +0530 MIME-Version: 1.0 In-Reply-To: <20180509074830.16196-2-hch@lst.de> Content-Type: text/plain; charset=utf-8; format=flowed Sender: linux-block-owner@vger.kernel.org List-Id: linux-block@vger.kernel.org On 5/9/2018 1:17 PM, Christoph Hellwig wrote: > For the upcoming removal of buffer heads in XFS we need to keep track of > the number of outstanding writeback requests per page. For this we need > to know if bio_add_page merged a region with the previous bvec or not. > Instead of adding additional arguments this refactors bio_add_page to > be implemented using three lower level helpers which users like XFS can > use directly if they care about the merge decisions. > > Signed-off-by: Christoph Hellwig > --- > block/bio.c | 87 ++++++++++++++++++++++++++++++--------------- > include/linux/bio.h | 9 +++++ > 2 files changed, 67 insertions(+), 29 deletions(-) > > diff --git a/block/bio.c b/block/bio.c > index 53e0f0a1ed94..6ceba6adbf42 100644 > --- a/block/bio.c > +++ b/block/bio.c > @@ -773,7 +773,7 @@ int bio_add_pc_page(struct request_queue *q, struct bio *bio, struct page > return 0; > } > > - if (bio->bi_vcnt >= bio->bi_max_vecs) > + if (bio_full(bio)) > return 0; > > /* > @@ -820,6 +820,59 @@ int bio_add_pc_page(struct request_queue *q, struct bio *bio, struct page > } > EXPORT_SYMBOL(bio_add_pc_page); > > +/** > + * __bio_try_merge_page - try adding data to an existing bvec > + * @bio: destination bio > + * @page: page to add > + * @len: length of the range to add > + * @off: offset into @page > + * > + * Try adding the data described at @page + @offset to the last bvec of @bio. > + * Return %true on success or %false on failure. This can happen frequently > + * for file systems with a block size smaller than the page size. > + */ > +bool __bio_try_merge_page(struct bio *bio, struct page *page, > + unsigned int len, unsigned int off) > +{ > + if (bio->bi_vcnt > 0) { > + struct bio_vec *bv = &bio->bi_io_vec[bio->bi_vcnt - 1]; > + > + if (page == bv->bv_page && off == bv->bv_offset + bv->bv_len) { > + bv->bv_len += len; > + bio->bi_iter.bi_size += len; > + return true; > + } > + } > + return false; > +} > +EXPORT_SYMBOL_GPL(__bio_try_merge_page); > + > +/** > + * __bio_add_page - add page to a bio in a new segment > + * @bio: destination bio > + * @page: page to add > + * @len: length of the range to add > + * @off: offset into @page > + * > + * Add the data at @page + @offset to @bio as a new bvec. The caller must > + * ensure that @bio has space for another bvec. > + */ > +void __bio_add_page(struct bio *bio, struct page *page, > + unsigned int len, unsigned int off) > +{ > + struct bio_vec *bv = &bio->bi_io_vec[bio->bi_vcnt]; > + > + WARN_ON_ONCE(bio_full(bio)); Please correct my understanding here. I am still new at understanding this. 1. if bio_full is true that means no space in bio->bio_io_vec[] no? Than how come we are still proceeding ahead with only warning? While originally in bio_add_page we used to return after checking bio_full. Callers can still call __bio_add_page directly right. 2. Also the bio_io_vec size allocated will only be upto bio->bi_max_vecs right? I could not follow up very well with the bvec_alloc function, mainly when nr_iovec > inline_vecs. So how and where it is getting sure that we are getting _nr_iovecs_ allocated from the bvec_pool? hmm.. tricky. Please help me understand this. 1. So we have defined different slabs of different sizes in bvec_slabs. and when the allocation request of nr_iovecs come we try to grab the predefined(in terms of size) slab of bvec_slabs and return. In case if that allocation does not succeed from slab, we go for mempool_alloc. 2. IF above is correct why don't we set the bio->bi_max_vecs to the size of the slab instead of keeeping it to nr_iovecs which user requested? (in bio_alloc_bioset) 3. Could you please help understand why for cloned bio we still allow __bio_add_page to work? why not WARN and return like in original code? 4. Ok, I see that in patch 32 you are first checking bio_full and calling for xfs_chain_bio. But there also I think you are making sure that new ioend->io_bio is the new chained bio which is not full. Apologies if above doesn't make any sense. > + > + bv->bv_page = page; > + bv->bv_offset = off; > + bv->bv_len = len; > + > + bio->bi_iter.bi_size += len; > + bio->bi_vcnt++; > +} > +EXPORT_SYMBOL_GPL(__bio_add_page); > + > /** > * bio_add_page - attempt to add page to bio > * @bio: destination bio > @@ -833,40 +886,16 @@ EXPORT_SYMBOL(bio_add_pc_page); > int bio_add_page(struct bio *bio, struct page *page, > unsigned int len, unsigned int offset) > { > - struct bio_vec *bv; > - > /* > * cloned bio must not modify vec list > */ > if (WARN_ON_ONCE(bio_flagged(bio, BIO_CLONED))) > return 0; > - > - /* > - * For filesystems with a blocksize smaller than the pagesize > - * we will often be called with the same page as last time and > - * a consecutive offset. Optimize this special case. > - */ > - if (bio->bi_vcnt > 0) { > - bv = &bio->bi_io_vec[bio->bi_vcnt - 1]; > - > - if (page == bv->bv_page && > - offset == bv->bv_offset + bv->bv_len) { > - bv->bv_len += len; > - goto done; > - } > + if (!__bio_try_merge_page(bio, page, len, offset)) { > + if (bio_full(bio)) > + return 0; > + __bio_add_page(bio, page, len, offset); > } > - > - if (bio->bi_vcnt >= bio->bi_max_vecs) > - return 0; Originally here we were supposed to return and not proceed further. Should __bio_add_page not have similar checks to safeguard crossing the bio_io_vec[] boundary? > - > - bv = &bio->bi_io_vec[bio->bi_vcnt]; > - bv->bv_page = page; > - bv->bv_len = len; > - bv->bv_offset = offset; > - > - bio->bi_vcnt++; > -done: > - bio->bi_iter.bi_size += len; > return len; > } > EXPORT_SYMBOL(bio_add_page); > diff --git a/include/linux/bio.h b/include/linux/bio.h > index ce547a25e8ae..3e73c8bc25ea 100644 > --- a/include/linux/bio.h > +++ b/include/linux/bio.h > @@ -123,6 +123,11 @@ static inline void *bio_data(struct bio *bio) > return NULL; > } > > +static inline bool bio_full(struct bio *bio) > +{ > + return bio->bi_vcnt >= bio->bi_max_vecs; > +} > + > /* > * will die > */ > @@ -470,6 +475,10 @@ void bio_chain(struct bio *, struct bio *); > extern int bio_add_page(struct bio *, struct page *, unsigned int,unsigned int); > extern int bio_add_pc_page(struct request_queue *, struct bio *, struct page *, > unsigned int, unsigned int); > +bool __bio_try_merge_page(struct bio *bio, struct page *page, > + unsigned int len, unsigned int off); > +void __bio_add_page(struct bio *bio, struct page *page, > + unsigned int len, unsigned int off); > int bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter); > struct rq_map_data; > extern struct bio *bio_map_user_iov(struct request_queue *, > -- Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center, Inc. Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project.