Linux-Block Archive on lore.kernel.org
 help / color / Atom feed
From: Ritesh Harjani <riteshh@codeaurora.org>
To: Christoph Hellwig <hch@lst.de>, linux-xfs@vger.kernel.org
Cc: linux-fsdevel@vger.kernel.org, linux-block@vger.kernel.org,
	linux-mm@kvack.org
Subject: Re: [PATCH 01/33] block: add a lower-level bio_add_page interface
Date: Wed, 16 May 2018 10:36:14 +0530
Message-ID: <37c16316-aa3a-e3df-79d0-9fca37a5996f@codeaurora.org> (raw)
In-Reply-To: <20180509074830.16196-2-hch@lst.de>



On 5/9/2018 1:17 PM, Christoph Hellwig wrote:
> For the upcoming removal of buffer heads in XFS we need to keep track of
> the number of outstanding writeback requests per page.  For this we need
> to know if bio_add_page merged a region with the previous bvec or not.
> Instead of adding additional arguments this refactors bio_add_page to
> be implemented using three lower level helpers which users like XFS can
> use directly if they care about the merge decisions.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>   block/bio.c         | 87 ++++++++++++++++++++++++++++++---------------
>   include/linux/bio.h |  9 +++++
>   2 files changed, 67 insertions(+), 29 deletions(-)
> 
> diff --git a/block/bio.c b/block/bio.c
> index 53e0f0a1ed94..6ceba6adbf42 100644
> --- a/block/bio.c
> +++ b/block/bio.c
> @@ -773,7 +773,7 @@ int bio_add_pc_page(struct request_queue *q, struct bio *bio, struct page
>   			return 0;
>   	}
>   
> -	if (bio->bi_vcnt >= bio->bi_max_vecs)
> +	if (bio_full(bio))
>   		return 0;
>   
>   	/*
> @@ -820,6 +820,59 @@ int bio_add_pc_page(struct request_queue *q, struct bio *bio, struct page
>   }
>   EXPORT_SYMBOL(bio_add_pc_page);
>   
> +/**
> + * __bio_try_merge_page - try adding data to an existing bvec
> + * @bio: destination bio
> + * @page: page to add
> + * @len: length of the range to add
> + * @off: offset into @page
> + *
> + * Try adding the data described at @page + @offset to the last bvec of @bio.
> + * Return %true on success or %false on failure.  This can happen frequently
> + * for file systems with a block size smaller than the page size.
> + */
> +bool __bio_try_merge_page(struct bio *bio, struct page *page,
> +		unsigned int len, unsigned int off)
> +{
> +	if (bio->bi_vcnt > 0) {
> +		struct bio_vec *bv = &bio->bi_io_vec[bio->bi_vcnt - 1];
> +
> +		if (page == bv->bv_page && off == bv->bv_offset + bv->bv_len) {
> +			bv->bv_len += len;
> +			bio->bi_iter.bi_size += len;
> +			return true;
> +		}
> +	}
> +	return false;
> +}
> +EXPORT_SYMBOL_GPL(__bio_try_merge_page);
> +
> +/**
> + * __bio_add_page - add page to a bio in a new segment
> + * @bio: destination bio
> + * @page: page to add
> + * @len: length of the range to add
> + * @off: offset into @page
> + *
> + * Add the data at @page + @offset to @bio as a new bvec.  The caller must
> + * ensure that @bio has space for another bvec.
> + */
> +void __bio_add_page(struct bio *bio, struct page *page,
> +		unsigned int len, unsigned int off)
> +{
> +	struct bio_vec *bv = &bio->bi_io_vec[bio->bi_vcnt];
> +
> +	WARN_ON_ONCE(bio_full(bio));

Please correct my understanding here. I am still new at understanding this.

1. if bio_full is true that means no space in bio->bio_io_vec[] no?
Than how come we are still proceeding ahead with only warning?
While originally in bio_add_page we used to return after checking
bio_full. Callers can still call __bio_add_page directly right.

2. Also the bio_io_vec size allocated will only be upto bio->bi_max_vecs 
right?
I could not follow up very well with the bvec_alloc function,
mainly when nr_iovec > inline_vecs. So how and where it is getting sure 
that we are getting _nr_iovecs_ allocated from the bvec_pool?

hmm.. tricky. Please help me understand this.
1. So we have defined different slabs of different sizes in bvec_slabs. 
and when the allocation request of nr_iovecs come
we try to grab the predefined(in terms of size) slab of bvec_slabs
and return. In case if that allocation does not succeed from slab,
we go for mempool_alloc.

2. IF above is correct why don't we set the bio->bi_max_vecs to the size
of the slab instead of keeeping it to nr_iovecs which user requested?
(in bio_alloc_bioset)


3. Could you please help understand why for cloned bio we still allow
__bio_add_page to work? why not WARN and return like in original code?

4. Ok, I see that in patch 32 you are first checking bio_full and 
calling for xfs_chain_bio. But there also I think you are making sure 
that new ioend->io_bio is the new chained bio which is not full.

Apologies if above doesn't make any sense.

> +
> +	bv->bv_page = page;
> +	bv->bv_offset = off;
> +	bv->bv_len = len;
> +
> +	bio->bi_iter.bi_size += len;
> +	bio->bi_vcnt++;
> +}
> +EXPORT_SYMBOL_GPL(__bio_add_page);
> +
>   /**
>    *	bio_add_page	-	attempt to add page to bio
>    *	@bio: destination bio
> @@ -833,40 +886,16 @@ EXPORT_SYMBOL(bio_add_pc_page);
>   int bio_add_page(struct bio *bio, struct page *page,
>   		 unsigned int len, unsigned int offset)
>   {
> -	struct bio_vec *bv;
> -
>   	/*
>   	 * cloned bio must not modify vec list
>   	 */
>   	if (WARN_ON_ONCE(bio_flagged(bio, BIO_CLONED)))
>   		return 0;
> -
> -	/*
> -	 * For filesystems with a blocksize smaller than the pagesize
> -	 * we will often be called with the same page as last time and
> -	 * a consecutive offset.  Optimize this special case.
> -	 */
> -	if (bio->bi_vcnt > 0) {
> -		bv = &bio->bi_io_vec[bio->bi_vcnt - 1];
> -
> -		if (page == bv->bv_page &&
> -		    offset == bv->bv_offset + bv->bv_len) {
> -			bv->bv_len += len;
> -			goto done;
> -		}
> +	if (!__bio_try_merge_page(bio, page, len, offset)) {
> +		if (bio_full(bio))
> +			return 0;
> +		__bio_add_page(bio, page, len, offset);
>   	}
> -
> -	if (bio->bi_vcnt >= bio->bi_max_vecs)
> -		return 0;
Originally here we were supposed to return and not proceed further.
Should __bio_add_page not have similar checks to safeguard crossing
the bio_io_vec[] boundary?


> -
> -	bv		= &bio->bi_io_vec[bio->bi_vcnt];
> -	bv->bv_page	= page;
> -	bv->bv_len	= len;
> -	bv->bv_offset	= offset;
> -
> -	bio->bi_vcnt++;
> -done:
> -	bio->bi_iter.bi_size += len;
>   	return len;
>   }
>   EXPORT_SYMBOL(bio_add_page);
> diff --git a/include/linux/bio.h b/include/linux/bio.h
> index ce547a25e8ae..3e73c8bc25ea 100644
> --- a/include/linux/bio.h
> +++ b/include/linux/bio.h
> @@ -123,6 +123,11 @@ static inline void *bio_data(struct bio *bio)
>   	return NULL;
>   }
>   
> +static inline bool bio_full(struct bio *bio)
> +{
> +	return bio->bi_vcnt >= bio->bi_max_vecs;
> +}
> +
>   /*
>    * will die
>    */
> @@ -470,6 +475,10 @@ void bio_chain(struct bio *, struct bio *);
>   extern int bio_add_page(struct bio *, struct page *, unsigned int,unsigned int);
>   extern int bio_add_pc_page(struct request_queue *, struct bio *, struct page *,
>   			   unsigned int, unsigned int);
> +bool __bio_try_merge_page(struct bio *bio, struct page *page,
> +		unsigned int len, unsigned int off);
> +void __bio_add_page(struct bio *bio, struct page *page,
> +		unsigned int len, unsigned int off);
>   int bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter);
>   struct rq_map_data;
>   extern struct bio *bio_map_user_iov(struct request_queue *,
> 

-- 
Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center, 
Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a 
Linux Foundation Collaborative Project.

  parent reply index

Thread overview: 64+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-05-09  7:47 stop using buffer heads in xfs and iomap Christoph Hellwig
2018-05-09  7:47 ` [PATCH 01/33] block: add a lower-level bio_add_page interface Christoph Hellwig
2018-05-09 15:12   ` Matthew Wilcox
2018-05-10  6:40     ` Christoph Hellwig
2018-05-10 21:49       ` Andreas Dilger
2018-05-11  6:29         ` Christoph Hellwig
2018-05-15 16:47           ` Jens Axboe
2018-05-10  8:52   ` Ming Lei
2018-05-11  6:24     ` Christoph Hellwig
2018-05-16  5:06   ` Ritesh Harjani [this message]
2018-05-16 18:05     ` Christoph Hellwig
2018-05-17  4:18       ` Ritesh Harjani
2018-05-09  7:47 ` [PATCH 02/33] fs: factor out a __generic_write_end helper Christoph Hellwig
2018-05-09 15:15   ` Matthew Wilcox
2018-05-10  6:40     ` Christoph Hellwig
2018-05-09  7:48 ` [PATCH 03/33] fs: move page_cache_seek_hole_data to iomap.c Christoph Hellwig
2018-05-09  7:48 ` [PATCH 04/33] fs: remove the buffer_unwritten check in page_seek_hole_data Christoph Hellwig
2018-05-17 11:33   ` Andreas Grünbacher
2018-05-09  7:48 ` [PATCH 05/33] fs: use ->is_partially_uptodate in page_cache_seek_hole_data Christoph Hellwig
2018-05-09  7:48 ` [PATCH 06/33] mm: give the 'ret' variable a better name __do_page_cache_readahead Christoph Hellwig
2018-05-09 15:45   ` Matthew Wilcox
2018-05-10  6:41     ` Christoph Hellwig
2018-05-09  7:48 ` [PATCH 07/33] mm: split ->readpages calls to avoid non-contiguous pages lists Christoph Hellwig
2018-05-09 15:46   ` Matthew Wilcox
2018-05-09  7:48 ` [PATCH 08/33] iomap: use __bio_add_page in iomap_dio_zero Christoph Hellwig
2018-05-09  7:48 ` [PATCH 09/33] iomap: add a iomap_sector helper Christoph Hellwig
2018-05-09  7:48 ` [PATCH 10/33] iomap: add an iomap-based bmap implementation Christoph Hellwig
2018-05-09 16:46   ` Darrick J. Wong
2018-05-10  6:42     ` Christoph Hellwig
2018-05-10 15:08       ` Darrick J. Wong
2018-05-11  6:25         ` Christoph Hellwig
2018-05-12  1:56           ` Darrick J. Wong
2018-05-09  7:48 ` [PATCH 11/33] iomap: add an iomap-based readpage and readpages implementation Christoph Hellwig
2018-05-10  1:17   ` Dave Chinner
2018-05-10  6:44     ` Christoph Hellwig
2018-05-09  7:48 ` [PATCH 12/33] xfs: use iomap_bmap Christoph Hellwig
2018-05-09  7:48 ` [PATCH 13/33] xfs: use iomap for blocksize == PAGE_SIZE readpage and readpages Christoph Hellwig
2018-05-09  7:48 ` [PATCH 14/33] xfs: simplify xfs_bmap_punch_delalloc_range Christoph Hellwig
2018-05-09  7:48 ` [PATCH 15/33] xfs: simplify xfs_aops_discard_page Christoph Hellwig
2018-05-09  7:48 ` [PATCH 16/33] xfs: move locking into xfs_bmap_punch_delalloc_range Christoph Hellwig
2018-05-09  7:48 ` [PATCH 17/33] xfs: make xfs_writepage_map extent map centric Christoph Hellwig
2018-05-09  7:48 ` [PATCH 18/33] xfs: remove the now unused XFS_BMAPI_IGSTATE flag Christoph Hellwig
2018-05-09  7:48 ` [PATCH 19/33] xfs: remove xfs_reflink_find_cow_mapping Christoph Hellwig
2018-05-09  7:48 ` [PATCH 20/33] xfs: remove xfs_reflink_trim_irec_to_next_cow Christoph Hellwig
2018-05-09  7:48 ` [PATCH 21/33] xfs: simplify xfs_map_blocks by using xfs_iext_lookup_extent directly Christoph Hellwig
2018-05-09  7:48 ` [PATCH 22/33] xfs: don't clear imap_valid for a non-uptodate buffers Christoph Hellwig
2018-05-09  7:48 ` [PATCH 23/33] xfs: remove the imap_valid flag Christoph Hellwig
2018-05-09  7:48 ` [PATCH 24/33] xfs: don't look at buffer heads in xfs_add_to_ioend Christoph Hellwig
2018-05-09  7:48 ` [PATCH 25/33] xfs: move all writeback buffer_head manipulation into xfs_map_at_offset Christoph Hellwig
2018-05-09  7:48 ` [PATCH 26/33] xfs: allow writeback on pages without buffer heads Christoph Hellwig
2018-05-09  7:48 ` [PATCH 27/33] xfs: remove xfs_start_page_writeback Christoph Hellwig
2018-05-09  7:48 ` [PATCH 28/33] xfs: refactor the tail of xfs_writepage_map Christoph Hellwig
2018-05-09  7:48 ` [PATCH 29/33] xfs: do not set the page uptodate in xfs_writepage_map Christoph Hellwig
2018-05-09  7:48 ` [PATCH 30/33] iomap: add initial support for writes without buffer heads Christoph Hellwig
2018-05-09  7:48 ` [PATCH 31/33] iomap: add support for sub-pagesize buffered I/O " Christoph Hellwig
2018-05-14 16:00   ` Goldwyn Rodrigues
2018-05-15  7:26     ` Christoph Hellwig
2018-05-15 13:47       ` Goldwyn Rodrigues
2018-05-16  5:46         ` Dave Chinner
2018-05-09  7:48 ` [PATCH 32/33] xfs: add support for sub-pagesize writeback without buffer_heads Christoph Hellwig
2018-05-09  7:48 ` [PATCH 33/33] fs: remove __block_write_begin and iomap_to_bh Christoph Hellwig
2018-05-10 15:13 ` stop using buffer heads in xfs and iomap Darrick J. Wong
2018-05-11  6:22   ` Christoph Hellwig
2018-05-11  6:39     ` Darrick J. Wong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=37c16316-aa3a-e3df-79d0-9fca37a5996f@codeaurora.org \
    --to=riteshh@codeaurora.org \
    --cc=hch@lst.de \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-Block Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-block/0 linux-block/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-block linux-block/ https://lore.kernel.org/linux-block \
		linux-block@vger.kernel.org
	public-inbox-index linux-block

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-block


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git