Re: [PATCH 02/15] fs: Introduce i_blocks_per_page

From: Dave Chinner <david@fromorbit.com>
To: Matthew Wilcox <willy@infradead.org>
Cc: linux-fsdevel@vger.kernel.org, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH 02/15] fs: Introduce i_blocks_per_page
Date: Wed, 25 Sep 2019 18:36:50 +1000	[thread overview]
Message-ID: <20190925083650.GE804@dread.disaster.area> (raw)
In-Reply-To: <20190925005214.27240-3-willy@infradead.org>

On Tue, Sep 24, 2019 at 05:52:01PM -0700, Matthew Wilcox wrote:
> From: "Matthew Wilcox (Oracle)" <willy@infradead.org>
> 
> This helper is useful for both large pages in the page cache and for
> supporting block size larger than page size.  Convert some example
> users (we have a few different ways of writing this idiom).
> 
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>

I'm actually working on abstrcting this code from both block size
and page size via the helpers below. We ahve need to support block
size > page size, and so that requires touching a bunch of all the
same code as this patchset. I'm currently trying to combine your
last patch set with my patchset so I can easily test allocating 64k
page cache pages on a 64k block size filesystem on a 4k page size
machine with XFS....

/*
 * Return the chunk size we should use for page cache based operations.
 * This supports both large block sizes and variable page sizes based on the
 * restriction that order-n blocks and page cache pages are order-n file offset
 * aligned.
 *
 * This will return the inode block size for block size < page_size(page),
 * otherwise it will return page_size(page).
 */
static inline unsigned
iomap_chunk_size(struct inode *inode, struct page *page)
{
        return min_t(unsigned, page_size(page), i_blocksize(inode));
}

static inline unsigned
iomap_chunk_bits(struct inode *inode, struct page *page)
{
        return min_t(unsigned, page_shift(page), inode->i_blkbits);
}

static inline unsigned
iomap_chunks_per_page(struct inode *inode, struct page *page)
{
        return page_size(page) >> inode->i_blkbits;
}

Basically, the process is to convert the iomap code over to
iterating "chunks" rather than blocks or pages, and then allocate
a struct iomap_page according to the difference between page and
block size....

> ---
>  fs/iomap/buffered-io.c  |  4 ++--
>  fs/jfs/jfs_metapage.c   |  2 +-
>  fs/xfs/xfs_aops.c       |  8 ++++----
>  include/linux/pagemap.h | 13 +++++++++++++
>  4 files changed, 20 insertions(+), 7 deletions(-)
> 
> diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
> index e25901ae3ff4..0e76a4b6d98a 100644
> --- a/fs/iomap/buffered-io.c
> +++ b/fs/iomap/buffered-io.c
> @@ -24,7 +24,7 @@ iomap_page_create(struct inode *inode, struct page *page)
>  {
>  	struct iomap_page *iop = to_iomap_page(page);
>  
> -	if (iop || i_blocksize(inode) == PAGE_SIZE)
> +	if (iop || i_blocks_per_page(inode, page) <= 1)
>  		return iop;

That also means checks like these become:

	if (iop || iomap_chunks_per_page(inode, page) <= 1)

as a single file can now have multiple pages per block, a page per
block and multiple blocks per page as the page size changes...

I'd like to only have to make one pass over this code to abstract
out page and block sizes, so I'm guessing we'll need to do some
co-ordination here....

> @@ -636,4 +636,17 @@ static inline unsigned long dir_pages(struct inode *inode)
>  			       PAGE_SHIFT;
>  }
>  
> +/**
> + * i_blocks_per_page - How many blocks fit in this page.
> + * @inode: The inode which contains the blocks.
> + * @page: The (potentially large) page.
> + *
> + * Context: Any context.
> + * Return: The number of filesystem blocks covered by this page.
> + */
> +static inline
> +unsigned int i_blocks_per_page(struct inode *inode, struct page *page)
> +{
> +	return page_size(page) >> inode->i_blkbits;
> +}
>  #endif /* _LINUX_PAGEMAP_H */

It also means that we largely don't need to touch mm headers as
all the helpers end up being iomap specific and private...

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com