Linux-Fsdevel Archive on lore.kernel.org
 help / color / Atom feed
From: Matthew Wilcox <willy@infradead.org>
To: linux-fsdevel@vger.kernel.org
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	linux-btrfs@vger.kernel.org, linux-erofs@lists.ozlabs.org,
	linux-ext4@vger.kernel.org,
	linux-f2fs-devel@lists.sourceforge.net, cluster-devel@redhat.com,
	ocfs2-devel@oss.oracle.com, linux-xfs@vger.kernel.org
Subject: [PATCH v8 00/25] Change readahead API
Date: Tue, 25 Feb 2020 13:48:13 -0800
Message-ID: <20200225214838.30017-1-willy@infradead.org> (raw)

From: "Matthew Wilcox (Oracle)" <willy@infradead.org>

This series adds a readahead address_space operation to replace the
readpages operation.  The key difference is that pages are added to the
page cache as they are allocated (and then looked up by the filesystem)
instead of passing them on a list to the readpages operation and having
the filesystem add them to the page cache.  It's a net reduction in
code for each implementation, more efficient than walking a list, and
solves the direct-write vs buffered-read problem reported by yu kuai at
https://lore.kernel.org/linux-fsdevel/20200116063601.39201-1-yukuai3@huawei.com/

The only unconverted filesystems are those which use fscache.
Their conversion is pending Dave Howells' rewrite which will make the
conversion substantially easier.

I want to thank the reviewers/testers; Dave Chinner, John Hubbard,
Eric Biggers, Johannes Thumshirn, Dave Sterba, Zi Yan and Christoph
Hellwig have done a marvellous job of providing constructive criticism.
I've tried to take it all on board, but I may have missed something
simply because you've done such a thorough job.

This series can also be found at
http://git.infradead.org/users/willy/linux-dax.git/shortlog/refs/tags/readahead_v8

v8:
 - btrfs, ext4 and xfs all survive an xfstests run (thanks to Kent Overstreet
   for providing the ktest framework)
 - iomap restructuring dropped due to Christoph's opposition and the
   redesign of readahead_page() meaning it wasn't needed any more.
 - f2fs_mpage_readpages() made static again
 - Made iomap_readahead() comment more useful
 - Added kernel-doc for the entire readahead_control API
 - Conditionally zero batch_count in readahead_page() (requested by John)
 - Hold RCU read lock while iterating over the xarray in readahead_page_batch()
 - Iterate over the correct pages in readahead_page_batch()
 - Correct the return type of readahead_index() (spotted by Zi Yan)
 - Added a 'skip_page' parameter to read_pages for better documentation
   purposes and so we can reuse the readahead_control higher in the call
   chain in future.
 - Removed the use_list bool (requested by Christoph)
 - Removed the explicit initialisation of _nr_pages to 0 (requested by
   Christoph & John)
 - Add comments explaining why nr_to_read is being capped (requested by John)
 - Reshuffled some of the patches:
   - Split out adding the readahead_control API from the three patches which
     added it piecemeal
   - Shift the final two mm patches to be with the other mm patches
   - Split the f2fs "pass the inode" patch from the "convert to readahead"
     patch, like ext4

v7:
 - Now passes an xfstests run on ext4!
 - Documentation improvements
 - Move the readahead prototypes out of mm.h (new patch)
 - readahead_for_each* iterators are gone; replaced with readahead_page()
   and readahead_page_batch()
 - page_cache_readahead_limit() renamed to page_cache_readahead_unbounded()
   and arguments changed
 - iomap_readahead_actor() restructured differently
 - The readahead code no longer uses the word 'offset' to reduce ambiguity
 - read_pages() now maintains the rac so we can just call it and continue
   instead of mucking around with branches
 - More assertions
 - More readahead functions return void

v6:
 - Name the private members of readahead_control with a leading underscore
   (suggested by Christoph Hellwig)
 - Fix whitespace in rst file
 - Remove misleading comment in btrfs patch
 - Add readahead_next() API and use it in iomap
 - Add iomap_readahead kerneldoc.
 - Fix the mpage_readahead kerneldoc
 - Make various readahead functions return void
 - Keep readahead_index() and readahead_offset() pointing to the start of
   this batch through the body.  No current user requires this, but it's
   less surprising.
 - Add kerneldoc for page_cache_readahead_limit
 - Make page_idx an unsigned long, and rename it to just 'i'
 - Get rid of page_offset local variable
 - Add patch to call memalloc_nofs_save() before allocating pages (suggested
   by Michal Hocko)
 - Resplit a lot of patches for more logical progression and easier review
   (suggested by John Hubbard)
 - Added sign-offs where received, and I deemed still relevant

v5 switched to passing a readahead_control struct (mirroring the
writepages_control struct passed to writepages).  This has a number of
advantages:
 - It fixes a number of bugs in various implementations, eg forgetting to
   increment 'start', an off-by-one error in 'nr_pages' or treating 'start'
   as a byte offset instead of a page offset.
 - It allows us to change the arguments without changing all the
   implementations of ->readahead which just call mpage_readahead() or
   iomap_readahead()
 - Figuring out which pages haven't been attempted by the implementation
   is more natural this way.
 - There's less code in each implementation.


Matthew Wilcox (Oracle) (25):
  mm: Move readahead prototypes from mm.h
  mm: Return void from various readahead functions
  mm: Ignore return value of ->readpages
  mm: Move readahead nr_pages check into read_pages
  mm: Add new readahead_control API
  mm: Use readahead_control to pass arguments
  mm: Rename various 'offset' parameters to 'index'
  mm: rename readahead loop variable to 'i'
  mm: Remove 'page_offset' from readahead loop
  mm: Put readahead pages in cache earlier
  mm: Add readahead address space operation
  mm: Move end_index check out of readahead loop
  mm: Add page_cache_readahead_unbounded
  mm: Document why we don't set PageReadahead
  mm: Use memalloc_nofs_save in readahead path
  fs: Convert mpage_readpages to mpage_readahead
  btrfs: Convert from readpages to readahead
  erofs: Convert uncompressed files from readpages to readahead
  erofs: Convert compressed files from readpages to readahead
  ext4: Convert from readpages to readahead
  ext4: Pass the inode to ext4_mpage_readpages
  f2fs: Convert from readpages to readahead
  f2fs: Pass the inode to f2fs_mpage_readpages
  fuse: Convert from readpages to readahead
  iomap: Convert from readpages to readahead

 Documentation/filesystems/locking.rst |   6 +-
 Documentation/filesystems/vfs.rst     |  15 ++
 block/blk-core.c                      |   1 +
 drivers/staging/exfat/exfat_super.c   |   7 +-
 fs/block_dev.c                        |   7 +-
 fs/btrfs/extent_io.c                  |  46 ++---
 fs/btrfs/extent_io.h                  |   3 +-
 fs/btrfs/inode.c                      |  16 +-
 fs/erofs/data.c                       |  39 ++--
 fs/erofs/zdata.c                      |  29 +--
 fs/ext2/inode.c                       |  10 +-
 fs/ext4/ext4.h                        |   5 +-
 fs/ext4/inode.c                       |  21 +-
 fs/ext4/readpage.c                    |  25 +--
 fs/ext4/verity.c                      |  35 +---
 fs/f2fs/data.c                        |  50 ++---
 fs/f2fs/f2fs.h                        |   3 -
 fs/f2fs/verity.c                      |  35 +---
 fs/fat/inode.c                        |   7 +-
 fs/fuse/file.c                        |  46 ++---
 fs/gfs2/aops.c                        |  23 +--
 fs/hpfs/file.c                        |   7 +-
 fs/iomap/buffered-io.c                |  92 +++------
 fs/iomap/trace.h                      |   2 +-
 fs/isofs/inode.c                      |   7 +-
 fs/jfs/inode.c                        |   7 +-
 fs/mpage.c                            |  38 +---
 fs/nilfs2/inode.c                     |  15 +-
 fs/ocfs2/aops.c                       |  34 ++--
 fs/omfs/file.c                        |   7 +-
 fs/qnx6/inode.c                       |   7 +-
 fs/reiserfs/inode.c                   |   8 +-
 fs/udf/inode.c                        |   7 +-
 fs/xfs/xfs_aops.c                     |  13 +-
 fs/zonefs/super.c                     |   7 +-
 include/linux/fs.h                    |   2 +
 include/linux/iomap.h                 |   3 +-
 include/linux/mm.h                    |  19 --
 include/linux/mpage.h                 |   4 +-
 include/linux/pagemap.h               | 151 ++++++++++++++
 include/trace/events/erofs.h          |   6 +-
 include/trace/events/f2fs.h           |   6 +-
 mm/fadvise.c                          |   6 +-
 mm/internal.h                         |  12 +-
 mm/migrate.c                          |   2 +-
 mm/readahead.c                        | 278 ++++++++++++++++----------
 46 files changed, 580 insertions(+), 589 deletions(-)


base-commit: 11a48a5a18c63fd7621bb050228cebf13566e4d8
-- 
2.25.0

             reply index

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-02-25 21:48 Matthew Wilcox [this message]
2020-02-25 21:48 ` [PATCH v8 01/25] mm: Move readahead prototypes from mm.h Matthew Wilcox
2020-02-25 21:48 ` [PATCH v8 02/25] mm: Return void from various readahead functions Matthew Wilcox
2020-02-25 21:48 ` [PATCH v8 03/25] mm: Ignore return value of ->readpages Matthew Wilcox
2020-02-25 21:48 ` [PATCH v8 04/25] mm: Move readahead nr_pages check into read_pages Matthew Wilcox
2020-02-25 21:48 ` [PATCH v8 05/25] mm: Add new readahead_control API Matthew Wilcox
2020-02-26 17:01   ` Christoph Hellwig
2020-02-25 21:48 ` [PATCH v8 06/25] mm: Use readahead_control to pass arguments Matthew Wilcox
2020-02-25 21:48 ` [PATCH v8 07/25] mm: Rename various 'offset' parameters to 'index' Matthew Wilcox
2020-02-26 17:13   ` Zi Yan
2020-02-25 21:48 ` [PATCH v8 08/25] mm: rename readahead loop variable to 'i' Matthew Wilcox
2020-02-25 21:48 ` [PATCH v8 09/25] mm: Remove 'page_offset' from readahead loop Matthew Wilcox
2020-02-25 21:48 ` [PATCH v8 10/25] mm: Put readahead pages in cache earlier Matthew Wilcox
2020-02-25 21:48 ` [PATCH v8 11/25] mm: Add readahead address space operation Matthew Wilcox
2020-02-25 21:48 ` [PATCH v8 12/25] mm: Move end_index check out of readahead loop Matthew Wilcox
2020-02-25 21:48 ` [PATCH v8 13/25] mm: Add page_cache_readahead_unbounded Matthew Wilcox
2020-02-25 21:48 ` [PATCH v8 14/25] mm: Document why we don't set PageReadahead Matthew Wilcox
2020-02-26 17:03   ` Christoph Hellwig
2020-02-25 21:48 ` [PATCH v8 15/25] mm: Use memalloc_nofs_save in readahead path Matthew Wilcox
2020-02-25 21:48 ` [PATCH v8 16/25] fs: Convert mpage_readpages to mpage_readahead Matthew Wilcox
2020-02-25 21:48 ` [PATCH v8 17/25] btrfs: Convert from readpages to readahead Matthew Wilcox
2020-02-26 17:05   ` Christoph Hellwig
2020-02-28 14:08     ` David Sterba
2020-02-25 21:48 ` [PATCH v8 18/25] erofs: Convert uncompressed files " Matthew Wilcox
2020-02-25 21:48 ` [PATCH v8 19/25] erofs: Convert compressed " Matthew Wilcox
2020-02-25 21:48 ` [PATCH v8 20/25] ext4: Convert " Matthew Wilcox
2020-02-25 21:48 ` [PATCH v8 21/25] ext4: Pass the inode to ext4_mpage_readpages Matthew Wilcox
2020-02-25 21:48 ` [PATCH v8 22/25] f2fs: Convert from readpages to readahead Matthew Wilcox
2020-02-25 21:48 ` [PATCH v8 23/25] f2fs: Pass the inode to f2fs_mpage_readpages Matthew Wilcox
2020-02-25 21:48 ` [PATCH v8 24/25] fuse: Convert from readpages to readahead Matthew Wilcox
2020-02-25 21:48 ` [PATCH v8 25/25] iomap: " Matthew Wilcox
2020-02-26 17:04   ` Darrick J. Wong
2020-02-26 17:07     ` Christoph Hellwig
2020-02-26 17:10       ` Darrick J. Wong
2020-02-26 17:07   ` Christoph Hellwig
2020-03-19 10:20 ` [PATCH v8 00/25] Change readahead API Christoph Hellwig
2020-03-19 11:57   ` Matthew Wilcox
2020-03-19 11:49 ` William Kucharski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200225214838.30017-1-willy@infradead.org \
    --to=willy@infradead.org \
    --cc=cluster-devel@redhat.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-erofs@lists.ozlabs.org \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-f2fs-devel@lists.sourceforge.net \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=ocfs2-devel@oss.oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-Fsdevel Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-fsdevel/0 linux-fsdevel/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-fsdevel linux-fsdevel/ https://lore.kernel.org/linux-fsdevel \
		linux-fsdevel@vger.kernel.org
	public-inbox-index linux-fsdevel

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-fsdevel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git