All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/31] Convert most of ext4 to folios
@ 2023-01-26 20:23 Matthew Wilcox (Oracle)
  2023-01-26 20:23 ` [PATCH 01/31] fs: Add FGP_WRITEBEGIN Matthew Wilcox (Oracle)
                   ` (31 more replies)
  0 siblings, 32 replies; 83+ messages in thread
From: Matthew Wilcox (Oracle) @ 2023-01-26 20:23 UTC (permalink / raw)
  To: Theodore Tso, Andreas Dilger
  Cc: Matthew Wilcox (Oracle), linux-ext4, linux-fsdevel

This, on top of a number of patches currently in next and a few patches
sent to the mailing lists earlier today, converts most of ext4 to use
folios instead of pages.  It does not add support for large folios.
It does not convert mballoc to use folios.  write_begin() and write_end()
still take a page parameter instead of a folio.

It does convert a lot of code away from the page APIs that we're trying
to remove.  It does remove a lot of calls to compound_head().  I'd like
to see it land in 6.4, so getting it in for review early.

Matthew Wilcox (Oracle) (31):
  fs: Add FGP_WRITEBEGIN
  fscrypt: Add some folio helper functions
  ext4: Convert ext4_bio_write_page() to use a folio
  ext4: Convert ext4_finish_bio() to use folios
  ext4: Convert ext4_writepage() to use a folio
  ext4: Turn mpage_process_page() into mpage_process_folio()
  ext4: Convert mpage_submit_page() to mpage_submit_folio()
  ext4: Convert ext4_bio_write_page() to ext4_bio_write_folio()
  ext4: Convert ext4_readpage_inline() to take a folio
  ext4: Convert ext4_convert_inline_data_to_extent() to use a folio
  ext4: Convert ext4_try_to_write_inline_data() to use a folio
  ext4: Convert ext4_da_convert_inline_data_to_extent() to use a folio
  ext4: Convert ext4_da_write_inline_data_begin() to use a folio
  ext4: Convert ext4_read_inline_page() to ext4_read_inline_folio()
  ext4: Convert ext4_write_inline_data_end() to use a folio
  ext4: Convert ext4_write_begin() to use a folio
  ext4: Convert ext4_write_end() to use a folio
  ext4: Use a folio in ext4_journalled_write_end()
  ext4: Convert ext4_journalled_zero_new_buffers() to use a folio
  ext4: Convert __ext4_block_zero_page_range() to use a folio
  ext4: Convert __ext4_journalled_writepage() to take a folio
  ext4: Convert ext4_page_nomap_can_writeout() to take a folio
  ext4: Use a folio in ext4_da_write_begin()
  ext4: Convert ext4_mpage_readpages() to work on folios
  ext4: Convert ext4_block_write_begin() to take a folio
  ext4: Convert ext4_writepage() to take a folio
  ext4: Use a folio in ext4_page_mkwrite()
  ext4: Use a folio iterator in __read_end_io()
  ext4: Convert mext_page_mkuptodate() to take a folio
  ext4: Convert pagecache_read() to use a folio
  ext4: Use a folio in ext4_read_merkle_tree_page

 fs/ext4/ext4.h             |   9 +-
 fs/ext4/inline.c           | 171 ++++++++--------
 fs/ext4/inode.c            | 394 +++++++++++++++++++------------------
 fs/ext4/move_extent.c      |  33 ++--
 fs/ext4/page-io.c          |  98 +++++----
 fs/ext4/readpage.c         |  72 ++++---
 fs/ext4/verity.c           |  30 ++-
 fs/iomap/buffered-io.c     |   2 +-
 fs/netfs/buffered_read.c   |   3 +-
 include/linux/fscrypt.h    |  21 ++
 include/linux/page-flags.h |   5 -
 include/linux/pagemap.h    |   2 +
 mm/folio-compat.c          |   4 +-
 13 files changed, 424 insertions(+), 420 deletions(-)

-- 
2.35.1


^ permalink raw reply	[flat|nested] 83+ messages in thread

* [PATCH 01/31] fs: Add FGP_WRITEBEGIN
  2023-01-26 20:23 [PATCH 00/31] Convert most of ext4 to folios Matthew Wilcox (Oracle)
@ 2023-01-26 20:23 ` Matthew Wilcox (Oracle)
  2023-03-05  8:53   ` Ritesh Harjani
  2023-03-14 22:00   ` Theodore Ts'o
  2023-01-26 20:23 ` [PATCH 02/31] fscrypt: Add some folio helper functions Matthew Wilcox (Oracle)
                   ` (30 subsequent siblings)
  31 siblings, 2 replies; 83+ messages in thread
From: Matthew Wilcox (Oracle) @ 2023-01-26 20:23 UTC (permalink / raw)
  To: Theodore Tso, Andreas Dilger
  Cc: Matthew Wilcox (Oracle), linux-ext4, linux-fsdevel

This particular combination of flags is used by most filesystems
in their ->write_begin method, although it does find use in a
few other places.  Before folios, it warranted its own function
(grab_cache_page_write_begin()), but I think that just having specialised
flags is enough.  It certainly helps the few places that have been
converted from grab_cache_page_write_begin() to __filemap_get_folio().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 fs/ext4/move_extent.c    | 5 ++---
 fs/iomap/buffered-io.c   | 2 +-
 fs/netfs/buffered_read.c | 3 +--
 include/linux/pagemap.h  | 2 ++
 mm/folio-compat.c        | 4 +---
 5 files changed, 7 insertions(+), 9 deletions(-)

diff --git a/fs/ext4/move_extent.c b/fs/ext4/move_extent.c
index 2de9829aed63..0cb361f0a4fe 100644
--- a/fs/ext4/move_extent.c
+++ b/fs/ext4/move_extent.c
@@ -126,7 +126,6 @@ mext_folio_double_lock(struct inode *inode1, struct inode *inode2,
 {
 	struct address_space *mapping[2];
 	unsigned int flags;
-	unsigned fgp_flags = FGP_LOCK | FGP_WRITE | FGP_CREAT | FGP_STABLE;
 
 	BUG_ON(!inode1 || !inode2);
 	if (inode1 < inode2) {
@@ -139,14 +138,14 @@ mext_folio_double_lock(struct inode *inode1, struct inode *inode2,
 	}
 
 	flags = memalloc_nofs_save();
-	folio[0] = __filemap_get_folio(mapping[0], index1, fgp_flags,
+	folio[0] = __filemap_get_folio(mapping[0], index1, FGP_WRITEBEGIN,
 			mapping_gfp_mask(mapping[0]));
 	if (!folio[0]) {
 		memalloc_nofs_restore(flags);
 		return -ENOMEM;
 	}
 
-	folio[1] = __filemap_get_folio(mapping[1], index2, fgp_flags,
+	folio[1] = __filemap_get_folio(mapping[1], index2, FGP_WRITEBEGIN,
 			mapping_gfp_mask(mapping[1]));
 	memalloc_nofs_restore(flags);
 	if (!folio[1]) {
diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index 6f4c97a6d7e9..10a203515583 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -467,7 +467,7 @@ EXPORT_SYMBOL_GPL(iomap_is_partially_uptodate);
  */
 struct folio *iomap_get_folio(struct iomap_iter *iter, loff_t pos)
 {
-	unsigned fgp = FGP_LOCK | FGP_WRITE | FGP_CREAT | FGP_STABLE | FGP_NOFS;
+	unsigned fgp = FGP_WRITEBEGIN | FGP_NOFS;
 	struct folio *folio;
 
 	if (iter->flags & IOMAP_NOWAIT)
diff --git a/fs/netfs/buffered_read.c b/fs/netfs/buffered_read.c
index 7679a68e8193..e3d754a9e1b0 100644
--- a/fs/netfs/buffered_read.c
+++ b/fs/netfs/buffered_read.c
@@ -341,14 +341,13 @@ int netfs_write_begin(struct netfs_inode *ctx,
 {
 	struct netfs_io_request *rreq;
 	struct folio *folio;
-	unsigned int fgp_flags = FGP_LOCK | FGP_WRITE | FGP_CREAT | FGP_STABLE;
 	pgoff_t index = pos >> PAGE_SHIFT;
 	int ret;
 
 	DEFINE_READAHEAD(ractl, file, NULL, mapping, index);
 
 retry:
-	folio = __filemap_get_folio(mapping, index, fgp_flags,
+	folio = __filemap_get_folio(mapping, index, FGP_WRITEBEGIN,
 				    mapping_gfp_mask(mapping));
 	if (!folio)
 		return -ENOMEM;
diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index 9f1081683771..47069662f4b8 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -507,6 +507,8 @@ pgoff_t page_cache_prev_miss(struct address_space *mapping,
 #define FGP_ENTRY		0x00000080
 #define FGP_STABLE		0x00000100
 
+#define FGP_WRITEBEGIN		(FGP_LOCK | FGP_WRITE | FGP_CREAT | FGP_STABLE)
+
 struct folio *__filemap_get_folio(struct address_space *mapping, pgoff_t index,
 		int fgp_flags, gfp_t gfp);
 struct page *pagecache_get_page(struct address_space *mapping, pgoff_t index,
diff --git a/mm/folio-compat.c b/mm/folio-compat.c
index 18c48b557926..668350748828 100644
--- a/mm/folio-compat.c
+++ b/mm/folio-compat.c
@@ -106,9 +106,7 @@ EXPORT_SYMBOL(pagecache_get_page);
 struct page *grab_cache_page_write_begin(struct address_space *mapping,
 					pgoff_t index)
 {
-	unsigned fgp_flags = FGP_LOCK | FGP_WRITE | FGP_CREAT | FGP_STABLE;
-
-	return pagecache_get_page(mapping, index, fgp_flags,
+	return pagecache_get_page(mapping, index, FGP_WRITEBEGIN,
 			mapping_gfp_mask(mapping));
 }
 EXPORT_SYMBOL(grab_cache_page_write_begin);
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH 02/31] fscrypt: Add some folio helper functions
  2023-01-26 20:23 [PATCH 00/31] Convert most of ext4 to folios Matthew Wilcox (Oracle)
  2023-01-26 20:23 ` [PATCH 01/31] fs: Add FGP_WRITEBEGIN Matthew Wilcox (Oracle)
@ 2023-01-26 20:23 ` Matthew Wilcox (Oracle)
  2023-01-27  3:02   ` Eric Biggers
  2023-03-05  9:06   ` Ritesh Harjani
  2023-01-26 20:23 ` [PATCH 03/31] ext4: Convert ext4_bio_write_page() to use a folio Matthew Wilcox (Oracle)
                   ` (29 subsequent siblings)
  31 siblings, 2 replies; 83+ messages in thread
From: Matthew Wilcox (Oracle) @ 2023-01-26 20:23 UTC (permalink / raw)
  To: Theodore Tso, Andreas Dilger
  Cc: Matthew Wilcox (Oracle), linux-ext4, linux-fsdevel

fscrypt_is_bounce_folio() is the equivalent of fscrypt_is_bounce_page()
and fscrypt_pagecache_folio() is the equivalent of fscrypt_pagecache_page().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 include/linux/fscrypt.h | 21 +++++++++++++++++++++
 1 file changed, 21 insertions(+)

diff --git a/include/linux/fscrypt.h b/include/linux/fscrypt.h
index 4f5f8a651213..c2c07d36fb3a 100644
--- a/include/linux/fscrypt.h
+++ b/include/linux/fscrypt.h
@@ -273,6 +273,16 @@ static inline struct page *fscrypt_pagecache_page(struct page *bounce_page)
 	return (struct page *)page_private(bounce_page);
 }
 
+static inline bool fscrypt_is_bounce_folio(struct folio *folio)
+{
+	return folio->mapping == NULL;
+}
+
+static inline struct folio *fscrypt_pagecache_folio(struct folio *bounce_folio)
+{
+	return bounce_folio->private;
+}
+
 void fscrypt_free_bounce_page(struct page *bounce_page);
 
 /* policy.c */
@@ -448,6 +458,17 @@ static inline struct page *fscrypt_pagecache_page(struct page *bounce_page)
 	return ERR_PTR(-EINVAL);
 }
 
+static inline bool fscrypt_is_bounce_folio(struct folio *folio)
+{
+	return false;
+}
+
+static inline struct folio *fscrypt_pagecache_folio(struct folio *bounce_folio)
+{
+	WARN_ON_ONCE(1);
+	return ERR_PTR(-EINVAL);
+}
+
 static inline void fscrypt_free_bounce_page(struct page *bounce_page)
 {
 }
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH 03/31] ext4: Convert ext4_bio_write_page() to use a folio
  2023-01-26 20:23 [PATCH 00/31] Convert most of ext4 to folios Matthew Wilcox (Oracle)
  2023-01-26 20:23 ` [PATCH 01/31] fs: Add FGP_WRITEBEGIN Matthew Wilcox (Oracle)
  2023-01-26 20:23 ` [PATCH 02/31] fscrypt: Add some folio helper functions Matthew Wilcox (Oracle)
@ 2023-01-26 20:23 ` Matthew Wilcox (Oracle)
  2023-01-28 16:53   ` kernel test robot
                     ` (3 more replies)
  2023-01-26 20:23 ` [PATCH 04/31] ext4: Convert ext4_finish_bio() to use folios Matthew Wilcox (Oracle)
                   ` (28 subsequent siblings)
  31 siblings, 4 replies; 83+ messages in thread
From: Matthew Wilcox (Oracle) @ 2023-01-26 20:23 UTC (permalink / raw)
  To: Theodore Tso, Andreas Dilger
  Cc: Matthew Wilcox (Oracle), linux-ext4, linux-fsdevel

Remove several calls to compound_head() and the last caller of
set_page_writeback_keepwrite(), so remove the wrapper too.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 fs/ext4/page-io.c          | 58 ++++++++++++++++++--------------------
 include/linux/page-flags.h |  5 ----
 2 files changed, 27 insertions(+), 36 deletions(-)

diff --git a/fs/ext4/page-io.c b/fs/ext4/page-io.c
index beaec6d81074..982791050892 100644
--- a/fs/ext4/page-io.c
+++ b/fs/ext4/page-io.c
@@ -409,11 +409,9 @@ static void io_submit_init_bio(struct ext4_io_submit *io,
 
 static void io_submit_add_bh(struct ext4_io_submit *io,
 			     struct inode *inode,
-			     struct page *page,
+			     struct folio *folio,
 			     struct buffer_head *bh)
 {
-	int ret;
-
 	if (io->io_bio && (bh->b_blocknr != io->io_next_block ||
 			   !fscrypt_mergeable_bio_bh(io->io_bio, bh))) {
 submit_and_retry:
@@ -421,10 +419,9 @@ static void io_submit_add_bh(struct ext4_io_submit *io,
 	}
 	if (io->io_bio == NULL)
 		io_submit_init_bio(io, bh);
-	ret = bio_add_page(io->io_bio, page, bh->b_size, bh_offset(bh));
-	if (ret != bh->b_size)
+	if (!bio_add_folio(io->io_bio, folio, bh->b_size, bh_offset(bh)))
 		goto submit_and_retry;
-	wbc_account_cgroup_owner(io->io_wbc, page, bh->b_size);
+	wbc_account_cgroup_owner(io->io_wbc, &folio->page, bh->b_size);
 	io->io_next_block++;
 }
 
@@ -432,8 +429,9 @@ int ext4_bio_write_page(struct ext4_io_submit *io,
 			struct page *page,
 			int len)
 {
-	struct page *bounce_page = NULL;
-	struct inode *inode = page->mapping->host;
+	struct folio *folio = page_folio(page);
+	struct folio *io_folio = folio;
+	struct inode *inode = folio->mapping->host;
 	unsigned block_start;
 	struct buffer_head *bh, *head;
 	int ret = 0;
@@ -441,30 +439,30 @@ int ext4_bio_write_page(struct ext4_io_submit *io,
 	struct writeback_control *wbc = io->io_wbc;
 	bool keep_towrite = false;
 
-	BUG_ON(!PageLocked(page));
-	BUG_ON(PageWriteback(page));
+	BUG_ON(!folio_test_locked(folio));
+	BUG_ON(folio_test_writeback(folio));
 
-	ClearPageError(page);
+	folio_clear_error(folio);
 
 	/*
 	 * Comments copied from block_write_full_page:
 	 *
-	 * The page straddles i_size.  It must be zeroed out on each and every
+	 * The folio straddles i_size.  It must be zeroed out on each and every
 	 * writepage invocation because it may be mmapped.  "A file is mapped
 	 * in multiples of the page size.  For a file that is not a multiple of
 	 * the page size, the remaining memory is zeroed when mapped, and
 	 * writes to that region are not written out to the file."
 	 */
-	if (len < PAGE_SIZE)
-		zero_user_segment(page, len, PAGE_SIZE);
+	if (len < folio_size(folio))
+		folio_zero_segment(folio, len, folio_size(folio));
 	/*
 	 * In the first loop we prepare and mark buffers to submit. We have to
-	 * mark all buffers in the page before submitting so that
-	 * end_page_writeback() cannot be called from ext4_end_bio() when IO
+	 * mark all buffers in the folio before submitting so that
+	 * folio_end_writeback() cannot be called from ext4_end_bio() when IO
 	 * on the first buffer finishes and we are still working on submitting
 	 * the second buffer.
 	 */
-	bh = head = page_buffers(page);
+	bh = head = folio_buffers(folio);
 	do {
 		block_start = bh_offset(bh);
 		if (block_start >= len) {
@@ -479,14 +477,14 @@ int ext4_bio_write_page(struct ext4_io_submit *io,
 				clear_buffer_dirty(bh);
 			/*
 			 * Keeping dirty some buffer we cannot write? Make sure
-			 * to redirty the page and keep TOWRITE tag so that
-			 * racing WB_SYNC_ALL writeback does not skip the page.
+			 * to redirty the folio and keep TOWRITE tag so that
+			 * racing WB_SYNC_ALL writeback does not skip the folio.
 			 * This happens e.g. when doing writeout for
 			 * transaction commit.
 			 */
 			if (buffer_dirty(bh)) {
-				if (!PageDirty(page))
-					redirty_page_for_writepage(wbc, page);
+				if (!folio_test_dirty(folio))
+					folio_redirty_for_writepage(wbc, folio);
 				keep_towrite = true;
 			}
 			continue;
@@ -498,11 +496,11 @@ int ext4_bio_write_page(struct ext4_io_submit *io,
 		nr_to_submit++;
 	} while ((bh = bh->b_this_page) != head);
 
-	/* Nothing to submit? Just unlock the page... */
+	/* Nothing to submit? Just unlock the folio... */
 	if (!nr_to_submit)
 		goto unlock;
 
-	bh = head = page_buffers(page);
+	bh = head = folio_buffers(folio);
 
 	/*
 	 * If any blocks are being written to an encrypted file, encrypt them
@@ -514,6 +512,7 @@ int ext4_bio_write_page(struct ext4_io_submit *io,
 	if (fscrypt_inode_uses_fs_layer_crypto(inode) && nr_to_submit) {
 		gfp_t gfp_flags = GFP_NOFS;
 		unsigned int enc_bytes = round_up(len, i_blocksize(inode));
+		struct page *bounce_page;
 
 		/*
 		 * Since bounce page allocation uses a mempool, we can only use
@@ -540,7 +539,7 @@ int ext4_bio_write_page(struct ext4_io_submit *io,
 			}
 
 			printk_ratelimited(KERN_ERR "%s: ret = %d\n", __func__, ret);
-			redirty_page_for_writepage(wbc, page);
+			folio_redirty_for_writepage(wbc, folio);
 			do {
 				if (buffer_async_write(bh)) {
 					clear_buffer_async_write(bh);
@@ -550,21 +549,18 @@ int ext4_bio_write_page(struct ext4_io_submit *io,
 			} while (bh != head);
 			goto unlock;
 		}
+		io_folio = page_folio(bounce_page);
 	}
 
-	if (keep_towrite)
-		set_page_writeback_keepwrite(page);
-	else
-		set_page_writeback(page);
+	__folio_start_writeback(folio, keep_towrite);
 
 	/* Now submit buffers to write */
 	do {
 		if (!buffer_async_write(bh))
 			continue;
-		io_submit_add_bh(io, inode,
-				 bounce_page ? bounce_page : page, bh);
+		io_submit_add_bh(io, inode, io_folio, bh);
 	} while ((bh = bh->b_this_page) != head);
 unlock:
-	unlock_page(page);
+	folio_unlock(folio);
 	return ret;
 }
diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
index 0425f22a9c82..bba2a32031a2 100644
--- a/include/linux/page-flags.h
+++ b/include/linux/page-flags.h
@@ -766,11 +766,6 @@ bool set_page_writeback(struct page *page);
 #define folio_start_writeback_keepwrite(folio)	\
 	__folio_start_writeback(folio, true)
 
-static inline void set_page_writeback_keepwrite(struct page *page)
-{
-	folio_start_writeback_keepwrite(page_folio(page));
-}
-
 static inline bool test_set_page_writeback(struct page *page)
 {
 	return set_page_writeback(page);
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH 04/31] ext4: Convert ext4_finish_bio() to use folios
  2023-01-26 20:23 [PATCH 00/31] Convert most of ext4 to folios Matthew Wilcox (Oracle)
                   ` (2 preceding siblings ...)
  2023-01-26 20:23 ` [PATCH 03/31] ext4: Convert ext4_bio_write_page() to use a folio Matthew Wilcox (Oracle)
@ 2023-01-26 20:23 ` Matthew Wilcox (Oracle)
  2023-03-06  9:10   ` Ritesh Harjani
  2023-03-14 22:08   ` Theodore Ts'o
  2023-01-26 20:23 ` [PATCH 05/31] ext4: Convert ext4_writepage() to use a folio Matthew Wilcox (Oracle)
                   ` (27 subsequent siblings)
  31 siblings, 2 replies; 83+ messages in thread
From: Matthew Wilcox (Oracle) @ 2023-01-26 20:23 UTC (permalink / raw)
  To: Theodore Tso, Andreas Dilger
  Cc: Matthew Wilcox (Oracle), linux-ext4, linux-fsdevel

Prepare ext4 to support large folios in the page writeback path.
Also set the actual error in the mapping, not just -EIO.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 fs/ext4/page-io.c | 32 ++++++++++++++++----------------
 1 file changed, 16 insertions(+), 16 deletions(-)

diff --git a/fs/ext4/page-io.c b/fs/ext4/page-io.c
index 982791050892..fd6c0dca24b9 100644
--- a/fs/ext4/page-io.c
+++ b/fs/ext4/page-io.c
@@ -99,30 +99,30 @@ static void buffer_io_error(struct buffer_head *bh)
 
 static void ext4_finish_bio(struct bio *bio)
 {
-	struct bio_vec *bvec;
-	struct bvec_iter_all iter_all;
+	struct folio_iter fi;
 
-	bio_for_each_segment_all(bvec, bio, iter_all) {
-		struct page *page = bvec->bv_page;
-		struct page *bounce_page = NULL;
+	bio_for_each_folio_all(fi, bio) {
+		struct folio *folio = fi.folio;
+		struct folio *io_folio = NULL;
 		struct buffer_head *bh, *head;
-		unsigned bio_start = bvec->bv_offset;
-		unsigned bio_end = bio_start + bvec->bv_len;
+		size_t bio_start = fi.offset;
+		size_t bio_end = bio_start + fi.length;
 		unsigned under_io = 0;
 		unsigned long flags;
 
-		if (fscrypt_is_bounce_page(page)) {
-			bounce_page = page;
-			page = fscrypt_pagecache_page(bounce_page);
+		if (fscrypt_is_bounce_folio(folio)) {
+			io_folio = folio;
+			folio = fscrypt_pagecache_folio(folio);
 		}
 
 		if (bio->bi_status) {
-			SetPageError(page);
-			mapping_set_error(page->mapping, -EIO);
+			int err = blk_status_to_errno(bio->bi_status);
+			folio_set_error(folio);
+			mapping_set_error(folio->mapping, err);
 		}
-		bh = head = page_buffers(page);
+		bh = head = folio_buffers(folio);
 		/*
-		 * We check all buffers in the page under b_uptodate_lock
+		 * We check all buffers in the folio under b_uptodate_lock
 		 * to avoid races with other end io clearing async_write flags
 		 */
 		spin_lock_irqsave(&head->b_uptodate_lock, flags);
@@ -141,8 +141,8 @@ static void ext4_finish_bio(struct bio *bio)
 		} while ((bh = bh->b_this_page) != head);
 		spin_unlock_irqrestore(&head->b_uptodate_lock, flags);
 		if (!under_io) {
-			fscrypt_free_bounce_page(bounce_page);
-			end_page_writeback(page);
+			fscrypt_free_bounce_page(&io_folio->page);
+			folio_end_writeback(folio);
 		}
 	}
 }
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH 05/31] ext4: Convert ext4_writepage() to use a folio
  2023-01-26 20:23 [PATCH 00/31] Convert most of ext4 to folios Matthew Wilcox (Oracle)
                   ` (3 preceding siblings ...)
  2023-01-26 20:23 ` [PATCH 04/31] ext4: Convert ext4_finish_bio() to use folios Matthew Wilcox (Oracle)
@ 2023-01-26 20:23 ` Matthew Wilcox (Oracle)
  2023-03-06 18:45   ` Ritesh Harjani
  2023-01-26 20:23 ` [PATCH 06/31] ext4: Turn mpage_process_page() into mpage_process_folio() Matthew Wilcox (Oracle)
                   ` (26 subsequent siblings)
  31 siblings, 1 reply; 83+ messages in thread
From: Matthew Wilcox (Oracle) @ 2023-01-26 20:23 UTC (permalink / raw)
  To: Theodore Tso, Andreas Dilger
  Cc: Matthew Wilcox (Oracle), linux-ext4, linux-fsdevel

Prepare for multi-page folios and save some instructions by converting
to the folio API.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 fs/ext4/inode.c | 29 ++++++++++++++---------------
 1 file changed, 14 insertions(+), 15 deletions(-)

diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index b8b3e2e0d9fd..8e3d2cca1e0c 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -2027,26 +2027,25 @@ static int ext4_writepage(struct page *page,
 
 	trace_ext4_writepage(page);
 	size = i_size_read(inode);
-	if (page->index == size >> PAGE_SHIFT &&
+	len = folio_size(folio);
+	if (folio_pos(folio) + len > size &&
 	    !ext4_verity_in_progress(inode))
-		len = size & ~PAGE_MASK;
-	else
-		len = PAGE_SIZE;
+		len = size - folio_pos(folio);
 
+	page_bufs = folio_buffers(folio);
 	/* Should never happen but for bugs in other kernel subsystems */
-	if (!page_has_buffers(page)) {
+	if (!page_bufs) {
 		ext4_warning_inode(inode,
-		   "page %lu does not have buffers attached", page->index);
-		ClearPageDirty(page);
-		unlock_page(page);
+		   "page %lu does not have buffers attached", folio->index);
+		folio_clear_dirty(folio);
+		folio_unlock(folio);
 		return 0;
 	}
 
-	page_bufs = page_buffers(page);
 	/*
 	 * We cannot do block allocation or other extent handling in this
 	 * function. If there are buffers needing that, we have to redirty
-	 * the page. But we may reach here when we do a journal commit via
+	 * the folio. But we may reach here when we do a journal commit via
 	 * journal_submit_inode_data_buffers() and in that case we must write
 	 * allocated buffers to achieve data=ordered mode guarantees.
 	 *
@@ -2062,7 +2061,7 @@ static int ext4_writepage(struct page *page,
 	 */
 	if (ext4_walk_page_buffers(NULL, inode, page_bufs, 0, len, NULL,
 				   ext4_bh_delay_or_unwritten)) {
-		redirty_page_for_writepage(wbc, page);
+		folio_redirty_for_writepage(wbc, folio);
 		if ((current->flags & PF_MEMALLOC) ||
 		    (inode->i_sb->s_blocksize == PAGE_SIZE)) {
 			/*
@@ -2072,12 +2071,12 @@ static int ext4_writepage(struct page *page,
 			 */
 			WARN_ON_ONCE((current->flags & (PF_MEMALLOC|PF_KSWAPD))
 							== PF_MEMALLOC);
-			unlock_page(page);
+			folio_unlock(folio);
 			return 0;
 		}
 	}
 
-	if (PageChecked(page) && ext4_should_journal_data(inode))
+	if (folio_test_checked(folio) && ext4_should_journal_data(inode))
 		/*
 		 * It's mmapped pagecache.  Add buffers and journal it.  There
 		 * doesn't seem much point in redirtying the page here.
@@ -2087,8 +2086,8 @@ static int ext4_writepage(struct page *page,
 	ext4_io_submit_init(&io_submit, wbc);
 	io_submit.io_end = ext4_init_io_end(inode, GFP_NOFS);
 	if (!io_submit.io_end) {
-		redirty_page_for_writepage(wbc, page);
-		unlock_page(page);
+		folio_redirty_for_writepage(wbc, folio);
+		folio_unlock(folio);
 		return -ENOMEM;
 	}
 	ret = ext4_bio_write_page(&io_submit, page, len);
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH 06/31] ext4: Turn mpage_process_page() into mpage_process_folio()
  2023-01-26 20:23 [PATCH 00/31] Convert most of ext4 to folios Matthew Wilcox (Oracle)
                   ` (4 preceding siblings ...)
  2023-01-26 20:23 ` [PATCH 05/31] ext4: Convert ext4_writepage() to use a folio Matthew Wilcox (Oracle)
@ 2023-01-26 20:23 ` Matthew Wilcox (Oracle)
  2023-03-14 22:27   ` Theodore Ts'o
  2023-01-26 20:23 ` [PATCH 07/31] ext4: Convert mpage_submit_page() to mpage_submit_folio() Matthew Wilcox (Oracle)
                   ` (25 subsequent siblings)
  31 siblings, 1 reply; 83+ messages in thread
From: Matthew Wilcox (Oracle) @ 2023-01-26 20:23 UTC (permalink / raw)
  To: Theodore Tso, Andreas Dilger
  Cc: Matthew Wilcox (Oracle), linux-ext4, linux-fsdevel

The page/folio is only used to extract the buffers, so this is a
simple change.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 fs/ext4/inode.c | 33 +++++++++++++++++----------------
 1 file changed, 17 insertions(+), 16 deletions(-)

diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 8e3d2cca1e0c..e8f2918fd854 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -2250,21 +2250,22 @@ static int mpage_process_page_bufs(struct mpage_da_data *mpd,
 }
 
 /*
- * mpage_process_page - update page buffers corresponding to changed extent and
- *		       may submit fully mapped page for IO
- *
- * @mpd		- description of extent to map, on return next extent to map
- * @m_lblk	- logical block mapping.
- * @m_pblk	- corresponding physical mapping.
- * @map_bh	- determines on return whether this page requires any further
+ * mpage_process_folio - update folio buffers corresponding to changed extent
+ *			 and may submit fully mapped page for IO
+ * @mpd: description of extent to map, on return next extent to map
+ * @folio: Contains these buffers.
+ * @m_lblk: logical block mapping.
+ * @m_pblk: corresponding physical mapping.
+ * @map_bh: determines on return whether this page requires any further
  *		  mapping or not.
- * Scan given page buffers corresponding to changed extent and update buffer
+ *
+ * Scan given folio buffers corresponding to changed extent and update buffer
  * state according to new extent state.
  * We map delalloc buffers to their physical location, clear unwritten bits.
- * If the given page is not fully mapped, we update @map to the next extent in
- * the given page that needs mapping & return @map_bh as true.
+ * If the given folio is not fully mapped, we update @mpd to the next extent in
+ * the given folio that needs mapping & return @map_bh as true.
  */
-static int mpage_process_page(struct mpage_da_data *mpd, struct page *page,
+static int mpage_process_folio(struct mpage_da_data *mpd, struct folio *folio,
 			      ext4_lblk_t *m_lblk, ext4_fsblk_t *m_pblk,
 			      bool *map_bh)
 {
@@ -2277,14 +2278,14 @@ static int mpage_process_page(struct mpage_da_data *mpd, struct page *page,
 	ssize_t io_end_size = 0;
 	struct ext4_io_end_vec *io_end_vec = ext4_last_io_end_vec(io_end);
 
-	bh = head = page_buffers(page);
+	bh = head = folio_buffers(folio);
 	do {
 		if (lblk < mpd->map.m_lblk)
 			continue;
 		if (lblk >= mpd->map.m_lblk + mpd->map.m_len) {
 			/*
 			 * Buffer after end of mapped extent.
-			 * Find next buffer in the page to map.
+			 * Find next buffer in the folio to map.
 			 */
 			mpd->map.m_len = 0;
 			mpd->map.m_flags = 0;
@@ -2357,9 +2358,9 @@ static int mpage_map_and_submit_buffers(struct mpage_da_data *mpd)
 		if (nr == 0)
 			break;
 		for (i = 0; i < nr; i++) {
-			struct page *page = &fbatch.folios[i]->page;
+			struct folio *folio = fbatch.folios[i];
 
-			err = mpage_process_page(mpd, page, &lblk, &pblock,
+			err = mpage_process_folio(mpd, folio, &lblk, &pblock,
 						 &map_bh);
 			/*
 			 * If map_bh is true, means page may require further bh
@@ -2369,7 +2370,7 @@ static int mpage_map_and_submit_buffers(struct mpage_da_data *mpd)
 			if (err < 0 || map_bh)
 				goto out;
 			/* Page fully mapped - let IO run! */
-			err = mpage_submit_page(mpd, page);
+			err = mpage_submit_page(mpd, &folio->page);
 			if (err < 0)
 				goto out;
 		}
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH 07/31] ext4: Convert mpage_submit_page() to mpage_submit_folio()
  2023-01-26 20:23 [PATCH 00/31] Convert most of ext4 to folios Matthew Wilcox (Oracle)
                   ` (5 preceding siblings ...)
  2023-01-26 20:23 ` [PATCH 06/31] ext4: Turn mpage_process_page() into mpage_process_folio() Matthew Wilcox (Oracle)
@ 2023-01-26 20:23 ` Matthew Wilcox (Oracle)
  2023-03-14 22:28   ` Theodore Ts'o
  2023-01-26 20:23 ` [PATCH 08/31] ext4: Convert ext4_bio_write_page() to ext4_bio_write_folio() Matthew Wilcox (Oracle)
                   ` (24 subsequent siblings)
  31 siblings, 1 reply; 83+ messages in thread
From: Matthew Wilcox (Oracle) @ 2023-01-26 20:23 UTC (permalink / raw)
  To: Theodore Tso, Andreas Dilger
  Cc: Matthew Wilcox (Oracle), linux-ext4, linux-fsdevel

All callers now have a folio so we can pass one in and use the folio
APIs to support large folios as well as save instructions by eliminating
calls to compound_head().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 fs/ext4/inode.c | 29 ++++++++++++++---------------
 1 file changed, 14 insertions(+), 15 deletions(-)

diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index e8f2918fd854..8b91e325492f 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -2097,34 +2097,33 @@ static int ext4_writepage(struct page *page,
 	return ret;
 }
 
-static int mpage_submit_page(struct mpage_da_data *mpd, struct page *page)
+static int mpage_submit_folio(struct mpage_da_data *mpd, struct folio *folio)
 {
-	int len;
+	size_t len;
 	loff_t size;
 	int err;
 
-	BUG_ON(page->index != mpd->first_page);
-	clear_page_dirty_for_io(page);
+	BUG_ON(folio->index != mpd->first_page);
+	folio_clear_dirty_for_io(folio);
 	/*
 	 * We have to be very careful here!  Nothing protects writeback path
 	 * against i_size changes and the page can be writeably mapped into
 	 * page tables. So an application can be growing i_size and writing
-	 * data through mmap while writeback runs. clear_page_dirty_for_io()
+	 * data through mmap while writeback runs. folio_clear_dirty_for_io()
 	 * write-protects our page in page tables and the page cannot get
-	 * written to again until we release page lock. So only after
-	 * clear_page_dirty_for_io() we are safe to sample i_size for
+	 * written to again until we release folio lock. So only after
+	 * folio_clear_dirty_for_io() we are safe to sample i_size for
 	 * ext4_bio_write_page() to zero-out tail of the written page. We rely
 	 * on the barrier provided by TestClearPageDirty in
-	 * clear_page_dirty_for_io() to make sure i_size is really sampled only
+	 * folio_clear_dirty_for_io() to make sure i_size is really sampled only
 	 * after page tables are updated.
 	 */
 	size = i_size_read(mpd->inode);
-	if (page->index == size >> PAGE_SHIFT &&
+	len = folio_size(folio);
+	if (folio_pos(folio) + len > size &&
 	    !ext4_verity_in_progress(mpd->inode))
 		len = size & ~PAGE_MASK;
-	else
-		len = PAGE_SIZE;
-	err = ext4_bio_write_page(&mpd->io_submit, page, len);
+	err = ext4_bio_write_page(&mpd->io_submit, &folio->page, len);
 	if (!err)
 		mpd->wbc->nr_to_write--;
 	mpd->first_page++;
@@ -2238,7 +2237,7 @@ static int mpage_process_page_bufs(struct mpage_da_data *mpd,
 	} while (lblk++, (bh = bh->b_this_page) != head);
 	/* So far everything mapped? Submit the page for IO. */
 	if (mpd->map.m_len == 0) {
-		err = mpage_submit_page(mpd, head->b_page);
+		err = mpage_submit_folio(mpd, head->b_folio);
 		if (err < 0)
 			return err;
 	}
@@ -2370,7 +2369,7 @@ static int mpage_map_and_submit_buffers(struct mpage_da_data *mpd)
 			if (err < 0 || map_bh)
 				goto out;
 			/* Page fully mapped - let IO run! */
-			err = mpage_submit_page(mpd, &folio->page);
+			err = mpage_submit_folio(mpd, folio);
 			if (err < 0)
 				goto out;
 		}
@@ -2680,7 +2679,7 @@ static int mpage_prepare_extent_to_map(struct mpage_da_data *mpd)
 			 */
 			if (!mpd->can_map) {
 				if (ext4_page_nomap_can_writeout(&folio->page)) {
-					err = mpage_submit_page(mpd, &folio->page);
+					err = mpage_submit_folio(mpd, folio);
 					if (err < 0)
 						goto out;
 				} else {
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH 08/31] ext4: Convert ext4_bio_write_page() to ext4_bio_write_folio()
  2023-01-26 20:23 [PATCH 00/31] Convert most of ext4 to folios Matthew Wilcox (Oracle)
                   ` (6 preceding siblings ...)
  2023-01-26 20:23 ` [PATCH 07/31] ext4: Convert mpage_submit_page() to mpage_submit_folio() Matthew Wilcox (Oracle)
@ 2023-01-26 20:23 ` Matthew Wilcox (Oracle)
  2023-03-14 22:31   ` Theodore Ts'o
  2023-01-26 20:23 ` [PATCH 09/31] ext4: Convert ext4_readpage_inline() to take a folio Matthew Wilcox (Oracle)
                   ` (23 subsequent siblings)
  31 siblings, 1 reply; 83+ messages in thread
From: Matthew Wilcox (Oracle) @ 2023-01-26 20:23 UTC (permalink / raw)
  To: Theodore Tso, Andreas Dilger
  Cc: Matthew Wilcox (Oracle), linux-ext4, linux-fsdevel

Both callers now have a folio so pass it in directly and avoid the call
to page_folio() at the beginning.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 fs/ext4/ext4.h    |  5 ++---
 fs/ext4/inode.c   | 18 +++++++++---------
 fs/ext4/page-io.c | 10 ++++------
 3 files changed, 15 insertions(+), 18 deletions(-)

diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index 43e26e6f6e42..7a132e8648f4 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -3756,9 +3756,8 @@ extern void ext4_io_submit_init(struct ext4_io_submit *io,
 				struct writeback_control *wbc);
 extern void ext4_end_io_rsv_work(struct work_struct *work);
 extern void ext4_io_submit(struct ext4_io_submit *io);
-extern int ext4_bio_write_page(struct ext4_io_submit *io,
-			       struct page *page,
-			       int len);
+int ext4_bio_write_folio(struct ext4_io_submit *io, struct folio *page,
+		size_t len);
 extern struct ext4_io_end_vec *ext4_alloc_io_end_vec(ext4_io_end_t *io_end);
 extern struct ext4_io_end_vec *ext4_last_io_end_vec(ext4_io_end_t *io_end);
 
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 8b91e325492f..fcd904123384 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -2014,9 +2014,9 @@ static int ext4_writepage(struct page *page,
 	struct folio *folio = page_folio(page);
 	int ret = 0;
 	loff_t size;
-	unsigned int len;
+	size_t len;
 	struct buffer_head *page_bufs = NULL;
-	struct inode *inode = page->mapping->host;
+	struct inode *inode = folio->mapping->host;
 	struct ext4_io_submit io_submit;
 
 	if (unlikely(ext4_forced_shutdown(EXT4_SB(inode->i_sb)))) {
@@ -2052,12 +2052,12 @@ static int ext4_writepage(struct page *page,
 	 * Also, if there is only one buffer per page (the fs block
 	 * size == the page size), if one buffer needs block
 	 * allocation or needs to modify the extent tree to clear the
-	 * unwritten flag, we know that the page can't be written at
+	 * unwritten flag, we know that the folio can't be written at
 	 * all, so we might as well refuse the write immediately.
 	 * Unfortunately if the block size != page size, we can't as
 	 * easily detect this case using ext4_walk_page_buffers(), but
 	 * for the extremely common case, this is an optimization that
-	 * skips a useless round trip through ext4_bio_write_page().
+	 * skips a useless round trip through ext4_bio_write_folio().
 	 */
 	if (ext4_walk_page_buffers(NULL, inode, page_bufs, 0, len, NULL,
 				   ext4_bh_delay_or_unwritten)) {
@@ -2079,7 +2079,7 @@ static int ext4_writepage(struct page *page,
 	if (folio_test_checked(folio) && ext4_should_journal_data(inode))
 		/*
 		 * It's mmapped pagecache.  Add buffers and journal it.  There
-		 * doesn't seem much point in redirtying the page here.
+		 * doesn't seem much point in redirtying the folio here.
 		 */
 		return __ext4_journalled_writepage(page, len);
 
@@ -2090,7 +2090,7 @@ static int ext4_writepage(struct page *page,
 		folio_unlock(folio);
 		return -ENOMEM;
 	}
-	ret = ext4_bio_write_page(&io_submit, page, len);
+	ret = ext4_bio_write_folio(&io_submit, folio, len);
 	ext4_io_submit(&io_submit);
 	/* Drop io_end reference we got from init */
 	ext4_put_io_end_defer(io_submit.io_end);
@@ -2113,8 +2113,8 @@ static int mpage_submit_folio(struct mpage_da_data *mpd, struct folio *folio)
 	 * write-protects our page in page tables and the page cannot get
 	 * written to again until we release folio lock. So only after
 	 * folio_clear_dirty_for_io() we are safe to sample i_size for
-	 * ext4_bio_write_page() to zero-out tail of the written page. We rely
-	 * on the barrier provided by TestClearPageDirty in
+	 * ext4_bio_write_folio() to zero-out tail of the written page. We rely
+	 * on the barrier provided by folio_test_clear_dirty() in
 	 * folio_clear_dirty_for_io() to make sure i_size is really sampled only
 	 * after page tables are updated.
 	 */
@@ -2123,7 +2123,7 @@ static int mpage_submit_folio(struct mpage_da_data *mpd, struct folio *folio)
 	if (folio_pos(folio) + len > size &&
 	    !ext4_verity_in_progress(mpd->inode))
 		len = size & ~PAGE_MASK;
-	err = ext4_bio_write_page(&mpd->io_submit, &folio->page, len);
+	err = ext4_bio_write_folio(&mpd->io_submit, folio, len);
 	if (!err)
 		mpd->wbc->nr_to_write--;
 	mpd->first_page++;
diff --git a/fs/ext4/page-io.c b/fs/ext4/page-io.c
index fd6c0dca24b9..c6da8800a49f 100644
--- a/fs/ext4/page-io.c
+++ b/fs/ext4/page-io.c
@@ -425,11 +425,9 @@ static void io_submit_add_bh(struct ext4_io_submit *io,
 	io->io_next_block++;
 }
 
-int ext4_bio_write_page(struct ext4_io_submit *io,
-			struct page *page,
-			int len)
+int ext4_bio_write_folio(struct ext4_io_submit *io, struct folio *folio,
+		size_t len)
 {
-	struct folio *folio = page_folio(page);
 	struct folio *io_folio = folio;
 	struct inode *inode = folio->mapping->host;
 	unsigned block_start;
@@ -522,8 +520,8 @@ int ext4_bio_write_page(struct ext4_io_submit *io,
 		if (io->io_bio)
 			gfp_flags = GFP_NOWAIT | __GFP_NOWARN;
 	retry_encrypt:
-		bounce_page = fscrypt_encrypt_pagecache_blocks(page, enc_bytes,
-							       0, gfp_flags);
+		bounce_page = fscrypt_encrypt_pagecache_blocks(&folio->page,
+					enc_bytes, 0, gfp_flags);
 		if (IS_ERR(bounce_page)) {
 			ret = PTR_ERR(bounce_page);
 			if (ret == -ENOMEM &&
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH 09/31] ext4: Convert ext4_readpage_inline() to take a folio
  2023-01-26 20:23 [PATCH 00/31] Convert most of ext4 to folios Matthew Wilcox (Oracle)
                   ` (7 preceding siblings ...)
  2023-01-26 20:23 ` [PATCH 08/31] ext4: Convert ext4_bio_write_page() to ext4_bio_write_folio() Matthew Wilcox (Oracle)
@ 2023-01-26 20:23 ` Matthew Wilcox (Oracle)
  2023-03-14 22:31   ` Theodore Ts'o
  2023-01-26 20:23 ` [PATCH 10/31] ext4: Convert ext4_convert_inline_data_to_extent() to use " Matthew Wilcox (Oracle)
                   ` (22 subsequent siblings)
  31 siblings, 1 reply; 83+ messages in thread
From: Matthew Wilcox (Oracle) @ 2023-01-26 20:23 UTC (permalink / raw)
  To: Theodore Tso, Andreas Dilger
  Cc: Matthew Wilcox (Oracle), linux-ext4, linux-fsdevel

Use the folio API in this function, saves a few calls to compound_head().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 fs/ext4/ext4.h   |  2 +-
 fs/ext4/inline.c | 14 +++++++-------
 fs/ext4/inode.c  |  2 +-
 3 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index 7a132e8648f4..d2998800855c 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -3549,7 +3549,7 @@ extern int ext4_init_inline_data(handle_t *handle, struct inode *inode,
 				 unsigned int len);
 extern int ext4_destroy_inline_data(handle_t *handle, struct inode *inode);
 
-extern int ext4_readpage_inline(struct inode *inode, struct page *page);
+int ext4_readpage_inline(struct inode *inode, struct folio *folio);
 extern int ext4_try_to_write_inline_data(struct address_space *mapping,
 					 struct inode *inode,
 					 loff_t pos, unsigned len,
diff --git a/fs/ext4/inline.c b/fs/ext4/inline.c
index 2b42ececa46d..38f6282cc012 100644
--- a/fs/ext4/inline.c
+++ b/fs/ext4/inline.c
@@ -502,7 +502,7 @@ static int ext4_read_inline_page(struct inode *inode, struct page *page)
 	return ret;
 }
 
-int ext4_readpage_inline(struct inode *inode, struct page *page)
+int ext4_readpage_inline(struct inode *inode, struct folio *folio)
 {
 	int ret = 0;
 
@@ -516,16 +516,16 @@ int ext4_readpage_inline(struct inode *inode, struct page *page)
 	 * Current inline data can only exist in the 1st page,
 	 * So for all the other pages, just set them uptodate.
 	 */
-	if (!page->index)
-		ret = ext4_read_inline_page(inode, page);
-	else if (!PageUptodate(page)) {
-		zero_user_segment(page, 0, PAGE_SIZE);
-		SetPageUptodate(page);
+	if (!folio->index)
+		ret = ext4_read_inline_page(inode, &folio->page);
+	else if (!folio_test_uptodate(folio)) {
+		folio_zero_segment(folio, 0, PAGE_SIZE);
+		folio_mark_uptodate(folio);
 	}
 
 	up_read(&EXT4_I(inode)->xattr_sem);
 
-	unlock_page(page);
+	folio_unlock(folio);
 	return ret >= 0 ? 0 : ret;
 }
 
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index fcd904123384..c627686295e0 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -3300,7 +3300,7 @@ static int ext4_read_folio(struct file *file, struct folio *folio)
 	trace_ext4_readpage(page);
 
 	if (ext4_has_inline_data(inode))
-		ret = ext4_readpage_inline(inode, page);
+		ret = ext4_readpage_inline(inode, folio);
 
 	if (ret == -EAGAIN)
 		return ext4_mpage_readpages(inode, NULL, page);
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH 10/31] ext4: Convert ext4_convert_inline_data_to_extent() to use a folio
  2023-01-26 20:23 [PATCH 00/31] Convert most of ext4 to folios Matthew Wilcox (Oracle)
                   ` (8 preceding siblings ...)
  2023-01-26 20:23 ` [PATCH 09/31] ext4: Convert ext4_readpage_inline() to take a folio Matthew Wilcox (Oracle)
@ 2023-01-26 20:23 ` Matthew Wilcox (Oracle)
  2023-03-14 22:36   ` Theodore Ts'o
  2023-01-26 20:23 ` [PATCH 11/31] ext4: Convert ext4_try_to_write_inline_data() " Matthew Wilcox (Oracle)
                   ` (21 subsequent siblings)
  31 siblings, 1 reply; 83+ messages in thread
From: Matthew Wilcox (Oracle) @ 2023-01-26 20:23 UTC (permalink / raw)
  To: Theodore Tso, Andreas Dilger
  Cc: Matthew Wilcox (Oracle), linux-ext4, linux-fsdevel

Saves a number of calls to compound_head().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 fs/ext4/inline.c | 40 +++++++++++++++++++---------------------
 1 file changed, 19 insertions(+), 21 deletions(-)

diff --git a/fs/ext4/inline.c b/fs/ext4/inline.c
index 38f6282cc012..2091077e37dc 100644
--- a/fs/ext4/inline.c
+++ b/fs/ext4/inline.c
@@ -535,8 +535,7 @@ static int ext4_convert_inline_data_to_extent(struct address_space *mapping,
 	int ret, needed_blocks, no_expand;
 	handle_t *handle = NULL;
 	int retries = 0, sem_held = 0;
-	struct page *page = NULL;
-	unsigned int flags;
+	struct folio *folio = NULL;
 	unsigned from, to;
 	struct ext4_iloc iloc;
 
@@ -565,10 +564,9 @@ static int ext4_convert_inline_data_to_extent(struct address_space *mapping,
 
 	/* We cannot recurse into the filesystem as the transaction is already
 	 * started */
-	flags = memalloc_nofs_save();
-	page = grab_cache_page_write_begin(mapping, 0);
-	memalloc_nofs_restore(flags);
-	if (!page) {
+	folio = __filemap_get_folio(mapping, 0, FGP_WRITEBEGIN | FGP_NOFS,
+			mapping_gfp_mask(mapping));
+	if (!folio) {
 		ret = -ENOMEM;
 		goto out;
 	}
@@ -583,8 +581,8 @@ static int ext4_convert_inline_data_to_extent(struct address_space *mapping,
 
 	from = 0;
 	to = ext4_get_inline_size(inode);
-	if (!PageUptodate(page)) {
-		ret = ext4_read_inline_page(inode, page);
+	if (!folio_test_uptodate(folio)) {
+		ret = ext4_read_inline_page(inode, &folio->page);
 		if (ret < 0)
 			goto out;
 	}
@@ -594,21 +592,21 @@ static int ext4_convert_inline_data_to_extent(struct address_space *mapping,
 		goto out;
 
 	if (ext4_should_dioread_nolock(inode)) {
-		ret = __block_write_begin(page, from, to,
+		ret = __block_write_begin(&folio->page, from, to,
 					  ext4_get_block_unwritten);
 	} else
-		ret = __block_write_begin(page, from, to, ext4_get_block);
+		ret = __block_write_begin(&folio->page, from, to, ext4_get_block);
 
 	if (!ret && ext4_should_journal_data(inode)) {
-		ret = ext4_walk_page_buffers(handle, inode, page_buffers(page),
-					     from, to, NULL,
-					     do_journal_get_write_access);
+		ret = ext4_walk_page_buffers(handle, inode,
+					     folio_buffers(folio), from, to,
+					     NULL, do_journal_get_write_access);
 	}
 
 	if (ret) {
-		unlock_page(page);
-		put_page(page);
-		page = NULL;
+		folio_unlock(folio);
+		folio_put(folio);
+		folio = NULL;
 		ext4_orphan_add(handle, inode);
 		ext4_write_unlock_xattr(inode, &no_expand);
 		sem_held = 0;
@@ -628,12 +626,12 @@ static int ext4_convert_inline_data_to_extent(struct address_space *mapping,
 	if (ret == -ENOSPC && ext4_should_retry_alloc(inode->i_sb, &retries))
 		goto retry;
 
-	if (page)
-		block_commit_write(page, from, to);
+	if (folio)
+		block_commit_write(&folio->page, from, to);
 out:
-	if (page) {
-		unlock_page(page);
-		put_page(page);
+	if (folio) {
+		folio_unlock(folio);
+		folio_put(folio);
 	}
 	if (sem_held)
 		ext4_write_unlock_xattr(inode, &no_expand);
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH 11/31] ext4: Convert ext4_try_to_write_inline_data() to use a folio
  2023-01-26 20:23 [PATCH 00/31] Convert most of ext4 to folios Matthew Wilcox (Oracle)
                   ` (9 preceding siblings ...)
  2023-01-26 20:23 ` [PATCH 10/31] ext4: Convert ext4_convert_inline_data_to_extent() to use " Matthew Wilcox (Oracle)
@ 2023-01-26 20:23 ` Matthew Wilcox (Oracle)
  2023-03-14 22:37   ` Theodore Ts'o
  2023-01-26 20:23 ` [PATCH 12/31] ext4: Convert ext4_da_convert_inline_data_to_extent() " Matthew Wilcox (Oracle)
                   ` (20 subsequent siblings)
  31 siblings, 1 reply; 83+ messages in thread
From: Matthew Wilcox (Oracle) @ 2023-01-26 20:23 UTC (permalink / raw)
  To: Theodore Tso, Andreas Dilger
  Cc: Matthew Wilcox (Oracle), linux-ext4, linux-fsdevel

Saves a number of calls to compound_head().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 fs/ext4/inline.c | 24 +++++++++++-------------
 1 file changed, 11 insertions(+), 13 deletions(-)

diff --git a/fs/ext4/inline.c b/fs/ext4/inline.c
index 2091077e37dc..6d136353ccc2 100644
--- a/fs/ext4/inline.c
+++ b/fs/ext4/inline.c
@@ -654,8 +654,7 @@ int ext4_try_to_write_inline_data(struct address_space *mapping,
 {
 	int ret;
 	handle_t *handle;
-	unsigned int flags;
-	struct page *page;
+	struct folio *folio;
 	struct ext4_iloc iloc;
 
 	if (pos + len > ext4_get_max_inline_size(inode))
@@ -692,28 +691,27 @@ int ext4_try_to_write_inline_data(struct address_space *mapping,
 	if (ret)
 		goto out;
 
-	flags = memalloc_nofs_save();
-	page = grab_cache_page_write_begin(mapping, 0);
-	memalloc_nofs_restore(flags);
-	if (!page) {
+	folio = __filemap_get_folio(mapping, 0, FGP_WRITEBEGIN | FGP_NOFS,
+					mapping_gfp_mask(mapping));
+	if (!folio) {
 		ret = -ENOMEM;
 		goto out;
 	}
 
-	*pagep = page;
+	*pagep = &folio->page;
 	down_read(&EXT4_I(inode)->xattr_sem);
 	if (!ext4_has_inline_data(inode)) {
 		ret = 0;
-		unlock_page(page);
-		put_page(page);
+		folio_unlock(folio);
+		folio_put(folio);
 		goto out_up_read;
 	}
 
-	if (!PageUptodate(page)) {
-		ret = ext4_read_inline_page(inode, page);
+	if (!folio_test_uptodate(folio)) {
+		ret = ext4_read_inline_page(inode, &folio->page);
 		if (ret < 0) {
-			unlock_page(page);
-			put_page(page);
+			folio_unlock(folio);
+			folio_put(folio);
 			goto out_up_read;
 		}
 	}
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH 12/31] ext4: Convert ext4_da_convert_inline_data_to_extent() to use a folio
  2023-01-26 20:23 [PATCH 00/31] Convert most of ext4 to folios Matthew Wilcox (Oracle)
                   ` (10 preceding siblings ...)
  2023-01-26 20:23 ` [PATCH 11/31] ext4: Convert ext4_try_to_write_inline_data() " Matthew Wilcox (Oracle)
@ 2023-01-26 20:23 ` Matthew Wilcox (Oracle)
  2023-01-26 20:23 ` [PATCH 13/31] ext4: Convert ext4_da_write_inline_data_begin() " Matthew Wilcox (Oracle)
                   ` (19 subsequent siblings)
  31 siblings, 0 replies; 83+ messages in thread
From: Matthew Wilcox (Oracle) @ 2023-01-26 20:23 UTC (permalink / raw)
  To: Theodore Tso, Andreas Dilger
  Cc: Matthew Wilcox (Oracle), linux-ext4, linux-fsdevel

Saves a number of calls to compound_head().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 fs/ext4/inline.c | 27 ++++++++++++++-------------
 1 file changed, 14 insertions(+), 13 deletions(-)

diff --git a/fs/ext4/inline.c b/fs/ext4/inline.c
index 6d136353ccc2..99c77dd519f0 100644
--- a/fs/ext4/inline.c
+++ b/fs/ext4/inline.c
@@ -849,10 +849,11 @@ static int ext4_da_convert_inline_data_to_extent(struct address_space *mapping,
 						 void **fsdata)
 {
 	int ret = 0, inline_size;
-	struct page *page;
+	struct folio *folio;
 
-	page = grab_cache_page_write_begin(mapping, 0);
-	if (!page)
+	folio = __filemap_get_folio(mapping, 0, FGP_WRITEBEGIN,
+					mapping_gfp_mask(mapping));
+	if (!folio)
 		return -ENOMEM;
 
 	down_read(&EXT4_I(inode)->xattr_sem);
@@ -863,32 +864,32 @@ static int ext4_da_convert_inline_data_to_extent(struct address_space *mapping,
 
 	inline_size = ext4_get_inline_size(inode);
 
-	if (!PageUptodate(page)) {
-		ret = ext4_read_inline_page(inode, page);
+	if (!folio_test_uptodate(folio)) {
+		ret = ext4_read_inline_page(inode, &folio->page);
 		if (ret < 0)
 			goto out;
 	}
 
-	ret = __block_write_begin(page, 0, inline_size,
+	ret = __block_write_begin(&folio->page, 0, inline_size,
 				  ext4_da_get_block_prep);
 	if (ret) {
 		up_read(&EXT4_I(inode)->xattr_sem);
-		unlock_page(page);
-		put_page(page);
+		folio_unlock(folio);
+		folio_put(folio);
 		ext4_truncate_failed_write(inode);
 		return ret;
 	}
 
-	SetPageDirty(page);
-	SetPageUptodate(page);
+	folio_mark_dirty(folio);
+	folio_mark_uptodate(folio);
 	ext4_clear_inode_state(inode, EXT4_STATE_MAY_INLINE_DATA);
 	*fsdata = (void *)CONVERT_INLINE_DATA;
 
 out:
 	up_read(&EXT4_I(inode)->xattr_sem);
-	if (page) {
-		unlock_page(page);
-		put_page(page);
+	if (folio) {
+		folio_unlock(folio);
+		folio_put(folio);
 	}
 	return ret;
 }
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH 13/31] ext4: Convert ext4_da_write_inline_data_begin() to use a folio
  2023-01-26 20:23 [PATCH 00/31] Convert most of ext4 to folios Matthew Wilcox (Oracle)
                   ` (11 preceding siblings ...)
  2023-01-26 20:23 ` [PATCH 12/31] ext4: Convert ext4_da_convert_inline_data_to_extent() " Matthew Wilcox (Oracle)
@ 2023-01-26 20:23 ` Matthew Wilcox (Oracle)
  2023-01-26 20:23 ` [PATCH 14/31] ext4: Convert ext4_read_inline_page() to ext4_read_inline_folio() Matthew Wilcox (Oracle)
                   ` (18 subsequent siblings)
  31 siblings, 0 replies; 83+ messages in thread
From: Matthew Wilcox (Oracle) @ 2023-01-26 20:23 UTC (permalink / raw)
  To: Theodore Tso, Andreas Dilger
  Cc: Matthew Wilcox (Oracle), linux-ext4, linux-fsdevel

Saves a number of calls to compound_head().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 fs/ext4/inline.c | 20 +++++++++-----------
 1 file changed, 9 insertions(+), 11 deletions(-)

diff --git a/fs/ext4/inline.c b/fs/ext4/inline.c
index 99c77dd519f0..b8e22348dad2 100644
--- a/fs/ext4/inline.c
+++ b/fs/ext4/inline.c
@@ -910,10 +910,9 @@ int ext4_da_write_inline_data_begin(struct address_space *mapping,
 {
 	int ret;
 	handle_t *handle;
-	struct page *page;
+	struct folio *folio;
 	struct ext4_iloc iloc;
 	int retries = 0;
-	unsigned int flags;
 
 	ret = ext4_get_inode_loc(inode, &iloc);
 	if (ret)
@@ -945,10 +944,9 @@ int ext4_da_write_inline_data_begin(struct address_space *mapping,
 	 * We cannot recurse into the filesystem as the transaction
 	 * is already started.
 	 */
-	flags = memalloc_nofs_save();
-	page = grab_cache_page_write_begin(mapping, 0);
-	memalloc_nofs_restore(flags);
-	if (!page) {
+	folio = __filemap_get_folio(mapping, 0, FGP_WRITEBEGIN | FGP_NOFS,
+					mapping_gfp_mask(mapping));
+	if (!folio) {
 		ret = -ENOMEM;
 		goto out_journal;
 	}
@@ -959,8 +957,8 @@ int ext4_da_write_inline_data_begin(struct address_space *mapping,
 		goto out_release_page;
 	}
 
-	if (!PageUptodate(page)) {
-		ret = ext4_read_inline_page(inode, page);
+	if (!folio_test_uptodate(folio)) {
+		ret = ext4_read_inline_page(inode, &folio->page);
 		if (ret < 0)
 			goto out_release_page;
 	}
@@ -970,13 +968,13 @@ int ext4_da_write_inline_data_begin(struct address_space *mapping,
 		goto out_release_page;
 
 	up_read(&EXT4_I(inode)->xattr_sem);
-	*pagep = page;
+	*pagep = &folio->page;
 	brelse(iloc.bh);
 	return 1;
 out_release_page:
 	up_read(&EXT4_I(inode)->xattr_sem);
-	unlock_page(page);
-	put_page(page);
+	folio_unlock(folio);
+	folio_put(folio);
 out_journal:
 	ext4_journal_stop(handle);
 out:
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH 14/31] ext4: Convert ext4_read_inline_page() to ext4_read_inline_folio()
  2023-01-26 20:23 [PATCH 00/31] Convert most of ext4 to folios Matthew Wilcox (Oracle)
                   ` (12 preceding siblings ...)
  2023-01-26 20:23 ` [PATCH 13/31] ext4: Convert ext4_da_write_inline_data_begin() " Matthew Wilcox (Oracle)
@ 2023-01-26 20:23 ` Matthew Wilcox (Oracle)
  2023-03-14 22:38   ` Theodore Ts'o
  2023-01-26 20:23 ` [PATCH 15/31] ext4: Convert ext4_write_inline_data_end() to use a folio Matthew Wilcox (Oracle)
                   ` (17 subsequent siblings)
  31 siblings, 1 reply; 83+ messages in thread
From: Matthew Wilcox (Oracle) @ 2023-01-26 20:23 UTC (permalink / raw)
  To: Theodore Tso, Andreas Dilger
  Cc: Matthew Wilcox (Oracle), linux-ext4, linux-fsdevel

All callers now have a folio, so pass it and use it.  The folio may
be large, although I doubt we'll want to use a large folio for an
inline file.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 fs/ext4/inline.c | 27 ++++++++++++++-------------
 1 file changed, 14 insertions(+), 13 deletions(-)

diff --git a/fs/ext4/inline.c b/fs/ext4/inline.c
index b8e22348dad2..29294caa20a1 100644
--- a/fs/ext4/inline.c
+++ b/fs/ext4/inline.c
@@ -468,16 +468,16 @@ static int ext4_destroy_inline_data_nolock(handle_t *handle,
 	return error;
 }
 
-static int ext4_read_inline_page(struct inode *inode, struct page *page)
+static int ext4_read_inline_folio(struct inode *inode, struct folio *folio)
 {
 	void *kaddr;
 	int ret = 0;
 	size_t len;
 	struct ext4_iloc iloc;
 
-	BUG_ON(!PageLocked(page));
+	BUG_ON(!folio_test_locked(folio));
 	BUG_ON(!ext4_has_inline_data(inode));
-	BUG_ON(page->index);
+	BUG_ON(folio->index);
 
 	if (!EXT4_I(inode)->i_inline_off) {
 		ext4_warning(inode->i_sb, "inode %lu doesn't have inline data.",
@@ -490,12 +490,13 @@ static int ext4_read_inline_page(struct inode *inode, struct page *page)
 		goto out;
 
 	len = min_t(size_t, ext4_get_inline_size(inode), i_size_read(inode));
-	kaddr = kmap_atomic(page);
+	BUG_ON(len > PAGE_SIZE);
+	kaddr = kmap_local_folio(folio, 0);
 	ret = ext4_read_inline_data(inode, kaddr, len, &iloc);
-	flush_dcache_page(page);
-	kunmap_atomic(kaddr);
-	zero_user_segment(page, len, PAGE_SIZE);
-	SetPageUptodate(page);
+	flush_dcache_folio(folio);
+	kunmap_local(kaddr);
+	folio_zero_segment(folio, len, folio_size(folio));
+	folio_mark_uptodate(folio);
 	brelse(iloc.bh);
 
 out:
@@ -517,7 +518,7 @@ int ext4_readpage_inline(struct inode *inode, struct folio *folio)
 	 * So for all the other pages, just set them uptodate.
 	 */
 	if (!folio->index)
-		ret = ext4_read_inline_page(inode, &folio->page);
+		ret = ext4_read_inline_folio(inode, folio);
 	else if (!folio_test_uptodate(folio)) {
 		folio_zero_segment(folio, 0, PAGE_SIZE);
 		folio_mark_uptodate(folio);
@@ -582,7 +583,7 @@ static int ext4_convert_inline_data_to_extent(struct address_space *mapping,
 	from = 0;
 	to = ext4_get_inline_size(inode);
 	if (!folio_test_uptodate(folio)) {
-		ret = ext4_read_inline_page(inode, &folio->page);
+		ret = ext4_read_inline_folio(inode, folio);
 		if (ret < 0)
 			goto out;
 	}
@@ -708,7 +709,7 @@ int ext4_try_to_write_inline_data(struct address_space *mapping,
 	}
 
 	if (!folio_test_uptodate(folio)) {
-		ret = ext4_read_inline_page(inode, &folio->page);
+		ret = ext4_read_inline_folio(inode, folio);
 		if (ret < 0) {
 			folio_unlock(folio);
 			folio_put(folio);
@@ -865,7 +866,7 @@ static int ext4_da_convert_inline_data_to_extent(struct address_space *mapping,
 	inline_size = ext4_get_inline_size(inode);
 
 	if (!folio_test_uptodate(folio)) {
-		ret = ext4_read_inline_page(inode, &folio->page);
+		ret = ext4_read_inline_folio(inode, folio);
 		if (ret < 0)
 			goto out;
 	}
@@ -958,7 +959,7 @@ int ext4_da_write_inline_data_begin(struct address_space *mapping,
 	}
 
 	if (!folio_test_uptodate(folio)) {
-		ret = ext4_read_inline_page(inode, &folio->page);
+		ret = ext4_read_inline_folio(inode, folio);
 		if (ret < 0)
 			goto out_release_page;
 	}
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH 15/31] ext4: Convert ext4_write_inline_data_end() to use a folio
  2023-01-26 20:23 [PATCH 00/31] Convert most of ext4 to folios Matthew Wilcox (Oracle)
                   ` (13 preceding siblings ...)
  2023-01-26 20:23 ` [PATCH 14/31] ext4: Convert ext4_read_inline_page() to ext4_read_inline_folio() Matthew Wilcox (Oracle)
@ 2023-01-26 20:23 ` Matthew Wilcox (Oracle)
  2023-03-14 22:39   ` Theodore Ts'o
  2023-01-26 20:24 ` [PATCH 16/31] ext4: Convert ext4_write_begin() " Matthew Wilcox (Oracle)
                   ` (16 subsequent siblings)
  31 siblings, 1 reply; 83+ messages in thread
From: Matthew Wilcox (Oracle) @ 2023-01-26 20:23 UTC (permalink / raw)
  To: Theodore Tso, Andreas Dilger
  Cc: Matthew Wilcox (Oracle), linux-ext4, linux-fsdevel

Convert the incoming page to a folio so that we call compound_head()
only once instead of seven times.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 fs/ext4/inline.c | 29 +++++++++++++++--------------
 1 file changed, 15 insertions(+), 14 deletions(-)

diff --git a/fs/ext4/inline.c b/fs/ext4/inline.c
index 29294caa20a1..a06dd4f0d17b 100644
--- a/fs/ext4/inline.c
+++ b/fs/ext4/inline.c
@@ -733,20 +733,21 @@ int ext4_try_to_write_inline_data(struct address_space *mapping,
 int ext4_write_inline_data_end(struct inode *inode, loff_t pos, unsigned len,
 			       unsigned copied, struct page *page)
 {
+	struct folio *folio = page_folio(page);
 	handle_t *handle = ext4_journal_current_handle();
 	int no_expand;
 	void *kaddr;
 	struct ext4_iloc iloc;
 	int ret = 0, ret2;
 
-	if (unlikely(copied < len) && !PageUptodate(page))
+	if (unlikely(copied < len) && !folio_test_uptodate(folio))
 		copied = 0;
 
 	if (likely(copied)) {
 		ret = ext4_get_inode_loc(inode, &iloc);
 		if (ret) {
-			unlock_page(page);
-			put_page(page);
+			folio_unlock(folio);
+			folio_put(folio);
 			ext4_std_error(inode->i_sb, ret);
 			goto out;
 		}
@@ -760,30 +761,30 @@ int ext4_write_inline_data_end(struct inode *inode, loff_t pos, unsigned len,
 		 */
 		(void) ext4_find_inline_data_nolock(inode);
 
-		kaddr = kmap_atomic(page);
+		kaddr = kmap_local_folio(folio, 0);
 		ext4_write_inline_data(inode, &iloc, kaddr, pos, copied);
-		kunmap_atomic(kaddr);
-		SetPageUptodate(page);
-		/* clear page dirty so that writepages wouldn't work for us. */
-		ClearPageDirty(page);
+		kunmap_local(kaddr);
+		folio_mark_uptodate(folio);
+		/* clear dirty flag so that writepages wouldn't work for us. */
+		folio_clear_dirty(folio);
 
 		ext4_write_unlock_xattr(inode, &no_expand);
 		brelse(iloc.bh);
 
 		/*
-		 * It's important to update i_size while still holding page
+		 * It's important to update i_size while still holding folio
 		 * lock: page writeout could otherwise come in and zero
 		 * beyond i_size.
 		 */
 		ext4_update_inode_size(inode, pos + copied);
 	}
-	unlock_page(page);
-	put_page(page);
+	folio_unlock(folio);
+	folio_put(folio);
 
 	/*
-	 * Don't mark the inode dirty under page lock. First, it unnecessarily
-	 * makes the holding time of page lock longer. Second, it forces lock
-	 * ordering of page lock and transaction start for journaling
+	 * Don't mark the inode dirty under folio lock. First, it unnecessarily
+	 * makes the holding time of folio lock longer. Second, it forces lock
+	 * ordering of folio lock and transaction start for journaling
 	 * filesystems.
 	 */
 	if (likely(copied))
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH 16/31] ext4: Convert ext4_write_begin() to use a folio
  2023-01-26 20:23 [PATCH 00/31] Convert most of ext4 to folios Matthew Wilcox (Oracle)
                   ` (14 preceding siblings ...)
  2023-01-26 20:23 ` [PATCH 15/31] ext4: Convert ext4_write_inline_data_end() to use a folio Matthew Wilcox (Oracle)
@ 2023-01-26 20:24 ` Matthew Wilcox (Oracle)
  2023-03-14 22:40   ` Theodore Ts'o
  2023-01-26 20:24 ` [PATCH 17/31] ext4: Convert ext4_write_end() " Matthew Wilcox (Oracle)
                   ` (15 subsequent siblings)
  31 siblings, 1 reply; 83+ messages in thread
From: Matthew Wilcox (Oracle) @ 2023-01-26 20:24 UTC (permalink / raw)
  To: Theodore Tso, Andreas Dilger
  Cc: Matthew Wilcox (Oracle), linux-ext4, linux-fsdevel

Remove a lot of calls to compound_head().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 fs/ext4/inode.c | 53 +++++++++++++++++++++++++------------------------
 1 file changed, 27 insertions(+), 26 deletions(-)

diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index c627686295e0..9233d6b68ebe 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -1156,7 +1156,7 @@ static int ext4_write_begin(struct file *file, struct address_space *mapping,
 	int ret, needed_blocks;
 	handle_t *handle;
 	int retries = 0;
-	struct page *page;
+	struct folio *folio;
 	pgoff_t index;
 	unsigned from, to;
 
@@ -1183,68 +1183,69 @@ static int ext4_write_begin(struct file *file, struct address_space *mapping,
 	}
 
 	/*
-	 * grab_cache_page_write_begin() can take a long time if the
-	 * system is thrashing due to memory pressure, or if the page
+	 * __filemap_get_folio() can take a long time if the
+	 * system is thrashing due to memory pressure, or if the folio
 	 * is being written back.  So grab it first before we start
 	 * the transaction handle.  This also allows us to allocate
-	 * the page (if needed) without using GFP_NOFS.
+	 * the folio (if needed) without using GFP_NOFS.
 	 */
 retry_grab:
-	page = grab_cache_page_write_begin(mapping, index);
-	if (!page)
+	folio = __filemap_get_folio(mapping, index, FGP_WRITEBEGIN,
+					mapping_gfp_mask(mapping));
+	if (!folio)
 		return -ENOMEM;
 	/*
 	 * The same as page allocation, we prealloc buffer heads before
 	 * starting the handle.
 	 */
-	if (!page_has_buffers(page))
-		create_empty_buffers(page, inode->i_sb->s_blocksize, 0);
+	if (!folio_buffers(folio))
+		create_empty_buffers(&folio->page, inode->i_sb->s_blocksize, 0);
 
-	unlock_page(page);
+	folio_unlock(folio);
 
 retry_journal:
 	handle = ext4_journal_start(inode, EXT4_HT_WRITE_PAGE, needed_blocks);
 	if (IS_ERR(handle)) {
-		put_page(page);
+		folio_put(folio);
 		return PTR_ERR(handle);
 	}
 
-	lock_page(page);
-	if (page->mapping != mapping) {
-		/* The page got truncated from under us */
-		unlock_page(page);
-		put_page(page);
+	folio_lock(folio);
+	if (folio->mapping != mapping) {
+		/* The folio got truncated from under us */
+		folio_unlock(folio);
+		folio_put(folio);
 		ext4_journal_stop(handle);
 		goto retry_grab;
 	}
-	/* In case writeback began while the page was unlocked */
-	wait_for_stable_page(page);
+	/* In case writeback began while the folio was unlocked */
+	folio_wait_stable(folio);
 
 #ifdef CONFIG_FS_ENCRYPTION
 	if (ext4_should_dioread_nolock(inode))
-		ret = ext4_block_write_begin(page, pos, len,
+		ret = ext4_block_write_begin(&folio->page, pos, len,
 					     ext4_get_block_unwritten);
 	else
-		ret = ext4_block_write_begin(page, pos, len,
+		ret = ext4_block_write_begin(&folio->page, pos, len,
 					     ext4_get_block);
 #else
 	if (ext4_should_dioread_nolock(inode))
-		ret = __block_write_begin(page, pos, len,
+		ret = __block_write_begin(&folio->page, pos, len,
 					  ext4_get_block_unwritten);
 	else
-		ret = __block_write_begin(page, pos, len, ext4_get_block);
+		ret = __block_write_begin(&folio->page, pos, len, ext4_get_block);
 #endif
 	if (!ret && ext4_should_journal_data(inode)) {
 		ret = ext4_walk_page_buffers(handle, inode,
-					     page_buffers(page), from, to, NULL,
-					     do_journal_get_write_access);
+					     folio_buffers(folio), from, to,
+					     NULL, do_journal_get_write_access);
 	}
 
 	if (ret) {
 		bool extended = (pos + len > inode->i_size) &&
 				!ext4_verity_in_progress(inode);
 
-		unlock_page(page);
+		folio_unlock(folio);
 		/*
 		 * __block_write_begin may have instantiated a few blocks
 		 * outside i_size.  Trim these off again. Don't need
@@ -1272,10 +1273,10 @@ static int ext4_write_begin(struct file *file, struct address_space *mapping,
 		if (ret == -ENOSPC &&
 		    ext4_should_retry_alloc(inode->i_sb, &retries))
 			goto retry_journal;
-		put_page(page);
+		folio_put(folio);
 		return ret;
 	}
-	*pagep = page;
+	*pagep = &folio->page;
 	return ret;
 }
 
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH 17/31] ext4: Convert ext4_write_end() to use a folio
  2023-01-26 20:23 [PATCH 00/31] Convert most of ext4 to folios Matthew Wilcox (Oracle)
                   ` (15 preceding siblings ...)
  2023-01-26 20:24 ` [PATCH 16/31] ext4: Convert ext4_write_begin() " Matthew Wilcox (Oracle)
@ 2023-01-26 20:24 ` Matthew Wilcox (Oracle)
  2023-03-14 22:41   ` Theodore Ts'o
  2023-01-26 20:24 ` [PATCH 18/31] ext4: Use a folio in ext4_journalled_write_end() Matthew Wilcox (Oracle)
                   ` (14 subsequent siblings)
  31 siblings, 1 reply; 83+ messages in thread
From: Matthew Wilcox (Oracle) @ 2023-01-26 20:24 UTC (permalink / raw)
  To: Theodore Tso, Andreas Dilger
  Cc: Matthew Wilcox (Oracle), linux-ext4, linux-fsdevel

Convert the incoming struct page to a folio.  Replaces two implicit
calls to compound_head() with one explicit call.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 fs/ext4/inode.c | 13 +++++++------
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 9233d6b68ebe..ab6eb85a9506 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -1306,6 +1306,7 @@ static int ext4_write_end(struct file *file,
 			  loff_t pos, unsigned len, unsigned copied,
 			  struct page *page, void *fsdata)
 {
+	struct folio *folio = page_folio(page);
 	handle_t *handle = ext4_journal_current_handle();
 	struct inode *inode = mapping->host;
 	loff_t old_size = inode->i_size;
@@ -1321,7 +1322,7 @@ static int ext4_write_end(struct file *file,
 
 	copied = block_write_end(file, mapping, pos, len, copied, page, fsdata);
 	/*
-	 * it's important to update i_size while still holding page lock:
+	 * it's important to update i_size while still holding folio lock:
 	 * page writeout could otherwise come in and zero beyond i_size.
 	 *
 	 * If FS_IOC_ENABLE_VERITY is running on this inode, then Merkle tree
@@ -1329,15 +1330,15 @@ static int ext4_write_end(struct file *file,
 	 */
 	if (!verity)
 		i_size_changed = ext4_update_inode_size(inode, pos + copied);
-	unlock_page(page);
-	put_page(page);
+	folio_unlock(folio);
+	folio_put(folio);
 
 	if (old_size < pos && !verity)
 		pagecache_isize_extended(inode, old_size, pos);
 	/*
-	 * Don't mark the inode dirty under page lock. First, it unnecessarily
-	 * makes the holding time of page lock longer. Second, it forces lock
-	 * ordering of page lock and transaction start for journaling
+	 * Don't mark the inode dirty under folio lock. First, it unnecessarily
+	 * makes the holding time of folio lock longer. Second, it forces lock
+	 * ordering of folio lock and transaction start for journaling
 	 * filesystems.
 	 */
 	if (i_size_changed)
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH 18/31] ext4: Use a folio in ext4_journalled_write_end()
  2023-01-26 20:23 [PATCH 00/31] Convert most of ext4 to folios Matthew Wilcox (Oracle)
                   ` (16 preceding siblings ...)
  2023-01-26 20:24 ` [PATCH 17/31] ext4: Convert ext4_write_end() " Matthew Wilcox (Oracle)
@ 2023-01-26 20:24 ` Matthew Wilcox (Oracle)
  2023-03-14 22:41   ` Theodore Ts'o
  2023-01-26 20:24 ` [PATCH 19/31] ext4: Convert ext4_journalled_zero_new_buffers() to use a folio Matthew Wilcox (Oracle)
                   ` (13 subsequent siblings)
  31 siblings, 1 reply; 83+ messages in thread
From: Matthew Wilcox (Oracle) @ 2023-01-26 20:24 UTC (permalink / raw)
  To: Theodore Tso, Andreas Dilger
  Cc: Matthew Wilcox (Oracle), linux-ext4, linux-fsdevel

Convert the incoming page to a folio to remove a few calls to
compound_head().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 fs/ext4/inode.c | 12 +++++++-----
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index ab6eb85a9506..4f43d7434965 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -1409,6 +1409,7 @@ static int ext4_journalled_write_end(struct file *file,
 				     loff_t pos, unsigned len, unsigned copied,
 				     struct page *page, void *fsdata)
 {
+	struct folio *folio = page_folio(page);
 	handle_t *handle = ext4_journal_current_handle();
 	struct inode *inode = mapping->host;
 	loff_t old_size = inode->i_size;
@@ -1427,25 +1428,26 @@ static int ext4_journalled_write_end(struct file *file,
 	if (ext4_has_inline_data(inode))
 		return ext4_write_inline_data_end(inode, pos, len, copied, page);
 
-	if (unlikely(copied < len) && !PageUptodate(page)) {
+	if (unlikely(copied < len) && !folio_test_uptodate(folio)) {
 		copied = 0;
 		ext4_journalled_zero_new_buffers(handle, inode, page, from, to);
 	} else {
 		if (unlikely(copied < len))
 			ext4_journalled_zero_new_buffers(handle, inode, page,
 							 from + copied, to);
-		ret = ext4_walk_page_buffers(handle, inode, page_buffers(page),
+		ret = ext4_walk_page_buffers(handle, inode,
+					     folio_buffers(folio),
 					     from, from + copied, &partial,
 					     write_end_fn);
 		if (!partial)
-			SetPageUptodate(page);
+			folio_mark_uptodate(folio);
 	}
 	if (!verity)
 		size_changed = ext4_update_inode_size(inode, pos + copied);
 	ext4_set_inode_state(inode, EXT4_STATE_JDATA);
 	EXT4_I(inode)->i_datasync_tid = handle->h_transaction->t_tid;
-	unlock_page(page);
-	put_page(page);
+	folio_unlock(folio);
+	folio_put(folio);
 
 	if (old_size < pos && !verity)
 		pagecache_isize_extended(inode, old_size, pos);
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH 19/31] ext4: Convert ext4_journalled_zero_new_buffers() to use a folio
  2023-01-26 20:23 [PATCH 00/31] Convert most of ext4 to folios Matthew Wilcox (Oracle)
                   ` (17 preceding siblings ...)
  2023-01-26 20:24 ` [PATCH 18/31] ext4: Use a folio in ext4_journalled_write_end() Matthew Wilcox (Oracle)
@ 2023-01-26 20:24 ` Matthew Wilcox (Oracle)
  2023-03-14 22:46   ` Theodore Ts'o
  2023-01-26 20:24 ` [PATCH 20/31] ext4: Convert __ext4_block_zero_page_range() " Matthew Wilcox (Oracle)
                   ` (12 subsequent siblings)
  31 siblings, 1 reply; 83+ messages in thread
From: Matthew Wilcox (Oracle) @ 2023-01-26 20:24 UTC (permalink / raw)
  To: Theodore Tso, Andreas Dilger
  Cc: Matthew Wilcox (Oracle), linux-ext4, linux-fsdevel

Remove a call to compound_head().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 fs/ext4/inode.c | 13 +++++++------
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 4f43d7434965..b79e591b7c8e 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -1376,24 +1376,24 @@ static int ext4_write_end(struct file *file,
  */
 static void ext4_journalled_zero_new_buffers(handle_t *handle,
 					    struct inode *inode,
-					    struct page *page,
+					    struct folio *folio,
 					    unsigned from, unsigned to)
 {
 	unsigned int block_start = 0, block_end;
 	struct buffer_head *head, *bh;
 
-	bh = head = page_buffers(page);
+	bh = head = folio_buffers(folio);
 	do {
 		block_end = block_start + bh->b_size;
 		if (buffer_new(bh)) {
 			if (block_end > from && block_start < to) {
-				if (!PageUptodate(page)) {
+				if (!folio_test_uptodate(folio)) {
 					unsigned start, size;
 
 					start = max(from, block_start);
 					size = min(to, block_end) - start;
 
-					zero_user(page, start, size);
+					folio_zero_range(folio, start, size);
 					write_end_fn(handle, inode, bh);
 				}
 				clear_buffer_new(bh);
@@ -1430,10 +1430,11 @@ static int ext4_journalled_write_end(struct file *file,
 
 	if (unlikely(copied < len) && !folio_test_uptodate(folio)) {
 		copied = 0;
-		ext4_journalled_zero_new_buffers(handle, inode, page, from, to);
+		ext4_journalled_zero_new_buffers(handle, inode, folio,
+						 from, to);
 	} else {
 		if (unlikely(copied < len))
-			ext4_journalled_zero_new_buffers(handle, inode, page,
+			ext4_journalled_zero_new_buffers(handle, inode, folio,
 							 from + copied, to);
 		ret = ext4_walk_page_buffers(handle, inode,
 					     folio_buffers(folio),
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH 20/31] ext4: Convert __ext4_block_zero_page_range() to use a folio
  2023-01-26 20:23 [PATCH 00/31] Convert most of ext4 to folios Matthew Wilcox (Oracle)
                   ` (18 preceding siblings ...)
  2023-01-26 20:24 ` [PATCH 19/31] ext4: Convert ext4_journalled_zero_new_buffers() to use a folio Matthew Wilcox (Oracle)
@ 2023-01-26 20:24 ` Matthew Wilcox (Oracle)
  2023-03-05 12:26   ` Ritesh Harjani
  2023-01-26 20:24 ` [PATCH 21/31] ext4: Convert __ext4_journalled_writepage() to take " Matthew Wilcox (Oracle)
                   ` (11 subsequent siblings)
  31 siblings, 1 reply; 83+ messages in thread
From: Matthew Wilcox (Oracle) @ 2023-01-26 20:24 UTC (permalink / raw)
  To: Theodore Tso, Andreas Dilger
  Cc: Matthew Wilcox (Oracle), linux-ext4, linux-fsdevel

Use folio APIs throughout.  Saves many calls to compound_head().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 fs/ext4/inode.c | 28 ++++++++++++++++------------
 1 file changed, 16 insertions(+), 12 deletions(-)

diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index b79e591b7c8e..727aa2e51a9d 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -3812,23 +3812,26 @@ static int __ext4_block_zero_page_range(handle_t *handle,
 	ext4_lblk_t iblock;
 	struct inode *inode = mapping->host;
 	struct buffer_head *bh;
-	struct page *page;
+	struct folio *folio;
 	int err = 0;
 
-	page = find_or_create_page(mapping, from >> PAGE_SHIFT,
-				   mapping_gfp_constraint(mapping, ~__GFP_FS));
-	if (!page)
+	folio = __filemap_get_folio(mapping, from >> PAGE_SHIFT,
+				    FGP_LOCK | FGP_ACCESSED | FGP_CREAT,
+				    mapping_gfp_constraint(mapping, ~__GFP_FS));
+	if (!folio)
 		return -ENOMEM;
 
 	blocksize = inode->i_sb->s_blocksize;
 
 	iblock = index << (PAGE_SHIFT - inode->i_sb->s_blocksize_bits);
 
-	if (!page_has_buffers(page))
-		create_empty_buffers(page, blocksize, 0);
+	bh = folio_buffers(folio);
+	if (!bh) {
+		create_empty_buffers(&folio->page, blocksize, 0);
+		bh = folio_buffers(folio);
+	}
 
 	/* Find the buffer that contains "offset" */
-	bh = page_buffers(page);
 	pos = blocksize;
 	while (offset >= pos) {
 		bh = bh->b_this_page;
@@ -3850,7 +3853,7 @@ static int __ext4_block_zero_page_range(handle_t *handle,
 	}
 
 	/* Ok, it's mapped. Make sure it's up-to-date */
-	if (PageUptodate(page))
+	if (folio_test_uptodate(folio))
 		set_buffer_uptodate(bh);
 
 	if (!buffer_uptodate(bh)) {
@@ -3860,7 +3863,8 @@ static int __ext4_block_zero_page_range(handle_t *handle,
 		if (fscrypt_inode_uses_fs_layer_crypto(inode)) {
 			/* We expect the key to be set. */
 			BUG_ON(!fscrypt_has_encryption_key(inode));
-			err = fscrypt_decrypt_pagecache_blocks(page, blocksize,
+			err = fscrypt_decrypt_pagecache_blocks(&folio->page,
+							       blocksize,
 							       bh_offset(bh));
 			if (err) {
 				clear_buffer_uptodate(bh);
@@ -3875,7 +3879,7 @@ static int __ext4_block_zero_page_range(handle_t *handle,
 		if (err)
 			goto unlock;
 	}
-	zero_user(page, offset, length);
+	folio_zero_range(folio, offset, length);
 	BUFFER_TRACE(bh, "zeroed end of block");
 
 	if (ext4_should_journal_data(inode)) {
@@ -3889,8 +3893,8 @@ static int __ext4_block_zero_page_range(handle_t *handle,
 	}
 
 unlock:
-	unlock_page(page);
-	put_page(page);
+	folio_unlock(folio);
+	folio_put(folio);
 	return err;
 }
 
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH 21/31] ext4: Convert __ext4_journalled_writepage() to take a folio
  2023-01-26 20:23 [PATCH 00/31] Convert most of ext4 to folios Matthew Wilcox (Oracle)
                   ` (19 preceding siblings ...)
  2023-01-26 20:24 ` [PATCH 20/31] ext4: Convert __ext4_block_zero_page_range() " Matthew Wilcox (Oracle)
@ 2023-01-26 20:24 ` Matthew Wilcox (Oracle)
  2023-03-14 22:47   ` Theodore Ts'o
  2023-01-26 20:24 ` [PATCH 22/31] ext4: Convert ext4_page_nomap_can_writeout() " Matthew Wilcox (Oracle)
                   ` (10 subsequent siblings)
  31 siblings, 1 reply; 83+ messages in thread
From: Matthew Wilcox (Oracle) @ 2023-01-26 20:24 UTC (permalink / raw)
  To: Theodore Tso, Andreas Dilger
  Cc: Matthew Wilcox (Oracle), linux-ext4, linux-fsdevel

Use the folio APIs throughout and remove a PAGE_SIZE assumption.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 fs/ext4/inode.c | 46 +++++++++++++++++++++++-----------------------
 1 file changed, 23 insertions(+), 23 deletions(-)

diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 727aa2e51a9d..9b2c21d0e1f3 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -136,7 +136,7 @@ static inline int ext4_begin_ordered_truncate(struct inode *inode,
 						   new_size);
 }
 
-static int __ext4_journalled_writepage(struct page *page, unsigned int len);
+static int __ext4_journalled_writepage(struct folio *folio, unsigned int len);
 static int ext4_meta_trans_blocks(struct inode *inode, int lblocks,
 				  int pextents);
 
@@ -1891,10 +1891,10 @@ int ext4_da_get_block_prep(struct inode *inode, sector_t iblock,
 	return 0;
 }
 
-static int __ext4_journalled_writepage(struct page *page,
+static int __ext4_journalled_writepage(struct folio *folio,
 				       unsigned int len)
 {
-	struct address_space *mapping = page->mapping;
+	struct address_space *mapping = folio->mapping;
 	struct inode *inode = mapping->host;
 	handle_t *handle = NULL;
 	int ret = 0, err = 0;
@@ -1902,37 +1902,38 @@ static int __ext4_journalled_writepage(struct page *page,
 	struct buffer_head *inode_bh = NULL;
 	loff_t size;
 
-	ClearPageChecked(page);
+	folio_clear_checked(folio);
 
 	if (inline_data) {
-		BUG_ON(page->index != 0);
+		BUG_ON(folio->index != 0);
 		BUG_ON(len > ext4_get_max_inline_size(inode));
-		inode_bh = ext4_journalled_write_inline_data(inode, len, page);
+		inode_bh = ext4_journalled_write_inline_data(inode, len,
+							     &folio->page);
 		if (inode_bh == NULL)
 			goto out;
 	}
 	/*
-	 * We need to release the page lock before we start the
-	 * journal, so grab a reference so the page won't disappear
+	 * We need to release the folio lock before we start the
+	 * journal, so grab a reference so the folio won't disappear
 	 * out from under us.
 	 */
-	get_page(page);
-	unlock_page(page);
+	folio_get(folio);
+	folio_unlock(folio);
 
 	handle = ext4_journal_start(inode, EXT4_HT_WRITE_PAGE,
 				    ext4_writepage_trans_blocks(inode));
 	if (IS_ERR(handle)) {
 		ret = PTR_ERR(handle);
-		put_page(page);
+		folio_put(folio);
 		goto out_no_pagelock;
 	}
 	BUG_ON(!ext4_handle_valid(handle));
 
-	lock_page(page);
-	put_page(page);
+	folio_lock(folio);
+	folio_put(folio);
 	size = i_size_read(inode);
-	if (page->mapping != mapping || page_offset(page) > size) {
-		/* The page got truncated from under us */
+	if (folio->mapping != mapping || folio_pos(folio) > size) {
+		/* The folio got truncated from under us */
 		ext4_journal_stop(handle);
 		ret = 0;
 		goto out;
@@ -1941,12 +1942,11 @@ static int __ext4_journalled_writepage(struct page *page,
 	if (inline_data) {
 		ret = ext4_mark_inode_dirty(handle, inode);
 	} else {
-		struct buffer_head *page_bufs = page_buffers(page);
+		struct buffer_head *page_bufs = folio_buffers(folio);
 
-		if (page->index == size >> PAGE_SHIFT)
-			len = size & ~PAGE_MASK;
-		else
-			len = PAGE_SIZE;
+		len = folio_size(folio);
+		if (folio_pos(folio) + len > size)
+			len = size - folio_pos(folio);
 
 		ret = ext4_walk_page_buffers(handle, inode, page_bufs, 0, len,
 					     NULL, do_journal_get_write_access);
@@ -1956,7 +1956,7 @@ static int __ext4_journalled_writepage(struct page *page,
 	}
 	if (ret == 0)
 		ret = err;
-	err = ext4_jbd2_inode_add_write(handle, inode, page_offset(page), len);
+	err = ext4_jbd2_inode_add_write(handle, inode, folio_pos(folio), len);
 	if (ret == 0)
 		ret = err;
 	EXT4_I(inode)->i_datasync_tid = handle->h_transaction->t_tid;
@@ -1966,7 +1966,7 @@ static int __ext4_journalled_writepage(struct page *page,
 
 	ext4_set_inode_state(inode, EXT4_STATE_JDATA);
 out:
-	unlock_page(page);
+	folio_unlock(folio);
 out_no_pagelock:
 	brelse(inode_bh);
 	return ret;
@@ -2086,7 +2086,7 @@ static int ext4_writepage(struct page *page,
 		 * It's mmapped pagecache.  Add buffers and journal it.  There
 		 * doesn't seem much point in redirtying the folio here.
 		 */
-		return __ext4_journalled_writepage(page, len);
+		return __ext4_journalled_writepage(folio, len);
 
 	ext4_io_submit_init(&io_submit, wbc);
 	io_submit.io_end = ext4_init_io_end(inode, GFP_NOFS);
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH 22/31] ext4: Convert ext4_page_nomap_can_writeout() to take a folio
  2023-01-26 20:23 [PATCH 00/31] Convert most of ext4 to folios Matthew Wilcox (Oracle)
                   ` (20 preceding siblings ...)
  2023-01-26 20:24 ` [PATCH 21/31] ext4: Convert __ext4_journalled_writepage() to take " Matthew Wilcox (Oracle)
@ 2023-01-26 20:24 ` Matthew Wilcox (Oracle)
  2023-03-14 22:50   ` Theodore Ts'o
  2023-01-26 20:24 ` [PATCH 23/31] ext4: Use a folio in ext4_da_write_begin() Matthew Wilcox (Oracle)
                   ` (9 subsequent siblings)
  31 siblings, 1 reply; 83+ messages in thread
From: Matthew Wilcox (Oracle) @ 2023-01-26 20:24 UTC (permalink / raw)
  To: Theodore Tso, Andreas Dilger
  Cc: Matthew Wilcox (Oracle), linux-ext4, linux-fsdevel

Its one caller is already using a folio.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 fs/ext4/inode.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 9b2c21d0e1f3..e7e8f2946012 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -2563,11 +2563,11 @@ static int ext4_da_writepages_trans_blocks(struct inode *inode)
 }
 
 /* Return true if the page needs to be written as part of transaction commit */
-static bool ext4_page_nomap_can_writeout(struct page *page)
+static bool ext4_page_nomap_can_writeout(struct folio *folio)
 {
 	struct buffer_head *bh, *head;
 
-	bh = head = page_buffers(page);
+	bh = head = folio_buffers(folio);
 	do {
 		if (buffer_dirty(bh) && buffer_mapped(bh) && !buffer_delay(bh))
 			return true;
@@ -2683,7 +2683,7 @@ static int mpage_prepare_extent_to_map(struct mpage_da_data *mpd)
 			 * modify metadata is simple. Just submit the page.
 			 */
 			if (!mpd->can_map) {
-				if (ext4_page_nomap_can_writeout(&folio->page)) {
+				if (ext4_page_nomap_can_writeout(folio)) {
 					err = mpage_submit_folio(mpd, folio);
 					if (err < 0)
 						goto out;
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH 23/31] ext4: Use a folio in ext4_da_write_begin()
  2023-01-26 20:23 [PATCH 00/31] Convert most of ext4 to folios Matthew Wilcox (Oracle)
                   ` (21 preceding siblings ...)
  2023-01-26 20:24 ` [PATCH 22/31] ext4: Convert ext4_page_nomap_can_writeout() " Matthew Wilcox (Oracle)
@ 2023-01-26 20:24 ` Matthew Wilcox (Oracle)
  2023-01-26 20:24 ` [PATCH 24/31] ext4: Convert ext4_mpage_readpages() to work on folios Matthew Wilcox (Oracle)
                   ` (8 subsequent siblings)
  31 siblings, 0 replies; 83+ messages in thread
From: Matthew Wilcox (Oracle) @ 2023-01-26 20:24 UTC (permalink / raw)
  To: Theodore Tso, Andreas Dilger
  Cc: Matthew Wilcox (Oracle), linux-ext4, linux-fsdevel

Remove a few calls to compound_head().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 fs/ext4/inode.c | 21 +++++++++++----------
 1 file changed, 11 insertions(+), 10 deletions(-)

diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index e7e8f2946012..8929add6808a 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -3046,7 +3046,7 @@ static int ext4_da_write_begin(struct file *file, struct address_space *mapping,
 			       struct page **pagep, void **fsdata)
 {
 	int ret, retries = 0;
-	struct page *page;
+	struct folio *folio;
 	pgoff_t index;
 	struct inode *inode = mapping->host;
 
@@ -3073,22 +3073,23 @@ static int ext4_da_write_begin(struct file *file, struct address_space *mapping,
 	}
 
 retry:
-	page = grab_cache_page_write_begin(mapping, index);
-	if (!page)
+	folio = __filemap_get_folio(mapping, index, FGP_WRITEBEGIN,
+			mapping_gfp_mask(mapping));
+	if (!folio)
 		return -ENOMEM;
 
-	/* In case writeback began while the page was unlocked */
-	wait_for_stable_page(page);
+	/* In case writeback began while the folio was unlocked */
+	folio_wait_stable(folio);
 
 #ifdef CONFIG_FS_ENCRYPTION
-	ret = ext4_block_write_begin(page, pos, len,
+	ret = ext4_block_write_begin(&folio->page, pos, len,
 				     ext4_da_get_block_prep);
 #else
-	ret = __block_write_begin(page, pos, len, ext4_da_get_block_prep);
+	ret = __block_write_begin(&folio->page, pos, len, ext4_da_get_block_prep);
 #endif
 	if (ret < 0) {
-		unlock_page(page);
-		put_page(page);
+		folio_unlock(folio);
+		folio_put(folio);
 		/*
 		 * block_write_begin may have instantiated a few blocks
 		 * outside i_size.  Trim these off again. Don't need
@@ -3103,7 +3104,7 @@ static int ext4_da_write_begin(struct file *file, struct address_space *mapping,
 		return ret;
 	}
 
-	*pagep = page;
+	*pagep = &folio->page;
 	return ret;
 }
 
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH 24/31] ext4: Convert ext4_mpage_readpages() to work on folios
  2023-01-26 20:23 [PATCH 00/31] Convert most of ext4 to folios Matthew Wilcox (Oracle)
                   ` (22 preceding siblings ...)
  2023-01-26 20:24 ` [PATCH 23/31] ext4: Use a folio in ext4_da_write_begin() Matthew Wilcox (Oracle)
@ 2023-01-26 20:24 ` Matthew Wilcox (Oracle)
  2023-01-27  4:15   ` Eric Biggers
  2023-01-26 20:24 ` [PATCH 25/31] ext4: Convert ext4_block_write_begin() to take a folio Matthew Wilcox (Oracle)
                   ` (7 subsequent siblings)
  31 siblings, 1 reply; 83+ messages in thread
From: Matthew Wilcox (Oracle) @ 2023-01-26 20:24 UTC (permalink / raw)
  To: Theodore Tso, Andreas Dilger
  Cc: Matthew Wilcox (Oracle), linux-ext4, linux-fsdevel

This definitely doesn't include support for large folios; there
are all kinds of assumptions about the number of buffers attached
to a folio.  But it does remove several calls to compound_head().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 fs/ext4/ext4.h     |  2 +-
 fs/ext4/inode.c    |  7 +++---
 fs/ext4/readpage.c | 58 +++++++++++++++++++++-------------------------
 3 files changed, 31 insertions(+), 36 deletions(-)

diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index d2998800855c..faa54035dbc8 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -3646,7 +3646,7 @@ static inline void ext4_set_de_type(struct super_block *sb,
 
 /* readpages.c */
 extern int ext4_mpage_readpages(struct inode *inode,
-		struct readahead_control *rac, struct page *page);
+		struct readahead_control *rac, struct folio *folio);
 extern int __init ext4_init_post_read_processing(void);
 extern void ext4_exit_post_read_processing(void);
 
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 8929add6808a..dbfc0670de75 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -3299,17 +3299,16 @@ static sector_t ext4_bmap(struct address_space *mapping, sector_t block)
 
 static int ext4_read_folio(struct file *file, struct folio *folio)
 {
-	struct page *page = &folio->page;
 	int ret = -EAGAIN;
-	struct inode *inode = page->mapping->host;
+	struct inode *inode = folio->mapping->host;
 
-	trace_ext4_readpage(page);
+	trace_ext4_readpage(&folio->page);
 
 	if (ext4_has_inline_data(inode))
 		ret = ext4_readpage_inline(inode, folio);
 
 	if (ret == -EAGAIN)
-		return ext4_mpage_readpages(inode, NULL, page);
+		return ext4_mpage_readpages(inode, NULL, folio);
 
 	return ret;
 }
diff --git a/fs/ext4/readpage.c b/fs/ext4/readpage.c
index c61dc8a7c014..8092d2ace75e 100644
--- a/fs/ext4/readpage.c
+++ b/fs/ext4/readpage.c
@@ -218,7 +218,7 @@ static inline loff_t ext4_readpage_limit(struct inode *inode)
 }
 
 int ext4_mpage_readpages(struct inode *inode,
-		struct readahead_control *rac, struct page *page)
+		struct readahead_control *rac, struct folio *folio)
 {
 	struct bio *bio = NULL;
 	sector_t last_block_in_bio = 0;
@@ -247,16 +247,15 @@ int ext4_mpage_readpages(struct inode *inode,
 		int fully_mapped = 1;
 		unsigned first_hole = blocks_per_page;
 
-		if (rac) {
-			page = readahead_page(rac);
-			prefetchw(&page->flags);
-		}
+		if (rac)
+			folio = readahead_folio(rac);
+		prefetchw(&folio->flags);
 
-		if (page_has_buffers(page))
+		if (folio_buffers(folio))
 			goto confused;
 
 		block_in_file = next_block =
-			(sector_t)page->index << (PAGE_SHIFT - blkbits);
+			(sector_t)folio->index << (PAGE_SHIFT - blkbits);
 		last_block = block_in_file + nr_pages * blocks_per_page;
 		last_block_in_file = (ext4_readpage_limit(inode) +
 				      blocksize - 1) >> blkbits;
@@ -290,7 +289,7 @@ int ext4_mpage_readpages(struct inode *inode,
 
 		/*
 		 * Then do more ext4_map_blocks() calls until we are
-		 * done with this page.
+		 * done with this folio.
 		 */
 		while (page_block < blocks_per_page) {
 			if (block_in_file < last_block) {
@@ -299,11 +298,11 @@ int ext4_mpage_readpages(struct inode *inode,
 
 				if (ext4_map_blocks(NULL, inode, &map, 0) < 0) {
 				set_error_page:
-					SetPageError(page);
-					zero_user_segment(page, 0,
-							  PAGE_SIZE);
-					unlock_page(page);
-					goto next_page;
+					folio_set_error(folio);
+					folio_zero_segment(folio, 0,
+							  folio_size(folio));
+					folio_unlock(folio);
+					continue;
 				}
 			}
 			if ((map.m_flags & EXT4_MAP_MAPPED) == 0) {
@@ -333,22 +332,22 @@ int ext4_mpage_readpages(struct inode *inode,
 			}
 		}
 		if (first_hole != blocks_per_page) {
-			zero_user_segment(page, first_hole << blkbits,
-					  PAGE_SIZE);
+			folio_zero_segment(folio, first_hole << blkbits,
+					  folio_size(folio));
 			if (first_hole == 0) {
-				if (ext4_need_verity(inode, page->index) &&
-				    !fsverity_verify_page(page))
+				if (ext4_need_verity(inode, folio->index) &&
+				    !fsverity_verify_page(&folio->page))
 					goto set_error_page;
-				SetPageUptodate(page);
-				unlock_page(page);
-				goto next_page;
+				folio_mark_uptodate(folio);
+				folio_unlock(folio);
+				continue;
 			}
 		} else if (fully_mapped) {
-			SetPageMappedToDisk(page);
+			folio_set_mappedtodisk(folio);
 		}
 
 		/*
-		 * This page will go to BIO.  Do we need to send this
+		 * This folio will go to BIO.  Do we need to send this
 		 * BIO off first?
 		 */
 		if (bio && (last_block_in_bio != blocks[0] - 1 ||
@@ -366,7 +365,7 @@ int ext4_mpage_readpages(struct inode *inode,
 					REQ_OP_READ, GFP_KERNEL);
 			fscrypt_set_bio_crypt_ctx(bio, inode, next_block,
 						  GFP_KERNEL);
-			ext4_set_bio_post_read_ctx(bio, inode, page->index);
+			ext4_set_bio_post_read_ctx(bio, inode, folio->index);
 			bio->bi_iter.bi_sector = blocks[0] << (blkbits - 9);
 			bio->bi_end_io = mpage_end_io;
 			if (rac)
@@ -374,7 +373,7 @@ int ext4_mpage_readpages(struct inode *inode,
 		}
 
 		length = first_hole << blkbits;
-		if (bio_add_page(bio, page, length, 0) < length)
+		if (!bio_add_folio(bio, folio, length, 0))
 			goto submit_and_realloc;
 
 		if (((map.m_flags & EXT4_MAP_BOUNDARY) &&
@@ -384,19 +383,16 @@ int ext4_mpage_readpages(struct inode *inode,
 			bio = NULL;
 		} else
 			last_block_in_bio = blocks[blocks_per_page - 1];
-		goto next_page;
+		continue;
 	confused:
 		if (bio) {
 			submit_bio(bio);
 			bio = NULL;
 		}
-		if (!PageUptodate(page))
-			block_read_full_folio(page_folio(page), ext4_get_block);
+		if (!folio_test_uptodate(folio))
+			block_read_full_folio(folio, ext4_get_block);
 		else
-			unlock_page(page);
-	next_page:
-		if (rac)
-			put_page(page);
+			folio_unlock(folio);
 	}
 	if (bio)
 		submit_bio(bio);
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH 25/31] ext4: Convert ext4_block_write_begin() to take a folio
  2023-01-26 20:23 [PATCH 00/31] Convert most of ext4 to folios Matthew Wilcox (Oracle)
                   ` (23 preceding siblings ...)
  2023-01-26 20:24 ` [PATCH 24/31] ext4: Convert ext4_mpage_readpages() to work on folios Matthew Wilcox (Oracle)
@ 2023-01-26 20:24 ` Matthew Wilcox (Oracle)
  2023-03-06  6:51   ` Ritesh Harjani
  2023-01-26 20:24 ` [PATCH 26/31] ext4: Convert ext4_writepage() " Matthew Wilcox (Oracle)
                   ` (6 subsequent siblings)
  31 siblings, 1 reply; 83+ messages in thread
From: Matthew Wilcox (Oracle) @ 2023-01-26 20:24 UTC (permalink / raw)
  To: Theodore Tso, Andreas Dilger
  Cc: Matthew Wilcox (Oracle), linux-ext4, linux-fsdevel

All the callers now have a folio, so pass that in and operate on folios.
Removes four calls to compound_head().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 fs/ext4/inode.c | 41 +++++++++++++++++++++--------------------
 1 file changed, 21 insertions(+), 20 deletions(-)

diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index dbfc0670de75..507c7f88d737 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -1055,12 +1055,12 @@ int do_journal_get_write_access(handle_t *handle, struct inode *inode,
 }
 
 #ifdef CONFIG_FS_ENCRYPTION
-static int ext4_block_write_begin(struct page *page, loff_t pos, unsigned len,
+static int ext4_block_write_begin(struct folio *folio, loff_t pos, unsigned len,
 				  get_block_t *get_block)
 {
 	unsigned from = pos & (PAGE_SIZE - 1);
 	unsigned to = from + len;
-	struct inode *inode = page->mapping->host;
+	struct inode *inode = folio->mapping->host;
 	unsigned block_start, block_end;
 	sector_t block;
 	int err = 0;
@@ -1070,22 +1070,24 @@ static int ext4_block_write_begin(struct page *page, loff_t pos, unsigned len,
 	int nr_wait = 0;
 	int i;
 
-	BUG_ON(!PageLocked(page));
+	BUG_ON(!folio_test_locked(folio));
 	BUG_ON(from > PAGE_SIZE);
 	BUG_ON(to > PAGE_SIZE);
 	BUG_ON(from > to);
 
-	if (!page_has_buffers(page))
-		create_empty_buffers(page, blocksize, 0);
-	head = page_buffers(page);
+	head = folio_buffers(folio);
+	if (!head) {
+		create_empty_buffers(&folio->page, blocksize, 0);
+		head = folio_buffers(folio);
+	}
 	bbits = ilog2(blocksize);
-	block = (sector_t)page->index << (PAGE_SHIFT - bbits);
+	block = (sector_t)folio->index << (PAGE_SHIFT - bbits);
 
 	for (bh = head, block_start = 0; bh != head || !block_start;
 	    block++, block_start = block_end, bh = bh->b_this_page) {
 		block_end = block_start + blocksize;
 		if (block_end <= from || block_start >= to) {
-			if (PageUptodate(page)) {
+			if (folio_test_uptodate(folio)) {
 				set_buffer_uptodate(bh);
 			}
 			continue;
@@ -1098,19 +1100,20 @@ static int ext4_block_write_begin(struct page *page, loff_t pos, unsigned len,
 			if (err)
 				break;
 			if (buffer_new(bh)) {
-				if (PageUptodate(page)) {
+				if (folio_test_uptodate(folio)) {
 					clear_buffer_new(bh);
 					set_buffer_uptodate(bh);
 					mark_buffer_dirty(bh);
 					continue;
 				}
 				if (block_end > to || block_start < from)
-					zero_user_segments(page, to, block_end,
-							   block_start, from);
+					folio_zero_segments(folio, to,
+							    block_end,
+							    block_start, from);
 				continue;
 			}
 		}
-		if (PageUptodate(page)) {
+		if (folio_test_uptodate(folio)) {
 			set_buffer_uptodate(bh);
 			continue;
 		}
@@ -1130,13 +1133,13 @@ static int ext4_block_write_begin(struct page *page, loff_t pos, unsigned len,
 			err = -EIO;
 	}
 	if (unlikely(err)) {
-		page_zero_new_buffers(page, from, to);
+		page_zero_new_buffers(&folio->page, from, to);
 	} else if (fscrypt_inode_uses_fs_layer_crypto(inode)) {
 		for (i = 0; i < nr_wait; i++) {
 			int err2;
 
-			err2 = fscrypt_decrypt_pagecache_blocks(page, blocksize,
-								bh_offset(wait[i]));
+			err2 = fscrypt_decrypt_pagecache_blocks(&folio->page,
+						blocksize, bh_offset(wait[i]));
 			if (err2) {
 				clear_buffer_uptodate(wait[i]);
 				err = err2;
@@ -1223,11 +1226,10 @@ static int ext4_write_begin(struct file *file, struct address_space *mapping,
 
 #ifdef CONFIG_FS_ENCRYPTION
 	if (ext4_should_dioread_nolock(inode))
-		ret = ext4_block_write_begin(&folio->page, pos, len,
+		ret = ext4_block_write_begin(folio, pos, len,
 					     ext4_get_block_unwritten);
 	else
-		ret = ext4_block_write_begin(&folio->page, pos, len,
-					     ext4_get_block);
+		ret = ext4_block_write_begin(folio, pos, len, ext4_get_block);
 #else
 	if (ext4_should_dioread_nolock(inode))
 		ret = __block_write_begin(&folio->page, pos, len,
@@ -3082,8 +3084,7 @@ static int ext4_da_write_begin(struct file *file, struct address_space *mapping,
 	folio_wait_stable(folio);
 
 #ifdef CONFIG_FS_ENCRYPTION
-	ret = ext4_block_write_begin(&folio->page, pos, len,
-				     ext4_da_get_block_prep);
+	ret = ext4_block_write_begin(folio, pos, len, ext4_da_get_block_prep);
 #else
 	ret = __block_write_begin(&folio->page, pos, len, ext4_da_get_block_prep);
 #endif
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH 26/31] ext4: Convert ext4_writepage() to take a folio
  2023-01-26 20:23 [PATCH 00/31] Convert most of ext4 to folios Matthew Wilcox (Oracle)
                   ` (24 preceding siblings ...)
  2023-01-26 20:24 ` [PATCH 25/31] ext4: Convert ext4_block_write_begin() to take a folio Matthew Wilcox (Oracle)
@ 2023-01-26 20:24 ` Matthew Wilcox (Oracle)
  2023-01-26 20:24 ` [PATCH 27/31] ext4: Use a folio in ext4_page_mkwrite() Matthew Wilcox (Oracle)
                   ` (5 subsequent siblings)
  31 siblings, 0 replies; 83+ messages in thread
From: Matthew Wilcox (Oracle) @ 2023-01-26 20:24 UTC (permalink / raw)
  To: Theodore Tso, Andreas Dilger
  Cc: Matthew Wilcox (Oracle), linux-ext4, linux-fsdevel

Its one caller already has a folio.  Saves a call to compound_head().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 fs/ext4/inode.c | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 507c7f88d737..9bcf7459a0c0 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -2015,10 +2015,9 @@ static int __ext4_journalled_writepage(struct folio *folio,
  * But since we don't do any block allocation we should not deadlock.
  * Page also have the dirty flag cleared so we don't get recurive page_lock.
  */
-static int ext4_writepage(struct page *page,
+static int ext4_writepage(struct folio *folio,
 			  struct writeback_control *wbc)
 {
-	struct folio *folio = page_folio(page);
 	int ret = 0;
 	loff_t size;
 	size_t len;
@@ -2032,7 +2031,7 @@ static int ext4_writepage(struct page *page,
 		return -EIO;
 	}
 
-	trace_ext4_writepage(page);
+	trace_ext4_writepage(&folio->page);
 	size = i_size_read(inode);
 	len = folio_size(folio);
 	if (folio_pos(folio) + len > size &&
@@ -2719,7 +2718,7 @@ static int mpage_prepare_extent_to_map(struct mpage_da_data *mpd)
 static int ext4_writepage_cb(struct folio *folio, struct writeback_control *wbc,
 			     void *data)
 {
-	return ext4_writepage(&folio->page, wbc);
+	return ext4_writepage(folio, wbc);
 }
 
 static int ext4_do_writepages(struct mpage_da_data *mpd)
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH 27/31] ext4: Use a folio in ext4_page_mkwrite()
  2023-01-26 20:23 [PATCH 00/31] Convert most of ext4 to folios Matthew Wilcox (Oracle)
                   ` (25 preceding siblings ...)
  2023-01-26 20:24 ` [PATCH 26/31] ext4: Convert ext4_writepage() " Matthew Wilcox (Oracle)
@ 2023-01-26 20:24 ` Matthew Wilcox (Oracle)
  2023-01-26 20:24 ` [PATCH 28/31] ext4: Use a folio iterator in __read_end_io() Matthew Wilcox (Oracle)
                   ` (4 subsequent siblings)
  31 siblings, 0 replies; 83+ messages in thread
From: Matthew Wilcox (Oracle) @ 2023-01-26 20:24 UTC (permalink / raw)
  To: Theodore Tso, Andreas Dilger
  Cc: Matthew Wilcox (Oracle), linux-ext4, linux-fsdevel

Convert to the folio API, saving a few calls to compound_head().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 fs/ext4/inode.c | 46 ++++++++++++++++++++++------------------------
 1 file changed, 22 insertions(+), 24 deletions(-)

diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 9bcf7459a0c0..dcb99121f1c1 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -6215,7 +6215,7 @@ static int ext4_bh_unmapped(handle_t *handle, struct inode *inode,
 vm_fault_t ext4_page_mkwrite(struct vm_fault *vmf)
 {
 	struct vm_area_struct *vma = vmf->vma;
-	struct page *page = vmf->page;
+	struct folio *folio = page_folio(vmf->page);
 	loff_t size;
 	unsigned long len;
 	int err;
@@ -6259,19 +6259,18 @@ vm_fault_t ext4_page_mkwrite(struct vm_fault *vmf)
 		goto out_ret;
 	}
 
-	lock_page(page);
+	folio_lock(folio);
 	size = i_size_read(inode);
 	/* Page got truncated from under us? */
-	if (page->mapping != mapping || page_offset(page) > size) {
-		unlock_page(page);
+	if (folio->mapping != mapping || folio_pos(folio) > size) {
+		folio_unlock(folio);
 		ret = VM_FAULT_NOPAGE;
 		goto out;
 	}
 
-	if (page->index == size >> PAGE_SHIFT)
-		len = size & ~PAGE_MASK;
-	else
-		len = PAGE_SIZE;
+	len = folio_size(folio);
+	if (folio_pos(folio) + len > size)
+		len = size - folio_pos(folio);
 	/*
 	 * Return if we have all the buffers mapped. This avoids the need to do
 	 * journal_start/journal_stop which can block and take a long time
@@ -6279,17 +6278,17 @@ vm_fault_t ext4_page_mkwrite(struct vm_fault *vmf)
 	 * This cannot be done for data journalling, as we have to add the
 	 * inode to the transaction's list to writeprotect pages on commit.
 	 */
-	if (page_has_buffers(page)) {
-		if (!ext4_walk_page_buffers(NULL, inode, page_buffers(page),
+	if (folio_buffers(folio)) {
+		if (!ext4_walk_page_buffers(NULL, inode, folio_buffers(folio),
 					    0, len, NULL,
 					    ext4_bh_unmapped)) {
 			/* Wait so that we don't change page under IO */
-			wait_for_stable_page(page);
+			folio_wait_stable(folio);
 			ret = VM_FAULT_LOCKED;
 			goto out;
 		}
 	}
-	unlock_page(page);
+	folio_unlock(folio);
 	/* OK, we need to fill the hole... */
 	if (ext4_should_dioread_nolock(inode))
 		get_block = ext4_get_block_unwritten;
@@ -6310,36 +6309,35 @@ vm_fault_t ext4_page_mkwrite(struct vm_fault *vmf)
 	if (!ext4_should_journal_data(inode)) {
 		err = block_page_mkwrite(vma, vmf, get_block);
 	} else {
-		lock_page(page);
+		folio_lock(folio);
 		size = i_size_read(inode);
 		/* Page got truncated from under us? */
-		if (page->mapping != mapping || page_offset(page) > size) {
+		if (folio->mapping != mapping || folio_pos(folio) > size) {
 			ret = VM_FAULT_NOPAGE;
 			goto out_error;
 		}
 
-		if (page->index == size >> PAGE_SHIFT)
-			len = size & ~PAGE_MASK;
-		else
-			len = PAGE_SIZE;
+		len = folio_size(folio);
+		if (folio_pos(folio) + len > size)
+			len = size - folio_pos(folio);
 
-		err = __block_write_begin(page, 0, len, ext4_get_block);
+		err = __block_write_begin(&folio->page, 0, len, ext4_get_block);
 		if (!err) {
 			ret = VM_FAULT_SIGBUS;
 			if (ext4_walk_page_buffers(handle, inode,
-					page_buffers(page), 0, len, NULL,
+					folio_buffers(folio), 0, len, NULL,
 					do_journal_get_write_access))
 				goto out_error;
 			if (ext4_walk_page_buffers(handle, inode,
-					page_buffers(page), 0, len, NULL,
+					folio_buffers(folio), 0, len, NULL,
 					write_end_fn))
 				goto out_error;
 			if (ext4_jbd2_inode_add_write(handle, inode,
-						      page_offset(page), len))
+						      folio_pos(folio), len))
 				goto out_error;
 			ext4_set_inode_state(inode, EXT4_STATE_JDATA);
 		} else {
-			unlock_page(page);
+			folio_unlock(folio);
 		}
 	}
 	ext4_journal_stop(handle);
@@ -6352,7 +6350,7 @@ vm_fault_t ext4_page_mkwrite(struct vm_fault *vmf)
 	sb_end_pagefault(inode->i_sb);
 	return ret;
 out_error:
-	unlock_page(page);
+	folio_unlock(folio);
 	ext4_journal_stop(handle);
 	goto out;
 }
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH 28/31] ext4: Use a folio iterator in __read_end_io()
  2023-01-26 20:23 [PATCH 00/31] Convert most of ext4 to folios Matthew Wilcox (Oracle)
                   ` (26 preceding siblings ...)
  2023-01-26 20:24 ` [PATCH 27/31] ext4: Use a folio in ext4_page_mkwrite() Matthew Wilcox (Oracle)
@ 2023-01-26 20:24 ` Matthew Wilcox (Oracle)
  2023-01-26 20:24 ` [PATCH 29/31] ext4: Convert mext_page_mkuptodate() to take a folio Matthew Wilcox (Oracle)
                   ` (3 subsequent siblings)
  31 siblings, 0 replies; 83+ messages in thread
From: Matthew Wilcox (Oracle) @ 2023-01-26 20:24 UTC (permalink / raw)
  To: Theodore Tso, Andreas Dilger
  Cc: Matthew Wilcox (Oracle), linux-ext4, linux-fsdevel

Iterate once per folio, not once per page.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 fs/ext4/readpage.c | 14 ++++++--------
 1 file changed, 6 insertions(+), 8 deletions(-)

diff --git a/fs/ext4/readpage.c b/fs/ext4/readpage.c
index 8092d2ace75e..442f6c507016 100644
--- a/fs/ext4/readpage.c
+++ b/fs/ext4/readpage.c
@@ -68,18 +68,16 @@ struct bio_post_read_ctx {
 
 static void __read_end_io(struct bio *bio)
 {
-	struct page *page;
-	struct bio_vec *bv;
-	struct bvec_iter_all iter_all;
+	struct folio_iter fi;
 
-	bio_for_each_segment_all(bv, bio, iter_all) {
-		page = bv->bv_page;
+	bio_for_each_folio_all(fi, bio) {
+		struct folio *folio = fi.folio;
 
 		if (bio->bi_status)
-			ClearPageUptodate(page);
+			folio_clear_uptodate(folio);
 		else
-			SetPageUptodate(page);
-		unlock_page(page);
+			folio_mark_uptodate(folio);
+		folio_unlock(folio);
 	}
 	if (bio->bi_private)
 		mempool_free(bio->bi_private, bio_post_read_ctx_pool);
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH 29/31] ext4: Convert mext_page_mkuptodate() to take a folio
  2023-01-26 20:23 [PATCH 00/31] Convert most of ext4 to folios Matthew Wilcox (Oracle)
                   ` (27 preceding siblings ...)
  2023-01-26 20:24 ` [PATCH 28/31] ext4: Use a folio iterator in __read_end_io() Matthew Wilcox (Oracle)
@ 2023-01-26 20:24 ` Matthew Wilcox (Oracle)
  2023-01-26 20:24 ` [PATCH 30/31] ext4: Convert pagecache_read() to use " Matthew Wilcox (Oracle)
                   ` (2 subsequent siblings)
  31 siblings, 0 replies; 83+ messages in thread
From: Matthew Wilcox (Oracle) @ 2023-01-26 20:24 UTC (permalink / raw)
  To: Theodore Tso, Andreas Dilger
  Cc: Matthew Wilcox (Oracle), linux-ext4, linux-fsdevel

Use a folio throughout.  Does not support large folios due to
an array sized for MAX_BUF_PER_PAGE, but it does remove a few
calls to compound_head().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 fs/ext4/move_extent.c | 28 +++++++++++++++-------------
 1 file changed, 15 insertions(+), 13 deletions(-)

diff --git a/fs/ext4/move_extent.c b/fs/ext4/move_extent.c
index 0cb361f0a4fe..e509c22a21ed 100644
--- a/fs/ext4/move_extent.c
+++ b/fs/ext4/move_extent.c
@@ -168,25 +168,27 @@ mext_folio_double_lock(struct inode *inode1, struct inode *inode2,
 
 /* Force page buffers uptodate w/o dropping page's lock */
 static int
-mext_page_mkuptodate(struct page *page, unsigned from, unsigned to)
+mext_page_mkuptodate(struct folio *folio, unsigned from, unsigned to)
 {
-	struct inode *inode = page->mapping->host;
+	struct inode *inode = folio->mapping->host;
 	sector_t block;
 	struct buffer_head *bh, *head, *arr[MAX_BUF_PER_PAGE];
 	unsigned int blocksize, block_start, block_end;
 	int i, err,  nr = 0, partial = 0;
-	BUG_ON(!PageLocked(page));
-	BUG_ON(PageWriteback(page));
+	BUG_ON(!folio_test_locked(folio));
+	BUG_ON(folio_test_writeback(folio));
 
-	if (PageUptodate(page))
+	if (folio_test_uptodate(folio))
 		return 0;
 
 	blocksize = i_blocksize(inode);
-	if (!page_has_buffers(page))
-		create_empty_buffers(page, blocksize, 0);
+	head = folio_buffers(folio);
+	if (!head) {
+		create_empty_buffers(&folio->page, blocksize, 0);
+		head = folio_buffers(folio);
+	}
 
-	head = page_buffers(page);
-	block = (sector_t)page->index << (PAGE_SHIFT - inode->i_blkbits);
+	block = (sector_t)folio->index << (PAGE_SHIFT - inode->i_blkbits);
 	for (bh = head, block_start = 0; bh != head || !block_start;
 	     block++, block_start = block_end, bh = bh->b_this_page) {
 		block_end = block_start + blocksize;
@@ -200,11 +202,11 @@ mext_page_mkuptodate(struct page *page, unsigned from, unsigned to)
 		if (!buffer_mapped(bh)) {
 			err = ext4_get_block(inode, block, bh, 0);
 			if (err) {
-				SetPageError(page);
+				folio_set_error(folio);
 				return err;
 			}
 			if (!buffer_mapped(bh)) {
-				zero_user(page, block_start, blocksize);
+				folio_zero_range(folio, block_start, blocksize);
 				set_buffer_uptodate(bh);
 				continue;
 			}
@@ -226,7 +228,7 @@ mext_page_mkuptodate(struct page *page, unsigned from, unsigned to)
 	}
 out:
 	if (!partial)
-		SetPageUptodate(page);
+		folio_mark_uptodate(folio);
 	return 0;
 }
 
@@ -354,7 +356,7 @@ move_extent_per_page(struct file *o_filp, struct inode *donor_inode,
 		goto unlock_folios;
 	}
 data_copy:
-	*err = mext_page_mkuptodate(&folio[0]->page, from, from + replaced_size);
+	*err = mext_page_mkuptodate(folio[0], from, from + replaced_size);
 	if (*err)
 		goto unlock_folios;
 
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH 30/31] ext4: Convert pagecache_read() to use a folio
  2023-01-26 20:23 [PATCH 00/31] Convert most of ext4 to folios Matthew Wilcox (Oracle)
                   ` (28 preceding siblings ...)
  2023-01-26 20:24 ` [PATCH 29/31] ext4: Convert mext_page_mkuptodate() to take a folio Matthew Wilcox (Oracle)
@ 2023-01-26 20:24 ` Matthew Wilcox (Oracle)
  2023-01-26 20:24 ` [PATCH 31/31] ext4: Use a folio in ext4_read_merkle_tree_page Matthew Wilcox (Oracle)
  2023-03-15 17:57 ` [PATCH 00/31] Convert most of ext4 to folios Theodore Ts'o
  31 siblings, 0 replies; 83+ messages in thread
From: Matthew Wilcox (Oracle) @ 2023-01-26 20:24 UTC (permalink / raw)
  To: Theodore Tso, Andreas Dilger
  Cc: Matthew Wilcox (Oracle), linux-ext4, linux-fsdevel

Use the folio API and support folios of arbitrary sizes.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 fs/ext4/verity.c | 16 +++++++---------
 1 file changed, 7 insertions(+), 9 deletions(-)

diff --git a/fs/ext4/verity.c b/fs/ext4/verity.c
index e4da1704438e..afe847c967a4 100644
--- a/fs/ext4/verity.c
+++ b/fs/ext4/verity.c
@@ -42,18 +42,16 @@ static int pagecache_read(struct inode *inode, void *buf, size_t count,
 			  loff_t pos)
 {
 	while (count) {
-		size_t n = min_t(size_t, count,
-				 PAGE_SIZE - offset_in_page(pos));
-		struct page *page;
+		struct folio *folio;
+		size_t n;
 
-		page = read_mapping_page(inode->i_mapping, pos >> PAGE_SHIFT,
+		folio = read_mapping_folio(inode->i_mapping, pos >> PAGE_SHIFT,
 					 NULL);
-		if (IS_ERR(page))
-			return PTR_ERR(page);
-
-		memcpy_from_page(buf, page, offset_in_page(pos), n);
+		if (IS_ERR(folio))
+			return PTR_ERR(folio);
 
-		put_page(page);
+		n = memcpy_from_file_folio(buf, folio, pos, count);
+		folio_put(folio);
 
 		buf += n;
 		pos += n;
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH 31/31] ext4: Use a folio in ext4_read_merkle_tree_page
  2023-01-26 20:23 [PATCH 00/31] Convert most of ext4 to folios Matthew Wilcox (Oracle)
                   ` (29 preceding siblings ...)
  2023-01-26 20:24 ` [PATCH 30/31] ext4: Convert pagecache_read() to use " Matthew Wilcox (Oracle)
@ 2023-01-26 20:24 ` Matthew Wilcox (Oracle)
  2023-03-15 17:57 ` [PATCH 00/31] Convert most of ext4 to folios Theodore Ts'o
  31 siblings, 0 replies; 83+ messages in thread
From: Matthew Wilcox (Oracle) @ 2023-01-26 20:24 UTC (permalink / raw)
  To: Theodore Tso, Andreas Dilger
  Cc: Matthew Wilcox (Oracle), linux-ext4, linux-fsdevel

This is an implementation of fsverity_operations read_merkle_tree_page,
so it must still return the precise page asked for, but we can use the
folio API to reduce the number of conversions between folios & pages.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 fs/ext4/verity.c | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/fs/ext4/verity.c b/fs/ext4/verity.c
index afe847c967a4..3b01247066dd 100644
--- a/fs/ext4/verity.c
+++ b/fs/ext4/verity.c
@@ -361,21 +361,21 @@ static struct page *ext4_read_merkle_tree_page(struct inode *inode,
 					       pgoff_t index,
 					       unsigned long num_ra_pages)
 {
-	struct page *page;
+	struct folio *folio;
 
 	index += ext4_verity_metadata_pos(inode) >> PAGE_SHIFT;
 
-	page = find_get_page_flags(inode->i_mapping, index, FGP_ACCESSED);
-	if (!page || !PageUptodate(page)) {
+	folio = __filemap_get_folio(inode->i_mapping, index, FGP_ACCESSED, 0);
+	if (!folio || !folio_test_uptodate(folio)) {
 		DEFINE_READAHEAD(ractl, NULL, NULL, inode->i_mapping, index);
 
-		if (page)
-			put_page(page);
+		if (folio)
+			folio_put(folio);
 		else if (num_ra_pages > 1)
 			page_cache_ra_unbounded(&ractl, num_ra_pages, 0);
-		page = read_mapping_page(inode->i_mapping, index, NULL);
+		folio = read_mapping_folio(inode->i_mapping, index, NULL);
 	}
-	return page;
+	return folio_file_page(folio, index);
 }
 
 static int ext4_write_merkle_tree_block(struct inode *inode, const void *buf,
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 83+ messages in thread

* Re: [PATCH 02/31] fscrypt: Add some folio helper functions
  2023-01-26 20:23 ` [PATCH 02/31] fscrypt: Add some folio helper functions Matthew Wilcox (Oracle)
@ 2023-01-27  3:02   ` Eric Biggers
  2023-01-27 16:13     ` Matthew Wilcox
  2023-03-05  9:06   ` Ritesh Harjani
  1 sibling, 1 reply; 83+ messages in thread
From: Eric Biggers @ 2023-01-27  3:02 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle)
  Cc: Theodore Tso, Andreas Dilger, linux-ext4, linux-fsdevel

On Thu, Jan 26, 2023 at 08:23:46PM +0000, Matthew Wilcox (Oracle) wrote:
> fscrypt_is_bounce_folio() is the equivalent of fscrypt_is_bounce_page()
> and fscrypt_pagecache_folio() is the equivalent of fscrypt_pagecache_page().
> 
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> ---
>  include/linux/fscrypt.h | 21 +++++++++++++++++++++
>  1 file changed, 21 insertions(+)
> 
> diff --git a/include/linux/fscrypt.h b/include/linux/fscrypt.h
> index 4f5f8a651213..c2c07d36fb3a 100644
> --- a/include/linux/fscrypt.h
> +++ b/include/linux/fscrypt.h
> @@ -273,6 +273,16 @@ static inline struct page *fscrypt_pagecache_page(struct page *bounce_page)
>  	return (struct page *)page_private(bounce_page);
>  }
>  
> +static inline bool fscrypt_is_bounce_folio(struct folio *folio)
> +{
> +	return folio->mapping == NULL;
> +}
> +
> +static inline struct folio *fscrypt_pagecache_folio(struct folio *bounce_folio)
> +{
> +	return bounce_folio->private;
> +}

ext4_bio_write_folio() is still doing:

	bounce_page = fscrypt_encrypt_pagecache_blocks(&folio->page, ...);

Should it be creating a "bounce folio" instead, or is that not in the scope of
this patchset?

- Eric

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 24/31] ext4: Convert ext4_mpage_readpages() to work on folios
  2023-01-26 20:24 ` [PATCH 24/31] ext4: Convert ext4_mpage_readpages() to work on folios Matthew Wilcox (Oracle)
@ 2023-01-27  4:15   ` Eric Biggers
  2023-01-27 16:08     ` Matthew Wilcox
  0 siblings, 1 reply; 83+ messages in thread
From: Eric Biggers @ 2023-01-27  4:15 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle)
  Cc: Theodore Tso, Andreas Dilger, linux-ext4, linux-fsdevel

On Thu, Jan 26, 2023 at 08:24:08PM +0000, Matthew Wilcox (Oracle) wrote:
>  int ext4_mpage_readpages(struct inode *inode,
> -		struct readahead_control *rac, struct page *page)
> +		struct readahead_control *rac, struct folio *folio)
>  {
>  	struct bio *bio = NULL;
>  	sector_t last_block_in_bio = 0;
> @@ -247,16 +247,15 @@ int ext4_mpage_readpages(struct inode *inode,
>  		int fully_mapped = 1;
>  		unsigned first_hole = blocks_per_page;
>  
> -		if (rac) {
> -			page = readahead_page(rac);
> -			prefetchw(&page->flags);
> -		}
> +		if (rac)
> +			folio = readahead_folio(rac);
> +		prefetchw(&folio->flags);

Unlike readahead_page(), readahead_folio() puts the folio immediately.  Is that
really safe?

> @@ -299,11 +298,11 @@ int ext4_mpage_readpages(struct inode *inode,
>  
>  				if (ext4_map_blocks(NULL, inode, &map, 0) < 0) {
>  				set_error_page:
> -					SetPageError(page);
> -					zero_user_segment(page, 0,
> -							  PAGE_SIZE);
> -					unlock_page(page);
> -					goto next_page;
> +					folio_set_error(folio);
> +					folio_zero_segment(folio, 0,
> +							  folio_size(folio));
> +					folio_unlock(folio);
> +					continue;

This is 'continuing' the inner loop, not the outer loop as it should.

- Eric

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 24/31] ext4: Convert ext4_mpage_readpages() to work on folios
  2023-01-27  4:15   ` Eric Biggers
@ 2023-01-27 16:08     ` Matthew Wilcox
  2023-03-05 11:26       ` Ritesh Harjani
  0 siblings, 1 reply; 83+ messages in thread
From: Matthew Wilcox @ 2023-01-27 16:08 UTC (permalink / raw)
  To: Eric Biggers; +Cc: Theodore Tso, Andreas Dilger, linux-ext4, linux-fsdevel

On Thu, Jan 26, 2023 at 08:15:04PM -0800, Eric Biggers wrote:
> On Thu, Jan 26, 2023 at 08:24:08PM +0000, Matthew Wilcox (Oracle) wrote:
> >  int ext4_mpage_readpages(struct inode *inode,
> > -		struct readahead_control *rac, struct page *page)
> > +		struct readahead_control *rac, struct folio *folio)
> >  {
> >  	struct bio *bio = NULL;
> >  	sector_t last_block_in_bio = 0;
> > @@ -247,16 +247,15 @@ int ext4_mpage_readpages(struct inode *inode,
> >  		int fully_mapped = 1;
> >  		unsigned first_hole = blocks_per_page;
> >  
> > -		if (rac) {
> > -			page = readahead_page(rac);
> > -			prefetchw(&page->flags);
> > -		}
> > +		if (rac)
> > +			folio = readahead_folio(rac);
> > +		prefetchw(&folio->flags);
> 
> Unlike readahead_page(), readahead_folio() puts the folio immediately.  Is that
> really safe?

It's safe until we unlock the page.  The page cache holds a refcount,
and truncation has to lock the page before it can remove it from the
page cache.

Putting the refcount in readahead_folio() is a transitional step; once
all filesystems are converted to use readahead_folio(), I'll hoist the
refcount put to the caller.  Having ->readahead() and ->read_folio()
with different rules for who puts the folio is a long-standing mistake.

> > @@ -299,11 +298,11 @@ int ext4_mpage_readpages(struct inode *inode,
> >  
> >  				if (ext4_map_blocks(NULL, inode, &map, 0) < 0) {
> >  				set_error_page:
> > -					SetPageError(page);
> > -					zero_user_segment(page, 0,
> > -							  PAGE_SIZE);
> > -					unlock_page(page);
> > -					goto next_page;
> > +					folio_set_error(folio);
> > +					folio_zero_segment(folio, 0,
> > +							  folio_size(folio));
> > +					folio_unlock(folio);
> > +					continue;
> 
> This is 'continuing' the inner loop, not the outer loop as it should.

Oops.  Will fix.  I didn't get any extra failures from xfstests
with this bug, although I suspect I wasn't testing with block size <
page size, which is probably needed to make a difference.

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 02/31] fscrypt: Add some folio helper functions
  2023-01-27  3:02   ` Eric Biggers
@ 2023-01-27 16:13     ` Matthew Wilcox
  2023-01-27 16:21       ` Eric Biggers
  2023-03-14 22:05       ` Theodore Ts'o
  0 siblings, 2 replies; 83+ messages in thread
From: Matthew Wilcox @ 2023-01-27 16:13 UTC (permalink / raw)
  To: Eric Biggers; +Cc: Theodore Tso, Andreas Dilger, linux-ext4, linux-fsdevel

On Thu, Jan 26, 2023 at 07:02:14PM -0800, Eric Biggers wrote:
> On Thu, Jan 26, 2023 at 08:23:46PM +0000, Matthew Wilcox (Oracle) wrote:
> > fscrypt_is_bounce_folio() is the equivalent of fscrypt_is_bounce_page()
> > and fscrypt_pagecache_folio() is the equivalent of fscrypt_pagecache_page().
> > 
> > Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> > ---
> >  include/linux/fscrypt.h | 21 +++++++++++++++++++++
> >  1 file changed, 21 insertions(+)
> > 
> > diff --git a/include/linux/fscrypt.h b/include/linux/fscrypt.h
> > index 4f5f8a651213..c2c07d36fb3a 100644
> > --- a/include/linux/fscrypt.h
> > +++ b/include/linux/fscrypt.h
> > @@ -273,6 +273,16 @@ static inline struct page *fscrypt_pagecache_page(struct page *bounce_page)
> >  	return (struct page *)page_private(bounce_page);
> >  }
> >  
> > +static inline bool fscrypt_is_bounce_folio(struct folio *folio)
> > +{
> > +	return folio->mapping == NULL;
> > +}
> > +
> > +static inline struct folio *fscrypt_pagecache_folio(struct folio *bounce_folio)
> > +{
> > +	return bounce_folio->private;
> > +}
> 
> ext4_bio_write_folio() is still doing:
> 
> 	bounce_page = fscrypt_encrypt_pagecache_blocks(&folio->page, ...);
> 
> Should it be creating a "bounce folio" instead, or is that not in the scope of
> this patchset?

It's out of scope for _this_ patchset.  I think it's a patchset that
could come either before or after, and is needed to support large folios
with ext4.  The biggest problem with doing that conversion is that
bounce pages are allocated from a mempool which obviously only allocates
order-0 folios.  I don't know what to do about that.  Have a mempool
for each order of folio that the filesystem supports?  Try to allocate
folios without a mempool and then split the folio if allocation fails?
Have a mempool containing PMD-order pages and split them ourselves if
we need to allocate from the mempool?

Nothing's really standing out to me as the perfect answer.  There are
probably other alternatives.

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 02/31] fscrypt: Add some folio helper functions
  2023-01-27 16:13     ` Matthew Wilcox
@ 2023-01-27 16:21       ` Eric Biggers
  2023-01-27 16:37         ` Matthew Wilcox
  2023-03-14 22:05       ` Theodore Ts'o
  1 sibling, 1 reply; 83+ messages in thread
From: Eric Biggers @ 2023-01-27 16:21 UTC (permalink / raw)
  To: Matthew Wilcox; +Cc: Theodore Tso, Andreas Dilger, linux-ext4, linux-fsdevel

On Fri, Jan 27, 2023 at 04:13:37PM +0000, Matthew Wilcox wrote:
> On Thu, Jan 26, 2023 at 07:02:14PM -0800, Eric Biggers wrote:
> > On Thu, Jan 26, 2023 at 08:23:46PM +0000, Matthew Wilcox (Oracle) wrote:
> > > fscrypt_is_bounce_folio() is the equivalent of fscrypt_is_bounce_page()
> > > and fscrypt_pagecache_folio() is the equivalent of fscrypt_pagecache_page().
> > > 
> > > Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> > > ---
> > >  include/linux/fscrypt.h | 21 +++++++++++++++++++++
> > >  1 file changed, 21 insertions(+)
> > > 
> > > diff --git a/include/linux/fscrypt.h b/include/linux/fscrypt.h
> > > index 4f5f8a651213..c2c07d36fb3a 100644
> > > --- a/include/linux/fscrypt.h
> > > +++ b/include/linux/fscrypt.h
> > > @@ -273,6 +273,16 @@ static inline struct page *fscrypt_pagecache_page(struct page *bounce_page)
> > >  	return (struct page *)page_private(bounce_page);
> > >  }
> > >  
> > > +static inline bool fscrypt_is_bounce_folio(struct folio *folio)
> > > +{
> > > +	return folio->mapping == NULL;
> > > +}
> > > +
> > > +static inline struct folio *fscrypt_pagecache_folio(struct folio *bounce_folio)
> > > +{
> > > +	return bounce_folio->private;
> > > +}
> > 
> > ext4_bio_write_folio() is still doing:
> > 
> > 	bounce_page = fscrypt_encrypt_pagecache_blocks(&folio->page, ...);
> > 
> > Should it be creating a "bounce folio" instead, or is that not in the scope of
> > this patchset?
> 
> It's out of scope for _this_ patchset.  I think it's a patchset that
> could come either before or after, and is needed to support large folios
> with ext4.  The biggest problem with doing that conversion is that
> bounce pages are allocated from a mempool which obviously only allocates
> order-0 folios.  I don't know what to do about that.  Have a mempool
> for each order of folio that the filesystem supports?  Try to allocate
> folios without a mempool and then split the folio if allocation fails?
> Have a mempool containing PMD-order pages and split them ourselves if
> we need to allocate from the mempool?
> 
> Nothing's really standing out to me as the perfect answer.  There are
> probably other alternatives.

Would it be possible to keep using bounce *pages* all the time, even when the
pagecache contains large folios?

- Eric

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 02/31] fscrypt: Add some folio helper functions
  2023-01-27 16:21       ` Eric Biggers
@ 2023-01-27 16:37         ` Matthew Wilcox
  0 siblings, 0 replies; 83+ messages in thread
From: Matthew Wilcox @ 2023-01-27 16:37 UTC (permalink / raw)
  To: Eric Biggers; +Cc: Theodore Tso, Andreas Dilger, linux-ext4, linux-fsdevel

On Fri, Jan 27, 2023 at 08:21:46AM -0800, Eric Biggers wrote:
> On Fri, Jan 27, 2023 at 04:13:37PM +0000, Matthew Wilcox wrote:
> > On Thu, Jan 26, 2023 at 07:02:14PM -0800, Eric Biggers wrote:
> > > On Thu, Jan 26, 2023 at 08:23:46PM +0000, Matthew Wilcox (Oracle) wrote:
> > > > fscrypt_is_bounce_folio() is the equivalent of fscrypt_is_bounce_page()
> > > > and fscrypt_pagecache_folio() is the equivalent of fscrypt_pagecache_page().
> > > > 
> > > > Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> > > > ---
> > > >  include/linux/fscrypt.h | 21 +++++++++++++++++++++
> > > >  1 file changed, 21 insertions(+)
> > > > 
> > > > diff --git a/include/linux/fscrypt.h b/include/linux/fscrypt.h
> > > > index 4f5f8a651213..c2c07d36fb3a 100644
> > > > --- a/include/linux/fscrypt.h
> > > > +++ b/include/linux/fscrypt.h
> > > > @@ -273,6 +273,16 @@ static inline struct page *fscrypt_pagecache_page(struct page *bounce_page)
> > > >  	return (struct page *)page_private(bounce_page);
> > > >  }
> > > >  
> > > > +static inline bool fscrypt_is_bounce_folio(struct folio *folio)
> > > > +{
> > > > +	return folio->mapping == NULL;
> > > > +}
> > > > +
> > > > +static inline struct folio *fscrypt_pagecache_folio(struct folio *bounce_folio)
> > > > +{
> > > > +	return bounce_folio->private;
> > > > +}
> > > 
> > > ext4_bio_write_folio() is still doing:
> > > 
> > > 	bounce_page = fscrypt_encrypt_pagecache_blocks(&folio->page, ...);
> > > 
> > > Should it be creating a "bounce folio" instead, or is that not in the scope of
> > > this patchset?
> > 
> > It's out of scope for _this_ patchset.  I think it's a patchset that
> > could come either before or after, and is needed to support large folios
> > with ext4.  The biggest problem with doing that conversion is that
> > bounce pages are allocated from a mempool which obviously only allocates
> > order-0 folios.  I don't know what to do about that.  Have a mempool
> > for each order of folio that the filesystem supports?  Try to allocate
> > folios without a mempool and then split the folio if allocation fails?
> > Have a mempool containing PMD-order pages and split them ourselves if
> > we need to allocate from the mempool?
> > 
> > Nothing's really standing out to me as the perfect answer.  There are
> > probably other alternatives.
> 
> Would it be possible to keep using bounce *pages* all the time, even when the
> pagecache contains large folios?

I _think_ so.  Probably the best solution is to attempt to allocate an
order-N folio (with GFP_NOWAIT?) and then fall back to allocating 2^N
pages from the mempool.  It'll require some surgery to ext4_finish_bio()
as well as ext4_bio_write_folio(), fscrypt_encrypt_pagecache_blocks()
and fscrypt_free_bounce_page(), but I think it's doable.  I'll try
to whip something up.

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 03/31] ext4: Convert ext4_bio_write_page() to use a folio
  2023-01-26 20:23 ` [PATCH 03/31] ext4: Convert ext4_bio_write_page() to use a folio Matthew Wilcox (Oracle)
@ 2023-01-28 16:53   ` kernel test robot
  2023-01-28 19:07   ` kernel test robot
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 83+ messages in thread
From: kernel test robot @ 2023-01-28 16:53 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle), Theodore Tso, Andreas Dilger
  Cc: oe-kbuild-all, Matthew Wilcox (Oracle), linux-ext4, linux-fsdevel

Hi Matthew,

I love your patch! Yet something to improve:

[auto build test ERROR on next-20230127]
[cannot apply to tytso-ext4/dev xfs-linux/for-next linus/master v6.2-rc5 v6.2-rc4 v6.2-rc3 v6.2-rc5]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Matthew-Wilcox-Oracle/fs-Add-FGP_WRITEBEGIN/20230128-150212
patch link:    https://lore.kernel.org/r/20230126202415.1682629-4-willy%40infradead.org
patch subject: [PATCH 03/31] ext4: Convert ext4_bio_write_page() to use a folio
config: loongarch-randconfig-r001-20230123 (https://download.01.org/0day-ci/archive/20230129/202301290044.oKaK49Hs-lkp@intel.com/config)
compiler: loongarch64-linux-gcc (GCC) 12.1.0
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/intel-lab-lkp/linux/commit/f6e4c5cfaf2ef7b8ee6c5354bbbd5f1ee758746f
        git remote add linux-review https://github.com/intel-lab-lkp/linux
        git fetch --no-tags linux-review Matthew-Wilcox-Oracle/fs-Add-FGP_WRITEBEGIN/20230128-150212
        git checkout f6e4c5cfaf2ef7b8ee6c5354bbbd5f1ee758746f
        # save the config file
        mkdir build_dir && cp config build_dir/.config
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=loongarch olddefconfig
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=loongarch SHELL=/bin/bash

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <lkp@intel.com>

All errors (new ones prefixed by >>, old ones prefixed by <<):

>> ERROR: modpost: "bio_add_folio" [fs/ext4/ext4.ko] undefined!

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 03/31] ext4: Convert ext4_bio_write_page() to use a folio
  2023-01-26 20:23 ` [PATCH 03/31] ext4: Convert ext4_bio_write_page() to use a folio Matthew Wilcox (Oracle)
  2023-01-28 16:53   ` kernel test robot
@ 2023-01-28 19:07   ` kernel test robot
  2023-03-05 11:18   ` Ritesh Harjani
  2023-03-14 22:07   ` Theodore Ts'o
  3 siblings, 0 replies; 83+ messages in thread
From: kernel test robot @ 2023-01-28 19:07 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle), Theodore Tso, Andreas Dilger
  Cc: llvm, oe-kbuild-all, Matthew Wilcox (Oracle), linux-ext4, linux-fsdevel

Hi Matthew,

I love your patch! Yet something to improve:

[auto build test ERROR on next-20230127]
[cannot apply to tytso-ext4/dev xfs-linux/for-next linus/master v6.2-rc5 v6.2-rc4 v6.2-rc3 v6.2-rc5]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Matthew-Wilcox-Oracle/fs-Add-FGP_WRITEBEGIN/20230128-150212
patch link:    https://lore.kernel.org/r/20230126202415.1682629-4-willy%40infradead.org
patch subject: [PATCH 03/31] ext4: Convert ext4_bio_write_page() to use a folio
config: x86_64-randconfig-a013-20230123 (https://download.01.org/0day-ci/archive/20230129/202301290216.BcfP0oGK-lkp@intel.com/config)
compiler: clang version 14.0.6 (https://github.com/llvm/llvm-project f28c006a5895fc0e329fe15fead81e37457cb1d1)
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/intel-lab-lkp/linux/commit/f6e4c5cfaf2ef7b8ee6c5354bbbd5f1ee758746f
        git remote add linux-review https://github.com/intel-lab-lkp/linux
        git fetch --no-tags linux-review Matthew-Wilcox-Oracle/fs-Add-FGP_WRITEBEGIN/20230128-150212
        git checkout f6e4c5cfaf2ef7b8ee6c5354bbbd5f1ee758746f
        # save the config file
        mkdir build_dir && cp config build_dir/.config
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=x86_64 olddefconfig
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=x86_64 SHELL=/bin/bash

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <lkp@intel.com>

All errors (new ones prefixed by >>, old ones prefixed by <<):

>> ERROR: modpost: "bio_add_folio" [fs/ext4/ext4.ko] undefined!

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 01/31] fs: Add FGP_WRITEBEGIN
  2023-01-26 20:23 ` [PATCH 01/31] fs: Add FGP_WRITEBEGIN Matthew Wilcox (Oracle)
@ 2023-03-05  8:53   ` Ritesh Harjani
  2023-03-14 22:00   ` Theodore Ts'o
  1 sibling, 0 replies; 83+ messages in thread
From: Ritesh Harjani @ 2023-03-05  8:53 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle), Theodore Tso, Andreas Dilger
  Cc: Matthew Wilcox (Oracle), linux-ext4, linux-fsdevel

"Matthew Wilcox (Oracle)" <willy@infradead.org> writes:

> This particular combination of flags is used by most filesystems
> in their ->write_begin method, although it does find use in a
> few other places.  Before folios, it warranted its own function
> (grab_cache_page_write_begin()), but I think that just having specialised
> flags is enough.  It certainly helps the few places that have been
> converted from grab_cache_page_write_begin() to __filemap_get_folio().
>
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>

Looks good to me. With small comment below.

Please feel free to add -
Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>

> ---
>  fs/ext4/move_extent.c    | 5 ++---
>  fs/iomap/buffered-io.c   | 2 +-
>  fs/netfs/buffered_read.c | 3 +--
>  include/linux/pagemap.h  | 2 ++
>  mm/folio-compat.c        | 4 +---
>  5 files changed, 7 insertions(+), 9 deletions(-)

After below patch got added to mainline, we should use FGP_WRITEBEGIN
flag in fs/nfs/file.c as well.

54d99381b7371d2999566d1fb4ea88d46cf9d865
Author:     Trond Myklebust <trond.myklebust@hammerspace.com>
CommitDate: Tue Feb 14 14:22:32 2023 -0500

NFS: Convert nfs_write_begin/end to use folios


In fact we don't even need the helper
(nfs_folio_grab_cache_write_begin()) anymore, since we can directly pass
FGP_WRITEBEGIN flag in __filemap_get_folio() in the caller itself.

static struct folio *
nfs_folio_grab_cache_write_begin(struct address_space *mapping, pgoff_t index)
{
	unsigned fgp_flags = FGP_LOCK | FGP_WRITE | FGP_CREAT | FGP_STABLE;

	return __filemap_get_folio(mapping, index, fgp_flags,
				   mapping_gfp_mask(mapping));
}


-ritesh

>
> diff --git a/fs/ext4/move_extent.c b/fs/ext4/move_extent.c
> index 2de9829aed63..0cb361f0a4fe 100644
> --- a/fs/ext4/move_extent.c
> +++ b/fs/ext4/move_extent.c
> @@ -126,7 +126,6 @@ mext_folio_double_lock(struct inode *inode1, struct inode *inode2,
>  {
>  	struct address_space *mapping[2];
>  	unsigned int flags;
> -	unsigned fgp_flags = FGP_LOCK | FGP_WRITE | FGP_CREAT | FGP_STABLE;
>
>  	BUG_ON(!inode1 || !inode2);
>  	if (inode1 < inode2) {
> @@ -139,14 +138,14 @@ mext_folio_double_lock(struct inode *inode1, struct inode *inode2,
>  	}
>
>  	flags = memalloc_nofs_save();
> -	folio[0] = __filemap_get_folio(mapping[0], index1, fgp_flags,
> +	folio[0] = __filemap_get_folio(mapping[0], index1, FGP_WRITEBEGIN,
>  			mapping_gfp_mask(mapping[0]));
>  	if (!folio[0]) {
>  		memalloc_nofs_restore(flags);
>  		return -ENOMEM;
>  	}
>
> -	folio[1] = __filemap_get_folio(mapping[1], index2, fgp_flags,
> +	folio[1] = __filemap_get_folio(mapping[1], index2, FGP_WRITEBEGIN,
>  			mapping_gfp_mask(mapping[1]));
>  	memalloc_nofs_restore(flags);
>  	if (!folio[1]) {
> diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
> index 6f4c97a6d7e9..10a203515583 100644
> --- a/fs/iomap/buffered-io.c
> +++ b/fs/iomap/buffered-io.c
> @@ -467,7 +467,7 @@ EXPORT_SYMBOL_GPL(iomap_is_partially_uptodate);
>   */
>  struct folio *iomap_get_folio(struct iomap_iter *iter, loff_t pos)
>  {
> -	unsigned fgp = FGP_LOCK | FGP_WRITE | FGP_CREAT | FGP_STABLE | FGP_NOFS;
> +	unsigned fgp = FGP_WRITEBEGIN | FGP_NOFS;
>  	struct folio *folio;
>
>  	if (iter->flags & IOMAP_NOWAIT)
> diff --git a/fs/netfs/buffered_read.c b/fs/netfs/buffered_read.c
> index 7679a68e8193..e3d754a9e1b0 100644
> --- a/fs/netfs/buffered_read.c
> +++ b/fs/netfs/buffered_read.c
> @@ -341,14 +341,13 @@ int netfs_write_begin(struct netfs_inode *ctx,
>  {
>  	struct netfs_io_request *rreq;
>  	struct folio *folio;
> -	unsigned int fgp_flags = FGP_LOCK | FGP_WRITE | FGP_CREAT | FGP_STABLE;
>  	pgoff_t index = pos >> PAGE_SHIFT;
>  	int ret;
>
>  	DEFINE_READAHEAD(ractl, file, NULL, mapping, index);
>
>  retry:
> -	folio = __filemap_get_folio(mapping, index, fgp_flags,
> +	folio = __filemap_get_folio(mapping, index, FGP_WRITEBEGIN,
>  				    mapping_gfp_mask(mapping));
>  	if (!folio)
>  		return -ENOMEM;
> diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
> index 9f1081683771..47069662f4b8 100644
> --- a/include/linux/pagemap.h
> +++ b/include/linux/pagemap.h
> @@ -507,6 +507,8 @@ pgoff_t page_cache_prev_miss(struct address_space *mapping,
>  #define FGP_ENTRY		0x00000080
>  #define FGP_STABLE		0x00000100
>
> +#define FGP_WRITEBEGIN		(FGP_LOCK | FGP_WRITE | FGP_CREAT | FGP_STABLE)
> +
>  struct folio *__filemap_get_folio(struct address_space *mapping, pgoff_t index,
>  		int fgp_flags, gfp_t gfp);
>  struct page *pagecache_get_page(struct address_space *mapping, pgoff_t index,
> diff --git a/mm/folio-compat.c b/mm/folio-compat.c
> index 18c48b557926..668350748828 100644
> --- a/mm/folio-compat.c
> +++ b/mm/folio-compat.c
> @@ -106,9 +106,7 @@ EXPORT_SYMBOL(pagecache_get_page);
>  struct page *grab_cache_page_write_begin(struct address_space *mapping,
>  					pgoff_t index)
>  {
> -	unsigned fgp_flags = FGP_LOCK | FGP_WRITE | FGP_CREAT | FGP_STABLE;
> -
> -	return pagecache_get_page(mapping, index, fgp_flags,
> +	return pagecache_get_page(mapping, index, FGP_WRITEBEGIN,
>  			mapping_gfp_mask(mapping));
>  }
>  EXPORT_SYMBOL(grab_cache_page_write_begin);
> --
> 2.35.1

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 02/31] fscrypt: Add some folio helper functions
  2023-01-26 20:23 ` [PATCH 02/31] fscrypt: Add some folio helper functions Matthew Wilcox (Oracle)
  2023-01-27  3:02   ` Eric Biggers
@ 2023-03-05  9:06   ` Ritesh Harjani
  1 sibling, 0 replies; 83+ messages in thread
From: Ritesh Harjani @ 2023-03-05  9:06 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle), Theodore Tso, Andreas Dilger
  Cc: Matthew Wilcox (Oracle), linux-ext4, linux-fsdevel

"Matthew Wilcox (Oracle)" <willy@infradead.org> writes:

> fscrypt_is_bounce_folio() is the equivalent of fscrypt_is_bounce_page()
> and fscrypt_pagecache_folio() is the equivalent of fscrypt_pagecache_page().
>
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> ---
>  include/linux/fscrypt.h | 21 +++++++++++++++++++++
>  1 file changed, 21 insertions(+)

Straight forward conversion. IIUC, even after this patchset we haven't
killed fscrypt_is_bounce_page() and fscrypt_pagecache_folio(), because
there are other users of this in f2fs and fscrypt.

Looks good to me. Please feel free to add -
Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>


-ritesh

>
> diff --git a/include/linux/fscrypt.h b/include/linux/fscrypt.h
> index 4f5f8a651213..c2c07d36fb3a 100644
> --- a/include/linux/fscrypt.h
> +++ b/include/linux/fscrypt.h
> @@ -273,6 +273,16 @@ static inline struct page *fscrypt_pagecache_page(struct page *bounce_page)
>  	return (struct page *)page_private(bounce_page);
>  }
>
> +static inline bool fscrypt_is_bounce_folio(struct folio *folio)
> +{
> +	return folio->mapping == NULL;
> +}
> +
> +static inline struct folio *fscrypt_pagecache_folio(struct folio *bounce_folio)
> +{
> +	return bounce_folio->private;
> +}
> +
>  void fscrypt_free_bounce_page(struct page *bounce_page);
>
>  /* policy.c */
> @@ -448,6 +458,17 @@ static inline struct page *fscrypt_pagecache_page(struct page *bounce_page)
>  	return ERR_PTR(-EINVAL);
>  }
>
> +static inline bool fscrypt_is_bounce_folio(struct folio *folio)
> +{
> +	return false;
> +}
> +
> +static inline struct folio *fscrypt_pagecache_folio(struct folio *bounce_folio)
> +{
> +	WARN_ON_ONCE(1);
> +	return ERR_PTR(-EINVAL);
> +}
> +
>  static inline void fscrypt_free_bounce_page(struct page *bounce_page)
>  {
>  }
> --
> 2.35.1

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 03/31] ext4: Convert ext4_bio_write_page() to use a folio
  2023-01-26 20:23 ` [PATCH 03/31] ext4: Convert ext4_bio_write_page() to use a folio Matthew Wilcox (Oracle)
  2023-01-28 16:53   ` kernel test robot
  2023-01-28 19:07   ` kernel test robot
@ 2023-03-05 11:18   ` Ritesh Harjani
  2023-03-14 22:07   ` Theodore Ts'o
  3 siblings, 0 replies; 83+ messages in thread
From: Ritesh Harjani @ 2023-03-05 11:18 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle), Theodore Tso, Andreas Dilger
  Cc: Matthew Wilcox (Oracle), linux-ext4, linux-fsdevel

"Matthew Wilcox (Oracle)" <willy@infradead.org> writes:

> Remove several calls to compound_head() and the last caller of
> set_page_writeback_keepwrite(), so remove the wrapper too.

Straight forward conversion.
Looks good to me. Please feel free to add -

Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>

> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> ---
>  fs/ext4/page-io.c          | 58 ++++++++++++++++++--------------------
>  include/linux/page-flags.h |  5 ----
>  2 files changed, 27 insertions(+), 36 deletions(-)
>
> diff --git a/fs/ext4/page-io.c b/fs/ext4/page-io.c
> index beaec6d81074..982791050892 100644
> --- a/fs/ext4/page-io.c
> +++ b/fs/ext4/page-io.c
> @@ -409,11 +409,9 @@ static void io_submit_init_bio(struct ext4_io_submit *io,
>
>  static void io_submit_add_bh(struct ext4_io_submit *io,
>  			     struct inode *inode,
> -			     struct page *page,
> +			     struct folio *folio,
>  			     struct buffer_head *bh)
>  {
> -	int ret;
> -
>  	if (io->io_bio && (bh->b_blocknr != io->io_next_block ||
>  			   !fscrypt_mergeable_bio_bh(io->io_bio, bh))) {
>  submit_and_retry:
> @@ -421,10 +419,9 @@ static void io_submit_add_bh(struct ext4_io_submit *io,
>  	}
>  	if (io->io_bio == NULL)
>  		io_submit_init_bio(io, bh);
> -	ret = bio_add_page(io->io_bio, page, bh->b_size, bh_offset(bh));
> -	if (ret != bh->b_size)
> +	if (!bio_add_folio(io->io_bio, folio, bh->b_size, bh_offset(bh)))
>  		goto submit_and_retry;
> -	wbc_account_cgroup_owner(io->io_wbc, page, bh->b_size);
> +	wbc_account_cgroup_owner(io->io_wbc, &folio->page, bh->b_size);
>  	io->io_next_block++;
>  }
>
> @@ -432,8 +429,9 @@ int ext4_bio_write_page(struct ext4_io_submit *io,
>  			struct page *page,
>  			int len)
>  {
> -	struct page *bounce_page = NULL;
> -	struct inode *inode = page->mapping->host;
> +	struct folio *folio = page_folio(page);
> +	struct folio *io_folio = folio;
> +	struct inode *inode = folio->mapping->host;
>  	unsigned block_start;
>  	struct buffer_head *bh, *head;
>  	int ret = 0;
> @@ -441,30 +439,30 @@ int ext4_bio_write_page(struct ext4_io_submit *io,
>  	struct writeback_control *wbc = io->io_wbc;
>  	bool keep_towrite = false;
>
> -	BUG_ON(!PageLocked(page));
> -	BUG_ON(PageWriteback(page));
> +	BUG_ON(!folio_test_locked(folio));
> +	BUG_ON(folio_test_writeback(folio));
>
> -	ClearPageError(page);
> +	folio_clear_error(folio);
>
>  	/*
>  	 * Comments copied from block_write_full_page:
>  	 *
> -	 * The page straddles i_size.  It must be zeroed out on each and every
> +	 * The folio straddles i_size.  It must be zeroed out on each and every
>  	 * writepage invocation because it may be mmapped.  "A file is mapped
>  	 * in multiples of the page size.  For a file that is not a multiple of
>  	 * the page size, the remaining memory is zeroed when mapped, and
>  	 * writes to that region are not written out to the file."
>  	 */
> -	if (len < PAGE_SIZE)
> -		zero_user_segment(page, len, PAGE_SIZE);
> +	if (len < folio_size(folio))
> +		folio_zero_segment(folio, len, folio_size(folio));
>  	/*
>  	 * In the first loop we prepare and mark buffers to submit. We have to
> -	 * mark all buffers in the page before submitting so that
> -	 * end_page_writeback() cannot be called from ext4_end_bio() when IO
> +	 * mark all buffers in the folio before submitting so that
> +	 * folio_end_writeback() cannot be called from ext4_end_bio() when IO
>  	 * on the first buffer finishes and we are still working on submitting
>  	 * the second buffer.
>  	 */
> -	bh = head = page_buffers(page);
> +	bh = head = folio_buffers(folio);
>  	do {
>  		block_start = bh_offset(bh);
>  		if (block_start >= len) {
> @@ -479,14 +477,14 @@ int ext4_bio_write_page(struct ext4_io_submit *io,
>  				clear_buffer_dirty(bh);
>  			/*
>  			 * Keeping dirty some buffer we cannot write? Make sure
> -			 * to redirty the page and keep TOWRITE tag so that
> -			 * racing WB_SYNC_ALL writeback does not skip the page.
> +			 * to redirty the folio and keep TOWRITE tag so that
> +			 * racing WB_SYNC_ALL writeback does not skip the folio.
>  			 * This happens e.g. when doing writeout for
>  			 * transaction commit.
>  			 */
>  			if (buffer_dirty(bh)) {
> -				if (!PageDirty(page))
> -					redirty_page_for_writepage(wbc, page);
> +				if (!folio_test_dirty(folio))
> +					folio_redirty_for_writepage(wbc, folio);
>  				keep_towrite = true;
>  			}
>  			continue;
> @@ -498,11 +496,11 @@ int ext4_bio_write_page(struct ext4_io_submit *io,
>  		nr_to_submit++;
>  	} while ((bh = bh->b_this_page) != head);
>
> -	/* Nothing to submit? Just unlock the page... */
> +	/* Nothing to submit? Just unlock the folio... */
>  	if (!nr_to_submit)
>  		goto unlock;
>
> -	bh = head = page_buffers(page);
> +	bh = head = folio_buffers(folio);
>
>  	/*
>  	 * If any blocks are being written to an encrypted file, encrypt them
> @@ -514,6 +512,7 @@ int ext4_bio_write_page(struct ext4_io_submit *io,
>  	if (fscrypt_inode_uses_fs_layer_crypto(inode) && nr_to_submit) {
>  		gfp_t gfp_flags = GFP_NOFS;
>  		unsigned int enc_bytes = round_up(len, i_blocksize(inode));
> +		struct page *bounce_page;
>
>  		/*
>  		 * Since bounce page allocation uses a mempool, we can only use
> @@ -540,7 +539,7 @@ int ext4_bio_write_page(struct ext4_io_submit *io,
>  			}
>
>  			printk_ratelimited(KERN_ERR "%s: ret = %d\n", __func__, ret);
> -			redirty_page_for_writepage(wbc, page);
> +			folio_redirty_for_writepage(wbc, folio);
>  			do {
>  				if (buffer_async_write(bh)) {
>  					clear_buffer_async_write(bh);
> @@ -550,21 +549,18 @@ int ext4_bio_write_page(struct ext4_io_submit *io,
>  			} while (bh != head);
>  			goto unlock;
>  		}
> +		io_folio = page_folio(bounce_page);
>  	}
>
> -	if (keep_towrite)
> -		set_page_writeback_keepwrite(page);
> -	else
> -		set_page_writeback(page);
> +	__folio_start_writeback(folio, keep_towrite);
>
>  	/* Now submit buffers to write */
>  	do {
>  		if (!buffer_async_write(bh))
>  			continue;
> -		io_submit_add_bh(io, inode,
> -				 bounce_page ? bounce_page : page, bh);
> +		io_submit_add_bh(io, inode, io_folio, bh);
>  	} while ((bh = bh->b_this_page) != head);
>  unlock:
> -	unlock_page(page);
> +	folio_unlock(folio);
>  	return ret;
>  }
> diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
> index 0425f22a9c82..bba2a32031a2 100644
> --- a/include/linux/page-flags.h
> +++ b/include/linux/page-flags.h
> @@ -766,11 +766,6 @@ bool set_page_writeback(struct page *page);
>  #define folio_start_writeback_keepwrite(folio)	\
>  	__folio_start_writeback(folio, true)
>
> -static inline void set_page_writeback_keepwrite(struct page *page)
> -{
> -	folio_start_writeback_keepwrite(page_folio(page));
> -}
> -
>  static inline bool test_set_page_writeback(struct page *page)
>  {
>  	return set_page_writeback(page);
> --
> 2.35.1

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 24/31] ext4: Convert ext4_mpage_readpages() to work on folios
  2023-01-27 16:08     ` Matthew Wilcox
@ 2023-03-05 11:26       ` Ritesh Harjani
  0 siblings, 0 replies; 83+ messages in thread
From: Ritesh Harjani @ 2023-03-05 11:26 UTC (permalink / raw)
  To: Matthew Wilcox, Eric Biggers
  Cc: Theodore Tso, Andreas Dilger, linux-ext4, linux-fsdevel

Matthew Wilcox <willy@infradead.org> writes:

> On Thu, Jan 26, 2023 at 08:15:04PM -0800, Eric Biggers wrote:
>> On Thu, Jan 26, 2023 at 08:24:08PM +0000, Matthew Wilcox (Oracle) wrote:
>> >  int ext4_mpage_readpages(struct inode *inode,
>> > -		struct readahead_control *rac, struct page *page)
>> > +		struct readahead_control *rac, struct folio *folio)
>> >  {
>> >  	struct bio *bio = NULL;
>> >  	sector_t last_block_in_bio = 0;
>> > @@ -247,16 +247,15 @@ int ext4_mpage_readpages(struct inode *inode,
>> >  		int fully_mapped = 1;
>> >  		unsigned first_hole = blocks_per_page;
>> >
>> > -		if (rac) {
>> > -			page = readahead_page(rac);
>> > -			prefetchw(&page->flags);
>> > -		}
>> > +		if (rac)
>> > +			folio = readahead_folio(rac);
>> > +		prefetchw(&folio->flags);
>>
>> Unlike readahead_page(), readahead_folio() puts the folio immediately.  Is that
>> really safe?
>
> It's safe until we unlock the page.  The page cache holds a refcount,
> and truncation has to lock the page before it can remove it from the
> page cache.
>
> Putting the refcount in readahead_folio() is a transitional step; once
> all filesystems are converted to use readahead_folio(), I'll hoist the
> refcount put to the caller.  Having ->readahead() and ->read_folio()
> with different rules for who puts the folio is a long-standing mistake.
>
>> > @@ -299,11 +298,11 @@ int ext4_mpage_readpages(struct inode *inode,
>> >
>> >  				if (ext4_map_blocks(NULL, inode, &map, 0) < 0) {
>> >  				set_error_page:
>> > -					SetPageError(page);
>> > -					zero_user_segment(page, 0,
>> > -							  PAGE_SIZE);
>> > -					unlock_page(page);
>> > -					goto next_page;
>> > +					folio_set_error(folio);
>> > +					folio_zero_segment(folio, 0,
>> > +							  folio_size(folio));
>> > +					folio_unlock(folio);
>> > +					continue;
>>
>> This is 'continuing' the inner loop, not the outer loop as it should.
>
> Oops.  Will fix.  I didn't get any extra failures from xfstests
> with this bug, although I suspect I wasn't testing with block size <
> page size, which is probably needed to make a difference.

I am still reviewing the rest of the series. But just wanted to paste
this failure with generic/574 with 4k blocksize on x86 system.

The fix is the same which Eric pointed out.


[  208.818910] fsverity_msg: 3 callbacks suppressed
[  208.818927] fs-verity (loop7, inode 12): FILE CORRUPTED! pos=0, level=0, want_hash=sha256:5d55504690cf24b26f46d577f874d2d4c6
[  208.835984] ------------[ cut here ]------------
[  208.839047] WARNING: CPU: 2 PID: 2370 at fs/verity/verify.c:277 verify_data_blocks+0xc5/0x1b0
[  208.844648] Modules linked in:
[  208.846986] CPU: 2 PID: 2370 Comm: cat Not tainted 6.2.0-xfstests-13498-ga1825ad035c0 #29
[  208.852746] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.15.0-0-g2dd4b9b3f840-prebuilt.qemu.org 04/01/4
[  208.860155] RIP: 0010:verify_data_blocks+0xc5/0x1b0
[  208.863491] Code: 89 e7 e8 8e 32 e0 ff 4c 89 e2 48 b8 00 00 00 00 00 fc ff df 48 c1 ea 03 80 3c 02 00 0f 85 bf 00 00 00 49 e
[  208.875434] RSP: 0018:ffff8881b8867688 EFLAGS: 00010246
[  208.878903] RAX: 0110000000000110 RBX: 0000000000001000 RCX: ffffffff81cd8a92
[  208.883539] RDX: 1ffffd4000a69bb0 RSI: 0000000000000008 RDI: ffffea000534dd80
[  208.888246] RBP: 0000000000001000 R08: 0000000000000000 R09: ffffea000534dd87
[  208.892932] R10: fffff94000a69bb0 R11: ffffffff86d90cb3 R12: ffffea000534dd80
[  208.897570] R13: ffff8881444381c8 R14: 0000000000000000 R15: ffff88810dd0cea8
[  208.901848] FS:  00007ffff7fb3740(0000) GS:ffff8883eb800000(0000) knlGS:0000000000000000
[  208.904643] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  208.906858] CR2: 00007ffff7f91000 CR3: 00000001f9202006 CR4: 0000000000170ee0
[  208.909510] Call Trace:
[  208.910545]  <TASK>
[  208.911469]  fsverity_verify_blocks+0xc7/0x140
[  208.913364]  ext4_mpage_readpages+0x545/0xe50
[  208.915211]  ? __pfx_ext4_mpage_readpages+0x10/0x10
[  208.917051]  ? find_held_lock+0x2d/0x120
[  208.918753]  ? kvm_clock_read+0x14/0x30
[  208.920316]  ? kvm_sched_clock_read+0x9/0x20
[  208.922074]  ? local_clock+0xf/0xd0
[  208.923436]  ? __lock_release+0x480/0x940
[  208.925071]  ? __pfx___lock_release+0x10/0x10
[  208.926723]  read_pages+0x190/0xb60
[  208.928134]  ? folio_add_lru+0x334/0x630
[  208.929746]  ? lock_release+0xff/0x2c0
[  208.931190]  ? folio_add_lru+0x355/0x630
[  208.932904]  ? __pfx_read_pages+0x10/0x10
[  208.934450]  page_cache_ra_unbounded+0x2cc/0x510
[  208.936249]  filemap_get_pages+0x233/0x7c0
[  208.937851]  ? __pfx_filemap_get_pages+0x10/0x10
[  208.939674]  ? __lock_acquire+0x7e1/0x1120
[  208.941229]  filemap_read+0x2dd/0xa20
[  208.942763]  ? __pfx_filemap_read+0x10/0x10
[  208.944522]  ? do_anonymous_page+0x58b/0x12e0
[  208.946333]  ? do_raw_spin_unlock+0x14d/0x1f0
[  208.948279]  ? _raw_spin_unlock+0x2d/0x50
[  208.949899]  ? do_anonymous_page+0x58b/0x12e0
[  208.951631]  vfs_read+0x512/0x750
[  208.953018]  ? __pfx_vfs_read+0x10/0x10
[  208.954480]  ? local_clock+0xf/0xd0
[  208.955964]  ? __pfx___lock_release+0x10/0x10
[  208.957744]  ? __fget_light+0x51/0x230
[  208.959408]  ksys_read+0xfd/0x1d0
[  208.960719]  ? __pfx_ksys_read+0x10/0x10
[  208.962327]  ? syscall_enter_from_user_mode+0x21/0x50
[  208.964180]  do_syscall_64+0x3f/0x90
[  208.965732]  entry_SYSCALL_64_after_hwframe+0x72/0xdc
[  208.967618] RIP: 0033:0x7ffff7d0ccf1
[  208.969181] Code: 31 c0 e9 b2 fe ff ff 50 48 8d 3d b2 0a 0b 00 e8 65 29 02 00 0f 1f 44 00 00 f3 0f 1e fa 80 3d ed 18 0f 00 4
[  208.975395] RSP: 002b:00007fffffffccc8 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
[  208.978052] RAX: ffffffffffffffda RBX: 0000000000020000 RCX: 00007ffff7d0ccf1
[  208.980657] RDX: 0000000000020000 RSI: 00007ffff7f92000 RDI: 0000000000000003
[  208.983256] RBP: 00007ffff7f92000 R08: 00000000ffffffff R09: 0000000000000000
[  208.985874] R10: 0000000000000022 R11: 0000000000000246 R12: 0000000000022000
[  208.988480] R13: 0000000000000003 R14: 0000000000020000 R15: 0000000000020000
[  208.991004]  </TASK>
[  208.992250] irq event stamp: 6759
[  208.993616] hardirqs last  enabled at (6769): [<ffffffff81528362>] __up_console_sem+0x52/0x60
[  208.996681] hardirqs last disabled at (6780): [<ffffffff81528347>] __up_console_sem+0x37/0x60
[  208.999764] softirqs last  enabled at (6278): [<ffffffff8445c3d6>] __do_softirq+0x546/0x87f
[  209.002772] softirqs last disabled at (6273): [<ffffffff813d0d64>] irq_exit_rcu+0x124/0x1a0
[  209.005992] ---[ end trace 0000000000000000 ]---
[  209.007743] page:ffffea000534dd80 refcount:1 mapcount:0 mapping:ffff88814476db30 index:0x0 pfn:0x14d376
[  209.011119] memcg:ffff8881800f9000
[  209.012564] aops:ext4_da_aops ino:c dentry name:"file.fsv"
[  209.014614] flags: 0x110000000000110(error|lru|node=0|zone=2)
[  209.016839] raw: 0110000000000110 ffffea0005404388 ffffea0005411588 ffff88814476db30
[  209.019657] raw: 0000000000000000 0000000000000000 00000001ffffffff ffff8881800f9000
[  209.022464] page dumped because: VM_BUG_ON_FOLIO(!folio_test_locked(folio))
[  209.025086] ------------[ cut here ]------------
[  209.026790] kernel BUG at mm/filemap.c:1529!
[  209.028620] invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC KASAN PTI
[  209.030944] CPU: 2 PID: 2370 Comm: cat Tainted: G        W          6.2.0-xfstests-13498-ga1825ad035c0 #29
[  209.034067] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.15.0-0-g2dd4b9b3f840-prebuilt.qemu.org 04/01/4
[  209.038883] RIP: 0010:folio_unlock+0x6a/0x80
[  209.040434] Code: ec 1c 00 f0 80 65 00 fe 78 06 5d c3 cc cc cc cc 48 89 ef 5d 31 f6 e9 15 f6 ff ff 48 c7 c6 a0 9c 93 84 48 0
[  209.048850] RSP: 0018:ffff8881b8867700 EFLAGS: 00010246
[  209.050702] RAX: 000000000000003f RBX: 0000000000000001 RCX: 0000000000000000
[  209.054327] RDX: 0000000000000000 RSI: ffffffff84cd7440 RDI: 0000000000000001
[  209.056733] RBP: ffffea000534dd80 R08: 0000000000000001 R09: ffff8881b886751f
[  209.060241] R10: ffffed103710cea3 R11: 0000000000000000 R12: 0000000000000000
[  209.062776] R13: 0000000000000001 R14: ffffea000534dd80 R15: dffffc0000000000
[  209.065462] FS:  00007ffff7fb3740(0000) GS:ffff8883eb800000(0000) knlGS:0000000000000000
[  209.068635] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  209.070598] CR2: 00007ffff7f91000 CR3: 00000001f9202006 CR4: 0000000000170ee0
[  209.072954] Call Trace:
[  209.073922]  <TASK>
[  209.074887]  ext4_mpage_readpages+0x731/0xe50
[  209.076506]  ? __pfx_ext4_mpage_readpages+0x10/0x10
[  209.078372]  ? find_held_lock+0x2d/0x120
[  209.079776]  ? kvm_clock_read+0x14/0x30
[  209.081165]  ? kvm_sched_clock_read+0x9/0x20
[  209.082883]  ? local_clock+0xf/0xd0
[  209.084158]  ? __lock_release+0x480/0x940
[  209.085591]  ? __pfx___lock_release+0x10/0x10
[  209.087127]  read_pages+0x190/0xb60
[  209.088433]  ? folio_add_lru+0x334/0x630
[  209.089843]  ? lock_release+0xff/0x2c0
[  209.091188]  ? folio_add_lru+0x355/0x630
[  209.092574]  ? __pfx_read_pages+0x10/0x10
[  209.093991]  page_cache_ra_unbounded+0x2cc/0x510
[  209.095580]  filemap_get_pages+0x233/0x7c0
[  209.097019]  ? __pfx_filemap_get_pages+0x10/0x10
[  209.098607]  ? __lock_acquire+0x7e1/0x1120
[  209.100032]  filemap_read+0x2dd/0xa20
[  209.101357]  ? __pfx_filemap_read+0x10/0x10
[  209.102809]  ? do_anonymous_page+0x58b/0x12e0
[  209.104321]  ? do_raw_spin_unlock+0x14d/0x1f0
[  209.105853]  ? _raw_spin_unlock+0x2d/0x50
[  209.107254]  ? do_anonymous_page+0x58b/0x12e0
[  209.108770]  vfs_read+0x512/0x750
[  209.109997]  ? __pfx_vfs_read+0x10/0x10
[  209.111350]  ? local_clock+0xf/0xd0
[  209.112604]  ? __pfx___lock_release+0x10/0x10
[  209.114138]  ? __fget_light+0x51/0x230
[  209.115469]  ksys_read+0xfd/0x1d0
[  209.116676]  ? __pfx_ksys_read+0x10/0x10
[  209.118189]  ? syscall_enter_from_user_mode+0x21/0x50
[  209.119894]  do_syscall_64+0x3f/0x90
[  209.121201]  entry_SYSCALL_64_after_hwframe+0x72/0xdc
[  209.122901] RIP: 0033:0x7ffff7d0ccf1
[  209.124159] Code: 31 c0 e9 b2 fe ff ff 50 48 8d 3d b2 0a 0b 00 e8 65 29 02 00 0f 1f 44 00 00 f3 0f 1e fa 80 3d ed 18 0f 00 4
[  209.129836] RSP: 002b:00007fffffffccc8 EFLAGS: 00000246 ORIG_RAX: 0000000000000000

-ritesh

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 20/31] ext4: Convert __ext4_block_zero_page_range() to use a folio
  2023-01-26 20:24 ` [PATCH 20/31] ext4: Convert __ext4_block_zero_page_range() " Matthew Wilcox (Oracle)
@ 2023-03-05 12:26   ` Ritesh Harjani
  0 siblings, 0 replies; 83+ messages in thread
From: Ritesh Harjani @ 2023-03-05 12:26 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle), Theodore Tso, Andreas Dilger
  Cc: Matthew Wilcox (Oracle), linux-ext4, linux-fsdevel

"Matthew Wilcox (Oracle)" <willy@infradead.org> writes:

> Use folio APIs throughout.  Saves many calls to compound_head().

minor comment below.

>
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> ---
>  fs/ext4/inode.c | 28 ++++++++++++++++------------
>  1 file changed, 16 insertions(+), 12 deletions(-)
>
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index b79e591b7c8e..727aa2e51a9d 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -3812,23 +3812,26 @@ static int __ext4_block_zero_page_range(handle_t *handle,
>  	ext4_lblk_t iblock;
>  	struct inode *inode = mapping->host;
>  	struct buffer_head *bh;
> -	struct page *page;
> +	struct folio *folio;
>  	int err = 0;
>
> -	page = find_or_create_page(mapping, from >> PAGE_SHIFT,
> -				   mapping_gfp_constraint(mapping, ~__GFP_FS));
> -	if (!page)
> +	folio = __filemap_get_folio(mapping, from >> PAGE_SHIFT,
> +				    FGP_LOCK | FGP_ACCESSED | FGP_CREAT,
> +				    mapping_gfp_constraint(mapping, ~__GFP_FS));
> +	if (!folio)
>  		return -ENOMEM;
>
>  	blocksize = inode->i_sb->s_blocksize;
>
>  	iblock = index << (PAGE_SHIFT - inode->i_sb->s_blocksize_bits);
>
> -	if (!page_has_buffers(page))
> -		create_empty_buffers(page, blocksize, 0);
> +	bh = folio_buffers(folio);
> +	if (!bh) {
> +		create_empty_buffers(&folio->page, blocksize, 0);
> +		bh = folio_buffers(folio);
> +	}
>
>  	/* Find the buffer that contains "offset" */
> -	bh = page_buffers(page);
>  	pos = blocksize;
>  	while (offset >= pos) {
>  		bh = bh->b_this_page;
> @@ -3850,7 +3853,7 @@ static int __ext4_block_zero_page_range(handle_t *handle,
>  	}
>
>  	/* Ok, it's mapped. Make sure it's up-to-date */
> -	if (PageUptodate(page))
> +	if (folio_test_uptodate(folio))
>  		set_buffer_uptodate(bh);
>
>  	if (!buffer_uptodate(bh)) {
> @@ -3860,7 +3863,8 @@ static int __ext4_block_zero_page_range(handle_t *handle,
>  		if (fscrypt_inode_uses_fs_layer_crypto(inode)) {
>  			/* We expect the key to be set. */
>  			BUG_ON(!fscrypt_has_encryption_key(inode));
> -			err = fscrypt_decrypt_pagecache_blocks(page, blocksize,
> +			err = fscrypt_decrypt_pagecache_blocks(&folio->page,
> +							       blocksize,
>  							       bh_offset(bh));

I think after this patch which added support for decrypting large folio,
fscrypt_descrypt_pagecache_blocks() takes folio as it's 1st argument.
Hence this patch will need a small change to pass folio instead of page.

Other than that the change looks good to me.

Please feel free to add -
Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>

    51e4e3153ebc32d3280d5d17418ae6f1a44f1ec1
    Author:     Eric Biggers <ebiggers@google.com>
    CommitDate: Sat Jan 28 15:10:12 2023 -0800

    fscrypt: support decrypting data from large folios


-ritesh

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 25/31] ext4: Convert ext4_block_write_begin() to take a folio
  2023-01-26 20:24 ` [PATCH 25/31] ext4: Convert ext4_block_write_begin() to take a folio Matthew Wilcox (Oracle)
@ 2023-03-06  6:51   ` Ritesh Harjani
  2023-03-06  8:27     ` Matthew Wilcox
  0 siblings, 1 reply; 83+ messages in thread
From: Ritesh Harjani @ 2023-03-06  6:51 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle), Theodore Tso, Andreas Dilger
  Cc: Matthew Wilcox (Oracle), linux-ext4, linux-fsdevel

"Matthew Wilcox (Oracle)" <willy@infradead.org> writes:

> All the callers now have a folio, so pass that in and operate on folios.
> Removes four calls to compound_head().

Why do you say four? Isn't it 3 calls of PageUptodate(page) which
removes calls to compound_head()? Which one did I miss?

>
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> ---
>  fs/ext4/inode.c | 41 +++++++++++++++++++++--------------------
>  1 file changed, 21 insertions(+), 20 deletions(-)
>
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index dbfc0670de75..507c7f88d737 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -1055,12 +1055,12 @@ int do_journal_get_write_access(handle_t *handle, struct inode *inode,
>  }
>
>  #ifdef CONFIG_FS_ENCRYPTION
> -static int ext4_block_write_begin(struct page *page, loff_t pos, unsigned len,
> +static int ext4_block_write_begin(struct folio *folio, loff_t pos, unsigned len,
>  				  get_block_t *get_block)
>  {
>  	unsigned from = pos & (PAGE_SIZE - 1);
>  	unsigned to = from + len;
> -	struct inode *inode = page->mapping->host;
> +	struct inode *inode = folio->mapping->host;
>  	unsigned block_start, block_end;
>  	sector_t block;
>  	int err = 0;
> @@ -1070,22 +1070,24 @@ static int ext4_block_write_begin(struct page *page, loff_t pos, unsigned len,
>  	int nr_wait = 0;
>  	int i;
>
> -	BUG_ON(!PageLocked(page));
> +	BUG_ON(!folio_test_locked(folio));
>  	BUG_ON(from > PAGE_SIZE);
>  	BUG_ON(to > PAGE_SIZE);
>  	BUG_ON(from > to);
>
> -	if (!page_has_buffers(page))
> -		create_empty_buffers(page, blocksize, 0);
> -	head = page_buffers(page);
> +	head = folio_buffers(folio);
> +	if (!head) {
> +		create_empty_buffers(&folio->page, blocksize, 0);
> +		head = folio_buffers(folio);
> +	}
>  	bbits = ilog2(blocksize);
> -	block = (sector_t)page->index << (PAGE_SHIFT - bbits);
> +	block = (sector_t)folio->index << (PAGE_SHIFT - bbits);
>
>  	for (bh = head, block_start = 0; bh != head || !block_start;
>  	    block++, block_start = block_end, bh = bh->b_this_page) {
>  		block_end = block_start + blocksize;
>  		if (block_end <= from || block_start >= to) {
> -			if (PageUptodate(page)) {
> +			if (folio_test_uptodate(folio)) {
>  				set_buffer_uptodate(bh);
>  			}
>  			continue;
> @@ -1098,19 +1100,20 @@ static int ext4_block_write_begin(struct page *page, loff_t pos, unsigned len,
>  			if (err)
>  				break;
>  			if (buffer_new(bh)) {
> -				if (PageUptodate(page)) {
> +				if (folio_test_uptodate(folio)) {
>  					clear_buffer_new(bh);
>  					set_buffer_uptodate(bh);
>  					mark_buffer_dirty(bh);
>  					continue;
>  				}
>  				if (block_end > to || block_start < from)
> -					zero_user_segments(page, to, block_end,
> -							   block_start, from);
> +					folio_zero_segments(folio, to,
> +							    block_end,
> +							    block_start, from);
>  				continue;
>  			}
>  		}
> -		if (PageUptodate(page)) {
> +		if (folio_test_uptodate(folio)) {
>  			set_buffer_uptodate(bh);
>  			continue;
>  		}
> @@ -1130,13 +1133,13 @@ static int ext4_block_write_begin(struct page *page, loff_t pos, unsigned len,
>  			err = -EIO;
>  	}
>  	if (unlikely(err)) {
> -		page_zero_new_buffers(page, from, to);
> +		page_zero_new_buffers(&folio->page, from, to);
>  	} else if (fscrypt_inode_uses_fs_layer_crypto(inode)) {
>  		for (i = 0; i < nr_wait; i++) {
>  			int err2;
>
> -			err2 = fscrypt_decrypt_pagecache_blocks(page, blocksize,
> -								bh_offset(wait[i]));
> +			err2 = fscrypt_decrypt_pagecache_blocks(&folio->page,
> +						blocksize, bh_offset(wait[i]));

folio_decrypt_pagecache_blocks() takes folio as it's argument now.

Other than that it looks good to me. Please feel free to add -
Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 25/31] ext4: Convert ext4_block_write_begin() to take a folio
  2023-03-06  6:51   ` Ritesh Harjani
@ 2023-03-06  8:27     ` Matthew Wilcox
  2023-03-06 15:21       ` Ritesh Harjani
  0 siblings, 1 reply; 83+ messages in thread
From: Matthew Wilcox @ 2023-03-06  8:27 UTC (permalink / raw)
  To: Ritesh Harjani; +Cc: Theodore Tso, Andreas Dilger, linux-ext4, linux-fsdevel

On Mon, Mar 06, 2023 at 12:21:48PM +0530, Ritesh Harjani wrote:
> "Matthew Wilcox (Oracle)" <willy@infradead.org> writes:
> 
> > All the callers now have a folio, so pass that in and operate on folios.
> > Removes four calls to compound_head().
> 
> Why do you say four? Isn't it 3 calls of PageUptodate(page) which
> removes calls to compound_head()? Which one did I miss?
>
> > -	BUG_ON(!PageLocked(page));
> > +	BUG_ON(!folio_test_locked(folio));

That one ;-)

> >  	} else if (fscrypt_inode_uses_fs_layer_crypto(inode)) {
> >  		for (i = 0; i < nr_wait; i++) {
> >  			int err2;
> >
> > -			err2 = fscrypt_decrypt_pagecache_blocks(page, blocksize,
> > -								bh_offset(wait[i]));
> > +			err2 = fscrypt_decrypt_pagecache_blocks(&folio->page,
> > +						blocksize, bh_offset(wait[i]));
> 
> folio_decrypt_pagecache_blocks() takes folio as it's argument now.
> 
> Other than that it looks good to me. Please feel free to add -
> Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>

Thanks.  I'll refresh this patchset next week.


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 04/31] ext4: Convert ext4_finish_bio() to use folios
  2023-01-26 20:23 ` [PATCH 04/31] ext4: Convert ext4_finish_bio() to use folios Matthew Wilcox (Oracle)
@ 2023-03-06  9:10   ` Ritesh Harjani
  2023-03-23  3:26     ` Matthew Wilcox
  2023-03-14 22:08   ` Theodore Ts'o
  1 sibling, 1 reply; 83+ messages in thread
From: Ritesh Harjani @ 2023-03-06  9:10 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle), Theodore Tso, Andreas Dilger
  Cc: Matthew Wilcox (Oracle), linux-ext4, linux-fsdevel

"Matthew Wilcox (Oracle)" <willy@infradead.org> writes:

> Prepare ext4 to support large folios in the page writeback path.

Sure. I am guessing for ext4 to completely support large folio
requires more work like fscrypt bounce page handling doesn't
yet support folios right?

Could you please give a little background on what all would be required
to add large folio support in ext4 buffered I/O path?
(I mean ofcourse other than saying move ext4 to iomap ;))

What I was interested in was, what other components in particular for
e.g. fscrypt, fsverity, ext4's xyz component needs large folio support?

And how should one go about in adding this support? So can we move
ext4's read path to have large folio support to get started?
Have you already identified what all is missing from this path to
convert it?

> Also set the actual error in the mapping, not just -EIO.

Right. I looked at the history and I think it always just had EIO.
I think setting the actual err in mapping_set_error() is the right thing
to do here.

>
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>

w.r.t this patch series. I reviewed the mechanical changes & error paths
which converts ext4 ext4_finish_bio() to use folio.

The changes looks good to me from that perspective. Feel free to add -
Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>


> ---
>  fs/ext4/page-io.c | 32 ++++++++++++++++----------------
>  1 file changed, 16 insertions(+), 16 deletions(-)
>
> diff --git a/fs/ext4/page-io.c b/fs/ext4/page-io.c
> index 982791050892..fd6c0dca24b9 100644
> --- a/fs/ext4/page-io.c
> +++ b/fs/ext4/page-io.c
> @@ -99,30 +99,30 @@ static void buffer_io_error(struct buffer_head *bh)
>
>  static void ext4_finish_bio(struct bio *bio)
>  {
> -	struct bio_vec *bvec;
> -	struct bvec_iter_all iter_all;
> +	struct folio_iter fi;
>
> -	bio_for_each_segment_all(bvec, bio, iter_all) {
> -		struct page *page = bvec->bv_page;
> -		struct page *bounce_page = NULL;
> +	bio_for_each_folio_all(fi, bio) {
> +		struct folio *folio = fi.folio;
> +		struct folio *io_folio = NULL;
>  		struct buffer_head *bh, *head;
> -		unsigned bio_start = bvec->bv_offset;
> -		unsigned bio_end = bio_start + bvec->bv_len;
> +		size_t bio_start = fi.offset;
> +		size_t bio_end = bio_start + fi.length;
>  		unsigned under_io = 0;
>  		unsigned long flags;
>
> -		if (fscrypt_is_bounce_page(page)) {
> -			bounce_page = page;
> -			page = fscrypt_pagecache_page(bounce_page);
> +		if (fscrypt_is_bounce_folio(folio)) {
> +			io_folio = folio;
> +			folio = fscrypt_pagecache_folio(folio);
>  		}
>
>  		if (bio->bi_status) {
> -			SetPageError(page);
> -			mapping_set_error(page->mapping, -EIO);
> +			int err = blk_status_to_errno(bio->bi_status);
> +			folio_set_error(folio);
> +			mapping_set_error(folio->mapping, err);
>  		}
> -		bh = head = page_buffers(page);
> +		bh = head = folio_buffers(folio);
>  		/*
> -		 * We check all buffers in the page under b_uptodate_lock
> +		 * We check all buffers in the folio under b_uptodate_lock
>  		 * to avoid races with other end io clearing async_write flags
>  		 */
>  		spin_lock_irqsave(&head->b_uptodate_lock, flags);
> @@ -141,8 +141,8 @@ static void ext4_finish_bio(struct bio *bio)
>  		} while ((bh = bh->b_this_page) != head);
>  		spin_unlock_irqrestore(&head->b_uptodate_lock, flags);
>  		if (!under_io) {
> -			fscrypt_free_bounce_page(bounce_page);
> -			end_page_writeback(page);
> +			fscrypt_free_bounce_page(&io_folio->page);

Could you please help understand what would it take to convert bounce
page in fscrypt to folio?

Today, we allocate 32 bounce pages of order 0 via mempool in
    fscrypt_initialize()
    <...>
        fscrypt_bounce_page_pool =
            mempool_create_page_pool(num_prealloc_crypto_pages, 0);
    <...>

And IIUC, we might need to add some support for having higher order
pages in the pool so that one can allocate a folio->_folio_order
folio from this pool for bounce page to support large folio.
Is that understanding correct? Your thoughts on this please?

-ritesh

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 25/31] ext4: Convert ext4_block_write_begin() to take a folio
  2023-03-06  8:27     ` Matthew Wilcox
@ 2023-03-06 15:21       ` Ritesh Harjani
  2023-03-15  4:40         ` Matthew Wilcox
  0 siblings, 1 reply; 83+ messages in thread
From: Ritesh Harjani @ 2023-03-06 15:21 UTC (permalink / raw)
  To: Matthew Wilcox; +Cc: Theodore Tso, Andreas Dilger, linux-ext4, linux-fsdevel

Matthew Wilcox <willy@infradead.org> writes:

> On Mon, Mar 06, 2023 at 12:21:48PM +0530, Ritesh Harjani wrote:
>> "Matthew Wilcox (Oracle)" <willy@infradead.org> writes:
>>
>> > All the callers now have a folio, so pass that in and operate on folios.
>> > Removes four calls to compound_head().
>>
>> Why do you say four? Isn't it 3 calls of PageUptodate(page) which
>> removes calls to compound_head()? Which one did I miss?
>>
>> > -	BUG_ON(!PageLocked(page));
>> > +	BUG_ON(!folio_test_locked(folio));
>
> That one ;-)

__PAGEFLAG(Locked, locked, PF_NO_TAIL)

#define __PAGEFLAG(uname, lname, policy)				\
	TESTPAGEFLAG(uname, lname, policy)				\
	__SETPAGEFLAG(uname, lname, policy)				\
	__CLEARPAGEFLAG(uname, lname, policy)

#define TESTPAGEFLAG(uname, lname, policy)				\
static __always_inline bool folio_test_##lname(struct folio *folio)	\
{ return test_bit(PG_##lname, folio_flags(folio, FOLIO_##policy)); }	\
static __always_inline int Page##uname(struct page *page)		\
{ return test_bit(PG_##lname, &policy(page, 0)->flags); }

How? PageLocked(page) doesn't do any compount_head() calls no?

-ritesh

>
>> >  	} else if (fscrypt_inode_uses_fs_layer_crypto(inode)) {
>> >  		for (i = 0; i < nr_wait; i++) {
>> >  			int err2;
>> >
>> > -			err2 = fscrypt_decrypt_pagecache_blocks(page, blocksize,
>> > -								bh_offset(wait[i]));
>> > +			err2 = fscrypt_decrypt_pagecache_blocks(&folio->page,
>> > +						blocksize, bh_offset(wait[i]));
>>
>> folio_decrypt_pagecache_blocks() takes folio as it's argument now.
>>
>> Other than that it looks good to me. Please feel free to add -
>> Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
>
> Thanks.  I'll refresh this patchset next week.

Sure. Thanks!

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 05/31] ext4: Convert ext4_writepage() to use a folio
  2023-01-26 20:23 ` [PATCH 05/31] ext4: Convert ext4_writepage() to use a folio Matthew Wilcox (Oracle)
@ 2023-03-06 18:45   ` Ritesh Harjani
  2023-03-14 22:26     ` Theodore Ts'o
  0 siblings, 1 reply; 83+ messages in thread
From: Ritesh Harjani @ 2023-03-06 18:45 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle), Theodore Tso, Andreas Dilger
  Cc: Matthew Wilcox (Oracle), linux-ext4, linux-fsdevel

"Matthew Wilcox (Oracle)" <willy@infradead.org> writes:

> Prepare for multi-page folios and save some instructions by converting
> to the folio API.

Mostly a straight forward change. The changes looks good to me.
Please feel free to add -

Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>

In later few patches I see ext4_readpage converted to ext4_read_folio().
I think the reason why we have not changed ext4_writepage() to
ext4_write_folio() is because we anyway would like to get rid of
->writepage ops eventually in future, so no point.
I think there is even patch series from Jan which tries to kill
ext4_writepage() completely.

-ritesh

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 01/31] fs: Add FGP_WRITEBEGIN
  2023-01-26 20:23 ` [PATCH 01/31] fs: Add FGP_WRITEBEGIN Matthew Wilcox (Oracle)
  2023-03-05  8:53   ` Ritesh Harjani
@ 2023-03-14 22:00   ` Theodore Ts'o
  1 sibling, 0 replies; 83+ messages in thread
From: Theodore Ts'o @ 2023-03-14 22:00 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle); +Cc: Andreas Dilger, linux-ext4, linux-fsdevel

On Thu, Jan 26, 2023 at 08:23:45PM +0000, Matthew Wilcox (Oracle) wrote:
> This particular combination of flags is used by most filesystems
> in their ->write_begin method, although it does find use in a
> few other places.  Before folios, it warranted its own function
> (grab_cache_page_write_begin()), but I think that just having specialised
> flags is enough.  It certainly helps the few places that have been
> converted from grab_cache_page_write_begin() to __filemap_get_folio().
> 
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>

Reviewed-by: Theodore Ts'o <tytso@mit.edu>

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 02/31] fscrypt: Add some folio helper functions
  2023-01-27 16:13     ` Matthew Wilcox
  2023-01-27 16:21       ` Eric Biggers
@ 2023-03-14 22:05       ` Theodore Ts'o
  2023-03-14 23:12         ` Eric Biggers
  1 sibling, 1 reply; 83+ messages in thread
From: Theodore Ts'o @ 2023-03-14 22:05 UTC (permalink / raw)
  To: Matthew Wilcox; +Cc: Eric Biggers, Andreas Dilger, linux-ext4, linux-fsdevel

On Fri, Jan 27, 2023 at 04:13:37PM +0000, Matthew Wilcox wrote:
> 
> It's out of scope for _this_ patchset.  I think it's a patchset that
> could come either before or after, and is needed to support large folios
> with ext4.  The biggest problem with doing that conversion is that
> bounce pages are allocated from a mempool which obviously only allocates
> order-0 folios.  I don't know what to do about that.  Have a mempool
> for each order of folio that the filesystem supports?  Try to allocate
> folios without a mempool and then split the folio if allocation fails?
> Have a mempool containing PMD-order pages and split them ourselves if
> we need to allocate from the mempool?
> 
> Nothing's really standing out to me as the perfect answer.  There are
> probably other alternatives.

Hmm.... should we have some kind of check in case a large folio is
passed to these fscrypt functions?  (e.g., some kind of BUG_ON, or
WARN_ON?)

Or do we just rely on people remembering that when we start trying to
support large folios for ext4, it will probably have to be the easy
cases first (e.g., no fscrypt, no fsverity, block size == page size)?

      	    	      	       	  	    - Ted

      	    	      	       	  


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 03/31] ext4: Convert ext4_bio_write_page() to use a folio
  2023-01-26 20:23 ` [PATCH 03/31] ext4: Convert ext4_bio_write_page() to use a folio Matthew Wilcox (Oracle)
                     ` (2 preceding siblings ...)
  2023-03-05 11:18   ` Ritesh Harjani
@ 2023-03-14 22:07   ` Theodore Ts'o
  3 siblings, 0 replies; 83+ messages in thread
From: Theodore Ts'o @ 2023-03-14 22:07 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle); +Cc: Andreas Dilger, linux-ext4, linux-fsdevel

On Thu, Jan 26, 2023 at 08:23:47PM +0000, Matthew Wilcox (Oracle) wrote:
> Remove several calls to compound_head() and the last caller of
> set_page_writeback_keepwrite(), so remove the wrapper too.
> 
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>

Reviewed-by: Theodore Ts'o <tytso@mit.edu>

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 04/31] ext4: Convert ext4_finish_bio() to use folios
  2023-01-26 20:23 ` [PATCH 04/31] ext4: Convert ext4_finish_bio() to use folios Matthew Wilcox (Oracle)
  2023-03-06  9:10   ` Ritesh Harjani
@ 2023-03-14 22:08   ` Theodore Ts'o
  1 sibling, 0 replies; 83+ messages in thread
From: Theodore Ts'o @ 2023-03-14 22:08 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle); +Cc: Andreas Dilger, linux-ext4, linux-fsdevel

On Thu, Jan 26, 2023 at 08:23:48PM +0000, Matthew Wilcox (Oracle) wrote:
> Prepare ext4 to support large folios in the page writeback path.
> Also set the actual error in the mapping, not just -EIO.
> 
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>

Reviewed-by: Theodore Ts'o <tytso@mit.edu>

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 05/31] ext4: Convert ext4_writepage() to use a folio
  2023-03-06 18:45   ` Ritesh Harjani
@ 2023-03-14 22:26     ` Theodore Ts'o
  2023-03-23  3:29       ` Matthew Wilcox
  0 siblings, 1 reply; 83+ messages in thread
From: Theodore Ts'o @ 2023-03-14 22:26 UTC (permalink / raw)
  To: Ritesh Harjani
  Cc: Matthew Wilcox (Oracle), Andreas Dilger, linux-ext4, linux-fsdevel

On Tue, Mar 07, 2023 at 12:15:13AM +0530, Ritesh Harjani wrote:
> "Matthew Wilcox (Oracle)" <willy@infradead.org> writes:
> 
> > Prepare for multi-page folios and save some instructions by converting
> > to the folio API.
> 
> Mostly a straight forward change. The changes looks good to me.
> Please feel free to add -
> 
> Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
> 
> In later few patches I see ext4_readpage converted to ext4_read_folio().
> I think the reason why we have not changed ext4_writepage() to
> ext4_write_folio() is because we anyway would like to get rid of
> ->writepage ops eventually in future, so no point.
> I think there is even patch series from Jan which tries to kill
> ext4_writepage() completely.

Indeed, Jan's patch series[1] is about to land in the ext4 tree, and
that's going to remove ext4_writepages.  The main reason why this
hadn't landed yet was due to some conflicts with some other folio
changes, so you should be able to drop this patch when you rebase this
patch series.

					- Ted

[1] https://lore.kernel.org/all/20230228051319.4085470-1-tytso@mit.edu/

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 06/31] ext4: Turn mpage_process_page() into mpage_process_folio()
  2023-01-26 20:23 ` [PATCH 06/31] ext4: Turn mpage_process_page() into mpage_process_folio() Matthew Wilcox (Oracle)
@ 2023-03-14 22:27   ` Theodore Ts'o
  0 siblings, 0 replies; 83+ messages in thread
From: Theodore Ts'o @ 2023-03-14 22:27 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle); +Cc: Andreas Dilger, linux-ext4, linux-fsdevel

On Thu, Jan 26, 2023 at 08:23:50PM +0000, Matthew Wilcox (Oracle) wrote:
> The page/folio is only used to extract the buffers, so this is a
> simple change.
> 
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>

Reviewed-by: Theodore Ts'o <tytso@mit.edu>

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 07/31] ext4: Convert mpage_submit_page() to mpage_submit_folio()
  2023-01-26 20:23 ` [PATCH 07/31] ext4: Convert mpage_submit_page() to mpage_submit_folio() Matthew Wilcox (Oracle)
@ 2023-03-14 22:28   ` Theodore Ts'o
  0 siblings, 0 replies; 83+ messages in thread
From: Theodore Ts'o @ 2023-03-14 22:28 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle); +Cc: Andreas Dilger, linux-ext4, linux-fsdevel

On Thu, Jan 26, 2023 at 08:23:51PM +0000, Matthew Wilcox (Oracle) wrote:
> All callers now have a folio so we can pass one in and use the folio
> APIs to support large folios as well as save instructions by eliminating
> calls to compound_head().
> 
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>

Reviewed-by: Theodore Ts'o <tytso@mit.edu>

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 08/31] ext4: Convert ext4_bio_write_page() to ext4_bio_write_folio()
  2023-01-26 20:23 ` [PATCH 08/31] ext4: Convert ext4_bio_write_page() to ext4_bio_write_folio() Matthew Wilcox (Oracle)
@ 2023-03-14 22:31   ` Theodore Ts'o
  0 siblings, 0 replies; 83+ messages in thread
From: Theodore Ts'o @ 2023-03-14 22:31 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle); +Cc: Andreas Dilger, linux-ext4, linux-fsdevel

On Thu, Jan 26, 2023 at 08:23:52PM +0000, Matthew Wilcox (Oracle) wrote:
> Both callers now have a folio so pass it in directly and avoid the call
> to page_folio() at the beginning.
> 
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>

The ext4_writepage() changes will need to be dropped when you rebase,
but other than that....

Reviewed-by: Theodore Ts'o <tytso@mit.edu>

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 09/31] ext4: Convert ext4_readpage_inline() to take a folio
  2023-01-26 20:23 ` [PATCH 09/31] ext4: Convert ext4_readpage_inline() to take a folio Matthew Wilcox (Oracle)
@ 2023-03-14 22:31   ` Theodore Ts'o
  0 siblings, 0 replies; 83+ messages in thread
From: Theodore Ts'o @ 2023-03-14 22:31 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle); +Cc: Andreas Dilger, linux-ext4, linux-fsdevel

On Thu, Jan 26, 2023 at 08:23:53PM +0000, Matthew Wilcox (Oracle) wrote:
> Use the folio API in this function, saves a few calls to compound_head().
> 
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>

Reviewed-by: Theodore Ts'o <tytso@mit.edu>

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 10/31] ext4: Convert ext4_convert_inline_data_to_extent() to use a folio
  2023-01-26 20:23 ` [PATCH 10/31] ext4: Convert ext4_convert_inline_data_to_extent() to use " Matthew Wilcox (Oracle)
@ 2023-03-14 22:36   ` Theodore Ts'o
  2023-03-23 17:14     ` Matthew Wilcox
  0 siblings, 1 reply; 83+ messages in thread
From: Theodore Ts'o @ 2023-03-14 22:36 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle); +Cc: Andreas Dilger, linux-ext4, linux-fsdevel

On Thu, Jan 26, 2023 at 08:23:54PM +0000, Matthew Wilcox (Oracle) wrote:
> Saves a number of calls to compound_head().

Is this left over from an earlier version of this patch series?  There
are no changes to calls to compound_head() that I can find in this
patch.

> @@ -565,10 +564,9 @@ static int ext4_convert_inline_data_to_extent(struct address_space *mapping,
>  
>  	/* We cannot recurse into the filesystem as the transaction is already
>  	 * started */
> -	flags = memalloc_nofs_save();
> -	page = grab_cache_page_write_begin(mapping, 0);
> -	memalloc_nofs_restore(flags);
> -	if (!page) {
> +	folio = __filemap_get_folio(mapping, 0, FGP_WRITEBEGIN | FGP_NOFS,
> +			mapping_gfp_mask(mapping));
> +	if (!folio) {
>  		ret = -ENOMEM;
>  		goto out;
>  	}

Is there a reason why to use FGP_NOFS as opposed to using
memalloc_nofs_{save,restore}()?

I thought using memalloc_nofs_save() is considered the perferred
approach by mm-folks.

						- Ted

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 11/31] ext4: Convert ext4_try_to_write_inline_data() to use a folio
  2023-01-26 20:23 ` [PATCH 11/31] ext4: Convert ext4_try_to_write_inline_data() " Matthew Wilcox (Oracle)
@ 2023-03-14 22:37   ` Theodore Ts'o
  0 siblings, 0 replies; 83+ messages in thread
From: Theodore Ts'o @ 2023-03-14 22:37 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle); +Cc: Andreas Dilger, linux-ext4, linux-fsdevel

On Thu, Jan 26, 2023 at 08:23:55PM +0000, Matthew Wilcox (Oracle) wrote:
> Saves a number of calls to compound_head().
> 
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>

Same comments as patch #10 -- calls to compound_head() and
memfs_nofs_save vs. FGP_NOFS?

					- Ted

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 14/31] ext4: Convert ext4_read_inline_page() to ext4_read_inline_folio()
  2023-01-26 20:23 ` [PATCH 14/31] ext4: Convert ext4_read_inline_page() to ext4_read_inline_folio() Matthew Wilcox (Oracle)
@ 2023-03-14 22:38   ` Theodore Ts'o
  0 siblings, 0 replies; 83+ messages in thread
From: Theodore Ts'o @ 2023-03-14 22:38 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle); +Cc: Andreas Dilger, linux-ext4, linux-fsdevel

On Thu, Jan 26, 2023 at 08:23:58PM +0000, Matthew Wilcox (Oracle) wrote:
> All callers now have a folio, so pass it and use it.  The folio may
> be large, although I doubt we'll want to use a large folio for an
> inline file.
> 
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>

Reviewed-by: Theodore Ts'o <tytso@mit.edu>

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 15/31] ext4: Convert ext4_write_inline_data_end() to use a folio
  2023-01-26 20:23 ` [PATCH 15/31] ext4: Convert ext4_write_inline_data_end() to use a folio Matthew Wilcox (Oracle)
@ 2023-03-14 22:39   ` Theodore Ts'o
  0 siblings, 0 replies; 83+ messages in thread
From: Theodore Ts'o @ 2023-03-14 22:39 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle); +Cc: Andreas Dilger, linux-ext4, linux-fsdevel

On Thu, Jan 26, 2023 at 08:23:59PM +0000, Matthew Wilcox (Oracle) wrote:
> Convert the incoming page to a folio so that we call compound_head()
> only once instead of seven times.
> 
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>

Reviewed-by: Theodore Ts'o <tytso@mit.edu>

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 16/31] ext4: Convert ext4_write_begin() to use a folio
  2023-01-26 20:24 ` [PATCH 16/31] ext4: Convert ext4_write_begin() " Matthew Wilcox (Oracle)
@ 2023-03-14 22:40   ` Theodore Ts'o
  0 siblings, 0 replies; 83+ messages in thread
From: Theodore Ts'o @ 2023-03-14 22:40 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle); +Cc: Andreas Dilger, linux-ext4, linux-fsdevel

On Thu, Jan 26, 2023 at 08:24:00PM +0000, Matthew Wilcox (Oracle) wrote:
> Remove a lot of calls to compound_head().

I'm still puzzled about how this removes a lot of calls to
compound_head().  I must be missing something...


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 17/31] ext4: Convert ext4_write_end() to use a folio
  2023-01-26 20:24 ` [PATCH 17/31] ext4: Convert ext4_write_end() " Matthew Wilcox (Oracle)
@ 2023-03-14 22:41   ` Theodore Ts'o
  0 siblings, 0 replies; 83+ messages in thread
From: Theodore Ts'o @ 2023-03-14 22:41 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle); +Cc: Andreas Dilger, linux-ext4, linux-fsdevel

On Thu, Jan 26, 2023 at 08:24:01PM +0000, Matthew Wilcox (Oracle) wrote:
> Convert the incoming struct page to a folio.  Replaces two implicit
> calls to compound_head() with one explicit call.
> 
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>

Reviewed-by: Theodore Ts'o <tytso@mit.edu>

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 18/31] ext4: Use a folio in ext4_journalled_write_end()
  2023-01-26 20:24 ` [PATCH 18/31] ext4: Use a folio in ext4_journalled_write_end() Matthew Wilcox (Oracle)
@ 2023-03-14 22:41   ` Theodore Ts'o
  0 siblings, 0 replies; 83+ messages in thread
From: Theodore Ts'o @ 2023-03-14 22:41 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle); +Cc: Andreas Dilger, linux-ext4, linux-fsdevel

On Thu, Jan 26, 2023 at 08:24:02PM +0000, Matthew Wilcox (Oracle) wrote:
> Convert the incoming page to a folio to remove a few calls to
> compound_head().
> 
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>

Reviewed-by: Theodore Ts'o <tytso@mit.edu>

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 19/31] ext4: Convert ext4_journalled_zero_new_buffers() to use a folio
  2023-01-26 20:24 ` [PATCH 19/31] ext4: Convert ext4_journalled_zero_new_buffers() to use a folio Matthew Wilcox (Oracle)
@ 2023-03-14 22:46   ` Theodore Ts'o
  2023-03-24  4:15     ` Matthew Wilcox
  0 siblings, 1 reply; 83+ messages in thread
From: Theodore Ts'o @ 2023-03-14 22:46 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle); +Cc: Andreas Dilger, linux-ext4, linux-fsdevel

On Thu, Jan 26, 2023 at 08:24:03PM +0000, Matthew Wilcox (Oracle) wrote:
> Remove a call to compound_head().

Same question as with other commits, plus one more; why is it notable
that calls to compound_head() are being reduced.  I've looked at the
implementation, and it doesn't look all _that_ heavyweight....


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 21/31] ext4: Convert __ext4_journalled_writepage() to take a folio
  2023-01-26 20:24 ` [PATCH 21/31] ext4: Convert __ext4_journalled_writepage() to take " Matthew Wilcox (Oracle)
@ 2023-03-14 22:47   ` Theodore Ts'o
  2023-03-24  4:55     ` Matthew Wilcox
  0 siblings, 1 reply; 83+ messages in thread
From: Theodore Ts'o @ 2023-03-14 22:47 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle); +Cc: Andreas Dilger, linux-ext4, linux-fsdevel

On Thu, Jan 26, 2023 at 08:24:05PM +0000, Matthew Wilcox (Oracle) wrote:
> Use the folio APIs throughout and remove a PAGE_SIZE assumption.
> 
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>

This patch should be obviated by Jan's "Cleanup data=journal writeback
path" patch series.



^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 22/31] ext4: Convert ext4_page_nomap_can_writeout() to take a folio
  2023-01-26 20:24 ` [PATCH 22/31] ext4: Convert ext4_page_nomap_can_writeout() " Matthew Wilcox (Oracle)
@ 2023-03-14 22:50   ` Theodore Ts'o
  0 siblings, 0 replies; 83+ messages in thread
From: Theodore Ts'o @ 2023-03-14 22:50 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle); +Cc: Andreas Dilger, linux-ext4, linux-fsdevel

On Thu, Jan 26, 2023 at 08:24:06PM +0000, Matthew Wilcox (Oracle) wrote:
> Its one caller is already using a folio.
> 
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>

Signed-off-by: Theodore Ts'o <tytso@mit.edu>

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 02/31] fscrypt: Add some folio helper functions
  2023-03-14 22:05       ` Theodore Ts'o
@ 2023-03-14 23:12         ` Eric Biggers
  2023-03-15  2:53           ` Theodore Ts'o
  0 siblings, 1 reply; 83+ messages in thread
From: Eric Biggers @ 2023-03-14 23:12 UTC (permalink / raw)
  To: Theodore Ts'o
  Cc: Matthew Wilcox, Andreas Dilger, linux-ext4, linux-fsdevel

On Tue, Mar 14, 2023 at 06:05:51PM -0400, Theodore Ts'o wrote:
> On Fri, Jan 27, 2023 at 04:13:37PM +0000, Matthew Wilcox wrote:
> > 
> > It's out of scope for _this_ patchset.  I think it's a patchset that
> > could come either before or after, and is needed to support large folios
> > with ext4.  The biggest problem with doing that conversion is that
> > bounce pages are allocated from a mempool which obviously only allocates
> > order-0 folios.  I don't know what to do about that.  Have a mempool
> > for each order of folio that the filesystem supports?  Try to allocate
> > folios without a mempool and then split the folio if allocation fails?
> > Have a mempool containing PMD-order pages and split them ourselves if
> > we need to allocate from the mempool?
> > 
> > Nothing's really standing out to me as the perfect answer.  There are
> > probably other alternatives.
> 
> Hmm.... should we have some kind of check in case a large folio is
> passed to these fscrypt functions?  (e.g., some kind of BUG_ON, or
> WARN_ON?)
> 
> Or do we just rely on people remembering that when we start trying to
> support large folios for ext4, it will probably have to be the easy
> cases first (e.g., no fscrypt, no fsverity, block size == page size)?
> 

I think large folio support for fscrypt and fsverity is not that far away.  I
already made the following changes in 6.3:

    51e4e3153ebc ("fscrypt: support decrypting data from large folios")
    5d0f0e57ed90 ("fsverity: support verifying data from large folios")

AFAICT, absent actual testing of course, the only major thing that's still
needed is that fscrypt_encrypt_pagecache_blocks() needs to support large folios.
I'm not sure how it should work, exactly.  Matthew gave a couple options.
Another option is to just continue to use bounce *pages*, and keep track of all
the bounce pages for each folio.

We could certainly make fscrypt_encrypt_pagecache_blocks() WARN when given a
large folio for now, if we aren't going to update it properly anytime soon.

By the way, fscrypt_encrypt_pagecache_blocks() is only used by the fs-layer file
contents encryption, not inline encryption.  Even without changing it, we could
support large folios on encrypted files when inline encryption is being used.

(A smaller thing, which I think I missed in "fsverity: support verifying data
from large folios", is that fsverity_verify_bio() still uses
bio_first_page_all(bio)->mapping->host to get the bio's inode.  Perhaps there
needs to be a page_folio() in there for the ->mapping to be valid?)

- Eric

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 02/31] fscrypt: Add some folio helper functions
  2023-03-14 23:12         ` Eric Biggers
@ 2023-03-15  2:53           ` Theodore Ts'o
  0 siblings, 0 replies; 83+ messages in thread
From: Theodore Ts'o @ 2023-03-15  2:53 UTC (permalink / raw)
  To: Eric Biggers; +Cc: Matthew Wilcox, Andreas Dilger, linux-ext4, linux-fsdevel

On Tue, Mar 14, 2023 at 04:12:39PM -0700, Eric Biggers wrote:
> 
> I think large folio support for fscrypt and fsverity is not that far away.  I
> already made the following changes in 6.3:
> 
>     51e4e3153ebc ("fscrypt: support decrypting data from large folios")
>     5d0f0e57ed90 ("fsverity: support verifying data from large folios")

Cool!  I was thinking that fscrypt and fsverity might end up lagging
as far as the large folio support was concerned, but I'm glad that
this might not be the case.

> AFAICT, absent actual testing of course, the only major thing that's still
> needed is that fscrypt_encrypt_pagecache_blocks() needs to support large folios.
> I'm not sure how it should work, exactly.  Matthew gave a couple options.
> Another option is to just continue to use bounce *pages*, and keep track of all
> the bounce pages for each folio.

We don't have to solve that right away; it is possible to support
reads of large folios, but not writes.  If someone reads in a 128k
folio, and then modifies a 4k page in the middle of the page, we could
just split up the 128k folio and then writing out either the single 4k
page that was modified.  (It might very well be that in that case, we
*want* to break up the folio anyway, to avoid the write amplification
problem.)

In any case, I suspect that how we would support large folios for ext4
by first is to support using iomap for buffer I/O --- but only for
file systems where page size == block size, with no fscrypt, no
fsverity, no data=journal, and only for buffered reads.  And for
buffered writes, we'll break apart the folio and then use the existing
ext4_writepages() code path.

We can then gradually start relying on iomap and using large folios
for additional scenarios, both on the read and eventually, on the
write side.  I suspect we'll want to have a way of enabling and
disabling large folios on a fine-grained manner, as well has
potentially proactively breaking up large folios in page_mkwrite (so
that a 4k random page modification doesn't get amplified into the
entire contents of a large folio needing to be written back).

       		     	   	 	    - Ted

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 25/31] ext4: Convert ext4_block_write_begin() to take a folio
  2023-03-06 15:21       ` Ritesh Harjani
@ 2023-03-15  4:40         ` Matthew Wilcox
  2023-03-15 14:57           ` Ritesh Harjani
  0 siblings, 1 reply; 83+ messages in thread
From: Matthew Wilcox @ 2023-03-15  4:40 UTC (permalink / raw)
  To: Ritesh Harjani; +Cc: Theodore Tso, Andreas Dilger, linux-ext4, linux-fsdevel

On Mon, Mar 06, 2023 at 08:51:45PM +0530, Ritesh Harjani wrote:
> Matthew Wilcox <willy@infradead.org> writes:
> 
> > On Mon, Mar 06, 2023 at 12:21:48PM +0530, Ritesh Harjani wrote:
> >> "Matthew Wilcox (Oracle)" <willy@infradead.org> writes:
> >>
> >> > All the callers now have a folio, so pass that in and operate on folios.
> >> > Removes four calls to compound_head().
> >>
> >> Why do you say four? Isn't it 3 calls of PageUptodate(page) which
> >> removes calls to compound_head()? Which one did I miss?
> >>
> >> > -	BUG_ON(!PageLocked(page));
> >> > +	BUG_ON(!folio_test_locked(folio));
> >
> > That one ;-)
> 
> __PAGEFLAG(Locked, locked, PF_NO_TAIL)
> 
> #define __PAGEFLAG(uname, lname, policy)				\
> 	TESTPAGEFLAG(uname, lname, policy)				\
> 	__SETPAGEFLAG(uname, lname, policy)				\
> 	__CLEARPAGEFLAG(uname, lname, policy)
> 
> #define TESTPAGEFLAG(uname, lname, policy)				\
> static __always_inline bool folio_test_##lname(struct folio *folio)	\
> { return test_bit(PG_##lname, folio_flags(folio, FOLIO_##policy)); }	\
> static __always_inline int Page##uname(struct page *page)		\
> { return test_bit(PG_##lname, &policy(page, 0)->flags); }
> 
> How? PageLocked(page) doesn't do any compount_head() calls no?

You missed one piece of the definition ...

#define PF_NO_TAIL(page, enforce) ({                                    \
                VM_BUG_ON_PGFLAGS(enforce && PageTail(page), page);     \
                PF_POISONED_CHECK(compound_head(page)); })



^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 25/31] ext4: Convert ext4_block_write_begin() to take a folio
  2023-03-15  4:40         ` Matthew Wilcox
@ 2023-03-15 14:57           ` Ritesh Harjani
  0 siblings, 0 replies; 83+ messages in thread
From: Ritesh Harjani @ 2023-03-15 14:57 UTC (permalink / raw)
  To: Matthew Wilcox; +Cc: Theodore Tso, Andreas Dilger, linux-ext4, linux-fsdevel

Matthew Wilcox <willy@infradead.org> writes:

> On Mon, Mar 06, 2023 at 08:51:45PM +0530, Ritesh Harjani wrote:
>> Matthew Wilcox <willy@infradead.org> writes:
>> 
>> > On Mon, Mar 06, 2023 at 12:21:48PM +0530, Ritesh Harjani wrote:
>> >> "Matthew Wilcox (Oracle)" <willy@infradead.org> writes:
>> >>
>> >> > All the callers now have a folio, so pass that in and operate on folios.
>> >> > Removes four calls to compound_head().
>> >>
>> >> Why do you say four? Isn't it 3 calls of PageUptodate(page) which
>> >> removes calls to compound_head()? Which one did I miss?
>> >>
>> >> > -	BUG_ON(!PageLocked(page));
>> >> > +	BUG_ON(!folio_test_locked(folio));
>> >
>> > That one ;-)
>> 
>> __PAGEFLAG(Locked, locked, PF_NO_TAIL)
>> 
>> #define __PAGEFLAG(uname, lname, policy)				\
>> 	TESTPAGEFLAG(uname, lname, policy)				\
>> 	__SETPAGEFLAG(uname, lname, policy)				\
>> 	__CLEARPAGEFLAG(uname, lname, policy)
>> 
>> #define TESTPAGEFLAG(uname, lname, policy)				\
>> static __always_inline bool folio_test_##lname(struct folio *folio)	\
>> { return test_bit(PG_##lname, folio_flags(folio, FOLIO_##policy)); }	\
>> static __always_inline int Page##uname(struct page *page)		\
>> { return test_bit(PG_##lname, &policy(page, 0)->flags); }
>> 
>> How? PageLocked(page) doesn't do any compount_head() calls no?
>
> You missed one piece of the definition ...
>
> #define PF_NO_TAIL(page, enforce) ({                                    \
>                 VM_BUG_ON_PGFLAGS(enforce && PageTail(page), page);     \
>                 PF_POISONED_CHECK(compound_head(page)); })

aah yes, right. Thanks for pointing it.

-ritesh

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 00/31] Convert most of ext4 to folios
  2023-01-26 20:23 [PATCH 00/31] Convert most of ext4 to folios Matthew Wilcox (Oracle)
                   ` (30 preceding siblings ...)
  2023-01-26 20:24 ` [PATCH 31/31] ext4: Use a folio in ext4_read_merkle_tree_page Matthew Wilcox (Oracle)
@ 2023-03-15 17:57 ` Theodore Ts'o
  31 siblings, 0 replies; 83+ messages in thread
From: Theodore Ts'o @ 2023-03-15 17:57 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle); +Cc: Andreas Dilger, linux-ext4, linux-fsdevel

I've pushed Jan's data=writeback cleanup patches, which, among other
things completely eliminates ext4_writepage) to the ext4 tree's dev
branch.  So when you rebase these patches for the next version of this
series, please base them on

https://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4.git dev

Thanks!!

					- Ted

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 04/31] ext4: Convert ext4_finish_bio() to use folios
  2023-03-06  9:10   ` Ritesh Harjani
@ 2023-03-23  3:26     ` Matthew Wilcox
  2023-03-23 14:51       ` Darrick J. Wong
  0 siblings, 1 reply; 83+ messages in thread
From: Matthew Wilcox @ 2023-03-23  3:26 UTC (permalink / raw)
  To: Ritesh Harjani; +Cc: Theodore Tso, Andreas Dilger, linux-ext4, linux-fsdevel

On Mon, Mar 06, 2023 at 02:40:55PM +0530, Ritesh Harjani wrote:
> "Matthew Wilcox (Oracle)" <willy@infradead.org> writes:
> 
> > Prepare ext4 to support large folios in the page writeback path.
> 
> Sure. I am guessing for ext4 to completely support large folio
> requires more work like fscrypt bounce page handling doesn't
> yet support folios right?
> 
> Could you please give a little background on what all would be required
> to add large folio support in ext4 buffered I/O path?
> (I mean ofcourse other than saying move ext4 to iomap ;))
> 
> What I was interested in was, what other components in particular for
> e.g. fscrypt, fsverity, ext4's xyz component needs large folio support?
> 
> And how should one go about in adding this support? So can we move
> ext4's read path to have large folio support to get started?
> Have you already identified what all is missing from this path to
> convert it?

Honestly, I don't know what else needs to be done beyond this patch
series.  I can point at some stuff and say "This doesn't work", but in
general, you have to just enable it and see what breaks.  A lot of the
buffer_head code is not large-folio safe right now, so that's somewhere
to go and look.  Or maybe we "just" convert to iomap, and never bother
fixing the bufferhead code for large folios.

> > Also set the actual error in the mapping, not just -EIO.
> 
> Right. I looked at the history and I think it always just had EIO.
> I think setting the actual err in mapping_set_error() is the right thing
> to do here.
> 
> >
> > Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> 
> w.r.t this patch series. I reviewed the mechanical changes & error paths
> which converts ext4 ext4_finish_bio() to use folio.
> 
> The changes looks good to me from that perspective. Feel free to add -
> Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>

Thanks!

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 05/31] ext4: Convert ext4_writepage() to use a folio
  2023-03-14 22:26     ` Theodore Ts'o
@ 2023-03-23  3:29       ` Matthew Wilcox
  0 siblings, 0 replies; 83+ messages in thread
From: Matthew Wilcox @ 2023-03-23  3:29 UTC (permalink / raw)
  To: Theodore Ts'o
  Cc: Ritesh Harjani, Andreas Dilger, linux-ext4, linux-fsdevel

On Tue, Mar 14, 2023 at 06:26:46PM -0400, Theodore Ts'o wrote:
> On Tue, Mar 07, 2023 at 12:15:13AM +0530, Ritesh Harjani wrote:
> > "Matthew Wilcox (Oracle)" <willy@infradead.org> writes:
> > 
> > > Prepare for multi-page folios and save some instructions by converting
> > > to the folio API.
> > 
> > Mostly a straight forward change. The changes looks good to me.
> > Please feel free to add -
> > 
> > Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
> > 
> > In later few patches I see ext4_readpage converted to ext4_read_folio().
> > I think the reason why we have not changed ext4_writepage() to
> > ext4_write_folio() is because we anyway would like to get rid of
> > ->writepage ops eventually in future, so no point.
> > I think there is even patch series from Jan which tries to kill
> > ext4_writepage() completely.
> 
> Indeed, Jan's patch series[1] is about to land in the ext4 tree, and
> that's going to remove ext4_writepages.  The main reason why this
> hadn't landed yet was due to some conflicts with some other folio
> changes, so you should be able to drop this patch when you rebase this
> patch series.

Correct; in the rebase, I ended up just dropping this patch.

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 04/31] ext4: Convert ext4_finish_bio() to use folios
  2023-03-23  3:26     ` Matthew Wilcox
@ 2023-03-23 14:51       ` Darrick J. Wong
  2023-03-23 15:30         ` Matthew Wilcox
  2023-03-27  0:57         ` Christoph Hellwig
  0 siblings, 2 replies; 83+ messages in thread
From: Darrick J. Wong @ 2023-03-23 14:51 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Ritesh Harjani, Theodore Tso, Andreas Dilger, linux-ext4, linux-fsdevel

On Thu, Mar 23, 2023 at 03:26:43AM +0000, Matthew Wilcox wrote:
> On Mon, Mar 06, 2023 at 02:40:55PM +0530, Ritesh Harjani wrote:
> > "Matthew Wilcox (Oracle)" <willy@infradead.org> writes:
> > 
> > > Prepare ext4 to support large folios in the page writeback path.
> > 
> > Sure. I am guessing for ext4 to completely support large folio
> > requires more work like fscrypt bounce page handling doesn't
> > yet support folios right?
> > 
> > Could you please give a little background on what all would be required
> > to add large folio support in ext4 buffered I/O path?
> > (I mean ofcourse other than saying move ext4 to iomap ;))
> > 
> > What I was interested in was, what other components in particular for
> > e.g. fscrypt, fsverity, ext4's xyz component needs large folio support?
> > 
> > And how should one go about in adding this support? So can we move
> > ext4's read path to have large folio support to get started?
> > Have you already identified what all is missing from this path to
> > convert it?
> 
> Honestly, I don't know what else needs to be done beyond this patch
> series.  I can point at some stuff and say "This doesn't work", but in
> general, you have to just enable it and see what breaks.  A lot of the
> buffer_head code is not large-folio safe right now, so that's somewhere
> to go and look.  Or maybe we "just" convert to iomap, and never bother
> fixing the bufferhead code for large folios.

Yes.  Let's leave bufferheads in the legacy doo-doo-dooooo basement
instead of wasting more time on them.  Ideally we'd someday run all the
filesystems through:

bufferheads -> iomap with bufferheads -> iomap with folios -> iomap with
large folios -> retire to somewhere cheaper than Hawaii

--D

> > > Also set the actual error in the mapping, not just -EIO.
> > 
> > Right. I looked at the history and I think it always just had EIO.
> > I think setting the actual err in mapping_set_error() is the right thing
> > to do here.
> > 
> > >
> > > Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> > 
> > w.r.t this patch series. I reviewed the mechanical changes & error paths
> > which converts ext4 ext4_finish_bio() to use folio.
> > 
> > The changes looks good to me from that perspective. Feel free to add -
> > Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
> 
> Thanks!

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 04/31] ext4: Convert ext4_finish_bio() to use folios
  2023-03-23 14:51       ` Darrick J. Wong
@ 2023-03-23 15:30         ` Matthew Wilcox
  2023-03-27  0:58           ` Christoph Hellwig
  2023-03-27  0:57         ` Christoph Hellwig
  1 sibling, 1 reply; 83+ messages in thread
From: Matthew Wilcox @ 2023-03-23 15:30 UTC (permalink / raw)
  To: Darrick J. Wong
  Cc: Ritesh Harjani, Theodore Tso, Andreas Dilger, linux-ext4, linux-fsdevel

On Thu, Mar 23, 2023 at 07:51:09AM -0700, Darrick J. Wong wrote:
> On Thu, Mar 23, 2023 at 03:26:43AM +0000, Matthew Wilcox wrote:
> > On Mon, Mar 06, 2023 at 02:40:55PM +0530, Ritesh Harjani wrote:
> > > "Matthew Wilcox (Oracle)" <willy@infradead.org> writes:
> > > 
> > > > Prepare ext4 to support large folios in the page writeback path.
> > > 
> > > Sure. I am guessing for ext4 to completely support large folio
> > > requires more work like fscrypt bounce page handling doesn't
> > > yet support folios right?
> > > 
> > > Could you please give a little background on what all would be required
> > > to add large folio support in ext4 buffered I/O path?
> > > (I mean ofcourse other than saying move ext4 to iomap ;))
> > > 
> > > What I was interested in was, what other components in particular for
> > > e.g. fscrypt, fsverity, ext4's xyz component needs large folio support?
> > > 
> > > And how should one go about in adding this support? So can we move
> > > ext4's read path to have large folio support to get started?
> > > Have you already identified what all is missing from this path to
> > > convert it?
> > 
> > Honestly, I don't know what else needs to be done beyond this patch
> > series.  I can point at some stuff and say "This doesn't work", but in
> > general, you have to just enable it and see what breaks.  A lot of the
> > buffer_head code is not large-folio safe right now, so that's somewhere
> > to go and look.  Or maybe we "just" convert to iomap, and never bother
> > fixing the bufferhead code for large folios.
> 
> Yes.  Let's leave bufferheads in the legacy doo-doo-dooooo basement
> instead of wasting more time on them.  Ideally we'd someday run all the
> filesystems through:
> 
> bufferheads -> iomap with bufferheads -> iomap with folios -> iomap with
> large folios -> retire to somewhere cheaper than Hawaii

Places cheaper than Hawaii probably aren't as pretty as Hawaii though :-(

XFS is fine because it uses xfs_buf, but if we don't add support for
large folios to bufferheads, we can't support LBA size > PAGE_SIZE even
to read the superblock.  Maybe that's fine ... only filesystems which
don't use sb_bread() get to support LBA size > PAGE_SIZE.

I really want to see a cheaper abstraction for accessing the block device
than BHs.  Or xfs_buf for that matter.

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 10/31] ext4: Convert ext4_convert_inline_data_to_extent() to use a folio
  2023-03-14 22:36   ` Theodore Ts'o
@ 2023-03-23 17:14     ` Matthew Wilcox
  0 siblings, 0 replies; 83+ messages in thread
From: Matthew Wilcox @ 2023-03-23 17:14 UTC (permalink / raw)
  To: Theodore Ts'o; +Cc: Andreas Dilger, linux-ext4, linux-fsdevel

On Tue, Mar 14, 2023 at 06:36:21PM -0400, Theodore Ts'o wrote:
> On Thu, Jan 26, 2023 at 08:23:54PM +0000, Matthew Wilcox (Oracle) wrote:
> > Saves a number of calls to compound_head().
> 
> Is this left over from an earlier version of this patch series?  There
> are no changes to calls to compound_head() that I can find in this
> patch.

They're hidden.  Here are the ones from this patch:

-       if (!PageUptodate(page)) {
-               unlock_page(page);
-               put_page(page);
-               unlock_page(page);
-               put_page(page);

That's five.  I may have missed some.

> > @@ -565,10 +564,9 @@ static int ext4_convert_inline_data_to_extent(struct address_space *mapping,
> >  
> >  	/* We cannot recurse into the filesystem as the transaction is already
> >  	 * started */
> > -	flags = memalloc_nofs_save();
> > -	page = grab_cache_page_write_begin(mapping, 0);
> > -	memalloc_nofs_restore(flags);
> > -	if (!page) {
> > +	folio = __filemap_get_folio(mapping, 0, FGP_WRITEBEGIN | FGP_NOFS,
> > +			mapping_gfp_mask(mapping));
> > +	if (!folio) {
> >  		ret = -ENOMEM;
> >  		goto out;
> >  	}
> 
> Is there a reason why to use FGP_NOFS as opposed to using
> memalloc_nofs_{save,restore}()?
> 
> I thought using memalloc_nofs_save() is considered the perferred
> approach by mm-folks.

Ideally, yes, we'd use memalloc_nofs_save(), but not like this!  The way
it's supposed to be used is at the point where you do something which
makes the fs non-reentrant.  ie when you start the transaction, you should
be calling memalloc_nofs_save() and when you finish the transaction,
you should be calling memalloc_nofs_restore().  That way, you don't
need to adorn the entire filesystem with GFP_NOFS/FGP_NOFS/whatever,
you have one place where you mark yourself non-reentrant and you're done.

Once ext4 does this every time it starts a transaction, we can drop
the FGP_NOFS flag usage in ext4, and once every filesystem does it,
we can drop the entire flag, and that will make me happy.  It's a long
road, though.

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 19/31] ext4: Convert ext4_journalled_zero_new_buffers() to use a folio
  2023-03-14 22:46   ` Theodore Ts'o
@ 2023-03-24  4:15     ` Matthew Wilcox
  0 siblings, 0 replies; 83+ messages in thread
From: Matthew Wilcox @ 2023-03-24  4:15 UTC (permalink / raw)
  To: Theodore Ts'o; +Cc: Andreas Dilger, linux-ext4, linux-fsdevel

On Tue, Mar 14, 2023 at 06:46:19PM -0400, Theodore Ts'o wrote:
> On Thu, Jan 26, 2023 at 08:24:03PM +0000, Matthew Wilcox (Oracle) wrote:
> > Remove a call to compound_head().
> 
> Same question as with other commits, plus one more; why is it notable
> that calls to compound_head() are being reduced.  I've looked at the
> implementation, and it doesn't look all _that_ heavyweight....

It gets a lot more heavyweight when you turn on
CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP, which more and more distro
kernels are doing, because it's such a win for VM host kernels.
eg SuSE do it here:
https://github.com/SUSE/kernel-source/blob/master/config/x86_64/default
and UEK does it here:
https://github.com/oracle/linux-uek/blob/uek7/ga/uek-rpm/ol9/config-x86_64
Debian also has it enabled.

It didn't use to be so expensive, but now it's something like 50-60
bytes of text per invocation on x86 [1].  And the compiler doesn't get
to remember the result of calling compound_head() because we might have
changed page->compound_head between invocations.  It doesn't even know
that compound_head() is idempotent.

Anyway, each of these patches can be justified as "This patch shrinks
the kernel by 0.0001%".  Of course my real motivation for doing this
is to reduce the number of callers of the page APIs so we can start to
remove them and lessen the cognitive complexity of having both page &
folio APIs that parallel each other.  And I would like ext4 to support
large folios sometime soon, and it's a step towards that goal too.
But that's a lot to write out in each changelog.

[1] For example, the disassembly of unlock_page() with the UEK
config:

  c0:   f3 0f 1e fa             endbr64
  c4:   e8 00 00 00 00          call   c9 <unlock_page+0x9>
                        c5: R_X86_64_PLT32      __fentry__-0x4
  c9:   55                      push   %rbp
  ca:   48 8b 47 08             mov    0x8(%rdi),%rax
  ce:   48 89 e5                mov    %rsp,%rbp
  d1:   a8 01                   test   $0x1,%al
  d3:   75 2f                   jne    104 <unlock_page+0x44>
  d5:   eb 0b                   jmp    e2 <unlock_page+0x22>
  d7:   e8 00 00 00 00          call   dc <unlock_page+0x1c>
                        d8: R_X86_64_PLT32      folio_unlock-0x4
  dc:   5d                      pop    %rbp
  dd:   e9 00 00 00 00          jmp    e2 <unlock_page+0x22>
                        de: R_X86_64_PLT32      __x86_return_thunk-0x4
  e2:   f7 c7 ff 0f 00 00       test   $0xfff,%edi
  e8:   75 ed                   jne    d7 <unlock_page+0x17>
  ea:   48 8b 07                mov    (%rdi),%rax
  ed:   a9 00 00 01 00          test   $0x10000,%eax
  f2:   74 e3                   je     d7 <unlock_page+0x17>
  f4:   48 8b 47 48             mov    0x48(%rdi),%rax
  f8:   48 8d 50 ff             lea    -0x1(%rax),%rdx
  fc:   a8 01                   test   $0x1,%al
  fe:   48 0f 45 fa             cmovne %rdx,%rdi
 102:   eb d3                   jmp    d7 <unlock_page+0x17>
 104:   48 8d 78 ff             lea    -0x1(%rax),%rdi
 108:   e8 00 00 00 00          call   10d <unlock_page+0x4d>
                        109: R_X86_64_PLT32     folio_unlock-0x4
 10d:   5d                      pop    %rbp
 10e:   e9 00 00 00 00          jmp    113 <unlock_page+0x53>
                        10f: R_X86_64_PLT32     __x86_return_thunk-0x4
 113:   66 66 2e 0f 1f 84 00    data16 cs nopw 0x0(%rax,%rax,1)
 11a:   00 00 00 00
 11e:   66 90                   xchg   %ax,%ax

Everything between 0xd3 and 0x104 is "maybe it's a fake head".  That's 41
bytes as a minimum per callsite, and typically it's much more becase we
also need the test for PageTail and the lea for the actual compound_head.

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 21/31] ext4: Convert __ext4_journalled_writepage() to take a folio
  2023-03-14 22:47   ` Theodore Ts'o
@ 2023-03-24  4:55     ` Matthew Wilcox
  0 siblings, 0 replies; 83+ messages in thread
From: Matthew Wilcox @ 2023-03-24  4:55 UTC (permalink / raw)
  To: Theodore Ts'o; +Cc: Andreas Dilger, linux-ext4, linux-fsdevel

On Tue, Mar 14, 2023 at 06:47:52PM -0400, Theodore Ts'o wrote:
> On Thu, Jan 26, 2023 at 08:24:05PM +0000, Matthew Wilcox (Oracle) wrote:
> > Use the folio APIs throughout and remove a PAGE_SIZE assumption.
> > 
> > Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> 
> This patch should be obviated by Jan's "Cleanup data=journal writeback
> path" patch series.

Yup, it's gone in the rebase.

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 04/31] ext4: Convert ext4_finish_bio() to use folios
  2023-03-23 14:51       ` Darrick J. Wong
  2023-03-23 15:30         ` Matthew Wilcox
@ 2023-03-27  0:57         ` Christoph Hellwig
  1 sibling, 0 replies; 83+ messages in thread
From: Christoph Hellwig @ 2023-03-27  0:57 UTC (permalink / raw)
  To: Darrick J. Wong
  Cc: Matthew Wilcox, Ritesh Harjani, Theodore Tso, Andreas Dilger,
	linux-ext4, linux-fsdevel

On Thu, Mar 23, 2023 at 07:51:09AM -0700, Darrick J. Wong wrote:
> Yes.  Let's leave bufferheads in the legacy doo-doo-dooooo basement
> instead of wasting more time on them.  Ideally we'd someday run all the
> filesystems through:
> 
> bufferheads -> iomap with bufferheads -> iomap with folios -> iomap with
> large folios -> retire to somewhere cheaper than Hawaii

For a lot of the legacy stuff (and with that I don't mean ext4) we'd
really need volunteers to do any work other than typo fixing and
cosmetic cleanups.  I suspect just dropping many of them is the only
thing we can do long term.

But even if we do the above for the data path for all file systems
remaining in tree, we still have buffer_heads for metadata.  And
I think buffered_heads really are more or less the right abstraction
there anyway.  And for these existing file systems we also do not
care about using large folios for metadata caching anyway.  The
only nasty part is that these buffer_heads use the same mapping
as all block device access, so we'll need to find a way to use
large folios for the block device mapping for the LBA size > page size
case, while supporting buffer heads for those file systems.  Nothing
unsolvable, but a bit tricky.

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 04/31] ext4: Convert ext4_finish_bio() to use folios
  2023-03-23 15:30         ` Matthew Wilcox
@ 2023-03-27  0:58           ` Christoph Hellwig
  0 siblings, 0 replies; 83+ messages in thread
From: Christoph Hellwig @ 2023-03-27  0:58 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Darrick J. Wong, Ritesh Harjani, Theodore Tso, Andreas Dilger,
	linux-ext4, linux-fsdevel

On Thu, Mar 23, 2023 at 03:30:38PM +0000, Matthew Wilcox wrote:
> I really want to see a cheaper abstraction for accessing the block device
> than BHs.  Or xfs_buf for that matter.

You literally can just use the bdev page cache using the normal page
cache helpers.  It's not quite what most of these file systems expect,
though, especially for the block size < PAGE_SIZE case.

^ permalink raw reply	[flat|nested] 83+ messages in thread

end of thread, other threads:[~2023-03-27  0:58 UTC | newest]

Thread overview: 83+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-01-26 20:23 [PATCH 00/31] Convert most of ext4 to folios Matthew Wilcox (Oracle)
2023-01-26 20:23 ` [PATCH 01/31] fs: Add FGP_WRITEBEGIN Matthew Wilcox (Oracle)
2023-03-05  8:53   ` Ritesh Harjani
2023-03-14 22:00   ` Theodore Ts'o
2023-01-26 20:23 ` [PATCH 02/31] fscrypt: Add some folio helper functions Matthew Wilcox (Oracle)
2023-01-27  3:02   ` Eric Biggers
2023-01-27 16:13     ` Matthew Wilcox
2023-01-27 16:21       ` Eric Biggers
2023-01-27 16:37         ` Matthew Wilcox
2023-03-14 22:05       ` Theodore Ts'o
2023-03-14 23:12         ` Eric Biggers
2023-03-15  2:53           ` Theodore Ts'o
2023-03-05  9:06   ` Ritesh Harjani
2023-01-26 20:23 ` [PATCH 03/31] ext4: Convert ext4_bio_write_page() to use a folio Matthew Wilcox (Oracle)
2023-01-28 16:53   ` kernel test robot
2023-01-28 19:07   ` kernel test robot
2023-03-05 11:18   ` Ritesh Harjani
2023-03-14 22:07   ` Theodore Ts'o
2023-01-26 20:23 ` [PATCH 04/31] ext4: Convert ext4_finish_bio() to use folios Matthew Wilcox (Oracle)
2023-03-06  9:10   ` Ritesh Harjani
2023-03-23  3:26     ` Matthew Wilcox
2023-03-23 14:51       ` Darrick J. Wong
2023-03-23 15:30         ` Matthew Wilcox
2023-03-27  0:58           ` Christoph Hellwig
2023-03-27  0:57         ` Christoph Hellwig
2023-03-14 22:08   ` Theodore Ts'o
2023-01-26 20:23 ` [PATCH 05/31] ext4: Convert ext4_writepage() to use a folio Matthew Wilcox (Oracle)
2023-03-06 18:45   ` Ritesh Harjani
2023-03-14 22:26     ` Theodore Ts'o
2023-03-23  3:29       ` Matthew Wilcox
2023-01-26 20:23 ` [PATCH 06/31] ext4: Turn mpage_process_page() into mpage_process_folio() Matthew Wilcox (Oracle)
2023-03-14 22:27   ` Theodore Ts'o
2023-01-26 20:23 ` [PATCH 07/31] ext4: Convert mpage_submit_page() to mpage_submit_folio() Matthew Wilcox (Oracle)
2023-03-14 22:28   ` Theodore Ts'o
2023-01-26 20:23 ` [PATCH 08/31] ext4: Convert ext4_bio_write_page() to ext4_bio_write_folio() Matthew Wilcox (Oracle)
2023-03-14 22:31   ` Theodore Ts'o
2023-01-26 20:23 ` [PATCH 09/31] ext4: Convert ext4_readpage_inline() to take a folio Matthew Wilcox (Oracle)
2023-03-14 22:31   ` Theodore Ts'o
2023-01-26 20:23 ` [PATCH 10/31] ext4: Convert ext4_convert_inline_data_to_extent() to use " Matthew Wilcox (Oracle)
2023-03-14 22:36   ` Theodore Ts'o
2023-03-23 17:14     ` Matthew Wilcox
2023-01-26 20:23 ` [PATCH 11/31] ext4: Convert ext4_try_to_write_inline_data() " Matthew Wilcox (Oracle)
2023-03-14 22:37   ` Theodore Ts'o
2023-01-26 20:23 ` [PATCH 12/31] ext4: Convert ext4_da_convert_inline_data_to_extent() " Matthew Wilcox (Oracle)
2023-01-26 20:23 ` [PATCH 13/31] ext4: Convert ext4_da_write_inline_data_begin() " Matthew Wilcox (Oracle)
2023-01-26 20:23 ` [PATCH 14/31] ext4: Convert ext4_read_inline_page() to ext4_read_inline_folio() Matthew Wilcox (Oracle)
2023-03-14 22:38   ` Theodore Ts'o
2023-01-26 20:23 ` [PATCH 15/31] ext4: Convert ext4_write_inline_data_end() to use a folio Matthew Wilcox (Oracle)
2023-03-14 22:39   ` Theodore Ts'o
2023-01-26 20:24 ` [PATCH 16/31] ext4: Convert ext4_write_begin() " Matthew Wilcox (Oracle)
2023-03-14 22:40   ` Theodore Ts'o
2023-01-26 20:24 ` [PATCH 17/31] ext4: Convert ext4_write_end() " Matthew Wilcox (Oracle)
2023-03-14 22:41   ` Theodore Ts'o
2023-01-26 20:24 ` [PATCH 18/31] ext4: Use a folio in ext4_journalled_write_end() Matthew Wilcox (Oracle)
2023-03-14 22:41   ` Theodore Ts'o
2023-01-26 20:24 ` [PATCH 19/31] ext4: Convert ext4_journalled_zero_new_buffers() to use a folio Matthew Wilcox (Oracle)
2023-03-14 22:46   ` Theodore Ts'o
2023-03-24  4:15     ` Matthew Wilcox
2023-01-26 20:24 ` [PATCH 20/31] ext4: Convert __ext4_block_zero_page_range() " Matthew Wilcox (Oracle)
2023-03-05 12:26   ` Ritesh Harjani
2023-01-26 20:24 ` [PATCH 21/31] ext4: Convert __ext4_journalled_writepage() to take " Matthew Wilcox (Oracle)
2023-03-14 22:47   ` Theodore Ts'o
2023-03-24  4:55     ` Matthew Wilcox
2023-01-26 20:24 ` [PATCH 22/31] ext4: Convert ext4_page_nomap_can_writeout() " Matthew Wilcox (Oracle)
2023-03-14 22:50   ` Theodore Ts'o
2023-01-26 20:24 ` [PATCH 23/31] ext4: Use a folio in ext4_da_write_begin() Matthew Wilcox (Oracle)
2023-01-26 20:24 ` [PATCH 24/31] ext4: Convert ext4_mpage_readpages() to work on folios Matthew Wilcox (Oracle)
2023-01-27  4:15   ` Eric Biggers
2023-01-27 16:08     ` Matthew Wilcox
2023-03-05 11:26       ` Ritesh Harjani
2023-01-26 20:24 ` [PATCH 25/31] ext4: Convert ext4_block_write_begin() to take a folio Matthew Wilcox (Oracle)
2023-03-06  6:51   ` Ritesh Harjani
2023-03-06  8:27     ` Matthew Wilcox
2023-03-06 15:21       ` Ritesh Harjani
2023-03-15  4:40         ` Matthew Wilcox
2023-03-15 14:57           ` Ritesh Harjani
2023-01-26 20:24 ` [PATCH 26/31] ext4: Convert ext4_writepage() " Matthew Wilcox (Oracle)
2023-01-26 20:24 ` [PATCH 27/31] ext4: Use a folio in ext4_page_mkwrite() Matthew Wilcox (Oracle)
2023-01-26 20:24 ` [PATCH 28/31] ext4: Use a folio iterator in __read_end_io() Matthew Wilcox (Oracle)
2023-01-26 20:24 ` [PATCH 29/31] ext4: Convert mext_page_mkuptodate() to take a folio Matthew Wilcox (Oracle)
2023-01-26 20:24 ` [PATCH 30/31] ext4: Convert pagecache_read() to use " Matthew Wilcox (Oracle)
2023-01-26 20:24 ` [PATCH 31/31] ext4: Use a folio in ext4_read_merkle_tree_page Matthew Wilcox (Oracle)
2023-03-15 17:57 ` [PATCH 00/31] Convert most of ext4 to folios Theodore Ts'o

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.