linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 00/28] iomap/xfs folio patches
@ 2021-11-08  4:05 Matthew Wilcox (Oracle)
  2021-11-08  4:05 ` [PATCH v2 01/28] csky,sparc: Declare flush_dcache_folio() Matthew Wilcox (Oracle)
                   ` (27 more replies)
  0 siblings, 28 replies; 64+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-11-08  4:05 UTC (permalink / raw)
  To: Darrick J . Wong 
  Cc: Matthew Wilcox (Oracle),
	linux-xfs, linux-fsdevel, linux-kernel, linux-block, Jens Axboe,
	Christoph Hellwig

This patchset converts XFS & iomap to use folios, and gets them to a
state where they can handle multi-page folios.  Applying these patches
is not yet sufficient to actually start using multi-page folios for
XFS; more page cache changes are needed.  I don't anticipate needing to
touch XFS again until we're at the point where we want to convert the
aops to be type-safe.  It completes an xfstests run with no unexpected
failures.  Most of these patches have been posted before and I've retained
acks/reviews where I thought them reasonable.  Some patches are new.

v2:
 - Added review tags from Jens, Darrick & Christoph (thanks!)
 - Added folio_zero_* wrappers around zero_user_*()
 - Added a patch to rename AS_THP_SUPPORT
 - Added a patch to convert __block_write_begin_int() to take a folio
 - Split the iomap_add_to_ioend() patch into three
 - Updated changelog of bio_add_folio() (Jens)
 - Adjusted whitespace of bio patches (Christoph, Jens)
 - Improved changelog of readahead conversion to explain why the put_page()
   disappeared (Christoph)
 - Add a patch to zero an entire folio at a time, instead of limiting to
   a page
 - Switch pos & end_pos back to being u64 from loff_t
 - Call block_write_end() and ->page_done with the head page of the folio,
   as that's what those functions expect.

I intend to push patch 1 upstream myself (before 5.16), but I've included
it here to avoid nasty messages from the build-bots.  I can probably
persuade Linus to take patches 2-4 as well if Darrick's not comfortable
taking them as part of the iomap changes.

These changes are also available at:
  git://git.infradead.org/users/willy/pagecache.git heads/folio-iomap

I intend to rebase that branch to include any further R-b tags (some of
the patches are new and don't have reviews).

Matthew Wilcox (Oracle) (28):
  csky,sparc: Declare flush_dcache_folio()
  mm: Add functions to zero portions of a folio
  fs: Remove FS_THP_SUPPORT
  fs: Rename AS_THP_SUPPORT and mapping_thp_support
  block: Add bio_add_folio()
  block: Add bio_for_each_folio_all()
  fs/buffer: Convert __block_write_begin_int() to take a folio
  iomap: Convert to_iomap_page to take a folio
  iomap: Convert iomap_page_create to take a folio
  iomap: Convert iomap_page_release to take a folio
  iomap: Convert iomap_releasepage to use a folio
  iomap: Add iomap_invalidate_folio
  iomap: Pass the iomap_page into iomap_set_range_uptodate
  iomap: Convert bio completions to use folios
  iomap: Use folio offsets instead of page offsets
  iomap: Convert iomap_read_inline_data to take a folio
  iomap: Convert readahead and readpage to use a folio
  iomap: Convert iomap_page_mkwrite to use a folio
  iomap: Convert __iomap_zero_iter to use a folio
  iomap: Convert iomap_write_begin() and iomap_write_end() to folios
  iomap: Convert iomap_write_end_inline to take a folio
  iomap,xfs: Convert ->discard_page to ->discard_folio
  iomap: Simplify iomap_writepage_map()
  iomap: Simplify iomap_do_writepage()
  iomap: Convert iomap_add_to_ioend() to take a folio
  iomap: Convert iomap_migrate_page() to use folios
  iomap: Support multi-page folios in invalidatepage
  xfs: Support multi-page folios

 Documentation/core-api/kernel-api.rst  |   1 +
 arch/csky/abiv1/inc/abi/cacheflush.h   |   1 +
 arch/csky/abiv2/inc/abi/cacheflush.h   |   2 +
 arch/sparc/include/asm/cacheflush_32.h |   1 +
 arch/sparc/include/asm/cacheflush_64.h |   1 +
 block/bio.c                            |  22 ++
 fs/buffer.c                            |  22 +-
 fs/inode.c                             |   2 -
 fs/internal.h                          |   2 +-
 fs/iomap/buffered-io.c                 | 506 +++++++++++++------------
 fs/xfs/xfs_aops.c                      |  24 +-
 fs/xfs/xfs_icache.c                    |   2 +
 include/linux/bio.h                    |  56 ++-
 include/linux/fs.h                     |   1 -
 include/linux/highmem.h                |  44 ++-
 include/linux/iomap.h                  |   3 +-
 include/linux/pagemap.h                |  26 +-
 mm/highmem.c                           |   2 -
 mm/shmem.c                             |   3 +-
 19 files changed, 431 insertions(+), 290 deletions(-)

-- 
2.33.0


^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH v2 01/28] csky,sparc: Declare flush_dcache_folio()
  2021-11-08  4:05 [PATCH v2 00/28] iomap/xfs folio patches Matthew Wilcox (Oracle)
@ 2021-11-08  4:05 ` Matthew Wilcox (Oracle)
  2021-11-09  8:36   ` Christoph Hellwig
  2021-11-08  4:05 ` [PATCH v2 02/28] mm: Add functions to zero portions of a folio Matthew Wilcox (Oracle)
                   ` (26 subsequent siblings)
  27 siblings, 1 reply; 64+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-11-08  4:05 UTC (permalink / raw)
  To: Darrick J . Wong 
  Cc: Matthew Wilcox (Oracle),
	linux-xfs, linux-fsdevel, linux-kernel, linux-block, Jens Axboe,
	Christoph Hellwig

These architectures do not include asm-generic/cacheflush.h so need
to declare it themselves.

Fixes: 08b0b0059bf1 ("mm: Add flush_dcache_folio()")
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 arch/csky/abiv1/inc/abi/cacheflush.h   | 1 +
 arch/csky/abiv2/inc/abi/cacheflush.h   | 2 ++
 arch/sparc/include/asm/cacheflush_32.h | 1 +
 arch/sparc/include/asm/cacheflush_64.h | 1 +
 4 files changed, 5 insertions(+)

diff --git a/arch/csky/abiv1/inc/abi/cacheflush.h b/arch/csky/abiv1/inc/abi/cacheflush.h
index ed62e2066ba7..432aef1f1dc2 100644
--- a/arch/csky/abiv1/inc/abi/cacheflush.h
+++ b/arch/csky/abiv1/inc/abi/cacheflush.h
@@ -9,6 +9,7 @@
 
 #define ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE 1
 extern void flush_dcache_page(struct page *);
+void flush_dcache_folio(struct folio *folio);
 
 #define flush_cache_mm(mm)			dcache_wbinv_all()
 #define flush_cache_page(vma, page, pfn)	cache_wbinv_all()
diff --git a/arch/csky/abiv2/inc/abi/cacheflush.h b/arch/csky/abiv2/inc/abi/cacheflush.h
index a565e00c3f70..7e8bef60958c 100644
--- a/arch/csky/abiv2/inc/abi/cacheflush.h
+++ b/arch/csky/abiv2/inc/abi/cacheflush.h
@@ -25,6 +25,8 @@ static inline void flush_dcache_page(struct page *page)
 		clear_bit(PG_dcache_clean, &page->flags);
 }
 
+void flush_dcache_folio(struct folio *folio);
+
 #define flush_dcache_mmap_lock(mapping)		do { } while (0)
 #define flush_dcache_mmap_unlock(mapping)	do { } while (0)
 #define flush_icache_page(vma, page)		do { } while (0)
diff --git a/arch/sparc/include/asm/cacheflush_32.h b/arch/sparc/include/asm/cacheflush_32.h
index 41c6d734a474..9991c18f4980 100644
--- a/arch/sparc/include/asm/cacheflush_32.h
+++ b/arch/sparc/include/asm/cacheflush_32.h
@@ -37,6 +37,7 @@
 
 void sparc_flush_page_to_ram(struct page *page);
 
+void flush_dcache_folio(struct folio *folio);
 #define ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE 1
 #define flush_dcache_page(page)			sparc_flush_page_to_ram(page)
 #define flush_dcache_mmap_lock(mapping)		do { } while (0)
diff --git a/arch/sparc/include/asm/cacheflush_64.h b/arch/sparc/include/asm/cacheflush_64.h
index b9341836597e..9ab59a73c28b 100644
--- a/arch/sparc/include/asm/cacheflush_64.h
+++ b/arch/sparc/include/asm/cacheflush_64.h
@@ -47,6 +47,7 @@ void flush_dcache_page_all(struct mm_struct *mm, struct page *page);
 void __flush_dcache_range(unsigned long start, unsigned long end);
 #define ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE 1
 void flush_dcache_page(struct page *page);
+void flush_dcache_folio(struct folio *folio);
 
 #define flush_icache_page(vma, pg)	do { } while(0)
 
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH v2 02/28] mm: Add functions to zero portions of a folio
  2021-11-08  4:05 [PATCH v2 00/28] iomap/xfs folio patches Matthew Wilcox (Oracle)
  2021-11-08  4:05 ` [PATCH v2 01/28] csky,sparc: Declare flush_dcache_folio() Matthew Wilcox (Oracle)
@ 2021-11-08  4:05 ` Matthew Wilcox (Oracle)
  2021-11-09  8:40   ` Christoph Hellwig
  2021-11-17  4:45   ` Darrick J. Wong
  2021-11-08  4:05 ` [PATCH v2 03/28] fs: Remove FS_THP_SUPPORT Matthew Wilcox (Oracle)
                   ` (25 subsequent siblings)
  27 siblings, 2 replies; 64+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-11-08  4:05 UTC (permalink / raw)
  To: Darrick J . Wong 
  Cc: Matthew Wilcox (Oracle),
	linux-xfs, linux-fsdevel, linux-kernel, linux-block, Jens Axboe,
	Christoph Hellwig

These functions are wrappers around zero_user_segments(), which means
that zero_user_segments() can now be called for compound pages even when
CONFIG_TRANSPARENT_HUGEPAGE is disabled.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 include/linux/highmem.h | 44 ++++++++++++++++++++++++++++++++++++++---
 mm/highmem.c            |  2 --
 2 files changed, 41 insertions(+), 5 deletions(-)

diff --git a/include/linux/highmem.h b/include/linux/highmem.h
index 25aff0f2ed0b..c343c69bb5b4 100644
--- a/include/linux/highmem.h
+++ b/include/linux/highmem.h
@@ -231,10 +231,10 @@ static inline void tag_clear_highpage(struct page *page)
  * If we pass in a base or tail page, we can zero up to PAGE_SIZE.
  * If we pass in a head page, we can zero up to the size of the compound page.
  */
-#if defined(CONFIG_HIGHMEM) && defined(CONFIG_TRANSPARENT_HUGEPAGE)
+#ifdef CONFIG_HIGHMEM
 void zero_user_segments(struct page *page, unsigned start1, unsigned end1,
 		unsigned start2, unsigned end2);
-#else /* !HIGHMEM || !TRANSPARENT_HUGEPAGE */
+#else
 static inline void zero_user_segments(struct page *page,
 		unsigned start1, unsigned end1,
 		unsigned start2, unsigned end2)
@@ -254,7 +254,7 @@ static inline void zero_user_segments(struct page *page,
 	for (i = 0; i < compound_nr(page); i++)
 		flush_dcache_page(page + i);
 }
-#endif /* !HIGHMEM || !TRANSPARENT_HUGEPAGE */
+#endif
 
 static inline void zero_user_segment(struct page *page,
 	unsigned start, unsigned end)
@@ -364,4 +364,42 @@ static inline void memzero_page(struct page *page, size_t offset, size_t len)
 	kunmap_local(addr);
 }
 
+/**
+ * folio_zero_segments() - Zero two byte ranges in a folio.
+ * @folio: The folio to write to.
+ * @start1: The first byte to zero.
+ * @end1: One more than the last byte in the first range.
+ * @start2: The first byte to zero in the second range.
+ * @end2: One more than the last byte in the second range.
+ */
+static inline void folio_zero_segments(struct folio *folio,
+		size_t start1, size_t end1, size_t start2, size_t end2)
+{
+	zero_user_segments(&folio->page, start1, end1, start2, end2);
+}
+
+/**
+ * folio_zero_segment() - Zero a byte range in a folio.
+ * @folio: The folio to write to.
+ * @start: The first byte to zero.
+ * @end: One more than the last byte in the first range.
+ */
+static inline void folio_zero_segment(struct folio *folio,
+		size_t start, size_t end)
+{
+	zero_user_segments(&folio->page, start, end, 0, 0);
+}
+
+/**
+ * folio_zero_range() - Zero a byte range in a folio.
+ * @folio: The folio to write to.
+ * @start: The first byte to zero.
+ * @length: The number of bytes to zero.
+ */
+static inline void folio_zero_range(struct folio *folio,
+		size_t start, size_t length)
+{
+	zero_user_segments(&folio->page, start, start + length, 0, 0);
+}
+
 #endif /* _LINUX_HIGHMEM_H */
diff --git a/mm/highmem.c b/mm/highmem.c
index 88f65f155845..819d41140e5b 100644
--- a/mm/highmem.c
+++ b/mm/highmem.c
@@ -359,7 +359,6 @@ void kunmap_high(struct page *page)
 }
 EXPORT_SYMBOL(kunmap_high);
 
-#ifdef CONFIG_TRANSPARENT_HUGEPAGE
 void zero_user_segments(struct page *page, unsigned start1, unsigned end1,
 		unsigned start2, unsigned end2)
 {
@@ -416,7 +415,6 @@ void zero_user_segments(struct page *page, unsigned start1, unsigned end1,
 	BUG_ON((start1 | start2 | end1 | end2) != 0);
 }
 EXPORT_SYMBOL(zero_user_segments);
-#endif /* CONFIG_TRANSPARENT_HUGEPAGE */
 #endif /* CONFIG_HIGHMEM */
 
 #ifdef CONFIG_KMAP_LOCAL
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH v2 03/28] fs: Remove FS_THP_SUPPORT
  2021-11-08  4:05 [PATCH v2 00/28] iomap/xfs folio patches Matthew Wilcox (Oracle)
  2021-11-08  4:05 ` [PATCH v2 01/28] csky,sparc: Declare flush_dcache_folio() Matthew Wilcox (Oracle)
  2021-11-08  4:05 ` [PATCH v2 02/28] mm: Add functions to zero portions of a folio Matthew Wilcox (Oracle)
@ 2021-11-08  4:05 ` Matthew Wilcox (Oracle)
  2021-11-17  4:36   ` Darrick J. Wong
  2021-11-08  4:05 ` [PATCH v2 04/28] fs: Rename AS_THP_SUPPORT and mapping_thp_support Matthew Wilcox (Oracle)
                   ` (24 subsequent siblings)
  27 siblings, 1 reply; 64+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-11-08  4:05 UTC (permalink / raw)
  To: Darrick J . Wong 
  Cc: Matthew Wilcox (Oracle),
	linux-xfs, linux-fsdevel, linux-kernel, linux-block, Jens Axboe,
	Christoph Hellwig, Christoph Hellwig

Instead of setting a bit in the fs_flags to set a bit in the
address_space, set the bit in the address_space directly.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 fs/inode.c              |  2 --
 include/linux/fs.h      |  1 -
 include/linux/pagemap.h | 16 ++++++++++++++++
 mm/shmem.c              |  3 ++-
 4 files changed, 18 insertions(+), 4 deletions(-)

diff --git a/fs/inode.c b/fs/inode.c
index 9abc88d7959c..d6386b6d5a6e 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -180,8 +180,6 @@ int inode_init_always(struct super_block *sb, struct inode *inode)
 	mapping->a_ops = &empty_aops;
 	mapping->host = inode;
 	mapping->flags = 0;
-	if (sb->s_type->fs_flags & FS_THP_SUPPORT)
-		__set_bit(AS_THP_SUPPORT, &mapping->flags);
 	mapping->wb_err = 0;
 	atomic_set(&mapping->i_mmap_writable, 0);
 #ifdef CONFIG_READ_ONLY_THP_FOR_FS
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 4137a9bfae7a..3c2fcabf9d12 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2518,7 +2518,6 @@ struct file_system_type {
 #define FS_USERNS_MOUNT		8	/* Can be mounted by userns root */
 #define FS_DISALLOW_NOTIFY_PERM	16	/* Disable fanotify permission events */
 #define FS_ALLOW_IDMAP         32      /* FS has been updated to handle vfs idmappings. */
-#define FS_THP_SUPPORT		8192	/* Remove once all fs converted */
 #define FS_RENAME_DOES_D_MOVE	32768	/* FS will handle d_move() during rename() internally. */
 	int (*init_fs_context)(struct fs_context *);
 	const struct fs_parameter_spec *parameters;
diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index db2c3e3eb1cf..471f0c422831 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -126,6 +126,22 @@ static inline void mapping_set_gfp_mask(struct address_space *m, gfp_t mask)
 	m->gfp_mask = mask;
 }
 
+/**
+ * mapping_set_large_folios() - Indicate the file supports multi-page folios.
+ * @mapping: The file.
+ *
+ * The filesystem should call this function in its inode constructor to
+ * indicate that the VFS can use multi-page folios to cache the contents
+ * of the file.
+ *
+ * Context: This should not be called while the inode is active as it
+ * is non-atomic.
+ */
+static inline void mapping_set_large_folios(struct address_space *mapping)
+{
+	__set_bit(AS_THP_SUPPORT, &mapping->flags);
+}
+
 static inline bool mapping_thp_support(struct address_space *mapping)
 {
 	return test_bit(AS_THP_SUPPORT, &mapping->flags);
diff --git a/mm/shmem.c b/mm/shmem.c
index 23c91a8beb78..54422933fa2d 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -2303,6 +2303,7 @@ static struct inode *shmem_get_inode(struct super_block *sb, const struct inode
 		INIT_LIST_HEAD(&info->swaplist);
 		simple_xattrs_init(&info->xattrs);
 		cache_no_acl(inode);
+		mapping_set_large_folios(inode->i_mapping);
 
 		switch (mode & S_IFMT) {
 		default:
@@ -3920,7 +3921,7 @@ static struct file_system_type shmem_fs_type = {
 	.parameters	= shmem_fs_parameters,
 #endif
 	.kill_sb	= kill_litter_super,
-	.fs_flags	= FS_USERNS_MOUNT | FS_THP_SUPPORT,
+	.fs_flags	= FS_USERNS_MOUNT,
 };
 
 int __init shmem_init(void)
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH v2 04/28] fs: Rename AS_THP_SUPPORT and mapping_thp_support
  2021-11-08  4:05 [PATCH v2 00/28] iomap/xfs folio patches Matthew Wilcox (Oracle)
                   ` (2 preceding siblings ...)
  2021-11-08  4:05 ` [PATCH v2 03/28] fs: Remove FS_THP_SUPPORT Matthew Wilcox (Oracle)
@ 2021-11-08  4:05 ` Matthew Wilcox (Oracle)
  2021-11-09  8:41   ` Christoph Hellwig
  2021-11-08  4:05 ` [PATCH v2 05/28] block: Add bio_add_folio() Matthew Wilcox (Oracle)
                   ` (23 subsequent siblings)
  27 siblings, 1 reply; 64+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-11-08  4:05 UTC (permalink / raw)
  To: Darrick J . Wong 
  Cc: Matthew Wilcox (Oracle),
	linux-xfs, linux-fsdevel, linux-kernel, linux-block, Jens Axboe,
	Christoph Hellwig

These are now indicators of multi-page folio support, not THP support.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 include/linux/pagemap.h | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index 471f0c422831..2ad10e1fd224 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -34,7 +34,7 @@ enum mapping_flags {
 	AS_EXITING	= 4, 	/* final truncate in progress */
 	/* writeback related tags are not used */
 	AS_NO_WRITEBACK_TAGS = 5,
-	AS_THP_SUPPORT = 6,	/* THPs supported */
+	AS_LARGE_FOLIO_SUPPORT = 6,
 };
 
 /**
@@ -139,12 +139,12 @@ static inline void mapping_set_gfp_mask(struct address_space *m, gfp_t mask)
  */
 static inline void mapping_set_large_folios(struct address_space *mapping)
 {
-	__set_bit(AS_THP_SUPPORT, &mapping->flags);
+	__set_bit(AS_LARGE_FOLIO_SUPPORT, &mapping->flags);
 }
 
-static inline bool mapping_thp_support(struct address_space *mapping)
+static inline bool mapping_large_folio_support(struct address_space *mapping)
 {
-	return test_bit(AS_THP_SUPPORT, &mapping->flags);
+	return test_bit(AS_LARGE_FOLIO_SUPPORT, &mapping->flags);
 }
 
 static inline int filemap_nr_thps(struct address_space *mapping)
@@ -159,7 +159,7 @@ static inline int filemap_nr_thps(struct address_space *mapping)
 static inline void filemap_nr_thps_inc(struct address_space *mapping)
 {
 #ifdef CONFIG_READ_ONLY_THP_FOR_FS
-	if (!mapping_thp_support(mapping))
+	if (!mapping_large_folio_support(mapping))
 		atomic_inc(&mapping->nr_thps);
 #else
 	WARN_ON_ONCE(1);
@@ -169,7 +169,7 @@ static inline void filemap_nr_thps_inc(struct address_space *mapping)
 static inline void filemap_nr_thps_dec(struct address_space *mapping)
 {
 #ifdef CONFIG_READ_ONLY_THP_FOR_FS
-	if (!mapping_thp_support(mapping))
+	if (!mapping_large_folio_support(mapping))
 		atomic_dec(&mapping->nr_thps);
 #else
 	WARN_ON_ONCE(1);
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH v2 05/28] block: Add bio_add_folio()
  2021-11-08  4:05 [PATCH v2 00/28] iomap/xfs folio patches Matthew Wilcox (Oracle)
                   ` (3 preceding siblings ...)
  2021-11-08  4:05 ` [PATCH v2 04/28] fs: Rename AS_THP_SUPPORT and mapping_thp_support Matthew Wilcox (Oracle)
@ 2021-11-08  4:05 ` Matthew Wilcox (Oracle)
  2021-11-17  4:48   ` Darrick J. Wong
  2021-11-08  4:05 ` [PATCH v2 06/28] block: Add bio_for_each_folio_all() Matthew Wilcox (Oracle)
                   ` (22 subsequent siblings)
  27 siblings, 1 reply; 64+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-11-08  4:05 UTC (permalink / raw)
  To: Darrick J . Wong 
  Cc: Matthew Wilcox (Oracle),
	linux-xfs, linux-fsdevel, linux-kernel, linux-block, Jens Axboe,
	Christoph Hellwig, Christoph Hellwig

This is a thin wrapper around bio_add_page().  The main advantage here
is the documentation that folios larger than 2GiB are not supported.
It's not currently possible to allocate folios that large, but if it
ever becomes possible, this function will fail gracefully instead of
doing I/O to the wrong bytes.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 block/bio.c         | 22 ++++++++++++++++++++++
 include/linux/bio.h |  3 ++-
 2 files changed, 24 insertions(+), 1 deletion(-)

diff --git a/block/bio.c b/block/bio.c
index 15ab0d6d1c06..4b3087e20d51 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -1033,6 +1033,28 @@ int bio_add_page(struct bio *bio, struct page *page,
 }
 EXPORT_SYMBOL(bio_add_page);
 
+/**
+ * bio_add_folio - Attempt to add part of a folio to a bio.
+ * @bio: BIO to add to.
+ * @folio: Folio to add.
+ * @len: How many bytes from the folio to add.
+ * @off: First byte in this folio to add.
+ *
+ * Filesystems that use folios can call this function instead of calling
+ * bio_add_page() for each page in the folio.  If @off is bigger than
+ * PAGE_SIZE, this function can create a bio_vec that starts in a page
+ * after the bv_page.  BIOs do not support folios that are 4GiB or larger.
+ *
+ * Return: Whether the addition was successful.
+ */
+bool bio_add_folio(struct bio *bio, struct folio *folio, size_t len,
+		   size_t off)
+{
+	if (len > UINT_MAX || off > UINT_MAX)
+		return 0;
+	return bio_add_page(bio, &folio->page, len, off) > 0;
+}
+
 void __bio_release_pages(struct bio *bio, bool mark_dirty)
 {
 	struct bvec_iter_all iter_all;
diff --git a/include/linux/bio.h b/include/linux/bio.h
index fe6bdfbbef66..a783cac49978 100644
--- a/include/linux/bio.h
+++ b/include/linux/bio.h
@@ -409,7 +409,8 @@ extern void bio_uninit(struct bio *);
 extern void bio_reset(struct bio *);
 void bio_chain(struct bio *, struct bio *);
 
-extern int bio_add_page(struct bio *, struct page *, unsigned int,unsigned int);
+int bio_add_page(struct bio *, struct page *, unsigned len, unsigned off);
+bool bio_add_folio(struct bio *, struct folio *, size_t len, size_t off);
 extern int bio_add_pc_page(struct request_queue *, struct bio *, struct page *,
 			   unsigned int, unsigned int);
 int bio_add_zone_append_page(struct bio *bio, struct page *page,
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH v2 06/28] block: Add bio_for_each_folio_all()
  2021-11-08  4:05 [PATCH v2 00/28] iomap/xfs folio patches Matthew Wilcox (Oracle)
                   ` (4 preceding siblings ...)
  2021-11-08  4:05 ` [PATCH v2 05/28] block: Add bio_add_folio() Matthew Wilcox (Oracle)
@ 2021-11-08  4:05 ` Matthew Wilcox (Oracle)
  2021-11-17  4:48   ` Darrick J. Wong
  2021-11-08  4:05 ` [PATCH v2 07/28] fs/buffer: Convert __block_write_begin_int() to take a folio Matthew Wilcox (Oracle)
                   ` (21 subsequent siblings)
  27 siblings, 1 reply; 64+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-11-08  4:05 UTC (permalink / raw)
  To: Darrick J . Wong 
  Cc: Matthew Wilcox (Oracle),
	linux-xfs, linux-fsdevel, linux-kernel, linux-block, Jens Axboe,
	Christoph Hellwig, Christoph Hellwig

Allow callers to iterate over each folio instead of each page.  The
bio need not have been constructed using folios originally.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 Documentation/core-api/kernel-api.rst |  1 +
 include/linux/bio.h                   | 53 ++++++++++++++++++++++++++-
 2 files changed, 53 insertions(+), 1 deletion(-)

diff --git a/Documentation/core-api/kernel-api.rst b/Documentation/core-api/kernel-api.rst
index 2e7186805148..7f0cb604b6ab 100644
--- a/Documentation/core-api/kernel-api.rst
+++ b/Documentation/core-api/kernel-api.rst
@@ -279,6 +279,7 @@ Accounting Framework
 Block Devices
 =============
 
+.. kernel-doc:: include/linux/bio.h
 .. kernel-doc:: block/blk-core.c
    :export:
 
diff --git a/include/linux/bio.h b/include/linux/bio.h
index a783cac49978..e3c9e8207f12 100644
--- a/include/linux/bio.h
+++ b/include/linux/bio.h
@@ -166,7 +166,7 @@ static inline void bio_advance(struct bio *bio, unsigned int nbytes)
  */
 #define bio_for_each_bvec_all(bvl, bio, i)		\
 	for (i = 0, bvl = bio_first_bvec_all(bio);	\
-	     i < (bio)->bi_vcnt; i++, bvl++)		\
+	     i < (bio)->bi_vcnt; i++, bvl++)
 
 #define bio_iter_last(bvec, iter) ((iter).bi_size == (bvec).bv_len)
 
@@ -260,6 +260,57 @@ static inline struct bio_vec *bio_last_bvec_all(struct bio *bio)
 	return &bio->bi_io_vec[bio->bi_vcnt - 1];
 }
 
+/**
+ * struct folio_iter - State for iterating all folios in a bio.
+ * @folio: The current folio we're iterating.  NULL after the last folio.
+ * @offset: The byte offset within the current folio.
+ * @length: The number of bytes in this iteration (will not cross folio
+ *	boundary).
+ */
+struct folio_iter {
+	struct folio *folio;
+	size_t offset;
+	size_t length;
+	/* private: for use by the iterator */
+	size_t _seg_count;
+	int _i;
+};
+
+static inline void bio_first_folio(struct folio_iter *fi, struct bio *bio,
+				   int i)
+{
+	struct bio_vec *bvec = bio_first_bvec_all(bio) + i;
+
+	fi->folio = page_folio(bvec->bv_page);
+	fi->offset = bvec->bv_offset +
+			PAGE_SIZE * (bvec->bv_page - &fi->folio->page);
+	fi->_seg_count = bvec->bv_len;
+	fi->length = min(folio_size(fi->folio) - fi->offset, fi->_seg_count);
+	fi->_i = i;
+}
+
+static inline void bio_next_folio(struct folio_iter *fi, struct bio *bio)
+{
+	fi->_seg_count -= fi->length;
+	if (fi->_seg_count) {
+		fi->folio = folio_next(fi->folio);
+		fi->offset = 0;
+		fi->length = min(folio_size(fi->folio), fi->_seg_count);
+	} else if (fi->_i + 1 < bio->bi_vcnt) {
+		bio_first_folio(fi, bio, fi->_i + 1);
+	} else {
+		fi->folio = NULL;
+	}
+}
+
+/**
+ * bio_for_each_folio_all - Iterate over each folio in a bio.
+ * @fi: struct folio_iter which is updated for each folio.
+ * @bio: struct bio to iterate over.
+ */
+#define bio_for_each_folio_all(fi, bio)				\
+	for (bio_first_folio(&fi, bio, 0); fi.folio; bio_next_folio(&fi, bio))
+
 enum bip_flags {
 	BIP_BLOCK_INTEGRITY	= 1 << 0, /* block layer owns integrity data */
 	BIP_MAPPED_INTEGRITY	= 1 << 1, /* ref tag has been remapped */
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH v2 07/28] fs/buffer: Convert __block_write_begin_int() to take a folio
  2021-11-08  4:05 [PATCH v2 00/28] iomap/xfs folio patches Matthew Wilcox (Oracle)
                   ` (5 preceding siblings ...)
  2021-11-08  4:05 ` [PATCH v2 06/28] block: Add bio_for_each_folio_all() Matthew Wilcox (Oracle)
@ 2021-11-08  4:05 ` Matthew Wilcox (Oracle)
  2021-11-09  8:42   ` Christoph Hellwig
  2021-11-17  4:35   ` Darrick J. Wong
  2021-11-08  4:05 ` [PATCH v2 08/28] iomap: Convert to_iomap_page " Matthew Wilcox (Oracle)
                   ` (20 subsequent siblings)
  27 siblings, 2 replies; 64+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-11-08  4:05 UTC (permalink / raw)
  To: Darrick J . Wong 
  Cc: Matthew Wilcox (Oracle),
	linux-xfs, linux-fsdevel, linux-kernel, linux-block, Jens Axboe,
	Christoph Hellwig

There are no plans to convert buffer_head infrastructure to use multi-page
folios, but __block_write_begin_int() is called from iomap, and it's
more convenient and less error-prone if we pass in a folio from iomap.
It also has a nice saving of almost 200 bytes of code from removing
repeated calls to compound_head().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 fs/buffer.c            | 22 +++++++++++-----------
 fs/internal.h          |  2 +-
 fs/iomap/buffered-io.c |  7 +++++--
 3 files changed, 17 insertions(+), 14 deletions(-)

diff --git a/fs/buffer.c b/fs/buffer.c
index 46bc589b7a03..b1d722b26fe9 100644
--- a/fs/buffer.c
+++ b/fs/buffer.c
@@ -1969,34 +1969,34 @@ iomap_to_bh(struct inode *inode, sector_t block, struct buffer_head *bh,
 	}
 }
 
-int __block_write_begin_int(struct page *page, loff_t pos, unsigned len,
+int __block_write_begin_int(struct folio *folio, loff_t pos, unsigned len,
 		get_block_t *get_block, const struct iomap *iomap)
 {
 	unsigned from = pos & (PAGE_SIZE - 1);
 	unsigned to = from + len;
-	struct inode *inode = page->mapping->host;
+	struct inode *inode = folio->mapping->host;
 	unsigned block_start, block_end;
 	sector_t block;
 	int err = 0;
 	unsigned blocksize, bbits;
 	struct buffer_head *bh, *head, *wait[2], **wait_bh=wait;
 
-	BUG_ON(!PageLocked(page));
+	BUG_ON(!folio_test_locked(folio));
 	BUG_ON(from > PAGE_SIZE);
 	BUG_ON(to > PAGE_SIZE);
 	BUG_ON(from > to);
 
-	head = create_page_buffers(page, inode, 0);
+	head = create_page_buffers(&folio->page, inode, 0);
 	blocksize = head->b_size;
 	bbits = block_size_bits(blocksize);
 
-	block = (sector_t)page->index << (PAGE_SHIFT - bbits);
+	block = (sector_t)folio->index << (PAGE_SHIFT - bbits);
 
 	for(bh = head, block_start = 0; bh != head || !block_start;
 	    block++, block_start=block_end, bh = bh->b_this_page) {
 		block_end = block_start + blocksize;
 		if (block_end <= from || block_start >= to) {
-			if (PageUptodate(page)) {
+			if (folio_test_uptodate(folio)) {
 				if (!buffer_uptodate(bh))
 					set_buffer_uptodate(bh);
 			}
@@ -2016,20 +2016,20 @@ int __block_write_begin_int(struct page *page, loff_t pos, unsigned len,
 
 			if (buffer_new(bh)) {
 				clean_bdev_bh_alias(bh);
-				if (PageUptodate(page)) {
+				if (folio_test_uptodate(folio)) {
 					clear_buffer_new(bh);
 					set_buffer_uptodate(bh);
 					mark_buffer_dirty(bh);
 					continue;
 				}
 				if (block_end > to || block_start < from)
-					zero_user_segments(page,
+					folio_zero_segments(folio,
 						to, block_end,
 						block_start, from);
 				continue;
 			}
 		}
-		if (PageUptodate(page)) {
+		if (folio_test_uptodate(folio)) {
 			if (!buffer_uptodate(bh))
 				set_buffer_uptodate(bh);
 			continue; 
@@ -2050,14 +2050,14 @@ int __block_write_begin_int(struct page *page, loff_t pos, unsigned len,
 			err = -EIO;
 	}
 	if (unlikely(err))
-		page_zero_new_buffers(page, from, to);
+		page_zero_new_buffers(&folio->page, from, to);
 	return err;
 }
 
 int __block_write_begin(struct page *page, loff_t pos, unsigned len,
 		get_block_t *get_block)
 {
-	return __block_write_begin_int(page, pos, len, get_block, NULL);
+	return __block_write_begin_int(page_folio(page), pos, len, get_block, NULL);
 }
 EXPORT_SYMBOL(__block_write_begin);
 
diff --git a/fs/internal.h b/fs/internal.h
index cdd83d4899bb..afc13443392b 100644
--- a/fs/internal.h
+++ b/fs/internal.h
@@ -37,7 +37,7 @@ static inline int emergency_thaw_bdev(struct super_block *sb)
 /*
  * buffer.c
  */
-int __block_write_begin_int(struct page *page, loff_t pos, unsigned len,
+int __block_write_begin_int(struct folio *folio, loff_t pos, unsigned len,
 		get_block_t *get_block, const struct iomap *iomap);
 
 /*
diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index 1753c26c8e76..4e09ea823148 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -597,6 +597,7 @@ static int iomap_write_begin(const struct iomap_iter *iter, loff_t pos,
 	const struct iomap_page_ops *page_ops = iter->iomap.page_ops;
 	const struct iomap *srcmap = iomap_iter_srcmap(iter);
 	struct page *page;
+	struct folio *folio;
 	int status = 0;
 
 	BUG_ON(pos + len > iter->iomap.offset + iter->iomap.length);
@@ -618,11 +619,12 @@ static int iomap_write_begin(const struct iomap_iter *iter, loff_t pos,
 		status = -ENOMEM;
 		goto out_no_page;
 	}
+	folio = page_folio(page);
 
 	if (srcmap->type == IOMAP_INLINE)
 		status = iomap_write_begin_inline(iter, page);
 	else if (srcmap->flags & IOMAP_F_BUFFER_HEAD)
-		status = __block_write_begin_int(page, pos, len, NULL, srcmap);
+		status = __block_write_begin_int(folio, pos, len, NULL, srcmap);
 	else
 		status = __iomap_write_begin(iter, pos, len, page);
 
@@ -954,11 +956,12 @@ EXPORT_SYMBOL_GPL(iomap_truncate_page);
 static loff_t iomap_page_mkwrite_iter(struct iomap_iter *iter,
 		struct page *page)
 {
+	struct folio *folio = page_folio(page);
 	loff_t length = iomap_length(iter);
 	int ret;
 
 	if (iter->iomap.flags & IOMAP_F_BUFFER_HEAD) {
-		ret = __block_write_begin_int(page, iter->pos, length, NULL,
+		ret = __block_write_begin_int(folio, iter->pos, length, NULL,
 					      &iter->iomap);
 		if (ret)
 			return ret;
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH v2 08/28] iomap: Convert to_iomap_page to take a folio
  2021-11-08  4:05 [PATCH v2 00/28] iomap/xfs folio patches Matthew Wilcox (Oracle)
                   ` (6 preceding siblings ...)
  2021-11-08  4:05 ` [PATCH v2 07/28] fs/buffer: Convert __block_write_begin_int() to take a folio Matthew Wilcox (Oracle)
@ 2021-11-08  4:05 ` Matthew Wilcox (Oracle)
  2021-11-08  4:05 ` [PATCH v2 09/28] iomap: Convert iomap_page_create " Matthew Wilcox (Oracle)
                   ` (19 subsequent siblings)
  27 siblings, 0 replies; 64+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-11-08  4:05 UTC (permalink / raw)
  To: Darrick J . Wong 
  Cc: Matthew Wilcox (Oracle),
	linux-xfs, linux-fsdevel, linux-kernel, linux-block, Jens Axboe,
	Christoph Hellwig, Christoph Hellwig

The big comment about only using a head page can go away now that
it takes a folio argument.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 fs/iomap/buffered-io.c | 32 +++++++++++++++-----------------
 1 file changed, 15 insertions(+), 17 deletions(-)

diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index 4e09ea823148..236beeeaef42 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -22,8 +22,8 @@
 #include "../internal.h"
 
 /*
- * Structure allocated for each page or THP when block size < page size
- * to track sub-page uptodate status and I/O completions.
+ * Structure allocated for each folio when block size < folio size
+ * to track sub-folio uptodate status and I/O completions.
  */
 struct iomap_page {
 	atomic_t		read_bytes_pending;
@@ -32,17 +32,10 @@ struct iomap_page {
 	unsigned long		uptodate[];
 };
 
-static inline struct iomap_page *to_iomap_page(struct page *page)
+static inline struct iomap_page *to_iomap_page(struct folio *folio)
 {
-	/*
-	 * per-block data is stored in the head page.  Callers should
-	 * not be dealing with tail pages, and if they are, they can
-	 * call thp_head() first.
-	 */
-	VM_BUG_ON_PGFLAGS(PageTail(page), page);
-
-	if (page_has_private(page))
-		return (struct iomap_page *)page_private(page);
+	if (folio_test_private(folio))
+		return folio_get_private(folio);
 	return NULL;
 }
 
@@ -51,7 +44,8 @@ static struct bio_set iomap_ioend_bioset;
 static struct iomap_page *
 iomap_page_create(struct inode *inode, struct page *page)
 {
-	struct iomap_page *iop = to_iomap_page(page);
+	struct folio *folio = page_folio(page);
+	struct iomap_page *iop = to_iomap_page(folio);
 	unsigned int nr_blocks = i_blocks_per_page(inode, page);
 
 	if (iop || nr_blocks <= 1)
@@ -144,7 +138,8 @@ iomap_adjust_read_range(struct inode *inode, struct iomap_page *iop,
 static void
 iomap_iop_set_range_uptodate(struct page *page, unsigned off, unsigned len)
 {
-	struct iomap_page *iop = to_iomap_page(page);
+	struct folio *folio = page_folio(page);
+	struct iomap_page *iop = to_iomap_page(folio);
 	struct inode *inode = page->mapping->host;
 	unsigned first = off >> inode->i_blkbits;
 	unsigned last = (off + len - 1) >> inode->i_blkbits;
@@ -173,7 +168,8 @@ static void
 iomap_read_page_end_io(struct bio_vec *bvec, int error)
 {
 	struct page *page = bvec->bv_page;
-	struct iomap_page *iop = to_iomap_page(page);
+	struct folio *folio = page_folio(page);
+	struct iomap_page *iop = to_iomap_page(folio);
 
 	if (unlikely(error)) {
 		ClearPageUptodate(page);
@@ -427,7 +423,8 @@ int
 iomap_is_partially_uptodate(struct page *page, unsigned long from,
 		unsigned long count)
 {
-	struct iomap_page *iop = to_iomap_page(page);
+	struct folio *folio = page_folio(page);
+	struct iomap_page *iop = to_iomap_page(folio);
 	struct inode *inode = page->mapping->host;
 	unsigned len, first, last;
 	unsigned i;
@@ -1006,7 +1003,8 @@ static void
 iomap_finish_page_writeback(struct inode *inode, struct page *page,
 		int error, unsigned int len)
 {
-	struct iomap_page *iop = to_iomap_page(page);
+	struct folio *folio = page_folio(page);
+	struct iomap_page *iop = to_iomap_page(folio);
 
 	if (error) {
 		SetPageError(page);
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH v2 09/28] iomap: Convert iomap_page_create to take a folio
  2021-11-08  4:05 [PATCH v2 00/28] iomap/xfs folio patches Matthew Wilcox (Oracle)
                   ` (7 preceding siblings ...)
  2021-11-08  4:05 ` [PATCH v2 08/28] iomap: Convert to_iomap_page " Matthew Wilcox (Oracle)
@ 2021-11-08  4:05 ` Matthew Wilcox (Oracle)
  2021-11-08  4:05 ` [PATCH v2 10/28] iomap: Convert iomap_page_release " Matthew Wilcox (Oracle)
                   ` (18 subsequent siblings)
  27 siblings, 0 replies; 64+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-11-08  4:05 UTC (permalink / raw)
  To: Darrick J . Wong 
  Cc: Matthew Wilcox (Oracle),
	linux-xfs, linux-fsdevel, linux-kernel, linux-block, Jens Axboe,
	Christoph Hellwig, Christoph Hellwig

This function already assumed it was being passed a head page, so
just formalise that.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 fs/iomap/buffered-io.c | 21 ++++++++++++---------
 1 file changed, 12 insertions(+), 9 deletions(-)

diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index 236beeeaef42..6972ac8fda77 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -42,11 +42,10 @@ static inline struct iomap_page *to_iomap_page(struct folio *folio)
 static struct bio_set iomap_ioend_bioset;
 
 static struct iomap_page *
-iomap_page_create(struct inode *inode, struct page *page)
+iomap_page_create(struct inode *inode, struct folio *folio)
 {
-	struct folio *folio = page_folio(page);
 	struct iomap_page *iop = to_iomap_page(folio);
-	unsigned int nr_blocks = i_blocks_per_page(inode, page);
+	unsigned int nr_blocks = i_blocks_per_folio(inode, folio);
 
 	if (iop || nr_blocks <= 1)
 		return iop;
@@ -54,9 +53,9 @@ iomap_page_create(struct inode *inode, struct page *page)
 	iop = kzalloc(struct_size(iop, uptodate, BITS_TO_LONGS(nr_blocks)),
 			GFP_NOFS | __GFP_NOFAIL);
 	spin_lock_init(&iop->uptodate_lock);
-	if (PageUptodate(page))
+	if (folio_test_uptodate(folio))
 		bitmap_fill(iop->uptodate, nr_blocks);
-	attach_page_private(page, iop);
+	folio_attach_private(folio, iop);
 	return iop;
 }
 
@@ -204,6 +203,7 @@ struct iomap_readpage_ctx {
 static loff_t iomap_read_inline_data(const struct iomap_iter *iter,
 		struct page *page)
 {
+	struct folio *folio = page_folio(page);
 	const struct iomap *iomap = iomap_iter_srcmap(iter);
 	size_t size = i_size_read(iter->inode) - iomap->offset;
 	size_t poff = offset_in_page(iomap->offset);
@@ -220,7 +220,7 @@ static loff_t iomap_read_inline_data(const struct iomap_iter *iter,
 	if (WARN_ON_ONCE(size > iomap->length))
 		return -EIO;
 	if (poff > 0)
-		iomap_page_create(iter->inode, page);
+		iomap_page_create(iter->inode, folio);
 
 	addr = kmap_local_page(page) + poff;
 	memcpy(addr, iomap->inline_data, size);
@@ -247,6 +247,7 @@ static loff_t iomap_readpage_iter(const struct iomap_iter *iter,
 	loff_t pos = iter->pos + offset;
 	loff_t length = iomap_length(iter) - offset;
 	struct page *page = ctx->cur_page;
+	struct folio *folio = page_folio(page);
 	struct iomap_page *iop;
 	loff_t orig_pos = pos;
 	unsigned poff, plen;
@@ -256,7 +257,7 @@ static loff_t iomap_readpage_iter(const struct iomap_iter *iter,
 		return min(iomap_read_inline_data(iter, page), length);
 
 	/* zero post-eof blocks as the page may be mapped */
-	iop = iomap_page_create(iter->inode, page);
+	iop = iomap_page_create(iter->inode, folio);
 	iomap_adjust_read_range(iter->inode, iop, &pos, length, &poff, &plen);
 	if (plen == 0)
 		goto done;
@@ -536,8 +537,9 @@ iomap_read_page_sync(loff_t block_start, struct page *page, unsigned poff,
 static int __iomap_write_begin(const struct iomap_iter *iter, loff_t pos,
 		unsigned len, struct page *page)
 {
+	struct folio *folio = page_folio(page);
 	const struct iomap *srcmap = iomap_iter_srcmap(iter);
-	struct iomap_page *iop = iomap_page_create(iter->inode, page);
+	struct iomap_page *iop = iomap_page_create(iter->inode, folio);
 	loff_t block_size = i_blocksize(iter->inode);
 	loff_t block_start = round_down(pos, block_size);
 	loff_t block_end = round_up(pos + len, block_size);
@@ -1290,7 +1292,8 @@ iomap_writepage_map(struct iomap_writepage_ctx *wpc,
 		struct writeback_control *wbc, struct inode *inode,
 		struct page *page, u64 end_offset)
 {
-	struct iomap_page *iop = iomap_page_create(inode, page);
+	struct folio *folio = page_folio(page);
+	struct iomap_page *iop = iomap_page_create(inode, folio);
 	struct iomap_ioend *ioend, *next;
 	unsigned len = i_blocksize(inode);
 	u64 file_offset; /* file offset of page */
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH v2 10/28] iomap: Convert iomap_page_release to take a folio
  2021-11-08  4:05 [PATCH v2 00/28] iomap/xfs folio patches Matthew Wilcox (Oracle)
                   ` (8 preceding siblings ...)
  2021-11-08  4:05 ` [PATCH v2 09/28] iomap: Convert iomap_page_create " Matthew Wilcox (Oracle)
@ 2021-11-08  4:05 ` Matthew Wilcox (Oracle)
  2021-11-08  4:05 ` [PATCH v2 11/28] iomap: Convert iomap_releasepage to use " Matthew Wilcox (Oracle)
                   ` (17 subsequent siblings)
  27 siblings, 0 replies; 64+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-11-08  4:05 UTC (permalink / raw)
  To: Darrick J . Wong 
  Cc: Matthew Wilcox (Oracle),
	linux-xfs, linux-fsdevel, linux-kernel, linux-block, Jens Axboe,
	Christoph Hellwig, Christoph Hellwig

iomap_page_release() was also assuming that it was being passed a
head page.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 fs/iomap/buffered-io.c | 18 +++++++++++-------
 1 file changed, 11 insertions(+), 7 deletions(-)

diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index 6972ac8fda77..ad3a16861ddc 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -59,18 +59,18 @@ iomap_page_create(struct inode *inode, struct folio *folio)
 	return iop;
 }
 
-static void
-iomap_page_release(struct page *page)
+static void iomap_page_release(struct folio *folio)
 {
-	struct iomap_page *iop = detach_page_private(page);
-	unsigned int nr_blocks = i_blocks_per_page(page->mapping->host, page);
+	struct iomap_page *iop = folio_detach_private(folio);
+	struct inode *inode = folio->mapping->host;
+	unsigned int nr_blocks = i_blocks_per_folio(inode, folio);
 
 	if (!iop)
 		return;
 	WARN_ON_ONCE(atomic_read(&iop->read_bytes_pending));
 	WARN_ON_ONCE(atomic_read(&iop->write_bytes_pending));
 	WARN_ON_ONCE(bitmap_full(iop->uptodate, nr_blocks) !=
-			PageUptodate(page));
+			folio_test_uptodate(folio));
 	kfree(iop);
 }
 
@@ -451,6 +451,8 @@ EXPORT_SYMBOL_GPL(iomap_is_partially_uptodate);
 int
 iomap_releasepage(struct page *page, gfp_t gfp_mask)
 {
+	struct folio *folio = page_folio(page);
+
 	trace_iomap_releasepage(page->mapping->host, page_offset(page),
 			PAGE_SIZE);
 
@@ -461,7 +463,7 @@ iomap_releasepage(struct page *page, gfp_t gfp_mask)
 	 */
 	if (PageDirty(page) || PageWriteback(page))
 		return 0;
-	iomap_page_release(page);
+	iomap_page_release(folio);
 	return 1;
 }
 EXPORT_SYMBOL_GPL(iomap_releasepage);
@@ -469,6 +471,8 @@ EXPORT_SYMBOL_GPL(iomap_releasepage);
 void
 iomap_invalidatepage(struct page *page, unsigned int offset, unsigned int len)
 {
+	struct folio *folio = page_folio(page);
+
 	trace_iomap_invalidatepage(page->mapping->host, offset, len);
 
 	/*
@@ -478,7 +482,7 @@ iomap_invalidatepage(struct page *page, unsigned int offset, unsigned int len)
 	if (offset == 0 && len == PAGE_SIZE) {
 		WARN_ON_ONCE(PageWriteback(page));
 		cancel_dirty_page(page);
-		iomap_page_release(page);
+		iomap_page_release(folio);
 	}
 }
 EXPORT_SYMBOL_GPL(iomap_invalidatepage);
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH v2 11/28] iomap: Convert iomap_releasepage to use a folio
  2021-11-08  4:05 [PATCH v2 00/28] iomap/xfs folio patches Matthew Wilcox (Oracle)
                   ` (9 preceding siblings ...)
  2021-11-08  4:05 ` [PATCH v2 10/28] iomap: Convert iomap_page_release " Matthew Wilcox (Oracle)
@ 2021-11-08  4:05 ` Matthew Wilcox (Oracle)
  2021-11-08  4:05 ` [PATCH v2 12/28] iomap: Add iomap_invalidate_folio Matthew Wilcox (Oracle)
                   ` (16 subsequent siblings)
  27 siblings, 0 replies; 64+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-11-08  4:05 UTC (permalink / raw)
  To: Darrick J . Wong 
  Cc: Matthew Wilcox (Oracle),
	linux-xfs, linux-fsdevel, linux-kernel, linux-block, Jens Axboe,
	Christoph Hellwig, Christoph Hellwig

This is an address_space operation, so its argument must remain as a
struct page, but we can use a folio internally.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
---
 fs/iomap/buffered-io.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index ad3a16861ddc..49f96fdadcb4 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -453,15 +453,15 @@ iomap_releasepage(struct page *page, gfp_t gfp_mask)
 {
 	struct folio *folio = page_folio(page);
 
-	trace_iomap_releasepage(page->mapping->host, page_offset(page),
-			PAGE_SIZE);
+	trace_iomap_releasepage(folio->mapping->host, folio_pos(folio),
+			folio_size(folio));
 
 	/*
 	 * mm accommodates an old ext3 case where clean pages might not have had
 	 * the dirty bit cleared. Thus, it can send actual dirty pages to
 	 * ->releasepage() via shrink_active_list(); skip those here.
 	 */
-	if (PageDirty(page) || PageWriteback(page))
+	if (folio_test_dirty(folio) || folio_test_writeback(folio))
 		return 0;
 	iomap_page_release(folio);
 	return 1;
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH v2 12/28] iomap: Add iomap_invalidate_folio
  2021-11-08  4:05 [PATCH v2 00/28] iomap/xfs folio patches Matthew Wilcox (Oracle)
                   ` (10 preceding siblings ...)
  2021-11-08  4:05 ` [PATCH v2 11/28] iomap: Convert iomap_releasepage to use " Matthew Wilcox (Oracle)
@ 2021-11-08  4:05 ` Matthew Wilcox (Oracle)
  2021-11-17  2:20   ` Darrick J. Wong
  2021-11-08  4:05 ` [PATCH v2 13/28] iomap: Pass the iomap_page into iomap_set_range_uptodate Matthew Wilcox (Oracle)
                   ` (15 subsequent siblings)
  27 siblings, 1 reply; 64+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-11-08  4:05 UTC (permalink / raw)
  To: Darrick J . Wong 
  Cc: Matthew Wilcox (Oracle),
	linux-xfs, linux-fsdevel, linux-kernel, linux-block, Jens Axboe,
	Christoph Hellwig, Christoph Hellwig

Keep iomap_invalidatepage around as a wrapper for use in address_space
operations.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 fs/iomap/buffered-io.c | 20 ++++++++++++--------
 include/linux/iomap.h  |  1 +
 2 files changed, 13 insertions(+), 8 deletions(-)

diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index 49f96fdadcb4..b7cbe4d202d8 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -468,23 +468,27 @@ iomap_releasepage(struct page *page, gfp_t gfp_mask)
 }
 EXPORT_SYMBOL_GPL(iomap_releasepage);
 
-void
-iomap_invalidatepage(struct page *page, unsigned int offset, unsigned int len)
+void iomap_invalidate_folio(struct folio *folio, size_t offset, size_t len)
 {
-	struct folio *folio = page_folio(page);
-
-	trace_iomap_invalidatepage(page->mapping->host, offset, len);
+	trace_iomap_invalidatepage(folio->mapping->host, offset, len);
 
 	/*
 	 * If we're invalidating the entire page, clear the dirty state from it
 	 * and release it to avoid unnecessary buildup of the LRU.
 	 */
-	if (offset == 0 && len == PAGE_SIZE) {
-		WARN_ON_ONCE(PageWriteback(page));
-		cancel_dirty_page(page);
+	if (offset == 0 && len == folio_size(folio)) {
+		WARN_ON_ONCE(folio_test_writeback(folio));
+		folio_cancel_dirty(folio);
 		iomap_page_release(folio);
 	}
 }
+EXPORT_SYMBOL_GPL(iomap_invalidate_folio);
+
+void iomap_invalidatepage(struct page *page, unsigned int offset,
+		unsigned int len)
+{
+	iomap_invalidate_folio(page_folio(page), offset, len);
+}
 EXPORT_SYMBOL_GPL(iomap_invalidatepage);
 
 #ifdef CONFIG_MIGRATION
diff --git a/include/linux/iomap.h b/include/linux/iomap.h
index 6d1b08d0ae93..29491fb9c5ba 100644
--- a/include/linux/iomap.h
+++ b/include/linux/iomap.h
@@ -225,6 +225,7 @@ void iomap_readahead(struct readahead_control *, const struct iomap_ops *ops);
 int iomap_is_partially_uptodate(struct page *page, unsigned long from,
 		unsigned long count);
 int iomap_releasepage(struct page *page, gfp_t gfp_mask);
+void iomap_invalidate_folio(struct folio *folio, size_t offset, size_t len);
 void iomap_invalidatepage(struct page *page, unsigned int offset,
 		unsigned int len);
 #ifdef CONFIG_MIGRATION
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH v2 13/28] iomap: Pass the iomap_page into iomap_set_range_uptodate
  2021-11-08  4:05 [PATCH v2 00/28] iomap/xfs folio patches Matthew Wilcox (Oracle)
                   ` (11 preceding siblings ...)
  2021-11-08  4:05 ` [PATCH v2 12/28] iomap: Add iomap_invalidate_folio Matthew Wilcox (Oracle)
@ 2021-11-08  4:05 ` Matthew Wilcox (Oracle)
  2021-11-08  4:05 ` [PATCH v2 14/28] iomap: Convert bio completions to use folios Matthew Wilcox (Oracle)
                   ` (14 subsequent siblings)
  27 siblings, 0 replies; 64+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-11-08  4:05 UTC (permalink / raw)
  To: Darrick J . Wong 
  Cc: Matthew Wilcox (Oracle),
	linux-xfs, linux-fsdevel, linux-kernel, linux-block, Jens Axboe,
	Christoph Hellwig, Christoph Hellwig

All but one caller already has the iomap_page, so we can avoid getting
it again.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 fs/iomap/buffered-io.c | 32 ++++++++++++++++++--------------
 1 file changed, 18 insertions(+), 14 deletions(-)

diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index b7cbe4d202d8..03bfbafec3f4 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -134,11 +134,9 @@ iomap_adjust_read_range(struct inode *inode, struct iomap_page *iop,
 	*lenp = plen;
 }
 
-static void
-iomap_iop_set_range_uptodate(struct page *page, unsigned off, unsigned len)
+static void iomap_iop_set_range_uptodate(struct page *page,
+		struct iomap_page *iop, unsigned off, unsigned len)
 {
-	struct folio *folio = page_folio(page);
-	struct iomap_page *iop = to_iomap_page(folio);
 	struct inode *inode = page->mapping->host;
 	unsigned first = off >> inode->i_blkbits;
 	unsigned last = (off + len - 1) >> inode->i_blkbits;
@@ -151,14 +149,14 @@ iomap_iop_set_range_uptodate(struct page *page, unsigned off, unsigned len)
 	spin_unlock_irqrestore(&iop->uptodate_lock, flags);
 }
 
-static void
-iomap_set_range_uptodate(struct page *page, unsigned off, unsigned len)
+static void iomap_set_range_uptodate(struct page *page,
+		struct iomap_page *iop, unsigned off, unsigned len)
 {
 	if (PageError(page))
 		return;
 
-	if (page_has_private(page))
-		iomap_iop_set_range_uptodate(page, off, len);
+	if (iop)
+		iomap_iop_set_range_uptodate(page, iop, off, len);
 	else
 		SetPageUptodate(page);
 }
@@ -174,7 +172,8 @@ iomap_read_page_end_io(struct bio_vec *bvec, int error)
 		ClearPageUptodate(page);
 		SetPageError(page);
 	} else {
-		iomap_set_range_uptodate(page, bvec->bv_offset, bvec->bv_len);
+		iomap_set_range_uptodate(page, iop, bvec->bv_offset,
+						bvec->bv_len);
 	}
 
 	if (!iop || atomic_sub_and_test(bvec->bv_len, &iop->read_bytes_pending))
@@ -204,6 +203,7 @@ static loff_t iomap_read_inline_data(const struct iomap_iter *iter,
 		struct page *page)
 {
 	struct folio *folio = page_folio(page);
+	struct iomap_page *iop;
 	const struct iomap *iomap = iomap_iter_srcmap(iter);
 	size_t size = i_size_read(iter->inode) - iomap->offset;
 	size_t poff = offset_in_page(iomap->offset);
@@ -220,13 +220,15 @@ static loff_t iomap_read_inline_data(const struct iomap_iter *iter,
 	if (WARN_ON_ONCE(size > iomap->length))
 		return -EIO;
 	if (poff > 0)
-		iomap_page_create(iter->inode, folio);
+		iop = iomap_page_create(iter->inode, folio);
+	else
+		iop = to_iomap_page(folio);
 
 	addr = kmap_local_page(page) + poff;
 	memcpy(addr, iomap->inline_data, size);
 	memset(addr + size, 0, PAGE_SIZE - poff - size);
 	kunmap_local(addr);
-	iomap_set_range_uptodate(page, poff, PAGE_SIZE - poff);
+	iomap_set_range_uptodate(page, iop, poff, PAGE_SIZE - poff);
 	return PAGE_SIZE - poff;
 }
 
@@ -264,7 +266,7 @@ static loff_t iomap_readpage_iter(const struct iomap_iter *iter,
 
 	if (iomap_block_needs_zeroing(iter, pos)) {
 		zero_user(page, poff, plen);
-		iomap_set_range_uptodate(page, poff, plen);
+		iomap_set_range_uptodate(page, iop, poff, plen);
 		goto done;
 	}
 
@@ -578,7 +580,7 @@ static int __iomap_write_begin(const struct iomap_iter *iter, loff_t pos,
 			if (status)
 				return status;
 		}
-		iomap_set_range_uptodate(page, poff, plen);
+		iomap_set_range_uptodate(page, iop, poff, plen);
 	} while ((block_start += plen) < block_end);
 
 	return 0;
@@ -655,6 +657,8 @@ static int iomap_write_begin(const struct iomap_iter *iter, loff_t pos,
 static size_t __iomap_write_end(struct inode *inode, loff_t pos, size_t len,
 		size_t copied, struct page *page)
 {
+	struct folio *folio = page_folio(page);
+	struct iomap_page *iop = to_iomap_page(folio);
 	flush_dcache_page(page);
 
 	/*
@@ -670,7 +674,7 @@ static size_t __iomap_write_end(struct inode *inode, loff_t pos, size_t len,
 	 */
 	if (unlikely(copied < len && !PageUptodate(page)))
 		return 0;
-	iomap_set_range_uptodate(page, offset_in_page(pos), len);
+	iomap_set_range_uptodate(page, iop, offset_in_page(pos), len);
 	__set_page_dirty_nobuffers(page);
 	return copied;
 }
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH v2 14/28] iomap: Convert bio completions to use folios
  2021-11-08  4:05 [PATCH v2 00/28] iomap/xfs folio patches Matthew Wilcox (Oracle)
                   ` (12 preceding siblings ...)
  2021-11-08  4:05 ` [PATCH v2 13/28] iomap: Pass the iomap_page into iomap_set_range_uptodate Matthew Wilcox (Oracle)
@ 2021-11-08  4:05 ` Matthew Wilcox (Oracle)
  2021-11-08  4:05 ` [PATCH v2 15/28] iomap: Use folio offsets instead of page offsets Matthew Wilcox (Oracle)
                   ` (13 subsequent siblings)
  27 siblings, 0 replies; 64+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-11-08  4:05 UTC (permalink / raw)
  To: Darrick J . Wong 
  Cc: Matthew Wilcox (Oracle),
	linux-xfs, linux-fsdevel, linux-kernel, linux-block, Jens Axboe,
	Christoph Hellwig, Christoph Hellwig

Use bio_for_each_folio() to iterate over each folio in the bio
instead of iterating over each page.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 fs/iomap/buffered-io.c | 50 ++++++++++++++++++------------------------
 1 file changed, 21 insertions(+), 29 deletions(-)

diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index 03bfbafec3f4..bbccb031815e 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -161,34 +161,29 @@ static void iomap_set_range_uptodate(struct page *page,
 		SetPageUptodate(page);
 }
 
-static void
-iomap_read_page_end_io(struct bio_vec *bvec, int error)
+static void iomap_finish_folio_read(struct folio *folio, size_t offset,
+		size_t len, int error)
 {
-	struct page *page = bvec->bv_page;
-	struct folio *folio = page_folio(page);
 	struct iomap_page *iop = to_iomap_page(folio);
 
 	if (unlikely(error)) {
-		ClearPageUptodate(page);
-		SetPageError(page);
+		folio_clear_uptodate(folio);
+		folio_set_error(folio);
 	} else {
-		iomap_set_range_uptodate(page, iop, bvec->bv_offset,
-						bvec->bv_len);
+		iomap_set_range_uptodate(&folio->page, iop, offset, len);
 	}
 
-	if (!iop || atomic_sub_and_test(bvec->bv_len, &iop->read_bytes_pending))
-		unlock_page(page);
+	if (!iop || atomic_sub_and_test(len, &iop->read_bytes_pending))
+		folio_unlock(folio);
 }
 
-static void
-iomap_read_end_io(struct bio *bio)
+static void iomap_read_end_io(struct bio *bio)
 {
 	int error = blk_status_to_errno(bio->bi_status);
-	struct bio_vec *bvec;
-	struct bvec_iter_all iter_all;
+	struct folio_iter fi;
 
-	bio_for_each_segment_all(bvec, bio, iter_all)
-		iomap_read_page_end_io(bvec, error);
+	bio_for_each_folio_all(fi, bio)
+		iomap_finish_folio_read(fi.folio, fi.offset, fi.length, error);
 	bio_put(bio);
 }
 
@@ -1013,23 +1008,21 @@ vm_fault_t iomap_page_mkwrite(struct vm_fault *vmf, const struct iomap_ops *ops)
 }
 EXPORT_SYMBOL_GPL(iomap_page_mkwrite);
 
-static void
-iomap_finish_page_writeback(struct inode *inode, struct page *page,
-		int error, unsigned int len)
+static void iomap_finish_folio_write(struct inode *inode, struct folio *folio,
+		size_t len, int error)
 {
-	struct folio *folio = page_folio(page);
 	struct iomap_page *iop = to_iomap_page(folio);
 
 	if (error) {
-		SetPageError(page);
+		folio_set_error(folio);
 		mapping_set_error(inode->i_mapping, error);
 	}
 
-	WARN_ON_ONCE(i_blocks_per_page(inode, page) > 1 && !iop);
+	WARN_ON_ONCE(i_blocks_per_folio(inode, folio) > 1 && !iop);
 	WARN_ON_ONCE(iop && atomic_read(&iop->write_bytes_pending) <= 0);
 
 	if (!iop || atomic_sub_and_test(len, &iop->write_bytes_pending))
-		end_page_writeback(page);
+		folio_end_writeback(folio);
 }
 
 /*
@@ -1048,8 +1041,7 @@ iomap_finish_ioend(struct iomap_ioend *ioend, int error)
 	bool quiet = bio_flagged(bio, BIO_QUIET);
 
 	for (bio = &ioend->io_inline_bio; bio; bio = next) {
-		struct bio_vec *bv;
-		struct bvec_iter_all iter_all;
+		struct folio_iter fi;
 
 		/*
 		 * For the last bio, bi_private points to the ioend, so we
@@ -1060,10 +1052,10 @@ iomap_finish_ioend(struct iomap_ioend *ioend, int error)
 		else
 			next = bio->bi_private;
 
-		/* walk each page on bio, ending page IO on them */
-		bio_for_each_segment_all(bv, bio, iter_all)
-			iomap_finish_page_writeback(inode, bv->bv_page, error,
-					bv->bv_len);
+		/* walk all folios in bio, ending page IO on them */
+		bio_for_each_folio_all(fi, bio)
+			iomap_finish_folio_write(inode, fi.folio, fi.length,
+					error);
 		bio_put(bio);
 	}
 	/* The ioend has been freed by bio_put() */
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH v2 15/28] iomap: Use folio offsets instead of page offsets
  2021-11-08  4:05 [PATCH v2 00/28] iomap/xfs folio patches Matthew Wilcox (Oracle)
                   ` (13 preceding siblings ...)
  2021-11-08  4:05 ` [PATCH v2 14/28] iomap: Convert bio completions to use folios Matthew Wilcox (Oracle)
@ 2021-11-08  4:05 ` Matthew Wilcox (Oracle)
  2021-11-08  4:05 ` [PATCH v2 16/28] iomap: Convert iomap_read_inline_data to take a folio Matthew Wilcox (Oracle)
                   ` (12 subsequent siblings)
  27 siblings, 0 replies; 64+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-11-08  4:05 UTC (permalink / raw)
  To: Darrick J . Wong 
  Cc: Matthew Wilcox (Oracle),
	linux-xfs, linux-fsdevel, linux-kernel, linux-block, Jens Axboe,
	Christoph Hellwig, Christoph Hellwig

Pass a folio around instead of the page, and make sure the offset
is relative to the start of the folio instead of the start of a page.
Also use size_t for offset & length to make it clear that these are byte
counts, and to support >2GB folios in the future.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 fs/iomap/buffered-io.c | 78 ++++++++++++++++++++++--------------------
 1 file changed, 40 insertions(+), 38 deletions(-)

diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index bbccb031815e..c7c4ae735620 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -75,18 +75,18 @@ static void iomap_page_release(struct folio *folio)
 }
 
 /*
- * Calculate the range inside the page that we actually need to read.
+ * Calculate the range inside the folio that we actually need to read.
  */
-static void
-iomap_adjust_read_range(struct inode *inode, struct iomap_page *iop,
-		loff_t *pos, loff_t length, unsigned *offp, unsigned *lenp)
+static void iomap_adjust_read_range(struct inode *inode, struct folio *folio,
+		loff_t *pos, loff_t length, size_t *offp, size_t *lenp)
 {
+	struct iomap_page *iop = to_iomap_page(folio);
 	loff_t orig_pos = *pos;
 	loff_t isize = i_size_read(inode);
 	unsigned block_bits = inode->i_blkbits;
 	unsigned block_size = (1 << block_bits);
-	unsigned poff = offset_in_page(*pos);
-	unsigned plen = min_t(loff_t, PAGE_SIZE - poff, length);
+	size_t poff = offset_in_folio(folio, *pos);
+	size_t plen = min_t(loff_t, folio_size(folio) - poff, length);
 	unsigned first = poff >> block_bits;
 	unsigned last = (poff + plen - 1) >> block_bits;
 
@@ -124,7 +124,7 @@ iomap_adjust_read_range(struct inode *inode, struct iomap_page *iop,
 	 * page cache for blocks that are entirely outside of i_size.
 	 */
 	if (orig_pos <= isize && orig_pos + length > isize) {
-		unsigned end = offset_in_page(isize - 1) >> block_bits;
+		unsigned end = offset_in_folio(folio, isize - 1) >> block_bits;
 
 		if (first <= end && last > end)
 			plen -= (last - end) * block_size;
@@ -134,31 +134,31 @@ iomap_adjust_read_range(struct inode *inode, struct iomap_page *iop,
 	*lenp = plen;
 }
 
-static void iomap_iop_set_range_uptodate(struct page *page,
-		struct iomap_page *iop, unsigned off, unsigned len)
+static void iomap_iop_set_range_uptodate(struct folio *folio,
+		struct iomap_page *iop, size_t off, size_t len)
 {
-	struct inode *inode = page->mapping->host;
+	struct inode *inode = folio->mapping->host;
 	unsigned first = off >> inode->i_blkbits;
 	unsigned last = (off + len - 1) >> inode->i_blkbits;
 	unsigned long flags;
 
 	spin_lock_irqsave(&iop->uptodate_lock, flags);
 	bitmap_set(iop->uptodate, first, last - first + 1);
-	if (bitmap_full(iop->uptodate, i_blocks_per_page(inode, page)))
-		SetPageUptodate(page);
+	if (bitmap_full(iop->uptodate, i_blocks_per_folio(inode, folio)))
+		folio_mark_uptodate(folio);
 	spin_unlock_irqrestore(&iop->uptodate_lock, flags);
 }
 
-static void iomap_set_range_uptodate(struct page *page,
-		struct iomap_page *iop, unsigned off, unsigned len)
+static void iomap_set_range_uptodate(struct folio *folio,
+		struct iomap_page *iop, size_t off, size_t len)
 {
-	if (PageError(page))
+	if (folio_test_error(folio))
 		return;
 
 	if (iop)
-		iomap_iop_set_range_uptodate(page, iop, off, len);
+		iomap_iop_set_range_uptodate(folio, iop, off, len);
 	else
-		SetPageUptodate(page);
+		folio_mark_uptodate(folio);
 }
 
 static void iomap_finish_folio_read(struct folio *folio, size_t offset,
@@ -170,7 +170,7 @@ static void iomap_finish_folio_read(struct folio *folio, size_t offset,
 		folio_clear_uptodate(folio);
 		folio_set_error(folio);
 	} else {
-		iomap_set_range_uptodate(&folio->page, iop, offset, len);
+		iomap_set_range_uptodate(folio, iop, offset, len);
 	}
 
 	if (!iop || atomic_sub_and_test(len, &iop->read_bytes_pending))
@@ -202,6 +202,7 @@ static loff_t iomap_read_inline_data(const struct iomap_iter *iter,
 	const struct iomap *iomap = iomap_iter_srcmap(iter);
 	size_t size = i_size_read(iter->inode) - iomap->offset;
 	size_t poff = offset_in_page(iomap->offset);
+	size_t offset = offset_in_folio(folio, iomap->offset);
 	void *addr;
 
 	if (PageUptodate(page))
@@ -214,7 +215,7 @@ static loff_t iomap_read_inline_data(const struct iomap_iter *iter,
 		return -EIO;
 	if (WARN_ON_ONCE(size > iomap->length))
 		return -EIO;
-	if (poff > 0)
+	if (offset > 0)
 		iop = iomap_page_create(iter->inode, folio);
 	else
 		iop = to_iomap_page(folio);
@@ -223,7 +224,7 @@ static loff_t iomap_read_inline_data(const struct iomap_iter *iter,
 	memcpy(addr, iomap->inline_data, size);
 	memset(addr + size, 0, PAGE_SIZE - poff - size);
 	kunmap_local(addr);
-	iomap_set_range_uptodate(page, iop, poff, PAGE_SIZE - poff);
+	iomap_set_range_uptodate(folio, iop, offset, PAGE_SIZE - poff);
 	return PAGE_SIZE - poff;
 }
 
@@ -247,7 +248,7 @@ static loff_t iomap_readpage_iter(const struct iomap_iter *iter,
 	struct folio *folio = page_folio(page);
 	struct iomap_page *iop;
 	loff_t orig_pos = pos;
-	unsigned poff, plen;
+	size_t poff, plen;
 	sector_t sector;
 
 	if (iomap->type == IOMAP_INLINE)
@@ -255,13 +256,13 @@ static loff_t iomap_readpage_iter(const struct iomap_iter *iter,
 
 	/* zero post-eof blocks as the page may be mapped */
 	iop = iomap_page_create(iter->inode, folio);
-	iomap_adjust_read_range(iter->inode, iop, &pos, length, &poff, &plen);
+	iomap_adjust_read_range(iter->inode, folio, &pos, length, &poff, &plen);
 	if (plen == 0)
 		goto done;
 
 	if (iomap_block_needs_zeroing(iter, pos)) {
-		zero_user(page, poff, plen);
-		iomap_set_range_uptodate(page, iop, poff, plen);
+		folio_zero_range(folio, poff, plen);
+		iomap_set_range_uptodate(folio, iop, poff, plen);
 		goto done;
 	}
 
@@ -272,7 +273,7 @@ static loff_t iomap_readpage_iter(const struct iomap_iter *iter,
 	sector = iomap_sector(iomap, pos);
 	if (!ctx->bio ||
 	    bio_end_sector(ctx->bio) != sector ||
-	    bio_add_page(ctx->bio, page, plen, poff) != plen) {
+	    !bio_add_folio(ctx->bio, folio, plen, poff)) {
 		gfp_t gfp = mapping_gfp_constraint(page->mapping, GFP_KERNEL);
 		gfp_t orig_gfp = gfp;
 		unsigned int nr_vecs = DIV_ROUND_UP(length, PAGE_SIZE);
@@ -296,8 +297,9 @@ static loff_t iomap_readpage_iter(const struct iomap_iter *iter,
 		ctx->bio->bi_iter.bi_sector = sector;
 		bio_set_dev(ctx->bio, iomap->bdev);
 		ctx->bio->bi_end_io = iomap_read_end_io;
-		__bio_add_page(ctx->bio, page, plen, poff);
+		bio_add_folio(ctx->bio, folio, plen, poff);
 	}
+
 done:
 	/*
 	 * Move the caller beyond our range so that it keeps making progress.
@@ -524,9 +526,8 @@ iomap_write_failed(struct inode *inode, loff_t pos, unsigned len)
 		truncate_pagecache_range(inode, max(pos, i_size), pos + len);
 }
 
-static int
-iomap_read_page_sync(loff_t block_start, struct page *page, unsigned poff,
-		unsigned plen, const struct iomap *iomap)
+static int iomap_read_folio_sync(loff_t block_start, struct folio *folio,
+		size_t poff, size_t plen, const struct iomap *iomap)
 {
 	struct bio_vec bvec;
 	struct bio bio;
@@ -535,7 +536,7 @@ iomap_read_page_sync(loff_t block_start, struct page *page, unsigned poff,
 	bio.bi_opf = REQ_OP_READ;
 	bio.bi_iter.bi_sector = iomap_sector(iomap, block_start);
 	bio_set_dev(&bio, iomap->bdev);
-	__bio_add_page(&bio, page, plen, poff);
+	bio_add_folio(&bio, folio, plen, poff);
 	return submit_bio_wait(&bio);
 }
 
@@ -548,14 +549,15 @@ static int __iomap_write_begin(const struct iomap_iter *iter, loff_t pos,
 	loff_t block_size = i_blocksize(iter->inode);
 	loff_t block_start = round_down(pos, block_size);
 	loff_t block_end = round_up(pos + len, block_size);
-	unsigned from = offset_in_page(pos), to = from + len, poff, plen;
+	size_t from = offset_in_folio(folio, pos), to = from + len;
+	size_t poff, plen;
 
-	if (PageUptodate(page))
+	if (folio_test_uptodate(folio))
 		return 0;
-	ClearPageError(page);
+	folio_clear_error(folio);
 
 	do {
-		iomap_adjust_read_range(iter->inode, iop, &block_start,
+		iomap_adjust_read_range(iter->inode, folio, &block_start,
 				block_end - block_start, &poff, &plen);
 		if (plen == 0)
 			break;
@@ -568,14 +570,14 @@ static int __iomap_write_begin(const struct iomap_iter *iter, loff_t pos,
 		if (iomap_block_needs_zeroing(iter, block_start)) {
 			if (WARN_ON_ONCE(iter->flags & IOMAP_UNSHARE))
 				return -EIO;
-			zero_user_segments(page, poff, from, to, poff + plen);
+			folio_zero_segments(folio, poff, from, to, poff + plen);
 		} else {
-			int status = iomap_read_page_sync(block_start, page,
+			int status = iomap_read_folio_sync(block_start, folio,
 					poff, plen, srcmap);
 			if (status)
 				return status;
 		}
-		iomap_set_range_uptodate(page, iop, poff, plen);
+		iomap_set_range_uptodate(folio, iop, poff, plen);
 	} while ((block_start += plen) < block_end);
 
 	return 0;
@@ -669,7 +671,7 @@ static size_t __iomap_write_end(struct inode *inode, loff_t pos, size_t len,
 	 */
 	if (unlikely(copied < len && !PageUptodate(page)))
 		return 0;
-	iomap_set_range_uptodate(page, iop, offset_in_page(pos), len);
+	iomap_set_range_uptodate(folio, iop, offset_in_folio(folio, pos), len);
 	__set_page_dirty_nobuffers(page);
 	return copied;
 }
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH v2 16/28] iomap: Convert iomap_read_inline_data to take a folio
  2021-11-08  4:05 [PATCH v2 00/28] iomap/xfs folio patches Matthew Wilcox (Oracle)
                   ` (14 preceding siblings ...)
  2021-11-08  4:05 ` [PATCH v2 15/28] iomap: Use folio offsets instead of page offsets Matthew Wilcox (Oracle)
@ 2021-11-08  4:05 ` Matthew Wilcox (Oracle)
  2021-11-08  4:05 ` [PATCH v2 17/28] iomap: Convert readahead and readpage to use " Matthew Wilcox (Oracle)
                   ` (11 subsequent siblings)
  27 siblings, 0 replies; 64+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-11-08  4:05 UTC (permalink / raw)
  To: Darrick J . Wong 
  Cc: Matthew Wilcox (Oracle),
	linux-xfs, linux-fsdevel, linux-kernel, linux-block, Jens Axboe,
	Christoph Hellwig, Christoph Hellwig

We still only support up to a single page of inline data (at least,
per call to iomap_read_inline_data()), but it can now be written into
the middle of a folio in case we decide to allocate a 16KiB page for
a file that's 8.1KiB in size.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 fs/iomap/buffered-io.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index c7c4ae735620..96a404f11a3b 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -195,9 +195,8 @@ struct iomap_readpage_ctx {
 };
 
 static loff_t iomap_read_inline_data(const struct iomap_iter *iter,
-		struct page *page)
+		struct folio *folio)
 {
-	struct folio *folio = page_folio(page);
 	struct iomap_page *iop;
 	const struct iomap *iomap = iomap_iter_srcmap(iter);
 	size_t size = i_size_read(iter->inode) - iomap->offset;
@@ -205,7 +204,7 @@ static loff_t iomap_read_inline_data(const struct iomap_iter *iter,
 	size_t offset = offset_in_folio(folio, iomap->offset);
 	void *addr;
 
-	if (PageUptodate(page))
+	if (folio_test_uptodate(folio))
 		return PAGE_SIZE - poff;
 
 	if (WARN_ON_ONCE(size > PAGE_SIZE - poff))
@@ -220,7 +219,7 @@ static loff_t iomap_read_inline_data(const struct iomap_iter *iter,
 	else
 		iop = to_iomap_page(folio);
 
-	addr = kmap_local_page(page) + poff;
+	addr = kmap_local_folio(folio, offset);
 	memcpy(addr, iomap->inline_data, size);
 	memset(addr + size, 0, PAGE_SIZE - poff - size);
 	kunmap_local(addr);
@@ -252,7 +251,7 @@ static loff_t iomap_readpage_iter(const struct iomap_iter *iter,
 	sector_t sector;
 
 	if (iomap->type == IOMAP_INLINE)
-		return min(iomap_read_inline_data(iter, page), length);
+		return min(iomap_read_inline_data(iter, folio), length);
 
 	/* zero post-eof blocks as the page may be mapped */
 	iop = iomap_page_create(iter->inode, folio);
@@ -586,12 +585,13 @@ static int __iomap_write_begin(const struct iomap_iter *iter, loff_t pos,
 static int iomap_write_begin_inline(const struct iomap_iter *iter,
 		struct page *page)
 {
+	struct folio *folio = page_folio(page);
 	int ret;
 
 	/* needs more work for the tailpacking case; disable for now */
 	if (WARN_ON_ONCE(iomap_iter_srcmap(iter)->offset != 0))
 		return -EIO;
-	ret = iomap_read_inline_data(iter, page);
+	ret = iomap_read_inline_data(iter, folio);
 	if (ret < 0)
 		return ret;
 	return 0;
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH v2 17/28] iomap: Convert readahead and readpage to use a folio
  2021-11-08  4:05 [PATCH v2 00/28] iomap/xfs folio patches Matthew Wilcox (Oracle)
                   ` (15 preceding siblings ...)
  2021-11-08  4:05 ` [PATCH v2 16/28] iomap: Convert iomap_read_inline_data to take a folio Matthew Wilcox (Oracle)
@ 2021-11-08  4:05 ` Matthew Wilcox (Oracle)
  2021-11-09  8:43   ` Christoph Hellwig
  2021-11-08  4:05 ` [PATCH v2 18/28] iomap: Convert iomap_page_mkwrite " Matthew Wilcox (Oracle)
                   ` (10 subsequent siblings)
  27 siblings, 1 reply; 64+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-11-08  4:05 UTC (permalink / raw)
  To: Darrick J . Wong 
  Cc: Matthew Wilcox (Oracle),
	linux-xfs, linux-fsdevel, linux-kernel, linux-block, Jens Axboe,
	Christoph Hellwig

Handle folios of arbitrary size instead of working in PAGE_SIZE units.
readahead_folio() decreases the page refcount for you, so this is not
quite a mechanical change.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
---
 fs/iomap/buffered-io.c | 53 +++++++++++++++++++++---------------------
 1 file changed, 26 insertions(+), 27 deletions(-)

diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index 96a404f11a3b..b0b402e1779e 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -188,8 +188,8 @@ static void iomap_read_end_io(struct bio *bio)
 }
 
 struct iomap_readpage_ctx {
-	struct page		*cur_page;
-	bool			cur_page_in_bio;
+	struct folio		*cur_folio;
+	bool			cur_folio_in_bio;
 	struct bio		*bio;
 	struct readahead_control *rac;
 };
@@ -243,8 +243,7 @@ static loff_t iomap_readpage_iter(const struct iomap_iter *iter,
 	const struct iomap *iomap = &iter->iomap;
 	loff_t pos = iter->pos + offset;
 	loff_t length = iomap_length(iter) - offset;
-	struct page *page = ctx->cur_page;
-	struct folio *folio = page_folio(page);
+	struct folio *folio = ctx->cur_folio;
 	struct iomap_page *iop;
 	loff_t orig_pos = pos;
 	size_t poff, plen;
@@ -265,7 +264,7 @@ static loff_t iomap_readpage_iter(const struct iomap_iter *iter,
 		goto done;
 	}
 
-	ctx->cur_page_in_bio = true;
+	ctx->cur_folio_in_bio = true;
 	if (iop)
 		atomic_add(plen, &iop->read_bytes_pending);
 
@@ -273,7 +272,7 @@ static loff_t iomap_readpage_iter(const struct iomap_iter *iter,
 	if (!ctx->bio ||
 	    bio_end_sector(ctx->bio) != sector ||
 	    !bio_add_folio(ctx->bio, folio, plen, poff)) {
-		gfp_t gfp = mapping_gfp_constraint(page->mapping, GFP_KERNEL);
+		gfp_t gfp = mapping_gfp_constraint(folio->mapping, GFP_KERNEL);
 		gfp_t orig_gfp = gfp;
 		unsigned int nr_vecs = DIV_ROUND_UP(length, PAGE_SIZE);
 
@@ -312,30 +311,31 @@ static loff_t iomap_readpage_iter(const struct iomap_iter *iter,
 int
 iomap_readpage(struct page *page, const struct iomap_ops *ops)
 {
+	struct folio *folio = page_folio(page);
 	struct iomap_iter iter = {
-		.inode		= page->mapping->host,
-		.pos		= page_offset(page),
-		.len		= PAGE_SIZE,
+		.inode		= folio->mapping->host,
+		.pos		= folio_pos(folio),
+		.len		= folio_size(folio),
 	};
 	struct iomap_readpage_ctx ctx = {
-		.cur_page	= page,
+		.cur_folio	= folio,
 	};
 	int ret;
 
-	trace_iomap_readpage(page->mapping->host, 1);
+	trace_iomap_readpage(iter.inode, 1);
 
 	while ((ret = iomap_iter(&iter, ops)) > 0)
 		iter.processed = iomap_readpage_iter(&iter, &ctx, 0);
 
 	if (ret < 0)
-		SetPageError(page);
+		folio_set_error(folio);
 
 	if (ctx.bio) {
 		submit_bio(ctx.bio);
-		WARN_ON_ONCE(!ctx.cur_page_in_bio);
+		WARN_ON_ONCE(!ctx.cur_folio_in_bio);
 	} else {
-		WARN_ON_ONCE(ctx.cur_page_in_bio);
-		unlock_page(page);
+		WARN_ON_ONCE(ctx.cur_folio_in_bio);
+		folio_unlock(folio);
 	}
 
 	/*
@@ -354,15 +354,15 @@ static loff_t iomap_readahead_iter(const struct iomap_iter *iter,
 	loff_t done, ret;
 
 	for (done = 0; done < length; done += ret) {
-		if (ctx->cur_page && offset_in_page(iter->pos + done) == 0) {
-			if (!ctx->cur_page_in_bio)
-				unlock_page(ctx->cur_page);
-			put_page(ctx->cur_page);
-			ctx->cur_page = NULL;
+		if (ctx->cur_folio &&
+		    offset_in_folio(ctx->cur_folio, iter->pos + done) == 0) {
+			if (!ctx->cur_folio_in_bio)
+				folio_unlock(ctx->cur_folio);
+			ctx->cur_folio = NULL;
 		}
-		if (!ctx->cur_page) {
-			ctx->cur_page = readahead_page(ctx->rac);
-			ctx->cur_page_in_bio = false;
+		if (!ctx->cur_folio) {
+			ctx->cur_folio = readahead_folio(ctx->rac);
+			ctx->cur_folio_in_bio = false;
 		}
 		ret = iomap_readpage_iter(iter, ctx, done);
 	}
@@ -403,10 +403,9 @@ void iomap_readahead(struct readahead_control *rac, const struct iomap_ops *ops)
 
 	if (ctx.bio)
 		submit_bio(ctx.bio);
-	if (ctx.cur_page) {
-		if (!ctx.cur_page_in_bio)
-			unlock_page(ctx.cur_page);
-		put_page(ctx.cur_page);
+	if (ctx.cur_folio) {
+		if (!ctx.cur_folio_in_bio)
+			folio_unlock(ctx.cur_folio);
 	}
 }
 EXPORT_SYMBOL_GPL(iomap_readahead);
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH v2 18/28] iomap: Convert iomap_page_mkwrite to use a folio
  2021-11-08  4:05 [PATCH v2 00/28] iomap/xfs folio patches Matthew Wilcox (Oracle)
                   ` (16 preceding siblings ...)
  2021-11-08  4:05 ` [PATCH v2 17/28] iomap: Convert readahead and readpage to use " Matthew Wilcox (Oracle)
@ 2021-11-08  4:05 ` Matthew Wilcox (Oracle)
  2021-11-08  4:05 ` [PATCH v2 19/28] iomap: Convert __iomap_zero_iter " Matthew Wilcox (Oracle)
                   ` (9 subsequent siblings)
  27 siblings, 0 replies; 64+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-11-08  4:05 UTC (permalink / raw)
  To: Darrick J . Wong 
  Cc: Matthew Wilcox (Oracle),
	linux-xfs, linux-fsdevel, linux-kernel, linux-block, Jens Axboe,
	Christoph Hellwig, Christoph Hellwig

If we write to any page in a folio, we have to mark the entire
folio as dirty, and potentially COW the entire folio, because it'll
all get written back as one unit.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 fs/iomap/buffered-io.c | 25 ++++++++++++-------------
 1 file changed, 12 insertions(+), 13 deletions(-)

diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index b0b402e1779e..64e54981b651 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -960,10 +960,9 @@ iomap_truncate_page(struct inode *inode, loff_t pos, bool *did_zero,
 }
 EXPORT_SYMBOL_GPL(iomap_truncate_page);
 
-static loff_t iomap_page_mkwrite_iter(struct iomap_iter *iter,
-		struct page *page)
+static loff_t iomap_folio_mkwrite_iter(struct iomap_iter *iter,
+		struct folio *folio)
 {
-	struct folio *folio = page_folio(page);
 	loff_t length = iomap_length(iter);
 	int ret;
 
@@ -972,10 +971,10 @@ static loff_t iomap_page_mkwrite_iter(struct iomap_iter *iter,
 					      &iter->iomap);
 		if (ret)
 			return ret;
-		block_commit_write(page, 0, length);
+		block_commit_write(&folio->page, 0, length);
 	} else {
-		WARN_ON_ONCE(!PageUptodate(page));
-		set_page_dirty(page);
+		WARN_ON_ONCE(!folio_test_uptodate(folio));
+		folio_mark_dirty(folio);
 	}
 
 	return length;
@@ -987,24 +986,24 @@ vm_fault_t iomap_page_mkwrite(struct vm_fault *vmf, const struct iomap_ops *ops)
 		.inode		= file_inode(vmf->vma->vm_file),
 		.flags		= IOMAP_WRITE | IOMAP_FAULT,
 	};
-	struct page *page = vmf->page;
+	struct folio *folio = page_folio(vmf->page);
 	ssize_t ret;
 
-	lock_page(page);
-	ret = page_mkwrite_check_truncate(page, iter.inode);
+	folio_lock(folio);
+	ret = folio_mkwrite_check_truncate(folio, iter.inode);
 	if (ret < 0)
 		goto out_unlock;
-	iter.pos = page_offset(page);
+	iter.pos = folio_pos(folio);
 	iter.len = ret;
 	while ((ret = iomap_iter(&iter, ops)) > 0)
-		iter.processed = iomap_page_mkwrite_iter(&iter, page);
+		iter.processed = iomap_folio_mkwrite_iter(&iter, folio);
 
 	if (ret < 0)
 		goto out_unlock;
-	wait_for_stable_page(page);
+	folio_wait_stable(folio);
 	return VM_FAULT_LOCKED;
 out_unlock:
-	unlock_page(page);
+	folio_unlock(folio);
 	return block_page_mkwrite_return(ret);
 }
 EXPORT_SYMBOL_GPL(iomap_page_mkwrite);
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH v2 19/28] iomap: Convert __iomap_zero_iter to use a folio
  2021-11-08  4:05 [PATCH v2 00/28] iomap/xfs folio patches Matthew Wilcox (Oracle)
                   ` (17 preceding siblings ...)
  2021-11-08  4:05 ` [PATCH v2 18/28] iomap: Convert iomap_page_mkwrite " Matthew Wilcox (Oracle)
@ 2021-11-08  4:05 ` Matthew Wilcox (Oracle)
  2021-11-09  8:47   ` Christoph Hellwig
                     ` (2 more replies)
  2021-11-08  4:05 ` [PATCH v2 20/28] iomap: Convert iomap_write_begin() and iomap_write_end() to folios Matthew Wilcox (Oracle)
                   ` (8 subsequent siblings)
  27 siblings, 3 replies; 64+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-11-08  4:05 UTC (permalink / raw)
  To: Darrick J . Wong 
  Cc: Matthew Wilcox (Oracle),
	linux-xfs, linux-fsdevel, linux-kernel, linux-block, Jens Axboe,
	Christoph Hellwig

The zero iterator can work in folio-sized chunks instead of page-sized
chunks.  This will save a lot of page cache lookups if the file is cached
in multi-page folios.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 fs/iomap/buffered-io.c | 13 ++++++++-----
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index 64e54981b651..9c61d12028ca 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -881,17 +881,20 @@ EXPORT_SYMBOL_GPL(iomap_file_unshare);
 
 static s64 __iomap_zero_iter(struct iomap_iter *iter, loff_t pos, u64 length)
 {
+	struct folio *folio;
 	struct page *page;
 	int status;
-	unsigned offset = offset_in_page(pos);
-	unsigned bytes = min_t(u64, PAGE_SIZE - offset, length);
+	size_t offset, bytes;
 
-	status = iomap_write_begin(iter, pos, bytes, &page);
+	status = iomap_write_begin(iter, pos, length, &page);
 	if (status)
 		return status;
+	folio = page_folio(page);
 
-	zero_user(page, offset, bytes);
-	mark_page_accessed(page);
+	offset = offset_in_folio(folio, pos);
+	bytes = min_t(u64, folio_size(folio) - offset, length);
+	folio_zero_range(folio, offset, bytes);
+	folio_mark_accessed(folio);
 
 	return iomap_write_end(iter, pos, bytes, bytes, page);
 }
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH v2 20/28] iomap: Convert iomap_write_begin() and iomap_write_end() to folios
  2021-11-08  4:05 [PATCH v2 00/28] iomap/xfs folio patches Matthew Wilcox (Oracle)
                   ` (18 preceding siblings ...)
  2021-11-08  4:05 ` [PATCH v2 19/28] iomap: Convert __iomap_zero_iter " Matthew Wilcox (Oracle)
@ 2021-11-08  4:05 ` Matthew Wilcox (Oracle)
  2021-11-17  4:31   ` Darrick J. Wong
  2021-11-08  4:05 ` [PATCH v2 21/28] iomap: Convert iomap_write_end_inline to take a folio Matthew Wilcox (Oracle)
                   ` (7 subsequent siblings)
  27 siblings, 1 reply; 64+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-11-08  4:05 UTC (permalink / raw)
  To: Darrick J . Wong 
  Cc: Matthew Wilcox (Oracle),
	linux-xfs, linux-fsdevel, linux-kernel, linux-block, Jens Axboe,
	Christoph Hellwig, Christoph Hellwig

These functions still only work in PAGE_SIZE chunks, but there are
fewer conversions from tail to head pages as a result of this patch.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 fs/iomap/buffered-io.c | 66 ++++++++++++++++++++----------------------
 1 file changed, 31 insertions(+), 35 deletions(-)

diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index 9c61d12028ca..f4ae200adc4c 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -539,9 +539,8 @@ static int iomap_read_folio_sync(loff_t block_start, struct folio *folio,
 }
 
 static int __iomap_write_begin(const struct iomap_iter *iter, loff_t pos,
-		unsigned len, struct page *page)
+		size_t len, struct folio *folio)
 {
-	struct folio *folio = page_folio(page);
 	const struct iomap *srcmap = iomap_iter_srcmap(iter);
 	struct iomap_page *iop = iomap_page_create(iter->inode, folio);
 	loff_t block_size = i_blocksize(iter->inode);
@@ -582,9 +581,8 @@ static int __iomap_write_begin(const struct iomap_iter *iter, loff_t pos,
 }
 
 static int iomap_write_begin_inline(const struct iomap_iter *iter,
-		struct page *page)
+		struct folio *folio)
 {
-	struct folio *folio = page_folio(page);
 	int ret;
 
 	/* needs more work for the tailpacking case; disable for now */
@@ -597,12 +595,12 @@ static int iomap_write_begin_inline(const struct iomap_iter *iter,
 }
 
 static int iomap_write_begin(const struct iomap_iter *iter, loff_t pos,
-		unsigned len, struct page **pagep)
+		size_t len, struct folio **foliop)
 {
 	const struct iomap_page_ops *page_ops = iter->iomap.page_ops;
 	const struct iomap *srcmap = iomap_iter_srcmap(iter);
-	struct page *page;
 	struct folio *folio;
+	unsigned fgp = FGP_LOCK | FGP_WRITE | FGP_CREAT | FGP_STABLE | FGP_NOFS;
 	int status = 0;
 
 	BUG_ON(pos + len > iter->iomap.offset + iter->iomap.length);
@@ -618,30 +616,29 @@ static int iomap_write_begin(const struct iomap_iter *iter, loff_t pos,
 			return status;
 	}
 
-	page = grab_cache_page_write_begin(iter->inode->i_mapping,
-				pos >> PAGE_SHIFT, AOP_FLAG_NOFS);
-	if (!page) {
+	folio = __filemap_get_folio(iter->inode->i_mapping, pos >> PAGE_SHIFT,
+			fgp, mapping_gfp_mask(iter->inode->i_mapping));
+	if (!folio) {
 		status = -ENOMEM;
 		goto out_no_page;
 	}
-	folio = page_folio(page);
 
 	if (srcmap->type == IOMAP_INLINE)
-		status = iomap_write_begin_inline(iter, page);
+		status = iomap_write_begin_inline(iter, folio);
 	else if (srcmap->flags & IOMAP_F_BUFFER_HEAD)
 		status = __block_write_begin_int(folio, pos, len, NULL, srcmap);
 	else
-		status = __iomap_write_begin(iter, pos, len, page);
+		status = __iomap_write_begin(iter, pos, len, folio);
 
 	if (unlikely(status))
 		goto out_unlock;
 
-	*pagep = page;
+	*foliop = folio;
 	return 0;
 
 out_unlock:
-	unlock_page(page);
-	put_page(page);
+	folio_unlock(folio);
+	folio_put(folio);
 	iomap_write_failed(iter->inode, pos, len);
 
 out_no_page:
@@ -651,11 +648,10 @@ static int iomap_write_begin(const struct iomap_iter *iter, loff_t pos,
 }
 
 static size_t __iomap_write_end(struct inode *inode, loff_t pos, size_t len,
-		size_t copied, struct page *page)
+		size_t copied, struct folio *folio)
 {
-	struct folio *folio = page_folio(page);
 	struct iomap_page *iop = to_iomap_page(folio);
-	flush_dcache_page(page);
+	flush_dcache_folio(folio);
 
 	/*
 	 * The blocks that were entirely written will now be uptodate, so we
@@ -668,10 +664,10 @@ static size_t __iomap_write_end(struct inode *inode, loff_t pos, size_t len,
 	 * non-uptodate page as a zero-length write, and force the caller to
 	 * redo the whole thing.
 	 */
-	if (unlikely(copied < len && !PageUptodate(page)))
+	if (unlikely(copied < len && !folio_test_uptodate(folio)))
 		return 0;
 	iomap_set_range_uptodate(folio, iop, offset_in_folio(folio, pos), len);
-	__set_page_dirty_nobuffers(page);
+	filemap_dirty_folio(inode->i_mapping, folio);
 	return copied;
 }
 
@@ -695,7 +691,7 @@ static size_t iomap_write_end_inline(const struct iomap_iter *iter,
 
 /* Returns the number of bytes copied.  May be 0.  Cannot be an errno. */
 static size_t iomap_write_end(struct iomap_iter *iter, loff_t pos, size_t len,
-		size_t copied, struct page *page)
+		size_t copied, struct folio *folio)
 {
 	const struct iomap_page_ops *page_ops = iter->iomap.page_ops;
 	const struct iomap *srcmap = iomap_iter_srcmap(iter);
@@ -706,9 +702,9 @@ static size_t iomap_write_end(struct iomap_iter *iter, loff_t pos, size_t len,
 		ret = iomap_write_end_inline(iter, page, pos, copied);
 	} else if (srcmap->flags & IOMAP_F_BUFFER_HEAD) {
 		ret = block_write_end(NULL, iter->inode->i_mapping, pos, len,
-				copied, page, NULL);
+				copied, &folio->page, NULL);
 	} else {
-		ret = __iomap_write_end(iter->inode, pos, len, copied, page);
+		ret = __iomap_write_end(iter->inode, pos, len, copied, folio);
 	}
 
 	/*
@@ -720,13 +716,13 @@ static size_t iomap_write_end(struct iomap_iter *iter, loff_t pos, size_t len,
 		i_size_write(iter->inode, pos + ret);
 		iter->iomap.flags |= IOMAP_F_SIZE_CHANGED;
 	}
-	unlock_page(page);
+	folio_unlock(folio);
 
 	if (old_size < pos)
 		pagecache_isize_extended(iter->inode, old_size, pos);
 	if (page_ops && page_ops->page_done)
-		page_ops->page_done(iter->inode, pos, ret, page);
-	put_page(page);
+		page_ops->page_done(iter->inode, pos, ret, &folio->page);
+	folio_put(folio);
 
 	if (ret < len)
 		iomap_write_failed(iter->inode, pos, len);
@@ -741,6 +737,7 @@ static loff_t iomap_write_iter(struct iomap_iter *iter, struct iov_iter *i)
 	long status = 0;
 
 	do {
+		struct folio *folio;
 		struct page *page;
 		unsigned long offset;	/* Offset into pagecache page */
 		unsigned long bytes;	/* Bytes to write to page */
@@ -764,16 +761,17 @@ static loff_t iomap_write_iter(struct iomap_iter *iter, struct iov_iter *i)
 			break;
 		}
 
-		status = iomap_write_begin(iter, pos, bytes, &page);
+		status = iomap_write_begin(iter, pos, bytes, &folio);
 		if (unlikely(status))
 			break;
 
+		page = folio_file_page(folio, pos >> PAGE_SHIFT);
 		if (mapping_writably_mapped(iter->inode->i_mapping))
 			flush_dcache_page(page);
 
 		copied = copy_page_from_iter_atomic(page, offset, bytes, i);
 
-		status = iomap_write_end(iter, pos, bytes, copied, page);
+		status = iomap_write_end(iter, pos, bytes, copied, folio);
 
 		if (unlikely(copied != status))
 			iov_iter_revert(i, copied - status);
@@ -839,13 +837,13 @@ static loff_t iomap_unshare_iter(struct iomap_iter *iter)
 	do {
 		unsigned long offset = offset_in_page(pos);
 		unsigned long bytes = min_t(loff_t, PAGE_SIZE - offset, length);
-		struct page *page;
+		struct folio *folio;
 
-		status = iomap_write_begin(iter, pos, bytes, &page);
+		status = iomap_write_begin(iter, pos, bytes, &folio);
 		if (unlikely(status))
 			return status;
 
-		status = iomap_write_end(iter, pos, bytes, bytes, page);
+		status = iomap_write_end(iter, pos, bytes, bytes, folio);
 		if (WARN_ON_ONCE(status == 0))
 			return -EIO;
 
@@ -882,21 +880,19 @@ EXPORT_SYMBOL_GPL(iomap_file_unshare);
 static s64 __iomap_zero_iter(struct iomap_iter *iter, loff_t pos, u64 length)
 {
 	struct folio *folio;
-	struct page *page;
 	int status;
 	size_t offset, bytes;
 
-	status = iomap_write_begin(iter, pos, length, &page);
+	status = iomap_write_begin(iter, pos, length, &folio);
 	if (status)
 		return status;
-	folio = page_folio(page);
 
 	offset = offset_in_folio(folio, pos);
 	bytes = min_t(u64, folio_size(folio) - offset, length);
 	folio_zero_range(folio, offset, bytes);
 	folio_mark_accessed(folio);
 
-	return iomap_write_end(iter, pos, bytes, bytes, page);
+	return iomap_write_end(iter, pos, bytes, bytes, folio);
 }
 
 static loff_t iomap_zero_iter(struct iomap_iter *iter, bool *did_zero)
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH v2 21/28] iomap: Convert iomap_write_end_inline to take a folio
  2021-11-08  4:05 [PATCH v2 00/28] iomap/xfs folio patches Matthew Wilcox (Oracle)
                   ` (19 preceding siblings ...)
  2021-11-08  4:05 ` [PATCH v2 20/28] iomap: Convert iomap_write_begin() and iomap_write_end() to folios Matthew Wilcox (Oracle)
@ 2021-11-08  4:05 ` Matthew Wilcox (Oracle)
  2021-11-08  4:05 ` [PATCH v2 22/28] iomap,xfs: Convert ->discard_page to ->discard_folio Matthew Wilcox (Oracle)
                   ` (6 subsequent siblings)
  27 siblings, 0 replies; 64+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-11-08  4:05 UTC (permalink / raw)
  To: Darrick J . Wong 
  Cc: Matthew Wilcox (Oracle),
	linux-xfs, linux-fsdevel, linux-kernel, linux-block, Jens Axboe,
	Christoph Hellwig, Christoph Hellwig

This conversion is only safe because iomap only supports writes to inline
data which starts at the beginning of the file.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 fs/iomap/buffered-io.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index f4ae200adc4c..6b73d070e3a1 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -672,16 +672,16 @@ static size_t __iomap_write_end(struct inode *inode, loff_t pos, size_t len,
 }
 
 static size_t iomap_write_end_inline(const struct iomap_iter *iter,
-		struct page *page, loff_t pos, size_t copied)
+		struct folio *folio, loff_t pos, size_t copied)
 {
 	const struct iomap *iomap = &iter->iomap;
 	void *addr;
 
-	WARN_ON_ONCE(!PageUptodate(page));
+	WARN_ON_ONCE(!folio_test_uptodate(folio));
 	BUG_ON(!iomap_inline_data_valid(iomap));
 
-	flush_dcache_page(page);
-	addr = kmap_local_page(page) + pos;
+	flush_dcache_folio(folio);
+	addr = kmap_local_folio(folio, pos);
 	memcpy(iomap_inline_data(iomap, pos), addr, copied);
 	kunmap_local(addr);
 
@@ -699,7 +699,7 @@ static size_t iomap_write_end(struct iomap_iter *iter, loff_t pos, size_t len,
 	size_t ret;
 
 	if (srcmap->type == IOMAP_INLINE) {
-		ret = iomap_write_end_inline(iter, page, pos, copied);
+		ret = iomap_write_end_inline(iter, folio, pos, copied);
 	} else if (srcmap->flags & IOMAP_F_BUFFER_HEAD) {
 		ret = block_write_end(NULL, iter->inode->i_mapping, pos, len,
 				copied, &folio->page, NULL);
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH v2 22/28] iomap,xfs: Convert ->discard_page to ->discard_folio
  2021-11-08  4:05 [PATCH v2 00/28] iomap/xfs folio patches Matthew Wilcox (Oracle)
                   ` (20 preceding siblings ...)
  2021-11-08  4:05 ` [PATCH v2 21/28] iomap: Convert iomap_write_end_inline to take a folio Matthew Wilcox (Oracle)
@ 2021-11-08  4:05 ` Matthew Wilcox (Oracle)
  2021-11-08  4:05 ` [PATCH v2 23/28] iomap: Simplify iomap_writepage_map() Matthew Wilcox (Oracle)
                   ` (5 subsequent siblings)
  27 siblings, 0 replies; 64+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-11-08  4:05 UTC (permalink / raw)
  To: Darrick J . Wong 
  Cc: Matthew Wilcox (Oracle),
	linux-xfs, linux-fsdevel, linux-kernel, linux-block, Jens Axboe,
	Christoph Hellwig, Christoph Hellwig

XFS has the only implementation of ->discard_page today, so convert it
to use folios in the same patch as converting the API.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
---
 fs/iomap/buffered-io.c |  4 ++--
 fs/xfs/xfs_aops.c      | 24 ++++++++++++------------
 include/linux/iomap.h  |  2 +-
 3 files changed, 15 insertions(+), 15 deletions(-)

diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index 6b73d070e3a1..20610b1364d6 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -1346,8 +1346,8 @@ iomap_writepage_map(struct iomap_writepage_ctx *wpc,
 		 * won't be affected by I/O completion and we must unlock it
 		 * now.
 		 */
-		if (wpc->ops->discard_page)
-			wpc->ops->discard_page(page, file_offset);
+		if (wpc->ops->discard_folio)
+			wpc->ops->discard_folio(folio, file_offset);
 		if (!count) {
 			ClearPageUptodate(page);
 			unlock_page(page);
diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c
index c8c15c3c3147..4098a9875c5b 100644
--- a/fs/xfs/xfs_aops.c
+++ b/fs/xfs/xfs_aops.c
@@ -437,37 +437,37 @@ xfs_prepare_ioend(
  * see a ENOSPC in writeback).
  */
 static void
-xfs_discard_page(
-	struct page		*page,
-	loff_t			fileoff)
+xfs_discard_folio(
+	struct folio		*folio,
+	loff_t			pos)
 {
-	struct inode		*inode = page->mapping->host;
+	struct inode		*inode = folio->mapping->host;
 	struct xfs_inode	*ip = XFS_I(inode);
 	struct xfs_mount	*mp = ip->i_mount;
-	unsigned int		pageoff = offset_in_page(fileoff);
-	xfs_fileoff_t		start_fsb = XFS_B_TO_FSBT(mp, fileoff);
-	xfs_fileoff_t		pageoff_fsb = XFS_B_TO_FSBT(mp, pageoff);
+	size_t			offset = offset_in_folio(folio, pos);
+	xfs_fileoff_t		start_fsb = XFS_B_TO_FSBT(mp, pos);
+	xfs_fileoff_t		pageoff_fsb = XFS_B_TO_FSBT(mp, offset);
 	int			error;
 
 	if (xfs_is_shutdown(mp))
 		goto out_invalidate;
 
 	xfs_alert_ratelimited(mp,
-		"page discard on page "PTR_FMT", inode 0x%llx, offset %llu.",
-			page, ip->i_ino, fileoff);
+		"page discard on page "PTR_FMT", inode 0x%llx, pos %llu.",
+			folio, ip->i_ino, pos);
 
 	error = xfs_bmap_punch_delalloc_range(ip, start_fsb,
-			i_blocks_per_page(inode, page) - pageoff_fsb);
+			i_blocks_per_folio(inode, folio) - pageoff_fsb);
 	if (error && !xfs_is_shutdown(mp))
 		xfs_alert(mp, "page discard unable to remove delalloc mapping.");
 out_invalidate:
-	iomap_invalidatepage(page, pageoff, PAGE_SIZE - pageoff);
+	iomap_invalidate_folio(folio, offset, folio_size(folio) - offset);
 }
 
 static const struct iomap_writeback_ops xfs_writeback_ops = {
 	.map_blocks		= xfs_map_blocks,
 	.prepare_ioend		= xfs_prepare_ioend,
-	.discard_page		= xfs_discard_page,
+	.discard_folio		= xfs_discard_folio,
 };
 
 STATIC int
diff --git a/include/linux/iomap.h b/include/linux/iomap.h
index 29491fb9c5ba..5ef5088dbbd8 100644
--- a/include/linux/iomap.h
+++ b/include/linux/iomap.h
@@ -285,7 +285,7 @@ struct iomap_writeback_ops {
 	 * Optional, allows the file system to discard state on a page where
 	 * we failed to submit any I/O.
 	 */
-	void (*discard_page)(struct page *page, loff_t fileoff);
+	void (*discard_folio)(struct folio *folio, loff_t pos);
 };
 
 struct iomap_writepage_ctx {
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH v2 23/28] iomap: Simplify iomap_writepage_map()
  2021-11-08  4:05 [PATCH v2 00/28] iomap/xfs folio patches Matthew Wilcox (Oracle)
                   ` (21 preceding siblings ...)
  2021-11-08  4:05 ` [PATCH v2 22/28] iomap,xfs: Convert ->discard_page to ->discard_folio Matthew Wilcox (Oracle)
@ 2021-11-08  4:05 ` Matthew Wilcox (Oracle)
  2021-11-08  4:05 ` [PATCH v2 24/28] iomap: Simplify iomap_do_writepage() Matthew Wilcox (Oracle)
                   ` (4 subsequent siblings)
  27 siblings, 0 replies; 64+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-11-08  4:05 UTC (permalink / raw)
  To: Darrick J . Wong 
  Cc: Matthew Wilcox (Oracle),
	linux-xfs, linux-fsdevel, linux-kernel, linux-block, Jens Axboe,
	Christoph Hellwig, Christoph Hellwig

Rename end_offset to end_pos and file_offset to pos to match the rest
of the file.  Simplify the loop by calculating nblocks up front instead
of each time around the loop.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
---
 fs/iomap/buffered-io.c | 21 ++++++++++-----------
 1 file changed, 10 insertions(+), 11 deletions(-)

diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index 20610b1364d6..87190b86ef1f 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -1293,37 +1293,36 @@ iomap_add_to_ioend(struct inode *inode, loff_t offset, struct page *page,
 static int
 iomap_writepage_map(struct iomap_writepage_ctx *wpc,
 		struct writeback_control *wbc, struct inode *inode,
-		struct page *page, u64 end_offset)
+		struct page *page, u64 end_pos)
 {
 	struct folio *folio = page_folio(page);
 	struct iomap_page *iop = iomap_page_create(inode, folio);
 	struct iomap_ioend *ioend, *next;
 	unsigned len = i_blocksize(inode);
-	u64 file_offset; /* file offset of page */
+	unsigned nblocks = i_blocks_per_folio(inode, folio);
+	u64 pos = folio_pos(folio);
 	int error = 0, count = 0, i;
 	LIST_HEAD(submit_list);
 
 	WARN_ON_ONCE(iop && atomic_read(&iop->write_bytes_pending) != 0);
 
 	/*
-	 * Walk through the page to find areas to write back. If we run off the
-	 * end of the current map or find the current map invalid, grab a new
-	 * one.
+	 * Walk through the folio to find areas to write back. If we
+	 * run off the end of the current map or find the current map
+	 * invalid, grab a new one.
 	 */
-	for (i = 0, file_offset = page_offset(page);
-	     i < (PAGE_SIZE >> inode->i_blkbits) && file_offset < end_offset;
-	     i++, file_offset += len) {
+	for (i = 0; i < nblocks && pos < end_pos; i++, pos += len) {
 		if (iop && !test_bit(i, iop->uptodate))
 			continue;
 
-		error = wpc->ops->map_blocks(wpc, inode, file_offset);
+		error = wpc->ops->map_blocks(wpc, inode, pos);
 		if (error)
 			break;
 		if (WARN_ON_ONCE(wpc->iomap.type == IOMAP_INLINE))
 			continue;
 		if (wpc->iomap.type == IOMAP_HOLE)
 			continue;
-		iomap_add_to_ioend(inode, file_offset, page, iop, wpc, wbc,
+		iomap_add_to_ioend(inode, pos, page, iop, wpc, wbc,
 				 &submit_list);
 		count++;
 	}
@@ -1347,7 +1346,7 @@ iomap_writepage_map(struct iomap_writepage_ctx *wpc,
 		 * now.
 		 */
 		if (wpc->ops->discard_folio)
-			wpc->ops->discard_folio(folio, file_offset);
+			wpc->ops->discard_folio(folio, pos);
 		if (!count) {
 			ClearPageUptodate(page);
 			unlock_page(page);
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH v2 24/28] iomap: Simplify iomap_do_writepage()
  2021-11-08  4:05 [PATCH v2 00/28] iomap/xfs folio patches Matthew Wilcox (Oracle)
                   ` (22 preceding siblings ...)
  2021-11-08  4:05 ` [PATCH v2 23/28] iomap: Simplify iomap_writepage_map() Matthew Wilcox (Oracle)
@ 2021-11-08  4:05 ` Matthew Wilcox (Oracle)
  2021-11-08  4:05 ` [PATCH v2 25/28] iomap: Convert iomap_add_to_ioend() to take a folio Matthew Wilcox (Oracle)
                   ` (3 subsequent siblings)
  27 siblings, 0 replies; 64+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-11-08  4:05 UTC (permalink / raw)
  To: Darrick J . Wong 
  Cc: Matthew Wilcox (Oracle),
	linux-xfs, linux-fsdevel, linux-kernel, linux-block, Jens Axboe,
	Christoph Hellwig, Christoph Hellwig

Rename end_offset to end_pos and offset_into_page to poff to match the
rest of the file.  Simplify the handling of the last page straddling
i_size by doing the EOF check based on the byte granularity i_size
instead of converting to a pgoff prematurely.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
---
 fs/iomap/buffered-io.c | 23 ++++++++++-------------
 1 file changed, 10 insertions(+), 13 deletions(-)

diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index 87190b86ef1f..b168cc0fe8be 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -1394,9 +1394,7 @@ iomap_do_writepage(struct page *page, struct writeback_control *wbc, void *data)
 {
 	struct iomap_writepage_ctx *wpc = data;
 	struct inode *inode = page->mapping->host;
-	pgoff_t end_index;
-	u64 end_offset;
-	loff_t offset;
+	u64 end_pos, isize;
 
 	trace_iomap_writepage(inode, page_offset(page), PAGE_SIZE);
 
@@ -1427,11 +1425,9 @@ iomap_do_writepage(struct page *page, struct writeback_control *wbc, void *data)
 	 * |     desired writeback range    |      see else    |
 	 * ---------------------------------^------------------|
 	 */
-	offset = i_size_read(inode);
-	end_index = offset >> PAGE_SHIFT;
-	if (page->index < end_index)
-		end_offset = (loff_t)(page->index + 1) << PAGE_SHIFT;
-	else {
+	isize = i_size_read(inode);
+	end_pos = page_offset(page) + PAGE_SIZE;
+	if (end_pos > isize) {
 		/*
 		 * Check whether the page to write out is beyond or straddles
 		 * i_size or not.
@@ -1443,7 +1439,8 @@ iomap_do_writepage(struct page *page, struct writeback_control *wbc, void *data)
 		 * |				    |      Straddles     |
 		 * ---------------------------------^-----------|--------|
 		 */
-		unsigned offset_into_page = offset & (PAGE_SIZE - 1);
+		size_t poff = offset_in_page(isize);
+		pgoff_t end_index = isize >> PAGE_SHIFT;
 
 		/*
 		 * Skip the page if it's fully outside i_size, e.g. due to a
@@ -1463,7 +1460,7 @@ iomap_do_writepage(struct page *page, struct writeback_control *wbc, void *data)
 		 * offset is just equal to the EOF.
 		 */
 		if (page->index > end_index ||
-		    (page->index == end_index && offset_into_page == 0))
+		    (page->index == end_index && poff == 0))
 			goto redirty;
 
 		/*
@@ -1474,13 +1471,13 @@ iomap_do_writepage(struct page *page, struct writeback_control *wbc, void *data)
 		 * memory is zeroed when mapped, and writes to that region are
 		 * not written out to the file."
 		 */
-		zero_user_segment(page, offset_into_page, PAGE_SIZE);
+		zero_user_segment(page, poff, PAGE_SIZE);
 
 		/* Adjust the end_offset to the end of file */
-		end_offset = offset;
+		end_pos = isize;
 	}
 
-	return iomap_writepage_map(wpc, wbc, inode, page, end_offset);
+	return iomap_writepage_map(wpc, wbc, inode, page, end_pos);
 
 redirty:
 	redirty_page_for_writepage(wbc, page);
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH v2 25/28] iomap: Convert iomap_add_to_ioend() to take a folio
  2021-11-08  4:05 [PATCH v2 00/28] iomap/xfs folio patches Matthew Wilcox (Oracle)
                   ` (23 preceding siblings ...)
  2021-11-08  4:05 ` [PATCH v2 24/28] iomap: Simplify iomap_do_writepage() Matthew Wilcox (Oracle)
@ 2021-11-08  4:05 ` Matthew Wilcox (Oracle)
  2021-11-17  4:34   ` Darrick J. Wong
  2021-11-08  4:05 ` [PATCH v2 26/28] iomap: Convert iomap_migrate_page() to use folios Matthew Wilcox (Oracle)
                   ` (2 subsequent siblings)
  27 siblings, 1 reply; 64+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-11-08  4:05 UTC (permalink / raw)
  To: Darrick J . Wong 
  Cc: Matthew Wilcox (Oracle),
	linux-xfs, linux-fsdevel, linux-kernel, linux-block, Jens Axboe,
	Christoph Hellwig, Christoph Hellwig

We still iterate one block at a time, but now we call compound_head()
less often.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 fs/iomap/buffered-io.c | 70 ++++++++++++++++++++----------------------
 1 file changed, 34 insertions(+), 36 deletions(-)

diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index b168cc0fe8be..90f9f33ffe41 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -1249,29 +1249,29 @@ iomap_can_add_to_ioend(struct iomap_writepage_ctx *wpc, loff_t offset,
  * first; otherwise finish off the current ioend and start another.
  */
 static void
-iomap_add_to_ioend(struct inode *inode, loff_t offset, struct page *page,
+iomap_add_to_ioend(struct inode *inode, loff_t pos, struct folio *folio,
 		struct iomap_page *iop, struct iomap_writepage_ctx *wpc,
 		struct writeback_control *wbc, struct list_head *iolist)
 {
-	sector_t sector = iomap_sector(&wpc->iomap, offset);
+	sector_t sector = iomap_sector(&wpc->iomap, pos);
 	unsigned len = i_blocksize(inode);
-	unsigned poff = offset & (PAGE_SIZE - 1);
+	size_t poff = offset_in_folio(folio, pos);
 
-	if (!wpc->ioend || !iomap_can_add_to_ioend(wpc, offset, sector)) {
+	if (!wpc->ioend || !iomap_can_add_to_ioend(wpc, pos, sector)) {
 		if (wpc->ioend)
 			list_add(&wpc->ioend->io_list, iolist);
-		wpc->ioend = iomap_alloc_ioend(inode, wpc, offset, sector, wbc);
+		wpc->ioend = iomap_alloc_ioend(inode, wpc, pos, sector, wbc);
 	}
 
-	if (bio_add_page(wpc->ioend->io_bio, page, len, poff) != len) {
+	if (!bio_add_folio(wpc->ioend->io_bio, folio, len, poff)) {
 		wpc->ioend->io_bio = iomap_chain_bio(wpc->ioend->io_bio);
-		__bio_add_page(wpc->ioend->io_bio, page, len, poff);
+		bio_add_folio(wpc->ioend->io_bio, folio, len, poff);
 	}
 
 	if (iop)
 		atomic_add(len, &iop->write_bytes_pending);
 	wpc->ioend->io_size += len;
-	wbc_account_cgroup_owner(wbc, page, len);
+	wbc_account_cgroup_owner(wbc, &folio->page, len);
 }
 
 /*
@@ -1293,9 +1293,8 @@ iomap_add_to_ioend(struct inode *inode, loff_t offset, struct page *page,
 static int
 iomap_writepage_map(struct iomap_writepage_ctx *wpc,
 		struct writeback_control *wbc, struct inode *inode,
-		struct page *page, u64 end_pos)
+		struct folio *folio, u64 end_pos)
 {
-	struct folio *folio = page_folio(page);
 	struct iomap_page *iop = iomap_page_create(inode, folio);
 	struct iomap_ioend *ioend, *next;
 	unsigned len = i_blocksize(inode);
@@ -1322,15 +1321,15 @@ iomap_writepage_map(struct iomap_writepage_ctx *wpc,
 			continue;
 		if (wpc->iomap.type == IOMAP_HOLE)
 			continue;
-		iomap_add_to_ioend(inode, pos, page, iop, wpc, wbc,
+		iomap_add_to_ioend(inode, pos, folio, iop, wpc, wbc,
 				 &submit_list);
 		count++;
 	}
 
 	WARN_ON_ONCE(!wpc->ioend && !list_empty(&submit_list));
-	WARN_ON_ONCE(!PageLocked(page));
-	WARN_ON_ONCE(PageWriteback(page));
-	WARN_ON_ONCE(PageDirty(page));
+	WARN_ON_ONCE(!folio_test_locked(folio));
+	WARN_ON_ONCE(folio_test_writeback(folio));
+	WARN_ON_ONCE(folio_test_dirty(folio));
 
 	/*
 	 * We cannot cancel the ioend directly here on error.  We may have
@@ -1348,14 +1347,14 @@ iomap_writepage_map(struct iomap_writepage_ctx *wpc,
 		if (wpc->ops->discard_folio)
 			wpc->ops->discard_folio(folio, pos);
 		if (!count) {
-			ClearPageUptodate(page);
-			unlock_page(page);
+			folio_clear_uptodate(folio);
+			folio_unlock(folio);
 			goto done;
 		}
 	}
 
-	set_page_writeback(page);
-	unlock_page(page);
+	folio_start_writeback(folio);
+	folio_unlock(folio);
 
 	/*
 	 * Preserve the original error if there was one; catch
@@ -1376,9 +1375,9 @@ iomap_writepage_map(struct iomap_writepage_ctx *wpc,
 	 * with a partial page truncate on a sub-page block sized filesystem.
 	 */
 	if (!count)
-		end_page_writeback(page);
+		folio_end_writeback(folio);
 done:
-	mapping_set_error(page->mapping, error);
+	mapping_set_error(folio->mapping, error);
 	return error;
 }
 
@@ -1392,14 +1391,15 @@ iomap_writepage_map(struct iomap_writepage_ctx *wpc,
 static int
 iomap_do_writepage(struct page *page, struct writeback_control *wbc, void *data)
 {
+	struct folio *folio = page_folio(page);
 	struct iomap_writepage_ctx *wpc = data;
-	struct inode *inode = page->mapping->host;
+	struct inode *inode = folio->mapping->host;
 	u64 end_pos, isize;
 
-	trace_iomap_writepage(inode, page_offset(page), PAGE_SIZE);
+	trace_iomap_writepage(inode, folio_pos(folio), folio_size(folio));
 
 	/*
-	 * Refuse to write the page out if we're called from reclaim context.
+	 * Refuse to write the folio out if we're called from reclaim context.
 	 *
 	 * This avoids stack overflows when called from deeply used stacks in
 	 * random callers for direct reclaim or memcg reclaim.  We explicitly
@@ -1413,10 +1413,10 @@ iomap_do_writepage(struct page *page, struct writeback_control *wbc, void *data)
 		goto redirty;
 
 	/*
-	 * Is this page beyond the end of the file?
+	 * Is this folio beyond the end of the file?
 	 *
-	 * The page index is less than the end_index, adjust the end_offset
-	 * to the highest offset that this page should represent.
+	 * The folio index is less than the end_index, adjust the end_pos
+	 * to the highest offset that this folio should represent.
 	 * -----------------------------------------------------
 	 * |			file mapping	       | <EOF> |
 	 * -----------------------------------------------------
@@ -1426,7 +1426,7 @@ iomap_do_writepage(struct page *page, struct writeback_control *wbc, void *data)
 	 * ---------------------------------^------------------|
 	 */
 	isize = i_size_read(inode);
-	end_pos = page_offset(page) + PAGE_SIZE;
+	end_pos = folio_pos(folio) + folio_size(folio);
 	if (end_pos > isize) {
 		/*
 		 * Check whether the page to write out is beyond or straddles
@@ -1439,7 +1439,7 @@ iomap_do_writepage(struct page *page, struct writeback_control *wbc, void *data)
 		 * |				    |      Straddles     |
 		 * ---------------------------------^-----------|--------|
 		 */
-		size_t poff = offset_in_page(isize);
+		size_t poff = offset_in_folio(folio, isize);
 		pgoff_t end_index = isize >> PAGE_SHIFT;
 
 		/*
@@ -1459,8 +1459,8 @@ iomap_do_writepage(struct page *page, struct writeback_control *wbc, void *data)
 		 * checking if the page is totally beyond i_size or if its
 		 * offset is just equal to the EOF.
 		 */
-		if (page->index > end_index ||
-		    (page->index == end_index && poff == 0))
+		if (folio->index > end_index ||
+		    (folio->index == end_index && poff == 0))
 			goto redirty;
 
 		/*
@@ -1471,17 +1471,15 @@ iomap_do_writepage(struct page *page, struct writeback_control *wbc, void *data)
 		 * memory is zeroed when mapped, and writes to that region are
 		 * not written out to the file."
 		 */
-		zero_user_segment(page, poff, PAGE_SIZE);
-
-		/* Adjust the end_offset to the end of file */
+		folio_zero_segment(folio, poff, folio_size(folio));
 		end_pos = isize;
 	}
 
-	return iomap_writepage_map(wpc, wbc, inode, page, end_pos);
+	return iomap_writepage_map(wpc, wbc, inode, folio, end_pos);
 
 redirty:
-	redirty_page_for_writepage(wbc, page);
-	unlock_page(page);
+	folio_redirty_for_writepage(wbc, folio);
+	folio_unlock(folio);
 	return 0;
 }
 
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH v2 26/28] iomap: Convert iomap_migrate_page() to use folios
  2021-11-08  4:05 [PATCH v2 00/28] iomap/xfs folio patches Matthew Wilcox (Oracle)
                   ` (24 preceding siblings ...)
  2021-11-08  4:05 ` [PATCH v2 25/28] iomap: Convert iomap_add_to_ioend() to take a folio Matthew Wilcox (Oracle)
@ 2021-11-08  4:05 ` Matthew Wilcox (Oracle)
  2021-11-08  4:05 ` [PATCH v2 27/28] iomap: Support multi-page folios in invalidatepage Matthew Wilcox (Oracle)
  2021-11-08  4:05 ` [PATCH v2 28/28] xfs: Support multi-page folios Matthew Wilcox (Oracle)
  27 siblings, 0 replies; 64+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-11-08  4:05 UTC (permalink / raw)
  To: Darrick J . Wong 
  Cc: Matthew Wilcox (Oracle),
	linux-xfs, linux-fsdevel, linux-kernel, linux-block, Jens Axboe,
	Christoph Hellwig, Christoph Hellwig

The arguments are still pages for now, but we can use folios internally
and cut out a lot of calls to compound_head().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
---
 fs/iomap/buffered-io.c | 12 +++++++-----
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index 90f9f33ffe41..6830e4c15c61 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -493,19 +493,21 @@ int
 iomap_migrate_page(struct address_space *mapping, struct page *newpage,
 		struct page *page, enum migrate_mode mode)
 {
+	struct folio *folio = page_folio(page);
+	struct folio *newfolio = page_folio(newpage);
 	int ret;
 
-	ret = migrate_page_move_mapping(mapping, newpage, page, 0);
+	ret = folio_migrate_mapping(mapping, newfolio, folio, 0);
 	if (ret != MIGRATEPAGE_SUCCESS)
 		return ret;
 
-	if (page_has_private(page))
-		attach_page_private(newpage, detach_page_private(page));
+	if (folio_test_private(folio))
+		folio_attach_private(newfolio, folio_detach_private(folio));
 
 	if (mode != MIGRATE_SYNC_NO_COPY)
-		migrate_page_copy(newpage, page);
+		folio_migrate_copy(newfolio, folio);
 	else
-		migrate_page_states(newpage, page);
+		folio_migrate_flags(newfolio, folio);
 	return MIGRATEPAGE_SUCCESS;
 }
 EXPORT_SYMBOL_GPL(iomap_migrate_page);
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH v2 27/28] iomap: Support multi-page folios in invalidatepage
  2021-11-08  4:05 [PATCH v2 00/28] iomap/xfs folio patches Matthew Wilcox (Oracle)
                   ` (25 preceding siblings ...)
  2021-11-08  4:05 ` [PATCH v2 26/28] iomap: Convert iomap_migrate_page() to use folios Matthew Wilcox (Oracle)
@ 2021-11-08  4:05 ` Matthew Wilcox (Oracle)
  2021-11-08  4:05 ` [PATCH v2 28/28] xfs: Support multi-page folios Matthew Wilcox (Oracle)
  27 siblings, 0 replies; 64+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-11-08  4:05 UTC (permalink / raw)
  To: Darrick J . Wong 
  Cc: Matthew Wilcox (Oracle),
	linux-xfs, linux-fsdevel, linux-kernel, linux-block, Jens Axboe,
	Christoph Hellwig, Christoph Hellwig

If we're punching a hole in a multi-page folio, we need to remove the
per-folio iomap data as the folio is about to be split and each page will
need its own.  If a dirty folio is only partially-uptodate, the iomap
data contains the information about which blocks cannot be written back,
so assert that a dirty folio is fully uptodate.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
---
 fs/iomap/buffered-io.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index 6830e4c15c61..265c7f8e7134 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -470,13 +470,18 @@ void iomap_invalidate_folio(struct folio *folio, size_t offset, size_t len)
 	trace_iomap_invalidatepage(folio->mapping->host, offset, len);
 
 	/*
-	 * If we're invalidating the entire page, clear the dirty state from it
-	 * and release it to avoid unnecessary buildup of the LRU.
+	 * If we're invalidating the entire folio, clear the dirty state
+	 * from it and release it to avoid unnecessary buildup of the LRU.
 	 */
 	if (offset == 0 && len == folio_size(folio)) {
 		WARN_ON_ONCE(folio_test_writeback(folio));
 		folio_cancel_dirty(folio);
 		iomap_page_release(folio);
+	} else if (folio_test_multi(folio)) {
+		/* Must release the iop so the page can be split */
+		WARN_ON_ONCE(!folio_test_uptodate(folio) &&
+			     folio_test_dirty(folio));
+		iomap_page_release(folio);
 	}
 }
 EXPORT_SYMBOL_GPL(iomap_invalidate_folio);
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH v2 28/28] xfs: Support multi-page folios
  2021-11-08  4:05 [PATCH v2 00/28] iomap/xfs folio patches Matthew Wilcox (Oracle)
                   ` (26 preceding siblings ...)
  2021-11-08  4:05 ` [PATCH v2 27/28] iomap: Support multi-page folios in invalidatepage Matthew Wilcox (Oracle)
@ 2021-11-08  4:05 ` Matthew Wilcox (Oracle)
  27 siblings, 0 replies; 64+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-11-08  4:05 UTC (permalink / raw)
  To: Darrick J . Wong 
  Cc: Matthew Wilcox (Oracle),
	linux-xfs, linux-fsdevel, linux-kernel, linux-block, Jens Axboe,
	Christoph Hellwig, Christoph Hellwig

Now that iomap has been converted, XFS is multi-page folio safe.
Indicate to the VFS that it can now create multi-page folios for XFS.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
---
 fs/xfs/xfs_icache.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c
index e1472004170e..5380a3f001e9 100644
--- a/fs/xfs/xfs_icache.c
+++ b/fs/xfs/xfs_icache.c
@@ -87,6 +87,7 @@ xfs_inode_alloc(
 	/* VFS doesn't initialise i_mode or i_state! */
 	VFS_I(ip)->i_mode = 0;
 	VFS_I(ip)->i_state = 0;
+	mapping_set_large_folios(VFS_I(ip)->i_mapping);
 
 	XFS_STATS_INC(mp, vn_active);
 	ASSERT(atomic_read(&ip->i_pincount) == 0);
@@ -336,6 +337,7 @@ xfs_reinit_inode(
 	inode->i_rdev = dev;
 	inode->i_uid = uid;
 	inode->i_gid = gid;
+	mapping_set_large_folios(inode->i_mapping);
 	return error;
 }
 
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* Re: [PATCH v2 01/28] csky,sparc: Declare flush_dcache_folio()
  2021-11-08  4:05 ` [PATCH v2 01/28] csky,sparc: Declare flush_dcache_folio() Matthew Wilcox (Oracle)
@ 2021-11-09  8:36   ` Christoph Hellwig
  2021-11-15 15:54     ` Matthew Wilcox
  0 siblings, 1 reply; 64+ messages in thread
From: Christoph Hellwig @ 2021-11-09  8:36 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle)
  Cc: Darrick J . Wong ,
	linux-xfs, linux-fsdevel, linux-kernel, linux-block, Jens Axboe,
	Christoph Hellwig

On Mon, Nov 08, 2021 at 04:05:24AM +0000, Matthew Wilcox (Oracle) wrote:
> These architectures do not include asm-generic/cacheflush.h so need
> to declare it themselves.

In mainline mm/util.c implements flush_dcache_folio unless
ARCH_IMPLEMENTS_FLUSH_DCACHE_FOLIO is set.  So I think you need to
define that for csky and sparc.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v2 02/28] mm: Add functions to zero portions of a folio
  2021-11-08  4:05 ` [PATCH v2 02/28] mm: Add functions to zero portions of a folio Matthew Wilcox (Oracle)
@ 2021-11-09  8:40   ` Christoph Hellwig
  2021-11-17  4:45   ` Darrick J. Wong
  1 sibling, 0 replies; 64+ messages in thread
From: Christoph Hellwig @ 2021-11-09  8:40 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle)
  Cc: Darrick J . Wong ,
	linux-xfs, linux-fsdevel, linux-kernel, linux-block, Jens Axboe,
	Christoph Hellwig

On Mon, Nov 08, 2021 at 04:05:25AM +0000, Matthew Wilcox (Oracle) wrote:
> These functions are wrappers around zero_user_segments(), which means
> that zero_user_segments() can now be called for compound pages even when
> CONFIG_TRANSPARENT_HUGEPAGE is disabled.

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

Related note: the inline !HIGHMEM version should switch to page_address
instead of kmap_local_page to make the code more obvious.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v2 04/28] fs: Rename AS_THP_SUPPORT and mapping_thp_support
  2021-11-08  4:05 ` [PATCH v2 04/28] fs: Rename AS_THP_SUPPORT and mapping_thp_support Matthew Wilcox (Oracle)
@ 2021-11-09  8:41   ` Christoph Hellwig
  2021-11-15 16:03     ` Matthew Wilcox
  0 siblings, 1 reply; 64+ messages in thread
From: Christoph Hellwig @ 2021-11-09  8:41 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle)
  Cc: Darrick J . Wong ,
	linux-xfs, linux-fsdevel, linux-kernel, linux-block, Jens Axboe,
	Christoph Hellwig

On Mon, Nov 08, 2021 at 04:05:27AM +0000, Matthew Wilcox (Oracle) wrote:
> These are now indicators of multi-page folio support, not THP support.

Given that we don't use the large foltio term anywhere else this really
needs to grow a comment explaining what the flag means.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v2 07/28] fs/buffer: Convert __block_write_begin_int() to take a folio
  2021-11-08  4:05 ` [PATCH v2 07/28] fs/buffer: Convert __block_write_begin_int() to take a folio Matthew Wilcox (Oracle)
@ 2021-11-09  8:42   ` Christoph Hellwig
  2021-11-17  4:35   ` Darrick J. Wong
  1 sibling, 0 replies; 64+ messages in thread
From: Christoph Hellwig @ 2021-11-09  8:42 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle)
  Cc: Darrick J . Wong ,
	linux-xfs, linux-fsdevel, linux-kernel, linux-block, Jens Axboe,
	Christoph Hellwig

>  		get_block_t *get_block)
>  {
> -	return __block_write_begin_int(page, pos, len, get_block, NULL);
> +	return __block_write_begin_int(page_folio(page), pos, len, get_block, NULL);

Overly long line here.

Otherwise looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v2 17/28] iomap: Convert readahead and readpage to use a folio
  2021-11-08  4:05 ` [PATCH v2 17/28] iomap: Convert readahead and readpage to use " Matthew Wilcox (Oracle)
@ 2021-11-09  8:43   ` Christoph Hellwig
  0 siblings, 0 replies; 64+ messages in thread
From: Christoph Hellwig @ 2021-11-09  8:43 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle)
  Cc: Darrick J . Wong ,
	linux-xfs, linux-fsdevel, linux-kernel, linux-block, Jens Axboe,
	Christoph Hellwig

On Mon, Nov 08, 2021 at 04:05:40AM +0000, Matthew Wilcox (Oracle) wrote:
> Handle folios of arbitrary size instead of working in PAGE_SIZE units.
> readahead_folio() decreases the page refcount for you, so this is not
> quite a mechanical change.

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v2 19/28] iomap: Convert __iomap_zero_iter to use a folio
  2021-11-08  4:05 ` [PATCH v2 19/28] iomap: Convert __iomap_zero_iter " Matthew Wilcox (Oracle)
@ 2021-11-09  8:47   ` Christoph Hellwig
  2021-11-17  2:24   ` Darrick J. Wong
  2021-12-09 21:38   ` Matthew Wilcox
  2 siblings, 0 replies; 64+ messages in thread
From: Christoph Hellwig @ 2021-11-09  8:47 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle)
  Cc: Darrick J . Wong ,
	linux-xfs, linux-fsdevel, linux-kernel, linux-block, Jens Axboe,
	Christoph Hellwig

On Mon, Nov 08, 2021 at 04:05:42AM +0000, Matthew Wilcox (Oracle) wrote:
> The zero iterator can work in folio-sized chunks instead of page-sized
> chunks.  This will save a lot of page cache lookups if the file is cached
> in multi-page folios.
> 
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>

Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>

but it will clash with my just sent series that decouples DAX
zeroing from buffered I/O zeroing and folds __iomap_zero_iter into
the caller.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v2 01/28] csky,sparc: Declare flush_dcache_folio()
  2021-11-09  8:36   ` Christoph Hellwig
@ 2021-11-15 15:54     ` Matthew Wilcox
  2021-11-16  6:33       ` Christoph Hellwig
  0 siblings, 1 reply; 64+ messages in thread
From: Matthew Wilcox @ 2021-11-15 15:54 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Darrick J . Wong ,
	linux-xfs, linux-fsdevel, linux-kernel, linux-block, Jens Axboe

On Tue, Nov 09, 2021 at 12:36:57AM -0800, Christoph Hellwig wrote:
> On Mon, Nov 08, 2021 at 04:05:24AM +0000, Matthew Wilcox (Oracle) wrote:
> > These architectures do not include asm-generic/cacheflush.h so need
> > to declare it themselves.
> 
> In mainline mm/util.c implements flush_dcache_folio unless
> ARCH_IMPLEMENTS_FLUSH_DCACHE_FOLIO is set.  So I think you need to
> define that for csky and sparc.

There are three ways to implement flush_dcache_folio().  The first is
as a noop (this is what xtensa does, which is the only architecture
to define ARCH_IMPLEMENTS_FLUSH_DCACHE_FOLIO; it's also done
automatically by asm-generic if the architecture doesn't define
ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE).  The second is as a loop which calls
flush_dcache_page() for each page in the folio.  That's the default
implementation which you found in mm/util.c.  The third way, which I
hope architecture maintainers actually implement, is to just set the
needs-flush bit on the head page.  But that requires knowledge of each
architecture; they need to check the needs-flush bit on the head page
instead of the precise page.  So I've done the safe, slow thing for
all architectures.  The only reason that csky and sparc are "special"
is that they don't include asm-generic/cacheflush.h and the buildbots
didn't catch that before the merge window.

I'm doing the exact same thing for csky and sparc that I did for
arc/arm/m68k/mips/nds32/nios2/parisc/sh.  Nothing more, nothing less.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v2 04/28] fs: Rename AS_THP_SUPPORT and mapping_thp_support
  2021-11-09  8:41   ` Christoph Hellwig
@ 2021-11-15 16:03     ` Matthew Wilcox
  2021-11-16  6:33       ` Christoph Hellwig
  0 siblings, 1 reply; 64+ messages in thread
From: Matthew Wilcox @ 2021-11-15 16:03 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Darrick J . Wong ,
	linux-xfs, linux-fsdevel, linux-kernel, linux-block, Jens Axboe

On Tue, Nov 09, 2021 at 12:41:19AM -0800, Christoph Hellwig wrote:
> On Mon, Nov 08, 2021 at 04:05:27AM +0000, Matthew Wilcox (Oracle) wrote:
> > These are now indicators of multi-page folio support, not THP support.
> 
> Given that we don't use the large foltio term anywhere else this really
> needs to grow a comment explaining what the flag means.

I think I prefer the term 'large' to 'multi'.  What would you think to
this patch (not on top of any particular branch; just to show the scope
of it ...)

+++ b/include/linux/page-flags.h
@@ -692,7 +692,7 @@ static inline bool folio_test_single(struct folio *folio)
        return !folio_test_head(folio);
 }

-static inline bool folio_test_multi(struct folio *folio)
+static inline bool folio_test_large(struct folio *folio)
 {
        return folio_test_head(folio);
 }
+++ b/mm/filemap.c
@@ -192,9 +192,9 @@ static void filemap_unaccount_folio(struct address_space *mapping,
        __lruvec_stat_mod_folio(folio, NR_FILE_PAGES, -nr);
        if (folio_test_swapbacked(folio)) {
                __lruvec_stat_mod_folio(folio, NR_SHMEM, -nr);
-               if (folio_test_multi(folio))
+               if (folio_test_large(folio))
                        __lruvec_stat_mod_folio(folio, NR_SHMEM_THPS, -nr);
-       } else if (folio_test_multi(folio)) {
+       } else if (folio_test_large(folio)) {
                __lruvec_stat_mod_folio(folio, NR_FILE_THPS, -nr);
                filemap_nr_thps_dec(mapping);
        }
@@ -236,7 +236,7 @@ void filemap_free_folio(struct address_space *mapping, struct folio *folio)
        if (freepage)
                freepage(&folio->page);

-       if (folio_test_multi(folio) && !folio_test_hugetlb(folio)) {
+       if (folio_test_large(folio) && !folio_test_hugetlb(folio)) {
                folio_ref_sub(folio, folio_nr_pages(folio));
                VM_BUG_ON_FOLIO(folio_ref_count(folio) <= 0, folio);
        } else {
+++ b/mm/memcontrol.c
@@ -5558,7 +5558,7 @@ static int mem_cgroup_move_account(struct page *page,

        VM_BUG_ON(from == to);
        VM_BUG_ON_FOLIO(folio_test_lru(folio), folio);
-       VM_BUG_ON(compound && !folio_test_multi(folio));
+       VM_BUG_ON(compound && !folio_test_large(folio));

        /*
         * Prevent mem_cgroup_migrate() from looking at


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v2 01/28] csky,sparc: Declare flush_dcache_folio()
  2021-11-15 15:54     ` Matthew Wilcox
@ 2021-11-16  6:33       ` Christoph Hellwig
  2021-11-16 21:49         ` Matthew Wilcox
  0 siblings, 1 reply; 64+ messages in thread
From: Christoph Hellwig @ 2021-11-16  6:33 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Christoph Hellwig, Darrick J . Wong ,
	linux-xfs, linux-fsdevel, linux-kernel, linux-block, Jens Axboe

On Mon, Nov 15, 2021 at 03:54:47PM +0000, Matthew Wilcox wrote:
> There are three ways to implement flush_dcache_folio().  The first is
> as a noop (this is what xtensa does, which is the only architecture
> to define ARCH_IMPLEMENTS_FLUSH_DCACHE_FOLIO; it's also done
> automatically by asm-generic if the architecture doesn't define
> ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE).  The second is as a loop which calls
> flush_dcache_page() for each page in the folio.  That's the default
> implementation which you found in mm/util.c.  The third way, which I
> hope architecture maintainers actually implement, is to just set the
> needs-flush bit on the head page.  But that requires knowledge of each
> architecture; they need to check the needs-flush bit on the head page
> instead of the precise page.  So I've done the safe, slow thing for
> all architectures.  The only reason that csky and sparc are "special"
> is that they don't include asm-generic/cacheflush.h and the buildbots
> didn't catch that before the merge window.
> 
> I'm doing the exact same thing for csky and sparc that I did for
> arc/arm/m68k/mips/nds32/nios2/parisc/sh.  Nothing more, nothing less.

I see how this works no, but it is pretty horrible.  Why not something
simple like the patch below?  If/when an architecture actually
wants to override flush_dcache_folio we can find out how to best do
it:

diff --git a/arch/arc/include/asm/cacheflush.h b/arch/arc/include/asm/cacheflush.h
index e8c2c7469e107..e201b4b1655af 100644
--- a/arch/arc/include/asm/cacheflush.h
+++ b/arch/arc/include/asm/cacheflush.h
@@ -36,7 +36,6 @@ void __flush_dcache_page(phys_addr_t paddr, unsigned long vaddr);
 #define ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE 1
 
 void flush_dcache_page(struct page *page);
-void flush_dcache_folio(struct folio *folio);
 
 void dma_cache_wback_inv(phys_addr_t start, unsigned long sz);
 void dma_cache_inv(phys_addr_t start, unsigned long sz);
diff --git a/arch/arm/include/asm/cacheflush.h b/arch/arm/include/asm/cacheflush.h
index e68fb879e4f9d..5e56288e343bb 100644
--- a/arch/arm/include/asm/cacheflush.h
+++ b/arch/arm/include/asm/cacheflush.h
@@ -290,7 +290,6 @@ extern void flush_cache_page(struct vm_area_struct *vma, unsigned long user_addr
  */
 #define ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE 1
 extern void flush_dcache_page(struct page *);
-void flush_dcache_folio(struct folio *folio);
 
 #define ARCH_IMPLEMENTS_FLUSH_KERNEL_VMAP_RANGE 1
 static inline void flush_kernel_vmap_range(void *addr, int size)
diff --git a/arch/csky/abiv1/inc/abi/cacheflush.h b/arch/csky/abiv1/inc/abi/cacheflush.h
index 432aef1f1dc23..ed62e2066ba76 100644
--- a/arch/csky/abiv1/inc/abi/cacheflush.h
+++ b/arch/csky/abiv1/inc/abi/cacheflush.h
@@ -9,7 +9,6 @@
 
 #define ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE 1
 extern void flush_dcache_page(struct page *);
-void flush_dcache_folio(struct folio *folio);
 
 #define flush_cache_mm(mm)			dcache_wbinv_all()
 #define flush_cache_page(vma, page, pfn)	cache_wbinv_all()
diff --git a/arch/csky/abiv2/inc/abi/cacheflush.h b/arch/csky/abiv2/inc/abi/cacheflush.h
index 7e8bef60958c6..a565e00c3f70b 100644
--- a/arch/csky/abiv2/inc/abi/cacheflush.h
+++ b/arch/csky/abiv2/inc/abi/cacheflush.h
@@ -25,8 +25,6 @@ static inline void flush_dcache_page(struct page *page)
 		clear_bit(PG_dcache_clean, &page->flags);
 }
 
-void flush_dcache_folio(struct folio *folio);
-
 #define flush_dcache_mmap_lock(mapping)		do { } while (0)
 #define flush_dcache_mmap_unlock(mapping)	do { } while (0)
 #define flush_icache_page(vma, page)		do { } while (0)
diff --git a/arch/m68k/include/asm/cacheflush_mm.h b/arch/m68k/include/asm/cacheflush_mm.h
index 8ab46625ddd32..1ac55e7b47f01 100644
--- a/arch/m68k/include/asm/cacheflush_mm.h
+++ b/arch/m68k/include/asm/cacheflush_mm.h
@@ -250,7 +250,6 @@ static inline void __flush_page_to_ram(void *vaddr)
 
 #define ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE 1
 #define flush_dcache_page(page)		__flush_page_to_ram(page_address(page))
-void flush_dcache_folio(struct folio *folio);
 #define flush_dcache_mmap_lock(mapping)		do { } while (0)
 #define flush_dcache_mmap_unlock(mapping)	do { } while (0)
 #define flush_icache_page(vma, page)	__flush_page_to_ram(page_address(page))
diff --git a/arch/mips/include/asm/cacheflush.h b/arch/mips/include/asm/cacheflush.h
index f207388541d50..b3dc9c589442a 100644
--- a/arch/mips/include/asm/cacheflush.h
+++ b/arch/mips/include/asm/cacheflush.h
@@ -61,8 +61,6 @@ static inline void flush_dcache_page(struct page *page)
 		SetPageDcacheDirty(page);
 }
 
-void flush_dcache_folio(struct folio *folio);
-
 #define flush_dcache_mmap_lock(mapping)		do { } while (0)
 #define flush_dcache_mmap_unlock(mapping)	do { } while (0)
 
diff --git a/arch/nds32/include/asm/cacheflush.h b/arch/nds32/include/asm/cacheflush.h
index 3fc0bb7d6487c..c2a222ebfa2af 100644
--- a/arch/nds32/include/asm/cacheflush.h
+++ b/arch/nds32/include/asm/cacheflush.h
@@ -27,7 +27,6 @@ void flush_cache_vunmap(unsigned long start, unsigned long end);
 
 #define ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE 1
 void flush_dcache_page(struct page *page);
-void flush_dcache_folio(struct folio *folio);
 void copy_to_user_page(struct vm_area_struct *vma, struct page *page,
 		       unsigned long vaddr, void *dst, void *src, int len);
 void copy_from_user_page(struct vm_area_struct *vma, struct page *page,
diff --git a/arch/nios2/include/asm/cacheflush.h b/arch/nios2/include/asm/cacheflush.h
index 1999561b22aa5..d0b71dd712872 100644
--- a/arch/nios2/include/asm/cacheflush.h
+++ b/arch/nios2/include/asm/cacheflush.h
@@ -29,7 +29,6 @@ extern void flush_cache_page(struct vm_area_struct *vma, unsigned long vmaddr,
 	unsigned long pfn);
 #define ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE 1
 void flush_dcache_page(struct page *page);
-void flush_dcache_folio(struct folio *folio);
 
 extern void flush_icache_range(unsigned long start, unsigned long end);
 extern void flush_icache_page(struct vm_area_struct *vma, struct page *page);
diff --git a/arch/parisc/include/asm/cacheflush.h b/arch/parisc/include/asm/cacheflush.h
index da0cd4b3a28f2..859b8a34adcfb 100644
--- a/arch/parisc/include/asm/cacheflush.h
+++ b/arch/parisc/include/asm/cacheflush.h
@@ -50,7 +50,6 @@ void invalidate_kernel_vmap_range(void *vaddr, int size);
 
 #define ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE 1
 void flush_dcache_page(struct page *page);
-void flush_dcache_folio(struct folio *folio);
 
 #define flush_dcache_mmap_lock(mapping)		xa_lock_irq(&mapping->i_pages)
 #define flush_dcache_mmap_unlock(mapping)	xa_unlock_irq(&mapping->i_pages)
diff --git a/arch/sh/include/asm/cacheflush.h b/arch/sh/include/asm/cacheflush.h
index c7a97f32432fb..481a664287e2e 100644
--- a/arch/sh/include/asm/cacheflush.h
+++ b/arch/sh/include/asm/cacheflush.h
@@ -43,7 +43,6 @@ extern void flush_cache_range(struct vm_area_struct *vma,
 				 unsigned long start, unsigned long end);
 #define ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE 1
 void flush_dcache_page(struct page *page);
-void flush_dcache_folio(struct folio *folio);
 extern void flush_icache_range(unsigned long start, unsigned long end);
 #define flush_icache_user_range flush_icache_range
 extern void flush_icache_page(struct vm_area_struct *vma,
diff --git a/arch/sparc/include/asm/cacheflush_32.h b/arch/sparc/include/asm/cacheflush_32.h
index 9991c18f4980c..41c6d734a4741 100644
--- a/arch/sparc/include/asm/cacheflush_32.h
+++ b/arch/sparc/include/asm/cacheflush_32.h
@@ -37,7 +37,6 @@
 
 void sparc_flush_page_to_ram(struct page *page);
 
-void flush_dcache_folio(struct folio *folio);
 #define ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE 1
 #define flush_dcache_page(page)			sparc_flush_page_to_ram(page)
 #define flush_dcache_mmap_lock(mapping)		do { } while (0)
diff --git a/arch/sparc/include/asm/cacheflush_64.h b/arch/sparc/include/asm/cacheflush_64.h
index 9ab59a73c28b1..b9341836597ec 100644
--- a/arch/sparc/include/asm/cacheflush_64.h
+++ b/arch/sparc/include/asm/cacheflush_64.h
@@ -47,7 +47,6 @@ void flush_dcache_page_all(struct mm_struct *mm, struct page *page);
 void __flush_dcache_range(unsigned long start, unsigned long end);
 #define ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE 1
 void flush_dcache_page(struct page *page);
-void flush_dcache_folio(struct folio *folio);
 
 #define flush_icache_page(vma, pg)	do { } while(0)
 
diff --git a/arch/xtensa/include/asm/cacheflush.h b/arch/xtensa/include/asm/cacheflush.h
index a8a041609c5d0..7b4359312c257 100644
--- a/arch/xtensa/include/asm/cacheflush.h
+++ b/arch/xtensa/include/asm/cacheflush.h
@@ -121,7 +121,6 @@ void flush_cache_page(struct vm_area_struct*,
 
 #define ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE 1
 void flush_dcache_page(struct page *);
-void flush_dcache_folio(struct folio *);
 
 void local_flush_cache_range(struct vm_area_struct *vma,
 		unsigned long start, unsigned long end);
@@ -138,9 +137,7 @@ void local_flush_cache_page(struct vm_area_struct *vma,
 #define flush_cache_vunmap(start,end)			do { } while (0)
 
 #define ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE 0
-#define ARCH_IMPLEMENTS_FLUSH_DCACHE_FOLIO
 #define flush_dcache_page(page)				do { } while (0)
-static inline void flush_dcache_folio(struct folio *folio) { }
 
 #define flush_icache_range local_flush_icache_range
 #define flush_cache_page(vma, addr, pfn)		do { } while (0)
diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index 265c7f8e71342..218df77641802 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -4,6 +4,7 @@
  * Copyright (C) 2016-2019 Christoph Hellwig.
  */
 #include <linux/module.h>
+#include <linux/cacheflush.h>
 #include <linux/compiler.h>
 #include <linux/fs.h>
 #include <linux/iomap.h>
@@ -658,6 +659,7 @@ static size_t __iomap_write_end(struct inode *inode, loff_t pos, size_t len,
 		size_t copied, struct folio *folio)
 {
 	struct iomap_page *iop = to_iomap_page(folio);
+
 	flush_dcache_folio(folio);
 
 	/*
diff --git a/include/asm-generic/cacheflush.h b/include/asm-generic/cacheflush.h
index fedc0dfa4877c..eeaea7bd97bbf 100644
--- a/include/asm-generic/cacheflush.h
+++ b/include/asm-generic/cacheflush.h
@@ -49,14 +49,7 @@ static inline void flush_cache_page(struct vm_area_struct *vma,
 static inline void flush_dcache_page(struct page *page)
 {
 }
-
-static inline void flush_dcache_folio(struct folio *folio) { }
 #define ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE 0
-#define ARCH_IMPLEMENTS_FLUSH_DCACHE_FOLIO
-#endif
-
-#ifndef ARCH_IMPLEMENTS_FLUSH_DCACHE_FOLIO
-void flush_dcache_folio(struct folio *folio);
 #endif
 
 #ifndef flush_dcache_mmap_lock
diff --git a/include/linux/cacheflush.h b/include/linux/cacheflush.h
new file mode 100644
index 0000000000000..c28359bac8aa5
--- /dev/null
+++ b/include/linux/cacheflush.h
@@ -0,0 +1,15 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _LINUX_CACHEFLUSH_H
+#define _LINUX_CACHEFLUSH_H
+
+#include <asm/cacheflush.h>
+
+#if ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE
+void flush_dcache_folio(struct folio *folio);
+#else
+static inline void flush_dcache_folio(struct folio *folio)
+{
+}
+#endif /* ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE */
+
+#endif /* _LINUX_CACHEFLUSH_H */
diff --git a/mm/util.c b/mm/util.c
index e58151a612555..61ffa71adb644 100644
--- a/mm/util.c
+++ b/mm/util.c
@@ -1090,7 +1090,7 @@ void page_offline_end(void)
 }
 EXPORT_SYMBOL(page_offline_end);
 
-#ifndef ARCH_IMPLEMENTS_FLUSH_DCACHE_FOLIO
+#if ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE
 void flush_dcache_folio(struct folio *folio)
 {
 	long i, nr = folio_nr_pages(folio);
@@ -1098,5 +1098,4 @@ void flush_dcache_folio(struct folio *folio)
 	for (i = 0; i < nr; i++)
 		flush_dcache_page(folio_page(folio, i));
 }
-EXPORT_SYMBOL(flush_dcache_folio);
 #endif

^ permalink raw reply related	[flat|nested] 64+ messages in thread

* Re: [PATCH v2 04/28] fs: Rename AS_THP_SUPPORT and mapping_thp_support
  2021-11-15 16:03     ` Matthew Wilcox
@ 2021-11-16  6:33       ` Christoph Hellwig
  0 siblings, 0 replies; 64+ messages in thread
From: Christoph Hellwig @ 2021-11-16  6:33 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Christoph Hellwig, Darrick J . Wong ,
	linux-xfs, linux-fsdevel, linux-kernel, linux-block, Jens Axboe

On Mon, Nov 15, 2021 at 04:03:22PM +0000, Matthew Wilcox wrote:
> I think I prefer the term 'large' to 'multi'.  What would you think to
> this patch (not on top of any particular branch; just to show the scope
> of it ...)

I don't really care either way. Just be consistent and maybe add a
comment here and there..

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v2 01/28] csky,sparc: Declare flush_dcache_folio()
  2021-11-16  6:33       ` Christoph Hellwig
@ 2021-11-16 21:49         ` Matthew Wilcox
  2021-11-17  9:52           ` Geert Uytterhoeven
  0 siblings, 1 reply; 64+ messages in thread
From: Matthew Wilcox @ 2021-11-16 21:49 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Darrick J . Wong ,
	linux-xfs, linux-fsdevel, linux-kernel, linux-block, Jens Axboe

On Mon, Nov 15, 2021 at 10:33:01PM -0800, Christoph Hellwig wrote:
> I see how this works no, but it is pretty horrible.  Why not something
> simple like the patch below?  If/when an architecture actually
> wants to override flush_dcache_folio we can find out how to best do
> it:

I'll stick this one into -next and see if anything blows up:

From 14f55de74c68a3eb058cfdbf81414148b9bdaac7 Mon Sep 17 00:00:00 2001
From: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Date: Sat, 6 Nov 2021 17:13:35 -0400
Subject: [PATCH] Add linux/cacheflush.h

Many architectures do not include asm-generic/cacheflush.h, so turn
the includes on their head and add linux/cacheflush.h which includes
asm/cacheflush.h.

Move the flush_dcache_folio() declaration from asm-generic/cacheflush.h
to linux/cacheflush.h and change linux/highmem.h to include
linux/cacheflush.h instead of asm/cacheflush.h so that all necessary
places will see flush_dcache_folio().

More functions should have their default implementations moved in the
future, but those are for follow-on patches.  This fixes csky, sparc and
sparc64 which were missed in the commit which added flush_dcache_folio().

Fixes: 08b0b0059bf1 ("mm: Add flush_dcache_folio()")
Suggested-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 arch/arc/include/asm/cacheflush.h     |  1 -
 arch/arm/include/asm/cacheflush.h     |  1 -
 arch/m68k/include/asm/cacheflush_mm.h |  1 -
 arch/mips/include/asm/cacheflush.h    |  2 --
 arch/nds32/include/asm/cacheflush.h   |  1 -
 arch/nios2/include/asm/cacheflush.h   |  1 -
 arch/parisc/include/asm/cacheflush.h  |  1 -
 arch/sh/include/asm/cacheflush.h      |  1 -
 arch/xtensa/include/asm/cacheflush.h  |  3 ---
 include/asm-generic/cacheflush.h      |  6 ------
 include/linux/cacheflush.h            | 18 ++++++++++++++++++
 include/linux/highmem.h               |  3 +--
 12 files changed, 19 insertions(+), 20 deletions(-)
 create mode 100644 include/linux/cacheflush.h

diff --git a/arch/arc/include/asm/cacheflush.h b/arch/arc/include/asm/cacheflush.h
index e8c2c7469e10..e201b4b1655a 100644
--- a/arch/arc/include/asm/cacheflush.h
+++ b/arch/arc/include/asm/cacheflush.h
@@ -36,7 +36,6 @@ void __flush_dcache_page(phys_addr_t paddr, unsigned long vaddr);
 #define ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE 1
 
 void flush_dcache_page(struct page *page);
-void flush_dcache_folio(struct folio *folio);
 
 void dma_cache_wback_inv(phys_addr_t start, unsigned long sz);
 void dma_cache_inv(phys_addr_t start, unsigned long sz);
diff --git a/arch/arm/include/asm/cacheflush.h b/arch/arm/include/asm/cacheflush.h
index e68fb879e4f9..5e56288e343b 100644
--- a/arch/arm/include/asm/cacheflush.h
+++ b/arch/arm/include/asm/cacheflush.h
@@ -290,7 +290,6 @@ extern void flush_cache_page(struct vm_area_struct *vma, unsigned long user_addr
  */
 #define ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE 1
 extern void flush_dcache_page(struct page *);
-void flush_dcache_folio(struct folio *folio);
 
 #define ARCH_IMPLEMENTS_FLUSH_KERNEL_VMAP_RANGE 1
 static inline void flush_kernel_vmap_range(void *addr, int size)
diff --git a/arch/m68k/include/asm/cacheflush_mm.h b/arch/m68k/include/asm/cacheflush_mm.h
index 8ab46625ddd3..1ac55e7b47f0 100644
--- a/arch/m68k/include/asm/cacheflush_mm.h
+++ b/arch/m68k/include/asm/cacheflush_mm.h
@@ -250,7 +250,6 @@ static inline void __flush_page_to_ram(void *vaddr)
 
 #define ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE 1
 #define flush_dcache_page(page)		__flush_page_to_ram(page_address(page))
-void flush_dcache_folio(struct folio *folio);
 #define flush_dcache_mmap_lock(mapping)		do { } while (0)
 #define flush_dcache_mmap_unlock(mapping)	do { } while (0)
 #define flush_icache_page(vma, page)	__flush_page_to_ram(page_address(page))
diff --git a/arch/mips/include/asm/cacheflush.h b/arch/mips/include/asm/cacheflush.h
index f207388541d5..b3dc9c589442 100644
--- a/arch/mips/include/asm/cacheflush.h
+++ b/arch/mips/include/asm/cacheflush.h
@@ -61,8 +61,6 @@ static inline void flush_dcache_page(struct page *page)
 		SetPageDcacheDirty(page);
 }
 
-void flush_dcache_folio(struct folio *folio);
-
 #define flush_dcache_mmap_lock(mapping)		do { } while (0)
 #define flush_dcache_mmap_unlock(mapping)	do { } while (0)
 
diff --git a/arch/nds32/include/asm/cacheflush.h b/arch/nds32/include/asm/cacheflush.h
index 3fc0bb7d6487..c2a222ebfa2a 100644
--- a/arch/nds32/include/asm/cacheflush.h
+++ b/arch/nds32/include/asm/cacheflush.h
@@ -27,7 +27,6 @@ void flush_cache_vunmap(unsigned long start, unsigned long end);
 
 #define ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE 1
 void flush_dcache_page(struct page *page);
-void flush_dcache_folio(struct folio *folio);
 void copy_to_user_page(struct vm_area_struct *vma, struct page *page,
 		       unsigned long vaddr, void *dst, void *src, int len);
 void copy_from_user_page(struct vm_area_struct *vma, struct page *page,
diff --git a/arch/nios2/include/asm/cacheflush.h b/arch/nios2/include/asm/cacheflush.h
index 1999561b22aa..d0b71dd71287 100644
--- a/arch/nios2/include/asm/cacheflush.h
+++ b/arch/nios2/include/asm/cacheflush.h
@@ -29,7 +29,6 @@ extern void flush_cache_page(struct vm_area_struct *vma, unsigned long vmaddr,
 	unsigned long pfn);
 #define ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE 1
 void flush_dcache_page(struct page *page);
-void flush_dcache_folio(struct folio *folio);
 
 extern void flush_icache_range(unsigned long start, unsigned long end);
 extern void flush_icache_page(struct vm_area_struct *vma, struct page *page);
diff --git a/arch/parisc/include/asm/cacheflush.h b/arch/parisc/include/asm/cacheflush.h
index da0cd4b3a28f..859b8a34adcf 100644
--- a/arch/parisc/include/asm/cacheflush.h
+++ b/arch/parisc/include/asm/cacheflush.h
@@ -50,7 +50,6 @@ void invalidate_kernel_vmap_range(void *vaddr, int size);
 
 #define ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE 1
 void flush_dcache_page(struct page *page);
-void flush_dcache_folio(struct folio *folio);
 
 #define flush_dcache_mmap_lock(mapping)		xa_lock_irq(&mapping->i_pages)
 #define flush_dcache_mmap_unlock(mapping)	xa_unlock_irq(&mapping->i_pages)
diff --git a/arch/sh/include/asm/cacheflush.h b/arch/sh/include/asm/cacheflush.h
index c7a97f32432f..481a664287e2 100644
--- a/arch/sh/include/asm/cacheflush.h
+++ b/arch/sh/include/asm/cacheflush.h
@@ -43,7 +43,6 @@ extern void flush_cache_range(struct vm_area_struct *vma,
 				 unsigned long start, unsigned long end);
 #define ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE 1
 void flush_dcache_page(struct page *page);
-void flush_dcache_folio(struct folio *folio);
 extern void flush_icache_range(unsigned long start, unsigned long end);
 #define flush_icache_user_range flush_icache_range
 extern void flush_icache_page(struct vm_area_struct *vma,
diff --git a/arch/xtensa/include/asm/cacheflush.h b/arch/xtensa/include/asm/cacheflush.h
index a8a041609c5d..7b4359312c25 100644
--- a/arch/xtensa/include/asm/cacheflush.h
+++ b/arch/xtensa/include/asm/cacheflush.h
@@ -121,7 +121,6 @@ void flush_cache_page(struct vm_area_struct*,
 
 #define ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE 1
 void flush_dcache_page(struct page *);
-void flush_dcache_folio(struct folio *);
 
 void local_flush_cache_range(struct vm_area_struct *vma,
 		unsigned long start, unsigned long end);
@@ -138,9 +137,7 @@ void local_flush_cache_page(struct vm_area_struct *vma,
 #define flush_cache_vunmap(start,end)			do { } while (0)
 
 #define ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE 0
-#define ARCH_IMPLEMENTS_FLUSH_DCACHE_FOLIO
 #define flush_dcache_page(page)				do { } while (0)
-static inline void flush_dcache_folio(struct folio *folio) { }
 
 #define flush_icache_range local_flush_icache_range
 #define flush_cache_page(vma, addr, pfn)		do { } while (0)
diff --git a/include/asm-generic/cacheflush.h b/include/asm-generic/cacheflush.h
index fedc0dfa4877..4f07afacbc23 100644
--- a/include/asm-generic/cacheflush.h
+++ b/include/asm-generic/cacheflush.h
@@ -50,13 +50,7 @@ static inline void flush_dcache_page(struct page *page)
 {
 }
 
-static inline void flush_dcache_folio(struct folio *folio) { }
 #define ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE 0
-#define ARCH_IMPLEMENTS_FLUSH_DCACHE_FOLIO
-#endif
-
-#ifndef ARCH_IMPLEMENTS_FLUSH_DCACHE_FOLIO
-void flush_dcache_folio(struct folio *folio);
 #endif
 
 #ifndef flush_dcache_mmap_lock
diff --git a/include/linux/cacheflush.h b/include/linux/cacheflush.h
new file mode 100644
index 000000000000..fef8b607f97e
--- /dev/null
+++ b/include/linux/cacheflush.h
@@ -0,0 +1,18 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _LINUX_CACHEFLUSH_H
+#define _LINUX_CACHEFLUSH_H
+
+#include <asm/cacheflush.h>
+
+#if ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE
+#ifndef ARCH_IMPLEMENTS_FLUSH_DCACHE_FOLIO
+void flush_dcache_folio(struct folio *folio);
+#endif
+#else
+static inline void flush_dcache_folio(struct folio *folio)
+{
+}
+#define ARCH_IMPLEMENTS_FLUSH_DCACHE_FOLIO 0
+#endif /* ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE */
+
+#endif /* _LINUX_CACHEFLUSH_H */
diff --git a/include/linux/highmem.h b/include/linux/highmem.h
index 25aff0f2ed0b..c944b3b70ee7 100644
--- a/include/linux/highmem.h
+++ b/include/linux/highmem.h
@@ -5,12 +5,11 @@
 #include <linux/fs.h>
 #include <linux/kernel.h>
 #include <linux/bug.h>
+#include <linux/cacheflush.h>
 #include <linux/mm.h>
 #include <linux/uaccess.h>
 #include <linux/hardirq.h>
 
-#include <asm/cacheflush.h>
-
 #include "highmem-internal.h"
 
 /**
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* Re: [PATCH v2 12/28] iomap: Add iomap_invalidate_folio
  2021-11-08  4:05 ` [PATCH v2 12/28] iomap: Add iomap_invalidate_folio Matthew Wilcox (Oracle)
@ 2021-11-17  2:20   ` Darrick J. Wong
  0 siblings, 0 replies; 64+ messages in thread
From: Darrick J. Wong @ 2021-11-17  2:20 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle)
  Cc: linux-xfs, linux-fsdevel, linux-kernel, linux-block, Jens Axboe,
	Christoph Hellwig, Christoph Hellwig

On Mon, Nov 08, 2021 at 04:05:35AM +0000, Matthew Wilcox (Oracle) wrote:
> Keep iomap_invalidatepage around as a wrapper for use in address_space
> operations.
> 
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> Reviewed-by: Christoph Hellwig <hch@lst.de>

Looks good to me,
Reviewed-by: Darrick J. Wong <djwong@kernel.org>

--D

> ---
>  fs/iomap/buffered-io.c | 20 ++++++++++++--------
>  include/linux/iomap.h  |  1 +
>  2 files changed, 13 insertions(+), 8 deletions(-)
> 
> diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
> index 49f96fdadcb4..b7cbe4d202d8 100644
> --- a/fs/iomap/buffered-io.c
> +++ b/fs/iomap/buffered-io.c
> @@ -468,23 +468,27 @@ iomap_releasepage(struct page *page, gfp_t gfp_mask)
>  }
>  EXPORT_SYMBOL_GPL(iomap_releasepage);
>  
> -void
> -iomap_invalidatepage(struct page *page, unsigned int offset, unsigned int len)
> +void iomap_invalidate_folio(struct folio *folio, size_t offset, size_t len)
>  {
> -	struct folio *folio = page_folio(page);
> -
> -	trace_iomap_invalidatepage(page->mapping->host, offset, len);
> +	trace_iomap_invalidatepage(folio->mapping->host, offset, len);
>  
>  	/*
>  	 * If we're invalidating the entire page, clear the dirty state from it
>  	 * and release it to avoid unnecessary buildup of the LRU.
>  	 */
> -	if (offset == 0 && len == PAGE_SIZE) {
> -		WARN_ON_ONCE(PageWriteback(page));
> -		cancel_dirty_page(page);
> +	if (offset == 0 && len == folio_size(folio)) {
> +		WARN_ON_ONCE(folio_test_writeback(folio));
> +		folio_cancel_dirty(folio);
>  		iomap_page_release(folio);
>  	}
>  }
> +EXPORT_SYMBOL_GPL(iomap_invalidate_folio);
> +
> +void iomap_invalidatepage(struct page *page, unsigned int offset,
> +		unsigned int len)
> +{
> +	iomap_invalidate_folio(page_folio(page), offset, len);
> +}
>  EXPORT_SYMBOL_GPL(iomap_invalidatepage);
>  
>  #ifdef CONFIG_MIGRATION
> diff --git a/include/linux/iomap.h b/include/linux/iomap.h
> index 6d1b08d0ae93..29491fb9c5ba 100644
> --- a/include/linux/iomap.h
> +++ b/include/linux/iomap.h
> @@ -225,6 +225,7 @@ void iomap_readahead(struct readahead_control *, const struct iomap_ops *ops);
>  int iomap_is_partially_uptodate(struct page *page, unsigned long from,
>  		unsigned long count);
>  int iomap_releasepage(struct page *page, gfp_t gfp_mask);
> +void iomap_invalidate_folio(struct folio *folio, size_t offset, size_t len);
>  void iomap_invalidatepage(struct page *page, unsigned int offset,
>  		unsigned int len);
>  #ifdef CONFIG_MIGRATION
> -- 
> 2.33.0
> 

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v2 19/28] iomap: Convert __iomap_zero_iter to use a folio
  2021-11-08  4:05 ` [PATCH v2 19/28] iomap: Convert __iomap_zero_iter " Matthew Wilcox (Oracle)
  2021-11-09  8:47   ` Christoph Hellwig
@ 2021-11-17  2:24   ` Darrick J. Wong
  2021-11-17 14:20     ` Matthew Wilcox
  2021-12-09 21:38   ` Matthew Wilcox
  2 siblings, 1 reply; 64+ messages in thread
From: Darrick J. Wong @ 2021-11-17  2:24 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle)
  Cc: linux-xfs, linux-fsdevel, linux-kernel, linux-block, Jens Axboe,
	Christoph Hellwig

On Mon, Nov 08, 2021 at 04:05:42AM +0000, Matthew Wilcox (Oracle) wrote:
> The zero iterator can work in folio-sized chunks instead of page-sized
> chunks.  This will save a lot of page cache lookups if the file is cached
> in multi-page folios.
> 
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>

hch's dax decoupling series notwithstanding,

Though TBH I am kinda wondering how the two of you plan to resolve those
kinds of differences -- I haven't looked at that series, though I think
this one's been waiting in the wings for longer?

Heck, I wonder how Matthew plans to merge all this given that it touches
mm, fs, block, and iomap...?

Reviewed-by: Darrick J. Wong <djwong@kernel.org>

--D

> ---
>  fs/iomap/buffered-io.c | 13 ++++++++-----
>  1 file changed, 8 insertions(+), 5 deletions(-)
> 
> diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
> index 64e54981b651..9c61d12028ca 100644
> --- a/fs/iomap/buffered-io.c
> +++ b/fs/iomap/buffered-io.c
> @@ -881,17 +881,20 @@ EXPORT_SYMBOL_GPL(iomap_file_unshare);
>  
>  static s64 __iomap_zero_iter(struct iomap_iter *iter, loff_t pos, u64 length)
>  {
> +	struct folio *folio;
>  	struct page *page;
>  	int status;
> -	unsigned offset = offset_in_page(pos);
> -	unsigned bytes = min_t(u64, PAGE_SIZE - offset, length);
> +	size_t offset, bytes;
>  
> -	status = iomap_write_begin(iter, pos, bytes, &page);
> +	status = iomap_write_begin(iter, pos, length, &page);
>  	if (status)
>  		return status;
> +	folio = page_folio(page);
>  
> -	zero_user(page, offset, bytes);
> -	mark_page_accessed(page);
> +	offset = offset_in_folio(folio, pos);
> +	bytes = min_t(u64, folio_size(folio) - offset, length);
> +	folio_zero_range(folio, offset, bytes);
> +	folio_mark_accessed(folio);
>  
>  	return iomap_write_end(iter, pos, bytes, bytes, page);
>  }
> -- 
> 2.33.0
> 

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v2 20/28] iomap: Convert iomap_write_begin() and iomap_write_end() to folios
  2021-11-08  4:05 ` [PATCH v2 20/28] iomap: Convert iomap_write_begin() and iomap_write_end() to folios Matthew Wilcox (Oracle)
@ 2021-11-17  4:31   ` Darrick J. Wong
  2021-11-17 14:31     ` Matthew Wilcox
  0 siblings, 1 reply; 64+ messages in thread
From: Darrick J. Wong @ 2021-11-17  4:31 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle)
  Cc: linux-xfs, linux-fsdevel, linux-kernel, linux-block, Jens Axboe,
	Christoph Hellwig, Christoph Hellwig

On Mon, Nov 08, 2021 at 04:05:43AM +0000, Matthew Wilcox (Oracle) wrote:
> These functions still only work in PAGE_SIZE chunks, but there are
> fewer conversions from tail to head pages as a result of this patch.
> 
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> Reviewed-by: Christoph Hellwig <hch@lst.de>
> ---
>  fs/iomap/buffered-io.c | 66 ++++++++++++++++++++----------------------
>  1 file changed, 31 insertions(+), 35 deletions(-)
> 
> diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
> index 9c61d12028ca..f4ae200adc4c 100644
> --- a/fs/iomap/buffered-io.c
> +++ b/fs/iomap/buffered-io.c

<snip>

> @@ -741,6 +737,7 @@ static loff_t iomap_write_iter(struct iomap_iter *iter, struct iov_iter *i)
>  	long status = 0;
>  
>  	do {
> +		struct folio *folio;
>  		struct page *page;
>  		unsigned long offset;	/* Offset into pagecache page */
>  		unsigned long bytes;	/* Bytes to write to page */
> @@ -764,16 +761,17 @@ static loff_t iomap_write_iter(struct iomap_iter *iter, struct iov_iter *i)
>  			break;
>  		}
>  
> -		status = iomap_write_begin(iter, pos, bytes, &page);
> +		status = iomap_write_begin(iter, pos, bytes, &folio);
>  		if (unlikely(status))
>  			break;
>  
> +		page = folio_file_page(folio, pos >> PAGE_SHIFT);
>  		if (mapping_writably_mapped(iter->inode->i_mapping))
>  			flush_dcache_page(page);
>  
>  		copied = copy_page_from_iter_atomic(page, offset, bytes, i);

Hrmm.  In principle (or I guess even a subsequent patch), if we had
multi-page folios, could we simply loop the pages in the folio instead
of doing a single page and then calling back into iomap_write_begin to
get (probably) the same folio?

This looks like a fairly straightforward conversion, but I was wondering
about that one little point...

Reviewed-by: Darrick J. Wong <djwong@kernel.org>

--D

>  
> -		status = iomap_write_end(iter, pos, bytes, copied, page);
> +		status = iomap_write_end(iter, pos, bytes, copied, folio);
>  
>  		if (unlikely(copied != status))
>  			iov_iter_revert(i, copied - status);
> @@ -839,13 +837,13 @@ static loff_t iomap_unshare_iter(struct iomap_iter *iter)
>  	do {
>  		unsigned long offset = offset_in_page(pos);
>  		unsigned long bytes = min_t(loff_t, PAGE_SIZE - offset, length);
> -		struct page *page;
> +		struct folio *folio;
>  
> -		status = iomap_write_begin(iter, pos, bytes, &page);
> +		status = iomap_write_begin(iter, pos, bytes, &folio);
>  		if (unlikely(status))
>  			return status;
>  
> -		status = iomap_write_end(iter, pos, bytes, bytes, page);
> +		status = iomap_write_end(iter, pos, bytes, bytes, folio);
>  		if (WARN_ON_ONCE(status == 0))
>  			return -EIO;
>  
> @@ -882,21 +880,19 @@ EXPORT_SYMBOL_GPL(iomap_file_unshare);
>  static s64 __iomap_zero_iter(struct iomap_iter *iter, loff_t pos, u64 length)
>  {
>  	struct folio *folio;
> -	struct page *page;
>  	int status;
>  	size_t offset, bytes;
>  
> -	status = iomap_write_begin(iter, pos, length, &page);
> +	status = iomap_write_begin(iter, pos, length, &folio);
>  	if (status)
>  		return status;
> -	folio = page_folio(page);
>  
>  	offset = offset_in_folio(folio, pos);
>  	bytes = min_t(u64, folio_size(folio) - offset, length);
>  	folio_zero_range(folio, offset, bytes);
>  	folio_mark_accessed(folio);
>  
> -	return iomap_write_end(iter, pos, bytes, bytes, page);
> +	return iomap_write_end(iter, pos, bytes, bytes, folio);
>  }
>  
>  static loff_t iomap_zero_iter(struct iomap_iter *iter, bool *did_zero)
> -- 
> 2.33.0
> 

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v2 25/28] iomap: Convert iomap_add_to_ioend() to take a folio
  2021-11-08  4:05 ` [PATCH v2 25/28] iomap: Convert iomap_add_to_ioend() to take a folio Matthew Wilcox (Oracle)
@ 2021-11-17  4:34   ` Darrick J. Wong
  0 siblings, 0 replies; 64+ messages in thread
From: Darrick J. Wong @ 2021-11-17  4:34 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle)
  Cc: linux-xfs, linux-fsdevel, linux-kernel, linux-block, Jens Axboe,
	Christoph Hellwig, Christoph Hellwig

On Mon, Nov 08, 2021 at 04:05:48AM +0000, Matthew Wilcox (Oracle) wrote:
> We still iterate one block at a time, but now we call compound_head()
> less often.
> 
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> Reviewed-by: Christoph Hellwig <hch@lst.de>

Looks good!
Reviewed-by: Darrick J. Wong <djwong@kernel.org>

--D

> ---
>  fs/iomap/buffered-io.c | 70 ++++++++++++++++++++----------------------
>  1 file changed, 34 insertions(+), 36 deletions(-)
> 
> diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
> index b168cc0fe8be..90f9f33ffe41 100644
> --- a/fs/iomap/buffered-io.c
> +++ b/fs/iomap/buffered-io.c
> @@ -1249,29 +1249,29 @@ iomap_can_add_to_ioend(struct iomap_writepage_ctx *wpc, loff_t offset,
>   * first; otherwise finish off the current ioend and start another.
>   */
>  static void
> -iomap_add_to_ioend(struct inode *inode, loff_t offset, struct page *page,
> +iomap_add_to_ioend(struct inode *inode, loff_t pos, struct folio *folio,
>  		struct iomap_page *iop, struct iomap_writepage_ctx *wpc,
>  		struct writeback_control *wbc, struct list_head *iolist)
>  {
> -	sector_t sector = iomap_sector(&wpc->iomap, offset);
> +	sector_t sector = iomap_sector(&wpc->iomap, pos);
>  	unsigned len = i_blocksize(inode);
> -	unsigned poff = offset & (PAGE_SIZE - 1);
> +	size_t poff = offset_in_folio(folio, pos);
>  
> -	if (!wpc->ioend || !iomap_can_add_to_ioend(wpc, offset, sector)) {
> +	if (!wpc->ioend || !iomap_can_add_to_ioend(wpc, pos, sector)) {
>  		if (wpc->ioend)
>  			list_add(&wpc->ioend->io_list, iolist);
> -		wpc->ioend = iomap_alloc_ioend(inode, wpc, offset, sector, wbc);
> +		wpc->ioend = iomap_alloc_ioend(inode, wpc, pos, sector, wbc);
>  	}
>  
> -	if (bio_add_page(wpc->ioend->io_bio, page, len, poff) != len) {
> +	if (!bio_add_folio(wpc->ioend->io_bio, folio, len, poff)) {
>  		wpc->ioend->io_bio = iomap_chain_bio(wpc->ioend->io_bio);
> -		__bio_add_page(wpc->ioend->io_bio, page, len, poff);
> +		bio_add_folio(wpc->ioend->io_bio, folio, len, poff);
>  	}
>  
>  	if (iop)
>  		atomic_add(len, &iop->write_bytes_pending);
>  	wpc->ioend->io_size += len;
> -	wbc_account_cgroup_owner(wbc, page, len);
> +	wbc_account_cgroup_owner(wbc, &folio->page, len);
>  }
>  
>  /*
> @@ -1293,9 +1293,8 @@ iomap_add_to_ioend(struct inode *inode, loff_t offset, struct page *page,
>  static int
>  iomap_writepage_map(struct iomap_writepage_ctx *wpc,
>  		struct writeback_control *wbc, struct inode *inode,
> -		struct page *page, u64 end_pos)
> +		struct folio *folio, u64 end_pos)
>  {
> -	struct folio *folio = page_folio(page);
>  	struct iomap_page *iop = iomap_page_create(inode, folio);
>  	struct iomap_ioend *ioend, *next;
>  	unsigned len = i_blocksize(inode);
> @@ -1322,15 +1321,15 @@ iomap_writepage_map(struct iomap_writepage_ctx *wpc,
>  			continue;
>  		if (wpc->iomap.type == IOMAP_HOLE)
>  			continue;
> -		iomap_add_to_ioend(inode, pos, page, iop, wpc, wbc,
> +		iomap_add_to_ioend(inode, pos, folio, iop, wpc, wbc,
>  				 &submit_list);
>  		count++;
>  	}
>  
>  	WARN_ON_ONCE(!wpc->ioend && !list_empty(&submit_list));
> -	WARN_ON_ONCE(!PageLocked(page));
> -	WARN_ON_ONCE(PageWriteback(page));
> -	WARN_ON_ONCE(PageDirty(page));
> +	WARN_ON_ONCE(!folio_test_locked(folio));
> +	WARN_ON_ONCE(folio_test_writeback(folio));
> +	WARN_ON_ONCE(folio_test_dirty(folio));
>  
>  	/*
>  	 * We cannot cancel the ioend directly here on error.  We may have
> @@ -1348,14 +1347,14 @@ iomap_writepage_map(struct iomap_writepage_ctx *wpc,
>  		if (wpc->ops->discard_folio)
>  			wpc->ops->discard_folio(folio, pos);
>  		if (!count) {
> -			ClearPageUptodate(page);
> -			unlock_page(page);
> +			folio_clear_uptodate(folio);
> +			folio_unlock(folio);
>  			goto done;
>  		}
>  	}
>  
> -	set_page_writeback(page);
> -	unlock_page(page);
> +	folio_start_writeback(folio);
> +	folio_unlock(folio);
>  
>  	/*
>  	 * Preserve the original error if there was one; catch
> @@ -1376,9 +1375,9 @@ iomap_writepage_map(struct iomap_writepage_ctx *wpc,
>  	 * with a partial page truncate on a sub-page block sized filesystem.
>  	 */
>  	if (!count)
> -		end_page_writeback(page);
> +		folio_end_writeback(folio);
>  done:
> -	mapping_set_error(page->mapping, error);
> +	mapping_set_error(folio->mapping, error);
>  	return error;
>  }
>  
> @@ -1392,14 +1391,15 @@ iomap_writepage_map(struct iomap_writepage_ctx *wpc,
>  static int
>  iomap_do_writepage(struct page *page, struct writeback_control *wbc, void *data)
>  {
> +	struct folio *folio = page_folio(page);
>  	struct iomap_writepage_ctx *wpc = data;
> -	struct inode *inode = page->mapping->host;
> +	struct inode *inode = folio->mapping->host;
>  	u64 end_pos, isize;
>  
> -	trace_iomap_writepage(inode, page_offset(page), PAGE_SIZE);
> +	trace_iomap_writepage(inode, folio_pos(folio), folio_size(folio));
>  
>  	/*
> -	 * Refuse to write the page out if we're called from reclaim context.
> +	 * Refuse to write the folio out if we're called from reclaim context.
>  	 *
>  	 * This avoids stack overflows when called from deeply used stacks in
>  	 * random callers for direct reclaim or memcg reclaim.  We explicitly
> @@ -1413,10 +1413,10 @@ iomap_do_writepage(struct page *page, struct writeback_control *wbc, void *data)
>  		goto redirty;
>  
>  	/*
> -	 * Is this page beyond the end of the file?
> +	 * Is this folio beyond the end of the file?
>  	 *
> -	 * The page index is less than the end_index, adjust the end_offset
> -	 * to the highest offset that this page should represent.
> +	 * The folio index is less than the end_index, adjust the end_pos
> +	 * to the highest offset that this folio should represent.
>  	 * -----------------------------------------------------
>  	 * |			file mapping	       | <EOF> |
>  	 * -----------------------------------------------------
> @@ -1426,7 +1426,7 @@ iomap_do_writepage(struct page *page, struct writeback_control *wbc, void *data)
>  	 * ---------------------------------^------------------|
>  	 */
>  	isize = i_size_read(inode);
> -	end_pos = page_offset(page) + PAGE_SIZE;
> +	end_pos = folio_pos(folio) + folio_size(folio);
>  	if (end_pos > isize) {
>  		/*
>  		 * Check whether the page to write out is beyond or straddles
> @@ -1439,7 +1439,7 @@ iomap_do_writepage(struct page *page, struct writeback_control *wbc, void *data)
>  		 * |				    |      Straddles     |
>  		 * ---------------------------------^-----------|--------|
>  		 */
> -		size_t poff = offset_in_page(isize);
> +		size_t poff = offset_in_folio(folio, isize);
>  		pgoff_t end_index = isize >> PAGE_SHIFT;
>  
>  		/*
> @@ -1459,8 +1459,8 @@ iomap_do_writepage(struct page *page, struct writeback_control *wbc, void *data)
>  		 * checking if the page is totally beyond i_size or if its
>  		 * offset is just equal to the EOF.
>  		 */
> -		if (page->index > end_index ||
> -		    (page->index == end_index && poff == 0))
> +		if (folio->index > end_index ||
> +		    (folio->index == end_index && poff == 0))
>  			goto redirty;
>  
>  		/*
> @@ -1471,17 +1471,15 @@ iomap_do_writepage(struct page *page, struct writeback_control *wbc, void *data)
>  		 * memory is zeroed when mapped, and writes to that region are
>  		 * not written out to the file."
>  		 */
> -		zero_user_segment(page, poff, PAGE_SIZE);
> -
> -		/* Adjust the end_offset to the end of file */
> +		folio_zero_segment(folio, poff, folio_size(folio));
>  		end_pos = isize;
>  	}
>  
> -	return iomap_writepage_map(wpc, wbc, inode, page, end_pos);
> +	return iomap_writepage_map(wpc, wbc, inode, folio, end_pos);
>  
>  redirty:
> -	redirty_page_for_writepage(wbc, page);
> -	unlock_page(page);
> +	folio_redirty_for_writepage(wbc, folio);
> +	folio_unlock(folio);
>  	return 0;
>  }
>  
> -- 
> 2.33.0
> 

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v2 07/28] fs/buffer: Convert __block_write_begin_int() to take a folio
  2021-11-08  4:05 ` [PATCH v2 07/28] fs/buffer: Convert __block_write_begin_int() to take a folio Matthew Wilcox (Oracle)
  2021-11-09  8:42   ` Christoph Hellwig
@ 2021-11-17  4:35   ` Darrick J. Wong
  1 sibling, 0 replies; 64+ messages in thread
From: Darrick J. Wong @ 2021-11-17  4:35 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle)
  Cc: linux-xfs, linux-fsdevel, linux-kernel, linux-block, Jens Axboe,
	Christoph Hellwig

On Mon, Nov 08, 2021 at 04:05:30AM +0000, Matthew Wilcox (Oracle) wrote:
> There are no plans to convert buffer_head infrastructure to use multi-page
> folios, but __block_write_begin_int() is called from iomap, and it's
> more convenient and less error-prone if we pass in a folio from iomap.
> It also has a nice saving of almost 200 bytes of code from removing
> repeated calls to compound_head().
> 
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>

Pretty straightforward,
Reviewed-by: Darrick J. Wong <djwong@kernel.org>

--D

> ---
>  fs/buffer.c            | 22 +++++++++++-----------
>  fs/internal.h          |  2 +-
>  fs/iomap/buffered-io.c |  7 +++++--
>  3 files changed, 17 insertions(+), 14 deletions(-)
> 
> diff --git a/fs/buffer.c b/fs/buffer.c
> index 46bc589b7a03..b1d722b26fe9 100644
> --- a/fs/buffer.c
> +++ b/fs/buffer.c
> @@ -1969,34 +1969,34 @@ iomap_to_bh(struct inode *inode, sector_t block, struct buffer_head *bh,
>  	}
>  }
>  
> -int __block_write_begin_int(struct page *page, loff_t pos, unsigned len,
> +int __block_write_begin_int(struct folio *folio, loff_t pos, unsigned len,
>  		get_block_t *get_block, const struct iomap *iomap)
>  {
>  	unsigned from = pos & (PAGE_SIZE - 1);
>  	unsigned to = from + len;
> -	struct inode *inode = page->mapping->host;
> +	struct inode *inode = folio->mapping->host;
>  	unsigned block_start, block_end;
>  	sector_t block;
>  	int err = 0;
>  	unsigned blocksize, bbits;
>  	struct buffer_head *bh, *head, *wait[2], **wait_bh=wait;
>  
> -	BUG_ON(!PageLocked(page));
> +	BUG_ON(!folio_test_locked(folio));
>  	BUG_ON(from > PAGE_SIZE);
>  	BUG_ON(to > PAGE_SIZE);
>  	BUG_ON(from > to);
>  
> -	head = create_page_buffers(page, inode, 0);
> +	head = create_page_buffers(&folio->page, inode, 0);
>  	blocksize = head->b_size;
>  	bbits = block_size_bits(blocksize);
>  
> -	block = (sector_t)page->index << (PAGE_SHIFT - bbits);
> +	block = (sector_t)folio->index << (PAGE_SHIFT - bbits);
>  
>  	for(bh = head, block_start = 0; bh != head || !block_start;
>  	    block++, block_start=block_end, bh = bh->b_this_page) {
>  		block_end = block_start + blocksize;
>  		if (block_end <= from || block_start >= to) {
> -			if (PageUptodate(page)) {
> +			if (folio_test_uptodate(folio)) {
>  				if (!buffer_uptodate(bh))
>  					set_buffer_uptodate(bh);
>  			}
> @@ -2016,20 +2016,20 @@ int __block_write_begin_int(struct page *page, loff_t pos, unsigned len,
>  
>  			if (buffer_new(bh)) {
>  				clean_bdev_bh_alias(bh);
> -				if (PageUptodate(page)) {
> +				if (folio_test_uptodate(folio)) {
>  					clear_buffer_new(bh);
>  					set_buffer_uptodate(bh);
>  					mark_buffer_dirty(bh);
>  					continue;
>  				}
>  				if (block_end > to || block_start < from)
> -					zero_user_segments(page,
> +					folio_zero_segments(folio,
>  						to, block_end,
>  						block_start, from);
>  				continue;
>  			}
>  		}
> -		if (PageUptodate(page)) {
> +		if (folio_test_uptodate(folio)) {
>  			if (!buffer_uptodate(bh))
>  				set_buffer_uptodate(bh);
>  			continue; 
> @@ -2050,14 +2050,14 @@ int __block_write_begin_int(struct page *page, loff_t pos, unsigned len,
>  			err = -EIO;
>  	}
>  	if (unlikely(err))
> -		page_zero_new_buffers(page, from, to);
> +		page_zero_new_buffers(&folio->page, from, to);
>  	return err;
>  }
>  
>  int __block_write_begin(struct page *page, loff_t pos, unsigned len,
>  		get_block_t *get_block)
>  {
> -	return __block_write_begin_int(page, pos, len, get_block, NULL);
> +	return __block_write_begin_int(page_folio(page), pos, len, get_block, NULL);
>  }
>  EXPORT_SYMBOL(__block_write_begin);
>  
> diff --git a/fs/internal.h b/fs/internal.h
> index cdd83d4899bb..afc13443392b 100644
> --- a/fs/internal.h
> +++ b/fs/internal.h
> @@ -37,7 +37,7 @@ static inline int emergency_thaw_bdev(struct super_block *sb)
>  /*
>   * buffer.c
>   */
> -int __block_write_begin_int(struct page *page, loff_t pos, unsigned len,
> +int __block_write_begin_int(struct folio *folio, loff_t pos, unsigned len,
>  		get_block_t *get_block, const struct iomap *iomap);
>  
>  /*
> diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
> index 1753c26c8e76..4e09ea823148 100644
> --- a/fs/iomap/buffered-io.c
> +++ b/fs/iomap/buffered-io.c
> @@ -597,6 +597,7 @@ static int iomap_write_begin(const struct iomap_iter *iter, loff_t pos,
>  	const struct iomap_page_ops *page_ops = iter->iomap.page_ops;
>  	const struct iomap *srcmap = iomap_iter_srcmap(iter);
>  	struct page *page;
> +	struct folio *folio;
>  	int status = 0;
>  
>  	BUG_ON(pos + len > iter->iomap.offset + iter->iomap.length);
> @@ -618,11 +619,12 @@ static int iomap_write_begin(const struct iomap_iter *iter, loff_t pos,
>  		status = -ENOMEM;
>  		goto out_no_page;
>  	}
> +	folio = page_folio(page);
>  
>  	if (srcmap->type == IOMAP_INLINE)
>  		status = iomap_write_begin_inline(iter, page);
>  	else if (srcmap->flags & IOMAP_F_BUFFER_HEAD)
> -		status = __block_write_begin_int(page, pos, len, NULL, srcmap);
> +		status = __block_write_begin_int(folio, pos, len, NULL, srcmap);
>  	else
>  		status = __iomap_write_begin(iter, pos, len, page);
>  
> @@ -954,11 +956,12 @@ EXPORT_SYMBOL_GPL(iomap_truncate_page);
>  static loff_t iomap_page_mkwrite_iter(struct iomap_iter *iter,
>  		struct page *page)
>  {
> +	struct folio *folio = page_folio(page);
>  	loff_t length = iomap_length(iter);
>  	int ret;
>  
>  	if (iter->iomap.flags & IOMAP_F_BUFFER_HEAD) {
> -		ret = __block_write_begin_int(page, iter->pos, length, NULL,
> +		ret = __block_write_begin_int(folio, iter->pos, length, NULL,
>  					      &iter->iomap);
>  		if (ret)
>  			return ret;
> -- 
> 2.33.0
> 

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v2 03/28] fs: Remove FS_THP_SUPPORT
  2021-11-08  4:05 ` [PATCH v2 03/28] fs: Remove FS_THP_SUPPORT Matthew Wilcox (Oracle)
@ 2021-11-17  4:36   ` Darrick J. Wong
  0 siblings, 0 replies; 64+ messages in thread
From: Darrick J. Wong @ 2021-11-17  4:36 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle)
  Cc: linux-xfs, linux-fsdevel, linux-kernel, linux-block, Jens Axboe,
	Christoph Hellwig, Christoph Hellwig

On Mon, Nov 08, 2021 at 04:05:26AM +0000, Matthew Wilcox (Oracle) wrote:
> Instead of setting a bit in the fs_flags to set a bit in the
> address_space, set the bit in the address_space directly.
> 
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> Reviewed-by: Christoph Hellwig <hch@lst.de>

Makes sense,
Reviewed-by: Darrick J. Wong <djwong@kernel.org>

--D

> ---
>  fs/inode.c              |  2 --
>  include/linux/fs.h      |  1 -
>  include/linux/pagemap.h | 16 ++++++++++++++++
>  mm/shmem.c              |  3 ++-
>  4 files changed, 18 insertions(+), 4 deletions(-)
> 
> diff --git a/fs/inode.c b/fs/inode.c
> index 9abc88d7959c..d6386b6d5a6e 100644
> --- a/fs/inode.c
> +++ b/fs/inode.c
> @@ -180,8 +180,6 @@ int inode_init_always(struct super_block *sb, struct inode *inode)
>  	mapping->a_ops = &empty_aops;
>  	mapping->host = inode;
>  	mapping->flags = 0;
> -	if (sb->s_type->fs_flags & FS_THP_SUPPORT)
> -		__set_bit(AS_THP_SUPPORT, &mapping->flags);
>  	mapping->wb_err = 0;
>  	atomic_set(&mapping->i_mmap_writable, 0);
>  #ifdef CONFIG_READ_ONLY_THP_FOR_FS
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index 4137a9bfae7a..3c2fcabf9d12 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -2518,7 +2518,6 @@ struct file_system_type {
>  #define FS_USERNS_MOUNT		8	/* Can be mounted by userns root */
>  #define FS_DISALLOW_NOTIFY_PERM	16	/* Disable fanotify permission events */
>  #define FS_ALLOW_IDMAP         32      /* FS has been updated to handle vfs idmappings. */
> -#define FS_THP_SUPPORT		8192	/* Remove once all fs converted */
>  #define FS_RENAME_DOES_D_MOVE	32768	/* FS will handle d_move() during rename() internally. */
>  	int (*init_fs_context)(struct fs_context *);
>  	const struct fs_parameter_spec *parameters;
> diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
> index db2c3e3eb1cf..471f0c422831 100644
> --- a/include/linux/pagemap.h
> +++ b/include/linux/pagemap.h
> @@ -126,6 +126,22 @@ static inline void mapping_set_gfp_mask(struct address_space *m, gfp_t mask)
>  	m->gfp_mask = mask;
>  }
>  
> +/**
> + * mapping_set_large_folios() - Indicate the file supports multi-page folios.
> + * @mapping: The file.
> + *
> + * The filesystem should call this function in its inode constructor to
> + * indicate that the VFS can use multi-page folios to cache the contents
> + * of the file.
> + *
> + * Context: This should not be called while the inode is active as it
> + * is non-atomic.
> + */
> +static inline void mapping_set_large_folios(struct address_space *mapping)
> +{
> +	__set_bit(AS_THP_SUPPORT, &mapping->flags);
> +}
> +
>  static inline bool mapping_thp_support(struct address_space *mapping)
>  {
>  	return test_bit(AS_THP_SUPPORT, &mapping->flags);
> diff --git a/mm/shmem.c b/mm/shmem.c
> index 23c91a8beb78..54422933fa2d 100644
> --- a/mm/shmem.c
> +++ b/mm/shmem.c
> @@ -2303,6 +2303,7 @@ static struct inode *shmem_get_inode(struct super_block *sb, const struct inode
>  		INIT_LIST_HEAD(&info->swaplist);
>  		simple_xattrs_init(&info->xattrs);
>  		cache_no_acl(inode);
> +		mapping_set_large_folios(inode->i_mapping);
>  
>  		switch (mode & S_IFMT) {
>  		default:
> @@ -3920,7 +3921,7 @@ static struct file_system_type shmem_fs_type = {
>  	.parameters	= shmem_fs_parameters,
>  #endif
>  	.kill_sb	= kill_litter_super,
> -	.fs_flags	= FS_USERNS_MOUNT | FS_THP_SUPPORT,
> +	.fs_flags	= FS_USERNS_MOUNT,
>  };
>  
>  int __init shmem_init(void)
> -- 
> 2.33.0
> 

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v2 02/28] mm: Add functions to zero portions of a folio
  2021-11-08  4:05 ` [PATCH v2 02/28] mm: Add functions to zero portions of a folio Matthew Wilcox (Oracle)
  2021-11-09  8:40   ` Christoph Hellwig
@ 2021-11-17  4:45   ` Darrick J. Wong
  2021-11-17 14:07     ` Matthew Wilcox
  1 sibling, 1 reply; 64+ messages in thread
From: Darrick J. Wong @ 2021-11-17  4:45 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle)
  Cc: linux-xfs, linux-fsdevel, linux-kernel, linux-block, Jens Axboe,
	Christoph Hellwig

On Mon, Nov 08, 2021 at 04:05:25AM +0000, Matthew Wilcox (Oracle) wrote:
> These functions are wrappers around zero_user_segments(), which means
> that zero_user_segments() can now be called for compound pages even when
> CONFIG_TRANSPARENT_HUGEPAGE is disabled.
> 
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> ---
>  include/linux/highmem.h | 44 ++++++++++++++++++++++++++++++++++++++---
>  mm/highmem.c            |  2 --
>  2 files changed, 41 insertions(+), 5 deletions(-)
> 
> diff --git a/include/linux/highmem.h b/include/linux/highmem.h
> index 25aff0f2ed0b..c343c69bb5b4 100644
> --- a/include/linux/highmem.h
> +++ b/include/linux/highmem.h
> @@ -231,10 +231,10 @@ static inline void tag_clear_highpage(struct page *page)
>   * If we pass in a base or tail page, we can zero up to PAGE_SIZE.
>   * If we pass in a head page, we can zero up to the size of the compound page.
>   */
> -#if defined(CONFIG_HIGHMEM) && defined(CONFIG_TRANSPARENT_HUGEPAGE)
> +#ifdef CONFIG_HIGHMEM
>  void zero_user_segments(struct page *page, unsigned start1, unsigned end1,
>  		unsigned start2, unsigned end2);
> -#else /* !HIGHMEM || !TRANSPARENT_HUGEPAGE */
> +#else
>  static inline void zero_user_segments(struct page *page,
>  		unsigned start1, unsigned end1,
>  		unsigned start2, unsigned end2)
> @@ -254,7 +254,7 @@ static inline void zero_user_segments(struct page *page,
>  	for (i = 0; i < compound_nr(page); i++)
>  		flush_dcache_page(page + i);
>  }
> -#endif /* !HIGHMEM || !TRANSPARENT_HUGEPAGE */
> +#endif
>  
>  static inline void zero_user_segment(struct page *page,
>  	unsigned start, unsigned end)
> @@ -364,4 +364,42 @@ static inline void memzero_page(struct page *page, size_t offset, size_t len)
>  	kunmap_local(addr);
>  }
>  
> +/**
> + * folio_zero_segments() - Zero two byte ranges in a folio.
> + * @folio: The folio to write to.
> + * @start1: The first byte to zero.
> + * @end1: One more than the last byte in the first range.
> + * @start2: The first byte to zero in the second range.
> + * @end2: One more than the last byte in the second range.
> + */
> +static inline void folio_zero_segments(struct folio *folio,
> +		size_t start1, size_t end1, size_t start2, size_t end2)
> +{
> +	zero_user_segments(&folio->page, start1, end1, start2, end2);
> +}
> +
> +/**
> + * folio_zero_segment() - Zero a byte range in a folio.
> + * @folio: The folio to write to.
> + * @start: The first byte to zero.
> + * @end: One more than the last byte in the first range.
> + */
> +static inline void folio_zero_segment(struct folio *folio,
> +		size_t start, size_t end)
> +{
> +	zero_user_segments(&folio->page, start, end, 0, 0);
> +}
> +
> +/**
> + * folio_zero_range() - Zero a byte range in a folio.
> + * @folio: The folio to write to.
> + * @start: The first byte to zero.
> + * @length: The number of bytes to zero.
> + */
> +static inline void folio_zero_range(struct folio *folio,
> +		size_t start, size_t length)
> +{
> +	zero_user_segments(&folio->page, start, start + length, 0, 0);

At first I thought "Gee, this is wrong, end should be start+length-1!"

Then I looked at zero_user_segments and realized that despite the
parameter name "endi1", it really wants you to tell it the next byte.
Not the end byte of the range you want to zero.

Then I looked at the other two new functions and saw that you documented
this, and now I get why Linus ranted about this some time ago.

The code looks right, but the "end" names rankle me.  Can we please
change them all?  Or at least in the new functions, if you all already
fought a flamewar over this that I'm not aware of?

Almost-Reviewed-by: Darrick J. Wong <djwong@kernel.org>

--D

> +}
> +
>  #endif /* _LINUX_HIGHMEM_H */
> diff --git a/mm/highmem.c b/mm/highmem.c
> index 88f65f155845..819d41140e5b 100644
> --- a/mm/highmem.c
> +++ b/mm/highmem.c
> @@ -359,7 +359,6 @@ void kunmap_high(struct page *page)
>  }
>  EXPORT_SYMBOL(kunmap_high);
>  
> -#ifdef CONFIG_TRANSPARENT_HUGEPAGE
>  void zero_user_segments(struct page *page, unsigned start1, unsigned end1,
>  		unsigned start2, unsigned end2)
>  {
> @@ -416,7 +415,6 @@ void zero_user_segments(struct page *page, unsigned start1, unsigned end1,
>  	BUG_ON((start1 | start2 | end1 | end2) != 0);
>  }
>  EXPORT_SYMBOL(zero_user_segments);
> -#endif /* CONFIG_TRANSPARENT_HUGEPAGE */
>  #endif /* CONFIG_HIGHMEM */
>  
>  #ifdef CONFIG_KMAP_LOCAL
> -- 
> 2.33.0
> 

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v2 05/28] block: Add bio_add_folio()
  2021-11-08  4:05 ` [PATCH v2 05/28] block: Add bio_add_folio() Matthew Wilcox (Oracle)
@ 2021-11-17  4:48   ` Darrick J. Wong
  0 siblings, 0 replies; 64+ messages in thread
From: Darrick J. Wong @ 2021-11-17  4:48 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle)
  Cc: linux-xfs, linux-fsdevel, linux-kernel, linux-block, Jens Axboe,
	Christoph Hellwig, Christoph Hellwig

On Mon, Nov 08, 2021 at 04:05:28AM +0000, Matthew Wilcox (Oracle) wrote:
> This is a thin wrapper around bio_add_page().  The main advantage here
> is the documentation that folios larger than 2GiB are not supported.
> It's not currently possible to allocate folios that large, but if it
> ever becomes possible, this function will fail gracefully instead of
> doing I/O to the wrong bytes.
> 
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> Reviewed-by: Jens Axboe <axboe@kernel.dk>
> Reviewed-by: Christoph Hellwig <hch@lst.de>

Looks ok,
Reviewed-by: Darrick J. Wong <djwong@kernel.org>

--D

> ---
>  block/bio.c         | 22 ++++++++++++++++++++++
>  include/linux/bio.h |  3 ++-
>  2 files changed, 24 insertions(+), 1 deletion(-)
> 
> diff --git a/block/bio.c b/block/bio.c
> index 15ab0d6d1c06..4b3087e20d51 100644
> --- a/block/bio.c
> +++ b/block/bio.c
> @@ -1033,6 +1033,28 @@ int bio_add_page(struct bio *bio, struct page *page,
>  }
>  EXPORT_SYMBOL(bio_add_page);
>  
> +/**
> + * bio_add_folio - Attempt to add part of a folio to a bio.
> + * @bio: BIO to add to.
> + * @folio: Folio to add.
> + * @len: How many bytes from the folio to add.
> + * @off: First byte in this folio to add.
> + *
> + * Filesystems that use folios can call this function instead of calling
> + * bio_add_page() for each page in the folio.  If @off is bigger than
> + * PAGE_SIZE, this function can create a bio_vec that starts in a page
> + * after the bv_page.  BIOs do not support folios that are 4GiB or larger.
> + *
> + * Return: Whether the addition was successful.
> + */
> +bool bio_add_folio(struct bio *bio, struct folio *folio, size_t len,
> +		   size_t off)
> +{
> +	if (len > UINT_MAX || off > UINT_MAX)
> +		return 0;
> +	return bio_add_page(bio, &folio->page, len, off) > 0;
> +}
> +
>  void __bio_release_pages(struct bio *bio, bool mark_dirty)
>  {
>  	struct bvec_iter_all iter_all;
> diff --git a/include/linux/bio.h b/include/linux/bio.h
> index fe6bdfbbef66..a783cac49978 100644
> --- a/include/linux/bio.h
> +++ b/include/linux/bio.h
> @@ -409,7 +409,8 @@ extern void bio_uninit(struct bio *);
>  extern void bio_reset(struct bio *);
>  void bio_chain(struct bio *, struct bio *);
>  
> -extern int bio_add_page(struct bio *, struct page *, unsigned int,unsigned int);
> +int bio_add_page(struct bio *, struct page *, unsigned len, unsigned off);
> +bool bio_add_folio(struct bio *, struct folio *, size_t len, size_t off);
>  extern int bio_add_pc_page(struct request_queue *, struct bio *, struct page *,
>  			   unsigned int, unsigned int);
>  int bio_add_zone_append_page(struct bio *bio, struct page *page,
> -- 
> 2.33.0
> 

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v2 06/28] block: Add bio_for_each_folio_all()
  2021-11-08  4:05 ` [PATCH v2 06/28] block: Add bio_for_each_folio_all() Matthew Wilcox (Oracle)
@ 2021-11-17  4:48   ` Darrick J. Wong
  0 siblings, 0 replies; 64+ messages in thread
From: Darrick J. Wong @ 2021-11-17  4:48 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle)
  Cc: linux-xfs, linux-fsdevel, linux-kernel, linux-block, Jens Axboe,
	Christoph Hellwig, Christoph Hellwig

On Mon, Nov 08, 2021 at 04:05:29AM +0000, Matthew Wilcox (Oracle) wrote:
> Allow callers to iterate over each folio instead of each page.  The
> bio need not have been constructed using folios originally.
> 
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> Reviewed-by: Jens Axboe <axboe@kernel.dk>
> Reviewed-by: Christoph Hellwig <hch@lst.de>

Looks ok,
Reviewed-by: Darrick J. Wong <djwong@kernel.org>

--D

> ---
>  Documentation/core-api/kernel-api.rst |  1 +
>  include/linux/bio.h                   | 53 ++++++++++++++++++++++++++-
>  2 files changed, 53 insertions(+), 1 deletion(-)
> 
> diff --git a/Documentation/core-api/kernel-api.rst b/Documentation/core-api/kernel-api.rst
> index 2e7186805148..7f0cb604b6ab 100644
> --- a/Documentation/core-api/kernel-api.rst
> +++ b/Documentation/core-api/kernel-api.rst
> @@ -279,6 +279,7 @@ Accounting Framework
>  Block Devices
>  =============
>  
> +.. kernel-doc:: include/linux/bio.h
>  .. kernel-doc:: block/blk-core.c
>     :export:
>  
> diff --git a/include/linux/bio.h b/include/linux/bio.h
> index a783cac49978..e3c9e8207f12 100644
> --- a/include/linux/bio.h
> +++ b/include/linux/bio.h
> @@ -166,7 +166,7 @@ static inline void bio_advance(struct bio *bio, unsigned int nbytes)
>   */
>  #define bio_for_each_bvec_all(bvl, bio, i)		\
>  	for (i = 0, bvl = bio_first_bvec_all(bio);	\
> -	     i < (bio)->bi_vcnt; i++, bvl++)		\
> +	     i < (bio)->bi_vcnt; i++, bvl++)
>  
>  #define bio_iter_last(bvec, iter) ((iter).bi_size == (bvec).bv_len)
>  
> @@ -260,6 +260,57 @@ static inline struct bio_vec *bio_last_bvec_all(struct bio *bio)
>  	return &bio->bi_io_vec[bio->bi_vcnt - 1];
>  }
>  
> +/**
> + * struct folio_iter - State for iterating all folios in a bio.
> + * @folio: The current folio we're iterating.  NULL after the last folio.
> + * @offset: The byte offset within the current folio.
> + * @length: The number of bytes in this iteration (will not cross folio
> + *	boundary).
> + */
> +struct folio_iter {
> +	struct folio *folio;
> +	size_t offset;
> +	size_t length;
> +	/* private: for use by the iterator */
> +	size_t _seg_count;
> +	int _i;
> +};
> +
> +static inline void bio_first_folio(struct folio_iter *fi, struct bio *bio,
> +				   int i)
> +{
> +	struct bio_vec *bvec = bio_first_bvec_all(bio) + i;
> +
> +	fi->folio = page_folio(bvec->bv_page);
> +	fi->offset = bvec->bv_offset +
> +			PAGE_SIZE * (bvec->bv_page - &fi->folio->page);
> +	fi->_seg_count = bvec->bv_len;
> +	fi->length = min(folio_size(fi->folio) - fi->offset, fi->_seg_count);
> +	fi->_i = i;
> +}
> +
> +static inline void bio_next_folio(struct folio_iter *fi, struct bio *bio)
> +{
> +	fi->_seg_count -= fi->length;
> +	if (fi->_seg_count) {
> +		fi->folio = folio_next(fi->folio);
> +		fi->offset = 0;
> +		fi->length = min(folio_size(fi->folio), fi->_seg_count);
> +	} else if (fi->_i + 1 < bio->bi_vcnt) {
> +		bio_first_folio(fi, bio, fi->_i + 1);
> +	} else {
> +		fi->folio = NULL;
> +	}
> +}
> +
> +/**
> + * bio_for_each_folio_all - Iterate over each folio in a bio.
> + * @fi: struct folio_iter which is updated for each folio.
> + * @bio: struct bio to iterate over.
> + */
> +#define bio_for_each_folio_all(fi, bio)				\
> +	for (bio_first_folio(&fi, bio, 0); fi.folio; bio_next_folio(&fi, bio))
> +
>  enum bip_flags {
>  	BIP_BLOCK_INTEGRITY	= 1 << 0, /* block layer owns integrity data */
>  	BIP_MAPPED_INTEGRITY	= 1 << 1, /* ref tag has been remapped */
> -- 
> 2.33.0
> 

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v2 01/28] csky,sparc: Declare flush_dcache_folio()
  2021-11-16 21:49         ` Matthew Wilcox
@ 2021-11-17  9:52           ` Geert Uytterhoeven
  0 siblings, 0 replies; 64+ messages in thread
From: Geert Uytterhoeven @ 2021-11-17  9:52 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Christoph Hellwig, Darrick J . Wong, linux-xfs, Linux FS Devel,
	Linux Kernel Mailing List, linux-block, Jens Axboe

On Wed, Nov 17, 2021 at 2:22 AM Matthew Wilcox <willy@infradead.org> wrote:
> On Mon, Nov 15, 2021 at 10:33:01PM -0800, Christoph Hellwig wrote:
> > I see how this works no, but it is pretty horrible.  Why not something
> > simple like the patch below?  If/when an architecture actually
> > wants to override flush_dcache_folio we can find out how to best do
> > it:
>
> I'll stick this one into -next and see if anything blows up:
>
> From 14f55de74c68a3eb058cfdbf81414148b9bdaac7 Mon Sep 17 00:00:00 2001
> From: "Matthew Wilcox (Oracle)" <willy@infradead.org>
> Date: Sat, 6 Nov 2021 17:13:35 -0400
> Subject: [PATCH] Add linux/cacheflush.h
>
> Many architectures do not include asm-generic/cacheflush.h, so turn
> the includes on their head and add linux/cacheflush.h which includes
> asm/cacheflush.h.
>
> Move the flush_dcache_folio() declaration from asm-generic/cacheflush.h
> to linux/cacheflush.h and change linux/highmem.h to include
> linux/cacheflush.h instead of asm/cacheflush.h so that all necessary
> places will see flush_dcache_folio().
>
> More functions should have their default implementations moved in the
> future, but those are for follow-on patches.  This fixes csky, sparc and
> sparc64 which were missed in the commit which added flush_dcache_folio().
>
> Fixes: 08b0b0059bf1 ("mm: Add flush_dcache_folio()")
> Suggested-by: Christoph Hellwig <hch@lst.de>
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>

>  arch/m68k/include/asm/cacheflush_mm.h |  1 -

Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v2 02/28] mm: Add functions to zero portions of a folio
  2021-11-17  4:45   ` Darrick J. Wong
@ 2021-11-17 14:07     ` Matthew Wilcox
  2021-11-17 17:07       ` Darrick J. Wong
  0 siblings, 1 reply; 64+ messages in thread
From: Matthew Wilcox @ 2021-11-17 14:07 UTC (permalink / raw)
  To: Darrick J. Wong
  Cc: linux-xfs, linux-fsdevel, linux-kernel, linux-block, Jens Axboe,
	Christoph Hellwig

On Tue, Nov 16, 2021 at 08:45:27PM -0800, Darrick J. Wong wrote:
> > +/**
> > + * folio_zero_segment() - Zero a byte range in a folio.
> > + * @folio: The folio to write to.
> > + * @start: The first byte to zero.
> > + * @end: One more than the last byte in the first range.
> > + */
> > +static inline void folio_zero_segment(struct folio *folio,
> > +		size_t start, size_t end)
> > +{
> > +	zero_user_segments(&folio->page, start, end, 0, 0);
> > +}
> > +
> > +/**
> > + * folio_zero_range() - Zero a byte range in a folio.
> > + * @folio: The folio to write to.
> > + * @start: The first byte to zero.
> > + * @length: The number of bytes to zero.
> > + */
> > +static inline void folio_zero_range(struct folio *folio,
> > +		size_t start, size_t length)
> > +{
> > +	zero_user_segments(&folio->page, start, start + length, 0, 0);
> 
> At first I thought "Gee, this is wrong, end should be start+length-1!"
> 
> Then I looked at zero_user_segments and realized that despite the
> parameter name "endi1", it really wants you to tell it the next byte.
> Not the end byte of the range you want to zero.
> 
> Then I looked at the other two new functions and saw that you documented
> this, and now I get why Linus ranted about this some time ago.
> 
> The code looks right, but the "end" names rankle me.  Can we please
> change them all?  Or at least in the new functions, if you all already
> fought a flamewar over this that I'm not aware of?

Change them to what?  I tend to use 'end' to mean 'excluded end' and
'max' to mean 'included end'.  What would you call the excluded end?

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v2 19/28] iomap: Convert __iomap_zero_iter to use a folio
  2021-11-17  2:24   ` Darrick J. Wong
@ 2021-11-17 14:20     ` Matthew Wilcox
  0 siblings, 0 replies; 64+ messages in thread
From: Matthew Wilcox @ 2021-11-17 14:20 UTC (permalink / raw)
  To: Darrick J. Wong
  Cc: linux-xfs, linux-fsdevel, linux-kernel, linux-block, Jens Axboe,
	Christoph Hellwig

On Tue, Nov 16, 2021 at 06:24:24PM -0800, Darrick J. Wong wrote:
> On Mon, Nov 08, 2021 at 04:05:42AM +0000, Matthew Wilcox (Oracle) wrote:
> > The zero iterator can work in folio-sized chunks instead of page-sized
> > chunks.  This will save a lot of page cache lookups if the file is cached
> > in multi-page folios.
> > 
> > Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> 
> hch's dax decoupling series notwithstanding,
> 
> Though TBH I am kinda wondering how the two of you plan to resolve those
> kinds of differences -- I haven't looked at that series, though I think
> this one's been waiting in the wings for longer?

I haven't looked at that series either

> Heck, I wonder how Matthew plans to merge all this given that it touches
> mm, fs, block, and iomap...?

I'm planning on sending a pull request to Linus on Monday for the first
few patches in this series:
https://git.infradead.org/users/willy/pagecache.git/shortlog/refs/heads/for-next

Then I was hoping you'd take the block + fs/buffer + iomap pieces for
the next merge window.

> Reviewed-by: Darrick J. Wong <djwong@kernel.org>

Thanks!  Going through and collecting all these now ...

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v2 20/28] iomap: Convert iomap_write_begin() and iomap_write_end() to folios
  2021-11-17  4:31   ` Darrick J. Wong
@ 2021-11-17 14:31     ` Matthew Wilcox
  2021-11-17 17:10       ` Darrick J. Wong
  0 siblings, 1 reply; 64+ messages in thread
From: Matthew Wilcox @ 2021-11-17 14:31 UTC (permalink / raw)
  To: Darrick J. Wong
  Cc: linux-xfs, linux-fsdevel, linux-kernel, linux-block, Jens Axboe,
	Christoph Hellwig, Christoph Hellwig

On Tue, Nov 16, 2021 at 08:31:27PM -0800, Darrick J. Wong wrote:
> > @@ -764,16 +761,17 @@ static loff_t iomap_write_iter(struct iomap_iter *iter, struct iov_iter *i)
> >  			break;
> >  		}
> >  
> > -		status = iomap_write_begin(iter, pos, bytes, &page);
> > +		status = iomap_write_begin(iter, pos, bytes, &folio);
> >  		if (unlikely(status))
> >  			break;
> >  
> > +		page = folio_file_page(folio, pos >> PAGE_SHIFT);
> >  		if (mapping_writably_mapped(iter->inode->i_mapping))
> >  			flush_dcache_page(page);
> >  
> >  		copied = copy_page_from_iter_atomic(page, offset, bytes, i);
> 
> Hrmm.  In principle (or I guess even a subsequent patch), if we had
> multi-page folios, could we simply loop the pages in the folio instead
> of doing a single page and then calling back into iomap_write_begin to
> get (probably) the same folio?
> 
> This looks like a fairly straightforward conversion, but I was wondering
> about that one little point...

Theoretically, yes, we should be able to do that.  But all of this code
is pretty subtle ("What if we hit a page fault?  What if we're writing
to part of this folio from an mmap of a different part of this folio?
What if it's !Uptodate?  What if we hit this weird ARM super-mprotect
memory tag thing?  What if ...") and, frankly, I got scared.  So I've
left that as future work; someone else can try to wrap their brain around
all of this.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v2 02/28] mm: Add functions to zero portions of a folio
  2021-11-17 14:07     ` Matthew Wilcox
@ 2021-11-17 17:07       ` Darrick J. Wong
  2021-11-18 15:55         ` Matthew Wilcox
  0 siblings, 1 reply; 64+ messages in thread
From: Darrick J. Wong @ 2021-11-17 17:07 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: linux-xfs, linux-fsdevel, linux-kernel, linux-block, Jens Axboe,
	Christoph Hellwig

On Wed, Nov 17, 2021 at 02:07:00PM +0000, Matthew Wilcox wrote:
> On Tue, Nov 16, 2021 at 08:45:27PM -0800, Darrick J. Wong wrote:
> > > +/**
> > > + * folio_zero_segment() - Zero a byte range in a folio.
> > > + * @folio: The folio to write to.
> > > + * @start: The first byte to zero.
> > > + * @end: One more than the last byte in the first range.
> > > + */
> > > +static inline void folio_zero_segment(struct folio *folio,
> > > +		size_t start, size_t end)
> > > +{
> > > +	zero_user_segments(&folio->page, start, end, 0, 0);
> > > +}
> > > +
> > > +/**
> > > + * folio_zero_range() - Zero a byte range in a folio.
> > > + * @folio: The folio to write to.
> > > + * @start: The first byte to zero.
> > > + * @length: The number of bytes to zero.
> > > + */
> > > +static inline void folio_zero_range(struct folio *folio,
> > > +		size_t start, size_t length)
> > > +{
> > > +	zero_user_segments(&folio->page, start, start + length, 0, 0);
> > 
> > At first I thought "Gee, this is wrong, end should be start+length-1!"
> > 
> > Then I looked at zero_user_segments and realized that despite the
> > parameter name "endi1", it really wants you to tell it the next byte.
> > Not the end byte of the range you want to zero.
> > 
> > Then I looked at the other two new functions and saw that you documented
> > this, and now I get why Linus ranted about this some time ago.
> > 
> > The code looks right, but the "end" names rankle me.  Can we please
> > change them all?  Or at least in the new functions, if you all already
> > fought a flamewar over this that I'm not aware of?
> 
> Change them to what?  I tend to use 'end' to mean 'excluded end' and
> 'max' to mean 'included end'.  What would you call the excluded end?

I've started using 'next', or changing the code to make 'end' be the
last element in the range the caller wants to act upon.  The thing is,
those are all iterators, so 'next' fits, whereas it doesn't fit so well
for range zeroing where that might have been all the zeroing we wanted
to do.

Though.  'xend' (shorthand for 'excluded end') is different enough to
signal that the reader should pay attention.  Ok, how about xend then?

--D

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v2 20/28] iomap: Convert iomap_write_begin() and iomap_write_end() to folios
  2021-11-17 14:31     ` Matthew Wilcox
@ 2021-11-17 17:10       ` Darrick J. Wong
  0 siblings, 0 replies; 64+ messages in thread
From: Darrick J. Wong @ 2021-11-17 17:10 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: linux-xfs, linux-fsdevel, linux-kernel, linux-block, Jens Axboe,
	Christoph Hellwig, Christoph Hellwig

On Wed, Nov 17, 2021 at 02:31:26PM +0000, Matthew Wilcox wrote:
> On Tue, Nov 16, 2021 at 08:31:27PM -0800, Darrick J. Wong wrote:
> > > @@ -764,16 +761,17 @@ static loff_t iomap_write_iter(struct iomap_iter *iter, struct iov_iter *i)
> > >  			break;
> > >  		}
> > >  
> > > -		status = iomap_write_begin(iter, pos, bytes, &page);
> > > +		status = iomap_write_begin(iter, pos, bytes, &folio);
> > >  		if (unlikely(status))
> > >  			break;
> > >  
> > > +		page = folio_file_page(folio, pos >> PAGE_SHIFT);
> > >  		if (mapping_writably_mapped(iter->inode->i_mapping))
> > >  			flush_dcache_page(page);
> > >  
> > >  		copied = copy_page_from_iter_atomic(page, offset, bytes, i);
> > 
> > Hrmm.  In principle (or I guess even a subsequent patch), if we had
> > multi-page folios, could we simply loop the pages in the folio instead
> > of doing a single page and then calling back into iomap_write_begin to
> > get (probably) the same folio?
> > 
> > This looks like a fairly straightforward conversion, but I was wondering
> > about that one little point...
> 
> Theoretically, yes, we should be able to do that.  But all of this code
> is pretty subtle ("What if we hit a page fault?  What if we're writing
> to part of this folio from an mmap of a different part of this folio?
> What if it's !Uptodate?  What if we hit this weird ARM super-mprotect
> memory tag thing?  What if ...") and, frankly, I got scared.  So I've
> left that as future work; someone else can try to wrap their brain around
> all of this.

<nod> That's roughly the same conclusion I came to -- conceptually we
could keep walking pages until we hit /any/ problem or other difference
with the first page that we don't feel like dealing with, and pass that
count to iomap_end... but no need to try that right this second.

Just checking that I grokked what's going on in this series. :)

--D

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v2 02/28] mm: Add functions to zero portions of a folio
  2021-11-17 17:07       ` Darrick J. Wong
@ 2021-11-18 15:55         ` Matthew Wilcox
  2021-11-18 17:26           ` Darrick J. Wong
  0 siblings, 1 reply; 64+ messages in thread
From: Matthew Wilcox @ 2021-11-18 15:55 UTC (permalink / raw)
  To: Darrick J. Wong
  Cc: linux-xfs, linux-fsdevel, linux-kernel, linux-block, Jens Axboe,
	Christoph Hellwig

On Wed, Nov 17, 2021 at 09:07:07AM -0800, Darrick J. Wong wrote:
> I've started using 'next', or changing the code to make 'end' be the
> last element in the range the caller wants to act upon.  The thing is,
> those are all iterators, so 'next' fits, whereas it doesn't fit so well
> for range zeroing where that might have been all the zeroing we wanted
> to do.

Yeah, it doesn't really work so well for one of the patches in this
series:

                        if (buffer_new(bh)) {
...
                                        folio_zero_segments(folio,
                                                to, block_end,
                                                block_start, from);

("zero between block_start and block_end, except for the region
specified by 'from' and 'to'").  Except that for some reason the
ranges are specified backwards, so it's not obvious what's going on.
Converting that to folio_zero_ranges() would be a possibility, at the
expense of complexity in the caller, or using 'max' instead of 'end'
would also add complexity to the callers.

> Though.  'xend' (shorthand for 'excluded end') is different enough to
> signal that the reader should pay attention.  Ok, how about xend then?

Done!

@@ -367,26 +367,26 @@ static inline void memzero_page(struct page *page, size_t
offset, size_t len)
  * folio_zero_segments() - Zero two byte ranges in a folio.
  * @folio: The folio to write to.
  * @start1: The first byte to zero.
- * @end1: One more than the last byte in the first range.
+ * @xend1: One more than the last byte in the first range.
  * @start2: The first byte to zero in the second range.
- * @end2: One more than the last byte in the second range.
+ * @xend2: One more than the last byte in the second range.
  */
 static inline void folio_zero_segments(struct folio *folio,
-               size_t start1, size_t end1, size_t start2, size_t end2)
+               size_t start1, size_t xend1, size_t start2, size_t xend2)
 {
-       zero_user_segments(&folio->page, start1, end1, start2, end2);
+       zero_user_segments(&folio->page, start1, xend1, start2, xend2);
 }

 /**
  * folio_zero_segment() - Zero a byte range in a folio.
  * @folio: The folio to write to.
  * @start: The first byte to zero.
- * @end: One more than the last byte in the first range.
+ * @xend: One more than the last byte to zero.
  */
 static inline void folio_zero_segment(struct folio *folio,
-               size_t start, size_t end)
+               size_t start, size_t xend)
 {
-       zero_user_segments(&folio->page, start, end, 0, 0);
+       zero_user_segments(&folio->page, start, xend, 0, 0);
 }

 /**


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v2 02/28] mm: Add functions to zero portions of a folio
  2021-11-18 15:55         ` Matthew Wilcox
@ 2021-11-18 17:26           ` Darrick J. Wong
  2021-11-18 20:08             ` Matthew Wilcox
  0 siblings, 1 reply; 64+ messages in thread
From: Darrick J. Wong @ 2021-11-18 17:26 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: linux-xfs, linux-fsdevel, linux-kernel, linux-block, Jens Axboe,
	Christoph Hellwig

On Thu, Nov 18, 2021 at 03:55:12PM +0000, Matthew Wilcox wrote:
> On Wed, Nov 17, 2021 at 09:07:07AM -0800, Darrick J. Wong wrote:
> > I've started using 'next', or changing the code to make 'end' be the
> > last element in the range the caller wants to act upon.  The thing is,
> > those are all iterators, so 'next' fits, whereas it doesn't fit so well
> > for range zeroing where that might have been all the zeroing we wanted
> > to do.
> 
> Yeah, it doesn't really work so well for one of the patches in this
> series:
> 
>                         if (buffer_new(bh)) {
> ...
>                                         folio_zero_segments(folio,
>                                                 to, block_end,
>                                                 block_start, from);
> 
> ("zero between block_start and block_end, except for the region
> specified by 'from' and 'to'").  Except that for some reason the
> ranges are specified backwards, so it's not obvious what's going on.
> Converting that to folio_zero_ranges() would be a possibility, at the
> expense of complexity in the caller, or using 'max' instead of 'end'
> would also add complexity to the callers.

The call above looks like it is preparing to copy some data into the
middle of a buffer by zero-initializing the bytes before and the bytes
after that middle region.

Admittedly my fs-addled brain actually finds this hot mess easier to
understand:

folio_zero_segments(folio, to, blocksize - 1, block_start, from - 1);

but I suppose the xend method involves less subtraction everywhere.

> 
> > Though.  'xend' (shorthand for 'excluded end') is different enough to
> > signal that the reader should pay attention.  Ok, how about xend then?
> 
> Done!
> 
> @@ -367,26 +367,26 @@ static inline void memzero_page(struct page *page, size_t
> offset, size_t len)
>   * folio_zero_segments() - Zero two byte ranges in a folio.
>   * @folio: The folio to write to.
>   * @start1: The first byte to zero.
> - * @end1: One more than the last byte in the first range.
> + * @xend1: One more than the last byte in the first range.
>   * @start2: The first byte to zero in the second range.
> - * @end2: One more than the last byte in the second range.
> + * @xend2: One more than the last byte in the second range.
>   */
>  static inline void folio_zero_segments(struct folio *folio,
> -               size_t start1, size_t end1, size_t start2, size_t end2)
> +               size_t start1, size_t xend1, size_t start2, size_t xend2)
>  {
> -       zero_user_segments(&folio->page, start1, end1, start2, end2);
> +       zero_user_segments(&folio->page, start1, xend1, start2, xend2);
>  }
> 
>  /**
>   * folio_zero_segment() - Zero a byte range in a folio.
>   * @folio: The folio to write to.
>   * @start: The first byte to zero.
> - * @end: One more than the last byte in the first range.
> + * @xend: One more than the last byte to zero.
>   */
>  static inline void folio_zero_segment(struct folio *folio,
> -               size_t start, size_t end)
> +               size_t start, size_t xend)
>  {
> -       zero_user_segments(&folio->page, start, end, 0, 0);
> +       zero_user_segments(&folio->page, start, xend, 0, 0);

Works for me,
Reviewed-by: Darrick J. Wong <djwong@kernel.org>

--D

>  }
> 
>  /**
> 

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v2 02/28] mm: Add functions to zero portions of a folio
  2021-11-18 17:26           ` Darrick J. Wong
@ 2021-11-18 20:08             ` Matthew Wilcox
  0 siblings, 0 replies; 64+ messages in thread
From: Matthew Wilcox @ 2021-11-18 20:08 UTC (permalink / raw)
  To: Darrick J. Wong
  Cc: linux-xfs, linux-fsdevel, linux-kernel, linux-block, Jens Axboe,
	Christoph Hellwig

On Thu, Nov 18, 2021 at 09:26:15AM -0800, Darrick J. Wong wrote:
> On Thu, Nov 18, 2021 at 03:55:12PM +0000, Matthew Wilcox wrote:
> >                         if (buffer_new(bh)) {
> > ...
> >                                         folio_zero_segments(folio,
> >                                                 to, block_end,
> >                                                 block_start, from);
> > 
> > ("zero between block_start and block_end, except for the region
> > specified by 'from' and 'to'").  Except that for some reason the
> > ranges are specified backwards, so it's not obvious what's going on.
> > Converting that to folio_zero_ranges() would be a possibility, at the
> > expense of complexity in the caller, or using 'max' instead of 'end'
> > would also add complexity to the callers.
> 
> The call above looks like it is preparing to copy some data into the
> middle of a buffer by zero-initializing the bytes before and the bytes
> after that middle region.
> 
> Admittedly my fs-addled brain actually finds this hot mess easier to
> understand:
> 
> folio_zero_segments(folio, to, blocksize - 1, block_start, from - 1);
> 
> but I suppose the xend method involves less subtraction everywhere.

That's exactly what it's doing.  It's kind of funny because it's an
abstraction that permits a micro-optimisation (removing potentially one
kmap() call), but removes the opportunity for a larger optimisation
(removing several, and also removing calls to flush_dcache_folio).
That is, we could rewrite __block_write_begin_int() as:

static void *kremap_folio(void *kaddr, struct folio *folio)
{
	if (kaddr)
		return kaddr;
	/* buffer heads only support single page folios */
	return kmap_local_folio(folio, 0);
}

+       void *kaddr = NULL;
...
-                               if (block_end > to || block_start < from)
-                                       folio_zero_segments(folio,
-                                               to, block_end,
-                                               block_start, from);
+                               if (from > block_start) {
+                                       kaddr = kremap_folio(kaddr, folio);
+                                       memset(kaddr + block_start, 0,
+                                               block_start - from);
+                               }
+                               if (block_end > to) {
+                                       kaddr = kremap_folio(kaddr, folio);
+                                       memset(kaddr + to, 0, block_end - to);
+                               }
...
        }
+       if (kaddr) {
+               kunmap_local(kaddr);
+               flush_dcache_folio(folio);
+       }

That way if there are multiple unmapped+new buffers, we only kmap/kunmap
once per page.  I don't care to submit this as a patch though ... buffer
heads just need to go away.  iomap can't use an optimisation like this;
it already reports all the contiguous unmapped blocks as a single extent,
and if you have multiple unmapped extents per page, well ... I'm sorry
for you, but the overhead of kmap/kunmap is the least of your problems.

> Reviewed-by: Darrick J. Wong <djwong@kernel.org>

Thanks.  Pushed to
https://git.infradead.org/users/willy/pagecache.git/shortlog/refs/heads/for-next

I'll give that until Monday to soak and send a pull request.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v2 19/28] iomap: Convert __iomap_zero_iter to use a folio
  2021-11-08  4:05 ` [PATCH v2 19/28] iomap: Convert __iomap_zero_iter " Matthew Wilcox (Oracle)
  2021-11-09  8:47   ` Christoph Hellwig
  2021-11-17  2:24   ` Darrick J. Wong
@ 2021-12-09 21:38   ` Matthew Wilcox
  2021-12-10 16:19     ` Matthew Wilcox
  2021-12-16 19:36     ` Darrick J. Wong
  2 siblings, 2 replies; 64+ messages in thread
From: Matthew Wilcox @ 2021-12-09 21:38 UTC (permalink / raw)
  To: Darrick J . Wong 
  Cc: linux-xfs, linux-fsdevel, linux-kernel, linux-block, Jens Axboe,
	Christoph Hellwig

On Mon, Nov 08, 2021 at 04:05:42AM +0000, Matthew Wilcox (Oracle) wrote:
> +++ b/fs/iomap/buffered-io.c
> @@ -881,17 +881,20 @@ EXPORT_SYMBOL_GPL(iomap_file_unshare);
>  
>  static s64 __iomap_zero_iter(struct iomap_iter *iter, loff_t pos, u64 length)
>  {
> +	struct folio *folio;
>  	struct page *page;
>  	int status;
> -	unsigned offset = offset_in_page(pos);
> -	unsigned bytes = min_t(u64, PAGE_SIZE - offset, length);
> +	size_t offset, bytes;
>  
> -	status = iomap_write_begin(iter, pos, bytes, &page);
> +	status = iomap_write_begin(iter, pos, length, &page);

This turned out to be buggy.  Darrick and I figured out why his tests
were failing and mine weren't; this only shows up with a 4kB block
size filesystem and I was only testing with 1kB block size filesystems.
(at least on x86; I haven't figured out why it passes with 1kB block size
filesystems, so I'm not sure what would be true on other filesystems).
iomap_write_begin() is not prepared to deal with a length that spans a
page boundary.  So I'm replacing this patch with the following patches
(whitespace damaged; pick them up from
https://git.infradead.org/users/willy/linux.git/tag/refs/tags/iomap-folio-5.17c
if you want to compile them):

commit 412212960b72
Author: Matthew Wilcox (Oracle) <willy@infradead.org>
Date:   Thu Dec 9 15:47:44 2021 -0500

    iomap: Allow iomap_write_begin() to be called with the full length

    In the future, we want write_begin to know the entire length of the
    write so that it can choose to allocate large folios.  Pass the full
    length in from __iomap_zero_iter() and limit it where necessary.

    Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>

diff --git a/fs/gfs2/bmap.c b/fs/gfs2/bmap.c
index d67108489148..9270db17c435 100644
--- a/fs/gfs2/bmap.c
+++ b/fs/gfs2/bmap.c
@@ -968,6 +968,9 @@ static int gfs2_iomap_page_prepare(struct inode *inode, loff_t pos,
        struct gfs2_sbd *sdp = GFS2_SB(inode);
        unsigned int blocks;

+       /* gfs2 does not support large folios yet */
+       if (len > PAGE_SIZE)
+               len = PAGE_SIZE;
        blocks = ((pos & blockmask) + len + blockmask) >> inode->i_blkbits;
        return gfs2_trans_begin(sdp, RES_DINODE + blocks, 0);
 }
diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index 8d7a67655b60..67fcd3b9928d 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -632,6 +632,8 @@ static int iomap_write_begin(const struct iomap_iter *iter, loff_t pos,
                goto out_no_page;
        }
        folio = page_folio(page);
+       if (pos + len > folio_pos(folio) + folio_size(folio))
+               len = folio_pos(folio) + folio_size(folio) - pos;

        if (srcmap->type == IOMAP_INLINE)
                status = iomap_write_begin_inline(iter, page);
@@ -891,16 +893,19 @@ static s64 __iomap_zero_iter(struct iomap_iter *iter, loff
_t pos, u64 length)
        struct page *page;
        int status;
        unsigned offset = offset_in_page(pos);
-       unsigned bytes = min_t(u64, PAGE_SIZE - offset, length);

-       status = iomap_write_begin(iter, pos, bytes, &page);
+       if (length > UINT_MAX)
+               length = UINT_MAX;
+       status = iomap_write_begin(iter, pos, length, &page);
        if (status)
                return status;
+       if (length > PAGE_SIZE - offset)
+               length = PAGE_SIZE - offset;

-       zero_user(page, offset, bytes);
+       zero_user(page, offset, length);
        mark_page_accessed(page);

-       return iomap_write_end(iter, pos, bytes, bytes, page);
+       return iomap_write_end(iter, pos, length, length, page);
 }

 static loff_t iomap_zero_iter(struct iomap_iter *iter, bool *did_zero)


commit 78c747a1b3a1
Author: Matthew Wilcox (Oracle) <willy@infradead.org>
Date:   Fri Nov 5 14:24:09 2021 -0400

    iomap: Convert __iomap_zero_iter to use a folio
    
    The zero iterator can work in folio-sized chunks instead of page-sized
    chunks.  This will save a lot of page cache lookups if the file is cached
    in large folios.
    
    Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
    Reviewed-by: Christoph Hellwig <hch@lst.de>
    Reviewed-by: Darrick J. Wong <djwong@kernel.org>

diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index 67fcd3b9928d..bbde6d4f27cd 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -890,20 +890,23 @@ EXPORT_SYMBOL_GPL(iomap_file_unshare);
 
 static s64 __iomap_zero_iter(struct iomap_iter *iter, loff_t pos, u64 length)
 {
+       struct folio *folio;
        struct page *page;
        int status;
-       unsigned offset = offset_in_page(pos);
+       size_t offset;
 
        if (length > UINT_MAX)
                length = UINT_MAX;
        status = iomap_write_begin(iter, pos, length, &page);
        if (status)
                return status;
-       if (length > PAGE_SIZE - offset)
-               length = PAGE_SIZE - offset;
+       folio = page_folio(page);
 
-       zero_user(page, offset, length);
-       mark_page_accessed(page);
+       offset = offset_in_folio(folio, pos);
+       if (length > folio_size(folio) - offset)
+               length = folio_size(folio) - offset;
+       folio_zero_range(folio, offset, length);
+       folio_mark_accessed(folio);
 
        return iomap_write_end(iter, pos, length, length, page);
 }


The xfstests that Darrick identified as failing all passed.  Running a
full sweep now; then I'll re-run with a 1kB filesystem to be sure that
still passes.  Then I'll send another pull request.

^ permalink raw reply related	[flat|nested] 64+ messages in thread

* Re: [PATCH v2 19/28] iomap: Convert __iomap_zero_iter to use a folio
  2021-12-09 21:38   ` Matthew Wilcox
@ 2021-12-10 16:19     ` Matthew Wilcox
  2021-12-13  7:34       ` Christoph Hellwig
  2021-12-16 19:36     ` Darrick J. Wong
  1 sibling, 1 reply; 64+ messages in thread
From: Matthew Wilcox @ 2021-12-10 16:19 UTC (permalink / raw)
  To: Darrick J . Wong 
  Cc: linux-xfs, linux-fsdevel, linux-kernel, linux-block, Jens Axboe,
	Christoph Hellwig

On Thu, Dec 09, 2021 at 09:38:03PM +0000, Matthew Wilcox wrote:
> @@ -891,16 +893,19 @@ static s64 __iomap_zero_iter(struct iomap_iter *iter, loff
> _t pos, u64 length)
>         struct page *page;
>         int status;
>         unsigned offset = offset_in_page(pos);
> -       unsigned bytes = min_t(u64, PAGE_SIZE - offset, length);
> 
> -       status = iomap_write_begin(iter, pos, bytes, &page);
> +       if (length > UINT_MAX)
> +               length = UINT_MAX;
> +       status = iomap_write_begin(iter, pos, length, &page);
>         if (status)
>                 return status;
> +       if (length > PAGE_SIZE - offset)
> +               length = PAGE_SIZE - offset;
> 
> -       zero_user(page, offset, bytes);
> +       zero_user(page, offset, length);
>         mark_page_accessed(page);
> 
> -       return iomap_write_end(iter, pos, bytes, bytes, page);
> +       return iomap_write_end(iter, pos, length, length, page);
>  }

After attempting the merge with Christoph's ill-timed refactoring,
I decided that eliding the use of 'bytes' here was the wrong approach,
because it very much needs to be put back in for the merge.

Here's the merge as I have it:

diff --cc fs/iomap/buffered-io.c
index f3176cf90351,d1aa0f0e7fd5..40356db3e856
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@@ -888,19 -926,12 +904,23 @@@ static loff_t iomap_zero_iter(struct io
                return length;

        do {
-               unsigned offset = offset_in_page(pos);
-               size_t bytes = min_t(u64, PAGE_SIZE - offset, length);
-               struct page *page;
 -              s64 bytes;
++              struct folio *folio;
 +              int status;
++              size_t offset;
++              size_t bytes = min_t(u64, SIZE_MAX, length);
 +
-               status = iomap_write_begin(iter, pos, bytes, &page);
++              status = iomap_write_begin(iter, pos, bytes, &folio);
 +              if (status)
 +                      return status;
 +
-               zero_user(page, offset, bytes);
-               mark_page_accessed(page);
++              offset = offset_in_folio(folio, pos);
++              if (bytes > folio_size(folio) - offset)
++                      bytes = folio_size(folio) - offset;
++
++              folio_zero_range(folio, offset, bytes);
++              folio_mark_accessed(folio);

-               bytes = iomap_write_end(iter, pos, bytes, bytes, page);
 -              if (IS_DAX(iter->inode))
 -                      bytes = dax_iomap_zero(pos, length, iomap);
 -              else
 -                      bytes = __iomap_zero_iter(iter, pos, length);
++              bytes = iomap_write_end(iter, pos, bytes, bytes, folio);
                if (bytes < 0)
                        return bytes;

I've pushed out a new tag:

https://git.infradead.org/users/willy/linux.git/shortlog/refs/tags/iomap-folio-5.17d


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v2 19/28] iomap: Convert __iomap_zero_iter to use a folio
  2021-12-10 16:19     ` Matthew Wilcox
@ 2021-12-13  7:34       ` Christoph Hellwig
  2021-12-13 18:08         ` Matthew Wilcox
  0 siblings, 1 reply; 64+ messages in thread
From: Christoph Hellwig @ 2021-12-13  7:34 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Darrick J . Wong ,
	linux-xfs, linux-fsdevel, linux-kernel, linux-block, Jens Axboe,
	Christoph Hellwig

On Fri, Dec 10, 2021 at 04:19:54PM +0000, Matthew Wilcox wrote:
> After attempting the merge with Christoph's ill-timed refactoring,

I did give you a headsup before..

> I decided that eliding the use of 'bytes' here was the wrong approach,
> because it very much needs to be put back in for the merge.

Is there any good reason to not just delay the iomp_zero_iter folio
conversion for now?

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v2 19/28] iomap: Convert __iomap_zero_iter to use a folio
  2021-12-13  7:34       ` Christoph Hellwig
@ 2021-12-13 18:08         ` Matthew Wilcox
  0 siblings, 0 replies; 64+ messages in thread
From: Matthew Wilcox @ 2021-12-13 18:08 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Darrick J . Wong ,
	linux-xfs, linux-fsdevel, linux-kernel, linux-block, Jens Axboe

On Sun, Dec 12, 2021 at 11:34:54PM -0800, Christoph Hellwig wrote:
> On Fri, Dec 10, 2021 at 04:19:54PM +0000, Matthew Wilcox wrote:
> > After attempting the merge with Christoph's ill-timed refactoring,
> 
> I did give you a headsup before..

I thought that was going in via Darrick's tree.  I had no idea Dan was
going to take it.

> > I decided that eliding the use of 'bytes' here was the wrong approach,
> > because it very much needs to be put back in for the merge.
> 
> Is there any good reason to not just delay the iomp_zero_iter folio
> conversion for now?

It would hold up about half of the iomap folio conversion (~10 patches).
I don't understand what the benefit is of your patch series.  Moving
filesystems away from being bdev based just doesn't seem interesting
to me.  Having DAX as an optional feature that some bdevs have seems
like a far superior option.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v2 19/28] iomap: Convert __iomap_zero_iter to use a folio
  2021-12-09 21:38   ` Matthew Wilcox
  2021-12-10 16:19     ` Matthew Wilcox
@ 2021-12-16 19:36     ` Darrick J. Wong
  2021-12-16 20:43       ` Matthew Wilcox
  1 sibling, 1 reply; 64+ messages in thread
From: Darrick J. Wong @ 2021-12-16 19:36 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: linux-xfs, linux-fsdevel, linux-kernel, linux-block, Jens Axboe,
	Christoph Hellwig

On Thu, Dec 09, 2021 at 09:38:03PM +0000, Matthew Wilcox wrote:
> On Mon, Nov 08, 2021 at 04:05:42AM +0000, Matthew Wilcox (Oracle) wrote:
> > +++ b/fs/iomap/buffered-io.c
> > @@ -881,17 +881,20 @@ EXPORT_SYMBOL_GPL(iomap_file_unshare);
> >  
> >  static s64 __iomap_zero_iter(struct iomap_iter *iter, loff_t pos, u64 length)
> >  {
> > +	struct folio *folio;
> >  	struct page *page;
> >  	int status;
> > -	unsigned offset = offset_in_page(pos);
> > -	unsigned bytes = min_t(u64, PAGE_SIZE - offset, length);
> > +	size_t offset, bytes;
> >  
> > -	status = iomap_write_begin(iter, pos, bytes, &page);
> > +	status = iomap_write_begin(iter, pos, length, &page);
> 
> This turned out to be buggy.  Darrick and I figured out why his tests
> were failing and mine weren't; this only shows up with a 4kB block
> size filesystem and I was only testing with 1kB block size filesystems.
> (at least on x86; I haven't figured out why it passes with 1kB block size
> filesystems, so I'm not sure what would be true on other filesystems).
> iomap_write_begin() is not prepared to deal with a length that spans a
> page boundary.  So I'm replacing this patch with the following patches
> (whitespace damaged; pick them up from
> https://git.infradead.org/users/willy/linux.git/tag/refs/tags/iomap-folio-5.17c
> if you want to compile them):
> 
> commit 412212960b72
> Author: Matthew Wilcox (Oracle) <willy@infradead.org>
> Date:   Thu Dec 9 15:47:44 2021 -0500
> 
>     iomap: Allow iomap_write_begin() to be called with the full length
> 
>     In the future, we want write_begin to know the entire length of the
>     write so that it can choose to allocate large folios.  Pass the full
>     length in from __iomap_zero_iter() and limit it where necessary.
> 
>     Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> 
> diff --git a/fs/gfs2/bmap.c b/fs/gfs2/bmap.c
> index d67108489148..9270db17c435 100644
> --- a/fs/gfs2/bmap.c
> +++ b/fs/gfs2/bmap.c
> @@ -968,6 +968,9 @@ static int gfs2_iomap_page_prepare(struct inode *inode, loff_t pos,
>         struct gfs2_sbd *sdp = GFS2_SB(inode);
>         unsigned int blocks;
> 
> +       /* gfs2 does not support large folios yet */
> +       if (len > PAGE_SIZE)
> +               len = PAGE_SIZE;

This is awkward -- gfs2 doesn't set the mapping flag to indicate that it
supports large folios, so it should never be asked to deal with more
than a page at a time.  Shouldn't iomap_write_begin clamp its len
argument to PAGE_SIZE at the start if the mapping doesn't have the large
folios flag set?

--D

>         blocks = ((pos & blockmask) + len + blockmask) >> inode->i_blkbits;
>         return gfs2_trans_begin(sdp, RES_DINODE + blocks, 0);
>  }
> diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
> index 8d7a67655b60..67fcd3b9928d 100644
> --- a/fs/iomap/buffered-io.c
> +++ b/fs/iomap/buffered-io.c
> @@ -632,6 +632,8 @@ static int iomap_write_begin(const struct iomap_iter *iter, loff_t pos,
>                 goto out_no_page;
>         }
>         folio = page_folio(page);
> +       if (pos + len > folio_pos(folio) + folio_size(folio))
> +               len = folio_pos(folio) + folio_size(folio) - pos;
> 
>         if (srcmap->type == IOMAP_INLINE)
>                 status = iomap_write_begin_inline(iter, page);
> @@ -891,16 +893,19 @@ static s64 __iomap_zero_iter(struct iomap_iter *iter, loff
> _t pos, u64 length)
>         struct page *page;
>         int status;
>         unsigned offset = offset_in_page(pos);
> -       unsigned bytes = min_t(u64, PAGE_SIZE - offset, length);
> 
> -       status = iomap_write_begin(iter, pos, bytes, &page);
> +       if (length > UINT_MAX)
> +               length = UINT_MAX;
> +       status = iomap_write_begin(iter, pos, length, &page);
>         if (status)
>                 return status;
> +       if (length > PAGE_SIZE - offset)
> +               length = PAGE_SIZE - offset;
> 
> -       zero_user(page, offset, bytes);
> +       zero_user(page, offset, length);
>         mark_page_accessed(page);
> 
> -       return iomap_write_end(iter, pos, bytes, bytes, page);
> +       return iomap_write_end(iter, pos, length, length, page);
>  }
> 
>  static loff_t iomap_zero_iter(struct iomap_iter *iter, bool *did_zero)
> 
> 
> commit 78c747a1b3a1
> Author: Matthew Wilcox (Oracle) <willy@infradead.org>
> Date:   Fri Nov 5 14:24:09 2021 -0400
> 
>     iomap: Convert __iomap_zero_iter to use a folio
>     
>     The zero iterator can work in folio-sized chunks instead of page-sized
>     chunks.  This will save a lot of page cache lookups if the file is cached
>     in large folios.
>     
>     Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
>     Reviewed-by: Christoph Hellwig <hch@lst.de>
>     Reviewed-by: Darrick J. Wong <djwong@kernel.org>
> 
> diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
> index 67fcd3b9928d..bbde6d4f27cd 100644
> --- a/fs/iomap/buffered-io.c
> +++ b/fs/iomap/buffered-io.c
> @@ -890,20 +890,23 @@ EXPORT_SYMBOL_GPL(iomap_file_unshare);
>  
>  static s64 __iomap_zero_iter(struct iomap_iter *iter, loff_t pos, u64 length)
>  {
> +       struct folio *folio;
>         struct page *page;
>         int status;
> -       unsigned offset = offset_in_page(pos);
> +       size_t offset;
>  
>         if (length > UINT_MAX)
>                 length = UINT_MAX;
>         status = iomap_write_begin(iter, pos, length, &page);
>         if (status)
>                 return status;
> -       if (length > PAGE_SIZE - offset)
> -               length = PAGE_SIZE - offset;
> +       folio = page_folio(page);
>  
> -       zero_user(page, offset, length);
> -       mark_page_accessed(page);
> +       offset = offset_in_folio(folio, pos);
> +       if (length > folio_size(folio) - offset)
> +               length = folio_size(folio) - offset;
> +       folio_zero_range(folio, offset, length);
> +       folio_mark_accessed(folio);
>  
>         return iomap_write_end(iter, pos, length, length, page);
>  }
> 
> 
> The xfstests that Darrick identified as failing all passed.  Running a
> full sweep now; then I'll re-run with a 1kB filesystem to be sure that
> still passes.  Then I'll send another pull request.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v2 19/28] iomap: Convert __iomap_zero_iter to use a folio
  2021-12-16 19:36     ` Darrick J. Wong
@ 2021-12-16 20:43       ` Matthew Wilcox
  0 siblings, 0 replies; 64+ messages in thread
From: Matthew Wilcox @ 2021-12-16 20:43 UTC (permalink / raw)
  To: Darrick J. Wong
  Cc: linux-xfs, linux-fsdevel, linux-kernel, linux-block, Jens Axboe,
	Christoph Hellwig

On Thu, Dec 16, 2021 at 11:36:14AM -0800, Darrick J. Wong wrote:
> > 
> > +       /* gfs2 does not support large folios yet */
> > +       if (len > PAGE_SIZE)
> > +               len = PAGE_SIZE;
> 
> This is awkward -- gfs2 doesn't set the mapping flag to indicate that it
> supports large folios, so it should never be asked to deal with more
> than a page at a time.  Shouldn't iomap_write_begin clamp its len
> argument to PAGE_SIZE at the start if the mapping doesn't have the large
> folios flag set?

You're right, this is awkward.  And it's a bit of a beartrap for
another filesystem that wants to implement ->prepare_page in the
future.

diff --git a/fs/gfs2/bmap.c b/fs/gfs2/bmap.c
index 9270db17c435..d67108489148 100644
--- a/fs/gfs2/bmap.c
+++ b/fs/gfs2/bmap.c
@@ -968,9 +968,6 @@ static int gfs2_iomap_page_prepare(struct inode *inode, loff_t pos,
        struct gfs2_sbd *sdp = GFS2_SB(inode);
        unsigned int blocks;

-       /* gfs2 does not support large folios yet */
-       if (len > PAGE_SIZE)
-               len = PAGE_SIZE;
        blocks = ((pos & blockmask) + len + blockmask) >> inode->i_blkbits;
        return gfs2_trans_begin(sdp, RES_DINODE + blocks, 0);
 }
diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index 1a9e897ee25a..b1ded5204d1c 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -619,6 +619,9 @@ static int iomap_write_begin(const struct iomap_iter *iter, loff_t pos,
        if (fatal_signal_pending(current))
                return -EINTR;
 
+       if (!mapping_large_folio_support(iter->inode->i_mapping))
+               len = min_t(size_t, len, PAGE_SIZE - offset_in_page(pos));
+
        if (page_ops && page_ops->page_prepare) {
                status = page_ops->page_prepare(iter->inode, pos, len);
                if (status)


^ permalink raw reply related	[flat|nested] 64+ messages in thread

end of thread, other threads:[~2021-12-16 20:44 UTC | newest]

Thread overview: 64+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-08  4:05 [PATCH v2 00/28] iomap/xfs folio patches Matthew Wilcox (Oracle)
2021-11-08  4:05 ` [PATCH v2 01/28] csky,sparc: Declare flush_dcache_folio() Matthew Wilcox (Oracle)
2021-11-09  8:36   ` Christoph Hellwig
2021-11-15 15:54     ` Matthew Wilcox
2021-11-16  6:33       ` Christoph Hellwig
2021-11-16 21:49         ` Matthew Wilcox
2021-11-17  9:52           ` Geert Uytterhoeven
2021-11-08  4:05 ` [PATCH v2 02/28] mm: Add functions to zero portions of a folio Matthew Wilcox (Oracle)
2021-11-09  8:40   ` Christoph Hellwig
2021-11-17  4:45   ` Darrick J. Wong
2021-11-17 14:07     ` Matthew Wilcox
2021-11-17 17:07       ` Darrick J. Wong
2021-11-18 15:55         ` Matthew Wilcox
2021-11-18 17:26           ` Darrick J. Wong
2021-11-18 20:08             ` Matthew Wilcox
2021-11-08  4:05 ` [PATCH v2 03/28] fs: Remove FS_THP_SUPPORT Matthew Wilcox (Oracle)
2021-11-17  4:36   ` Darrick J. Wong
2021-11-08  4:05 ` [PATCH v2 04/28] fs: Rename AS_THP_SUPPORT and mapping_thp_support Matthew Wilcox (Oracle)
2021-11-09  8:41   ` Christoph Hellwig
2021-11-15 16:03     ` Matthew Wilcox
2021-11-16  6:33       ` Christoph Hellwig
2021-11-08  4:05 ` [PATCH v2 05/28] block: Add bio_add_folio() Matthew Wilcox (Oracle)
2021-11-17  4:48   ` Darrick J. Wong
2021-11-08  4:05 ` [PATCH v2 06/28] block: Add bio_for_each_folio_all() Matthew Wilcox (Oracle)
2021-11-17  4:48   ` Darrick J. Wong
2021-11-08  4:05 ` [PATCH v2 07/28] fs/buffer: Convert __block_write_begin_int() to take a folio Matthew Wilcox (Oracle)
2021-11-09  8:42   ` Christoph Hellwig
2021-11-17  4:35   ` Darrick J. Wong
2021-11-08  4:05 ` [PATCH v2 08/28] iomap: Convert to_iomap_page " Matthew Wilcox (Oracle)
2021-11-08  4:05 ` [PATCH v2 09/28] iomap: Convert iomap_page_create " Matthew Wilcox (Oracle)
2021-11-08  4:05 ` [PATCH v2 10/28] iomap: Convert iomap_page_release " Matthew Wilcox (Oracle)
2021-11-08  4:05 ` [PATCH v2 11/28] iomap: Convert iomap_releasepage to use " Matthew Wilcox (Oracle)
2021-11-08  4:05 ` [PATCH v2 12/28] iomap: Add iomap_invalidate_folio Matthew Wilcox (Oracle)
2021-11-17  2:20   ` Darrick J. Wong
2021-11-08  4:05 ` [PATCH v2 13/28] iomap: Pass the iomap_page into iomap_set_range_uptodate Matthew Wilcox (Oracle)
2021-11-08  4:05 ` [PATCH v2 14/28] iomap: Convert bio completions to use folios Matthew Wilcox (Oracle)
2021-11-08  4:05 ` [PATCH v2 15/28] iomap: Use folio offsets instead of page offsets Matthew Wilcox (Oracle)
2021-11-08  4:05 ` [PATCH v2 16/28] iomap: Convert iomap_read_inline_data to take a folio Matthew Wilcox (Oracle)
2021-11-08  4:05 ` [PATCH v2 17/28] iomap: Convert readahead and readpage to use " Matthew Wilcox (Oracle)
2021-11-09  8:43   ` Christoph Hellwig
2021-11-08  4:05 ` [PATCH v2 18/28] iomap: Convert iomap_page_mkwrite " Matthew Wilcox (Oracle)
2021-11-08  4:05 ` [PATCH v2 19/28] iomap: Convert __iomap_zero_iter " Matthew Wilcox (Oracle)
2021-11-09  8:47   ` Christoph Hellwig
2021-11-17  2:24   ` Darrick J. Wong
2021-11-17 14:20     ` Matthew Wilcox
2021-12-09 21:38   ` Matthew Wilcox
2021-12-10 16:19     ` Matthew Wilcox
2021-12-13  7:34       ` Christoph Hellwig
2021-12-13 18:08         ` Matthew Wilcox
2021-12-16 19:36     ` Darrick J. Wong
2021-12-16 20:43       ` Matthew Wilcox
2021-11-08  4:05 ` [PATCH v2 20/28] iomap: Convert iomap_write_begin() and iomap_write_end() to folios Matthew Wilcox (Oracle)
2021-11-17  4:31   ` Darrick J. Wong
2021-11-17 14:31     ` Matthew Wilcox
2021-11-17 17:10       ` Darrick J. Wong
2021-11-08  4:05 ` [PATCH v2 21/28] iomap: Convert iomap_write_end_inline to take a folio Matthew Wilcox (Oracle)
2021-11-08  4:05 ` [PATCH v2 22/28] iomap,xfs: Convert ->discard_page to ->discard_folio Matthew Wilcox (Oracle)
2021-11-08  4:05 ` [PATCH v2 23/28] iomap: Simplify iomap_writepage_map() Matthew Wilcox (Oracle)
2021-11-08  4:05 ` [PATCH v2 24/28] iomap: Simplify iomap_do_writepage() Matthew Wilcox (Oracle)
2021-11-08  4:05 ` [PATCH v2 25/28] iomap: Convert iomap_add_to_ioend() to take a folio Matthew Wilcox (Oracle)
2021-11-17  4:34   ` Darrick J. Wong
2021-11-08  4:05 ` [PATCH v2 26/28] iomap: Convert iomap_migrate_page() to use folios Matthew Wilcox (Oracle)
2021-11-08  4:05 ` [PATCH v2 27/28] iomap: Support multi-page folios in invalidatepage Matthew Wilcox (Oracle)
2021-11-08  4:05 ` [PATCH v2 28/28] xfs: Support multi-page folios Matthew Wilcox (Oracle)

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).