linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2] iomap: small block problems
@ 2021-06-28 17:27 Andreas Gruenbacher
  2021-06-28 17:27 ` [PATCH 1/2] iomap: Don't create iomap_page objects for inline files Andreas Gruenbacher
                   ` (3 more replies)
  0 siblings, 4 replies; 17+ messages in thread
From: Andreas Gruenbacher @ 2021-06-28 17:27 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Darrick J . Wong, linux-xfs, linux-fsdevel, cluster-devel,
	Andreas Gruenbacher

Hi Christoph and all,

the recent gfs2 switch from buffer heads to iomap for managing the
block-to-page mappings in commit 2164f9b91869 ("gfs2: use iomap for
buffered I/O in ordered and writeback mode") broke filesystems with a
block size smaller than the page size.  This haüüens when iomap_page
objects aren't attached to pages in all the right circumstances, or the
iop->upodate bitmap isn't kept in sync with the PageUptodata flag.
There are three instances of this problem:

(1) In iomap_readpage_actor, an iomap_page is attached to the page even
for inline inodes.  This is unnecessary because inline inodes don't need
iomap_page objects.  That alone wouldn't cause any real issues, but when
iomap_read_inline_data copies the inline data into the page, it sets the
PageUptodate flag without setting iop->uptodate, causing an
inconsistency between the two.  This will trigger a WARN_ON in
iomap_page_release.  The fix should be not to allocate iomap_page
objects when reading from inline inodes (patch 1).

(2) When un-inlining an inode, we must allocate a page with an attached
iomap_page object (iomap_page_create) and initialize the iop->uptodate
bitmap (iomap_set_range_uptodate).  We can't currently do that because
iomap_page_create and iomap_set_range_uptodate are not exported.  That
could be fixed by exporting those functions, or by implementing an
additional helper as in patch 2.  Which of the two would you prefer?

(3) We're not yet using iomap_page_mkwrite, so iomap_page objects don't
get created on .page_mkwrite, either.  Part of the reason is that
iomap_page_mkwrite locks the page and then calls into the filesystem for
uninlining and for allocating backing blocks.  This conflicts with the
gfs2 locking order: on gfs2, transactions must be started before locking
any pages.  We can fix that by calling iomap_page_create from
gfs2_page_mkwrite, or by doing the uninlining and allocations before
calling iomap_page_mkwrite.  I've implemented option 2 for now; see
here:

https://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2.git/log/?h=for-next.iomap-page-mkwrite

Any thoughts?

Thanks,
Andreas

Andreas Gruenbacher (1):
  iomap: Don't create iomap_page objects for inline files

Bob Peterson (1):
  iomap: Add helper for un-inlining an inline inode

 fs/iomap/buffered-io.c | 28 +++++++++++++++++++++++++++-
 include/linux/iomap.h  |  2 ++
 2 files changed, 29 insertions(+), 1 deletion(-)

-- 
2.26.3


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 1/2] iomap: Don't create iomap_page objects for inline files
  2021-06-28 17:27 [PATCH 0/2] iomap: small block problems Andreas Gruenbacher
@ 2021-06-28 17:27 ` Andreas Gruenbacher
  2021-06-30 13:35   ` Matthew Wilcox
  2021-06-28 17:27 ` [PATCH 2/2] iomap: Add helper for un-inlining an inline inode Andreas Gruenbacher
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 17+ messages in thread
From: Andreas Gruenbacher @ 2021-06-28 17:27 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Darrick J . Wong, linux-xfs, linux-fsdevel, cluster-devel,
	Andreas Gruenbacher

In iomap_readpage_actor, don't create iop objects for inline inodes.
Otherwise, iomap_read_inline_data will set PageUptodate without setting
iop->uptodate, and iomap_page_release will eventually complain.

To prevent this kind of bug from occurring in the future, make sure the
page doesn't have private data attached in iomap_read_inline_data.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
---
 fs/iomap/buffered-io.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index 9023717c5188..03537ecb2a94 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -215,6 +215,7 @@ iomap_read_inline_data(struct inode *inode, struct page *page,
 	if (PageUptodate(page))
 		return;
 
+	BUG_ON(page_has_private(page));
 	BUG_ON(page->index);
 	BUG_ON(size > PAGE_SIZE - offset_in_page(iomap->inline_data));
 
@@ -239,7 +240,7 @@ iomap_readpage_actor(struct inode *inode, loff_t pos, loff_t length, void *data,
 {
 	struct iomap_readpage_ctx *ctx = data;
 	struct page *page = ctx->cur_page;
-	struct iomap_page *iop = iomap_page_create(inode, page);
+	struct iomap_page *iop;
 	bool same_page = false, is_contig = false;
 	loff_t orig_pos = pos;
 	unsigned poff, plen;
@@ -252,6 +253,7 @@ iomap_readpage_actor(struct inode *inode, loff_t pos, loff_t length, void *data,
 	}
 
 	/* zero post-eof blocks as the page may be mapped */
+	iop = iomap_page_create(inode, page);
 	iomap_adjust_read_range(inode, iop, &pos, length, &poff, &plen);
 	if (plen == 0)
 		goto done;
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 2/2] iomap: Add helper for un-inlining an inline inode
  2021-06-28 17:27 [PATCH 0/2] iomap: small block problems Andreas Gruenbacher
  2021-06-28 17:27 ` [PATCH 1/2] iomap: Don't create iomap_page objects for inline files Andreas Gruenbacher
@ 2021-06-28 17:27 ` Andreas Gruenbacher
  2021-06-28 17:39 ` [PATCH 0/2] iomap: small block problems Matthew Wilcox
  2021-06-29  9:12 ` Andreas Gruenbacher
  3 siblings, 0 replies; 17+ messages in thread
From: Andreas Gruenbacher @ 2021-06-28 17:27 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Darrick J . Wong, linux-xfs, linux-fsdevel, cluster-devel,
	Bob Peterson, Andreas Gruenbacher

From: Bob Peterson <rpeterso@redhat.com>

Add function iomap_uninline_inode for converting an inline inode into a
non-inline inode.  This takes care of attaching a new iomap_page object
to page->private if the block size is smaller than the page size.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
---
 fs/iomap/buffered-io.c | 24 ++++++++++++++++++++++++
 include/linux/iomap.h  |  2 ++
 2 files changed, 26 insertions(+)

diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index 03537ecb2a94..44acb59191b2 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -226,6 +226,30 @@ iomap_read_inline_data(struct inode *inode, struct page *page,
 	SetPageUptodate(page);
 }
 
+int iomap_uninline_inode(struct inode *inode,
+		int (*uninline)(struct inode *, struct page *))
+{
+	struct page *page = NULL;
+	int ret;
+
+	if (i_size_read(inode)) {
+		page = find_or_create_page(inode->i_mapping, 0, GFP_NOFS);
+		if (!page)
+			return -ENOMEM;
+	}
+	ret = uninline(inode, page);
+	if (page) {
+		if (PageUptodate(page)) {
+			iomap_page_create(inode, page);
+			iomap_set_range_uptodate(page, 0, PAGE_SIZE);
+		}
+		unlock_page(page);
+		put_page(page);
+	}
+	return ret;
+}
+EXPORT_SYMBOL_GPL(iomap_uninline_inode);
+
 static inline bool iomap_block_needs_zeroing(struct inode *inode,
 		struct iomap *iomap, loff_t pos)
 {
diff --git a/include/linux/iomap.h b/include/linux/iomap.h
index c87d0cb0de6d..90c924eec09b 100644
--- a/include/linux/iomap.h
+++ b/include/linux/iomap.h
@@ -157,6 +157,8 @@ loff_t iomap_apply(struct inode *inode, loff_t pos, loff_t length,
 
 ssize_t iomap_file_buffered_write(struct kiocb *iocb, struct iov_iter *from,
 		const struct iomap_ops *ops);
+int iomap_uninline_inode(struct inode *inode,
+                int (*uninline)(struct inode *, struct page *));
 int iomap_readpage(struct page *page, const struct iomap_ops *ops);
 void iomap_readahead(struct readahead_control *, const struct iomap_ops *ops);
 int iomap_set_page_dirty(struct page *page);
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH 0/2] iomap: small block problems
  2021-06-28 17:27 [PATCH 0/2] iomap: small block problems Andreas Gruenbacher
  2021-06-28 17:27 ` [PATCH 1/2] iomap: Don't create iomap_page objects for inline files Andreas Gruenbacher
  2021-06-28 17:27 ` [PATCH 2/2] iomap: Add helper for un-inlining an inline inode Andreas Gruenbacher
@ 2021-06-28 17:39 ` Matthew Wilcox
  2021-06-28 17:47   ` Christoph Hellwig
  2021-06-28 17:55   ` Andreas Gruenbacher
  2021-06-29  9:12 ` Andreas Gruenbacher
  3 siblings, 2 replies; 17+ messages in thread
From: Matthew Wilcox @ 2021-06-28 17:39 UTC (permalink / raw)
  To: Andreas Gruenbacher
  Cc: Christoph Hellwig, Darrick J . Wong, linux-xfs, linux-fsdevel,
	cluster-devel

On Mon, Jun 28, 2021 at 07:27:25PM +0200, Andreas Gruenbacher wrote:
> (1) In iomap_readpage_actor, an iomap_page is attached to the page even
> for inline inodes.  This is unnecessary because inline inodes don't need
> iomap_page objects.  That alone wouldn't cause any real issues, but when
> iomap_read_inline_data copies the inline data into the page, it sets the
> PageUptodate flag without setting iop->uptodate, causing an
> inconsistency between the two.  This will trigger a WARN_ON in
> iomap_page_release.  The fix should be not to allocate iomap_page
> objects when reading from inline inodes (patch 1).

I don't have a problem with this patch.

> (2) When un-inlining an inode, we must allocate a page with an attached
> iomap_page object (iomap_page_create) and initialize the iop->uptodate
> bitmap (iomap_set_range_uptodate).  We can't currently do that because
> iomap_page_create and iomap_set_range_uptodate are not exported.  That
> could be fixed by exporting those functions, or by implementing an
> additional helper as in patch 2.  Which of the two would you prefer?

Not hugely happy with either of these options, tbh.  I'd rather we apply
a patch akin to this one (plucked from the folio tree), so won't apply:

@@ -1305,7 +1311,7 @@ iomap_writepage_map(struct iomap_writepage_ctx *wpc,
                struct writeback_control *wbc, struct inode *inode,
                struct folio *folio, loff_t end_pos)
 {
-       struct iomap_page *iop = to_iomap_page(folio);
+       struct iomap_page *iop = iomap_page_create(inode, folio);
        struct iomap_ioend *ioend, *next;
        unsigned len = i_blocksize(inode);
        unsigned nblocks = i_blocks_per_folio(inode, folio);
@@ -1313,7 +1319,6 @@ iomap_writepage_map(struct iomap_writepage_ctx *wpc,
        int error = 0, count = 0, i;
        LIST_HEAD(submit_list);

-       WARN_ON_ONCE(nblocks > 1 && !iop);
        WARN_ON_ONCE(iop && atomic_read(&iop->write_bytes_pending) != 0);

        /*

so permit pages without an iop to enter writeback and create an iop
*then*.  Would that solve your problem?

> (3) We're not yet using iomap_page_mkwrite, so iomap_page objects don't
> get created on .page_mkwrite, either.  Part of the reason is that
> iomap_page_mkwrite locks the page and then calls into the filesystem for
> uninlining and for allocating backing blocks.  This conflicts with the
> gfs2 locking order: on gfs2, transactions must be started before locking
> any pages.  We can fix that by calling iomap_page_create from
> gfs2_page_mkwrite, or by doing the uninlining and allocations before
> calling iomap_page_mkwrite.  I've implemented option 2 for now; see
> here:

I think this might also solve this problem?

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 0/2] iomap: small block problems
  2021-06-28 17:39 ` [PATCH 0/2] iomap: small block problems Matthew Wilcox
@ 2021-06-28 17:47   ` Christoph Hellwig
  2021-06-28 21:28     ` Andreas Gruenbacher
  2021-06-28 21:59     ` Matthew Wilcox
  2021-06-28 17:55   ` Andreas Gruenbacher
  1 sibling, 2 replies; 17+ messages in thread
From: Christoph Hellwig @ 2021-06-28 17:47 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Andreas Gruenbacher, Christoph Hellwig, Darrick J . Wong,
	linux-xfs, linux-fsdevel, cluster-devel

On Mon, Jun 28, 2021 at 06:39:09PM +0100, Matthew Wilcox wrote:
> Not hugely happy with either of these options, tbh.  I'd rather we apply
> a patch akin to this one (plucked from the folio tree), so won't apply:

> so permit pages without an iop to enter writeback and create an iop
> *then*.  Would that solve your problem?

It is the right thing to do, especially when combined with a feature
patch to not bother to create the iomap_page structure on small
block size file systems when the extent covers the whole page.

> 
> > (3) We're not yet using iomap_page_mkwrite, so iomap_page objects don't
> > get created on .page_mkwrite, either.  Part of the reason is that
> > iomap_page_mkwrite locks the page and then calls into the filesystem for
> > uninlining and for allocating backing blocks.  This conflicts with the
> > gfs2 locking order: on gfs2, transactions must be started before locking
> > any pages.  We can fix that by calling iomap_page_create from
> > gfs2_page_mkwrite, or by doing the uninlining and allocations before
> > calling iomap_page_mkwrite.  I've implemented option 2 for now; see
> > here:
> 
> I think this might also solve this problem?

We'll still need to create the iomap_page structure for page_mkwrite
if there is an extent boundary inside the page.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 0/2] iomap: small block problems
  2021-06-28 17:39 ` [PATCH 0/2] iomap: small block problems Matthew Wilcox
  2021-06-28 17:47   ` Christoph Hellwig
@ 2021-06-28 17:55   ` Andreas Gruenbacher
  1 sibling, 0 replies; 17+ messages in thread
From: Andreas Gruenbacher @ 2021-06-28 17:55 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Christoph Hellwig, Darrick J . Wong, linux-xfs, linux-fsdevel,
	cluster-devel

On Mon, Jun 28, 2021 at 7:40 PM Matthew Wilcox <willy@infradead.org> wrote:
> On Mon, Jun 28, 2021 at 07:27:25PM +0200, Andreas Gruenbacher wrote:
> > (1) In iomap_readpage_actor, an iomap_page is attached to the page even
> > for inline inodes.  This is unnecessary because inline inodes don't need
> > iomap_page objects.  That alone wouldn't cause any real issues, but when
> > iomap_read_inline_data copies the inline data into the page, it sets the
> > PageUptodate flag without setting iop->uptodate, causing an
> > inconsistency between the two.  This will trigger a WARN_ON in
> > iomap_page_release.  The fix should be not to allocate iomap_page
> > objects when reading from inline inodes (patch 1).
>
> I don't have a problem with this patch.
>
> > (2) When un-inlining an inode, we must allocate a page with an attached
> > iomap_page object (iomap_page_create) and initialize the iop->uptodate
> > bitmap (iomap_set_range_uptodate).  We can't currently do that because
> > iomap_page_create and iomap_set_range_uptodate are not exported.  That
> > could be fixed by exporting those functions, or by implementing an
> > additional helper as in patch 2.  Which of the two would you prefer?
>
> Not hugely happy with either of these options, tbh.  I'd rather we apply
> a patch akin to this one (plucked from the folio tree), so won't apply:
>
> @@ -1305,7 +1311,7 @@ iomap_writepage_map(struct iomap_writepage_ctx *wpc,
>                 struct writeback_control *wbc, struct inode *inode,
>                 struct folio *folio, loff_t end_pos)
>  {
> -       struct iomap_page *iop = to_iomap_page(folio);
> +       struct iomap_page *iop = iomap_page_create(inode, folio);
>         struct iomap_ioend *ioend, *next;
>         unsigned len = i_blocksize(inode);
>         unsigned nblocks = i_blocks_per_folio(inode, folio);
> @@ -1313,7 +1319,6 @@ iomap_writepage_map(struct iomap_writepage_ctx *wpc,
>         int error = 0, count = 0, i;
>         LIST_HEAD(submit_list);
>
> -       WARN_ON_ONCE(nblocks > 1 && !iop);
>         WARN_ON_ONCE(iop && atomic_read(&iop->write_bytes_pending) != 0);
>
>         /*
>
> so permit pages without an iop to enter writeback and create an iop
> *then*.  Would that solve your problem?

It probably would. Let me do some testing based on that.

> > (3) We're not yet using iomap_page_mkwrite, so iomap_page objects don't
> > get created on .page_mkwrite, either.  Part of the reason is that
> > iomap_page_mkwrite locks the page and then calls into the filesystem for
> > uninlining and for allocating backing blocks.  This conflicts with the
> > gfs2 locking order: on gfs2, transactions must be started before locking
> > any pages.  We can fix that by calling iomap_page_create from
> > gfs2_page_mkwrite, or by doing the uninlining and allocations before
> > calling iomap_page_mkwrite.  I've implemented option 2 for now; see
> > here:
>
> I think this might also solve this problem?

Probably yes.

Thanks,
Andreas


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 0/2] iomap: small block problems
  2021-06-28 17:47   ` Christoph Hellwig
@ 2021-06-28 21:28     ` Andreas Gruenbacher
  2021-06-28 21:59     ` Matthew Wilcox
  1 sibling, 0 replies; 17+ messages in thread
From: Andreas Gruenbacher @ 2021-06-28 21:28 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Matthew Wilcox, Darrick J . Wong, linux-xfs, linux-fsdevel,
	cluster-devel

On Mon, Jun 28, 2021 at 8:07 PM Christoph Hellwig <hch@infradead.org> wrote:
> On Mon, Jun 28, 2021 at 06:39:09PM +0100, Matthew Wilcox wrote:
> > Not hugely happy with either of these options, tbh.  I'd rather we apply
> > a patch akin to this one (plucked from the folio tree), so won't apply:
>
> > so permit pages without an iop to enter writeback and create an iop
> > *then*.  Would that solve your problem?
>
> It is the right thing to do, especially when combined with a feature
> patch to not bother to create the iomap_page structure on small
> block size file systems when the extent covers the whole page.
>
> >
> > > (3) We're not yet using iomap_page_mkwrite, so iomap_page objects don't
> > > get created on .page_mkwrite, either.  Part of the reason is that
> > > iomap_page_mkwrite locks the page and then calls into the filesystem for
> > > uninlining and for allocating backing blocks.  This conflicts with the
> > > gfs2 locking order: on gfs2, transactions must be started before locking
> > > any pages.  We can fix that by calling iomap_page_create from
> > > gfs2_page_mkwrite, or by doing the uninlining and allocations before
> > > calling iomap_page_mkwrite.  I've implemented option 2 for now; see
> > > here:
> >
> > I think this might also solve this problem?
>
> We'll still need to create the iomap_page structure for page_mkwrite
> if there is an extent boundary inside the page.

Yes, but the iop wouldn't need to be allocated in page_mkwrite; that
would be taken care of by iomap_writepage / iomap_writepages as in the
patch suggested by Matthew, right?

Thanks,
Andreas


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 0/2] iomap: small block problems
  2021-06-28 17:47   ` Christoph Hellwig
  2021-06-28 21:28     ` Andreas Gruenbacher
@ 2021-06-28 21:59     ` Matthew Wilcox
  2021-06-29  5:29       ` Christoph Hellwig
  1 sibling, 1 reply; 17+ messages in thread
From: Matthew Wilcox @ 2021-06-28 21:59 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Andreas Gruenbacher, Darrick J . Wong, linux-xfs, linux-fsdevel,
	cluster-devel

On Mon, Jun 28, 2021 at 06:47:58PM +0100, Christoph Hellwig wrote:
> On Mon, Jun 28, 2021 at 06:39:09PM +0100, Matthew Wilcox wrote:
> > Not hugely happy with either of these options, tbh.  I'd rather we apply
> > a patch akin to this one (plucked from the folio tree), so won't apply:
> 
> > so permit pages without an iop to enter writeback and create an iop
> > *then*.  Would that solve your problem?
> 
> It is the right thing to do, especially when combined with a feature
> patch to not bother to create the iomap_page structure on small
> block size file systems when the extent covers the whole page.

We don't know the extent layout at the point where *this* patch creates
iomap_pages during writeback.  I imagine we can delay creating one until
we find out what our destination layout will be?


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 0/2] iomap: small block problems
  2021-06-28 21:59     ` Matthew Wilcox
@ 2021-06-29  5:29       ` Christoph Hellwig
  2021-06-29  5:42         ` Christoph Hellwig
  2021-06-30 12:29         ` Andreas Gruenbacher
  0 siblings, 2 replies; 17+ messages in thread
From: Christoph Hellwig @ 2021-06-29  5:29 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Christoph Hellwig, Andreas Gruenbacher, Darrick J . Wong,
	linux-xfs, linux-fsdevel, cluster-devel

On Mon, Jun 28, 2021 at 10:59:55PM +0100, Matthew Wilcox wrote:
> > > so permit pages without an iop to enter writeback and create an iop
> > > *then*.  Would that solve your problem?
> > 
> > It is the right thing to do, especially when combined with a feature
> > patch to not bother to create the iomap_page structure on small
> > block size file systems when the extent covers the whole page.
> 
> We don't know the extent layout at the point where *this* patch creates
> iomap_pages during writeback.  I imagine we can delay creating one until
> we find out what our destination layout will be?

Hmm.  Actually ->page_mkwrite is always is always called on an uptodate
page and we even assert that.  I should have remembered the whole page
fault path better.

So yeah, I think we should take patch 1 from Andreas, then a non-folio
version of your patch as a start.  The next steps then would be in
approximate order:

 1. remove the iomap_page_create in iomap_page_mkwrite_actor as it
    clearly is not needed at that point
 2. don't bother to create an iomap_page in iomap_readpage_actor when
    the iomap spans the whole page
 3. don't create the iomap_page in __iomap_write_begin when the
    page is marked uptodate or the write covers the whole page 

delaying the creation further in iomap_writepage_map will be harder
as the loop around iomap_add_to_ioend is still fundamentally block
based.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 0/2] iomap: small block problems
  2021-06-29  5:29       ` Christoph Hellwig
@ 2021-06-29  5:42         ` Christoph Hellwig
  2021-06-30 12:29         ` Andreas Gruenbacher
  1 sibling, 0 replies; 17+ messages in thread
From: Christoph Hellwig @ 2021-06-29  5:42 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Christoph Hellwig, Andreas Gruenbacher, Darrick J . Wong,
	linux-xfs, linux-fsdevel, cluster-devel

On Tue, Jun 29, 2021 at 06:29:48AM +0100, Christoph Hellwig wrote:
> Hmm.  Actually ->page_mkwrite is always is always called on an uptodate
> page and we even assert that.  I should have remembered the whole page
> fault path better.
> 
> So yeah, I think we should take patch 1 from Andreas, then a non-folio
> version of your patch as a start.  The next steps then would be in
> approximate order:
> 
>  1. remove the iomap_page_create in iomap_page_mkwrite_actor as it
>     clearly is not needed at that point
>  2. don't bother to create an iomap_page in iomap_readpage_actor when
>     the iomap spans the whole page
>  3. don't create the iomap_page in __iomap_write_begin when the
>     page is marked uptodate or the write covers the whole page 

Further thoughts for a better series:

 1. create iomap_page if needed in iomap_writepage_map
 2. do not create the iomap_page at all in iomap_readpage_actor.
    ->readahead is always called on newly allocated pages, and
    ->readpage either on a clean !uptodate page or on one that
    has seen a write leading to a partial uptodate state.  That
    is for the case that cares about the iomap_page it is present
    already
 3. don't create the iomap_page in iomap_page_mkwrite_actor

I think this is the simple initial series that should solve Andreas'
problem.  Then we can look into optimizing __iomap_write_begin
and iomap_writepage_map further as needed.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 0/2] iomap: small block problems
  2021-06-28 17:27 [PATCH 0/2] iomap: small block problems Andreas Gruenbacher
                   ` (2 preceding siblings ...)
  2021-06-28 17:39 ` [PATCH 0/2] iomap: small block problems Matthew Wilcox
@ 2021-06-29  9:12 ` Andreas Gruenbacher
  2021-06-30 14:08   ` Matthew Wilcox
  3 siblings, 1 reply; 17+ messages in thread
From: Andreas Gruenbacher @ 2021-06-29  9:12 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Andreas Gruenbacher, Christoph Hellwig, Darrick J . Wong,
	linux-xfs, linux-fsdevel, cluster-devel

Below is a version of your patch on top of v5.13 which has passed some
local testing here.

Thanks,
Andreas

--

iomap: Permit pages without an iop to enter writeback

Permit pages without an iop to enter writeback and create an iop *then*.  This
allows filesystems to mark pages dirty without having to worry about how the
iop block tracking is implemented.

Suggested-by: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
---
 fs/iomap/buffered-io.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index 03537ecb2a94..6330dabc451e 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -1336,14 +1336,13 @@ iomap_writepage_map(struct iomap_writepage_ctx *wpc,
 		struct writeback_control *wbc, struct inode *inode,
 		struct page *page, u64 end_offset)
 {
-	struct iomap_page *iop = to_iomap_page(page);
+	struct iomap_page *iop = iomap_page_create(inode, page);
 	struct iomap_ioend *ioend, *next;
 	unsigned len = i_blocksize(inode);
 	u64 file_offset; /* file offset of page */
 	int error = 0, count = 0, i;
 	LIST_HEAD(submit_list);
 
-	WARN_ON_ONCE(i_blocks_per_page(inode, page) > 1 && !iop);
 	WARN_ON_ONCE(iop && atomic_read(&iop->write_bytes_pending) != 0);
 
 	/*
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH 0/2] iomap: small block problems
  2021-06-29  5:29       ` Christoph Hellwig
  2021-06-29  5:42         ` Christoph Hellwig
@ 2021-06-30 12:29         ` Andreas Gruenbacher
  2021-07-05 15:51           ` Andreas Gruenbacher
  1 sibling, 1 reply; 17+ messages in thread
From: Andreas Gruenbacher @ 2021-06-30 12:29 UTC (permalink / raw)
  To: Darrick J . Wong
  Cc: Christoph Hellwig, Matthew Wilcox, linux-xfs, linux-fsdevel,
	cluster-devel

Darrick,

On Tue, Jun 29, 2021 at 7:47 AM Christoph Hellwig <hch@infradead.org> wrote:
> On Mon, Jun 28, 2021 at 10:59:55PM +0100, Matthew Wilcox wrote:
> > > > so permit pages without an iop to enter writeback and create an iop
> > > > *then*.  Would that solve your problem?
> > >
> > > It is the right thing to do, especially when combined with a feature
> > > patch to not bother to create the iomap_page structure on small
> > > block size file systems when the extent covers the whole page.
> >
> > We don't know the extent layout at the point where *this* patch creates
> > iomap_pages during writeback.  I imagine we can delay creating one until
> > we find out what our destination layout will be?
>
> Hmm.  Actually ->page_mkwrite is always is always called on an uptodate
> page and we even assert that.  I should have remembered the whole page
> fault path better.
>
> So yeah, I think we should take patch 1 from Andreas, then a non-folio
> version of your patch as a start.

will you pick up those two patches and push them to Linus? They both
seem pretty safe.

Thanks,
Andreas


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/2] iomap: Don't create iomap_page objects for inline files
  2021-06-28 17:27 ` [PATCH 1/2] iomap: Don't create iomap_page objects for inline files Andreas Gruenbacher
@ 2021-06-30 13:35   ` Matthew Wilcox
  0 siblings, 0 replies; 17+ messages in thread
From: Matthew Wilcox @ 2021-06-30 13:35 UTC (permalink / raw)
  To: Andreas Gruenbacher
  Cc: Christoph Hellwig, Darrick J . Wong, linux-xfs, linux-fsdevel,
	cluster-devel

On Mon, Jun 28, 2021 at 07:27:26PM +0200, Andreas Gruenbacher wrote:
> In iomap_readpage_actor, don't create iop objects for inline inodes.
> Otherwise, iomap_read_inline_data will set PageUptodate without setting
> iop->uptodate, and iomap_page_release will eventually complain.
> 
> To prevent this kind of bug from occurring in the future, make sure the
> page doesn't have private data attached in iomap_read_inline_data.
> 
> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>

Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 0/2] iomap: small block problems
  2021-06-29  9:12 ` Andreas Gruenbacher
@ 2021-06-30 14:08   ` Matthew Wilcox
  2021-06-30 14:45     ` Andreas Gruenbacher
  0 siblings, 1 reply; 17+ messages in thread
From: Matthew Wilcox @ 2021-06-30 14:08 UTC (permalink / raw)
  To: Andreas Gruenbacher
  Cc: Christoph Hellwig, Darrick J . Wong, linux-xfs, linux-fsdevel,
	cluster-devel

On Tue, Jun 29, 2021 at 11:12:39AM +0200, Andreas Gruenbacher wrote:
> Below is a version of your patch on top of v5.13 which has passed some
> local testing here.
> 
> Thanks,
> Andreas
> 
> --
> 
> iomap: Permit pages without an iop to enter writeback
> 
> Permit pages without an iop to enter writeback and create an iop *then*.  This
> allows filesystems to mark pages dirty without having to worry about how the
> iop block tracking is implemented.

How about ...

Create an iop in the writeback path if one doesn't exist.  This allows
us to avoid creating the iop in some cases.  The only current case we
do that for is pages with inline data, but it can be extended to pages
which are entirely within an extent.  It also allows for an iop to be
removed from pages in the future (eg page split).

> Suggested-by: Matthew Wilcox <willy@infradead.org>
> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>

Co-developed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 0/2] iomap: small block problems
  2021-06-30 14:08   ` Matthew Wilcox
@ 2021-06-30 14:45     ` Andreas Gruenbacher
  0 siblings, 0 replies; 17+ messages in thread
From: Andreas Gruenbacher @ 2021-06-30 14:45 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Christoph Hellwig, Darrick J . Wong, linux-xfs, linux-fsdevel,
	cluster-devel

On Wed, Jun 30, 2021 at 4:09 PM Matthew Wilcox <willy@infradead.org> wrote:
> On Tue, Jun 29, 2021 at 11:12:39AM +0200, Andreas Gruenbacher wrote:
> > Below is a version of your patch on top of v5.13 which has passed some
> > local testing here.
> >
> > Thanks,
> > Andreas
> >
> > --
> >
> > iomap: Permit pages without an iop to enter writeback
> >
> > Permit pages without an iop to enter writeback and create an iop *then*.  This
> > allows filesystems to mark pages dirty without having to worry about how the
> > iop block tracking is implemented.
>
> How about ...
>
> Create an iop in the writeback path if one doesn't exist.  This allows
> us to avoid creating the iop in some cases.  The only current case we
> do that for is pages with inline data, but it can be extended to pages
> which are entirely within an extent.  It also allows for an iop to be
> removed from pages in the future (eg page split).
>
> > Suggested-by: Matthew Wilcox <willy@infradead.org>
> > Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
>
> Co-developed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>

Sure, that works for me.

Thanks,
Andreas


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 0/2] iomap: small block problems
  2021-06-30 12:29         ` Andreas Gruenbacher
@ 2021-07-05 15:51           ` Andreas Gruenbacher
  2021-07-05 16:55             ` Christoph Hellwig
  0 siblings, 1 reply; 17+ messages in thread
From: Andreas Gruenbacher @ 2021-07-05 15:51 UTC (permalink / raw)
  To: Darrick J . Wong
  Cc: Christoph Hellwig, Matthew Wilcox, linux-xfs, linux-fsdevel,
	cluster-devel

On Wed, Jun 30, 2021 at 2:29 PM Andreas Gruenbacher <agruenba@redhat.com> wrote:
> Darrick,
>
> will you pick up those two patches and push them to Linus? They both
> seem pretty safe.

Hello, is there anybody out there?

I've put the two patches here with the sign-offs they've received:

https://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2.git/log/?h=for-next.iomap

Thanks,
Andreas


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 0/2] iomap: small block problems
  2021-07-05 15:51           ` Andreas Gruenbacher
@ 2021-07-05 16:55             ` Christoph Hellwig
  0 siblings, 0 replies; 17+ messages in thread
From: Christoph Hellwig @ 2021-07-05 16:55 UTC (permalink / raw)
  To: Andreas Gruenbacher
  Cc: Darrick J . Wong, Christoph Hellwig, Matthew Wilcox, linux-xfs,
	linux-fsdevel, cluster-devel

On Mon, Jul 05, 2021 at 05:51:22PM +0200, Andreas Gruenbacher wrote:
> On Wed, Jun 30, 2021 at 2:29 PM Andreas Gruenbacher <agruenba@redhat.com> wrote:
> > Darrick,
> >
> > will you pick up those two patches and push them to Linus? They both
> > seem pretty safe.
> 
> Hello, is there anybody out there?
> 
> I've put the two patches here with the sign-offs they've received:

Please send out an updated series likey everyone else.

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2021-07-05 16:56 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-28 17:27 [PATCH 0/2] iomap: small block problems Andreas Gruenbacher
2021-06-28 17:27 ` [PATCH 1/2] iomap: Don't create iomap_page objects for inline files Andreas Gruenbacher
2021-06-30 13:35   ` Matthew Wilcox
2021-06-28 17:27 ` [PATCH 2/2] iomap: Add helper for un-inlining an inline inode Andreas Gruenbacher
2021-06-28 17:39 ` [PATCH 0/2] iomap: small block problems Matthew Wilcox
2021-06-28 17:47   ` Christoph Hellwig
2021-06-28 21:28     ` Andreas Gruenbacher
2021-06-28 21:59     ` Matthew Wilcox
2021-06-29  5:29       ` Christoph Hellwig
2021-06-29  5:42         ` Christoph Hellwig
2021-06-30 12:29         ` Andreas Gruenbacher
2021-07-05 15:51           ` Andreas Gruenbacher
2021-07-05 16:55             ` Christoph Hellwig
2021-06-28 17:55   ` Andreas Gruenbacher
2021-06-29  9:12 ` Andreas Gruenbacher
2021-06-30 14:08   ` Matthew Wilcox
2021-06-30 14:45     ` Andreas Gruenbacher

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).