linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] afs: Fix dangling folio ref counts in writeback
@ 2023-06-07 20:41 Vishal Moola (Oracle)
  2023-06-07 20:41 ` [PATCH] afs: Fix waiting for writeback then skipping folio Vishal Moola (Oracle)
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Vishal Moola (Oracle) @ 2023-06-07 20:41 UTC (permalink / raw)
  To: linux-afs; +Cc: linux-kernel, dhowells, akpm, Vishal Moola (Oracle)

Commit acc8d8588cb7 converted afs_writepages_region() to write back a
folio batch. If writeback needs rescheduling, the function exits without
dropping the references to the folios in fbatch. This patch fixes that.

This has only been compile tested.

Fixes: acc8d8588cb7 ("afs: convert afs_writepages_region() to use filemap_get_folios_tag()")
Signed-off-by: Vishal Moola (Oracle) <vishal.moola@gmail.com>
---
 fs/afs/write.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/fs/afs/write.c b/fs/afs/write.c
index c822d6006033..a724228e4d94 100644
--- a/fs/afs/write.c
+++ b/fs/afs/write.c
@@ -764,6 +764,7 @@ static int afs_writepages_region(struct address_space *mapping,
 					if (skips >= 5 || need_resched()) {
 						*_next = start;
 						_leave(" = 0 [%llx]", *_next);
+						folio_batch_release(&fbatch);
 						return 0;
 					}
 					skips++;
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH] afs: Fix waiting for writeback then skipping folio
  2023-06-07 20:41 [PATCH] afs: Fix dangling folio ref counts in writeback Vishal Moola (Oracle)
@ 2023-06-07 20:41 ` Vishal Moola (Oracle)
  2023-06-09  0:50   ` Andrew Morton
  2023-06-16 22:43   ` David Howells
  2023-06-16 22:30 ` [PATCH] afs: Fix dangling folio ref counts in writeback David Howells
  2023-06-16 22:43 ` [PATCH] afs: Fix waiting for writeback then skipping folio David Howells
  2 siblings, 2 replies; 8+ messages in thread
From: Vishal Moola (Oracle) @ 2023-06-07 20:41 UTC (permalink / raw)
  To: linux-afs; +Cc: linux-kernel, dhowells, akpm, Vishal Moola (Oracle)

Commit acc8d8588cb7 converted afs_writepages_region() to write back a
folio batch. The function waits for writeback to a folio, but then
proceeds to the rest of the batch without trying to write that folio
again. This patch fixes has it attempt to write the folio again.

This has only been compile tested.

Fixes: acc8d8588cb7 ("afs: convert afs_writepages_region() to use filemap_get_folios_tag()")
Signed-off-by: Vishal Moola (Oracle) <vishal.moola@gmail.com>
---
 fs/afs/write.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/fs/afs/write.c b/fs/afs/write.c
index a724228e4d94..18ccb613dff8 100644
--- a/fs/afs/write.c
+++ b/fs/afs/write.c
@@ -731,6 +731,7 @@ static int afs_writepages_region(struct address_space *mapping,
 			 * (changing page->mapping to NULL), or even swizzled
 			 * back from swapper_space to tmpfs file mapping
 			 */
+try_again:
 			if (wbc->sync_mode != WB_SYNC_NONE) {
 				ret = folio_lock_killable(folio);
 				if (ret < 0) {
@@ -757,6 +758,7 @@ static int afs_writepages_region(struct address_space *mapping,
 #ifdef CONFIG_AFS_FSCACHE
 					folio_wait_fscache(folio);
 #endif
+					goto try_again;
 				} else {
 					start += folio_size(folio);
 				}
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] afs: Fix waiting for writeback then skipping folio
  2023-06-07 20:41 ` [PATCH] afs: Fix waiting for writeback then skipping folio Vishal Moola (Oracle)
@ 2023-06-09  0:50   ` Andrew Morton
  2023-06-16 22:43   ` David Howells
  1 sibling, 0 replies; 8+ messages in thread
From: Andrew Morton @ 2023-06-09  0:50 UTC (permalink / raw)
  To: Vishal Moola (Oracle); +Cc: linux-afs, linux-kernel, dhowells

On Wed,  7 Jun 2023 13:41:20 -0700 "Vishal Moola (Oracle)" <vishal.moola@gmail.com> wrote:

> Commit acc8d8588cb7 converted afs_writepages_region() to write back a
> folio batch. The function waits for writeback to a folio, but then
> proceeds to the rest of the batch without trying to write that folio
> again. This patch fixes has it attempt to write the folio again.
> 
> This has only been compile tested.

This seems fairly serious?

> --- a/fs/afs/write.c
> +++ b/fs/afs/write.c
> @@ -731,6 +731,7 @@ static int afs_writepages_region(struct address_space *mapping,
>  			 * (changing page->mapping to NULL), or even swizzled
>  			 * back from swapper_space to tmpfs file mapping
>  			 */
> +try_again:
>  			if (wbc->sync_mode != WB_SYNC_NONE) {
>  				ret = folio_lock_killable(folio);
>  				if (ret < 0) {
> @@ -757,6 +758,7 @@ static int afs_writepages_region(struct address_space *mapping,
>  #ifdef CONFIG_AFS_FSCACHE
>  					folio_wait_fscache(folio);
>  #endif
> +					goto try_again;
>  				} else {
>  					start += folio_size(folio);
>  				}

From my reading, we'll fail to write out the dirty data.  Presumably
not easily observable, as it will get written out again later on.  But
we're also calling afs_write_back_from_locked_folio() with an unlocked
folio, which might cause mayhem.

So I'm suspecting that a cc:stable is needed.  David, could you please
take a look and perhaps retest?

Thanks.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] afs: Fix dangling folio ref counts in writeback
  2023-06-07 20:41 [PATCH] afs: Fix dangling folio ref counts in writeback Vishal Moola (Oracle)
  2023-06-07 20:41 ` [PATCH] afs: Fix waiting for writeback then skipping folio Vishal Moola (Oracle)
@ 2023-06-16 22:30 ` David Howells
  2023-06-16 22:43 ` [PATCH] afs: Fix waiting for writeback then skipping folio David Howells
  2 siblings, 0 replies; 8+ messages in thread
From: David Howells @ 2023-06-16 22:30 UTC (permalink / raw)
  To: Vishal Moola (Oracle); +Cc: dhowells, linux-afs, linux-kernel, akpm

Vishal Moola (Oracle) <vishal.moola@gmail.com> wrote:

>  					if (skips >= 5 || need_resched()) {
>  						*_next = start;
>  						_leave(" = 0 [%llx]", *_next);
> +						folio_batch_release(&fbatch);

This should go before the _leave().

>  						return 0;
>  					}

Looks okay otherwise.

David


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] afs: Fix waiting for writeback then skipping folio
  2023-06-07 20:41 ` [PATCH] afs: Fix waiting for writeback then skipping folio Vishal Moola (Oracle)
  2023-06-09  0:50   ` Andrew Morton
@ 2023-06-16 22:43   ` David Howells
  2023-06-16 23:22     ` Andrew Morton
  2023-06-16 23:26     ` David Howells
  1 sibling, 2 replies; 8+ messages in thread
From: David Howells @ 2023-06-16 22:43 UTC (permalink / raw)
  To: Andrew Morton; +Cc: dhowells, Vishal Moola (Oracle), linux-afs, linux-kernel

Andrew Morton <akpm@linux-foundation.org> wrote:

> > Commit acc8d8588cb7 converted afs_writepages_region() to write back a
> > folio batch. The function waits for writeback to a folio, but then
> > proceeds to the rest of the batch without trying to write that folio
> > again. This patch fixes has it attempt to write the folio again.
> > 
> > This has only been compile tested.
> 
> This seems fairly serious?

We will try to write the again later, but sync()/fsync() might now have
skipped it.

> From my reading, we'll fail to write out the dirty data.  Presumably
> not easily observable, as it will get written out again later on.

As it's a network filesystem, interactions with third parties could cause
apparent corruption.  Closing a file will flush it - but if there's a
simultaneous op of some other kind, a bit of a flush or a sync may get missed
and the copy visible to another user be temporarily missing that bit.

> But we're also calling afs_write_back_from_locked_folio() with an unlocked
> folio, which might cause mayhem.

Without this patch, you mean?  There's a "continue" statement that should send
us back to the top of the loop before we get as far as
afs_write_back_from_locked_folio() - and then the folio_unlock() there would
go bang.

David


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] afs: Fix waiting for writeback then skipping folio
  2023-06-07 20:41 [PATCH] afs: Fix dangling folio ref counts in writeback Vishal Moola (Oracle)
  2023-06-07 20:41 ` [PATCH] afs: Fix waiting for writeback then skipping folio Vishal Moola (Oracle)
  2023-06-16 22:30 ` [PATCH] afs: Fix dangling folio ref counts in writeback David Howells
@ 2023-06-16 22:43 ` David Howells
  2 siblings, 0 replies; 8+ messages in thread
From: David Howells @ 2023-06-16 22:43 UTC (permalink / raw)
  To: Vishal Moola (Oracle); +Cc: dhowells, linux-afs, linux-kernel, akpm

Vishal Moola (Oracle) <vishal.moola@gmail.com> wrote:

> +					goto try_again;
>  				} else {
>  					start += folio_size(folio);

The "else" is then redundant.

David


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] afs: Fix waiting for writeback then skipping folio
  2023-06-16 22:43   ` David Howells
@ 2023-06-16 23:22     ` Andrew Morton
  2023-06-16 23:26     ` David Howells
  1 sibling, 0 replies; 8+ messages in thread
From: Andrew Morton @ 2023-06-16 23:22 UTC (permalink / raw)
  To: David Howells; +Cc: Vishal Moola (Oracle), linux-afs, linux-kernel

On Fri, 16 Jun 2023 23:43:02 +0100 David Howells <dhowells@redhat.com> wrote:

> Andrew Morton <akpm@linux-foundation.org> wrote:
> 
> > > Commit acc8d8588cb7 converted afs_writepages_region() to write back a
> > > folio batch. The function waits for writeback to a folio, but then
> > > proceeds to the rest of the batch without trying to write that folio
> > > again. This patch fixes has it attempt to write the folio again.
> > > 
> > > This has only been compile tested.
> > 
> > This seems fairly serious?
> 
> We will try to write the again later, but sync()/fsync() might now have
> skipped it.
> 
> > From my reading, we'll fail to write out the dirty data.  Presumably
> > not easily observable, as it will get written out again later on.
> 
> As it's a network filesystem, interactions with third parties could cause
> apparent corruption.  Closing a file will flush it - but if there's a
> simultaneous op of some other kind, a bit of a flush or a sync may get missed
> and the copy visible to another user be temporarily missing that bit.
> 
> > But we're also calling afs_write_back_from_locked_folio() with an unlocked
> > folio, which might cause mayhem.
> 
> Without this patch, you mean?  There's a "continue" statement that should send
> us back to the top of the loop before we get as far as
> afs_write_back_from_locked_folio() - and then the folio_unlock() there would
> go bang.
> 

Well, what I'm really asking is the thing I ask seven times a day:

- what are the end-user visible effects of the bug

- should be fix be backported into earlier kernels

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] afs: Fix waiting for writeback then skipping folio
  2023-06-16 22:43   ` David Howells
  2023-06-16 23:22     ` Andrew Morton
@ 2023-06-16 23:26     ` David Howells
  1 sibling, 0 replies; 8+ messages in thread
From: David Howells @ 2023-06-16 23:26 UTC (permalink / raw)
  To: Andrew Morton; +Cc: dhowells, Vishal Moola (Oracle), linux-afs, linux-kernel

Andrew Morton <akpm@linux-foundation.org> wrote:

> Well, what I'm really asking is the thing I ask seven times a day:
> 
> - what are the end-user visible effects of the bug

A third party might see an incomplete flush after they've done a sync - which
amounts to temporary file corruption.

> - should be fix be backported into earlier kernels

Yes.

David


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2023-06-16 23:27 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-06-07 20:41 [PATCH] afs: Fix dangling folio ref counts in writeback Vishal Moola (Oracle)
2023-06-07 20:41 ` [PATCH] afs: Fix waiting for writeback then skipping folio Vishal Moola (Oracle)
2023-06-09  0:50   ` Andrew Morton
2023-06-16 22:43   ` David Howells
2023-06-16 23:22     ` Andrew Morton
2023-06-16 23:26     ` David Howells
2023-06-16 22:30 ` [PATCH] afs: Fix dangling folio ref counts in writeback David Howells
2023-06-16 22:43 ` [PATCH] afs: Fix waiting for writeback then skipping folio David Howells

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).