All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2] xfs: don't eat an EIO/ENOSPC writeback error when scrubbing data fork
@ 2020-06-23  3:50 Darrick J. Wong
  2020-06-23  4:26 ` Dave Chinner
  2020-06-23 12:10 ` Brian Foster
  0 siblings, 2 replies; 7+ messages in thread
From: Darrick J. Wong @ 2020-06-23  3:50 UTC (permalink / raw)
  To: xfs; +Cc: Dave Chinner

From: Darrick J. Wong <darrick.wong@oracle.com>

The data fork scrubber calls filemap_write_and_wait to flush dirty pages
and delalloc reservations out to disk prior to checking the data fork's
extent mappings.  Unfortunately, this means that scrub can consume the
EIO/ENOSPC errors that would otherwise have stayed around in the address
space until (we hope) the writer application calls fsync to persist data
and collect errors.  The end result is that programs that wrote to a
file might never see the error code and proceed as if nothing were
wrong.

xfs_scrub is not in a position to notify file writers about the
writeback failure, and it's only here to check metadata, not file
contents.  Therefore, if writeback fails, we should stuff the error code
back into the address space so that an fsync by the writer application
can pick that up.

Fixes: 99d9d8d05da2 ("xfs: scrub inode block mappings")
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
v2: explain why it's ok to keep going even if writeback fails
---
 fs/xfs/scrub/bmap.c |   19 ++++++++++++++++++-
 1 file changed, 18 insertions(+), 1 deletion(-)

diff --git a/fs/xfs/scrub/bmap.c b/fs/xfs/scrub/bmap.c
index 7badd6dfe544..0d7062b7068b 100644
--- a/fs/xfs/scrub/bmap.c
+++ b/fs/xfs/scrub/bmap.c
@@ -47,7 +47,24 @@ xchk_setup_inode_bmap(
 	    sc->sm->sm_type == XFS_SCRUB_TYPE_BMBTD) {
 		inode_dio_wait(VFS_I(sc->ip));
 		error = filemap_write_and_wait(VFS_I(sc->ip)->i_mapping);
-		if (error)
+		if (error == -ENOSPC || error == -EIO) {
+			/*
+			 * If writeback hits EIO or ENOSPC, reflect it back
+			 * into the address space mapping so that a writer
+			 * program calling fsync to look for errors will still
+			 * capture the error.
+			 *
+			 * However, we continue into the extent mapping checks
+			 * because write failures do not necessarily imply
+			 * anything about the correctness of the file metadata.
+			 * The metadata and the file data could be on
+			 * completely separate devices; a media failure might
+			 * only affect a subset of the disk, etc.  We properly
+			 * account for delalloc extents, so leaving them in
+			 * memory is fine.
+			 */
+			mapping_set_error(VFS_I(sc->ip)->i_mapping, error);
+		} else if (error)
 			goto out;
 	}
 

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH v2] xfs: don't eat an EIO/ENOSPC writeback error when scrubbing data fork
  2020-06-23  3:50 [PATCH v2] xfs: don't eat an EIO/ENOSPC writeback error when scrubbing data fork Darrick J. Wong
@ 2020-06-23  4:26 ` Dave Chinner
  2020-06-23 12:10 ` Brian Foster
  1 sibling, 0 replies; 7+ messages in thread
From: Dave Chinner @ 2020-06-23  4:26 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: xfs

On Mon, Jun 22, 2020 at 08:50:10PM -0700, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> The data fork scrubber calls filemap_write_and_wait to flush dirty pages
> and delalloc reservations out to disk prior to checking the data fork's
> extent mappings.  Unfortunately, this means that scrub can consume the
> EIO/ENOSPC errors that would otherwise have stayed around in the address
> space until (we hope) the writer application calls fsync to persist data
> and collect errors.  The end result is that programs that wrote to a
> file might never see the error code and proceed as if nothing were
> wrong.
> 
> xfs_scrub is not in a position to notify file writers about the
> writeback failure, and it's only here to check metadata, not file
> contents.  Therefore, if writeback fails, we should stuff the error code
> back into the address space so that an fsync by the writer application
> can pick that up.
> 
> Fixes: 99d9d8d05da2 ("xfs: scrub inode block mappings")
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> ---
> v2: explain why it's ok to keep going even if writeback fails

Looks good.

Reviewed-by: Dave Chinner <dchinner@redhat.com>
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2] xfs: don't eat an EIO/ENOSPC writeback error when scrubbing data fork
  2020-06-23  3:50 [PATCH v2] xfs: don't eat an EIO/ENOSPC writeback error when scrubbing data fork Darrick J. Wong
  2020-06-23  4:26 ` Dave Chinner
@ 2020-06-23 12:10 ` Brian Foster
  2020-06-23 15:23   ` Darrick J. Wong
  1 sibling, 1 reply; 7+ messages in thread
From: Brian Foster @ 2020-06-23 12:10 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: xfs, Dave Chinner

On Mon, Jun 22, 2020 at 08:50:10PM -0700, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> The data fork scrubber calls filemap_write_and_wait to flush dirty pages
> and delalloc reservations out to disk prior to checking the data fork's
> extent mappings.  Unfortunately, this means that scrub can consume the
> EIO/ENOSPC errors that would otherwise have stayed around in the address
> space until (we hope) the writer application calls fsync to persist data
> and collect errors.  The end result is that programs that wrote to a
> file might never see the error code and proceed as if nothing were
> wrong.
> 
> xfs_scrub is not in a position to notify file writers about the
> writeback failure, and it's only here to check metadata, not file
> contents.  Therefore, if writeback fails, we should stuff the error code
> back into the address space so that an fsync by the writer application
> can pick that up.
> 
> Fixes: 99d9d8d05da2 ("xfs: scrub inode block mappings")
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> ---
> v2: explain why it's ok to keep going even if writeback fails
> ---
>  fs/xfs/scrub/bmap.c |   19 ++++++++++++++++++-
>  1 file changed, 18 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/xfs/scrub/bmap.c b/fs/xfs/scrub/bmap.c
> index 7badd6dfe544..0d7062b7068b 100644
> --- a/fs/xfs/scrub/bmap.c
> +++ b/fs/xfs/scrub/bmap.c
> @@ -47,7 +47,24 @@ xchk_setup_inode_bmap(
>  	    sc->sm->sm_type == XFS_SCRUB_TYPE_BMBTD) {
>  		inode_dio_wait(VFS_I(sc->ip));
>  		error = filemap_write_and_wait(VFS_I(sc->ip)->i_mapping);
> -		if (error)
> +		if (error == -ENOSPC || error == -EIO) {
> +			/*
> +			 * If writeback hits EIO or ENOSPC, reflect it back
> +			 * into the address space mapping so that a writer
> +			 * program calling fsync to look for errors will still
> +			 * capture the error.
> +			 *
> +			 * However, we continue into the extent mapping checks
> +			 * because write failures do not necessarily imply
> +			 * anything about the correctness of the file metadata.
> +			 * The metadata and the file data could be on
> +			 * completely separate devices; a media failure might
> +			 * only affect a subset of the disk, etc.  We properly
> +			 * account for delalloc extents, so leaving them in
> +			 * memory is fine.
> +			 */
> +			mapping_set_error(VFS_I(sc->ip)->i_mapping, error);

I think the more appropriate thing to do is open code the data write and
wait and use the variants of the latter that don't consume address space
errors in the first place (i.e. filemap_fdatawait_keep_errors()). Then
we wouldn't need the special error handling branch or perhaps the first
part of the comment. Hm?

Brian

> +		} else if (error)
>  			goto out;
>  	}
>  
> 


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2] xfs: don't eat an EIO/ENOSPC writeback error when scrubbing data fork
  2020-06-23 12:10 ` Brian Foster
@ 2020-06-23 15:23   ` Darrick J. Wong
  2020-06-23 16:49     ` Brian Foster
  0 siblings, 1 reply; 7+ messages in thread
From: Darrick J. Wong @ 2020-06-23 15:23 UTC (permalink / raw)
  To: Brian Foster; +Cc: xfs, Dave Chinner

On Tue, Jun 23, 2020 at 08:10:31AM -0400, Brian Foster wrote:
> On Mon, Jun 22, 2020 at 08:50:10PM -0700, Darrick J. Wong wrote:
> > From: Darrick J. Wong <darrick.wong@oracle.com>
> > 
> > The data fork scrubber calls filemap_write_and_wait to flush dirty pages
> > and delalloc reservations out to disk prior to checking the data fork's
> > extent mappings.  Unfortunately, this means that scrub can consume the
> > EIO/ENOSPC errors that would otherwise have stayed around in the address
> > space until (we hope) the writer application calls fsync to persist data
> > and collect errors.  The end result is that programs that wrote to a
> > file might never see the error code and proceed as if nothing were
> > wrong.
> > 
> > xfs_scrub is not in a position to notify file writers about the
> > writeback failure, and it's only here to check metadata, not file
> > contents.  Therefore, if writeback fails, we should stuff the error code
> > back into the address space so that an fsync by the writer application
> > can pick that up.
> > 
> > Fixes: 99d9d8d05da2 ("xfs: scrub inode block mappings")
> > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > ---
> > v2: explain why it's ok to keep going even if writeback fails
> > ---
> >  fs/xfs/scrub/bmap.c |   19 ++++++++++++++++++-
> >  1 file changed, 18 insertions(+), 1 deletion(-)
> > 
> > diff --git a/fs/xfs/scrub/bmap.c b/fs/xfs/scrub/bmap.c
> > index 7badd6dfe544..0d7062b7068b 100644
> > --- a/fs/xfs/scrub/bmap.c
> > +++ b/fs/xfs/scrub/bmap.c
> > @@ -47,7 +47,24 @@ xchk_setup_inode_bmap(
> >  	    sc->sm->sm_type == XFS_SCRUB_TYPE_BMBTD) {
> >  		inode_dio_wait(VFS_I(sc->ip));
> >  		error = filemap_write_and_wait(VFS_I(sc->ip)->i_mapping);
> > -		if (error)
> > +		if (error == -ENOSPC || error == -EIO) {
> > +			/*
> > +			 * If writeback hits EIO or ENOSPC, reflect it back
> > +			 * into the address space mapping so that a writer
> > +			 * program calling fsync to look for errors will still
> > +			 * capture the error.
> > +			 *
> > +			 * However, we continue into the extent mapping checks
> > +			 * because write failures do not necessarily imply
> > +			 * anything about the correctness of the file metadata.
> > +			 * The metadata and the file data could be on
> > +			 * completely separate devices; a media failure might
> > +			 * only affect a subset of the disk, etc.  We properly
> > +			 * account for delalloc extents, so leaving them in
> > +			 * memory is fine.
> > +			 */
> > +			mapping_set_error(VFS_I(sc->ip)->i_mapping, error);
> 
> I think the more appropriate thing to do is open code the data write and
> wait and use the variants of the latter that don't consume address space
> errors in the first place (i.e. filemap_fdatawait_keep_errors()). Then
> we wouldn't need the special error handling branch or perhaps the first
> part of the comment. Hm?

Yes, it's certainly possible.  I don't want to go opencoding more vfs
methods (like some e4 filesystems do) so I'll propose that as a second
patch for 5.9.

On second thought, I wonder if I should just drop the flush entirely?
It's not a huge burden to skip past the delalloc reservations.

Hmmm.  Any preferences?

--D

> Brian
> 
> > +		} else if (error)
> >  			goto out;
> >  	}
> >  
> > 
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2] xfs: don't eat an EIO/ENOSPC writeback error when scrubbing data fork
  2020-06-23 15:23   ` Darrick J. Wong
@ 2020-06-23 16:49     ` Brian Foster
  2020-06-23 17:00       ` Darrick J. Wong
  0 siblings, 1 reply; 7+ messages in thread
From: Brian Foster @ 2020-06-23 16:49 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: xfs, Dave Chinner

On Tue, Jun 23, 2020 at 08:23:50AM -0700, Darrick J. Wong wrote:
> On Tue, Jun 23, 2020 at 08:10:31AM -0400, Brian Foster wrote:
> > On Mon, Jun 22, 2020 at 08:50:10PM -0700, Darrick J. Wong wrote:
> > > From: Darrick J. Wong <darrick.wong@oracle.com>
> > > 
> > > The data fork scrubber calls filemap_write_and_wait to flush dirty pages
> > > and delalloc reservations out to disk prior to checking the data fork's
> > > extent mappings.  Unfortunately, this means that scrub can consume the
> > > EIO/ENOSPC errors that would otherwise have stayed around in the address
> > > space until (we hope) the writer application calls fsync to persist data
> > > and collect errors.  The end result is that programs that wrote to a
> > > file might never see the error code and proceed as if nothing were
> > > wrong.
> > > 
> > > xfs_scrub is not in a position to notify file writers about the
> > > writeback failure, and it's only here to check metadata, not file
> > > contents.  Therefore, if writeback fails, we should stuff the error code
> > > back into the address space so that an fsync by the writer application
> > > can pick that up.
> > > 
> > > Fixes: 99d9d8d05da2 ("xfs: scrub inode block mappings")
> > > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > > ---
> > > v2: explain why it's ok to keep going even if writeback fails
> > > ---
> > >  fs/xfs/scrub/bmap.c |   19 ++++++++++++++++++-
> > >  1 file changed, 18 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/fs/xfs/scrub/bmap.c b/fs/xfs/scrub/bmap.c
> > > index 7badd6dfe544..0d7062b7068b 100644
> > > --- a/fs/xfs/scrub/bmap.c
> > > +++ b/fs/xfs/scrub/bmap.c
> > > @@ -47,7 +47,24 @@ xchk_setup_inode_bmap(
> > >  	    sc->sm->sm_type == XFS_SCRUB_TYPE_BMBTD) {
> > >  		inode_dio_wait(VFS_I(sc->ip));
> > >  		error = filemap_write_and_wait(VFS_I(sc->ip)->i_mapping);
> > > -		if (error)
> > > +		if (error == -ENOSPC || error == -EIO) {
> > > +			/*
> > > +			 * If writeback hits EIO or ENOSPC, reflect it back
> > > +			 * into the address space mapping so that a writer
> > > +			 * program calling fsync to look for errors will still
> > > +			 * capture the error.
> > > +			 *
> > > +			 * However, we continue into the extent mapping checks
> > > +			 * because write failures do not necessarily imply
> > > +			 * anything about the correctness of the file metadata.
> > > +			 * The metadata and the file data could be on
> > > +			 * completely separate devices; a media failure might
> > > +			 * only affect a subset of the disk, etc.  We properly
> > > +			 * account for delalloc extents, so leaving them in
> > > +			 * memory is fine.
> > > +			 */
> > > +			mapping_set_error(VFS_I(sc->ip)->i_mapping, error);
> > 
> > I think the more appropriate thing to do is open code the data write and
> > wait and use the variants of the latter that don't consume address space
> > errors in the first place (i.e. filemap_fdatawait_keep_errors()). Then
> > we wouldn't need the special error handling branch or perhaps the first
> > part of the comment. Hm?
> 
> Yes, it's certainly possible.  I don't want to go opencoding more vfs
> methods (like some e4 filesystems do) so I'll propose that as a second
> patch for 5.9.
> 

What's the point of fixing it twice when the generic code already
exports the appropriate helpers? filemap_fdatawrite() and
filemap_fdatawait_keep_errors() are used fairly commonly afaict. That
seems much more straightforward to me than misusing a convenience helper
and trying to undo the undesirable effects after the fact.

> On second thought, I wonder if I should just drop the flush entirely?
> It's not a huge burden to skip past the delalloc reservations.
> 
> Hmmm.  Any preferences?
> 

The context for the above is not clear to me. If the purpose is to check
on-disk metadata, shouldn't we flush the in-core content first? It would seem
a little strange to me for one file check to behave differently from
another if the only difference between the two is that some or more of a
file had been written back, but maybe I'm missing details..

Brian

> --D
> 
> > Brian
> > 
> > > +		} else if (error)
> > >  			goto out;
> > >  	}
> > >  
> > > 
> > 
> 


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2] xfs: don't eat an EIO/ENOSPC writeback error when scrubbing data fork
  2020-06-23 16:49     ` Brian Foster
@ 2020-06-23 17:00       ` Darrick J. Wong
  2020-06-23 17:15         ` Brian Foster
  0 siblings, 1 reply; 7+ messages in thread
From: Darrick J. Wong @ 2020-06-23 17:00 UTC (permalink / raw)
  To: Brian Foster; +Cc: xfs, Dave Chinner

On Tue, Jun 23, 2020 at 12:49:34PM -0400, Brian Foster wrote:
> On Tue, Jun 23, 2020 at 08:23:50AM -0700, Darrick J. Wong wrote:
> > On Tue, Jun 23, 2020 at 08:10:31AM -0400, Brian Foster wrote:
> > > On Mon, Jun 22, 2020 at 08:50:10PM -0700, Darrick J. Wong wrote:
> > > > From: Darrick J. Wong <darrick.wong@oracle.com>
> > > > 
> > > > The data fork scrubber calls filemap_write_and_wait to flush dirty pages
> > > > and delalloc reservations out to disk prior to checking the data fork's
> > > > extent mappings.  Unfortunately, this means that scrub can consume the
> > > > EIO/ENOSPC errors that would otherwise have stayed around in the address
> > > > space until (we hope) the writer application calls fsync to persist data
> > > > and collect errors.  The end result is that programs that wrote to a
> > > > file might never see the error code and proceed as if nothing were
> > > > wrong.
> > > > 
> > > > xfs_scrub is not in a position to notify file writers about the
> > > > writeback failure, and it's only here to check metadata, not file
> > > > contents.  Therefore, if writeback fails, we should stuff the error code
> > > > back into the address space so that an fsync by the writer application
> > > > can pick that up.
> > > > 
> > > > Fixes: 99d9d8d05da2 ("xfs: scrub inode block mappings")
> > > > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > > > ---
> > > > v2: explain why it's ok to keep going even if writeback fails
> > > > ---
> > > >  fs/xfs/scrub/bmap.c |   19 ++++++++++++++++++-
> > > >  1 file changed, 18 insertions(+), 1 deletion(-)
> > > > 
> > > > diff --git a/fs/xfs/scrub/bmap.c b/fs/xfs/scrub/bmap.c
> > > > index 7badd6dfe544..0d7062b7068b 100644
> > > > --- a/fs/xfs/scrub/bmap.c
> > > > +++ b/fs/xfs/scrub/bmap.c
> > > > @@ -47,7 +47,24 @@ xchk_setup_inode_bmap(
> > > >  	    sc->sm->sm_type == XFS_SCRUB_TYPE_BMBTD) {
> > > >  		inode_dio_wait(VFS_I(sc->ip));
> > > >  		error = filemap_write_and_wait(VFS_I(sc->ip)->i_mapping);
> > > > -		if (error)
> > > > +		if (error == -ENOSPC || error == -EIO) {
> > > > +			/*
> > > > +			 * If writeback hits EIO or ENOSPC, reflect it back
> > > > +			 * into the address space mapping so that a writer
> > > > +			 * program calling fsync to look for errors will still
> > > > +			 * capture the error.
> > > > +			 *
> > > > +			 * However, we continue into the extent mapping checks
> > > > +			 * because write failures do not necessarily imply
> > > > +			 * anything about the correctness of the file metadata.
> > > > +			 * The metadata and the file data could be on
> > > > +			 * completely separate devices; a media failure might
> > > > +			 * only affect a subset of the disk, etc.  We properly
> > > > +			 * account for delalloc extents, so leaving them in
> > > > +			 * memory is fine.
> > > > +			 */
> > > > +			mapping_set_error(VFS_I(sc->ip)->i_mapping, error);
> > > 
> > > I think the more appropriate thing to do is open code the data write and
> > > wait and use the variants of the latter that don't consume address space
> > > errors in the first place (i.e. filemap_fdatawait_keep_errors()). Then
> > > we wouldn't need the special error handling branch or perhaps the first
> > > part of the comment. Hm?
> > 
> > Yes, it's certainly possible.  I don't want to go opencoding more vfs
> > methods (like some e4 filesystems do) so I'll propose that as a second
> > patch for 5.9.
> > 
> 
> What's the point of fixing it twice when the generic code already
> exports the appropriate helpers? filemap_fdatawrite() and
> filemap_fdatawait_keep_errors() are used fairly commonly afaict. That
> seems much more straightforward to me than misusing a convenience helper
> and trying to undo the undesirable effects after the fact.

Blergh.  Apparently my eyes suck at telling fdatawait from fdatawrite
and I got all twisted around.  Now I realize that I think you were
asking why I didn't simply call:

filemap_flush()
filemap_fdatawait_keep_errors()

one after the other?  And yes, that's way better than throwing error
codes back into the mapping.  I'll do that, thanks.

> > On second thought, I wonder if I should just drop the flush entirely?
> > It's not a huge burden to skip past the delalloc reservations.
> > 
> > Hmmm.  Any preferences?
> > 
> 
> The context for the above is not clear to me. If the purpose is to check
> on-disk metadata, shouldn't we flush the in-core content first? It would seem
> a little strange to me for one file check to behave differently from
> another if the only difference between the two is that some or more of a
> file had been written back, but maybe I'm missing details..

Originally it was because the bmap scrubber didn't handle delalloc
extents, but that was changed long ago.  Nowadays it only exists as a
precautionary "try to push everything to disk" tactic.

--D

> Brian
> 
> > --D
> > 
> > > Brian
> > > 
> > > > +		} else if (error)
> > > >  			goto out;
> > > >  	}
> > > >  
> > > > 
> > > 
> > 
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2] xfs: don't eat an EIO/ENOSPC writeback error when scrubbing data fork
  2020-06-23 17:00       ` Darrick J. Wong
@ 2020-06-23 17:15         ` Brian Foster
  0 siblings, 0 replies; 7+ messages in thread
From: Brian Foster @ 2020-06-23 17:15 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: xfs, Dave Chinner

On Tue, Jun 23, 2020 at 10:00:54AM -0700, Darrick J. Wong wrote:
> On Tue, Jun 23, 2020 at 12:49:34PM -0400, Brian Foster wrote:
> > On Tue, Jun 23, 2020 at 08:23:50AM -0700, Darrick J. Wong wrote:
> > > On Tue, Jun 23, 2020 at 08:10:31AM -0400, Brian Foster wrote:
> > > > On Mon, Jun 22, 2020 at 08:50:10PM -0700, Darrick J. Wong wrote:
> > > > > From: Darrick J. Wong <darrick.wong@oracle.com>
> > > > > 
> > > > > The data fork scrubber calls filemap_write_and_wait to flush dirty pages
> > > > > and delalloc reservations out to disk prior to checking the data fork's
> > > > > extent mappings.  Unfortunately, this means that scrub can consume the
> > > > > EIO/ENOSPC errors that would otherwise have stayed around in the address
> > > > > space until (we hope) the writer application calls fsync to persist data
> > > > > and collect errors.  The end result is that programs that wrote to a
> > > > > file might never see the error code and proceed as if nothing were
> > > > > wrong.
> > > > > 
> > > > > xfs_scrub is not in a position to notify file writers about the
> > > > > writeback failure, and it's only here to check metadata, not file
> > > > > contents.  Therefore, if writeback fails, we should stuff the error code
> > > > > back into the address space so that an fsync by the writer application
> > > > > can pick that up.
> > > > > 
> > > > > Fixes: 99d9d8d05da2 ("xfs: scrub inode block mappings")
> > > > > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > > > > ---
> > > > > v2: explain why it's ok to keep going even if writeback fails
> > > > > ---
> > > > >  fs/xfs/scrub/bmap.c |   19 ++++++++++++++++++-
> > > > >  1 file changed, 18 insertions(+), 1 deletion(-)
> > > > > 
> > > > > diff --git a/fs/xfs/scrub/bmap.c b/fs/xfs/scrub/bmap.c
> > > > > index 7badd6dfe544..0d7062b7068b 100644
> > > > > --- a/fs/xfs/scrub/bmap.c
> > > > > +++ b/fs/xfs/scrub/bmap.c
> > > > > @@ -47,7 +47,24 @@ xchk_setup_inode_bmap(
> > > > >  	    sc->sm->sm_type == XFS_SCRUB_TYPE_BMBTD) {
> > > > >  		inode_dio_wait(VFS_I(sc->ip));
> > > > >  		error = filemap_write_and_wait(VFS_I(sc->ip)->i_mapping);
> > > > > -		if (error)
> > > > > +		if (error == -ENOSPC || error == -EIO) {
> > > > > +			/*
> > > > > +			 * If writeback hits EIO or ENOSPC, reflect it back
> > > > > +			 * into the address space mapping so that a writer
> > > > > +			 * program calling fsync to look for errors will still
> > > > > +			 * capture the error.
> > > > > +			 *
> > > > > +			 * However, we continue into the extent mapping checks
> > > > > +			 * because write failures do not necessarily imply
> > > > > +			 * anything about the correctness of the file metadata.
> > > > > +			 * The metadata and the file data could be on
> > > > > +			 * completely separate devices; a media failure might
> > > > > +			 * only affect a subset of the disk, etc.  We properly
> > > > > +			 * account for delalloc extents, so leaving them in
> > > > > +			 * memory is fine.
> > > > > +			 */
> > > > > +			mapping_set_error(VFS_I(sc->ip)->i_mapping, error);
> > > > 
> > > > I think the more appropriate thing to do is open code the data write and
> > > > wait and use the variants of the latter that don't consume address space
> > > > errors in the first place (i.e. filemap_fdatawait_keep_errors()). Then
> > > > we wouldn't need the special error handling branch or perhaps the first
> > > > part of the comment. Hm?
> > > 
> > > Yes, it's certainly possible.  I don't want to go opencoding more vfs
> > > methods (like some e4 filesystems do) so I'll propose that as a second
> > > patch for 5.9.
> > > 
> > 
> > What's the point of fixing it twice when the generic code already
> > exports the appropriate helpers? filemap_fdatawrite() and
> > filemap_fdatawait_keep_errors() are used fairly commonly afaict. That
> > seems much more straightforward to me than misusing a convenience helper
> > and trying to undo the undesirable effects after the fact.
> 
> Blergh.  Apparently my eyes suck at telling fdatawait from fdatawrite
> and I got all twisted around.  Now I realize that I think you were
> asking why I didn't simply call:
> 
> filemap_flush()
> filemap_fdatawait_keep_errors()
> 
> one after the other?  And yes, that's way better than throwing error
> codes back into the mapping.  I'll do that, thanks.
> 

Yeah basically, though I was looking more at filemap_fdatawrite() simply
because it's analogous to the write component of
filemap_write_and_wait(). It looks like the only difference with
filemap_flush() is it uses WB_SYNC_NONE instead of WB_SYNC_ALL. Perhaps
either one is fine from here..

> > > On second thought, I wonder if I should just drop the flush entirely?
> > > It's not a huge burden to skip past the delalloc reservations.
> > > 
> > > Hmmm.  Any preferences?
> > > 
> > 
> > The context for the above is not clear to me. If the purpose is to check
> > on-disk metadata, shouldn't we flush the in-core content first? It would seem
> > a little strange to me for one file check to behave differently from
> > another if the only difference between the two is that some or more of a
> > file had been written back, but maybe I'm missing details..
> 
> Originally it was because the bmap scrubber didn't handle delalloc
> extents, but that was changed long ago.  Nowadays it only exists as a
> precautionary "try to push everything to disk" tactic.
> 

Ok. When you mention "skip past the delalloc reservations" above that
implies to me we'd skip some processing/validation bits. If that's not
the case then perhaps it doesn't matter as much...

Brian

> --D
> 
> > Brian
> > 
> > > --D
> > > 
> > > > Brian
> > > > 
> > > > > +		} else if (error)
> > > > >  			goto out;
> > > > >  	}
> > > > >  
> > > > > 
> > > > 
> > > 
> > 
> 


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2020-06-23 17:16 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-23  3:50 [PATCH v2] xfs: don't eat an EIO/ENOSPC writeback error when scrubbing data fork Darrick J. Wong
2020-06-23  4:26 ` Dave Chinner
2020-06-23 12:10 ` Brian Foster
2020-06-23 15:23   ` Darrick J. Wong
2020-06-23 16:49     ` Brian Foster
2020-06-23 17:00       ` Darrick J. Wong
2020-06-23 17:15         ` Brian Foster

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.