linux-xfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/9] xfs_scrub: fix IO error reporting
@ 2019-10-22 18:47 Darrick J. Wong
  2019-10-22 18:47 ` [PATCH 1/9] libfrog/xfs_scrub: improve iteration function documentation Darrick J. Wong
                   ` (8 more replies)
  0 siblings, 9 replies; 16+ messages in thread
From: Darrick J. Wong @ 2019-10-22 18:47 UTC (permalink / raw)
  To: sandeen, darrick.wong; +Cc: linux-xfs

Hi all,

The scrub media error reporting could use some improvements -- first,
scrub can calculate the exact offset of media errors in file mappings,
so we should report more precise offsets.  Second, we only need to scan
the rmap once after assembling the io error bitmap to look for destroyed
metadata (instead of once per error!).  Third, we can filter out
unwritten and attr/cow fork extents from what we report since sector
remapping takes care of unwritten/cow extents and attr media errors
should be detected by phase 3.

If you're going to start using this mess, you probably ought to just
pull from my git trees, which are linked below.

This is an extraordinary way to destroy everything.  Enjoy!
Comments and questions are, as always, welcome.

--D

xfsprogs git tree:
https://git.kernel.org/cgit/linux/kernel/git/djwong/xfsprogs-dev.git/log/?h=scrub-media-error-reporting

fstests git tree:
https://git.kernel.org/cgit/linux/kernel/git/djwong/xfstests-dev.git/log/?h=scrub-media-error-reporting

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 1/9] libfrog/xfs_scrub: improve iteration function documentation
  2019-10-22 18:47 [PATCH 0/9] xfs_scrub: fix IO error reporting Darrick J. Wong
@ 2019-10-22 18:47 ` Darrick J. Wong
  2019-11-01 19:57   ` Eric Sandeen
  2019-10-22 18:47 ` [PATCH 2/9] xfs_scrub: separate media error reporting for attribute forks Darrick J. Wong
                   ` (7 subsequent siblings)
  8 siblings, 1 reply; 16+ messages in thread
From: Darrick J. Wong @ 2019-10-22 18:47 UTC (permalink / raw)
  To: sandeen, darrick.wong; +Cc: linux-xfs

From: Darrick J. Wong <darrick.wong@oracle.com>

Between libfrog and xfs_scrub, we have several item collection iteration
functions that take a pointer to a function that will be called for
every item in that collection.  They're not well documented, so improve
the description of when they'll be called and what kinds of return
values they expect.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 libfrog/ptvar.h  |    8 ++++++--
 scrub/filemap.h  |    4 ++++
 scrub/inodes.h   |    6 ++++++
 scrub/spacemap.h |    4 ++++
 scrub/vfs.h      |   10 ++++++++++
 5 files changed, 30 insertions(+), 2 deletions(-)


diff --git a/libfrog/ptvar.h b/libfrog/ptvar.h
index 42865c0b..b7d02d62 100644
--- a/libfrog/ptvar.h
+++ b/libfrog/ptvar.h
@@ -8,11 +8,15 @@
 
 struct ptvar;
 
-typedef int (*ptvar_iter_fn)(struct ptvar *ptv, void *data, void *foreach_arg);
-
 int ptvar_alloc(size_t nr, size_t size, struct ptvar **pptv);
 void ptvar_free(struct ptvar *ptv);
 void *ptvar_get(struct ptvar *ptv, int *ret);
+
+/*
+ * Visit each per-thread value.  Return 0 to continue iteration or nonzero to
+ * stop iterating and return to the caller.
+ */
+typedef int (*ptvar_iter_fn)(struct ptvar *ptv, void *data, void *foreach_arg);
 int ptvar_foreach(struct ptvar *ptv, ptvar_iter_fn fn, void *foreach_arg);
 
 #endif /* __LIBFROG_PTVAR_H__ */
diff --git a/scrub/filemap.h b/scrub/filemap.h
index cb331729..219be575 100644
--- a/scrub/filemap.h
+++ b/scrub/filemap.h
@@ -14,6 +14,10 @@ struct xfs_bmap {
 	uint32_t	bm_flags;	/* output flags */
 };
 
+/*
+ * Visit each inode fork mapping.  Return true to continue iteration or false
+ * to stop iterating and return to the caller.
+ */
 typedef bool (*xfs_bmap_iter_fn)(struct scrub_ctx *ctx, const char *descr,
 		int fd, int whichfork, struct fsxattr *fsx,
 		struct xfs_bmap *bmap, void *arg);
diff --git a/scrub/inodes.h b/scrub/inodes.h
index 3341c6d9..e58795e7 100644
--- a/scrub/inodes.h
+++ b/scrub/inodes.h
@@ -6,6 +6,12 @@
 #ifndef XFS_SCRUB_INODES_H_
 #define XFS_SCRUB_INODES_H_
 
+/*
+ * Visit each space mapping of an inode fork.  Return 0 to continue iteration
+ * or a positive error code to interrupt iteraton.  If ESTALE is returned,
+ * iteration will be restarted from the beginning of the inode allocation
+ * group.  Any other non zero value will stop iteration.
+ */
 typedef int (*xfs_inode_iter_fn)(struct scrub_ctx *ctx,
 		struct xfs_handle *handle, struct xfs_bulkstat *bs, void *arg);
 
diff --git a/scrub/spacemap.h b/scrub/spacemap.h
index 8f2f0a01..c29c43a5 100644
--- a/scrub/spacemap.h
+++ b/scrub/spacemap.h
@@ -6,6 +6,10 @@
 #ifndef XFS_SCRUB_SPACEMAP_H_
 #define XFS_SCRUB_SPACEMAP_H_
 
+/*
+ * Visit each space mapping in the filesystem.  Return true to continue
+ * iteration or false to stop iterating and return to the caller.
+ */
 typedef bool (*xfs_fsmap_iter_fn)(struct scrub_ctx *ctx, const char *descr,
 		struct fsmap *fsr, void *arg);
 
diff --git a/scrub/vfs.h b/scrub/vfs.h
index caef8969..af23674a 100644
--- a/scrub/vfs.h
+++ b/scrub/vfs.h
@@ -6,8 +6,18 @@
 #ifndef XFS_SCRUB_VFS_H_
 #define XFS_SCRUB_VFS_H_
 
+/*
+ * Visit a subdirectory prior to iterating entries in that subdirectory.
+ * Return true to continue iteration or false to stop iterating and return to
+ * the caller.
+ */
 typedef bool (*scan_fs_tree_dir_fn)(struct scrub_ctx *, const char *,
 		int, void *);
+
+/*
+ * Visit each directory entry in a directory.  Return true to continue
+ * iteration or false to stop iterating and return to the caller.
+ */
 typedef bool (*scan_fs_tree_dirent_fn)(struct scrub_ctx *, const char *,
 		int, struct dirent *, struct stat *, void *);
 


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 2/9] xfs_scrub: separate media error reporting for attribute forks
  2019-10-22 18:47 [PATCH 0/9] xfs_scrub: fix IO error reporting Darrick J. Wong
  2019-10-22 18:47 ` [PATCH 1/9] libfrog/xfs_scrub: improve iteration function documentation Darrick J. Wong
@ 2019-10-22 18:47 ` Darrick J. Wong
  2019-11-01 20:00   ` Eric Sandeen
  2019-10-22 18:47 ` [PATCH 3/9] xfs_scrub: improve reporting of file data media errors Darrick J. Wong
                   ` (6 subsequent siblings)
  8 siblings, 1 reply; 16+ messages in thread
From: Darrick J. Wong @ 2019-10-22 18:47 UTC (permalink / raw)
  To: sandeen, darrick.wong; +Cc: linux-xfs

From: Darrick J. Wong <darrick.wong@oracle.com>

Use different functions to warn about media errors that were detected in
underlying xattr data because logical offsets for attribute fork extents
have no meaning to users.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 scrub/phase6.c |   45 ++++++++++++++++++++++++++++++++++++++-------
 1 file changed, 38 insertions(+), 7 deletions(-)


diff --git a/scrub/phase6.c b/scrub/phase6.c
index 2ce2a19e..f2a78a60 100644
--- a/scrub/phase6.c
+++ b/scrub/phase6.c
@@ -113,7 +113,7 @@ xfs_decode_special_owner(
 
 /* Report if this extent overlaps a bad region. */
 static bool
-xfs_report_verify_inode_bmap(
+report_data_loss(
 	struct scrub_ctx		*ctx,
 	const char			*descr,
 	int				fd,
@@ -142,6 +142,40 @@ _("offset %llu failed read verification."), bmap->bm_offset);
 	return true;
 }
 
+/* Report if the extended attribute data overlaps a bad region. */
+static bool
+report_attr_loss(
+	struct scrub_ctx		*ctx,
+	const char			*descr,
+	int				fd,
+	int				whichfork,
+	struct fsxattr			*fsx,
+	struct xfs_bmap			*bmap,
+	void				*arg)
+{
+	struct media_verify_state	*vs = arg;
+	struct bitmap			*bmp = vs->d_bad;
+
+	/* Complain about attr fork extents that don't look right. */
+	if (bmap->bm_flags & (BMV_OF_PREALLOC | BMV_OF_DELALLOC)) {
+		str_info(ctx, descr,
+_("found unexpected unwritten/delalloc attr fork extent."));
+		return true;
+	}
+
+	if (fsx->fsx_xflags & FS_XFLAG_REALTIME) {
+		str_info(ctx, descr,
+_("found unexpected realtime attr fork extent."));
+		return true;
+	}
+
+	if (bitmap_test(bmp, bmap->bm_physical, bmap->bm_length))
+		str_error(ctx, descr,
+_("media error in extended attribute data."));
+
+	return true;
+}
+
 /* Iterate the extent mappings of a file to report errors. */
 static bool
 xfs_report_verify_fd(
@@ -155,16 +189,13 @@ xfs_report_verify_fd(
 
 	/* data fork */
 	moveon = xfs_iterate_filemaps(ctx, descr, fd, XFS_DATA_FORK, &key,
-			xfs_report_verify_inode_bmap, arg);
+			report_data_loss, arg);
 	if (!moveon)
 		return false;
 
 	/* attr fork */
-	moveon = xfs_iterate_filemaps(ctx, descr, fd, XFS_ATTR_FORK, &key,
-			xfs_report_verify_inode_bmap, arg);
-	if (!moveon)
-		return false;
-	return true;
+	return xfs_iterate_filemaps(ctx, descr, fd, XFS_ATTR_FORK, &key,
+			report_attr_loss, arg);
 }
 
 /* Report read verify errors in unlinked (but still open) files. */


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 3/9] xfs_scrub: improve reporting of file data media errors
  2019-10-22 18:47 [PATCH 0/9] xfs_scrub: fix IO error reporting Darrick J. Wong
  2019-10-22 18:47 ` [PATCH 1/9] libfrog/xfs_scrub: improve iteration function documentation Darrick J. Wong
  2019-10-22 18:47 ` [PATCH 2/9] xfs_scrub: separate media error reporting for attribute forks Darrick J. Wong
@ 2019-10-22 18:47 ` Darrick J. Wong
  2019-11-01 20:04   ` Eric Sandeen
  2019-10-22 18:47 ` [PATCH 4/9] xfs_scrub: better reporting of metadata " Darrick J. Wong
                   ` (5 subsequent siblings)
  8 siblings, 1 reply; 16+ messages in thread
From: Darrick J. Wong @ 2019-10-22 18:47 UTC (permalink / raw)
  To: sandeen, darrick.wong; +Cc: linux-xfs

From: Darrick J. Wong <darrick.wong@oracle.com>

When we report media errors, we should tell the administrator the file
offset and length of the bad region, not just the offset of the entire
file extent record that overlaps a bad region.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 libfrog/bitmap.c |   37 +++++++++++++++++++++++++++++++++++++
 libfrog/bitmap.h |    2 ++
 scrub/phase6.c   |   52 +++++++++++++++++++++++++++++++++++++++++++++++-----
 3 files changed, 86 insertions(+), 5 deletions(-)


diff --git a/libfrog/bitmap.c b/libfrog/bitmap.c
index a75d085a..6a88ef48 100644
--- a/libfrog/bitmap.c
+++ b/libfrog/bitmap.c
@@ -339,6 +339,43 @@ bitmap_iterate(
 }
 #endif
 
+/* Iterate the set regions of part of this bitmap. */
+int
+bitmap_iterate_range(
+	struct bitmap		*bmap,
+	uint64_t		start,
+	uint64_t		length,
+	int			(*fn)(uint64_t, uint64_t, void *),
+	void			*arg)
+{
+	struct avl64node	*firstn;
+	struct avl64node	*lastn;
+	struct avl64node	*pos;
+	struct avl64node	*n;
+	struct avl64node	*l;
+	struct bitmap_node	*ext;
+	int			ret = 0;
+
+	pthread_mutex_lock(&bmap->bt_lock);
+
+	avl64_findranges(bmap->bt_tree, start, start + length, &firstn,
+			&lastn);
+
+	if (firstn == NULL && lastn == NULL)
+		goto out;
+
+	avl_for_each_range_safe(pos, n, l, firstn, lastn) {
+		ext = container_of(pos, struct bitmap_node, btn_node);
+		ret = fn(ext->btn_start, ext->btn_length, arg);
+		if (ret)
+			break;
+	}
+
+out:
+	pthread_mutex_unlock(&bmap->bt_lock);
+	return ret;
+}
+
 /* Do any bitmap extents overlap the given one?  (locked) */
 static bool
 __bitmap_test(
diff --git a/libfrog/bitmap.h b/libfrog/bitmap.h
index 759386a8..043b77ee 100644
--- a/libfrog/bitmap.h
+++ b/libfrog/bitmap.h
@@ -16,6 +16,8 @@ void bitmap_free(struct bitmap **bmap);
 int bitmap_set(struct bitmap *bmap, uint64_t start, uint64_t length);
 int bitmap_iterate(struct bitmap *bmap, int (*fn)(uint64_t, uint64_t, void *),
 		void *arg);
+int bitmap_iterate_range(struct bitmap *bmap, uint64_t start, uint64_t length,
+		int (*fn)(uint64_t, uint64_t, void *), void *arg);
 bool bitmap_test(struct bitmap *bmap, uint64_t start,
 		uint64_t len);
 bool bitmap_empty(struct bitmap *bmap);
diff --git a/scrub/phase6.c b/scrub/phase6.c
index f2a78a60..5aeef1cb 100644
--- a/scrub/phase6.c
+++ b/scrub/phase6.c
@@ -111,6 +111,41 @@ xfs_decode_special_owner(
 
 /* Routines to translate bad physical extents into file paths and offsets. */
 
+struct badfile_report {
+	struct scrub_ctx	*ctx;
+	const char		*descr;
+	struct xfs_bmap		*bmap;
+};
+
+/* Report on bad extents found during a media scan. */
+static int
+report_badfile(
+	uint64_t		start,
+	uint64_t		length,
+	void			*arg)
+{
+	struct badfile_report	*br = arg;
+	unsigned long long	bad_offset;
+	unsigned long long	bad_length;
+
+	/* Clamp the bad region to the file mapping. */
+	if (start < br->bmap->bm_physical) {
+		length -= br->bmap->bm_physical - start;
+		start = br->bmap->bm_physical;
+	}
+	length = min(length, br->bmap->bm_length);
+
+	/* Figure out how far into the bmap is the bad mapping and report it. */
+	bad_offset = start - br->bmap->bm_physical;
+	bad_length = min(start + length,
+			 br->bmap->bm_physical + br->bmap->bm_length) - start;
+
+	str_error(br->ctx, br->descr,
+_("media error at data offset %llu length %llu."),
+			br->bmap->bm_offset + bad_offset, bad_length);
+	return 0;
+}
+
 /* Report if this extent overlaps a bad region. */
 static bool
 report_data_loss(
@@ -122,8 +157,14 @@ report_data_loss(
 	struct xfs_bmap			*bmap,
 	void				*arg)
 {
+	struct badfile_report		br = {
+		.ctx			= ctx,
+		.descr			= descr,
+		.bmap			= bmap,
+	};
 	struct media_verify_state	*vs = arg;
 	struct bitmap			*bmp;
+	int				ret;
 
 	/* Only report errors for real extents. */
 	if (bmap->bm_flags & (BMV_OF_PREALLOC | BMV_OF_DELALLOC))
@@ -134,11 +175,12 @@ report_data_loss(
 	else
 		bmp = vs->d_bad;
 
-	if (!bitmap_test(bmp, bmap->bm_physical, bmap->bm_length))
-		return true;
-
-	str_error(ctx, descr,
-_("offset %llu failed read verification."), bmap->bm_offset);
+	ret = bitmap_iterate_range(bmp, bmap->bm_physical, bmap->bm_length,
+			report_badfile, &br);
+	if (ret) {
+		str_liberror(ctx, ret, descr);
+		return false;
+	}
 	return true;
 }
 


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 4/9] xfs_scrub: better reporting of metadata media errors
  2019-10-22 18:47 [PATCH 0/9] xfs_scrub: fix IO error reporting Darrick J. Wong
                   ` (2 preceding siblings ...)
  2019-10-22 18:47 ` [PATCH 3/9] xfs_scrub: improve reporting of file data media errors Darrick J. Wong
@ 2019-10-22 18:47 ` Darrick J. Wong
  2019-11-01 20:08   ` Eric Sandeen
  2019-10-22 18:47 ` [PATCH 5/9] xfs_scrub: improve reporting of file " Darrick J. Wong
                   ` (4 subsequent siblings)
  8 siblings, 1 reply; 16+ messages in thread
From: Darrick J. Wong @ 2019-10-22 18:47 UTC (permalink / raw)
  To: sandeen, darrick.wong; +Cc: linux-xfs

From: Darrick J. Wong <darrick.wong@oracle.com>

When we report bad metadata, we inexplicably report the physical address
in units of sectors, whereas for file data we report file offsets in
units of bytes.  Fix the metadata reporting units to match the file data
units (i.e. bytes) and skip the printf for all other cases.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 man/man8/xfs_scrub.8 |    4 ++++
 scrub/phase6.c       |   12 +++++-------
 2 files changed, 9 insertions(+), 7 deletions(-)


diff --git a/man/man8/xfs_scrub.8 b/man/man8/xfs_scrub.8
index 18948a4e..e881ae76 100644
--- a/man/man8/xfs_scrub.8
+++ b/man/man8/xfs_scrub.8
@@ -101,6 +101,10 @@ Read all file data extents to look for disk errors.
 will issue O_DIRECT reads to the block device directly.
 If the block device is a SCSI disk, it will instead issue READ VERIFY commands
 directly to the disk.
+If media errors are found, the error report will include the disk offset, in
+bytes.
+If the media errors affect a file, the report will also include the inode
+number and file offset, in bytes.
 These actions will confirm that all file data blocks can be read from storage.
 .SH OPTIMIZATIONS
 Optimizations supported by this program include, but are not limited to:
diff --git a/scrub/phase6.c b/scrub/phase6.c
index 5aeef1cb..1c4a2107 100644
--- a/scrub/phase6.c
+++ b/scrub/phase6.c
@@ -368,7 +368,7 @@ xfs_check_rmap_error_report(
 	void			*arg)
 {
 	const char		*type;
-	char			buf[32];
+	char			buf[DESCR_BUFSZ];
 	uint64_t		err_physical = *(uint64_t *)arg;
 	uint64_t		err_off;
 
@@ -377,14 +377,12 @@ xfs_check_rmap_error_report(
 	else
 		err_off = 0;
 
-	snprintf(buf, 32, _("disk offset %"PRIu64),
-			(uint64_t)BTOBB(map->fmr_physical + err_off));
-
+	/* Report special owners */
 	if (map->fmr_flags & FMR_OF_SPECIAL_OWNER) {
+		snprintf(buf, DESCR_BUFSZ, _("disk offset %"PRIu64),
+				(uint64_t)map->fmr_physical + err_off);
 		type = xfs_decode_special_owner(map->fmr_owner);
-		str_error(ctx, buf,
-_("%s failed read verification."),
-				type);
+		str_error(ctx, buf, _("media error in %s."), type);
 	}
 
 	/*


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 5/9] xfs_scrub: improve reporting of file metadata media errors
  2019-10-22 18:47 [PATCH 0/9] xfs_scrub: fix IO error reporting Darrick J. Wong
                   ` (3 preceding siblings ...)
  2019-10-22 18:47 ` [PATCH 4/9] xfs_scrub: better reporting of metadata " Darrick J. Wong
@ 2019-10-22 18:47 ` Darrick J. Wong
  2019-10-22 18:47 ` [PATCH 6/9] xfs_scrub: don't report media errors on unwritten extents Darrick J. Wong
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 16+ messages in thread
From: Darrick J. Wong @ 2019-10-22 18:47 UTC (permalink / raw)
  To: sandeen, darrick.wong; +Cc: linux-xfs, Eric Sandeen

From: Darrick J. Wong <darrick.wong@oracle.com>

Report media errors that map to data and attr fork extent maps.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
---
 scrub/phase6.c |   11 +++++++++++
 1 file changed, 11 insertions(+)


diff --git a/scrub/phase6.c b/scrub/phase6.c
index 1c4a2107..3125bfd5 100644
--- a/scrub/phase6.c
+++ b/scrub/phase6.c
@@ -385,6 +385,17 @@ xfs_check_rmap_error_report(
 		str_error(ctx, buf, _("media error in %s."), type);
 	}
 
+	/* Report extent maps */
+	if (map->fmr_flags & FMR_OF_EXTENT_MAP) {
+		bool		attr = (map->fmr_flags & FMR_OF_ATTR_FORK);
+
+		scrub_render_ino_descr(ctx, buf, DESCR_BUFSZ,
+				map->fmr_owner, 0, " %s",
+				attr ? _("extended attribute") :
+				       _("file data"));
+		str_error(ctx, buf, _("media error in extent map"));
+	}
+
 	/*
 	 * XXX: If we had a getparent() call we could report IO errors
 	 * efficiently.  Until then, we'll have to scan the dir tree


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 6/9] xfs_scrub: don't report media errors on unwritten extents
  2019-10-22 18:47 [PATCH 0/9] xfs_scrub: fix IO error reporting Darrick J. Wong
                   ` (4 preceding siblings ...)
  2019-10-22 18:47 ` [PATCH 5/9] xfs_scrub: improve reporting of file " Darrick J. Wong
@ 2019-10-22 18:47 ` Darrick J. Wong
  2019-10-22 18:47 ` [PATCH 7/9] xfs_scrub: reduce fsmap activity for media errors Darrick J. Wong
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 16+ messages in thread
From: Darrick J. Wong @ 2019-10-22 18:47 UTC (permalink / raw)
  To: sandeen, darrick.wong; +Cc: linux-xfs, Eric Sandeen

From: Darrick J. Wong <darrick.wong@oracle.com>

Don't report media errors for unwritten extents since no data has been
lost.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
---
 scrub/phase6.c |    4 ++++
 1 file changed, 4 insertions(+)


diff --git a/scrub/phase6.c b/scrub/phase6.c
index 3125bfd5..71a1922d 100644
--- a/scrub/phase6.c
+++ b/scrub/phase6.c
@@ -372,6 +372,10 @@ xfs_check_rmap_error_report(
 	uint64_t		err_physical = *(uint64_t *)arg;
 	uint64_t		err_off;
 
+	/* Don't care about unwritten extents. */
+	if (map->fmr_flags & FMR_OF_PREALLOC)
+		return true;
+
 	if (err_physical > map->fmr_physical)
 		err_off = err_physical - map->fmr_physical;
 	else


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 7/9] xfs_scrub: reduce fsmap activity for media errors
  2019-10-22 18:47 [PATCH 0/9] xfs_scrub: fix IO error reporting Darrick J. Wong
                   ` (5 preceding siblings ...)
  2019-10-22 18:47 ` [PATCH 6/9] xfs_scrub: don't report media errors on unwritten extents Darrick J. Wong
@ 2019-10-22 18:47 ` Darrick J. Wong
  2019-11-01 20:29   ` Eric Sandeen
  2019-10-22 18:47 ` [PATCH 8/9] xfs_scrub: request fewer bmaps when we can Darrick J. Wong
  2019-10-22 18:48 ` [PATCH 9/9] xfs_scrub: fix media verification thread pool size calculations Darrick J. Wong
  8 siblings, 1 reply; 16+ messages in thread
From: Darrick J. Wong @ 2019-10-22 18:47 UTC (permalink / raw)
  To: sandeen, darrick.wong; +Cc: linux-xfs

From: Darrick J. Wong <darrick.wong@oracle.com>

Right now we rather foolishly query the fsmap data for every single
media error that we find.  This is a silly waste of time since we
have yet to combine adjacent bad blocks into bad extents, so move the
rmap query until after we've constructed the bad block bitmap data.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 libfrog/bitmap.c |    2 -
 scrub/phase6.c   |  167 +++++++++++++++++++++++++++++++++++++-----------------
 2 files changed, 114 insertions(+), 55 deletions(-)


diff --git a/libfrog/bitmap.c b/libfrog/bitmap.c
index 6a88ef48..c928d26f 100644
--- a/libfrog/bitmap.c
+++ b/libfrog/bitmap.c
@@ -314,7 +314,6 @@ bitmap_clear(
 }
 #endif
 
-#ifdef DEBUG
 /* Iterate the set regions of this bitmap. */
 int
 bitmap_iterate(
@@ -337,7 +336,6 @@ bitmap_iterate(
 
 	return error;
 }
-#endif
 
 /* Iterate the set regions of part of this bitmap. */
 int
diff --git a/scrub/phase6.c b/scrub/phase6.c
index 71a1922d..fccd18e9 100644
--- a/scrub/phase6.c
+++ b/scrub/phase6.c
@@ -74,6 +74,27 @@ xfs_disk_to_dev(
 	abort();
 }
 
+/* Find the incore bad blocks bitmap for a given disk. */
+static struct bitmap *
+bitmap_for_disk(
+	struct scrub_ctx		*ctx,
+	struct disk			*disk,
+	struct media_verify_state	*vs)
+{
+	dev_t				dev = xfs_disk_to_dev(ctx, disk);
+
+	if (dev == ctx->fsinfo.fs_datadev)
+		return vs->d_bad;
+	else if (dev == ctx->fsinfo.fs_rtdev)
+		return vs->r_bad;
+	return NULL;
+}
+
+struct disk_ioerr_report {
+	struct scrub_ctx	*ctx;
+	struct disk		*disk;
+};
+
 struct owner_decode {
 	uint64_t		owner;
 	const char		*descr;
@@ -341,27 +362,9 @@ xfs_report_verify_dirent(
 	return moveon;
 }
 
-/* Given bad extent lists for the data & rtdev, find bad files. */
+/* Use a fsmap to report metadata lost to a media error. */
 static bool
-xfs_report_verify_errors(
-	struct scrub_ctx		*ctx,
-	struct media_verify_state	*vs)
-{
-	bool				moveon;
-
-	/* Scan the directory tree to get file paths. */
-	moveon = scan_fs_tree(ctx, xfs_report_verify_dir,
-			xfs_report_verify_dirent, vs);
-	if (!moveon)
-		return false;
-
-	/* Scan for unlinked files. */
-	return xfs_scan_all_inodes(ctx, xfs_report_verify_inode, vs);
-}
-
-/* Report an IO error resulting from read-verify based off getfsmap. */
-static bool
-xfs_check_rmap_error_report(
+report_ioerr_fsmap(
 	struct scrub_ctx	*ctx,
 	const char		*descr,
 	struct fsmap		*map,
@@ -410,44 +413,24 @@ xfs_check_rmap_error_report(
 }
 
 /*
- * Remember a read error for later, and see if rmap will tell us about the
- * owner ahead of time.
+ * For a range of bad blocks, visit each space mapping that overlaps the bad
+ * range so that we can report lost metadata.
  */
-static void
-xfs_check_rmap_ioerr(
-	struct scrub_ctx		*ctx,
-	struct disk			*disk,
+static int
+report_ioerr(
 	uint64_t			start,
 	uint64_t			length,
-	int				error,
 	void				*arg)
 {
 	struct fsmap			keys[2];
 	char				descr[DESCR_BUFSZ];
-	struct media_verify_state	*vs = arg;
-	struct bitmap			*tree;
+	struct disk_ioerr_report	*dioerr = arg;
 	dev_t				dev;
-	int				ret;
 
-	dev = xfs_disk_to_dev(ctx, disk);
+	dev = xfs_disk_to_dev(dioerr->ctx, dioerr->disk);
 
-	/*
-	 * If we don't have parent pointers, save the bad extent for
-	 * later rescanning.
-	 */
-	if (dev == ctx->fsinfo.fs_datadev)
-		tree = vs->d_bad;
-	else if (dev == ctx->fsinfo.fs_rtdev)
-		tree = vs->r_bad;
-	else
-		tree = NULL;
-	if (tree) {
-		ret = bitmap_set(tree, start, length);
-		if (ret)
-			str_liberror(ctx, ret, _("setting bad block bitmap"));
-	}
-
-	snprintf(descr, DESCR_BUFSZ, _("dev %d:%d ioerr @ %"PRIu64":%"PRIu64" "),
+	snprintf(descr, DESCR_BUFSZ,
+_("dev %d:%d ioerr @ %"PRIu64":%"PRIu64" "),
 			major(dev), minor(dev), start, length);
 
 	/* Go figure out which blocks are bad from the fsmap. */
@@ -459,8 +442,61 @@ xfs_check_rmap_ioerr(
 	(keys + 1)->fmr_owner = ULLONG_MAX;
 	(keys + 1)->fmr_offset = ULLONG_MAX;
 	(keys + 1)->fmr_flags = UINT_MAX;
-	xfs_iterate_fsmap(ctx, descr, keys, xfs_check_rmap_error_report,
+	xfs_iterate_fsmap(dioerr->ctx, descr, keys, report_ioerr_fsmap,
 			&start);
+	return 0;
+}
+
+/* Report all the media errors found on a disk. */
+static int
+report_disk_ioerrs(
+	struct scrub_ctx		*ctx,
+	struct disk			*disk,
+	struct media_verify_state	*vs)
+{
+	struct disk_ioerr_report	dioerr = {
+		.ctx			= ctx,
+		.disk			= disk,
+	};
+	struct bitmap			*tree;
+
+	if (!disk)
+		return 0;
+	tree = bitmap_for_disk(ctx, disk, vs);
+	if (!tree)
+		return 0;
+	return bitmap_iterate(tree, report_ioerr, &dioerr);
+}
+
+/* Given bad extent lists for the data & rtdev, find bad files. */
+static bool
+report_all_media_errors(
+	struct scrub_ctx		*ctx,
+	struct media_verify_state	*vs)
+{
+	bool				moveon;
+	int				ret;
+
+	ret = report_disk_ioerrs(ctx, ctx->datadev, vs);
+	if (ret) {
+		str_liberror(ctx, ret, _("walking datadev io errors"));
+		return false;
+	}
+
+	ret = report_disk_ioerrs(ctx, ctx->rtdev, vs);
+	if (ret) {
+		str_liberror(ctx, ret, _("walking rtdev io errors"));
+		return false;
+	}
+
+	/* Scan the directory tree to get file paths. */
+	moveon = scan_fs_tree(ctx, xfs_report_verify_dir,
+			xfs_report_verify_dirent, vs);
+	if (!moveon)
+		return false;
+
+	/* Scan for unlinked files. */
+	return xfs_scan_all_inodes(ctx, xfs_report_verify_inode, vs);
 }
 
 /* Schedule a read-verify of a (data block) extent. */
@@ -542,6 +578,31 @@ clean_pool(
 	return ret;
 }
 
+/* Remember a media error for later. */
+static void
+remember_ioerr(
+	struct scrub_ctx		*ctx,
+	struct disk			*disk,
+	uint64_t			start,
+	uint64_t			length,
+	int				error,
+	void				*arg)
+{
+	struct media_verify_state	*vs = arg;
+	struct bitmap			*tree;
+	int				ret;
+
+	tree = bitmap_for_disk(ctx, disk, vs);
+	if (!tree) {
+		str_liberror(ctx, ENOENT, _("finding bad block bitmap"));
+		return;
+	}
+
+	ret = bitmap_set(tree, start, length);
+	if (ret)
+		str_liberror(ctx, ret, _("setting bad block bitmap"));
+}
+
 /*
  * Read verify all the file data blocks in a filesystem.  Since XFS doesn't
  * do data checksums, we trust that the underlying storage will pass back
@@ -571,7 +632,7 @@ xfs_scan_blocks(
 	}
 
 	ret = read_verify_pool_alloc(ctx, ctx->datadev,
-			ctx->mnt.fsgeom.blocksize, xfs_check_rmap_ioerr,
+			ctx->mnt.fsgeom.blocksize, remember_ioerr,
 			scrub_nproc(ctx), &vs.rvp_data);
 	if (ret) {
 		str_liberror(ctx, ret, _("creating datadev media verifier"));
@@ -579,7 +640,7 @@ xfs_scan_blocks(
 	}
 	if (ctx->logdev) {
 		ret = read_verify_pool_alloc(ctx, ctx->logdev,
-				ctx->mnt.fsgeom.blocksize, xfs_check_rmap_ioerr,
+				ctx->mnt.fsgeom.blocksize, remember_ioerr,
 				scrub_nproc(ctx), &vs.rvp_log);
 		if (ret) {
 			str_liberror(ctx, ret,
@@ -589,7 +650,7 @@ xfs_scan_blocks(
 	}
 	if (ctx->rtdev) {
 		ret = read_verify_pool_alloc(ctx, ctx->rtdev,
-				ctx->mnt.fsgeom.blocksize, xfs_check_rmap_ioerr,
+				ctx->mnt.fsgeom.blocksize, remember_ioerr,
 				scrub_nproc(ctx), &vs.rvp_realtime);
 		if (ret) {
 			str_liberror(ctx, ret,
@@ -621,7 +682,7 @@ xfs_scan_blocks(
 
 	/* Scan the whole dir tree to see what matches the bad extents. */
 	if (moveon && (!bitmap_empty(vs.d_bad) || !bitmap_empty(vs.r_bad)))
-		moveon = xfs_report_verify_errors(ctx, &vs);
+		moveon = report_all_media_errors(ctx, &vs);
 
 	bitmap_free(&vs.r_bad);
 	bitmap_free(&vs.d_bad);


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 8/9] xfs_scrub: request fewer bmaps when we can
  2019-10-22 18:47 [PATCH 0/9] xfs_scrub: fix IO error reporting Darrick J. Wong
                   ` (6 preceding siblings ...)
  2019-10-22 18:47 ` [PATCH 7/9] xfs_scrub: reduce fsmap activity for media errors Darrick J. Wong
@ 2019-10-22 18:47 ` Darrick J. Wong
  2019-11-01 20:31   ` Eric Sandeen
  2019-10-22 18:48 ` [PATCH 9/9] xfs_scrub: fix media verification thread pool size calculations Darrick J. Wong
  8 siblings, 1 reply; 16+ messages in thread
From: Darrick J. Wong @ 2019-10-22 18:47 UTC (permalink / raw)
  To: sandeen, darrick.wong; +Cc: linux-xfs

From: Darrick J. Wong <darrick.wong@oracle.com>

In xfs_iterate_filemaps, we query the number of bmaps for a given file
that we're going to iterate, so feed that information to bmap so that
the kernel won't waste time allocating in-kernel memory unnecessarily.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 scrub/filemap.c |    8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)


diff --git a/scrub/filemap.c b/scrub/filemap.c
index bdc6d8f9..f92e9620 100644
--- a/scrub/filemap.c
+++ b/scrub/filemap.c
@@ -71,7 +71,6 @@ xfs_iterate_filemaps(
 		map->bmv_length = ULLONG_MAX;
 	else
 		map->bmv_length = BTOBB(key->bm_length);
-	map->bmv_count = BMAP_NR;
 	map->bmv_iflags = BMV_IF_NO_DMAPI_READ | BMV_IF_PREALLOC |
 			  BMV_IF_NO_HOLES;
 	switch (whichfork) {
@@ -97,6 +96,13 @@ xfs_iterate_filemaps(
 		goto out;
 	}
 
+	if (fsx.fsx_nextents == 0) {
+		moveon = true;
+		goto out;
+	}
+
+	map->bmv_count = min(fsx.fsx_nextents + 1, BMAP_NR);
+
 	while ((error = ioctl(fd, XFS_IOC_GETBMAPX, map)) == 0) {
 		for (i = 0, p = &map[i + 1]; i < map->bmv_entries; i++, p++) {
 			bmap.bm_offset = BBTOB(p->bmv_offset);


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 9/9] xfs_scrub: fix media verification thread pool size calculations
  2019-10-22 18:47 [PATCH 0/9] xfs_scrub: fix IO error reporting Darrick J. Wong
                   ` (7 preceding siblings ...)
  2019-10-22 18:47 ` [PATCH 8/9] xfs_scrub: request fewer bmaps when we can Darrick J. Wong
@ 2019-10-22 18:48 ` Darrick J. Wong
  8 siblings, 0 replies; 16+ messages in thread
From: Darrick J. Wong @ 2019-10-22 18:48 UTC (permalink / raw)
  To: sandeen, darrick.wong; +Cc: linux-xfs, Eric Sandeen

From: Darrick J. Wong <darrick.wong@oracle.com>

The read verifier pool deals with two different thread counts -- there's
the submitter thread count that enables us to perform per-thread verify
request aggregation, and then there's the io thread pool count which is
the maximum number of IO requests we want to send to the disk at any
given time.

The io thread pool count should be derived from disk_heads() but instead
we bungle it by measuring and modifying(!) the nproc global variable.
Fix the derivation to use global variables correctly.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
---
 scrub/read_verify.c |    8 +++-----
 1 file changed, 3 insertions(+), 5 deletions(-)


diff --git a/scrub/read_verify.c b/scrub/read_verify.c
index cba1b2d4..a963a3fd 100644
--- a/scrub/read_verify.c
+++ b/scrub/read_verify.c
@@ -80,6 +80,7 @@ read_verify_pool_alloc(
 	struct read_verify_pool		**prvp)
 {
 	struct read_verify_pool		*rvp;
+	unsigned int			verifier_threads = disk_heads(disk);
 	int				ret;
 
 	/*
@@ -99,7 +100,7 @@ read_verify_pool_alloc(
 			RVP_IO_MAX_SIZE);
 	if (ret)
 		goto out_free;
-	ret = ptcounter_alloc(nproc, &rvp->verified_bytes);
+	ret = ptcounter_alloc(verifier_threads, &rvp->verified_bytes);
 	if (ret)
 		goto out_buf;
 	rvp->miniosz = miniosz;
@@ -110,11 +111,8 @@ read_verify_pool_alloc(
 			&rvp->rvstate);
 	if (ret)
 		goto out_counter;
-	/* Run in the main thread if we only want one thread. */
-	if (nproc == 1)
-		nproc = 0;
 	ret = workqueue_create(&rvp->wq, (struct xfs_mount *)rvp,
-			disk_heads(disk));
+			verifier_threads == 1 ? 0 : verifier_threads);
 	if (ret)
 		goto out_rvstate;
 	*prvp = rvp;


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/9] libfrog/xfs_scrub: improve iteration function documentation
  2019-10-22 18:47 ` [PATCH 1/9] libfrog/xfs_scrub: improve iteration function documentation Darrick J. Wong
@ 2019-11-01 19:57   ` Eric Sandeen
  0 siblings, 0 replies; 16+ messages in thread
From: Eric Sandeen @ 2019-11-01 19:57 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs

On 10/22/19 1:47 PM, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> Between libfrog and xfs_scrub, we have several item collection iteration
> functions that take a pointer to a function that will be called for
> every item in that collection.  They're not well documented, so improve
> the description of when they'll be called and what kinds of return
> values they expect.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>

Yay for helpful comments!

Reviewed-by: Eric Sandeen <sandeen@redhat.com>



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/9] xfs_scrub: separate media error reporting for attribute forks
  2019-10-22 18:47 ` [PATCH 2/9] xfs_scrub: separate media error reporting for attribute forks Darrick J. Wong
@ 2019-11-01 20:00   ` Eric Sandeen
  0 siblings, 0 replies; 16+ messages in thread
From: Eric Sandeen @ 2019-11-01 20:00 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs

On 10/22/19 1:47 PM, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> Use different functions to warn about media errors that were detected in
> underlying xattr data because logical offsets for attribute fork extents
> have no meaning to users.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>

Reviewed-by: Eric Sandeen <sandeen@redhat.com>


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 3/9] xfs_scrub: improve reporting of file data media errors
  2019-10-22 18:47 ` [PATCH 3/9] xfs_scrub: improve reporting of file data media errors Darrick J. Wong
@ 2019-11-01 20:04   ` Eric Sandeen
  0 siblings, 0 replies; 16+ messages in thread
From: Eric Sandeen @ 2019-11-01 20:04 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs

On 10/22/19 1:47 PM, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> When we report media errors, we should tell the administrator the file
> offset and length of the bad region, not just the offset of the entire
> file extent record that overlaps a bad region.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>

Reviewed-by: Eric Sandeen <sandeen@redhat.com>


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 4/9] xfs_scrub: better reporting of metadata media errors
  2019-10-22 18:47 ` [PATCH 4/9] xfs_scrub: better reporting of metadata " Darrick J. Wong
@ 2019-11-01 20:08   ` Eric Sandeen
  0 siblings, 0 replies; 16+ messages in thread
From: Eric Sandeen @ 2019-11-01 20:08 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs

On 10/22/19 1:47 PM, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> When we report bad metadata, we inexplicably report the physical address
> in units of sectors, whereas for file data we report file offsets in
> units of bytes.  Fix the metadata reporting units to match the file data
> units (i.e. bytes) and skip the printf for all other cases.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>

Reviewed-by: Eric Sandeen <sandeen@redhat.com>



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 7/9] xfs_scrub: reduce fsmap activity for media errors
  2019-10-22 18:47 ` [PATCH 7/9] xfs_scrub: reduce fsmap activity for media errors Darrick J. Wong
@ 2019-11-01 20:29   ` Eric Sandeen
  0 siblings, 0 replies; 16+ messages in thread
From: Eric Sandeen @ 2019-11-01 20:29 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs

On 10/22/19 1:47 PM, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> Right now we rather foolishly query the fsmap data for every single
> media error that we find.  This is a silly waste of time since we
> have yet to combine adjacent bad blocks into bad extents, so move the
> rmap query until after we've constructed the bad block bitmap data.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>

Reviewed-by: Eric Sandeen <sandeen@redhat.com>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 8/9] xfs_scrub: request fewer bmaps when we can
  2019-10-22 18:47 ` [PATCH 8/9] xfs_scrub: request fewer bmaps when we can Darrick J. Wong
@ 2019-11-01 20:31   ` Eric Sandeen
  0 siblings, 0 replies; 16+ messages in thread
From: Eric Sandeen @ 2019-11-01 20:31 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs

On 10/22/19 1:47 PM, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> In xfs_iterate_filemaps, we query the number of bmaps for a given file
> that we're going to iterate, so feed that information to bmap so that
> the kernel won't waste time allocating in-kernel memory unnecessarily.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>

Reviewed-by: Eric Sandeen <sandeen@redhat.com>

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2019-11-01 20:31 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-22 18:47 [PATCH 0/9] xfs_scrub: fix IO error reporting Darrick J. Wong
2019-10-22 18:47 ` [PATCH 1/9] libfrog/xfs_scrub: improve iteration function documentation Darrick J. Wong
2019-11-01 19:57   ` Eric Sandeen
2019-10-22 18:47 ` [PATCH 2/9] xfs_scrub: separate media error reporting for attribute forks Darrick J. Wong
2019-11-01 20:00   ` Eric Sandeen
2019-10-22 18:47 ` [PATCH 3/9] xfs_scrub: improve reporting of file data media errors Darrick J. Wong
2019-11-01 20:04   ` Eric Sandeen
2019-10-22 18:47 ` [PATCH 4/9] xfs_scrub: better reporting of metadata " Darrick J. Wong
2019-11-01 20:08   ` Eric Sandeen
2019-10-22 18:47 ` [PATCH 5/9] xfs_scrub: improve reporting of file " Darrick J. Wong
2019-10-22 18:47 ` [PATCH 6/9] xfs_scrub: don't report media errors on unwritten extents Darrick J. Wong
2019-10-22 18:47 ` [PATCH 7/9] xfs_scrub: reduce fsmap activity for media errors Darrick J. Wong
2019-11-01 20:29   ` Eric Sandeen
2019-10-22 18:47 ` [PATCH 8/9] xfs_scrub: request fewer bmaps when we can Darrick J. Wong
2019-11-01 20:31   ` Eric Sandeen
2019-10-22 18:48 ` [PATCH 9/9] xfs_scrub: fix media verification thread pool size calculations Darrick J. Wong

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).