All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4 00/12] xfs-4.18: various fixes
@ 2018-05-03 18:05 Darrick J. Wong
  2018-05-03 18:05 ` [PATCH 01/12] xfs: bmap debugging should never panic the system Darrick J. Wong
                   ` (12 more replies)
  0 siblings, 13 replies; 27+ messages in thread
From: Darrick J. Wong @ 2018-05-03 18:05 UTC (permalink / raw)
  To: darrick.wong; +Cc: linux-xfs

Hi all,

Here's a revised series cleaning up some problems and bugs found by
stressing the online scrub code.  Most of the problems fixed here were
uncovered through more thorough testing of the online repair series,
which was posted previously.

The first patch replaces a panic() call in the bmap debugging verifiers
with logging code that doesn't take the whole system down.  This helps
us to avoid panic reboots while running generic/388.  The second patch
fixes an uncaught error return in the rmap code.

Then, we amend scrub to skip cross-referencing once the first error is
found; this patch hasn't changed since its posting during the 4.17 cycle.
Second is a patch adding more "finish scrub as soon as we've decided
CORRUPT" checks.

The next two patches refactor the quota scrub to drop all the weird code
that attaches to the quota ip in favor of using the well known mechanism
that we use for all other inodes.  This enables us to call the data fork
scrubber on the quota inodes; and the dquot iterator that we created
earlier.  We also adapt the rtbitmap/rtsum scrubbers to check the data
forks with the same helper.

The next two patches fix some broken interaction between scrub and the
rest of xfs.

Following that are three fixes to the scrubbers themselves -- the first
checks that a btree block never contains fewer than minrecs records
unless it's the top block in a tree.  The second refactors the
transaction allocation helper; and the last one fixes a potential ABBA
deadlock in the parent scrubber.

If you're going to start using this mess, you probably ought to just
pull from my git trees.  The kernel patches[1] should apply against
4.17-rc3.  xfsprogs[2] and xfstests[3] can be found in their usual
places.  The git trees contain all four series' worth of changes.

This is an extraordinary way to eat your data.  Enjoy!
Comments and questions are, as always, welcome.

--D

[1] https://git.kernel.org/cgit/linux/kernel/git/djwong/xfs-linux.git/log/?h=djwong-devel
[2] https://git.kernel.org/cgit/linux/kernel/git/djwong/xfsprogs-dev.git/log/?h=djwong-devel
[3] https://git.kernel.org/cgit/linux/kernel/git/djwong/xfstests-dev.git/log/?h=djwong-devel

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH 01/12] xfs: bmap debugging should never panic the system
  2018-05-03 18:05 [PATCH v4 00/12] xfs-4.18: various fixes Darrick J. Wong
@ 2018-05-03 18:05 ` Darrick J. Wong
  2018-05-07 15:06   ` Christoph Hellwig
  2018-05-10 13:53   ` Brian Foster
  2018-05-03 18:06 ` [PATCH 02/12] xfs: add missing rmap error return Darrick J. Wong
                   ` (11 subsequent siblings)
  12 siblings, 2 replies; 27+ messages in thread
From: Darrick J. Wong @ 2018-05-03 18:05 UTC (permalink / raw)
  To: darrick.wong; +Cc: linux-xfs

From: Darrick J. Wong <darrick.wong@oracle.com>

Don't panic() the system if the bmap records are garbage, just call
ASSERT which gives us the same backtrace but enables developers to
control if the system goes down or not.  This makes debugging with
generic/388 much easier because it won't reboot the machine midway
through a run just because btree_read_bufl returns EIO when the fs has
already shut down.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 fs/xfs/libxfs/xfs_bmap.c |    8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)


diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c
index 6a7c2f03ea11..393b4e2c63ad 100644
--- a/fs/xfs/libxfs/xfs_bmap.c
+++ b/fs/xfs/libxfs/xfs_bmap.c
@@ -312,8 +312,10 @@ xfs_check_block(
 				xfs_warn(mp, "%s: thispa(%d) == pp(%d) %Ld",
 					__func__, j, i,
 					(unsigned long long)be64_to_cpu(*thispa));
-				panic("%s: ptrs are equal in node\n",
+				xfs_err(mp, "%s: ptrs are equal in node\n",
 					__func__);
+				ASSERT(XFS_FORCED_SHUTDOWN(mp));
+				xfs_force_shutdown(mp, SHUTDOWN_CORRUPT_INCORE);
 			}
 		}
 	}
@@ -483,7 +485,9 @@ xfs_bmap_check_leaf_extents(
 error_norelse:
 	xfs_warn(mp, "%s: BAD after btree leaves for %d extents",
 		__func__, i);
-	panic("%s: CORRUPTED BTREE OR SOMETHING", __func__);
+	xfs_err(mp, "%s: CORRUPTED BTREE OR SOMETHING", __func__);
+	ASSERT(XFS_FORCED_SHUTDOWN(mp));
+	xfs_force_shutdown(mp, SHUTDOWN_CORRUPT_INCORE);
 	return;
 }
 


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 02/12] xfs: add missing rmap error return
  2018-05-03 18:05 [PATCH v4 00/12] xfs-4.18: various fixes Darrick J. Wong
  2018-05-03 18:05 ` [PATCH 01/12] xfs: bmap debugging should never panic the system Darrick J. Wong
@ 2018-05-03 18:06 ` Darrick J. Wong
  2018-05-07 15:06   ` Christoph Hellwig
  2018-05-10 13:53   ` Brian Foster
  2018-05-03 18:06 ` [PATCH 03/12] xfs: skip scrub xref if corruption already noted Darrick J. Wong
                   ` (10 subsequent siblings)
  12 siblings, 2 replies; 27+ messages in thread
From: Darrick J. Wong @ 2018-05-03 18:06 UTC (permalink / raw)
  To: darrick.wong; +Cc: linux-xfs

From: Darrick J. Wong <darrick.wong@oracle.com>

xfs_rmap_lookup_le_range can return errors, so we need to check for
them and bail out.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 fs/xfs/libxfs/xfs_rmap.c |    2 ++
 1 file changed, 2 insertions(+)


diff --git a/fs/xfs/libxfs/xfs_rmap.c b/fs/xfs/libxfs/xfs_rmap.c
index fba8d2718017..f7769edf5b68 100644
--- a/fs/xfs/libxfs/xfs_rmap.c
+++ b/fs/xfs/libxfs/xfs_rmap.c
@@ -1374,6 +1374,8 @@ xfs_rmap_convert_shared(
 	 */
 	error = xfs_rmap_lookup_le_range(cur, bno, owner, offset, flags,
 			&PREV, &i);
+	if (error)
+		goto done;
 	XFS_WANT_CORRUPTED_GOTO(mp, i == 1, done);
 
 	ASSERT(PREV.rm_offset <= offset);


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 03/12] xfs: skip scrub xref if corruption already noted
  2018-05-03 18:05 [PATCH v4 00/12] xfs-4.18: various fixes Darrick J. Wong
  2018-05-03 18:05 ` [PATCH 01/12] xfs: bmap debugging should never panic the system Darrick J. Wong
  2018-05-03 18:06 ` [PATCH 02/12] xfs: add missing rmap error return Darrick J. Wong
@ 2018-05-03 18:06 ` Darrick J. Wong
  2018-05-10 13:53   ` Brian Foster
  2018-05-03 18:06 ` [PATCH 04/12] xfs: don't continue scrub if already corrupt Darrick J. Wong
                   ` (9 subsequent siblings)
  12 siblings, 1 reply; 27+ messages in thread
From: Darrick J. Wong @ 2018-05-03 18:06 UTC (permalink / raw)
  To: darrick.wong; +Cc: linux-xfs

From: Darrick J. Wong <darrick.wong@oracle.com>

Don't bother looking for cross-referencing problems if the metadata is
already corrupt or we've already found a cross-referencing problem.
Since we added a helper function for flags testing, convert existing
users to use it.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 fs/xfs/scrub/alloc.c    |    4 ++--
 fs/xfs/scrub/bmap.c     |    2 +-
 fs/xfs/scrub/common.c   |    4 ++++
 fs/xfs/scrub/common.h   |   10 ++++++++++
 fs/xfs/scrub/ialloc.c   |    7 ++++---
 fs/xfs/scrub/inode.c    |    5 ++++-
 fs/xfs/scrub/refcount.c |    8 ++++----
 fs/xfs/scrub/rmap.c     |    6 +++---
 fs/xfs/scrub/rtbitmap.c |    3 +++
 fs/xfs/scrub/scrub.c    |    4 ++--
 10 files changed, 37 insertions(+), 16 deletions(-)


diff --git a/fs/xfs/scrub/alloc.c b/fs/xfs/scrub/alloc.c
index 517c079d3f68..941a0a55224e 100644
--- a/fs/xfs/scrub/alloc.c
+++ b/fs/xfs/scrub/alloc.c
@@ -70,7 +70,7 @@ xfs_scrub_allocbt_xref_other(
 		pcur = &sc->sa.cnt_cur;
 	else
 		pcur = &sc->sa.bno_cur;
-	if (!*pcur)
+	if (!*pcur || xfs_scrub_skip_xref(sc->sm))
 		return;
 
 	error = xfs_alloc_lookup_le(*pcur, agbno, len, &has_otherrec);
@@ -172,7 +172,7 @@ xfs_scrub_xref_is_used_space(
 	bool				is_freesp;
 	int				error;
 
-	if (!sc->sa.bno_cur)
+	if (!sc->sa.bno_cur || xfs_scrub_skip_xref(sc->sm))
 		return;
 
 	error = xfs_alloc_has_record(sc->sa.bno_cur, agbno, len, &is_freesp);
diff --git a/fs/xfs/scrub/bmap.c b/fs/xfs/scrub/bmap.c
index 639d14b51e90..fb91caf17652 100644
--- a/fs/xfs/scrub/bmap.c
+++ b/fs/xfs/scrub/bmap.c
@@ -175,7 +175,7 @@ xfs_scrub_bmap_xref_rmap(
 	unsigned long long		rmap_end;
 	uint64_t			owner;
 
-	if (!info->sc->sa.rmap_cur)
+	if (!info->sc->sa.rmap_cur || xfs_scrub_skip_xref(info->sc->sm))
 		return;
 
 	if (info->whichfork == XFS_COW_FORK)
diff --git a/fs/xfs/scrub/common.c b/fs/xfs/scrub/common.c
index 8ed91d5c868d..e9ca5f0802d5 100644
--- a/fs/xfs/scrub/common.c
+++ b/fs/xfs/scrub/common.c
@@ -727,6 +727,10 @@ xfs_scrub_should_check_xref(
 	int				*error,
 	struct xfs_btree_cur		**curpp)
 {
+	/* No point in xref if we already know we're corrupt. */
+	if (xfs_scrub_skip_xref(sc->sm))
+		return false;
+
 	if (*error == 0)
 		return true;
 
diff --git a/fs/xfs/scrub/common.h b/fs/xfs/scrub/common.h
index deaf60400981..80fa5a67c265 100644
--- a/fs/xfs/scrub/common.h
+++ b/fs/xfs/scrub/common.h
@@ -157,4 +157,14 @@ int xfs_scrub_setup_inode_contents(struct xfs_scrub_context *sc,
 				   struct xfs_inode *ip, unsigned int resblks);
 void xfs_scrub_buffer_recheck(struct xfs_scrub_context *sc, struct xfs_buf *bp);
 
+/*
+ * Don't bother cross-referencing if we already found corruption or cross
+ * referencing discrepancies.
+ */
+static inline bool xfs_scrub_skip_xref(struct xfs_scrub_metadata *sm)
+{
+	return sm->sm_flags & (XFS_SCRUB_OFLAG_CORRUPT |
+			       XFS_SCRUB_OFLAG_XCORRUPT);
+}
+
 #endif	/* __XFS_SCRUB_COMMON_H__ */
diff --git a/fs/xfs/scrub/ialloc.c b/fs/xfs/scrub/ialloc.c
index 106ca4bd753f..00a834d3b56d 100644
--- a/fs/xfs/scrub/ialloc.c
+++ b/fs/xfs/scrub/ialloc.c
@@ -387,7 +387,8 @@ xfs_scrub_iallocbt_xref_rmap_btreeblks(
 	int				error;
 
 	if (!sc->sa.ino_cur || !sc->sa.rmap_cur ||
-	    (xfs_sb_version_hasfinobt(&sc->mp->m_sb) && !sc->sa.fino_cur))
+	    (xfs_sb_version_hasfinobt(&sc->mp->m_sb) && !sc->sa.fino_cur) ||
+	    xfs_scrub_skip_xref(sc->sm))
 		return;
 
 	/* Check that we saw as many inobt blocks as the rmap says. */
@@ -424,7 +425,7 @@ xfs_scrub_iallocbt_xref_rmap_inodes(
 	xfs_filblks_t			blocks;
 	int				error;
 
-	if (!sc->sa.rmap_cur)
+	if (!sc->sa.rmap_cur || xfs_scrub_skip_xref(sc->sm))
 		return;
 
 	/* Check that we saw as many inode blocks as the rmap knows about. */
@@ -496,7 +497,7 @@ xfs_scrub_xref_inode_check(
 	bool				has_inodes;
 	int				error;
 
-	if (!(*icur))
+	if (!(*icur) || xfs_scrub_skip_xref(sc->sm))
 		return;
 
 	error = xfs_ialloc_has_inodes_at_extent(*icur, agbno, len, &has_inodes);
diff --git a/fs/xfs/scrub/inode.c b/fs/xfs/scrub/inode.c
index df14930e4fc5..8532fc7be838 100644
--- a/fs/xfs/scrub/inode.c
+++ b/fs/xfs/scrub/inode.c
@@ -449,7 +449,7 @@ xfs_scrub_inode_xref_finobt(
 	int				has_record;
 	int				error;
 
-	if (!sc->sa.fino_cur)
+	if (!sc->sa.fino_cur || xfs_scrub_skip_xref(sc->sm))
 		return;
 
 	agino = XFS_INO_TO_AGINO(sc->mp, ino);
@@ -492,6 +492,9 @@ xfs_scrub_inode_xref_bmap(
 	xfs_filblks_t			acount;
 	int				error;
 
+	if (xfs_scrub_skip_xref(sc->sm))
+		return;
+
 	/* Walk all the extents to check nextents/naextents/nblocks. */
 	error = xfs_bmap_count_blocks(sc->tp, sc->ip, XFS_DATA_FORK,
 			&nextents, &count);
diff --git a/fs/xfs/scrub/refcount.c b/fs/xfs/scrub/refcount.c
index 400f1561cd3d..913b0a190e68 100644
--- a/fs/xfs/scrub/refcount.c
+++ b/fs/xfs/scrub/refcount.c
@@ -310,7 +310,7 @@ xfs_scrub_refcountbt_xref_rmap(
 	struct xfs_scrub_refcnt_frag	*n;
 	int				error;
 
-	if (!sc->sa.rmap_cur)
+	if (!sc->sa.rmap_cur || xfs_scrub_skip_xref(sc->sm))
 		return;
 
 	/* Cross-reference with the rmapbt to confirm the refcount. */
@@ -404,7 +404,7 @@ xfs_scrub_refcount_xref_rmap(
 	xfs_filblks_t			blocks;
 	int				error;
 
-	if (!sc->sa.rmap_cur)
+	if (!sc->sa.rmap_cur || xfs_scrub_skip_xref(sc->sm))
 		return;
 
 	/* Check that we saw as many refcbt blocks as the rmap knows about. */
@@ -460,7 +460,7 @@ xfs_scrub_xref_is_cow_staging(
 	int				has_refcount;
 	int				error;
 
-	if (!sc->sa.refc_cur)
+	if (!sc->sa.refc_cur || xfs_scrub_skip_xref(sc->sm))
 		return;
 
 	/* Find the CoW staging extent. */
@@ -504,7 +504,7 @@ xfs_scrub_xref_is_not_shared(
 	bool				shared;
 	int				error;
 
-	if (!sc->sa.refc_cur)
+	if (!sc->sa.refc_cur || xfs_scrub_skip_xref(sc->sm))
 		return;
 
 	error = xfs_refcount_has_record(sc->sa.refc_cur, agbno, len, &shared);
diff --git a/fs/xfs/scrub/rmap.c b/fs/xfs/scrub/rmap.c
index 8f2a7c3ff455..b376a9a77c04 100644
--- a/fs/xfs/scrub/rmap.c
+++ b/fs/xfs/scrub/rmap.c
@@ -66,7 +66,7 @@ xfs_scrub_rmapbt_xref_refc(
 	bool				is_unwritten;
 	int				error;
 
-	if (!sc->sa.refc_cur)
+	if (!sc->sa.refc_cur || xfs_scrub_skip_xref(sc->sm))
 		return;
 
 	non_inode = XFS_RMAP_NON_INODE_OWNER(irec->rm_owner);
@@ -207,7 +207,7 @@ xfs_scrub_xref_check_owner(
 	bool				has_rmap;
 	int				error;
 
-	if (!sc->sa.rmap_cur)
+	if (!sc->sa.rmap_cur || xfs_scrub_skip_xref(sc->sm))
 		return;
 
 	error = xfs_rmap_record_exists(sc->sa.rmap_cur, bno, len, oinfo,
@@ -250,7 +250,7 @@ xfs_scrub_xref_has_no_owner(
 	bool				has_rmap;
 	int				error;
 
-	if (!sc->sa.rmap_cur)
+	if (!sc->sa.rmap_cur || xfs_scrub_skip_xref(sc->sm))
 		return;
 
 	error = xfs_rmap_has_record(sc->sa.rmap_cur, bno, len, &has_rmap);
diff --git a/fs/xfs/scrub/rtbitmap.c b/fs/xfs/scrub/rtbitmap.c
index 39c41dfe08ee..8b048f107af2 100644
--- a/fs/xfs/scrub/rtbitmap.c
+++ b/fs/xfs/scrub/rtbitmap.c
@@ -110,6 +110,9 @@ xfs_scrub_xref_is_used_rt_space(
 	bool				is_free;
 	int				error;
 
+	if (xfs_scrub_skip_xref(sc->sm))
+		return;
+
 	xfs_ilock(sc->mp->m_rbmip, XFS_ILOCK_SHARED | XFS_ILOCK_RTBITMAP);
 	error = xfs_rtalloc_extent_is_free(sc->mp, sc->tp, fsbno, len,
 			&is_free);
diff --git a/fs/xfs/scrub/scrub.c b/fs/xfs/scrub/scrub.c
index 26c75967a072..d9ecbb6ad4a3 100644
--- a/fs/xfs/scrub/scrub.c
+++ b/fs/xfs/scrub/scrub.c
@@ -446,8 +446,8 @@ xfs_scrub_metadata(
 	} else if (error)
 		goto out_teardown;
 
-	if (sc.sm->sm_flags & (XFS_SCRUB_OFLAG_CORRUPT |
-			       XFS_SCRUB_OFLAG_XCORRUPT))
+	if (sm->sm_flags & (XFS_SCRUB_OFLAG_CORRUPT |
+			    XFS_SCRUB_OFLAG_XCORRUPT))
 		xfs_alert_ratelimited(mp, "Corruption detected during scrub.");
 
 out_teardown:


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 04/12] xfs: don't continue scrub if already corrupt
  2018-05-03 18:05 [PATCH v4 00/12] xfs-4.18: various fixes Darrick J. Wong
                   ` (2 preceding siblings ...)
  2018-05-03 18:06 ` [PATCH 03/12] xfs: skip scrub xref if corruption already noted Darrick J. Wong
@ 2018-05-03 18:06 ` Darrick J. Wong
  2018-05-10 13:53   ` Brian Foster
  2018-05-03 18:06 ` [PATCH 05/12] xfs: avoid ilock games in the quota scrubber Darrick J. Wong
                   ` (8 subsequent siblings)
  12 siblings, 1 reply; 27+ messages in thread
From: Darrick J. Wong @ 2018-05-03 18:06 UTC (permalink / raw)
  To: darrick.wong; +Cc: linux-xfs

From: Darrick J. Wong <darrick.wong@oracle.com>

If we've already decided that something is corrupt, we might as well
abort all the loops and exit as quickly as possible.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 fs/xfs/scrub/attr.c   |    3 ++-
 fs/xfs/scrub/bmap.c   |    3 ++-
 fs/xfs/scrub/dir.c    |   34 +++++++++++++++++++++++++---------
 fs/xfs/scrub/parent.c |    3 +++
 4 files changed, 32 insertions(+), 11 deletions(-)


diff --git a/fs/xfs/scrub/attr.c b/fs/xfs/scrub/attr.c
index 127575f0abfb..84b6d6b66578 100644
--- a/fs/xfs/scrub/attr.c
+++ b/fs/xfs/scrub/attr.c
@@ -126,8 +126,9 @@ xfs_scrub_xattr_listent(
 	if (args.valuelen != valuelen)
 		xfs_scrub_fblock_set_corrupt(sx->sc, XFS_ATTR_FORK,
 					     args.blkno);
-
 fail_xref:
+	if (sx->sc->sm->sm_flags & XFS_SCRUB_OFLAG_CORRUPT)
+		context->seen_enough = 1;
 	return;
 }
 
diff --git a/fs/xfs/scrub/bmap.c b/fs/xfs/scrub/bmap.c
index fb91caf17652..c28ff2ec201a 100644
--- a/fs/xfs/scrub/bmap.c
+++ b/fs/xfs/scrub/bmap.c
@@ -684,7 +684,8 @@ xfs_scrub_bmap(
 	info.lastoff = 0;
 	ifp = XFS_IFORK_PTR(ip, whichfork);
 	for_each_xfs_iext(ifp, &icur, &irec) {
-		if (xfs_scrub_should_terminate(sc, &error))
+		if (xfs_scrub_should_terminate(sc, &error) ||
+		    (sc->sm->sm_flags & XFS_SCRUB_OFLAG_CORRUPT))
 			break;
 		if (isnullstartblock(irec.br_startblock))
 			continue;
diff --git a/fs/xfs/scrub/dir.c b/fs/xfs/scrub/dir.c
index 38f29806eb54..fb56a885f443 100644
--- a/fs/xfs/scrub/dir.c
+++ b/fs/xfs/scrub/dir.c
@@ -172,7 +172,7 @@ xfs_scrub_dir_actor(
 	error = xfs_dir_lookup(sdc->sc->tp, ip, &xname, &lookup_ino, NULL);
 	if (!xfs_scrub_fblock_process_error(sdc->sc, XFS_DATA_FORK, offset,
 			&error))
-		goto fail_xref;
+		goto out;
 	if (lookup_ino != ino) {
 		xfs_scrub_fblock_set_corrupt(sdc->sc, XFS_DATA_FORK, offset);
 		goto out;
@@ -183,8 +183,13 @@ xfs_scrub_dir_actor(
 	if (error)
 		goto out;
 out:
-	return error;
-fail_xref:
+	/*
+	 * A negative error code returned here is supposed to cause the
+	 * dir_emit caller (xfs_readdir) to abort the directory iteration
+	 * and return zero to xfs_scrub_dir.
+	 */
+	if (error == 0 && sdc->sc->sm->sm_flags & XFS_SCRUB_OFLAG_CORRUPT)
+		return -EFSCORRUPTED;
 	return error;
 }
 
@@ -240,6 +245,9 @@ xfs_scrub_dir_rec(
 	}
 	xfs_scrub_buffer_recheck(ds->sc, bp);
 
+	if (ds->sc->sm->sm_flags & XFS_SCRUB_OFLAG_CORRUPT)
+		goto out_relse;
+
 	dent = (struct xfs_dir2_data_entry *)(((char *)bp->b_addr) + off);
 
 	/* Make sure we got a real directory entry. */
@@ -357,6 +365,9 @@ xfs_scrub_directory_data_bestfree(
 
 	/* XXX: Check xfs_dir3_data_hdr.pad is zero once we start setting it. */
 
+	if (sc->sm->sm_flags & XFS_SCRUB_OFLAG_CORRUPT)
+		goto out_buf;
+
 	/* Do the bestfrees correspond to actual free space? */
 	bf = d_ops->data_bestfree_p(bp->b_addr);
 	smallest_bestfree = UINT_MAX;
@@ -394,7 +405,7 @@ xfs_scrub_directory_data_bestfree(
 	endptr = xfs_dir3_data_endp(mp->m_dir_geo, bp->b_addr);
 
 	/* Iterate the entries, stopping when we hit or go past the end. */
-	while (ptr < endptr) {
+	while (ptr < endptr && !(sc->sm->sm_flags & XFS_SCRUB_OFLAG_CORRUPT)) {
 		dup = (struct xfs_dir2_data_unused *)ptr;
 		/* Skip real entries */
 		if (dup->freetag != cpu_to_be16(XFS_DIR2_DATA_FREE_TAG)) {
@@ -413,8 +424,10 @@ xfs_scrub_directory_data_bestfree(
 
 		/* Spot check this free entry */
 		tag = be16_to_cpu(*xfs_dir2_data_unused_tag_p(dup));
-		if (tag != ((char *)dup - (char *)bp->b_addr))
+		if (tag != ((char *)dup - (char *)bp->b_addr)) {
 			xfs_scrub_fblock_set_corrupt(sc, XFS_DATA_FORK, lblk);
+			goto out_buf;
+		}
 
 		/*
 		 * Either this entry is a bestfree or it's smaller than
@@ -549,6 +562,8 @@ xfs_scrub_directory_leaf1_bestfree(
 
 	/* Check all the bestfree entries. */
 	for (i = 0; i < bestcount; i++, bestp++) {
+		if (sc->sm->sm_flags & XFS_SCRUB_OFLAG_CORRUPT)
+			break;
 		best = be16_to_cpu(*bestp);
 		if (best == NULLDATAOFF)
 			continue;
@@ -556,9 +571,10 @@ xfs_scrub_directory_leaf1_bestfree(
 				i * args->geo->fsbcount, -1, &dbp);
 		if (!xfs_scrub_fblock_process_error(sc, XFS_DATA_FORK, lblk,
 				&error))
-			continue;
+			break;
 		xfs_scrub_directory_check_freesp(sc, lblk, dbp, best);
 		xfs_trans_brelse(sc->tp, dbp);
+
 	}
 out:
 	return error;
@@ -607,7 +623,7 @@ xfs_scrub_directory_free_bestfree(
 				-1, &dbp);
 		if (!xfs_scrub_fblock_process_error(sc, XFS_DATA_FORK, lblk,
 				&error))
-			continue;
+			break;
 		xfs_scrub_directory_check_freesp(sc, lblk, dbp, best);
 		xfs_trans_brelse(sc->tp, dbp);
 	}
@@ -656,7 +672,7 @@ xfs_scrub_directory_blocks(
 
 	/* Iterate all the data extents in the directory... */
 	found = xfs_iext_lookup_extent(sc->ip, ifp, lblk, &icur, &got);
-	while (found) {
+	while (found && !(sc->sm->sm_flags & XFS_SCRUB_OFLAG_CORRUPT)) {
 		/* Block directories only have a single block at offset 0. */
 		if (is_block &&
 		    (got.br_startoff > 0 ||
@@ -719,7 +735,7 @@ xfs_scrub_directory_blocks(
 	/* Scan for free blocks */
 	lblk = free_lblk;
 	found = xfs_iext_lookup_extent(sc->ip, ifp, lblk, &icur, &got);
-	while (found) {
+	while (found && !(sc->sm->sm_flags & XFS_SCRUB_OFLAG_CORRUPT)) {
 		/*
 		 * Dirs can't have blocks mapped above 2^32.
 		 * Single-block dirs shouldn't even be here.
diff --git a/fs/xfs/scrub/parent.c b/fs/xfs/scrub/parent.c
index 1fb88c18d455..95fdd0d9e65d 100644
--- a/fs/xfs/scrub/parent.c
+++ b/fs/xfs/scrub/parent.c
@@ -147,6 +147,9 @@ xfs_scrub_parent_validate(
 
 	*try_again = false;
 
+	if (sc->sm->sm_flags & XFS_SCRUB_OFLAG_CORRUPT)
+		return 0;
+
 	/* '..' must not point to ourselves. */
 	if (sc->ip->i_ino == dnum) {
 		xfs_scrub_fblock_set_corrupt(sc, XFS_DATA_FORK, 0);


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 05/12] xfs: avoid ilock games in the quota scrubber
  2018-05-03 18:05 [PATCH v4 00/12] xfs-4.18: various fixes Darrick J. Wong
                   ` (3 preceding siblings ...)
  2018-05-03 18:06 ` [PATCH 04/12] xfs: don't continue scrub if already corrupt Darrick J. Wong
@ 2018-05-03 18:06 ` Darrick J. Wong
  2018-05-03 18:06 ` [PATCH 06/12] xfs: quota scrub should use bmapbtd scrubber Darrick J. Wong
                   ` (7 subsequent siblings)
  12 siblings, 0 replies; 27+ messages in thread
From: Darrick J. Wong @ 2018-05-03 18:06 UTC (permalink / raw)
  To: darrick.wong; +Cc: linux-xfs, Brian Foster

From: Darrick J. Wong <darrick.wong@oracle.com>

Refactor the quota scrubber to take the quotaofflock and grab the quota
inode in the setup function so that we can treat quota in the same
"scrub in the context of this inode" (i.e. sc->ip) manner as we treat
any other inode.  We do have to drop the quota inode's ILOCK_EXCL to use
dqiterate, but since dquots have their own individual locks the ILOCK
wasn't helping us anyway.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
---
 fs/xfs/scrub/quota.c |   53 ++++++++++++++++++++++++--------------------------
 fs/xfs/scrub/scrub.c |    4 ++++
 fs/xfs/scrub/scrub.h |    1 +
 3 files changed, 30 insertions(+), 28 deletions(-)


diff --git a/fs/xfs/scrub/quota.c b/fs/xfs/scrub/quota.c
index ba87c3aaa8b7..d3d08978f53a 100644
--- a/fs/xfs/scrub/quota.c
+++ b/fs/xfs/scrub/quota.c
@@ -66,12 +66,24 @@ xfs_scrub_setup_quota(
 	struct xfs_inode		*ip)
 {
 	uint				dqtype;
+	int				error;
+
+	if (!XFS_IS_QUOTA_RUNNING(sc->mp) || !XFS_IS_QUOTA_ON(sc->mp))
+		return -ENOENT;
 
 	dqtype = xfs_scrub_quota_to_dqtype(sc);
 	if (dqtype == 0)
 		return -EINVAL;
+	sc->has_quotaofflock = true;
+	mutex_lock(&sc->mp->m_quotainfo->qi_quotaofflock);
 	if (!xfs_this_quota_on(sc->mp, dqtype))
 		return -ENOENT;
+	error = xfs_scrub_setup_fs(sc, ip);
+	if (error)
+		return error;
+	sc->ip = xfs_quota_inode(sc->mp, dqtype);
+	xfs_ilock(sc->ip, XFS_ILOCK_EXCL);
+	sc->ilock_flags = XFS_ILOCK_EXCL;
 	return 0;
 }
 
@@ -202,7 +214,6 @@ xfs_scrub_quota(
 	struct xfs_bmbt_irec		irec = { 0 };
 	struct xfs_scrub_quota_info	sqi;
 	struct xfs_mount		*mp = sc->mp;
-	struct xfs_inode		*ip;
 	struct xfs_quotainfo		*qi = mp->m_quotainfo;
 	xfs_fileoff_t			max_dqid_off;
 	xfs_fileoff_t			off = 0;
@@ -210,25 +221,12 @@ xfs_scrub_quota(
 	int				nimaps;
 	int				error = 0;
 
-	if (!XFS_IS_QUOTA_RUNNING(mp) || !XFS_IS_QUOTA_ON(mp))
-		return -ENOENT;
-
-	mutex_lock(&qi->qi_quotaofflock);
 	dqtype = xfs_scrub_quota_to_dqtype(sc);
-	if (!xfs_this_quota_on(sc->mp, dqtype)) {
-		error = -ENOENT;
-		goto out_unlock_quota;
-	}
-
-	/* Attach to the quota inode and set sc->ip so that reporting works. */
-	ip = xfs_quota_inode(sc->mp, dqtype);
-	sc->ip = ip;
 
 	/* Look for problem extents. */
-	xfs_ilock(ip, XFS_ILOCK_EXCL);
-	if (ip->i_d.di_flags & XFS_DIFLAG_REALTIME) {
+	if (sc->ip->i_d.di_flags & XFS_DIFLAG_REALTIME) {
 		xfs_scrub_ino_set_corrupt(sc, sc->ip->i_ino);
-		goto out_unlock_inode;
+		goto out;
 	}
 	max_dqid_off = ((xfs_dqid_t)-1) / qi->qi_dqperchunk;
 	while (1) {
@@ -237,11 +235,11 @@ xfs_scrub_quota(
 
 		off = irec.br_startoff + irec.br_blockcount;
 		nimaps = 1;
-		error = xfs_bmapi_read(ip, off, -1, &irec, &nimaps,
+		error = xfs_bmapi_read(sc->ip, off, -1, &irec, &nimaps,
 				XFS_BMAPI_ENTIRE);
 		if (!xfs_scrub_fblock_process_error(sc, XFS_DATA_FORK, off,
 				&error))
-			goto out_unlock_inode;
+			goto out;
 		if (!nimaps)
 			break;
 		if (irec.br_startblock == HOLESTARTBLOCK)
@@ -267,26 +265,25 @@ xfs_scrub_quota(
 		    irec.br_startoff + irec.br_blockcount > max_dqid_off + 1)
 			xfs_scrub_fblock_set_corrupt(sc, XFS_DATA_FORK, off);
 	}
-	xfs_iunlock(ip, XFS_ILOCK_EXCL);
 	if (sc->sm->sm_flags & XFS_SCRUB_OFLAG_CORRUPT)
 		goto out;
 
-	/* Check all the quota items. */
+	/*
+	 * Check all the quota items.  Now that we've checked the quota inode
+	 * data fork we have to drop ILOCK_EXCL to use the regular dquot
+	 * functions.
+	 */
+	xfs_iunlock(sc->ip, sc->ilock_flags);
+	sc->ilock_flags = 0;
 	sqi.sc = sc;
 	sqi.last_id = 0;
 	error = xfs_qm_dqiterate(mp, dqtype, xfs_scrub_quota_item, &sqi);
+	sc->ilock_flags = XFS_ILOCK_EXCL;
+	xfs_ilock(sc->ip, sc->ilock_flags);
 	if (!xfs_scrub_fblock_process_error(sc, XFS_DATA_FORK,
 			sqi.last_id * qi->qi_dqperchunk, &error))
 		goto out;
 
 out:
-	/* We set sc->ip earlier, so make sure we clear it now. */
-	sc->ip = NULL;
-out_unlock_quota:
-	mutex_unlock(&qi->qi_quotaofflock);
 	return error;
-
-out_unlock_inode:
-	xfs_iunlock(ip, XFS_ILOCK_EXCL);
-	goto out;
 }
diff --git a/fs/xfs/scrub/scrub.c b/fs/xfs/scrub/scrub.c
index d9ecbb6ad4a3..ad6eda4fa123 100644
--- a/fs/xfs/scrub/scrub.c
+++ b/fs/xfs/scrub/scrub.c
@@ -42,6 +42,8 @@
 #include "xfs_refcount_btree.h"
 #include "xfs_rmap.h"
 #include "xfs_rmap_btree.h"
+#include "xfs_quota.h"
+#include "xfs_qm.h"
 #include "scrub/xfs_scrub.h"
 #include "scrub/scrub.h"
 #include "scrub/common.h"
@@ -166,6 +168,8 @@ xfs_scrub_teardown(
 			iput(VFS_I(sc->ip));
 		sc->ip = NULL;
 	}
+	if (sc->has_quotaofflock)
+		mutex_unlock(&sc->mp->m_quotainfo->qi_quotaofflock);
 	if (sc->buf) {
 		kmem_free(sc->buf);
 		sc->buf = NULL;
diff --git a/fs/xfs/scrub/scrub.h b/fs/xfs/scrub/scrub.h
index 0d92af86f67a..5d797319fc9a 100644
--- a/fs/xfs/scrub/scrub.h
+++ b/fs/xfs/scrub/scrub.h
@@ -73,6 +73,7 @@ struct xfs_scrub_context {
 	void				*buf;
 	uint				ilock_flags;
 	bool				try_harder;
+	bool				has_quotaofflock;
 
 	/* State tracking for single-AG operations. */
 	struct xfs_scrub_ag		sa;


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 06/12] xfs: quota scrub should use bmapbtd scrubber
  2018-05-03 18:05 [PATCH v4 00/12] xfs-4.18: various fixes Darrick J. Wong
                   ` (4 preceding siblings ...)
  2018-05-03 18:06 ` [PATCH 05/12] xfs: avoid ilock games in the quota scrubber Darrick J. Wong
@ 2018-05-03 18:06 ` Darrick J. Wong
  2018-05-10 13:54   ` Brian Foster
  2018-05-03 18:06 ` [PATCH 07/12] xfs: scrub the data fork of the realtime inodes Darrick J. Wong
                   ` (6 subsequent siblings)
  12 siblings, 1 reply; 27+ messages in thread
From: Darrick J. Wong @ 2018-05-03 18:06 UTC (permalink / raw)
  To: darrick.wong; +Cc: linux-xfs

From: Darrick J. Wong <darrick.wong@oracle.com>

Replace the quota scrubber's open-coded data fork scrubber with a
redirected call to the bmapbtd scrubber.  This strengthens the quota
scrub to include all the cross-referencing that it does.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 fs/xfs/scrub/common.c |   57 ++++++++++++++++++++++++++++++++++
 fs/xfs/scrub/common.h |    2 +
 fs/xfs/scrub/quota.c  |   83 ++++++++++++++++++++++++-------------------------
 3 files changed, 99 insertions(+), 43 deletions(-)


diff --git a/fs/xfs/scrub/common.c b/fs/xfs/scrub/common.c
index e9ca5f0802d5..335c7a305e15 100644
--- a/fs/xfs/scrub/common.c
+++ b/fs/xfs/scrub/common.c
@@ -44,6 +44,8 @@
 #include "xfs_rmap_btree.h"
 #include "xfs_log.h"
 #include "xfs_trans_priv.h"
+#include "xfs_attr.h"
+#include "xfs_reflink.h"
 #include "scrub/xfs_scrub.h"
 #include "scrub/scrub.h"
 #include "scrub/common.h"
@@ -777,3 +779,58 @@ xfs_scrub_buffer_recheck(
 	sc->sm->sm_flags |= XFS_SCRUB_OFLAG_CORRUPT;
 	trace_xfs_scrub_block_error(sc, bp->b_bn, fa);
 }
+
+/*
+ * Scrub the attr/data forks of a metadata inode.  The metadata inode must be
+ * pointed to by sc->ip and the ILOCK must be held.
+ */
+int
+xfs_scrub_metadata_inode_forks(
+	struct xfs_scrub_context	*sc)
+{
+	__u32				smtype;
+	bool				shared;
+	int				error;
+
+	if (sc->sm->sm_flags & XFS_SCRUB_OFLAG_CORRUPT)
+		return 0;
+
+	/* Metadata inodes don't live on the rt device. */
+	if (sc->ip->i_d.di_flags & XFS_DIFLAG_REALTIME) {
+		xfs_scrub_ino_set_corrupt(sc, sc->ip->i_ino);
+		return 0;
+	}
+
+	/* They should never participate in reflink. */
+	if (xfs_is_reflink_inode(sc->ip)) {
+		xfs_scrub_ino_set_corrupt(sc, sc->ip->i_ino);
+		return 0;
+	}
+
+	/* They also should never have extended attributes. */
+	if (xfs_inode_hasattr(sc->ip)) {
+		xfs_scrub_ino_set_corrupt(sc, sc->ip->i_ino);
+		return 0;
+	}
+
+	/* Invoke the data fork scrubber. */
+	smtype = sc->sm->sm_type;
+	sc->sm->sm_type = XFS_SCRUB_TYPE_BMBTD;
+	error = xfs_scrub_bmap_data(sc);
+	sc->sm->sm_type = smtype;
+	if (error || (sc->sm->sm_flags & XFS_SCRUB_OFLAG_CORRUPT))
+		return error;
+
+	/* Look for incorrect shared blocks. */
+	if (xfs_sb_version_hasreflink(&sc->mp->m_sb)) {
+		error = xfs_reflink_inode_has_shared_extents(sc->tp, sc->ip,
+				&shared);
+		if (!xfs_scrub_fblock_process_error(sc, XFS_DATA_FORK, 0,
+				&error))
+			return error;
+		if (shared)
+			xfs_scrub_ino_set_corrupt(sc, sc->ip->i_ino);
+	}
+
+	return error;
+}
diff --git a/fs/xfs/scrub/common.h b/fs/xfs/scrub/common.h
index 80fa5a67c265..ad736c411287 100644
--- a/fs/xfs/scrub/common.h
+++ b/fs/xfs/scrub/common.h
@@ -167,4 +167,6 @@ static inline bool xfs_scrub_skip_xref(struct xfs_scrub_metadata *sm)
 			       XFS_SCRUB_OFLAG_XCORRUPT);
 }
 
+int xfs_scrub_metadata_inode_forks(struct xfs_scrub_context *sc);
+
 #endif	/* __XFS_SCRUB_COMMON_H__ */
diff --git a/fs/xfs/scrub/quota.c b/fs/xfs/scrub/quota.c
index d3d08978f53a..15ae4d23d6ac 100644
--- a/fs/xfs/scrub/quota.c
+++ b/fs/xfs/scrub/quota.c
@@ -206,65 +206,62 @@ xfs_scrub_quota_item(
 	return 0;
 }
 
-/* Scrub all of a quota type's items. */
-int
-xfs_scrub_quota(
+/* Check the quota's data fork. */
+STATIC int
+xfs_scrub_quota_data_fork(
 	struct xfs_scrub_context	*sc)
 {
 	struct xfs_bmbt_irec		irec = { 0 };
-	struct xfs_scrub_quota_info	sqi;
-	struct xfs_mount		*mp = sc->mp;
-	struct xfs_quotainfo		*qi = mp->m_quotainfo;
+	struct xfs_iext_cursor		icur;
+	struct xfs_quotainfo		*qi = sc->mp->m_quotainfo;
+	struct xfs_ifork		*ifp;
 	xfs_fileoff_t			max_dqid_off;
-	xfs_fileoff_t			off = 0;
-	uint				dqtype;
-	int				nimaps;
 	int				error = 0;
 
-	dqtype = xfs_scrub_quota_to_dqtype(sc);
+	/* Invoke the fork scrubber. */
+	error = xfs_scrub_metadata_inode_forks(sc);
+	if (error || (sc->sm->sm_flags & XFS_SCRUB_OFLAG_CORRUPT))
+		return error;
 
-	/* Look for problem extents. */
-	if (sc->ip->i_d.di_flags & XFS_DIFLAG_REALTIME) {
-		xfs_scrub_ino_set_corrupt(sc, sc->ip->i_ino);
-		goto out;
-	}
+	/* Check for data fork problems that apply only to quota files. */
 	max_dqid_off = ((xfs_dqid_t)-1) / qi->qi_dqperchunk;
-	while (1) {
+	ifp = XFS_IFORK_PTR(sc->ip, XFS_DATA_FORK);
+	for_each_xfs_iext(ifp, &icur, &irec) {
 		if (xfs_scrub_should_terminate(sc, &error))
 			break;
-
-		off = irec.br_startoff + irec.br_blockcount;
-		nimaps = 1;
-		error = xfs_bmapi_read(sc->ip, off, -1, &irec, &nimaps,
-				XFS_BMAPI_ENTIRE);
-		if (!xfs_scrub_fblock_process_error(sc, XFS_DATA_FORK, off,
-				&error))
-			goto out;
-		if (!nimaps)
-			break;
-		if (irec.br_startblock == HOLESTARTBLOCK)
-			continue;
-
-		/* Check the extent record doesn't point to crap. */
-		if (irec.br_startblock + irec.br_blockcount <=
-		    irec.br_startblock)
-			xfs_scrub_fblock_set_corrupt(sc, XFS_DATA_FORK,
-					irec.br_startoff);
-		if (!xfs_verify_fsbno(mp, irec.br_startblock) ||
-		    !xfs_verify_fsbno(mp, irec.br_startblock +
-					irec.br_blockcount - 1))
-			xfs_scrub_fblock_set_corrupt(sc, XFS_DATA_FORK,
-					irec.br_startoff);
-
 		/*
-		 * Unwritten extents or blocks mapped above the highest
+		 * delalloc extents or blocks mapped above the highest
 		 * quota id shouldn't happen.
 		 */
 		if (isnullstartblock(irec.br_startblock) ||
 		    irec.br_startoff > max_dqid_off ||
-		    irec.br_startoff + irec.br_blockcount > max_dqid_off + 1)
-			xfs_scrub_fblock_set_corrupt(sc, XFS_DATA_FORK, off);
+		    irec.br_startoff + irec.br_blockcount - 1 > max_dqid_off) {
+			xfs_scrub_fblock_set_corrupt(sc, XFS_DATA_FORK,
+					irec.br_startoff);
+			break;
+		}
 	}
+
+	return error;
+}
+
+/* Scrub all of a quota type's items. */
+int
+xfs_scrub_quota(
+	struct xfs_scrub_context	*sc)
+{
+	struct xfs_scrub_quota_info	sqi;
+	struct xfs_mount		*mp = sc->mp;
+	struct xfs_quotainfo		*qi = mp->m_quotainfo;
+	uint				dqtype;
+	int				error = 0;
+
+	dqtype = xfs_scrub_quota_to_dqtype(sc);
+
+	/* Look for problem extents. */
+	error = xfs_scrub_quota_data_fork(sc);
+	if (error)
+		goto out;
 	if (sc->sm->sm_flags & XFS_SCRUB_OFLAG_CORRUPT)
 		goto out;
 


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 07/12] xfs: scrub the data fork of the realtime inodes
  2018-05-03 18:05 [PATCH v4 00/12] xfs-4.18: various fixes Darrick J. Wong
                   ` (5 preceding siblings ...)
  2018-05-03 18:06 ` [PATCH 06/12] xfs: quota scrub should use bmapbtd scrubber Darrick J. Wong
@ 2018-05-03 18:06 ` Darrick J. Wong
  2018-05-10 13:54   ` Brian Foster
  2018-05-03 18:06 ` [PATCH 08/12] xfs: superblock scrub should use short-lived buffers Darrick J. Wong
                   ` (5 subsequent siblings)
  12 siblings, 1 reply; 27+ messages in thread
From: Darrick J. Wong @ 2018-05-03 18:06 UTC (permalink / raw)
  To: darrick.wong; +Cc: linux-xfs

From: Darrick J. Wong <darrick.wong@oracle.com>

The realtime bitmap and summary inodes live on the metadata device, so
we can scrub their data forks with the regular scrubbers.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 fs/xfs/scrub/rtbitmap.c |   26 +++++++++++++++++++++++++-
 1 file changed, 25 insertions(+), 1 deletion(-)


diff --git a/fs/xfs/scrub/rtbitmap.c b/fs/xfs/scrub/rtbitmap.c
index 8b048f107af2..b2b4dff2660d 100644
--- a/fs/xfs/scrub/rtbitmap.c
+++ b/fs/xfs/scrub/rtbitmap.c
@@ -82,6 +82,11 @@ xfs_scrub_rtbitmap(
 {
 	int				error;
 
+	/* Invoke the fork scrubber. */
+	error = xfs_scrub_metadata_inode_forks(sc);
+	if (error || (sc->sm->sm_flags & XFS_SCRUB_OFLAG_CORRUPT))
+		return error;
+
 	error = xfs_rtalloc_query_all(sc->tp, xfs_scrub_rtbitmap_rec, sc);
 	if (!xfs_scrub_fblock_process_error(sc, XFS_DATA_FORK, 0, &error))
 		goto out;
@@ -95,8 +100,27 @@ int
 xfs_scrub_rtsummary(
 	struct xfs_scrub_context	*sc)
 {
+	struct xfs_inode		*rsumip = sc->mp->m_rsumip;
+	int				error = 0;
+
+	/*
+	 * We ILOCK'd the rt bitmap ip in the setup routine, now lock the
+	 * rt summary ip in compliance with the rt inode locking rules.
+	 */
+	xfs_ilock(rsumip, XFS_ILOCK_EXCL | XFS_ILOCK_RTSUM);
+
+	/* Invoke the fork scrubber. */
+	sc->ip = rsumip;
+	error = xfs_scrub_metadata_inode_forks(sc);
+	sc->ip = sc->mp->m_rbmip;
+	if (error || (sc->sm->sm_flags & XFS_SCRUB_OFLAG_CORRUPT))
+		goto out;
+
 	/* XXX: implement this some day */
-	return -ENOENT;
+	xfs_scrub_set_incomplete(sc);
+out:
+	xfs_iunlock(rsumip, XFS_ILOCK_EXCL | XFS_ILOCK_RTSUM);
+	return error;
 }
 
 


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 08/12] xfs: superblock scrub should use short-lived buffers
  2018-05-03 18:05 [PATCH v4 00/12] xfs-4.18: various fixes Darrick J. Wong
                   ` (6 preceding siblings ...)
  2018-05-03 18:06 ` [PATCH 07/12] xfs: scrub the data fork of the realtime inodes Darrick J. Wong
@ 2018-05-03 18:06 ` Darrick J. Wong
  2018-05-10 13:55   ` Brian Foster
  2018-05-03 18:06 ` [PATCH 09/12] xfs: clean up scrub usage of KM_NOFS Darrick J. Wong
                   ` (4 subsequent siblings)
  12 siblings, 1 reply; 27+ messages in thread
From: Darrick J. Wong @ 2018-05-03 18:06 UTC (permalink / raw)
  To: darrick.wong; +Cc: linux-xfs

From: Darrick J. Wong <darrick.wong@oracle.com>

Secondary superblocks are rarely used, so create a helper to read a
given non-primary AG's superblock and ensure that it won't stick around
hogging memory.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 fs/xfs/libxfs/xfs_sb.c     |   22 ++++++++++++++++++++++
 fs/xfs/libxfs/xfs_sb.h     |    3 +++
 fs/xfs/libxfs/xfs_shared.h |    1 +
 fs/xfs/scrub/agheader.c    |    4 +---
 4 files changed, 27 insertions(+), 3 deletions(-)


diff --git a/fs/xfs/libxfs/xfs_sb.c b/fs/xfs/libxfs/xfs_sb.c
index d9b94bd5f689..d9bef41a3f26 100644
--- a/fs/xfs/libxfs/xfs_sb.c
+++ b/fs/xfs/libxfs/xfs_sb.c
@@ -972,3 +972,25 @@ xfs_fs_geometry(
 
 	return 0;
 }
+
+/* Read a secondary superblock. */
+int
+xfs_sb_read_secondary(
+	struct xfs_mount	*mp,
+	struct xfs_trans	*tp,
+	xfs_agnumber_t		agno,
+	struct xfs_buf		**bpp)
+{
+	struct xfs_buf		*bp;
+	int			error;
+
+	ASSERT(agno != 0 && agno != NULLAGNUMBER);
+	error = xfs_trans_read_buf(mp, tp, mp->m_ddev_targp,
+			XFS_AG_DADDR(mp, agno, XFS_SB_BLOCK(mp)),
+			XFS_FSS_TO_BB(mp, 1), 0, &bp, &xfs_sb_buf_ops);
+	if (error)
+		return error;
+	xfs_buf_set_ref(bp, XFS_SSB_REF);
+	*bpp = bp;
+	return 0;
+}
diff --git a/fs/xfs/libxfs/xfs_sb.h b/fs/xfs/libxfs/xfs_sb.h
index 63dcd2a1a657..5166d78b2c34 100644
--- a/fs/xfs/libxfs/xfs_sb.h
+++ b/fs/xfs/libxfs/xfs_sb.h
@@ -37,5 +37,8 @@ extern void	xfs_sb_quota_from_disk(struct xfs_sb *sbp);
 #define XFS_FS_GEOM_MAX_STRUCT_VER	(4)
 extern int	xfs_fs_geometry(struct xfs_sb *sbp, struct xfs_fsop_geom *geo,
 				int struct_version);
+extern int	xfs_sb_read_secondary(struct xfs_mount *mp,
+				struct xfs_trans *tp, xfs_agnumber_t agno,
+				struct xfs_buf **bpp);
 
 #endif	/* __XFS_SB_H__ */
diff --git a/fs/xfs/libxfs/xfs_shared.h b/fs/xfs/libxfs/xfs_shared.h
index d0b84da0cb1e..d23f33c02f64 100644
--- a/fs/xfs/libxfs/xfs_shared.h
+++ b/fs/xfs/libxfs/xfs_shared.h
@@ -127,6 +127,7 @@ void	xfs_log_get_max_trans_res(struct xfs_mount *mp,
 #define	XFS_ATTR_BTREE_REF	1
 #define	XFS_DQUOT_REF		1
 #define	XFS_REFC_BTREE_REF	1
+#define	XFS_SSB_REF		0
 
 /*
  * Flags for xfs_trans_ichgtime().
diff --git a/fs/xfs/scrub/agheader.c b/fs/xfs/scrub/agheader.c
index 018aabbd9394..cd24ac56c21e 100644
--- a/fs/xfs/scrub/agheader.c
+++ b/fs/xfs/scrub/agheader.c
@@ -157,9 +157,7 @@ xfs_scrub_superblock(
 	if (agno == 0)
 		return 0;
 
-	error = xfs_trans_read_buf(mp, sc->tp, mp->m_ddev_targp,
-		  XFS_AGB_TO_DADDR(mp, agno, XFS_SB_BLOCK(mp)),
-		  XFS_FSS_TO_BB(mp, 1), 0, &bp, &xfs_sb_buf_ops);
+	error = xfs_sb_read_secondary(mp, sc->tp, agno, &bp);
 	/*
 	 * The superblock verifier can return several different error codes
 	 * if it thinks the superblock doesn't look right.  For a mount these


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 09/12] xfs: clean up scrub usage of KM_NOFS
  2018-05-03 18:05 [PATCH v4 00/12] xfs-4.18: various fixes Darrick J. Wong
                   ` (7 preceding siblings ...)
  2018-05-03 18:06 ` [PATCH 08/12] xfs: superblock scrub should use short-lived buffers Darrick J. Wong
@ 2018-05-03 18:06 ` Darrick J. Wong
  2018-05-03 18:06 ` [PATCH 10/12] xfs: btree scrub should check minrecs Darrick J. Wong
                   ` (3 subsequent siblings)
  12 siblings, 0 replies; 27+ messages in thread
From: Darrick J. Wong @ 2018-05-03 18:06 UTC (permalink / raw)
  To: darrick.wong; +Cc: linux-xfs, Brian Foster

From: Darrick J. Wong <darrick.wong@oracle.com>

All scrub code runs in transaction context, which means that memory
allocations are automatically run in PF_MEMALLOC_NOFS context.  It's
therefore unnecessary to pass in KM_NOFS to allocation routines, so
clean them all out.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
---
 fs/xfs/scrub/agheader.c |    3 ++-
 fs/xfs/scrub/btree.c    |    2 +-
 fs/xfs/scrub/refcount.c |    2 +-
 3 files changed, 4 insertions(+), 3 deletions(-)


diff --git a/fs/xfs/scrub/agheader.c b/fs/xfs/scrub/agheader.c
index cd24ac56c21e..831acc0a328f 100644
--- a/fs/xfs/scrub/agheader.c
+++ b/fs/xfs/scrub/agheader.c
@@ -795,7 +795,8 @@ xfs_scrub_agfl(
 	}
 	memset(&sai, 0, sizeof(sai));
 	sai.sz_entries = agflcount;
-	sai.entries = kmem_zalloc(sizeof(xfs_agblock_t) * agflcount, KM_NOFS);
+	sai.entries = kmem_zalloc(sizeof(xfs_agblock_t) * agflcount,
+			KM_MAYFAIL);
 	if (!sai.entries) {
 		error = -ENOMEM;
 		goto out;
diff --git a/fs/xfs/scrub/btree.c b/fs/xfs/scrub/btree.c
index 54218168c8f9..ea972dab1485 100644
--- a/fs/xfs/scrub/btree.c
+++ b/fs/xfs/scrub/btree.c
@@ -442,7 +442,7 @@ xfs_scrub_btree_check_owner(
 	 */
 	if (cur->bc_btnum == XFS_BTNUM_BNO || cur->bc_btnum == XFS_BTNUM_RMAP) {
 		co = kmem_alloc(sizeof(struct check_owner),
-				KM_MAYFAIL | KM_NOFS);
+				KM_MAYFAIL);
 		if (!co)
 			return -ENOMEM;
 		co->level = level;
diff --git a/fs/xfs/scrub/refcount.c b/fs/xfs/scrub/refcount.c
index 913b0a190e68..324a5f159145 100644
--- a/fs/xfs/scrub/refcount.c
+++ b/fs/xfs/scrub/refcount.c
@@ -150,7 +150,7 @@ xfs_scrub_refcountbt_rmap_check(
 		 * so we don't need insertion sort here.
 		 */
 		frag = kmem_alloc(sizeof(struct xfs_scrub_refcnt_frag),
-				KM_MAYFAIL | KM_NOFS);
+				KM_MAYFAIL);
 		if (!frag)
 			return -ENOMEM;
 		memcpy(&frag->rm, rec, sizeof(frag->rm));


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 10/12] xfs: btree scrub should check minrecs
  2018-05-03 18:05 [PATCH v4 00/12] xfs-4.18: various fixes Darrick J. Wong
                   ` (8 preceding siblings ...)
  2018-05-03 18:06 ` [PATCH 09/12] xfs: clean up scrub usage of KM_NOFS Darrick J. Wong
@ 2018-05-03 18:06 ` Darrick J. Wong
  2018-05-03 18:07 ` [PATCH 11/12] xfs: refactor scrub transaction allocation function Darrick J. Wong
                   ` (2 subsequent siblings)
  12 siblings, 0 replies; 27+ messages in thread
From: Darrick J. Wong @ 2018-05-03 18:06 UTC (permalink / raw)
  To: darrick.wong; +Cc: linux-xfs, Brian Foster

From: Darrick J. Wong <darrick.wong@oracle.com>

Strengthen the btree block header checks to detect the number of records
being less than the btree type's minimum record count.  Certain blocks
are allowed to violate this constraint -- specifically any btree block
at the top of the tree can have fewer than minrecs records.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
---
 fs/xfs/scrub/btree.c |   40 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 40 insertions(+)


diff --git a/fs/xfs/scrub/btree.c b/fs/xfs/scrub/btree.c
index ea972dab1485..2d29dceaa00e 100644
--- a/fs/xfs/scrub/btree.c
+++ b/fs/xfs/scrub/btree.c
@@ -454,6 +454,44 @@ xfs_scrub_btree_check_owner(
 	return xfs_scrub_btree_check_block_owner(bs, level, XFS_BUF_ADDR(bp));
 }
 
+/*
+ * Check that this btree block has at least minrecs records or is one of the
+ * special blocks that don't require that.
+ */
+STATIC void
+xfs_scrub_btree_check_minrecs(
+	struct xfs_scrub_btree	*bs,
+	int			level,
+	struct xfs_btree_block	*block)
+{
+	unsigned int		numrecs;
+	int			ok_level;
+
+	numrecs = be16_to_cpu(block->bb_numrecs);
+
+	/* More records than minrecs means the block is ok. */
+	if (numrecs >= bs->cur->bc_ops->get_minrecs(bs->cur, level))
+		return;
+
+	/*
+	 * Certain btree blocks /can/ have fewer than minrecs records.  Any
+	 * level greater than or equal to the level of the highest dedicated
+	 * btree block are allowed to violate this constraint.
+	 *
+	 * For a btree rooted in a block, the btree root can have fewer than
+	 * minrecs records.  If the btree is rooted in an inode and does not
+	 * store records in the root, the direct children of the root and the
+	 * root itself can have fewer than minrecs records.
+	 */
+	ok_level = bs->cur->bc_nlevels - 1;
+	if (bs->cur->bc_flags & XFS_BTREE_ROOT_IN_INODE)
+		ok_level--;
+	if (level >= ok_level)
+		return;
+
+	xfs_scrub_btree_set_corrupt(bs->sc, bs->cur, level);
+}
+
 /*
  * Grab and scrub a btree block given a btree pointer.  Returns block
  * and buffer pointers (if applicable) if they're ok to use.
@@ -491,6 +529,8 @@ xfs_scrub_btree_get_block(
 	if (*pbp)
 		xfs_scrub_buffer_recheck(bs->sc, *pbp);
 
+	xfs_scrub_btree_check_minrecs(bs, level, *pblock);
+
 	/*
 	 * Check the block's owner; this function absorbs error codes
 	 * for us.


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 11/12] xfs: refactor scrub transaction allocation function
  2018-05-03 18:05 [PATCH v4 00/12] xfs-4.18: various fixes Darrick J. Wong
                   ` (9 preceding siblings ...)
  2018-05-03 18:06 ` [PATCH 10/12] xfs: btree scrub should check minrecs Darrick J. Wong
@ 2018-05-03 18:07 ` Darrick J. Wong
  2018-05-03 18:07 ` [PATCH 12/12] xfs: avoid ABBA deadlock when scrubbing parent pointers Darrick J. Wong
  2018-05-08  0:40 ` [PATCH 13/12] xfs: refactor quota limits initialization Darrick J. Wong
  12 siblings, 0 replies; 27+ messages in thread
From: Darrick J. Wong @ 2018-05-03 18:07 UTC (permalink / raw)
  To: darrick.wong; +Cc: linux-xfs, Brian Foster

From: Darrick J. Wong <darrick.wong@oracle.com>

Since the transaction allocation helper is about to become more complex,
move it to common.c and remove the redundant parameters.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
---
 fs/xfs/scrub/bmap.c   |    3 +--
 fs/xfs/scrub/common.c |   16 +++++++++++++---
 fs/xfs/scrub/common.h |   14 +-------------
 fs/xfs/scrub/inode.c  |    5 ++---
 4 files changed, 17 insertions(+), 21 deletions(-)


diff --git a/fs/xfs/scrub/bmap.c b/fs/xfs/scrub/bmap.c
index c28ff2ec201a..42a115e83739 100644
--- a/fs/xfs/scrub/bmap.c
+++ b/fs/xfs/scrub/bmap.c
@@ -51,7 +51,6 @@ xfs_scrub_setup_inode_bmap(
 	struct xfs_scrub_context	*sc,
 	struct xfs_inode		*ip)
 {
-	struct xfs_mount		*mp = sc->mp;
 	int				error;
 
 	error = xfs_scrub_get_inode(sc, ip);
@@ -75,7 +74,7 @@ xfs_scrub_setup_inode_bmap(
 	}
 
 	/* Got the inode, lock it and we're ready to go. */
-	error = xfs_scrub_trans_alloc(sc->sm, mp, &sc->tp);
+	error = xfs_scrub_trans_alloc(sc);
 	if (error)
 		goto out;
 	sc->ilock_flags |= XFS_ILOCK_EXCL;
diff --git a/fs/xfs/scrub/common.c b/fs/xfs/scrub/common.c
index 335c7a305e15..62b33c99efe4 100644
--- a/fs/xfs/scrub/common.c
+++ b/fs/xfs/scrub/common.c
@@ -570,13 +570,24 @@ xfs_scrub_ag_init(
 
 /* Per-scrubber setup functions */
 
+/*
+ * Grab an empty transaction so that we can re-grab locked buffers if
+ * one of our btrees turns out to be cyclic.
+ */
+int
+xfs_scrub_trans_alloc(
+	struct xfs_scrub_context	*sc)
+{
+	return xfs_trans_alloc_empty(sc->mp, &sc->tp);
+}
+
 /* Set us up with a transaction and an empty context. */
 int
 xfs_scrub_setup_fs(
 	struct xfs_scrub_context	*sc,
 	struct xfs_inode		*ip)
 {
-	return xfs_scrub_trans_alloc(sc->sm, sc->mp, &sc->tp);
+	return xfs_scrub_trans_alloc(sc);
 }
 
 /* Set us up with AG headers and btree cursors. */
@@ -697,7 +708,6 @@ xfs_scrub_setup_inode_contents(
 	struct xfs_inode		*ip,
 	unsigned int			resblks)
 {
-	struct xfs_mount		*mp = sc->mp;
 	int				error;
 
 	error = xfs_scrub_get_inode(sc, ip);
@@ -707,7 +717,7 @@ xfs_scrub_setup_inode_contents(
 	/* Got the inode, lock it and we're ready to go. */
 	sc->ilock_flags = XFS_IOLOCK_EXCL | XFS_MMAPLOCK_EXCL;
 	xfs_ilock(sc->ip, sc->ilock_flags);
-	error = xfs_scrub_trans_alloc(sc->sm, mp, &sc->tp);
+	error = xfs_scrub_trans_alloc(sc);
 	if (error)
 		goto out;
 	sc->ilock_flags |= XFS_ILOCK_EXCL;
diff --git a/fs/xfs/scrub/common.h b/fs/xfs/scrub/common.h
index ad736c411287..5d78bb9602ab 100644
--- a/fs/xfs/scrub/common.h
+++ b/fs/xfs/scrub/common.h
@@ -38,19 +38,7 @@ xfs_scrub_should_terminate(
 	return false;
 }
 
-/*
- * Grab an empty transaction so that we can re-grab locked buffers if
- * one of our btrees turns out to be cyclic.
- */
-static inline int
-xfs_scrub_trans_alloc(
-	struct xfs_scrub_metadata	*sm,
-	struct xfs_mount		*mp,
-	struct xfs_trans		**tpp)
-{
-	return xfs_trans_alloc_empty(mp, tpp);
-}
-
+int xfs_scrub_trans_alloc(struct xfs_scrub_context *sc);
 bool xfs_scrub_process_error(struct xfs_scrub_context *sc, xfs_agnumber_t agno,
 		xfs_agblock_t bno, int *error);
 bool xfs_scrub_fblock_process_error(struct xfs_scrub_context *sc, int whichfork,
diff --git a/fs/xfs/scrub/inode.c b/fs/xfs/scrub/inode.c
index 8532fc7be838..550c0cf70a92 100644
--- a/fs/xfs/scrub/inode.c
+++ b/fs/xfs/scrub/inode.c
@@ -55,7 +55,6 @@ xfs_scrub_setup_inode(
 	struct xfs_scrub_context	*sc,
 	struct xfs_inode		*ip)
 {
-	struct xfs_mount		*mp = sc->mp;
 	int				error;
 
 	/*
@@ -68,7 +67,7 @@ xfs_scrub_setup_inode(
 		break;
 	case -EFSCORRUPTED:
 	case -EFSBADCRC:
-		return xfs_scrub_trans_alloc(sc->sm, mp, &sc->tp);
+		return xfs_scrub_trans_alloc(sc);
 	default:
 		return error;
 	}
@@ -76,7 +75,7 @@ xfs_scrub_setup_inode(
 	/* Got the inode, lock it and we're ready to go. */
 	sc->ilock_flags = XFS_IOLOCK_EXCL | XFS_MMAPLOCK_EXCL;
 	xfs_ilock(sc->ip, sc->ilock_flags);
-	error = xfs_scrub_trans_alloc(sc->sm, mp, &sc->tp);
+	error = xfs_scrub_trans_alloc(sc);
 	if (error)
 		goto out;
 	sc->ilock_flags |= XFS_ILOCK_EXCL;


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 12/12] xfs: avoid ABBA deadlock when scrubbing parent pointers
  2018-05-03 18:05 [PATCH v4 00/12] xfs-4.18: various fixes Darrick J. Wong
                   ` (10 preceding siblings ...)
  2018-05-03 18:07 ` [PATCH 11/12] xfs: refactor scrub transaction allocation function Darrick J. Wong
@ 2018-05-03 18:07 ` Darrick J. Wong
  2018-05-08  0:40 ` [PATCH 13/12] xfs: refactor quota limits initialization Darrick J. Wong
  12 siblings, 0 replies; 27+ messages in thread
From: Darrick J. Wong @ 2018-05-03 18:07 UTC (permalink / raw)
  To: darrick.wong; +Cc: linux-xfs

From: Darrick J. Wong <darrick.wong@oracle.com>

In normal operation, the XFS convention is to take an inode's iolock
and then allocate a transaction.  However, when scrubbing parent inodes
this is inverted -- we allocated the transaction to do the scrub, and
now we're trying to grab the parent's iolock.  This can lead to ABBA
deadlocks: some thread grabbed the parent's iolock and is waiting for
space for a transaction while our parent scrubber is sitting on a
transaction trying to get the parent's iolock.

Therefore, convert all iolock attempts to use trylock; if that fails,
they can use the existing mechanisms to back off and try again.

The ABBA deadlock didn't happen with a non-repair scrub because the
transactions don't reserve any space, but repair scrubs require
reservation in order to update metadata.  However, any other concurrent
metadata update (e.g. directory create in the parent) could also induce
this deadlock with the parent scrubber.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 fs/xfs/scrub/common.c |   22 ++++++++++++++++++++++
 fs/xfs/scrub/common.h |    1 +
 fs/xfs/scrub/parent.c |   16 ++++++++++++++--
 3 files changed, 37 insertions(+), 2 deletions(-)


diff --git a/fs/xfs/scrub/common.c b/fs/xfs/scrub/common.c
index 62b33c99efe4..518bff2be0c9 100644
--- a/fs/xfs/scrub/common.c
+++ b/fs/xfs/scrub/common.c
@@ -844,3 +844,25 @@ xfs_scrub_metadata_inode_forks(
 
 	return error;
 }
+
+/*
+ * Try to lock an inode in violation of the usual locking order rules.  For
+ * example, trying to get the IOLOCK while in transaction context, or just
+ * plain breaking AG-order or inode-order inode locking rules.  Either way,
+ * the only way to avoid an ABBA deadlock is to use trylock and back off if
+ * we can't.
+ */
+int
+xfs_scrub_ilock_inverted(
+	struct xfs_inode	*ip,
+	uint			lock_mode)
+{
+	int			i;
+
+	for (i = 0; i < 20; i++) {
+		if (xfs_ilock_nowait(ip, lock_mode))
+			return 0;
+		delay(1);
+	}
+	return -EDEADLOCK;
+}
diff --git a/fs/xfs/scrub/common.h b/fs/xfs/scrub/common.h
index 5d78bb9602ab..119d9b6db887 100644
--- a/fs/xfs/scrub/common.h
+++ b/fs/xfs/scrub/common.h
@@ -156,5 +156,6 @@ static inline bool xfs_scrub_skip_xref(struct xfs_scrub_metadata *sm)
 }
 
 int xfs_scrub_metadata_inode_forks(struct xfs_scrub_context *sc);
+int xfs_scrub_ilock_inverted(struct xfs_inode *ip, uint lock_mode);
 
 #endif	/* __XFS_SCRUB_COMMON_H__ */
diff --git a/fs/xfs/scrub/parent.c b/fs/xfs/scrub/parent.c
index 95fdd0d9e65d..85241e680b37 100644
--- a/fs/xfs/scrub/parent.c
+++ b/fs/xfs/scrub/parent.c
@@ -214,7 +214,9 @@ xfs_scrub_parent_validate(
 	 */
 	xfs_iunlock(sc->ip, sc->ilock_flags);
 	sc->ilock_flags = 0;
-	xfs_ilock(dp, XFS_IOLOCK_SHARED);
+	error = xfs_scrub_ilock_inverted(dp, XFS_IOLOCK_SHARED);
+	if (error)
+		goto out_rele;
 
 	/* Go looking for our dentry. */
 	error = xfs_scrub_parent_count_parent_dentries(sc, dp, &nlink);
@@ -223,8 +225,10 @@ xfs_scrub_parent_validate(
 
 	/* Drop the parent lock, relock this inode. */
 	xfs_iunlock(dp, XFS_IOLOCK_SHARED);
+	error = xfs_scrub_ilock_inverted(sc->ip, XFS_IOLOCK_EXCL);
+	if (error)
+		goto out_rele;
 	sc->ilock_flags = XFS_IOLOCK_EXCL;
-	xfs_ilock(sc->ip, sc->ilock_flags);
 
 	/*
 	 * If we're an unlinked directory, the parent /won't/ have a link
@@ -326,5 +330,13 @@ xfs_scrub_parent(
 	if (try_again && tries == 20)
 		xfs_scrub_set_incomplete(sc);
 out:
+	/*
+	 * If we failed to lock the parent inode even after a retry, just mark
+	 * this scrub incomplete and return.
+	 */
+	if (sc->try_harder && error == -EDEADLOCK) {
+		error = 0;
+		xfs_scrub_set_incomplete(sc);
+	}
 	return error;
 }


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [PATCH 01/12] xfs: bmap debugging should never panic the system
  2018-05-03 18:05 ` [PATCH 01/12] xfs: bmap debugging should never panic the system Darrick J. Wong
@ 2018-05-07 15:06   ` Christoph Hellwig
  2018-05-10 13:53   ` Brian Foster
  1 sibling, 0 replies; 27+ messages in thread
From: Christoph Hellwig @ 2018-05-07 15:06 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs

On Thu, May 03, 2018 at 11:05:59AM -0700, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> Don't panic() the system if the bmap records are garbage, just call
> ASSERT which gives us the same backtrace but enables developers to
> control if the system goes down or not.  This makes debugging with
> generic/388 much easier because it won't reboot the machine midway
> through a run just because btree_read_bufl returns EIO when the fs has
> already shut down.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 02/12] xfs: add missing rmap error return
  2018-05-03 18:06 ` [PATCH 02/12] xfs: add missing rmap error return Darrick J. Wong
@ 2018-05-07 15:06   ` Christoph Hellwig
  2018-05-10 13:53   ` Brian Foster
  1 sibling, 0 replies; 27+ messages in thread
From: Christoph Hellwig @ 2018-05-07 15:06 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs

On Thu, May 03, 2018 at 11:06:06AM -0700, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> xfs_rmap_lookup_le_range can return errors, so we need to check for
> them and bail out.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH 13/12] xfs: refactor quota limits initialization
  2018-05-03 18:05 [PATCH v4 00/12] xfs-4.18: various fixes Darrick J. Wong
                   ` (11 preceding siblings ...)
  2018-05-03 18:07 ` [PATCH 12/12] xfs: avoid ABBA deadlock when scrubbing parent pointers Darrick J. Wong
@ 2018-05-08  0:40 ` Darrick J. Wong
  12 siblings, 0 replies; 27+ messages in thread
From: Darrick J. Wong @ 2018-05-08  0:40 UTC (permalink / raw)
  To: linux-xfs, Dave Chinner

From: Darrick J. Wong <darrick.wong@oracle.com>

Replace all the if (!error) weirdness with helper functions that follow
our regular coding practices.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 fs/xfs/xfs_qm.c |  141 ++++++++++++++++++++++++++++++-------------------------
 1 file changed, 77 insertions(+), 64 deletions(-)

diff --git a/fs/xfs/xfs_qm.c b/fs/xfs/xfs_qm.c
index 90de07c6b7bc..7c9055a57521 100644
--- a/fs/xfs/xfs_qm.c
+++ b/fs/xfs/xfs_qm.c
@@ -561,26 +561,87 @@ xfs_qm_set_defquota(
 {
 	xfs_dquot_t		*dqp;
 	struct xfs_def_quota    *defq;
+	struct xfs_disk_dquot	*ddqp;
 	int			error;
 
 	error = xfs_qm_dqget_uncached(mp, 0, type, &dqp);
-	if (!error) {
-		xfs_disk_dquot_t        *ddqp = &dqp->q_core;
+	if (error)
+		return;
 
-		defq = xfs_get_defquota(dqp, qinf);
+	ddqp = &dqp->q_core;
+	defq = xfs_get_defquota(dqp, qinf);
 
-		/*
-		 * Timers and warnings have been already set, let's just set the
-		 * default limits for this quota type
-		 */
-		defq->bhardlimit = be64_to_cpu(ddqp->d_blk_hardlimit);
-		defq->bsoftlimit = be64_to_cpu(ddqp->d_blk_softlimit);
-		defq->ihardlimit = be64_to_cpu(ddqp->d_ino_hardlimit);
-		defq->isoftlimit = be64_to_cpu(ddqp->d_ino_softlimit);
-		defq->rtbhardlimit = be64_to_cpu(ddqp->d_rtb_hardlimit);
-		defq->rtbsoftlimit = be64_to_cpu(ddqp->d_rtb_softlimit);
-		xfs_qm_dqdestroy(dqp);
-	}
+	/*
+	 * Timers and warnings have been already set, let's just set the
+	 * default limits for this quota type
+	 */
+	defq->bhardlimit = be64_to_cpu(ddqp->d_blk_hardlimit);
+	defq->bsoftlimit = be64_to_cpu(ddqp->d_blk_softlimit);
+	defq->ihardlimit = be64_to_cpu(ddqp->d_ino_hardlimit);
+	defq->isoftlimit = be64_to_cpu(ddqp->d_ino_softlimit);
+	defq->rtbhardlimit = be64_to_cpu(ddqp->d_rtb_hardlimit);
+	defq->rtbsoftlimit = be64_to_cpu(ddqp->d_rtb_softlimit);
+	xfs_qm_dqdestroy(dqp);
+}
+
+/* Initialize quota time limits from the root dquot. */
+static void
+xfs_qm_init_timelimits(
+	struct xfs_mount	*mp,
+	struct xfs_quotainfo	*qinf)
+{
+	struct xfs_disk_dquot	*ddqp;
+	struct xfs_dquot	*dqp;
+	uint			type;
+	int			error;
+
+	qinf->qi_btimelimit = XFS_QM_BTIMELIMIT;
+	qinf->qi_itimelimit = XFS_QM_ITIMELIMIT;
+	qinf->qi_rtbtimelimit = XFS_QM_RTBTIMELIMIT;
+	qinf->qi_bwarnlimit = XFS_QM_BWARNLIMIT;
+	qinf->qi_iwarnlimit = XFS_QM_IWARNLIMIT;
+	qinf->qi_rtbwarnlimit = XFS_QM_RTBWARNLIMIT;
+
+	/*
+	 * We try to get the limits from the superuser's limits fields.
+	 * This is quite hacky, but it is standard quota practice.
+	 *
+	 * Since we may not have done a quotacheck by this point, just read
+	 * the dquot without attaching it to any hashtables or lists.
+	 *
+	 * Timers and warnings are globally set by the first timer found in
+	 * user/group/proj quota types, otherwise a default value is used.
+	 * This should be split into different fields per quota type.
+	 */
+	if (XFS_IS_UQUOTA_RUNNING(mp))
+		type = XFS_DQ_USER;
+	else if (XFS_IS_GQUOTA_RUNNING(mp))
+		type = XFS_DQ_GROUP;
+	else
+		type = XFS_DQ_PROJ;
+	error = xfs_qm_dqget_uncached(mp, 0, type, &dqp);
+	if (error)
+		return;
+
+	ddqp = &dqp->q_core;
+	/*
+	 * The warnings and timers set the grace period given to
+	 * a user or group before he or she can not perform any
+	 * more writing. If it is zero, a default is used.
+	 */
+	qinf->qi_btimelimit = ddqp->d_btimer ?
+		be32_to_cpu(ddqp->d_btimer) : XFS_QM_BTIMELIMIT;
+	qinf->qi_itimelimit = ddqp->d_itimer ?
+		be32_to_cpu(ddqp->d_itimer) : XFS_QM_ITIMELIMIT;
+	qinf->qi_rtbtimelimit = ddqp->d_rtbtimer ?
+		be32_to_cpu(ddqp->d_rtbtimer) : XFS_QM_RTBTIMELIMIT;
+	qinf->qi_bwarnlimit = ddqp->d_bwarns ?
+		be16_to_cpu(ddqp->d_bwarns) : XFS_QM_BWARNLIMIT;
+	qinf->qi_iwarnlimit = ddqp->d_iwarns ?
+		be16_to_cpu(ddqp->d_iwarns) : XFS_QM_IWARNLIMIT;
+	qinf->qi_rtbwarnlimit = ddqp->d_rtbwarns ?
+		be16_to_cpu(ddqp->d_rtbwarns) : XFS_QM_RTBWARNLIMIT;
+	xfs_qm_dqdestroy(dqp);
 }
 
 /*
@@ -592,8 +653,6 @@ xfs_qm_init_quotainfo(
 	struct xfs_mount	*mp)
 {
 	struct xfs_quotainfo	*qinf;
-	struct xfs_dquot	*dqp;
-	uint			type;
 	int			error;
 
 	ASSERT(XFS_IS_QUOTA_RUNNING(mp));
@@ -626,53 +685,7 @@ xfs_qm_init_quotainfo(
 
 	mp->m_qflags |= (mp->m_sb.sb_qflags & XFS_ALL_QUOTA_CHKD);
 
-	/*
-	 * We try to get the limits from the superuser's limits fields.
-	 * This is quite hacky, but it is standard quota practice.
-	 *
-	 * Since we may not have done a quotacheck by this point, just read
-	 * the dquot without attaching it to any hashtables or lists.
-	 *
-	 * Timers and warnings are globally set by the first timer found in
-	 * user/group/proj quota types, otherwise a default value is used.
-	 * This should be split into different fields per quota type.
-	 */
-	if (XFS_IS_UQUOTA_RUNNING(mp))
-		type = XFS_DQ_USER;
-	else if (XFS_IS_GQUOTA_RUNNING(mp))
-		type = XFS_DQ_GROUP;
-	else
-		type = XFS_DQ_PROJ;
-	error = xfs_qm_dqget_uncached(mp, 0, type, &dqp);
-	if (!error) {
-		xfs_disk_dquot_t	*ddqp = &dqp->q_core;
-
-		/*
-		 * The warnings and timers set the grace period given to
-		 * a user or group before he or she can not perform any
-		 * more writing. If it is zero, a default is used.
-		 */
-		qinf->qi_btimelimit = ddqp->d_btimer ?
-			be32_to_cpu(ddqp->d_btimer) : XFS_QM_BTIMELIMIT;
-		qinf->qi_itimelimit = ddqp->d_itimer ?
-			be32_to_cpu(ddqp->d_itimer) : XFS_QM_ITIMELIMIT;
-		qinf->qi_rtbtimelimit = ddqp->d_rtbtimer ?
-			be32_to_cpu(ddqp->d_rtbtimer) : XFS_QM_RTBTIMELIMIT;
-		qinf->qi_bwarnlimit = ddqp->d_bwarns ?
-			be16_to_cpu(ddqp->d_bwarns) : XFS_QM_BWARNLIMIT;
-		qinf->qi_iwarnlimit = ddqp->d_iwarns ?
-			be16_to_cpu(ddqp->d_iwarns) : XFS_QM_IWARNLIMIT;
-		qinf->qi_rtbwarnlimit = ddqp->d_rtbwarns ?
-			be16_to_cpu(ddqp->d_rtbwarns) : XFS_QM_RTBWARNLIMIT;
-		xfs_qm_dqdestroy(dqp);
-	} else {
-		qinf->qi_btimelimit = XFS_QM_BTIMELIMIT;
-		qinf->qi_itimelimit = XFS_QM_ITIMELIMIT;
-		qinf->qi_rtbtimelimit = XFS_QM_RTBTIMELIMIT;
-		qinf->qi_bwarnlimit = XFS_QM_BWARNLIMIT;
-		qinf->qi_iwarnlimit = XFS_QM_IWARNLIMIT;
-		qinf->qi_rtbwarnlimit = XFS_QM_RTBWARNLIMIT;
-	}
+	xfs_qm_init_timelimits(mp, qinf);
 
 	if (XFS_IS_UQUOTA_RUNNING(mp))
 		xfs_qm_set_defquota(mp, XFS_DQ_USER, qinf);

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [PATCH 01/12] xfs: bmap debugging should never panic the system
  2018-05-03 18:05 ` [PATCH 01/12] xfs: bmap debugging should never panic the system Darrick J. Wong
  2018-05-07 15:06   ` Christoph Hellwig
@ 2018-05-10 13:53   ` Brian Foster
  2018-05-10 15:45     ` Darrick J. Wong
  1 sibling, 1 reply; 27+ messages in thread
From: Brian Foster @ 2018-05-10 13:53 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs

On Thu, May 03, 2018 at 11:05:59AM -0700, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> Don't panic() the system if the bmap records are garbage, just call
> ASSERT which gives us the same backtrace but enables developers to
> control if the system goes down or not.  This makes debugging with
> generic/388 much easier because it won't reboot the machine midway
> through a run just because btree_read_bufl returns EIO when the fs has
> already shut down.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> ---
>  fs/xfs/libxfs/xfs_bmap.c |    8 ++++++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
> 
> 
> diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c
> index 6a7c2f03ea11..393b4e2c63ad 100644
> --- a/fs/xfs/libxfs/xfs_bmap.c
> +++ b/fs/xfs/libxfs/xfs_bmap.c
> @@ -312,8 +312,10 @@ xfs_check_block(
>  				xfs_warn(mp, "%s: thispa(%d) == pp(%d) %Ld",
>  					__func__, j, i,
>  					(unsigned long long)be64_to_cpu(*thispa));
> -				panic("%s: ptrs are equal in node\n",
> +				xfs_err(mp, "%s: ptrs are equal in node\n",
>  					__func__);
> +				ASSERT(XFS_FORCED_SHUTDOWN(mp));
> +				xfs_force_shutdown(mp, SHUTDOWN_CORRUPT_INCORE);

Why the assert if we're just going to shutdown? At minimum, it looks a
bit funny to assert that the fs is shutdown just prior to issuing a
shutdown. Can we just shutdown and let the error level dictate whether
we want a backtrace?

Brian

>  			}
>  		}
>  	}
> @@ -483,7 +485,9 @@ xfs_bmap_check_leaf_extents(
>  error_norelse:
>  	xfs_warn(mp, "%s: BAD after btree leaves for %d extents",
>  		__func__, i);
> -	panic("%s: CORRUPTED BTREE OR SOMETHING", __func__);
> +	xfs_err(mp, "%s: CORRUPTED BTREE OR SOMETHING", __func__);
> +	ASSERT(XFS_FORCED_SHUTDOWN(mp));
> +	xfs_force_shutdown(mp, SHUTDOWN_CORRUPT_INCORE);
>  	return;
>  }
>  
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 02/12] xfs: add missing rmap error return
  2018-05-03 18:06 ` [PATCH 02/12] xfs: add missing rmap error return Darrick J. Wong
  2018-05-07 15:06   ` Christoph Hellwig
@ 2018-05-10 13:53   ` Brian Foster
  1 sibling, 0 replies; 27+ messages in thread
From: Brian Foster @ 2018-05-10 13:53 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs

On Thu, May 03, 2018 at 11:06:06AM -0700, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> xfs_rmap_lookup_le_range can return errors, so we need to check for
> them and bail out.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> ---

Reviewed-by: Brian Foster <bfoster@redhat.com>

>  fs/xfs/libxfs/xfs_rmap.c |    2 ++
>  1 file changed, 2 insertions(+)
> 
> 
> diff --git a/fs/xfs/libxfs/xfs_rmap.c b/fs/xfs/libxfs/xfs_rmap.c
> index fba8d2718017..f7769edf5b68 100644
> --- a/fs/xfs/libxfs/xfs_rmap.c
> +++ b/fs/xfs/libxfs/xfs_rmap.c
> @@ -1374,6 +1374,8 @@ xfs_rmap_convert_shared(
>  	 */
>  	error = xfs_rmap_lookup_le_range(cur, bno, owner, offset, flags,
>  			&PREV, &i);
> +	if (error)
> +		goto done;
>  	XFS_WANT_CORRUPTED_GOTO(mp, i == 1, done);
>  
>  	ASSERT(PREV.rm_offset <= offset);
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 03/12] xfs: skip scrub xref if corruption already noted
  2018-05-03 18:06 ` [PATCH 03/12] xfs: skip scrub xref if corruption already noted Darrick J. Wong
@ 2018-05-10 13:53   ` Brian Foster
  0 siblings, 0 replies; 27+ messages in thread
From: Brian Foster @ 2018-05-10 13:53 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs

On Thu, May 03, 2018 at 11:06:13AM -0700, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> Don't bother looking for cross-referencing problems if the metadata is
> already corrupt or we've already found a cross-referencing problem.
> Since we added a helper function for flags testing, convert existing
> users to use it.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> ---

Reviewed-by: Brian Foster <bfoster@redhat.com>

>  fs/xfs/scrub/alloc.c    |    4 ++--
>  fs/xfs/scrub/bmap.c     |    2 +-
>  fs/xfs/scrub/common.c   |    4 ++++
>  fs/xfs/scrub/common.h   |   10 ++++++++++
>  fs/xfs/scrub/ialloc.c   |    7 ++++---
>  fs/xfs/scrub/inode.c    |    5 ++++-
>  fs/xfs/scrub/refcount.c |    8 ++++----
>  fs/xfs/scrub/rmap.c     |    6 +++---
>  fs/xfs/scrub/rtbitmap.c |    3 +++
>  fs/xfs/scrub/scrub.c    |    4 ++--
>  10 files changed, 37 insertions(+), 16 deletions(-)
> 
> 
> diff --git a/fs/xfs/scrub/alloc.c b/fs/xfs/scrub/alloc.c
> index 517c079d3f68..941a0a55224e 100644
> --- a/fs/xfs/scrub/alloc.c
> +++ b/fs/xfs/scrub/alloc.c
> @@ -70,7 +70,7 @@ xfs_scrub_allocbt_xref_other(
>  		pcur = &sc->sa.cnt_cur;
>  	else
>  		pcur = &sc->sa.bno_cur;
> -	if (!*pcur)
> +	if (!*pcur || xfs_scrub_skip_xref(sc->sm))
>  		return;
>  
>  	error = xfs_alloc_lookup_le(*pcur, agbno, len, &has_otherrec);
> @@ -172,7 +172,7 @@ xfs_scrub_xref_is_used_space(
>  	bool				is_freesp;
>  	int				error;
>  
> -	if (!sc->sa.bno_cur)
> +	if (!sc->sa.bno_cur || xfs_scrub_skip_xref(sc->sm))
>  		return;
>  
>  	error = xfs_alloc_has_record(sc->sa.bno_cur, agbno, len, &is_freesp);
> diff --git a/fs/xfs/scrub/bmap.c b/fs/xfs/scrub/bmap.c
> index 639d14b51e90..fb91caf17652 100644
> --- a/fs/xfs/scrub/bmap.c
> +++ b/fs/xfs/scrub/bmap.c
> @@ -175,7 +175,7 @@ xfs_scrub_bmap_xref_rmap(
>  	unsigned long long		rmap_end;
>  	uint64_t			owner;
>  
> -	if (!info->sc->sa.rmap_cur)
> +	if (!info->sc->sa.rmap_cur || xfs_scrub_skip_xref(info->sc->sm))
>  		return;
>  
>  	if (info->whichfork == XFS_COW_FORK)
> diff --git a/fs/xfs/scrub/common.c b/fs/xfs/scrub/common.c
> index 8ed91d5c868d..e9ca5f0802d5 100644
> --- a/fs/xfs/scrub/common.c
> +++ b/fs/xfs/scrub/common.c
> @@ -727,6 +727,10 @@ xfs_scrub_should_check_xref(
>  	int				*error,
>  	struct xfs_btree_cur		**curpp)
>  {
> +	/* No point in xref if we already know we're corrupt. */
> +	if (xfs_scrub_skip_xref(sc->sm))
> +		return false;
> +
>  	if (*error == 0)
>  		return true;
>  
> diff --git a/fs/xfs/scrub/common.h b/fs/xfs/scrub/common.h
> index deaf60400981..80fa5a67c265 100644
> --- a/fs/xfs/scrub/common.h
> +++ b/fs/xfs/scrub/common.h
> @@ -157,4 +157,14 @@ int xfs_scrub_setup_inode_contents(struct xfs_scrub_context *sc,
>  				   struct xfs_inode *ip, unsigned int resblks);
>  void xfs_scrub_buffer_recheck(struct xfs_scrub_context *sc, struct xfs_buf *bp);
>  
> +/*
> + * Don't bother cross-referencing if we already found corruption or cross
> + * referencing discrepancies.
> + */
> +static inline bool xfs_scrub_skip_xref(struct xfs_scrub_metadata *sm)
> +{
> +	return sm->sm_flags & (XFS_SCRUB_OFLAG_CORRUPT |
> +			       XFS_SCRUB_OFLAG_XCORRUPT);
> +}
> +
>  #endif	/* __XFS_SCRUB_COMMON_H__ */
> diff --git a/fs/xfs/scrub/ialloc.c b/fs/xfs/scrub/ialloc.c
> index 106ca4bd753f..00a834d3b56d 100644
> --- a/fs/xfs/scrub/ialloc.c
> +++ b/fs/xfs/scrub/ialloc.c
> @@ -387,7 +387,8 @@ xfs_scrub_iallocbt_xref_rmap_btreeblks(
>  	int				error;
>  
>  	if (!sc->sa.ino_cur || !sc->sa.rmap_cur ||
> -	    (xfs_sb_version_hasfinobt(&sc->mp->m_sb) && !sc->sa.fino_cur))
> +	    (xfs_sb_version_hasfinobt(&sc->mp->m_sb) && !sc->sa.fino_cur) ||
> +	    xfs_scrub_skip_xref(sc->sm))
>  		return;
>  
>  	/* Check that we saw as many inobt blocks as the rmap says. */
> @@ -424,7 +425,7 @@ xfs_scrub_iallocbt_xref_rmap_inodes(
>  	xfs_filblks_t			blocks;
>  	int				error;
>  
> -	if (!sc->sa.rmap_cur)
> +	if (!sc->sa.rmap_cur || xfs_scrub_skip_xref(sc->sm))
>  		return;
>  
>  	/* Check that we saw as many inode blocks as the rmap knows about. */
> @@ -496,7 +497,7 @@ xfs_scrub_xref_inode_check(
>  	bool				has_inodes;
>  	int				error;
>  
> -	if (!(*icur))
> +	if (!(*icur) || xfs_scrub_skip_xref(sc->sm))
>  		return;
>  
>  	error = xfs_ialloc_has_inodes_at_extent(*icur, agbno, len, &has_inodes);
> diff --git a/fs/xfs/scrub/inode.c b/fs/xfs/scrub/inode.c
> index df14930e4fc5..8532fc7be838 100644
> --- a/fs/xfs/scrub/inode.c
> +++ b/fs/xfs/scrub/inode.c
> @@ -449,7 +449,7 @@ xfs_scrub_inode_xref_finobt(
>  	int				has_record;
>  	int				error;
>  
> -	if (!sc->sa.fino_cur)
> +	if (!sc->sa.fino_cur || xfs_scrub_skip_xref(sc->sm))
>  		return;
>  
>  	agino = XFS_INO_TO_AGINO(sc->mp, ino);
> @@ -492,6 +492,9 @@ xfs_scrub_inode_xref_bmap(
>  	xfs_filblks_t			acount;
>  	int				error;
>  
> +	if (xfs_scrub_skip_xref(sc->sm))
> +		return;
> +
>  	/* Walk all the extents to check nextents/naextents/nblocks. */
>  	error = xfs_bmap_count_blocks(sc->tp, sc->ip, XFS_DATA_FORK,
>  			&nextents, &count);
> diff --git a/fs/xfs/scrub/refcount.c b/fs/xfs/scrub/refcount.c
> index 400f1561cd3d..913b0a190e68 100644
> --- a/fs/xfs/scrub/refcount.c
> +++ b/fs/xfs/scrub/refcount.c
> @@ -310,7 +310,7 @@ xfs_scrub_refcountbt_xref_rmap(
>  	struct xfs_scrub_refcnt_frag	*n;
>  	int				error;
>  
> -	if (!sc->sa.rmap_cur)
> +	if (!sc->sa.rmap_cur || xfs_scrub_skip_xref(sc->sm))
>  		return;
>  
>  	/* Cross-reference with the rmapbt to confirm the refcount. */
> @@ -404,7 +404,7 @@ xfs_scrub_refcount_xref_rmap(
>  	xfs_filblks_t			blocks;
>  	int				error;
>  
> -	if (!sc->sa.rmap_cur)
> +	if (!sc->sa.rmap_cur || xfs_scrub_skip_xref(sc->sm))
>  		return;
>  
>  	/* Check that we saw as many refcbt blocks as the rmap knows about. */
> @@ -460,7 +460,7 @@ xfs_scrub_xref_is_cow_staging(
>  	int				has_refcount;
>  	int				error;
>  
> -	if (!sc->sa.refc_cur)
> +	if (!sc->sa.refc_cur || xfs_scrub_skip_xref(sc->sm))
>  		return;
>  
>  	/* Find the CoW staging extent. */
> @@ -504,7 +504,7 @@ xfs_scrub_xref_is_not_shared(
>  	bool				shared;
>  	int				error;
>  
> -	if (!sc->sa.refc_cur)
> +	if (!sc->sa.refc_cur || xfs_scrub_skip_xref(sc->sm))
>  		return;
>  
>  	error = xfs_refcount_has_record(sc->sa.refc_cur, agbno, len, &shared);
> diff --git a/fs/xfs/scrub/rmap.c b/fs/xfs/scrub/rmap.c
> index 8f2a7c3ff455..b376a9a77c04 100644
> --- a/fs/xfs/scrub/rmap.c
> +++ b/fs/xfs/scrub/rmap.c
> @@ -66,7 +66,7 @@ xfs_scrub_rmapbt_xref_refc(
>  	bool				is_unwritten;
>  	int				error;
>  
> -	if (!sc->sa.refc_cur)
> +	if (!sc->sa.refc_cur || xfs_scrub_skip_xref(sc->sm))
>  		return;
>  
>  	non_inode = XFS_RMAP_NON_INODE_OWNER(irec->rm_owner);
> @@ -207,7 +207,7 @@ xfs_scrub_xref_check_owner(
>  	bool				has_rmap;
>  	int				error;
>  
> -	if (!sc->sa.rmap_cur)
> +	if (!sc->sa.rmap_cur || xfs_scrub_skip_xref(sc->sm))
>  		return;
>  
>  	error = xfs_rmap_record_exists(sc->sa.rmap_cur, bno, len, oinfo,
> @@ -250,7 +250,7 @@ xfs_scrub_xref_has_no_owner(
>  	bool				has_rmap;
>  	int				error;
>  
> -	if (!sc->sa.rmap_cur)
> +	if (!sc->sa.rmap_cur || xfs_scrub_skip_xref(sc->sm))
>  		return;
>  
>  	error = xfs_rmap_has_record(sc->sa.rmap_cur, bno, len, &has_rmap);
> diff --git a/fs/xfs/scrub/rtbitmap.c b/fs/xfs/scrub/rtbitmap.c
> index 39c41dfe08ee..8b048f107af2 100644
> --- a/fs/xfs/scrub/rtbitmap.c
> +++ b/fs/xfs/scrub/rtbitmap.c
> @@ -110,6 +110,9 @@ xfs_scrub_xref_is_used_rt_space(
>  	bool				is_free;
>  	int				error;
>  
> +	if (xfs_scrub_skip_xref(sc->sm))
> +		return;
> +
>  	xfs_ilock(sc->mp->m_rbmip, XFS_ILOCK_SHARED | XFS_ILOCK_RTBITMAP);
>  	error = xfs_rtalloc_extent_is_free(sc->mp, sc->tp, fsbno, len,
>  			&is_free);
> diff --git a/fs/xfs/scrub/scrub.c b/fs/xfs/scrub/scrub.c
> index 26c75967a072..d9ecbb6ad4a3 100644
> --- a/fs/xfs/scrub/scrub.c
> +++ b/fs/xfs/scrub/scrub.c
> @@ -446,8 +446,8 @@ xfs_scrub_metadata(
>  	} else if (error)
>  		goto out_teardown;
>  
> -	if (sc.sm->sm_flags & (XFS_SCRUB_OFLAG_CORRUPT |
> -			       XFS_SCRUB_OFLAG_XCORRUPT))
> +	if (sm->sm_flags & (XFS_SCRUB_OFLAG_CORRUPT |
> +			    XFS_SCRUB_OFLAG_XCORRUPT))
>  		xfs_alert_ratelimited(mp, "Corruption detected during scrub.");
>  
>  out_teardown:
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 04/12] xfs: don't continue scrub if already corrupt
  2018-05-03 18:06 ` [PATCH 04/12] xfs: don't continue scrub if already corrupt Darrick J. Wong
@ 2018-05-10 13:53   ` Brian Foster
  2018-05-10 15:54     ` Darrick J. Wong
  0 siblings, 1 reply; 27+ messages in thread
From: Brian Foster @ 2018-05-10 13:53 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs

On Thu, May 03, 2018 at 11:06:20AM -0700, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> If we've already decided that something is corrupt, we might as well
> abort all the loops and exit as quickly as possible.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> ---

Just a few nits..

>  fs/xfs/scrub/attr.c   |    3 ++-
>  fs/xfs/scrub/bmap.c   |    3 ++-
>  fs/xfs/scrub/dir.c    |   34 +++++++++++++++++++++++++---------
>  fs/xfs/scrub/parent.c |    3 +++
>  4 files changed, 32 insertions(+), 11 deletions(-)
> 
> 
> diff --git a/fs/xfs/scrub/attr.c b/fs/xfs/scrub/attr.c
> index 127575f0abfb..84b6d6b66578 100644
> --- a/fs/xfs/scrub/attr.c
> +++ b/fs/xfs/scrub/attr.c
> @@ -126,8 +126,9 @@ xfs_scrub_xattr_listent(
>  	if (args.valuelen != valuelen)
>  		xfs_scrub_fblock_set_corrupt(sx->sc, XFS_ATTR_FORK,
>  					     args.blkno);
> -
>  fail_xref:
> +	if (sx->sc->sm->sm_flags & XFS_SCRUB_OFLAG_CORRUPT)
> +		context->seen_enough = 1;
>  	return;
>  }
>  
> diff --git a/fs/xfs/scrub/bmap.c b/fs/xfs/scrub/bmap.c
> index fb91caf17652..c28ff2ec201a 100644
> --- a/fs/xfs/scrub/bmap.c
> +++ b/fs/xfs/scrub/bmap.c
> @@ -684,7 +684,8 @@ xfs_scrub_bmap(
>  	info.lastoff = 0;
>  	ifp = XFS_IFORK_PTR(ip, whichfork);
>  	for_each_xfs_iext(ifp, &icur, &irec) {
> -		if (xfs_scrub_should_terminate(sc, &error))
> +		if (xfs_scrub_should_terminate(sc, &error) ||
> +		    (sc->sm->sm_flags & XFS_SCRUB_OFLAG_CORRUPT))
>  			break;
>  		if (isnullstartblock(irec.br_startblock))
>  			continue;
> diff --git a/fs/xfs/scrub/dir.c b/fs/xfs/scrub/dir.c
> index 38f29806eb54..fb56a885f443 100644
> --- a/fs/xfs/scrub/dir.c
> +++ b/fs/xfs/scrub/dir.c
> @@ -172,7 +172,7 @@ xfs_scrub_dir_actor(
>  	error = xfs_dir_lookup(sdc->sc->tp, ip, &xname, &lookup_ino, NULL);
>  	if (!xfs_scrub_fblock_process_error(sdc->sc, XFS_DATA_FORK, offset,
>  			&error))
> -		goto fail_xref;
> +		goto out;
>  	if (lookup_ino != ino) {
>  		xfs_scrub_fblock_set_corrupt(sdc->sc, XFS_DATA_FORK, offset);
>  		goto out;
> @@ -183,8 +183,13 @@ xfs_scrub_dir_actor(
>  	if (error)
>  		goto out;
>  out:
> -	return error;
> -fail_xref:
> +	/*
> +	 * A negative error code returned here is supposed to cause the
> +	 * dir_emit caller (xfs_readdir) to abort the directory iteration
> +	 * and return zero to xfs_scrub_dir.

s/xfs_scrub_dir/xfs_scrub_directory/ ?

> +	 */
> +	if (error == 0 && sdc->sc->sm->sm_flags & XFS_SCRUB_OFLAG_CORRUPT)
> +		return -EFSCORRUPTED;
>  	return error;
>  }
>  
> @@ -240,6 +245,9 @@ xfs_scrub_dir_rec(
>  	}
>  	xfs_scrub_buffer_recheck(ds->sc, bp);
>  
> +	if (ds->sc->sm->sm_flags & XFS_SCRUB_OFLAG_CORRUPT)
> +		goto out_relse;
> +
>  	dent = (struct xfs_dir2_data_entry *)(((char *)bp->b_addr) + off);
>  
>  	/* Make sure we got a real directory entry. */
> @@ -357,6 +365,9 @@ xfs_scrub_directory_data_bestfree(
>  
>  	/* XXX: Check xfs_dir3_data_hdr.pad is zero once we start setting it. */
>  
> +	if (sc->sm->sm_flags & XFS_SCRUB_OFLAG_CORRUPT)
> +		goto out_buf;
> +
>  	/* Do the bestfrees correspond to actual free space? */
>  	bf = d_ops->data_bestfree_p(bp->b_addr);
>  	smallest_bestfree = UINT_MAX;
> @@ -394,7 +405,7 @@ xfs_scrub_directory_data_bestfree(
>  	endptr = xfs_dir3_data_endp(mp->m_dir_geo, bp->b_addr);
>  
>  	/* Iterate the entries, stopping when we hit or go past the end. */
> -	while (ptr < endptr) {
> +	while (ptr < endptr && !(sc->sm->sm_flags & XFS_SCRUB_OFLAG_CORRUPT)) {

This seems like slightly different behavior than all of the other
corruption checks in the loop. Perhaps this should go after the
xfs_scrub_directory_check_free_entry() call?

>  		dup = (struct xfs_dir2_data_unused *)ptr;
>  		/* Skip real entries */
>  		if (dup->freetag != cpu_to_be16(XFS_DIR2_DATA_FREE_TAG)) {
> @@ -413,8 +424,10 @@ xfs_scrub_directory_data_bestfree(
>  
>  		/* Spot check this free entry */
>  		tag = be16_to_cpu(*xfs_dir2_data_unused_tag_p(dup));
> -		if (tag != ((char *)dup - (char *)bp->b_addr))
> +		if (tag != ((char *)dup - (char *)bp->b_addr)) {
>  			xfs_scrub_fblock_set_corrupt(sc, XFS_DATA_FORK, lblk);
> +			goto out_buf;
> +		}
>  
>  		/*
>  		 * Either this entry is a bestfree or it's smaller than
> @@ -549,6 +562,8 @@ xfs_scrub_directory_leaf1_bestfree(
>  
>  	/* Check all the bestfree entries. */
>  	for (i = 0; i < bestcount; i++, bestp++) {
> +		if (sc->sm->sm_flags & XFS_SCRUB_OFLAG_CORRUPT)
> +			break;

This seems similarly misplaced. If somebody later adds some additional
checks before the loop, this all of a sudden doesn't exactly do the same
thing. Perhaps check after the preceding hash checks and then again in
the loop after the brelse?

>  		best = be16_to_cpu(*bestp);
>  		if (best == NULLDATAOFF)
>  			continue;
> @@ -556,9 +571,10 @@ xfs_scrub_directory_leaf1_bestfree(
>  				i * args->geo->fsbcount, -1, &dbp);
>  		if (!xfs_scrub_fblock_process_error(sc, XFS_DATA_FORK, lblk,
>  				&error))
> -			continue;
> +			break;
>  		xfs_scrub_directory_check_freesp(sc, lblk, dbp, best);
>  		xfs_trans_brelse(sc->tp, dbp);
> +
>  	}
>  out:
>  	return error;
> @@ -607,7 +623,7 @@ xfs_scrub_directory_free_bestfree(
>  				-1, &dbp);
>  		if (!xfs_scrub_fblock_process_error(sc, XFS_DATA_FORK, lblk,
>  				&error))
> -			continue;
> +			break;
>  		xfs_scrub_directory_check_freesp(sc, lblk, dbp, best);
>  		xfs_trans_brelse(sc->tp, dbp);
>  	}
> @@ -656,7 +672,7 @@ xfs_scrub_directory_blocks(
>  
>  	/* Iterate all the data extents in the directory... */
>  	found = xfs_iext_lookup_extent(sc->ip, ifp, lblk, &icur, &got);
> -	while (found) {
> +	while (found && !(sc->sm->sm_flags & XFS_SCRUB_OFLAG_CORRUPT)) {
>  		/* Block directories only have a single block at offset 0. */
>  		if (is_block &&
>  		    (got.br_startoff > 0 ||
> @@ -719,7 +735,7 @@ xfs_scrub_directory_blocks(
>  	/* Scan for free blocks */
>  	lblk = free_lblk;
>  	found = xfs_iext_lookup_extent(sc->ip, ifp, lblk, &icur, &got);
> -	while (found) {
> +	while (found && !(sc->sm->sm_flags & XFS_SCRUB_OFLAG_CORRUPT)) {
>  		/*
>  		 * Dirs can't have blocks mapped above 2^32.
>  		 * Single-block dirs shouldn't even be here.
> diff --git a/fs/xfs/scrub/parent.c b/fs/xfs/scrub/parent.c
> index 1fb88c18d455..95fdd0d9e65d 100644
> --- a/fs/xfs/scrub/parent.c
> +++ b/fs/xfs/scrub/parent.c
> @@ -147,6 +147,9 @@ xfs_scrub_parent_validate(
>  
>  	*try_again = false;
>  
> +	if (sc->sm->sm_flags & XFS_SCRUB_OFLAG_CORRUPT)
> +		return 0;
> +

Technically the same resulting logic, but the return 0 here vs. the goto
out (that also returns 0) in the subsequent check stand out a bit as
inconsistent.

Brian

>  	/* '..' must not point to ourselves. */
>  	if (sc->ip->i_ino == dnum) {
>  		xfs_scrub_fblock_set_corrupt(sc, XFS_DATA_FORK, 0);
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 06/12] xfs: quota scrub should use bmapbtd scrubber
  2018-05-03 18:06 ` [PATCH 06/12] xfs: quota scrub should use bmapbtd scrubber Darrick J. Wong
@ 2018-05-10 13:54   ` Brian Foster
  2018-05-10 15:34     ` Darrick J. Wong
  0 siblings, 1 reply; 27+ messages in thread
From: Brian Foster @ 2018-05-10 13:54 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs

On Thu, May 03, 2018 at 11:06:32AM -0700, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> Replace the quota scrubber's open-coded data fork scrubber with a
> redirected call to the bmapbtd scrubber.  This strengthens the quota
> scrub to include all the cross-referencing that it does.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> ---

FYI, I start to get patch application errors on the previous patch, this
patch, and couple or so of the subsequent... (against current for-next).

Brian

>  fs/xfs/scrub/common.c |   57 ++++++++++++++++++++++++++++++++++
>  fs/xfs/scrub/common.h |    2 +
>  fs/xfs/scrub/quota.c  |   83 ++++++++++++++++++++++++-------------------------
>  3 files changed, 99 insertions(+), 43 deletions(-)
> 
> 
> diff --git a/fs/xfs/scrub/common.c b/fs/xfs/scrub/common.c
> index e9ca5f0802d5..335c7a305e15 100644
> --- a/fs/xfs/scrub/common.c
> +++ b/fs/xfs/scrub/common.c
> @@ -44,6 +44,8 @@
>  #include "xfs_rmap_btree.h"
>  #include "xfs_log.h"
>  #include "xfs_trans_priv.h"
> +#include "xfs_attr.h"
> +#include "xfs_reflink.h"
>  #include "scrub/xfs_scrub.h"
>  #include "scrub/scrub.h"
>  #include "scrub/common.h"
> @@ -777,3 +779,58 @@ xfs_scrub_buffer_recheck(
>  	sc->sm->sm_flags |= XFS_SCRUB_OFLAG_CORRUPT;
>  	trace_xfs_scrub_block_error(sc, bp->b_bn, fa);
>  }
> +
> +/*
> + * Scrub the attr/data forks of a metadata inode.  The metadata inode must be
> + * pointed to by sc->ip and the ILOCK must be held.
> + */
> +int
> +xfs_scrub_metadata_inode_forks(
> +	struct xfs_scrub_context	*sc)
> +{
> +	__u32				smtype;
> +	bool				shared;
> +	int				error;
> +
> +	if (sc->sm->sm_flags & XFS_SCRUB_OFLAG_CORRUPT)
> +		return 0;
> +
> +	/* Metadata inodes don't live on the rt device. */
> +	if (sc->ip->i_d.di_flags & XFS_DIFLAG_REALTIME) {
> +		xfs_scrub_ino_set_corrupt(sc, sc->ip->i_ino);
> +		return 0;
> +	}
> +
> +	/* They should never participate in reflink. */
> +	if (xfs_is_reflink_inode(sc->ip)) {
> +		xfs_scrub_ino_set_corrupt(sc, sc->ip->i_ino);
> +		return 0;
> +	}
> +
> +	/* They also should never have extended attributes. */
> +	if (xfs_inode_hasattr(sc->ip)) {
> +		xfs_scrub_ino_set_corrupt(sc, sc->ip->i_ino);
> +		return 0;
> +	}
> +
> +	/* Invoke the data fork scrubber. */
> +	smtype = sc->sm->sm_type;
> +	sc->sm->sm_type = XFS_SCRUB_TYPE_BMBTD;
> +	error = xfs_scrub_bmap_data(sc);
> +	sc->sm->sm_type = smtype;
> +	if (error || (sc->sm->sm_flags & XFS_SCRUB_OFLAG_CORRUPT))
> +		return error;
> +
> +	/* Look for incorrect shared blocks. */
> +	if (xfs_sb_version_hasreflink(&sc->mp->m_sb)) {
> +		error = xfs_reflink_inode_has_shared_extents(sc->tp, sc->ip,
> +				&shared);
> +		if (!xfs_scrub_fblock_process_error(sc, XFS_DATA_FORK, 0,
> +				&error))
> +			return error;
> +		if (shared)
> +			xfs_scrub_ino_set_corrupt(sc, sc->ip->i_ino);
> +	}
> +
> +	return error;
> +}
> diff --git a/fs/xfs/scrub/common.h b/fs/xfs/scrub/common.h
> index 80fa5a67c265..ad736c411287 100644
> --- a/fs/xfs/scrub/common.h
> +++ b/fs/xfs/scrub/common.h
> @@ -167,4 +167,6 @@ static inline bool xfs_scrub_skip_xref(struct xfs_scrub_metadata *sm)
>  			       XFS_SCRUB_OFLAG_XCORRUPT);
>  }
>  
> +int xfs_scrub_metadata_inode_forks(struct xfs_scrub_context *sc);
> +
>  #endif	/* __XFS_SCRUB_COMMON_H__ */
> diff --git a/fs/xfs/scrub/quota.c b/fs/xfs/scrub/quota.c
> index d3d08978f53a..15ae4d23d6ac 100644
> --- a/fs/xfs/scrub/quota.c
> +++ b/fs/xfs/scrub/quota.c
> @@ -206,65 +206,62 @@ xfs_scrub_quota_item(
>  	return 0;
>  }
>  
> -/* Scrub all of a quota type's items. */
> -int
> -xfs_scrub_quota(
> +/* Check the quota's data fork. */
> +STATIC int
> +xfs_scrub_quota_data_fork(
>  	struct xfs_scrub_context	*sc)
>  {
>  	struct xfs_bmbt_irec		irec = { 0 };
> -	struct xfs_scrub_quota_info	sqi;
> -	struct xfs_mount		*mp = sc->mp;
> -	struct xfs_quotainfo		*qi = mp->m_quotainfo;
> +	struct xfs_iext_cursor		icur;
> +	struct xfs_quotainfo		*qi = sc->mp->m_quotainfo;
> +	struct xfs_ifork		*ifp;
>  	xfs_fileoff_t			max_dqid_off;
> -	xfs_fileoff_t			off = 0;
> -	uint				dqtype;
> -	int				nimaps;
>  	int				error = 0;
>  
> -	dqtype = xfs_scrub_quota_to_dqtype(sc);
> +	/* Invoke the fork scrubber. */
> +	error = xfs_scrub_metadata_inode_forks(sc);
> +	if (error || (sc->sm->sm_flags & XFS_SCRUB_OFLAG_CORRUPT))
> +		return error;
>  
> -	/* Look for problem extents. */
> -	if (sc->ip->i_d.di_flags & XFS_DIFLAG_REALTIME) {
> -		xfs_scrub_ino_set_corrupt(sc, sc->ip->i_ino);
> -		goto out;
> -	}
> +	/* Check for data fork problems that apply only to quota files. */
>  	max_dqid_off = ((xfs_dqid_t)-1) / qi->qi_dqperchunk;
> -	while (1) {
> +	ifp = XFS_IFORK_PTR(sc->ip, XFS_DATA_FORK);
> +	for_each_xfs_iext(ifp, &icur, &irec) {
>  		if (xfs_scrub_should_terminate(sc, &error))
>  			break;
> -
> -		off = irec.br_startoff + irec.br_blockcount;
> -		nimaps = 1;
> -		error = xfs_bmapi_read(sc->ip, off, -1, &irec, &nimaps,
> -				XFS_BMAPI_ENTIRE);
> -		if (!xfs_scrub_fblock_process_error(sc, XFS_DATA_FORK, off,
> -				&error))
> -			goto out;
> -		if (!nimaps)
> -			break;
> -		if (irec.br_startblock == HOLESTARTBLOCK)
> -			continue;
> -
> -		/* Check the extent record doesn't point to crap. */
> -		if (irec.br_startblock + irec.br_blockcount <=
> -		    irec.br_startblock)
> -			xfs_scrub_fblock_set_corrupt(sc, XFS_DATA_FORK,
> -					irec.br_startoff);
> -		if (!xfs_verify_fsbno(mp, irec.br_startblock) ||
> -		    !xfs_verify_fsbno(mp, irec.br_startblock +
> -					irec.br_blockcount - 1))
> -			xfs_scrub_fblock_set_corrupt(sc, XFS_DATA_FORK,
> -					irec.br_startoff);
> -
>  		/*
> -		 * Unwritten extents or blocks mapped above the highest
> +		 * delalloc extents or blocks mapped above the highest
>  		 * quota id shouldn't happen.
>  		 */
>  		if (isnullstartblock(irec.br_startblock) ||
>  		    irec.br_startoff > max_dqid_off ||
> -		    irec.br_startoff + irec.br_blockcount > max_dqid_off + 1)
> -			xfs_scrub_fblock_set_corrupt(sc, XFS_DATA_FORK, off);
> +		    irec.br_startoff + irec.br_blockcount - 1 > max_dqid_off) {
> +			xfs_scrub_fblock_set_corrupt(sc, XFS_DATA_FORK,
> +					irec.br_startoff);
> +			break;
> +		}
>  	}
> +
> +	return error;
> +}
> +
> +/* Scrub all of a quota type's items. */
> +int
> +xfs_scrub_quota(
> +	struct xfs_scrub_context	*sc)
> +{
> +	struct xfs_scrub_quota_info	sqi;
> +	struct xfs_mount		*mp = sc->mp;
> +	struct xfs_quotainfo		*qi = mp->m_quotainfo;
> +	uint				dqtype;
> +	int				error = 0;
> +
> +	dqtype = xfs_scrub_quota_to_dqtype(sc);
> +
> +	/* Look for problem extents. */
> +	error = xfs_scrub_quota_data_fork(sc);
> +	if (error)
> +		goto out;
>  	if (sc->sm->sm_flags & XFS_SCRUB_OFLAG_CORRUPT)
>  		goto out;
>  
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 07/12] xfs: scrub the data fork of the realtime inodes
  2018-05-03 18:06 ` [PATCH 07/12] xfs: scrub the data fork of the realtime inodes Darrick J. Wong
@ 2018-05-10 13:54   ` Brian Foster
  2018-05-10 16:07     ` Darrick J. Wong
  0 siblings, 1 reply; 27+ messages in thread
From: Brian Foster @ 2018-05-10 13:54 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs

On Thu, May 03, 2018 at 11:06:38AM -0700, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> The realtime bitmap and summary inodes live on the metadata device, so
> we can scrub their data forks with the regular scrubbers.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> ---
>  fs/xfs/scrub/rtbitmap.c |   26 +++++++++++++++++++++++++-
>  1 file changed, 25 insertions(+), 1 deletion(-)
> 
> 
> diff --git a/fs/xfs/scrub/rtbitmap.c b/fs/xfs/scrub/rtbitmap.c
> index 8b048f107af2..b2b4dff2660d 100644
> --- a/fs/xfs/scrub/rtbitmap.c
> +++ b/fs/xfs/scrub/rtbitmap.c
> @@ -82,6 +82,11 @@ xfs_scrub_rtbitmap(
>  {
>  	int				error;
>  
> +	/* Invoke the fork scrubber. */
> +	error = xfs_scrub_metadata_inode_forks(sc);
> +	if (error || (sc->sm->sm_flags & XFS_SCRUB_OFLAG_CORRUPT))
> +		return error;
> +
>  	error = xfs_rtalloc_query_all(sc->tp, xfs_scrub_rtbitmap_rec, sc);
>  	if (!xfs_scrub_fblock_process_error(sc, XFS_DATA_FORK, 0, &error))
>  		goto out;
> @@ -95,8 +100,27 @@ int
>  xfs_scrub_rtsummary(
>  	struct xfs_scrub_context	*sc)
>  {
> +	struct xfs_inode		*rsumip = sc->mp->m_rsumip;
> +	int				error = 0;
> +
> +	/*
> +	 * We ILOCK'd the rt bitmap ip in the setup routine, now lock the
> +	 * rt summary ip in compliance with the rt inode locking rules.
> +	 */
> +	xfs_ilock(rsumip, XFS_ILOCK_EXCL | XFS_ILOCK_RTSUM);
> +
> +	/* Invoke the fork scrubber. */
> +	sc->ip = rsumip;

I guess the ->ilock_flags are technically inconsistent at this point.
Does that matter? If not, couldn't hurt to document that in the comment.
;P

I also think this would be a bit easier to follow if we just stored and
restored an old_ip pointer on the stack (and perhaps assert that it
matches ->m_rbmip).

Brian

> +	error = xfs_scrub_metadata_inode_forks(sc);
> +	sc->ip = sc->mp->m_rbmip;
> +	if (error || (sc->sm->sm_flags & XFS_SCRUB_OFLAG_CORRUPT))
> +		goto out;
> +
>  	/* XXX: implement this some day */
> -	return -ENOENT;
> +	xfs_scrub_set_incomplete(sc);
> +out:
> +	xfs_iunlock(rsumip, XFS_ILOCK_EXCL | XFS_ILOCK_RTSUM);
> +	return error;
>  }
>  
>  
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 08/12] xfs: superblock scrub should use short-lived buffers
  2018-05-03 18:06 ` [PATCH 08/12] xfs: superblock scrub should use short-lived buffers Darrick J. Wong
@ 2018-05-10 13:55   ` Brian Foster
  0 siblings, 0 replies; 27+ messages in thread
From: Brian Foster @ 2018-05-10 13:55 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs

On Thu, May 03, 2018 at 11:06:44AM -0700, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> Secondary superblocks are rarely used, so create a helper to read a
> given non-primary AG's superblock and ensure that it won't stick around
> hogging memory.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> ---

Reviewed-by: Brian Foster <bfoster@redhat.com>

>  fs/xfs/libxfs/xfs_sb.c     |   22 ++++++++++++++++++++++
>  fs/xfs/libxfs/xfs_sb.h     |    3 +++
>  fs/xfs/libxfs/xfs_shared.h |    1 +
>  fs/xfs/scrub/agheader.c    |    4 +---
>  4 files changed, 27 insertions(+), 3 deletions(-)
> 
> 
> diff --git a/fs/xfs/libxfs/xfs_sb.c b/fs/xfs/libxfs/xfs_sb.c
> index d9b94bd5f689..d9bef41a3f26 100644
> --- a/fs/xfs/libxfs/xfs_sb.c
> +++ b/fs/xfs/libxfs/xfs_sb.c
> @@ -972,3 +972,25 @@ xfs_fs_geometry(
>  
>  	return 0;
>  }
> +
> +/* Read a secondary superblock. */
> +int
> +xfs_sb_read_secondary(
> +	struct xfs_mount	*mp,
> +	struct xfs_trans	*tp,
> +	xfs_agnumber_t		agno,
> +	struct xfs_buf		**bpp)
> +{
> +	struct xfs_buf		*bp;
> +	int			error;
> +
> +	ASSERT(agno != 0 && agno != NULLAGNUMBER);
> +	error = xfs_trans_read_buf(mp, tp, mp->m_ddev_targp,
> +			XFS_AG_DADDR(mp, agno, XFS_SB_BLOCK(mp)),
> +			XFS_FSS_TO_BB(mp, 1), 0, &bp, &xfs_sb_buf_ops);
> +	if (error)
> +		return error;
> +	xfs_buf_set_ref(bp, XFS_SSB_REF);
> +	*bpp = bp;
> +	return 0;
> +}
> diff --git a/fs/xfs/libxfs/xfs_sb.h b/fs/xfs/libxfs/xfs_sb.h
> index 63dcd2a1a657..5166d78b2c34 100644
> --- a/fs/xfs/libxfs/xfs_sb.h
> +++ b/fs/xfs/libxfs/xfs_sb.h
> @@ -37,5 +37,8 @@ extern void	xfs_sb_quota_from_disk(struct xfs_sb *sbp);
>  #define XFS_FS_GEOM_MAX_STRUCT_VER	(4)
>  extern int	xfs_fs_geometry(struct xfs_sb *sbp, struct xfs_fsop_geom *geo,
>  				int struct_version);
> +extern int	xfs_sb_read_secondary(struct xfs_mount *mp,
> +				struct xfs_trans *tp, xfs_agnumber_t agno,
> +				struct xfs_buf **bpp);
>  
>  #endif	/* __XFS_SB_H__ */
> diff --git a/fs/xfs/libxfs/xfs_shared.h b/fs/xfs/libxfs/xfs_shared.h
> index d0b84da0cb1e..d23f33c02f64 100644
> --- a/fs/xfs/libxfs/xfs_shared.h
> +++ b/fs/xfs/libxfs/xfs_shared.h
> @@ -127,6 +127,7 @@ void	xfs_log_get_max_trans_res(struct xfs_mount *mp,
>  #define	XFS_ATTR_BTREE_REF	1
>  #define	XFS_DQUOT_REF		1
>  #define	XFS_REFC_BTREE_REF	1
> +#define	XFS_SSB_REF		0
>  
>  /*
>   * Flags for xfs_trans_ichgtime().
> diff --git a/fs/xfs/scrub/agheader.c b/fs/xfs/scrub/agheader.c
> index 018aabbd9394..cd24ac56c21e 100644
> --- a/fs/xfs/scrub/agheader.c
> +++ b/fs/xfs/scrub/agheader.c
> @@ -157,9 +157,7 @@ xfs_scrub_superblock(
>  	if (agno == 0)
>  		return 0;
>  
> -	error = xfs_trans_read_buf(mp, sc->tp, mp->m_ddev_targp,
> -		  XFS_AGB_TO_DADDR(mp, agno, XFS_SB_BLOCK(mp)),
> -		  XFS_FSS_TO_BB(mp, 1), 0, &bp, &xfs_sb_buf_ops);
> +	error = xfs_sb_read_secondary(mp, sc->tp, agno, &bp);
>  	/*
>  	 * The superblock verifier can return several different error codes
>  	 * if it thinks the superblock doesn't look right.  For a mount these
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 06/12] xfs: quota scrub should use bmapbtd scrubber
  2018-05-10 13:54   ` Brian Foster
@ 2018-05-10 15:34     ` Darrick J. Wong
  0 siblings, 0 replies; 27+ messages in thread
From: Darrick J. Wong @ 2018-05-10 15:34 UTC (permalink / raw)
  To: Brian Foster; +Cc: linux-xfs

On Thu, May 10, 2018 at 09:54:38AM -0400, Brian Foster wrote:
> On Thu, May 03, 2018 at 11:06:32AM -0700, Darrick J. Wong wrote:
> > From: Darrick J. Wong <darrick.wong@oracle.com>
> > 
> > Replace the quota scrubber's open-coded data fork scrubber with a
> > redirected call to the bmapbtd scrubber.  This strengthens the quota
> > scrub to include all the cross-referencing that it does.
> > 
> > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > ---
> 
> FYI, I start to get patch application errors on the previous patch, this
> patch, and couple or so of the subsequent... (against current for-next).

Yeah, this series hasn't aged well. :/

I was actually about to repost the whole thing later today but I'll
incorporate your suggestions/rvb before I do that.

Also in general you're better off pulling the branch in the cover letter
because I get less aggressive about rebasing my scrub tree off for-next
particularly once I start collecting things for the next merge window.

--D

> Brian
> 
> >  fs/xfs/scrub/common.c |   57 ++++++++++++++++++++++++++++++++++
> >  fs/xfs/scrub/common.h |    2 +
> >  fs/xfs/scrub/quota.c  |   83 ++++++++++++++++++++++++-------------------------
> >  3 files changed, 99 insertions(+), 43 deletions(-)
> > 
> > 
> > diff --git a/fs/xfs/scrub/common.c b/fs/xfs/scrub/common.c
> > index e9ca5f0802d5..335c7a305e15 100644
> > --- a/fs/xfs/scrub/common.c
> > +++ b/fs/xfs/scrub/common.c
> > @@ -44,6 +44,8 @@
> >  #include "xfs_rmap_btree.h"
> >  #include "xfs_log.h"
> >  #include "xfs_trans_priv.h"
> > +#include "xfs_attr.h"
> > +#include "xfs_reflink.h"
> >  #include "scrub/xfs_scrub.h"
> >  #include "scrub/scrub.h"
> >  #include "scrub/common.h"
> > @@ -777,3 +779,58 @@ xfs_scrub_buffer_recheck(
> >  	sc->sm->sm_flags |= XFS_SCRUB_OFLAG_CORRUPT;
> >  	trace_xfs_scrub_block_error(sc, bp->b_bn, fa);
> >  }
> > +
> > +/*
> > + * Scrub the attr/data forks of a metadata inode.  The metadata inode must be
> > + * pointed to by sc->ip and the ILOCK must be held.
> > + */
> > +int
> > +xfs_scrub_metadata_inode_forks(
> > +	struct xfs_scrub_context	*sc)
> > +{
> > +	__u32				smtype;
> > +	bool				shared;
> > +	int				error;
> > +
> > +	if (sc->sm->sm_flags & XFS_SCRUB_OFLAG_CORRUPT)
> > +		return 0;
> > +
> > +	/* Metadata inodes don't live on the rt device. */
> > +	if (sc->ip->i_d.di_flags & XFS_DIFLAG_REALTIME) {
> > +		xfs_scrub_ino_set_corrupt(sc, sc->ip->i_ino);
> > +		return 0;
> > +	}
> > +
> > +	/* They should never participate in reflink. */
> > +	if (xfs_is_reflink_inode(sc->ip)) {
> > +		xfs_scrub_ino_set_corrupt(sc, sc->ip->i_ino);
> > +		return 0;
> > +	}
> > +
> > +	/* They also should never have extended attributes. */
> > +	if (xfs_inode_hasattr(sc->ip)) {
> > +		xfs_scrub_ino_set_corrupt(sc, sc->ip->i_ino);
> > +		return 0;
> > +	}
> > +
> > +	/* Invoke the data fork scrubber. */
> > +	smtype = sc->sm->sm_type;
> > +	sc->sm->sm_type = XFS_SCRUB_TYPE_BMBTD;
> > +	error = xfs_scrub_bmap_data(sc);
> > +	sc->sm->sm_type = smtype;
> > +	if (error || (sc->sm->sm_flags & XFS_SCRUB_OFLAG_CORRUPT))
> > +		return error;
> > +
> > +	/* Look for incorrect shared blocks. */
> > +	if (xfs_sb_version_hasreflink(&sc->mp->m_sb)) {
> > +		error = xfs_reflink_inode_has_shared_extents(sc->tp, sc->ip,
> > +				&shared);
> > +		if (!xfs_scrub_fblock_process_error(sc, XFS_DATA_FORK, 0,
> > +				&error))
> > +			return error;
> > +		if (shared)
> > +			xfs_scrub_ino_set_corrupt(sc, sc->ip->i_ino);
> > +	}
> > +
> > +	return error;
> > +}
> > diff --git a/fs/xfs/scrub/common.h b/fs/xfs/scrub/common.h
> > index 80fa5a67c265..ad736c411287 100644
> > --- a/fs/xfs/scrub/common.h
> > +++ b/fs/xfs/scrub/common.h
> > @@ -167,4 +167,6 @@ static inline bool xfs_scrub_skip_xref(struct xfs_scrub_metadata *sm)
> >  			       XFS_SCRUB_OFLAG_XCORRUPT);
> >  }
> >  
> > +int xfs_scrub_metadata_inode_forks(struct xfs_scrub_context *sc);
> > +
> >  #endif	/* __XFS_SCRUB_COMMON_H__ */
> > diff --git a/fs/xfs/scrub/quota.c b/fs/xfs/scrub/quota.c
> > index d3d08978f53a..15ae4d23d6ac 100644
> > --- a/fs/xfs/scrub/quota.c
> > +++ b/fs/xfs/scrub/quota.c
> > @@ -206,65 +206,62 @@ xfs_scrub_quota_item(
> >  	return 0;
> >  }
> >  
> > -/* Scrub all of a quota type's items. */
> > -int
> > -xfs_scrub_quota(
> > +/* Check the quota's data fork. */
> > +STATIC int
> > +xfs_scrub_quota_data_fork(
> >  	struct xfs_scrub_context	*sc)
> >  {
> >  	struct xfs_bmbt_irec		irec = { 0 };
> > -	struct xfs_scrub_quota_info	sqi;
> > -	struct xfs_mount		*mp = sc->mp;
> > -	struct xfs_quotainfo		*qi = mp->m_quotainfo;
> > +	struct xfs_iext_cursor		icur;
> > +	struct xfs_quotainfo		*qi = sc->mp->m_quotainfo;
> > +	struct xfs_ifork		*ifp;
> >  	xfs_fileoff_t			max_dqid_off;
> > -	xfs_fileoff_t			off = 0;
> > -	uint				dqtype;
> > -	int				nimaps;
> >  	int				error = 0;
> >  
> > -	dqtype = xfs_scrub_quota_to_dqtype(sc);
> > +	/* Invoke the fork scrubber. */
> > +	error = xfs_scrub_metadata_inode_forks(sc);
> > +	if (error || (sc->sm->sm_flags & XFS_SCRUB_OFLAG_CORRUPT))
> > +		return error;
> >  
> > -	/* Look for problem extents. */
> > -	if (sc->ip->i_d.di_flags & XFS_DIFLAG_REALTIME) {
> > -		xfs_scrub_ino_set_corrupt(sc, sc->ip->i_ino);
> > -		goto out;
> > -	}
> > +	/* Check for data fork problems that apply only to quota files. */
> >  	max_dqid_off = ((xfs_dqid_t)-1) / qi->qi_dqperchunk;
> > -	while (1) {
> > +	ifp = XFS_IFORK_PTR(sc->ip, XFS_DATA_FORK);
> > +	for_each_xfs_iext(ifp, &icur, &irec) {
> >  		if (xfs_scrub_should_terminate(sc, &error))
> >  			break;
> > -
> > -		off = irec.br_startoff + irec.br_blockcount;
> > -		nimaps = 1;
> > -		error = xfs_bmapi_read(sc->ip, off, -1, &irec, &nimaps,
> > -				XFS_BMAPI_ENTIRE);
> > -		if (!xfs_scrub_fblock_process_error(sc, XFS_DATA_FORK, off,
> > -				&error))
> > -			goto out;
> > -		if (!nimaps)
> > -			break;
> > -		if (irec.br_startblock == HOLESTARTBLOCK)
> > -			continue;
> > -
> > -		/* Check the extent record doesn't point to crap. */
> > -		if (irec.br_startblock + irec.br_blockcount <=
> > -		    irec.br_startblock)
> > -			xfs_scrub_fblock_set_corrupt(sc, XFS_DATA_FORK,
> > -					irec.br_startoff);
> > -		if (!xfs_verify_fsbno(mp, irec.br_startblock) ||
> > -		    !xfs_verify_fsbno(mp, irec.br_startblock +
> > -					irec.br_blockcount - 1))
> > -			xfs_scrub_fblock_set_corrupt(sc, XFS_DATA_FORK,
> > -					irec.br_startoff);
> > -
> >  		/*
> > -		 * Unwritten extents or blocks mapped above the highest
> > +		 * delalloc extents or blocks mapped above the highest
> >  		 * quota id shouldn't happen.
> >  		 */
> >  		if (isnullstartblock(irec.br_startblock) ||
> >  		    irec.br_startoff > max_dqid_off ||
> > -		    irec.br_startoff + irec.br_blockcount > max_dqid_off + 1)
> > -			xfs_scrub_fblock_set_corrupt(sc, XFS_DATA_FORK, off);
> > +		    irec.br_startoff + irec.br_blockcount - 1 > max_dqid_off) {
> > +			xfs_scrub_fblock_set_corrupt(sc, XFS_DATA_FORK,
> > +					irec.br_startoff);
> > +			break;
> > +		}
> >  	}
> > +
> > +	return error;
> > +}
> > +
> > +/* Scrub all of a quota type's items. */
> > +int
> > +xfs_scrub_quota(
> > +	struct xfs_scrub_context	*sc)
> > +{
> > +	struct xfs_scrub_quota_info	sqi;
> > +	struct xfs_mount		*mp = sc->mp;
> > +	struct xfs_quotainfo		*qi = mp->m_quotainfo;
> > +	uint				dqtype;
> > +	int				error = 0;
> > +
> > +	dqtype = xfs_scrub_quota_to_dqtype(sc);
> > +
> > +	/* Look for problem extents. */
> > +	error = xfs_scrub_quota_data_fork(sc);
> > +	if (error)
> > +		goto out;
> >  	if (sc->sm->sm_flags & XFS_SCRUB_OFLAG_CORRUPT)
> >  		goto out;
> >  
> > 
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 01/12] xfs: bmap debugging should never panic the system
  2018-05-10 13:53   ` Brian Foster
@ 2018-05-10 15:45     ` Darrick J. Wong
  0 siblings, 0 replies; 27+ messages in thread
From: Darrick J. Wong @ 2018-05-10 15:45 UTC (permalink / raw)
  To: Brian Foster; +Cc: linux-xfs

On Thu, May 10, 2018 at 09:53:19AM -0400, Brian Foster wrote:
> On Thu, May 03, 2018 at 11:05:59AM -0700, Darrick J. Wong wrote:
> > From: Darrick J. Wong <darrick.wong@oracle.com>
> > 
> > Don't panic() the system if the bmap records are garbage, just call
> > ASSERT which gives us the same backtrace but enables developers to
> > control if the system goes down or not.  This makes debugging with
> > generic/388 much easier because it won't reboot the machine midway
> > through a run just because btree_read_bufl returns EIO when the fs has
> > already shut down.
> > 
> > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > ---
> >  fs/xfs/libxfs/xfs_bmap.c |    8 ++++++--
> >  1 file changed, 6 insertions(+), 2 deletions(-)
> > 
> > 
> > diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c
> > index 6a7c2f03ea11..393b4e2c63ad 100644
> > --- a/fs/xfs/libxfs/xfs_bmap.c
> > +++ b/fs/xfs/libxfs/xfs_bmap.c
> > @@ -312,8 +312,10 @@ xfs_check_block(
> >  				xfs_warn(mp, "%s: thispa(%d) == pp(%d) %Ld",
> >  					__func__, j, i,
> >  					(unsigned long long)be64_to_cpu(*thispa));
> > -				panic("%s: ptrs are equal in node\n",
> > +				xfs_err(mp, "%s: ptrs are equal in node\n",
> >  					__func__);
> > +				ASSERT(XFS_FORCED_SHUTDOWN(mp));
> > +				xfs_force_shutdown(mp, SHUTDOWN_CORRUPT_INCORE);
> 
> Why the assert if we're just going to shutdown? At minimum, it looks a
> bit funny to assert that the fs is shutdown just prior to issuing a
> shutdown. Can we just shutdown and let the error level dictate whether
> we want a backtrace?

Ok.

--D

> Brian
> 
> >  			}
> >  		}
> >  	}
> > @@ -483,7 +485,9 @@ xfs_bmap_check_leaf_extents(
> >  error_norelse:
> >  	xfs_warn(mp, "%s: BAD after btree leaves for %d extents",
> >  		__func__, i);
> > -	panic("%s: CORRUPTED BTREE OR SOMETHING", __func__);
> > +	xfs_err(mp, "%s: CORRUPTED BTREE OR SOMETHING", __func__);
> > +	ASSERT(XFS_FORCED_SHUTDOWN(mp));
> > +	xfs_force_shutdown(mp, SHUTDOWN_CORRUPT_INCORE);
> >  	return;
> >  }
> >  
> > 
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 04/12] xfs: don't continue scrub if already corrupt
  2018-05-10 13:53   ` Brian Foster
@ 2018-05-10 15:54     ` Darrick J. Wong
  0 siblings, 0 replies; 27+ messages in thread
From: Darrick J. Wong @ 2018-05-10 15:54 UTC (permalink / raw)
  To: Brian Foster; +Cc: linux-xfs

On Thu, May 10, 2018 at 09:53:43AM -0400, Brian Foster wrote:
> On Thu, May 03, 2018 at 11:06:20AM -0700, Darrick J. Wong wrote:
> > From: Darrick J. Wong <darrick.wong@oracle.com>
> > 
> > If we've already decided that something is corrupt, we might as well
> > abort all the loops and exit as quickly as possible.
> > 
> > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > ---
> 
> Just a few nits..
> 
> >  fs/xfs/scrub/attr.c   |    3 ++-
> >  fs/xfs/scrub/bmap.c   |    3 ++-
> >  fs/xfs/scrub/dir.c    |   34 +++++++++++++++++++++++++---------
> >  fs/xfs/scrub/parent.c |    3 +++
> >  4 files changed, 32 insertions(+), 11 deletions(-)
> > 
> > 
> > diff --git a/fs/xfs/scrub/attr.c b/fs/xfs/scrub/attr.c
> > index 127575f0abfb..84b6d6b66578 100644
> > --- a/fs/xfs/scrub/attr.c
> > +++ b/fs/xfs/scrub/attr.c
> > @@ -126,8 +126,9 @@ xfs_scrub_xattr_listent(
> >  	if (args.valuelen != valuelen)
> >  		xfs_scrub_fblock_set_corrupt(sx->sc, XFS_ATTR_FORK,
> >  					     args.blkno);
> > -
> >  fail_xref:
> > +	if (sx->sc->sm->sm_flags & XFS_SCRUB_OFLAG_CORRUPT)
> > +		context->seen_enough = 1;
> >  	return;
> >  }
> >  
> > diff --git a/fs/xfs/scrub/bmap.c b/fs/xfs/scrub/bmap.c
> > index fb91caf17652..c28ff2ec201a 100644
> > --- a/fs/xfs/scrub/bmap.c
> > +++ b/fs/xfs/scrub/bmap.c
> > @@ -684,7 +684,8 @@ xfs_scrub_bmap(
> >  	info.lastoff = 0;
> >  	ifp = XFS_IFORK_PTR(ip, whichfork);
> >  	for_each_xfs_iext(ifp, &icur, &irec) {
> > -		if (xfs_scrub_should_terminate(sc, &error))
> > +		if (xfs_scrub_should_terminate(sc, &error) ||
> > +		    (sc->sm->sm_flags & XFS_SCRUB_OFLAG_CORRUPT))
> >  			break;
> >  		if (isnullstartblock(irec.br_startblock))
> >  			continue;
> > diff --git a/fs/xfs/scrub/dir.c b/fs/xfs/scrub/dir.c
> > index 38f29806eb54..fb56a885f443 100644
> > --- a/fs/xfs/scrub/dir.c
> > +++ b/fs/xfs/scrub/dir.c
> > @@ -172,7 +172,7 @@ xfs_scrub_dir_actor(
> >  	error = xfs_dir_lookup(sdc->sc->tp, ip, &xname, &lookup_ino, NULL);
> >  	if (!xfs_scrub_fblock_process_error(sdc->sc, XFS_DATA_FORK, offset,
> >  			&error))
> > -		goto fail_xref;
> > +		goto out;
> >  	if (lookup_ino != ino) {
> >  		xfs_scrub_fblock_set_corrupt(sdc->sc, XFS_DATA_FORK, offset);
> >  		goto out;
> > @@ -183,8 +183,13 @@ xfs_scrub_dir_actor(
> >  	if (error)
> >  		goto out;
> >  out:
> > -	return error;
> > -fail_xref:
> > +	/*
> > +	 * A negative error code returned here is supposed to cause the
> > +	 * dir_emit caller (xfs_readdir) to abort the directory iteration
> > +	 * and return zero to xfs_scrub_dir.
> 
> s/xfs_scrub_dir/xfs_scrub_directory/ ?

Fixed.

> > +	 */
> > +	if (error == 0 && sdc->sc->sm->sm_flags & XFS_SCRUB_OFLAG_CORRUPT)
> > +		return -EFSCORRUPTED;
> >  	return error;
> >  }
> >  
> > @@ -240,6 +245,9 @@ xfs_scrub_dir_rec(
> >  	}
> >  	xfs_scrub_buffer_recheck(ds->sc, bp);
> >  
> > +	if (ds->sc->sm->sm_flags & XFS_SCRUB_OFLAG_CORRUPT)
> > +		goto out_relse;
> > +
> >  	dent = (struct xfs_dir2_data_entry *)(((char *)bp->b_addr) + off);
> >  
> >  	/* Make sure we got a real directory entry. */
> > @@ -357,6 +365,9 @@ xfs_scrub_directory_data_bestfree(
> >  
> >  	/* XXX: Check xfs_dir3_data_hdr.pad is zero once we start setting it. */
> >  
> > +	if (sc->sm->sm_flags & XFS_SCRUB_OFLAG_CORRUPT)
> > +		goto out_buf;
> > +
> >  	/* Do the bestfrees correspond to actual free space? */
> >  	bf = d_ops->data_bestfree_p(bp->b_addr);
> >  	smallest_bestfree = UINT_MAX;
> > @@ -394,7 +405,7 @@ xfs_scrub_directory_data_bestfree(
> >  	endptr = xfs_dir3_data_endp(mp->m_dir_geo, bp->b_addr);
> >  
> >  	/* Iterate the entries, stopping when we hit or go past the end. */
> > -	while (ptr < endptr) {
> > +	while (ptr < endptr && !(sc->sm->sm_flags & XFS_SCRUB_OFLAG_CORRUPT)) {
> 
> This seems like slightly different behavior than all of the other
> corruption checks in the loop. Perhaps this should go after the
> xfs_scrub_directory_check_free_entry() call?

Yep, fixed.

> >  		dup = (struct xfs_dir2_data_unused *)ptr;
> >  		/* Skip real entries */
> >  		if (dup->freetag != cpu_to_be16(XFS_DIR2_DATA_FREE_TAG)) {
> > @@ -413,8 +424,10 @@ xfs_scrub_directory_data_bestfree(
> >  
> >  		/* Spot check this free entry */
> >  		tag = be16_to_cpu(*xfs_dir2_data_unused_tag_p(dup));
> > -		if (tag != ((char *)dup - (char *)bp->b_addr))
> > +		if (tag != ((char *)dup - (char *)bp->b_addr)) {
> >  			xfs_scrub_fblock_set_corrupt(sc, XFS_DATA_FORK, lblk);
> > +			goto out_buf;
> > +		}
> >  
> >  		/*
> >  		 * Either this entry is a bestfree or it's smaller than
> > @@ -549,6 +562,8 @@ xfs_scrub_directory_leaf1_bestfree(
> >  
> >  	/* Check all the bestfree entries. */
> >  	for (i = 0; i < bestcount; i++, bestp++) {
> > +		if (sc->sm->sm_flags & XFS_SCRUB_OFLAG_CORRUPT)
> > +			break;
> 
> This seems similarly misplaced. If somebody later adds some additional
> checks before the loop, this all of a sudden doesn't exactly do the same
> thing. Perhaps check after the preceding hash checks and then again in
> the loop after the brelse?

Ok.

> >  		best = be16_to_cpu(*bestp);
> >  		if (best == NULLDATAOFF)
> >  			continue;
> > @@ -556,9 +571,10 @@ xfs_scrub_directory_leaf1_bestfree(
> >  				i * args->geo->fsbcount, -1, &dbp);
> >  		if (!xfs_scrub_fblock_process_error(sc, XFS_DATA_FORK, lblk,
> >  				&error))
> > -			continue;
> > +			break;
> >  		xfs_scrub_directory_check_freesp(sc, lblk, dbp, best);
> >  		xfs_trans_brelse(sc->tp, dbp);
> > +
> >  	}
> >  out:
> >  	return error;
> > @@ -607,7 +623,7 @@ xfs_scrub_directory_free_bestfree(
> >  				-1, &dbp);
> >  		if (!xfs_scrub_fblock_process_error(sc, XFS_DATA_FORK, lblk,
> >  				&error))
> > -			continue;
> > +			break;
> >  		xfs_scrub_directory_check_freesp(sc, lblk, dbp, best);
> >  		xfs_trans_brelse(sc->tp, dbp);
> >  	}
> > @@ -656,7 +672,7 @@ xfs_scrub_directory_blocks(
> >  
> >  	/* Iterate all the data extents in the directory... */
> >  	found = xfs_iext_lookup_extent(sc->ip, ifp, lblk, &icur, &got);
> > -	while (found) {
> > +	while (found && !(sc->sm->sm_flags & XFS_SCRUB_OFLAG_CORRUPT)) {
> >  		/* Block directories only have a single block at offset 0. */
> >  		if (is_block &&
> >  		    (got.br_startoff > 0 ||
> > @@ -719,7 +735,7 @@ xfs_scrub_directory_blocks(
> >  	/* Scan for free blocks */
> >  	lblk = free_lblk;
> >  	found = xfs_iext_lookup_extent(sc->ip, ifp, lblk, &icur, &got);
> > -	while (found) {
> > +	while (found && !(sc->sm->sm_flags & XFS_SCRUB_OFLAG_CORRUPT)) {
> >  		/*
> >  		 * Dirs can't have blocks mapped above 2^32.
> >  		 * Single-block dirs shouldn't even be here.
> > diff --git a/fs/xfs/scrub/parent.c b/fs/xfs/scrub/parent.c
> > index 1fb88c18d455..95fdd0d9e65d 100644
> > --- a/fs/xfs/scrub/parent.c
> > +++ b/fs/xfs/scrub/parent.c
> > @@ -147,6 +147,9 @@ xfs_scrub_parent_validate(
> >  
> >  	*try_again = false;
> >  
> > +	if (sc->sm->sm_flags & XFS_SCRUB_OFLAG_CORRUPT)
> > +		return 0;
> > +
> 
> Technically the same resulting logic, but the return 0 here vs. the goto
> out (that also returns 0) in the subsequent check stand out a bit as
> inconsistent.

Fixed.  Thanks for the review!

--D

> Brian
> 
> >  	/* '..' must not point to ourselves. */
> >  	if (sc->ip->i_ino == dnum) {
> >  		xfs_scrub_fblock_set_corrupt(sc, XFS_DATA_FORK, 0);
> > 
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 07/12] xfs: scrub the data fork of the realtime inodes
  2018-05-10 13:54   ` Brian Foster
@ 2018-05-10 16:07     ` Darrick J. Wong
  0 siblings, 0 replies; 27+ messages in thread
From: Darrick J. Wong @ 2018-05-10 16:07 UTC (permalink / raw)
  To: Brian Foster; +Cc: linux-xfs

On Thu, May 10, 2018 at 09:54:59AM -0400, Brian Foster wrote:
> On Thu, May 03, 2018 at 11:06:38AM -0700, Darrick J. Wong wrote:
> > From: Darrick J. Wong <darrick.wong@oracle.com>
> > 
> > The realtime bitmap and summary inodes live on the metadata device, so
> > we can scrub their data forks with the regular scrubbers.
> > 
> > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > ---
> >  fs/xfs/scrub/rtbitmap.c |   26 +++++++++++++++++++++++++-
> >  1 file changed, 25 insertions(+), 1 deletion(-)
> > 
> > 
> > diff --git a/fs/xfs/scrub/rtbitmap.c b/fs/xfs/scrub/rtbitmap.c
> > index 8b048f107af2..b2b4dff2660d 100644
> > --- a/fs/xfs/scrub/rtbitmap.c
> > +++ b/fs/xfs/scrub/rtbitmap.c
> > @@ -82,6 +82,11 @@ xfs_scrub_rtbitmap(
> >  {
> >  	int				error;
> >  
> > +	/* Invoke the fork scrubber. */
> > +	error = xfs_scrub_metadata_inode_forks(sc);
> > +	if (error || (sc->sm->sm_flags & XFS_SCRUB_OFLAG_CORRUPT))
> > +		return error;
> > +
> >  	error = xfs_rtalloc_query_all(sc->tp, xfs_scrub_rtbitmap_rec, sc);
> >  	if (!xfs_scrub_fblock_process_error(sc, XFS_DATA_FORK, 0, &error))
> >  		goto out;
> > @@ -95,8 +100,27 @@ int
> >  xfs_scrub_rtsummary(
> >  	struct xfs_scrub_context	*sc)
> >  {
> > +	struct xfs_inode		*rsumip = sc->mp->m_rsumip;
> > +	int				error = 0;
> > +
> > +	/*
> > +	 * We ILOCK'd the rt bitmap ip in the setup routine, now lock the
> > +	 * rt summary ip in compliance with the rt inode locking rules.
> > +	 */
> > +	xfs_ilock(rsumip, XFS_ILOCK_EXCL | XFS_ILOCK_RTSUM);
> > +
> > +	/* Invoke the fork scrubber. */
> > +	sc->ip = rsumip;
> 
> I guess the ->ilock_flags are technically inconsistent at this point.
> Does that matter? If not, couldn't hurt to document that in the comment.
> ;P
> 
> I also think this would be a bit easier to follow if we just stored and
> restored an old_ip pointer on the stack (and perhaps assert that it
> matches ->m_rbmip).

Ok.  I'll update the documentation to make it clearer that we're
switching state, and restructure the code to keep sc->{ip,ilock_flags}
consistent.

--D

> Brian
> 
> > +	error = xfs_scrub_metadata_inode_forks(sc);
> > +	sc->ip = sc->mp->m_rbmip;
> > +	if (error || (sc->sm->sm_flags & XFS_SCRUB_OFLAG_CORRUPT))
> > +		goto out;
> > +
> >  	/* XXX: implement this some day */
> > -	return -ENOENT;
> > +	xfs_scrub_set_incomplete(sc);
> > +out:
> > +	xfs_iunlock(rsumip, XFS_ILOCK_EXCL | XFS_ILOCK_RTSUM);
> > +	return error;
> >  }
> >  
> >  
> > 
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2018-05-10 16:08 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-05-03 18:05 [PATCH v4 00/12] xfs-4.18: various fixes Darrick J. Wong
2018-05-03 18:05 ` [PATCH 01/12] xfs: bmap debugging should never panic the system Darrick J. Wong
2018-05-07 15:06   ` Christoph Hellwig
2018-05-10 13:53   ` Brian Foster
2018-05-10 15:45     ` Darrick J. Wong
2018-05-03 18:06 ` [PATCH 02/12] xfs: add missing rmap error return Darrick J. Wong
2018-05-07 15:06   ` Christoph Hellwig
2018-05-10 13:53   ` Brian Foster
2018-05-03 18:06 ` [PATCH 03/12] xfs: skip scrub xref if corruption already noted Darrick J. Wong
2018-05-10 13:53   ` Brian Foster
2018-05-03 18:06 ` [PATCH 04/12] xfs: don't continue scrub if already corrupt Darrick J. Wong
2018-05-10 13:53   ` Brian Foster
2018-05-10 15:54     ` Darrick J. Wong
2018-05-03 18:06 ` [PATCH 05/12] xfs: avoid ilock games in the quota scrubber Darrick J. Wong
2018-05-03 18:06 ` [PATCH 06/12] xfs: quota scrub should use bmapbtd scrubber Darrick J. Wong
2018-05-10 13:54   ` Brian Foster
2018-05-10 15:34     ` Darrick J. Wong
2018-05-03 18:06 ` [PATCH 07/12] xfs: scrub the data fork of the realtime inodes Darrick J. Wong
2018-05-10 13:54   ` Brian Foster
2018-05-10 16:07     ` Darrick J. Wong
2018-05-03 18:06 ` [PATCH 08/12] xfs: superblock scrub should use short-lived buffers Darrick J. Wong
2018-05-10 13:55   ` Brian Foster
2018-05-03 18:06 ` [PATCH 09/12] xfs: clean up scrub usage of KM_NOFS Darrick J. Wong
2018-05-03 18:06 ` [PATCH 10/12] xfs: btree scrub should check minrecs Darrick J. Wong
2018-05-03 18:07 ` [PATCH 11/12] xfs: refactor scrub transaction allocation function Darrick J. Wong
2018-05-03 18:07 ` [PATCH 12/12] xfs: avoid ABBA deadlock when scrubbing parent pointers Darrick J. Wong
2018-05-08  0:40 ` [PATCH 13/12] xfs: refactor quota limits initialization Darrick J. Wong

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.