[PATCH 0/4 V2] xfs: missing verifier fixes

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH 0/4 V2] xfs: missing verifier fixes
@ 2014-07-31  1:01 Dave Chinner
  2014-07-31  1:01 ` [PATCH 1/4] xfs: catch buffers written without verifiers attached Dave Chinner
                   ` (3 more replies)
  0 siblings, 4 replies; 14+ messages in thread
From: Dave Chinner @ 2014-07-31  1:01 UTC (permalink / raw)
  To: xfs

Hi folks,

Updated version of the patchset posted yesterday here:

http://oss.sgi.com/archives/xfs/2014-07/msg00430.html

Version 2:
- use XFS_BUF_DADDR_NULL
- print buffer addr/length pair, not two addrs
- remove extraneous xfs_sb_version_hascrc() checks
- fix m_qflags check issue.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 1/4] xfs: catch buffers written without verifiers attached
  2014-07-31  1:01 [PATCH 0/4 V2] xfs: missing verifier fixes Dave Chinner
@ 2014-07-31  1:01 ` Dave Chinner
  2014-07-31 12:27   ` Brian Foster
  2014-08-01 14:26   ` Christoph Hellwig
  2014-07-31  1:01 ` [PATCH 2/4] xfs: ensure verifiers are attached to recovered buffers Dave Chinner
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 14+ messages in thread
From: Dave Chinner @ 2014-07-31  1:01 UTC (permalink / raw)
  To: xfs

From: Dave Chinner <dchinner@redhat.com>

We recently had a bug where buffers were slipping through log
recovery without any verifier attached to them. This was resulting
in on-disk CRC mismatches for valid data. Add some warning code to
catch this occurrence so that we catch such bugs during development
rather than not being aware they exist.

Note that we cannot do this verification unconditionally as non-CRC
filesystems don't always attach verifiers to the buffers being
written. e.g. during log recovery we cannot identify all the
different types of buffers correctly on non-CRC filesystems, so we
can't attach the correct verifiers in all cases and so we don't
attach any. Hence we don't want on non-CRC filesystems to avoid
spamming the logs with false indications.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
 fs/xfs/xfs_buf.c | 14 ++++++++++++++
 fs/xfs/xfs_log.c |  8 +++++++-
 2 files changed, 21 insertions(+), 1 deletion(-)

diff --git a/fs/xfs/xfs_buf.c b/fs/xfs/xfs_buf.c
index a6dc83e..cd7b8ca 100644
--- a/fs/xfs/xfs_buf.c
+++ b/fs/xfs/xfs_buf.c
@@ -1330,6 +1330,20 @@ _xfs_buf_ioapply(
 						   SHUTDOWN_CORRUPT_INCORE);
 				return;
 			}
+		} else if (bp->b_bn != XFS_BUF_DADDR_NULL) {
+			struct xfs_mount *mp = bp->b_target->bt_mount;
+
+			/*
+			 * non-crc filesystems don't attach verifiers during
+			 * log recovery, so don't warn for such filesystems.
+			 */
+			if (xfs_sb_version_hascrc(&mp->m_sb)) {
+				xfs_warn(mp,
+					"%s: no ops on block 0x%llx/0x%x",
+					__func__, bp->b_bn, bp->b_length);
+				xfs_hex_dump(bp->b_addr, 64);
+				dump_stack();
+			}
 		}
 	} else if (bp->b_flags & XBF_READ_AHEAD) {
 		rw = READA;
diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c
index 149a4a5..ca4fd5b 100644
--- a/fs/xfs/xfs_log.c
+++ b/fs/xfs/xfs_log.c
@@ -1378,8 +1378,14 @@ xlog_alloc_log(
 
 	xlog_get_iclog_buffer_size(mp, log);
 
+	/*
+	 * Use a NULL block for the extra log buffer used during splits so that
+	 * it will trigger errors if we ever try to do IO on it without first
+	 * having set it up properly.
+	 */
 	error = -ENOMEM;
-	bp = xfs_buf_alloc(mp->m_logdev_targp, 0, BTOBB(log->l_iclog_size), 0);
+	bp = xfs_buf_alloc(mp->m_logdev_targp, XFS_BUF_DADDR_NULL,
+			   BTOBB(log->l_iclog_size), 0);
 	if (!bp)
 		goto out_free_log;
 
-- 
2.0.0

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 2/4] xfs: ensure verifiers are attached to recovered buffers
  2014-07-31  1:01 [PATCH 0/4 V2] xfs: missing verifier fixes Dave Chinner
  2014-07-31  1:01 ` [PATCH 1/4] xfs: catch buffers written without verifiers attached Dave Chinner
@ 2014-07-31  1:01 ` Dave Chinner
  2014-08-01 14:27   ` Christoph Hellwig
  2014-07-31  1:01 ` [PATCH 3/4] xfs: quotacheck leaves dquot buffers without verifiers Dave Chinner
  2014-07-31  1:01 ` [PATCH 4/4] xfs: dquot recovery needs verifiers Dave Chinner
  3 siblings, 1 reply; 14+ messages in thread
From: Dave Chinner @ 2014-07-31  1:01 UTC (permalink / raw)
  To: xfs

From: Dave Chinner <dchinner@redhat.com>

Crash testing of CRC enabled filesystems has resulted in a number of
reports of bad CRCs being detected after the filesystem was mounted.
Errors such as the following were being seen:

XFS (sdb3): Mounting V5 Filesystem
XFS (sdb3): Starting recovery (logdev: internal)
XFS (sdb3): Metadata CRC error detected at xfs_agf_read_verify+0x5a/0x100 [xfs], block 0x1
XFS (sdb3): Unmount and run xfs_repair
XFS (sdb3): First 64 bytes of corrupted metadata buffer:
ffff880136ffd600: 58 41 47 46 00 00 00 01 00 00 00 00 00 0f aa 40  XAGF...........@
ffff880136ffd610: 00 02 6d 53 00 02 77 f8 00 00 00 00 00 00 00 01  ..mS..w.........
ffff880136ffd620: 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00 03  ................
ffff880136ffd630: 00 00 00 04 00 08 81 d0 00 08 81 a7 00 00 00 00  ................
XFS (sdb3): metadata I/O error: block 0x1 ("xfs_trans_read_buf_map") error 74 numblks 1

The errors were typically being seen in AGF, AGI and their related
btree block buffers some time after log recovery had run. Often it
wasn't until later subsequent mounts that the problem was
discovered. The common symptom was a buffer with the correct
contents, but a CRC and an LSN that matched an older version of the
contents.

Some debug added to _xfs_buf_ioapply() indicated that buffers were
being written without verifiers attached to them from log recovery,
and Jan Kara isolated the cause to log recovery readahead an dit's
interactions with buffers that had a more recent LSN on disk than
the transaction being recovered. In this case, the buffer did not
get a verifier attached, and os when the second phase of log
recovery ran and recovered EFIs and unlinked inodes, the buffers
were modified and written without the verifier running. Hence they
had up to date contents, but stale LSNs and CRCs.

Fix it by attaching verifiers to buffers we skip due to future LSN
values so they don't escape into the buffer cache without the
correct verifier attached.

This patch is based on analysis and a patch from Jan Kara.

cc: <stable@vger.kernel.org>
Reported-by: Jan Kara <jack@suse.cz>
Reported-by: Fanael Linithien <fanael4@gmail.com>
Reported-by: Grozdan <neutrino8@gmail.com>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
---
 fs/xfs/xfs_log_recover.c | 51 +++++++++++++++++++++++++++++-------------------
 1 file changed, 31 insertions(+), 20 deletions(-)

diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c
index fbc2362..8a7d8a7 100644
--- a/fs/xfs/xfs_log_recover.c
+++ b/fs/xfs/xfs_log_recover.c
@@ -2126,6 +2126,17 @@ xlog_recover_validate_buf_type(
 	__uint16_t		magic16;
 	__uint16_t		magicda;
 
+	/*
+	 * We can only do post recovery validation on items on CRC enabled
+	 * fielsystems as we need to know when the buffer was written to be able
+	 * to determine if we should have replayed the item. If we replay old
+	 * metadata over a newer buffer, then it will enter a temporarily
+	 * inconsistent state resulting in verification failures. Hence for now
+	 * just avoid the verification stage for non-crc filesystems
+	 */
+	if (!xfs_sb_version_hascrc(&mp->m_sb))
+		return;
+
 	magic32 = be32_to_cpu(*(__be32 *)bp->b_addr);
 	magic16 = be16_to_cpu(*(__be16*)bp->b_addr);
 	magicda = be16_to_cpu(info->magic);
@@ -2163,8 +2174,6 @@ xlog_recover_validate_buf_type(
 		bp->b_ops = &xfs_agf_buf_ops;
 		break;
 	case XFS_BLFT_AGFL_BUF:
-		if (!xfs_sb_version_hascrc(&mp->m_sb))
-			break;
 		if (magic32 != XFS_AGFL_MAGIC) {
 			xfs_warn(mp, "Bad AGFL block magic!");
 			ASSERT(0);
@@ -2197,10 +2206,6 @@ xlog_recover_validate_buf_type(
 #endif
 		break;
 	case XFS_BLFT_DINO_BUF:
-		/*
-		 * we get here with inode allocation buffers, not buffers that
-		 * track unlinked list changes.
-		 */
 		if (magic16 != XFS_DINODE_MAGIC) {
 			xfs_warn(mp, "Bad INODE block magic!");
 			ASSERT(0);
@@ -2280,8 +2285,6 @@ xlog_recover_validate_buf_type(
 		bp->b_ops = &xfs_attr3_leaf_buf_ops;
 		break;
 	case XFS_BLFT_ATTR_RMT_BUF:
-		if (!xfs_sb_version_hascrc(&mp->m_sb))
-			break;
 		if (magic32 != XFS_ATTR3_RMT_MAGIC) {
 			xfs_warn(mp, "Bad attr remote magic!");
 			ASSERT(0);
@@ -2388,16 +2391,7 @@ xlog_recover_do_reg_buffer(
 	/* Shouldn't be any more regions */
 	ASSERT(i == item->ri_total);
 
-	/*
-	 * We can only do post recovery validation on items on CRC enabled
-	 * fielsystems as we need to know when the buffer was written to be able
-	 * to determine if we should have replayed the item. If we replay old
-	 * metadata over a newer buffer, then it will enter a temporarily
-	 * inconsistent state resulting in verification failures. Hence for now
-	 * just avoid the verification stage for non-crc filesystems
-	 */
-	if (xfs_sb_version_hascrc(&mp->m_sb))
-		xlog_recover_validate_buf_type(mp, bp, buf_f);
+	xlog_recover_validate_buf_type(mp, bp, buf_f);
 }
 
 /*
@@ -2505,12 +2499,29 @@ xlog_recover_buffer_pass2(
 	}
 
 	/*
-	 * recover the buffer only if we get an LSN from it and it's less than
+	 * Recover the buffer only if we get an LSN from it and it's less than
 	 * the lsn of the transaction we are replaying.
+	 *
+	 * Note that we have to be extremely careful of readahead here.
+	 * Readahead does not attach verfiers to the buffers so if we don't
+	 * actually do any replay after readahead because of the LSN we found
+	 * in the buffer if more recent than that current transaction then we
+	 * need to attach the verifier directly. Failure to do so can lead to
+	 * future recovery actions (e.g. EFI and unlinked list recovery) can
+	 * operate on the buffers and they won't get the verifier attached. This
+	 * can lead to blocks on disk having the correct content but a stale
+	 * CRC.
+	 *
+	 * It is safe to assume these clean buffers are currently up to date.
+	 * If the buffer is dirtied by a later transaction being replayed, then
+	 * the verifier will be reset to match whatever recover turns that
+	 * buffer into.
 	 */
 	lsn = xlog_recover_get_buf_lsn(mp, bp);
-	if (lsn && lsn != -1 && XFS_LSN_CMP(lsn, current_lsn) >= 0)
+	if (lsn && lsn != -1 && XFS_LSN_CMP(lsn, current_lsn) >= 0) {
+		xlog_recover_validate_buf_type(mp, bp, buf_f);
 		goto out_release;
+	}
 
 	if (buf_f->blf_flags & XFS_BLF_INODE_BUF) {
 		error = xlog_recover_do_inode_buffer(mp, item, bp, buf_f);
-- 
2.0.0

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 3/4] xfs: quotacheck leaves dquot buffers without verifiers
  2014-07-31  1:01 [PATCH 0/4 V2] xfs: missing verifier fixes Dave Chinner
  2014-07-31  1:01 ` [PATCH 1/4] xfs: catch buffers written without verifiers attached Dave Chinner
  2014-07-31  1:01 ` [PATCH 2/4] xfs: ensure verifiers are attached to recovered buffers Dave Chinner
@ 2014-07-31  1:01 ` Dave Chinner
  2014-08-01 14:28   ` Christoph Hellwig
  2014-07-31  1:01 ` [PATCH 4/4] xfs: dquot recovery needs verifiers Dave Chinner
  3 siblings, 1 reply; 14+ messages in thread
From: Dave Chinner @ 2014-07-31  1:01 UTC (permalink / raw)
  To: xfs

From: Dave Chinner <dchinner@redhat.com>

When running xfs/305, I noticed that quotacheck was flushing dquot
buffers that did not have the xfs_dquot_buf_ops verifiers attached:

XFS (vdb): _xfs_buf_ioapply: no ops on block 0x1dc8/0x1dc8
ffff880052489000: 44 51 01 04 00 00 65 b8 00 00 00 00 00 00 00 00  DQ....e.........
ffff880052489010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
ffff880052489020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
ffff880052489030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
CPU: 1 PID: 2376 Comm: mount Not tainted 3.16.0-rc2-dgc+ #306
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
 ffff88006fe38000 ffff88004a0ffae8 ffffffff81cf1cca 0000000000000001
 ffff88004a0ffb88 ffffffff814d50ca 000010004a0ffc70 0000000000000000
 ffff88006be56dc4 0000000000000021 0000000000001dc8 ffff88007c773d80
Call Trace:
 [<ffffffff81cf1cca>] dump_stack+0x45/0x56
 [<ffffffff814d50ca>] _xfs_buf_ioapply+0x3ca/0x3d0
 [<ffffffff810db520>] ? wake_up_state+0x20/0x20
 [<ffffffff814d51f5>] ? xfs_bdstrat_cb+0x55/0xb0
 [<ffffffff814d513b>] xfs_buf_iorequest+0x6b/0xd0
 [<ffffffff814d51f5>] xfs_bdstrat_cb+0x55/0xb0
 [<ffffffff814d53ab>] __xfs_buf_delwri_submit+0x15b/0x220
 [<ffffffff814d6040>] ? xfs_buf_delwri_submit+0x30/0x90
 [<ffffffff814d6040>] xfs_buf_delwri_submit+0x30/0x90
 [<ffffffff8150f89d>] xfs_qm_quotacheck+0x17d/0x3c0
 [<ffffffff81510591>] xfs_qm_mount_quotas+0x151/0x1e0
 [<ffffffff814ed01c>] xfs_mountfs+0x56c/0x7d0
 [<ffffffff814f0f12>] xfs_fs_fill_super+0x2c2/0x340
 [<ffffffff811c9fe4>] mount_bdev+0x194/0x1d0
 [<ffffffff814f0c50>] ? xfs_finish_flags+0x170/0x170
 [<ffffffff814ef0f5>] xfs_fs_mount+0x15/0x20
 [<ffffffff811ca8c9>] mount_fs+0x39/0x1b0
 [<ffffffff811e4d67>] vfs_kern_mount+0x67/0x120
 [<ffffffff811e757e>] do_mount+0x23e/0xad0
 [<ffffffff8117abde>] ? __get_free_pages+0xe/0x50
 [<ffffffff811e71e6>] ? copy_mount_options+0x36/0x150
 [<ffffffff811e8103>] SyS_mount+0x83/0xc0
 [<ffffffff81cfd40b>] tracesys+0xdd/0xe2

This was caused by dquot buffer readahead not attaching a verifier
structure to the buffer when readahead was issued, resulting in the
followup read of the buffer finding a valid buffer and so not
attaching new verifiers to the buffer as part of the read.

Also, when a verifier failure occurs, we then read the buffer
without verifiers. Attach the verifiers manually after this read so
that if the buffer is then written it will be verified that the
corruption has been repaired.

Further, when flushing a dquot we don't ask for a verifier when
reading in the dquot buffer the dquot belongs to. Most of the time
this isn't an issue because the buffer is still cached, but when it
is not cached it will result in writing the dquot buffer without
having the verfier attached.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
---
 fs/xfs/xfs_dquot.c | 3 ++-
 fs/xfs/xfs_qm.c    | 8 +++++++-
 2 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/fs/xfs/xfs_dquot.c b/fs/xfs/xfs_dquot.c
index 8a44a79..63c2de4 100644
--- a/fs/xfs/xfs_dquot.c
+++ b/fs/xfs/xfs_dquot.c
@@ -974,7 +974,8 @@ xfs_qm_dqflush(
 	 * Get the buffer containing the on-disk dquot
 	 */
 	error = xfs_trans_read_buf(mp, NULL, mp->m_ddev_targp, dqp->q_blkno,
-				   mp->m_quotainfo->qi_dqchunklen, 0, &bp, NULL);
+				   mp->m_quotainfo->qi_dqchunklen, 0, &bp,
+				   &xfs_dquot_buf_ops);
 	if (error)
 		goto out_unlock;
 
diff --git a/fs/xfs/xfs_qm.c b/fs/xfs/xfs_qm.c
index 7e1a80b..1023210 100644
--- a/fs/xfs/xfs_qm.c
+++ b/fs/xfs/xfs_qm.c
@@ -911,6 +911,12 @@ xfs_qm_dqiter_bufs(
 		if (error)
 			break;
 
+		/*
+		 * A corrupt buffer might not have a verifier attached, so
+		 * make sure we have the correct one attached before writeback
+		 * occurs.
+		 */
+		bp->b_ops = &xfs_dquot_buf_ops;
 		xfs_qm_reset_dqcounts(mp, bp, firstid, type);
 		xfs_buf_delwri_queue(bp, buffer_list);
 		xfs_buf_relse(bp);
@@ -996,7 +1002,7 @@ xfs_qm_dqiterate(
 					xfs_buf_readahead(mp->m_ddev_targp,
 					       XFS_FSB_TO_DADDR(mp, rablkno),
 					       mp->m_quotainfo->qi_dqchunklen,
-					       NULL);
+					       &xfs_dquot_buf_ops);
 					rablkno++;
 				}
 			}
-- 
2.0.0

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 4/4] xfs: dquot recovery needs verifiers
  2014-07-31  1:01 [PATCH 0/4 V2] xfs: missing verifier fixes Dave Chinner
                   ` (2 preceding siblings ...)
  2014-07-31  1:01 ` [PATCH 3/4] xfs: quotacheck leaves dquot buffers without verifiers Dave Chinner
@ 2014-07-31  1:01 ` Dave Chinner
  2014-07-31 12:27   ` Brian Foster
  2014-08-01 14:30   ` Christoph Hellwig
  3 siblings, 2 replies; 14+ messages in thread
From: Dave Chinner @ 2014-07-31  1:01 UTC (permalink / raw)
  To: xfs

From: Dave Chinner <dchinner@redhat.com>

dquot recovery should add verifiers to the dquot buffers that it
recovers changes into. Unfortunately, it doesn't attached the
verifiers to the buffers in a consistent manner. For example,
xlog_recover_dquot_pass2() reads dquot buffers without a verifier
and then writes it without ever having attached a verifier to the
buffer.

Further, dquot buffer recovery may write a dquot buffer that has not
been modified, or indeed, shoul dbe written because quotas are not
enabled and hence changes to the buffer were not replayed. In this
case, we again write buffers without verifiers attached because that
doesn't happen until after the buffer changes have been replayed.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
 fs/xfs/xfs_log_recover.c | 44 +++++++++++++++++++++++---------------------
 1 file changed, 23 insertions(+), 21 deletions(-)

diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c
index 8a7d8a7..1fd5787 100644
--- a/fs/xfs/xfs_log_recover.c
+++ b/fs/xfs/xfs_log_recover.c
@@ -2399,8 +2399,11 @@ xlog_recover_do_reg_buffer(
  * Simple algorithm: if we have found a QUOTAOFF log item of the same type
  * (ie. USR or GRP), then just toss this buffer away; don't recover it.
  * Else, treat it as a regular buffer and do recovery.
+ *
+ * Return false if the buffer was tossed and true if we recovered the buffer to
+ * indicate to the caller if the buffer needs writing.
  */
-STATIC void
+STATIC bool
 xlog_recover_do_dquot_buffer(
 	struct xfs_mount		*mp,
 	struct xlog			*log,
@@ -2415,9 +2418,8 @@ xlog_recover_do_dquot_buffer(
 	/*
 	 * Filesystems are required to send in quota flags at mount time.
 	 */
-	if (mp->m_qflags == 0) {
-		return;
-	}
+	if (!mp->m_qflags)
+		return false;
 
 	type = 0;
 	if (buf_f->blf_flags & XFS_BLF_UDQUOT_BUF)
@@ -2430,9 +2432,10 @@ xlog_recover_do_dquot_buffer(
 	 * This type of quotas was turned off, so ignore this buffer
 	 */
 	if (log->l_quotaoffs_flag & type)
-		return;
+		return false;
 
 	xlog_recover_do_reg_buffer(mp, item, bp, buf_f);
+	return true;
 }
 
 /*
@@ -2525,14 +2528,18 @@ xlog_recover_buffer_pass2(
 
 	if (buf_f->blf_flags & XFS_BLF_INODE_BUF) {
 		error = xlog_recover_do_inode_buffer(mp, item, bp, buf_f);
+		if (error)
+			goto out_release;
 	} else if (buf_f->blf_flags &
 		  (XFS_BLF_UDQUOT_BUF|XFS_BLF_PDQUOT_BUF|XFS_BLF_GDQUOT_BUF)) {
-		xlog_recover_do_dquot_buffer(mp, log, item, bp, buf_f);
+		bool	dirty;
+
+		dirty = xlog_recover_do_dquot_buffer(mp, log, item, bp, buf_f);
+		if (!dirty)
+			goto out_release;
 	} else {
 		xlog_recover_do_reg_buffer(mp, item, bp, buf_f);
 	}
-	if (error)
-		goto out_release;
 
 	/*
 	 * Perform delayed write on the buffer.  Asynchronous writes will be
@@ -3022,9 +3029,16 @@ xlog_recover_dquot_pass2(
 		return -EIO;
 	ASSERT(dq_f->qlf_len == 1);
 
+	/*
+	 * At this point we are assuming that the dquots have been allocated
+	 * and hence the buffer has valid dquots stamped in it. It should,
+	 * therefore, pass verifier validation. If the dquot is bad, then the
+	 * we'll return an error here, so we don't need to specifically check
+	 * the dquot in the buffer after the verifier has run.
+	 */
 	error = xfs_trans_read_buf(mp, NULL, mp->m_ddev_targp, dq_f->qlf_blkno,
 				   XFS_FSB_TO_BB(mp, dq_f->qlf_len), 0, &bp,
-				   NULL);
+				   &xfs_dquot_buf_ops);
 	if (error)
 		return error;
 
@@ -3032,18 +3046,6 @@ xlog_recover_dquot_pass2(
 	ddq = (xfs_disk_dquot_t *)xfs_buf_offset(bp, dq_f->qlf_boffset);
 
 	/*
-	 * At least the magic num portion should be on disk because this
-	 * was among a chunk of dquots created earlier, and we did some
-	 * minimal initialization then.
-	 */
-	error = xfs_dqcheck(mp, ddq, dq_f->qlf_id, 0, XFS_QMOPT_DOWARN,
-			   "xlog_recover_dquot_pass2");
-	if (error) {
-		xfs_buf_relse(bp);
-		return -EIO;
-	}
-
-	/*
 	 * If the dquot has an LSN in it, recover the dquot only if it's less
 	 * than the lsn of the transaction we are replaying.
 	 */
-- 
2.0.0

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/4] xfs: catch buffers written without verifiers attached
  2014-07-31  1:01 ` [PATCH 1/4] xfs: catch buffers written without verifiers attached Dave Chinner
@ 2014-07-31 12:27   ` Brian Foster
  2014-08-01 14:26   ` Christoph Hellwig
  1 sibling, 0 replies; 14+ messages in thread
From: Brian Foster @ 2014-07-31 12:27 UTC (permalink / raw)
  To: Dave Chinner; +Cc: xfs

On Thu, Jul 31, 2014 at 11:01:46AM +1000, Dave Chinner wrote:
> From: Dave Chinner <dchinner@redhat.com>
> 
> We recently had a bug where buffers were slipping through log
> recovery without any verifier attached to them. This was resulting
> in on-disk CRC mismatches for valid data. Add some warning code to
> catch this occurrence so that we catch such bugs during development
> rather than not being aware they exist.
> 
> Note that we cannot do this verification unconditionally as non-CRC
> filesystems don't always attach verifiers to the buffers being
> written. e.g. during log recovery we cannot identify all the
> different types of buffers correctly on non-CRC filesystems, so we
> can't attach the correct verifiers in all cases and so we don't
> attach any. Hence we don't want on non-CRC filesystems to avoid
> spamming the logs with false indications.
> 
> Signed-off-by: Dave Chinner <dchinner@redhat.com>
> ---

Reviewed-by: Brian Foster <bfoster@redhat.com>

>  fs/xfs/xfs_buf.c | 14 ++++++++++++++
>  fs/xfs/xfs_log.c |  8 +++++++-
>  2 files changed, 21 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/xfs/xfs_buf.c b/fs/xfs/xfs_buf.c
> index a6dc83e..cd7b8ca 100644
> --- a/fs/xfs/xfs_buf.c
> +++ b/fs/xfs/xfs_buf.c
> @@ -1330,6 +1330,20 @@ _xfs_buf_ioapply(
>  						   SHUTDOWN_CORRUPT_INCORE);
>  				return;
>  			}
> +		} else if (bp->b_bn != XFS_BUF_DADDR_NULL) {
> +			struct xfs_mount *mp = bp->b_target->bt_mount;
> +
> +			/*
> +			 * non-crc filesystems don't attach verifiers during
> +			 * log recovery, so don't warn for such filesystems.
> +			 */
> +			if (xfs_sb_version_hascrc(&mp->m_sb)) {
> +				xfs_warn(mp,
> +					"%s: no ops on block 0x%llx/0x%x",
> +					__func__, bp->b_bn, bp->b_length);
> +				xfs_hex_dump(bp->b_addr, 64);
> +				dump_stack();
> +			}
>  		}
>  	} else if (bp->b_flags & XBF_READ_AHEAD) {
>  		rw = READA;
> diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c
> index 149a4a5..ca4fd5b 100644
> --- a/fs/xfs/xfs_log.c
> +++ b/fs/xfs/xfs_log.c
> @@ -1378,8 +1378,14 @@ xlog_alloc_log(
>  
>  	xlog_get_iclog_buffer_size(mp, log);
>  
> +	/*
> +	 * Use a NULL block for the extra log buffer used during splits so that
> +	 * it will trigger errors if we ever try to do IO on it without first
> +	 * having set it up properly.
> +	 */
>  	error = -ENOMEM;
> -	bp = xfs_buf_alloc(mp->m_logdev_targp, 0, BTOBB(log->l_iclog_size), 0);
> +	bp = xfs_buf_alloc(mp->m_logdev_targp, XFS_BUF_DADDR_NULL,
> +			   BTOBB(log->l_iclog_size), 0);
>  	if (!bp)
>  		goto out_free_log;
>  
> -- 
> 2.0.0
> 
> _______________________________________________
> xfs mailing list
> xfs@oss.sgi.com
> http://oss.sgi.com/mailman/listinfo/xfs

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 4/4] xfs: dquot recovery needs verifiers
  2014-07-31  1:01 ` [PATCH 4/4] xfs: dquot recovery needs verifiers Dave Chinner
@ 2014-07-31 12:27   ` Brian Foster
  2014-08-01 14:30   ` Christoph Hellwig
  1 sibling, 0 replies; 14+ messages in thread
From: Brian Foster @ 2014-07-31 12:27 UTC (permalink / raw)
  To: Dave Chinner; +Cc: xfs

On Thu, Jul 31, 2014 at 11:01:49AM +1000, Dave Chinner wrote:
> From: Dave Chinner <dchinner@redhat.com>
> 
> dquot recovery should add verifiers to the dquot buffers that it
> recovers changes into. Unfortunately, it doesn't attached the
> verifiers to the buffers in a consistent manner. For example,
> xlog_recover_dquot_pass2() reads dquot buffers without a verifier
> and then writes it without ever having attached a verifier to the
> buffer.
> 
> Further, dquot buffer recovery may write a dquot buffer that has not
> been modified, or indeed, shoul dbe written because quotas are not
> enabled and hence changes to the buffer were not replayed. In this
> case, we again write buffers without verifiers attached because that
> doesn't happen until after the buffer changes have been replayed.
> 
> Signed-off-by: Dave Chinner <dchinner@redhat.com>
> ---

Reviewed-by: Brian Foster <bfoster@redhat.com>

>  fs/xfs/xfs_log_recover.c | 44 +++++++++++++++++++++++---------------------
>  1 file changed, 23 insertions(+), 21 deletions(-)
> 
> diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c
> index 8a7d8a7..1fd5787 100644
> --- a/fs/xfs/xfs_log_recover.c
> +++ b/fs/xfs/xfs_log_recover.c
> @@ -2399,8 +2399,11 @@ xlog_recover_do_reg_buffer(
>   * Simple algorithm: if we have found a QUOTAOFF log item of the same type
>   * (ie. USR or GRP), then just toss this buffer away; don't recover it.
>   * Else, treat it as a regular buffer and do recovery.
> + *
> + * Return false if the buffer was tossed and true if we recovered the buffer to
> + * indicate to the caller if the buffer needs writing.
>   */
> -STATIC void
> +STATIC bool
>  xlog_recover_do_dquot_buffer(
>  	struct xfs_mount		*mp,
>  	struct xlog			*log,
> @@ -2415,9 +2418,8 @@ xlog_recover_do_dquot_buffer(
>  	/*
>  	 * Filesystems are required to send in quota flags at mount time.
>  	 */
> -	if (mp->m_qflags == 0) {
> -		return;
> -	}
> +	if (!mp->m_qflags)
> +		return false;
>  
>  	type = 0;
>  	if (buf_f->blf_flags & XFS_BLF_UDQUOT_BUF)
> @@ -2430,9 +2432,10 @@ xlog_recover_do_dquot_buffer(
>  	 * This type of quotas was turned off, so ignore this buffer
>  	 */
>  	if (log->l_quotaoffs_flag & type)
> -		return;
> +		return false;
>  
>  	xlog_recover_do_reg_buffer(mp, item, bp, buf_f);
> +	return true;
>  }
>  
>  /*
> @@ -2525,14 +2528,18 @@ xlog_recover_buffer_pass2(
>  
>  	if (buf_f->blf_flags & XFS_BLF_INODE_BUF) {
>  		error = xlog_recover_do_inode_buffer(mp, item, bp, buf_f);
> +		if (error)
> +			goto out_release;
>  	} else if (buf_f->blf_flags &
>  		  (XFS_BLF_UDQUOT_BUF|XFS_BLF_PDQUOT_BUF|XFS_BLF_GDQUOT_BUF)) {
> -		xlog_recover_do_dquot_buffer(mp, log, item, bp, buf_f);
> +		bool	dirty;
> +
> +		dirty = xlog_recover_do_dquot_buffer(mp, log, item, bp, buf_f);
> +		if (!dirty)
> +			goto out_release;
>  	} else {
>  		xlog_recover_do_reg_buffer(mp, item, bp, buf_f);
>  	}
> -	if (error)
> -		goto out_release;
>  
>  	/*
>  	 * Perform delayed write on the buffer.  Asynchronous writes will be
> @@ -3022,9 +3029,16 @@ xlog_recover_dquot_pass2(
>  		return -EIO;
>  	ASSERT(dq_f->qlf_len == 1);
>  
> +	/*
> +	 * At this point we are assuming that the dquots have been allocated
> +	 * and hence the buffer has valid dquots stamped in it. It should,
> +	 * therefore, pass verifier validation. If the dquot is bad, then the
> +	 * we'll return an error here, so we don't need to specifically check
> +	 * the dquot in the buffer after the verifier has run.
> +	 */
>  	error = xfs_trans_read_buf(mp, NULL, mp->m_ddev_targp, dq_f->qlf_blkno,
>  				   XFS_FSB_TO_BB(mp, dq_f->qlf_len), 0, &bp,
> -				   NULL);
> +				   &xfs_dquot_buf_ops);
>  	if (error)
>  		return error;
>  
> @@ -3032,18 +3046,6 @@ xlog_recover_dquot_pass2(
>  	ddq = (xfs_disk_dquot_t *)xfs_buf_offset(bp, dq_f->qlf_boffset);
>  
>  	/*
> -	 * At least the magic num portion should be on disk because this
> -	 * was among a chunk of dquots created earlier, and we did some
> -	 * minimal initialization then.
> -	 */
> -	error = xfs_dqcheck(mp, ddq, dq_f->qlf_id, 0, XFS_QMOPT_DOWARN,
> -			   "xlog_recover_dquot_pass2");
> -	if (error) {
> -		xfs_buf_relse(bp);
> -		return -EIO;
> -	}
> -
> -	/*
>  	 * If the dquot has an LSN in it, recover the dquot only if it's less
>  	 * than the lsn of the transaction we are replaying.
>  	 */
> -- 
> 2.0.0
> 
> _______________________________________________
> xfs mailing list
> xfs@oss.sgi.com
> http://oss.sgi.com/mailman/listinfo/xfs

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/4] xfs: catch buffers written without verifiers attached
  2014-07-31  1:01 ` [PATCH 1/4] xfs: catch buffers written without verifiers attached Dave Chinner
  2014-07-31 12:27   ` Brian Foster
@ 2014-08-01 14:26   ` Christoph Hellwig
  1 sibling, 0 replies; 14+ messages in thread
From: Christoph Hellwig @ 2014-08-01 14:26 UTC (permalink / raw)
  To: Dave Chinner; +Cc: xfs

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/4] xfs: ensure verifiers are attached to recovered buffers
  2014-07-31  1:01 ` [PATCH 2/4] xfs: ensure verifiers are attached to recovered buffers Dave Chinner
@ 2014-08-01 14:27   ` Christoph Hellwig
  0 siblings, 0 replies; 14+ messages in thread
From: Christoph Hellwig @ 2014-08-01 14:27 UTC (permalink / raw)
  To: Dave Chinner; +Cc: xfs

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 3/4] xfs: quotacheck leaves dquot buffers without verifiers
  2014-07-31  1:01 ` [PATCH 3/4] xfs: quotacheck leaves dquot buffers without verifiers Dave Chinner
@ 2014-08-01 14:28   ` Christoph Hellwig
  0 siblings, 0 replies; 14+ messages in thread
From: Christoph Hellwig @ 2014-08-01 14:28 UTC (permalink / raw)
  To: Dave Chinner; +Cc: xfs

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 4/4] xfs: dquot recovery needs verifiers
  2014-07-31  1:01 ` [PATCH 4/4] xfs: dquot recovery needs verifiers Dave Chinner
  2014-07-31 12:27   ` Brian Foster
@ 2014-08-01 14:30   ` Christoph Hellwig
  2014-08-02  3:20     ` Dave Chinner
  1 sibling, 1 reply; 14+ messages in thread
From: Christoph Hellwig @ 2014-08-01 14:30 UTC (permalink / raw)
  To: Dave Chinner; +Cc: xfs

On Thu, Jul 31, 2014 at 11:01:49AM +1000, Dave Chinner wrote:
> From: Dave Chinner <dchinner@redhat.com>
> 
> dquot recovery should add verifiers to the dquot buffers that it
> recovers changes into. Unfortunately, it doesn't attached the
> verifiers to the buffers in a consistent manner. For example,
> xlog_recover_dquot_pass2() reads dquot buffers without a verifier
> and then writes it without ever having attached a verifier to the
> buffer.
> 
> Further, dquot buffer recovery may write a dquot buffer that has not
> been modified, or indeed, shoul dbe written because quotas are not
> enabled and hence changes to the buffer were not replayed. In this
> case, we again write buffers without verifiers attached because that
> doesn't happen until after the buffer changes have been replayed.
> 
> Signed-off-by: Dave Chinner <dchinner@redhat.com>

The xlog_recover_do_reg_buffer look fine to me, but what's the rationale
for removing the xfs_dqcheck call?

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 4/4] xfs: dquot recovery needs verifiers
  2014-08-01 14:30   ` Christoph Hellwig
@ 2014-08-02  3:20     ` Dave Chinner
  0 siblings, 0 replies; 14+ messages in thread
From: Dave Chinner @ 2014-08-02  3:20 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: xfs

On Fri, Aug 01, 2014 at 07:30:23AM -0700, Christoph Hellwig wrote:
> On Thu, Jul 31, 2014 at 11:01:49AM +1000, Dave Chinner wrote:
> > From: Dave Chinner <dchinner@redhat.com>
> > 
> > dquot recovery should add verifiers to the dquot buffers that it
> > recovers changes into. Unfortunately, it doesn't attached the
> > verifiers to the buffers in a consistent manner. For example,
> > xlog_recover_dquot_pass2() reads dquot buffers without a verifier
> > and then writes it without ever having attached a verifier to the
> > buffer.
> > 
> > Further, dquot buffer recovery may write a dquot buffer that has not
> > been modified, or indeed, shoul dbe written because quotas are not
> > enabled and hence changes to the buffer were not replayed. In this
> > case, we again write buffers without verifiers attached because that
> > doesn't happen until after the buffer changes have been replayed.
> > 
> > Signed-off-by: Dave Chinner <dchinner@redhat.com>
> 
> The xlog_recover_do_reg_buffer look fine to me, but what's the rationale
> for removing the xfs_dqcheck call?

It's done by the verifier.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/4] xfs: ensure verifiers are attached to recovered buffers
  2014-07-30  1:48 ` [PATCH 2/4] xfs: ensure verifiers are attached to recovered buffers Dave Chinner
@ 2014-07-30 16:30   ` Brian Foster
  0 siblings, 0 replies; 14+ messages in thread
From: Brian Foster @ 2014-07-30 16:30 UTC (permalink / raw)
  To: Dave Chinner; +Cc: xfs

On Wed, Jul 30, 2014 at 11:48:47AM +1000, Dave Chinner wrote:
> From: Dave Chinner <dchinner@redhat.com>
> 
> Crash testing of CRC enabled filesystems has resulted in a number of
> reports of bad CRCs being detected after the filesystem was mounted.
> Errors such as the following were being seen:
> 
> XFS (sdb3): Mounting V5 Filesystem
> XFS (sdb3): Starting recovery (logdev: internal)
> XFS (sdb3): Metadata CRC error detected at xfs_agf_read_verify+0x5a/0x100 [xfs], block 0x1
> XFS (sdb3): Unmount and run xfs_repair
> XFS (sdb3): First 64 bytes of corrupted metadata buffer:
> ffff880136ffd600: 58 41 47 46 00 00 00 01 00 00 00 00 00 0f aa 40  XAGF...........@
> ffff880136ffd610: 00 02 6d 53 00 02 77 f8 00 00 00 00 00 00 00 01  ..mS..w.........
> ffff880136ffd620: 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00 03  ................
> ffff880136ffd630: 00 00 00 04 00 08 81 d0 00 08 81 a7 00 00 00 00  ................
> XFS (sdb3): metadata I/O error: block 0x1 ("xfs_trans_read_buf_map") error 74 numblks 1
> 
> The errors were typically being seen in AGF, AGI and their related
> btree block buffers some time after log recovery had run. Often it
> wasn't until later subsequent mounts that the problem was
> discovered. The common symptom was a buffer with the correct
> contents, but a CRC and an LSN that matched an older version of the
> contents.
> 
> Some debug added to _xfs_buf_ioapply() indicated that buffers were
> being written without verifiers attached to them from log recovery,
> and Jan Kara isolated the cause to log recovery readahead an dit's
> interactions with buffers that had a more recent LSN on disk than
> the transaction being recovered. In this case, the buffer did not
> get a verifier attached, and os when the second phase of log
> recovery ran and recovered EFIs and unlinked inodes, the buffers
> were modified and written without the verifier running. Hence they
> had up to date contents, but stale LSNs and CRCs.
> 
> Fix it by attaching verifiers to buffers we skip due to future LSN
> values so they don't escape into the buffer cache without the
> correct verifier attached.
> 
> This patch is based on analysis and a patch from Jan Kara.
> 
> cc: <stable@vger.kernel.org>
> Reported-by: Jan Kara <jack@suse.cz>
> Reported-by: Fanael Linithien <fanael4@gmail.com>
> Reported-by: Grozdan <neutrino8@gmail.com>
> Signed-off-by: Dave Chinner <dchinner@redhat.com>
> ---
>  fs/xfs/xfs_log_recover.c | 47 +++++++++++++++++++++++++++++++----------------
>  1 file changed, 31 insertions(+), 16 deletions(-)
> 
> diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c
> index fbc2362..0a015cc 100644
> --- a/fs/xfs/xfs_log_recover.c
> +++ b/fs/xfs/xfs_log_recover.c
> @@ -2126,6 +2126,17 @@ xlog_recover_validate_buf_type(
>  	__uint16_t		magic16;
>  	__uint16_t		magicda;
>  
> +	/*
> +	 * We can only do post recovery validation on items on CRC enabled
> +	 * fielsystems as we need to know when the buffer was written to be able
> +	 * to determine if we should have replayed the item. If we replay old
> +	 * metadata over a newer buffer, then it will enter a temporarily
> +	 * inconsistent state resulting in verification failures. Hence for now
> +	 * just avoid the verification stage for non-crc filesystems
> +	 */
> +	if (!xfs_sb_version_hascrc(&mp->m_sb))
> +		return;
> +

This function has a couple other similar *_hascrc() checks further down
that we should probably kill now. Otherwise, looks Ok...

Reviewed-by: Brian Foster <bfoster@redhat.com>

>  	magic32 = be32_to_cpu(*(__be32 *)bp->b_addr);
>  	magic16 = be16_to_cpu(*(__be16*)bp->b_addr);
>  	magicda = be16_to_cpu(info->magic);
> @@ -2197,10 +2208,6 @@ xlog_recover_validate_buf_type(
>  #endif
>  		break;
>  	case XFS_BLFT_DINO_BUF:
> -		/*
> -		 * we get here with inode allocation buffers, not buffers that
> -		 * track unlinked list changes.
> -		 */
>  		if (magic16 != XFS_DINODE_MAGIC) {
>  			xfs_warn(mp, "Bad INODE block magic!");
>  			ASSERT(0);
> @@ -2388,16 +2395,7 @@ xlog_recover_do_reg_buffer(
>  	/* Shouldn't be any more regions */
>  	ASSERT(i == item->ri_total);
>  
> -	/*
> -	 * We can only do post recovery validation on items on CRC enabled
> -	 * fielsystems as we need to know when the buffer was written to be able
> -	 * to determine if we should have replayed the item. If we replay old
> -	 * metadata over a newer buffer, then it will enter a temporarily
> -	 * inconsistent state resulting in verification failures. Hence for now
> -	 * just avoid the verification stage for non-crc filesystems
> -	 */
> -	if (xfs_sb_version_hascrc(&mp->m_sb))
> -		xlog_recover_validate_buf_type(mp, bp, buf_f);
> +	xlog_recover_validate_buf_type(mp, bp, buf_f);
>  }
>  
>  /*
> @@ -2505,12 +2503,29 @@ xlog_recover_buffer_pass2(
>  	}
>  
>  	/*
> -	 * recover the buffer only if we get an LSN from it and it's less than
> +	 * Recover the buffer only if we get an LSN from it and it's less than
>  	 * the lsn of the transaction we are replaying.
> +	 *
> +	 * Note that we have to be extremely careful of readahead here.
> +	 * Readahead does not attach verfiers to the buffers so if we don't
> +	 * actually do any replay after readahead because of the LSN we found
> +	 * in the buffer if more recent than that current transaction then we
> +	 * need to attach the verifier directly. Failure to do so can lead to
> +	 * future recovery actions (e.g. EFI and unlinked list recovery) can
> +	 * operate on the buffers and they won't get the verifier attached. This
> +	 * can lead to blocks on disk having the correct content but a stale
> +	 * CRC.
> +	 *
> +	 * It is safe to assume these clean buffers are currently up to date.
> +	 * If the buffer is dirtied by a later transaction being replayed, then
> +	 * the verifier will be reset to match whatever recover turns that
> +	 * buffer into.
>  	 */
>  	lsn = xlog_recover_get_buf_lsn(mp, bp);
> -	if (lsn && lsn != -1 && XFS_LSN_CMP(lsn, current_lsn) >= 0)
> +	if (lsn && lsn != -1 && XFS_LSN_CMP(lsn, current_lsn) >= 0) {
> +		xlog_recover_validate_buf_type(mp, bp, buf_f);
>  		goto out_release;
> +	}
>  
>  	if (buf_f->blf_flags & XFS_BLF_INODE_BUF) {
>  		error = xlog_recover_do_inode_buffer(mp, item, bp, buf_f);
> -- 
> 2.0.0
> 
> _______________________________________________
> xfs mailing list
> xfs@oss.sgi.com
> http://oss.sgi.com/mailman/listinfo/xfs

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 2/4] xfs: ensure verifiers are attached to recovered buffers
  2014-07-30  1:48 [PATCH 0/4] xfs: missing verifer fixes Dave Chinner
@ 2014-07-30  1:48 ` Dave Chinner
  2014-07-30 16:30   ` Brian Foster
  0 siblings, 1 reply; 14+ messages in thread
From: Dave Chinner @ 2014-07-30  1:48 UTC (permalink / raw)
  To: xfs

From: Dave Chinner <dchinner@redhat.com>

Crash testing of CRC enabled filesystems has resulted in a number of
reports of bad CRCs being detected after the filesystem was mounted.
Errors such as the following were being seen:

XFS (sdb3): Mounting V5 Filesystem
XFS (sdb3): Starting recovery (logdev: internal)
XFS (sdb3): Metadata CRC error detected at xfs_agf_read_verify+0x5a/0x100 [xfs], block 0x1
XFS (sdb3): Unmount and run xfs_repair
XFS (sdb3): First 64 bytes of corrupted metadata buffer:
ffff880136ffd600: 58 41 47 46 00 00 00 01 00 00 00 00 00 0f aa 40  XAGF...........@
ffff880136ffd610: 00 02 6d 53 00 02 77 f8 00 00 00 00 00 00 00 01  ..mS..w.........
ffff880136ffd620: 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00 03  ................
ffff880136ffd630: 00 00 00 04 00 08 81 d0 00 08 81 a7 00 00 00 00  ................
XFS (sdb3): metadata I/O error: block 0x1 ("xfs_trans_read_buf_map") error 74 numblks 1

The errors were typically being seen in AGF, AGI and their related
btree block buffers some time after log recovery had run. Often it
wasn't until later subsequent mounts that the problem was
discovered. The common symptom was a buffer with the correct
contents, but a CRC and an LSN that matched an older version of the
contents.

Some debug added to _xfs_buf_ioapply() indicated that buffers were
being written without verifiers attached to them from log recovery,
and Jan Kara isolated the cause to log recovery readahead an dit's
interactions with buffers that had a more recent LSN on disk than
the transaction being recovered. In this case, the buffer did not
get a verifier attached, and os when the second phase of log
recovery ran and recovered EFIs and unlinked inodes, the buffers
were modified and written without the verifier running. Hence they
had up to date contents, but stale LSNs and CRCs.

Fix it by attaching verifiers to buffers we skip due to future LSN
values so they don't escape into the buffer cache without the
correct verifier attached.

This patch is based on analysis and a patch from Jan Kara.

cc: <stable@vger.kernel.org>
Reported-by: Jan Kara <jack@suse.cz>
Reported-by: Fanael Linithien <fanael4@gmail.com>
Reported-by: Grozdan <neutrino8@gmail.com>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
 fs/xfs/xfs_log_recover.c | 47 +++++++++++++++++++++++++++++++----------------
 1 file changed, 31 insertions(+), 16 deletions(-)

diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c
index fbc2362..0a015cc 100644
--- a/fs/xfs/xfs_log_recover.c
+++ b/fs/xfs/xfs_log_recover.c
@@ -2126,6 +2126,17 @@ xlog_recover_validate_buf_type(
 	__uint16_t		magic16;
 	__uint16_t		magicda;
 
+	/*
+	 * We can only do post recovery validation on items on CRC enabled
+	 * fielsystems as we need to know when the buffer was written to be able
+	 * to determine if we should have replayed the item. If we replay old
+	 * metadata over a newer buffer, then it will enter a temporarily
+	 * inconsistent state resulting in verification failures. Hence for now
+	 * just avoid the verification stage for non-crc filesystems
+	 */
+	if (!xfs_sb_version_hascrc(&mp->m_sb))
+		return;
+
 	magic32 = be32_to_cpu(*(__be32 *)bp->b_addr);
 	magic16 = be16_to_cpu(*(__be16*)bp->b_addr);
 	magicda = be16_to_cpu(info->magic);
@@ -2197,10 +2208,6 @@ xlog_recover_validate_buf_type(
 #endif
 		break;
 	case XFS_BLFT_DINO_BUF:
-		/*
-		 * we get here with inode allocation buffers, not buffers that
-		 * track unlinked list changes.
-		 */
 		if (magic16 != XFS_DINODE_MAGIC) {
 			xfs_warn(mp, "Bad INODE block magic!");
 			ASSERT(0);
@@ -2388,16 +2395,7 @@ xlog_recover_do_reg_buffer(
 	/* Shouldn't be any more regions */
 	ASSERT(i == item->ri_total);
 
-	/*
-	 * We can only do post recovery validation on items on CRC enabled
-	 * fielsystems as we need to know when the buffer was written to be able
-	 * to determine if we should have replayed the item. If we replay old
-	 * metadata over a newer buffer, then it will enter a temporarily
-	 * inconsistent state resulting in verification failures. Hence for now
-	 * just avoid the verification stage for non-crc filesystems
-	 */
-	if (xfs_sb_version_hascrc(&mp->m_sb))
-		xlog_recover_validate_buf_type(mp, bp, buf_f);
+	xlog_recover_validate_buf_type(mp, bp, buf_f);
 }
 
 /*
@@ -2505,12 +2503,29 @@ xlog_recover_buffer_pass2(
 	}
 
 	/*
-	 * recover the buffer only if we get an LSN from it and it's less than
+	 * Recover the buffer only if we get an LSN from it and it's less than
 	 * the lsn of the transaction we are replaying.
+	 *
+	 * Note that we have to be extremely careful of readahead here.
+	 * Readahead does not attach verfiers to the buffers so if we don't
+	 * actually do any replay after readahead because of the LSN we found
+	 * in the buffer if more recent than that current transaction then we
+	 * need to attach the verifier directly. Failure to do so can lead to
+	 * future recovery actions (e.g. EFI and unlinked list recovery) can
+	 * operate on the buffers and they won't get the verifier attached. This
+	 * can lead to blocks on disk having the correct content but a stale
+	 * CRC.
+	 *
+	 * It is safe to assume these clean buffers are currently up to date.
+	 * If the buffer is dirtied by a later transaction being replayed, then
+	 * the verifier will be reset to match whatever recover turns that
+	 * buffer into.
 	 */
 	lsn = xlog_recover_get_buf_lsn(mp, bp);
-	if (lsn && lsn != -1 && XFS_LSN_CMP(lsn, current_lsn) >= 0)
+	if (lsn && lsn != -1 && XFS_LSN_CMP(lsn, current_lsn) >= 0) {
+		xlog_recover_validate_buf_type(mp, bp, buf_f);
 		goto out_release;
+	}
 
 	if (buf_f->blf_flags & XFS_BLF_INODE_BUF) {
 		error = xlog_recover_do_inode_buffer(mp, item, bp, buf_f);
-- 
2.0.0

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply related	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2014-08-02  3:20 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-07-31  1:01 [PATCH 0/4 V2] xfs: missing verifier fixes Dave Chinner
2014-07-31  1:01 ` [PATCH 1/4] xfs: catch buffers written without verifiers attached Dave Chinner
2014-07-31 12:27   ` Brian Foster
2014-08-01 14:26   ` Christoph Hellwig
2014-07-31  1:01 ` [PATCH 2/4] xfs: ensure verifiers are attached to recovered buffers Dave Chinner
2014-08-01 14:27   ` Christoph Hellwig
2014-07-31  1:01 ` [PATCH 3/4] xfs: quotacheck leaves dquot buffers without verifiers Dave Chinner
2014-08-01 14:28   ` Christoph Hellwig
2014-07-31  1:01 ` [PATCH 4/4] xfs: dquot recovery needs verifiers Dave Chinner
2014-07-31 12:27   ` Brian Foster
2014-08-01 14:30   ` Christoph Hellwig
2014-08-02  3:20     ` Dave Chinner
  -- strict thread matches above, loose matches on Subject: below --
2014-07-30  1:48 [PATCH 0/4] xfs: missing verifer fixes Dave Chinner
2014-07-30  1:48 ` [PATCH 2/4] xfs: ensure verifiers are attached to recovered buffers Dave Chinner
2014-07-30 16:30   ` Brian Foster

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.