linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Dave Chinner <david@fromorbit.com>,
	Brian Foster <bfoster@redhat.com>,
	"Darrick J. Wong" <darrick.wong@oracle.com>
Subject: [PATCH 4.16 12/47] xfs: detect agfl count corruption and reset agfl
Date: Mon,  4 Jun 2018 08:58:24 +0200	[thread overview]
Message-ID: <20180604065550.004615380@linuxfoundation.org> (raw)
In-Reply-To: <20180604065549.468488465@linuxfoundation.org>

4.16-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Brian Foster <bfoster@redhat.com>

commit a27ba2607e60312554cbcd43fc660b2c7f29dc9c upstream.

The struct xfs_agfl v5 header was originally introduced with
unexpected padding that caused the AGFL to operate with one less
slot than intended. The header has since been packed, but the fix
left an incompatibility for users who upgrade from an old kernel
with the unpacked header to a newer kernel with the packed header
while the AGFL happens to wrap around the end. The newer kernel
recognizes one extra slot at the physical end of the AGFL that the
previous kernel did not. The new kernel will eventually attempt to
allocate a block from that slot, which contains invalid data, and
cause a crash.

This condition can be detected by comparing the active range of the
AGFL to the count. While this detects a padding mismatch, it can
also trigger false positives for unrelated flcount corruption. Since
we cannot distinguish a size mismatch due to padding from unrelated
corruption, we can't trust the AGFL enough to simply repopulate the
empty slot.

Instead, avoid unnecessarily complex detection logic and and use a
solution that can handle any form of flcount corruption that slips
through read verifiers: distrust the entire AGFL and reset it to an
empty state. Any valid blocks within the AGFL are intentionally
leaked. This requires xfs_repair to rectify (which was already
necessary based on the state the AGFL was found in). The reset
mitigates the side effect of the padding mismatch problem from a
filesystem crash to a free space accounting inconsistency. The
generic approach also means that this patch can be safely backported
to kernels with or without a packed struct xfs_agfl.

Check the AGF for an invalid freelist count on initial read from
disk. If detected, set a flag on the xfs_perag to indicate that a
reset is required before the AGFL can be used. In the first
transaction that attempts to use a flagged AGFL, reset it to empty,
warn the user about the inconsistency and allow the freelist fixup
code to repopulate the AGFL with new blocks. The xfs_perag flag is
cleared to eliminate the need for repeated checks on each block
allocation operation.

This allows kernels that include the packing fix commit 96f859d52bcb
("libxfs: pack the agfl header structure so XFS_AGFL_SIZE is correct")
to handle older unpacked AGFL formats without a filesystem crash.

Suggested-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by Dave Chiluk <chiluk+linuxxfs@indeed.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 fs/xfs/libxfs/xfs_alloc.c |   94 ++++++++++++++++++++++++++++++++++++++++++++++
 fs/xfs/xfs_mount.h        |    1 
 fs/xfs/xfs_trace.h        |    9 +++-
 3 files changed, 103 insertions(+), 1 deletion(-)

--- a/fs/xfs/libxfs/xfs_alloc.c
+++ b/fs/xfs/libxfs/xfs_alloc.c
@@ -2071,6 +2071,93 @@ xfs_alloc_space_available(
 }
 
 /*
+ * Check the agfl fields of the agf for inconsistency or corruption. The purpose
+ * is to detect an agfl header padding mismatch between current and early v5
+ * kernels. This problem manifests as a 1-slot size difference between the
+ * on-disk flcount and the active [first, last] range of a wrapped agfl. This
+ * may also catch variants of agfl count corruption unrelated to padding. Either
+ * way, we'll reset the agfl and warn the user.
+ *
+ * Return true if a reset is required before the agfl can be used, false
+ * otherwise.
+ */
+static bool
+xfs_agfl_needs_reset(
+	struct xfs_mount	*mp,
+	struct xfs_agf		*agf)
+{
+	uint32_t		f = be32_to_cpu(agf->agf_flfirst);
+	uint32_t		l = be32_to_cpu(agf->agf_fllast);
+	uint32_t		c = be32_to_cpu(agf->agf_flcount);
+	int			agfl_size = xfs_agfl_size(mp);
+	int			active;
+
+	/* no agfl header on v4 supers */
+	if (!xfs_sb_version_hascrc(&mp->m_sb))
+		return false;
+
+	/*
+	 * The agf read verifier catches severe corruption of these fields.
+	 * Repeat some sanity checks to cover a packed -> unpacked mismatch if
+	 * the verifier allows it.
+	 */
+	if (f >= agfl_size || l >= agfl_size)
+		return true;
+	if (c > agfl_size)
+		return true;
+
+	/*
+	 * Check consistency between the on-disk count and the active range. An
+	 * agfl padding mismatch manifests as an inconsistent flcount.
+	 */
+	if (c && l >= f)
+		active = l - f + 1;
+	else if (c)
+		active = agfl_size - f + l + 1;
+	else
+		active = 0;
+
+	return active != c;
+}
+
+/*
+ * Reset the agfl to an empty state. Ignore/drop any existing blocks since the
+ * agfl content cannot be trusted. Warn the user that a repair is required to
+ * recover leaked blocks.
+ *
+ * The purpose of this mechanism is to handle filesystems affected by the agfl
+ * header padding mismatch problem. A reset keeps the filesystem online with a
+ * relatively minor free space accounting inconsistency rather than suffer the
+ * inevitable crash from use of an invalid agfl block.
+ */
+static void
+xfs_agfl_reset(
+	struct xfs_trans	*tp,
+	struct xfs_buf		*agbp,
+	struct xfs_perag	*pag)
+{
+	struct xfs_mount	*mp = tp->t_mountp;
+	struct xfs_agf		*agf = XFS_BUF_TO_AGF(agbp);
+
+	ASSERT(pag->pagf_agflreset);
+	trace_xfs_agfl_reset(mp, agf, 0, _RET_IP_);
+
+	xfs_warn(mp,
+	       "WARNING: Reset corrupted AGFL on AG %u. %d blocks leaked. "
+	       "Please unmount and run xfs_repair.",
+	         pag->pag_agno, pag->pagf_flcount);
+
+	agf->agf_flfirst = 0;
+	agf->agf_fllast = cpu_to_be32(xfs_agfl_size(mp) - 1);
+	agf->agf_flcount = 0;
+	xfs_alloc_log_agf(tp, agbp, XFS_AGF_FLFIRST | XFS_AGF_FLLAST |
+				    XFS_AGF_FLCOUNT);
+
+	pag->pagf_flcount = 0;
+	pag->pagf_agflreset = false;
+}
+
+/*
  * Decide whether to use this allocation group for this allocation.
  * If so, fix up the btree freelist's size.
  */
@@ -2131,6 +2218,10 @@ xfs_alloc_fix_freelist(
 		}
 	}
 
+	/* reset a padding mismatched agfl before final free space check */
+	if (pag->pagf_agflreset)
+		xfs_agfl_reset(tp, agbp, pag);
+
 	/* If there isn't enough total space or single-extent, reject it. */
 	need = xfs_alloc_min_freelist(mp, pag);
 	if (!xfs_alloc_space_available(args, need, flags))
@@ -2287,6 +2378,7 @@ xfs_alloc_get_freelist(
 		agf->agf_flfirst = 0;
 
 	pag = xfs_perag_get(mp, be32_to_cpu(agf->agf_seqno));
+	ASSERT(!pag->pagf_agflreset);
 	be32_add_cpu(&agf->agf_flcount, -1);
 	xfs_trans_agflist_delta(tp, -1);
 	pag->pagf_flcount--;
@@ -2398,6 +2490,7 @@ xfs_alloc_put_freelist(
 		agf->agf_fllast = 0;
 
 	pag = xfs_perag_get(mp, be32_to_cpu(agf->agf_seqno));
+	ASSERT(!pag->pagf_agflreset);
 	be32_add_cpu(&agf->agf_flcount, 1);
 	xfs_trans_agflist_delta(tp, 1);
 	pag->pagf_flcount++;
@@ -2605,6 +2698,7 @@ xfs_alloc_read_agf(
 		pag->pagb_count = 0;
 		pag->pagb_tree = RB_ROOT;
 		pag->pagf_init = 1;
+		pag->pagf_agflreset = xfs_agfl_needs_reset(mp, agf);
 	}
 #ifdef DEBUG
 	else if (!XFS_FORCED_SHUTDOWN(mp)) {
--- a/fs/xfs/xfs_mount.h
+++ b/fs/xfs/xfs_mount.h
@@ -353,6 +353,7 @@ typedef struct xfs_perag {
 	char		pagi_inodeok;	/* The agi is ok for inodes */
 	uint8_t		pagf_levels[XFS_BTNUM_AGF];
 					/* # of levels in bno & cnt btree */
+	bool		pagf_agflreset; /* agfl requires reset before use */
 	uint32_t	pagf_flcount;	/* count of blocks in freelist */
 	xfs_extlen_t	pagf_freeblks;	/* total free blocks */
 	xfs_extlen_t	pagf_longest;	/* longest free space */
--- a/fs/xfs/xfs_trace.h
+++ b/fs/xfs/xfs_trace.h
@@ -1477,7 +1477,7 @@ TRACE_EVENT(xfs_extent_busy_trim,
 		  __entry->tlen)
 );
 
-TRACE_EVENT(xfs_agf,
+DECLARE_EVENT_CLASS(xfs_agf_class,
 	TP_PROTO(struct xfs_mount *mp, struct xfs_agf *agf, int flags,
 		 unsigned long caller_ip),
 	TP_ARGS(mp, agf, flags, caller_ip),
@@ -1533,6 +1533,13 @@ TRACE_EVENT(xfs_agf,
 		  __entry->longest,
 		  (void *)__entry->caller_ip)
 );
+#define DEFINE_AGF_EVENT(name) \
+DEFINE_EVENT(xfs_agf_class, name, \
+	TP_PROTO(struct xfs_mount *mp, struct xfs_agf *agf, int flags, \
+		 unsigned long caller_ip), \
+	TP_ARGS(mp, agf, flags, caller_ip))
+DEFINE_AGF_EVENT(xfs_agf);
+DEFINE_AGF_EVENT(xfs_agfl_reset);
 
 TRACE_EVENT(xfs_free_extent,
 	TP_PROTO(struct xfs_mount *mp, xfs_agnumber_t agno, xfs_agblock_t agbno,

  parent reply	other threads:[~2018-06-04  7:03 UTC|newest]

Thread overview: 51+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-06-04  6:58 [PATCH 4.16 00/47] 4.16.14-stable review Greg Kroah-Hartman
2018-06-04  6:58 ` [PATCH 4.16 01/47] objtool: Support GCC 8s cold subfunctions Greg Kroah-Hartman
2018-06-04  6:58 ` [PATCH 4.16 02/47] objtool: Support GCC 8 switch tables Greg Kroah-Hartman
2018-06-04  6:58 ` [PATCH 4.16 03/47] objtool: Detect RIP-relative switch table references Greg Kroah-Hartman
2018-06-04  6:58 ` [PATCH 4.16 04/47] objtool: Detect RIP-relative switch table references, part 2 Greg Kroah-Hartman
2018-06-04  6:58 ` [PATCH 4.16 05/47] objtool: Fix "noreturn" detection for recursive sibling calls Greg Kroah-Hartman
2018-06-04  6:58 ` [PATCH 4.16 06/47] x86/mce/AMD: Carve out SMCA get_block_address() code Greg Kroah-Hartman
2018-06-04  6:58 ` [PATCH 4.16 07/47] x86/MCE/AMD: Cache SMCA MISC block addresses Greg Kroah-Hartman
2018-06-04  6:58 ` [PATCH 4.16 08/47] drm/vmwgfx: Use kasprintf Greg Kroah-Hartman
2018-06-04  6:58 ` [PATCH 4.16 09/47] drm/vmwgfx: Fix host logging / guestinfo reading error paths Greg Kroah-Hartman
2018-06-04  6:58 ` [PATCH 4.16 10/47] Revert "pinctrl: msm: Use dynamic GPIO numbering" Greg Kroah-Hartman
2018-06-04  6:58 ` [PATCH 4.16 11/47] xfs: convert XFS_AGFL_SIZE to a helper function Greg Kroah-Hartman
2018-06-04  6:58 ` Greg Kroah-Hartman [this message]
2018-06-04  6:58 ` [PATCH 4.16 13/47] Input: synaptics - Lenovo Carbon X1 Gen5 (2017) devices should use RMI Greg Kroah-Hartman
2018-06-04  6:58 ` [PATCH 4.16 14/47] Input: synaptics - Lenovo Thinkpad X1 Carbon G5 (2017) with Elantech trackpoints " Greg Kroah-Hartman
2018-06-04  6:58 ` [PATCH 4.16 15/47] Input: synaptics - add Intertouch support on X1 Carbon 6th and X280 Greg Kroah-Hartman
2018-06-04  6:58 ` [PATCH 4.16 16/47] Input: synaptics - add Lenovo 80 series ids to SMBus Greg Kroah-Hartman
2018-06-04  6:58 ` [PATCH 4.16 17/47] Input: elan_i2c_smbus - fix corrupted stack Greg Kroah-Hartman
2018-06-04  6:58 ` [PATCH 4.16 18/47] tracing: Fix crash when freeing instances with event triggers Greg Kroah-Hartman
2018-06-04  6:58 ` [PATCH 4.16 19/47] tracing: Make the snapshot trigger work with instances Greg Kroah-Hartman
2018-06-04  6:58 ` [PATCH 4.16 20/47] nvme: fix extended data LBA supported setting Greg Kroah-Hartman
2018-06-04  6:58 ` [PATCH 4.16 21/47] selinux: KASAN: slab-out-of-bounds in xattr_getsecurity Greg Kroah-Hartman
2018-06-04  6:58 ` [PATCH 4.16 22/47] cfg80211: further limit wiphy names to 64 bytes Greg Kroah-Hartman
2018-06-04  6:58 ` [PATCH 4.16 23/47] drm/amd/powerplay: Fix enum mismatch Greg Kroah-Hartman
2018-06-04  6:58 ` [PATCH 4.16 24/47] rtlwifi: rtl8192cu: Remove variable self-assignment in rf.c Greg Kroah-Hartman
2018-06-04  6:58 ` [PATCH 4.16 25/47] iio: ad7793: implement IIO_CHAN_INFO_SAMP_FREQ Greg Kroah-Hartman
2018-06-04  6:58 ` [PATCH 4.16 26/47] iio: hid-sensor-trigger: Fix sometimes not powering up the sensor after resume Greg Kroah-Hartman
2018-06-04  6:58 ` [PATCH 4.16 27/47] iio:buffer: make length types match kfifo types Greg Kroah-Hartman
2018-06-04  6:58 ` [PATCH 4.16 28/47] iio:kfifo_buf: check for uint overflow Greg Kroah-Hartman
2018-06-04  6:58 ` [PATCH 4.16 29/47] iio: adc: stm32-dfsdm: fix successive oversampling settings Greg Kroah-Hartman
2018-06-04  6:58 ` [PATCH 4.16 30/47] iio: adc: stm32-dfsdm: fix sample rate for div2 spi clock Greg Kroah-Hartman
2018-06-04  6:58 ` [PATCH 4.16 31/47] iio: adc: at91-sama5d2_adc: fix channel configuration for differential channels Greg Kroah-Hartman
2018-06-04  6:58 ` [PATCH 4.16 32/47] iio: adc: select buffer for at91-sama5d2_adc Greg Kroah-Hartman
2018-06-04  6:58 ` [PATCH 4.16 33/47] MIPS: lantiq: gphy: Drop reboot/remove reset asserts Greg Kroah-Hartman
2018-06-04  6:58 ` [PATCH 4.16 34/47] MIPS: ptrace: Fix PTRACE_PEEKUSR requests for 64-bit FGRs Greg Kroah-Hartman
2018-06-04  6:58 ` [PATCH 4.16 35/47] MIPS: prctl: Disallow FRE without FR with PR_SET_FP_MODE requests Greg Kroah-Hartman
2018-06-04  6:58 ` [PATCH 4.16 36/47] scsi: scsi_transport_srp: Fix shost to rport translation Greg Kroah-Hartman
2018-06-04  6:58 ` [PATCH 4.16 37/47] stm class: Use vmalloc for the master map Greg Kroah-Hartman
2018-06-04  6:58 ` [PATCH 4.16 38/47] hwtracing: stm: fix build error on some arches Greg Kroah-Hartman
2018-06-04  6:58 ` [PATCH 4.16 39/47] IB/core: Fix error code for invalid GID entry Greg Kroah-Hartman
2018-06-04  6:58 ` [PATCH 4.16 40/47] mm/huge_memory.c: __split_huge_page() use atomic ClearPageDirty() Greg Kroah-Hartman
2018-06-04  6:58 ` [PATCH 4.16 41/47] Revert "rt2800: use TXOP_BACKOFF for probe frames" Greg Kroah-Hartman
2018-06-04  6:58 ` [PATCH 4.16 42/47] intel_th: Use correct device when freeing buffers Greg Kroah-Hartman
2018-06-04  6:58 ` [PATCH 4.16 46/47] fix io_destroy()/aio_complete() race Greg Kroah-Hartman
2018-06-04  6:58 ` [PATCH 4.16 47/47] mm: fix the NULL mapping case in __isolate_lru_page() Greg Kroah-Hartman
2018-06-04 13:20 ` [PATCH 4.16 00/47] 4.16.14-stable review Guenter Roeck
2018-06-05  8:15   ` Greg Kroah-Hartman
2018-06-04 19:47 ` Shuah Khan
2018-06-05  8:15   ` Greg Kroah-Hartman
2018-06-05  6:47 ` Naresh Kamboju
2018-06-05  8:28   ` Greg Kroah-Hartman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180604065550.004615380@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=bfoster@redhat.com \
    --cc=darrick.wong@oracle.com \
    --cc=david@fromorbit.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).