Hi all, This patchset hoists the code that checks log intent record validation into separate functions, and reworks them to use the standard field validation predicates instead of open-coding them. This strengthens log recovery against (some) fuzzed log items. If you're going to start using this mess, you probably ought to just pull from my git trees, which are linked below. This is an extraordinary way to destroy everything. Enjoy! Comments and questions are, as always, welcome. --D kernel git tree: https://git.kernel.org/cgit/linux/kernel/git/djwong/xfs-linux.git/log/?h=fix-recovered-log-intent-validation-5.11 --- fs/xfs/xfs_bmap_item.c | 75 ++++++++++++++++++++++++++++---------------- fs/xfs/xfs_extfree_item.c | 31 ++++++++++++------ fs/xfs/xfs_log_recover.c | 5 ++- fs/xfs/xfs_refcount_item.c | 61 ++++++++++++++++++++++-------------- fs/xfs/xfs_rmap_item.c | 75 ++++++++++++++++++++++++++++---------------- fs/xfs/xfs_trace.h | 19 +++++++++++ 6 files changed, 178 insertions(+), 88 deletions(-)
From: Darrick J. Wong <darrick.wong@oracle.com> When we recover a bmap intent from the log, we need to validate its contents before we try to replay them. Hoist the checking code into a separate function in preparation to refactor this code to use validation helpers. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> --- fs/xfs/xfs_bmap_item.c | 71 ++++++++++++++++++++++++++++++------------------ 1 file changed, 44 insertions(+), 27 deletions(-) diff --git a/fs/xfs/xfs_bmap_item.c b/fs/xfs/xfs_bmap_item.c index 9e16a4d0f97c..c90d018fbc2e 100644 --- a/fs/xfs/xfs_bmap_item.c +++ b/fs/xfs/xfs_bmap_item.c @@ -417,6 +417,49 @@ const struct xfs_defer_op_type xfs_bmap_update_defer_type = { .cancel_item = xfs_bmap_update_cancel_item, }; +/* Is this recovered BUI ok? */ +static inline bool +xfs_bui_validate( + struct xfs_mount *mp, + struct xfs_bui_log_item *buip) +{ + struct xfs_map_extent *bmap; + xfs_fsblock_t startblock_fsb; + xfs_fsblock_t inode_fsb; + + /* Only one mapping operation per BUI... */ + if (buip->bui_format.bui_nextents != XFS_BUI_MAX_FAST_EXTENTS) + return false; + + bmap = &buip->bui_format.bui_extents[0]; + startblock_fsb = XFS_BB_TO_FSB(mp, + XFS_FSB_TO_DADDR(mp, bmap->me_startblock)); + inode_fsb = XFS_BB_TO_FSB(mp, XFS_FSB_TO_DADDR(mp, + XFS_INO_TO_FSB(mp, bmap->me_owner))); + + if (bmap->me_flags & ~XFS_BMAP_EXTENT_FLAGS) + return false; + + switch (bmap->me_flags & XFS_BMAP_EXTENT_TYPE_MASK) { + case XFS_BMAP_MAP: + case XFS_BMAP_UNMAP: + break; + default: + return false; + } + + if (startblock_fsb == 0 || + bmap->me_len == 0 || + inode_fsb == 0 || + startblock_fsb >= mp->m_sb.sb_dblocks || + bmap->me_len >= mp->m_sb.sb_agblocks || + inode_fsb >= mp->m_sb.sb_dblocks || + (bmap->me_flags & ~XFS_BMAP_EXTENT_FLAGS)) + return false; + + return true; +} + /* * Process a bmap update intent item that was recovered from the log. * We need to update some inode's bmbt. @@ -433,47 +476,21 @@ xfs_bui_item_recover( struct xfs_mount *mp = lip->li_mountp; struct xfs_map_extent *bmap; struct xfs_bud_log_item *budp; - xfs_fsblock_t startblock_fsb; - xfs_fsblock_t inode_fsb; xfs_filblks_t count; xfs_exntst_t state; unsigned int bui_type; int whichfork; int error = 0; - /* Only one mapping operation per BUI... */ - if (buip->bui_format.bui_nextents != XFS_BUI_MAX_FAST_EXTENTS) + if (!xfs_bui_validate(mp, buip)) return -EFSCORRUPTED; - /* - * First check the validity of the extent described by the - * BUI. If anything is bad, then toss the BUI. - */ bmap = &buip->bui_format.bui_extents[0]; - startblock_fsb = XFS_BB_TO_FSB(mp, - XFS_FSB_TO_DADDR(mp, bmap->me_startblock)); - inode_fsb = XFS_BB_TO_FSB(mp, XFS_FSB_TO_DADDR(mp, - XFS_INO_TO_FSB(mp, bmap->me_owner))); state = (bmap->me_flags & XFS_BMAP_EXTENT_UNWRITTEN) ? XFS_EXT_UNWRITTEN : XFS_EXT_NORM; whichfork = (bmap->me_flags & XFS_BMAP_EXTENT_ATTR_FORK) ? XFS_ATTR_FORK : XFS_DATA_FORK; bui_type = bmap->me_flags & XFS_BMAP_EXTENT_TYPE_MASK; - switch (bui_type) { - case XFS_BMAP_MAP: - case XFS_BMAP_UNMAP: - break; - default: - return -EFSCORRUPTED; - } - if (startblock_fsb == 0 || - bmap->me_len == 0 || - inode_fsb == 0 || - startblock_fsb >= mp->m_sb.sb_dblocks || - bmap->me_len >= mp->m_sb.sb_agblocks || - inode_fsb >= mp->m_sb.sb_dblocks || - (bmap->me_flags & ~XFS_BMAP_EXTENT_FLAGS)) - return -EFSCORRUPTED; /* Grab the inode. */ error = xfs_iget(mp, NULL, bmap->me_owner, 0, 0, &ip);
From: Darrick J. Wong <darrick.wong@oracle.com> The code that validates recovered bmap intent items is kind of a mess -- it doesn't use the standard xfs type validators, and it doesn't check for things that it should. Fix the validator function to use the standard validation helpers and look for more types of obvious errors. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> --- fs/xfs/xfs_bmap_item.c | 26 +++++++++++++------------- 1 file changed, 13 insertions(+), 13 deletions(-) diff --git a/fs/xfs/xfs_bmap_item.c b/fs/xfs/xfs_bmap_item.c index c90d018fbc2e..19f89a6b65a1 100644 --- a/fs/xfs/xfs_bmap_item.c +++ b/fs/xfs/xfs_bmap_item.c @@ -424,18 +424,13 @@ xfs_bui_validate( struct xfs_bui_log_item *buip) { struct xfs_map_extent *bmap; - xfs_fsblock_t startblock_fsb; - xfs_fsblock_t inode_fsb; + xfs_fsblock_t end; /* Only one mapping operation per BUI... */ if (buip->bui_format.bui_nextents != XFS_BUI_MAX_FAST_EXTENTS) return false; bmap = &buip->bui_format.bui_extents[0]; - startblock_fsb = XFS_BB_TO_FSB(mp, - XFS_FSB_TO_DADDR(mp, bmap->me_startblock)); - inode_fsb = XFS_BB_TO_FSB(mp, XFS_FSB_TO_DADDR(mp, - XFS_INO_TO_FSB(mp, bmap->me_owner))); if (bmap->me_flags & ~XFS_BMAP_EXTENT_FLAGS) return false; @@ -448,13 +443,18 @@ xfs_bui_validate( return false; } - if (startblock_fsb == 0 || - bmap->me_len == 0 || - inode_fsb == 0 || - startblock_fsb >= mp->m_sb.sb_dblocks || - bmap->me_len >= mp->m_sb.sb_agblocks || - inode_fsb >= mp->m_sb.sb_dblocks || - (bmap->me_flags & ~XFS_BMAP_EXTENT_FLAGS)) + if (!xfs_verify_ino(mp, bmap->me_owner)) + return false; + + if (bmap->me_startoff + bmap->me_len <= bmap->me_startoff) + return false; + + if (bmap->me_startblock + bmap->me_len <= bmap->me_startblock) + return false; + + end = bmap->me_startblock + bmap->me_len - 1; + if (!xfs_verify_fsbno(mp, bmap->me_startblock) || + !xfs_verify_fsbno(mp, end)) return false; return true;
From: Darrick J. Wong <darrick.wong@oracle.com> When we recover a rmap intent from the log, we need to validate its contents before we try to replay them. Hoist the checking code into a separate function in preparation to refactor this code to use validation helpers. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> --- fs/xfs/xfs_rmap_item.c | 65 ++++++++++++++++++++++++++++-------------------- 1 file changed, 38 insertions(+), 27 deletions(-) diff --git a/fs/xfs/xfs_rmap_item.c b/fs/xfs/xfs_rmap_item.c index 7adc996ca6e3..871ed7fc43ee 100644 --- a/fs/xfs/xfs_rmap_item.c +++ b/fs/xfs/xfs_rmap_item.c @@ -460,6 +460,42 @@ const struct xfs_defer_op_type xfs_rmap_update_defer_type = { .cancel_item = xfs_rmap_update_cancel_item, }; +/* Is this recovered RUI ok? */ +static inline bool +xfs_rui_validate_map( + struct xfs_mount *mp, + struct xfs_map_extent *rmap) +{ + xfs_fsblock_t startblock_fsb; + bool op_ok; + + startblock_fsb = XFS_BB_TO_FSB(mp, + XFS_FSB_TO_DADDR(mp, rmap->me_startblock)); + switch (rmap->me_flags & XFS_RMAP_EXTENT_TYPE_MASK) { + case XFS_RMAP_EXTENT_MAP: + case XFS_RMAP_EXTENT_MAP_SHARED: + case XFS_RMAP_EXTENT_UNMAP: + case XFS_RMAP_EXTENT_UNMAP_SHARED: + case XFS_RMAP_EXTENT_CONVERT: + case XFS_RMAP_EXTENT_CONVERT_SHARED: + case XFS_RMAP_EXTENT_ALLOC: + case XFS_RMAP_EXTENT_FREE: + op_ok = true; + break; + default: + op_ok = false; + break; + } + if (!op_ok || startblock_fsb == 0 || + rmap->me_len == 0 || + startblock_fsb >= mp->m_sb.sb_dblocks || + rmap->me_len >= mp->m_sb.sb_agblocks || + (rmap->me_flags & ~XFS_RMAP_EXTENT_FLAGS)) + return false; + + return true; +} + /* * Process an rmap update intent item that was recovered from the log. * We need to update the rmapbt. @@ -475,10 +511,8 @@ xfs_rui_item_recover( struct xfs_trans *tp; struct xfs_btree_cur *rcur = NULL; struct xfs_mount *mp = lip->li_mountp; - xfs_fsblock_t startblock_fsb; enum xfs_rmap_intent_type type; xfs_exntst_t state; - bool op_ok; int i; int whichfork; int error = 0; @@ -488,32 +522,9 @@ xfs_rui_item_recover( * RUI. If any are bad, then assume that all are bad and * just toss the RUI. */ - for (i = 0; i < ruip->rui_format.rui_nextents; i++) { - rmap = &ruip->rui_format.rui_extents[i]; - startblock_fsb = XFS_BB_TO_FSB(mp, - XFS_FSB_TO_DADDR(mp, rmap->me_startblock)); - switch (rmap->me_flags & XFS_RMAP_EXTENT_TYPE_MASK) { - case XFS_RMAP_EXTENT_MAP: - case XFS_RMAP_EXTENT_MAP_SHARED: - case XFS_RMAP_EXTENT_UNMAP: - case XFS_RMAP_EXTENT_UNMAP_SHARED: - case XFS_RMAP_EXTENT_CONVERT: - case XFS_RMAP_EXTENT_CONVERT_SHARED: - case XFS_RMAP_EXTENT_ALLOC: - case XFS_RMAP_EXTENT_FREE: - op_ok = true; - break; - default: - op_ok = false; - break; - } - if (!op_ok || startblock_fsb == 0 || - rmap->me_len == 0 || - startblock_fsb >= mp->m_sb.sb_dblocks || - rmap->me_len >= mp->m_sb.sb_agblocks || - (rmap->me_flags & ~XFS_RMAP_EXTENT_FLAGS)) + for (i = 0; i < ruip->rui_format.rui_nextents; i++) + if (!xfs_rui_validate_map(mp, &ruip->rui_format.rui_extents[i])) return -EFSCORRUPTED; - } error = xfs_trans_alloc(mp, &M_RES(mp)->tr_itruncate, mp->m_rmap_maxlevels, 0, XFS_TRANS_RESERVE, &tp);
From: Darrick J. Wong <darrick.wong@oracle.com> The code that validates recovered rmap intent items is kind of a mess -- it doesn't use the standard xfs type validators, and it doesn't check for things that it should. Fix the validator function to use the standard validation helpers and look for more types of obvious errors. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> --- fs/xfs/xfs_rmap_item.c | 31 +++++++++++++++++++------------ 1 file changed, 19 insertions(+), 12 deletions(-) diff --git a/fs/xfs/xfs_rmap_item.c b/fs/xfs/xfs_rmap_item.c index 871ed7fc43ee..2779cbee8fa8 100644 --- a/fs/xfs/xfs_rmap_item.c +++ b/fs/xfs/xfs_rmap_item.c @@ -466,11 +466,11 @@ xfs_rui_validate_map( struct xfs_mount *mp, struct xfs_map_extent *rmap) { - xfs_fsblock_t startblock_fsb; - bool op_ok; + xfs_fsblock_t end; + + if (rmap->me_flags & ~XFS_RMAP_EXTENT_FLAGS) + return false; - startblock_fsb = XFS_BB_TO_FSB(mp, - XFS_FSB_TO_DADDR(mp, rmap->me_startblock)); switch (rmap->me_flags & XFS_RMAP_EXTENT_TYPE_MASK) { case XFS_RMAP_EXTENT_MAP: case XFS_RMAP_EXTENT_MAP_SHARED: @@ -480,17 +480,24 @@ xfs_rui_validate_map( case XFS_RMAP_EXTENT_CONVERT_SHARED: case XFS_RMAP_EXTENT_ALLOC: case XFS_RMAP_EXTENT_FREE: - op_ok = true; break; default: - op_ok = false; - break; + return false; } - if (!op_ok || startblock_fsb == 0 || - rmap->me_len == 0 || - startblock_fsb >= mp->m_sb.sb_dblocks || - rmap->me_len >= mp->m_sb.sb_agblocks || - (rmap->me_flags & ~XFS_RMAP_EXTENT_FLAGS)) + + if (!xfs_verify_ino(mp, rmap->me_owner) && + !XFS_RMAP_NON_INODE_OWNER(rmap->me_owner)) + return false; + + if (rmap->me_startoff + rmap->me_len <= rmap->me_startoff) + return false; + + if (rmap->me_startblock + rmap->me_len <= rmap->me_startblock) + return false; + + end = rmap->me_startblock + rmap->me_len - 1; + if (!xfs_verify_fsbno(mp, rmap->me_startblock) || + !xfs_verify_fsbno(mp, end)) return false; return true;
From: Darrick J. Wong <darrick.wong@oracle.com> When we recover a refcount intent from the log, we need to validate its contents before we try to replay them. Hoist the checking code into a separate function in preparation to refactor this code to use validation helpers. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> --- fs/xfs/xfs_refcount_item.c | 58 +++++++++++++++++++++++++++----------------- 1 file changed, 35 insertions(+), 23 deletions(-) diff --git a/fs/xfs/xfs_refcount_item.c b/fs/xfs/xfs_refcount_item.c index 7529eb63ce94..de344bd7e73c 100644 --- a/fs/xfs/xfs_refcount_item.c +++ b/fs/xfs/xfs_refcount_item.c @@ -417,6 +417,38 @@ const struct xfs_defer_op_type xfs_refcount_update_defer_type = { .cancel_item = xfs_refcount_update_cancel_item, }; +/* Is this recovered CUI ok? */ +static inline bool +xfs_cui_validate_phys( + struct xfs_mount *mp, + struct xfs_phys_extent *refc) +{ + xfs_fsblock_t startblock_fsb; + bool op_ok; + + startblock_fsb = XFS_BB_TO_FSB(mp, + XFS_FSB_TO_DADDR(mp, refc->pe_startblock)); + switch (refc->pe_flags & XFS_REFCOUNT_EXTENT_TYPE_MASK) { + case XFS_REFCOUNT_INCREASE: + case XFS_REFCOUNT_DECREASE: + case XFS_REFCOUNT_ALLOC_COW: + case XFS_REFCOUNT_FREE_COW: + op_ok = true; + break; + default: + op_ok = false; + break; + } + if (!op_ok || startblock_fsb == 0 || + refc->pe_len == 0 || + startblock_fsb >= mp->m_sb.sb_dblocks || + refc->pe_len >= mp->m_sb.sb_agblocks || + (refc->pe_flags & ~XFS_REFCOUNT_EXTENT_FLAGS)) + return false; + + return true; +} + /* * Process a refcount update intent item that was recovered from the log. * We need to update the refcountbt. @@ -433,11 +465,9 @@ xfs_cui_item_recover( struct xfs_trans *tp; struct xfs_btree_cur *rcur = NULL; struct xfs_mount *mp = lip->li_mountp; - xfs_fsblock_t startblock_fsb; xfs_fsblock_t new_fsb; xfs_extlen_t new_len; unsigned int refc_type; - bool op_ok; bool requeue_only = false; enum xfs_refcount_intent_type type; int i; @@ -448,28 +478,10 @@ xfs_cui_item_recover( * CUI. If any are bad, then assume that all are bad and * just toss the CUI. */ - for (i = 0; i < cuip->cui_format.cui_nextents; i++) { - refc = &cuip->cui_format.cui_extents[i]; - startblock_fsb = XFS_BB_TO_FSB(mp, - XFS_FSB_TO_DADDR(mp, refc->pe_startblock)); - switch (refc->pe_flags & XFS_REFCOUNT_EXTENT_TYPE_MASK) { - case XFS_REFCOUNT_INCREASE: - case XFS_REFCOUNT_DECREASE: - case XFS_REFCOUNT_ALLOC_COW: - case XFS_REFCOUNT_FREE_COW: - op_ok = true; - break; - default: - op_ok = false; - break; - } - if (!op_ok || startblock_fsb == 0 || - refc->pe_len == 0 || - startblock_fsb >= mp->m_sb.sb_dblocks || - refc->pe_len >= mp->m_sb.sb_agblocks || - (refc->pe_flags & ~XFS_REFCOUNT_EXTENT_FLAGS)) + for (i = 0; i < cuip->cui_format.cui_nextents; i++) + if (!xfs_cui_validate_phys(mp, + &cuip->cui_format.cui_extents[i])) return -EFSCORRUPTED; - } /* * Under normal operation, refcount updates are deferred, so we
From: Darrick J. Wong <darrick.wong@oracle.com> The code that validates recovered refcount intent items is kind of a mess -- it doesn't use the standard xfs type validators, and it doesn't check for things that it should. Fix the validator function to use the standard validation helpers and look for more types of obvious errors. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> --- fs/xfs/xfs_refcount_item.c | 24 ++++++++++++------------ 1 file changed, 12 insertions(+), 12 deletions(-) diff --git a/fs/xfs/xfs_refcount_item.c b/fs/xfs/xfs_refcount_item.c index de344bd7e73c..20e5c22bb754 100644 --- a/fs/xfs/xfs_refcount_item.c +++ b/fs/xfs/xfs_refcount_item.c @@ -423,27 +423,27 @@ xfs_cui_validate_phys( struct xfs_mount *mp, struct xfs_phys_extent *refc) { - xfs_fsblock_t startblock_fsb; - bool op_ok; + xfs_fsblock_t end; + + if (refc->pe_flags & ~XFS_REFCOUNT_EXTENT_FLAGS) + return false; - startblock_fsb = XFS_BB_TO_FSB(mp, - XFS_FSB_TO_DADDR(mp, refc->pe_startblock)); switch (refc->pe_flags & XFS_REFCOUNT_EXTENT_TYPE_MASK) { case XFS_REFCOUNT_INCREASE: case XFS_REFCOUNT_DECREASE: case XFS_REFCOUNT_ALLOC_COW: case XFS_REFCOUNT_FREE_COW: - op_ok = true; break; default: - op_ok = false; - break; + return false; } - if (!op_ok || startblock_fsb == 0 || - refc->pe_len == 0 || - startblock_fsb >= mp->m_sb.sb_dblocks || - refc->pe_len >= mp->m_sb.sb_agblocks || - (refc->pe_flags & ~XFS_REFCOUNT_EXTENT_FLAGS)) + + if (refc->pe_startblock + refc->pe_len <= refc->pe_startblock) + return false; + + end = refc->pe_startblock + refc->pe_len - 1; + if (!xfs_verify_fsbno(mp, refc->pe_startblock) || + !xfs_verify_fsbno(mp, end)) return false; return true;
From: Darrick J. Wong <darrick.wong@oracle.com> When we recover a extent-free intent from the log, we need to validate its contents before we try to replay them. Hoist the checking code into a separate function in preparation to refactor this code to use validation helpers. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> --- fs/xfs/xfs_extfree_item.c | 31 +++++++++++++++++++++---------- 1 file changed, 21 insertions(+), 10 deletions(-) diff --git a/fs/xfs/xfs_extfree_item.c b/fs/xfs/xfs_extfree_item.c index 6c11bfc3d452..b5710b7fc263 100644 --- a/fs/xfs/xfs_extfree_item.c +++ b/fs/xfs/xfs_extfree_item.c @@ -578,6 +578,25 @@ const struct xfs_defer_op_type xfs_agfl_free_defer_type = { .cancel_item = xfs_extent_free_cancel_item, }; +/* Is this recovered EFI ok? */ +static inline bool +xfs_efi_validate_ext( + struct xfs_mount *mp, + struct xfs_extent *extp) +{ + xfs_fsblock_t startblock_fsb; + + startblock_fsb = XFS_BB_TO_FSB(mp, + XFS_FSB_TO_DADDR(mp, extp->ext_start)); + if (startblock_fsb == 0 || + extp->ext_len == 0 || + startblock_fsb >= mp->m_sb.sb_dblocks || + extp->ext_len >= mp->m_sb.sb_agblocks) + return false; + + return true; +} + /* * Process an extent free intent item that was recovered from * the log. We need to free the extents that it describes. @@ -592,7 +611,6 @@ xfs_efi_item_recover( struct xfs_efd_log_item *efdp; struct xfs_trans *tp; struct xfs_extent *extp; - xfs_fsblock_t startblock_fsb; int i; int error = 0; @@ -601,16 +619,9 @@ xfs_efi_item_recover( * EFI. If any are bad, then assume that all are bad and * just toss the EFI. */ - for (i = 0; i < efip->efi_format.efi_nextents; i++) { - extp = &efip->efi_format.efi_extents[i]; - startblock_fsb = XFS_BB_TO_FSB(mp, - XFS_FSB_TO_DADDR(mp, extp->ext_start)); - if (startblock_fsb == 0 || - extp->ext_len == 0 || - startblock_fsb >= mp->m_sb.sb_dblocks || - extp->ext_len >= mp->m_sb.sb_agblocks) + for (i = 0; i < efip->efi_format.efi_nextents; i++) + if (!xfs_efi_validate_ext(mp, &efip->efi_format.efi_extents[i])) return -EFSCORRUPTED; - } error = xfs_trans_alloc(mp, &M_RES(mp)->tr_itruncate, 0, 0, 0, &tp); if (error)
From: Darrick J. Wong <darrick.wong@oracle.com> The code that validates recovered extent-free intent items is kind of a mess -- it doesn't use the standard xfs type validators, and it doesn't check for things that it should. Fix the validator function to use the standard validation helpers and look for more types of obvious errors. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> --- fs/xfs/xfs_extfree_item.c | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/fs/xfs/xfs_extfree_item.c b/fs/xfs/xfs_extfree_item.c index b5710b7fc263..b1c004016709 100644 --- a/fs/xfs/xfs_extfree_item.c +++ b/fs/xfs/xfs_extfree_item.c @@ -584,14 +584,14 @@ xfs_efi_validate_ext( struct xfs_mount *mp, struct xfs_extent *extp) { - xfs_fsblock_t startblock_fsb; + xfs_fsblock_t end; - startblock_fsb = XFS_BB_TO_FSB(mp, - XFS_FSB_TO_DADDR(mp, extp->ext_start)); - if (startblock_fsb == 0 || - extp->ext_len == 0 || - startblock_fsb >= mp->m_sb.sb_dblocks || - extp->ext_len >= mp->m_sb.sb_agblocks) + if (extp->ext_start + extp->ext_len <= extp->ext_start) + return false; + + end = extp->ext_start + extp->ext_len - 1; + if (!xfs_verify_fsbno(mp, extp->ext_start) || + !xfs_verify_fsbno(mp, end)) return false; return true;
From: Darrick J. Wong <darrick.wong@oracle.com> The bmap, rmap, and refcount log intent items were added to support the rmap and reflink features. Because these features come with changes to the ondisk format, the log items aren't tied to a log incompat flags. However, the log recovery routines don't actually check for those feature flags. The kernel has no business replayng an intent item for a feature that isn't enabled, so check that as part of recovered log item validation. (Note that kernels pre-dating rmap and reflink will fail the mount on the unknown log item type code.) Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> --- fs/xfs/xfs_bmap_item.c | 4 ++++ fs/xfs/xfs_refcount_item.c | 3 +++ fs/xfs/xfs_rmap_item.c | 3 +++ 3 files changed, 10 insertions(+) diff --git a/fs/xfs/xfs_bmap_item.c b/fs/xfs/xfs_bmap_item.c index 19f89a6b65a1..f36005c999b2 100644 --- a/fs/xfs/xfs_bmap_item.c +++ b/fs/xfs/xfs_bmap_item.c @@ -426,6 +426,10 @@ xfs_bui_validate( struct xfs_map_extent *bmap; xfs_fsblock_t end; + if (!xfs_sb_version_hasrmapbt(&mp->m_sb) && + !xfs_sb_version_hasreflink(&mp->m_sb)) + return false; + /* Only one mapping operation per BUI... */ if (buip->bui_format.bui_nextents != XFS_BUI_MAX_FAST_EXTENTS) return false; diff --git a/fs/xfs/xfs_refcount_item.c b/fs/xfs/xfs_refcount_item.c index 20e5c22bb754..2017108b37a1 100644 --- a/fs/xfs/xfs_refcount_item.c +++ b/fs/xfs/xfs_refcount_item.c @@ -425,6 +425,9 @@ xfs_cui_validate_phys( { xfs_fsblock_t end; + if (!xfs_sb_version_hasreflink(&mp->m_sb)) + return false; + if (refc->pe_flags & ~XFS_REFCOUNT_EXTENT_FLAGS) return false; diff --git a/fs/xfs/xfs_rmap_item.c b/fs/xfs/xfs_rmap_item.c index 2779cbee8fa8..13871882ffb6 100644 --- a/fs/xfs/xfs_rmap_item.c +++ b/fs/xfs/xfs_rmap_item.c @@ -468,6 +468,9 @@ xfs_rui_validate_map( { xfs_fsblock_t end; + if (!xfs_sb_version_hasrmapbt(&mp->m_sb)) + return false; + if (rmap->me_flags & ~XFS_RMAP_EXTENT_FLAGS) return false;
From: Darrick J. Wong <darrick.wong@oracle.com> Add a trace point so that we can capture when a recovered log intent item fails to recover. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> --- fs/xfs/xfs_log_recover.c | 5 ++++- fs/xfs/xfs_trace.h | 19 +++++++++++++++++++ 2 files changed, 23 insertions(+), 1 deletion(-) diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c index 87886b7f77da..ed92c72976c9 100644 --- a/fs/xfs/xfs_log_recover.c +++ b/fs/xfs/xfs_log_recover.c @@ -2559,8 +2559,11 @@ xlog_recover_process_intents( spin_unlock(&ailp->ail_lock); error = lip->li_ops->iop_recover(lip, &capture_list); spin_lock(&ailp->ail_lock); - if (error) + if (error) { + trace_xfs_error_return(log->l_mp, error, + lip->li_ops->iop_recover); break; + } } xfs_trans_ail_cursor_done(&cur); diff --git a/fs/xfs/xfs_trace.h b/fs/xfs/xfs_trace.h index 86951652d3ed..99383b1acd49 100644 --- a/fs/xfs/xfs_trace.h +++ b/fs/xfs/xfs_trace.h @@ -103,6 +103,25 @@ DEFINE_ATTR_LIST_EVENT(xfs_attr_list_notfound); DEFINE_ATTR_LIST_EVENT(xfs_attr_leaf_list); DEFINE_ATTR_LIST_EVENT(xfs_attr_node_list); +TRACE_EVENT(xfs_error_return, + TP_PROTO(struct xfs_mount *mp, int error, void *caller_ip), + TP_ARGS(mp, error, caller_ip), + TP_STRUCT__entry( + __field(dev_t, dev) + __field(int, error) + __field(void *, caller_ip) + ), + TP_fast_assign( + __entry->dev = mp->m_super->s_dev; + __entry->error = error; + __entry->caller_ip = caller_ip; + ), + TP_printk("dev %d:%d error %d caller %pS", + MAJOR(__entry->dev), MINOR(__entry->dev), + __entry->error, __entry->caller_ip) + +); + DECLARE_EVENT_CLASS(xfs_perag_class, TP_PROTO(struct xfs_mount *mp, xfs_agnumber_t agno, int refcount, unsigned long caller_ip),
On Mon, Nov 30, 2020 at 07:37:46PM -0800, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
>
> When we recover a bmap intent from the log, we need to validate its
> contents before we try to replay them. Hoist the checking code into a
> separate function in preparation to refactor this code to use validation
> helpers.
>
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Looks good,
Reviewed-by: Christoph Hellwig <hch@lst.de>
On Mon, Nov 30, 2020 at 07:37:52PM -0800, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
>
> The code that validates recovered bmap intent items is kind of a mess --
> it doesn't use the standard xfs type validators, and it doesn't check
> for things that it should. Fix the validator function to use the
> standard validation helpers and look for more types of obvious errors.
>
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> ---
Looks good,
Reviewed-by: Christoph Hellwig <hch@lst.de>
On Mon, Nov 30, 2020 at 07:37:58PM -0800, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
>
> When we recover a rmap intent from the log, we need to validate its
> contents before we try to replay them. Hoist the checking code into a
> separate function in preparation to refactor this code to use validation
> helpers.
>
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Looks good,
Reviewed-by: Christoph Hellwig <hch@lst.de>
On Mon, Nov 30, 2020 at 07:38:04PM -0800, Darrick J. Wong wrote: > + if (!xfs_verify_ino(mp, rmap->me_owner) && > + !XFS_RMAP_NON_INODE_OWNER(rmap->me_owner)) > + return false; Wouldn't it make sense to reverse the order of the checks here? > + end = rmap->me_startblock + rmap->me_len - 1; > + if (!xfs_verify_fsbno(mp, rmap->me_startblock) || > + !xfs_verify_fsbno(mp, end)) > return false; Nit: why not simply: if (!xfs_verify_fsbno(mp, rmap->me_startblock)) return false; if (!xfs_verify_fsbno(mp, rmap->me_startblock + rmap->me_len - 1)) return false; ?
On Mon, Nov 30, 2020 at 07:38:10PM -0800, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
>
> When we recover a refcount intent from the log, we need to validate its
> contents before we try to replay them. Hoist the checking code into a
> separate function in preparation to refactor this code to use validation
> helpers.
>
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Looks good,
Reviewed-by: Christoph Hellwig <hch@lst.de>
Looks good (minus the same nitpick as for the other one): Reviewed-by: Christoph Hellwig <hch@lst.de>
On Mon, Nov 30, 2020 at 07:38:22PM -0800, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
>
> When we recover a extent-free intent from the log, we need to validate
> its contents before we try to replay them. Hoist the checking code into
> a separate function in preparation to refactor this code to use
> validation helpers.
>
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Looks good,
Reviewed-by: Christoph Hellwig <hch@lst.de>
Looks good modulo the end nitpick: Reviewed-by: Christoph Hellwig <hch@lst.de>
On Mon, Nov 30, 2020 at 07:38:34PM -0800, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
>
> The bmap, rmap, and refcount log intent items were added to support the
> rmap and reflink features. Because these features come with changes to
> the ondisk format, the log items aren't tied to a log incompat flags.
>
> However, the log recovery routines don't actually check for those
> feature flags. The kernel has no business replayng an intent item for a
> feature that isn't enabled, so check that as part of recovered log item
> validation. (Note that kernels pre-dating rmap and reflink will fail
> the mount on the unknown log item type code.)
>
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Looks good,
Reviewed-by: Christoph Hellwig <hch@lst.de>
On Mon, Nov 30, 2020 at 07:38:41PM -0800, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
>
> Add a trace point so that we can capture when a recovered log intent
> item fails to recover.
>
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Looks good,
Reviewed-by: Christoph Hellwig <hch@lst.de>
On Tue, Dec 01, 2020 at 10:05:35AM +0000, Christoph Hellwig wrote: > On Mon, Nov 30, 2020 at 07:38:04PM -0800, Darrick J. Wong wrote: > > + if (!xfs_verify_ino(mp, rmap->me_owner) && > > + !XFS_RMAP_NON_INODE_OWNER(rmap->me_owner)) > > + return false; > > Wouldn't it make sense to reverse the order of the checks here? Yep. Fixed. > > + end = rmap->me_startblock + rmap->me_len - 1; > > + if (!xfs_verify_fsbno(mp, rmap->me_startblock) || > > + !xfs_verify_fsbno(mp, end)) > > return false; > > Nit: why not simply: > > if (!xfs_verify_fsbno(mp, rmap->me_startblock)) > return false; > if (!xfs_verify_fsbno(mp, rmap->me_startblock + rmap->me_len - 1)) > return false; > > ? Yeah. --D
From: Darrick J. Wong <darrick.wong@oracle.com> The code that validates recovered rmap intent items is kind of a mess -- it doesn't use the standard xfs type validators, and it doesn't check for things that it should. Fix the validator function to use the standard validation helpers and look for more types of obvious errors. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> --- v2: reverse the owner tests, simplify the startblock checks --- fs/xfs/xfs_rmap_item.c | 30 ++++++++++++++++++------------ 1 file changed, 18 insertions(+), 12 deletions(-) diff --git a/fs/xfs/xfs_rmap_item.c b/fs/xfs/xfs_rmap_item.c index 871ed7fc43ee..36fe368531d2 100644 --- a/fs/xfs/xfs_rmap_item.c +++ b/fs/xfs/xfs_rmap_item.c @@ -466,11 +466,9 @@ xfs_rui_validate_map( struct xfs_mount *mp, struct xfs_map_extent *rmap) { - xfs_fsblock_t startblock_fsb; - bool op_ok; + if (rmap->me_flags & ~XFS_RMAP_EXTENT_FLAGS) + return false; - startblock_fsb = XFS_BB_TO_FSB(mp, - XFS_FSB_TO_DADDR(mp, rmap->me_startblock)); switch (rmap->me_flags & XFS_RMAP_EXTENT_TYPE_MASK) { case XFS_RMAP_EXTENT_MAP: case XFS_RMAP_EXTENT_MAP_SHARED: @@ -480,17 +478,25 @@ xfs_rui_validate_map( case XFS_RMAP_EXTENT_CONVERT_SHARED: case XFS_RMAP_EXTENT_ALLOC: case XFS_RMAP_EXTENT_FREE: - op_ok = true; break; default: - op_ok = false; - break; + return false; } - if (!op_ok || startblock_fsb == 0 || - rmap->me_len == 0 || - startblock_fsb >= mp->m_sb.sb_dblocks || - rmap->me_len >= mp->m_sb.sb_agblocks || - (rmap->me_flags & ~XFS_RMAP_EXTENT_FLAGS)) + + if (!XFS_RMAP_NON_INODE_OWNER(rmap->me_owner) && + !xfs_verify_ino(mp, rmap->me_owner)) + return false; + + if (rmap->me_startoff + rmap->me_len <= rmap->me_startoff) + return false; + + if (rmap->me_startblock + rmap->me_len <= rmap->me_startblock) + return false; + + if (!xfs_verify_fsbno(mp, rmap->me_startblock)) + return false; + + if (!xfs_verify_fsbno(mp, rmap->me_startblock + rmap->me_len - 1)) return false; return true;
From: Darrick J. Wong <darrick.wong@oracle.com> The bmap, rmap, and refcount log intent items were added to support the rmap and reflink features. Because these features come with changes to the ondisk format, the log items aren't tied to a log incompat flag. However, the log recovery routines don't actually check for those feature flags. The kernel has no business replayng an intent item for a feature that isn't enabled, so check that as part of recovered log item validation. (Note that kernels pre-dating rmap and reflink will fail the mount on the unknown log item type code.) Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> --- fs/xfs/xfs_bmap_item.c | 4 ++++ fs/xfs/xfs_refcount_item.c | 3 +++ fs/xfs/xfs_rmap_item.c | 3 +++ 3 files changed, 10 insertions(+) diff --git a/fs/xfs/xfs_bmap_item.c b/fs/xfs/xfs_bmap_item.c index 78346d47564b..4ea9132716c6 100644 --- a/fs/xfs/xfs_bmap_item.c +++ b/fs/xfs/xfs_bmap_item.c @@ -425,6 +425,10 @@ xfs_bui_validate( { struct xfs_map_extent *bmap; + if (!xfs_sb_version_hasrmapbt(&mp->m_sb) && + !xfs_sb_version_hasreflink(&mp->m_sb)) + return false; + /* Only one mapping operation per BUI... */ if (buip->bui_format.bui_nextents != XFS_BUI_MAX_FAST_EXTENTS) return false; diff --git a/fs/xfs/xfs_refcount_item.c b/fs/xfs/xfs_refcount_item.c index 8ad6c81f6d8f..2b28f5643c0b 100644 --- a/fs/xfs/xfs_refcount_item.c +++ b/fs/xfs/xfs_refcount_item.c @@ -423,6 +423,9 @@ xfs_cui_validate_phys( struct xfs_mount *mp, struct xfs_phys_extent *refc) { + if (!xfs_sb_version_hasreflink(&mp->m_sb)) + return false; + if (refc->pe_flags & ~XFS_REFCOUNT_EXTENT_FLAGS) return false; diff --git a/fs/xfs/xfs_rmap_item.c b/fs/xfs/xfs_rmap_item.c index f296ec349936..2628bc0080fe 100644 --- a/fs/xfs/xfs_rmap_item.c +++ b/fs/xfs/xfs_rmap_item.c @@ -466,6 +466,9 @@ xfs_rui_validate_map( struct xfs_mount *mp, struct xfs_map_extent *rmap) { + if (!xfs_sb_version_hasrmapbt(&mp->m_sb)) + return false; + if (rmap->me_flags & ~XFS_RMAP_EXTENT_FLAGS) return false;
On Thu, Dec 03, 2020 at 05:12:30PM -0800, Darrick J. Wong wrote: > From: Darrick J. Wong <darrick.wong@oracle.com> > > The bmap, rmap, and refcount log intent items were added to support the > rmap and reflink features. Because these features come with changes to > the ondisk format, the log items aren't tied to a log incompat flag. > > However, the log recovery routines don't actually check for those > feature flags. The kernel has no business replayng an intent item for a > feature that isn't enabled, so check that as part of recovered log item > validation. (Note that kernels pre-dating rmap and reflink will fail > the mount on the unknown log item type code.) > > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> > Reviewed-by: Christoph Hellwig <hch@lst.de> > --- > fs/xfs/xfs_bmap_item.c | 4 ++++ > fs/xfs/xfs_refcount_item.c | 3 +++ > fs/xfs/xfs_rmap_item.c | 3 +++ > 3 files changed, 10 insertions(+) > > > diff --git a/fs/xfs/xfs_bmap_item.c b/fs/xfs/xfs_bmap_item.c > index 78346d47564b..4ea9132716c6 100644 > --- a/fs/xfs/xfs_bmap_item.c > +++ b/fs/xfs/xfs_bmap_item.c > @@ -425,6 +425,10 @@ xfs_bui_validate( > { > struct xfs_map_extent *bmap; > > + if (!xfs_sb_version_hasrmapbt(&mp->m_sb) && > + !xfs_sb_version_hasreflink(&mp->m_sb)) > + return false; > + Took me a minute to realize we use the map/unmap for extent swap if rmap is enabled. That does make me wonder a bit.. had we made this kind of recovery feature validation change before that came around (such that we probably would have only checked _hasreflink() here), would we have created an unnecessary backwards incompatibility? Brian > /* Only one mapping operation per BUI... */ > if (buip->bui_format.bui_nextents != XFS_BUI_MAX_FAST_EXTENTS) > return false; > diff --git a/fs/xfs/xfs_refcount_item.c b/fs/xfs/xfs_refcount_item.c > index 8ad6c81f6d8f..2b28f5643c0b 100644 > --- a/fs/xfs/xfs_refcount_item.c > +++ b/fs/xfs/xfs_refcount_item.c > @@ -423,6 +423,9 @@ xfs_cui_validate_phys( > struct xfs_mount *mp, > struct xfs_phys_extent *refc) > { > + if (!xfs_sb_version_hasreflink(&mp->m_sb)) > + return false; > + > if (refc->pe_flags & ~XFS_REFCOUNT_EXTENT_FLAGS) > return false; > > diff --git a/fs/xfs/xfs_rmap_item.c b/fs/xfs/xfs_rmap_item.c > index f296ec349936..2628bc0080fe 100644 > --- a/fs/xfs/xfs_rmap_item.c > +++ b/fs/xfs/xfs_rmap_item.c > @@ -466,6 +466,9 @@ xfs_rui_validate_map( > struct xfs_mount *mp, > struct xfs_map_extent *rmap) > { > + if (!xfs_sb_version_hasrmapbt(&mp->m_sb)) > + return false; > + > if (rmap->me_flags & ~XFS_RMAP_EXTENT_FLAGS) > return false; > >
On Fri, Dec 04, 2020 at 09:00:36AM -0500, Brian Foster wrote: > On Thu, Dec 03, 2020 at 05:12:30PM -0800, Darrick J. Wong wrote: > > From: Darrick J. Wong <darrick.wong@oracle.com> > > > > The bmap, rmap, and refcount log intent items were added to support the > > rmap and reflink features. Because these features come with changes to > > the ondisk format, the log items aren't tied to a log incompat flag. > > > > However, the log recovery routines don't actually check for those > > feature flags. The kernel has no business replayng an intent item for a > > feature that isn't enabled, so check that as part of recovered log item > > validation. (Note that kernels pre-dating rmap and reflink will fail > > the mount on the unknown log item type code.) > > > > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> > > Reviewed-by: Christoph Hellwig <hch@lst.de> > > --- > > fs/xfs/xfs_bmap_item.c | 4 ++++ > > fs/xfs/xfs_refcount_item.c | 3 +++ > > fs/xfs/xfs_rmap_item.c | 3 +++ > > 3 files changed, 10 insertions(+) > > > > > > diff --git a/fs/xfs/xfs_bmap_item.c b/fs/xfs/xfs_bmap_item.c > > index 78346d47564b..4ea9132716c6 100644 > > --- a/fs/xfs/xfs_bmap_item.c > > +++ b/fs/xfs/xfs_bmap_item.c > > @@ -425,6 +425,10 @@ xfs_bui_validate( > > { > > struct xfs_map_extent *bmap; > > > > + if (!xfs_sb_version_hasrmapbt(&mp->m_sb) && > > + !xfs_sb_version_hasreflink(&mp->m_sb)) > > + return false; > > + > > Took me a minute to realize we use the map/unmap for extent swap if rmap > is enabled. That does make me wonder a bit.. had we made this kind of > recovery feature validation change before that came around (such that we > probably would have only checked _hasreflink() here), would we have > created an unnecessary backwards incompatibility? Yes. I confess to cheating a little here -- technically the bmap intents were introduced with reflink in 4.9, whereas rmap was introduced in 4.8. The proper solution is probably to introduce a new log incompat bit for bmap intents when reflink isn't enabled, but TBH there were enough other rmap bugs in 4.8 (not to mention the EXPERIMENTAL warning) that nobody should be running that old of a kernel on a production system. (Also we don't enable rmap by default yet whereas reflink has been enabled by default since 4.18, so the number of people affected probably isn't very high...) Secondary question: should we patch 4.9 and 4.14 to disable rmap and reflink support, since they both still have EXPERIMENTAL warnings? --D > Brian > > > /* Only one mapping operation per BUI... */ > > if (buip->bui_format.bui_nextents != XFS_BUI_MAX_FAST_EXTENTS) > > return false; > > diff --git a/fs/xfs/xfs_refcount_item.c b/fs/xfs/xfs_refcount_item.c > > index 8ad6c81f6d8f..2b28f5643c0b 100644 > > --- a/fs/xfs/xfs_refcount_item.c > > +++ b/fs/xfs/xfs_refcount_item.c > > @@ -423,6 +423,9 @@ xfs_cui_validate_phys( > > struct xfs_mount *mp, > > struct xfs_phys_extent *refc) > > { > > + if (!xfs_sb_version_hasreflink(&mp->m_sb)) > > + return false; > > + > > if (refc->pe_flags & ~XFS_REFCOUNT_EXTENT_FLAGS) > > return false; > > > > diff --git a/fs/xfs/xfs_rmap_item.c b/fs/xfs/xfs_rmap_item.c > > index f296ec349936..2628bc0080fe 100644 > > --- a/fs/xfs/xfs_rmap_item.c > > +++ b/fs/xfs/xfs_rmap_item.c > > @@ -466,6 +466,9 @@ xfs_rui_validate_map( > > struct xfs_mount *mp, > > struct xfs_map_extent *rmap) > > { > > + if (!xfs_sb_version_hasrmapbt(&mp->m_sb)) > > + return false; > > + > > if (rmap->me_flags & ~XFS_RMAP_EXTENT_FLAGS) > > return false; > > > > >
From: Darrick J. Wong <darrick.wong@oracle.com> The bmap, rmap, and refcount log intent items were added to support the rmap and reflink features. Because these features come with changes to the ondisk format, the log items aren't tied to a log incompat flag. However, the log recovery routines don't actually check for those feature flags. The kernel has no business replayng an intent item for a feature that isn't enabled, so check that as part of recovered log item validation. (Note that kernels pre-dating rmap and reflink will fail the mount on the unknown log item type code.) Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> --- fs/xfs/xfs_bmap_item.c | 4 ++++ fs/xfs/xfs_refcount_item.c | 3 +++ fs/xfs/xfs_rmap_item.c | 3 +++ 3 files changed, 10 insertions(+) diff --git a/fs/xfs/xfs_bmap_item.c b/fs/xfs/xfs_bmap_item.c index a21a9f71c0c0..8d3ed07800f6 100644 --- a/fs/xfs/xfs_bmap_item.c +++ b/fs/xfs/xfs_bmap_item.c @@ -425,6 +425,10 @@ xfs_bui_validate( { struct xfs_map_extent *bmap; + if (!xfs_sb_version_hasrmapbt(&mp->m_sb) && + !xfs_sb_version_hasreflink(&mp->m_sb)) + return false; + /* Only one mapping operation per BUI... */ if (buip->bui_format.bui_nextents != XFS_BUI_MAX_FAST_EXTENTS) return false; diff --git a/fs/xfs/xfs_refcount_item.c b/fs/xfs/xfs_refcount_item.c index c24f2da0f795..937d482c9be4 100644 --- a/fs/xfs/xfs_refcount_item.c +++ b/fs/xfs/xfs_refcount_item.c @@ -423,6 +423,9 @@ xfs_cui_validate_phys( struct xfs_mount *mp, struct xfs_phys_extent *refc) { + if (!xfs_sb_version_hasreflink(&mp->m_sb)) + return false; + if (refc->pe_flags & ~XFS_REFCOUNT_EXTENT_FLAGS) return false; diff --git a/fs/xfs/xfs_rmap_item.c b/fs/xfs/xfs_rmap_item.c index 6f3250a22093..9b84017184d9 100644 --- a/fs/xfs/xfs_rmap_item.c +++ b/fs/xfs/xfs_rmap_item.c @@ -466,6 +466,9 @@ xfs_rui_validate_map( struct xfs_mount *mp, struct xfs_map_extent *rmap) { + if (!xfs_sb_version_hasrmapbt(&mp->m_sb)) + return false; + if (rmap->me_flags & ~XFS_RMAP_EXTENT_FLAGS) return false;
On Sun, Dec 06, 2020 at 03:08:42PM -0800, Darrick J. Wong wrote: > On Fri, Dec 04, 2020 at 09:00:36AM -0500, Brian Foster wrote: > > On Thu, Dec 03, 2020 at 05:12:30PM -0800, Darrick J. Wong wrote: > > > From: Darrick J. Wong <darrick.wong@oracle.com> > > > > > > The bmap, rmap, and refcount log intent items were added to support the > > > rmap and reflink features. Because these features come with changes to > > > the ondisk format, the log items aren't tied to a log incompat flag. > > > > > > However, the log recovery routines don't actually check for those > > > feature flags. The kernel has no business replayng an intent item for a > > > feature that isn't enabled, so check that as part of recovered log item > > > validation. (Note that kernels pre-dating rmap and reflink will fail > > > the mount on the unknown log item type code.) > > > > > > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> > > > Reviewed-by: Christoph Hellwig <hch@lst.de> > > > --- > > > fs/xfs/xfs_bmap_item.c | 4 ++++ > > > fs/xfs/xfs_refcount_item.c | 3 +++ > > > fs/xfs/xfs_rmap_item.c | 3 +++ > > > 3 files changed, 10 insertions(+) > > > > > > > > > diff --git a/fs/xfs/xfs_bmap_item.c b/fs/xfs/xfs_bmap_item.c > > > index 78346d47564b..4ea9132716c6 100644 > > > --- a/fs/xfs/xfs_bmap_item.c > > > +++ b/fs/xfs/xfs_bmap_item.c > > > @@ -425,6 +425,10 @@ xfs_bui_validate( > > > { > > > struct xfs_map_extent *bmap; > > > > > > + if (!xfs_sb_version_hasrmapbt(&mp->m_sb) && > > > + !xfs_sb_version_hasreflink(&mp->m_sb)) > > > + return false; > > > + > > > > Took me a minute to realize we use the map/unmap for extent swap if rmap > > is enabled. That does make me wonder a bit.. had we made this kind of > > recovery feature validation change before that came around (such that we > > probably would have only checked _hasreflink() here), would we have > > created an unnecessary backwards incompatibility? > > Yes. > > I confess to cheating a little here -- technically the bmap intents were > introduced with reflink in 4.9, whereas rmap was introduced in 4.8. The > proper solution is probably to introduce a new log incompat bit for bmap > intents when reflink isn't enabled, but TBH there were enough other rmap > bugs in 4.8 (not to mention the EXPERIMENTAL warning) that nobody should > be running that old of a kernel on a production system. > > (Also we don't enable rmap by default yet whereas reflink has been > enabled by default since 4.18, so the number of people affected probably > isn't very high...) > Hmm, so this all has me a a bit concerned over the value proposition for these particular feature checks. The current reflink/rmap feature situation may work out Ok in practice, but it sounds like that is partly due to timing and a little bit of luck around when the implementations and interdependencies landed. This code will ultimately introduce a verification pattern that will likely be followed for new features, associated log item types, etc. and it's not totally clear to me that we'd always get it right (as opposed to something more granular like incompat bits for intent formats). Is this addressing a real problem we've seen in the wild or more of a fuzzing thing? > Secondary question: should we patch 4.9 and 4.14 to disable rmap and > reflink support, since they both still have EXPERIMENTAL warnings? > That sounds like an odd thing to do to a stable kernel, but that's just my .02. Brian > --D > > > Brian > > > > > /* Only one mapping operation per BUI... */ > > > if (buip->bui_format.bui_nextents != XFS_BUI_MAX_FAST_EXTENTS) > > > return false; > > > diff --git a/fs/xfs/xfs_refcount_item.c b/fs/xfs/xfs_refcount_item.c > > > index 8ad6c81f6d8f..2b28f5643c0b 100644 > > > --- a/fs/xfs/xfs_refcount_item.c > > > +++ b/fs/xfs/xfs_refcount_item.c > > > @@ -423,6 +423,9 @@ xfs_cui_validate_phys( > > > struct xfs_mount *mp, > > > struct xfs_phys_extent *refc) > > > { > > > + if (!xfs_sb_version_hasreflink(&mp->m_sb)) > > > + return false; > > > + > > > if (refc->pe_flags & ~XFS_REFCOUNT_EXTENT_FLAGS) > > > return false; > > > > > > diff --git a/fs/xfs/xfs_rmap_item.c b/fs/xfs/xfs_rmap_item.c > > > index f296ec349936..2628bc0080fe 100644 > > > --- a/fs/xfs/xfs_rmap_item.c > > > +++ b/fs/xfs/xfs_rmap_item.c > > > @@ -466,6 +466,9 @@ xfs_rui_validate_map( > > > struct xfs_mount *mp, > > > struct xfs_map_extent *rmap) > > > { > > > + if (!xfs_sb_version_hasrmapbt(&mp->m_sb)) > > > + return false; > > > + > > > if (rmap->me_flags & ~XFS_RMAP_EXTENT_FLAGS) > > > return false; > > > > > > > > >
On Mon, Dec 07, 2020 at 09:02:12AM -0500, Brian Foster wrote: > On Sun, Dec 06, 2020 at 03:08:42PM -0800, Darrick J. Wong wrote: > > On Fri, Dec 04, 2020 at 09:00:36AM -0500, Brian Foster wrote: > > > On Thu, Dec 03, 2020 at 05:12:30PM -0800, Darrick J. Wong wrote: > > > > From: Darrick J. Wong <darrick.wong@oracle.com> > > > > > > > > The bmap, rmap, and refcount log intent items were added to support the > > > > rmap and reflink features. Because these features come with changes to > > > > the ondisk format, the log items aren't tied to a log incompat flag. > > > > > > > > However, the log recovery routines don't actually check for those > > > > feature flags. The kernel has no business replayng an intent item for a > > > > feature that isn't enabled, so check that as part of recovered log item > > > > validation. (Note that kernels pre-dating rmap and reflink will fail > > > > the mount on the unknown log item type code.) > > > > > > > > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> > > > > Reviewed-by: Christoph Hellwig <hch@lst.de> > > > > --- > > > > fs/xfs/xfs_bmap_item.c | 4 ++++ > > > > fs/xfs/xfs_refcount_item.c | 3 +++ > > > > fs/xfs/xfs_rmap_item.c | 3 +++ > > > > 3 files changed, 10 insertions(+) > > > > > > > > > > > > diff --git a/fs/xfs/xfs_bmap_item.c b/fs/xfs/xfs_bmap_item.c > > > > index 78346d47564b..4ea9132716c6 100644 > > > > --- a/fs/xfs/xfs_bmap_item.c > > > > +++ b/fs/xfs/xfs_bmap_item.c > > > > @@ -425,6 +425,10 @@ xfs_bui_validate( > > > > { > > > > struct xfs_map_extent *bmap; > > > > > > > > + if (!xfs_sb_version_hasrmapbt(&mp->m_sb) && > > > > + !xfs_sb_version_hasreflink(&mp->m_sb)) > > > > + return false; > > > > + > > > > > > Took me a minute to realize we use the map/unmap for extent swap if rmap > > > is enabled. That does make me wonder a bit.. had we made this kind of > > > recovery feature validation change before that came around (such that we > > > probably would have only checked _hasreflink() here), would we have > > > created an unnecessary backwards incompatibility? > > > > Yes. > > > > I confess to cheating a little here -- technically the bmap intents were > > introduced with reflink in 4.9, whereas rmap was introduced in 4.8. The > > proper solution is probably to introduce a new log incompat bit for bmap > > intents when reflink isn't enabled, but TBH there were enough other rmap > > bugs in 4.8 (not to mention the EXPERIMENTAL warning) that nobody should > > be running that old of a kernel on a production system. > > > > (Also we don't enable rmap by default yet whereas reflink has been > > enabled by default since 4.18, so the number of people affected probably > > isn't very high...) > > > > Hmm, so this all has me a a bit concerned over the value proposition for > these particular feature checks. The current reflink/rmap feature > situation may work out Ok in practice, but it sounds like that is partly > due to timing and a little bit of luck around when the implementations > and interdependencies landed. This code will ultimately introduce a > verification pattern that will likely be followed for new features, > associated log item types, etc. and it's not totally clear to me that > we'd always get it right (as opposed to something more granular like > incompat bits for intent formats). Is this addressing a real problem > we've seen in the wild or more of a fuzzing thing? Neither, it was just me doing some code review over thanksgiving. It also occurred to me to (re)consider this in terms of "What are we protecting against?" Adding feature checks to the CUI/RUI recovery functions makes sense since we can't replay something into a feature that isn't enabled. For BUI items however, the bmap has existed forever so we're really not guarding much. If someone out there has (for example) a V4 filesystem with a dirty BUI to replay, why not replay it? So I guess I could just drop the feature check from the BUI recovery function. --D > > Secondary question: should we patch 4.9 and 4.14 to disable rmap and > > reflink support, since they both still have EXPERIMENTAL warnings? > > > > That sounds like an odd thing to do to a stable kernel, but that's just > my .02. > > Brian > > > --D > > > > > Brian > > > > > > > /* Only one mapping operation per BUI... */ > > > > if (buip->bui_format.bui_nextents != XFS_BUI_MAX_FAST_EXTENTS) > > > > return false; > > > > diff --git a/fs/xfs/xfs_refcount_item.c b/fs/xfs/xfs_refcount_item.c > > > > index 8ad6c81f6d8f..2b28f5643c0b 100644 > > > > --- a/fs/xfs/xfs_refcount_item.c > > > > +++ b/fs/xfs/xfs_refcount_item.c > > > > @@ -423,6 +423,9 @@ xfs_cui_validate_phys( > > > > struct xfs_mount *mp, > > > > struct xfs_phys_extent *refc) > > > > { > > > > + if (!xfs_sb_version_hasreflink(&mp->m_sb)) > > > > + return false; > > > > + > > > > if (refc->pe_flags & ~XFS_REFCOUNT_EXTENT_FLAGS) > > > > return false; > > > > > > > > diff --git a/fs/xfs/xfs_rmap_item.c b/fs/xfs/xfs_rmap_item.c > > > > index f296ec349936..2628bc0080fe 100644 > > > > --- a/fs/xfs/xfs_rmap_item.c > > > > +++ b/fs/xfs/xfs_rmap_item.c > > > > @@ -466,6 +466,9 @@ xfs_rui_validate_map( > > > > struct xfs_mount *mp, > > > > struct xfs_map_extent *rmap) > > > > { > > > > + if (!xfs_sb_version_hasrmapbt(&mp->m_sb)) > > > > + return false; > > > > + > > > > if (rmap->me_flags & ~XFS_RMAP_EXTENT_FLAGS) > > > > return false; > > > > > > > > > > > > > >