* [PATCH v3 0/3] xfs-4.20: scrub fixes @ 2018-10-05 0:47 Darrick J. Wong 2018-10-05 0:47 ` [PATCH 1/3] xfs: xrep_findroot_block should reject root blocks with siblings Darrick J. Wong ` (2 more replies) 0 siblings, 3 replies; 13+ messages in thread From: Darrick J. Wong @ 2018-10-05 0:47 UTC (permalink / raw) To: david, darrick.wong; +Cc: linux-xfs Hi all, Here are a few online fsck fixes for 4.20. The first patch fixes the online repair "find AG btree root" function to ignore btree blocks that have siblings and to ignore a btree level if multiple sibling-less blocks are found. The second patch strengthens the buffer read functions to apply buffer ops (and verify the buffer contents) any time a caller tries to read a buffer with a given set of ops and the buffer doesn't have ops yet. The third patch fixes some buffer state management bugs so that we don't accidentally clobber b_ops on buffers that were already in-core when we try to find an AG header's btree root blocks. Comments and questions are, as always, welcome. --D ^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH 1/3] xfs: xrep_findroot_block should reject root blocks with siblings 2018-10-05 0:47 [PATCH v3 0/3] xfs-4.20: scrub fixes Darrick J. Wong @ 2018-10-05 0:47 ` Darrick J. Wong 2018-10-05 0:47 ` [PATCH 2/3] xfs: always assign buffer verifiers when one is provided Darrick J. Wong 2018-10-05 0:47 ` [PATCH 3/3] xfs: fix buffer state management in xrep_findroot_block Darrick J. Wong 2 siblings, 0 replies; 13+ messages in thread From: Darrick J. Wong @ 2018-10-05 0:47 UTC (permalink / raw) To: david, darrick.wong; +Cc: linux-xfs, Brian Foster From: Darrick J. Wong <darrick.wong@oracle.com> In xrep_findroot_block, if we find a candidate root block with sibling pointers or sibling blocks on the same tree level, we should not return that block as a tree root because root blocks cannot have siblings. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> --- fs/xfs/scrub/repair.c | 61 +++++++++++++++++++++++++++++++++++++++---------- 1 file changed, 48 insertions(+), 13 deletions(-) diff --git a/fs/xfs/scrub/repair.c b/fs/xfs/scrub/repair.c index 9f08dd9bf1d5..63786341ac2a 100644 --- a/fs/xfs/scrub/repair.c +++ b/fs/xfs/scrub/repair.c @@ -692,12 +692,13 @@ xrep_findroot_block( struct xrep_find_ag_btree *fab, uint64_t owner, xfs_agblock_t agbno, - bool *found_it) + bool *done_with_block) { struct xfs_mount *mp = ri->sc->mp; struct xfs_buf *bp; struct xfs_btree_block *btblock; xfs_daddr_t daddr; + int block_level; int error; daddr = XFS_AGB_TO_DADDR(mp, ri->sc->sa.agno, agbno); @@ -735,18 +736,52 @@ xrep_findroot_block( goto out; bp->b_ops = fab->buf_ops; - /* Ignore this block if it's lower in the tree than we've seen. */ - if (fab->root != NULLAGBLOCK && - xfs_btree_get_level(btblock) < fab->height) - goto out; - /* Make sure we pass the verifiers. */ bp->b_ops->verify_read(bp); if (bp->b_error) goto out; - fab->root = agbno; - fab->height = xfs_btree_get_level(btblock) + 1; - *found_it = true; + + /* + * This block passes the magic/uuid and verifier tests for this btree + * type. We don't need the caller to try the other tree types. + */ + *done_with_block = true; + + /* + * Compare this btree block's level to the height of the current + * candidate root block. + * + * If the level matches the root we found previously, throw away both + * blocks because there can't be two candidate roots. + * + * If level is lower in the tree than the root we found previously, + * ignore this block. + */ + block_level = xfs_btree_get_level(btblock); + if (block_level + 1 == fab->height) { + fab->root = NULLAGBLOCK; + goto out; + } else if (block_level < fab->height) { + goto out; + } + + /* + * This is the highest block in the tree that we've found so far. + * Update the btree height to reflect what we've learned from this + * block. + */ + fab->height = block_level + 1; + + /* + * If this block doesn't have sibling pointers, then it's the new root + * block candidate. Otherwise, the root will be found farther up the + * tree. + */ + if (btblock->bb_u.s.bb_leftsib == cpu_to_be32(NULLAGBLOCK) && + btblock->bb_u.s.bb_rightsib == cpu_to_be32(NULLAGBLOCK)) + fab->root = agbno; + else + fab->root = NULLAGBLOCK; trace_xrep_findroot_block(mp, ri->sc->sa.agno, agbno, be32_to_cpu(btblock->bb_magic), fab->height - 1); @@ -768,7 +803,7 @@ xrep_findroot_rmap( struct xrep_findroot *ri = priv; struct xrep_find_ag_btree *fab; xfs_agblock_t b; - bool found_it; + bool done; int error = 0; /* Ignore anything that isn't AG metadata. */ @@ -777,16 +812,16 @@ xrep_findroot_rmap( /* Otherwise scan each block + btree type. */ for (b = 0; b < rec->rm_blockcount; b++) { - found_it = false; + done = false; for (fab = ri->btree_info; fab->buf_ops; fab++) { if (rec->rm_owner != fab->rmap_owner) continue; error = xrep_findroot_block(ri, fab, rec->rm_owner, rec->rm_startblock + b, - &found_it); + &done); if (error) return error; - if (found_it) + if (done) break; } } ^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH 2/3] xfs: always assign buffer verifiers when one is provided 2018-10-05 0:47 [PATCH v3 0/3] xfs-4.20: scrub fixes Darrick J. Wong 2018-10-05 0:47 ` [PATCH 1/3] xfs: xrep_findroot_block should reject root blocks with siblings Darrick J. Wong @ 2018-10-05 0:47 ` Darrick J. Wong 2018-10-05 11:57 ` Brian Foster 2018-10-06 10:25 ` Christoph Hellwig 2018-10-05 0:47 ` [PATCH 3/3] xfs: fix buffer state management in xrep_findroot_block Darrick J. Wong 2 siblings, 2 replies; 13+ messages in thread From: Darrick J. Wong @ 2018-10-05 0:47 UTC (permalink / raw) To: david, darrick.wong; +Cc: linux-xfs From: Darrick J. Wong <darrick.wong@oracle.com> If a caller supplies buffer ops when trying to read a buffer and the buffer doesn't already have buf ops assigned, ensure that the ops are assigned to the buffer and the verifier is run on that buffer. Note that current XFS code is careful to assign buffer ops after a xfs_{trans_,}buf_read call in which ops were not supplied. However, we should apply ops defensively in case there is ever a coding mistake; and an upcoming repair patch will need to be able to read a buffer without assigning buf ops. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> --- fs/xfs/xfs_buf.c | 64 +++++++++++++++++++++++++++++++++++------------- fs/xfs/xfs_buf.h | 3 ++ fs/xfs/xfs_trans_buf.c | 28 +++++++++++++++++++++ 3 files changed, 78 insertions(+), 17 deletions(-) diff --git a/fs/xfs/xfs_buf.c b/fs/xfs/xfs_buf.c index e839907e8492..3adfa139dcfe 100644 --- a/fs/xfs/xfs_buf.c +++ b/fs/xfs/xfs_buf.c @@ -749,6 +749,29 @@ _xfs_buf_read( return xfs_buf_submit(bp); } +/* + * If the caller passed in an ops structure and the buffer doesn't have ops + * assigned, set the ops and use them to verify the contents. If the contents + * cannot be verified, we'll clear XBF_DONE. + */ +int +xfs_buf_ensure_ops( + struct xfs_buf *bp, + const struct xfs_buf_ops *ops) +{ + ASSERT(bp->b_flags & XBF_DONE); + + if (!ops || bp->b_ops) + return 0; + + bp->b_error = 0; + bp->b_ops = ops; + bp->b_ops->verify_read(bp); + if (bp->b_error) + bp->b_flags &= ~XBF_DONE; + return bp->b_error; +} + xfs_buf_t * xfs_buf_read_map( struct xfs_buftarg *target, @@ -762,26 +785,33 @@ xfs_buf_read_map( flags |= XBF_READ; bp = xfs_buf_get_map(target, map, nmaps, flags); - if (bp) { - trace_xfs_buf_read(bp, flags, _RET_IP_); + if (!bp) + return NULL; - if (!(bp->b_flags & XBF_DONE)) { - XFS_STATS_INC(target->bt_mount, xb_get_read); - bp->b_ops = ops; - _xfs_buf_read(bp, flags); - } else if (flags & XBF_ASYNC) { - /* - * Read ahead call which is already satisfied, - * drop the buffer - */ - xfs_buf_relse(bp); - return NULL; - } else { - /* We do not want read in the flags */ - bp->b_flags &= ~XBF_READ; - } + trace_xfs_buf_read(bp, flags, _RET_IP_); + + if (!(bp->b_flags & XBF_DONE)) { + XFS_STATS_INC(target->bt_mount, xb_get_read); + bp->b_ops = ops; + _xfs_buf_read(bp, flags); + ASSERT(bp->b_ops != NULL || ops == NULL); + return bp; + } + + xfs_buf_ensure_ops(bp, ops); + + if (flags & XBF_ASYNC) { + /* + * Read ahead call which is already satisfied, + * drop the buffer + */ + xfs_buf_relse(bp); + return NULL; } + /* We do not want read in the flags */ + bp->b_flags &= ~XBF_READ; + ASSERT(bp->b_ops != NULL || ops == NULL); return bp; } diff --git a/fs/xfs/xfs_buf.h b/fs/xfs/xfs_buf.h index 4e3171acd0f8..526bc7e9e7fc 100644 --- a/fs/xfs/xfs_buf.h +++ b/fs/xfs/xfs_buf.h @@ -385,4 +385,7 @@ extern int xfs_setsize_buftarg(xfs_buftarg_t *, unsigned int); #define xfs_getsize_buftarg(buftarg) block_size((buftarg)->bt_bdev) #define xfs_readonly_buftarg(buftarg) bdev_read_only((buftarg)->bt_bdev) +extern int xfs_buf_ensure_ops(struct xfs_buf *bp, + const struct xfs_buf_ops *ops); + #endif /* __XFS_BUF_H__ */ diff --git a/fs/xfs/xfs_trans_buf.c b/fs/xfs/xfs_trans_buf.c index 15919f67a88f..b0ba2ca9cca3 100644 --- a/fs/xfs/xfs_trans_buf.c +++ b/fs/xfs/xfs_trans_buf.c @@ -264,11 +264,38 @@ xfs_trans_read_buf_map( return -EIO; } + /* + * The caller is trying to read a buffer that is already + * attached to the transaction yet has no buffer ops assigned. + * Ops are usually attached when the buffer is attached to the + * transaction, or by the read caller in special circumstances. + * That didn't happen, which is not how this is supposed to go. + * + * If the buffer passes verification we'll let this go, but if + * not we have to shut down. Let the transaction cleanup code + * release this buffer when it kills the tranaction. + */ + ASSERT(bp->b_ops != NULL); + error = xfs_buf_ensure_ops(bp, ops); + if (error) { + xfs_buf_ioerror_alert(bp, __func__); + + if (tp->t_flags & XFS_TRANS_DIRTY) + xfs_force_shutdown(tp->t_mountp, + SHUTDOWN_META_IO_ERROR); + + /* bad CRC means corrupted metadata */ + if (error == -EFSBADCRC) + error = -EFSCORRUPTED; + return error; + } + bip = bp->b_log_item; bip->bli_recur++; ASSERT(atomic_read(&bip->bli_refcount) > 0); trace_xfs_trans_read_buf_recur(bip); + ASSERT(bp->b_ops != NULL || ops == NULL); *bpp = bp; return 0; } @@ -316,6 +343,7 @@ xfs_trans_read_buf_map( _xfs_trans_bjoin(tp, bp, 1); trace_xfs_trans_read_buf(bp->b_log_item); } + ASSERT(bp->b_ops != NULL || ops == NULL); *bpp = bp; return 0; ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH 2/3] xfs: always assign buffer verifiers when one is provided 2018-10-05 0:47 ` [PATCH 2/3] xfs: always assign buffer verifiers when one is provided Darrick J. Wong @ 2018-10-05 11:57 ` Brian Foster 2018-10-05 17:02 ` Darrick J. Wong 2018-10-06 10:25 ` Christoph Hellwig 1 sibling, 1 reply; 13+ messages in thread From: Brian Foster @ 2018-10-05 11:57 UTC (permalink / raw) To: Darrick J. Wong; +Cc: david, linux-xfs On Thu, Oct 04, 2018 at 05:47:51PM -0700, Darrick J. Wong wrote: > From: Darrick J. Wong <darrick.wong@oracle.com> > > If a caller supplies buffer ops when trying to read a buffer and the > buffer doesn't already have buf ops assigned, ensure that the ops are > assigned to the buffer and the verifier is run on that buffer. > > Note that current XFS code is careful to assign buffer ops after a > xfs_{trans_,}buf_read call in which ops were not supplied. However, we > should apply ops defensively in case there is ever a coding mistake; and > an upcoming repair patch will need to be able to read a buffer without > assigning buf ops. > > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> > --- Just a few nits.. > fs/xfs/xfs_buf.c | 64 +++++++++++++++++++++++++++++++++++------------- > fs/xfs/xfs_buf.h | 3 ++ > fs/xfs/xfs_trans_buf.c | 28 +++++++++++++++++++++ > 3 files changed, 78 insertions(+), 17 deletions(-) > > > diff --git a/fs/xfs/xfs_buf.c b/fs/xfs/xfs_buf.c > index e839907e8492..3adfa139dcfe 100644 > --- a/fs/xfs/xfs_buf.c > +++ b/fs/xfs/xfs_buf.c > @@ -749,6 +749,29 @@ _xfs_buf_read( > return xfs_buf_submit(bp); > } > > +/* > + * If the caller passed in an ops structure and the buffer doesn't have ops > + * assigned, set the ops and use them to verify the contents. If the contents > + * cannot be verified, we'll clear XBF_DONE. > + */ > +int > +xfs_buf_ensure_ops( > + struct xfs_buf *bp, > + const struct xfs_buf_ops *ops) > +{ > + ASSERT(bp->b_flags & XBF_DONE); > + > + if (!ops || bp->b_ops) > + return 0; > + > + bp->b_error = 0; If we only call this for XBF_DONE buffers, does that mean that ->b_error should also be zero already? Is it worth adding that to the assert above instead of resetting it? > + bp->b_ops = ops; > + bp->b_ops->verify_read(bp); > + if (bp->b_error) > + bp->b_flags &= ~XBF_DONE; > + return bp->b_error; > +} > + > xfs_buf_t * > xfs_buf_read_map( > struct xfs_buftarg *target, > @@ -762,26 +785,33 @@ xfs_buf_read_map( > flags |= XBF_READ; > > bp = xfs_buf_get_map(target, map, nmaps, flags); > - if (bp) { > - trace_xfs_buf_read(bp, flags, _RET_IP_); > + if (!bp) > + return NULL; > > - if (!(bp->b_flags & XBF_DONE)) { > - XFS_STATS_INC(target->bt_mount, xb_get_read); > - bp->b_ops = ops; > - _xfs_buf_read(bp, flags); > - } else if (flags & XBF_ASYNC) { > - /* > - * Read ahead call which is already satisfied, > - * drop the buffer > - */ > - xfs_buf_relse(bp); > - return NULL; > - } else { > - /* We do not want read in the flags */ > - bp->b_flags &= ~XBF_READ; > - } > + trace_xfs_buf_read(bp, flags, _RET_IP_); > + > + if (!(bp->b_flags & XBF_DONE)) { > + XFS_STATS_INC(target->bt_mount, xb_get_read); > + bp->b_ops = ops; > + _xfs_buf_read(bp, flags); > + ASSERT(bp->b_ops != NULL || ops == NULL); I like having this assert sprinkled around as well, but I'm wondering if we can (safely) make it a bit stronger: ASSERT(bp->b_ops == ops || !ops); I think the !ops check is necessary to cover the case of reading a verified buffer from scrub context (where we don't know the appropriate verifier), but with the current approach we should never pass in the wrong ops for a verified buffer, right? > + return bp; > + } > + > + xfs_buf_ensure_ops(bp, ops); > + > + if (flags & XBF_ASYNC) { > + /* > + * Read ahead call which is already satisfied, > + * drop the buffer > + */ > + xfs_buf_relse(bp); > + return NULL; > } > > + /* We do not want read in the flags */ > + bp->b_flags &= ~XBF_READ; > + ASSERT(bp->b_ops != NULL || ops == NULL); > return bp; > } > > diff --git a/fs/xfs/xfs_buf.h b/fs/xfs/xfs_buf.h > index 4e3171acd0f8..526bc7e9e7fc 100644 > --- a/fs/xfs/xfs_buf.h > +++ b/fs/xfs/xfs_buf.h > @@ -385,4 +385,7 @@ extern int xfs_setsize_buftarg(xfs_buftarg_t *, unsigned int); > #define xfs_getsize_buftarg(buftarg) block_size((buftarg)->bt_bdev) > #define xfs_readonly_buftarg(buftarg) bdev_read_only((buftarg)->bt_bdev) > > +extern int xfs_buf_ensure_ops(struct xfs_buf *bp, > + const struct xfs_buf_ops *ops); > + > #endif /* __XFS_BUF_H__ */ > diff --git a/fs/xfs/xfs_trans_buf.c b/fs/xfs/xfs_trans_buf.c > index 15919f67a88f..b0ba2ca9cca3 100644 > --- a/fs/xfs/xfs_trans_buf.c > +++ b/fs/xfs/xfs_trans_buf.c > @@ -264,11 +264,38 @@ xfs_trans_read_buf_map( > return -EIO; > } > > + /* > + * The caller is trying to read a buffer that is already "Check if the caller is trying ..." ? Nits aside: Reviewed-by: Brian Foster <bfoster@redhat.com> > + * attached to the transaction yet has no buffer ops assigned. > + * Ops are usually attached when the buffer is attached to the > + * transaction, or by the read caller in special circumstances. > + * That didn't happen, which is not how this is supposed to go. > + * > + * If the buffer passes verification we'll let this go, but if > + * not we have to shut down. Let the transaction cleanup code > + * release this buffer when it kills the tranaction. > + */ > + ASSERT(bp->b_ops != NULL); > + error = xfs_buf_ensure_ops(bp, ops); > + if (error) { > + xfs_buf_ioerror_alert(bp, __func__); > + > + if (tp->t_flags & XFS_TRANS_DIRTY) > + xfs_force_shutdown(tp->t_mountp, > + SHUTDOWN_META_IO_ERROR); > + > + /* bad CRC means corrupted metadata */ > + if (error == -EFSBADCRC) > + error = -EFSCORRUPTED; > + return error; > + } > + > bip = bp->b_log_item; > bip->bli_recur++; > > ASSERT(atomic_read(&bip->bli_refcount) > 0); > trace_xfs_trans_read_buf_recur(bip); > + ASSERT(bp->b_ops != NULL || ops == NULL); > *bpp = bp; > return 0; > } > @@ -316,6 +343,7 @@ xfs_trans_read_buf_map( > _xfs_trans_bjoin(tp, bp, 1); > trace_xfs_trans_read_buf(bp->b_log_item); > } > + ASSERT(bp->b_ops != NULL || ops == NULL); > *bpp = bp; > return 0; > > ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 2/3] xfs: always assign buffer verifiers when one is provided 2018-10-05 11:57 ` Brian Foster @ 2018-10-05 17:02 ` Darrick J. Wong 2018-10-06 3:15 ` Darrick J. Wong 0 siblings, 1 reply; 13+ messages in thread From: Darrick J. Wong @ 2018-10-05 17:02 UTC (permalink / raw) To: Brian Foster; +Cc: david, linux-xfs On Fri, Oct 05, 2018 at 07:57:13AM -0400, Brian Foster wrote: > On Thu, Oct 04, 2018 at 05:47:51PM -0700, Darrick J. Wong wrote: > > From: Darrick J. Wong <darrick.wong@oracle.com> > > > > If a caller supplies buffer ops when trying to read a buffer and the > > buffer doesn't already have buf ops assigned, ensure that the ops are > > assigned to the buffer and the verifier is run on that buffer. > > > > Note that current XFS code is careful to assign buffer ops after a > > xfs_{trans_,}buf_read call in which ops were not supplied. However, we > > should apply ops defensively in case there is ever a coding mistake; and > > an upcoming repair patch will need to be able to read a buffer without > > assigning buf ops. > > > > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> > > --- > > Just a few nits.. > > > fs/xfs/xfs_buf.c | 64 +++++++++++++++++++++++++++++++++++------------- > > fs/xfs/xfs_buf.h | 3 ++ > > fs/xfs/xfs_trans_buf.c | 28 +++++++++++++++++++++ > > 3 files changed, 78 insertions(+), 17 deletions(-) > > > > > > diff --git a/fs/xfs/xfs_buf.c b/fs/xfs/xfs_buf.c > > index e839907e8492..3adfa139dcfe 100644 > > --- a/fs/xfs/xfs_buf.c > > +++ b/fs/xfs/xfs_buf.c > > @@ -749,6 +749,29 @@ _xfs_buf_read( > > return xfs_buf_submit(bp); > > } > > > > +/* > > + * If the caller passed in an ops structure and the buffer doesn't have ops > > + * assigned, set the ops and use them to verify the contents. If the contents > > + * cannot be verified, we'll clear XBF_DONE. > > + */ > > +int > > +xfs_buf_ensure_ops( > > + struct xfs_buf *bp, > > + const struct xfs_buf_ops *ops) > > +{ > > + ASSERT(bp->b_flags & XBF_DONE); > > + > > + if (!ops || bp->b_ops) > > + return 0; > > + > > + bp->b_error = 0; > > If we only call this for XBF_DONE buffers, does that mean that ->b_error > should also be zero already? Is it worth adding that to the assert above > instead of resetting it? Hmm, yes, I think b_error ought to be zero on the way into this function. > > + bp->b_ops = ops; > > + bp->b_ops->verify_read(bp); > > + if (bp->b_error) > > + bp->b_flags &= ~XBF_DONE; > > + return bp->b_error; > > +} > > + > > xfs_buf_t * > > xfs_buf_read_map( > > struct xfs_buftarg *target, > > @@ -762,26 +785,33 @@ xfs_buf_read_map( > > flags |= XBF_READ; > > > > bp = xfs_buf_get_map(target, map, nmaps, flags); > > - if (bp) { > > - trace_xfs_buf_read(bp, flags, _RET_IP_); > > + if (!bp) > > + return NULL; > > > > - if (!(bp->b_flags & XBF_DONE)) { > > - XFS_STATS_INC(target->bt_mount, xb_get_read); > > - bp->b_ops = ops; > > - _xfs_buf_read(bp, flags); > > - } else if (flags & XBF_ASYNC) { > > - /* > > - * Read ahead call which is already satisfied, > > - * drop the buffer > > - */ > > - xfs_buf_relse(bp); > > - return NULL; > > - } else { > > - /* We do not want read in the flags */ > > - bp->b_flags &= ~XBF_READ; > > - } > > + trace_xfs_buf_read(bp, flags, _RET_IP_); > > + > > + if (!(bp->b_flags & XBF_DONE)) { > > + XFS_STATS_INC(target->bt_mount, xb_get_read); > > + bp->b_ops = ops; > > + _xfs_buf_read(bp, flags); > > + ASSERT(bp->b_ops != NULL || ops == NULL); > > I like having this assert sprinkled around as well, but I'm wondering if > we can (safely) make it a bit stronger: > > ASSERT(bp->b_ops == ops || !ops); > > I think the !ops check is necessary to cover the case of reading a > verified buffer from scrub context (where we don't know the appropriate > verifier), but with the current approach we should never pass in the > wrong ops for a verified buffer, right? That *particular* ASSERT I think is the pointless result of overeagerness on my part. :) But yes, the rest of them ought to be as you say. We never want the situation where the read caller passes in bufops A but it really has bufops B. TBH since ASSERTs disappear in CONFIG_XFS_DEBUG=n mode, maybe we should complain a little louder about this sort of misprogramming? I'll look into doing something like... void xfs_buf_confirm_ops(bp, ops) { bool wrong_ops = ops && bp->b_ops != ops; if (!wrong_ops) return; WARN_ON(wrong_ops, "Metadata verifier mismatch at %pS", __return_address; xfs_force_shutdown(...); } > > > + return bp; > > + } > > + > > + xfs_buf_ensure_ops(bp, ops); > > + > > + if (flags & XBF_ASYNC) { > > + /* > > + * Read ahead call which is already satisfied, > > + * drop the buffer > > + */ > > + xfs_buf_relse(bp); > > + return NULL; > > } > > > > + /* We do not want read in the flags */ > > + bp->b_flags &= ~XBF_READ; > > + ASSERT(bp->b_ops != NULL || ops == NULL); > > return bp; > > } > > > > diff --git a/fs/xfs/xfs_buf.h b/fs/xfs/xfs_buf.h > > index 4e3171acd0f8..526bc7e9e7fc 100644 > > --- a/fs/xfs/xfs_buf.h > > +++ b/fs/xfs/xfs_buf.h > > @@ -385,4 +385,7 @@ extern int xfs_setsize_buftarg(xfs_buftarg_t *, unsigned int); > > #define xfs_getsize_buftarg(buftarg) block_size((buftarg)->bt_bdev) > > #define xfs_readonly_buftarg(buftarg) bdev_read_only((buftarg)->bt_bdev) > > > > +extern int xfs_buf_ensure_ops(struct xfs_buf *bp, > > + const struct xfs_buf_ops *ops); > > + > > #endif /* __XFS_BUF_H__ */ > > diff --git a/fs/xfs/xfs_trans_buf.c b/fs/xfs/xfs_trans_buf.c > > index 15919f67a88f..b0ba2ca9cca3 100644 > > --- a/fs/xfs/xfs_trans_buf.c > > +++ b/fs/xfs/xfs_trans_buf.c > > @@ -264,11 +264,38 @@ xfs_trans_read_buf_map( > > return -EIO; > > } > > > > + /* > > + * The caller is trying to read a buffer that is already > > "Check if the caller is trying ..." ? > > Nits aside: > > Reviewed-by: Brian Foster <bfoster@redhat.com> Will fix, thanks for the review! --D > > + * attached to the transaction yet has no buffer ops assigned. > > + * Ops are usually attached when the buffer is attached to the > > + * transaction, or by the read caller in special circumstances. > > + * That didn't happen, which is not how this is supposed to go. > > + * > > + * If the buffer passes verification we'll let this go, but if > > + * not we have to shut down. Let the transaction cleanup code > > + * release this buffer when it kills the tranaction. > > + */ > > + ASSERT(bp->b_ops != NULL); > > + error = xfs_buf_ensure_ops(bp, ops); > > + if (error) { > > + xfs_buf_ioerror_alert(bp, __func__); > > + > > + if (tp->t_flags & XFS_TRANS_DIRTY) > > + xfs_force_shutdown(tp->t_mountp, > > + SHUTDOWN_META_IO_ERROR); > > + > > + /* bad CRC means corrupted metadata */ > > + if (error == -EFSBADCRC) > > + error = -EFSCORRUPTED; > > + return error; > > + } > > + > > bip = bp->b_log_item; > > bip->bli_recur++; > > > > ASSERT(atomic_read(&bip->bli_refcount) > 0); > > trace_xfs_trans_read_buf_recur(bip); > > + ASSERT(bp->b_ops != NULL || ops == NULL); > > *bpp = bp; > > return 0; > > } > > @@ -316,6 +343,7 @@ xfs_trans_read_buf_map( > > _xfs_trans_bjoin(tp, bp, 1); > > trace_xfs_trans_read_buf(bp->b_log_item); > > } > > + ASSERT(bp->b_ops != NULL || ops == NULL); > > *bpp = bp; > > return 0; > > > > ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 2/3] xfs: always assign buffer verifiers when one is provided 2018-10-05 17:02 ` Darrick J. Wong @ 2018-10-06 3:15 ` Darrick J. Wong 0 siblings, 0 replies; 13+ messages in thread From: Darrick J. Wong @ 2018-10-06 3:15 UTC (permalink / raw) To: Brian Foster; +Cc: david, linux-xfs On Fri, Oct 05, 2018 at 10:02:51AM -0700, Darrick J. Wong wrote: > On Fri, Oct 05, 2018 at 07:57:13AM -0400, Brian Foster wrote: > > On Thu, Oct 04, 2018 at 05:47:51PM -0700, Darrick J. Wong wrote: > > > From: Darrick J. Wong <darrick.wong@oracle.com> > > > > > > If a caller supplies buffer ops when trying to read a buffer and the > > > buffer doesn't already have buf ops assigned, ensure that the ops are > > > assigned to the buffer and the verifier is run on that buffer. > > > > > > Note that current XFS code is careful to assign buffer ops after a > > > xfs_{trans_,}buf_read call in which ops were not supplied. However, we > > > should apply ops defensively in case there is ever a coding mistake; and > > > an upcoming repair patch will need to be able to read a buffer without > > > assigning buf ops. > > > > > > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> > > > --- > > > > Just a few nits.. > > > > > fs/xfs/xfs_buf.c | 64 +++++++++++++++++++++++++++++++++++------------- > > > fs/xfs/xfs_buf.h | 3 ++ > > > fs/xfs/xfs_trans_buf.c | 28 +++++++++++++++++++++ > > > 3 files changed, 78 insertions(+), 17 deletions(-) > > > > > > > > > diff --git a/fs/xfs/xfs_buf.c b/fs/xfs/xfs_buf.c > > > index e839907e8492..3adfa139dcfe 100644 > > > --- a/fs/xfs/xfs_buf.c > > > +++ b/fs/xfs/xfs_buf.c > > > @@ -749,6 +749,29 @@ _xfs_buf_read( > > > return xfs_buf_submit(bp); > > > } > > > > > > +/* > > > + * If the caller passed in an ops structure and the buffer doesn't have ops > > > + * assigned, set the ops and use them to verify the contents. If the contents > > > + * cannot be verified, we'll clear XBF_DONE. > > > + */ > > > +int > > > +xfs_buf_ensure_ops( > > > + struct xfs_buf *bp, > > > + const struct xfs_buf_ops *ops) > > > +{ > > > + ASSERT(bp->b_flags & XBF_DONE); > > > + > > > + if (!ops || bp->b_ops) > > > + return 0; > > > + > > > + bp->b_error = 0; > > > > If we only call this for XBF_DONE buffers, does that mean that ->b_error > > should also be zero already? Is it worth adding that to the assert above > > instead of resetting it? > > Hmm, yes, I think b_error ought to be zero on the way into this > function. > > > > + bp->b_ops = ops; > > > + bp->b_ops->verify_read(bp); > > > + if (bp->b_error) > > > + bp->b_flags &= ~XBF_DONE; > > > + return bp->b_error; > > > +} > > > + > > > xfs_buf_t * > > > xfs_buf_read_map( > > > struct xfs_buftarg *target, > > > @@ -762,26 +785,33 @@ xfs_buf_read_map( > > > flags |= XBF_READ; > > > > > > bp = xfs_buf_get_map(target, map, nmaps, flags); > > > - if (bp) { > > > - trace_xfs_buf_read(bp, flags, _RET_IP_); > > > + if (!bp) > > > + return NULL; > > > > > > - if (!(bp->b_flags & XBF_DONE)) { > > > - XFS_STATS_INC(target->bt_mount, xb_get_read); > > > - bp->b_ops = ops; > > > - _xfs_buf_read(bp, flags); > > > - } else if (flags & XBF_ASYNC) { > > > - /* > > > - * Read ahead call which is already satisfied, > > > - * drop the buffer > > > - */ > > > - xfs_buf_relse(bp); > > > - return NULL; > > > - } else { > > > - /* We do not want read in the flags */ > > > - bp->b_flags &= ~XBF_READ; > > > - } > > > + trace_xfs_buf_read(bp, flags, _RET_IP_); > > > + > > > + if (!(bp->b_flags & XBF_DONE)) { > > > + XFS_STATS_INC(target->bt_mount, xb_get_read); > > > + bp->b_ops = ops; > > > + _xfs_buf_read(bp, flags); > > > + ASSERT(bp->b_ops != NULL || ops == NULL); > > > > I like having this assert sprinkled around as well, but I'm wondering if > > we can (safely) make it a bit stronger: > > > > ASSERT(bp->b_ops == ops || !ops); This doesn't work because xfs_da3_node_buf_ops can change b_ops to the leaf1 or leafn buffer ops, which means that subsequent re-reads of the same buffer will blow the assert even though everything's fine. I'm going to leave it as it was. (I tried xfstests and it kablooie with assertion failures everywhere.) --D > > > > I think the !ops check is necessary to cover the case of reading a > > verified buffer from scrub context (where we don't know the appropriate > > verifier), but with the current approach we should never pass in the > > wrong ops for a verified buffer, right? > > That *particular* ASSERT I think is the pointless result of > overeagerness on my part. :) > > But yes, the rest of them ought to be as you say. We never want the > situation where the read caller passes in bufops A but it really has > bufops B. > > TBH since ASSERTs disappear in CONFIG_XFS_DEBUG=n mode, maybe we should > complain a little louder about this sort of misprogramming? I'll look > into doing something like... > > void > xfs_buf_confirm_ops(bp, ops) > { > bool wrong_ops = ops && bp->b_ops != ops; > > if (!wrong_ops) > return; > WARN_ON(wrong_ops, "Metadata verifier mismatch at %pS", > __return_address; > xfs_force_shutdown(...); > } > > > > > + return bp; > > > + } > > > + > > > + xfs_buf_ensure_ops(bp, ops); > > > + > > > + if (flags & XBF_ASYNC) { > > > + /* > > > + * Read ahead call which is already satisfied, > > > + * drop the buffer > > > + */ > > > + xfs_buf_relse(bp); > > > + return NULL; > > > } > > > > > > + /* We do not want read in the flags */ > > > + bp->b_flags &= ~XBF_READ; > > > + ASSERT(bp->b_ops != NULL || ops == NULL); > > > return bp; > > > } > > > > > > diff --git a/fs/xfs/xfs_buf.h b/fs/xfs/xfs_buf.h > > > index 4e3171acd0f8..526bc7e9e7fc 100644 > > > --- a/fs/xfs/xfs_buf.h > > > +++ b/fs/xfs/xfs_buf.h > > > @@ -385,4 +385,7 @@ extern int xfs_setsize_buftarg(xfs_buftarg_t *, unsigned int); > > > #define xfs_getsize_buftarg(buftarg) block_size((buftarg)->bt_bdev) > > > #define xfs_readonly_buftarg(buftarg) bdev_read_only((buftarg)->bt_bdev) > > > > > > +extern int xfs_buf_ensure_ops(struct xfs_buf *bp, > > > + const struct xfs_buf_ops *ops); > > > + > > > #endif /* __XFS_BUF_H__ */ > > > diff --git a/fs/xfs/xfs_trans_buf.c b/fs/xfs/xfs_trans_buf.c > > > index 15919f67a88f..b0ba2ca9cca3 100644 > > > --- a/fs/xfs/xfs_trans_buf.c > > > +++ b/fs/xfs/xfs_trans_buf.c > > > @@ -264,11 +264,38 @@ xfs_trans_read_buf_map( > > > return -EIO; > > > } > > > > > > + /* > > > + * The caller is trying to read a buffer that is already > > > > "Check if the caller is trying ..." ? > > > > Nits aside: > > > > Reviewed-by: Brian Foster <bfoster@redhat.com> > > Will fix, thanks for the review! > > --D > > > > + * attached to the transaction yet has no buffer ops assigned. > > > + * Ops are usually attached when the buffer is attached to the > > > + * transaction, or by the read caller in special circumstances. > > > + * That didn't happen, which is not how this is supposed to go. > > > + * > > > + * If the buffer passes verification we'll let this go, but if > > > + * not we have to shut down. Let the transaction cleanup code > > > + * release this buffer when it kills the tranaction. > > > + */ > > > + ASSERT(bp->b_ops != NULL); > > > + error = xfs_buf_ensure_ops(bp, ops); > > > + if (error) { > > > + xfs_buf_ioerror_alert(bp, __func__); > > > + > > > + if (tp->t_flags & XFS_TRANS_DIRTY) > > > + xfs_force_shutdown(tp->t_mountp, > > > + SHUTDOWN_META_IO_ERROR); > > > + > > > + /* bad CRC means corrupted metadata */ > > > + if (error == -EFSBADCRC) > > > + error = -EFSCORRUPTED; > > > + return error; > > > + } > > > + > > > bip = bp->b_log_item; > > > bip->bli_recur++; > > > > > > ASSERT(atomic_read(&bip->bli_refcount) > 0); > > > trace_xfs_trans_read_buf_recur(bip); > > > + ASSERT(bp->b_ops != NULL || ops == NULL); > > > *bpp = bp; > > > return 0; > > > } > > > @@ -316,6 +343,7 @@ xfs_trans_read_buf_map( > > > _xfs_trans_bjoin(tp, bp, 1); > > > trace_xfs_trans_read_buf(bp->b_log_item); > > > } > > > + ASSERT(bp->b_ops != NULL || ops == NULL); > > > *bpp = bp; > > > return 0; > > > > > > ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 2/3] xfs: always assign buffer verifiers when one is provided 2018-10-05 0:47 ` [PATCH 2/3] xfs: always assign buffer verifiers when one is provided Darrick J. Wong 2018-10-05 11:57 ` Brian Foster @ 2018-10-06 10:25 ` Christoph Hellwig 1 sibling, 0 replies; 13+ messages in thread From: Christoph Hellwig @ 2018-10-06 10:25 UTC (permalink / raw) To: Darrick J. Wong; +Cc: david, linux-xfs > @@ -385,4 +385,7 @@ extern int xfs_setsize_buftarg(xfs_buftarg_t *, unsigned int); > #define xfs_getsize_buftarg(buftarg) block_size((buftarg)->bt_bdev) > #define xfs_readonly_buftarg(buftarg) bdev_read_only((buftarg)->bt_bdev) > > +extern int xfs_buf_ensure_ops(struct xfs_buf *bp, > + const struct xfs_buf_ops *ops); While we are nitpicking: no need to have an extern in function prototypes ever. Modulo that and the nitpicks from Brian this looks fine to me: Reviewed-by: Christoph Hellwig <hch@lst.de> ^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH 3/3] xfs: fix buffer state management in xrep_findroot_block 2018-10-05 0:47 [PATCH v3 0/3] xfs-4.20: scrub fixes Darrick J. Wong 2018-10-05 0:47 ` [PATCH 1/3] xfs: xrep_findroot_block should reject root blocks with siblings Darrick J. Wong 2018-10-05 0:47 ` [PATCH 2/3] xfs: always assign buffer verifiers when one is provided Darrick J. Wong @ 2018-10-05 0:47 ` Darrick J. Wong 2018-10-05 11:59 ` Brian Foster 2 siblings, 1 reply; 13+ messages in thread From: Darrick J. Wong @ 2018-10-05 0:47 UTC (permalink / raw) To: david, darrick.wong; +Cc: linux-xfs From: Darrick J. Wong <darrick.wong@oracle.com> We don't handle buffer state properly in online repair's findroot routine. If a buffer already has b_ops set, we don't ever want to touch that, and we don't want to call the read verifiers on a buffer that could be dirty (CRCs are only recomputed during log checkpoints). Therefore, be more careful about what we do with a buffer -- if someone else already attached ops that are not the ones for this btree type, just ignore the buffer. We only attach our btree type's buf ops if it matches the magic/uuid and structure checks. We also modify xfs_buf_read_map to allow callers to set buffer ops on a DONE buffer with NULL ops so that repair doesn't leave behind buffers which won't have buffers attached to them. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> --- fs/xfs/scrub/repair.c | 63 +++++++++++++++++++++++++++++++++++++----------- fs/xfs/xfs_trans.h | 1 + fs/xfs/xfs_trans_buf.c | 13 ++++++++++ 3 files changed, 63 insertions(+), 14 deletions(-) diff --git a/fs/xfs/scrub/repair.c b/fs/xfs/scrub/repair.c index 63786341ac2a..cebaebb26566 100644 --- a/fs/xfs/scrub/repair.c +++ b/fs/xfs/scrub/repair.c @@ -29,6 +29,8 @@ #include "xfs_ag_resv.h" #include "xfs_trans_space.h" #include "xfs_quota.h" +#include "xfs_attr.h" +#include "xfs_reflink.h" #include "scrub/xfs_scrub.h" #include "scrub/scrub.h" #include "scrub/common.h" @@ -699,7 +701,7 @@ xrep_findroot_block( struct xfs_btree_block *btblock; xfs_daddr_t daddr; int block_level; - int error; + int error = 0; daddr = XFS_AGB_TO_DADDR(mp, ri->sc->sa.agno, agbno); @@ -718,28 +720,61 @@ xrep_findroot_block( return error; } + /* + * Read the buffer into memory so that we can see if it's a match for + * our btree type. We have no clue if it is beforehand, and we want to + * avoid xfs_trans_read_buf's behavior of dumping the DONE state (which + * will cause needless disk reads in subsequent calls to this function) + * and logging metadata verifier failures. + * + * Therefore, pass in NULL buffer ops. If the buffer was already in + * memory from some other caller it will already have b_ops assigned. + * If it was in memory from a previous unsuccessful findroot_block + * call, the buffer won't have b_ops but it should be clean and ready + * for us to try to verify if the read call succeeds. The same applies + * if the buffer wasn't in memory at all. + * + * Note: If we never match a btree type with this buffer, it will be + * left in memory with NULL b_ops. This shouldn't be a problem unless + * the buffer gets written. + */ error = xfs_trans_read_buf(mp, ri->sc->tp, mp->m_ddev_targp, daddr, mp->m_bsize, 0, &bp, NULL); if (error) return error; - /* - * Does this look like a block matching our fs and higher than any - * other block we've found so far? If so, reattach buffer verifiers - * so the AIL won't complain if the buffer is also dirty. - */ + /* Ensure the block magic matches the btree type we're looking for. */ btblock = XFS_BUF_TO_BLOCK(bp); if (be32_to_cpu(btblock->bb_magic) != fab->magic) goto out; - if (xfs_sb_version_hascrc(&mp->m_sb) && - !uuid_equal(&btblock->bb_u.s.bb_uuid, &mp->m_sb.sb_meta_uuid)) - goto out; - bp->b_ops = fab->buf_ops; - /* Make sure we pass the verifiers. */ - bp->b_ops->verify_read(bp); - if (bp->b_error) - goto out; + /* + * If the buffer already has ops applied and they're not the ones for + * this btree type, we know this block doesn't match the btree and we + * can bail out. + * + * If the buffer ops match ours, someone else has already validated + * the block for us, so we can move on to checking if this is a root + * block candidate. + * + * If the buffer does not have ops, nobody has successfully validated + * the contents and the buffer cannot be dirty. If the magic, uuid, + * and structure match this btree type then we'll move on to checking + * if it's a root block candidate. If there is no match, bail out. + */ + if (bp->b_ops) { + if (bp->b_ops != fab->buf_ops) + goto out; + } else { + ASSERT(!xfs_trans_buf_is_dirty(bp)); + if (!uuid_equal(&btblock->bb_u.s.bb_uuid, + &mp->m_sb.sb_meta_uuid)) + goto out; + fab->buf_ops->verify_read(bp); + if (bp->b_error) + goto out; + bp->b_ops = fab->buf_ops; + } /* * This block passes the magic/uuid and verifier tests for this btree diff --git a/fs/xfs/xfs_trans.h b/fs/xfs/xfs_trans.h index c3d278e96ad1..a0c5dbda18aa 100644 --- a/fs/xfs/xfs_trans.h +++ b/fs/xfs/xfs_trans.h @@ -220,6 +220,7 @@ void xfs_trans_ijoin(struct xfs_trans *, struct xfs_inode *, uint); void xfs_trans_log_buf(struct xfs_trans *, struct xfs_buf *, uint, uint); void xfs_trans_dirty_buf(struct xfs_trans *, struct xfs_buf *); +bool xfs_trans_buf_is_dirty(struct xfs_buf *bp); void xfs_trans_log_inode(xfs_trans_t *, struct xfs_inode *, uint); void xfs_extent_free_init_defer_op(void); diff --git a/fs/xfs/xfs_trans_buf.c b/fs/xfs/xfs_trans_buf.c index b0ba2ca9cca3..93a053c700dd 100644 --- a/fs/xfs/xfs_trans_buf.c +++ b/fs/xfs/xfs_trans_buf.c @@ -349,6 +349,19 @@ xfs_trans_read_buf_map( } +/* Has this buffer been dirtied by anyone? */ +bool +xfs_trans_buf_is_dirty( + struct xfs_buf *bp) +{ + struct xfs_buf_log_item *bip = bp->b_log_item; + + if (!bip) + return false; + ASSERT(bip->bli_item.li_type == XFS_LI_BUF); + return test_bit(XFS_LI_DIRTY, &bip->bli_item.li_flags); +} + /* * Release the buffer bp which was previously acquired with one of the * xfs_trans_... buffer allocation routines if the buffer has not ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH 3/3] xfs: fix buffer state management in xrep_findroot_block 2018-10-05 0:47 ` [PATCH 3/3] xfs: fix buffer state management in xrep_findroot_block Darrick J. Wong @ 2018-10-05 11:59 ` Brian Foster 2018-10-05 15:11 ` Darrick J. Wong 0 siblings, 1 reply; 13+ messages in thread From: Brian Foster @ 2018-10-05 11:59 UTC (permalink / raw) To: Darrick J. Wong; +Cc: david, linux-xfs On Thu, Oct 04, 2018 at 05:47:57PM -0700, Darrick J. Wong wrote: > From: Darrick J. Wong <darrick.wong@oracle.com> > > We don't handle buffer state properly in online repair's findroot > routine. If a buffer already has b_ops set, we don't ever want to touch > that, and we don't want to call the read verifiers on a buffer that > could be dirty (CRCs are only recomputed during log checkpoints). > > Therefore, be more careful about what we do with a buffer -- if someone > else already attached ops that are not the ones for this btree type, > just ignore the buffer. We only attach our btree type's buf ops if it > matches the magic/uuid and structure checks. > > We also modify xfs_buf_read_map to allow callers to set buffer ops on a > DONE buffer with NULL ops so that repair doesn't leave behind buffers > which won't have buffers attached to them. > > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> > --- > fs/xfs/scrub/repair.c | 63 +++++++++++++++++++++++++++++++++++++----------- > fs/xfs/xfs_trans.h | 1 + > fs/xfs/xfs_trans_buf.c | 13 ++++++++++ > 3 files changed, 63 insertions(+), 14 deletions(-) > > > diff --git a/fs/xfs/scrub/repair.c b/fs/xfs/scrub/repair.c > index 63786341ac2a..cebaebb26566 100644 > --- a/fs/xfs/scrub/repair.c > +++ b/fs/xfs/scrub/repair.c ... > @@ -718,28 +720,61 @@ xrep_findroot_block( > return error; > } > ... > + /* > + * If the buffer already has ops applied and they're not the ones for > + * this btree type, we know this block doesn't match the btree and we > + * can bail out. > + * > + * If the buffer ops match ours, someone else has already validated > + * the block for us, so we can move on to checking if this is a root > + * block candidate. > + * > + * If the buffer does not have ops, nobody has successfully validated > + * the contents and the buffer cannot be dirty. If the magic, uuid, > + * and structure match this btree type then we'll move on to checking > + * if it's a root block candidate. If there is no match, bail out. > + */ > + if (bp->b_ops) { > + if (bp->b_ops != fab->buf_ops) > + goto out; > + } else { > + ASSERT(!xfs_trans_buf_is_dirty(bp)); > + if (!uuid_equal(&btblock->bb_u.s.bb_uuid, > + &mp->m_sb.sb_meta_uuid)) > + goto out; > + fab->buf_ops->verify_read(bp); > + if (bp->b_error) > + goto out; I guess this is related to my question on the previous patch. If the verifier fails, we leave the XBF_DONE buffer around with ->b_ops == NULL and ->b_error != 0. I suppose somebody should eventually attach a verifier before this buffer is ever really used, but I think I'd feel a little better about this if we immediately cleaned up the side effects of using the wrong verifier rather than potentially leaking an error to other contexts where it has no relevance. That aside, this all looks fine to me. Brian > + bp->b_ops = fab->buf_ops; > + } > > /* > * This block passes the magic/uuid and verifier tests for this btree > diff --git a/fs/xfs/xfs_trans.h b/fs/xfs/xfs_trans.h > index c3d278e96ad1..a0c5dbda18aa 100644 > --- a/fs/xfs/xfs_trans.h > +++ b/fs/xfs/xfs_trans.h > @@ -220,6 +220,7 @@ void xfs_trans_ijoin(struct xfs_trans *, struct xfs_inode *, uint); > void xfs_trans_log_buf(struct xfs_trans *, struct xfs_buf *, uint, > uint); > void xfs_trans_dirty_buf(struct xfs_trans *, struct xfs_buf *); > +bool xfs_trans_buf_is_dirty(struct xfs_buf *bp); > void xfs_trans_log_inode(xfs_trans_t *, struct xfs_inode *, uint); > > void xfs_extent_free_init_defer_op(void); > diff --git a/fs/xfs/xfs_trans_buf.c b/fs/xfs/xfs_trans_buf.c > index b0ba2ca9cca3..93a053c700dd 100644 > --- a/fs/xfs/xfs_trans_buf.c > +++ b/fs/xfs/xfs_trans_buf.c > @@ -349,6 +349,19 @@ xfs_trans_read_buf_map( > > } > > +/* Has this buffer been dirtied by anyone? */ > +bool > +xfs_trans_buf_is_dirty( > + struct xfs_buf *bp) > +{ > + struct xfs_buf_log_item *bip = bp->b_log_item; > + > + if (!bip) > + return false; > + ASSERT(bip->bli_item.li_type == XFS_LI_BUF); > + return test_bit(XFS_LI_DIRTY, &bip->bli_item.li_flags); > +} > + > /* > * Release the buffer bp which was previously acquired with one of the > * xfs_trans_... buffer allocation routines if the buffer has not > ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 3/3] xfs: fix buffer state management in xrep_findroot_block 2018-10-05 11:59 ` Brian Foster @ 2018-10-05 15:11 ` Darrick J. Wong 0 siblings, 0 replies; 13+ messages in thread From: Darrick J. Wong @ 2018-10-05 15:11 UTC (permalink / raw) To: Brian Foster; +Cc: david, linux-xfs On Fri, Oct 05, 2018 at 07:59:51AM -0400, Brian Foster wrote: > On Thu, Oct 04, 2018 at 05:47:57PM -0700, Darrick J. Wong wrote: > > From: Darrick J. Wong <darrick.wong@oracle.com> > > > > We don't handle buffer state properly in online repair's findroot > > routine. If a buffer already has b_ops set, we don't ever want to touch > > that, and we don't want to call the read verifiers on a buffer that > > could be dirty (CRCs are only recomputed during log checkpoints). > > > > Therefore, be more careful about what we do with a buffer -- if someone > > else already attached ops that are not the ones for this btree type, > > just ignore the buffer. We only attach our btree type's buf ops if it > > matches the magic/uuid and structure checks. > > > > We also modify xfs_buf_read_map to allow callers to set buffer ops on a > > DONE buffer with NULL ops so that repair doesn't leave behind buffers > > which won't have buffers attached to them. > > > > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> > > --- > > fs/xfs/scrub/repair.c | 63 +++++++++++++++++++++++++++++++++++++----------- > > fs/xfs/xfs_trans.h | 1 + > > fs/xfs/xfs_trans_buf.c | 13 ++++++++++ > > 3 files changed, 63 insertions(+), 14 deletions(-) > > > > > > diff --git a/fs/xfs/scrub/repair.c b/fs/xfs/scrub/repair.c > > index 63786341ac2a..cebaebb26566 100644 > > --- a/fs/xfs/scrub/repair.c > > +++ b/fs/xfs/scrub/repair.c > ... > > @@ -718,28 +720,61 @@ xrep_findroot_block( > > return error; > > } > > > ... > > + /* > > + * If the buffer already has ops applied and they're not the ones for > > + * this btree type, we know this block doesn't match the btree and we > > + * can bail out. > > + * > > + * If the buffer ops match ours, someone else has already validated > > + * the block for us, so we can move on to checking if this is a root > > + * block candidate. > > + * > > + * If the buffer does not have ops, nobody has successfully validated > > + * the contents and the buffer cannot be dirty. If the magic, uuid, > > + * and structure match this btree type then we'll move on to checking > > + * if it's a root block candidate. If there is no match, bail out. > > + */ > > + if (bp->b_ops) { > > + if (bp->b_ops != fab->buf_ops) > > + goto out; > > + } else { > > + ASSERT(!xfs_trans_buf_is_dirty(bp)); > > + if (!uuid_equal(&btblock->bb_u.s.bb_uuid, > > + &mp->m_sb.sb_meta_uuid)) > > + goto out; > > + fab->buf_ops->verify_read(bp); > > + if (bp->b_error) > > + goto out; > > I guess this is related to my question on the previous patch. If the > verifier fails, we leave the XBF_DONE buffer around with ->b_ops == NULL > and ->b_error != 0. > > I suppose somebody should eventually attach a verifier before this > buffer is ever really used, but I think I'd feel a little better about > this if we immediately cleaned up the side effects of using the wrong > verifier rather than potentially leaking an error to other contexts > where it has no relevance. That aside, this all looks fine to me. Ok, I'll make it clean up the error state before dumping the buffer. --D > Brian > > > + bp->b_ops = fab->buf_ops; > > + } > > > > /* > > * This block passes the magic/uuid and verifier tests for this btree > > diff --git a/fs/xfs/xfs_trans.h b/fs/xfs/xfs_trans.h > > index c3d278e96ad1..a0c5dbda18aa 100644 > > --- a/fs/xfs/xfs_trans.h > > +++ b/fs/xfs/xfs_trans.h > > @@ -220,6 +220,7 @@ void xfs_trans_ijoin(struct xfs_trans *, struct xfs_inode *, uint); > > void xfs_trans_log_buf(struct xfs_trans *, struct xfs_buf *, uint, > > uint); > > void xfs_trans_dirty_buf(struct xfs_trans *, struct xfs_buf *); > > +bool xfs_trans_buf_is_dirty(struct xfs_buf *bp); > > void xfs_trans_log_inode(xfs_trans_t *, struct xfs_inode *, uint); > > > > void xfs_extent_free_init_defer_op(void); > > diff --git a/fs/xfs/xfs_trans_buf.c b/fs/xfs/xfs_trans_buf.c > > index b0ba2ca9cca3..93a053c700dd 100644 > > --- a/fs/xfs/xfs_trans_buf.c > > +++ b/fs/xfs/xfs_trans_buf.c > > @@ -349,6 +349,19 @@ xfs_trans_read_buf_map( > > > > } > > > > +/* Has this buffer been dirtied by anyone? */ > > +bool > > +xfs_trans_buf_is_dirty( > > + struct xfs_buf *bp) > > +{ > > + struct xfs_buf_log_item *bip = bp->b_log_item; > > + > > + if (!bip) > > + return false; > > + ASSERT(bip->bli_item.li_type == XFS_LI_BUF); > > + return test_bit(XFS_LI_DIRTY, &bip->bli_item.li_flags); > > +} > > + > > /* > > * Release the buffer bp which was previously acquired with one of the > > * xfs_trans_... buffer allocation routines if the buffer has not > > ^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH v4 0/3] xfs-4.20: scrub fixes @ 2018-10-09 4:19 Darrick J. Wong 2018-10-09 4:19 ` [PATCH 3/3] xfs: fix buffer state management in xrep_findroot_block Darrick J. Wong 0 siblings, 1 reply; 13+ messages in thread From: Darrick J. Wong @ 2018-10-09 4:19 UTC (permalink / raw) To: david, darrick.wong; +Cc: linux-xfs, bfoster Hi all, Here are a few online fsck fixes for 4.20. The first patch fixes the online repair "find AG btree root" function to ignore btree blocks that have siblings and to ignore a btree level if multiple sibling-less blocks are found. The second patch strengthens the buffer read functions to apply buffer ops (and verify the buffer contents) any time a caller tries to read a buffer with a given set of ops and the buffer doesn't have ops yet. The third patch fixes some buffer state management bugs so that we don't accidentally clobber b_ops on buffers that were already in-core when we try to find an AG header's btree root blocks. Comments and questions are, as always, welcome. --D ^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH 3/3] xfs: fix buffer state management in xrep_findroot_block 2018-10-09 4:19 [PATCH v4 0/3] xfs-4.20: scrub fixes Darrick J. Wong @ 2018-10-09 4:19 ` Darrick J. Wong 2018-10-09 12:16 ` Brian Foster 0 siblings, 1 reply; 13+ messages in thread From: Darrick J. Wong @ 2018-10-09 4:19 UTC (permalink / raw) To: david, darrick.wong; +Cc: linux-xfs, bfoster From: Darrick J. Wong <darrick.wong@oracle.com> We don't handle buffer state properly in online repair's findroot routine. If a buffer already has b_ops set, we don't ever want to touch that, and we don't want to call the read verifiers on a buffer that could be dirty (CRCs are only recomputed during log checkpoints). Therefore, be more careful about what we do with a buffer -- if someone else already attached ops that are not the ones for this btree type, just ignore the buffer. We only attach our btree type's buf ops if it matches the magic/uuid and structure checks. We also modify xfs_buf_read_map to allow callers to set buffer ops on a DONE buffer with NULL ops so that repair doesn't leave behind buffers which won't have buffers attached to them. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> --- fs/xfs/scrub/repair.c | 65 ++++++++++++++++++++++++++++++++++++++---------- fs/xfs/xfs_trans.h | 1 + fs/xfs/xfs_trans_buf.c | 13 ++++++++++ 3 files changed, 65 insertions(+), 14 deletions(-) diff --git a/fs/xfs/scrub/repair.c b/fs/xfs/scrub/repair.c index 63786341ac2a..a07b9364c9de 100644 --- a/fs/xfs/scrub/repair.c +++ b/fs/xfs/scrub/repair.c @@ -29,6 +29,8 @@ #include "xfs_ag_resv.h" #include "xfs_trans_space.h" #include "xfs_quota.h" +#include "xfs_attr.h" +#include "xfs_reflink.h" #include "scrub/xfs_scrub.h" #include "scrub/scrub.h" #include "scrub/common.h" @@ -699,7 +701,7 @@ xrep_findroot_block( struct xfs_btree_block *btblock; xfs_daddr_t daddr; int block_level; - int error; + int error = 0; daddr = XFS_AGB_TO_DADDR(mp, ri->sc->sa.agno, agbno); @@ -718,28 +720,63 @@ xrep_findroot_block( return error; } + /* + * Read the buffer into memory so that we can see if it's a match for + * our btree type. We have no clue if it is beforehand, and we want to + * avoid xfs_trans_read_buf's behavior of dumping the DONE state (which + * will cause needless disk reads in subsequent calls to this function) + * and logging metadata verifier failures. + * + * Therefore, pass in NULL buffer ops. If the buffer was already in + * memory from some other caller it will already have b_ops assigned. + * If it was in memory from a previous unsuccessful findroot_block + * call, the buffer won't have b_ops but it should be clean and ready + * for us to try to verify if the read call succeeds. The same applies + * if the buffer wasn't in memory at all. + * + * Note: If we never match a btree type with this buffer, it will be + * left in memory with NULL b_ops. This shouldn't be a problem unless + * the buffer gets written. + */ error = xfs_trans_read_buf(mp, ri->sc->tp, mp->m_ddev_targp, daddr, mp->m_bsize, 0, &bp, NULL); if (error) return error; - /* - * Does this look like a block matching our fs and higher than any - * other block we've found so far? If so, reattach buffer verifiers - * so the AIL won't complain if the buffer is also dirty. - */ + /* Ensure the block magic matches the btree type we're looking for. */ btblock = XFS_BUF_TO_BLOCK(bp); if (be32_to_cpu(btblock->bb_magic) != fab->magic) goto out; - if (xfs_sb_version_hascrc(&mp->m_sb) && - !uuid_equal(&btblock->bb_u.s.bb_uuid, &mp->m_sb.sb_meta_uuid)) - goto out; - bp->b_ops = fab->buf_ops; - /* Make sure we pass the verifiers. */ - bp->b_ops->verify_read(bp); - if (bp->b_error) - goto out; + /* + * If the buffer already has ops applied and they're not the ones for + * this btree type, we know this block doesn't match the btree and we + * can bail out. + * + * If the buffer ops match ours, someone else has already validated + * the block for us, so we can move on to checking if this is a root + * block candidate. + * + * If the buffer does not have ops, nobody has successfully validated + * the contents and the buffer cannot be dirty. If the magic, uuid, + * and structure match this btree type then we'll move on to checking + * if it's a root block candidate. If there is no match, bail out. + */ + if (bp->b_ops) { + if (bp->b_ops != fab->buf_ops) + goto out; + } else { + ASSERT(!xfs_trans_buf_is_dirty(bp)); + if (!uuid_equal(&btblock->bb_u.s.bb_uuid, + &mp->m_sb.sb_meta_uuid)) + goto out; + fab->buf_ops->verify_read(bp); + if (bp->b_error) { + bp->b_error = 0; + goto out; + } + bp->b_ops = fab->buf_ops; + } /* * This block passes the magic/uuid and verifier tests for this btree diff --git a/fs/xfs/xfs_trans.h b/fs/xfs/xfs_trans.h index c3d278e96ad1..a0c5dbda18aa 100644 --- a/fs/xfs/xfs_trans.h +++ b/fs/xfs/xfs_trans.h @@ -220,6 +220,7 @@ void xfs_trans_ijoin(struct xfs_trans *, struct xfs_inode *, uint); void xfs_trans_log_buf(struct xfs_trans *, struct xfs_buf *, uint, uint); void xfs_trans_dirty_buf(struct xfs_trans *, struct xfs_buf *); +bool xfs_trans_buf_is_dirty(struct xfs_buf *bp); void xfs_trans_log_inode(xfs_trans_t *, struct xfs_inode *, uint); void xfs_extent_free_init_defer_op(void); diff --git a/fs/xfs/xfs_trans_buf.c b/fs/xfs/xfs_trans_buf.c index fc40160c1773..629f1479c9d2 100644 --- a/fs/xfs/xfs_trans_buf.c +++ b/fs/xfs/xfs_trans_buf.c @@ -350,6 +350,19 @@ xfs_trans_read_buf_map( } +/* Has this buffer been dirtied by anyone? */ +bool +xfs_trans_buf_is_dirty( + struct xfs_buf *bp) +{ + struct xfs_buf_log_item *bip = bp->b_log_item; + + if (!bip) + return false; + ASSERT(bip->bli_item.li_type == XFS_LI_BUF); + return test_bit(XFS_LI_DIRTY, &bip->bli_item.li_flags); +} + /* * Release a buffer previously joined to the transaction. If the buffer is * modified within this transaction, decrement the recursion count but do not ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH 3/3] xfs: fix buffer state management in xrep_findroot_block 2018-10-09 4:19 ` [PATCH 3/3] xfs: fix buffer state management in xrep_findroot_block Darrick J. Wong @ 2018-10-09 12:16 ` Brian Foster 2018-10-09 16:19 ` Darrick J. Wong 0 siblings, 1 reply; 13+ messages in thread From: Brian Foster @ 2018-10-09 12:16 UTC (permalink / raw) To: Darrick J. Wong; +Cc: david, linux-xfs On Mon, Oct 08, 2018 at 09:19:47PM -0700, Darrick J. Wong wrote: > From: Darrick J. Wong <darrick.wong@oracle.com> > > We don't handle buffer state properly in online repair's findroot > routine. If a buffer already has b_ops set, we don't ever want to touch > that, and we don't want to call the read verifiers on a buffer that > could be dirty (CRCs are only recomputed during log checkpoints). > > Therefore, be more careful about what we do with a buffer -- if someone > else already attached ops that are not the ones for this btree type, > just ignore the buffer. We only attach our btree type's buf ops if it > matches the magic/uuid and structure checks. > > We also modify xfs_buf_read_map to allow callers to set buffer ops on a > DONE buffer with NULL ops so that repair doesn't leave behind buffers > which won't have buffers attached to them. > > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> > --- > fs/xfs/scrub/repair.c | 65 ++++++++++++++++++++++++++++++++++++++---------- > fs/xfs/xfs_trans.h | 1 + > fs/xfs/xfs_trans_buf.c | 13 ++++++++++ > 3 files changed, 65 insertions(+), 14 deletions(-) > > > diff --git a/fs/xfs/scrub/repair.c b/fs/xfs/scrub/repair.c > index 63786341ac2a..a07b9364c9de 100644 > --- a/fs/xfs/scrub/repair.c > +++ b/fs/xfs/scrub/repair.c > @@ -29,6 +29,8 @@ > #include "xfs_ag_resv.h" > #include "xfs_trans_space.h" > #include "xfs_quota.h" > +#include "xfs_attr.h" > +#include "xfs_reflink.h" > #include "scrub/xfs_scrub.h" > #include "scrub/scrub.h" > #include "scrub/common.h" > @@ -699,7 +701,7 @@ xrep_findroot_block( > struct xfs_btree_block *btblock; > xfs_daddr_t daddr; > int block_level; > - int error; > + int error = 0; > > daddr = XFS_AGB_TO_DADDR(mp, ri->sc->sa.agno, agbno); > > @@ -718,28 +720,63 @@ xrep_findroot_block( > return error; > } > > + /* > + * Read the buffer into memory so that we can see if it's a match for > + * our btree type. We have no clue if it is beforehand, and we want to > + * avoid xfs_trans_read_buf's behavior of dumping the DONE state (which > + * will cause needless disk reads in subsequent calls to this function) > + * and logging metadata verifier failures. > + * > + * Therefore, pass in NULL buffer ops. If the buffer was already in > + * memory from some other caller it will already have b_ops assigned. > + * If it was in memory from a previous unsuccessful findroot_block > + * call, the buffer won't have b_ops but it should be clean and ready > + * for us to try to verify if the read call succeeds. The same applies > + * if the buffer wasn't in memory at all. > + * > + * Note: If we never match a btree type with this buffer, it will be > + * left in memory with NULL b_ops. This shouldn't be a problem unless > + * the buffer gets written. > + */ > error = xfs_trans_read_buf(mp, ri->sc->tp, mp->m_ddev_targp, daddr, > mp->m_bsize, 0, &bp, NULL); > if (error) > return error; > > - /* > - * Does this look like a block matching our fs and higher than any > - * other block we've found so far? If so, reattach buffer verifiers > - * so the AIL won't complain if the buffer is also dirty. > - */ > + /* Ensure the block magic matches the btree type we're looking for. */ > btblock = XFS_BUF_TO_BLOCK(bp); > if (be32_to_cpu(btblock->bb_magic) != fab->magic) > goto out; > - if (xfs_sb_version_hascrc(&mp->m_sb) && > - !uuid_equal(&btblock->bb_u.s.bb_uuid, &mp->m_sb.sb_meta_uuid)) > - goto out; > - bp->b_ops = fab->buf_ops; > > - /* Make sure we pass the verifiers. */ > - bp->b_ops->verify_read(bp); > - if (bp->b_error) > - goto out; > + /* > + * If the buffer already has ops applied and they're not the ones for > + * this btree type, we know this block doesn't match the btree and we > + * can bail out. > + * > + * If the buffer ops match ours, someone else has already validated > + * the block for us, so we can move on to checking if this is a root > + * block candidate. > + * > + * If the buffer does not have ops, nobody has successfully validated > + * the contents and the buffer cannot be dirty. If the magic, uuid, > + * and structure match this btree type then we'll move on to checking > + * if it's a root block candidate. If there is no match, bail out. > + */ > + if (bp->b_ops) { > + if (bp->b_ops != fab->buf_ops) > + goto out; > + } else { > + ASSERT(!xfs_trans_buf_is_dirty(bp)); > + if (!uuid_equal(&btblock->bb_u.s.bb_uuid, > + &mp->m_sb.sb_meta_uuid)) > + goto out; > + fab->buf_ops->verify_read(bp); > + if (bp->b_error) { > + bp->b_error = 0; > + goto out; > + } > + bp->b_ops = fab->buf_ops; In light of the assert issues you hit on the previous patch related to verifiers reassigning ->b_ops, perhaps we should think about clearing ->b_ops on error and making the line above something like: if (!bp->b_ops) bp->b_ops = fab->buf_ops; I guess this mechanism is only for per-ag btrees atm, but that could be a fun landmine to deal with if this rmap searching/detecting code is ever repurposed to deal with directories/attrs or other verifiers are updated to do a similar kind of reassignment. Not an immediate issue (and I don't want to nit this patch to death :P), so: Reviewed-by: Brian Foster <bfoster@redhat.com> > + } > > /* > * This block passes the magic/uuid and verifier tests for this btree > diff --git a/fs/xfs/xfs_trans.h b/fs/xfs/xfs_trans.h > index c3d278e96ad1..a0c5dbda18aa 100644 > --- a/fs/xfs/xfs_trans.h > +++ b/fs/xfs/xfs_trans.h > @@ -220,6 +220,7 @@ void xfs_trans_ijoin(struct xfs_trans *, struct xfs_inode *, uint); > void xfs_trans_log_buf(struct xfs_trans *, struct xfs_buf *, uint, > uint); > void xfs_trans_dirty_buf(struct xfs_trans *, struct xfs_buf *); > +bool xfs_trans_buf_is_dirty(struct xfs_buf *bp); > void xfs_trans_log_inode(xfs_trans_t *, struct xfs_inode *, uint); > > void xfs_extent_free_init_defer_op(void); > diff --git a/fs/xfs/xfs_trans_buf.c b/fs/xfs/xfs_trans_buf.c > index fc40160c1773..629f1479c9d2 100644 > --- a/fs/xfs/xfs_trans_buf.c > +++ b/fs/xfs/xfs_trans_buf.c > @@ -350,6 +350,19 @@ xfs_trans_read_buf_map( > > } > > +/* Has this buffer been dirtied by anyone? */ > +bool > +xfs_trans_buf_is_dirty( > + struct xfs_buf *bp) > +{ > + struct xfs_buf_log_item *bip = bp->b_log_item; > + > + if (!bip) > + return false; > + ASSERT(bip->bli_item.li_type == XFS_LI_BUF); > + return test_bit(XFS_LI_DIRTY, &bip->bli_item.li_flags); > +} > + > /* > * Release a buffer previously joined to the transaction. If the buffer is > * modified within this transaction, decrement the recursion count but do not > ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 3/3] xfs: fix buffer state management in xrep_findroot_block 2018-10-09 12:16 ` Brian Foster @ 2018-10-09 16:19 ` Darrick J. Wong 0 siblings, 0 replies; 13+ messages in thread From: Darrick J. Wong @ 2018-10-09 16:19 UTC (permalink / raw) To: Brian Foster; +Cc: david, linux-xfs On Tue, Oct 09, 2018 at 08:16:39AM -0400, Brian Foster wrote: > On Mon, Oct 08, 2018 at 09:19:47PM -0700, Darrick J. Wong wrote: > > From: Darrick J. Wong <darrick.wong@oracle.com> > > > > We don't handle buffer state properly in online repair's findroot > > routine. If a buffer already has b_ops set, we don't ever want to touch > > that, and we don't want to call the read verifiers on a buffer that > > could be dirty (CRCs are only recomputed during log checkpoints). > > > > Therefore, be more careful about what we do with a buffer -- if someone > > else already attached ops that are not the ones for this btree type, > > just ignore the buffer. We only attach our btree type's buf ops if it > > matches the magic/uuid and structure checks. > > > > We also modify xfs_buf_read_map to allow callers to set buffer ops on a > > DONE buffer with NULL ops so that repair doesn't leave behind buffers > > which won't have buffers attached to them. > > > > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> > > --- > > fs/xfs/scrub/repair.c | 65 ++++++++++++++++++++++++++++++++++++++---------- > > fs/xfs/xfs_trans.h | 1 + > > fs/xfs/xfs_trans_buf.c | 13 ++++++++++ > > 3 files changed, 65 insertions(+), 14 deletions(-) > > > > > > diff --git a/fs/xfs/scrub/repair.c b/fs/xfs/scrub/repair.c > > index 63786341ac2a..a07b9364c9de 100644 > > --- a/fs/xfs/scrub/repair.c > > +++ b/fs/xfs/scrub/repair.c > > @@ -29,6 +29,8 @@ > > #include "xfs_ag_resv.h" > > #include "xfs_trans_space.h" > > #include "xfs_quota.h" > > +#include "xfs_attr.h" > > +#include "xfs_reflink.h" > > #include "scrub/xfs_scrub.h" > > #include "scrub/scrub.h" > > #include "scrub/common.h" > > @@ -699,7 +701,7 @@ xrep_findroot_block( > > struct xfs_btree_block *btblock; > > xfs_daddr_t daddr; > > int block_level; > > - int error; > > + int error = 0; > > > > daddr = XFS_AGB_TO_DADDR(mp, ri->sc->sa.agno, agbno); > > > > @@ -718,28 +720,63 @@ xrep_findroot_block( > > return error; > > } > > > > + /* > > + * Read the buffer into memory so that we can see if it's a match for > > + * our btree type. We have no clue if it is beforehand, and we want to > > + * avoid xfs_trans_read_buf's behavior of dumping the DONE state (which > > + * will cause needless disk reads in subsequent calls to this function) > > + * and logging metadata verifier failures. > > + * > > + * Therefore, pass in NULL buffer ops. If the buffer was already in > > + * memory from some other caller it will already have b_ops assigned. > > + * If it was in memory from a previous unsuccessful findroot_block > > + * call, the buffer won't have b_ops but it should be clean and ready > > + * for us to try to verify if the read call succeeds. The same applies > > + * if the buffer wasn't in memory at all. > > + * > > + * Note: If we never match a btree type with this buffer, it will be > > + * left in memory with NULL b_ops. This shouldn't be a problem unless > > + * the buffer gets written. > > + */ > > error = xfs_trans_read_buf(mp, ri->sc->tp, mp->m_ddev_targp, daddr, > > mp->m_bsize, 0, &bp, NULL); > > if (error) > > return error; > > > > - /* > > - * Does this look like a block matching our fs and higher than any > > - * other block we've found so far? If so, reattach buffer verifiers > > - * so the AIL won't complain if the buffer is also dirty. > > - */ > > + /* Ensure the block magic matches the btree type we're looking for. */ > > btblock = XFS_BUF_TO_BLOCK(bp); > > if (be32_to_cpu(btblock->bb_magic) != fab->magic) > > goto out; > > - if (xfs_sb_version_hascrc(&mp->m_sb) && > > - !uuid_equal(&btblock->bb_u.s.bb_uuid, &mp->m_sb.sb_meta_uuid)) > > - goto out; > > - bp->b_ops = fab->buf_ops; > > > > - /* Make sure we pass the verifiers. */ > > - bp->b_ops->verify_read(bp); > > - if (bp->b_error) > > - goto out; > > + /* > > + * If the buffer already has ops applied and they're not the ones for > > + * this btree type, we know this block doesn't match the btree and we > > + * can bail out. > > + * > > + * If the buffer ops match ours, someone else has already validated > > + * the block for us, so we can move on to checking if this is a root > > + * block candidate. > > + * > > + * If the buffer does not have ops, nobody has successfully validated > > + * the contents and the buffer cannot be dirty. If the magic, uuid, > > + * and structure match this btree type then we'll move on to checking > > + * if it's a root block candidate. If there is no match, bail out. > > + */ > > + if (bp->b_ops) { > > + if (bp->b_ops != fab->buf_ops) > > + goto out; > > + } else { > > + ASSERT(!xfs_trans_buf_is_dirty(bp)); > > + if (!uuid_equal(&btblock->bb_u.s.bb_uuid, > > + &mp->m_sb.sb_meta_uuid)) > > + goto out; > > + fab->buf_ops->verify_read(bp); > > + if (bp->b_error) { > > + bp->b_error = 0; > > + goto out; > > + } > > + bp->b_ops = fab->buf_ops; > > In light of the assert issues you hit on the previous patch related to > verifiers reassigning ->b_ops, perhaps we should think about clearing > ->b_ops on error and making the line above something like: > > if (!bp->b_ops) > bp->b_ops = fab->buf_ops; > > I guess this mechanism is only for per-ag btrees atm, but that could be > a fun landmine to deal with if this rmap searching/detecting code is > ever repurposed to deal with directories/attrs or other verifiers are > updated to do a similar kind of reassignment. Not an immediate issue > (and I don't want to nit this patch to death :P), so: This function was only ever meant to find AG btree roots (and the dir/attr code uses a different strategy to recover its marbles), but there's still time to revise the patch to avoid that landmine, so I'll go ahead and add that in: /* * Some read verifiers will (re)set b_ops, so we must be * careful not to blow away any such assignment. */ if (!bp->b_ops) bp->b_ops = fab->buf_ops; Thanks for the review! --D > Reviewed-by: Brian Foster <bfoster@redhat.com> > > > + } > > > > /* > > * This block passes the magic/uuid and verifier tests for this btree > > diff --git a/fs/xfs/xfs_trans.h b/fs/xfs/xfs_trans.h > > index c3d278e96ad1..a0c5dbda18aa 100644 > > --- a/fs/xfs/xfs_trans.h > > +++ b/fs/xfs/xfs_trans.h > > @@ -220,6 +220,7 @@ void xfs_trans_ijoin(struct xfs_trans *, struct xfs_inode *, uint); > > void xfs_trans_log_buf(struct xfs_trans *, struct xfs_buf *, uint, > > uint); > > void xfs_trans_dirty_buf(struct xfs_trans *, struct xfs_buf *); > > +bool xfs_trans_buf_is_dirty(struct xfs_buf *bp); > > void xfs_trans_log_inode(xfs_trans_t *, struct xfs_inode *, uint); > > > > void xfs_extent_free_init_defer_op(void); > > diff --git a/fs/xfs/xfs_trans_buf.c b/fs/xfs/xfs_trans_buf.c > > index fc40160c1773..629f1479c9d2 100644 > > --- a/fs/xfs/xfs_trans_buf.c > > +++ b/fs/xfs/xfs_trans_buf.c > > @@ -350,6 +350,19 @@ xfs_trans_read_buf_map( > > > > } > > > > +/* Has this buffer been dirtied by anyone? */ > > +bool > > +xfs_trans_buf_is_dirty( > > + struct xfs_buf *bp) > > +{ > > + struct xfs_buf_log_item *bip = bp->b_log_item; > > + > > + if (!bip) > > + return false; > > + ASSERT(bip->bli_item.li_type == XFS_LI_BUF); > > + return test_bit(XFS_LI_DIRTY, &bip->bli_item.li_flags); > > +} > > + > > /* > > * Release a buffer previously joined to the transaction. If the buffer is > > * modified within this transaction, decrement the recursion count but do not > > ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2018-10-09 23:37 UTC | newest] Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2018-10-05 0:47 [PATCH v3 0/3] xfs-4.20: scrub fixes Darrick J. Wong 2018-10-05 0:47 ` [PATCH 1/3] xfs: xrep_findroot_block should reject root blocks with siblings Darrick J. Wong 2018-10-05 0:47 ` [PATCH 2/3] xfs: always assign buffer verifiers when one is provided Darrick J. Wong 2018-10-05 11:57 ` Brian Foster 2018-10-05 17:02 ` Darrick J. Wong 2018-10-06 3:15 ` Darrick J. Wong 2018-10-06 10:25 ` Christoph Hellwig 2018-10-05 0:47 ` [PATCH 3/3] xfs: fix buffer state management in xrep_findroot_block Darrick J. Wong 2018-10-05 11:59 ` Brian Foster 2018-10-05 15:11 ` Darrick J. Wong 2018-10-09 4:19 [PATCH v4 0/3] xfs-4.20: scrub fixes Darrick J. Wong 2018-10-09 4:19 ` [PATCH 3/3] xfs: fix buffer state management in xrep_findroot_block Darrick J. Wong 2018-10-09 12:16 ` Brian Foster 2018-10-09 16:19 ` Darrick J. Wong
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.