* [PATCH 0/3] xfs: various fixes @ 2019-11-11 1:18 Darrick J. Wong 2019-11-11 1:18 ` [PATCH 1/3] xfs: actually check xfs_btree_check_block return in xfs_btree_islastblock Darrick J. Wong ` (2 more replies) 0 siblings, 3 replies; 9+ messages in thread From: Darrick J. Wong @ 2019-11-11 1:18 UTC (permalink / raw) To: darrick.wong; +Cc: linux-xfs, hch Hi all, This series contians various unrelated fixes: the first one shuts up a Coverity complaint in the btree islastblock predicate by turning it into an assert. The second patch fixes a 17yo bug where the iomap unwritten extent conversion code forgets to ensure that dquots are attached to the inode, which means we fail to account for bmbt splits and thus the quota accounting is incorrect. The third fixes the same problem but with the swap extents ioctl. If you're going to start using this mess, you probably ought to just pull from my git trees, which are linked below. This has been lightly tested with fstests. Enjoy! Comments and questions are, as always, welcome. --D xfsprogs git tree: https://git.kernel.org/cgit/linux/kernel/git/djwong/xfsprogs-dev.git/log/?h=random-fixes fstests git tree: https://git.kernel.org/cgit/linux/kernel/git/djwong/xfstests-dev.git/log/?h=random-fixes ^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH 1/3] xfs: actually check xfs_btree_check_block return in xfs_btree_islastblock 2019-11-11 1:18 [PATCH 0/3] xfs: various fixes Darrick J. Wong @ 2019-11-11 1:18 ` Darrick J. Wong 2019-11-11 7:57 ` Christoph Hellwig 2019-11-11 1:18 ` [PATCH 2/3] xfs: attach dquots and reserve quota blocks during unwritten conversion Darrick J. Wong 2019-11-11 1:18 ` [PATCH 3/3] xfs: attach dquots before performing xfs_swap_extents Darrick J. Wong 2 siblings, 1 reply; 9+ messages in thread From: Darrick J. Wong @ 2019-11-11 1:18 UTC (permalink / raw) To: darrick.wong; +Cc: linux-xfs, hch From: Darrick J. Wong <darrick.wong@oracle.com> Coverity points out that xfs_btree_islastblock doesn't check the return value of xfs_btree_check_block. Since the question "Does the cursor point to the last block in this level?" only makes sense if the caller previously performed a lookup or seek operation, the block should already have been checked. Therefore, check the return value in an ASSERT and turn the whole thing into a static inline predicate. Coverity-id: 114069 Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> --- fs/xfs/libxfs/xfs_btree.c | 19 ------------------- fs/xfs/libxfs/xfs_btree.h | 25 +++++++++++++++++-------- 2 files changed, 17 insertions(+), 27 deletions(-) diff --git a/fs/xfs/libxfs/xfs_btree.c b/fs/xfs/libxfs/xfs_btree.c index feab5b307c45..3d48878ab298 100644 --- a/fs/xfs/libxfs/xfs_btree.c +++ b/fs/xfs/libxfs/xfs_btree.c @@ -714,25 +714,6 @@ xfs_btree_get_bufs( return xfs_trans_get_buf(tp, mp->m_ddev_targp, d, mp->m_bsize, 0); } -/* - * Check for the cursor referring to the last block at the given level. - */ -int /* 1=is last block, 0=not last block */ -xfs_btree_islastblock( - xfs_btree_cur_t *cur, /* btree cursor */ - int level) /* level to check */ -{ - struct xfs_btree_block *block; /* generic btree block pointer */ - xfs_buf_t *bp; /* buffer containing block */ - - block = xfs_btree_get_block(cur, level, &bp); - xfs_btree_check_block(cur, block, level, bp); - if (cur->bc_flags & XFS_BTREE_LONG_PTRS) - return block->bb_u.l.bb_rightsib == cpu_to_be64(NULLFSBLOCK); - else - return block->bb_u.s.bb_rightsib == cpu_to_be32(NULLAGBLOCK); -} - /* * Change the cursor to point to the first record at the given level. * Other levels are unaffected. diff --git a/fs/xfs/libxfs/xfs_btree.h b/fs/xfs/libxfs/xfs_btree.h index 6670120cd690..fb9b2121c628 100644 --- a/fs/xfs/libxfs/xfs_btree.h +++ b/fs/xfs/libxfs/xfs_btree.h @@ -317,14 +317,6 @@ xfs_btree_get_bufs( xfs_agnumber_t agno, /* allocation group number */ xfs_agblock_t agbno); /* allocation group block number */ -/* - * Check for the cursor referring to the last block at the given level. - */ -int /* 1=is last block, 0=not last block */ -xfs_btree_islastblock( - xfs_btree_cur_t *cur, /* btree cursor */ - int level); /* level to check */ - /* * Compute first and last byte offsets for the fields given. * Interprets the offsets table, which contains struct field offsets. @@ -524,4 +516,21 @@ int xfs_btree_has_record(struct xfs_btree_cur *cur, union xfs_btree_irec *low, union xfs_btree_irec *high, bool *exists); bool xfs_btree_has_more_records(struct xfs_btree_cur *cur); +/* Does this cursor point to the last block in the given level? */ +static inline bool +xfs_btree_islastblock( + xfs_btree_cur_t *cur, + int level) +{ + struct xfs_btree_block *block; + struct xfs_buf *bp; + + block = xfs_btree_get_block(cur, level, &bp); + ASSERT(block && xfs_btree_check_block(cur, block, level, bp) == 0); + + if (cur->bc_flags & XFS_BTREE_LONG_PTRS) + return block->bb_u.l.bb_rightsib == cpu_to_be64(NULLFSBLOCK); + return block->bb_u.s.bb_rightsib == cpu_to_be32(NULLAGBLOCK); +} + #endif /* __XFS_BTREE_H__ */ ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH 1/3] xfs: actually check xfs_btree_check_block return in xfs_btree_islastblock 2019-11-11 1:18 ` [PATCH 1/3] xfs: actually check xfs_btree_check_block return in xfs_btree_islastblock Darrick J. Wong @ 2019-11-11 7:57 ` Christoph Hellwig 0 siblings, 0 replies; 9+ messages in thread From: Christoph Hellwig @ 2019-11-11 7:57 UTC (permalink / raw) To: Darrick J. Wong; +Cc: linux-xfs, hch Looks good, Reviewed-by: Christoph Hellwig <hch@lst.de> ^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH 2/3] xfs: attach dquots and reserve quota blocks during unwritten conversion 2019-11-11 1:18 [PATCH 0/3] xfs: various fixes Darrick J. Wong 2019-11-11 1:18 ` [PATCH 1/3] xfs: actually check xfs_btree_check_block return in xfs_btree_islastblock Darrick J. Wong @ 2019-11-11 1:18 ` Darrick J. Wong 2019-11-11 7:59 ` Christoph Hellwig 2019-11-11 1:18 ` [PATCH 3/3] xfs: attach dquots before performing xfs_swap_extents Darrick J. Wong 2 siblings, 1 reply; 9+ messages in thread From: Darrick J. Wong @ 2019-11-11 1:18 UTC (permalink / raw) To: darrick.wong; +Cc: linux-xfs, hch From: Darrick J. Wong <darrick.wong@oracle.com> In xfs_iomap_write_unwritten, we need to ensure that dquots are attached to the inode and quota blocks reserved so that we capture in the quota counters any blocks allocated to handle a bmbt split. This can happen on the first unwritten extent conversion to a preallocated sparse file on a fresh mount. This was found by running generic/311 with quotas enabled. The bug seems to have been introduced in "[XFS] rework iocore infrastructure, remove some code and make it more" from ~2002? Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> --- fs/xfs/xfs_iomap.c | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c index f234929b4f15..97dbe3b2aec0 100644 --- a/fs/xfs/xfs_iomap.c +++ b/fs/xfs/xfs_iomap.c @@ -539,6 +539,11 @@ xfs_iomap_write_unwritten( */ resblks = XFS_DIOSTRAT_SPACE_RES(mp, 0) << 1; + /* Attach dquots so that bmbt splits are accounted correctly. */ + error = xfs_qm_dqattach(ip); + if (error) + return error; + do { /* * Set up a transaction to convert the range of extents @@ -557,6 +562,11 @@ xfs_iomap_write_unwritten( xfs_ilock(ip, XFS_ILOCK_EXCL); xfs_trans_ijoin(tp, ip, 0); + error = xfs_trans_reserve_quota_nblks(tp, ip, resblks, 0, + XFS_QMOPT_RES_REGBLKS); + if (error) + goto error_on_bmapi_transaction; + /* * Modify the unwritten extent state of the buffer. */ ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH 2/3] xfs: attach dquots and reserve quota blocks during unwritten conversion 2019-11-11 1:18 ` [PATCH 2/3] xfs: attach dquots and reserve quota blocks during unwritten conversion Darrick J. Wong @ 2019-11-11 7:59 ` Christoph Hellwig 0 siblings, 0 replies; 9+ messages in thread From: Christoph Hellwig @ 2019-11-11 7:59 UTC (permalink / raw) To: Darrick J. Wong; +Cc: linux-xfs, hch On Sun, Nov 10, 2019 at 05:18:27PM -0800, Darrick J. Wong wrote: > From: Darrick J. Wong <darrick.wong@oracle.com> > > In xfs_iomap_write_unwritten, we need to ensure that dquots are attached > to the inode and quota blocks reserved so that we capture in the quota > counters any blocks allocated to handle a bmbt split. This can happen > on the first unwritten extent conversion to a preallocated sparse file > on a fresh mount. > > This was found by running generic/311 with quotas enabled. The bug > seems to have been introduced in "[XFS] rework iocore infrastructure, > remove some code and make it more" from ~2002? Wow. The fix looks correct, but I can't see how we got away with that for so long.. Reviewed-by: Christoph Hellwig <hch@lst.de> ^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH 3/3] xfs: attach dquots before performing xfs_swap_extents 2019-11-11 1:18 [PATCH 0/3] xfs: various fixes Darrick J. Wong 2019-11-11 1:18 ` [PATCH 1/3] xfs: actually check xfs_btree_check_block return in xfs_btree_islastblock Darrick J. Wong 2019-11-11 1:18 ` [PATCH 2/3] xfs: attach dquots and reserve quota blocks during unwritten conversion Darrick J. Wong @ 2019-11-11 1:18 ` Darrick J. Wong 2019-11-11 8:05 ` Christoph Hellwig 2 siblings, 1 reply; 9+ messages in thread From: Darrick J. Wong @ 2019-11-11 1:18 UTC (permalink / raw) To: darrick.wong; +Cc: linux-xfs, hch From: Darrick J. Wong <darrick.wong@oracle.com> Make sure we attach dquots to both inodes before swapping their extents. This was found via manual code inspection by looking for places where we could call xfs_trans_mod_dquot without dquots attached to inodes, and confirmed by instrumenting the kernel and running xfs/328. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> --- fs/xfs/xfs_bmap_util.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/fs/xfs/xfs_bmap_util.c b/fs/xfs/xfs_bmap_util.c index 9d731b71e84f..2efd78a9719e 100644 --- a/fs/xfs/xfs_bmap_util.c +++ b/fs/xfs/xfs_bmap_util.c @@ -1569,6 +1569,14 @@ xfs_swap_extents( goto out_unlock; } + error = xfs_qm_dqattach(ip); + if (error) + goto out_unlock; + + error = xfs_qm_dqattach(tip); + if (error) + goto out_unlock; + error = xfs_swap_extent_flush(ip); if (error) goto out_unlock; ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH 3/3] xfs: attach dquots before performing xfs_swap_extents 2019-11-11 1:18 ` [PATCH 3/3] xfs: attach dquots before performing xfs_swap_extents Darrick J. Wong @ 2019-11-11 8:05 ` Christoph Hellwig 2019-11-12 22:14 ` Dave Chinner 0 siblings, 1 reply; 9+ messages in thread From: Christoph Hellwig @ 2019-11-11 8:05 UTC (permalink / raw) To: Darrick J. Wong; +Cc: linux-xfs, hch On Sun, Nov 10, 2019 at 05:18:34PM -0800, Darrick J. Wong wrote: > From: Darrick J. Wong <darrick.wong@oracle.com> > > Make sure we attach dquots to both inodes before swapping their extents. > This was found via manual code inspection by looking for places where we > could call xfs_trans_mod_dquot without dquots attached to inodes, and > confirmed by instrumenting the kernel and running xfs/328. Looks good: Reviewed-by: Christoph Hellwig <hch@lst.de> Btw, for while I've been wondering if we could just get rid of the concepts of attached dquots. With the radix-tree/xarray looks up are be fairly cheap, and could be done lockless using RCU. So we could try to just kill the concept of attaching the dquot to the inode and just look it up once per operation, where operation preferally is something high-level like the actual file/inode operation and not a low-level thing inside xfs. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 3/3] xfs: attach dquots before performing xfs_swap_extents 2019-11-11 8:05 ` Christoph Hellwig @ 2019-11-12 22:14 ` Dave Chinner 2019-11-14 14:53 ` Christoph Hellwig 0 siblings, 1 reply; 9+ messages in thread From: Dave Chinner @ 2019-11-12 22:14 UTC (permalink / raw) To: Christoph Hellwig; +Cc: Darrick J. Wong, linux-xfs On Mon, Nov 11, 2019 at 12:05:03AM -0800, Christoph Hellwig wrote: > On Sun, Nov 10, 2019 at 05:18:34PM -0800, Darrick J. Wong wrote: > > From: Darrick J. Wong <darrick.wong@oracle.com> > > > > Make sure we attach dquots to both inodes before swapping their extents. > > This was found via manual code inspection by looking for places where we > > could call xfs_trans_mod_dquot without dquots attached to inodes, and > > confirmed by instrumenting the kernel and running xfs/328. > > Looks good: > > Reviewed-by: Christoph Hellwig <hch@lst.de> > > Btw, for while I've been wondering if we could just get rid of the > concepts of attached dquots. With the radix-tree/xarray looks up > are be fairly cheap, and could be done lockless using RCU. So we could > try to just kill the concept of attaching the dquot to the inode and > just look it up once per operation, where operation preferally is > something high-level like the actual file/inode operation and not a > low-level thing inside xfs. If the dquots are not attached to the inode, how would you pass the 3 dquots per inode down the stack to where they are actually used inside the filesystem? I mean, we have to get the dquots attached to the transaction so we can update them in xfs_trans_commit -> xfs_trans_apply_dquot_deltas(), so somehow we'd have to get them from the high level file/inode operations down to the XFS transaction context. And things like writeback need dquots attached for delayed allocation, so various aops would need to do dquot lookups, too... I can see the advantage of doing rcu dquot cache lookups in the xfs context where we are attaching the dquots to the transaction rather than attaching them to the inode, but I can't see how the "do it at a high level" aspect of this would work.... Cheers, Dave. -- Dave Chinner david@fromorbit.com ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 3/3] xfs: attach dquots before performing xfs_swap_extents 2019-11-12 22:14 ` Dave Chinner @ 2019-11-14 14:53 ` Christoph Hellwig 0 siblings, 0 replies; 9+ messages in thread From: Christoph Hellwig @ 2019-11-14 14:53 UTC (permalink / raw) To: Dave Chinner; +Cc: Christoph Hellwig, Darrick J. Wong, linux-xfs On Wed, Nov 13, 2019 at 09:14:48AM +1100, Dave Chinner wrote: > If the dquots are not attached to the inode, how would you pass the > 3 dquots per inode down the stack to where they are actually used > inside the filesystem? I mean, we have to get the dquots attached to > the transaction so we can update them in xfs_trans_commit -> > xfs_trans_apply_dquot_deltas(), so somehow we'd have to get them > from the high level file/inode operations down to the XFS > transaction context. And things like writeback need dquots attached > for delayed allocation, so various aops would need to do dquot > lookups, too... My prime idea was to attach them to the transaction and keep them over transaction roles. Then see what is left and probably use an on-stack struct containing three dquots. At that point I know if that idea was feasible, because if we have too many deep callstacks where we need to pass that struct it obviously isn't. > I can see the advantage of doing rcu dquot cache lookups in the xfs > context where we are attaching the dquots to the transaction rather > than attaching them to the inode, but I can't see how the "do it at > a high level" aspect of this would work.... Most of our ops really just have one transaction / set of rolled over permanent transactions, because if they didn't they wouldn't be atomic.. ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2019-11-14 14:53 UTC | newest] Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2019-11-11 1:18 [PATCH 0/3] xfs: various fixes Darrick J. Wong 2019-11-11 1:18 ` [PATCH 1/3] xfs: actually check xfs_btree_check_block return in xfs_btree_islastblock Darrick J. Wong 2019-11-11 7:57 ` Christoph Hellwig 2019-11-11 1:18 ` [PATCH 2/3] xfs: attach dquots and reserve quota blocks during unwritten conversion Darrick J. Wong 2019-11-11 7:59 ` Christoph Hellwig 2019-11-11 1:18 ` [PATCH 3/3] xfs: attach dquots before performing xfs_swap_extents Darrick J. Wong 2019-11-11 8:05 ` Christoph Hellwig 2019-11-12 22:14 ` Dave Chinner 2019-11-14 14:53 ` Christoph Hellwig
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).