All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 5.10 CANDIDATE 00/11] xfs stable candidate patches for 5.10.y (v5.15+)
@ 2022-06-17 10:06 Amir Goldstein
  2022-06-17 10:06 ` [PATCH 5.10 CANDIDATE 01/11] xfs: use kmem_cache_free() for kmem_cache objects Amir Goldstein
                   ` (11 more replies)
  0 siblings, 12 replies; 31+ messages in thread
From: Amir Goldstein @ 2022-06-17 10:06 UTC (permalink / raw)
  To: Darrick J . Wong
  Cc: Leah Rumancik, Luis Chamberlain, Dave Chinner, Christoph Hellwig,
	Christian Brauner, linux-xfs, fstests

Hi all,

Previously posted candidates for 5.10.y followed chronological release
order.

Parts 1 and 2 of fixes from v5.10..v5.12 have already been applied to
v5.10.121.

Part 3 (from 5.13) has already been posted for review [3] on June 6,
but following feedback from Dave, I changed my focus to get the same
set of patches tested and reviewed for 5.10.y/5.15.y.

I do want to ask you guys to also find time to review part 3, because
we have a lot of catching up to do for 5.10.y, so we need to chew at
this debt at a reasonable rate.

This post has the matching set of patches for 5.10.y that goes with
Leah's first set of candidates for 5.15.y [1].

Most of the fixes are from v5.15..v5.17 except for patch 11 (v5.18-rc1).
All fix patches have been tagged with Fixes: by the author.

The patches have been soaking in kdepops since Sunday. They passed more
than 30 auto group runs with several different versions of xfsprogs.

The differences from Leah's 5.15.y:
- It is 11 patches and not 8 because of dependencies
- Patches 6,7 are non-fixes backported as dependency to patch 8 -
  they have "backported .* for dependency" in their commit message
- Patches 3,4,11 needed changes to apply to 5.10.y - they have a
  "backport" related comment in their commit message to explain what
  changes were needed
- Patch 10 is a fix from v5.12 that is re-posted as a dependency for
  patch 11

Darrick,

As the author patches 4,11 and sole reviewer of patch 3 (a.k.a
the non-cleanly applied patches), please take a closer look at those.

Patch 10 has been dropped from my part 2 candidates following concerns
raised by Dave and is now being re-posted following feedback from
Christian and Christoph [2].

If there are still concerns about patches 10 or 11, please raise a flag.
I can drop either of these patches before posting to stable if anyone
feels that they need more time to soak in master.

Thanks,
Amir.

[1] https://lore.kernel.org/linux-xfs/20220616182749.1200971-1-leah.rumancik@gmail.com/
[2] https://lore.kernel.org/linux-xfs/CAOQ4uxg4=m9zEFbDAKXx7CP7HYiMwtsYSJvq076oKpy-OhK1uw@mail.gmail.com/
[3] https://lore.kernel.org/linux-xfs/20220606160537.689915-1-amir73il@gmail.com/

Brian Foster (1):
  xfs: punch out data fork delalloc blocks on COW writeback failure

Christoph Hellwig (2):
  xfs: refactor xfs_file_fsync
  xfs: fix up non-directory creation in SGID directories

Darrick J. Wong (4):
  xfs: remove all COW fork extents when remounting readonly
  xfs: prevent UAF in xfs_log_item_in_current_chkpt
  xfs: only bother with sync_filesystem during readonly remount
  xfs: use setattr_copy to set vfs inode attributes

Dave Chinner (2):
  xfs: check sb_meta_uuid for dabuf buffer recovery
  xfs: xfs_log_force_lsn isn't passed a LSN

Rustam Kovhaev (1):
  xfs: use kmem_cache_free() for kmem_cache objects

Yang Xu (1):
  xfs: Fix the free logic of state in xfs_attr_node_hasname

 fs/xfs/libxfs/xfs_attr.c      | 13 +++---
 fs/xfs/libxfs/xfs_types.h     |  1 +
 fs/xfs/xfs_aops.c             | 15 +++++--
 fs/xfs/xfs_buf_item.c         |  2 +-
 fs/xfs/xfs_buf_item_recover.c |  2 +-
 fs/xfs/xfs_dquot_item.c       |  2 +-
 fs/xfs/xfs_extfree_item.c     |  6 +--
 fs/xfs/xfs_file.c             | 81 +++++++++++++++++++++--------------
 fs/xfs/xfs_inode.c            | 24 +++++------
 fs/xfs/xfs_inode_item.c       |  4 +-
 fs/xfs/xfs_inode_item.h       |  2 +-
 fs/xfs/xfs_iops.c             | 56 ++----------------------
 fs/xfs/xfs_log.c              | 27 ++++++------
 fs/xfs/xfs_log.h              |  4 +-
 fs/xfs/xfs_log_cil.c          | 32 ++++++--------
 fs/xfs/xfs_log_priv.h         | 15 +++----
 fs/xfs/xfs_pnfs.c             |  3 +-
 fs/xfs/xfs_super.c            | 21 ++++++---
 fs/xfs/xfs_trans.c            |  6 +--
 fs/xfs/xfs_trans.h            |  4 +-
 20 files changed, 149 insertions(+), 171 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH 5.10 CANDIDATE 01/11] xfs: use kmem_cache_free() for kmem_cache objects
  2022-06-17 10:06 [PATCH 5.10 CANDIDATE 00/11] xfs stable candidate patches for 5.10.y (v5.15+) Amir Goldstein
@ 2022-06-17 10:06 ` Amir Goldstein
  2022-06-17 10:06 ` [PATCH 5.10 CANDIDATE 02/11] xfs: punch out data fork delalloc blocks on COW writeback failure Amir Goldstein
                   ` (10 subsequent siblings)
  11 siblings, 0 replies; 31+ messages in thread
From: Amir Goldstein @ 2022-06-17 10:06 UTC (permalink / raw)
  To: Darrick J . Wong
  Cc: Leah Rumancik, Luis Chamberlain, Dave Chinner, Christoph Hellwig,
	Christian Brauner, linux-xfs, fstests, Rustam Kovhaev

From: Rustam Kovhaev <rkovhaev@gmail.com>

commit c30a0cbd07ecc0eec7b3cd568f7b1c7bb7913f93 upstream.

For kmalloc() allocations SLOB prepends the blocks with a 4-byte header,
and it puts the size of the allocated blocks in that header.
Blocks allocated with kmem_cache_alloc() allocations do not have that
header.

SLOB explodes when you allocate memory with kmem_cache_alloc() and then
try to free it with kfree() instead of kmem_cache_free().
SLOB will assume that there is a header when there is none, read some
garbage to size variable and corrupt the adjacent objects, which
eventually leads to hang or panic.

Let's make XFS work with SLOB by using proper free function.

Fixes: 9749fee83f38 ("xfs: enable the xfs_defer mechanism to process extents to free")
Signed-off-by: Rustam Kovhaev <rkovhaev@gmail.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/xfs/xfs_extfree_item.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/fs/xfs/xfs_extfree_item.c b/fs/xfs/xfs_extfree_item.c
index 5c0395256bd1..11474770d630 100644
--- a/fs/xfs/xfs_extfree_item.c
+++ b/fs/xfs/xfs_extfree_item.c
@@ -482,7 +482,7 @@ xfs_extent_free_finish_item(
 			free->xefi_startblock,
 			free->xefi_blockcount,
 			&free->xefi_oinfo, free->xefi_skip_discard);
-	kmem_free(free);
+	kmem_cache_free(xfs_bmap_free_item_zone, free);
 	return error;
 }
 
@@ -502,7 +502,7 @@ xfs_extent_free_cancel_item(
 	struct xfs_extent_free_item	*free;
 
 	free = container_of(item, struct xfs_extent_free_item, xefi_list);
-	kmem_free(free);
+	kmem_cache_free(xfs_bmap_free_item_zone, free);
 }
 
 const struct xfs_defer_op_type xfs_extent_free_defer_type = {
@@ -564,7 +564,7 @@ xfs_agfl_free_finish_item(
 	extp->ext_len = free->xefi_blockcount;
 	efdp->efd_next_extent++;
 
-	kmem_free(free);
+	kmem_cache_free(xfs_bmap_free_item_zone, free);
 	return error;
 }
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 5.10 CANDIDATE 02/11] xfs: punch out data fork delalloc blocks on COW writeback failure
  2022-06-17 10:06 [PATCH 5.10 CANDIDATE 00/11] xfs stable candidate patches for 5.10.y (v5.15+) Amir Goldstein
  2022-06-17 10:06 ` [PATCH 5.10 CANDIDATE 01/11] xfs: use kmem_cache_free() for kmem_cache objects Amir Goldstein
@ 2022-06-17 10:06 ` Amir Goldstein
  2022-06-17 10:06 ` [PATCH 5.10 CANDIDATE 03/11] xfs: Fix the free logic of state in xfs_attr_node_hasname Amir Goldstein
                   ` (9 subsequent siblings)
  11 siblings, 0 replies; 31+ messages in thread
From: Amir Goldstein @ 2022-06-17 10:06 UTC (permalink / raw)
  To: Darrick J . Wong
  Cc: Leah Rumancik, Luis Chamberlain, Dave Chinner, Christoph Hellwig,
	Christian Brauner, linux-xfs, fstests, Brian Foster

From: Brian Foster <bfoster@redhat.com>

commit 5ca5916b6bc93577c360c06cb7cdf71adb9b5faf upstream.

If writeback I/O to a COW extent fails, the COW fork blocks are
punched out and the data fork blocks left alone. It is possible for
COW fork blocks to overlap non-shared data fork blocks (due to
cowextsz hint prealloc), however, and writeback unconditionally maps
to the COW fork whenever blocks exist at the corresponding offset of
the page undergoing writeback. This means it's quite possible for a
COW fork extent to overlap delalloc data fork blocks, writeback to
convert and map to the COW fork blocks, writeback to fail, and
finally for ioend completion to cancel the COW fork blocks and leave
stale data fork delalloc blocks around in the inode. The blocks are
effectively stale because writeback failure also discards dirty page
state.

If this occurs, it is likely to trigger assert failures, free space
accounting corruption and failures in unrelated file operations. For
example, a subsequent reflink attempt of the affected file to a new
target file will trip over the stale delalloc in the source file and
fail. Several of these issues are occasionally reproduced by
generic/648, but are reproducible on demand with the right sequence
of operations and timely I/O error injection.

To fix this problem, update the ioend failure path to also punch out
underlying data fork delalloc blocks on I/O error. This is analogous
to the writeback submission failure path in xfs_discard_page() where
we might fail to map data fork delalloc blocks and consistent with
the successful COW writeback completion path, which is responsible
for unmapping from the data fork and remapping in COW fork blocks.

Fixes: 787eb485509f ("xfs: fix and streamline error handling in xfs_end_io")
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/xfs/xfs_aops.c | 15 ++++++++++++---
 1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c
index b4186d666157..953de843d9c3 100644
--- a/fs/xfs/xfs_aops.c
+++ b/fs/xfs/xfs_aops.c
@@ -145,6 +145,7 @@ xfs_end_ioend(
 	struct iomap_ioend	*ioend)
 {
 	struct xfs_inode	*ip = XFS_I(ioend->io_inode);
+	struct xfs_mount	*mp = ip->i_mount;
 	xfs_off_t		offset = ioend->io_offset;
 	size_t			size = ioend->io_size;
 	unsigned int		nofs_flag;
@@ -160,18 +161,26 @@ xfs_end_ioend(
 	/*
 	 * Just clean up the in-memory strutures if the fs has been shut down.
 	 */
-	if (XFS_FORCED_SHUTDOWN(ip->i_mount)) {
+	if (XFS_FORCED_SHUTDOWN(mp)) {
 		error = -EIO;
 		goto done;
 	}
 
 	/*
-	 * Clean up any COW blocks on an I/O error.
+	 * Clean up all COW blocks and underlying data fork delalloc blocks on
+	 * I/O error. The delalloc punch is required because this ioend was
+	 * mapped to blocks in the COW fork and the associated pages are no
+	 * longer dirty. If we don't remove delalloc blocks here, they become
+	 * stale and can corrupt free space accounting on unmount.
 	 */
 	error = blk_status_to_errno(ioend->io_bio->bi_status);
 	if (unlikely(error)) {
-		if (ioend->io_flags & IOMAP_F_SHARED)
+		if (ioend->io_flags & IOMAP_F_SHARED) {
 			xfs_reflink_cancel_cow_range(ip, offset, size, true);
+			xfs_bmap_punch_delalloc_range(ip,
+						      XFS_B_TO_FSBT(mp, offset),
+						      XFS_B_TO_FSB(mp, size));
+		}
 		goto done;
 	}
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 5.10 CANDIDATE 03/11] xfs: Fix the free logic of state in xfs_attr_node_hasname
  2022-06-17 10:06 [PATCH 5.10 CANDIDATE 00/11] xfs stable candidate patches for 5.10.y (v5.15+) Amir Goldstein
  2022-06-17 10:06 ` [PATCH 5.10 CANDIDATE 01/11] xfs: use kmem_cache_free() for kmem_cache objects Amir Goldstein
  2022-06-17 10:06 ` [PATCH 5.10 CANDIDATE 02/11] xfs: punch out data fork delalloc blocks on COW writeback failure Amir Goldstein
@ 2022-06-17 10:06 ` Amir Goldstein
  2022-06-22 16:32   ` Darrick J. Wong
  2022-06-17 10:06 ` [PATCH 5.10 CANDIDATE 04/11] xfs: remove all COW fork extents when remounting readonly Amir Goldstein
                   ` (8 subsequent siblings)
  11 siblings, 1 reply; 31+ messages in thread
From: Amir Goldstein @ 2022-06-17 10:06 UTC (permalink / raw)
  To: Darrick J . Wong
  Cc: Leah Rumancik, Luis Chamberlain, Dave Chinner, Christoph Hellwig,
	Christian Brauner, linux-xfs, fstests, Yang Xu

From: Yang Xu <xuyang2018.jy@fujitsu.com>

commit a1de97fe296c52eafc6590a3506f4bbd44ecb19a upstream.

When testing xfstests xfs/126 on lastest upstream kernel, it will hang on some machine.
Adding a getxattr operation after xattr corrupted, I can reproduce it 100%.

The deadlock as below:
[983.923403] task:setfattr        state:D stack:    0 pid:17639 ppid: 14687 flags:0x00000080
[  983.923405] Call Trace:
[  983.923410]  __schedule+0x2c4/0x700
[  983.923412]  schedule+0x37/0xa0
[  983.923414]  schedule_timeout+0x274/0x300
[  983.923416]  __down+0x9b/0xf0
[  983.923451]  ? xfs_buf_find.isra.29+0x3c8/0x5f0 [xfs]
[  983.923453]  down+0x3b/0x50
[  983.923471]  xfs_buf_lock+0x33/0xf0 [xfs]
[  983.923490]  xfs_buf_find.isra.29+0x3c8/0x5f0 [xfs]
[  983.923508]  xfs_buf_get_map+0x4c/0x320 [xfs]
[  983.923525]  xfs_buf_read_map+0x53/0x310 [xfs]
[  983.923541]  ? xfs_da_read_buf+0xcf/0x120 [xfs]
[  983.923560]  xfs_trans_read_buf_map+0x1cf/0x360 [xfs]
[  983.923575]  ? xfs_da_read_buf+0xcf/0x120 [xfs]
[  983.923590]  xfs_da_read_buf+0xcf/0x120 [xfs]
[  983.923606]  xfs_da3_node_read+0x1f/0x40 [xfs]
[  983.923621]  xfs_da3_node_lookup_int+0x69/0x4a0 [xfs]
[  983.923624]  ? kmem_cache_alloc+0x12e/0x270
[  983.923637]  xfs_attr_node_hasname+0x6e/0xa0 [xfs]
[  983.923651]  xfs_has_attr+0x6e/0xd0 [xfs]
[  983.923664]  xfs_attr_set+0x273/0x320 [xfs]
[  983.923683]  xfs_xattr_set+0x87/0xd0 [xfs]
[  983.923686]  __vfs_removexattr+0x4d/0x60
[  983.923688]  __vfs_removexattr_locked+0xac/0x130
[  983.923689]  vfs_removexattr+0x4e/0xf0
[  983.923690]  removexattr+0x4d/0x80
[  983.923693]  ? __check_object_size+0xa8/0x16b
[  983.923695]  ? strncpy_from_user+0x47/0x1a0
[  983.923696]  ? getname_flags+0x6a/0x1e0
[  983.923697]  ? _cond_resched+0x15/0x30
[  983.923699]  ? __sb_start_write+0x1e/0x70
[  983.923700]  ? mnt_want_write+0x28/0x50
[  983.923701]  path_removexattr+0x9b/0xb0
[  983.923702]  __x64_sys_removexattr+0x17/0x20
[  983.923704]  do_syscall_64+0x5b/0x1a0
[  983.923705]  entry_SYSCALL_64_after_hwframe+0x65/0xca
[  983.923707] RIP: 0033:0x7f080f10ee1b

When getxattr calls xfs_attr_node_get function, xfs_da3_node_lookup_int fails with EFSCORRUPTED in
xfs_attr_node_hasname because we have use blocktrash to random it in xfs/126. So it
free state in internal and xfs_attr_node_get doesn't do xfs_buf_trans release job.

Then subsequent removexattr will hang because of it.

This bug was introduced by kernel commit 07120f1abdff ("xfs: Add xfs_has_attr and subroutines").
It adds xfs_attr_node_hasname helper and said caller will be responsible for freeing the state
in this case. But xfs_attr_node_hasname will free state itself instead of caller if
xfs_da3_node_lookup_int fails.

Fix this bug by moving the step of free state into caller.

[amir: this text from original commit is not relevant for 5.10 backport:
Also, use "goto error/out" instead of returning error directly in xfs_attr_node_addname_find_attr and
xfs_attr_node_removename_setup function because we should free state ourselves.
]

Fixes: 07120f1abdff ("xfs: Add xfs_has_attr and subroutines")
Signed-off-by: Yang Xu <xuyang2018.jy@fujitsu.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/xfs/libxfs/xfs_attr.c | 13 +++++--------
 1 file changed, 5 insertions(+), 8 deletions(-)

diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c
index 96ac7e562b87..fcca36bbd997 100644
--- a/fs/xfs/libxfs/xfs_attr.c
+++ b/fs/xfs/libxfs/xfs_attr.c
@@ -876,21 +876,18 @@ xfs_attr_node_hasname(
 
 	state = xfs_da_state_alloc(args);
 	if (statep != NULL)
-		*statep = NULL;
+		*statep = state;
 
 	/*
 	 * Search to see if name exists, and get back a pointer to it.
 	 */
 	error = xfs_da3_node_lookup_int(state, &retval);
-	if (error) {
-		xfs_da_state_free(state);
-		return error;
-	}
+	if (error)
+		retval = error;
 
-	if (statep != NULL)
-		*statep = state;
-	else
+	if (!statep)
 		xfs_da_state_free(state);
+
 	return retval;
 }
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 5.10 CANDIDATE 04/11] xfs: remove all COW fork extents when remounting readonly
  2022-06-17 10:06 [PATCH 5.10 CANDIDATE 00/11] xfs stable candidate patches for 5.10.y (v5.15+) Amir Goldstein
                   ` (2 preceding siblings ...)
  2022-06-17 10:06 ` [PATCH 5.10 CANDIDATE 03/11] xfs: Fix the free logic of state in xfs_attr_node_hasname Amir Goldstein
@ 2022-06-17 10:06 ` Amir Goldstein
  2022-06-17 10:06 ` [PATCH 5.10 CANDIDATE 05/11] xfs: check sb_meta_uuid for dabuf buffer recovery Amir Goldstein
                   ` (7 subsequent siblings)
  11 siblings, 0 replies; 31+ messages in thread
From: Amir Goldstein @ 2022-06-17 10:06 UTC (permalink / raw)
  To: Darrick J . Wong
  Cc: Leah Rumancik, Luis Chamberlain, Dave Chinner, Christoph Hellwig,
	Christian Brauner, linux-xfs, fstests, Dave Chinner,
	Chandan Babu R

From: "Darrick J. Wong" <djwong@kernel.org>

commit 089558bc7ba785c03815a49c89e28ad9b8de51f9 upstream.

[backport xfs_icwalk -> xfs_eofblocks for 5.10.y]

As part of multiple customer escalations due to file data corruption
after copy on write operations, I wrote some fstests that use fsstress
to hammer on COW to shake things loose.  Regrettably, I caught some
filesystem shutdowns due to incorrect rmap operations with the following
loop:

mount <filesystem>				# (0)
fsstress <run only readonly ops> &		# (1)
while true; do
	fsstress <run all ops>
	mount -o remount,ro			# (2)
	fsstress <run only readonly ops>
	mount -o remount,rw			# (3)
done

When (2) happens, notice that (1) is still running.  xfs_remount_ro will
call xfs_blockgc_stop to walk the inode cache to free all the COW
extents, but the blockgc mechanism races with (1)'s reader threads to
take IOLOCKs and loses, which means that it doesn't clean them all out.
Call such a file (A).

When (3) happens, xfs_remount_rw calls xfs_reflink_recover_cow, which
walks the ondisk refcount btree and frees any COW extent that it finds.
This function does not check the inode cache, which means that incore
COW forks of inode (A) is now inconsistent with the ondisk metadata.  If
one of those former COW extents are allocated and mapped into another
file (B) and someone triggers a COW to the stale reservation in (A), A's
dirty data will be written into (B) and once that's done, those blocks
will be transferred to (A)'s data fork without bumping the refcount.

The results are catastrophic -- file (B) and the refcount btree are now
corrupt.  Solve this race by forcing the xfs_blockgc_free_space to run
synchronously, which causes xfs_icwalk to return to inodes that were
skipped because the blockgc code couldn't take the IOLOCK.  This is safe
to do here because the VFS has already prohibited new writer threads.

Fixes: 10ddf64e420f ("xfs: remove leftover CoW reservations when remounting ro")
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Chandan Babu R <chandan.babu@oracle.com>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/xfs/xfs_super.c | 14 +++++++++++---
 1 file changed, 11 insertions(+), 3 deletions(-)

diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
index d220a63d7883..6323974d6b3e 100644
--- a/fs/xfs/xfs_super.c
+++ b/fs/xfs/xfs_super.c
@@ -1711,7 +1711,10 @@ static int
 xfs_remount_ro(
 	struct xfs_mount	*mp)
 {
-	int error;
+	struct xfs_eofblocks	eofb = {
+		.eof_flags	= XFS_EOF_FLAGS_SYNC,
+	};
+	int			error;
 
 	/*
 	 * Cancel background eofb scanning so it cannot race with the final
@@ -1719,8 +1722,13 @@ xfs_remount_ro(
 	 */
 	xfs_stop_block_reaping(mp);
 
-	/* Get rid of any leftover CoW reservations... */
-	error = xfs_icache_free_cowblocks(mp, NULL);
+	/*
+	 * Clear out all remaining COW staging extents and speculative post-EOF
+	 * preallocations so that we don't leave inodes requiring inactivation
+	 * cleanups during reclaim on a read-only mount.  We must process every
+	 * cached inode, so this requires a synchronous cache scan.
+	 */
+	error = xfs_icache_free_cowblocks(mp, &eofb);
 	if (error) {
 		xfs_force_shutdown(mp, SHUTDOWN_CORRUPT_INCORE);
 		return error;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 5.10 CANDIDATE 05/11] xfs: check sb_meta_uuid for dabuf buffer recovery
  2022-06-17 10:06 [PATCH 5.10 CANDIDATE 00/11] xfs stable candidate patches for 5.10.y (v5.15+) Amir Goldstein
                   ` (3 preceding siblings ...)
  2022-06-17 10:06 ` [PATCH 5.10 CANDIDATE 04/11] xfs: remove all COW fork extents when remounting readonly Amir Goldstein
@ 2022-06-17 10:06 ` Amir Goldstein
  2022-06-17 10:06 ` [PATCH 5.10 CANDIDATE 06/11] xfs: refactor xfs_file_fsync Amir Goldstein
                   ` (6 subsequent siblings)
  11 siblings, 0 replies; 31+ messages in thread
From: Amir Goldstein @ 2022-06-17 10:06 UTC (permalink / raw)
  To: Darrick J . Wong
  Cc: Leah Rumancik, Luis Chamberlain, Dave Chinner, Christoph Hellwig,
	Christian Brauner, linux-xfs, fstests, Dave Chinner

From: Dave Chinner <dchinner@redhat.com>

commit 09654ed8a18cfd45027a67d6cbca45c9ea54feab upstream.

Got a report that a repeated crash test of a container host would
eventually fail with a log recovery error preventing the system from
mounting the root filesystem. It manifested as a directory leaf node
corruption on writeback like so:

 XFS (loop0): Mounting V5 Filesystem
 XFS (loop0): Starting recovery (logdev: internal)
 XFS (loop0): Metadata corruption detected at xfs_dir3_leaf_check_int+0x99/0xf0, xfs_dir3_leaf1 block 0x12faa158
 XFS (loop0): Unmount and run xfs_repair
 XFS (loop0): First 128 bytes of corrupted metadata buffer:
 00000000: 00 00 00 00 00 00 00 00 3d f1 00 00 e1 9e d5 8b  ........=.......
 00000010: 00 00 00 00 12 fa a1 58 00 00 00 29 00 00 1b cc  .......X...)....
 00000020: 91 06 78 ff f7 7e 4a 7d 8d 53 86 f2 ac 47 a8 23  ..x..~J}.S...G.#
 00000030: 00 00 00 00 17 e0 00 80 00 43 00 00 00 00 00 00  .........C......
 00000040: 00 00 00 2e 00 00 00 08 00 00 17 2e 00 00 00 0a  ................
 00000050: 02 35 79 83 00 00 00 30 04 d3 b4 80 00 00 01 50  .5y....0.......P
 00000060: 08 40 95 7f 00 00 02 98 08 41 fe b7 00 00 02 d4  .@.......A......
 00000070: 0d 62 ef a7 00 00 01 f2 14 50 21 41 00 00 00 0c  .b.......P!A....
 XFS (loop0): Corruption of in-memory data (0x8) detected at xfs_do_force_shutdown+0x1a/0x20 (fs/xfs/xfs_buf.c:1514).  Shutting down.
 XFS (loop0): Please unmount the filesystem and rectify the problem(s)
 XFS (loop0): log mount/recovery failed: error -117
 XFS (loop0): log mount failed

Tracing indicated that we were recovering changes from a transaction
at LSN 0x29/0x1c16 into a buffer that had an LSN of 0x29/0x1d57.
That is, log recovery was overwriting a buffer with newer changes on
disk than was in the transaction. Tracing indicated that we were
hitting the "recovery immediately" case in
xfs_buf_log_recovery_lsn(), and hence it was ignoring the LSN in the
buffer.

The code was extracting the LSN correctly, then ignoring it because
the UUID in the buffer did not match the superblock UUID. The
problem arises because the UUID check uses the wrong UUID - it
should be checking the sb_meta_uuid, not sb_uuid. This filesystem
has sb_uuid != sb_meta_uuid (which is fine), and the buffer has the
correct matching sb_meta_uuid in it, it's just the code checked it
against the wrong superblock uuid.

The is no corruption in the filesystem, and failing to recover the
buffer due to a write verifier failure means the recovery bug did
not propagate the corruption to disk. Hence there is no corruption
before or after this bug has manifested, the impact is limited
simply to an unmountable filesystem....

This was missed back in 2015 during an audit of incorrect sb_uuid
usage that resulted in commit fcfbe2c4ef42 ("xfs: log recovery needs
to validate against sb_meta_uuid") that fixed the magic32 buffers to
validate against sb_meta_uuid instead of sb_uuid. It missed the
magicda buffers....

Fixes: ce748eaa65f2 ("xfs: create new metadata UUID field and incompat flag")
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/xfs/xfs_buf_item_recover.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/xfs/xfs_buf_item_recover.c b/fs/xfs/xfs_buf_item_recover.c
index d44e8b4a3391..1d649462d731 100644
--- a/fs/xfs/xfs_buf_item_recover.c
+++ b/fs/xfs/xfs_buf_item_recover.c
@@ -805,7 +805,7 @@ xlog_recover_get_buf_lsn(
 	}
 
 	if (lsn != (xfs_lsn_t)-1) {
-		if (!uuid_equal(&mp->m_sb.sb_uuid, uuid))
+		if (!uuid_equal(&mp->m_sb.sb_meta_uuid, uuid))
 			goto recover_immediately;
 		return lsn;
 	}
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 5.10 CANDIDATE 06/11] xfs: refactor xfs_file_fsync
  2022-06-17 10:06 [PATCH 5.10 CANDIDATE 00/11] xfs stable candidate patches for 5.10.y (v5.15+) Amir Goldstein
                   ` (4 preceding siblings ...)
  2022-06-17 10:06 ` [PATCH 5.10 CANDIDATE 05/11] xfs: check sb_meta_uuid for dabuf buffer recovery Amir Goldstein
@ 2022-06-17 10:06 ` Amir Goldstein
  2022-06-17 10:06 ` [PATCH 5.10 CANDIDATE 07/11] xfs: xfs_log_force_lsn isn't passed a LSN Amir Goldstein
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 31+ messages in thread
From: Amir Goldstein @ 2022-06-17 10:06 UTC (permalink / raw)
  To: Darrick J . Wong
  Cc: Leah Rumancik, Luis Chamberlain, Dave Chinner, Christoph Hellwig,
	Christian Brauner, linux-xfs, fstests, Brian Foster,
	Dave Chinner

From: Christoph Hellwig <hch@lst.de>

commit f22c7f87777361f94aa17f746fbadfa499248dc8 upstream.

[backported for dependency]

Factor out the log syncing logic into two helpers to make the code easier
to read and more maintainable.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/xfs/xfs_file.c | 81 +++++++++++++++++++++++++++++------------------
 1 file changed, 50 insertions(+), 31 deletions(-)

diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
index 5b0f93f73837..414d856e2e75 100644
--- a/fs/xfs/xfs_file.c
+++ b/fs/xfs/xfs_file.c
@@ -118,6 +118,54 @@ xfs_dir_fsync(
 	return xfs_log_force_inode(ip);
 }
 
+static xfs_lsn_t
+xfs_fsync_lsn(
+	struct xfs_inode	*ip,
+	bool			datasync)
+{
+	if (!xfs_ipincount(ip))
+		return 0;
+	if (datasync && !(ip->i_itemp->ili_fsync_fields & ~XFS_ILOG_TIMESTAMP))
+		return 0;
+	return ip->i_itemp->ili_last_lsn;
+}
+
+/*
+ * All metadata updates are logged, which means that we just have to flush the
+ * log up to the latest LSN that touched the inode.
+ *
+ * If we have concurrent fsync/fdatasync() calls, we need them to all block on
+ * the log force before we clear the ili_fsync_fields field. This ensures that
+ * we don't get a racing sync operation that does not wait for the metadata to
+ * hit the journal before returning.  If we race with clearing ili_fsync_fields,
+ * then all that will happen is the log force will do nothing as the lsn will
+ * already be on disk.  We can't race with setting ili_fsync_fields because that
+ * is done under XFS_ILOCK_EXCL, and that can't happen because we hold the lock
+ * shared until after the ili_fsync_fields is cleared.
+ */
+static  int
+xfs_fsync_flush_log(
+	struct xfs_inode	*ip,
+	bool			datasync,
+	int			*log_flushed)
+{
+	int			error = 0;
+	xfs_lsn_t		lsn;
+
+	xfs_ilock(ip, XFS_ILOCK_SHARED);
+	lsn = xfs_fsync_lsn(ip, datasync);
+	if (lsn) {
+		error = xfs_log_force_lsn(ip->i_mount, lsn, XFS_LOG_SYNC,
+					  log_flushed);
+
+		spin_lock(&ip->i_itemp->ili_lock);
+		ip->i_itemp->ili_fsync_fields = 0;
+		spin_unlock(&ip->i_itemp->ili_lock);
+	}
+	xfs_iunlock(ip, XFS_ILOCK_SHARED);
+	return error;
+}
+
 STATIC int
 xfs_file_fsync(
 	struct file		*file,
@@ -125,13 +173,10 @@ xfs_file_fsync(
 	loff_t			end,
 	int			datasync)
 {
-	struct inode		*inode = file->f_mapping->host;
-	struct xfs_inode	*ip = XFS_I(inode);
-	struct xfs_inode_log_item *iip = ip->i_itemp;
+	struct xfs_inode	*ip = XFS_I(file->f_mapping->host);
 	struct xfs_mount	*mp = ip->i_mount;
 	int			error = 0;
 	int			log_flushed = 0;
-	xfs_lsn_t		lsn = 0;
 
 	trace_xfs_file_fsync(ip);
 
@@ -155,33 +200,7 @@ xfs_file_fsync(
 	else if (mp->m_logdev_targp != mp->m_ddev_targp)
 		xfs_blkdev_issue_flush(mp->m_ddev_targp);
 
-	/*
-	 * All metadata updates are logged, which means that we just have to
-	 * flush the log up to the latest LSN that touched the inode. If we have
-	 * concurrent fsync/fdatasync() calls, we need them to all block on the
-	 * log force before we clear the ili_fsync_fields field. This ensures
-	 * that we don't get a racing sync operation that does not wait for the
-	 * metadata to hit the journal before returning. If we race with
-	 * clearing the ili_fsync_fields, then all that will happen is the log
-	 * force will do nothing as the lsn will already be on disk. We can't
-	 * race with setting ili_fsync_fields because that is done under
-	 * XFS_ILOCK_EXCL, and that can't happen because we hold the lock shared
-	 * until after the ili_fsync_fields is cleared.
-	 */
-	xfs_ilock(ip, XFS_ILOCK_SHARED);
-	if (xfs_ipincount(ip)) {
-		if (!datasync ||
-		    (iip->ili_fsync_fields & ~XFS_ILOG_TIMESTAMP))
-			lsn = iip->ili_last_lsn;
-	}
-
-	if (lsn) {
-		error = xfs_log_force_lsn(mp, lsn, XFS_LOG_SYNC, &log_flushed);
-		spin_lock(&iip->ili_lock);
-		iip->ili_fsync_fields = 0;
-		spin_unlock(&iip->ili_lock);
-	}
-	xfs_iunlock(ip, XFS_ILOCK_SHARED);
+	error = xfs_fsync_flush_log(ip, datasync, &log_flushed);
 
 	/*
 	 * If we only have a single device, and the log force about was
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 5.10 CANDIDATE 07/11] xfs: xfs_log_force_lsn isn't passed a LSN
  2022-06-17 10:06 [PATCH 5.10 CANDIDATE 00/11] xfs stable candidate patches for 5.10.y (v5.15+) Amir Goldstein
                   ` (5 preceding siblings ...)
  2022-06-17 10:06 ` [PATCH 5.10 CANDIDATE 06/11] xfs: refactor xfs_file_fsync Amir Goldstein
@ 2022-06-17 10:06 ` Amir Goldstein
  2022-06-22 16:45   ` Darrick J. Wong
  2022-06-17 10:06 ` [PATCH 5.10 CANDIDATE 08/11] xfs: prevent UAF in xfs_log_item_in_current_chkpt Amir Goldstein
                   ` (4 subsequent siblings)
  11 siblings, 1 reply; 31+ messages in thread
From: Amir Goldstein @ 2022-06-17 10:06 UTC (permalink / raw)
  To: Darrick J . Wong
  Cc: Leah Rumancik, Luis Chamberlain, Dave Chinner, Christoph Hellwig,
	Christian Brauner, linux-xfs, fstests, Dave Chinner,
	Brian Foster, Allison Henderson

From: Dave Chinner <dchinner@redhat.com>

commit 5f9b4b0de8dc2fb8eb655463b438001c111570fe upstream.

[backported from CIL scalability series for dependency]

In doing an investigation into AIL push stalls, I was looking at the
log force code to see if an async CIL push could be done instead.
This lead me to xfs_log_force_lsn() and looking at how it works.

xfs_log_force_lsn() is only called from inode synchronisation
contexts such as fsync(), and it takes the ip->i_itemp->ili_last_lsn
value as the LSN to sync the log to. This gets passed to
xlog_cil_force_lsn() via xfs_log_force_lsn() to flush the CIL to the
journal, and then used by xfs_log_force_lsn() to flush the iclogs to
the journal.

The problem is that ip->i_itemp->ili_last_lsn does not store a
log sequence number. What it stores is passed to it from the
->iop_committing method, which is called by xfs_log_commit_cil().
The value this passes to the iop_committing method is the CIL
context sequence number that the item was committed to.

As it turns out, xlog_cil_force_lsn() converts the sequence to an
actual commit LSN for the related context and returns that to
xfs_log_force_lsn(). xfs_log_force_lsn() overwrites it's "lsn"
variable that contained a sequence with an actual LSN and then uses
that to sync the iclogs.

This caused me some confusion for a while, even though I originally
wrote all this code a decade ago. ->iop_committing is only used by
a couple of log item types, and only inode items use the sequence
number it is passed.

Let's clean up the API, CIL structures and inode log item to call it
a sequence number, and make it clear that the high level code is
using CIL sequence numbers and not on-disk LSNs for integrity
synchronisation purposes.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Allison Henderson <allison.henderson@oracle.com>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/xfs/libxfs/xfs_types.h |  1 +
 fs/xfs/xfs_buf_item.c     |  2 +-
 fs/xfs/xfs_dquot_item.c   |  2 +-
 fs/xfs/xfs_file.c         | 14 +++++++-------
 fs/xfs/xfs_inode.c        | 10 +++++-----
 fs/xfs/xfs_inode_item.c   |  4 ++--
 fs/xfs/xfs_inode_item.h   |  2 +-
 fs/xfs/xfs_log.c          | 27 ++++++++++++++-------------
 fs/xfs/xfs_log.h          |  4 +---
 fs/xfs/xfs_log_cil.c      | 30 +++++++++++-------------------
 fs/xfs/xfs_log_priv.h     | 15 +++++++--------
 fs/xfs/xfs_trans.c        |  6 +++---
 fs/xfs/xfs_trans.h        |  4 ++--
 13 files changed, 56 insertions(+), 65 deletions(-)

diff --git a/fs/xfs/libxfs/xfs_types.h b/fs/xfs/libxfs/xfs_types.h
index 397d94775440..1ce06173c2f5 100644
--- a/fs/xfs/libxfs/xfs_types.h
+++ b/fs/xfs/libxfs/xfs_types.h
@@ -21,6 +21,7 @@ typedef int32_t		xfs_suminfo_t;	/* type of bitmap summary info */
 typedef uint32_t	xfs_rtword_t;	/* word type for bitmap manipulations */
 
 typedef int64_t		xfs_lsn_t;	/* log sequence number */
+typedef int64_t		xfs_csn_t;	/* CIL sequence number */
 
 typedef uint32_t	xfs_dablk_t;	/* dir/attr block number (in file) */
 typedef uint32_t	xfs_dahash_t;	/* dir/attr hash value */
diff --git a/fs/xfs/xfs_buf_item.c b/fs/xfs/xfs_buf_item.c
index 8c6e26d62ef2..5d6535370f87 100644
--- a/fs/xfs/xfs_buf_item.c
+++ b/fs/xfs/xfs_buf_item.c
@@ -632,7 +632,7 @@ xfs_buf_item_release(
 STATIC void
 xfs_buf_item_committing(
 	struct xfs_log_item	*lip,
-	xfs_lsn_t		commit_lsn)
+	xfs_csn_t		seq)
 {
 	return xfs_buf_item_release(lip);
 }
diff --git a/fs/xfs/xfs_dquot_item.c b/fs/xfs/xfs_dquot_item.c
index 8c1fdf37ee8f..8ed47b739b6c 100644
--- a/fs/xfs/xfs_dquot_item.c
+++ b/fs/xfs/xfs_dquot_item.c
@@ -188,7 +188,7 @@ xfs_qm_dquot_logitem_release(
 STATIC void
 xfs_qm_dquot_logitem_committing(
 	struct xfs_log_item	*lip,
-	xfs_lsn_t		commit_lsn)
+	xfs_csn_t		seq)
 {
 	return xfs_qm_dquot_logitem_release(lip);
 }
diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
index 414d856e2e75..4d6bf8d4974f 100644
--- a/fs/xfs/xfs_file.c
+++ b/fs/xfs/xfs_file.c
@@ -118,8 +118,8 @@ xfs_dir_fsync(
 	return xfs_log_force_inode(ip);
 }
 
-static xfs_lsn_t
-xfs_fsync_lsn(
+static xfs_csn_t
+xfs_fsync_seq(
 	struct xfs_inode	*ip,
 	bool			datasync)
 {
@@ -127,7 +127,7 @@ xfs_fsync_lsn(
 		return 0;
 	if (datasync && !(ip->i_itemp->ili_fsync_fields & ~XFS_ILOG_TIMESTAMP))
 		return 0;
-	return ip->i_itemp->ili_last_lsn;
+	return ip->i_itemp->ili_commit_seq;
 }
 
 /*
@@ -150,12 +150,12 @@ xfs_fsync_flush_log(
 	int			*log_flushed)
 {
 	int			error = 0;
-	xfs_lsn_t		lsn;
+	xfs_csn_t		seq;
 
 	xfs_ilock(ip, XFS_ILOCK_SHARED);
-	lsn = xfs_fsync_lsn(ip, datasync);
-	if (lsn) {
-		error = xfs_log_force_lsn(ip->i_mount, lsn, XFS_LOG_SYNC,
+	seq = xfs_fsync_seq(ip, datasync);
+	if (seq) {
+		error = xfs_log_force_seq(ip->i_mount, seq, XFS_LOG_SYNC,
 					  log_flushed);
 
 		spin_lock(&ip->i_itemp->ili_lock);
diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
index e958b1c74561..1ae669d12301 100644
--- a/fs/xfs/xfs_inode.c
+++ b/fs/xfs/xfs_inode.c
@@ -2754,7 +2754,7 @@ xfs_iunpin(
 	trace_xfs_inode_unpin_nowait(ip, _RET_IP_);
 
 	/* Give the log a push to start the unpinning I/O */
-	xfs_log_force_lsn(ip->i_mount, ip->i_itemp->ili_last_lsn, 0, NULL);
+	xfs_log_force_seq(ip->i_mount, ip->i_itemp->ili_commit_seq, 0, NULL);
 
 }
 
@@ -3717,16 +3717,16 @@ int
 xfs_log_force_inode(
 	struct xfs_inode	*ip)
 {
-	xfs_lsn_t		lsn = 0;
+	xfs_csn_t		seq = 0;
 
 	xfs_ilock(ip, XFS_ILOCK_SHARED);
 	if (xfs_ipincount(ip))
-		lsn = ip->i_itemp->ili_last_lsn;
+		seq = ip->i_itemp->ili_commit_seq;
 	xfs_iunlock(ip, XFS_ILOCK_SHARED);
 
-	if (!lsn)
+	if (!seq)
 		return 0;
-	return xfs_log_force_lsn(ip->i_mount, lsn, XFS_LOG_SYNC, NULL);
+	return xfs_log_force_seq(ip->i_mount, seq, XFS_LOG_SYNC, NULL);
 }
 
 /*
diff --git a/fs/xfs/xfs_inode_item.c b/fs/xfs/xfs_inode_item.c
index 6ff91e5bf3cd..3aba4559469f 100644
--- a/fs/xfs/xfs_inode_item.c
+++ b/fs/xfs/xfs_inode_item.c
@@ -617,9 +617,9 @@ xfs_inode_item_committed(
 STATIC void
 xfs_inode_item_committing(
 	struct xfs_log_item	*lip,
-	xfs_lsn_t		commit_lsn)
+	xfs_csn_t		seq)
 {
-	INODE_ITEM(lip)->ili_last_lsn = commit_lsn;
+	INODE_ITEM(lip)->ili_commit_seq = seq;
 	return xfs_inode_item_release(lip);
 }
 
diff --git a/fs/xfs/xfs_inode_item.h b/fs/xfs/xfs_inode_item.h
index 4b926e32831c..403b45ab9aa2 100644
--- a/fs/xfs/xfs_inode_item.h
+++ b/fs/xfs/xfs_inode_item.h
@@ -33,7 +33,7 @@ struct xfs_inode_log_item {
 	unsigned int		ili_fields;	   /* fields to be logged */
 	unsigned int		ili_fsync_fields;  /* logged since last fsync */
 	xfs_lsn_t		ili_flush_lsn;	   /* lsn at last flush */
-	xfs_lsn_t		ili_last_lsn;	   /* lsn at last transaction */
+	xfs_csn_t		ili_commit_seq;	   /* last transaction commit */
 };
 
 static inline int xfs_inode_clean(struct xfs_inode *ip)
diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c
index b445e63cbc3c..05791456adbb 100644
--- a/fs/xfs/xfs_log.c
+++ b/fs/xfs/xfs_log.c
@@ -3210,14 +3210,13 @@ xfs_log_force(
 }
 
 static int
-__xfs_log_force_lsn(
-	struct xfs_mount	*mp,
+xlog_force_lsn(
+	struct xlog		*log,
 	xfs_lsn_t		lsn,
 	uint			flags,
 	int			*log_flushed,
 	bool			already_slept)
 {
-	struct xlog		*log = mp->m_log;
 	struct xlog_in_core	*iclog;
 
 	spin_lock(&log->l_icloglock);
@@ -3250,8 +3249,6 @@ __xfs_log_force_lsn(
 		if (!already_slept &&
 		    (iclog->ic_prev->ic_state == XLOG_STATE_WANT_SYNC ||
 		     iclog->ic_prev->ic_state == XLOG_STATE_SYNCING)) {
-			XFS_STATS_INC(mp, xs_log_force_sleep);
-
 			xlog_wait(&iclog->ic_prev->ic_write_wait,
 					&log->l_icloglock);
 			return -EAGAIN;
@@ -3289,25 +3286,29 @@ __xfs_log_force_lsn(
  * to disk, that thread will wake up all threads waiting on the queue.
  */
 int
-xfs_log_force_lsn(
+xfs_log_force_seq(
 	struct xfs_mount	*mp,
-	xfs_lsn_t		lsn,
+	xfs_csn_t		seq,
 	uint			flags,
 	int			*log_flushed)
 {
+	struct xlog		*log = mp->m_log;
+	xfs_lsn_t		lsn;
 	int			ret;
-	ASSERT(lsn != 0);
+	ASSERT(seq != 0);
 
 	XFS_STATS_INC(mp, xs_log_force);
-	trace_xfs_log_force(mp, lsn, _RET_IP_);
+	trace_xfs_log_force(mp, seq, _RET_IP_);
 
-	lsn = xlog_cil_force_lsn(mp->m_log, lsn);
+	lsn = xlog_cil_force_seq(log, seq);
 	if (lsn == NULLCOMMITLSN)
 		return 0;
 
-	ret = __xfs_log_force_lsn(mp, lsn, flags, log_flushed, false);
-	if (ret == -EAGAIN)
-		ret = __xfs_log_force_lsn(mp, lsn, flags, log_flushed, true);
+	ret = xlog_force_lsn(log, lsn, flags, log_flushed, false);
+	if (ret == -EAGAIN) {
+		XFS_STATS_INC(mp, xs_log_force_sleep);
+		ret = xlog_force_lsn(log, lsn, flags, log_flushed, true);
+	}
 	return ret;
 }
 
diff --git a/fs/xfs/xfs_log.h b/fs/xfs/xfs_log.h
index 98c913da7587..a1089f8b7169 100644
--- a/fs/xfs/xfs_log.h
+++ b/fs/xfs/xfs_log.h
@@ -106,7 +106,7 @@ struct xfs_item_ops;
 struct xfs_trans;
 
 int	  xfs_log_force(struct xfs_mount *mp, uint flags);
-int	  xfs_log_force_lsn(struct xfs_mount *mp, xfs_lsn_t lsn, uint flags,
+int	  xfs_log_force_seq(struct xfs_mount *mp, xfs_csn_t seq, uint flags,
 		int *log_forced);
 int	  xfs_log_mount(struct xfs_mount	*mp,
 			struct xfs_buftarg	*log_target,
@@ -132,8 +132,6 @@ bool	xfs_log_writable(struct xfs_mount *mp);
 struct xlog_ticket *xfs_log_ticket_get(struct xlog_ticket *ticket);
 void	  xfs_log_ticket_put(struct xlog_ticket *ticket);
 
-void	xfs_log_commit_cil(struct xfs_mount *mp, struct xfs_trans *tp,
-				xfs_lsn_t *commit_lsn, bool regrant);
 void	xlog_cil_process_committed(struct list_head *list);
 bool	xfs_log_item_in_current_chkpt(struct xfs_log_item *lip);
 
diff --git a/fs/xfs/xfs_log_cil.c b/fs/xfs/xfs_log_cil.c
index cd5c04dabe2e..88730883bb70 100644
--- a/fs/xfs/xfs_log_cil.c
+++ b/fs/xfs/xfs_log_cil.c
@@ -777,7 +777,7 @@ xlog_cil_push_work(
 	 * that higher sequences will wait for us to write out a commit record
 	 * before they do.
 	 *
-	 * xfs_log_force_lsn requires us to mirror the new sequence into the cil
+	 * xfs_log_force_seq requires us to mirror the new sequence into the cil
 	 * structure atomically with the addition of this sequence to the
 	 * committing list. This also ensures that we can do unlocked checks
 	 * against the current sequence in log forces without risking
@@ -1020,16 +1020,14 @@ xlog_cil_empty(
  * allowed again.
  */
 void
-xfs_log_commit_cil(
-	struct xfs_mount	*mp,
+xlog_cil_commit(
+	struct xlog		*log,
 	struct xfs_trans	*tp,
-	xfs_lsn_t		*commit_lsn,
+	xfs_csn_t		*commit_seq,
 	bool			regrant)
 {
-	struct xlog		*log = mp->m_log;
 	struct xfs_cil		*cil = log->l_cilp;
 	struct xfs_log_item	*lip, *next;
-	xfs_lsn_t		xc_commit_lsn;
 
 	/*
 	 * Do all necessary memory allocation before we lock the CIL.
@@ -1043,10 +1041,6 @@ xfs_log_commit_cil(
 
 	xlog_cil_insert_items(log, tp);
 
-	xc_commit_lsn = cil->xc_ctx->sequence;
-	if (commit_lsn)
-		*commit_lsn = xc_commit_lsn;
-
 	if (regrant && !XLOG_FORCED_SHUTDOWN(log))
 		xfs_log_ticket_regrant(log, tp->t_ticket);
 	else
@@ -1069,8 +1063,10 @@ xfs_log_commit_cil(
 	list_for_each_entry_safe(lip, next, &tp->t_items, li_trans) {
 		xfs_trans_del_item(lip);
 		if (lip->li_ops->iop_committing)
-			lip->li_ops->iop_committing(lip, xc_commit_lsn);
+			lip->li_ops->iop_committing(lip, cil->xc_ctx->sequence);
 	}
+	if (commit_seq)
+		*commit_seq = cil->xc_ctx->sequence;
 
 	/* xlog_cil_push_background() releases cil->xc_ctx_lock */
 	xlog_cil_push_background(log);
@@ -1087,9 +1083,9 @@ xfs_log_commit_cil(
  * iclog flush is necessary following this call.
  */
 xfs_lsn_t
-xlog_cil_force_lsn(
+xlog_cil_force_seq(
 	struct xlog	*log,
-	xfs_lsn_t	sequence)
+	xfs_csn_t	sequence)
 {
 	struct xfs_cil		*cil = log->l_cilp;
 	struct xfs_cil_ctx	*ctx;
@@ -1185,21 +1181,17 @@ bool
 xfs_log_item_in_current_chkpt(
 	struct xfs_log_item *lip)
 {
-	struct xfs_cil_ctx *ctx;
+	struct xfs_cil_ctx *ctx = lip->li_mountp->m_log->l_cilp->xc_ctx;
 
 	if (list_empty(&lip->li_cil))
 		return false;
 
-	ctx = lip->li_mountp->m_log->l_cilp->xc_ctx;
-
 	/*
 	 * li_seq is written on the first commit of a log item to record the
 	 * first checkpoint it is written to. Hence if it is different to the
 	 * current sequence, we're in a new checkpoint.
 	 */
-	if (XFS_LSN_CMP(lip->li_seq, ctx->sequence) != 0)
-		return false;
-	return true;
+	return lip->li_seq == ctx->sequence;
 }
 
 /*
diff --git a/fs/xfs/xfs_log_priv.h b/fs/xfs/xfs_log_priv.h
index 1c6fdbf3d506..42cd1602ac25 100644
--- a/fs/xfs/xfs_log_priv.h
+++ b/fs/xfs/xfs_log_priv.h
@@ -230,7 +230,7 @@ struct xfs_cil;
 
 struct xfs_cil_ctx {
 	struct xfs_cil		*cil;
-	xfs_lsn_t		sequence;	/* chkpt sequence # */
+	xfs_csn_t		sequence;	/* chkpt sequence # */
 	xfs_lsn_t		start_lsn;	/* first LSN of chkpt commit */
 	xfs_lsn_t		commit_lsn;	/* chkpt commit record lsn */
 	struct xlog_ticket	*ticket;	/* chkpt ticket */
@@ -268,10 +268,10 @@ struct xfs_cil {
 	struct xfs_cil_ctx	*xc_ctx;
 
 	spinlock_t		xc_push_lock ____cacheline_aligned_in_smp;
-	xfs_lsn_t		xc_push_seq;
+	xfs_csn_t		xc_push_seq;
 	struct list_head	xc_committing;
 	wait_queue_head_t	xc_commit_wait;
-	xfs_lsn_t		xc_current_sequence;
+	xfs_csn_t		xc_current_sequence;
 	struct work_struct	xc_push_work;
 	wait_queue_head_t	xc_push_wait;	/* background push throttle */
 } ____cacheline_aligned_in_smp;
@@ -547,19 +547,18 @@ int	xlog_cil_init(struct xlog *log);
 void	xlog_cil_init_post_recovery(struct xlog *log);
 void	xlog_cil_destroy(struct xlog *log);
 bool	xlog_cil_empty(struct xlog *log);
+void	xlog_cil_commit(struct xlog *log, struct xfs_trans *tp,
+			xfs_csn_t *commit_seq, bool regrant);
 
 /*
  * CIL force routines
  */
-xfs_lsn_t
-xlog_cil_force_lsn(
-	struct xlog *log,
-	xfs_lsn_t sequence);
+xfs_lsn_t xlog_cil_force_seq(struct xlog *log, xfs_csn_t sequence);
 
 static inline void
 xlog_cil_force(struct xlog *log)
 {
-	xlog_cil_force_lsn(log, log->l_cilp->xc_current_sequence);
+	xlog_cil_force_seq(log, log->l_cilp->xc_current_sequence);
 }
 
 /*
diff --git a/fs/xfs/xfs_trans.c b/fs/xfs/xfs_trans.c
index 36166bae24a6..73a1de7ceefc 100644
--- a/fs/xfs/xfs_trans.c
+++ b/fs/xfs/xfs_trans.c
@@ -832,7 +832,7 @@ __xfs_trans_commit(
 	bool			regrant)
 {
 	struct xfs_mount	*mp = tp->t_mountp;
-	xfs_lsn_t		commit_lsn = -1;
+	xfs_csn_t		commit_seq = 0;
 	int			error = 0;
 	int			sync = tp->t_flags & XFS_TRANS_SYNC;
 
@@ -874,7 +874,7 @@ __xfs_trans_commit(
 		xfs_trans_apply_sb_deltas(tp);
 	xfs_trans_apply_dquot_deltas(tp);
 
-	xfs_log_commit_cil(mp, tp, &commit_lsn, regrant);
+	xlog_cil_commit(mp->m_log, tp, &commit_seq, regrant);
 
 	xfs_trans_free(tp);
 
@@ -883,7 +883,7 @@ __xfs_trans_commit(
 	 * log out now and wait for it.
 	 */
 	if (sync) {
-		error = xfs_log_force_lsn(mp, commit_lsn, XFS_LOG_SYNC, NULL);
+		error = xfs_log_force_seq(mp, commit_seq, XFS_LOG_SYNC, NULL);
 		XFS_STATS_INC(mp, xs_trans_sync);
 	} else {
 		XFS_STATS_INC(mp, xs_trans_async);
diff --git a/fs/xfs/xfs_trans.h b/fs/xfs/xfs_trans.h
index 075eeade4f7d..97485559008b 100644
--- a/fs/xfs/xfs_trans.h
+++ b/fs/xfs/xfs_trans.h
@@ -43,7 +43,7 @@ struct xfs_log_item {
 	struct list_head		li_cil;		/* CIL pointers */
 	struct xfs_log_vec		*li_lv;		/* active log vector */
 	struct xfs_log_vec		*li_lv_shadow;	/* standby vector */
-	xfs_lsn_t			li_seq;		/* CIL commit seq */
+	xfs_csn_t			li_seq;		/* CIL commit seq */
 };
 
 /*
@@ -69,7 +69,7 @@ struct xfs_item_ops {
 	void (*iop_pin)(struct xfs_log_item *);
 	void (*iop_unpin)(struct xfs_log_item *, int remove);
 	uint (*iop_push)(struct xfs_log_item *, struct list_head *);
-	void (*iop_committing)(struct xfs_log_item *, xfs_lsn_t commit_lsn);
+	void (*iop_committing)(struct xfs_log_item *lip, xfs_csn_t seq);
 	void (*iop_release)(struct xfs_log_item *);
 	xfs_lsn_t (*iop_committed)(struct xfs_log_item *, xfs_lsn_t);
 	int (*iop_recover)(struct xfs_log_item *lip,
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 5.10 CANDIDATE 08/11] xfs: prevent UAF in xfs_log_item_in_current_chkpt
  2022-06-17 10:06 [PATCH 5.10 CANDIDATE 00/11] xfs stable candidate patches for 5.10.y (v5.15+) Amir Goldstein
                   ` (6 preceding siblings ...)
  2022-06-17 10:06 ` [PATCH 5.10 CANDIDATE 07/11] xfs: xfs_log_force_lsn isn't passed a LSN Amir Goldstein
@ 2022-06-17 10:06 ` Amir Goldstein
  2022-06-17 10:06 ` [PATCH 5.10 CANDIDATE 09/11] xfs: only bother with sync_filesystem during readonly remount Amir Goldstein
                   ` (3 subsequent siblings)
  11 siblings, 0 replies; 31+ messages in thread
From: Amir Goldstein @ 2022-06-17 10:06 UTC (permalink / raw)
  To: Darrick J . Wong
  Cc: Leah Rumancik, Luis Chamberlain, Dave Chinner, Christoph Hellwig,
	Christian Brauner, linux-xfs, fstests, Dave Chinner

From: "Darrick J. Wong" <djwong@kernel.org>

commit f8d92a66e810acbef6ddbc0bd0cbd9b117ce8acd upstream.

While I was running with KASAN and lockdep enabled, I stumbled upon an
KASAN report about a UAF to a freed CIL checkpoint.  Looking at the
comment for xfs_log_item_in_current_chkpt, it seems pretty obvious to me
that the original patch to xfs_defer_finish_noroll should have done
something to lock the CIL to prevent it from switching the CIL contexts
while the predicate runs.

For upper level code that needs to know if a given log item is new
enough not to need relogging, add a new wrapper that takes the CIL
context lock long enough to sample the current CIL context.  This is
kind of racy in that the CIL can switch the contexts immediately after
sampling, but that's ok because the consequence is that the defer ops
code is a little slow to relog items.

 ==================================================================
 BUG: KASAN: use-after-free in xfs_log_item_in_current_chkpt+0x139/0x160 [xfs]
 Read of size 8 at addr ffff88804ea5f608 by task fsstress/527999

 CPU: 1 PID: 527999 Comm: fsstress Tainted: G      D      5.16.0-rc4-xfsx #rc4
 Call Trace:
  <TASK>
  dump_stack_lvl+0x45/0x59
  print_address_description.constprop.0+0x1f/0x140
  kasan_report.cold+0x83/0xdf
  xfs_log_item_in_current_chkpt+0x139/0x160
  xfs_defer_finish_noroll+0x3bb/0x1e30
  __xfs_trans_commit+0x6c8/0xcf0
  xfs_reflink_remap_extent+0x66f/0x10e0
  xfs_reflink_remap_blocks+0x2dd/0xa90
  xfs_file_remap_range+0x27b/0xc30
  vfs_dedupe_file_range_one+0x368/0x420
  vfs_dedupe_file_range+0x37c/0x5d0
  do_vfs_ioctl+0x308/0x1260
  __x64_sys_ioctl+0xa1/0x170
  do_syscall_64+0x35/0x80
  entry_SYSCALL_64_after_hwframe+0x44/0xae
 RIP: 0033:0x7f2c71a2950b
 Code: 0f 1e fa 48 8b 05 85 39 0d 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff
ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01
f0 ff ff 73 01 c3 48 8b 0d 55 39 0d 00 f7 d8 64 89 01 48
 RSP: 002b:00007ffe8c0e03c8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
 RAX: ffffffffffffffda RBX: 00005600862a8740 RCX: 00007f2c71a2950b
 RDX: 00005600862a7be0 RSI: 00000000c0189436 RDI: 0000000000000004
 RBP: 000000000000000b R08: 0000000000000027 R09: 0000000000000003
 R10: 0000000000000000 R11: 0000000000000246 R12: 000000000000005a
 R13: 00005600862804a8 R14: 0000000000016000 R15: 00005600862a8a20
  </TASK>

 Allocated by task 464064:
  kasan_save_stack+0x1e/0x50
  __kasan_kmalloc+0x81/0xa0
  kmem_alloc+0xcd/0x2c0 [xfs]
  xlog_cil_ctx_alloc+0x17/0x1e0 [xfs]
  xlog_cil_push_work+0x141/0x13d0 [xfs]
  process_one_work+0x7f6/0x1380
  worker_thread+0x59d/0x1040
  kthread+0x3b0/0x490
  ret_from_fork+0x1f/0x30

 Freed by task 51:
  kasan_save_stack+0x1e/0x50
  kasan_set_track+0x21/0x30
  kasan_set_free_info+0x20/0x30
  __kasan_slab_free+0xed/0x130
  slab_free_freelist_hook+0x7f/0x160
  kfree+0xde/0x340
  xlog_cil_committed+0xbfd/0xfe0 [xfs]
  xlog_cil_process_committed+0x103/0x1c0 [xfs]
  xlog_state_do_callback+0x45d/0xbd0 [xfs]
  xlog_ioend_work+0x116/0x1c0 [xfs]
  process_one_work+0x7f6/0x1380
  worker_thread+0x59d/0x1040
  kthread+0x3b0/0x490
  ret_from_fork+0x1f/0x30

 Last potentially related work creation:
  kasan_save_stack+0x1e/0x50
  __kasan_record_aux_stack+0xb7/0xc0
  insert_work+0x48/0x2e0
  __queue_work+0x4e7/0xda0
  queue_work_on+0x69/0x80
  xlog_cil_push_now.isra.0+0x16b/0x210 [xfs]
  xlog_cil_force_seq+0x1b7/0x850 [xfs]
  xfs_log_force_seq+0x1c7/0x670 [xfs]
  xfs_file_fsync+0x7c1/0xa60 [xfs]
  __x64_sys_fsync+0x52/0x80
  do_syscall_64+0x35/0x80
  entry_SYSCALL_64_after_hwframe+0x44/0xae

 The buggy address belongs to the object at ffff88804ea5f600
  which belongs to the cache kmalloc-256 of size 256
 The buggy address is located 8 bytes inside of
  256-byte region [ffff88804ea5f600, ffff88804ea5f700)
 The buggy address belongs to the page:
 page:ffffea00013a9780 refcount:1 mapcount:0 mapping:0000000000000000 index:0xffff88804ea5ea00 pfn:0x4ea5e
 head:ffffea00013a9780 order:1 compound_mapcount:0
 flags: 0x4fff80000010200(slab|head|node=1|zone=1|lastcpupid=0xfff)
 raw: 04fff80000010200 ffffea0001245908 ffffea00011bd388 ffff888004c42b40
 raw: ffff88804ea5ea00 0000000000100009 00000001ffffffff 0000000000000000
 page dumped because: kasan: bad access detected

 Memory state around the buggy address:
  ffff88804ea5f500: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
  ffff88804ea5f580: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
 >ffff88804ea5f600: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                       ^
  ffff88804ea5f680: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
  ffff88804ea5f700: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
 ==================================================================

Fixes: 4e919af7827a ("xfs: periodically relog deferred intent items")
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/xfs/xfs_log_cil.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/fs/xfs/xfs_log_cil.c b/fs/xfs/xfs_log_cil.c
index 88730883bb70..fbe160d5e9b9 100644
--- a/fs/xfs/xfs_log_cil.c
+++ b/fs/xfs/xfs_log_cil.c
@@ -1179,9 +1179,9 @@ xlog_cil_force_seq(
  */
 bool
 xfs_log_item_in_current_chkpt(
-	struct xfs_log_item *lip)
+	struct xfs_log_item	*lip)
 {
-	struct xfs_cil_ctx *ctx = lip->li_mountp->m_log->l_cilp->xc_ctx;
+	struct xfs_cil		*cil = lip->li_mountp->m_log->l_cilp;
 
 	if (list_empty(&lip->li_cil))
 		return false;
@@ -1191,7 +1191,7 @@ xfs_log_item_in_current_chkpt(
 	 * first checkpoint it is written to. Hence if it is different to the
 	 * current sequence, we're in a new checkpoint.
 	 */
-	return lip->li_seq == ctx->sequence;
+	return lip->li_seq == READ_ONCE(cil->xc_current_sequence);
 }
 
 /*
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 5.10 CANDIDATE 09/11] xfs: only bother with sync_filesystem during readonly remount
  2022-06-17 10:06 [PATCH 5.10 CANDIDATE 00/11] xfs stable candidate patches for 5.10.y (v5.15+) Amir Goldstein
                   ` (7 preceding siblings ...)
  2022-06-17 10:06 ` [PATCH 5.10 CANDIDATE 08/11] xfs: prevent UAF in xfs_log_item_in_current_chkpt Amir Goldstein
@ 2022-06-17 10:06 ` Amir Goldstein
  2022-06-22 16:38   ` Darrick J. Wong
  2022-06-17 10:06 ` [PATCH 5.10 CANDIDATE 10/11] xfs: fix up non-directory creation in SGID directories Amir Goldstein
                   ` (2 subsequent siblings)
  11 siblings, 1 reply; 31+ messages in thread
From: Amir Goldstein @ 2022-06-17 10:06 UTC (permalink / raw)
  To: Darrick J . Wong
  Cc: Leah Rumancik, Luis Chamberlain, Dave Chinner, Christoph Hellwig,
	Christian Brauner, linux-xfs, fstests, Dave Chinner

From: "Darrick J. Wong" <djwong@kernel.org>

commit b97cca3ba9098522e5a1c3388764ead42640c1a5 upstream.

In commit 02b9984d6408, we pushed a sync_filesystem() call from the VFS
into xfs_fs_remount.  The only time that we ever need to push dirty file
data or metadata to disk for a remount is if we're remounting the
filesystem read only, so this really could be moved to xfs_remount_ro.

Once we've moved the call site, actually check the return value from
sync_filesystem.

Fixes: 02b9984d6408 ("fs: push sync_filesystem() down to the file system's remount_fs()")
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/xfs/xfs_super.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
index 6323974d6b3e..dd0439ae6732 100644
--- a/fs/xfs/xfs_super.c
+++ b/fs/xfs/xfs_super.c
@@ -1716,6 +1716,11 @@ xfs_remount_ro(
 	};
 	int			error;
 
+	/* Flush all the dirty data to disk. */
+	error = sync_filesystem(mp->m_super);
+	if (error)
+		return error;
+
 	/*
 	 * Cancel background eofb scanning so it cannot race with the final
 	 * log force+buftarg wait and deadlock the remount.
@@ -1786,8 +1791,6 @@ xfs_fc_reconfigure(
 	if (error)
 		return error;
 
-	sync_filesystem(mp->m_super);
-
 	/* inode32 -> inode64 */
 	if ((mp->m_flags & XFS_MOUNT_SMALL_INUMS) &&
 	    !(new_mp->m_flags & XFS_MOUNT_SMALL_INUMS)) {
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 5.10 CANDIDATE 10/11] xfs: fix up non-directory creation in SGID directories
  2022-06-17 10:06 [PATCH 5.10 CANDIDATE 00/11] xfs stable candidate patches for 5.10.y (v5.15+) Amir Goldstein
                   ` (8 preceding siblings ...)
  2022-06-17 10:06 ` [PATCH 5.10 CANDIDATE 09/11] xfs: only bother with sync_filesystem during readonly remount Amir Goldstein
@ 2022-06-17 10:06 ` Amir Goldstein
  2022-06-17 10:06 ` [PATCH 5.10 CANDIDATE 11/11] xfs: use setattr_copy to set vfs inode attributes Amir Goldstein
  2022-06-22 23:45 ` [PATCH 5.10 CANDIDATE 00/11] xfs stable candidate patches for 5.10.y (v5.15+) Darrick J. Wong
  11 siblings, 0 replies; 31+ messages in thread
From: Amir Goldstein @ 2022-06-17 10:06 UTC (permalink / raw)
  To: Darrick J . Wong
  Cc: Leah Rumancik, Luis Chamberlain, Dave Chinner, Christoph Hellwig,
	Christian Brauner, linux-xfs, fstests

From: Christoph Hellwig <hch@lst.de>

commit 01ea173e103edd5ec41acec65b9261b87e123fc2 upstream.

XFS always inherits the SGID bit if it is set on the parent inode, while
the generic inode_init_owner does not do this in a few cases where it can
create a possible security problem, see commit 0fa3ecd87848
("Fix up non-directory creation in SGID directories") for details.

Switch XFS to use the generic helper for the normal path to fix this,
just keeping the simple field inheritance open coded for the case of the
non-sgid case with the bsdgrpid mount option.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Reported-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/xfs/xfs_inode.c | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
index 1ae669d12301..ae86fbdc1bab 100644
--- a/fs/xfs/xfs_inode.c
+++ b/fs/xfs/xfs_inode.c
@@ -802,6 +802,7 @@ xfs_ialloc(
 	xfs_buf_t	**ialloc_context,
 	xfs_inode_t	**ipp)
 {
+	struct inode	*dir = pip ? VFS_I(pip) : NULL;
 	struct xfs_mount *mp = tp->t_mountp;
 	xfs_ino_t	ino;
 	xfs_inode_t	*ip;
@@ -847,18 +848,17 @@ xfs_ialloc(
 		return error;
 	ASSERT(ip != NULL);
 	inode = VFS_I(ip);
-	inode->i_mode = mode;
 	set_nlink(inode, nlink);
-	inode->i_uid = current_fsuid();
 	inode->i_rdev = rdev;
 	ip->i_d.di_projid = prid;
 
-	if (pip && XFS_INHERIT_GID(pip)) {
-		inode->i_gid = VFS_I(pip)->i_gid;
-		if ((VFS_I(pip)->i_mode & S_ISGID) && S_ISDIR(mode))
-			inode->i_mode |= S_ISGID;
+	if (dir && !(dir->i_mode & S_ISGID) &&
+	    (mp->m_flags & XFS_MOUNT_GRPID)) {
+		inode->i_uid = current_fsuid();
+		inode->i_gid = dir->i_gid;
+		inode->i_mode = mode;
 	} else {
-		inode->i_gid = current_fsgid();
+		inode_init_owner(inode, dir, mode);
 	}
 
 	/*
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 5.10 CANDIDATE 11/11] xfs: use setattr_copy to set vfs inode attributes
  2022-06-17 10:06 [PATCH 5.10 CANDIDATE 00/11] xfs stable candidate patches for 5.10.y (v5.15+) Amir Goldstein
                   ` (9 preceding siblings ...)
  2022-06-17 10:06 ` [PATCH 5.10 CANDIDATE 10/11] xfs: fix up non-directory creation in SGID directories Amir Goldstein
@ 2022-06-17 10:06 ` Amir Goldstein
  2022-06-22 16:41   ` Darrick J. Wong
  2022-06-22 23:45 ` [PATCH 5.10 CANDIDATE 00/11] xfs stable candidate patches for 5.10.y (v5.15+) Darrick J. Wong
  11 siblings, 1 reply; 31+ messages in thread
From: Amir Goldstein @ 2022-06-17 10:06 UTC (permalink / raw)
  To: Darrick J . Wong
  Cc: Leah Rumancik, Luis Chamberlain, Dave Chinner, Christoph Hellwig,
	Christian Brauner, linux-xfs, fstests, Dave Chinner,
	Christian Brauner

From: "Darrick J. Wong" <djwong@kernel.org>

commit e014f37db1a2d109afa750042ac4d69cf3e3d88e upstream.

[remove userns argument of setattr_copy() for backport]

Filipe Manana pointed out that XFS' behavior w.r.t. setuid/setgid
revocation isn't consistent with btrfs[1] or ext4.  Those two
filesystems use the VFS function setattr_copy to convey certain
attributes from struct iattr into the VFS inode structure.

Andrey Zhadchenko reported[2] that XFS uses the wrong user namespace to
decide if it should clear setgid and setuid on a file attribute update.
This is a second symptom of the problem that Filipe noticed.

XFS, on the other hand, open-codes setattr_copy in xfs_setattr_mode,
xfs_setattr_nonsize, and xfs_setattr_time.  Regrettably, setattr_copy is
/not/ a simple copy function; it contains additional logic to clear the
setgid bit when setting the mode, and XFS' version no longer matches.

The VFS implements its own setuid/setgid stripping logic, which
establishes consistent behavior.  It's a tad unfortunate that it's
scattered across notify_change, should_remove_suid, and setattr_copy but
XFS should really follow the Linux VFS.  Adapt XFS to use the VFS
functions and get rid of the old functions.

[1] https://lore.kernel.org/fstests/CAL3q7H47iNQ=Wmk83WcGB-KBJVOEtR9+qGczzCeXJ9Y2KCV25Q@mail.gmail.com/
[2] https://lore.kernel.org/linux-xfs/20220221182218.748084-1-andrey.zhadchenko@virtuozzo.com/

Fixes: 7fa294c8991c ("userns: Allow chown and setgid preservation")
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/xfs/xfs_iops.c | 56 +++--------------------------------------------
 fs/xfs/xfs_pnfs.c |  3 ++-
 2 files changed, 5 insertions(+), 54 deletions(-)

diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c
index b7f7b31a77d5..5711c8c12625 100644
--- a/fs/xfs/xfs_iops.c
+++ b/fs/xfs/xfs_iops.c
@@ -595,37 +595,6 @@ xfs_vn_getattr(
 	return 0;
 }
 
-static void
-xfs_setattr_mode(
-	struct xfs_inode	*ip,
-	struct iattr		*iattr)
-{
-	struct inode		*inode = VFS_I(ip);
-	umode_t			mode = iattr->ia_mode;
-
-	ASSERT(xfs_isilocked(ip, XFS_ILOCK_EXCL));
-
-	inode->i_mode &= S_IFMT;
-	inode->i_mode |= mode & ~S_IFMT;
-}
-
-void
-xfs_setattr_time(
-	struct xfs_inode	*ip,
-	struct iattr		*iattr)
-{
-	struct inode		*inode = VFS_I(ip);
-
-	ASSERT(xfs_isilocked(ip, XFS_ILOCK_EXCL));
-
-	if (iattr->ia_valid & ATTR_ATIME)
-		inode->i_atime = iattr->ia_atime;
-	if (iattr->ia_valid & ATTR_CTIME)
-		inode->i_ctime = iattr->ia_ctime;
-	if (iattr->ia_valid & ATTR_MTIME)
-		inode->i_mtime = iattr->ia_mtime;
-}
-
 static int
 xfs_vn_change_ok(
 	struct dentry	*dentry,
@@ -740,16 +709,6 @@ xfs_setattr_nonsize(
 				goto out_cancel;
 		}
 
-		/*
-		 * CAP_FSETID overrides the following restrictions:
-		 *
-		 * The set-user-ID and set-group-ID bits of a file will be
-		 * cleared upon successful return from chown()
-		 */
-		if ((inode->i_mode & (S_ISUID|S_ISGID)) &&
-		    !capable(CAP_FSETID))
-			inode->i_mode &= ~(S_ISUID|S_ISGID);
-
 		/*
 		 * Change the ownerships and register quota modifications
 		 * in the transaction.
@@ -761,7 +720,6 @@ xfs_setattr_nonsize(
 				olddquot1 = xfs_qm_vop_chown(tp, ip,
 							&ip->i_udquot, udqp);
 			}
-			inode->i_uid = uid;
 		}
 		if (!gid_eq(igid, gid)) {
 			if (XFS_IS_QUOTA_RUNNING(mp) && XFS_IS_GQUOTA_ON(mp)) {
@@ -772,15 +730,10 @@ xfs_setattr_nonsize(
 				olddquot2 = xfs_qm_vop_chown(tp, ip,
 							&ip->i_gdquot, gdqp);
 			}
-			inode->i_gid = gid;
 		}
 	}
 
-	if (mask & ATTR_MODE)
-		xfs_setattr_mode(ip, iattr);
-	if (mask & (ATTR_ATIME|ATTR_CTIME|ATTR_MTIME))
-		xfs_setattr_time(ip, iattr);
-
+	setattr_copy(inode, iattr);
 	xfs_trans_log_inode(tp, ip, XFS_ILOG_CORE);
 
 	XFS_STATS_INC(mp, xs_ig_attrchg);
@@ -1025,11 +978,8 @@ xfs_setattr_size(
 		xfs_inode_clear_eofblocks_tag(ip);
 	}
 
-	if (iattr->ia_valid & ATTR_MODE)
-		xfs_setattr_mode(ip, iattr);
-	if (iattr->ia_valid & (ATTR_ATIME|ATTR_CTIME|ATTR_MTIME))
-		xfs_setattr_time(ip, iattr);
-
+	ASSERT(!(iattr->ia_valid & (ATTR_UID | ATTR_GID)));
+	setattr_copy(inode, iattr);
 	xfs_trans_log_inode(tp, ip, XFS_ILOG_CORE);
 
 	XFS_STATS_INC(mp, xs_ig_attrchg);
diff --git a/fs/xfs/xfs_pnfs.c b/fs/xfs/xfs_pnfs.c
index f3082a957d5e..ae61094bc9d1 100644
--- a/fs/xfs/xfs_pnfs.c
+++ b/fs/xfs/xfs_pnfs.c
@@ -283,7 +283,8 @@ xfs_fs_commit_blocks(
 	xfs_trans_ijoin(tp, ip, XFS_ILOCK_EXCL);
 	xfs_trans_log_inode(tp, ip, XFS_ILOG_CORE);
 
-	xfs_setattr_time(ip, iattr);
+	ASSERT(!(iattr->ia_valid & (ATTR_UID | ATTR_GID)));
+	setattr_copy(inode, iattr);
 	if (update_isize) {
 		i_size_write(inode, iattr->ia_size);
 		ip->i_d.di_size = iattr->ia_size;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [PATCH 5.10 CANDIDATE 03/11] xfs: Fix the free logic of state in xfs_attr_node_hasname
  2022-06-17 10:06 ` [PATCH 5.10 CANDIDATE 03/11] xfs: Fix the free logic of state in xfs_attr_node_hasname Amir Goldstein
@ 2022-06-22 16:32   ` Darrick J. Wong
  2022-06-22 18:46     ` Amir Goldstein
  0 siblings, 1 reply; 31+ messages in thread
From: Darrick J. Wong @ 2022-06-22 16:32 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: Leah Rumancik, Luis Chamberlain, Dave Chinner, Christoph Hellwig,
	Christian Brauner, linux-xfs, fstests, Yang Xu

On Fri, Jun 17, 2022 at 01:06:33PM +0300, Amir Goldstein wrote:
> From: Yang Xu <xuyang2018.jy@fujitsu.com>
> 
> commit a1de97fe296c52eafc6590a3506f4bbd44ecb19a upstream.
> 
> When testing xfstests xfs/126 on lastest upstream kernel, it will hang on some machine.
> Adding a getxattr operation after xattr corrupted, I can reproduce it 100%.
> 
> The deadlock as below:
> [983.923403] task:setfattr        state:D stack:    0 pid:17639 ppid: 14687 flags:0x00000080
> [  983.923405] Call Trace:
> [  983.923410]  __schedule+0x2c4/0x700
> [  983.923412]  schedule+0x37/0xa0
> [  983.923414]  schedule_timeout+0x274/0x300
> [  983.923416]  __down+0x9b/0xf0
> [  983.923451]  ? xfs_buf_find.isra.29+0x3c8/0x5f0 [xfs]
> [  983.923453]  down+0x3b/0x50
> [  983.923471]  xfs_buf_lock+0x33/0xf0 [xfs]
> [  983.923490]  xfs_buf_find.isra.29+0x3c8/0x5f0 [xfs]
> [  983.923508]  xfs_buf_get_map+0x4c/0x320 [xfs]
> [  983.923525]  xfs_buf_read_map+0x53/0x310 [xfs]
> [  983.923541]  ? xfs_da_read_buf+0xcf/0x120 [xfs]
> [  983.923560]  xfs_trans_read_buf_map+0x1cf/0x360 [xfs]
> [  983.923575]  ? xfs_da_read_buf+0xcf/0x120 [xfs]
> [  983.923590]  xfs_da_read_buf+0xcf/0x120 [xfs]
> [  983.923606]  xfs_da3_node_read+0x1f/0x40 [xfs]
> [  983.923621]  xfs_da3_node_lookup_int+0x69/0x4a0 [xfs]
> [  983.923624]  ? kmem_cache_alloc+0x12e/0x270
> [  983.923637]  xfs_attr_node_hasname+0x6e/0xa0 [xfs]
> [  983.923651]  xfs_has_attr+0x6e/0xd0 [xfs]
> [  983.923664]  xfs_attr_set+0x273/0x320 [xfs]
> [  983.923683]  xfs_xattr_set+0x87/0xd0 [xfs]
> [  983.923686]  __vfs_removexattr+0x4d/0x60
> [  983.923688]  __vfs_removexattr_locked+0xac/0x130
> [  983.923689]  vfs_removexattr+0x4e/0xf0
> [  983.923690]  removexattr+0x4d/0x80
> [  983.923693]  ? __check_object_size+0xa8/0x16b
> [  983.923695]  ? strncpy_from_user+0x47/0x1a0
> [  983.923696]  ? getname_flags+0x6a/0x1e0
> [  983.923697]  ? _cond_resched+0x15/0x30
> [  983.923699]  ? __sb_start_write+0x1e/0x70
> [  983.923700]  ? mnt_want_write+0x28/0x50
> [  983.923701]  path_removexattr+0x9b/0xb0
> [  983.923702]  __x64_sys_removexattr+0x17/0x20
> [  983.923704]  do_syscall_64+0x5b/0x1a0
> [  983.923705]  entry_SYSCALL_64_after_hwframe+0x65/0xca
> [  983.923707] RIP: 0033:0x7f080f10ee1b
> 
> When getxattr calls xfs_attr_node_get function, xfs_da3_node_lookup_int fails with EFSCORRUPTED in
> xfs_attr_node_hasname because we have use blocktrash to random it in xfs/126. So it
> free state in internal and xfs_attr_node_get doesn't do xfs_buf_trans release job.
> 
> Then subsequent removexattr will hang because of it.
> 
> This bug was introduced by kernel commit 07120f1abdff ("xfs: Add xfs_has_attr and subroutines").
> It adds xfs_attr_node_hasname helper and said caller will be responsible for freeing the state
> in this case. But xfs_attr_node_hasname will free state itself instead of caller if
> xfs_da3_node_lookup_int fails.
> 
> Fix this bug by moving the step of free state into caller.
> 
> [amir: this text from original commit is not relevant for 5.10 backport:
> Also, use "goto error/out" instead of returning error directly in xfs_attr_node_addname_find_attr and
> xfs_attr_node_removename_setup function because we should free state ourselves.
> ]
> 
> Fixes: 07120f1abdff ("xfs: Add xfs_has_attr and subroutines")
> Signed-off-by: Yang Xu <xuyang2018.jy@fujitsu.com>
> Reviewed-by: Darrick J. Wong <djwong@kernel.org>
> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
> Signed-off-by: Amir Goldstein <amir73il@gmail.com>
> ---
>  fs/xfs/libxfs/xfs_attr.c | 13 +++++--------
>  1 file changed, 5 insertions(+), 8 deletions(-)
> 
> diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c
> index 96ac7e562b87..fcca36bbd997 100644
> --- a/fs/xfs/libxfs/xfs_attr.c
> +++ b/fs/xfs/libxfs/xfs_attr.c
> @@ -876,21 +876,18 @@ xfs_attr_node_hasname(
>  
>  	state = xfs_da_state_alloc(args);
>  	if (statep != NULL)
> -		*statep = NULL;
> +		*statep = state;
>  
>  	/*
>  	 * Search to see if name exists, and get back a pointer to it.
>  	 */
>  	error = xfs_da3_node_lookup_int(state, &retval);
> -	if (error) {
> -		xfs_da_state_free(state);
> -		return error;
> -	}
> +	if (error)
> +		retval = error;
>  
> -	if (statep != NULL)
> -		*statep = state;
> -	else
> +	if (!statep)
>  		xfs_da_state_free(state);
> +
>  	return retval;
>  }
>  
> -- 

Curious -- the conversion of the _node_hasname callers isn't in this
patch.  Looking at 5.10.124, I see that most of the callers already
clean up the passed-out statep, but do the callers of xfs_has_attr free
it too?

--D

> 2.25.1
> 

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 5.10 CANDIDATE 09/11] xfs: only bother with sync_filesystem during readonly remount
  2022-06-17 10:06 ` [PATCH 5.10 CANDIDATE 09/11] xfs: only bother with sync_filesystem during readonly remount Amir Goldstein
@ 2022-06-22 16:38   ` Darrick J. Wong
  2022-06-22 16:54     ` Amir Goldstein
  0 siblings, 1 reply; 31+ messages in thread
From: Darrick J. Wong @ 2022-06-22 16:38 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: Leah Rumancik, Luis Chamberlain, Dave Chinner, Christoph Hellwig,
	Christian Brauner, linux-xfs, fstests, Dave Chinner

On Fri, Jun 17, 2022 at 01:06:39PM +0300, Amir Goldstein wrote:
> From: "Darrick J. Wong" <djwong@kernel.org>
> 
> commit b97cca3ba9098522e5a1c3388764ead42640c1a5 upstream.
> 
> In commit 02b9984d6408, we pushed a sync_filesystem() call from the VFS
> into xfs_fs_remount.  The only time that we ever need to push dirty file
> data or metadata to disk for a remount is if we're remounting the
> filesystem read only, so this really could be moved to xfs_remount_ro.
> 
> Once we've moved the call site, actually check the return value from
> sync_filesystem.
> 
> Fixes: 02b9984d6408 ("fs: push sync_filesystem() down to the file system's remount_fs()")
> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
> Reviewed-by: Dave Chinner <dchinner@redhat.com>
> Signed-off-by: Amir Goldstein <amir73il@gmail.com>
> ---
>  fs/xfs/xfs_super.c | 7 +++++--
>  1 file changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
> index 6323974d6b3e..dd0439ae6732 100644
> --- a/fs/xfs/xfs_super.c
> +++ b/fs/xfs/xfs_super.c
> @@ -1716,6 +1716,11 @@ xfs_remount_ro(
>  	};
>  	int			error;
>  
> +	/* Flush all the dirty data to disk. */
> +	error = sync_filesystem(mp->m_super);

Looking at 5.10.124's fsync.c and xfs_super.c:

https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/fs/sync.c?h=v5.10.124#n31
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/fs/xfs/xfs_super.c?h=v5.10.124#n755

I think this kernel needs the patch(es) that make __sync_filesystem return
the errors passed back by ->sync_fs, and I think also the patch that
makes xfs_fs_sync_fs return errors encountered by xfs_log_force, right?

--D

> +	if (error)
> +		return error;
> +
>  	/*
>  	 * Cancel background eofb scanning so it cannot race with the final
>  	 * log force+buftarg wait and deadlock the remount.
> @@ -1786,8 +1791,6 @@ xfs_fc_reconfigure(
>  	if (error)
>  		return error;
>  
> -	sync_filesystem(mp->m_super);
> -
>  	/* inode32 -> inode64 */
>  	if ((mp->m_flags & XFS_MOUNT_SMALL_INUMS) &&
>  	    !(new_mp->m_flags & XFS_MOUNT_SMALL_INUMS)) {
> -- 
> 2.25.1
> 

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 5.10 CANDIDATE 11/11] xfs: use setattr_copy to set vfs inode attributes
  2022-06-17 10:06 ` [PATCH 5.10 CANDIDATE 11/11] xfs: use setattr_copy to set vfs inode attributes Amir Goldstein
@ 2022-06-22 16:41   ` Darrick J. Wong
  2022-06-22 18:36     ` Amir Goldstein
  0 siblings, 1 reply; 31+ messages in thread
From: Darrick J. Wong @ 2022-06-22 16:41 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: Leah Rumancik, Luis Chamberlain, Dave Chinner, Christoph Hellwig,
	Christian Brauner, linux-xfs, fstests, Dave Chinner,
	Christian Brauner

On Fri, Jun 17, 2022 at 01:06:41PM +0300, Amir Goldstein wrote:
> From: "Darrick J. Wong" <djwong@kernel.org>
> 
> commit e014f37db1a2d109afa750042ac4d69cf3e3d88e upstream.
> 
> [remove userns argument of setattr_copy() for backport]
> 
> Filipe Manana pointed out that XFS' behavior w.r.t. setuid/setgid
> revocation isn't consistent with btrfs[1] or ext4.  Those two
> filesystems use the VFS function setattr_copy to convey certain
> attributes from struct iattr into the VFS inode structure.
> 
> Andrey Zhadchenko reported[2] that XFS uses the wrong user namespace to
> decide if it should clear setgid and setuid on a file attribute update.
> This is a second symptom of the problem that Filipe noticed.
> 
> XFS, on the other hand, open-codes setattr_copy in xfs_setattr_mode,
> xfs_setattr_nonsize, and xfs_setattr_time.  Regrettably, setattr_copy is
> /not/ a simple copy function; it contains additional logic to clear the
> setgid bit when setting the mode, and XFS' version no longer matches.
> 
> The VFS implements its own setuid/setgid stripping logic, which
> establishes consistent behavior.  It's a tad unfortunate that it's
> scattered across notify_change, should_remove_suid, and setattr_copy but
> XFS should really follow the Linux VFS.  Adapt XFS to use the VFS
> functions and get rid of the old functions.
> 
> [1] https://lore.kernel.org/fstests/CAL3q7H47iNQ=Wmk83WcGB-KBJVOEtR9+qGczzCeXJ9Y2KCV25Q@mail.gmail.com/
> [2] https://lore.kernel.org/linux-xfs/20220221182218.748084-1-andrey.zhadchenko@virtuozzo.com/
> 
> Fixes: 7fa294c8991c ("userns: Allow chown and setgid preservation")
> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
> Reviewed-by: Dave Chinner <dchinner@redhat.com>
> Reviewed-by: Christoph Hellwig <hch@lst.de>
> Reviewed-by: Christian Brauner <brauner@kernel.org>
> Signed-off-by: Amir Goldstein <amir73il@gmail.com>

Same question as I posted to Leah's series -- have all the necessary VFS
fixes and whatnot been backported to 5.10?  Such that all the new sgid
inheritance tests actually pass with this patch applied? :)

--D

> ---
>  fs/xfs/xfs_iops.c | 56 +++--------------------------------------------
>  fs/xfs/xfs_pnfs.c |  3 ++-
>  2 files changed, 5 insertions(+), 54 deletions(-)
> 
> diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c
> index b7f7b31a77d5..5711c8c12625 100644
> --- a/fs/xfs/xfs_iops.c
> +++ b/fs/xfs/xfs_iops.c
> @@ -595,37 +595,6 @@ xfs_vn_getattr(
>  	return 0;
>  }
>  
> -static void
> -xfs_setattr_mode(
> -	struct xfs_inode	*ip,
> -	struct iattr		*iattr)
> -{
> -	struct inode		*inode = VFS_I(ip);
> -	umode_t			mode = iattr->ia_mode;
> -
> -	ASSERT(xfs_isilocked(ip, XFS_ILOCK_EXCL));
> -
> -	inode->i_mode &= S_IFMT;
> -	inode->i_mode |= mode & ~S_IFMT;
> -}
> -
> -void
> -xfs_setattr_time(
> -	struct xfs_inode	*ip,
> -	struct iattr		*iattr)
> -{
> -	struct inode		*inode = VFS_I(ip);
> -
> -	ASSERT(xfs_isilocked(ip, XFS_ILOCK_EXCL));
> -
> -	if (iattr->ia_valid & ATTR_ATIME)
> -		inode->i_atime = iattr->ia_atime;
> -	if (iattr->ia_valid & ATTR_CTIME)
> -		inode->i_ctime = iattr->ia_ctime;
> -	if (iattr->ia_valid & ATTR_MTIME)
> -		inode->i_mtime = iattr->ia_mtime;
> -}
> -
>  static int
>  xfs_vn_change_ok(
>  	struct dentry	*dentry,
> @@ -740,16 +709,6 @@ xfs_setattr_nonsize(
>  				goto out_cancel;
>  		}
>  
> -		/*
> -		 * CAP_FSETID overrides the following restrictions:
> -		 *
> -		 * The set-user-ID and set-group-ID bits of a file will be
> -		 * cleared upon successful return from chown()
> -		 */
> -		if ((inode->i_mode & (S_ISUID|S_ISGID)) &&
> -		    !capable(CAP_FSETID))
> -			inode->i_mode &= ~(S_ISUID|S_ISGID);
> -
>  		/*
>  		 * Change the ownerships and register quota modifications
>  		 * in the transaction.
> @@ -761,7 +720,6 @@ xfs_setattr_nonsize(
>  				olddquot1 = xfs_qm_vop_chown(tp, ip,
>  							&ip->i_udquot, udqp);
>  			}
> -			inode->i_uid = uid;
>  		}
>  		if (!gid_eq(igid, gid)) {
>  			if (XFS_IS_QUOTA_RUNNING(mp) && XFS_IS_GQUOTA_ON(mp)) {
> @@ -772,15 +730,10 @@ xfs_setattr_nonsize(
>  				olddquot2 = xfs_qm_vop_chown(tp, ip,
>  							&ip->i_gdquot, gdqp);
>  			}
> -			inode->i_gid = gid;
>  		}
>  	}
>  
> -	if (mask & ATTR_MODE)
> -		xfs_setattr_mode(ip, iattr);
> -	if (mask & (ATTR_ATIME|ATTR_CTIME|ATTR_MTIME))
> -		xfs_setattr_time(ip, iattr);
> -
> +	setattr_copy(inode, iattr);
>  	xfs_trans_log_inode(tp, ip, XFS_ILOG_CORE);
>  
>  	XFS_STATS_INC(mp, xs_ig_attrchg);
> @@ -1025,11 +978,8 @@ xfs_setattr_size(
>  		xfs_inode_clear_eofblocks_tag(ip);
>  	}
>  
> -	if (iattr->ia_valid & ATTR_MODE)
> -		xfs_setattr_mode(ip, iattr);
> -	if (iattr->ia_valid & (ATTR_ATIME|ATTR_CTIME|ATTR_MTIME))
> -		xfs_setattr_time(ip, iattr);
> -
> +	ASSERT(!(iattr->ia_valid & (ATTR_UID | ATTR_GID)));
> +	setattr_copy(inode, iattr);
>  	xfs_trans_log_inode(tp, ip, XFS_ILOG_CORE);
>  
>  	XFS_STATS_INC(mp, xs_ig_attrchg);
> diff --git a/fs/xfs/xfs_pnfs.c b/fs/xfs/xfs_pnfs.c
> index f3082a957d5e..ae61094bc9d1 100644
> --- a/fs/xfs/xfs_pnfs.c
> +++ b/fs/xfs/xfs_pnfs.c
> @@ -283,7 +283,8 @@ xfs_fs_commit_blocks(
>  	xfs_trans_ijoin(tp, ip, XFS_ILOCK_EXCL);
>  	xfs_trans_log_inode(tp, ip, XFS_ILOG_CORE);
>  
> -	xfs_setattr_time(ip, iattr);
> +	ASSERT(!(iattr->ia_valid & (ATTR_UID | ATTR_GID)));
> +	setattr_copy(inode, iattr);
>  	if (update_isize) {
>  		i_size_write(inode, iattr->ia_size);
>  		ip->i_d.di_size = iattr->ia_size;
> -- 
> 2.25.1
> 

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 5.10 CANDIDATE 07/11] xfs: xfs_log_force_lsn isn't passed a LSN
  2022-06-17 10:06 ` [PATCH 5.10 CANDIDATE 07/11] xfs: xfs_log_force_lsn isn't passed a LSN Amir Goldstein
@ 2022-06-22 16:45   ` Darrick J. Wong
  2022-06-22 17:09     ` Amir Goldstein
  0 siblings, 1 reply; 31+ messages in thread
From: Darrick J. Wong @ 2022-06-22 16:45 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: Leah Rumancik, Luis Chamberlain, Dave Chinner, Christoph Hellwig,
	Christian Brauner, linux-xfs, fstests, Dave Chinner,
	Brian Foster, Allison Henderson

On Fri, Jun 17, 2022 at 01:06:37PM +0300, Amir Goldstein wrote:
> From: Dave Chinner <dchinner@redhat.com>
> 
> commit 5f9b4b0de8dc2fb8eb655463b438001c111570fe upstream.
> 
> [backported from CIL scalability series for dependency]
> 
> In doing an investigation into AIL push stalls, I was looking at the
> log force code to see if an async CIL push could be done instead.
> This lead me to xfs_log_force_lsn() and looking at how it works.
> 
> xfs_log_force_lsn() is only called from inode synchronisation
> contexts such as fsync(), and it takes the ip->i_itemp->ili_last_lsn
> value as the LSN to sync the log to. This gets passed to
> xlog_cil_force_lsn() via xfs_log_force_lsn() to flush the CIL to the
> journal, and then used by xfs_log_force_lsn() to flush the iclogs to
> the journal.
> 
> The problem is that ip->i_itemp->ili_last_lsn does not store a
> log sequence number. What it stores is passed to it from the
> ->iop_committing method, which is called by xfs_log_commit_cil().
> The value this passes to the iop_committing method is the CIL
> context sequence number that the item was committed to.
> 
> As it turns out, xlog_cil_force_lsn() converts the sequence to an
> actual commit LSN for the related context and returns that to
> xfs_log_force_lsn(). xfs_log_force_lsn() overwrites it's "lsn"
> variable that contained a sequence with an actual LSN and then uses
> that to sync the iclogs.
> 
> This caused me some confusion for a while, even though I originally
> wrote all this code a decade ago. ->iop_committing is only used by
> a couple of log item types, and only inode items use the sequence
> number it is passed.
> 
> Let's clean up the API, CIL structures and inode log item to call it
> a sequence number, and make it clear that the high level code is
> using CIL sequence numbers and not on-disk LSNs for integrity
> synchronisation purposes.
> 
> Signed-off-by: Dave Chinner <dchinner@redhat.com>
> Reviewed-by: Brian Foster <bfoster@redhat.com>
> Reviewed-by: Darrick J. Wong <djwong@kernel.org>
> Reviewed-by: Allison Henderson <allison.henderson@oracle.com>
> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
> Signed-off-by: Amir Goldstein <amir73il@gmail.com>

This /looks/ right, but given the invasiveness of this patch, I'm
curious how much you had to change from upstream to get here?

Also, did you look at the other log fixes in 5.14 and decide none of
them were needed/worth the risk, or is that merely pending for the next
round?

--D

> ---
>  fs/xfs/libxfs/xfs_types.h |  1 +
>  fs/xfs/xfs_buf_item.c     |  2 +-
>  fs/xfs/xfs_dquot_item.c   |  2 +-
>  fs/xfs/xfs_file.c         | 14 +++++++-------
>  fs/xfs/xfs_inode.c        | 10 +++++-----
>  fs/xfs/xfs_inode_item.c   |  4 ++--
>  fs/xfs/xfs_inode_item.h   |  2 +-
>  fs/xfs/xfs_log.c          | 27 ++++++++++++++-------------
>  fs/xfs/xfs_log.h          |  4 +---
>  fs/xfs/xfs_log_cil.c      | 30 +++++++++++-------------------
>  fs/xfs/xfs_log_priv.h     | 15 +++++++--------
>  fs/xfs/xfs_trans.c        |  6 +++---
>  fs/xfs/xfs_trans.h        |  4 ++--
>  13 files changed, 56 insertions(+), 65 deletions(-)
> 
> diff --git a/fs/xfs/libxfs/xfs_types.h b/fs/xfs/libxfs/xfs_types.h
> index 397d94775440..1ce06173c2f5 100644
> --- a/fs/xfs/libxfs/xfs_types.h
> +++ b/fs/xfs/libxfs/xfs_types.h
> @@ -21,6 +21,7 @@ typedef int32_t		xfs_suminfo_t;	/* type of bitmap summary info */
>  typedef uint32_t	xfs_rtword_t;	/* word type for bitmap manipulations */
>  
>  typedef int64_t		xfs_lsn_t;	/* log sequence number */
> +typedef int64_t		xfs_csn_t;	/* CIL sequence number */
>  
>  typedef uint32_t	xfs_dablk_t;	/* dir/attr block number (in file) */
>  typedef uint32_t	xfs_dahash_t;	/* dir/attr hash value */
> diff --git a/fs/xfs/xfs_buf_item.c b/fs/xfs/xfs_buf_item.c
> index 8c6e26d62ef2..5d6535370f87 100644
> --- a/fs/xfs/xfs_buf_item.c
> +++ b/fs/xfs/xfs_buf_item.c
> @@ -632,7 +632,7 @@ xfs_buf_item_release(
>  STATIC void
>  xfs_buf_item_committing(
>  	struct xfs_log_item	*lip,
> -	xfs_lsn_t		commit_lsn)
> +	xfs_csn_t		seq)
>  {
>  	return xfs_buf_item_release(lip);
>  }
> diff --git a/fs/xfs/xfs_dquot_item.c b/fs/xfs/xfs_dquot_item.c
> index 8c1fdf37ee8f..8ed47b739b6c 100644
> --- a/fs/xfs/xfs_dquot_item.c
> +++ b/fs/xfs/xfs_dquot_item.c
> @@ -188,7 +188,7 @@ xfs_qm_dquot_logitem_release(
>  STATIC void
>  xfs_qm_dquot_logitem_committing(
>  	struct xfs_log_item	*lip,
> -	xfs_lsn_t		commit_lsn)
> +	xfs_csn_t		seq)
>  {
>  	return xfs_qm_dquot_logitem_release(lip);
>  }
> diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
> index 414d856e2e75..4d6bf8d4974f 100644
> --- a/fs/xfs/xfs_file.c
> +++ b/fs/xfs/xfs_file.c
> @@ -118,8 +118,8 @@ xfs_dir_fsync(
>  	return xfs_log_force_inode(ip);
>  }
>  
> -static xfs_lsn_t
> -xfs_fsync_lsn(
> +static xfs_csn_t
> +xfs_fsync_seq(
>  	struct xfs_inode	*ip,
>  	bool			datasync)
>  {
> @@ -127,7 +127,7 @@ xfs_fsync_lsn(
>  		return 0;
>  	if (datasync && !(ip->i_itemp->ili_fsync_fields & ~XFS_ILOG_TIMESTAMP))
>  		return 0;
> -	return ip->i_itemp->ili_last_lsn;
> +	return ip->i_itemp->ili_commit_seq;
>  }
>  
>  /*
> @@ -150,12 +150,12 @@ xfs_fsync_flush_log(
>  	int			*log_flushed)
>  {
>  	int			error = 0;
> -	xfs_lsn_t		lsn;
> +	xfs_csn_t		seq;
>  
>  	xfs_ilock(ip, XFS_ILOCK_SHARED);
> -	lsn = xfs_fsync_lsn(ip, datasync);
> -	if (lsn) {
> -		error = xfs_log_force_lsn(ip->i_mount, lsn, XFS_LOG_SYNC,
> +	seq = xfs_fsync_seq(ip, datasync);
> +	if (seq) {
> +		error = xfs_log_force_seq(ip->i_mount, seq, XFS_LOG_SYNC,
>  					  log_flushed);
>  
>  		spin_lock(&ip->i_itemp->ili_lock);
> diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
> index e958b1c74561..1ae669d12301 100644
> --- a/fs/xfs/xfs_inode.c
> +++ b/fs/xfs/xfs_inode.c
> @@ -2754,7 +2754,7 @@ xfs_iunpin(
>  	trace_xfs_inode_unpin_nowait(ip, _RET_IP_);
>  
>  	/* Give the log a push to start the unpinning I/O */
> -	xfs_log_force_lsn(ip->i_mount, ip->i_itemp->ili_last_lsn, 0, NULL);
> +	xfs_log_force_seq(ip->i_mount, ip->i_itemp->ili_commit_seq, 0, NULL);
>  
>  }
>  
> @@ -3717,16 +3717,16 @@ int
>  xfs_log_force_inode(
>  	struct xfs_inode	*ip)
>  {
> -	xfs_lsn_t		lsn = 0;
> +	xfs_csn_t		seq = 0;
>  
>  	xfs_ilock(ip, XFS_ILOCK_SHARED);
>  	if (xfs_ipincount(ip))
> -		lsn = ip->i_itemp->ili_last_lsn;
> +		seq = ip->i_itemp->ili_commit_seq;
>  	xfs_iunlock(ip, XFS_ILOCK_SHARED);
>  
> -	if (!lsn)
> +	if (!seq)
>  		return 0;
> -	return xfs_log_force_lsn(ip->i_mount, lsn, XFS_LOG_SYNC, NULL);
> +	return xfs_log_force_seq(ip->i_mount, seq, XFS_LOG_SYNC, NULL);
>  }
>  
>  /*
> diff --git a/fs/xfs/xfs_inode_item.c b/fs/xfs/xfs_inode_item.c
> index 6ff91e5bf3cd..3aba4559469f 100644
> --- a/fs/xfs/xfs_inode_item.c
> +++ b/fs/xfs/xfs_inode_item.c
> @@ -617,9 +617,9 @@ xfs_inode_item_committed(
>  STATIC void
>  xfs_inode_item_committing(
>  	struct xfs_log_item	*lip,
> -	xfs_lsn_t		commit_lsn)
> +	xfs_csn_t		seq)
>  {
> -	INODE_ITEM(lip)->ili_last_lsn = commit_lsn;
> +	INODE_ITEM(lip)->ili_commit_seq = seq;
>  	return xfs_inode_item_release(lip);
>  }
>  
> diff --git a/fs/xfs/xfs_inode_item.h b/fs/xfs/xfs_inode_item.h
> index 4b926e32831c..403b45ab9aa2 100644
> --- a/fs/xfs/xfs_inode_item.h
> +++ b/fs/xfs/xfs_inode_item.h
> @@ -33,7 +33,7 @@ struct xfs_inode_log_item {
>  	unsigned int		ili_fields;	   /* fields to be logged */
>  	unsigned int		ili_fsync_fields;  /* logged since last fsync */
>  	xfs_lsn_t		ili_flush_lsn;	   /* lsn at last flush */
> -	xfs_lsn_t		ili_last_lsn;	   /* lsn at last transaction */
> +	xfs_csn_t		ili_commit_seq;	   /* last transaction commit */
>  };
>  
>  static inline int xfs_inode_clean(struct xfs_inode *ip)
> diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c
> index b445e63cbc3c..05791456adbb 100644
> --- a/fs/xfs/xfs_log.c
> +++ b/fs/xfs/xfs_log.c
> @@ -3210,14 +3210,13 @@ xfs_log_force(
>  }
>  
>  static int
> -__xfs_log_force_lsn(
> -	struct xfs_mount	*mp,
> +xlog_force_lsn(
> +	struct xlog		*log,
>  	xfs_lsn_t		lsn,
>  	uint			flags,
>  	int			*log_flushed,
>  	bool			already_slept)
>  {
> -	struct xlog		*log = mp->m_log;
>  	struct xlog_in_core	*iclog;
>  
>  	spin_lock(&log->l_icloglock);
> @@ -3250,8 +3249,6 @@ __xfs_log_force_lsn(
>  		if (!already_slept &&
>  		    (iclog->ic_prev->ic_state == XLOG_STATE_WANT_SYNC ||
>  		     iclog->ic_prev->ic_state == XLOG_STATE_SYNCING)) {
> -			XFS_STATS_INC(mp, xs_log_force_sleep);
> -
>  			xlog_wait(&iclog->ic_prev->ic_write_wait,
>  					&log->l_icloglock);
>  			return -EAGAIN;
> @@ -3289,25 +3286,29 @@ __xfs_log_force_lsn(
>   * to disk, that thread will wake up all threads waiting on the queue.
>   */
>  int
> -xfs_log_force_lsn(
> +xfs_log_force_seq(
>  	struct xfs_mount	*mp,
> -	xfs_lsn_t		lsn,
> +	xfs_csn_t		seq,
>  	uint			flags,
>  	int			*log_flushed)
>  {
> +	struct xlog		*log = mp->m_log;
> +	xfs_lsn_t		lsn;
>  	int			ret;
> -	ASSERT(lsn != 0);
> +	ASSERT(seq != 0);
>  
>  	XFS_STATS_INC(mp, xs_log_force);
> -	trace_xfs_log_force(mp, lsn, _RET_IP_);
> +	trace_xfs_log_force(mp, seq, _RET_IP_);
>  
> -	lsn = xlog_cil_force_lsn(mp->m_log, lsn);
> +	lsn = xlog_cil_force_seq(log, seq);
>  	if (lsn == NULLCOMMITLSN)
>  		return 0;
>  
> -	ret = __xfs_log_force_lsn(mp, lsn, flags, log_flushed, false);
> -	if (ret == -EAGAIN)
> -		ret = __xfs_log_force_lsn(mp, lsn, flags, log_flushed, true);
> +	ret = xlog_force_lsn(log, lsn, flags, log_flushed, false);
> +	if (ret == -EAGAIN) {
> +		XFS_STATS_INC(mp, xs_log_force_sleep);
> +		ret = xlog_force_lsn(log, lsn, flags, log_flushed, true);
> +	}
>  	return ret;
>  }
>  
> diff --git a/fs/xfs/xfs_log.h b/fs/xfs/xfs_log.h
> index 98c913da7587..a1089f8b7169 100644
> --- a/fs/xfs/xfs_log.h
> +++ b/fs/xfs/xfs_log.h
> @@ -106,7 +106,7 @@ struct xfs_item_ops;
>  struct xfs_trans;
>  
>  int	  xfs_log_force(struct xfs_mount *mp, uint flags);
> -int	  xfs_log_force_lsn(struct xfs_mount *mp, xfs_lsn_t lsn, uint flags,
> +int	  xfs_log_force_seq(struct xfs_mount *mp, xfs_csn_t seq, uint flags,
>  		int *log_forced);
>  int	  xfs_log_mount(struct xfs_mount	*mp,
>  			struct xfs_buftarg	*log_target,
> @@ -132,8 +132,6 @@ bool	xfs_log_writable(struct xfs_mount *mp);
>  struct xlog_ticket *xfs_log_ticket_get(struct xlog_ticket *ticket);
>  void	  xfs_log_ticket_put(struct xlog_ticket *ticket);
>  
> -void	xfs_log_commit_cil(struct xfs_mount *mp, struct xfs_trans *tp,
> -				xfs_lsn_t *commit_lsn, bool regrant);
>  void	xlog_cil_process_committed(struct list_head *list);
>  bool	xfs_log_item_in_current_chkpt(struct xfs_log_item *lip);
>  
> diff --git a/fs/xfs/xfs_log_cil.c b/fs/xfs/xfs_log_cil.c
> index cd5c04dabe2e..88730883bb70 100644
> --- a/fs/xfs/xfs_log_cil.c
> +++ b/fs/xfs/xfs_log_cil.c
> @@ -777,7 +777,7 @@ xlog_cil_push_work(
>  	 * that higher sequences will wait for us to write out a commit record
>  	 * before they do.
>  	 *
> -	 * xfs_log_force_lsn requires us to mirror the new sequence into the cil
> +	 * xfs_log_force_seq requires us to mirror the new sequence into the cil
>  	 * structure atomically with the addition of this sequence to the
>  	 * committing list. This also ensures that we can do unlocked checks
>  	 * against the current sequence in log forces without risking
> @@ -1020,16 +1020,14 @@ xlog_cil_empty(
>   * allowed again.
>   */
>  void
> -xfs_log_commit_cil(
> -	struct xfs_mount	*mp,
> +xlog_cil_commit(
> +	struct xlog		*log,
>  	struct xfs_trans	*tp,
> -	xfs_lsn_t		*commit_lsn,
> +	xfs_csn_t		*commit_seq,
>  	bool			regrant)
>  {
> -	struct xlog		*log = mp->m_log;
>  	struct xfs_cil		*cil = log->l_cilp;
>  	struct xfs_log_item	*lip, *next;
> -	xfs_lsn_t		xc_commit_lsn;
>  
>  	/*
>  	 * Do all necessary memory allocation before we lock the CIL.
> @@ -1043,10 +1041,6 @@ xfs_log_commit_cil(
>  
>  	xlog_cil_insert_items(log, tp);
>  
> -	xc_commit_lsn = cil->xc_ctx->sequence;
> -	if (commit_lsn)
> -		*commit_lsn = xc_commit_lsn;
> -
>  	if (regrant && !XLOG_FORCED_SHUTDOWN(log))
>  		xfs_log_ticket_regrant(log, tp->t_ticket);
>  	else
> @@ -1069,8 +1063,10 @@ xfs_log_commit_cil(
>  	list_for_each_entry_safe(lip, next, &tp->t_items, li_trans) {
>  		xfs_trans_del_item(lip);
>  		if (lip->li_ops->iop_committing)
> -			lip->li_ops->iop_committing(lip, xc_commit_lsn);
> +			lip->li_ops->iop_committing(lip, cil->xc_ctx->sequence);
>  	}
> +	if (commit_seq)
> +		*commit_seq = cil->xc_ctx->sequence;
>  
>  	/* xlog_cil_push_background() releases cil->xc_ctx_lock */
>  	xlog_cil_push_background(log);
> @@ -1087,9 +1083,9 @@ xfs_log_commit_cil(
>   * iclog flush is necessary following this call.
>   */
>  xfs_lsn_t
> -xlog_cil_force_lsn(
> +xlog_cil_force_seq(
>  	struct xlog	*log,
> -	xfs_lsn_t	sequence)
> +	xfs_csn_t	sequence)
>  {
>  	struct xfs_cil		*cil = log->l_cilp;
>  	struct xfs_cil_ctx	*ctx;
> @@ -1185,21 +1181,17 @@ bool
>  xfs_log_item_in_current_chkpt(
>  	struct xfs_log_item *lip)
>  {
> -	struct xfs_cil_ctx *ctx;
> +	struct xfs_cil_ctx *ctx = lip->li_mountp->m_log->l_cilp->xc_ctx;
>  
>  	if (list_empty(&lip->li_cil))
>  		return false;
>  
> -	ctx = lip->li_mountp->m_log->l_cilp->xc_ctx;
> -
>  	/*
>  	 * li_seq is written on the first commit of a log item to record the
>  	 * first checkpoint it is written to. Hence if it is different to the
>  	 * current sequence, we're in a new checkpoint.
>  	 */
> -	if (XFS_LSN_CMP(lip->li_seq, ctx->sequence) != 0)
> -		return false;
> -	return true;
> +	return lip->li_seq == ctx->sequence;
>  }
>  
>  /*
> diff --git a/fs/xfs/xfs_log_priv.h b/fs/xfs/xfs_log_priv.h
> index 1c6fdbf3d506..42cd1602ac25 100644
> --- a/fs/xfs/xfs_log_priv.h
> +++ b/fs/xfs/xfs_log_priv.h
> @@ -230,7 +230,7 @@ struct xfs_cil;
>  
>  struct xfs_cil_ctx {
>  	struct xfs_cil		*cil;
> -	xfs_lsn_t		sequence;	/* chkpt sequence # */
> +	xfs_csn_t		sequence;	/* chkpt sequence # */
>  	xfs_lsn_t		start_lsn;	/* first LSN of chkpt commit */
>  	xfs_lsn_t		commit_lsn;	/* chkpt commit record lsn */
>  	struct xlog_ticket	*ticket;	/* chkpt ticket */
> @@ -268,10 +268,10 @@ struct xfs_cil {
>  	struct xfs_cil_ctx	*xc_ctx;
>  
>  	spinlock_t		xc_push_lock ____cacheline_aligned_in_smp;
> -	xfs_lsn_t		xc_push_seq;
> +	xfs_csn_t		xc_push_seq;
>  	struct list_head	xc_committing;
>  	wait_queue_head_t	xc_commit_wait;
> -	xfs_lsn_t		xc_current_sequence;
> +	xfs_csn_t		xc_current_sequence;
>  	struct work_struct	xc_push_work;
>  	wait_queue_head_t	xc_push_wait;	/* background push throttle */
>  } ____cacheline_aligned_in_smp;
> @@ -547,19 +547,18 @@ int	xlog_cil_init(struct xlog *log);
>  void	xlog_cil_init_post_recovery(struct xlog *log);
>  void	xlog_cil_destroy(struct xlog *log);
>  bool	xlog_cil_empty(struct xlog *log);
> +void	xlog_cil_commit(struct xlog *log, struct xfs_trans *tp,
> +			xfs_csn_t *commit_seq, bool regrant);
>  
>  /*
>   * CIL force routines
>   */
> -xfs_lsn_t
> -xlog_cil_force_lsn(
> -	struct xlog *log,
> -	xfs_lsn_t sequence);
> +xfs_lsn_t xlog_cil_force_seq(struct xlog *log, xfs_csn_t sequence);
>  
>  static inline void
>  xlog_cil_force(struct xlog *log)
>  {
> -	xlog_cil_force_lsn(log, log->l_cilp->xc_current_sequence);
> +	xlog_cil_force_seq(log, log->l_cilp->xc_current_sequence);
>  }
>  
>  /*
> diff --git a/fs/xfs/xfs_trans.c b/fs/xfs/xfs_trans.c
> index 36166bae24a6..73a1de7ceefc 100644
> --- a/fs/xfs/xfs_trans.c
> +++ b/fs/xfs/xfs_trans.c
> @@ -832,7 +832,7 @@ __xfs_trans_commit(
>  	bool			regrant)
>  {
>  	struct xfs_mount	*mp = tp->t_mountp;
> -	xfs_lsn_t		commit_lsn = -1;
> +	xfs_csn_t		commit_seq = 0;
>  	int			error = 0;
>  	int			sync = tp->t_flags & XFS_TRANS_SYNC;
>  
> @@ -874,7 +874,7 @@ __xfs_trans_commit(
>  		xfs_trans_apply_sb_deltas(tp);
>  	xfs_trans_apply_dquot_deltas(tp);
>  
> -	xfs_log_commit_cil(mp, tp, &commit_lsn, regrant);
> +	xlog_cil_commit(mp->m_log, tp, &commit_seq, regrant);
>  
>  	xfs_trans_free(tp);
>  
> @@ -883,7 +883,7 @@ __xfs_trans_commit(
>  	 * log out now and wait for it.
>  	 */
>  	if (sync) {
> -		error = xfs_log_force_lsn(mp, commit_lsn, XFS_LOG_SYNC, NULL);
> +		error = xfs_log_force_seq(mp, commit_seq, XFS_LOG_SYNC, NULL);
>  		XFS_STATS_INC(mp, xs_trans_sync);
>  	} else {
>  		XFS_STATS_INC(mp, xs_trans_async);
> diff --git a/fs/xfs/xfs_trans.h b/fs/xfs/xfs_trans.h
> index 075eeade4f7d..97485559008b 100644
> --- a/fs/xfs/xfs_trans.h
> +++ b/fs/xfs/xfs_trans.h
> @@ -43,7 +43,7 @@ struct xfs_log_item {
>  	struct list_head		li_cil;		/* CIL pointers */
>  	struct xfs_log_vec		*li_lv;		/* active log vector */
>  	struct xfs_log_vec		*li_lv_shadow;	/* standby vector */
> -	xfs_lsn_t			li_seq;		/* CIL commit seq */
> +	xfs_csn_t			li_seq;		/* CIL commit seq */
>  };
>  
>  /*
> @@ -69,7 +69,7 @@ struct xfs_item_ops {
>  	void (*iop_pin)(struct xfs_log_item *);
>  	void (*iop_unpin)(struct xfs_log_item *, int remove);
>  	uint (*iop_push)(struct xfs_log_item *, struct list_head *);
> -	void (*iop_committing)(struct xfs_log_item *, xfs_lsn_t commit_lsn);
> +	void (*iop_committing)(struct xfs_log_item *lip, xfs_csn_t seq);
>  	void (*iop_release)(struct xfs_log_item *);
>  	xfs_lsn_t (*iop_committed)(struct xfs_log_item *, xfs_lsn_t);
>  	int (*iop_recover)(struct xfs_log_item *lip,
> -- 
> 2.25.1
> 

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 5.10 CANDIDATE 09/11] xfs: only bother with sync_filesystem during readonly remount
  2022-06-22 16:38   ` Darrick J. Wong
@ 2022-06-22 16:54     ` Amir Goldstein
  2022-06-22 23:42       ` Darrick J. Wong
  0 siblings, 1 reply; 31+ messages in thread
From: Amir Goldstein @ 2022-06-22 16:54 UTC (permalink / raw)
  To: Darrick J. Wong
  Cc: Leah Rumancik, Luis Chamberlain, Dave Chinner, Christoph Hellwig,
	Christian Brauner, linux-xfs, fstests, Dave Chinner

On Wed, Jun 22, 2022 at 7:38 PM Darrick J. Wong <djwong@kernel.org> wrote:
>
> On Fri, Jun 17, 2022 at 01:06:39PM +0300, Amir Goldstein wrote:
> > From: "Darrick J. Wong" <djwong@kernel.org>
> >
> > commit b97cca3ba9098522e5a1c3388764ead42640c1a5 upstream.
> >
> > In commit 02b9984d6408, we pushed a sync_filesystem() call from the VFS
> > into xfs_fs_remount.  The only time that we ever need to push dirty file
> > data or metadata to disk for a remount is if we're remounting the
> > filesystem read only, so this really could be moved to xfs_remount_ro.
> >
> > Once we've moved the call site, actually check the return value from
> > sync_filesystem.

This part is not really relevant for this backport, do you want me to
emphasise that?

> >
> > Fixes: 02b9984d6408 ("fs: push sync_filesystem() down to the file system's remount_fs()")
> > Signed-off-by: Darrick J. Wong <djwong@kernel.org>
> > Reviewed-by: Dave Chinner <dchinner@redhat.com>
> > Signed-off-by: Amir Goldstein <amir73il@gmail.com>
> > ---
> >  fs/xfs/xfs_super.c | 7 +++++--
> >  1 file changed, 5 insertions(+), 2 deletions(-)
> >
> > diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
> > index 6323974d6b3e..dd0439ae6732 100644
> > --- a/fs/xfs/xfs_super.c
> > +++ b/fs/xfs/xfs_super.c
> > @@ -1716,6 +1716,11 @@ xfs_remount_ro(
> >       };
> >       int                     error;
> >
> > +     /* Flush all the dirty data to disk. */
> > +     error = sync_filesystem(mp->m_super);
>
> Looking at 5.10.124's fsync.c and xfs_super.c:
>
> https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/fs/sync.c?h=v5.10.124#n31
> https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/fs/xfs/xfs_super.c?h=v5.10.124#n755
>
> I think this kernel needs the patch(es) that make __sync_filesystem return
> the errors passed back by ->sync_fs, and I think also the patch that
> makes xfs_fs_sync_fs return errors encountered by xfs_log_force, right?

It wasn't my intention to fix syncfs() does not return errors in 5.10.
It has always been that way and IIRC, the relevant patches did not
apply cleanly.

THIS patch however, fixes something else, not only the return of the error
to its caller, so I thought it was worth backporting.
If you think otherwise, I'll drop it.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 5.10 CANDIDATE 07/11] xfs: xfs_log_force_lsn isn't passed a LSN
  2022-06-22 16:45   ` Darrick J. Wong
@ 2022-06-22 17:09     ` Amir Goldstein
  0 siblings, 0 replies; 31+ messages in thread
From: Amir Goldstein @ 2022-06-22 17:09 UTC (permalink / raw)
  To: Darrick J. Wong
  Cc: Leah Rumancik, Luis Chamberlain, Dave Chinner, Christoph Hellwig,
	Christian Brauner, linux-xfs, fstests, Dave Chinner,
	Brian Foster, Allison Henderson

On Wed, Jun 22, 2022 at 7:45 PM Darrick J. Wong <djwong@kernel.org> wrote:
>
> On Fri, Jun 17, 2022 at 01:06:37PM +0300, Amir Goldstein wrote:
> > From: Dave Chinner <dchinner@redhat.com>
> >
> > commit 5f9b4b0de8dc2fb8eb655463b438001c111570fe upstream.
> >
> > [backported from CIL scalability series for dependency]
> >
> > In doing an investigation into AIL push stalls, I was looking at the
> > log force code to see if an async CIL push could be done instead.
> > This lead me to xfs_log_force_lsn() and looking at how it works.
> >
> > xfs_log_force_lsn() is only called from inode synchronisation
> > contexts such as fsync(), and it takes the ip->i_itemp->ili_last_lsn
> > value as the LSN to sync the log to. This gets passed to
> > xlog_cil_force_lsn() via xfs_log_force_lsn() to flush the CIL to the
> > journal, and then used by xfs_log_force_lsn() to flush the iclogs to
> > the journal.
> >
> > The problem is that ip->i_itemp->ili_last_lsn does not store a
> > log sequence number. What it stores is passed to it from the
> > ->iop_committing method, which is called by xfs_log_commit_cil().
> > The value this passes to the iop_committing method is the CIL
> > context sequence number that the item was committed to.
> >
> > As it turns out, xlog_cil_force_lsn() converts the sequence to an
> > actual commit LSN for the related context and returns that to
> > xfs_log_force_lsn(). xfs_log_force_lsn() overwrites it's "lsn"
> > variable that contained a sequence with an actual LSN and then uses
> > that to sync the iclogs.
> >
> > This caused me some confusion for a while, even though I originally
> > wrote all this code a decade ago. ->iop_committing is only used by
> > a couple of log item types, and only inode items use the sequence
> > number it is passed.
> >
> > Let's clean up the API, CIL structures and inode log item to call it
> > a sequence number, and make it clear that the high level code is
> > using CIL sequence numbers and not on-disk LSNs for integrity
> > synchronisation purposes.
> >
> > Signed-off-by: Dave Chinner <dchinner@redhat.com>
> > Reviewed-by: Brian Foster <bfoster@redhat.com>
> > Reviewed-by: Darrick J. Wong <djwong@kernel.org>
> > Reviewed-by: Allison Henderson <allison.henderson@oracle.com>
> > Signed-off-by: Darrick J. Wong <djwong@kernel.org>
> > Signed-off-by: Amir Goldstein <amir73il@gmail.com>
>
> This /looks/ right, but given the invasiveness of this patch, I'm
> curious how much you had to change from upstream to get here?

IIRC, it wasn't so bad, there may have been a lot of conflicts, but
most of them were resolved by doing search&replace on the 5.10
code and it was also pretty clean and the compiler will catch errors
if something that needed to be replaced wasn't in a conflicting hunk.

>
> Also, did you look at the other log fixes in 5.14 and decide none of
> them were needed/worth the risk, or is that merely pending for the next
> round?

There are other log fixes pending, which look like they fix scary bugs.
I just promoted this one to accommodate the joint 5.10/5.15 series.
Here are the fixes pending in xfs-5.10.y-4:

* 6e0d4f5c1a0f - (xfs-5.10.y-4) xfs: Enforce attr3 buffer recovery order
* 610edc215903 - xfs: logging the on disk inode LSN can make it go backwards
* da28279083ee - xfs: reset child dir '..' entry when unlinking child
* 35efb2d1e3a7 - xfs: remove dead stale buf unpin handling code
* fa7ae3691e6f - xfs: hold buffer across unpin and potential shutdown processing
* 0f6d6b3e8da3 - xfs: force the log offline when log intent item recovery fails
* 1883c074e795 - xfs: fix log intent recovery ENOSPC shutdowns when
inactivating inodes

I would be happy to get to post these.
If you have time to ACK the 5.13 selection [1], I will start testing
xfs-5.10.y-4.

Thanks!
Amir.

[1] https://lore.kernel.org/linux-xfs/20220606160537.689915-1-amir73il@gmail.com/

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 5.10 CANDIDATE 11/11] xfs: use setattr_copy to set vfs inode attributes
  2022-06-22 16:41   ` Darrick J. Wong
@ 2022-06-22 18:36     ` Amir Goldstein
  2022-06-22 22:17       ` Leah Rumancik
  0 siblings, 1 reply; 31+ messages in thread
From: Amir Goldstein @ 2022-06-22 18:36 UTC (permalink / raw)
  To: Darrick J. Wong
  Cc: Leah Rumancik, Luis Chamberlain, Dave Chinner, Christoph Hellwig,
	Christian Brauner, linux-xfs, fstests, Dave Chinner,
	Christian Brauner

On Wed, Jun 22, 2022 at 7:41 PM Darrick J. Wong <djwong@kernel.org> wrote:
>
> On Fri, Jun 17, 2022 at 01:06:41PM +0300, Amir Goldstein wrote:
> > From: "Darrick J. Wong" <djwong@kernel.org>
> >
> > commit e014f37db1a2d109afa750042ac4d69cf3e3d88e upstream.
> >
> > [remove userns argument of setattr_copy() for backport]
> >
> > Filipe Manana pointed out that XFS' behavior w.r.t. setuid/setgid
> > revocation isn't consistent with btrfs[1] or ext4.  Those two
> > filesystems use the VFS function setattr_copy to convey certain
> > attributes from struct iattr into the VFS inode structure.
> >
> > Andrey Zhadchenko reported[2] that XFS uses the wrong user namespace to
> > decide if it should clear setgid and setuid on a file attribute update.
> > This is a second symptom of the problem that Filipe noticed.
> >
> > XFS, on the other hand, open-codes setattr_copy in xfs_setattr_mode,
> > xfs_setattr_nonsize, and xfs_setattr_time.  Regrettably, setattr_copy is
> > /not/ a simple copy function; it contains additional logic to clear the
> > setgid bit when setting the mode, and XFS' version no longer matches.
> >
> > The VFS implements its own setuid/setgid stripping logic, which
> > establishes consistent behavior.  It's a tad unfortunate that it's
> > scattered across notify_change, should_remove_suid, and setattr_copy but
> > XFS should really follow the Linux VFS.  Adapt XFS to use the VFS
> > functions and get rid of the old functions.
> >
> > [1] https://lore.kernel.org/fstests/CAL3q7H47iNQ=Wmk83WcGB-KBJVOEtR9+qGczzCeXJ9Y2KCV25Q@mail.gmail.com/
> > [2] https://lore.kernel.org/linux-xfs/20220221182218.748084-1-andrey.zhadchenko@virtuozzo.com/
> >
> > Fixes: 7fa294c8991c ("userns: Allow chown and setgid preservation")
> > Signed-off-by: Darrick J. Wong <djwong@kernel.org>
> > Reviewed-by: Dave Chinner <dchinner@redhat.com>
> > Reviewed-by: Christoph Hellwig <hch@lst.de>
> > Reviewed-by: Christian Brauner <brauner@kernel.org>
> > Signed-off-by: Amir Goldstein <amir73il@gmail.com>
>
> Same question as I posted to Leah's series -- have all the necessary VFS
> fixes and whatnot been backported to 5.10?  Such that all the new sgid
> inheritance tests actually pass with this patch applied? :)

The only patch I backorted to 5.10 is:
xfs: fix up non-directory creation in SGID directories

I will check which SGID tests ran on my series.

Personally, I would rather defer THIS patch to a later post to stable
(Leah's patch as well) until we have a better understanding of the state
of SGID issues.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 5.10 CANDIDATE 03/11] xfs: Fix the free logic of state in xfs_attr_node_hasname
  2022-06-22 16:32   ` Darrick J. Wong
@ 2022-06-22 18:46     ` Amir Goldstein
  2022-06-22 21:50       ` Darrick J. Wong
  0 siblings, 1 reply; 31+ messages in thread
From: Amir Goldstein @ 2022-06-22 18:46 UTC (permalink / raw)
  To: Darrick J. Wong
  Cc: Leah Rumancik, Luis Chamberlain, Dave Chinner, Christoph Hellwig,
	Christian Brauner, linux-xfs, fstests, Yang Xu

On Wed, Jun 22, 2022 at 7:32 PM Darrick J. Wong <djwong@kernel.org> wrote:
>
> On Fri, Jun 17, 2022 at 01:06:33PM +0300, Amir Goldstein wrote:
> > From: Yang Xu <xuyang2018.jy@fujitsu.com>
> >
> > commit a1de97fe296c52eafc6590a3506f4bbd44ecb19a upstream.
> >
> > When testing xfstests xfs/126 on lastest upstream kernel, it will hang on some machine.
> > Adding a getxattr operation after xattr corrupted, I can reproduce it 100%.
> >
> > The deadlock as below:
> > [983.923403] task:setfattr        state:D stack:    0 pid:17639 ppid: 14687 flags:0x00000080
> > [  983.923405] Call Trace:
> > [  983.923410]  __schedule+0x2c4/0x700
> > [  983.923412]  schedule+0x37/0xa0
> > [  983.923414]  schedule_timeout+0x274/0x300
> > [  983.923416]  __down+0x9b/0xf0
> > [  983.923451]  ? xfs_buf_find.isra.29+0x3c8/0x5f0 [xfs]
> > [  983.923453]  down+0x3b/0x50
> > [  983.923471]  xfs_buf_lock+0x33/0xf0 [xfs]
> > [  983.923490]  xfs_buf_find.isra.29+0x3c8/0x5f0 [xfs]
> > [  983.923508]  xfs_buf_get_map+0x4c/0x320 [xfs]
> > [  983.923525]  xfs_buf_read_map+0x53/0x310 [xfs]
> > [  983.923541]  ? xfs_da_read_buf+0xcf/0x120 [xfs]
> > [  983.923560]  xfs_trans_read_buf_map+0x1cf/0x360 [xfs]
> > [  983.923575]  ? xfs_da_read_buf+0xcf/0x120 [xfs]
> > [  983.923590]  xfs_da_read_buf+0xcf/0x120 [xfs]
> > [  983.923606]  xfs_da3_node_read+0x1f/0x40 [xfs]
> > [  983.923621]  xfs_da3_node_lookup_int+0x69/0x4a0 [xfs]
> > [  983.923624]  ? kmem_cache_alloc+0x12e/0x270
> > [  983.923637]  xfs_attr_node_hasname+0x6e/0xa0 [xfs]
> > [  983.923651]  xfs_has_attr+0x6e/0xd0 [xfs]
> > [  983.923664]  xfs_attr_set+0x273/0x320 [xfs]
> > [  983.923683]  xfs_xattr_set+0x87/0xd0 [xfs]
> > [  983.923686]  __vfs_removexattr+0x4d/0x60
> > [  983.923688]  __vfs_removexattr_locked+0xac/0x130
> > [  983.923689]  vfs_removexattr+0x4e/0xf0
> > [  983.923690]  removexattr+0x4d/0x80
> > [  983.923693]  ? __check_object_size+0xa8/0x16b
> > [  983.923695]  ? strncpy_from_user+0x47/0x1a0
> > [  983.923696]  ? getname_flags+0x6a/0x1e0
> > [  983.923697]  ? _cond_resched+0x15/0x30
> > [  983.923699]  ? __sb_start_write+0x1e/0x70
> > [  983.923700]  ? mnt_want_write+0x28/0x50
> > [  983.923701]  path_removexattr+0x9b/0xb0
> > [  983.923702]  __x64_sys_removexattr+0x17/0x20
> > [  983.923704]  do_syscall_64+0x5b/0x1a0
> > [  983.923705]  entry_SYSCALL_64_after_hwframe+0x65/0xca
> > [  983.923707] RIP: 0033:0x7f080f10ee1b
> >
> > When getxattr calls xfs_attr_node_get function, xfs_da3_node_lookup_int fails with EFSCORRUPTED in
> > xfs_attr_node_hasname because we have use blocktrash to random it in xfs/126. So it
> > free state in internal and xfs_attr_node_get doesn't do xfs_buf_trans release job.
> >
> > Then subsequent removexattr will hang because of it.
> >
> > This bug was introduced by kernel commit 07120f1abdff ("xfs: Add xfs_has_attr and subroutines").
> > It adds xfs_attr_node_hasname helper and said caller will be responsible for freeing the state
> > in this case. But xfs_attr_node_hasname will free state itself instead of caller if
> > xfs_da3_node_lookup_int fails.
> >
> > Fix this bug by moving the step of free state into caller.
> >
> > [amir: this text from original commit is not relevant for 5.10 backport:
> > Also, use "goto error/out" instead of returning error directly in xfs_attr_node_addname_find_attr and
> > xfs_attr_node_removename_setup function because we should free state ourselves.
> > ]
> >
> > Fixes: 07120f1abdff ("xfs: Add xfs_has_attr and subroutines")
> > Signed-off-by: Yang Xu <xuyang2018.jy@fujitsu.com>
> > Reviewed-by: Darrick J. Wong <djwong@kernel.org>
> > Signed-off-by: Darrick J. Wong <djwong@kernel.org>
> > Signed-off-by: Amir Goldstein <amir73il@gmail.com>
> > ---
> >  fs/xfs/libxfs/xfs_attr.c | 13 +++++--------
> >  1 file changed, 5 insertions(+), 8 deletions(-)
> >
> > diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c
> > index 96ac7e562b87..fcca36bbd997 100644
> > --- a/fs/xfs/libxfs/xfs_attr.c
> > +++ b/fs/xfs/libxfs/xfs_attr.c
> > @@ -876,21 +876,18 @@ xfs_attr_node_hasname(
> >
> >       state = xfs_da_state_alloc(args);
> >       if (statep != NULL)
> > -             *statep = NULL;
> > +             *statep = state;
> >
> >       /*
> >        * Search to see if name exists, and get back a pointer to it.
> >        */
> >       error = xfs_da3_node_lookup_int(state, &retval);
> > -     if (error) {
> > -             xfs_da_state_free(state);
> > -             return error;
> > -     }
> > +     if (error)
> > +             retval = error;
> >
> > -     if (statep != NULL)
> > -             *statep = state;
> > -     else
> > +     if (!statep)
> >               xfs_da_state_free(state);
> > +
> >       return retval;
> >  }
> >
> > --
>
> Curious -- the conversion of the _node_hasname callers isn't in this
> patch.  Looking at 5.10.124, I see that most of the callers already
> clean up the passed-out statep, but do the callers of xfs_has_attr free
> it too?

Is that a trick question or am I misunderstanding it :)

xfs_has_attr() passes NULL as xfs_attr_node_hasname()
statep argument... they don't get the state back.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 5.10 CANDIDATE 03/11] xfs: Fix the free logic of state in xfs_attr_node_hasname
  2022-06-22 18:46     ` Amir Goldstein
@ 2022-06-22 21:50       ` Darrick J. Wong
  0 siblings, 0 replies; 31+ messages in thread
From: Darrick J. Wong @ 2022-06-22 21:50 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: Leah Rumancik, Luis Chamberlain, Dave Chinner, Christoph Hellwig,
	Christian Brauner, linux-xfs, fstests, Yang Xu

On Wed, Jun 22, 2022 at 09:46:29PM +0300, Amir Goldstein wrote:
> On Wed, Jun 22, 2022 at 7:32 PM Darrick J. Wong <djwong@kernel.org> wrote:
> >
> > On Fri, Jun 17, 2022 at 01:06:33PM +0300, Amir Goldstein wrote:
> > > From: Yang Xu <xuyang2018.jy@fujitsu.com>
> > >
> > > commit a1de97fe296c52eafc6590a3506f4bbd44ecb19a upstream.
> > >
> > > When testing xfstests xfs/126 on lastest upstream kernel, it will hang on some machine.
> > > Adding a getxattr operation after xattr corrupted, I can reproduce it 100%.
> > >
> > > The deadlock as below:
> > > [983.923403] task:setfattr        state:D stack:    0 pid:17639 ppid: 14687 flags:0x00000080
> > > [  983.923405] Call Trace:
> > > [  983.923410]  __schedule+0x2c4/0x700
> > > [  983.923412]  schedule+0x37/0xa0
> > > [  983.923414]  schedule_timeout+0x274/0x300
> > > [  983.923416]  __down+0x9b/0xf0
> > > [  983.923451]  ? xfs_buf_find.isra.29+0x3c8/0x5f0 [xfs]
> > > [  983.923453]  down+0x3b/0x50
> > > [  983.923471]  xfs_buf_lock+0x33/0xf0 [xfs]
> > > [  983.923490]  xfs_buf_find.isra.29+0x3c8/0x5f0 [xfs]
> > > [  983.923508]  xfs_buf_get_map+0x4c/0x320 [xfs]
> > > [  983.923525]  xfs_buf_read_map+0x53/0x310 [xfs]
> > > [  983.923541]  ? xfs_da_read_buf+0xcf/0x120 [xfs]
> > > [  983.923560]  xfs_trans_read_buf_map+0x1cf/0x360 [xfs]
> > > [  983.923575]  ? xfs_da_read_buf+0xcf/0x120 [xfs]
> > > [  983.923590]  xfs_da_read_buf+0xcf/0x120 [xfs]
> > > [  983.923606]  xfs_da3_node_read+0x1f/0x40 [xfs]
> > > [  983.923621]  xfs_da3_node_lookup_int+0x69/0x4a0 [xfs]
> > > [  983.923624]  ? kmem_cache_alloc+0x12e/0x270
> > > [  983.923637]  xfs_attr_node_hasname+0x6e/0xa0 [xfs]
> > > [  983.923651]  xfs_has_attr+0x6e/0xd0 [xfs]
> > > [  983.923664]  xfs_attr_set+0x273/0x320 [xfs]
> > > [  983.923683]  xfs_xattr_set+0x87/0xd0 [xfs]
> > > [  983.923686]  __vfs_removexattr+0x4d/0x60
> > > [  983.923688]  __vfs_removexattr_locked+0xac/0x130
> > > [  983.923689]  vfs_removexattr+0x4e/0xf0
> > > [  983.923690]  removexattr+0x4d/0x80
> > > [  983.923693]  ? __check_object_size+0xa8/0x16b
> > > [  983.923695]  ? strncpy_from_user+0x47/0x1a0
> > > [  983.923696]  ? getname_flags+0x6a/0x1e0
> > > [  983.923697]  ? _cond_resched+0x15/0x30
> > > [  983.923699]  ? __sb_start_write+0x1e/0x70
> > > [  983.923700]  ? mnt_want_write+0x28/0x50
> > > [  983.923701]  path_removexattr+0x9b/0xb0
> > > [  983.923702]  __x64_sys_removexattr+0x17/0x20
> > > [  983.923704]  do_syscall_64+0x5b/0x1a0
> > > [  983.923705]  entry_SYSCALL_64_after_hwframe+0x65/0xca
> > > [  983.923707] RIP: 0033:0x7f080f10ee1b
> > >
> > > When getxattr calls xfs_attr_node_get function, xfs_da3_node_lookup_int fails with EFSCORRUPTED in
> > > xfs_attr_node_hasname because we have use blocktrash to random it in xfs/126. So it
> > > free state in internal and xfs_attr_node_get doesn't do xfs_buf_trans release job.
> > >
> > > Then subsequent removexattr will hang because of it.
> > >
> > > This bug was introduced by kernel commit 07120f1abdff ("xfs: Add xfs_has_attr and subroutines").
> > > It adds xfs_attr_node_hasname helper and said caller will be responsible for freeing the state
> > > in this case. But xfs_attr_node_hasname will free state itself instead of caller if
> > > xfs_da3_node_lookup_int fails.
> > >
> > > Fix this bug by moving the step of free state into caller.
> > >
> > > [amir: this text from original commit is not relevant for 5.10 backport:
> > > Also, use "goto error/out" instead of returning error directly in xfs_attr_node_addname_find_attr and
> > > xfs_attr_node_removename_setup function because we should free state ourselves.
> > > ]
> > >
> > > Fixes: 07120f1abdff ("xfs: Add xfs_has_attr and subroutines")
> > > Signed-off-by: Yang Xu <xuyang2018.jy@fujitsu.com>
> > > Reviewed-by: Darrick J. Wong <djwong@kernel.org>
> > > Signed-off-by: Darrick J. Wong <djwong@kernel.org>
> > > Signed-off-by: Amir Goldstein <amir73il@gmail.com>
> > > ---
> > >  fs/xfs/libxfs/xfs_attr.c | 13 +++++--------
> > >  1 file changed, 5 insertions(+), 8 deletions(-)
> > >
> > > diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c
> > > index 96ac7e562b87..fcca36bbd997 100644
> > > --- a/fs/xfs/libxfs/xfs_attr.c
> > > +++ b/fs/xfs/libxfs/xfs_attr.c
> > > @@ -876,21 +876,18 @@ xfs_attr_node_hasname(
> > >
> > >       state = xfs_da_state_alloc(args);
> > >       if (statep != NULL)
> > > -             *statep = NULL;
> > > +             *statep = state;
> > >
> > >       /*
> > >        * Search to see if name exists, and get back a pointer to it.
> > >        */
> > >       error = xfs_da3_node_lookup_int(state, &retval);
> > > -     if (error) {
> > > -             xfs_da_state_free(state);
> > > -             return error;
> > > -     }
> > > +     if (error)
> > > +             retval = error;
> > >
> > > -     if (statep != NULL)
> > > -             *statep = state;
> > > -     else
> > > +     if (!statep)
> > >               xfs_da_state_free(state);
> > > +
> > >       return retval;
> > >  }
> > >
> > > --
> >
> > Curious -- the conversion of the _node_hasname callers isn't in this
> > patch.  Looking at 5.10.124, I see that most of the callers already
> > clean up the passed-out statep, but do the callers of xfs_has_attr free
> > it too?
> 
> Is that a trick question or am I misunderstanding it :)
> 
> xfs_has_attr() passes NULL as xfs_attr_node_hasname()
> statep argument... they don't get the state back.

Nope, just misreading the code.  I guess this looks fine.

--D

> Thanks,
> Amir.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 5.10 CANDIDATE 11/11] xfs: use setattr_copy to set vfs inode attributes
  2022-06-22 18:36     ` Amir Goldstein
@ 2022-06-22 22:17       ` Leah Rumancik
  2022-06-23  4:22         ` Amir Goldstein
  0 siblings, 1 reply; 31+ messages in thread
From: Leah Rumancik @ 2022-06-22 22:17 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: Darrick J. Wong, Luis Chamberlain, Dave Chinner,
	Christoph Hellwig, Christian Brauner, linux-xfs, fstests,
	Dave Chinner, Christian Brauner

On Wed, Jun 22, 2022 at 09:36:53PM +0300, Amir Goldstein wrote:
> On Wed, Jun 22, 2022 at 7:41 PM Darrick J. Wong <djwong@kernel.org> wrote:
> >
> > On Fri, Jun 17, 2022 at 01:06:41PM +0300, Amir Goldstein wrote:
> > > From: "Darrick J. Wong" <djwong@kernel.org>
> > >
> > > commit e014f37db1a2d109afa750042ac4d69cf3e3d88e upstream.
> > >
> > > [remove userns argument of setattr_copy() for backport]
> > >
> > > Filipe Manana pointed out that XFS' behavior w.r.t. setuid/setgid
> > > revocation isn't consistent with btrfs[1] or ext4.  Those two
> > > filesystems use the VFS function setattr_copy to convey certain
> > > attributes from struct iattr into the VFS inode structure.
> > >
> > > Andrey Zhadchenko reported[2] that XFS uses the wrong user namespace to
> > > decide if it should clear setgid and setuid on a file attribute update.
> > > This is a second symptom of the problem that Filipe noticed.
> > >
> > > XFS, on the other hand, open-codes setattr_copy in xfs_setattr_mode,
> > > xfs_setattr_nonsize, and xfs_setattr_time.  Regrettably, setattr_copy is
> > > /not/ a simple copy function; it contains additional logic to clear the
> > > setgid bit when setting the mode, and XFS' version no longer matches.
> > >
> > > The VFS implements its own setuid/setgid stripping logic, which
> > > establishes consistent behavior.  It's a tad unfortunate that it's
> > > scattered across notify_change, should_remove_suid, and setattr_copy but
> > > XFS should really follow the Linux VFS.  Adapt XFS to use the VFS
> > > functions and get rid of the old functions.
> > >
> > > [1] https://lore.kernel.org/fstests/CAL3q7H47iNQ=Wmk83WcGB-KBJVOEtR9+qGczzCeXJ9Y2KCV25Q@mail.gmail.com/
> > > [2] https://lore.kernel.org/linux-xfs/20220221182218.748084-1-andrey.zhadchenko@virtuozzo.com/
> > >
> > > Fixes: 7fa294c8991c ("userns: Allow chown and setgid preservation")
> > > Signed-off-by: Darrick J. Wong <djwong@kernel.org>
> > > Reviewed-by: Dave Chinner <dchinner@redhat.com>
> > > Reviewed-by: Christoph Hellwig <hch@lst.de>
> > > Reviewed-by: Christian Brauner <brauner@kernel.org>
> > > Signed-off-by: Amir Goldstein <amir73il@gmail.com>
> >
> > Same question as I posted to Leah's series -- have all the necessary VFS
> > fixes and whatnot been backported to 5.10?  Such that all the new sgid
> > inheritance tests actually pass with this patch applied? :)
> 
> The only patch I backorted to 5.10 is:
> xfs: fix up non-directory creation in SGID directories
> 
> I will check which SGID tests ran on my series.
> 
> Personally, I would rather defer THIS patch to a later post to stable
> (Leah's patch as well) until we have a better understanding of the state
> of SGID issues.
> 
> Thanks,
> Amir.

On the latest version of the SGID tests, I see failures of
generic/68[3-7] and xfs/019 on both the baseline and backports branch.
generic/673 fails on most configs for the baseline but seems to be fixed
in the backports branch. Regardless, I am fine dropping this patch for
this round.

Best,
Leah

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 5.10 CANDIDATE 09/11] xfs: only bother with sync_filesystem during readonly remount
  2022-06-22 16:54     ` Amir Goldstein
@ 2022-06-22 23:42       ` Darrick J. Wong
  2022-06-23  6:38         ` Amir Goldstein
  0 siblings, 1 reply; 31+ messages in thread
From: Darrick J. Wong @ 2022-06-22 23:42 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: Leah Rumancik, Luis Chamberlain, Dave Chinner, Christoph Hellwig,
	Christian Brauner, linux-xfs, fstests, Dave Chinner

On Wed, Jun 22, 2022 at 07:54:33PM +0300, Amir Goldstein wrote:
> On Wed, Jun 22, 2022 at 7:38 PM Darrick J. Wong <djwong@kernel.org> wrote:
> >
> > On Fri, Jun 17, 2022 at 01:06:39PM +0300, Amir Goldstein wrote:
> > > From: "Darrick J. Wong" <djwong@kernel.org>
> > >
> > > commit b97cca3ba9098522e5a1c3388764ead42640c1a5 upstream.
> > >
> > > In commit 02b9984d6408, we pushed a sync_filesystem() call from the VFS
> > > into xfs_fs_remount.  The only time that we ever need to push dirty file
> > > data or metadata to disk for a remount is if we're remounting the
> > > filesystem read only, so this really could be moved to xfs_remount_ro.
> > >
> > > Once we've moved the call site, actually check the return value from
> > > sync_filesystem.
> 
> This part is not really relevant for this backport, do you want me to
> emphasise that?

Not relevant?  Making sync_fs return error codes to callers was the
entire reason for creating this series...

> > >
> > > Fixes: 02b9984d6408 ("fs: push sync_filesystem() down to the file system's remount_fs()")
> > > Signed-off-by: Darrick J. Wong <djwong@kernel.org>
> > > Reviewed-by: Dave Chinner <dchinner@redhat.com>
> > > Signed-off-by: Amir Goldstein <amir73il@gmail.com>
> > > ---
> > >  fs/xfs/xfs_super.c | 7 +++++--
> > >  1 file changed, 5 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
> > > index 6323974d6b3e..dd0439ae6732 100644
> > > --- a/fs/xfs/xfs_super.c
> > > +++ b/fs/xfs/xfs_super.c
> > > @@ -1716,6 +1716,11 @@ xfs_remount_ro(
> > >       };
> > >       int                     error;
> > >
> > > +     /* Flush all the dirty data to disk. */
> > > +     error = sync_filesystem(mp->m_super);
> >
> > Looking at 5.10.124's fsync.c and xfs_super.c:
> >
> > https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/fs/sync.c?h=v5.10.124#n31
> > https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/fs/xfs/xfs_super.c?h=v5.10.124#n755
> >
> > I think this kernel needs the patch(es) that make __sync_filesystem return
> > the errors passed back by ->sync_fs, and I think also the patch that
> > makes xfs_fs_sync_fs return errors encountered by xfs_log_force, right?
> 
> It wasn't my intention to fix syncfs() does not return errors in 5.10.
> It has always been that way and IIRC, the relevant patches did not
> apply cleanly.

...because right now userspace can call syncfs() on a filesystem that
dies in the process, and the VFS eats the EIO and returns 0 to
userspace.  Yes, that's the historical behavior fo 5.10, but that's a
serious problem that needs addressing.  Eliding the sync_filesystem call
during a rw remount is not itself all that exciting.

> THIS patch however, fixes something else, not only the return of the error
> to its caller, so I thought it was worth backporting.

Assuming "something else" means "moving the sync_filesystem callsite" --
that was a secondary piece that I did to get the requisite RVB tag under
time pressure after 5.17-rc6 dropped.

> If you think otherwise, I'll drop it.

On the contrary, I think the ->sync_fs fixes *also* need backporting.
It should be as simple as patching __sync_filesystem:

static int __sync_filesystem(struct super_block *sb, int wait)
{
	if (wait)
		sync_inodes_sb(sb);
	else
		writeback_inodes_sb(sb, WB_REASON_SYNC);

	if (sb->s_op->sync_fs) {
		int ret = sb->s_op->sync_fs(sb, wait);
		if (ret)
			return ret;
	}
	return __sync_blockdev(sb->s_bdev, wait);
}

Granted, that can be a part of the next batch.  If you plan to pick up
the vfs sync_fs changes then I guess this one's ok for inclusion now.

--D

> 
> Thanks,
> Amir.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 5.10 CANDIDATE 00/11] xfs stable candidate patches for 5.10.y (v5.15+)
  2022-06-17 10:06 [PATCH 5.10 CANDIDATE 00/11] xfs stable candidate patches for 5.10.y (v5.15+) Amir Goldstein
                   ` (10 preceding siblings ...)
  2022-06-17 10:06 ` [PATCH 5.10 CANDIDATE 11/11] xfs: use setattr_copy to set vfs inode attributes Amir Goldstein
@ 2022-06-22 23:45 ` Darrick J. Wong
  2022-06-23  7:33   ` Amir Goldstein
  11 siblings, 1 reply; 31+ messages in thread
From: Darrick J. Wong @ 2022-06-22 23:45 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: Leah Rumancik, Luis Chamberlain, Dave Chinner, Christoph Hellwig,
	Christian Brauner, linux-xfs, fstests

On Fri, Jun 17, 2022 at 01:06:30PM +0300, Amir Goldstein wrote:
> Hi all,
> 
> Previously posted candidates for 5.10.y followed chronological release
> order.
> 
> Parts 1 and 2 of fixes from v5.10..v5.12 have already been applied to
> v5.10.121.
> 
> Part 3 (from 5.13) has already been posted for review [3] on June 6,
> but following feedback from Dave, I changed my focus to get the same
> set of patches tested and reviewed for 5.10.y/5.15.y.
> 
> I do want to ask you guys to also find time to review part 3, because
> we have a lot of catching up to do for 5.10.y, so we need to chew at
> this debt at a reasonable rate.
> 
> This post has the matching set of patches for 5.10.y that goes with
> Leah's first set of candidates for 5.15.y [1].
> 
> Most of the fixes are from v5.15..v5.17 except for patch 11 (v5.18-rc1).
> All fix patches have been tagged with Fixes: by the author.
> 
> The patches have been soaking in kdepops since Sunday. They passed more
> than 30 auto group runs with several different versions of xfsprogs.
> 
> The differences from Leah's 5.15.y:
> - It is 11 patches and not 8 because of dependencies
> - Patches 6,7 are non-fixes backported as dependency to patch 8 -
>   they have "backported .* for dependency" in their commit message
> - Patches 3,4,11 needed changes to apply to 5.10.y - they have a
>   "backport" related comment in their commit message to explain what
>   changes were needed
> - Patch 10 is a fix from v5.12 that is re-posted as a dependency for
>   patch 11
> 
> Darrick,
> 
> As the author patches 4,11 and sole reviewer of patch 3 (a.k.a
> the non-cleanly applied patches), please take a closer look at those.
> 
> Patch 10 has been dropped from my part 2 candidates following concerns
> raised by Dave and is now being re-posted following feedback from
> Christian and Christoph [2].
> 
> If there are still concerns about patches 10 or 11, please raise a flag.
> I can drop either of these patches before posting to stable if anyone
> feels that they need more time to soak in master.

At the current moment (keep in mind that I have 2,978 more emails to get
through before I'm caught up), I think it's safe to say that for patches
1-5:

Acked-by: Darrick J. Wong <djwong@kernel.org>

(patch 9 also, but see the reply I just sent for that one about grabbing
the sync_fs fixes too)

The log changes are going to take more time to go through, since that
stuff is always tricky and /not/ something for me to be messing with at
4:45pm.

--D

> Thanks,
> Amir.
> 
> [1] https://lore.kernel.org/linux-xfs/20220616182749.1200971-1-leah.rumancik@gmail.com/
> [2] https://lore.kernel.org/linux-xfs/CAOQ4uxg4=m9zEFbDAKXx7CP7HYiMwtsYSJvq076oKpy-OhK1uw@mail.gmail.com/
> [3] https://lore.kernel.org/linux-xfs/20220606160537.689915-1-amir73il@gmail.com/
> 
> Brian Foster (1):
>   xfs: punch out data fork delalloc blocks on COW writeback failure
> 
> Christoph Hellwig (2):
>   xfs: refactor xfs_file_fsync
>   xfs: fix up non-directory creation in SGID directories
> 
> Darrick J. Wong (4):
>   xfs: remove all COW fork extents when remounting readonly
>   xfs: prevent UAF in xfs_log_item_in_current_chkpt
>   xfs: only bother with sync_filesystem during readonly remount
>   xfs: use setattr_copy to set vfs inode attributes
> 
> Dave Chinner (2):
>   xfs: check sb_meta_uuid for dabuf buffer recovery
>   xfs: xfs_log_force_lsn isn't passed a LSN
> 
> Rustam Kovhaev (1):
>   xfs: use kmem_cache_free() for kmem_cache objects
> 
> Yang Xu (1):
>   xfs: Fix the free logic of state in xfs_attr_node_hasname
> 
>  fs/xfs/libxfs/xfs_attr.c      | 13 +++---
>  fs/xfs/libxfs/xfs_types.h     |  1 +
>  fs/xfs/xfs_aops.c             | 15 +++++--
>  fs/xfs/xfs_buf_item.c         |  2 +-
>  fs/xfs/xfs_buf_item_recover.c |  2 +-
>  fs/xfs/xfs_dquot_item.c       |  2 +-
>  fs/xfs/xfs_extfree_item.c     |  6 +--
>  fs/xfs/xfs_file.c             | 81 +++++++++++++++++++++--------------
>  fs/xfs/xfs_inode.c            | 24 +++++------
>  fs/xfs/xfs_inode_item.c       |  4 +-
>  fs/xfs/xfs_inode_item.h       |  2 +-
>  fs/xfs/xfs_iops.c             | 56 ++----------------------
>  fs/xfs/xfs_log.c              | 27 ++++++------
>  fs/xfs/xfs_log.h              |  4 +-
>  fs/xfs/xfs_log_cil.c          | 32 ++++++--------
>  fs/xfs/xfs_log_priv.h         | 15 +++----
>  fs/xfs/xfs_pnfs.c             |  3 +-
>  fs/xfs/xfs_super.c            | 21 ++++++---
>  fs/xfs/xfs_trans.c            |  6 +--
>  fs/xfs/xfs_trans.h            |  4 +-
>  20 files changed, 149 insertions(+), 171 deletions(-)
> 
> -- 
> 2.25.1
> 

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 5.10 CANDIDATE 11/11] xfs: use setattr_copy to set vfs inode attributes
  2022-06-22 22:17       ` Leah Rumancik
@ 2022-06-23  4:22         ` Amir Goldstein
  0 siblings, 0 replies; 31+ messages in thread
From: Amir Goldstein @ 2022-06-23  4:22 UTC (permalink / raw)
  To: Leah Rumancik
  Cc: Darrick J. Wong, Luis Chamberlain, Dave Chinner,
	Christoph Hellwig, Christian Brauner, linux-xfs, fstests,
	Dave Chinner, Christian Brauner

On Thu, Jun 23, 2022 at 1:17 AM Leah Rumancik <leah.rumancik@gmail.com> wrote:
>
> On Wed, Jun 22, 2022 at 09:36:53PM +0300, Amir Goldstein wrote:
> > On Wed, Jun 22, 2022 at 7:41 PM Darrick J. Wong <djwong@kernel.org> wrote:
> > >
> > > On Fri, Jun 17, 2022 at 01:06:41PM +0300, Amir Goldstein wrote:
> > > > From: "Darrick J. Wong" <djwong@kernel.org>
> > > >
> > > > commit e014f37db1a2d109afa750042ac4d69cf3e3d88e upstream.
> > > >
> > > > [remove userns argument of setattr_copy() for backport]
> > > >
> > > > Filipe Manana pointed out that XFS' behavior w.r.t. setuid/setgid
> > > > revocation isn't consistent with btrfs[1] or ext4.  Those two
> > > > filesystems use the VFS function setattr_copy to convey certain
> > > > attributes from struct iattr into the VFS inode structure.
> > > >
> > > > Andrey Zhadchenko reported[2] that XFS uses the wrong user namespace to
> > > > decide if it should clear setgid and setuid on a file attribute update.
> > > > This is a second symptom of the problem that Filipe noticed.
> > > >
> > > > XFS, on the other hand, open-codes setattr_copy in xfs_setattr_mode,
> > > > xfs_setattr_nonsize, and xfs_setattr_time.  Regrettably, setattr_copy is
> > > > /not/ a simple copy function; it contains additional logic to clear the
> > > > setgid bit when setting the mode, and XFS' version no longer matches.
> > > >
> > > > The VFS implements its own setuid/setgid stripping logic, which
> > > > establishes consistent behavior.  It's a tad unfortunate that it's
> > > > scattered across notify_change, should_remove_suid, and setattr_copy but
> > > > XFS should really follow the Linux VFS.  Adapt XFS to use the VFS
> > > > functions and get rid of the old functions.
> > > >
> > > > [1] https://lore.kernel.org/fstests/CAL3q7H47iNQ=Wmk83WcGB-KBJVOEtR9+qGczzCeXJ9Y2KCV25Q@mail.gmail.com/
> > > > [2] https://lore.kernel.org/linux-xfs/20220221182218.748084-1-andrey.zhadchenko@virtuozzo.com/
> > > >
> > > > Fixes: 7fa294c8991c ("userns: Allow chown and setgid preservation")
> > > > Signed-off-by: Darrick J. Wong <djwong@kernel.org>
> > > > Reviewed-by: Dave Chinner <dchinner@redhat.com>
> > > > Reviewed-by: Christoph Hellwig <hch@lst.de>
> > > > Reviewed-by: Christian Brauner <brauner@kernel.org>
> > > > Signed-off-by: Amir Goldstein <amir73il@gmail.com>
> > >
> > > Same question as I posted to Leah's series -- have all the necessary VFS
> > > fixes and whatnot been backported to 5.10?  Such that all the new sgid
> > > inheritance tests actually pass with this patch applied? :)
> >
> > The only patch I backorted to 5.10 is:
> > xfs: fix up non-directory creation in SGID directories
> >
> > I will check which SGID tests ran on my series.
> >
> > Personally, I would rather defer THIS patch to a later post to stable
> > (Leah's patch as well) until we have a better understanding of the state
> > of SGID issues.
> >
> > Thanks,
> > Amir.
>
> On the latest version of the SGID tests, I see failures of
> generic/68[3-7] and xfs/019 on both the baseline and backports branch.
> generic/673 fails on most configs for the baseline but seems to be fixed
> in the backports branch. Regardless, I am fine dropping this patch for
> this round.

Let's do that then.
I actually didn't plan to post this patch for this round to begin with.
I only posted it because you did.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 5.10 CANDIDATE 09/11] xfs: only bother with sync_filesystem during readonly remount
  2022-06-22 23:42       ` Darrick J. Wong
@ 2022-06-23  6:38         ` Amir Goldstein
  0 siblings, 0 replies; 31+ messages in thread
From: Amir Goldstein @ 2022-06-23  6:38 UTC (permalink / raw)
  To: Darrick J. Wong
  Cc: Leah Rumancik, Luis Chamberlain, Dave Chinner, Christoph Hellwig,
	Christian Brauner, linux-xfs, fstests, Dave Chinner

On Thu, Jun 23, 2022 at 2:42 AM Darrick J. Wong <djwong@kernel.org> wrote:
>
> On Wed, Jun 22, 2022 at 07:54:33PM +0300, Amir Goldstein wrote:
> > On Wed, Jun 22, 2022 at 7:38 PM Darrick J. Wong <djwong@kernel.org> wrote:
> > >
> > > On Fri, Jun 17, 2022 at 01:06:39PM +0300, Amir Goldstein wrote:
> > > > From: "Darrick J. Wong" <djwong@kernel.org>
> > > >
> > > > commit b97cca3ba9098522e5a1c3388764ead42640c1a5 upstream.
> > > >
> > > > In commit 02b9984d6408, we pushed a sync_filesystem() call from the VFS
> > > > into xfs_fs_remount.  The only time that we ever need to push dirty file
> > > > data or metadata to disk for a remount is if we're remounting the
> > > > filesystem read only, so this really could be moved to xfs_remount_ro.
> > > >
> > > > Once we've moved the call site, actually check the return value from
> > > > sync_filesystem.
> >
> > This part is not really relevant for this backport, do you want me to
> > emphasise that?
>
> Not relevant?  Making sync_fs return error codes to callers was the
> entire reason for creating this series...
>
> > > >
> > > > Fixes: 02b9984d6408 ("fs: push sync_filesystem() down to the file system's remount_fs()")
> > > > Signed-off-by: Darrick J. Wong <djwong@kernel.org>
> > > > Reviewed-by: Dave Chinner <dchinner@redhat.com>
> > > > Signed-off-by: Amir Goldstein <amir73il@gmail.com>
> > > > ---
> > > >  fs/xfs/xfs_super.c | 7 +++++--
> > > >  1 file changed, 5 insertions(+), 2 deletions(-)
> > > >
> > > > diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
> > > > index 6323974d6b3e..dd0439ae6732 100644
> > > > --- a/fs/xfs/xfs_super.c
> > > > +++ b/fs/xfs/xfs_super.c
> > > > @@ -1716,6 +1716,11 @@ xfs_remount_ro(
> > > >       };
> > > >       int                     error;
> > > >
> > > > +     /* Flush all the dirty data to disk. */
> > > > +     error = sync_filesystem(mp->m_super);
> > >
> > > Looking at 5.10.124's fsync.c and xfs_super.c:
> > >
> > > https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/fs/sync.c?h=v5.10.124#n31
> > > https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/fs/xfs/xfs_super.c?h=v5.10.124#n755
> > >
> > > I think this kernel needs the patch(es) that make __sync_filesystem return
> > > the errors passed back by ->sync_fs, and I think also the patch that
> > > makes xfs_fs_sync_fs return errors encountered by xfs_log_force, right?
> >
> > It wasn't my intention to fix syncfs() does not return errors in 5.10.
> > It has always been that way and IIRC, the relevant patches did not
> > apply cleanly.
>
> ...because right now userspace can call syncfs() on a filesystem that
> dies in the process, and the VFS eats the EIO and returns 0 to
> userspace.  Yes, that's the historical behavior fo 5.10, but that's a
> serious problem that needs addressing.  Eliding the sync_filesystem call
> during a rw remount is not itself all that exciting.

Yes, this is a just cause.

>
> > THIS patch however, fixes something else, not only the return of the error
> > to its caller, so I thought it was worth backporting.
>
> Assuming "something else" means "moving the sync_filesystem callsite" --
> that was a secondary piece that I did to get the requisite RVB tag under
> time pressure after 5.17-rc6 dropped.

Right. looking back at my notes:
https://github.com/amir73il/b4/commit/fddd6d961c029441d2f0e9dd86d0f83b754d0c47

I didn't pick this patch at all because of the missing dependencies.
I must have picked it up later from Laeh's series, because
I thought it was harmless to apply the "something else".

>
> > If you think otherwise, I'll drop it.
>
> On the contrary, I think the ->sync_fs fixes *also* need backporting.
> It should be as simple as patching __sync_filesystem:
>
> static int __sync_filesystem(struct super_block *sb, int wait)
> {
>         if (wait)
>                 sync_inodes_sb(sb);
>         else
>                 writeback_inodes_sb(sb, WB_REASON_SYNC);
>
>         if (sb->s_op->sync_fs) {
>                 int ret = sb->s_op->sync_fs(sb, wait);
>                 if (ret)
>                         return ret;
>         }
>         return __sync_blockdev(sb->s_bdev, wait);
> }
>
> Granted, that can be a part of the next batch.  If you plan to pick up
> the vfs sync_fs changes then I guess this one's ok for inclusion now.

I picked up the vfs fixes, it was pretty simple as you say.
I even backported "fs: remove __sync_filesystem" for dependency
to make the backport more straightforward.

But that adds 3 more more patches to the series and needs to go
back to testing, so I will just drop patch 9 from this batch.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 5.10 CANDIDATE 00/11] xfs stable candidate patches for 5.10.y (v5.15+)
  2022-06-22 23:45 ` [PATCH 5.10 CANDIDATE 00/11] xfs stable candidate patches for 5.10.y (v5.15+) Darrick J. Wong
@ 2022-06-23  7:33   ` Amir Goldstein
  2022-06-23 16:05     ` Darrick J. Wong
  0 siblings, 1 reply; 31+ messages in thread
From: Amir Goldstein @ 2022-06-23  7:33 UTC (permalink / raw)
  To: Darrick J. Wong
  Cc: Leah Rumancik, Luis Chamberlain, Dave Chinner, Christoph Hellwig,
	Christian Brauner, linux-xfs, fstests

On Thu, Jun 23, 2022 at 2:45 AM Darrick J. Wong <djwong@kernel.org> wrote:
>
> On Fri, Jun 17, 2022 at 01:06:30PM +0300, Amir Goldstein wrote:
> > Hi all,
> >
> > Previously posted candidates for 5.10.y followed chronological release
> > order.
> >
> > Parts 1 and 2 of fixes from v5.10..v5.12 have already been applied to
> > v5.10.121.
> >
> > Part 3 (from 5.13) has already been posted for review [3] on June 6,
> > but following feedback from Dave, I changed my focus to get the same
> > set of patches tested and reviewed for 5.10.y/5.15.y.
> >
> > I do want to ask you guys to also find time to review part 3, because
> > we have a lot of catching up to do for 5.10.y, so we need to chew at
> > this debt at a reasonable rate.
> >
> > This post has the matching set of patches for 5.10.y that goes with
> > Leah's first set of candidates for 5.15.y [1].
> >
> > Most of the fixes are from v5.15..v5.17 except for patch 11 (v5.18-rc1).
> > All fix patches have been tagged with Fixes: by the author.
> >
> > The patches have been soaking in kdepops since Sunday. They passed more
> > than 30 auto group runs with several different versions of xfsprogs.
> >
> > The differences from Leah's 5.15.y:
> > - It is 11 patches and not 8 because of dependencies
> > - Patches 6,7 are non-fixes backported as dependency to patch 8 -
> >   they have "backported .* for dependency" in their commit message
> > - Patches 3,4,11 needed changes to apply to 5.10.y - they have a
> >   "backport" related comment in their commit message to explain what
> >   changes were needed
> > - Patch 10 is a fix from v5.12 that is re-posted as a dependency for
> >   patch 11
> >
> > Darrick,
> >
> > As the author patches 4,11 and sole reviewer of patch 3 (a.k.a
> > the non-cleanly applied patches), please take a closer look at those.
> >
> > Patch 10 has been dropped from my part 2 candidates following concerns
> > raised by Dave and is now being re-posted following feedback from
> > Christian and Christoph [2].
> >
> > If there are still concerns about patches 10 or 11, please raise a flag.
> > I can drop either of these patches before posting to stable if anyone
> > feels that they need more time to soak in master.
>
> At the current moment (keep in mind that I have 2,978 more emails to get

Oh boy! Thank you for getting to my series so soon.

> through before I'm caught up), I think it's safe to say that for patches
> 1-5:
>
> Acked-by: Darrick J. Wong <djwong@kernel.org>
>
> (patch 9 also, but see the reply I just sent for that one about grabbing
> the sync_fs fixes too)
>
> The log changes are going to take more time to go through, since that
> stuff is always tricky and /not/ something for me to be messing with at
> 4:45pm.

Let's make it easier for you then.
I already decided to defer patches 9-11.

Since you already started looking at patches 6-8, if you want to finish
that review let me know and I will wait, but if you prefer, I can also defer
the log changes 6-8 and post them along with the other log fixes from 5.14.
That means that I have a 5 patch series ACKed and ready to go to stable.

Let me know what you prefer.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 5.10 CANDIDATE 00/11] xfs stable candidate patches for 5.10.y (v5.15+)
  2022-06-23  7:33   ` Amir Goldstein
@ 2022-06-23 16:05     ` Darrick J. Wong
  2022-07-24  8:36       ` Amir Goldstein
  0 siblings, 1 reply; 31+ messages in thread
From: Darrick J. Wong @ 2022-06-23 16:05 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: Leah Rumancik, Luis Chamberlain, Dave Chinner, Christoph Hellwig,
	Christian Brauner, linux-xfs, fstests

On Thu, Jun 23, 2022 at 10:33:47AM +0300, Amir Goldstein wrote:
> On Thu, Jun 23, 2022 at 2:45 AM Darrick J. Wong <djwong@kernel.org> wrote:
> >
> > On Fri, Jun 17, 2022 at 01:06:30PM +0300, Amir Goldstein wrote:
> > > Hi all,
> > >
> > > Previously posted candidates for 5.10.y followed chronological release
> > > order.
> > >
> > > Parts 1 and 2 of fixes from v5.10..v5.12 have already been applied to
> > > v5.10.121.
> > >
> > > Part 3 (from 5.13) has already been posted for review [3] on June 6,
> > > but following feedback from Dave, I changed my focus to get the same
> > > set of patches tested and reviewed for 5.10.y/5.15.y.
> > >
> > > I do want to ask you guys to also find time to review part 3, because
> > > we have a lot of catching up to do for 5.10.y, so we need to chew at
> > > this debt at a reasonable rate.
> > >
> > > This post has the matching set of patches for 5.10.y that goes with
> > > Leah's first set of candidates for 5.15.y [1].
> > >
> > > Most of the fixes are from v5.15..v5.17 except for patch 11 (v5.18-rc1).
> > > All fix patches have been tagged with Fixes: by the author.
> > >
> > > The patches have been soaking in kdepops since Sunday. They passed more
> > > than 30 auto group runs with several different versions of xfsprogs.
> > >
> > > The differences from Leah's 5.15.y:
> > > - It is 11 patches and not 8 because of dependencies
> > > - Patches 6,7 are non-fixes backported as dependency to patch 8 -
> > >   they have "backported .* for dependency" in their commit message
> > > - Patches 3,4,11 needed changes to apply to 5.10.y - they have a
> > >   "backport" related comment in their commit message to explain what
> > >   changes were needed
> > > - Patch 10 is a fix from v5.12 that is re-posted as a dependency for
> > >   patch 11
> > >
> > > Darrick,
> > >
> > > As the author patches 4,11 and sole reviewer of patch 3 (a.k.a
> > > the non-cleanly applied patches), please take a closer look at those.
> > >
> > > Patch 10 has been dropped from my part 2 candidates following concerns
> > > raised by Dave and is now being re-posted following feedback from
> > > Christian and Christoph [2].
> > >
> > > If there are still concerns about patches 10 or 11, please raise a flag.
> > > I can drop either of these patches before posting to stable if anyone
> > > feels that they need more time to soak in master.
> >
> > At the current moment (keep in mind that I have 2,978 more emails to get
> 
> Oh boy! Thank you for getting to my series so soon.
> 
> > through before I'm caught up), I think it's safe to say that for patches
> > 1-5:
> >
> > Acked-by: Darrick J. Wong <djwong@kernel.org>
> >
> > (patch 9 also, but see the reply I just sent for that one about grabbing
> > the sync_fs fixes too)
> >
> > The log changes are going to take more time to go through, since that
> > stuff is always tricky and /not/ something for me to be messing with at
> > 4:45pm.
> 
> Let's make it easier for you then.
> I already decided to defer patches 9-11.
> 
> Since you already started looking at patches 6-8, if you want to finish
> that review let me know and I will wait, but if you prefer, I can also defer
> the log changes 6-8 and post them along with the other log fixes from 5.14.
> That means that I have a 5 patch series ACKed and ready to go to stable.
> 
> Let me know what you prefer.

I wouldn't hold back on sending 1-5 to stable; yesterday was quick
triage of the list traffic to figure out who I could unblock most
rapidly.

--D

> Thanks,
> Amir.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 5.10 CANDIDATE 00/11] xfs stable candidate patches for 5.10.y (v5.15+)
  2022-06-23 16:05     ` Darrick J. Wong
@ 2022-07-24  8:36       ` Amir Goldstein
  2022-07-26  2:10         ` Darrick J. Wong
  0 siblings, 1 reply; 31+ messages in thread
From: Amir Goldstein @ 2022-07-24  8:36 UTC (permalink / raw)
  To: Darrick J. Wong
  Cc: Leah Rumancik, Luis Chamberlain, Dave Chinner, Christoph Hellwig,
	Christian Brauner, linux-xfs, fstests

On Thu, Jun 23, 2022 at 6:05 PM Darrick J. Wong <djwong@kernel.org> wrote:
>
> On Thu, Jun 23, 2022 at 10:33:47AM +0300, Amir Goldstein wrote:
> > On Thu, Jun 23, 2022 at 2:45 AM Darrick J. Wong <djwong@kernel.org> wrote:
> > >
> > > On Fri, Jun 17, 2022 at 01:06:30PM +0300, Amir Goldstein wrote:
> > > > Hi all,
> > > >
> > > > Previously posted candidates for 5.10.y followed chronological release
> > > > order.
> > > >
> > > > Parts 1 and 2 of fixes from v5.10..v5.12 have already been applied to
> > > > v5.10.121.
> > > >
> > > > Part 3 (from 5.13) has already been posted for review [3] on June 6,
> > > > but following feedback from Dave, I changed my focus to get the same
> > > > set of patches tested and reviewed for 5.10.y/5.15.y.
> > > >
> > > > I do want to ask you guys to also find time to review part 3, because
> > > > we have a lot of catching up to do for 5.10.y, so we need to chew at
> > > > this debt at a reasonable rate.
> > > >
> > > > This post has the matching set of patches for 5.10.y that goes with
> > > > Leah's first set of candidates for 5.15.y [1].
> > > >
> > > > Most of the fixes are from v5.15..v5.17 except for patch 11 (v5.18-rc1).
> > > > All fix patches have been tagged with Fixes: by the author.
> > > >
> > > > The patches have been soaking in kdepops since Sunday. They passed more
> > > > than 30 auto group runs with several different versions of xfsprogs.
> > > >
> > > > The differences from Leah's 5.15.y:
> > > > - It is 11 patches and not 8 because of dependencies
> > > > - Patches 6,7 are non-fixes backported as dependency to patch 8 -
> > > >   they have "backported .* for dependency" in their commit message
> > > > - Patches 3,4,11 needed changes to apply to 5.10.y - they have a
> > > >   "backport" related comment in their commit message to explain what
> > > >   changes were needed
> > > > - Patch 10 is a fix from v5.12 that is re-posted as a dependency for
> > > >   patch 11
> > > >
> > > > Darrick,
> > > >
> > > > As the author patches 4,11 and sole reviewer of patch 3 (a.k.a
> > > > the non-cleanly applied patches), please take a closer look at those.
> > > >
> > > > Patch 10 has been dropped from my part 2 candidates following concerns
> > > > raised by Dave and is now being re-posted following feedback from
> > > > Christian and Christoph [2].
> > > >
> > > > If there are still concerns about patches 10 or 11, please raise a flag.
> > > > I can drop either of these patches before posting to stable if anyone
> > > > feels that they need more time to soak in master.
> > >
> > > At the current moment (keep in mind that I have 2,978 more emails to get
> >
> > Oh boy! Thank you for getting to my series so soon.
> >
> > > through before I'm caught up), I think it's safe to say that for patches
> > > 1-5:
> > >
> > > Acked-by: Darrick J. Wong <djwong@kernel.org>
> > >
> > > (patch 9 also, but see the reply I just sent for that one about grabbing
> > > the sync_fs fixes too)
> > >
> > > The log changes are going to take more time to go through, since that
> > > stuff is always tricky and /not/ something for me to be messing with at
> > > 4:45pm.
> >
> > Let's make it easier for you then.
> > I already decided to defer patches 9-11.
> >
> > Since you already started looking at patches 6-8, if you want to finish
> > that review let me know and I will wait, but if you prefer, I can also defer
> > the log changes 6-8 and post them along with the other log fixes from 5.14.

Hi Darrick,

FYI, I started testing the log fixes backports from v5.14 along with
the deferred
patches 6-8 [1] with extra focus on recoveryloop tests.

I know that Leah is also testing another batch of 5.15-only patches, so she
may yet post another 5.15-only series before my 5.10-only series.

In the meanwhile, if you have some spare time due to rc8, please try to
look at the already posted patches 6-8 [2] that were deferred from the original
stable submission per your request.

Thanks,
Amir.

[1] https://github.com/amir73il/linux/commits/xfs-5.10.y-for-review
[2] https://lore.kernel.org/linux-xfs/20220617100641.1653164-1-amir73il@gmail.com/

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 5.10 CANDIDATE 00/11] xfs stable candidate patches for 5.10.y (v5.15+)
  2022-07-24  8:36       ` Amir Goldstein
@ 2022-07-26  2:10         ` Darrick J. Wong
  2022-07-26  8:41           ` Amir Goldstein
  0 siblings, 1 reply; 31+ messages in thread
From: Darrick J. Wong @ 2022-07-26  2:10 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: Leah Rumancik, Luis Chamberlain, Dave Chinner, Christoph Hellwig,
	Christian Brauner, linux-xfs, fstests

On Sun, Jul 24, 2022 at 10:36:15AM +0200, Amir Goldstein wrote:
> On Thu, Jun 23, 2022 at 6:05 PM Darrick J. Wong <djwong@kernel.org> wrote:
> >
> > On Thu, Jun 23, 2022 at 10:33:47AM +0300, Amir Goldstein wrote:
> > > On Thu, Jun 23, 2022 at 2:45 AM Darrick J. Wong <djwong@kernel.org> wrote:
> > > >
> > > > On Fri, Jun 17, 2022 at 01:06:30PM +0300, Amir Goldstein wrote:
> > > > > Hi all,
> > > > >
> > > > > Previously posted candidates for 5.10.y followed chronological release
> > > > > order.
> > > > >
> > > > > Parts 1 and 2 of fixes from v5.10..v5.12 have already been applied to
> > > > > v5.10.121.
> > > > >
> > > > > Part 3 (from 5.13) has already been posted for review [3] on June 6,
> > > > > but following feedback from Dave, I changed my focus to get the same
> > > > > set of patches tested and reviewed for 5.10.y/5.15.y.
> > > > >
> > > > > I do want to ask you guys to also find time to review part 3, because
> > > > > we have a lot of catching up to do for 5.10.y, so we need to chew at
> > > > > this debt at a reasonable rate.
> > > > >
> > > > > This post has the matching set of patches for 5.10.y that goes with
> > > > > Leah's first set of candidates for 5.15.y [1].
> > > > >
> > > > > Most of the fixes are from v5.15..v5.17 except for patch 11 (v5.18-rc1).
> > > > > All fix patches have been tagged with Fixes: by the author.
> > > > >
> > > > > The patches have been soaking in kdepops since Sunday. They passed more
> > > > > than 30 auto group runs with several different versions of xfsprogs.
> > > > >
> > > > > The differences from Leah's 5.15.y:
> > > > > - It is 11 patches and not 8 because of dependencies
> > > > > - Patches 6,7 are non-fixes backported as dependency to patch 8 -
> > > > >   they have "backported .* for dependency" in their commit message
> > > > > - Patches 3,4,11 needed changes to apply to 5.10.y - they have a
> > > > >   "backport" related comment in their commit message to explain what
> > > > >   changes were needed
> > > > > - Patch 10 is a fix from v5.12 that is re-posted as a dependency for
> > > > >   patch 11
> > > > >
> > > > > Darrick,
> > > > >
> > > > > As the author patches 4,11 and sole reviewer of patch 3 (a.k.a
> > > > > the non-cleanly applied patches), please take a closer look at those.
> > > > >
> > > > > Patch 10 has been dropped from my part 2 candidates following concerns
> > > > > raised by Dave and is now being re-posted following feedback from
> > > > > Christian and Christoph [2].
> > > > >
> > > > > If there are still concerns about patches 10 or 11, please raise a flag.
> > > > > I can drop either of these patches before posting to stable if anyone
> > > > > feels that they need more time to soak in master.
> > > >
> > > > At the current moment (keep in mind that I have 2,978 more emails to get
> > >
> > > Oh boy! Thank you for getting to my series so soon.
> > >
> > > > through before I'm caught up), I think it's safe to say that for patches
> > > > 1-5:
> > > >
> > > > Acked-by: Darrick J. Wong <djwong@kernel.org>
> > > >
> > > > (patch 9 also, but see the reply I just sent for that one about grabbing
> > > > the sync_fs fixes too)
> > > >
> > > > The log changes are going to take more time to go through, since that
> > > > stuff is always tricky and /not/ something for me to be messing with at
> > > > 4:45pm.
> > >
> > > Let's make it easier for you then.
> > > I already decided to defer patches 9-11.
> > >
> > > Since you already started looking at patches 6-8, if you want to finish
> > > that review let me know and I will wait, but if you prefer, I can also defer
> > > the log changes 6-8 and post them along with the other log fixes from 5.14.
> 
> Hi Darrick,
> 
> FYI, I started testing the log fixes backports from v5.14 along with
> the deferred
> patches 6-8 [1] with extra focus on recoveryloop tests.
> 
> I know that Leah is also testing another batch of 5.15-only patches, so she
> may yet post another 5.15-only series before my 5.10-only series.
> 
> In the meanwhile, if you have some spare time due to rc8, please try to
> look at the already posted patches 6-8 [2] that were deferred from the original
> stable submission per your request.

This is pretty difficult request -- while I /think/ the LSN->CSN
conversion for the upper layers in patch 7 is correct, I'm not as
familiar with where 5.10 is right now as I was when that series was
being proposed for upstream.

It /looks/ ok, but were I maintaining the 5.10 tree I'd be a lot more
comfortable if I had in hand all the results from running the long-soak
log recovery tests for a week.

(Which itself might be fairly difficult for 5.10...)

--D

> Thanks,
> Amir.
> 
> [1] https://github.com/amir73il/linux/commits/xfs-5.10.y-for-review
> [2] https://lore.kernel.org/linux-xfs/20220617100641.1653164-1-amir73il@gmail.com/

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 5.10 CANDIDATE 00/11] xfs stable candidate patches for 5.10.y (v5.15+)
  2022-07-26  2:10         ` Darrick J. Wong
@ 2022-07-26  8:41           ` Amir Goldstein
  0 siblings, 0 replies; 31+ messages in thread
From: Amir Goldstein @ 2022-07-26  8:41 UTC (permalink / raw)
  To: Darrick J. Wong
  Cc: Leah Rumancik, Luis Chamberlain, Dave Chinner, Christoph Hellwig,
	Christian Brauner, linux-xfs, fstests

> > Hi Darrick,
> >
> > FYI, I started testing the log fixes backports from v5.14 along with
> > the deferred
> > patches 6-8 [1] with extra focus on recoveryloop tests.
> >
> > I know that Leah is also testing another batch of 5.15-only patches, so she
> > may yet post another 5.15-only series before my 5.10-only series.
> >
> > In the meanwhile, if you have some spare time due to rc8, please try to
> > look at the already posted patches 6-8 [2] that were deferred from the original
> > stable submission per your request.
>
> This is pretty difficult request -- while I /think/ the LSN->CSN
> conversion for the upper layers in patch 7 is correct, I'm not as
> familiar with where 5.10 is right now as I was when that series was
> being proposed for upstream.
>
> It /looks/ ok, but were I maintaining the 5.10 tree I'd be a lot more
> comfortable if I had in hand all the results from running the long-soak
> log recovery tests for a week.

Sure, I can do that.

>
> (Which itself might be fairly difficult for 5.10...)

That depends on the exact definition of running the long-soak
and log recovery tests for a week.

I did run the 'recoveryloop' group 100 times on baseline vs. backports
and saw no regressions observed.
Also ran 'auto' 10 times and 'soak' 10 times (and counting) with no
regressions so far.

However, some recoveryloop tests are failing in baseline, so running them
"for a week" is meaningless for those tests:

generic/019, generic/475, generic/648: failing often in all config
generic/388: failing often with reflink_1024
generic/388: failing at ~1/50 rate for any config
generic/482: failing often on V4 configs
generic/482: failing at ~1/100 rate for V5 configs
xfs/057: failing at ~1/200 rate for any config

Note that some of those recoverloop tests may be failing due to older
xfs_repair [*].

The following recoveryloop/soak tests have NOT failed so far:
generic,455, generic/457, generic/646
generic/521, generic/522, generic/642

Should I just keep running those for a week?

Another option is to declare those fixes too risky for LTS 5.10.
That is always an option we need to consider.

Anyway, I will post the full v5.14 series for review and continue
running the long soak tests and the passing tests.

Thanks,
Amir.

[*] I should note that the 5.10.y tests are run with xfsprogs 5.10
When I started testing 5.10.y, I used debian/testing with newer xfsprogs
but that led to many issues compared to debain/bullseye with xfsprogs 5.10.
I figured that it is not very likely for a distro to use kernel 5.10
and much newer
xfsprogs (?) so I preferred to test more aligned versions for kernel/xfsprogs
to match the real life use cases.

^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2022-07-26  8:41 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-06-17 10:06 [PATCH 5.10 CANDIDATE 00/11] xfs stable candidate patches for 5.10.y (v5.15+) Amir Goldstein
2022-06-17 10:06 ` [PATCH 5.10 CANDIDATE 01/11] xfs: use kmem_cache_free() for kmem_cache objects Amir Goldstein
2022-06-17 10:06 ` [PATCH 5.10 CANDIDATE 02/11] xfs: punch out data fork delalloc blocks on COW writeback failure Amir Goldstein
2022-06-17 10:06 ` [PATCH 5.10 CANDIDATE 03/11] xfs: Fix the free logic of state in xfs_attr_node_hasname Amir Goldstein
2022-06-22 16:32   ` Darrick J. Wong
2022-06-22 18:46     ` Amir Goldstein
2022-06-22 21:50       ` Darrick J. Wong
2022-06-17 10:06 ` [PATCH 5.10 CANDIDATE 04/11] xfs: remove all COW fork extents when remounting readonly Amir Goldstein
2022-06-17 10:06 ` [PATCH 5.10 CANDIDATE 05/11] xfs: check sb_meta_uuid for dabuf buffer recovery Amir Goldstein
2022-06-17 10:06 ` [PATCH 5.10 CANDIDATE 06/11] xfs: refactor xfs_file_fsync Amir Goldstein
2022-06-17 10:06 ` [PATCH 5.10 CANDIDATE 07/11] xfs: xfs_log_force_lsn isn't passed a LSN Amir Goldstein
2022-06-22 16:45   ` Darrick J. Wong
2022-06-22 17:09     ` Amir Goldstein
2022-06-17 10:06 ` [PATCH 5.10 CANDIDATE 08/11] xfs: prevent UAF in xfs_log_item_in_current_chkpt Amir Goldstein
2022-06-17 10:06 ` [PATCH 5.10 CANDIDATE 09/11] xfs: only bother with sync_filesystem during readonly remount Amir Goldstein
2022-06-22 16:38   ` Darrick J. Wong
2022-06-22 16:54     ` Amir Goldstein
2022-06-22 23:42       ` Darrick J. Wong
2022-06-23  6:38         ` Amir Goldstein
2022-06-17 10:06 ` [PATCH 5.10 CANDIDATE 10/11] xfs: fix up non-directory creation in SGID directories Amir Goldstein
2022-06-17 10:06 ` [PATCH 5.10 CANDIDATE 11/11] xfs: use setattr_copy to set vfs inode attributes Amir Goldstein
2022-06-22 16:41   ` Darrick J. Wong
2022-06-22 18:36     ` Amir Goldstein
2022-06-22 22:17       ` Leah Rumancik
2022-06-23  4:22         ` Amir Goldstein
2022-06-22 23:45 ` [PATCH 5.10 CANDIDATE 00/11] xfs stable candidate patches for 5.10.y (v5.15+) Darrick J. Wong
2022-06-23  7:33   ` Amir Goldstein
2022-06-23 16:05     ` Darrick J. Wong
2022-07-24  8:36       ` Amir Goldstein
2022-07-26  2:10         ` Darrick J. Wong
2022-07-26  8:41           ` Amir Goldstein

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.