linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 5.15 00/28] 5.15.52-rc1 review
@ 2022-06-30 13:46 Greg Kroah-Hartman
  2022-06-30 13:46 ` [PATCH 5.15 01/28] tick/nohz: unexport __init-annotated tick_nohz_full_setup() Greg Kroah-Hartman
                   ` (35 more replies)
  0 siblings, 36 replies; 41+ messages in thread
From: Greg Kroah-Hartman @ 2022-06-30 13:46 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, torvalds, akpm, linux, shuah,
	patches, lkft-triage, pavel, jonathanh, f.fainelli,
	sudipm.mukherjee, slade

This is the start of the stable review cycle for the 5.15.52 release.
There are 28 patches in this series, all will be posted as a response
to this one.  If anyone has any issues with these being applied, please
let me know.

Responses should be made by Sat, 02 Jul 2022 13:32:22 +0000.
Anything received after that time might be too late.

The whole patch series can be found in one patch at:
	https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.15.52-rc1.gz
or in the git tree and branch at:
	git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.15.y
and the diffstat can be found below.

thanks,

greg k-h

-------------
Pseudo-Shortlog of commits:

Greg Kroah-Hartman <gregkh@linuxfoundation.org>
    Linux 5.15.52-rc1

Pavel Begunkov <asml.silence@gmail.com>
    io_uring: fix not locked access to fixed buf table

Vladimir Oltean <vladimir.oltean@nxp.com>
    net: mscc: ocelot: allow unregistered IP multicast flooding to CPU

Ping-Ke Shih <pkshih@realtek.com>
    rtw88: rtw8821c: enable rfe 6 devices

Guo-Feng Fan <vincent_fann@realtek.com>
    rtw88: 8821c: support RFE type4 wifi NIC

Christian Brauner <brauner@kernel.org>
    fs: account for group membership

Christian Brauner <brauner@kernel.org>
    fs: fix acl translation

Christian Brauner <christian.brauner@ubuntu.com>
    fs: support mapped mounts of mapped filesystems

Christian Brauner <christian.brauner@ubuntu.com>
    fs: add i_user_ns() helper

Christian Brauner <christian.brauner@ubuntu.com>
    fs: port higher-level mapping helpers

Christian Brauner <christian.brauner@ubuntu.com>
    fs: remove unused low-level mapping helpers

Christian Brauner <christian.brauner@ubuntu.com>
    fs: use low-level mapping helpers

Christian Brauner <christian.brauner@ubuntu.com>
    docs: update mapping documentation

Christian Brauner <christian.brauner@ubuntu.com>
    fs: account for filesystem mappings

Christian Brauner <christian.brauner@ubuntu.com>
    fs: tweak fsuidgid_has_mapping()

Christian Brauner <christian.brauner@ubuntu.com>
    fs: move mapping helpers

Christian Brauner <christian.brauner@ubuntu.com>
    fs: add is_idmapped_mnt() helper

Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
    powerpc/ftrace: Remove ftrace init tramp once kernel init is complete

Darrick J. Wong <djwong@kernel.org>
    xfs: only bother with sync_filesystem during readonly remount

Darrick J. Wong <djwong@kernel.org>
    xfs: prevent UAF in xfs_log_item_in_current_chkpt

Dave Chinner <dchinner@redhat.com>
    xfs: check sb_meta_uuid for dabuf buffer recovery

Darrick J. Wong <djwong@kernel.org>
    xfs: remove all COW fork extents when remounting readonly

Yang Xu <xuyang2018.jy@fujitsu.com>
    xfs: Fix the free logic of state in xfs_attr_node_hasname

Brian Foster <bfoster@redhat.com>
    xfs: punch out data fork delalloc blocks on COW writeback failure

Rustam Kovhaev <rkovhaev@gmail.com>
    xfs: use kmem_cache_free() for kmem_cache objects

Coly Li <colyli@suse.de>
    bcache: memset on stack variables in bch_btree_check() and bch_sectors_dirty_init()

Greg Kroah-Hartman <gregkh@linuxfoundation.org>
    x86, kvm: use proper ASM macros for kvm_vcpu_is_preempted

Greg Kroah-Hartman <gregkh@linuxfoundation.org>
    clocksource/drivers/ixp4xx: remove __init from ixp4xx_timer_setup()

Masahiro Yamada <masahiroy@kernel.org>
    tick/nohz: unexport __init-annotated tick_nohz_full_setup()


-------------

Diffstat:

 Documentation/filesystems/idmappings.rst      |  72 --------
 Makefile                                      |   4 +-
 arch/powerpc/include/asm/ftrace.h             |   4 +-
 arch/powerpc/kernel/trace/ftrace.c            |  15 +-
 arch/powerpc/mm/mem.c                         |   2 +
 arch/x86/kernel/kvm.c                         |   2 +-
 drivers/clocksource/mmio.c                    |   2 +-
 drivers/clocksource/timer-ixp4xx.c            |  10 +-
 drivers/md/bcache/btree.c                     |   1 +
 drivers/md/bcache/writeback.c                 |   1 +
 drivers/net/ethernet/mscc/ocelot.c            |   8 +-
 drivers/net/wireless/realtek/rtw88/rtw8821c.c |  14 +-
 fs/attr.c                                     |  26 ++-
 fs/cachefiles/bind.c                          |   2 +-
 fs/ecryptfs/main.c                            |   2 +-
 fs/io_uring.c                                 |  28 +--
 fs/ksmbd/smbacl.c                             |  19 +--
 fs/ksmbd/smbacl.h                             |   5 +-
 fs/namespace.c                                |  53 ++++--
 fs/nfsd/export.c                              |   2 +-
 fs/open.c                                     |   8 +-
 fs/overlayfs/super.c                          |   2 +-
 fs/posix_acl.c                                |  27 ++-
 fs/proc_namespace.c                           |   2 +-
 fs/xattr.c                                    |   6 +-
 fs/xfs/libxfs/xfs_attr.c                      |  17 +-
 fs/xfs/xfs_aops.c                             |  15 +-
 fs/xfs/xfs_buf_item_recover.c                 |   2 +-
 fs/xfs/xfs_extfree_item.c                     |   6 +-
 fs/xfs/xfs_inode.c                            |   8 +-
 fs/xfs/xfs_linux.h                            |   1 +
 fs/xfs/xfs_log_cil.c                          |   6 +-
 fs/xfs/xfs_super.c                            |  21 ++-
 fs/xfs/xfs_symlink.c                          |   4 +-
 include/linux/fs.h                            | 141 +++++-----------
 include/linux/mnt_idmapping.h                 | 234 ++++++++++++++++++++++++++
 include/linux/platform_data/timer-ixp4xx.h    |   5 +-
 include/linux/posix_acl_xattr.h               |   4 +
 kernel/time/tick-sched.c                      |   1 -
 security/commoncap.c                          |  15 +-
 40 files changed, 498 insertions(+), 299 deletions(-)



^ permalink raw reply	[flat|nested] 41+ messages in thread

* [PATCH 5.15 01/28] tick/nohz: unexport __init-annotated tick_nohz_full_setup()
  2022-06-30 13:46 [PATCH 5.15 00/28] 5.15.52-rc1 review Greg Kroah-Hartman
@ 2022-06-30 13:46 ` Greg Kroah-Hartman
  2022-06-30 13:46 ` [PATCH 5.15 02/28] clocksource/drivers/ixp4xx: remove __init from ixp4xx_timer_setup() Greg Kroah-Hartman
                   ` (34 subsequent siblings)
  35 siblings, 0 replies; 41+ messages in thread
From: Greg Kroah-Hartman @ 2022-06-30 13:46 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Linus Torvalds, Masahiro Yamada,
	Paul E. McKenney, Thomas Backlund

From: Masahiro Yamada <masahiroy@kernel.org>

commit 2390095113e98fc52fffe35c5206d30d9efe3f78 upstream.

EXPORT_SYMBOL and __init is a bad combination because the .init.text
section is freed up after the initialization. Hence, modules cannot
use symbols annotated __init. The access to a freed symbol may end up
with kernel panic.

modpost used to detect it, but it had been broken for a decade.

Commit 28438794aba4 ("modpost: fix section mismatch check for exported
init/exit sections") fixed it so modpost started to warn it again, then
this showed up:

    MODPOST vmlinux.symvers
  WARNING: modpost: vmlinux.o(___ksymtab_gpl+tick_nohz_full_setup+0x0): Section mismatch in reference from the variable __ksymtab_tick_nohz_full_setup to the function .init.text:tick_nohz_full_setup()
  The symbol tick_nohz_full_setup is exported and annotated __init
  Fix this by removing the __init annotation of tick_nohz_full_setup or drop the export.

Drop the export because tick_nohz_full_setup() is only called from the
built-in code in kernel/sched/isolation.c.

Fixes: ae9e557b5be2 ("time: Export tick start/stop functions for rcutorture")
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Tested-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Backlund <tmb@tmb.nu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 kernel/time/tick-sched.c |    1 -
 1 file changed, 1 deletion(-)

--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -509,7 +509,6 @@ void __init tick_nohz_full_setup(cpumask
 	cpumask_copy(tick_nohz_full_mask, cpumask);
 	tick_nohz_full_running = true;
 }
-EXPORT_SYMBOL_GPL(tick_nohz_full_setup);
 
 static int tick_nohz_cpu_down(unsigned int cpu)
 {



^ permalink raw reply	[flat|nested] 41+ messages in thread

* [PATCH 5.15 02/28] clocksource/drivers/ixp4xx: remove __init from ixp4xx_timer_setup()
  2022-06-30 13:46 [PATCH 5.15 00/28] 5.15.52-rc1 review Greg Kroah-Hartman
  2022-06-30 13:46 ` [PATCH 5.15 01/28] tick/nohz: unexport __init-annotated tick_nohz_full_setup() Greg Kroah-Hartman
@ 2022-06-30 13:46 ` Greg Kroah-Hartman
  2022-07-01 15:31   ` Nathan Chancellor
  2022-06-30 13:46 ` [PATCH 5.15 03/28] x86, kvm: use proper ASM macros for kvm_vcpu_is_preempted Greg Kroah-Hartman
                   ` (33 subsequent siblings)
  35 siblings, 1 reply; 41+ messages in thread
From: Greg Kroah-Hartman @ 2022-06-30 13:46 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Linus Walleij, Daniel Lezcano

From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ixp4xx_timer_setup is exported, and so can not be an __init function.
Remove the __init marking as the build system is rightfully claiming
this is an error in older kernels.

This is fixed "properly" in commit 41929c9f628b
("clocksource/drivers/ixp4xx: Drop boardfile probe path") but that can
not be backported to older kernels as the reworking of the IXP4xx
codebase is not suitable for stable releases.

Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/clocksource/mmio.c                 |    2 +-
 drivers/clocksource/timer-ixp4xx.c         |   10 ++++------
 include/linux/platform_data/timer-ixp4xx.h |    5 ++---
 3 files changed, 7 insertions(+), 10 deletions(-)

--- a/drivers/clocksource/mmio.c
+++ b/drivers/clocksource/mmio.c
@@ -46,7 +46,7 @@ u64 clocksource_mmio_readw_down(struct c
  * @bits:	Number of valid bits
  * @read:	One of clocksource_mmio_read*() above
  */
-int __init clocksource_mmio_init(void __iomem *base, const char *name,
+int clocksource_mmio_init(void __iomem *base, const char *name,
 	unsigned long hz, int rating, unsigned bits,
 	u64 (*read)(struct clocksource *))
 {
--- a/drivers/clocksource/timer-ixp4xx.c
+++ b/drivers/clocksource/timer-ixp4xx.c
@@ -161,9 +161,8 @@ static int ixp4xx_resume(struct clock_ev
  * We use OS timer1 on the CPU for the timer tick and the timestamp
  * counter as a source of real clock ticks to account for missed jiffies.
  */
-static __init int ixp4xx_timer_register(void __iomem *base,
-					int timer_irq,
-					unsigned int timer_freq)
+static int ixp4xx_timer_register(void __iomem *base, int timer_irq,
+				 unsigned int timer_freq)
 {
 	struct ixp4xx_timer *tmr;
 	int ret;
@@ -269,9 +268,8 @@ builtin_platform_driver(ixp4xx_timer_dri
  * @timer_irq: Linux IRQ number for the timer
  * @timer_freq: Fixed frequency of the timer
  */
-void __init ixp4xx_timer_setup(resource_size_t timerbase,
-			       int timer_irq,
-			       unsigned int timer_freq)
+void ixp4xx_timer_setup(resource_size_t timerbase, int timer_irq,
+			unsigned int timer_freq)
 {
 	void __iomem *base;
 
--- a/include/linux/platform_data/timer-ixp4xx.h
+++ b/include/linux/platform_data/timer-ixp4xx.h
@@ -4,8 +4,7 @@
 
 #include <linux/ioport.h>
 
-void __init ixp4xx_timer_setup(resource_size_t timerbase,
-			       int timer_irq,
-			       unsigned int timer_freq);
+void ixp4xx_timer_setup(resource_size_t timerbase, int timer_irq,
+			unsigned int timer_freq);
 
 #endif



^ permalink raw reply	[flat|nested] 41+ messages in thread

* [PATCH 5.15 03/28] x86, kvm: use proper ASM macros for kvm_vcpu_is_preempted
  2022-06-30 13:46 [PATCH 5.15 00/28] 5.15.52-rc1 review Greg Kroah-Hartman
  2022-06-30 13:46 ` [PATCH 5.15 01/28] tick/nohz: unexport __init-annotated tick_nohz_full_setup() Greg Kroah-Hartman
  2022-06-30 13:46 ` [PATCH 5.15 02/28] clocksource/drivers/ixp4xx: remove __init from ixp4xx_timer_setup() Greg Kroah-Hartman
@ 2022-06-30 13:46 ` Greg Kroah-Hartman
  2022-06-30 13:47 ` [PATCH 5.15 04/28] bcache: memset on stack variables in bch_btree_check() and bch_sectors_dirty_init() Greg Kroah-Hartman
                   ` (32 subsequent siblings)
  35 siblings, 0 replies; 41+ messages in thread
From: Greg Kroah-Hartman @ 2022-06-30 13:46 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Paolo Bonzini

From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

The build rightfully complains about:
	arch/x86/kernel/kvm.o: warning: objtool: __raw_callee_save___kvm_vcpu_is_preempted()+0x12: missing int3 after ret

because the ASM_RET call is not being used correctly in kvm_vcpu_is_preempted().

This was hand-fixed-up in the kvm merge commit a4cfff3f0f8c ("Merge branch
'kvm-older-features' into HEAD") which of course can not be backported to
stable kernels, so just fix this up directly instead.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/x86/kernel/kvm.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/arch/x86/kernel/kvm.c
+++ b/arch/x86/kernel/kvm.c
@@ -948,7 +948,7 @@ asm(
 "movq	__per_cpu_offset(,%rdi,8), %rax;"
 "cmpb	$0, " __stringify(KVM_STEAL_TIME_preempted) "+steal_time(%rax);"
 "setne	%al;"
-"ret;"
+ASM_RET
 ".size __raw_callee_save___kvm_vcpu_is_preempted, .-__raw_callee_save___kvm_vcpu_is_preempted;"
 ".popsection");
 



^ permalink raw reply	[flat|nested] 41+ messages in thread

* [PATCH 5.15 04/28] bcache: memset on stack variables in bch_btree_check() and bch_sectors_dirty_init()
  2022-06-30 13:46 [PATCH 5.15 00/28] 5.15.52-rc1 review Greg Kroah-Hartman
                   ` (2 preceding siblings ...)
  2022-06-30 13:46 ` [PATCH 5.15 03/28] x86, kvm: use proper ASM macros for kvm_vcpu_is_preempted Greg Kroah-Hartman
@ 2022-06-30 13:47 ` Greg Kroah-Hartman
  2022-06-30 13:47 ` [PATCH 5.15 05/28] xfs: use kmem_cache_free() for kmem_cache objects Greg Kroah-Hartman
                   ` (31 subsequent siblings)
  35 siblings, 0 replies; 41+ messages in thread
From: Greg Kroah-Hartman @ 2022-06-30 13:47 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Coly Li, Jens Axboe

From: Coly Li <colyli@suse.de>

commit 7d6b902ea0e02b2a25c480edf471cbaa4ebe6b3c upstream.

The local variables check_state (in bch_btree_check()) and state (in
bch_sectors_dirty_init()) should be fully filled by 0, because before
allocating them on stack, they were dynamically allocated by kzalloc().

Signed-off-by: Coly Li <colyli@suse.de>
Link: https://lore.kernel.org/r/20220527152818.27545-2-colyli@suse.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/md/bcache/btree.c     |    1 +
 drivers/md/bcache/writeback.c |    1 +
 2 files changed, 2 insertions(+)

--- a/drivers/md/bcache/btree.c
+++ b/drivers/md/bcache/btree.c
@@ -2017,6 +2017,7 @@ int bch_btree_check(struct cache_set *c)
 	if (c->root->level == 0)
 		return 0;
 
+	memset(&check_state, 0, sizeof(struct btree_check_state));
 	check_state.c = c;
 	check_state.total_threads = bch_btree_chkthread_nr();
 	check_state.key_idx = 0;
--- a/drivers/md/bcache/writeback.c
+++ b/drivers/md/bcache/writeback.c
@@ -947,6 +947,7 @@ void bch_sectors_dirty_init(struct bcach
 		return;
 	}
 
+	memset(&state, 0, sizeof(struct bch_dirty_init_state));
 	state.c = c;
 	state.d = d;
 	state.total_threads = bch_btre_dirty_init_thread_nr();



^ permalink raw reply	[flat|nested] 41+ messages in thread

* [PATCH 5.15 05/28] xfs: use kmem_cache_free() for kmem_cache objects
  2022-06-30 13:46 [PATCH 5.15 00/28] 5.15.52-rc1 review Greg Kroah-Hartman
                   ` (3 preceding siblings ...)
  2022-06-30 13:47 ` [PATCH 5.15 04/28] bcache: memset on stack variables in bch_btree_check() and bch_sectors_dirty_init() Greg Kroah-Hartman
@ 2022-06-30 13:47 ` Greg Kroah-Hartman
  2022-06-30 13:47 ` [PATCH 5.15 06/28] xfs: punch out data fork delalloc blocks on COW writeback failure Greg Kroah-Hartman
                   ` (30 subsequent siblings)
  35 siblings, 0 replies; 41+ messages in thread
From: Greg Kroah-Hartman @ 2022-06-30 13:47 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Greg Kroah-Hartman, Rustam Kovhaev, Darrick J. Wong, Leah Rumancik

From: Rustam Kovhaev <rkovhaev@gmail.com>

[ Upstream commit c30a0cbd07ecc0eec7b3cd568f7b1c7bb7913f93 ]

For kmalloc() allocations SLOB prepends the blocks with a 4-byte header,
and it puts the size of the allocated blocks in that header.
Blocks allocated with kmem_cache_alloc() allocations do not have that
header.

SLOB explodes when you allocate memory with kmem_cache_alloc() and then
try to free it with kfree() instead of kmem_cache_free().
SLOB will assume that there is a header when there is none, read some
garbage to size variable and corrupt the adjacent objects, which
eventually leads to hang or panic.

Let's make XFS work with SLOB by using proper free function.

Fixes: 9749fee83f38 ("xfs: enable the xfs_defer mechanism to process extents to free")
Signed-off-by: Rustam Kovhaev <rkovhaev@gmail.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Leah Rumancik <leah.rumancik@gmail.com>
Acked-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 fs/xfs/xfs_extfree_item.c |    6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

--- a/fs/xfs/xfs_extfree_item.c
+++ b/fs/xfs/xfs_extfree_item.c
@@ -482,7 +482,7 @@ xfs_extent_free_finish_item(
 			free->xefi_startblock,
 			free->xefi_blockcount,
 			&free->xefi_oinfo, free->xefi_skip_discard);
-	kmem_free(free);
+	kmem_cache_free(xfs_bmap_free_item_zone, free);
 	return error;
 }
 
@@ -502,7 +502,7 @@ xfs_extent_free_cancel_item(
 	struct xfs_extent_free_item	*free;
 
 	free = container_of(item, struct xfs_extent_free_item, xefi_list);
-	kmem_free(free);
+	kmem_cache_free(xfs_bmap_free_item_zone, free);
 }
 
 const struct xfs_defer_op_type xfs_extent_free_defer_type = {
@@ -564,7 +564,7 @@ xfs_agfl_free_finish_item(
 	extp->ext_len = free->xefi_blockcount;
 	efdp->efd_next_extent++;
 
-	kmem_free(free);
+	kmem_cache_free(xfs_bmap_free_item_zone, free);
 	return error;
 }
 



^ permalink raw reply	[flat|nested] 41+ messages in thread

* [PATCH 5.15 06/28] xfs: punch out data fork delalloc blocks on COW writeback failure
  2022-06-30 13:46 [PATCH 5.15 00/28] 5.15.52-rc1 review Greg Kroah-Hartman
                   ` (4 preceding siblings ...)
  2022-06-30 13:47 ` [PATCH 5.15 05/28] xfs: use kmem_cache_free() for kmem_cache objects Greg Kroah-Hartman
@ 2022-06-30 13:47 ` Greg Kroah-Hartman
  2022-06-30 13:47 ` [PATCH 5.15 07/28] xfs: Fix the free logic of state in xfs_attr_node_hasname Greg Kroah-Hartman
                   ` (29 subsequent siblings)
  35 siblings, 0 replies; 41+ messages in thread
From: Greg Kroah-Hartman @ 2022-06-30 13:47 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Greg Kroah-Hartman, Brian Foster, Darrick J. Wong, Leah Rumancik

From: Brian Foster <bfoster@redhat.com>

[ Upstream commit 5ca5916b6bc93577c360c06cb7cdf71adb9b5faf ]

If writeback I/O to a COW extent fails, the COW fork blocks are
punched out and the data fork blocks left alone. It is possible for
COW fork blocks to overlap non-shared data fork blocks (due to
cowextsz hint prealloc), however, and writeback unconditionally maps
to the COW fork whenever blocks exist at the corresponding offset of
the page undergoing writeback. This means it's quite possible for a
COW fork extent to overlap delalloc data fork blocks, writeback to
convert and map to the COW fork blocks, writeback to fail, and
finally for ioend completion to cancel the COW fork blocks and leave
stale data fork delalloc blocks around in the inode. The blocks are
effectively stale because writeback failure also discards dirty page
state.

If this occurs, it is likely to trigger assert failures, free space
accounting corruption and failures in unrelated file operations. For
example, a subsequent reflink attempt of the affected file to a new
target file will trip over the stale delalloc in the source file and
fail. Several of these issues are occasionally reproduced by
generic/648, but are reproducible on demand with the right sequence
of operations and timely I/O error injection.

To fix this problem, update the ioend failure path to also punch out
underlying data fork delalloc blocks on I/O error. This is analogous
to the writeback submission failure path in xfs_discard_page() where
we might fail to map data fork delalloc blocks and consistent with
the successful COW writeback completion path, which is responsible
for unmapping from the data fork and remapping in COW fork blocks.

Fixes: 787eb485509f ("xfs: fix and streamline error handling in xfs_end_io")
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Leah Rumancik <leah.rumancik@gmail.com>
Acked-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 fs/xfs/xfs_aops.c |   15 ++++++++++++---
 1 file changed, 12 insertions(+), 3 deletions(-)

--- a/fs/xfs/xfs_aops.c
+++ b/fs/xfs/xfs_aops.c
@@ -82,6 +82,7 @@ xfs_end_ioend(
 	struct iomap_ioend	*ioend)
 {
 	struct xfs_inode	*ip = XFS_I(ioend->io_inode);
+	struct xfs_mount	*mp = ip->i_mount;
 	xfs_off_t		offset = ioend->io_offset;
 	size_t			size = ioend->io_size;
 	unsigned int		nofs_flag;
@@ -97,18 +98,26 @@ xfs_end_ioend(
 	/*
 	 * Just clean up the in-memory structures if the fs has been shut down.
 	 */
-	if (xfs_is_shutdown(ip->i_mount)) {
+	if (xfs_is_shutdown(mp)) {
 		error = -EIO;
 		goto done;
 	}
 
 	/*
-	 * Clean up any COW blocks on an I/O error.
+	 * Clean up all COW blocks and underlying data fork delalloc blocks on
+	 * I/O error. The delalloc punch is required because this ioend was
+	 * mapped to blocks in the COW fork and the associated pages are no
+	 * longer dirty. If we don't remove delalloc blocks here, they become
+	 * stale and can corrupt free space accounting on unmount.
 	 */
 	error = blk_status_to_errno(ioend->io_bio->bi_status);
 	if (unlikely(error)) {
-		if (ioend->io_flags & IOMAP_F_SHARED)
+		if (ioend->io_flags & IOMAP_F_SHARED) {
 			xfs_reflink_cancel_cow_range(ip, offset, size, true);
+			xfs_bmap_punch_delalloc_range(ip,
+						      XFS_B_TO_FSBT(mp, offset),
+						      XFS_B_TO_FSB(mp, size));
+		}
 		goto done;
 	}
 



^ permalink raw reply	[flat|nested] 41+ messages in thread

* [PATCH 5.15 07/28] xfs: Fix the free logic of state in xfs_attr_node_hasname
  2022-06-30 13:46 [PATCH 5.15 00/28] 5.15.52-rc1 review Greg Kroah-Hartman
                   ` (5 preceding siblings ...)
  2022-06-30 13:47 ` [PATCH 5.15 06/28] xfs: punch out data fork delalloc blocks on COW writeback failure Greg Kroah-Hartman
@ 2022-06-30 13:47 ` Greg Kroah-Hartman
  2022-06-30 13:47 ` [PATCH 5.15 08/28] xfs: remove all COW fork extents when remounting readonly Greg Kroah-Hartman
                   ` (28 subsequent siblings)
  35 siblings, 0 replies; 41+ messages in thread
From: Greg Kroah-Hartman @ 2022-06-30 13:47 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Greg Kroah-Hartman, Yang Xu, Darrick J. Wong, Leah Rumancik

From: Yang Xu <xuyang2018.jy@fujitsu.com>

[ Upstream commit a1de97fe296c52eafc6590a3506f4bbd44ecb19a ]

When testing xfstests xfs/126 on lastest upstream kernel, it will hang on some machine.
Adding a getxattr operation after xattr corrupted, I can reproduce it 100%.

The deadlock as below:
[983.923403] task:setfattr        state:D stack:    0 pid:17639 ppid: 14687 flags:0x00000080
[  983.923405] Call Trace:
[  983.923410]  __schedule+0x2c4/0x700
[  983.923412]  schedule+0x37/0xa0
[  983.923414]  schedule_timeout+0x274/0x300
[  983.923416]  __down+0x9b/0xf0
[  983.923451]  ? xfs_buf_find.isra.29+0x3c8/0x5f0 [xfs]
[  983.923453]  down+0x3b/0x50
[  983.923471]  xfs_buf_lock+0x33/0xf0 [xfs]
[  983.923490]  xfs_buf_find.isra.29+0x3c8/0x5f0 [xfs]
[  983.923508]  xfs_buf_get_map+0x4c/0x320 [xfs]
[  983.923525]  xfs_buf_read_map+0x53/0x310 [xfs]
[  983.923541]  ? xfs_da_read_buf+0xcf/0x120 [xfs]
[  983.923560]  xfs_trans_read_buf_map+0x1cf/0x360 [xfs]
[  983.923575]  ? xfs_da_read_buf+0xcf/0x120 [xfs]
[  983.923590]  xfs_da_read_buf+0xcf/0x120 [xfs]
[  983.923606]  xfs_da3_node_read+0x1f/0x40 [xfs]
[  983.923621]  xfs_da3_node_lookup_int+0x69/0x4a0 [xfs]
[  983.923624]  ? kmem_cache_alloc+0x12e/0x270
[  983.923637]  xfs_attr_node_hasname+0x6e/0xa0 [xfs]
[  983.923651]  xfs_has_attr+0x6e/0xd0 [xfs]
[  983.923664]  xfs_attr_set+0x273/0x320 [xfs]
[  983.923683]  xfs_xattr_set+0x87/0xd0 [xfs]
[  983.923686]  __vfs_removexattr+0x4d/0x60
[  983.923688]  __vfs_removexattr_locked+0xac/0x130
[  983.923689]  vfs_removexattr+0x4e/0xf0
[  983.923690]  removexattr+0x4d/0x80
[  983.923693]  ? __check_object_size+0xa8/0x16b
[  983.923695]  ? strncpy_from_user+0x47/0x1a0
[  983.923696]  ? getname_flags+0x6a/0x1e0
[  983.923697]  ? _cond_resched+0x15/0x30
[  983.923699]  ? __sb_start_write+0x1e/0x70
[  983.923700]  ? mnt_want_write+0x28/0x50
[  983.923701]  path_removexattr+0x9b/0xb0
[  983.923702]  __x64_sys_removexattr+0x17/0x20
[  983.923704]  do_syscall_64+0x5b/0x1a0
[  983.923705]  entry_SYSCALL_64_after_hwframe+0x65/0xca
[  983.923707] RIP: 0033:0x7f080f10ee1b

When getxattr calls xfs_attr_node_get function, xfs_da3_node_lookup_int fails with EFSCORRUPTED in
xfs_attr_node_hasname because we have use blocktrash to random it in xfs/126. So it
free state in internal and xfs_attr_node_get doesn't do xfs_buf_trans release job.

Then subsequent removexattr will hang because of it.

This bug was introduced by kernel commit 07120f1abdff ("xfs: Add xfs_has_attr and subroutines").
It adds xfs_attr_node_hasname helper and said caller will be responsible for freeing the state
in this case. But xfs_attr_node_hasname will free state itself instead of caller if
xfs_da3_node_lookup_int fails.

Fix this bug by moving the step of free state into caller.

Also, use "goto error/out" instead of returning error directly in xfs_attr_node_addname_find_attr and
xfs_attr_node_removename_setup function because we should free state ourselves.

Fixes: 07120f1abdff ("xfs: Add xfs_has_attr and subroutines")
Signed-off-by: Yang Xu <xuyang2018.jy@fujitsu.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Leah Rumancik <leah.rumancik@gmail.com>
Acked-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 fs/xfs/libxfs/xfs_attr.c |   17 +++++++----------
 1 file changed, 7 insertions(+), 10 deletions(-)

--- a/fs/xfs/libxfs/xfs_attr.c
+++ b/fs/xfs/libxfs/xfs_attr.c
@@ -1077,21 +1077,18 @@ xfs_attr_node_hasname(
 
 	state = xfs_da_state_alloc(args);
 	if (statep != NULL)
-		*statep = NULL;
+		*statep = state;
 
 	/*
 	 * Search to see if name exists, and get back a pointer to it.
 	 */
 	error = xfs_da3_node_lookup_int(state, &retval);
-	if (error) {
-		xfs_da_state_free(state);
-		return error;
-	}
+	if (error)
+		retval = error;
 
-	if (statep != NULL)
-		*statep = state;
-	else
+	if (!statep)
 		xfs_da_state_free(state);
+
 	return retval;
 }
 
@@ -1112,7 +1109,7 @@ xfs_attr_node_addname_find_attr(
 	 */
 	retval = xfs_attr_node_hasname(args, &dac->da_state);
 	if (retval != -ENOATTR && retval != -EEXIST)
-		return retval;
+		goto error;
 
 	if (retval == -ENOATTR && (args->attr_flags & XATTR_REPLACE))
 		goto error;
@@ -1337,7 +1334,7 @@ int xfs_attr_node_removename_setup(
 
 	error = xfs_attr_node_hasname(args, state);
 	if (error != -EEXIST)
-		return error;
+		goto out;
 	error = 0;
 
 	ASSERT((*state)->path.blk[(*state)->path.active - 1].bp != NULL);



^ permalink raw reply	[flat|nested] 41+ messages in thread

* [PATCH 5.15 08/28] xfs: remove all COW fork extents when remounting readonly
  2022-06-30 13:46 [PATCH 5.15 00/28] 5.15.52-rc1 review Greg Kroah-Hartman
                   ` (6 preceding siblings ...)
  2022-06-30 13:47 ` [PATCH 5.15 07/28] xfs: Fix the free logic of state in xfs_attr_node_hasname Greg Kroah-Hartman
@ 2022-06-30 13:47 ` Greg Kroah-Hartman
  2022-06-30 13:47 ` [PATCH 5.15 09/28] xfs: check sb_meta_uuid for dabuf buffer recovery Greg Kroah-Hartman
                   ` (27 subsequent siblings)
  35 siblings, 0 replies; 41+ messages in thread
From: Greg Kroah-Hartman @ 2022-06-30 13:47 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Greg Kroah-Hartman, Darrick J. Wong, Dave Chinner,
	Chandan Babu R, Leah Rumancik

From: "Darrick J. Wong" <djwong@kernel.org>

[ Upstream commit 089558bc7ba785c03815a49c89e28ad9b8de51f9 ]

As part of multiple customer escalations due to file data corruption
after copy on write operations, I wrote some fstests that use fsstress
to hammer on COW to shake things loose.  Regrettably, I caught some
filesystem shutdowns due to incorrect rmap operations with the following
loop:

mount <filesystem>				# (0)
fsstress <run only readonly ops> &		# (1)
while true; do
	fsstress <run all ops>
	mount -o remount,ro			# (2)
	fsstress <run only readonly ops>
	mount -o remount,rw			# (3)
done

When (2) happens, notice that (1) is still running.  xfs_remount_ro will
call xfs_blockgc_stop to walk the inode cache to free all the COW
extents, but the blockgc mechanism races with (1)'s reader threads to
take IOLOCKs and loses, which means that it doesn't clean them all out.
Call such a file (A).

When (3) happens, xfs_remount_rw calls xfs_reflink_recover_cow, which
walks the ondisk refcount btree and frees any COW extent that it finds.
This function does not check the inode cache, which means that incore
COW forks of inode (A) is now inconsistent with the ondisk metadata.  If
one of those former COW extents are allocated and mapped into another
file (B) and someone triggers a COW to the stale reservation in (A), A's
dirty data will be written into (B) and once that's done, those blocks
will be transferred to (A)'s data fork without bumping the refcount.

The results are catastrophic -- file (B) and the refcount btree are now
corrupt.  Solve this race by forcing the xfs_blockgc_free_space to run
synchronously, which causes xfs_icwalk to return to inodes that were
skipped because the blockgc code couldn't take the IOLOCK.  This is safe
to do here because the VFS has already prohibited new writer threads.

Fixes: 10ddf64e420f ("xfs: remove leftover CoW reservations when remounting ro")
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Chandan Babu R <chandan.babu@oracle.com>
Signed-off-by: Leah Rumancik <leah.rumancik@gmail.com>
Acked-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 fs/xfs/xfs_super.c |   14 +++++++++++---
 1 file changed, 11 insertions(+), 3 deletions(-)

--- a/fs/xfs/xfs_super.c
+++ b/fs/xfs/xfs_super.c
@@ -1768,7 +1768,10 @@ static int
 xfs_remount_ro(
 	struct xfs_mount	*mp)
 {
-	int error;
+	struct xfs_icwalk	icw = {
+		.icw_flags	= XFS_ICWALK_FLAG_SYNC,
+	};
+	int			error;
 
 	/*
 	 * Cancel background eofb scanning so it cannot race with the final
@@ -1776,8 +1779,13 @@ xfs_remount_ro(
 	 */
 	xfs_blockgc_stop(mp);
 
-	/* Get rid of any leftover CoW reservations... */
-	error = xfs_blockgc_free_space(mp, NULL);
+	/*
+	 * Clear out all remaining COW staging extents and speculative post-EOF
+	 * preallocations so that we don't leave inodes requiring inactivation
+	 * cleanups during reclaim on a read-only mount.  We must process every
+	 * cached inode, so this requires a synchronous cache scan.
+	 */
+	error = xfs_blockgc_free_space(mp, &icw);
 	if (error) {
 		xfs_force_shutdown(mp, SHUTDOWN_CORRUPT_INCORE);
 		return error;



^ permalink raw reply	[flat|nested] 41+ messages in thread

* [PATCH 5.15 09/28] xfs: check sb_meta_uuid for dabuf buffer recovery
  2022-06-30 13:46 [PATCH 5.15 00/28] 5.15.52-rc1 review Greg Kroah-Hartman
                   ` (7 preceding siblings ...)
  2022-06-30 13:47 ` [PATCH 5.15 08/28] xfs: remove all COW fork extents when remounting readonly Greg Kroah-Hartman
@ 2022-06-30 13:47 ` Greg Kroah-Hartman
  2022-06-30 13:47 ` [PATCH 5.15 10/28] xfs: prevent UAF in xfs_log_item_in_current_chkpt Greg Kroah-Hartman
                   ` (26 subsequent siblings)
  35 siblings, 0 replies; 41+ messages in thread
From: Greg Kroah-Hartman @ 2022-06-30 13:47 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Greg Kroah-Hartman, Dave Chinner, Darrick J. Wong, Leah Rumancik

From: Dave Chinner <dchinner@redhat.com>

[ Upstream commit 09654ed8a18cfd45027a67d6cbca45c9ea54feab ]

Got a report that a repeated crash test of a container host would
eventually fail with a log recovery error preventing the system from
mounting the root filesystem. It manifested as a directory leaf node
corruption on writeback like so:

 XFS (loop0): Mounting V5 Filesystem
 XFS (loop0): Starting recovery (logdev: internal)
 XFS (loop0): Metadata corruption detected at xfs_dir3_leaf_check_int+0x99/0xf0, xfs_dir3_leaf1 block 0x12faa158
 XFS (loop0): Unmount and run xfs_repair
 XFS (loop0): First 128 bytes of corrupted metadata buffer:
 00000000: 00 00 00 00 00 00 00 00 3d f1 00 00 e1 9e d5 8b  ........=.......
 00000010: 00 00 00 00 12 fa a1 58 00 00 00 29 00 00 1b cc  .......X...)....
 00000020: 91 06 78 ff f7 7e 4a 7d 8d 53 86 f2 ac 47 a8 23  ..x..~J}.S...G.#
 00000030: 00 00 00 00 17 e0 00 80 00 43 00 00 00 00 00 00  .........C......
 00000040: 00 00 00 2e 00 00 00 08 00 00 17 2e 00 00 00 0a  ................
 00000050: 02 35 79 83 00 00 00 30 04 d3 b4 80 00 00 01 50  .5y....0.......P
 00000060: 08 40 95 7f 00 00 02 98 08 41 fe b7 00 00 02 d4  .@.......A......
 00000070: 0d 62 ef a7 00 00 01 f2 14 50 21 41 00 00 00 0c  .b.......P!A....
 XFS (loop0): Corruption of in-memory data (0x8) detected at xfs_do_force_shutdown+0x1a/0x20 (fs/xfs/xfs_buf.c:1514).  Shutting down.
 XFS (loop0): Please unmount the filesystem and rectify the problem(s)
 XFS (loop0): log mount/recovery failed: error -117
 XFS (loop0): log mount failed

Tracing indicated that we were recovering changes from a transaction
at LSN 0x29/0x1c16 into a buffer that had an LSN of 0x29/0x1d57.
That is, log recovery was overwriting a buffer with newer changes on
disk than was in the transaction. Tracing indicated that we were
hitting the "recovery immediately" case in
xfs_buf_log_recovery_lsn(), and hence it was ignoring the LSN in the
buffer.

The code was extracting the LSN correctly, then ignoring it because
the UUID in the buffer did not match the superblock UUID. The
problem arises because the UUID check uses the wrong UUID - it
should be checking the sb_meta_uuid, not sb_uuid. This filesystem
has sb_uuid != sb_meta_uuid (which is fine), and the buffer has the
correct matching sb_meta_uuid in it, it's just the code checked it
against the wrong superblock uuid.

The is no corruption in the filesystem, and failing to recover the
buffer due to a write verifier failure means the recovery bug did
not propagate the corruption to disk. Hence there is no corruption
before or after this bug has manifested, the impact is limited
simply to an unmountable filesystem....

This was missed back in 2015 during an audit of incorrect sb_uuid
usage that resulted in commit fcfbe2c4ef42 ("xfs: log recovery needs
to validate against sb_meta_uuid") that fixed the magic32 buffers to
validate against sb_meta_uuid instead of sb_uuid. It missed the
magicda buffers....

Fixes: ce748eaa65f2 ("xfs: create new metadata UUID field and incompat flag")
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Leah Rumancik <leah.rumancik@gmail.com>
Acked-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 fs/xfs/xfs_buf_item_recover.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/fs/xfs/xfs_buf_item_recover.c
+++ b/fs/xfs/xfs_buf_item_recover.c
@@ -816,7 +816,7 @@ xlog_recover_get_buf_lsn(
 	}
 
 	if (lsn != (xfs_lsn_t)-1) {
-		if (!uuid_equal(&mp->m_sb.sb_uuid, uuid))
+		if (!uuid_equal(&mp->m_sb.sb_meta_uuid, uuid))
 			goto recover_immediately;
 		return lsn;
 	}



^ permalink raw reply	[flat|nested] 41+ messages in thread

* [PATCH 5.15 10/28] xfs: prevent UAF in xfs_log_item_in_current_chkpt
  2022-06-30 13:46 [PATCH 5.15 00/28] 5.15.52-rc1 review Greg Kroah-Hartman
                   ` (8 preceding siblings ...)
  2022-06-30 13:47 ` [PATCH 5.15 09/28] xfs: check sb_meta_uuid for dabuf buffer recovery Greg Kroah-Hartman
@ 2022-06-30 13:47 ` Greg Kroah-Hartman
  2022-06-30 13:47 ` [PATCH 5.15 11/28] xfs: only bother with sync_filesystem during readonly remount Greg Kroah-Hartman
                   ` (25 subsequent siblings)
  35 siblings, 0 replies; 41+ messages in thread
From: Greg Kroah-Hartman @ 2022-06-30 13:47 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Greg Kroah-Hartman, Darrick J. Wong, Dave Chinner, Leah Rumancik

From: "Darrick J. Wong" <djwong@kernel.org>

[ Upstream commit f8d92a66e810acbef6ddbc0bd0cbd9b117ce8acd ]

While I was running with KASAN and lockdep enabled, I stumbled upon an
KASAN report about a UAF to a freed CIL checkpoint.  Looking at the
comment for xfs_log_item_in_current_chkpt, it seems pretty obvious to me
that the original patch to xfs_defer_finish_noroll should have done
something to lock the CIL to prevent it from switching the CIL contexts
while the predicate runs.

For upper level code that needs to know if a given log item is new
enough not to need relogging, add a new wrapper that takes the CIL
context lock long enough to sample the current CIL context.  This is
kind of racy in that the CIL can switch the contexts immediately after
sampling, but that's ok because the consequence is that the defer ops
code is a little slow to relog items.

 ==================================================================
 BUG: KASAN: use-after-free in xfs_log_item_in_current_chkpt+0x139/0x160 [xfs]
 Read of size 8 at addr ffff88804ea5f608 by task fsstress/527999

 CPU: 1 PID: 527999 Comm: fsstress Tainted: G      D      5.16.0-rc4-xfsx #rc4
 Call Trace:
  <TASK>
  dump_stack_lvl+0x45/0x59
  print_address_description.constprop.0+0x1f/0x140
  kasan_report.cold+0x83/0xdf
  xfs_log_item_in_current_chkpt+0x139/0x160
  xfs_defer_finish_noroll+0x3bb/0x1e30
  __xfs_trans_commit+0x6c8/0xcf0
  xfs_reflink_remap_extent+0x66f/0x10e0
  xfs_reflink_remap_blocks+0x2dd/0xa90
  xfs_file_remap_range+0x27b/0xc30
  vfs_dedupe_file_range_one+0x368/0x420
  vfs_dedupe_file_range+0x37c/0x5d0
  do_vfs_ioctl+0x308/0x1260
  __x64_sys_ioctl+0xa1/0x170
  do_syscall_64+0x35/0x80
  entry_SYSCALL_64_after_hwframe+0x44/0xae
 RIP: 0033:0x7f2c71a2950b
 Code: 0f 1e fa 48 8b 05 85 39 0d 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff
ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01
f0 ff ff 73 01 c3 48 8b 0d 55 39 0d 00 f7 d8 64 89 01 48
 RSP: 002b:00007ffe8c0e03c8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
 RAX: ffffffffffffffda RBX: 00005600862a8740 RCX: 00007f2c71a2950b
 RDX: 00005600862a7be0 RSI: 00000000c0189436 RDI: 0000000000000004
 RBP: 000000000000000b R08: 0000000000000027 R09: 0000000000000003
 R10: 0000000000000000 R11: 0000000000000246 R12: 000000000000005a
 R13: 00005600862804a8 R14: 0000000000016000 R15: 00005600862a8a20
  </TASK>

 Allocated by task 464064:
  kasan_save_stack+0x1e/0x50
  __kasan_kmalloc+0x81/0xa0
  kmem_alloc+0xcd/0x2c0 [xfs]
  xlog_cil_ctx_alloc+0x17/0x1e0 [xfs]
  xlog_cil_push_work+0x141/0x13d0 [xfs]
  process_one_work+0x7f6/0x1380
  worker_thread+0x59d/0x1040
  kthread+0x3b0/0x490
  ret_from_fork+0x1f/0x30

 Freed by task 51:
  kasan_save_stack+0x1e/0x50
  kasan_set_track+0x21/0x30
  kasan_set_free_info+0x20/0x30
  __kasan_slab_free+0xed/0x130
  slab_free_freelist_hook+0x7f/0x160
  kfree+0xde/0x340
  xlog_cil_committed+0xbfd/0xfe0 [xfs]
  xlog_cil_process_committed+0x103/0x1c0 [xfs]
  xlog_state_do_callback+0x45d/0xbd0 [xfs]
  xlog_ioend_work+0x116/0x1c0 [xfs]
  process_one_work+0x7f6/0x1380
  worker_thread+0x59d/0x1040
  kthread+0x3b0/0x490
  ret_from_fork+0x1f/0x30

 Last potentially related work creation:
  kasan_save_stack+0x1e/0x50
  __kasan_record_aux_stack+0xb7/0xc0
  insert_work+0x48/0x2e0
  __queue_work+0x4e7/0xda0
  queue_work_on+0x69/0x80
  xlog_cil_push_now.isra.0+0x16b/0x210 [xfs]
  xlog_cil_force_seq+0x1b7/0x850 [xfs]
  xfs_log_force_seq+0x1c7/0x670 [xfs]
  xfs_file_fsync+0x7c1/0xa60 [xfs]
  __x64_sys_fsync+0x52/0x80
  do_syscall_64+0x35/0x80
  entry_SYSCALL_64_after_hwframe+0x44/0xae

 The buggy address belongs to the object at ffff88804ea5f600
  which belongs to the cache kmalloc-256 of size 256
 The buggy address is located 8 bytes inside of
  256-byte region [ffff88804ea5f600, ffff88804ea5f700)
 The buggy address belongs to the page:
 page:ffffea00013a9780 refcount:1 mapcount:0 mapping:0000000000000000 index:0xffff88804ea5ea00 pfn:0x4ea5e
 head:ffffea00013a9780 order:1 compound_mapcount:0
 flags: 0x4fff80000010200(slab|head|node=1|zone=1|lastcpupid=0xfff)
 raw: 04fff80000010200 ffffea0001245908 ffffea00011bd388 ffff888004c42b40
 raw: ffff88804ea5ea00 0000000000100009 00000001ffffffff 0000000000000000
 page dumped because: kasan: bad access detected

 Memory state around the buggy address:
  ffff88804ea5f500: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
  ffff88804ea5f580: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
 >ffff88804ea5f600: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                       ^
  ffff88804ea5f680: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
  ffff88804ea5f700: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
 ==================================================================

Fixes: 4e919af7827a ("xfs: periodically relog deferred intent items")
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Leah Rumancik <leah.rumancik@gmail.com>
Acked-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 fs/xfs/xfs_log_cil.c |    6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

--- a/fs/xfs/xfs_log_cil.c
+++ b/fs/xfs/xfs_log_cil.c
@@ -1442,9 +1442,9 @@ out_shutdown:
  */
 bool
 xfs_log_item_in_current_chkpt(
-	struct xfs_log_item *lip)
+	struct xfs_log_item	*lip)
 {
-	struct xfs_cil_ctx *ctx = lip->li_mountp->m_log->l_cilp->xc_ctx;
+	struct xfs_cil		*cil = lip->li_mountp->m_log->l_cilp;
 
 	if (list_empty(&lip->li_cil))
 		return false;
@@ -1454,7 +1454,7 @@ xfs_log_item_in_current_chkpt(
 	 * first checkpoint it is written to. Hence if it is different to the
 	 * current sequence, we're in a new checkpoint.
 	 */
-	return lip->li_seq == ctx->sequence;
+	return lip->li_seq == READ_ONCE(cil->xc_current_sequence);
 }
 
 /*



^ permalink raw reply	[flat|nested] 41+ messages in thread

* [PATCH 5.15 11/28] xfs: only bother with sync_filesystem during readonly remount
  2022-06-30 13:46 [PATCH 5.15 00/28] 5.15.52-rc1 review Greg Kroah-Hartman
                   ` (9 preceding siblings ...)
  2022-06-30 13:47 ` [PATCH 5.15 10/28] xfs: prevent UAF in xfs_log_item_in_current_chkpt Greg Kroah-Hartman
@ 2022-06-30 13:47 ` Greg Kroah-Hartman
  2022-06-30 13:47 ` [PATCH 5.15 12/28] powerpc/ftrace: Remove ftrace init tramp once kernel init is complete Greg Kroah-Hartman
                   ` (24 subsequent siblings)
  35 siblings, 0 replies; 41+ messages in thread
From: Greg Kroah-Hartman @ 2022-06-30 13:47 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Greg Kroah-Hartman, Darrick J. Wong, Dave Chinner, Leah Rumancik

From: "Darrick J. Wong" <djwong@kernel.org>

[ Upstream commit b97cca3ba9098522e5a1c3388764ead42640c1a5 ]

In commit 02b9984d6408, we pushed a sync_filesystem() call from the VFS
into xfs_fs_remount.  The only time that we ever need to push dirty file
data or metadata to disk for a remount is if we're remounting the
filesystem read only, so this really could be moved to xfs_remount_ro.

Once we've moved the call site, actually check the return value from
sync_filesystem.

Fixes: 02b9984d6408 ("fs: push sync_filesystem() down to the file system's remount_fs()")
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Leah Rumancik <leah.rumancik@gmail.com>
Acked-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 fs/xfs/xfs_super.c |    7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

--- a/fs/xfs/xfs_super.c
+++ b/fs/xfs/xfs_super.c
@@ -1773,6 +1773,11 @@ xfs_remount_ro(
 	};
 	int			error;
 
+	/* Flush all the dirty data to disk. */
+	error = sync_filesystem(mp->m_super);
+	if (error)
+		return error;
+
 	/*
 	 * Cancel background eofb scanning so it cannot race with the final
 	 * log force+buftarg wait and deadlock the remount.
@@ -1851,8 +1856,6 @@ xfs_fs_reconfigure(
 	if (error)
 		return error;
 
-	sync_filesystem(mp->m_super);
-
 	/* inode32 -> inode64 */
 	if (xfs_has_small_inums(mp) && !xfs_has_small_inums(new_mp)) {
 		mp->m_features &= ~XFS_FEAT_SMALL_INUMS;



^ permalink raw reply	[flat|nested] 41+ messages in thread

* [PATCH 5.15 12/28] powerpc/ftrace: Remove ftrace init tramp once kernel init is complete
  2022-06-30 13:46 [PATCH 5.15 00/28] 5.15.52-rc1 review Greg Kroah-Hartman
                   ` (10 preceding siblings ...)
  2022-06-30 13:47 ` [PATCH 5.15 11/28] xfs: only bother with sync_filesystem during readonly remount Greg Kroah-Hartman
@ 2022-06-30 13:47 ` Greg Kroah-Hartman
  2022-06-30 13:47 ` [PATCH 5.15 13/28] fs: add is_idmapped_mnt() helper Greg Kroah-Hartman
                   ` (23 subsequent siblings)
  35 siblings, 0 replies; 41+ messages in thread
From: Greg Kroah-Hartman @ 2022-06-30 13:47 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Naveen N. Rao, Michael Ellerman

From: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>

commit 84ade0a6655bee803d176525ef457175cbf4df22 upstream.

Stop using the ftrace trampoline for init section once kernel init is
complete.

Fixes: 67361cf8071286 ("powerpc/ftrace: Handle large kernel configs")
Cc: stable@vger.kernel.org # v4.20+
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220516071422.463738-1-naveen.n.rao@linux.vnet.ibm.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/powerpc/include/asm/ftrace.h  |    4 +++-
 arch/powerpc/kernel/trace/ftrace.c |   15 ++++++++++++---
 arch/powerpc/mm/mem.c              |    2 ++
 3 files changed, 17 insertions(+), 4 deletions(-)

--- a/arch/powerpc/include/asm/ftrace.h
+++ b/arch/powerpc/include/asm/ftrace.h
@@ -96,7 +96,7 @@ static inline bool arch_syscall_match_sy
 #endif /* PPC64_ELF_ABI_v1 */
 #endif /* CONFIG_FTRACE_SYSCALLS */
 
-#ifdef CONFIG_PPC64
+#if defined(CONFIG_PPC64) && defined(CONFIG_FUNCTION_TRACER)
 #include <asm/paca.h>
 
 static inline void this_cpu_disable_ftrace(void)
@@ -120,11 +120,13 @@ static inline u8 this_cpu_get_ftrace_ena
 	return get_paca()->ftrace_enabled;
 }
 
+void ftrace_free_init_tramp(void);
 #else /* CONFIG_PPC64 */
 static inline void this_cpu_disable_ftrace(void) { }
 static inline void this_cpu_enable_ftrace(void) { }
 static inline void this_cpu_set_ftrace_enabled(u8 ftrace_enabled) { }
 static inline u8 this_cpu_get_ftrace_enabled(void) { return 1; }
+static inline void ftrace_free_init_tramp(void) { }
 #endif /* CONFIG_PPC64 */
 #endif /* !__ASSEMBLY__ */
 
--- a/arch/powerpc/kernel/trace/ftrace.c
+++ b/arch/powerpc/kernel/trace/ftrace.c
@@ -336,9 +336,7 @@ static int setup_mcount_compiler_tramp(u
 
 	/* Is this a known long jump tramp? */
 	for (i = 0; i < NUM_FTRACE_TRAMPS; i++)
-		if (!ftrace_tramps[i])
-			break;
-		else if (ftrace_tramps[i] == tramp)
+		if (ftrace_tramps[i] == tramp)
 			return 0;
 
 	/* Is this a known plt tramp? */
@@ -881,6 +879,17 @@ void arch_ftrace_update_code(int command
 
 extern unsigned int ftrace_tramp_text[], ftrace_tramp_init[];
 
+void ftrace_free_init_tramp(void)
+{
+	int i;
+
+	for (i = 0; i < NUM_FTRACE_TRAMPS && ftrace_tramps[i]; i++)
+		if (ftrace_tramps[i] == (unsigned long)ftrace_tramp_init) {
+			ftrace_tramps[i] = 0;
+			return;
+		}
+}
+
 int __init ftrace_dyn_arch_init(void)
 {
 	int i;
--- a/arch/powerpc/mm/mem.c
+++ b/arch/powerpc/mm/mem.c
@@ -22,6 +22,7 @@
 #include <asm/kasan.h>
 #include <asm/svm.h>
 #include <asm/mmzone.h>
+#include <asm/ftrace.h>
 
 #include <mm/mmu_decl.h>
 
@@ -314,6 +315,7 @@ void free_initmem(void)
 	mark_initmem_nx();
 	init_mem_is_free = true;
 	free_initmem_default(POISON_FREE_INITMEM);
+	ftrace_free_init_tramp();
 }
 
 /*



^ permalink raw reply	[flat|nested] 41+ messages in thread

* [PATCH 5.15 13/28] fs: add is_idmapped_mnt() helper
  2022-06-30 13:46 [PATCH 5.15 00/28] 5.15.52-rc1 review Greg Kroah-Hartman
                   ` (11 preceding siblings ...)
  2022-06-30 13:47 ` [PATCH 5.15 12/28] powerpc/ftrace: Remove ftrace init tramp once kernel init is complete Greg Kroah-Hartman
@ 2022-06-30 13:47 ` Greg Kroah-Hartman
  2022-06-30 13:47 ` [PATCH 5.15 14/28] fs: move mapping helpers Greg Kroah-Hartman
                   ` (22 subsequent siblings)
  35 siblings, 0 replies; 41+ messages in thread
From: Greg Kroah-Hartman @ 2022-06-30 13:47 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Seth Forshee, Christoph Hellwig,
	Al Viro, linux-fsdevel, Amir Goldstein, Christian Brauner,
	Christian Brauner (Microsoft)

From: Christian Brauner <christian.brauner@ubuntu.com>

commit bb49e9e730c2906a958eee273a7819f401543d6c upstream.

Multiple places open-code the same check to determine whether a given
mount is idmapped. Introduce a simple helper function that can be used
instead. This allows us to get rid of the fragile open-coding. We will
later change the check that is used to determine whether a given mount
is idmapped. Introducing a helper allows us to do this in a single
place instead of doing it for multiple places.

Link: https://lore.kernel.org/r/20211123114227.3124056-2-brauner@kernel.org (v1)
Link: https://lore.kernel.org/r/20211130121032.3753852-2-brauner@kernel.org (v2)
Link: https://lore.kernel.org/r/20211203111707.3901969-2-brauner@kernel.org
Cc: Seth Forshee <sforshee@digitalocean.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
CC: linux-fsdevel@vger.kernel.org
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Reviewed-by: Seth Forshee <sforshee@digitalocean.com>
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 fs/cachefiles/bind.c |    2 +-
 fs/ecryptfs/main.c   |    2 +-
 fs/namespace.c       |    2 +-
 fs/nfsd/export.c     |    2 +-
 fs/overlayfs/super.c |    2 +-
 fs/proc_namespace.c  |    2 +-
 include/linux/fs.h   |   14 ++++++++++++++
 7 files changed, 20 insertions(+), 6 deletions(-)

--- a/fs/cachefiles/bind.c
+++ b/fs/cachefiles/bind.c
@@ -117,7 +117,7 @@ static int cachefiles_daemon_add_cache(s
 	root = path.dentry;
 
 	ret = -EINVAL;
-	if (mnt_user_ns(path.mnt) != &init_user_ns) {
+	if (is_idmapped_mnt(path.mnt)) {
 		pr_warn("File cache on idmapped mounts not supported");
 		goto error_unsupported;
 	}
--- a/fs/ecryptfs/main.c
+++ b/fs/ecryptfs/main.c
@@ -537,7 +537,7 @@ static struct dentry *ecryptfs_mount(str
 		goto out_free;
 	}
 
-	if (mnt_user_ns(path.mnt) != &init_user_ns) {
+	if (is_idmapped_mnt(path.mnt)) {
 		rc = -EINVAL;
 		printk(KERN_ERR "Mounting on idmapped mounts currently disallowed\n");
 		goto out_free;
--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -3936,7 +3936,7 @@ static int can_idmap_mount(const struct
 	 * mapping. It makes things simpler and callers can just create
 	 * another bind-mount they can idmap if they want to.
 	 */
-	if (mnt_user_ns(m) != &init_user_ns)
+	if (is_idmapped_mnt(m))
 		return -EPERM;
 
 	/* The underlying filesystem doesn't support idmapped mounts yet. */
--- a/fs/nfsd/export.c
+++ b/fs/nfsd/export.c
@@ -427,7 +427,7 @@ static int check_export(struct path *pat
 		return -EINVAL;
 	}
 
-	if (mnt_user_ns(path->mnt) != &init_user_ns) {
+	if (is_idmapped_mnt(path->mnt)) {
 		dprintk("exp_export: export of idmapped mounts not yet supported.\n");
 		return -EINVAL;
 	}
--- a/fs/overlayfs/super.c
+++ b/fs/overlayfs/super.c
@@ -873,7 +873,7 @@ static int ovl_mount_dir_noesc(const cha
 		pr_err("filesystem on '%s' not supported\n", name);
 		goto out_put;
 	}
-	if (mnt_user_ns(path->mnt) != &init_user_ns) {
+	if (is_idmapped_mnt(path->mnt)) {
 		pr_err("idmapped layers are currently not supported\n");
 		goto out_put;
 	}
--- a/fs/proc_namespace.c
+++ b/fs/proc_namespace.c
@@ -80,7 +80,7 @@ static void show_mnt_opts(struct seq_fil
 			seq_puts(m, fs_infop->str);
 	}
 
-	if (mnt_user_ns(mnt) != &init_user_ns)
+	if (is_idmapped_mnt(mnt))
 		seq_puts(m, ",idmapped");
 }
 
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2726,6 +2726,20 @@ static inline struct user_namespace *fil
 {
 	return mnt_user_ns(file->f_path.mnt);
 }
+
+/**
+ * is_idmapped_mnt - check whether a mount is mapped
+ * @mnt: the mount to check
+ *
+ * If @mnt has an idmapping attached to it @mnt is mapped.
+ *
+ * Return: true if mount is mapped, false if not.
+ */
+static inline bool is_idmapped_mnt(const struct vfsmount *mnt)
+{
+	return mnt_user_ns(mnt) != &init_user_ns;
+}
+
 extern long vfs_truncate(const struct path *, loff_t);
 int do_truncate(struct user_namespace *, struct dentry *, loff_t start,
 		unsigned int time_attrs, struct file *filp);



^ permalink raw reply	[flat|nested] 41+ messages in thread

* [PATCH 5.15 14/28] fs: move mapping helpers
  2022-06-30 13:46 [PATCH 5.15 00/28] 5.15.52-rc1 review Greg Kroah-Hartman
                   ` (12 preceding siblings ...)
  2022-06-30 13:47 ` [PATCH 5.15 13/28] fs: add is_idmapped_mnt() helper Greg Kroah-Hartman
@ 2022-06-30 13:47 ` Greg Kroah-Hartman
  2022-06-30 13:47 ` [PATCH 5.15 15/28] fs: tweak fsuidgid_has_mapping() Greg Kroah-Hartman
                   ` (21 subsequent siblings)
  35 siblings, 0 replies; 41+ messages in thread
From: Greg Kroah-Hartman @ 2022-06-30 13:47 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Seth Forshee, Christoph Hellwig,
	Al Viro, linux-fsdevel, Amir Goldstein, Christian Brauner,
	Christian Brauner (Microsoft)

From: Christian Brauner <christian.brauner@ubuntu.com>

commit a793d79ea3e041081cd7cbd8ee43d0b5e4914a2b upstream.

The low-level mapping helpers were so far crammed into fs.h. They are
out of place there. The fs.h header should just contain the higher-level
mapping helpers that interact directly with vfs objects such as struct
super_block or struct inode and not the bare mapping helpers. Similarly,
only vfs and specific fs code shall interact with low-level mapping
helpers. And so they won't be made accessible automatically through
regular {g,u}id helpers.

Link: https://lore.kernel.org/r/20211123114227.3124056-3-brauner@kernel.org (v1)
Link: https://lore.kernel.org/r/20211130121032.3753852-3-brauner@kernel.org (v2)
Link: https://lore.kernel.org/r/20211203111707.3901969-3-brauner@kernel.org
Cc: Seth Forshee <sforshee@digitalocean.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
CC: linux-fsdevel@vger.kernel.org
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Reviewed-by: Seth Forshee <sforshee@digitalocean.com>
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 fs/ksmbd/smbacl.c             |    1 
 fs/ksmbd/smbacl.h             |    1 
 fs/open.c                     |    1 
 fs/posix_acl.c                |    1 
 fs/xfs/xfs_linux.h            |    1 
 include/linux/fs.h            |   91 -------------------------------------
 include/linux/mnt_idmapping.h |  101 ++++++++++++++++++++++++++++++++++++++++++
 security/commoncap.c          |    1 
 8 files changed, 108 insertions(+), 90 deletions(-)
 create mode 100644 include/linux/mnt_idmapping.h

--- a/fs/ksmbd/smbacl.c
+++ b/fs/ksmbd/smbacl.c
@@ -9,6 +9,7 @@
 #include <linux/fs.h>
 #include <linux/slab.h>
 #include <linux/string.h>
+#include <linux/mnt_idmapping.h>
 
 #include "smbacl.h"
 #include "smb_common.h"
--- a/fs/ksmbd/smbacl.h
+++ b/fs/ksmbd/smbacl.h
@@ -11,6 +11,7 @@
 #include <linux/fs.h>
 #include <linux/namei.h>
 #include <linux/posix_acl.h>
+#include <linux/mnt_idmapping.h>
 
 #include "mgmt/tree_connect.h"
 
--- a/fs/open.c
+++ b/fs/open.c
@@ -32,6 +32,7 @@
 #include <linux/ima.h>
 #include <linux/dnotify.h>
 #include <linux/compat.h>
+#include <linux/mnt_idmapping.h>
 
 #include "internal.h"
 
--- a/fs/posix_acl.c
+++ b/fs/posix_acl.c
@@ -23,6 +23,7 @@
 #include <linux/export.h>
 #include <linux/user_namespace.h>
 #include <linux/namei.h>
+#include <linux/mnt_idmapping.h>
 
 static struct posix_acl **acl_by_type(struct inode *inode, int type)
 {
--- a/fs/xfs/xfs_linux.h
+++ b/fs/xfs/xfs_linux.h
@@ -61,6 +61,7 @@ typedef __u32			xfs_nlink_t;
 #include <linux/ratelimit.h>
 #include <linux/rhashtable.h>
 #include <linux/xattr.h>
+#include <linux/mnt_idmapping.h>
 
 #include <asm/page.h>
 #include <asm/div64.h>
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -41,6 +41,7 @@
 #include <linux/stddef.h>
 #include <linux/mount.h>
 #include <linux/cred.h>
+#include <linux/mnt_idmapping.h>
 
 #include <asm/byteorder.h>
 #include <uapi/linux/fs.h>
@@ -1627,34 +1628,6 @@ static inline void i_gid_write(struct in
 }
 
 /**
- * kuid_into_mnt - map a kuid down into a mnt_userns
- * @mnt_userns: user namespace of the relevant mount
- * @kuid: kuid to be mapped
- *
- * Return: @kuid mapped according to @mnt_userns.
- * If @kuid has no mapping INVALID_UID is returned.
- */
-static inline kuid_t kuid_into_mnt(struct user_namespace *mnt_userns,
-				   kuid_t kuid)
-{
-	return make_kuid(mnt_userns, __kuid_val(kuid));
-}
-
-/**
- * kgid_into_mnt - map a kgid down into a mnt_userns
- * @mnt_userns: user namespace of the relevant mount
- * @kgid: kgid to be mapped
- *
- * Return: @kgid mapped according to @mnt_userns.
- * If @kgid has no mapping INVALID_GID is returned.
- */
-static inline kgid_t kgid_into_mnt(struct user_namespace *mnt_userns,
-				   kgid_t kgid)
-{
-	return make_kgid(mnt_userns, __kgid_val(kgid));
-}
-
-/**
  * i_uid_into_mnt - map an inode's i_uid down into a mnt_userns
  * @mnt_userns: user namespace of the mount the inode was found from
  * @inode: inode to map
@@ -1683,68 +1656,6 @@ static inline kgid_t i_gid_into_mnt(stru
 }
 
 /**
- * kuid_from_mnt - map a kuid up into a mnt_userns
- * @mnt_userns: user namespace of the relevant mount
- * @kuid: kuid to be mapped
- *
- * Return: @kuid mapped up according to @mnt_userns.
- * If @kuid has no mapping INVALID_UID is returned.
- */
-static inline kuid_t kuid_from_mnt(struct user_namespace *mnt_userns,
-				   kuid_t kuid)
-{
-	return KUIDT_INIT(from_kuid(mnt_userns, kuid));
-}
-
-/**
- * kgid_from_mnt - map a kgid up into a mnt_userns
- * @mnt_userns: user namespace of the relevant mount
- * @kgid: kgid to be mapped
- *
- * Return: @kgid mapped up according to @mnt_userns.
- * If @kgid has no mapping INVALID_GID is returned.
- */
-static inline kgid_t kgid_from_mnt(struct user_namespace *mnt_userns,
-				   kgid_t kgid)
-{
-	return KGIDT_INIT(from_kgid(mnt_userns, kgid));
-}
-
-/**
- * mapped_fsuid - return caller's fsuid mapped up into a mnt_userns
- * @mnt_userns: user namespace of the relevant mount
- *
- * Use this helper to initialize a new vfs or filesystem object based on
- * the caller's fsuid. A common example is initializing the i_uid field of
- * a newly allocated inode triggered by a creation event such as mkdir or
- * O_CREAT. Other examples include the allocation of quotas for a specific
- * user.
- *
- * Return: the caller's current fsuid mapped up according to @mnt_userns.
- */
-static inline kuid_t mapped_fsuid(struct user_namespace *mnt_userns)
-{
-	return kuid_from_mnt(mnt_userns, current_fsuid());
-}
-
-/**
- * mapped_fsgid - return caller's fsgid mapped up into a mnt_userns
- * @mnt_userns: user namespace of the relevant mount
- *
- * Use this helper to initialize a new vfs or filesystem object based on
- * the caller's fsgid. A common example is initializing the i_gid field of
- * a newly allocated inode triggered by a creation event such as mkdir or
- * O_CREAT. Other examples include the allocation of quotas for a specific
- * user.
- *
- * Return: the caller's current fsgid mapped up according to @mnt_userns.
- */
-static inline kgid_t mapped_fsgid(struct user_namespace *mnt_userns)
-{
-	return kgid_from_mnt(mnt_userns, current_fsgid());
-}
-
-/**
  * inode_fsuid_set - initialize inode's i_uid field with callers fsuid
  * @inode: inode to initialize
  * @mnt_userns: user namespace of the mount the inode was found from
--- /dev/null
+++ b/include/linux/mnt_idmapping.h
@@ -0,0 +1,101 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _LINUX_MNT_IDMAPPING_H
+#define _LINUX_MNT_IDMAPPING_H
+
+#include <linux/types.h>
+#include <linux/uidgid.h>
+
+struct user_namespace;
+extern struct user_namespace init_user_ns;
+
+/**
+ * kuid_into_mnt - map a kuid down into a mnt_userns
+ * @mnt_userns: user namespace of the relevant mount
+ * @kuid: kuid to be mapped
+ *
+ * Return: @kuid mapped according to @mnt_userns.
+ * If @kuid has no mapping INVALID_UID is returned.
+ */
+static inline kuid_t kuid_into_mnt(struct user_namespace *mnt_userns,
+				   kuid_t kuid)
+{
+	return make_kuid(mnt_userns, __kuid_val(kuid));
+}
+
+/**
+ * kgid_into_mnt - map a kgid down into a mnt_userns
+ * @mnt_userns: user namespace of the relevant mount
+ * @kgid: kgid to be mapped
+ *
+ * Return: @kgid mapped according to @mnt_userns.
+ * If @kgid has no mapping INVALID_GID is returned.
+ */
+static inline kgid_t kgid_into_mnt(struct user_namespace *mnt_userns,
+				   kgid_t kgid)
+{
+	return make_kgid(mnt_userns, __kgid_val(kgid));
+}
+
+/**
+ * kuid_from_mnt - map a kuid up into a mnt_userns
+ * @mnt_userns: user namespace of the relevant mount
+ * @kuid: kuid to be mapped
+ *
+ * Return: @kuid mapped up according to @mnt_userns.
+ * If @kuid has no mapping INVALID_UID is returned.
+ */
+static inline kuid_t kuid_from_mnt(struct user_namespace *mnt_userns,
+				   kuid_t kuid)
+{
+	return KUIDT_INIT(from_kuid(mnt_userns, kuid));
+}
+
+/**
+ * kgid_from_mnt - map a kgid up into a mnt_userns
+ * @mnt_userns: user namespace of the relevant mount
+ * @kgid: kgid to be mapped
+ *
+ * Return: @kgid mapped up according to @mnt_userns.
+ * If @kgid has no mapping INVALID_GID is returned.
+ */
+static inline kgid_t kgid_from_mnt(struct user_namespace *mnt_userns,
+				   kgid_t kgid)
+{
+	return KGIDT_INIT(from_kgid(mnt_userns, kgid));
+}
+
+/**
+ * mapped_fsuid - return caller's fsuid mapped up into a mnt_userns
+ * @mnt_userns: user namespace of the relevant mount
+ *
+ * Use this helper to initialize a new vfs or filesystem object based on
+ * the caller's fsuid. A common example is initializing the i_uid field of
+ * a newly allocated inode triggered by a creation event such as mkdir or
+ * O_CREAT. Other examples include the allocation of quotas for a specific
+ * user.
+ *
+ * Return: the caller's current fsuid mapped up according to @mnt_userns.
+ */
+static inline kuid_t mapped_fsuid(struct user_namespace *mnt_userns)
+{
+	return kuid_from_mnt(mnt_userns, current_fsuid());
+}
+
+/**
+ * mapped_fsgid - return caller's fsgid mapped up into a mnt_userns
+ * @mnt_userns: user namespace of the relevant mount
+ *
+ * Use this helper to initialize a new vfs or filesystem object based on
+ * the caller's fsgid. A common example is initializing the i_gid field of
+ * a newly allocated inode triggered by a creation event such as mkdir or
+ * O_CREAT. Other examples include the allocation of quotas for a specific
+ * user.
+ *
+ * Return: the caller's current fsgid mapped up according to @mnt_userns.
+ */
+static inline kgid_t mapped_fsgid(struct user_namespace *mnt_userns)
+{
+	return kgid_from_mnt(mnt_userns, current_fsgid());
+}
+
+#endif /* _LINUX_MNT_IDMAPPING_H */
--- a/security/commoncap.c
+++ b/security/commoncap.c
@@ -24,6 +24,7 @@
 #include <linux/user_namespace.h>
 #include <linux/binfmts.h>
 #include <linux/personality.h>
+#include <linux/mnt_idmapping.h>
 
 /*
  * If a non-root user executes a setuid-root binary in



^ permalink raw reply	[flat|nested] 41+ messages in thread

* [PATCH 5.15 15/28] fs: tweak fsuidgid_has_mapping()
  2022-06-30 13:46 [PATCH 5.15 00/28] 5.15.52-rc1 review Greg Kroah-Hartman
                   ` (13 preceding siblings ...)
  2022-06-30 13:47 ` [PATCH 5.15 14/28] fs: move mapping helpers Greg Kroah-Hartman
@ 2022-06-30 13:47 ` Greg Kroah-Hartman
  2022-06-30 13:47 ` [PATCH 5.15 16/28] fs: account for filesystem mappings Greg Kroah-Hartman
                   ` (20 subsequent siblings)
  35 siblings, 0 replies; 41+ messages in thread
From: Greg Kroah-Hartman @ 2022-06-30 13:47 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Seth Forshee, Christoph Hellwig,
	Al Viro, linux-fsdevel, Amir Goldstein, Christian Brauner,
	Christian Brauner (Microsoft)

From: Christian Brauner <christian.brauner@ubuntu.com>

commit 476860b3eb4a50958243158861d5340066df5af2 upstream.

If the caller's fs{g,u}id aren't mapped in the mount's idmapping we can
return early and skip the check whether the mapped fs{g,u}id also have a
mapping in the filesystem's idmapping. If the fs{g,u}id aren't mapped in
the mount's idmapping they consequently can't be mapped in the
filesystem's idmapping. So there's no point in checking that.

Link: https://lore.kernel.org/r/20211123114227.3124056-4-brauner@kernel.org (v1)
Link: https://lore.kernel.org/r/20211130121032.3753852-4-brauner@kernel.org (v2)
Link: https://lore.kernel.org/r/20211203111707.3901969-4-brauner@kernel.org
Cc: Seth Forshee <sforshee@digitalocean.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
CC: linux-fsdevel@vger.kernel.org
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Reviewed-by: Seth Forshee <sforshee@digitalocean.com>
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 include/linux/fs.h |   14 +++++++++++---
 1 file changed, 11 insertions(+), 3 deletions(-)

--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1697,10 +1697,18 @@ static inline void inode_fsgid_set(struc
 static inline bool fsuidgid_has_mapping(struct super_block *sb,
 					struct user_namespace *mnt_userns)
 {
-	struct user_namespace *s_user_ns = sb->s_user_ns;
+	struct user_namespace *fs_userns = sb->s_user_ns;
+	kuid_t kuid;
+	kgid_t kgid;
 
-	return kuid_has_mapping(s_user_ns, mapped_fsuid(mnt_userns)) &&
-	       kgid_has_mapping(s_user_ns, mapped_fsgid(mnt_userns));
+	kuid = mapped_fsuid(mnt_userns);
+	if (!uid_valid(kuid))
+		return false;
+	kgid = mapped_fsgid(mnt_userns);
+	if (!gid_valid(kgid))
+		return false;
+	return kuid_has_mapping(fs_userns, kuid) &&
+	       kgid_has_mapping(fs_userns, kgid);
 }
 
 extern struct timespec64 current_time(struct inode *inode);



^ permalink raw reply	[flat|nested] 41+ messages in thread

* [PATCH 5.15 16/28] fs: account for filesystem mappings
  2022-06-30 13:46 [PATCH 5.15 00/28] 5.15.52-rc1 review Greg Kroah-Hartman
                   ` (14 preceding siblings ...)
  2022-06-30 13:47 ` [PATCH 5.15 15/28] fs: tweak fsuidgid_has_mapping() Greg Kroah-Hartman
@ 2022-06-30 13:47 ` Greg Kroah-Hartman
  2022-06-30 13:47 ` [PATCH 5.15 17/28] docs: update mapping documentation Greg Kroah-Hartman
                   ` (19 subsequent siblings)
  35 siblings, 0 replies; 41+ messages in thread
From: Greg Kroah-Hartman @ 2022-06-30 13:47 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Seth Forshee, Amir Goldstein,
	Christoph Hellwig, Al Viro, linux-fsdevel, Christian Brauner,
	Christian Brauner (Microsoft)

From: Christian Brauner <christian.brauner@ubuntu.com>

commit 1ac2a4104968e0a60b4b3572216a92aab5c1b025 upstream.

Currently we only support idmapped mounts for filesystems mounted
without an idmapping. This was a conscious decision mentioned in
multiple places (cf. e.g. [1]).

As explained at length in [3] it is perfectly fine to extend support for
idmapped mounts to filesystem's mounted with an idmapping should the
need arise. The need has been there for some time now. Various container
projects in userspace need this to run unprivileged and nested
unprivileged containers (cf. [2]).

Before we can port any filesystem that is mountable with an idmapping to
support idmapped mounts we need to first extend the mapping helpers to
account for the filesystem's idmapping. This again, is explained at
length in our documentation at [3] but I'll give an overview here again.

Currently, the low-level mapping helpers implement the remapping
algorithms described in [3] in a simplified manner. Because we could
rely on the fact that all filesystems supporting idmapped mounts are
mounted without an idmapping the translation step from or into the
filesystem idmapping could be skipped.

In order to support idmapped mounts of filesystem's mountable with an
idmapping the translation step we were able to skip before cannot be
skipped anymore. A filesystem mounted with an idmapping is very likely
to not use an identity mapping and will instead use a non-identity
mapping. So the translation step from or into the filesystem's idmapping
in the remapping algorithm cannot be skipped for such filesystems. More
details with examples can be found in [3].

This patch adds a few new and prepares some already existing low-level
mapping helpers to perform the full translation algorithm explained in
[3]. The low-level helpers can be written in a way that they only
perform the additional translation step when the filesystem is indeed
mounted with an idmapping.

If the low-level helpers detect that they are not dealing with an
idmapped mount they can simply return the relevant k{g,u}id unchanged;
no remapping needs to be performed at all. The no_idmapping() helper
detects whether the shortcut can be used.

If the low-level helpers detected that they are dealing with an idmapped
mount but the underlying filesystem is mounted without an idmapping we
can rely on the previous shorcut and can continue to skip the
translation step from or into the filesystem's idmapping.

These checks guarantee that only the minimal amount of work is
performed. As before, if idmapped mounts aren't used the low-level
helpers are idempotent and no work is performed at all.

This patch adds the helpers mapped_k{g,u}id_fs() and
mapped_k{g,u}id_user(). Following patches will port all places to
replace the old k{g,u}id_into_mnt() and k{g,u}id_from_mnt() with these
two new helpers. After the conversion is done k{g,u}id_into_mnt() and
k{g,u}id_from_mnt() will be removed. This also concludes the renaming of
the mapping helpers we started in [4]. Now, all mapping helpers will
started with the "mapped_" prefix making everything nice and consistent.

The mapped_k{g,u}id_fs() helpers replace the k{g,u}id_into_mnt()
helpers. They are to be used when k{g,u}ids are to be mapped from the
vfs, e.g. from from struct inode's i_{g,u}id.  Conversely, the
mapped_k{g,u}id_user() helpers replace the k{g,u}id_from_mnt() helpers.
They are to be used when k{g,u}ids are to be written to disk, e.g. when
entering from a system call to change ownership of a file.

This patch only introduces the helpers. It doesn't yet convert the
relevant places to account for filesystem mounted with an idmapping.

[1]: commit 2ca4dcc4909d ("fs/mount_setattr: tighten permission checks")
[2]: https://github.com/containers/podman/issues/10374
[3]: Documentations/filesystems/idmappings.rst
[4]: commit a65e58e791a1 ("fs: document and rename fsid helpers")

Link: https://lore.kernel.org/r/20211123114227.3124056-5-brauner@kernel.org (v1)
Link: https://lore.kernel.org/r/20211130121032.3753852-5-brauner@kernel.org (v2)
Link: https://lore.kernel.org/r/20211203111707.3901969-5-brauner@kernel.org
Cc: Seth Forshee <sforshee@digitalocean.com>
Cc: Amir Goldstein <amir73il@gmail.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
CC: linux-fsdevel@vger.kernel.org
Reviewed-by: Seth Forshee <sforshee@digitalocean.com>
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 include/linux/fs.h            |    4 
 include/linux/mnt_idmapping.h |  193 +++++++++++++++++++++++++++++++++++++++++-
 2 files changed, 191 insertions(+), 6 deletions(-)

--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1638,7 +1638,7 @@ static inline void i_gid_write(struct in
 static inline kuid_t i_uid_into_mnt(struct user_namespace *mnt_userns,
 				    const struct inode *inode)
 {
-	return kuid_into_mnt(mnt_userns, inode->i_uid);
+	return mapped_kuid_fs(mnt_userns, &init_user_ns, inode->i_uid);
 }
 
 /**
@@ -1652,7 +1652,7 @@ static inline kuid_t i_uid_into_mnt(stru
 static inline kgid_t i_gid_into_mnt(struct user_namespace *mnt_userns,
 				    const struct inode *inode)
 {
-	return kgid_into_mnt(mnt_userns, inode->i_gid);
+	return mapped_kgid_fs(mnt_userns, &init_user_ns, inode->i_gid);
 }
 
 /**
--- a/include/linux/mnt_idmapping.h
+++ b/include/linux/mnt_idmapping.h
@@ -6,6 +6,11 @@
 #include <linux/uidgid.h>
 
 struct user_namespace;
+/*
+ * Carries the initial idmapping of 0:0:4294967295 which is an identity
+ * mapping. This means that {g,u}id 0 is mapped to {g,u}id 0, {g,u}id 1 is
+ * mapped to {g,u}id 1, [...], {g,u}id 1000 to {g,u}id 1000, [...].
+ */
 extern struct user_namespace init_user_ns;
 
 /**
@@ -65,8 +70,188 @@ static inline kgid_t kgid_from_mnt(struc
 }
 
 /**
+ * initial_idmapping - check whether this is the initial mapping
+ * @ns: idmapping to check
+ *
+ * Check whether this is the initial mapping, mapping 0 to 0, 1 to 1,
+ * [...], 1000 to 1000 [...].
+ *
+ * Return: true if this is the initial mapping, false if not.
+ */
+static inline bool initial_idmapping(const struct user_namespace *ns)
+{
+	return ns == &init_user_ns;
+}
+
+/**
+ * no_idmapping - check whether we can skip remapping a kuid/gid
+ * @mnt_userns: the mount's idmapping
+ * @fs_userns: the filesystem's idmapping
+ *
+ * This function can be used to check whether a remapping between two
+ * idmappings is required.
+ * An idmapped mount is a mount that has an idmapping attached to it that
+ * is different from the filsystem's idmapping and the initial idmapping.
+ * If the initial mapping is used or the idmapping of the mount and the
+ * filesystem are identical no remapping is required.
+ *
+ * Return: true if remapping can be skipped, false if not.
+ */
+static inline bool no_idmapping(const struct user_namespace *mnt_userns,
+				const struct user_namespace *fs_userns)
+{
+	return initial_idmapping(mnt_userns) || mnt_userns == fs_userns;
+}
+
+/**
+ * mapped_kuid_fs - map a filesystem kuid into a mnt_userns
+ * @mnt_userns: the mount's idmapping
+ * @fs_userns: the filesystem's idmapping
+ * @kuid : kuid to be mapped
+ *
+ * Take a @kuid and remap it from @fs_userns into @mnt_userns. Use this
+ * function when preparing a @kuid to be reported to userspace.
+ *
+ * If no_idmapping() determines that this is not an idmapped mount we can
+ * simply return @kuid unchanged.
+ * If initial_idmapping() tells us that the filesystem is not mounted with an
+ * idmapping we know the value of @kuid won't change when calling
+ * from_kuid() so we can simply retrieve the value via __kuid_val()
+ * directly.
+ *
+ * Return: @kuid mapped according to @mnt_userns.
+ * If @kuid has no mapping in either @mnt_userns or @fs_userns INVALID_UID is
+ * returned.
+ */
+static inline kuid_t mapped_kuid_fs(struct user_namespace *mnt_userns,
+				    struct user_namespace *fs_userns,
+				    kuid_t kuid)
+{
+	uid_t uid;
+
+	if (no_idmapping(mnt_userns, fs_userns))
+		return kuid;
+	if (initial_idmapping(fs_userns))
+		uid = __kuid_val(kuid);
+	else
+		uid = from_kuid(fs_userns, kuid);
+	if (uid == (uid_t)-1)
+		return INVALID_UID;
+	return make_kuid(mnt_userns, uid);
+}
+
+/**
+ * mapped_kgid_fs - map a filesystem kgid into a mnt_userns
+ * @mnt_userns: the mount's idmapping
+ * @fs_userns: the filesystem's idmapping
+ * @kgid : kgid to be mapped
+ *
+ * Take a @kgid and remap it from @fs_userns into @mnt_userns. Use this
+ * function when preparing a @kgid to be reported to userspace.
+ *
+ * If no_idmapping() determines that this is not an idmapped mount we can
+ * simply return @kgid unchanged.
+ * If initial_idmapping() tells us that the filesystem is not mounted with an
+ * idmapping we know the value of @kgid won't change when calling
+ * from_kgid() so we can simply retrieve the value via __kgid_val()
+ * directly.
+ *
+ * Return: @kgid mapped according to @mnt_userns.
+ * If @kgid has no mapping in either @mnt_userns or @fs_userns INVALID_GID is
+ * returned.
+ */
+static inline kgid_t mapped_kgid_fs(struct user_namespace *mnt_userns,
+				    struct user_namespace *fs_userns,
+				    kgid_t kgid)
+{
+	gid_t gid;
+
+	if (no_idmapping(mnt_userns, fs_userns))
+		return kgid;
+	if (initial_idmapping(fs_userns))
+		gid = __kgid_val(kgid);
+	else
+		gid = from_kgid(fs_userns, kgid);
+	if (gid == (gid_t)-1)
+		return INVALID_GID;
+	return make_kgid(mnt_userns, gid);
+}
+
+/**
+ * mapped_kuid_user - map a user kuid into a mnt_userns
+ * @mnt_userns: the mount's idmapping
+ * @fs_userns: the filesystem's idmapping
+ * @kuid : kuid to be mapped
+ *
+ * Use the idmapping of @mnt_userns to remap a @kuid into @fs_userns. Use this
+ * function when preparing a @kuid to be written to disk or inode.
+ *
+ * If no_idmapping() determines that this is not an idmapped mount we can
+ * simply return @kuid unchanged.
+ * If initial_idmapping() tells us that the filesystem is not mounted with an
+ * idmapping we know the value of @kuid won't change when calling
+ * make_kuid() so we can simply retrieve the value via KUIDT_INIT()
+ * directly.
+ *
+ * Return: @kuid mapped according to @mnt_userns.
+ * If @kuid has no mapping in either @mnt_userns or @fs_userns INVALID_UID is
+ * returned.
+ */
+static inline kuid_t mapped_kuid_user(struct user_namespace *mnt_userns,
+				      struct user_namespace *fs_userns,
+				      kuid_t kuid)
+{
+	uid_t uid;
+
+	if (no_idmapping(mnt_userns, fs_userns))
+		return kuid;
+	uid = from_kuid(mnt_userns, kuid);
+	if (uid == (uid_t)-1)
+		return INVALID_UID;
+	if (initial_idmapping(fs_userns))
+		return KUIDT_INIT(uid);
+	return make_kuid(fs_userns, uid);
+}
+
+/**
+ * mapped_kgid_user - map a user kgid into a mnt_userns
+ * @mnt_userns: the mount's idmapping
+ * @fs_userns: the filesystem's idmapping
+ * @kgid : kgid to be mapped
+ *
+ * Use the idmapping of @mnt_userns to remap a @kgid into @fs_userns. Use this
+ * function when preparing a @kgid to be written to disk or inode.
+ *
+ * If no_idmapping() determines that this is not an idmapped mount we can
+ * simply return @kgid unchanged.
+ * If initial_idmapping() tells us that the filesystem is not mounted with an
+ * idmapping we know the value of @kgid won't change when calling
+ * make_kgid() so we can simply retrieve the value via KGIDT_INIT()
+ * directly.
+ *
+ * Return: @kgid mapped according to @mnt_userns.
+ * If @kgid has no mapping in either @mnt_userns or @fs_userns INVALID_GID is
+ * returned.
+ */
+static inline kgid_t mapped_kgid_user(struct user_namespace *mnt_userns,
+				      struct user_namespace *fs_userns,
+				      kgid_t kgid)
+{
+	gid_t gid;
+
+	if (no_idmapping(mnt_userns, fs_userns))
+		return kgid;
+	gid = from_kgid(mnt_userns, kgid);
+	if (gid == (gid_t)-1)
+		return INVALID_GID;
+	if (initial_idmapping(fs_userns))
+		return KGIDT_INIT(gid);
+	return make_kgid(fs_userns, gid);
+}
+
+/**
  * mapped_fsuid - return caller's fsuid mapped up into a mnt_userns
- * @mnt_userns: user namespace of the relevant mount
+ * @mnt_userns: the mount's idmapping
  *
  * Use this helper to initialize a new vfs or filesystem object based on
  * the caller's fsuid. A common example is initializing the i_uid field of
@@ -78,12 +263,12 @@ static inline kgid_t kgid_from_mnt(struc
  */
 static inline kuid_t mapped_fsuid(struct user_namespace *mnt_userns)
 {
-	return kuid_from_mnt(mnt_userns, current_fsuid());
+	return mapped_kuid_user(mnt_userns, &init_user_ns, current_fsuid());
 }
 
 /**
  * mapped_fsgid - return caller's fsgid mapped up into a mnt_userns
- * @mnt_userns: user namespace of the relevant mount
+ * @mnt_userns: the mount's idmapping
  *
  * Use this helper to initialize a new vfs or filesystem object based on
  * the caller's fsgid. A common example is initializing the i_gid field of
@@ -95,7 +280,7 @@ static inline kuid_t mapped_fsuid(struct
  */
 static inline kgid_t mapped_fsgid(struct user_namespace *mnt_userns)
 {
-	return kgid_from_mnt(mnt_userns, current_fsgid());
+	return mapped_kgid_user(mnt_userns, &init_user_ns, current_fsgid());
 }
 
 #endif /* _LINUX_MNT_IDMAPPING_H */



^ permalink raw reply	[flat|nested] 41+ messages in thread

* [PATCH 5.15 17/28] docs: update mapping documentation
  2022-06-30 13:46 [PATCH 5.15 00/28] 5.15.52-rc1 review Greg Kroah-Hartman
                   ` (15 preceding siblings ...)
  2022-06-30 13:47 ` [PATCH 5.15 16/28] fs: account for filesystem mappings Greg Kroah-Hartman
@ 2022-06-30 13:47 ` Greg Kroah-Hartman
  2022-06-30 13:47 ` [PATCH 5.15 18/28] fs: use low-level mapping helpers Greg Kroah-Hartman
                   ` (18 subsequent siblings)
  35 siblings, 0 replies; 41+ messages in thread
From: Greg Kroah-Hartman @ 2022-06-30 13:47 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Seth Forshee, Amir Goldstein,
	Christoph Hellwig, Al Viro, linux-fsdevel, Christian Brauner,
	Christian Brauner (Microsoft)

From: Christian Brauner <christian.brauner@ubuntu.com>

commit 8cc5c54de44c5e8e104d364a627ac4296845fc7f upstream.

Now that we implement the full remapping algorithms described in our
documentation remove the section about shortcircuting them.

Link: https://lore.kernel.org/r/20211123114227.3124056-6-brauner@kernel.org (v1)
Link: https://lore.kernel.org/r/20211130121032.3753852-6-brauner@kernel.org (v2)
Link: https://lore.kernel.org/r/20211203111707.3901969-6-brauner@kernel.org
Cc: Seth Forshee <sforshee@digitalocean.com>
Cc: Amir Goldstein <amir73il@gmail.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
CC: linux-fsdevel@vger.kernel.org
Reviewed-by: Seth Forshee <sforshee@digitalocean.com>
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 Documentation/filesystems/idmappings.rst |   72 -------------------------------
 1 file changed, 72 deletions(-)

--- a/Documentation/filesystems/idmappings.rst
+++ b/Documentation/filesystems/idmappings.rst
@@ -952,75 +952,3 @@ The raw userspace id that is put on disk
 their home directory back to their home computer where they are assigned
 ``u1000`` using the initial idmapping and mount the filesystem with the initial
 idmapping they will see all those files owned by ``u1000``.
-
-Shortcircuting
---------------
-
-Currently, the implementation of idmapped mounts enforces that the filesystem
-is mounted with the initial idmapping. The reason is simply that none of the
-filesystems that we targeted were mountable with a non-initial idmapping. But
-that might change soon enough. As we've seen above, thanks to the properties of
-idmappings the translation works for both filesystems mounted with the initial
-idmapping and filesystem with non-initial idmappings.
-
-Based on this current restriction to filesystem mounted with the initial
-idmapping two noticeable shortcuts have been taken:
-
-1. We always stash a reference to the initial user namespace in ``struct
-   vfsmount``. Idmapped mounts are thus mounts that have a non-initial user
-   namespace attached to them.
-
-   In order to support idmapped mounts this needs to be changed. Instead of
-   stashing the initial user namespace the user namespace the filesystem was
-   mounted with must be stashed. An idmapped mount is then any mount that has
-   a different user namespace attached then the filesystem was mounted with.
-   This has no user-visible consequences.
-
-2. The translation algorithms in ``mapped_fs*id()`` and ``i_*id_into_mnt()``
-   are simplified.
-
-   Let's consider ``mapped_fs*id()`` first. This function translates the
-   caller's kernel id into a kernel id in the filesystem's idmapping via
-   a mount's idmapping. The full algorithm is::
-
-    mapped_fsuid(kid):
-      /* Map the kernel id up into a userspace id in the mount's idmapping. */
-      from_kuid(mount-idmapping, kid) = uid
-
-      /* Map the userspace id down into a kernel id in the filesystem's idmapping. */
-      make_kuid(filesystem-idmapping, uid) = kuid
-
-   We know that the filesystem is always mounted with the initial idmapping as
-   we enforce this in ``mount_setattr()``. So this can be shortened to::
-
-    mapped_fsuid(kid):
-      /* Map the kernel id up into a userspace id in the mount's idmapping. */
-      from_kuid(mount-idmapping, kid) = uid
-
-      /* Map the userspace id down into a kernel id in the filesystem's idmapping. */
-      KUIDT_INIT(uid) = kuid
-
-   Similarly, for ``i_*id_into_mnt()`` which translated the filesystem's kernel
-   id into a mount's kernel id::
-
-    i_uid_into_mnt(kid):
-      /* Map the kernel id up into a userspace id in the filesystem's idmapping. */
-      from_kuid(filesystem-idmapping, kid) = uid
-
-      /* Map the userspace id down into a kernel id in the mounts's idmapping. */
-      make_kuid(mount-idmapping, uid) = kuid
-
-   Again, we know that the filesystem is always mounted with the initial
-   idmapping as we enforce this in ``mount_setattr()``. So this can be
-   shortened to::
-
-    i_uid_into_mnt(kid):
-      /* Map the kernel id up into a userspace id in the filesystem's idmapping. */
-      __kuid_val(kid) = uid
-
-      /* Map the userspace id down into a kernel id in the mounts's idmapping. */
-      make_kuid(mount-idmapping, uid) = kuid
-
-Handling filesystems mounted with non-initial idmappings requires that the
-translation functions be converted to their full form. They can still be
-shortcircuited on non-idmapped mounts. This has no user-visible consequences.



^ permalink raw reply	[flat|nested] 41+ messages in thread

* [PATCH 5.15 18/28] fs: use low-level mapping helpers
  2022-06-30 13:46 [PATCH 5.15 00/28] 5.15.52-rc1 review Greg Kroah-Hartman
                   ` (16 preceding siblings ...)
  2022-06-30 13:47 ` [PATCH 5.15 17/28] docs: update mapping documentation Greg Kroah-Hartman
@ 2022-06-30 13:47 ` Greg Kroah-Hartman
  2022-06-30 13:47 ` [PATCH 5.15 19/28] fs: remove unused " Greg Kroah-Hartman
                   ` (17 subsequent siblings)
  35 siblings, 0 replies; 41+ messages in thread
From: Greg Kroah-Hartman @ 2022-06-30 13:47 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Seth Forshee, Amir Goldstein,
	Christoph Hellwig, Al Viro, linux-fsdevel, Christian Brauner,
	Christian Brauner (Microsoft)

From: Christian Brauner <christian.brauner@ubuntu.com>

commit 4472071331549e911a5abad41aea6e3be855a1a4 upstream.

In a few places the vfs needs to interact with bare k{g,u}ids directly
instead of struct inode. These are just a few. In previous patches we
introduced low-level mapping helpers that are able to support
filesystems mounted an idmapping. This patch simply converts the places
to use these new helpers.

Link: https://lore.kernel.org/r/20211123114227.3124056-7-brauner@kernel.org (v1)
Link: https://lore.kernel.org/r/20211130121032.3753852-7-brauner@kernel.org (v2)
Link: https://lore.kernel.org/r/20211203111707.3901969-7-brauner@kernel.org
Cc: Seth Forshee <sforshee@digitalocean.com>
Cc: Amir Goldstein <amir73il@gmail.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
CC: linux-fsdevel@vger.kernel.org
Reviewed-by: Seth Forshee <sforshee@digitalocean.com>
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 fs/ksmbd/smbacl.c    |   18 ++----------------
 fs/ksmbd/smbacl.h    |    4 ++--
 fs/open.c            |    4 ++--
 fs/posix_acl.c       |   16 ++++++++++------
 security/commoncap.c |   13 ++++++++-----
 5 files changed, 24 insertions(+), 31 deletions(-)

--- a/fs/ksmbd/smbacl.c
+++ b/fs/ksmbd/smbacl.c
@@ -275,14 +275,7 @@ static int sid_to_id(struct user_namespa
 		uid_t id;
 
 		id = le32_to_cpu(psid->sub_auth[psid->num_subauth - 1]);
-		/*
-		 * Translate raw sid into kuid in the server's user
-		 * namespace.
-		 */
-		uid = make_kuid(&init_user_ns, id);
-
-		/* If this is an idmapped mount, apply the idmapping. */
-		uid = kuid_from_mnt(user_ns, uid);
+		uid = mapped_kuid_user(user_ns, &init_user_ns, KUIDT_INIT(id));
 		if (uid_valid(uid)) {
 			fattr->cf_uid = uid;
 			rc = 0;
@@ -292,14 +285,7 @@ static int sid_to_id(struct user_namespa
 		gid_t id;
 
 		id = le32_to_cpu(psid->sub_auth[psid->num_subauth - 1]);
-		/*
-		 * Translate raw sid into kgid in the server's user
-		 * namespace.
-		 */
-		gid = make_kgid(&init_user_ns, id);
-
-		/* If this is an idmapped mount, apply the idmapping. */
-		gid = kgid_from_mnt(user_ns, gid);
+		gid = mapped_kgid_user(user_ns, &init_user_ns, KGIDT_INIT(id));
 		if (gid_valid(gid)) {
 			fattr->cf_gid = gid;
 			rc = 0;
--- a/fs/ksmbd/smbacl.h
+++ b/fs/ksmbd/smbacl.h
@@ -217,7 +217,7 @@ static inline uid_t posix_acl_uid_transl
 	kuid_t kuid;
 
 	/* If this is an idmapped mount, apply the idmapping. */
-	kuid = kuid_into_mnt(mnt_userns, pace->e_uid);
+	kuid = mapped_kuid_fs(mnt_userns, &init_user_ns, pace->e_uid);
 
 	/* Translate the kuid into a userspace id ksmbd would see. */
 	return from_kuid(&init_user_ns, kuid);
@@ -229,7 +229,7 @@ static inline gid_t posix_acl_gid_transl
 	kgid_t kgid;
 
 	/* If this is an idmapped mount, apply the idmapping. */
-	kgid = kgid_into_mnt(mnt_userns, pace->e_gid);
+	kgid = mapped_kgid_fs(mnt_userns, &init_user_ns, pace->e_gid);
 
 	/* Translate the kgid into a userspace id ksmbd would see. */
 	return from_kgid(&init_user_ns, kgid);
--- a/fs/open.c
+++ b/fs/open.c
@@ -653,8 +653,8 @@ int chown_common(const struct path *path
 	gid = make_kgid(current_user_ns(), group);
 
 	mnt_userns = mnt_user_ns(path->mnt);
-	uid = kuid_from_mnt(mnt_userns, uid);
-	gid = kgid_from_mnt(mnt_userns, gid);
+	uid = mapped_kuid_user(mnt_userns, &init_user_ns, uid);
+	gid = mapped_kgid_user(mnt_userns, &init_user_ns, gid);
 
 retry_deleg:
 	newattrs.ia_valid =  ATTR_CTIME;
--- a/fs/posix_acl.c
+++ b/fs/posix_acl.c
@@ -376,7 +376,9 @@ posix_acl_permission(struct user_namespa
                                         goto check_perm;
                                 break;
                         case ACL_USER:
-				uid = kuid_into_mnt(mnt_userns, pa->e_uid);
+				uid = mapped_kuid_fs(mnt_userns,
+						      &init_user_ns,
+						      pa->e_uid);
 				if (uid_eq(uid, current_fsuid()))
                                         goto mask;
 				break;
@@ -389,7 +391,9 @@ posix_acl_permission(struct user_namespa
                                 }
 				break;
                         case ACL_GROUP:
-				gid = kgid_into_mnt(mnt_userns, pa->e_gid);
+				gid = mapped_kgid_fs(mnt_userns,
+						      &init_user_ns,
+						      pa->e_gid);
 				if (in_group_p(gid)) {
 					found = 1;
 					if ((pa->e_perm & want) == want)
@@ -736,17 +740,17 @@ static void posix_acl_fix_xattr_userns(
 		case ACL_USER:
 			uid = make_kuid(from, le32_to_cpu(entry->e_id));
 			if (from_user)
-				uid = kuid_from_mnt(mnt_userns, uid);
+				uid = mapped_kuid_user(mnt_userns, &init_user_ns, uid);
 			else
-				uid = kuid_into_mnt(mnt_userns, uid);
+				uid = mapped_kuid_fs(mnt_userns, &init_user_ns, uid);
 			entry->e_id = cpu_to_le32(from_kuid(to, uid));
 			break;
 		case ACL_GROUP:
 			gid = make_kgid(from, le32_to_cpu(entry->e_id));
 			if (from_user)
-				gid = kgid_from_mnt(mnt_userns, gid);
+				gid = mapped_kgid_user(mnt_userns, &init_user_ns, gid);
 			else
-				gid = kgid_into_mnt(mnt_userns, gid);
+				gid = mapped_kgid_fs(mnt_userns, &init_user_ns, gid);
 			entry->e_id = cpu_to_le32(from_kgid(to, gid));
 			break;
 		default:
--- a/security/commoncap.c
+++ b/security/commoncap.c
@@ -419,7 +419,7 @@ int cap_inode_getsecurity(struct user_na
 	kroot = make_kuid(fs_ns, root);
 
 	/* If this is an idmapped mount shift the kuid. */
-	kroot = kuid_into_mnt(mnt_userns, kroot);
+	kroot = mapped_kuid_fs(mnt_userns, &init_user_ns, kroot);
 
 	/* If the root kuid maps to a valid uid in current ns, then return
 	 * this as a nscap. */
@@ -489,6 +489,7 @@ out_free:
  * @size:	size of @ivalue
  * @task_ns:	user namespace of the caller
  * @mnt_userns:	user namespace of the mount the inode was found from
+ * @fs_userns:	user namespace of the filesystem
  *
  * If the inode has been found through an idmapped mount the user namespace of
  * the vfsmount must be passed through @mnt_userns. This function will then
@@ -498,7 +499,8 @@ out_free:
  */
 static kuid_t rootid_from_xattr(const void *value, size_t size,
 				struct user_namespace *task_ns,
-				struct user_namespace *mnt_userns)
+				struct user_namespace *mnt_userns,
+				struct user_namespace *fs_userns)
 {
 	const struct vfs_ns_cap_data *nscap = value;
 	kuid_t rootkid;
@@ -508,7 +510,7 @@ static kuid_t rootid_from_xattr(const vo
 		rootid = le32_to_cpu(nscap->rootid);
 
 	rootkid = make_kuid(task_ns, rootid);
-	return kuid_from_mnt(mnt_userns, rootkid);
+	return mapped_kuid_user(mnt_userns, fs_userns, rootkid);
 }
 
 static bool validheader(size_t size, const struct vfs_cap_data *cap)
@@ -559,7 +561,8 @@ int cap_convert_nscap(struct user_namesp
 			/* user is privileged, just write the v2 */
 			return size;
 
-	rootid = rootid_from_xattr(*ivalue, size, task_ns, mnt_userns);
+	rootid = rootid_from_xattr(*ivalue, size, task_ns, mnt_userns,
+				   &init_user_ns);
 	if (!uid_valid(rootid))
 		return -EINVAL;
 
@@ -700,7 +703,7 @@ int get_vfs_caps_from_disk(struct user_n
 	/* Limit the caps to the mounter of the filesystem
 	 * or the more limited uid specified in the xattr.
 	 */
-	rootkuid = kuid_into_mnt(mnt_userns, rootkuid);
+	rootkuid = mapped_kuid_fs(mnt_userns, &init_user_ns, rootkuid);
 	if (!rootid_owns_currentns(rootkuid))
 		return -ENODATA;
 



^ permalink raw reply	[flat|nested] 41+ messages in thread

* [PATCH 5.15 19/28] fs: remove unused low-level mapping helpers
  2022-06-30 13:46 [PATCH 5.15 00/28] 5.15.52-rc1 review Greg Kroah-Hartman
                   ` (17 preceding siblings ...)
  2022-06-30 13:47 ` [PATCH 5.15 18/28] fs: use low-level mapping helpers Greg Kroah-Hartman
@ 2022-06-30 13:47 ` Greg Kroah-Hartman
  2022-06-30 13:47 ` [PATCH 5.15 20/28] fs: port higher-level " Greg Kroah-Hartman
                   ` (16 subsequent siblings)
  35 siblings, 0 replies; 41+ messages in thread
From: Greg Kroah-Hartman @ 2022-06-30 13:47 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Seth Forshee, Christoph Hellwig,
	Al Viro, linux-fsdevel, Amir Goldstein, Christian Brauner,
	Christian Brauner (Microsoft)

From: Christian Brauner <christian.brauner@ubuntu.com>

commit 02e4079913500f24ceb082d8d87d8665f044b298 upstream.

Now that we ported all places to use the new low-level mapping helpers
that are able to support filesystems mounted with an idmapping we can
remove the old low-level mapping helpers. With the removal of these old
helpers we also conclude the renaming of the mapping helpers we started
in commit a65e58e791a1 ("fs: document and rename fsid helpers").

Link: https://lore.kernel.org/r/20211123114227.3124056-8-brauner@kernel.org (v1)
Link: https://lore.kernel.org/r/20211130121032.3753852-8-brauner@kernel.org (v2)
Link: https://lore.kernel.org/r/20211203111707.3901969-8-brauner@kernel.org
Cc: Seth Forshee <sforshee@digitalocean.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
CC: linux-fsdevel@vger.kernel.org
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Reviewed-by: Seth Forshee <sforshee@digitalocean.com>
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 include/linux/mnt_idmapping.h |   56 ------------------------------------------
 1 file changed, 56 deletions(-)

--- a/include/linux/mnt_idmapping.h
+++ b/include/linux/mnt_idmapping.h
@@ -14,62 +14,6 @@ struct user_namespace;
 extern struct user_namespace init_user_ns;
 
 /**
- * kuid_into_mnt - map a kuid down into a mnt_userns
- * @mnt_userns: user namespace of the relevant mount
- * @kuid: kuid to be mapped
- *
- * Return: @kuid mapped according to @mnt_userns.
- * If @kuid has no mapping INVALID_UID is returned.
- */
-static inline kuid_t kuid_into_mnt(struct user_namespace *mnt_userns,
-				   kuid_t kuid)
-{
-	return make_kuid(mnt_userns, __kuid_val(kuid));
-}
-
-/**
- * kgid_into_mnt - map a kgid down into a mnt_userns
- * @mnt_userns: user namespace of the relevant mount
- * @kgid: kgid to be mapped
- *
- * Return: @kgid mapped according to @mnt_userns.
- * If @kgid has no mapping INVALID_GID is returned.
- */
-static inline kgid_t kgid_into_mnt(struct user_namespace *mnt_userns,
-				   kgid_t kgid)
-{
-	return make_kgid(mnt_userns, __kgid_val(kgid));
-}
-
-/**
- * kuid_from_mnt - map a kuid up into a mnt_userns
- * @mnt_userns: user namespace of the relevant mount
- * @kuid: kuid to be mapped
- *
- * Return: @kuid mapped up according to @mnt_userns.
- * If @kuid has no mapping INVALID_UID is returned.
- */
-static inline kuid_t kuid_from_mnt(struct user_namespace *mnt_userns,
-				   kuid_t kuid)
-{
-	return KUIDT_INIT(from_kuid(mnt_userns, kuid));
-}
-
-/**
- * kgid_from_mnt - map a kgid up into a mnt_userns
- * @mnt_userns: user namespace of the relevant mount
- * @kgid: kgid to be mapped
- *
- * Return: @kgid mapped up according to @mnt_userns.
- * If @kgid has no mapping INVALID_GID is returned.
- */
-static inline kgid_t kgid_from_mnt(struct user_namespace *mnt_userns,
-				   kgid_t kgid)
-{
-	return KGIDT_INIT(from_kgid(mnt_userns, kgid));
-}
-
-/**
  * initial_idmapping - check whether this is the initial mapping
  * @ns: idmapping to check
  *



^ permalink raw reply	[flat|nested] 41+ messages in thread

* [PATCH 5.15 20/28] fs: port higher-level mapping helpers
  2022-06-30 13:46 [PATCH 5.15 00/28] 5.15.52-rc1 review Greg Kroah-Hartman
                   ` (18 preceding siblings ...)
  2022-06-30 13:47 ` [PATCH 5.15 19/28] fs: remove unused " Greg Kroah-Hartman
@ 2022-06-30 13:47 ` Greg Kroah-Hartman
  2022-06-30 13:47 ` [PATCH 5.15 21/28] fs: add i_user_ns() helper Greg Kroah-Hartman
                   ` (15 subsequent siblings)
  35 siblings, 0 replies; 41+ messages in thread
From: Greg Kroah-Hartman @ 2022-06-30 13:47 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Seth Forshee, Christoph Hellwig,
	Al Viro, linux-fsdevel, Amir Goldstein, Christian Brauner,
	Christian Brauner (Microsoft)

From: Christian Brauner <christian.brauner@ubuntu.com>

commit 209188ce75d0d357c292f6bb81d712acdd4e7db7 upstream.

Enable the mapped_fs{g,u}id() helpers to support filesystems mounted
with an idmapping. Apart from core mapping helpers that use
mapped_fs{g,u}id() to initialize struct inode's i_{g,u}id fields xfs is
the only place that uses these low-level helpers directly.

The patch only extends the helpers to be able to take the filesystem
idmapping into account. Since we don't actually yet pass the
filesystem's idmapping in no functional changes happen. This will happen
in a final patch.

Link: https://lore.kernel.org/r/20211123114227.3124056-9-brauner@kernel.org (v1)
Link: https://lore.kernel.org/r/20211130121032.3753852-9-brauner@kernel.org (v2)
Link: https://lore.kernel.org/r/20211203111707.3901969-9-brauner@kernel.org
Cc: Seth Forshee <sforshee@digitalocean.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
CC: linux-fsdevel@vger.kernel.org
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Reviewed-by: Seth Forshee <sforshee@digitalocean.com>
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 fs/xfs/xfs_inode.c            |    8 ++++----
 fs/xfs/xfs_symlink.c          |    4 ++--
 include/linux/fs.h            |    8 ++++----
 include/linux/mnt_idmapping.h |   12 ++++++++----
 4 files changed, 18 insertions(+), 14 deletions(-)

--- a/fs/xfs/xfs_inode.c
+++ b/fs/xfs/xfs_inode.c
@@ -994,8 +994,8 @@ xfs_create(
 	/*
 	 * Make sure that we have allocated dquot(s) on disk.
 	 */
-	error = xfs_qm_vop_dqalloc(dp, mapped_fsuid(mnt_userns),
-			mapped_fsgid(mnt_userns), prid,
+	error = xfs_qm_vop_dqalloc(dp, mapped_fsuid(mnt_userns, &init_user_ns),
+			mapped_fsgid(mnt_userns, &init_user_ns), prid,
 			XFS_QMOPT_QUOTALL | XFS_QMOPT_INHERIT,
 			&udqp, &gdqp, &pdqp);
 	if (error)
@@ -1148,8 +1148,8 @@ xfs_create_tmpfile(
 	/*
 	 * Make sure that we have allocated dquot(s) on disk.
 	 */
-	error = xfs_qm_vop_dqalloc(dp, mapped_fsuid(mnt_userns),
-			mapped_fsgid(mnt_userns), prid,
+	error = xfs_qm_vop_dqalloc(dp, mapped_fsuid(mnt_userns, &init_user_ns),
+			mapped_fsgid(mnt_userns, &init_user_ns), prid,
 			XFS_QMOPT_QUOTALL | XFS_QMOPT_INHERIT,
 			&udqp, &gdqp, &pdqp);
 	if (error)
--- a/fs/xfs/xfs_symlink.c
+++ b/fs/xfs/xfs_symlink.c
@@ -184,8 +184,8 @@ xfs_symlink(
 	/*
 	 * Make sure that we have allocated dquot(s) on disk.
 	 */
-	error = xfs_qm_vop_dqalloc(dp, mapped_fsuid(mnt_userns),
-			mapped_fsgid(mnt_userns), prid,
+	error = xfs_qm_vop_dqalloc(dp, mapped_fsuid(mnt_userns, &init_user_ns),
+			mapped_fsgid(mnt_userns, &init_user_ns), prid,
 			XFS_QMOPT_QUOTALL | XFS_QMOPT_INHERIT,
 			&udqp, &gdqp, &pdqp);
 	if (error)
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1666,7 +1666,7 @@ static inline kgid_t i_gid_into_mnt(stru
 static inline void inode_fsuid_set(struct inode *inode,
 				   struct user_namespace *mnt_userns)
 {
-	inode->i_uid = mapped_fsuid(mnt_userns);
+	inode->i_uid = mapped_fsuid(mnt_userns, &init_user_ns);
 }
 
 /**
@@ -1680,7 +1680,7 @@ static inline void inode_fsuid_set(struc
 static inline void inode_fsgid_set(struct inode *inode,
 				   struct user_namespace *mnt_userns)
 {
-	inode->i_gid = mapped_fsgid(mnt_userns);
+	inode->i_gid = mapped_fsgid(mnt_userns, &init_user_ns);
 }
 
 /**
@@ -1701,10 +1701,10 @@ static inline bool fsuidgid_has_mapping(
 	kuid_t kuid;
 	kgid_t kgid;
 
-	kuid = mapped_fsuid(mnt_userns);
+	kuid = mapped_fsuid(mnt_userns, &init_user_ns);
 	if (!uid_valid(kuid))
 		return false;
-	kgid = mapped_fsgid(mnt_userns);
+	kgid = mapped_fsgid(mnt_userns, &init_user_ns);
 	if (!gid_valid(kgid))
 		return false;
 	return kuid_has_mapping(fs_userns, kuid) &&
--- a/include/linux/mnt_idmapping.h
+++ b/include/linux/mnt_idmapping.h
@@ -196,6 +196,7 @@ static inline kgid_t mapped_kgid_user(st
 /**
  * mapped_fsuid - return caller's fsuid mapped up into a mnt_userns
  * @mnt_userns: the mount's idmapping
+ * @fs_userns: the filesystem's idmapping
  *
  * Use this helper to initialize a new vfs or filesystem object based on
  * the caller's fsuid. A common example is initializing the i_uid field of
@@ -205,14 +206,16 @@ static inline kgid_t mapped_kgid_user(st
  *
  * Return: the caller's current fsuid mapped up according to @mnt_userns.
  */
-static inline kuid_t mapped_fsuid(struct user_namespace *mnt_userns)
+static inline kuid_t mapped_fsuid(struct user_namespace *mnt_userns,
+				  struct user_namespace *fs_userns)
 {
-	return mapped_kuid_user(mnt_userns, &init_user_ns, current_fsuid());
+	return mapped_kuid_user(mnt_userns, fs_userns, current_fsuid());
 }
 
 /**
  * mapped_fsgid - return caller's fsgid mapped up into a mnt_userns
  * @mnt_userns: the mount's idmapping
+ * @fs_userns: the filesystem's idmapping
  *
  * Use this helper to initialize a new vfs or filesystem object based on
  * the caller's fsgid. A common example is initializing the i_gid field of
@@ -222,9 +225,10 @@ static inline kuid_t mapped_fsuid(struct
  *
  * Return: the caller's current fsgid mapped up according to @mnt_userns.
  */
-static inline kgid_t mapped_fsgid(struct user_namespace *mnt_userns)
+static inline kgid_t mapped_fsgid(struct user_namespace *mnt_userns,
+				  struct user_namespace *fs_userns)
 {
-	return mapped_kgid_user(mnt_userns, &init_user_ns, current_fsgid());
+	return mapped_kgid_user(mnt_userns, fs_userns, current_fsgid());
 }
 
 #endif /* _LINUX_MNT_IDMAPPING_H */



^ permalink raw reply	[flat|nested] 41+ messages in thread

* [PATCH 5.15 21/28] fs: add i_user_ns() helper
  2022-06-30 13:46 [PATCH 5.15 00/28] 5.15.52-rc1 review Greg Kroah-Hartman
                   ` (19 preceding siblings ...)
  2022-06-30 13:47 ` [PATCH 5.15 20/28] fs: port higher-level " Greg Kroah-Hartman
@ 2022-06-30 13:47 ` Greg Kroah-Hartman
  2022-06-30 13:47 ` [PATCH 5.15 22/28] fs: support mapped mounts of mapped filesystems Greg Kroah-Hartman
                   ` (14 subsequent siblings)
  35 siblings, 0 replies; 41+ messages in thread
From: Greg Kroah-Hartman @ 2022-06-30 13:47 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Seth Forshee, Christoph Hellwig,
	Al Viro, linux-fsdevel, Amir Goldstein, Christian Brauner,
	Christian Brauner (Microsoft)

From: Christian Brauner <christian.brauner@ubuntu.com>

commit a1ec9040a2a9122605ac26e5725c6de019184419 upstream.

Since we'll be passing the filesystem's idmapping in even more places in
the following patches and we do already dereference struct inode to get
to the filesystem's idmapping multiple times add a tiny helper.

Link: https://lore.kernel.org/r/20211123114227.3124056-10-brauner@kernel.org (v1)
Link: https://lore.kernel.org/r/20211130121032.3753852-10-brauner@kernel.org (v2)
Link: https://lore.kernel.org/r/20211203111707.3901969-10-brauner@kernel.org
Cc: Seth Forshee <sforshee@digitalocean.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
CC: linux-fsdevel@vger.kernel.org
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Reviewed-by: Seth Forshee <sforshee@digitalocean.com>
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 include/linux/fs.h |   13 +++++++++----
 1 file changed, 9 insertions(+), 4 deletions(-)

--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1602,6 +1602,11 @@ struct super_block {
 	struct list_head	s_inodes_wb;	/* writeback inodes */
 } __randomize_layout;
 
+static inline struct user_namespace *i_user_ns(const struct inode *inode)
+{
+	return inode->i_sb->s_user_ns;
+}
+
 /* Helper functions so that in most cases filesystems will
  * not need to deal directly with kuid_t and kgid_t and can
  * instead deal with the raw numeric values that are stored
@@ -1609,22 +1614,22 @@ struct super_block {
  */
 static inline uid_t i_uid_read(const struct inode *inode)
 {
-	return from_kuid(inode->i_sb->s_user_ns, inode->i_uid);
+	return from_kuid(i_user_ns(inode), inode->i_uid);
 }
 
 static inline gid_t i_gid_read(const struct inode *inode)
 {
-	return from_kgid(inode->i_sb->s_user_ns, inode->i_gid);
+	return from_kgid(i_user_ns(inode), inode->i_gid);
 }
 
 static inline void i_uid_write(struct inode *inode, uid_t uid)
 {
-	inode->i_uid = make_kuid(inode->i_sb->s_user_ns, uid);
+	inode->i_uid = make_kuid(i_user_ns(inode), uid);
 }
 
 static inline void i_gid_write(struct inode *inode, gid_t gid)
 {
-	inode->i_gid = make_kgid(inode->i_sb->s_user_ns, gid);
+	inode->i_gid = make_kgid(i_user_ns(inode), gid);
 }
 
 /**



^ permalink raw reply	[flat|nested] 41+ messages in thread

* [PATCH 5.15 22/28] fs: support mapped mounts of mapped filesystems
  2022-06-30 13:46 [PATCH 5.15 00/28] 5.15.52-rc1 review Greg Kroah-Hartman
                   ` (20 preceding siblings ...)
  2022-06-30 13:47 ` [PATCH 5.15 21/28] fs: add i_user_ns() helper Greg Kroah-Hartman
@ 2022-06-30 13:47 ` Greg Kroah-Hartman
  2022-06-30 13:47 ` [PATCH 5.15 23/28] fs: fix acl translation Greg Kroah-Hartman
                   ` (13 subsequent siblings)
  35 siblings, 0 replies; 41+ messages in thread
From: Greg Kroah-Hartman @ 2022-06-30 13:47 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Seth Forshee, Amir Goldstein,
	Christoph Hellwig, Al Viro, linux-fsdevel, Christian Brauner,
	Christian Brauner (Microsoft)

From: Christian Brauner <christian.brauner@ubuntu.com>

commit bd303368b776eead1c29e6cdda82bde7128b82a7 upstream.

In previous patches we added new and modified existing helpers to handle
idmapped mounts of filesystems mounted with an idmapping. In this final
patch we convert all relevant places in the vfs to actually pass the
filesystem's idmapping into these helpers.

With this the vfs is in shape to handle idmapped mounts of filesystems
mounted with an idmapping. Note that this is just the generic
infrastructure. Actually adding support for idmapped mounts to a
filesystem mountable with an idmapping is follow-up work.

In this patch we extend the definition of an idmapped mount from a mount
that that has the initial idmapping attached to it to a mount that has
an idmapping attached to it which is not the same as the idmapping the
filesystem was mounted with.

As before we do not allow the initial idmapping to be attached to a
mount. In addition this patch prevents that the idmapping the filesystem
was mounted with can be attached to a mount created based on this
filesystem.

This has multiple reasons and advantages. First, attaching the initial
idmapping or the filesystem's idmapping doesn't make much sense as in
both cases the values of the i_{g,u}id and other places where k{g,u}ids
are used do not change. Second, a user that really wants to do this for
whatever reason can just create a separate dedicated identical idmapping
to attach to the mount. Third, we can continue to use the initial
idmapping as an indicator that a mount is not idmapped allowing us to
continue to keep passing the initial idmapping into the mapping helpers
to tell them that something isn't an idmapped mount even if the
filesystem is mounted with an idmapping.

Link: https://lore.kernel.org/r/20211123114227.3124056-11-brauner@kernel.org (v1)
Link: https://lore.kernel.org/r/20211130121032.3753852-11-brauner@kernel.org (v2)
Link: https://lore.kernel.org/r/20211203111707.3901969-11-brauner@kernel.org
Cc: Seth Forshee <sforshee@digitalocean.com>
Cc: Amir Goldstein <amir73il@gmail.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
CC: linux-fsdevel@vger.kernel.org
Reviewed-by: Seth Forshee <sforshee@digitalocean.com>
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 fs/namespace.c       |   51 ++++++++++++++++++++++++++++++++++++++-------------
 fs/open.c            |    7 ++++---
 fs/posix_acl.c       |    8 ++++----
 include/linux/fs.h   |   17 +++++++++--------
 security/commoncap.c |    9 ++++-----
 5 files changed, 59 insertions(+), 33 deletions(-)

--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -31,6 +31,7 @@
 #include <uapi/linux/mount.h>
 #include <linux/fs_context.h>
 #include <linux/shmem_fs.h>
+#include <linux/mnt_idmapping.h>
 
 #include "pnode.h"
 #include "internal.h"
@@ -561,7 +562,7 @@ static void free_vfsmnt(struct mount *mn
 	struct user_namespace *mnt_userns;
 
 	mnt_userns = mnt_user_ns(&mnt->mnt);
-	if (mnt_userns != &init_user_ns)
+	if (!initial_idmapping(mnt_userns))
 		put_user_ns(mnt_userns);
 	kfree_const(mnt->mnt_devname);
 #ifdef CONFIG_SMP
@@ -965,6 +966,7 @@ static struct mount *skip_mnt_tree(struc
 struct vfsmount *vfs_create_mount(struct fs_context *fc)
 {
 	struct mount *mnt;
+	struct user_namespace *fs_userns;
 
 	if (!fc->root)
 		return ERR_PTR(-EINVAL);
@@ -982,6 +984,10 @@ struct vfsmount *vfs_create_mount(struct
 	mnt->mnt_mountpoint	= mnt->mnt.mnt_root;
 	mnt->mnt_parent		= mnt;
 
+	fs_userns = mnt->mnt.mnt_sb->s_user_ns;
+	if (!initial_idmapping(fs_userns))
+		mnt->mnt.mnt_userns = get_user_ns(fs_userns);
+
 	lock_mount_hash();
 	list_add_tail(&mnt->mnt_instance, &mnt->mnt.mnt_sb->s_mounts);
 	unlock_mount_hash();
@@ -1072,7 +1078,7 @@ static struct mount *clone_mnt(struct mo
 
 	atomic_inc(&sb->s_active);
 	mnt->mnt.mnt_userns = mnt_user_ns(&old->mnt);
-	if (mnt->mnt.mnt_userns != &init_user_ns)
+	if (!initial_idmapping(mnt->mnt.mnt_userns))
 		mnt->mnt.mnt_userns = get_user_ns(mnt->mnt.mnt_userns);
 	mnt->mnt.mnt_sb = sb;
 	mnt->mnt.mnt_root = dget(root);
@@ -3927,11 +3933,19 @@ static unsigned int recalc_flags(struct
 static int can_idmap_mount(const struct mount_kattr *kattr, struct mount *mnt)
 {
 	struct vfsmount *m = &mnt->mnt;
+	struct user_namespace *fs_userns = m->mnt_sb->s_user_ns;
 
 	if (!kattr->mnt_userns)
 		return 0;
 
 	/*
+	 * Creating an idmapped mount with the filesystem wide idmapping
+	 * doesn't make sense so block that. We don't allow mushy semantics.
+	 */
+	if (kattr->mnt_userns == fs_userns)
+		return -EINVAL;
+
+	/*
 	 * Once a mount has been idmapped we don't allow it to change its
 	 * mapping. It makes things simpler and callers can just create
 	 * another bind-mount they can idmap if they want to.
@@ -3943,12 +3957,8 @@ static int can_idmap_mount(const struct
 	if (!(m->mnt_sb->s_type->fs_flags & FS_ALLOW_IDMAP))
 		return -EINVAL;
 
-	/* Don't yet support filesystem mountable in user namespaces. */
-	if (m->mnt_sb->s_user_ns != &init_user_ns)
-		return -EINVAL;
-
 	/* We're not controlling the superblock. */
-	if (!capable(CAP_SYS_ADMIN))
+	if (!ns_capable(fs_userns, CAP_SYS_ADMIN))
 		return -EPERM;
 
 	/* Mount has already been visible in the filesystem hierarchy. */
@@ -4002,14 +4012,27 @@ out:
 
 static void do_idmap_mount(const struct mount_kattr *kattr, struct mount *mnt)
 {
-	struct user_namespace *mnt_userns;
+	struct user_namespace *mnt_userns, *old_mnt_userns;
 
 	if (!kattr->mnt_userns)
 		return;
 
+	/*
+	 * We're the only ones able to change the mount's idmapping. So
+	 * mnt->mnt.mnt_userns is stable and we can retrieve it directly.
+	 */
+	old_mnt_userns = mnt->mnt.mnt_userns;
+
 	mnt_userns = get_user_ns(kattr->mnt_userns);
 	/* Pairs with smp_load_acquire() in mnt_user_ns(). */
 	smp_store_release(&mnt->mnt.mnt_userns, mnt_userns);
+
+	/*
+	 * If this is an idmapped filesystem drop the reference we've taken
+	 * in vfs_create_mount() before.
+	 */
+	if (!initial_idmapping(old_mnt_userns))
+		put_user_ns(old_mnt_userns);
 }
 
 static void mount_setattr_commit(struct mount_kattr *kattr,
@@ -4133,13 +4156,15 @@ static int build_mount_idmapped(const st
 	}
 
 	/*
-	 * The init_user_ns is used to indicate that a vfsmount is not idmapped.
-	 * This is simpler than just having to treat NULL as unmapped. Users
-	 * wanting to idmap a mount to init_user_ns can just use a namespace
-	 * with an identity mapping.
+	 * The initial idmapping cannot be used to create an idmapped
+	 * mount. We use the initial idmapping as an indicator of a mount
+	 * that is not idmapped. It can simply be passed into helpers that
+	 * are aware of idmapped mounts as a convenient shortcut. A user
+	 * can just create a dedicated identity mapping to achieve the same
+	 * result.
 	 */
 	mnt_userns = container_of(ns, struct user_namespace, ns);
-	if (mnt_userns == &init_user_ns) {
+	if (initial_idmapping(mnt_userns)) {
 		err = -EPERM;
 		goto out_fput;
 	}
--- a/fs/open.c
+++ b/fs/open.c
@@ -641,7 +641,7 @@ SYSCALL_DEFINE2(chmod, const char __user
 
 int chown_common(const struct path *path, uid_t user, gid_t group)
 {
-	struct user_namespace *mnt_userns;
+	struct user_namespace *mnt_userns, *fs_userns;
 	struct inode *inode = path->dentry->d_inode;
 	struct inode *delegated_inode = NULL;
 	int error;
@@ -653,8 +653,9 @@ int chown_common(const struct path *path
 	gid = make_kgid(current_user_ns(), group);
 
 	mnt_userns = mnt_user_ns(path->mnt);
-	uid = mapped_kuid_user(mnt_userns, &init_user_ns, uid);
-	gid = mapped_kgid_user(mnt_userns, &init_user_ns, gid);
+	fs_userns = i_user_ns(inode);
+	uid = mapped_kuid_user(mnt_userns, fs_userns, uid);
+	gid = mapped_kgid_user(mnt_userns, fs_userns, gid);
 
 retry_deleg:
 	newattrs.ia_valid =  ATTR_CTIME;
--- a/fs/posix_acl.c
+++ b/fs/posix_acl.c
@@ -377,8 +377,8 @@ posix_acl_permission(struct user_namespa
                                 break;
                         case ACL_USER:
 				uid = mapped_kuid_fs(mnt_userns,
-						      &init_user_ns,
-						      pa->e_uid);
+						     i_user_ns(inode),
+						     pa->e_uid);
 				if (uid_eq(uid, current_fsuid()))
                                         goto mask;
 				break;
@@ -392,8 +392,8 @@ posix_acl_permission(struct user_namespa
 				break;
                         case ACL_GROUP:
 				gid = mapped_kgid_fs(mnt_userns,
-						      &init_user_ns,
-						      pa->e_gid);
+						     i_user_ns(inode),
+						     pa->e_gid);
 				if (in_group_p(gid)) {
 					found = 1;
 					if ((pa->e_perm & want) == want)
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1643,7 +1643,7 @@ static inline void i_gid_write(struct in
 static inline kuid_t i_uid_into_mnt(struct user_namespace *mnt_userns,
 				    const struct inode *inode)
 {
-	return mapped_kuid_fs(mnt_userns, &init_user_ns, inode->i_uid);
+	return mapped_kuid_fs(mnt_userns, i_user_ns(inode), inode->i_uid);
 }
 
 /**
@@ -1657,7 +1657,7 @@ static inline kuid_t i_uid_into_mnt(stru
 static inline kgid_t i_gid_into_mnt(struct user_namespace *mnt_userns,
 				    const struct inode *inode)
 {
-	return mapped_kgid_fs(mnt_userns, &init_user_ns, inode->i_gid);
+	return mapped_kgid_fs(mnt_userns, i_user_ns(inode), inode->i_gid);
 }
 
 /**
@@ -1671,7 +1671,7 @@ static inline kgid_t i_gid_into_mnt(stru
 static inline void inode_fsuid_set(struct inode *inode,
 				   struct user_namespace *mnt_userns)
 {
-	inode->i_uid = mapped_fsuid(mnt_userns, &init_user_ns);
+	inode->i_uid = mapped_fsuid(mnt_userns, i_user_ns(inode));
 }
 
 /**
@@ -1685,7 +1685,7 @@ static inline void inode_fsuid_set(struc
 static inline void inode_fsgid_set(struct inode *inode,
 				   struct user_namespace *mnt_userns)
 {
-	inode->i_gid = mapped_fsgid(mnt_userns, &init_user_ns);
+	inode->i_gid = mapped_fsgid(mnt_userns, i_user_ns(inode));
 }
 
 /**
@@ -1706,10 +1706,10 @@ static inline bool fsuidgid_has_mapping(
 	kuid_t kuid;
 	kgid_t kgid;
 
-	kuid = mapped_fsuid(mnt_userns, &init_user_ns);
+	kuid = mapped_fsuid(mnt_userns, fs_userns);
 	if (!uid_valid(kuid))
 		return false;
-	kgid = mapped_fsgid(mnt_userns, &init_user_ns);
+	kgid = mapped_fsgid(mnt_userns, fs_userns);
 	if (!gid_valid(kgid))
 		return false;
 	return kuid_has_mapping(fs_userns, kuid) &&
@@ -2655,13 +2655,14 @@ static inline struct user_namespace *fil
  * is_idmapped_mnt - check whether a mount is mapped
  * @mnt: the mount to check
  *
- * If @mnt has an idmapping attached to it @mnt is mapped.
+ * If @mnt has an idmapping attached different from the
+ * filesystem's idmapping then @mnt is mapped.
  *
  * Return: true if mount is mapped, false if not.
  */
 static inline bool is_idmapped_mnt(const struct vfsmount *mnt)
 {
-	return mnt_user_ns(mnt) != &init_user_ns;
+	return mnt_user_ns(mnt) != mnt->mnt_sb->s_user_ns;
 }
 
 extern long vfs_truncate(const struct path *, loff_t);
--- a/security/commoncap.c
+++ b/security/commoncap.c
@@ -419,7 +419,7 @@ int cap_inode_getsecurity(struct user_na
 	kroot = make_kuid(fs_ns, root);
 
 	/* If this is an idmapped mount shift the kuid. */
-	kroot = mapped_kuid_fs(mnt_userns, &init_user_ns, kroot);
+	kroot = mapped_kuid_fs(mnt_userns, fs_ns, kroot);
 
 	/* If the root kuid maps to a valid uid in current ns, then return
 	 * this as a nscap. */
@@ -556,13 +556,12 @@ int cap_convert_nscap(struct user_namesp
 		return -EINVAL;
 	if (!capable_wrt_inode_uidgid(mnt_userns, inode, CAP_SETFCAP))
 		return -EPERM;
-	if (size == XATTR_CAPS_SZ_2 && (mnt_userns == &init_user_ns))
+	if (size == XATTR_CAPS_SZ_2 && (mnt_userns == fs_ns))
 		if (ns_capable(inode->i_sb->s_user_ns, CAP_SETFCAP))
 			/* user is privileged, just write the v2 */
 			return size;
 
-	rootid = rootid_from_xattr(*ivalue, size, task_ns, mnt_userns,
-				   &init_user_ns);
+	rootid = rootid_from_xattr(*ivalue, size, task_ns, mnt_userns, fs_ns);
 	if (!uid_valid(rootid))
 		return -EINVAL;
 
@@ -703,7 +702,7 @@ int get_vfs_caps_from_disk(struct user_n
 	/* Limit the caps to the mounter of the filesystem
 	 * or the more limited uid specified in the xattr.
 	 */
-	rootkuid = mapped_kuid_fs(mnt_userns, &init_user_ns, rootkuid);
+	rootkuid = mapped_kuid_fs(mnt_userns, fs_ns, rootkuid);
 	if (!rootid_owns_currentns(rootkuid))
 		return -ENODATA;
 



^ permalink raw reply	[flat|nested] 41+ messages in thread

* [PATCH 5.15 23/28] fs: fix acl translation
  2022-06-30 13:46 [PATCH 5.15 00/28] 5.15.52-rc1 review Greg Kroah-Hartman
                   ` (21 preceding siblings ...)
  2022-06-30 13:47 ` [PATCH 5.15 22/28] fs: support mapped mounts of mapped filesystems Greg Kroah-Hartman
@ 2022-06-30 13:47 ` Greg Kroah-Hartman
  2022-06-30 13:47 ` [PATCH 5.15 24/28] fs: account for group membership Greg Kroah-Hartman
                   ` (12 subsequent siblings)
  35 siblings, 0 replies; 41+ messages in thread
From: Greg Kroah-Hartman @ 2022-06-30 13:47 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Seth Forshee, Christoph Hellwig,
	regressions, Christian Brauner (Microsoft),
	Linus Torvalds

From: Christian Brauner <brauner@kernel.org>

commit 705191b03d507744c7e097f78d583621c14988ac upstream.

Last cycle we extended the idmapped mounts infrastructure to support
idmapped mounts of idmapped filesystems (No such filesystem yet exist.).
Since then, the meaning of an idmapped mount is a mount whose idmapping
is different from the filesystems idmapping.

While doing that work we missed to adapt the acl translation helpers.
They still assume that checking for the identity mapping is enough.  But
they need to use the no_idmapping() helper instead.

Note, POSIX ACLs are always translated right at the userspace-kernel
boundary using the caller's current idmapping and the initial idmapping.
The order depends on whether we're coming from or going to userspace.
The filesystem's idmapping doesn't matter at the border.

Consequently, if a non-idmapped mount is passed we need to make sure to
always pass the initial idmapping as the mount's idmapping and not the
filesystem idmapping.  Since it's irrelevant here it would yield invalid
ids and prevent setting acls for filesystems that are mountable in a
userns and support posix acls (tmpfs and fuse).

I verified the regression reported in [1] and verified that this patch
fixes it.  A regression test will be added to xfstests in parallel.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=215849 [1]
Fixes: bd303368b776 ("fs: support mapped mounts of mapped filesystems")
Cc: Seth Forshee <sforshee@digitalocean.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: <stable@vger.kernel.org> # 5.15+
Cc: <regressions@lists.linux.dev>
Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 fs/posix_acl.c                  |   10 ++++++++++
 fs/xattr.c                      |    6 ++++--
 include/linux/posix_acl_xattr.h |    4 ++++
 3 files changed, 18 insertions(+), 2 deletions(-)

--- a/fs/posix_acl.c
+++ b/fs/posix_acl.c
@@ -760,9 +760,14 @@ static void posix_acl_fix_xattr_userns(
 }
 
 void posix_acl_fix_xattr_from_user(struct user_namespace *mnt_userns,
+				   struct inode *inode,
 				   void *value, size_t size)
 {
 	struct user_namespace *user_ns = current_user_ns();
+
+	/* Leave ids untouched on non-idmapped mounts. */
+	if (no_idmapping(mnt_userns, i_user_ns(inode)))
+		mnt_userns = &init_user_ns;
 	if ((user_ns == &init_user_ns) && (mnt_userns == &init_user_ns))
 		return;
 	posix_acl_fix_xattr_userns(&init_user_ns, user_ns, mnt_userns, value,
@@ -770,9 +775,14 @@ void posix_acl_fix_xattr_from_user(struc
 }
 
 void posix_acl_fix_xattr_to_user(struct user_namespace *mnt_userns,
+				 struct inode *inode,
 				 void *value, size_t size)
 {
 	struct user_namespace *user_ns = current_user_ns();
+
+	/* Leave ids untouched on non-idmapped mounts. */
+	if (no_idmapping(mnt_userns, i_user_ns(inode)))
+		mnt_userns = &init_user_ns;
 	if ((user_ns == &init_user_ns) && (mnt_userns == &init_user_ns))
 		return;
 	posix_acl_fix_xattr_userns(user_ns, &init_user_ns, mnt_userns, value,
--- a/fs/xattr.c
+++ b/fs/xattr.c
@@ -569,7 +569,8 @@ setxattr(struct user_namespace *mnt_user
 		}
 		if ((strcmp(kname, XATTR_NAME_POSIX_ACL_ACCESS) == 0) ||
 		    (strcmp(kname, XATTR_NAME_POSIX_ACL_DEFAULT) == 0))
-			posix_acl_fix_xattr_from_user(mnt_userns, kvalue, size);
+			posix_acl_fix_xattr_from_user(mnt_userns, d_inode(d),
+						      kvalue, size);
 	}
 
 	error = vfs_setxattr(mnt_userns, d, kname, kvalue, size, flags);
@@ -667,7 +668,8 @@ getxattr(struct user_namespace *mnt_user
 	if (error > 0) {
 		if ((strcmp(kname, XATTR_NAME_POSIX_ACL_ACCESS) == 0) ||
 		    (strcmp(kname, XATTR_NAME_POSIX_ACL_DEFAULT) == 0))
-			posix_acl_fix_xattr_to_user(mnt_userns, kvalue, error);
+			posix_acl_fix_xattr_to_user(mnt_userns, d_inode(d),
+						    kvalue, error);
 		if (size && copy_to_user(value, kvalue, error))
 			error = -EFAULT;
 	} else if (error == -ERANGE && size >= XATTR_SIZE_MAX) {
--- a/include/linux/posix_acl_xattr.h
+++ b/include/linux/posix_acl_xattr.h
@@ -34,15 +34,19 @@ posix_acl_xattr_count(size_t size)
 
 #ifdef CONFIG_FS_POSIX_ACL
 void posix_acl_fix_xattr_from_user(struct user_namespace *mnt_userns,
+				   struct inode *inode,
 				   void *value, size_t size);
 void posix_acl_fix_xattr_to_user(struct user_namespace *mnt_userns,
+				   struct inode *inode,
 				 void *value, size_t size);
 #else
 static inline void posix_acl_fix_xattr_from_user(struct user_namespace *mnt_userns,
+						 struct inode *inode,
 						 void *value, size_t size)
 {
 }
 static inline void posix_acl_fix_xattr_to_user(struct user_namespace *mnt_userns,
+					       struct inode *inode,
 					       void *value, size_t size)
 {
 }



^ permalink raw reply	[flat|nested] 41+ messages in thread

* [PATCH 5.15 24/28] fs: account for group membership
  2022-06-30 13:46 [PATCH 5.15 00/28] 5.15.52-rc1 review Greg Kroah-Hartman
                   ` (22 preceding siblings ...)
  2022-06-30 13:47 ` [PATCH 5.15 23/28] fs: fix acl translation Greg Kroah-Hartman
@ 2022-06-30 13:47 ` Greg Kroah-Hartman
  2022-06-30 13:47 ` [PATCH 5.15 25/28] rtw88: 8821c: support RFE type4 wifi NIC Greg Kroah-Hartman
                   ` (11 subsequent siblings)
  35 siblings, 0 replies; 41+ messages in thread
From: Greg Kroah-Hartman @ 2022-06-30 13:47 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Seth Forshee, Christoph Hellwig,
	Aleksa Sarai, Al Viro, linux-fsdevel,
	Christian Brauner (Microsoft)

From: Christian Brauner <brauner@kernel.org>

commit 168f912893407a5acb798a4a58613b5f1f98c717 upstream.

When calling setattr_prepare() to determine the validity of the
attributes the ia_{g,u}id fields contain the value that will be written
to inode->i_{g,u}id. This is exactly the same for idmapped and
non-idmapped mounts and allows callers to pass in the values they want
to see written to inode->i_{g,u}id.

When group ownership is changed a caller whose fsuid owns the inode can
change the group of the inode to any group they are a member of. When
searching through the caller's groups we need to use the gid mapped
according to the idmapped mount otherwise we will fail to change
ownership for unprivileged users.

Consider a caller running with fsuid and fsgid 1000 using an idmapped
mount that maps id 65534 to 1000 and 65535 to 1001. Consequently, a file
owned by 65534:65535 in the filesystem will be owned by 1000:1001 in the
idmapped mount.

The caller now requests the gid of the file to be changed to 1000 going
through the idmapped mount. In the vfs we will immediately map the
requested gid to the value that will need to be written to inode->i_gid
and place it in attr->ia_gid. Since this idmapped mount maps 65534 to
1000 we place 65534 in attr->ia_gid.

When we check whether the caller is allowed to change group ownership we
first validate that their fsuid matches the inode's uid. The
inode->i_uid is 65534 which is mapped to uid 1000 in the idmapped mount.
Since the caller's fsuid is 1000 we pass the check.

We now check whether the caller is allowed to change inode->i_gid to the
requested gid by calling in_group_p(). This will compare the passed in
gid to the caller's fsgid and search the caller's additional groups.

Since we're dealing with an idmapped mount we need to pass in the gid
mapped according to the idmapped mount. This is akin to checking whether
a caller is privileged over the future group the inode is owned by. And
that needs to take the idmapped mount into account. Note, all helpers
are nops without idmapped mounts.

New regression test sent to xfstests.

Link: https://github.com/lxc/lxd/issues/10537
Link: https://lore.kernel.org/r/20220613111517.2186646-1-brauner@kernel.org
Fixes: 2f221d6f7b88 ("attr: handle idmapped mounts")
Cc: Seth Forshee <sforshee@digitalocean.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Aleksa Sarai <cyphar@cyphar.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: stable@vger.kernel.org # 5.15+
CC: linux-fsdevel@vger.kernel.org
Reviewed-by: Seth Forshee <sforshee@digitalocean.com>
Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 fs/attr.c |   26 ++++++++++++++++++++------
 1 file changed, 20 insertions(+), 6 deletions(-)

--- a/fs/attr.c
+++ b/fs/attr.c
@@ -61,9 +61,15 @@ static bool chgrp_ok(struct user_namespa
 		     const struct inode *inode, kgid_t gid)
 {
 	kgid_t kgid = i_gid_into_mnt(mnt_userns, inode);
-	if (uid_eq(current_fsuid(), i_uid_into_mnt(mnt_userns, inode)) &&
-	    (in_group_p(gid) || gid_eq(gid, inode->i_gid)))
-		return true;
+	if (uid_eq(current_fsuid(), i_uid_into_mnt(mnt_userns, inode))) {
+		kgid_t mapped_gid;
+
+		if (gid_eq(gid, inode->i_gid))
+			return true;
+		mapped_gid = mapped_kgid_fs(mnt_userns, i_user_ns(inode), gid);
+		if (in_group_p(mapped_gid))
+			return true;
+	}
 	if (capable_wrt_inode_uidgid(mnt_userns, inode, CAP_CHOWN))
 		return true;
 	if (gid_eq(kgid, INVALID_GID) &&
@@ -123,12 +129,20 @@ int setattr_prepare(struct user_namespac
 
 	/* Make sure a caller can chmod. */
 	if (ia_valid & ATTR_MODE) {
+		kgid_t mapped_gid;
+
 		if (!inode_owner_or_capable(mnt_userns, inode))
 			return -EPERM;
+
+		if (ia_valid & ATTR_GID)
+			mapped_gid = mapped_kgid_fs(mnt_userns,
+						i_user_ns(inode), attr->ia_gid);
+		else
+			mapped_gid = i_gid_into_mnt(mnt_userns, inode);
+
 		/* Also check the setgid bit! */
-               if (!in_group_p((ia_valid & ATTR_GID) ? attr->ia_gid :
-                                i_gid_into_mnt(mnt_userns, inode)) &&
-                    !capable_wrt_inode_uidgid(mnt_userns, inode, CAP_FSETID))
+		if (!in_group_p(mapped_gid) &&
+		    !capable_wrt_inode_uidgid(mnt_userns, inode, CAP_FSETID))
 			attr->ia_mode &= ~S_ISGID;
 	}
 



^ permalink raw reply	[flat|nested] 41+ messages in thread

* [PATCH 5.15 25/28] rtw88: 8821c: support RFE type4 wifi NIC
  2022-06-30 13:46 [PATCH 5.15 00/28] 5.15.52-rc1 review Greg Kroah-Hartman
                   ` (23 preceding siblings ...)
  2022-06-30 13:47 ` [PATCH 5.15 24/28] fs: account for group membership Greg Kroah-Hartman
@ 2022-06-30 13:47 ` Greg Kroah-Hartman
  2022-06-30 13:47 ` [PATCH 5.15 26/28] rtw88: rtw8821c: enable rfe 6 devices Greg Kroah-Hartman
                   ` (10 subsequent siblings)
  35 siblings, 0 replies; 41+ messages in thread
From: Greg Kroah-Hartman @ 2022-06-30 13:47 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Guo-Feng Fan, Ping-Ke Shih,
	Kalle Valo, Meng Tang

From: Guo-Feng Fan <vincent_fann@realtek.com>

commit b789e3fe7047296be0ccdbb7ceb0b58856053572 upstream.

RFE type4 is a new NIC which has one RF antenna shares with BT.
RFE type4 HW is the same as RFE type2 but attaching antenna to
aux antenna connector.

RFE type2 attach antenna to main antenna connector.
Load the same parameter as RFE type2 when initializing NIC.

Signed-off-by: Guo-Feng Fan <vincent_fann@realtek.com>
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20210922023637.9357-1-pkshih@realtek.com
Signed-off-by: Meng Tang <tangmeng@uniontech.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/net/wireless/realtek/rtw88/rtw8821c.c |   13 ++++++++++++-
 1 file changed, 12 insertions(+), 1 deletion(-)

--- a/drivers/net/wireless/realtek/rtw88/rtw8821c.c
+++ b/drivers/net/wireless/realtek/rtw88/rtw8821c.c
@@ -304,7 +304,8 @@ static void rtw8821c_set_channel_rf(stru
 	if (channel <= 14) {
 		if (rtwdev->efuse.rfe_option == 0)
 			rtw8821c_switch_rf_set(rtwdev, SWITCH_TO_WLG);
-		else if (rtwdev->efuse.rfe_option == 2)
+		else if (rtwdev->efuse.rfe_option == 2 ||
+			 rtwdev->efuse.rfe_option == 4)
 			rtw8821c_switch_rf_set(rtwdev, SWITCH_TO_BTG);
 		rtw_write_rf(rtwdev, RF_PATH_A, RF_LUTDBG, BIT(6), 0x1);
 		rtw_write_rf(rtwdev, RF_PATH_A, 0x64, 0xf, 0xf);
@@ -777,6 +778,15 @@ static void rtw8821c_coex_cfg_ant_switch
 	if (switch_status == coex_dm->cur_switch_status)
 		return;
 
+	if (coex_rfe->wlg_at_btg) {
+		ctrl_type = COEX_SWITCH_CTRL_BY_BBSW;
+
+		if (coex_rfe->ant_switch_polarity)
+			pos_type = COEX_SWITCH_TO_WLA;
+		else
+			pos_type = COEX_SWITCH_TO_WLG_BT;
+	}
+
 	coex_dm->cur_switch_status = switch_status;
 
 	if (coex_rfe->ant_switch_diversity &&
@@ -1502,6 +1512,7 @@ static const struct rtw_intf_phy_para_ta
 static const struct rtw_rfe_def rtw8821c_rfe_defs[] = {
 	[0] = RTW_DEF_RFE(8821c, 0, 0),
 	[2] = RTW_DEF_RFE_EXT(8821c, 0, 0, 2),
+	[4] = RTW_DEF_RFE_EXT(8821c, 0, 0, 2),
 };
 
 static struct rtw_hw_reg rtw8821c_dig[] = {



^ permalink raw reply	[flat|nested] 41+ messages in thread

* [PATCH 5.15 26/28] rtw88: rtw8821c: enable rfe 6 devices
  2022-06-30 13:46 [PATCH 5.15 00/28] 5.15.52-rc1 review Greg Kroah-Hartman
                   ` (24 preceding siblings ...)
  2022-06-30 13:47 ` [PATCH 5.15 25/28] rtw88: 8821c: support RFE type4 wifi NIC Greg Kroah-Hartman
@ 2022-06-30 13:47 ` Greg Kroah-Hartman
  2022-06-30 13:47 ` [PATCH 5.15 27/28] net: mscc: ocelot: allow unregistered IP multicast flooding to CPU Greg Kroah-Hartman
                   ` (9 subsequent siblings)
  35 siblings, 0 replies; 41+ messages in thread
From: Greg Kroah-Hartman @ 2022-06-30 13:47 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Ping-Ke Shih, Larry Finger,
	Kalle Valo, Meng Tang, masterzorag

From: Ping-Ke Shih <pkshih@realtek.com>

commit e109e3617e5d563b431a52e6e2f07f0fc65a93ae upstream.

Ping-Ke Shih answered[1] a question for a user about an rtl8821ce device that
reported RFE 6, which the driver did not support. Ping-Ke suggested a possible
fix, but the user never reported back.

A second user discovered the above thread and tested the proposed fix.
Accordingly, I am pushing this change, even though I am not the author.

[1] https://lore.kernel.org/linux-wireless/3f5e2f6eac344316b5dd518ebfea2f95@realtek.com/

Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Reported-and-tested-by: masterzorag <masterzorag@gmail.com>
Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20220107024739.20967-1-Larry.Finger@lwfinger.net
Signed-off-by: Meng Tang <tangmeng@uniontech.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/net/wireless/realtek/rtw88/rtw8821c.c |    1 +
 1 file changed, 1 insertion(+)

--- a/drivers/net/wireless/realtek/rtw88/rtw8821c.c
+++ b/drivers/net/wireless/realtek/rtw88/rtw8821c.c
@@ -1513,6 +1513,7 @@ static const struct rtw_rfe_def rtw8821c
 	[0] = RTW_DEF_RFE(8821c, 0, 0),
 	[2] = RTW_DEF_RFE_EXT(8821c, 0, 0, 2),
 	[4] = RTW_DEF_RFE_EXT(8821c, 0, 0, 2),
+	[6] = RTW_DEF_RFE(8821c, 0, 0),
 };
 
 static struct rtw_hw_reg rtw8821c_dig[] = {



^ permalink raw reply	[flat|nested] 41+ messages in thread

* [PATCH 5.15 27/28] net: mscc: ocelot: allow unregistered IP multicast flooding to CPU
  2022-06-30 13:46 [PATCH 5.15 00/28] 5.15.52-rc1 review Greg Kroah-Hartman
                   ` (25 preceding siblings ...)
  2022-06-30 13:47 ` [PATCH 5.15 26/28] rtw88: rtw8821c: enable rfe 6 devices Greg Kroah-Hartman
@ 2022-06-30 13:47 ` Greg Kroah-Hartman
  2022-06-30 13:47 ` [PATCH 5.15 28/28] io_uring: fix not locked access to fixed buf table Greg Kroah-Hartman
                   ` (8 subsequent siblings)
  35 siblings, 0 replies; 41+ messages in thread
From: Greg Kroah-Hartman @ 2022-06-30 13:47 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, stable, Vladimir Oltean

From: Vladimir Oltean <vladimir.oltean@nxp.com>

Since commit 4cf35a2b627a ("net: mscc: ocelot: fix broken IP multicast
flooding") from v5.12, unregistered IP multicast flooding is
configurable in the ocelot driver for bridged ports. However, by writing
0 to the PGID_MCIPV4 and PGID_MCIPV6 port masks at initialization time,
the CPU port module, for which ocelot_port_set_mcast_flood() is not
called, will have unknown IP multicast flooding disabled.

This makes it impossible for an application such as smcroute to work
properly, since all IP multicast traffic received on a standalone port
is treated as unregistered (and dropped).

Starting with commit 7569459a52c9 ("net: dsa: manage flooding on the CPU
ports"), the limitation above has been lifted, because when standalone
ports become IFF_PROMISC or IFF_ALLMULTI, ocelot_port_set_mcast_flood()
would be called on the CPU port module, so unregistered multicast is
flooded to the CPU on an as-needed basis.

But between v5.12 and v5.18, IP multicast flooding to the CPU has
remained broken, promiscuous or not.

Delete the inexplicable premature optimization of clearing PGID_MCIPV4
and PGID_MCIPV6 as part of the init sequence, and allow unregistered IP
multicast to be flooded freely to the CPU port module.

Fixes: a556c76adc05 ("net: mscc: Add initial Ocelot switch support")
Cc: stable@kernel.org
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/net/ethernet/mscc/ocelot.c |    8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

--- a/drivers/net/ethernet/mscc/ocelot.c
+++ b/drivers/net/ethernet/mscc/ocelot.c
@@ -2208,9 +2208,13 @@ int ocelot_init(struct ocelot *ocelot)
 		       ANA_PGID_PGID, PGID_MC);
 	ocelot_rmw_rix(ocelot, ANA_PGID_PGID_PGID(BIT(ocelot->num_phys_ports)),
 		       ANA_PGID_PGID_PGID(BIT(ocelot->num_phys_ports)),
+		       ANA_PGID_PGID, PGID_MCIPV4);
+	ocelot_rmw_rix(ocelot, ANA_PGID_PGID_PGID(BIT(ocelot->num_phys_ports)),
+		       ANA_PGID_PGID_PGID(BIT(ocelot->num_phys_ports)),
+		       ANA_PGID_PGID, PGID_MCIPV6);
+	ocelot_rmw_rix(ocelot, ANA_PGID_PGID_PGID(BIT(ocelot->num_phys_ports)),
+		       ANA_PGID_PGID_PGID(BIT(ocelot->num_phys_ports)),
 		       ANA_PGID_PGID, PGID_BC);
-	ocelot_write_rix(ocelot, 0, ANA_PGID_PGID, PGID_MCIPV4);
-	ocelot_write_rix(ocelot, 0, ANA_PGID_PGID, PGID_MCIPV6);
 
 	/* Allow manual injection via DEVCPU_QS registers, and byte swap these
 	 * registers endianness.



^ permalink raw reply	[flat|nested] 41+ messages in thread

* [PATCH 5.15 28/28] io_uring: fix not locked access to fixed buf table
  2022-06-30 13:46 [PATCH 5.15 00/28] 5.15.52-rc1 review Greg Kroah-Hartman
                   ` (26 preceding siblings ...)
  2022-06-30 13:47 ` [PATCH 5.15 27/28] net: mscc: ocelot: allow unregistered IP multicast flooding to CPU Greg Kroah-Hartman
@ 2022-06-30 13:47 ` Greg Kroah-Hartman
  2022-06-30 23:17 ` [PATCH 5.15 00/28] 5.15.52-rc1 review Shuah Khan
                   ` (7 subsequent siblings)
  35 siblings, 0 replies; 41+ messages in thread
From: Greg Kroah-Hartman @ 2022-06-30 13:47 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Pavel Begunkov

From: Pavel Begunkov <asml.silence@gmail.com>

commit 05b538c1765f8d14a71ccf5f85258dcbeaf189f7 upstream.

We can look inside the fixed buffer table only while holding
->uring_lock, however in some cases we don't do the right async prep for
IORING_OP_{WRITE,READ}_FIXED ending up with NULL req->imu forcing making
an io-wq worker to try to resolve the fixed buffer without proper
locking.

Move req->imu setup into early req init paths, i.e. io_prep_rw(), which
is called unconditionally for rw requests and under uring_lock.

Fixes: 634d00df5e1cf ("io_uring: add full-fledged dynamic buffers support")
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 fs/io_uring.c |   28 ++++++++++++++--------------
 1 file changed, 14 insertions(+), 14 deletions(-)

--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -2932,15 +2932,24 @@ static int io_prep_rw(struct io_kiocb *r
 		kiocb->ki_complete = io_complete_rw;
 	}
 
+	/* used for fixed read/write too - just read unconditionally */
+	req->buf_index = READ_ONCE(sqe->buf_index);
+	req->imu = NULL;
+
 	if (req->opcode == IORING_OP_READ_FIXED ||
 	    req->opcode == IORING_OP_WRITE_FIXED) {
-		req->imu = NULL;
+		struct io_ring_ctx *ctx = req->ctx;
+		u16 index;
+
+		if (unlikely(req->buf_index >= ctx->nr_user_bufs))
+			return -EFAULT;
+		index = array_index_nospec(req->buf_index, ctx->nr_user_bufs);
+		req->imu = ctx->user_bufs[index];
 		io_req_set_rsrc_node(req);
 	}
 
 	req->rw.addr = READ_ONCE(sqe->addr);
 	req->rw.len = READ_ONCE(sqe->len);
-	req->buf_index = READ_ONCE(sqe->buf_index);
 	return 0;
 }
 
@@ -3066,18 +3075,9 @@ static int __io_import_fixed(struct io_k
 
 static int io_import_fixed(struct io_kiocb *req, int rw, struct iov_iter *iter)
 {
-	struct io_ring_ctx *ctx = req->ctx;
-	struct io_mapped_ubuf *imu = req->imu;
-	u16 index, buf_index = req->buf_index;
-
-	if (likely(!imu)) {
-		if (unlikely(buf_index >= ctx->nr_user_bufs))
-			return -EFAULT;
-		index = array_index_nospec(buf_index, ctx->nr_user_bufs);
-		imu = READ_ONCE(ctx->user_bufs[index]);
-		req->imu = imu;
-	}
-	return __io_import_fixed(req, rw, iter, imu);
+	if (WARN_ON_ONCE(!req->imu))
+		return -EFAULT;
+	return __io_import_fixed(req, rw, iter, req->imu);
 }
 
 static void io_ring_submit_unlock(struct io_ring_ctx *ctx, bool needs_lock)



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 5.15 00/28] 5.15.52-rc1 review
  2022-06-30 13:46 [PATCH 5.15 00/28] 5.15.52-rc1 review Greg Kroah-Hartman
                   ` (27 preceding siblings ...)
  2022-06-30 13:47 ` [PATCH 5.15 28/28] io_uring: fix not locked access to fixed buf table Greg Kroah-Hartman
@ 2022-06-30 23:17 ` Shuah Khan
  2022-06-30 23:21 ` Florian Fainelli
                   ` (6 subsequent siblings)
  35 siblings, 0 replies; 41+ messages in thread
From: Shuah Khan @ 2022-06-30 23:17 UTC (permalink / raw)
  To: Greg Kroah-Hartman, linux-kernel
  Cc: stable, torvalds, akpm, linux, shuah, patches, lkft-triage,
	pavel, jonathanh, f.fainelli, sudipm.mukherjee, slade,
	Shuah Khan

On 6/30/22 7:46 AM, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 5.15.52 release.
> There are 28 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
> 
> Responses should be made by Sat, 02 Jul 2022 13:32:22 +0000.
> Anything received after that time might be too late.
> 
> The whole patch series can be found in one patch at:
> 	https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.15.52-rc1.gz
> or in the git tree and branch at:
> 	git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.15.y
> and the diffstat can be found below.
> 
> thanks,
> 
> greg k-h
> 

Compiled and booted on my test system. No dmesg regressions.

Tested-by: Shuah Khan <skhan@linuxfoundation.org>

thanks,
-- Shuah

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 5.15 00/28] 5.15.52-rc1 review
  2022-06-30 13:46 [PATCH 5.15 00/28] 5.15.52-rc1 review Greg Kroah-Hartman
                   ` (28 preceding siblings ...)
  2022-06-30 23:17 ` [PATCH 5.15 00/28] 5.15.52-rc1 review Shuah Khan
@ 2022-06-30 23:21 ` Florian Fainelli
  2022-07-01  0:58 ` Guenter Roeck
                   ` (5 subsequent siblings)
  35 siblings, 0 replies; 41+ messages in thread
From: Florian Fainelli @ 2022-06-30 23:21 UTC (permalink / raw)
  To: Greg Kroah-Hartman, linux-kernel
  Cc: stable, torvalds, akpm, linux, shuah, patches, lkft-triage,
	pavel, jonathanh, sudipm.mukherjee, slade



On 6/30/2022 6:46 AM, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 5.15.52 release.
> There are 28 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
> 
> Responses should be made by Sat, 02 Jul 2022 13:32:22 +0000.
> Anything received after that time might be too late.
> 
> The whole patch series can be found in one patch at:
> 	https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.15.52-rc1.gz
> or in the git tree and branch at:
> 	git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.15.y
> and the diffstat can be found below.
> 
> thanks,
> 
> greg k-h

On ARCH_BRCMSTB using 32-bit and 64-bit ARM kernels:

Tested-by: Florian Fainelli <f.fainelli@gmail.com>
-- 
Florian

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 5.15 00/28] 5.15.52-rc1 review
  2022-06-30 13:46 [PATCH 5.15 00/28] 5.15.52-rc1 review Greg Kroah-Hartman
                   ` (29 preceding siblings ...)
  2022-06-30 23:21 ` Florian Fainelli
@ 2022-07-01  0:58 ` Guenter Roeck
  2022-07-01  3:51 ` Bagas Sanjaya
                   ` (4 subsequent siblings)
  35 siblings, 0 replies; 41+ messages in thread
From: Guenter Roeck @ 2022-07-01  0:58 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: linux-kernel, stable, torvalds, akpm, shuah, patches,
	lkft-triage, pavel, jonathanh, f.fainelli, sudipm.mukherjee,
	slade

On Thu, Jun 30, 2022 at 03:46:56PM +0200, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 5.15.52 release.
> There are 28 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
> 
> Responses should be made by Sat, 02 Jul 2022 13:32:22 +0000.
> Anything received after that time might be too late.
> 

Build results:
	total: 159 pass: 159 fail: 0
Qemu test results:
	total: 488 pass: 488 fail: 0

Tested-by: Guenter Roeck <linux@roeck-us.net>

Guenter

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 5.15 00/28] 5.15.52-rc1 review
  2022-06-30 13:46 [PATCH 5.15 00/28] 5.15.52-rc1 review Greg Kroah-Hartman
                   ` (30 preceding siblings ...)
  2022-07-01  0:58 ` Guenter Roeck
@ 2022-07-01  3:51 ` Bagas Sanjaya
  2022-07-01  6:18 ` Naresh Kamboju
                   ` (3 subsequent siblings)
  35 siblings, 0 replies; 41+ messages in thread
From: Bagas Sanjaya @ 2022-07-01  3:51 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: linux-kernel, stable, torvalds, akpm, linux, shuah, patches,
	lkft-triage, pavel, jonathanh, f.fainelli, sudipm.mukherjee,
	slade, linuxppc-dev, Paul Mackerras, Benjamin Herrenschmidt,
	Michael Ellerman

On Thu, Jun 30, 2022 at 03:46:56PM +0200, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 5.15.52 release.
> There are 28 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
> 

Successfully cross-compiled for arm64 (bcm2711_defconfig, GCC 12.1.0)
and powerpc (ps3_defconfig, GCC 12.1.0).

I get two warnings on powerpc build (vdso related):

  VDSO32L arch/powerpc/kernel/vdso32/vdso32.so.dbg
<prefix>/lib/gcc/powerpc64-unknown-linux-gnu/12.1.0/../../../../powerpc64-unknown-linux-gnu/bin/ld: warning: cannot find entry symbol nable-trivial-auto-var-init-zero-knowing-it-will-be-removed-from-clang; not setting start address
  VDSO64L arch/powerpc/kernel/vdso64/vdso64.so.dbg
<prefix>/lib/gcc/powerpc64-unknown-linux-gnu/12.1.0/../../../../powerpc64-unknown-linux-gnu/bin/ld: warning: cannot find entry symbol nable-trivial-auto-var-init-zero-knowing-it-will-be-removed-from-clang; not setting start address

Tested-by: Bagas Sanjaya <bagasdotme@gmail.com>

-- 
An old man doll... just what I always wanted! - Clara

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 5.15 00/28] 5.15.52-rc1 review
  2022-06-30 13:46 [PATCH 5.15 00/28] 5.15.52-rc1 review Greg Kroah-Hartman
                   ` (31 preceding siblings ...)
  2022-07-01  3:51 ` Bagas Sanjaya
@ 2022-07-01  6:18 ` Naresh Kamboju
  2022-07-01  8:51 ` Christian Brauner
                   ` (2 subsequent siblings)
  35 siblings, 0 replies; 41+ messages in thread
From: Naresh Kamboju @ 2022-07-01  6:18 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: linux-kernel, stable, torvalds, akpm, linux, shuah, patches,
	lkft-triage, pavel, jonathanh, f.fainelli, sudipm.mukherjee,
	slade

On Thu, 30 Jun 2022 at 19:25, Greg Kroah-Hartman
<gregkh@linuxfoundation.org> wrote:
>
> This is the start of the stable review cycle for the 5.15.52 release.
> There are 28 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Sat, 02 Jul 2022 13:32:22 +0000.
> Anything received after that time might be too late.
>
> The whole patch series can be found in one patch at:
>         https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.15.52-rc1.gz
> or in the git tree and branch at:
>         git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.15.y
> and the diffstat can be found below.
>
> thanks,
>
> greg k-h


Results from Linaro’s test farm.
No regressions on arm64, arm, x86_64, and i386.

Tested-by: Linux Kernel Functional Testing <lkft@linaro.org>

## Build
* kernel: 5.15.52-rc1
* git: https://gitlab.com/Linaro/lkft/mirrors/stable/linux-stable-rc
* git branch: linux-5.15.y
* git commit: 49249bfc4d1bc5a4f6c82471a12a13bcc78a40c4
* git describe: v5.15.51-29-g49249bfc4d1b
* test details:
https://qa-reports.linaro.org/lkft/linux-stable-rc-linux-5.15.y/build/v5.15.51-29-g49249bfc4d1b

## Test Regressions (compared to v5.15.51)
No test regressions found.

## Metric Regressions (compared to v5.15.51)
No metric regressions found.

## Test Fixes (compared to v5.15.51)
No test fixes found.

## Metric Fixes (compared to v5.15.51)
No metric fixes found.

## Test result summary
total: 132820, pass: 118669, fail: 310, skip: 12977, xfail: 864

## Build Summary
* arc: 10 total, 10 passed, 0 failed
* arm: 308 total, 308 passed, 0 failed
* arm64: 62 total, 62 passed, 0 failed
* i386: 52 total, 49 passed, 3 failed
* mips: 48 total, 48 passed, 0 failed
* parisc: 12 total, 12 passed, 0 failed
* powerpc: 54 total, 54 passed, 0 failed
* riscv: 22 total, 22 passed, 0 failed
* s390: 20 total, 20 passed, 0 failed
* sh: 24 total, 24 passed, 0 failed
* sparc: 12 total, 12 passed, 0 failed
* x86_64: 56 total, 55 passed, 1 failed

## Test suites summary
* fwts
* igt-gpu-tools
* kunit
* kvm-unit-tests
* libgpiod
* libhugetlbfs
* log-parser-boot
* log-parser-test
* ltp-cap_bounds
* ltp-commands
* ltp-containers
* ltp-controllers
* ltp-cpuhotplug
* ltp-crypto
* ltp-cve
* ltp-dio
* ltp-fcntl-locktests
* ltp-filecaps
* ltp-fs
* ltp-fs_bind
* ltp-fs_perms_simple
* ltp-fsx
* ltp-hugetlb
* ltp-io
* ltp-ipc
* ltp-math
* ltp-mm
* ltp-nptl
* ltp-open-posix-tests
* ltp-pty
* ltp-sched
* ltp-securebits
* ltp-smoke
* ltp-syscalls
* ltp-tracing
* network-basic-tests
* packetdrill
* perf
* perf/Zstd-perf.data-compression
* rcutorture
* ssuite
* v4l2-compliance
* vdso

--
Linaro LKFT
https://lkft.linaro.org

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 5.15 00/28] 5.15.52-rc1 review
  2022-06-30 13:46 [PATCH 5.15 00/28] 5.15.52-rc1 review Greg Kroah-Hartman
                   ` (32 preceding siblings ...)
  2022-07-01  6:18 ` Naresh Kamboju
@ 2022-07-01  8:51 ` Christian Brauner
  2022-07-01  8:57   ` Greg Kroah-Hartman
  2022-07-01 10:36 ` Sudip Mukherjee
  2022-07-01 13:55 ` Ron Economos
  35 siblings, 1 reply; 41+ messages in thread
From: Christian Brauner @ 2022-07-01  8:51 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: linux-kernel, stable, torvalds, akpm, linux, shuah, patches,
	lkft-triage, pavel, jonathanh, f.fainelli, sudipm.mukherjee,
	slade

On Thu, Jun 30, 2022 at 03:46:56PM +0200, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 5.15.52 release.
> There are 28 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
> 
> Responses should be made by Sat, 02 Jul 2022 13:32:22 +0000.
> Anything received after that time might be too late.
> 
> The whole patch series can be found in one patch at:
> 	https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.15.52-rc1.gz
> or in the git tree and branch at:
> 	git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.15.y
> and the diffstat can be found below.
> 
> thanks,
> 
> greg k-h

Ran xfstests after the backport series I gave you:

sudo ./check -g idmapped (xfs, ext4, btrfs, ...)
sudo ./runltp -f fs_perms_simple,fs_bind,containers,cap_bounds,cve,uevent,filecaps

all tests pass.

Thank you!
Christian

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 5.15 00/28] 5.15.52-rc1 review
  2022-07-01  8:51 ` Christian Brauner
@ 2022-07-01  8:57   ` Greg Kroah-Hartman
  2022-07-01 12:51     ` Christian Brauner
  0 siblings, 1 reply; 41+ messages in thread
From: Greg Kroah-Hartman @ 2022-07-01  8:57 UTC (permalink / raw)
  To: Christian Brauner
  Cc: linux-kernel, stable, torvalds, akpm, linux, shuah, patches,
	lkft-triage, pavel, jonathanh, f.fainelli, sudipm.mukherjee,
	slade

On Fri, Jul 01, 2022 at 10:51:23AM +0200, Christian Brauner wrote:
> On Thu, Jun 30, 2022 at 03:46:56PM +0200, Greg Kroah-Hartman wrote:
> > This is the start of the stable review cycle for the 5.15.52 release.
> > There are 28 patches in this series, all will be posted as a response
> > to this one.  If anyone has any issues with these being applied, please
> > let me know.
> > 
> > Responses should be made by Sat, 02 Jul 2022 13:32:22 +0000.
> > Anything received after that time might be too late.
> > 
> > The whole patch series can be found in one patch at:
> > 	https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.15.52-rc1.gz
> > or in the git tree and branch at:
> > 	git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.15.y
> > and the diffstat can be found below.
> > 
> > thanks,
> > 
> > greg k-h
> 
> Ran xfstests after the backport series I gave you:
> 
> sudo ./check -g idmapped (xfs, ext4, btrfs, ...)
> sudo ./runltp -f fs_perms_simple,fs_bind,containers,cap_bounds,cve,uevent,filecaps
> 
> all tests pass.

Nice!

Want to respond with a Tested-by so that the tools will pick it up for
the release commit?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 5.15 00/28] 5.15.52-rc1 review
  2022-06-30 13:46 [PATCH 5.15 00/28] 5.15.52-rc1 review Greg Kroah-Hartman
                   ` (33 preceding siblings ...)
  2022-07-01  8:51 ` Christian Brauner
@ 2022-07-01 10:36 ` Sudip Mukherjee
  2022-07-01 13:55 ` Ron Economos
  35 siblings, 0 replies; 41+ messages in thread
From: Sudip Mukherjee @ 2022-07-01 10:36 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: linux-kernel, stable, torvalds, akpm, linux, shuah, patches,
	lkft-triage, pavel, jonathanh, f.fainelli, slade

Hi Greg,

On Thu, Jun 30, 2022 at 03:46:56PM +0200, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 5.15.52 release.
> There are 28 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
> 
> Responses should be made by Sat, 02 Jul 2022 13:32:22 +0000.
> Anything received after that time might be too late.

Build test (gcc version 11.3.1 20220627):
mips: 62 configs -> no failure
arm: 99 configs -> no failure
arm64: 3 configs -> no failure
x86_64: 4 configs -> no failure
alpha allmodconfig -> no failure
csky allmodconfig -> no failure
powerpc allmodconfig -> no failure
riscv allmodconfig -> no failure
s390 allmodconfig -> no failure
xtensa allmodconfig -> no failure

Boot test:
x86_64: Booted on my test laptop. No regression.
x86_64: Booted on qemu. No regression. [1]
arm64: Booted on rpi4b (4GB model). No regression. [2]
mips: Booted on ci20 board. No regression. [3]

[1]. https://openqa.qa.codethink.co.uk/tests/1434
[2]. https://openqa.qa.codethink.co.uk/tests/1437
[3]. https://openqa.qa.codethink.co.uk/tests/1439

Tested-by: Sudip Mukherjee <sudip.mukherjee@codethink.co.uk>

--
Regards
Sudip

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 5.15 00/28] 5.15.52-rc1 review
  2022-07-01  8:57   ` Greg Kroah-Hartman
@ 2022-07-01 12:51     ` Christian Brauner
  0 siblings, 0 replies; 41+ messages in thread
From: Christian Brauner @ 2022-07-01 12:51 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: linux-kernel, stable, torvalds, akpm, linux, shuah, patches,
	lkft-triage, pavel, jonathanh, f.fainelli, sudipm.mukherjee,
	slade

On Fri, Jul 01, 2022 at 10:57:48AM +0200, Greg Kroah-Hartman wrote:
> On Fri, Jul 01, 2022 at 10:51:23AM +0200, Christian Brauner wrote:
> > On Thu, Jun 30, 2022 at 03:46:56PM +0200, Greg Kroah-Hartman wrote:
> > > This is the start of the stable review cycle for the 5.15.52 release.
> > > There are 28 patches in this series, all will be posted as a response
> > > to this one.  If anyone has any issues with these being applied, please
> > > let me know.
> > > 
> > > Responses should be made by Sat, 02 Jul 2022 13:32:22 +0000.
> > > Anything received after that time might be too late.
> > > 
> > > The whole patch series can be found in one patch at:
> > > 	https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.15.52-rc1.gz
> > > or in the git tree and branch at:
> > > 	git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.15.y
> > > and the diffstat can be found below.
> > > 
> > > thanks,
> > > 
> > > greg k-h
> > 
> > Ran xfstests after the backport series I gave you:
> > 
> > sudo ./check -g idmapped (xfs, ext4, btrfs, ...)
> > sudo ./runltp -f fs_perms_simple,fs_bind,containers,cap_bounds,cve,uevent,filecaps
> > 
> > all tests pass.
> 
> Nice!
> 
> Want to respond with a Tested-by so that the tools will pick it up for
> the release commit?

Absolutely,
Tested-by: Christian Brauner (Microsoft) <brauner@kernel.org> 

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 5.15 00/28] 5.15.52-rc1 review
  2022-06-30 13:46 [PATCH 5.15 00/28] 5.15.52-rc1 review Greg Kroah-Hartman
                   ` (34 preceding siblings ...)
  2022-07-01 10:36 ` Sudip Mukherjee
@ 2022-07-01 13:55 ` Ron Economos
  35 siblings, 0 replies; 41+ messages in thread
From: Ron Economos @ 2022-07-01 13:55 UTC (permalink / raw)
  To: Greg Kroah-Hartman, linux-kernel
  Cc: stable, torvalds, akpm, linux, shuah, patches, lkft-triage,
	pavel, jonathanh, f.fainelli, sudipm.mukherjee, slade

On 6/30/22 6:46 AM, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 5.15.52 release.
> There are 28 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Sat, 02 Jul 2022 13:32:22 +0000.
> Anything received after that time might be too late.
>
> The whole patch series can be found in one patch at:
> 	https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.15.52-rc1.gz
> or in the git tree and branch at:
> 	git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.15.y
> and the diffstat can be found below.
>
> thanks,
>
> greg k-h

Built and booted successfully on RISC-V RV64 (HiFive Unmatched).

Tested-by: Ron Economos <re@w6rz.net>


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 5.15 02/28] clocksource/drivers/ixp4xx: remove __init from ixp4xx_timer_setup()
  2022-06-30 13:46 ` [PATCH 5.15 02/28] clocksource/drivers/ixp4xx: remove __init from ixp4xx_timer_setup() Greg Kroah-Hartman
@ 2022-07-01 15:31   ` Nathan Chancellor
  2022-07-01 15:50     ` Greg Kroah-Hartman
  0 siblings, 1 reply; 41+ messages in thread
From: Nathan Chancellor @ 2022-07-01 15:31 UTC (permalink / raw)
  To: Greg Kroah-Hartman; +Cc: linux-kernel, stable, Linus Walleij, Daniel Lezcano

Hi Greg,

On Thu, Jun 30, 2022 at 03:46:58PM +0200, Greg Kroah-Hartman wrote:
> From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> 
> ixp4xx_timer_setup is exported, and so can not be an __init function.
> Remove the __init marking as the build system is rightfully claiming
> this is an error in older kernels.
> 
> This is fixed "properly" in commit 41929c9f628b
> ("clocksource/drivers/ixp4xx: Drop boardfile probe path") but that can
> not be backported to older kernels as the reworking of the IXP4xx
> codebase is not suitable for stable releases.
> 
> Cc: Linus Walleij <linus.walleij@linaro.org>
> Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

This patch causes the following warnings with clang when building
ARCH=arm allmodconfig on 5.15, 5.10, and 5.4. I am surprised nobody else
saw them.

  WARNING: modpost: vmlinux.o(.text+0x1219ccc): Section mismatch in reference from the function ixp4xx_timer_register() to the function .init.text:sched_clock_register()
  The function ixp4xx_timer_register() references
  the function __init sched_clock_register().
  This is often because ixp4xx_timer_register lacks a __init
  annotation or the annotation of sched_clock_register is wrong.

  WARNING: modpost: vmlinux.o(.text+0x1219cf4): Section mismatch in reference from the function ixp4xx_timer_register() to the function .init.text:register_current_timer_delay()
  The function ixp4xx_timer_register() references
  the function __init register_current_timer_delay().
  This is often because ixp4xx_timer_register lacks a __init
  annotation or the annotation of register_current_timer_delay is wrong.

I think it would just be better to remove the export of
ixp4xx_timer_setup(), rather than removing __init, as it is only called
in arch/arm/mach-ixp4xx/common.c, which has to be built into the kernel
image as it is 'obj-y' in arch/arm/mach-ixp4xx/Makefile.

Cheers,
Nathan

> ---
>  drivers/clocksource/mmio.c                 |    2 +-
>  drivers/clocksource/timer-ixp4xx.c         |   10 ++++------
>  include/linux/platform_data/timer-ixp4xx.h |    5 ++---
>  3 files changed, 7 insertions(+), 10 deletions(-)
> 
> --- a/drivers/clocksource/mmio.c
> +++ b/drivers/clocksource/mmio.c
> @@ -46,7 +46,7 @@ u64 clocksource_mmio_readw_down(struct c
>   * @bits:	Number of valid bits
>   * @read:	One of clocksource_mmio_read*() above
>   */
> -int __init clocksource_mmio_init(void __iomem *base, const char *name,
> +int clocksource_mmio_init(void __iomem *base, const char *name,
>  	unsigned long hz, int rating, unsigned bits,
>  	u64 (*read)(struct clocksource *))
>  {
> --- a/drivers/clocksource/timer-ixp4xx.c
> +++ b/drivers/clocksource/timer-ixp4xx.c
> @@ -161,9 +161,8 @@ static int ixp4xx_resume(struct clock_ev
>   * We use OS timer1 on the CPU for the timer tick and the timestamp
>   * counter as a source of real clock ticks to account for missed jiffies.
>   */
> -static __init int ixp4xx_timer_register(void __iomem *base,
> -					int timer_irq,
> -					unsigned int timer_freq)
> +static int ixp4xx_timer_register(void __iomem *base, int timer_irq,
> +				 unsigned int timer_freq)
>  {
>  	struct ixp4xx_timer *tmr;
>  	int ret;
> @@ -269,9 +268,8 @@ builtin_platform_driver(ixp4xx_timer_dri
>   * @timer_irq: Linux IRQ number for the timer
>   * @timer_freq: Fixed frequency of the timer
>   */
> -void __init ixp4xx_timer_setup(resource_size_t timerbase,
> -			       int timer_irq,
> -			       unsigned int timer_freq)
> +void ixp4xx_timer_setup(resource_size_t timerbase, int timer_irq,
> +			unsigned int timer_freq)
>  {
>  	void __iomem *base;
>  
> --- a/include/linux/platform_data/timer-ixp4xx.h
> +++ b/include/linux/platform_data/timer-ixp4xx.h
> @@ -4,8 +4,7 @@
>  
>  #include <linux/ioport.h>
>  
> -void __init ixp4xx_timer_setup(resource_size_t timerbase,
> -			       int timer_irq,
> -			       unsigned int timer_freq);
> +void ixp4xx_timer_setup(resource_size_t timerbase, int timer_irq,
> +			unsigned int timer_freq);
>  
>  #endif
> 
> 
> 

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 5.15 02/28] clocksource/drivers/ixp4xx: remove __init from ixp4xx_timer_setup()
  2022-07-01 15:31   ` Nathan Chancellor
@ 2022-07-01 15:50     ` Greg Kroah-Hartman
  0 siblings, 0 replies; 41+ messages in thread
From: Greg Kroah-Hartman @ 2022-07-01 15:50 UTC (permalink / raw)
  To: Nathan Chancellor; +Cc: linux-kernel, stable, Linus Walleij, Daniel Lezcano

On Fri, Jul 01, 2022 at 08:31:23AM -0700, Nathan Chancellor wrote:
> Hi Greg,
> 
> On Thu, Jun 30, 2022 at 03:46:58PM +0200, Greg Kroah-Hartman wrote:
> > From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> > 
> > ixp4xx_timer_setup is exported, and so can not be an __init function.
> > Remove the __init marking as the build system is rightfully claiming
> > this is an error in older kernels.
> > 
> > This is fixed "properly" in commit 41929c9f628b
> > ("clocksource/drivers/ixp4xx: Drop boardfile probe path") but that can
> > not be backported to older kernels as the reworking of the IXP4xx
> > codebase is not suitable for stable releases.
> > 
> > Cc: Linus Walleij <linus.walleij@linaro.org>
> > Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
> > Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> 
> This patch causes the following warnings with clang when building
> ARCH=arm allmodconfig on 5.15, 5.10, and 5.4. I am surprised nobody else
> saw them.
> 
>   WARNING: modpost: vmlinux.o(.text+0x1219ccc): Section mismatch in reference from the function ixp4xx_timer_register() to the function .init.text:sched_clock_register()
>   The function ixp4xx_timer_register() references
>   the function __init sched_clock_register().
>   This is often because ixp4xx_timer_register lacks a __init
>   annotation or the annotation of sched_clock_register is wrong.
> 
>   WARNING: modpost: vmlinux.o(.text+0x1219cf4): Section mismatch in reference from the function ixp4xx_timer_register() to the function .init.text:register_current_timer_delay()
>   The function ixp4xx_timer_register() references
>   the function __init register_current_timer_delay().
>   This is often because ixp4xx_timer_register lacks a __init
>   annotation or the annotation of register_current_timer_delay is wrong.
> 
> I think it would just be better to remove the export of
> ixp4xx_timer_setup(), rather than removing __init, as it is only called
> in arch/arm/mach-ixp4xx/common.c, which has to be built into the kernel
> image as it is 'obj-y' in arch/arm/mach-ixp4xx/Makefile.

Ah it can?  That would have been much better/easier.  I'll drop this and
go do that for the next release after this one, thanks for letting me
know as this did not show up in my local testing either.

greg k-h

^ permalink raw reply	[flat|nested] 41+ messages in thread

end of thread, other threads:[~2022-07-01 15:51 UTC | newest]

Thread overview: 41+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-06-30 13:46 [PATCH 5.15 00/28] 5.15.52-rc1 review Greg Kroah-Hartman
2022-06-30 13:46 ` [PATCH 5.15 01/28] tick/nohz: unexport __init-annotated tick_nohz_full_setup() Greg Kroah-Hartman
2022-06-30 13:46 ` [PATCH 5.15 02/28] clocksource/drivers/ixp4xx: remove __init from ixp4xx_timer_setup() Greg Kroah-Hartman
2022-07-01 15:31   ` Nathan Chancellor
2022-07-01 15:50     ` Greg Kroah-Hartman
2022-06-30 13:46 ` [PATCH 5.15 03/28] x86, kvm: use proper ASM macros for kvm_vcpu_is_preempted Greg Kroah-Hartman
2022-06-30 13:47 ` [PATCH 5.15 04/28] bcache: memset on stack variables in bch_btree_check() and bch_sectors_dirty_init() Greg Kroah-Hartman
2022-06-30 13:47 ` [PATCH 5.15 05/28] xfs: use kmem_cache_free() for kmem_cache objects Greg Kroah-Hartman
2022-06-30 13:47 ` [PATCH 5.15 06/28] xfs: punch out data fork delalloc blocks on COW writeback failure Greg Kroah-Hartman
2022-06-30 13:47 ` [PATCH 5.15 07/28] xfs: Fix the free logic of state in xfs_attr_node_hasname Greg Kroah-Hartman
2022-06-30 13:47 ` [PATCH 5.15 08/28] xfs: remove all COW fork extents when remounting readonly Greg Kroah-Hartman
2022-06-30 13:47 ` [PATCH 5.15 09/28] xfs: check sb_meta_uuid for dabuf buffer recovery Greg Kroah-Hartman
2022-06-30 13:47 ` [PATCH 5.15 10/28] xfs: prevent UAF in xfs_log_item_in_current_chkpt Greg Kroah-Hartman
2022-06-30 13:47 ` [PATCH 5.15 11/28] xfs: only bother with sync_filesystem during readonly remount Greg Kroah-Hartman
2022-06-30 13:47 ` [PATCH 5.15 12/28] powerpc/ftrace: Remove ftrace init tramp once kernel init is complete Greg Kroah-Hartman
2022-06-30 13:47 ` [PATCH 5.15 13/28] fs: add is_idmapped_mnt() helper Greg Kroah-Hartman
2022-06-30 13:47 ` [PATCH 5.15 14/28] fs: move mapping helpers Greg Kroah-Hartman
2022-06-30 13:47 ` [PATCH 5.15 15/28] fs: tweak fsuidgid_has_mapping() Greg Kroah-Hartman
2022-06-30 13:47 ` [PATCH 5.15 16/28] fs: account for filesystem mappings Greg Kroah-Hartman
2022-06-30 13:47 ` [PATCH 5.15 17/28] docs: update mapping documentation Greg Kroah-Hartman
2022-06-30 13:47 ` [PATCH 5.15 18/28] fs: use low-level mapping helpers Greg Kroah-Hartman
2022-06-30 13:47 ` [PATCH 5.15 19/28] fs: remove unused " Greg Kroah-Hartman
2022-06-30 13:47 ` [PATCH 5.15 20/28] fs: port higher-level " Greg Kroah-Hartman
2022-06-30 13:47 ` [PATCH 5.15 21/28] fs: add i_user_ns() helper Greg Kroah-Hartman
2022-06-30 13:47 ` [PATCH 5.15 22/28] fs: support mapped mounts of mapped filesystems Greg Kroah-Hartman
2022-06-30 13:47 ` [PATCH 5.15 23/28] fs: fix acl translation Greg Kroah-Hartman
2022-06-30 13:47 ` [PATCH 5.15 24/28] fs: account for group membership Greg Kroah-Hartman
2022-06-30 13:47 ` [PATCH 5.15 25/28] rtw88: 8821c: support RFE type4 wifi NIC Greg Kroah-Hartman
2022-06-30 13:47 ` [PATCH 5.15 26/28] rtw88: rtw8821c: enable rfe 6 devices Greg Kroah-Hartman
2022-06-30 13:47 ` [PATCH 5.15 27/28] net: mscc: ocelot: allow unregistered IP multicast flooding to CPU Greg Kroah-Hartman
2022-06-30 13:47 ` [PATCH 5.15 28/28] io_uring: fix not locked access to fixed buf table Greg Kroah-Hartman
2022-06-30 23:17 ` [PATCH 5.15 00/28] 5.15.52-rc1 review Shuah Khan
2022-06-30 23:21 ` Florian Fainelli
2022-07-01  0:58 ` Guenter Roeck
2022-07-01  3:51 ` Bagas Sanjaya
2022-07-01  6:18 ` Naresh Kamboju
2022-07-01  8:51 ` Christian Brauner
2022-07-01  8:57   ` Greg Kroah-Hartman
2022-07-01 12:51     ` Christian Brauner
2022-07-01 10:36 ` Sudip Mukherjee
2022-07-01 13:55 ` Ron Economos

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).