linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] kernfs: fix use-after-free in __kernfs_remove
@ 2022-09-07 20:08 Christian A. Ehrhardt
  2022-09-08 17:14 ` Tejun Heo
  0 siblings, 1 reply; 7+ messages in thread
From: Christian A. Ehrhardt @ 2022-09-07 20:08 UTC (permalink / raw)
  To: linux-kernel; +Cc: Christian A. Ehrhardt, Greg Kroah-Hartman, Tejun Heo

Concurrent calls to __kernfs_remove can race on the removal
of the root node: The race occurs if the root node(kn) is freed
during kernfs_drain. The child node(pos) is explicitly protected
with an additional ref count. Do the same for the root node.

Found by syzkaller with the following reproducer:

syz_mount_image$ext4(0x0, &(0x7f0000000100)='./file0\x00', 0x100000, 0x0, 0x0, 0x0, 0x0)
r0 = openat(0xffffffffffffff9c, &(0x7f0000000080)='/proc/self/exe\x00', 0x0, 0x0)
close(r0)
pipe2(&(0x7f0000000140)={0xffffffffffffffff, <r1=>0xffffffffffffffff}, 0x800)
mount$9p_fd(0x0, &(0x7f0000000040)='./file0\x00', &(0x7f00000000c0), 0x408, &(0x7f0000000280)={'trans=fd,', {'rfdno', 0x3d, r0}, 0x2c, {'wfdno', 0x3d, r1}, 0x2c, {[{@cache_loose}, {@mmap}, {@loose}, {@loose}, {@mmap}], [{@mask={'mask', 0x3d, '^MAY_EXEC'}}, {@fsmagic={'fsmagic', 0x3d, 0x10001}}, {@dont_hash}]}})

Sample report:

==================================================================
BUG: KASAN: use-after-free in kernfs_type include/linux/kernfs.h:335 [inline]
BUG: KASAN: use-after-free in kernfs_leftmost_descendant fs/kernfs/dir.c:1261 [inline]
BUG: KASAN: use-after-free in __kernfs_remove.part.0+0x843/0x960 fs/kernfs/dir.c:1369
Read of size 2 at addr ffff8880088807f0 by task syz-executor.2/857

CPU: 0 PID: 857 Comm: syz-executor.2 Not tainted 6.0.0-rc3-00363-g7726d4c3e60b #5
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
Call Trace:
 <TASK>
 __dump_stack lib/dump_stack.c:88 [inline]
 dump_stack_lvl+0x6e/0x91 lib/dump_stack.c:106
 print_address_description mm/kasan/report.c:317 [inline]
 print_report.cold+0x5e/0x5e5 mm/kasan/report.c:433
 kasan_report+0xa3/0x130 mm/kasan/report.c:495
 kernfs_type include/linux/kernfs.h:335 [inline]
 kernfs_leftmost_descendant fs/kernfs/dir.c:1261 [inline]
 __kernfs_remove.part.0+0x843/0x960 fs/kernfs/dir.c:1369
 __kernfs_remove fs/kernfs/dir.c:1356 [inline]
 kernfs_remove_by_name_ns+0x108/0x190 fs/kernfs/dir.c:1589
 sysfs_slab_add+0x133/0x1e0 mm/slub.c:5943
 __kmem_cache_create+0x3e0/0x550 mm/slub.c:4899
 create_cache mm/slab_common.c:229 [inline]
 kmem_cache_create_usercopy+0x167/0x2a0 mm/slab_common.c:335
 p9_client_create+0xd4d/0x1190 net/9p/client.c:993
 v9fs_session_init+0x1e6/0x13c0 fs/9p/v9fs.c:408
 v9fs_mount+0xb9/0xbd0 fs/9p/vfs_super.c:126
 legacy_get_tree+0xf1/0x200 fs/fs_context.c:610
 vfs_get_tree+0x85/0x2e0 fs/super.c:1530
 do_new_mount fs/namespace.c:3040 [inline]
 path_mount+0x675/0x1d00 fs/namespace.c:3370
 do_mount fs/namespace.c:3383 [inline]
 __do_sys_mount fs/namespace.c:3591 [inline]
 __se_sys_mount fs/namespace.c:3568 [inline]
 __x64_sys_mount+0x282/0x300 fs/namespace.c:3568
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x38/0x90 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x63/0xcd
RIP: 0033:0x7f725f983aed
Code: 02 b8 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f725f0f7028 EFLAGS: 00000246 ORIG_RAX: 00000000000000a5
RAX: ffffffffffffffda RBX: 00007f725faa3f80 RCX: 00007f725f983aed
RDX: 00000000200000c0 RSI: 0000000020000040 RDI: 0000000000000000
RBP: 00007f725f9f419c R08: 0000000020000280 R09: 0000000000000000
R10: 0000000000000408 R11: 0000000000000246 R12: 0000000000000000
R13: 0000000000000006 R14: 00007f725faa3f80 R15: 00007f725f0d7000
 </TASK>

Allocated by task 855:
 kasan_save_stack+0x1e/0x40 mm/kasan/common.c:38
 kasan_set_track mm/kasan/common.c:45 [inline]
 set_alloc_info mm/kasan/common.c:437 [inline]
 __kasan_slab_alloc+0x66/0x80 mm/kasan/common.c:470
 kasan_slab_alloc include/linux/kasan.h:224 [inline]
 slab_post_alloc_hook mm/slab.h:727 [inline]
 slab_alloc_node mm/slub.c:3243 [inline]
 slab_alloc mm/slub.c:3251 [inline]
 __kmem_cache_alloc_lru mm/slub.c:3258 [inline]
 kmem_cache_alloc+0xbf/0x200 mm/slub.c:3268
 kmem_cache_zalloc include/linux/slab.h:723 [inline]
 __kernfs_new_node+0xd4/0x680 fs/kernfs/dir.c:593
 kernfs_new_node fs/kernfs/dir.c:655 [inline]
 kernfs_create_dir_ns+0x9c/0x220 fs/kernfs/dir.c:1010
 sysfs_create_dir_ns+0x127/0x290 fs/sysfs/dir.c:59
 create_dir lib/kobject.c:63 [inline]
 kobject_add_internal+0x24a/0x8d0 lib/kobject.c:223
 kobject_add_varg lib/kobject.c:358 [inline]
 kobject_init_and_add+0x101/0x160 lib/kobject.c:441
 sysfs_slab_add+0x156/0x1e0 mm/slub.c:5954
 __kmem_cache_create+0x3e0/0x550 mm/slub.c:4899
 create_cache mm/slab_common.c:229 [inline]
 kmem_cache_create_usercopy+0x167/0x2a0 mm/slab_common.c:335
 p9_client_create+0xd4d/0x1190 net/9p/client.c:993
 v9fs_session_init+0x1e6/0x13c0 fs/9p/v9fs.c:408
 v9fs_mount+0xb9/0xbd0 fs/9p/vfs_super.c:126
 legacy_get_tree+0xf1/0x200 fs/fs_context.c:610
 vfs_get_tree+0x85/0x2e0 fs/super.c:1530
 do_new_mount fs/namespace.c:3040 [inline]
 path_mount+0x675/0x1d00 fs/namespace.c:3370
 do_mount fs/namespace.c:3383 [inline]
 __do_sys_mount fs/namespace.c:3591 [inline]
 __se_sys_mount fs/namespace.c:3568 [inline]
 __x64_sys_mount+0x282/0x300 fs/namespace.c:3568
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x38/0x90 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x63/0xcd

Freed by task 857:
 kasan_save_stack+0x1e/0x40 mm/kasan/common.c:38
 kasan_set_track+0x21/0x30 mm/kasan/common.c:45
 kasan_set_free_info+0x20/0x40 mm/kasan/generic.c:370
 ____kasan_slab_free mm/kasan/common.c:367 [inline]
 ____kasan_slab_free mm/kasan/common.c:329 [inline]
 __kasan_slab_free+0x108/0x190 mm/kasan/common.c:375
 kasan_slab_free include/linux/kasan.h:200 [inline]
 slab_free_hook mm/slub.c:1754 [inline]
 slab_free_freelist_hook mm/slub.c:1780 [inline]
 slab_free mm/slub.c:3534 [inline]
 kmem_cache_free+0x9c/0x340 mm/slub.c:3551
 kernfs_put.part.0+0x2b2/0x520 fs/kernfs/dir.c:547
 kernfs_put+0x42/0x50 fs/kernfs/dir.c:521
 __kernfs_remove.part.0+0x72d/0x960 fs/kernfs/dir.c:1407
 __kernfs_remove fs/kernfs/dir.c:1356 [inline]
 kernfs_remove_by_name_ns+0x108/0x190 fs/kernfs/dir.c:1589
 sysfs_slab_add+0x133/0x1e0 mm/slub.c:5943
 __kmem_cache_create+0x3e0/0x550 mm/slub.c:4899
 create_cache mm/slab_common.c:229 [inline]
 kmem_cache_create_usercopy+0x167/0x2a0 mm/slab_common.c:335
 p9_client_create+0xd4d/0x1190 net/9p/client.c:993
 v9fs_session_init+0x1e6/0x13c0 fs/9p/v9fs.c:408
 v9fs_mount+0xb9/0xbd0 fs/9p/vfs_super.c:126
 legacy_get_tree+0xf1/0x200 fs/fs_context.c:610
 vfs_get_tree+0x85/0x2e0 fs/super.c:1530
 do_new_mount fs/namespace.c:3040 [inline]
 path_mount+0x675/0x1d00 fs/namespace.c:3370
 do_mount fs/namespace.c:3383 [inline]
 __do_sys_mount fs/namespace.c:3591 [inline]
 __se_sys_mount fs/namespace.c:3568 [inline]
 __x64_sys_mount+0x282/0x300 fs/namespace.c:3568
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x38/0x90 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x63/0xcd

The buggy address belongs to the object at ffff888008880780
 which belongs to the cache kernfs_node_cache of size 128
The buggy address is located 112 bytes inside of
 128-byte region [ffff888008880780, ffff888008880800)

The buggy address belongs to the physical page:
page:00000000732833f8 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x8880
flags: 0x100000000000200(slab|node=0|zone=1)
raw: 0100000000000200 0000000000000000 dead000000000122 ffff888001147280
raw: 0000000000000000 0000000000150015 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
 ffff888008880680: fc fc fc fc fc fc fc fc fa fb fb fb fb fb fb fb
 ffff888008880700: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
>ffff888008880780: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                                                             ^
 ffff888008880800: fc fc fc fc fc fc fc fc fa fb fb fb fb fb fb fb
 ffff888008880880: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
==================================================================

cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Christian A. Ehrhardt <lk@c--e.de>
---
 fs/kernfs/dir.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/fs/kernfs/dir.c b/fs/kernfs/dir.c
index 1cc88ba6de90..66ccf8480592 100644
--- a/fs/kernfs/dir.c
+++ b/fs/kernfs/dir.c
@@ -1365,6 +1365,7 @@ static void __kernfs_remove(struct kernfs_node *kn)
 			atomic_add(KN_DEACTIVATED_BIAS, &pos->active);
 
 	/* deactivate and unlink the subtree node-by-node */
+	kernfs_get(kn);
 	do {
 		pos = kernfs_leftmost_descendant(kn);
 
@@ -1406,6 +1407,7 @@ static void __kernfs_remove(struct kernfs_node *kn)
 
 		kernfs_put(pos);
 	} while (pos != kn);
+	kernfs_put(kn);
 }
 
 /**
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] kernfs: fix use-after-free in __kernfs_remove
  2022-09-07 20:08 [PATCH] kernfs: fix use-after-free in __kernfs_remove Christian A. Ehrhardt
@ 2022-09-08 17:14 ` Tejun Heo
  2022-09-08 22:25   ` Christian A. Ehrhardt
  0 siblings, 1 reply; 7+ messages in thread
From: Tejun Heo @ 2022-09-08 17:14 UTC (permalink / raw)
  To: Christian A. Ehrhardt; +Cc: linux-kernel, Greg Kroah-Hartman

Hello, Christian.

On Wed, Sep 07, 2022 at 10:08:11PM +0200, Christian A. Ehrhardt wrote:
> Concurrent calls to __kernfs_remove can race on the removal
> of the root node: The race occurs if the root node(kn) is freed
> during kernfs_drain. The child node(pos) is explicitly protected
> with an additional ref count. Do the same for the root node.

I don't think this is right. We don't support parallel invocations of
__kernfs_remove() this way. If @kn can be freed during kernfs_drain(), it
also means that it can be freed before kernfs_rwsem is grabbed in
kernfs_remove(). The caller must be responsible for ensuring that @kn
remains allocated. Otherwise, it can't be made reliable.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] kernfs: fix use-after-free in __kernfs_remove
  2022-09-08 17:14 ` Tejun Heo
@ 2022-09-08 22:25   ` Christian A. Ehrhardt
  2022-09-12 21:24     ` Christian A. Ehrhardt
  0 siblings, 1 reply; 7+ messages in thread
From: Christian A. Ehrhardt @ 2022-09-08 22:25 UTC (permalink / raw)
  To: Tejun Heo; +Cc: linux-kernel, Greg Kroah-Hartman


Hello Tejun,

On Thu, Sep 08, 2022 at 07:14:43AM -1000, Tejun Heo wrote:
> Hello, Christian.
> 
> On Wed, Sep 07, 2022 at 10:08:11PM +0200, Christian A. Ehrhardt wrote:
> > Concurrent calls to __kernfs_remove can race on the removal
> > of the root node: The race occurs if the root node(kn) is freed
> > during kernfs_drain. The child node(pos) is explicitly protected
> > with an additional ref count. Do the same for the root node.
> 
> I don't think this is right. We don't support parallel invocations of
> __kernfs_remove() this way. If @kn can be freed during kernfs_drain(), it
> also means that it can be freed before kernfs_rwsem is grabbed in
> kernfs_remove().

Point taken. However, the syzkaller reproducer reliably triggers
the bug without the patch and the bug is gone with the patch.

> The caller must be responsible for ensuring that @kn
> remains allocated. Otherwise, it can't be made reliable.

In this case the caller of __kernfs_remove is not kernfs_remove but
kernfs_remove_by_name_ns and it fails to take a reference for the
node that it looks up and deletes. Thus a second call to
kernfs_remove_by_name_ns can remove the node while kernfs_drain
drops the semaphore.

I'll post an updated patch tomorrow.

      regards   Christian


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] kernfs: fix use-after-free in __kernfs_remove
  2022-09-08 22:25   ` Christian A. Ehrhardt
@ 2022-09-12 21:24     ` Christian A. Ehrhardt
  2022-09-12 21:39       ` Tejun Heo
  0 siblings, 1 reply; 7+ messages in thread
From: Christian A. Ehrhardt @ 2022-09-12 21:24 UTC (permalink / raw)
  To: Tejun Heo; +Cc: linux-kernel, Greg Kroah-Hartman


Hello, Tejun,

On Fri, Sep 09, 2022 at 12:25:36AM +0200, Christian A. Ehrhardt wrote:
> 
> Hello Tejun,
> 
> On Thu, Sep 08, 2022 at 07:14:43AM -1000, Tejun Heo wrote:
> > Hello, Christian.
> > 
> > On Wed, Sep 07, 2022 at 10:08:11PM +0200, Christian A. Ehrhardt wrote:
> > > Concurrent calls to __kernfs_remove can race on the removal
> > > of the root node: The race occurs if the root node(kn) is freed
> > > during kernfs_drain. The child node(pos) is explicitly protected
> > > with an additional ref count. Do the same for the root node.
> > 
> > I don't think this is right. We don't support parallel invocations of
> > __kernfs_remove() this way. If @kn can be freed during kernfs_drain(), it
> > also means that it can be freed before kernfs_rwsem is grabbed in
> > kernfs_remove().
> 
> Point taken. However, the syzkaller reproducer reliably triggers
> the bug without the patch and the bug is gone with the patch.
> 
> > The caller must be responsible for ensuring that @kn
> > remains allocated. Otherwise, it can't be made reliable.
> 
> In this case the caller of __kernfs_remove is not kernfs_remove but
> kernfs_remove_by_name_ns and it fails to take a reference for the
> node that it looks up and deletes. Thus a second call to
> kernfs_remove_by_name_ns can remove the node while kernfs_drain
> drops the semaphore.
> 
> I'll post an updated patch tomorrow.

Sorry, no patch (yet). But here's the whole story of the initial
syzkaller report (form top to bottom). I'm not sure where we go wrong
but I think several places could do better here:

The code in net/9p/client.c creates one kmem-cache per client session.
All of these kmem caches use the same name ("9p-fcall-cache").
Is it ok to create several kmem caches with the same name? My feeling is
that this is somewhat unexpected but should probably be allowed.

In the setup in question slab caches are not merged. Thus the slub
code uses the kmem cache name directly to create the sysfs directory for
the slub cache. If we allow multiple kmem caches with the same name the
slub code should somehow make the sysfs names unique (e.g. by adding a
serial numer to the name), right?

Before creating the sysfs directory the slub code uses sysfs_remove_link
(aka kernfs_remove_by_name) with the following comment:
"If we have a leftover link remove it." In fact these "leftover"s
are the sysfs files of an active kmem cache with the same name.

Additionally, sysfs_remove_link() looks like a bit of a misnomer
as it removes whatever exists under the given name. This may be a
symlink but it can be an entire directory hierarchy, too.
Is this intentional? At least it's been like that forever.

If kmem cache creation is done in parallel we can now have
concurrent invocations of kernfs_remove_by_name_ns() for the same
parent and the same name. This is what eventually causes the race.

The race is possible as kernfs_remove_by_name_ns() looks up the
name of the target in its parent but does not acquire a ref count
on the target before calling __kernfs_remove(). __kernfs_remove()
may drop the kernfs_rwsem in kernfs_drain(). Thus a concurrent call
to __kernfs_remove can complete the removal except for the nodes
that the first instance of __kernfs_remove() holds refs for.
As explained above no ref is held on the root of the removed tree.
This causes the use-after-free that KASAN sees and complains about.

For kernfs_remove it is reasonable to expect the caller to hold
some kind of reference to prevent this type of race and from a quick
check, the callers seem to get this correct. The only exception that
I could find is kernfs_remove_by_name_ns. This is easy to fix if
kernfs_remove_by_name_ns() hold a reference on the root node across
the call to __kernfs_remove().
IMHO this should be done. Should I sent a patch?

Thanks for bearing with me. Oppinons?

      regards   Christian


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] kernfs: fix use-after-free in __kernfs_remove
  2022-09-12 21:24     ` Christian A. Ehrhardt
@ 2022-09-12 21:39       ` Tejun Heo
  2022-09-13 12:17         ` [PATCH v2] " Christian A. Ehrhardt
  0 siblings, 1 reply; 7+ messages in thread
From: Tejun Heo @ 2022-09-12 21:39 UTC (permalink / raw)
  To: Christian A. Ehrhardt; +Cc: linux-kernel, Greg Kroah-Hartman

Hello,

On Mon, Sep 12, 2022 at 11:24:52PM +0200, Christian A. Ehrhardt wrote:
> Sorry, no patch (yet). But here's the whole story of the initial
> syzkaller report (form top to bottom). I'm not sure where we go wrong
> but I think several places could do better here:
> 
> The code in net/9p/client.c creates one kmem-cache per client session.
> All of these kmem caches use the same name ("9p-fcall-cache").
> Is it ok to create several kmem caches with the same name? My feeling is
> that this is somewhat unexpected but should probably be allowed.

I don't think that's supported. kmem_caches are exposed in sysfs and having
the same name is gonna cause issues.

> In the setup in question slab caches are not merged. Thus the slub
> code uses the kmem cache name directly to create the sysfs directory for
> the slub cache. If we allow multiple kmem caches with the same name the
> slub code should somehow make the sysfs names unique (e.g. by adding a
> serial numer to the name), right?

I think we're just in uncharted terriotory. Maybe some things will work
while others don't. Nobody really thought about or tested this usage.

> Before creating the sysfs directory the slub code uses sysfs_remove_link
> (aka kernfs_remove_by_name) with the following comment:
> "If we have a leftover link remove it." In fact these "leftover"s
> are the sysfs files of an active kmem cache with the same name.

Hahahaha

> Additionally, sysfs_remove_link() looks like a bit of a misnomer
> as it removes whatever exists under the given name. This may be a
> symlink but it can be an entire directory hierarchy, too.
> Is this intentional? At least it's been like that forever.

It's a historical artifact. The backend implementation has changed while the
wrapping sysfs interface remained the same.

> If kmem cache creation is done in parallel we can now have
> concurrent invocations of kernfs_remove_by_name_ns() for the same
> parent and the same name. This is what eventually causes the race.
> 
> The race is possible as kernfs_remove_by_name_ns() looks up the
> name of the target in its parent but does not acquire a ref count
> on the target before calling __kernfs_remove(). __kernfs_remove()
> may drop the kernfs_rwsem in kernfs_drain(). Thus a concurrent call
> to __kernfs_remove can complete the removal except for the nodes
> that the first instance of __kernfs_remove() holds refs for.
> As explained above no ref is held on the root of the removed tree.
> This causes the use-after-free that KASAN sees and complains about.
> 
> For kernfs_remove it is reasonable to expect the caller to hold
> some kind of reference to prevent this type of race and from a quick
> check, the callers seem to get this correct. The only exception that
> I could find is kernfs_remove_by_name_ns. This is easy to fix if
> kernfs_remove_by_name_ns() hold a reference on the root node across
> the call to __kernfs_remove().
> IMHO this should be done. Should I sent a patch?

So, no matter what, I think it'd be a good idea to make remove_by_name hold
onto the kn it's removing, so please send a patch to do so. For the rest of
the situation, I think the right thing to do would be updating 9p so that it
doesn't create multiple kmem_caches with the same name.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH v2] kernfs: fix use-after-free in __kernfs_remove
  2022-09-12 21:39       ` Tejun Heo
@ 2022-09-13 12:17         ` Christian A. Ehrhardt
  2022-09-19 17:35           ` Tejun Heo
  0 siblings, 1 reply; 7+ messages in thread
From: Christian A. Ehrhardt @ 2022-09-13 12:17 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Christian A. Ehrhardt, linux-kernel, Greg Kroah-Hartman

Syzkaller managed to trigger concurrent calls to
kernfs_remove_by_name_ns() for the same file resulting in
a KASAN detected use-after-free. The race occurs when the root
node is freed during kernfs_drain().

To prevent this acquire an additional reference for the root
of the tree that is removed before calling __kernfs_remove().

Found by syzkaller with the following reproducer (slab_nomerge is
required):

syz_mount_image$ext4(0x0, &(0x7f0000000100)='./file0\x00', 0x100000, 0x0, 0x0, 0x0, 0x0)
r0 = openat(0xffffffffffffff9c, &(0x7f0000000080)='/proc/self/exe\x00', 0x0, 0x0)
close(r0)
pipe2(&(0x7f0000000140)={0xffffffffffffffff, <r1=>0xffffffffffffffff}, 0x800)
mount$9p_fd(0x0, &(0x7f0000000040)='./file0\x00', &(0x7f00000000c0), 0x408, &(0x7f0000000280)={'trans=fd,', {'rfdno', 0x3d, r0}, 0x2c, {'wfdno', 0x3d, r1}, 0x2c, {[{@cache_loose}, {@mmap}, {@loose}, {@loose}, {@mmap}], [{@mask={'mask', 0x3d, '^MAY_EXEC'}}, {@fsmagic={'fsmagic', 0x3d, 0x10001}}, {@dont_hash}]}})

Sample report:

==================================================================
BUG: KASAN: use-after-free in kernfs_type include/linux/kernfs.h:335 [inline]
BUG: KASAN: use-after-free in kernfs_leftmost_descendant fs/kernfs/dir.c:1261 [inline]
BUG: KASAN: use-after-free in __kernfs_remove.part.0+0x843/0x960 fs/kernfs/dir.c:1369
Read of size 2 at addr ffff8880088807f0 by task syz-executor.2/857

CPU: 0 PID: 857 Comm: syz-executor.2 Not tainted 6.0.0-rc3-00363-g7726d4c3e60b #5
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
Call Trace:
 <TASK>
 __dump_stack lib/dump_stack.c:88 [inline]
 dump_stack_lvl+0x6e/0x91 lib/dump_stack.c:106
 print_address_description mm/kasan/report.c:317 [inline]
 print_report.cold+0x5e/0x5e5 mm/kasan/report.c:433
 kasan_report+0xa3/0x130 mm/kasan/report.c:495
 kernfs_type include/linux/kernfs.h:335 [inline]
 kernfs_leftmost_descendant fs/kernfs/dir.c:1261 [inline]
 __kernfs_remove.part.0+0x843/0x960 fs/kernfs/dir.c:1369
 __kernfs_remove fs/kernfs/dir.c:1356 [inline]
 kernfs_remove_by_name_ns+0x108/0x190 fs/kernfs/dir.c:1589
 sysfs_slab_add+0x133/0x1e0 mm/slub.c:5943
 __kmem_cache_create+0x3e0/0x550 mm/slub.c:4899
 create_cache mm/slab_common.c:229 [inline]
 kmem_cache_create_usercopy+0x167/0x2a0 mm/slab_common.c:335
 p9_client_create+0xd4d/0x1190 net/9p/client.c:993
 v9fs_session_init+0x1e6/0x13c0 fs/9p/v9fs.c:408
 v9fs_mount+0xb9/0xbd0 fs/9p/vfs_super.c:126
 legacy_get_tree+0xf1/0x200 fs/fs_context.c:610
 vfs_get_tree+0x85/0x2e0 fs/super.c:1530
 do_new_mount fs/namespace.c:3040 [inline]
 path_mount+0x675/0x1d00 fs/namespace.c:3370
 do_mount fs/namespace.c:3383 [inline]
 __do_sys_mount fs/namespace.c:3591 [inline]
 __se_sys_mount fs/namespace.c:3568 [inline]
 __x64_sys_mount+0x282/0x300 fs/namespace.c:3568
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x38/0x90 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x63/0xcd
RIP: 0033:0x7f725f983aed
Code: 02 b8 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f725f0f7028 EFLAGS: 00000246 ORIG_RAX: 00000000000000a5
RAX: ffffffffffffffda RBX: 00007f725faa3f80 RCX: 00007f725f983aed
RDX: 00000000200000c0 RSI: 0000000020000040 RDI: 0000000000000000
RBP: 00007f725f9f419c R08: 0000000020000280 R09: 0000000000000000
R10: 0000000000000408 R11: 0000000000000246 R12: 0000000000000000
R13: 0000000000000006 R14: 00007f725faa3f80 R15: 00007f725f0d7000
 </TASK>

Allocated by task 855:
 kasan_save_stack+0x1e/0x40 mm/kasan/common.c:38
 kasan_set_track mm/kasan/common.c:45 [inline]
 set_alloc_info mm/kasan/common.c:437 [inline]
 __kasan_slab_alloc+0x66/0x80 mm/kasan/common.c:470
 kasan_slab_alloc include/linux/kasan.h:224 [inline]
 slab_post_alloc_hook mm/slab.h:727 [inline]
 slab_alloc_node mm/slub.c:3243 [inline]
 slab_alloc mm/slub.c:3251 [inline]
 __kmem_cache_alloc_lru mm/slub.c:3258 [inline]
 kmem_cache_alloc+0xbf/0x200 mm/slub.c:3268
 kmem_cache_zalloc include/linux/slab.h:723 [inline]
 __kernfs_new_node+0xd4/0x680 fs/kernfs/dir.c:593
 kernfs_new_node fs/kernfs/dir.c:655 [inline]
 kernfs_create_dir_ns+0x9c/0x220 fs/kernfs/dir.c:1010
 sysfs_create_dir_ns+0x127/0x290 fs/sysfs/dir.c:59
 create_dir lib/kobject.c:63 [inline]
 kobject_add_internal+0x24a/0x8d0 lib/kobject.c:223
 kobject_add_varg lib/kobject.c:358 [inline]
 kobject_init_and_add+0x101/0x160 lib/kobject.c:441
 sysfs_slab_add+0x156/0x1e0 mm/slub.c:5954
 __kmem_cache_create+0x3e0/0x550 mm/slub.c:4899
 create_cache mm/slab_common.c:229 [inline]
 kmem_cache_create_usercopy+0x167/0x2a0 mm/slab_common.c:335
 p9_client_create+0xd4d/0x1190 net/9p/client.c:993
 v9fs_session_init+0x1e6/0x13c0 fs/9p/v9fs.c:408
 v9fs_mount+0xb9/0xbd0 fs/9p/vfs_super.c:126
 legacy_get_tree+0xf1/0x200 fs/fs_context.c:610
 vfs_get_tree+0x85/0x2e0 fs/super.c:1530
 do_new_mount fs/namespace.c:3040 [inline]
 path_mount+0x675/0x1d00 fs/namespace.c:3370
 do_mount fs/namespace.c:3383 [inline]
 __do_sys_mount fs/namespace.c:3591 [inline]
 __se_sys_mount fs/namespace.c:3568 [inline]
 __x64_sys_mount+0x282/0x300 fs/namespace.c:3568
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x38/0x90 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x63/0xcd

Freed by task 857:
 kasan_save_stack+0x1e/0x40 mm/kasan/common.c:38
 kasan_set_track+0x21/0x30 mm/kasan/common.c:45
 kasan_set_free_info+0x20/0x40 mm/kasan/generic.c:370
 ____kasan_slab_free mm/kasan/common.c:367 [inline]
 ____kasan_slab_free mm/kasan/common.c:329 [inline]
 __kasan_slab_free+0x108/0x190 mm/kasan/common.c:375
 kasan_slab_free include/linux/kasan.h:200 [inline]
 slab_free_hook mm/slub.c:1754 [inline]
 slab_free_freelist_hook mm/slub.c:1780 [inline]
 slab_free mm/slub.c:3534 [inline]
 kmem_cache_free+0x9c/0x340 mm/slub.c:3551
 kernfs_put.part.0+0x2b2/0x520 fs/kernfs/dir.c:547
 kernfs_put+0x42/0x50 fs/kernfs/dir.c:521
 __kernfs_remove.part.0+0x72d/0x960 fs/kernfs/dir.c:1407
 __kernfs_remove fs/kernfs/dir.c:1356 [inline]
 kernfs_remove_by_name_ns+0x108/0x190 fs/kernfs/dir.c:1589
 sysfs_slab_add+0x133/0x1e0 mm/slub.c:5943
 __kmem_cache_create+0x3e0/0x550 mm/slub.c:4899
 create_cache mm/slab_common.c:229 [inline]
 kmem_cache_create_usercopy+0x167/0x2a0 mm/slab_common.c:335
 p9_client_create+0xd4d/0x1190 net/9p/client.c:993
 v9fs_session_init+0x1e6/0x13c0 fs/9p/v9fs.c:408
 v9fs_mount+0xb9/0xbd0 fs/9p/vfs_super.c:126
 legacy_get_tree+0xf1/0x200 fs/fs_context.c:610
 vfs_get_tree+0x85/0x2e0 fs/super.c:1530
 do_new_mount fs/namespace.c:3040 [inline]
 path_mount+0x675/0x1d00 fs/namespace.c:3370
 do_mount fs/namespace.c:3383 [inline]
 __do_sys_mount fs/namespace.c:3591 [inline]
 __se_sys_mount fs/namespace.c:3568 [inline]
 __x64_sys_mount+0x282/0x300 fs/namespace.c:3568
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x38/0x90 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x63/0xcd

The buggy address belongs to the object at ffff888008880780
 which belongs to the cache kernfs_node_cache of size 128
The buggy address is located 112 bytes inside of
 128-byte region [ffff888008880780, ffff888008880800)

The buggy address belongs to the physical page:
page:00000000732833f8 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x8880
flags: 0x100000000000200(slab|node=0|zone=1)
raw: 0100000000000200 0000000000000000 dead000000000122 ffff888001147280
raw: 0000000000000000 0000000000150015 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
 ffff888008880680: fc fc fc fc fc fc fc fc fa fb fb fb fb fb fb fb
 ffff888008880700: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
>ffff888008880780: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                                                             ^
 ffff888008880800: fc fc fc fc fc fc fc fc fa fb fb fb fb fb fb fb
 ffff888008880880: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
==================================================================

cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Christian A. Ehrhardt <lk@c--e.de>
---
 fs/kernfs/dir.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/fs/kernfs/dir.c b/fs/kernfs/dir.c
index 1cc88ba6de90..2e9313988871 100644
--- a/fs/kernfs/dir.c
+++ b/fs/kernfs/dir.c
@@ -1585,8 +1585,11 @@ int kernfs_remove_by_name_ns(struct kernfs_node *parent, const char *name,
 	down_write(&root->kernfs_rwsem);
 
 	kn = kernfs_find_ns(parent, name, ns);
-	if (kn)
+	if (kn) {
+		kernfs_get(kn);
 		__kernfs_remove(kn);
+		kernfs_put(kn);
+	}
 
 	up_write(&root->kernfs_rwsem);
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH v2] kernfs: fix use-after-free in __kernfs_remove
  2022-09-13 12:17         ` [PATCH v2] " Christian A. Ehrhardt
@ 2022-09-19 17:35           ` Tejun Heo
  0 siblings, 0 replies; 7+ messages in thread
From: Tejun Heo @ 2022-09-19 17:35 UTC (permalink / raw)
  To: Christian A. Ehrhardt; +Cc: linux-kernel, Greg Kroah-Hartman

On Tue, Sep 13, 2022 at 02:17:23PM +0200, Christian A. Ehrhardt wrote:
> Syzkaller managed to trigger concurrent calls to
> kernfs_remove_by_name_ns() for the same file resulting in
> a KASAN detected use-after-free. The race occurs when the root
> node is freed during kernfs_drain().
> 
> To prevent this acquire an additional reference for the root
> of the tree that is removed before calling __kernfs_remove().
...
> cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> cc: Tejun Heo <tj@kernel.org>
> Signed-off-by: Christian A. Ehrhardt <lk@c--e.de>

Acked-by: Tejun Heo <tj@kernel.org>

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2022-09-19 17:35 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-09-07 20:08 [PATCH] kernfs: fix use-after-free in __kernfs_remove Christian A. Ehrhardt
2022-09-08 17:14 ` Tejun Heo
2022-09-08 22:25   ` Christian A. Ehrhardt
2022-09-12 21:24     ` Christian A. Ehrhardt
2022-09-12 21:39       ` Tejun Heo
2022-09-13 12:17         ` [PATCH v2] " Christian A. Ehrhardt
2022-09-19 17:35           ` Tejun Heo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).