linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* BUG: KASAN: use-after-free in dec_rlimit_ucounts
@ 2021-11-17 22:00 Qian Cai
  2021-11-18 19:46 ` Eric W. Biederman
  0 siblings, 1 reply; 10+ messages in thread
From: Qian Cai @ 2021-11-17 22:00 UTC (permalink / raw)
  To: Eric W. Biederman, Alexey Gladkov; +Cc: Yu Zhao, linux-kernel

Hi there, I can still reproduce this quickly on today's linux-next and all
the way back to 5.15-rc6 by running a syscall fuzzer for a while. The trace
points out to this line,

        for (iter = ucounts; iter; iter = iter->ns->ucounts) {

It looks KASAN indicated that that "ns" had already been freed. Is that
possible or perhaps this is more of refcount issue? 

 BUG: KASAN: use-after-free in dec_rlimit_ucounts
 Read of size 8 at addr ffff0008c0739860 by task trinity-c27/10924

 CPU: 27 PID: 10924 Comm: trinity-c27 Not tainted 5.15.0-next-20211115-dirty #192
 Hardware name: MiTAC RAPTOR EV-883832-X3-0001/RAPTOR, BIOS 1.6 06/28/2020
 Call trace:
  dump_backtrace
  show_stack
  dump_stack_lvl
  print_address_description.constprop.0
  kasan_report
  __asan_report_load8_noabort
  dec_rlimit_ucounts
  dec_rlimit_ucounts at kernel/ucount.c:284
  mqueue_evict_inode
  mqueue_evict_inode at ipc/mqueue.c:544
  evict
  iput.part.0
  iput
  __arm64_sys_mq_unlink
  invoke_syscall
  el0_svc_common.constprop.0
  do_el0_svc
  el0_svc
  el0t_64_sync_handler
  el0t_64_sync

 Allocated by task 10615:
  kasan_save_stack
  __kasan_slab_alloc
  slab_post_alloc_hook
  kmem_cache_alloc
  create_user_ns
  unshare_userns
  ksys_unshare
  __arm64_sys_unshare
  invoke_syscall
  el0_svc_common.constprop.0
  do_el0_svc
  el0_svc
  el0t_64_sync_handler
  el0t_64_sync

 Freed by task 8660:
  kasan_save_stack
  kasan_set_track
  kasan_set_free_info
  __kasan_slab_free
  slab_free_freelist_hook
  kmem_cache_free
  free_user_ns
  process_one_work
  worker_thread
  kthread
  ret_from_fork

 Last potentially related work creation:
  kasan_save_stack
  __kasan_record_aux_stack
  kasan_record_aux_stack_noalloc
  insert_work
  __queue_work
  queue_work_on
  __put_user_ns
  put_cred_rcu
  rcu_do_batch
  rcu_core
  rcu_core_si
  __do_softirq

 The buggy address belongs to the object at ffff0008c07395e8
  which belongs to the cache user_namespace of size 768
 The buggy address is located 632 bytes inside of
  768-byte region [ffff0008c07395e8, ffff0008c07398e8)
 The buggy address belongs to the page:
 page:fffffc002301ce00 refcount:1 mapcount:0 mapping:0000000000000000 index:0xffff0008c073cec8 pfn:0x940738
 head:fffffc002301ce00 order:3 compound_mapcount:0 compound_pincount:0
 memcg:ffff0008b9b5f101
 flags: 0xbfffc0000010200(slab|head|node=0|zone=2|lastcpupid=0xffff)
 raw: 0bfffc0000010200 ffff000800f3e9c8 ffff000800f3e9c8 ffff000802e69b80
 raw: ffff0008c073cec8 00000000001d0012 00000001ffffffff ffff0008b9b5f101
 page dumped because: kasan: bad access detected

 Memory state around the buggy address:
  ffff0008c0739700: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
  ffff0008c0739780: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 >ffff0008c0739800: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                                                        ^
  ffff0008c0739880: fb fb fb fb fb fb fb fb fb fb fb fb fb fc fc fc
  ffff0008c0739900: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: BUG: KASAN: use-after-free in dec_rlimit_ucounts
  2021-11-17 22:00 BUG: KASAN: use-after-free in dec_rlimit_ucounts Qian Cai
@ 2021-11-18 19:46 ` Eric W. Biederman
  2021-11-18 20:32   ` Qian Cai
  0 siblings, 1 reply; 10+ messages in thread
From: Eric W. Biederman @ 2021-11-18 19:46 UTC (permalink / raw)
  To: Qian Cai; +Cc: Alexey Gladkov, Yu Zhao, linux-kernel

Qian Cai <quic_qiancai@quicinc.com> writes:

> Hi there, I can still reproduce this quickly on today's linux-next and all
> the way back to 5.15-rc6 by running a syscall fuzzer for a while. The trace
> points out to this line,
>
>         for (iter = ucounts; iter; iter = iter->ns->ucounts) {
>
> It looks KASAN indicated that that "ns" had already been freed. Is that
> possible or perhaps this is more of refcount issue?

Is it possible?  Yes it is possible.  That is one place where
a use-after-free has shown up and I expect would show up in the
future.

That said it is hard to believe there is still a user-after-free in the
code.  We spent the last kernel development cycle pouring through and
correcting everything we saw until we ultimately found one very subtle
use-after-free.

If you have a reliable reproducer that you can share, we can look into
this and see if we can track down where the reference count is going
bad.

It tends to take instrumenting the entire life cycle every increment and
every decrement and then pouring through the logs to track down a
use-after-free.  Which is not something we can really do without a
reproducer.

Eric

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: BUG: KASAN: use-after-free in dec_rlimit_ucounts
  2021-11-18 19:46 ` Eric W. Biederman
@ 2021-11-18 20:32   ` Qian Cai
  2021-11-18 20:57     ` Eric W. Biederman
  0 siblings, 1 reply; 10+ messages in thread
From: Qian Cai @ 2021-11-18 20:32 UTC (permalink / raw)
  To: Eric W. Biederman; +Cc: Alexey Gladkov, Yu Zhao, linux-kernel

On Thu, Nov 18, 2021 at 01:46:05PM -0600, Eric W. Biederman wrote:
> Is it possible?  Yes it is possible.  That is one place where
> a use-after-free has shown up and I expect would show up in the
> future.
> 
> That said it is hard to believe there is still a user-after-free in the
> code.  We spent the last kernel development cycle pouring through and
> correcting everything we saw until we ultimately found one very subtle
> use-after-free.
> 
> If you have a reliable reproducer that you can share, we can look into
> this and see if we can track down where the reference count is going
> bad.
> 
> It tends to take instrumenting the entire life cycle every increment and
> every decrement and then pouring through the logs to track down a
> use-after-free.  Which is not something we can really do without a
> reproducer.

The reproducer is just to run trinity by an unprivileged user on defconfig
with KASAN enabled (On linux-next, you can do "make defconfig debug.conf"
[1], but dont think other debugging options are relevent here.)

$ trinity -C 31 -N 10000000

It is always reproduced on an arm64 server here within 5-minute so far.
Some debugging progress so far. BTW, this could happen on user_shm_unlock()
path as well.

 Call trace:
  dec_rlimit_ucounts
  user_shm_unlock
  (inlined by) user_shm_unlock at mm/mlock.c:854
  shmem_lock
  shmctl_do_lock
  ksys_shmctl.constprop.0
  __arm64_sys_shmctl
  invoke_syscall
  el0_svc_common.constprop.0
  do_el0_svc
  el0_svc
  el0t_64_sync_handler
  el0t_64_sync

I noticed in dec_rlimit_ucounts(), dec == 0 and type ==
UCOUNT_RLIMIT_MEMLOCK. 

[1] https://lore.kernel.org/lkml/20211115134754.7334-1-quic_qiancai@quicinc.com/

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: BUG: KASAN: use-after-free in dec_rlimit_ucounts
  2021-11-18 20:32   ` Qian Cai
@ 2021-11-18 20:57     ` Eric W. Biederman
  2021-11-19 13:32       ` Qian Cai
  2021-11-24 21:49       ` Qian Cai
  0 siblings, 2 replies; 10+ messages in thread
From: Eric W. Biederman @ 2021-11-18 20:57 UTC (permalink / raw)
  To: Qian Cai; +Cc: Alexey Gladkov, Yu Zhao, linux-kernel

Qian Cai <quic_qiancai@quicinc.com> writes:

> On Thu, Nov 18, 2021 at 01:46:05PM -0600, Eric W. Biederman wrote:
>> Is it possible?  Yes it is possible.  That is one place where
>> a use-after-free has shown up and I expect would show up in the
>> future.
>> 
>> That said it is hard to believe there is still a user-after-free in the
>> code.  We spent the last kernel development cycle pouring through and
>> correcting everything we saw until we ultimately found one very subtle
>> use-after-free.
>> 
>> If you have a reliable reproducer that you can share, we can look into
>> this and see if we can track down where the reference count is going
>> bad.
>> 
>> It tends to take instrumenting the entire life cycle every increment and
>> every decrement and then pouring through the logs to track down a
>> use-after-free.  Which is not something we can really do without a
>> reproducer.
>
> The reproducer is just to run trinity by an unprivileged user on defconfig
> with KASAN enabled (On linux-next, you can do "make defconfig debug.conf"
> [1], but dont think other debugging options are relevent here.)
>
> $ trinity -C 31 -N 10000000
>
> It is always reproduced on an arm64 server here within 5-minute so far.
> Some debugging progress so far. BTW, this could happen on user_shm_unlock()
> path as well.

Does this only happen on a single architecture?  If so I wonder if
perhaps some of the architectures atomic primitives are implemented
improperly.

Unfortunately I don't have any arm64 machines where I can easily test
this.

The call path you posted from user_shm_unlock is another path where
a use-after-free has show up in the past.

My blind guess would be that I made an implementation mistake in
inc_rlimit_get_ucounts or dec_rlimit_put_ucounts but I can't see it
right now.

Eric

>  Call trace:
>   dec_rlimit_ucounts
>   user_shm_unlock
>   (inlined by) user_shm_unlock at mm/mlock.c:854
>   shmem_lock
>   shmctl_do_lock
>   ksys_shmctl.constprop.0
>   __arm64_sys_shmctl
>   invoke_syscall
>   el0_svc_common.constprop.0
>   do_el0_svc
>   el0_svc
>   el0t_64_sync_handler
>   el0t_64_sync
>
> I noticed in dec_rlimit_ucounts(), dec == 0 and type ==
> UCOUNT_RLIMIT_MEMLOCK. 
>
> [1] https://lore.kernel.org/lkml/20211115134754.7334-1-quic_qiancai@quicinc.com/

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: BUG: KASAN: use-after-free in dec_rlimit_ucounts
  2021-11-18 20:57     ` Eric W. Biederman
@ 2021-11-19 13:32       ` Qian Cai
  2021-11-24 21:49       ` Qian Cai
  1 sibling, 0 replies; 10+ messages in thread
From: Qian Cai @ 2021-11-19 13:32 UTC (permalink / raw)
  To: Eric W. Biederman; +Cc: Alexey Gladkov, Yu Zhao, linux-kernel

On Thu, Nov 18, 2021 at 02:57:17PM -0600, Eric W. Biederman wrote:
> Does this only happen on a single architecture?  If so I wonder if
> perhaps some of the architectures atomic primitives are implemented
> improperly.

No, I just don't have another arch to test this on, and I see no reason
that it won't be reproduced on x86. If arm64 atomic primitives are
problematic, it will likely blow up elsewhere which is not the case from
our daily CI regression testing running for many years.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: BUG: KASAN: use-after-free in dec_rlimit_ucounts
  2021-11-18 20:57     ` Eric W. Biederman
  2021-11-19 13:32       ` Qian Cai
@ 2021-11-24 21:49       ` Qian Cai
  2021-11-26  5:34         ` Qian Cai
  1 sibling, 1 reply; 10+ messages in thread
From: Qian Cai @ 2021-11-24 21:49 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Alexey Gladkov, Yu Zhao, linux-kernel, Catalin Marinas,
	Will Deacon, Mark Rutland, linux-arm-kernel

On Thu, Nov 18, 2021 at 02:57:17PM -0600, Eric W. Biederman wrote:
> Qian Cai <quic_qiancai@quicinc.com> writes:
> 
> > On Thu, Nov 18, 2021 at 01:46:05PM -0600, Eric W. Biederman wrote:
> >> Is it possible?  Yes it is possible.  That is one place where
> >> a use-after-free has shown up and I expect would show up in the
> >> future.
> >> 
> >> That said it is hard to believe there is still a user-after-free in the
> >> code.  We spent the last kernel development cycle pouring through and
> >> correcting everything we saw until we ultimately found one very subtle
> >> use-after-free.
> >> 
> >> If you have a reliable reproducer that you can share, we can look into
> >> this and see if we can track down where the reference count is going
> >> bad.
> >> 
> >> It tends to take instrumenting the entire life cycle every increment and
> >> every decrement and then pouring through the logs to track down a
> >> use-after-free.  Which is not something we can really do without a
> >> reproducer.
> >
> > The reproducer is just to run trinity by an unprivileged user on defconfig
> > with KASAN enabled (On linux-next, you can do "make defconfig debug.conf"
> > [1], but dont think other debugging options are relevent here.)
> >
> > $ trinity -C 31 -N 10000000
> >
> > It is always reproduced on an arm64 server here within 5-minute so far.
> > Some debugging progress so far. BTW, this could happen on user_shm_unlock()
> > path as well.
> 
> Does this only happen on a single architecture?  If so I wonder if
> perhaps some of the architectures atomic primitives are implemented
> improperly.

Hmm, I don't know if that or it is just this platfrom is lucky to trigger
the race condition quickly, but I can't reproduce it on x86 so far. I am
Cc'ing a few arm64 people to see if they have spot anything I might be
missing. The original bug report is here:

https://lore.kernel.org/lkml/YZV7Z+yXbsx9p3JN@fixkernel.com/

I did narrow it down the same traces were first introduced by those
commits:

d7c9e99aee48 Reimplement RLIMIT_MEMLOCK on top of ucounts
d64696905554 Reimplement RLIMIT_SIGPENDING on top of ucounts
6e52a9f0532f Reimplement RLIMIT_MSGQUEUE on top of ucounts
21d1c5e386bc Reimplement RLIMIT_NPROC on top of ucounts
b6c336528926 Use atomic_t for ucounts reference counting
905ae01c4ae2 Add a reference to ucounts for each cred
f9c82a4ea89c Increase size of ucounts to atomic_long_t

Also, I added a debugging patch here:

--- a/mm/mlock.c
+++ b/mm/mlock.c
@@ -847,8 +847,14 @@ int user_shm_lock(size_t size, struct ucounts *ucounts)

 void user_shm_unlock(size_t size, struct ucounts *ucounts)
 {
+       int i;
+
        spin_lock(&shmlock_user_lock);
+       printk("KK user_shm_unlock ucounts = %d\n", atomic_read(&ucounts->count));
+       for (i = 0; i < UCOUNT_COUNTS; i++)
+               printk("KK type = %d, count = %ld\n", i, atomic_long_read(&ucounts->ucount[i]));
        dec_rlimit_ucounts(ucounts, UCOUNT_RLIMIT_MEMLOCK, (size + PAGE_SIZE - 1) >> PAGE_SHIFT);
+       printk("size = %zu, count = %ld\n", size, atomic_long_read(&ucounts->ucount[UCOUNT_RLIMIT_MEMLOCK]));
        spin_unlock(&shmlock_user_lock);
        put_ucounts(ucounts)

Then, I noticed that ucounts->count is off-by-one. Since the later
put_ucounts() would free the "ucounts", I am wondering if it is actually
correct that "ucounts->count == 1" when entering user_shm_unlock(),
uncounts->ns has already gone. Thus, dec_rlimit_ucounts() should not
blindly traverse ucounts->ns ?

[  214.541754] KK user_shm_unlock ucounts = 1
[  214.545871] KK type = 0, count = 0
[  214.549288] KK type = 1, count = 0
[  214.552697] KK type = 2, count = 0
[  214.556104] KK type = 3, count = 0
[  214.559511] KK type = 4, count = 0
[  214.562920] KK type = 5, count = 0
[  214.566314] KK type = 6, count = 0
[  214.569718] KK type = 7, count = 0
[  214.573132] KK type = 8, count = 0
[  214.576537] KK type = 9, count = 0
[  214.579945] KK type = 10, count = 0
[  214.583441] KK type = 11, count = 0
[  214.586940] KK type = 12, count = 0
[  214.590420] KK type = 13, count = 1
[  214.593917] ==================================================================
[  214.601130] BUG: KASAN: use-after-free in dec_rlimit_ucounts+0xe8/0xf0
[  214.607657] Read of size 8 at addr ffff000905ee12f0 by task trinity-c2/9708
[  214.614611] 
[  214.616093] CPU: 13 PID: 9708 Comm: trinity-c2 Not tainted 5.12.0-00007-gd7c9e99aee48-dirty #221
[  214.624870] Hardware name: MiTAC RAPTOR EV-883832-X3-0001/RAPTOR, BIOS 1.6 06/28/2020
[  214.632689] Call trace:
[  214.635124]  dump_backtrace+0x0/0x350
[  214.638781]  show_stack+0x18/0x28
[  214.642088]  dump_stack+0x120/0x18c
[  214.645570]  print_address_description.constprop.0+0x6c/0x30c
[  214.651309]  kasan_report+0x1d8/0x1f0
[  214.654964]  __asan_report_load8_noabort+0x34/0x60
[  214.659747]  dec_rlimit_ucounts+0xe8/0xf0
[  214.663748]  user_shm_unlock+0xdc/0x338
[  214.667577]  shmem_lock+0x154/0x250
[  214.671057]  shmctl_do_lock+0x310/0x5d8
[  214.674886]  ksys_shmctl.constprop.0+0x200/0x588
[  214.679496]  __arm64_sys_shmctl+0x6c/0xa0
[  214.683497]  el0_svc_common.constprop.0+0xe4/0x300
[  214.688281]  do_el0_svc+0x48/0xd0
[  214.691587]  el0_svc+0x24/0x38
[  214.694633]  el0_sync_handler+0xb0/0xb8
[  214.698460]  el0_sync+0x174/0x180
[  214.701766] 
[  214.703247] Allocated by task 9392:
[  214.706726]  kasan_save_stack+0x28/0x58
[  214.710555]  __kasan_slab_alloc+0x88/0xa8
[  214.714555]  kmem_cache_alloc+0x190/0x5b0
[  214.718555]  create_user_ns+0x158/0xa60
[  214.722384]  unshare_userns+0x44/0xe0
[  214.726038]  ksys_unshare+0x23c/0x580
[  214.729693]  __arm64_sys_unshare+0x30/0x50
[  214.733781]  el0_svc_common.constprop.0+0xe4/0x300
[  214.738564]  do_el0_svc+0x48/0xd0
[  214.741871]  e
                 [  214.752048] asan_set_track+0x28/0x40
[  214.764227]  kasan_set_free_info+0x28/0x50
[  214.768314]  __kasan_slab_free+0xd0/0x130
[  214.772316]  kmem_cache_free+0xb4/0x390
[  214.776146]  free_user_ns+0x108/0x2a8
[  214.779802]  process_one_work+0x684/0xfd0
[  214.783804]  worker_thread+0x314/0xc78
[  214.787543]  kthread+0x3a4/0x460
[  214.790763]  ret_from_fork+0x10/0x30
[  214.794330] 
[  214.795811] Last potentially related work creation:
[  214.800678]  kasan_save_stack+0x28/0x58
[  214.804505]  kasan_record_aux_stack+0xc0/0xd8
[  214.808853]  insert_work+0x50/0x2f0
[  214.812334]  __queue_work+0x314/0xac8
[  214.815988]  queue_work_on+0x94/0xc8
[  214.819555]  __put_user_ns+0x3c/0x60
[  214.823122]  put_cred_rcu+0x208/0x2f8
[  214.826775]  rcu_core+0x734/0xf68
[  214.830083]  rcu_core_si+0x10/0x20
[  214.833477]  __do_softirq+0x28c/0x774
[  214.837130] 
[  214.838610] The buggy address belongs to the object at ffff000905ee1110
[  214.838610]  which belongs to the cache user_namespace of size 600
[  214.851378] The buggy address is located 480 bytes inside of
[  214.851378]  600-byte region [ffff000905ee1110, ffff000905ee1368)
[  214.863105] The buggy address belongs to the page:
[  214.867886] page:000000000a048a0d refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x985ee0
[  214.877271] head:000000000a048a0d order:3 compound_mapcount:0 compound_pincount:0
[  214.884744] flags: 0xbfffc0000010200(slab|head)
[  214.889270] raw: 0bfffc0000010200 dead000000000100 dead000000000122 ffff0008002a3180
[  214.897003] raw: 0000000000000000 00000000802d002d 00000001ffffffff 0000000000000000
[  214.904734] page dumped because: kasan: bad access detected
[  214.910296] 
[  214.911776] Memory state around the buggy address:
[  214.916557]  ffff000905ee1180: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[  214.923769]  ffff000905ee1200: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[  214.930981] >ffff000905ee1280: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[  214.938191]                                                              ^
[  214.945056]  ffff000905ee1300: fb fb fb fb fb fb fb fb fb fb fb fb fb fc fc fc
[  214.952267]  ffff000905ee1380: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[  214.959477] ==================================================================
[  214.967070] Disabling lock debugging due to kernel taint
[  214.972398] size = 4096, count = 0

> 
> Unfortunately I don't have any arm64 machines where I can easily test
> this.
> 
> The call path you posted from user_shm_unlock is another path where
> a use-after-free has show up in the past.
> 
> My blind guess would be that I made an implementation mistake in
> inc_rlimit_get_ucounts or dec_rlimit_put_ucounts but I can't see it
> right now.
> 
> Eric
> 
> >  Call trace:
> >   dec_rlimit_ucounts
> >   user_shm_unlock
> >   (inlined by) user_shm_unlock at mm/mlock.c:854
> >   shmem_lock
> >   shmctl_do_lock
> >   ksys_shmctl.constprop.0
> >   __arm64_sys_shmctl
> >   invoke_syscall
> >   el0_svc_common.constprop.0
> >   do_el0_svc
> >   el0_svc
> >   el0t_64_sync_handler
> >   el0t_64_sync
> >
> > I noticed in dec_rlimit_ucounts(), dec == 0 and type ==
> > UCOUNT_RLIMIT_MEMLOCK. 
> >
> > [1] https://lore.kernel.org/lkml/20211115134754.7334-1-quic_qiancai@quicinc.com/

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: BUG: KASAN: use-after-free in dec_rlimit_ucounts
  2021-11-24 21:49       ` Qian Cai
@ 2021-11-26  5:34         ` Qian Cai
  2021-12-20  5:58           ` Eric W. Biederman
  0 siblings, 1 reply; 10+ messages in thread
From: Qian Cai @ 2021-11-26  5:34 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Alexey Gladkov, Yu Zhao, linux-kernel, Catalin Marinas,
	Will Deacon, Mark Rutland, linux-arm-kernel

On Wed, Nov 24, 2021 at 04:49:19PM -0500, Qian Cai wrote:
> Hmm, I don't know if that or it is just this platfrom is lucky to trigger
> the race condition quickly, but I can't reproduce it on x86 so far. I am
> Cc'ing a few arm64 people to see if they have spot anything I might be
> missing. The original bug report is here:
> 
> https://lore.kernel.org/lkml/YZV7Z+yXbsx9p3JN@fixkernel.com/

Okay, I am finally able to reproduce this on x86_64 with the latest
mainline as well by setting CONFIG_USER_NS and KASAN on the top of
defconfig (I did not realize it did not select CONFIG_USER_NS in the first
place). Anyway, it still took less than 5-minute by running:

$ trinity -C 48

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: BUG: KASAN: use-after-free in dec_rlimit_ucounts
  2021-11-26  5:34         ` Qian Cai
@ 2021-12-20  5:58           ` Eric W. Biederman
  2021-12-21 13:09             ` Alexey Gladkov
  0 siblings, 1 reply; 10+ messages in thread
From: Eric W. Biederman @ 2021-12-20  5:58 UTC (permalink / raw)
  To: Qian Cai
  Cc: Alexey Gladkov, Yu Zhao, linux-kernel, Catalin Marinas,
	Will Deacon, Mark Rutland, linux-arm-kernel

Qian Cai <quic_qiancai@quicinc.com> writes:

> On Wed, Nov 24, 2021 at 04:49:19PM -0500, Qian Cai wrote:
>> Hmm, I don't know if that or it is just this platfrom is lucky to trigger
>> the race condition quickly, but I can't reproduce it on x86 so far. I am
>> Cc'ing a few arm64 people to see if they have spot anything I might be
>> missing. The original bug report is here:
>> 
>> https://lore.kernel.org/lkml/YZV7Z+yXbsx9p3JN@fixkernel.com/
>
> Okay, I am finally able to reproduce this on x86_64 with the latest
> mainline as well by setting CONFIG_USER_NS and KASAN on the top of
> defconfig (I did not realize it did not select CONFIG_USER_NS in the first
> place). Anyway, it still took less than 5-minute by running:
>
> $ trinity -C 48

It took me a while to get to the point of reproducing this but I can
confirm I see this with 2 core VM, running 5.16.0-rc4.

Running trinity 2019.06 packaged in debian 11.

I didn't watch so I don't know if it was 5 minutes but I do know it took
less than an hour.

Now I am puzzled why there are not other reports of problems.

Now to start drilling down to figure out why the user namespace was
freed early.
----

The failure I got looked like:
> BUG: KASAN: use-after-free in dec_rlimit_ucounts+0x7b/0xb0
> Read of size 8 at addr ffff88800b7dd018 by task trinity-c3/67982
> 
> CPU: 1 PID: 67982 Comm: trinity-c3 Tainted: G  O 5.16.0-rc4 #1
> Hardware name: Xen HVM domU, BIOS 4.8.5-35.fc25 08/25/2021
> Call Trace:
>  <TASK>
>  dump_stack_lvl+0x48/0x5e
>  print_address_descrtion.constprop.0+0x1f/0x140
>  ? dec_rlimit_ucounts+0x7b/0xb0
>  ? dec_rlimit_ucounts+0x7b/0xb0
>  kasan_report.cold+0x7f/0xe0
>  ? _raw_spin_lock+0x7f/0x11b
>  ? dec_rlimit_ucounts+0x7b/0xb0
>  dec_rlimit_ucounts+0x7b/0xb0
>  mqueue_evict_inode+0x417/0x590
>  ? perf_trace_global_dirty_state+0x350/0x350
>  ? __x64_sys_mq_unlink+0x250/0x250
>  ? _raw_spin_lock_bh+0xe0/0xe0
>  ? _raw_spin_lock_bh+0xe0/0xe0
>  evict+0x155/0x2a0
>  __x64_sys_mq_unlink+0x1a7/0x250
>  do_syscall_64+0x3b/0x90
>  entry_SYSCALL_64_after_hwframe+0x44/0xae
> RIP: 0033:0x7f0505ebc9b9
> Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 00 0f 1f 44 00 00 48 89 ....
> 
> Allocated by task 67717
> Freed by task 6027
> 
> The buggy address belongs to the object at ffff88800b7dce38
>  which belongs to the cache user_namespace of size 600
> The buggy address is located 480 bytes inside of
>  600-byte region [ffff88800b7dce38, ffff88800b7dd090]
> The buggy address belongs to the page:
> 
> trinity: Detected kernel tainting. Last seed was 1891442794

Eric

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: BUG: KASAN: use-after-free in dec_rlimit_ucounts
  2021-12-20  5:58           ` Eric W. Biederman
@ 2021-12-21 13:09             ` Alexey Gladkov
  2021-12-27 15:22               ` Eric W. Biederman
  0 siblings, 1 reply; 10+ messages in thread
From: Alexey Gladkov @ 2021-12-21 13:09 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Qian Cai, Yu Zhao, linux-kernel, Catalin Marinas, Will Deacon,
	Mark Rutland, linux-arm-kernel

On Sun, Dec 19, 2021 at 11:58:41PM -0600, Eric W. Biederman wrote:
> Qian Cai <quic_qiancai@quicinc.com> writes:
> 
> > On Wed, Nov 24, 2021 at 04:49:19PM -0500, Qian Cai wrote:
> >> Hmm, I don't know if that or it is just this platfrom is lucky to trigger
> >> the race condition quickly, but I can't reproduce it on x86 so far. I am
> >> Cc'ing a few arm64 people to see if they have spot anything I might be
> >> missing. The original bug report is here:
> >> 
> >> https://lore.kernel.org/lkml/YZV7Z+yXbsx9p3JN@fixkernel.com/
> >
> > Okay, I am finally able to reproduce this on x86_64 with the latest
> > mainline as well by setting CONFIG_USER_NS and KASAN on the top of
> > defconfig (I did not realize it did not select CONFIG_USER_NS in the first
> > place). Anyway, it still took less than 5-minute by running:
> >
> > $ trinity -C 48
> 
> It took me a while to get to the point of reproducing this but I can
> confirm I see this with 2 core VM, running 5.16.0-rc4.
> 
> Running trinity 2019.06 packaged in debian 11.

I still can't reproduce :(

> I didn't watch so I don't know if it was 5 minutes but I do know it took
> less than an hour.

--- a/kernel/ucount.c
+++ b/kernel/ucount.c
@@ -209,6 +209,7 @@ void put_ucounts(struct ucounts *ucounts)

        if (atomic_dec_and_lock_irqsave(&ucounts->count, &ucounts_lock, flags)) {
                hlist_del_init(&ucounts->node);
+               ucounts->ns = NULL;
                spin_unlock_irqrestore(&ucounts_lock, flags);
                kfree(ucounts);
        }

Does the previous hack increase the likelihood of an error being
triggered?

> Now I am puzzled why there are not other reports of problems.
> 
> Now to start drilling down to figure out why the user namespace was
> freed early.
> ----
> 
> The failure I got looked like:
> > BUG: KASAN: use-after-free in dec_rlimit_ucounts+0x7b/0xb0
> > Read of size 8 at addr ffff88800b7dd018 by task trinity-c3/67982
> > 
> > CPU: 1 PID: 67982 Comm: trinity-c3 Tainted: G  O 5.16.0-rc4 #1
> > Hardware name: Xen HVM domU, BIOS 4.8.5-35.fc25 08/25/2021
> > Call Trace:
> >  <TASK>
> >  dump_stack_lvl+0x48/0x5e
> >  print_address_descrtion.constprop.0+0x1f/0x140
> >  ? dec_rlimit_ucounts+0x7b/0xb0
> >  ? dec_rlimit_ucounts+0x7b/0xb0
> >  kasan_report.cold+0x7f/0xe0
> >  ? _raw_spin_lock+0x7f/0x11b
> >  ? dec_rlimit_ucounts+0x7b/0xb0
> >  dec_rlimit_ucounts+0x7b/0xb0
> >  mqueue_evict_inode+0x417/0x590
> >  ? perf_trace_global_dirty_state+0x350/0x350
> >  ? __x64_sys_mq_unlink+0x250/0x250
> >  ? _raw_spin_lock_bh+0xe0/0xe0
> >  ? _raw_spin_lock_bh+0xe0/0xe0
> >  evict+0x155/0x2a0
> >  __x64_sys_mq_unlink+0x1a7/0x250
> >  do_syscall_64+0x3b/0x90
> >  entry_SYSCALL_64_after_hwframe+0x44/0xae
> > RIP: 0033:0x7f0505ebc9b9
> > Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 00 0f 1f 44 00 00 48 89 ....
> > 
> > Allocated by task 67717
> > Freed by task 6027
> > 
> > The buggy address belongs to the object at ffff88800b7dce38
> >  which belongs to the cache user_namespace of size 600
> > The buggy address is located 480 bytes inside of
> >  600-byte region [ffff88800b7dce38, ffff88800b7dd090]
> > The buggy address belongs to the page:
> > 
> > trinity: Detected kernel tainting. Last seed was 1891442794
> 
> Eric
> 

-- 
Rgrds, legion


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: BUG: KASAN: use-after-free in dec_rlimit_ucounts
  2021-12-21 13:09             ` Alexey Gladkov
@ 2021-12-27 15:22               ` Eric W. Biederman
  0 siblings, 0 replies; 10+ messages in thread
From: Eric W. Biederman @ 2021-12-27 15:22 UTC (permalink / raw)
  To: Alexey Gladkov
  Cc: Qian Cai, Yu Zhao, linux-kernel, Catalin Marinas, Will Deacon,
	Mark Rutland, linux-arm-kernel

Alexey Gladkov <legion@kernel.org> writes:

> On Sun, Dec 19, 2021 at 11:58:41PM -0600, Eric W. Biederman wrote:
>> Qian Cai <quic_qiancai@quicinc.com> writes:
>> 
>> > On Wed, Nov 24, 2021 at 04:49:19PM -0500, Qian Cai wrote:
>> >> Hmm, I don't know if that or it is just this platfrom is lucky to trigger
>> >> the race condition quickly, but I can't reproduce it on x86 so far. I am
>> >> Cc'ing a few arm64 people to see if they have spot anything I might be
>> >> missing. The original bug report is here:
>> >> 
>> >> https://lore.kernel.org/lkml/YZV7Z+yXbsx9p3JN@fixkernel.com/
>> >
>> > Okay, I am finally able to reproduce this on x86_64 with the latest
>> > mainline as well by setting CONFIG_USER_NS and KASAN on the top of
>> > defconfig (I did not realize it did not select CONFIG_USER_NS in the first
>> > place). Anyway, it still took less than 5-minute by running:
>> >
>> > $ trinity -C 48
>> 
>> It took me a while to get to the point of reproducing this but I can
>> confirm I see this with 2 core VM, running 5.16.0-rc4.
>> 
>> Running trinity 2019.06 packaged in debian 11.
>
> I still can't reproduce :(
>
>> I didn't watch so I don't know if it was 5 minutes but I do know it took
>> less than an hour.
>
> --- a/kernel/ucount.c
> +++ b/kernel/ucount.c
> @@ -209,6 +209,7 @@ void put_ucounts(struct ucounts *ucounts)
>
>         if (atomic_dec_and_lock_irqsave(&ucounts->count, &ucounts_lock, flags)) {
>                 hlist_del_init(&ucounts->node);
> +               ucounts->ns = NULL;
>                 spin_unlock_irqrestore(&ucounts_lock, flags);
>                 kfree(ucounts);
>         }
>
> Does the previous hack increase the likelihood of an error being
> triggered?

It doesn't seem to make a difference.  That makes sense as the kernel
address sanitizer is part of the kernel configuration required to
reproduce the issue.

Eric

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2021-12-27 15:24 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-17 22:00 BUG: KASAN: use-after-free in dec_rlimit_ucounts Qian Cai
2021-11-18 19:46 ` Eric W. Biederman
2021-11-18 20:32   ` Qian Cai
2021-11-18 20:57     ` Eric W. Biederman
2021-11-19 13:32       ` Qian Cai
2021-11-24 21:49       ` Qian Cai
2021-11-26  5:34         ` Qian Cai
2021-12-20  5:58           ` Eric W. Biederman
2021-12-21 13:09             ` Alexey Gladkov
2021-12-27 15:22               ` Eric W. Biederman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).