All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: ucount: use-after-free read in inc_ucount & dec_ucount
       [not found] ` <C4CB953C-4712-43F3-9E75-1A12C14B8A4B-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
@ 2017-03-03 15:37   ` Nikolay Borisov
       [not found]     ` <180fb7dc-790e-8e82-0cc1-c6e15ddcd20b-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
  0 siblings, 1 reply; 23+ messages in thread
From: Nikolay Borisov @ 2017-03-03 15:37 UTC (permalink / raw)
  To: JongHwan Kim
  Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Jan Kara,
	Eric W. Biederman

[Added containers ml, Eric Biederman and Jan Kara]. Please,
next time don't add random people but take the time to see who touched
the code.

On  3.03.2017 14:16, JongHwan Kim wrote:
> I've got the following report with syzkaller fuzzer
> 
> 
> Syzkaller hit 'KASAN: use-after-free Read in dec_ucount' bug on commit .
> 
> ==================================================================
> BUG: KASAN: use-after-free in __read_once_size
> include/linux/compiler.h:254 [inline] at addr ffff88006d399bc4
> BUG: KASAN: use-after-free in atomic_read
> arch/x86/include/asm/atomic.h:26 [inline] at addr ffff88006d399bc4
> BUG: KASAN: use-after-free in atomic_dec_if_positive
> include/linux/atomic.h:616 [inline] at addr ffff88006d399bc4
> BUG: KASAN: use-after-free in dec_ucount+0x1e5/0x210 kernel/ucount.c:217
> at addr ffff88006d399bc4
> Read of size 4 by task syz-executor3/19713
> CPU: 1 PID: 19713 Comm: syz-executor3 Not tainted 4.10.0+ #4
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
> Ubuntu-1.8.2-1ubuntu1 04/01/2014
> Call Trace:
>  __dump_stack lib/dump_stack.c:15 [inline]
>  dump_stack+0x115/0x1cf lib/dump_stack.c:51
>  kasan_object_err+0x1c/0x70 mm/kasan/report.c:162
>  print_address_description mm/kasan/report.c:200 [inline]
>  kasan_report_error mm/kasan/report.c:289 [inline]
>  kasan_report.part.1+0x20e/0x4e0 mm/kasan/report.c:311
>  kasan_report mm/kasan/report.c:331 [inline]
>  __asan_report_load4_noabort+0x29/0x30 mm/kasan/report.c:331
>  __read_once_size include/linux/compiler.h:254 [inline]
>  atomic_read arch/x86/include/asm/atomic.h:26 [inline]
>  atomic_dec_if_positive include/linux/atomic.h:616 [inline]
>  dec_ucount+0x1e5/0x210 kernel/ucount.c:217
>  dec_inotify_instances fs/notify/inotify/inotify.h:37 [inline]
>  inotify_free_group_priv+0x6c/0x80 fs/notify/inotify/inotify_fsnotify.c:169
>  fsnotify_final_destroy_group fs/notify/group.c:37 [inline]
>  fsnotify_put_group+0x73/0xa0 fs/notify/group.c:110
>  fsnotify_destroy_group+0xec/0x120 fs/notify/group.c:93
>  inotify_release+0x37/0x50 fs/notify/inotify/inotify_user.c:280
>  __fput+0x327/0x7e0 fs/file_table.c:208
>  ____fput+0x15/0x20 fs/file_table.c:244
>  task_work_run+0x18a/0x260 kernel/task_work.c:116
>  exit_task_work include/linux/task_work.h:21 [inline]
>  do_exit+0xa45/0x1b20 kernel/exit.c:873
>  do_group_exit+0x149/0x400 kernel/exit.c:977
>  get_signal+0x7d5/0x1810 kernel/signal.c:2313
>  do_signal+0x94/0x1f30 arch/x86/kernel/signal.c:807
>  exit_to_usermode_loop+0x162/0x1e0 arch/x86/entry/common.c:156
>  prepare_exit_to_usermode arch/x86/entry/common.c:190 [inline]
>  syscall_return_slowpath+0x2b6/0x310 arch/x86/entry/common.c:259
>  entry_SYSCALL_64_fastpath+0xc0/0xc2

So PID 19713 is exitting and as part of it it's freeing its file
descriptors, one of which is apparently an inotify fd. And this has
already been freed.


> RIP: 0033:0x44fb79
> RSP: 002b:00007ffd0f00f6d8 EFLAGS: 00000206 ORIG_RAX: 00000000000000ca
> RAX: fffffffffffffdfc RBX: 0000000000708024 RCX: 000000000044fb79
> RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000708024
> RBP: 00000000000ae8e6 R08: 0000000000708000 R09: 000000160000000d
> R10: 00007ffd0f00f710 R11: 0000000000000206 R12: 0000000000708000
> R13: 0000000000708024 R14: 00000000000ae8a1 R15: 0000000000000016
> Object at ffff88006d399b88, in cache kmalloc-96 size: 96
> Allocated:
> PID = 19691
>  save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:57
>  save_stack+0x43/0xd0 mm/kasan/kasan.c:502
>  set_track mm/kasan/kasan.c:514 [inline]
>  kasan_kmalloc+0xad/0xe0 mm/kasan/kasan.c:605
>  kmem_cache_alloc_trace+0xfb/0x280 mm/slub.c:2745
>  kmalloc include/linux/slab.h:490 [inline]
>  kzalloc include/linux/slab.h:663 [inline]
>  get_ucounts kernel/ucount.c:140 [inline]
>  inc_ucount+0x538/0xa70 kernel/ucount.c:195
>  inotify_new_group+0x309/0x410 fs/notify/inotify/inotify_user.c:655
>  SYSC_inotify_init1 fs/notify/inotify/inotify_user.c:682 [inline]
>  SyS_inotify_init1 fs/notify/inotify/inotify_user.c:669 [inline]
>  sys_inotify_init+0x17/0x80 fs/notify/inotify/inotify_user.c:696
>  entry_SYSCALL_64_fastpath+0x1f/0xc2

However, it has been actually allocated by a different process with pid
19691.


> Freed:
> PID = 19708
>  save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:57
>  save_stack+0x43/0xd0 mm/kasan/kasan.c:502
>  set_track mm/kasan/kasan.c:514 [inline]
>  kasan_slab_free+0x73/0xc0 mm/kasan/kasan.c:578
>  slab_free_hook mm/slub.c:1357 [inline]
>  slab_free_freelist_hook mm/slub.c:1379 [inline]
>  slab_free mm/slub.c:2961 [inline]
>  kfree+0xe8/0x2c0 mm/slub.c:3882
>  put_ucounts+0x1dd/0x270 kernel/ucount.c:172
>  dec_ucount+0x172/0x210 kernel/ucount.c:220
>  dec_inotify_instances fs/notify/inotify/inotify.h:37 [inline]
>  inotify_free_group_priv+0x6c/0x80 fs/notify/inotify/inotify_fsnotify.c:169
>  fsnotify_final_destroy_group fs/notify/group.c:37 [inline]
>  fsnotify_put_group+0x73/0xa0 fs/notify/group.c:110
>  fsnotify_destroy_group+0xec/0x120 fs/notify/group.c:93
>  inotify_release+0x37/0x50 fs/notify/inotify/inotify_user.c:280
>  __fput+0x327/0x7e0 fs/file_table.c:208
>  ____fput+0x15/0x20 fs/file_table.c:244
>  task_work_run+0x18a/0x260 kernel/task_work.c:116
>  exit_task_work include/linux/task_work.h:21 [inline]
>  do_exit+0xa45/0x1b20 kernel/exit.c:873
>  do_group_exit+0x149/0x400 kernel/exit.c:977
>  get_signal+0x7d5/0x1810 kernel/signal.c:2313
>  do_signal+0x94/0x1f30 arch/x86/kernel/signal.c:807
>  exit_to_usermode_loop+0x162/0x1e0 arch/x86/entry/common.c:156
>  prepare_exit_to_usermode arch/x86/entry/common.c:190 [inline]
>  syscall_return_slowpath+0x2b6/0x310 arch/x86/entry/common.c:259
>  entry_SYSCALL_64_fastpath+0xc0/0xc2

And yet we have a third process which freed it, PID 19708. So there is
some dance happening with this fd, being allocated by one process,
handed over to 2 more, which are freeing it. Is this a valid usage
scenario of inotify descriptors?


> Memory state around the buggy address:
>  ffff88006d399a80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>  ffff88006d399b00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>>ffff88006d399b80: fc fb fb fb fb fb fb fb fb fb fb fb fb fc fc fc
>                                            ^
>  ffff88006d399c00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>  ffff88006d399c80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
> ==================================================================
> Disabling lock debugging due to kernel taint
> Kernel panic - not syncing: panic_on_warn set ...
> 
> CPU: 1 PID: 19713 Comm: syz-executor3 Tainted: G    B           4.10.0+ #4
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
> Ubuntu-1.8.2-1ubuntu1 04/01/2014
> Call Trace:
>  __dump_stack lib/dump_stack.c:15 [inline]
>  dump_stack+0x115/0x1cf lib/dump_stack.c:51
>  panic+0x1b4/0x392 kernel/panic.c:179
>  kasan_end_report+0x5b/0x60 mm/kasan/report.c:141
>  kasan_report_error mm/kasan/report.c:293 [inline]
>  kasan_report.part.1+0x40a/0x4e0 mm/kasan/report.c:311
>  kasan_report mm/kasan/report.c:331 [inline]
>  __asan_report_load4_noabort+0x29/0x30 mm/kasan/report.c:331
>  __read_once_size include/linux/compiler.h:254 [inline]
>  atomic_read arch/x86/include/asm/atomic.h:26 [inline]
>  atomic_dec_if_positive include/linux/atomic.h:616 [inline]
>  dec_ucount+0x1e5/0x210 kernel/ucount.c:217
>  dec_inotify_instances fs/notify/inotify/inotify.h:37 [inline]
>  inotify_free_group_priv+0x6c/0x80 fs/notify/inotify/inotify_fsnotify.c:169
>  fsnotify_final_destroy_group fs/notify/group.c:37 [inline]
>  fsnotify_put_group+0x73/0xa0 fs/notify/group.c:110
>  fsnotify_destroy_group+0xec/0x120 fs/notify/group.c:93
>  inotify_release+0x37/0x50 fs/notify/inotify/inotify_user.c:280
>  __fput+0x327/0x7e0 fs/file_table.c:208
>  ____fput+0x15/0x20 fs/file_table.c:244
>  task_work_run+0x18a/0x260 kernel/task_work.c:116
>  exit_task_work include/linux/task_work.h:21 [inline]
>  do_exit+0xa45/0x1b20 kernel/exit.c:873
>  do_group_exit+0x149/0x400 kernel/exit.c:977
>  get_signal+0x7d5/0x1810 kernel/signal.c:2313
>  do_signal+0x94/0x1f30 arch/x86/kernel/signal.c:807
>  exit_to_usermode_loop+0x162/0x1e0 arch/x86/entry/common.c:156
>  prepare_exit_to_usermode arch/x86/entry/common.c:190 [inline]
>  syscall_return_slowpath+0x2b6/0x310 arch/x86/entry/common.c:259
>  entry_SYSCALL_64_fastpath+0xc0/0xc2
> RIP: 0033:0x44fb79
> RSP: 002b:00007ffd0f00f6d8 EFLAGS: 00000206 ORIG_RAX: 00000000000000ca
> RAX: fffffffffffffdfc RBX: 0000000000708024 RCX: 000000000044fb79
> RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000708024
> RBP: 00000000000ae8e6 R08: 0000000000708000 R09: 000000160000000d
> R10: 00007ffd0f00f710 R11: 0000000000000206 R12: 0000000000708000
> R13: 0000000000708024 R14: 00000000000ae8a1 R15: 0000000000000016
> Dumping ftrace buffer:
>    (ftrace buffer empty)
> Kernel Offset: disabled
> Rebooting in 86400 seconds..
> 
> 
> Syzkaller reproducer:
> # {Threaded:false Collide:false Repeat:true Procs:4 Sandbox:setuid
> Repro:false}
> inotify_init()
> 
> Syzkaller hit 'KASAN: use-after-free Read in inc_ucount' bug on commit .
> 
> ==================================================================
> BUG: KASAN: use-after-free in inc_ucount+0x90f/0xa70 kernel/ucount.c:198
> at addr ffff88003a806c20
> Read of size 8 by task syz-executor3/31498
> CPU: 3 PID: 31498 Comm: syz-executor3 Not tainted 4.10.0+ #4
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
> Ubuntu-1.8.2-1ubuntu1 04/01/2014
> Call Trace:
>  __dump_stack lib/dump_stack.c:15 [inline]
>  dump_stack+0x115/0x1cf lib/dump_stack.c:51
>  kasan_object_err+0x1c/0x70 mm/kasan/report.c:162
>  print_address_description mm/kasan/report.c:200 [inline]
>  kasan_report_error mm/kasan/report.c:289 [inline]
>  kasan_report.part.1+0x20e/0x4e0 mm/kasan/report.c:311
>  kasan_report mm/kasan/report.c:332 [inline]
>  __asan_report_load8_noabort+0x29/0x30 mm/kasan/report.c:332
>  inc_ucount+0x90f/0xa70 kernel/ucount.c:198
>  inotify_new_group+0x309/0x410 fs/notify/inotify/inotify_user.c:655
>  SYSC_inotify_init1 fs/notify/inotify/inotify_user.c:682 [inline]
>  SyS_inotify_init1 fs/notify/inotify/inotify_user.c:669 [inline]
>  sys_inotify_init+0x17/0x80 fs/notify/inotify/inotify_user.c:696
>  entry_SYSCALL_64_fastpath+0x1f/0xc2
> RIP: 0033:0x44fb79
> RSP: 002b:00007f13df946b58 EFLAGS: 00000212 ORIG_RAX: 00000000000000fd
> RAX: ffffffffffffffda RBX: 0000000000708000 RCX: 000000000044fb79
> RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
> RBP: 0000000000000046 R08: 0000000000000000 R09: 0000000000000000
> R10: 0000000000000000 R11: 0000000000000212 R12: 0000000000000000
> R13: 00007fff6a0bde1f R14: 00007f13df9479c0 R15: 0000000000000000
> Object at ffff88003a806c10, in cache kmalloc-96 size: 96
> Allocated:
> PID = 31470
>  save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:57
>  save_stack+0x43/0xd0 mm/kasan/kasan.c:502
>  set_track mm/kasan/kasan.c:514 [inline]
>  kasan_kmalloc+0xad/0xe0 mm/kasan/kasan.c:605
>  kmem_cache_alloc_trace+0xfb/0x280 mm/slub.c:2745
>  kmalloc include/linux/slab.h:490 [inline]
>  kzalloc include/linux/slab.h:663 [inline]
>  get_ucounts kernel/ucount.c:140 [inline]
>  inc_ucount+0x538/0xa70 kernel/ucount.c:195
>  inotify_new_group+0x309/0x410 fs/notify/inotify/inotify_user.c:655
>  SYSC_inotify_init1 fs/notify/inotify/inotify_user.c:682 [inline]
>  SyS_inotify_init1 fs/notify/inotify/inotify_user.c:669 [inline]
>  sys_inotify_init+0x17/0x80 fs/notify/inotify/inotify_user.c:696
>  entry_SYSCALL_64_fastpath+0x1f/0xc2
> Freed:
> PID = 31485
>  save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:57
>  save_stack+0x43/0xd0 mm/kasan/kasan.c:502
>  set_track mm/kasan/kasan.c:514 [inline]
>  kasan_slab_free+0x73/0xc0 mm/kasan/kasan.c:578
>  slab_free_hook mm/slub.c:1357 [inline]
>  slab_free_freelist_hook mm/slub.c:1379 [inline]
>  slab_free mm/slub.c:2961 [inline]
>  kfree+0xe8/0x2c0 mm/slub.c:3882
>  put_ucounts+0x1dd/0x270 kernel/ucount.c:172
>  dec_ucount+0x172/0x210 kernel/ucount.c:220
>  dec_inotify_instances fs/notify/inotify/inotify.h:37 [inline]
>  inotify_free_group_priv+0x6c/0x80 fs/notify/inotify/inotify_fsnotify.c:169
>  fsnotify_final_destroy_group fs/notify/group.c:37 [inline]
>  fsnotify_put_group+0x73/0xa0 fs/notify/group.c:110
>  fsnotify_destroy_group+0xec/0x120 fs/notify/group.c:93
>  inotify_release+0x37/0x50 fs/notify/inotify/inotify_user.c:280
>  __fput+0x327/0x7e0 fs/file_table.c:208
>  ____fput+0x15/0x20 fs/file_table.c:244
>  task_work_run+0x18a/0x260 kernel/task_work.c:116
>  exit_task_work include/linux/task_work.h:21 [inline]
>  do_exit+0xa45/0x1b20 kernel/exit.c:873
>  do_group_exit+0x149/0x400 kernel/exit.c:977
>  get_signal+0x7d5/0x1810 kernel/signal.c:2313
>  do_signal+0x94/0x1f30 arch/x86/kernel/signal.c:807
>  exit_to_usermode_loop+0x162/0x1e0 arch/x86/entry/common.c:156
>  prepare_exit_to_usermode arch/x86/entry/common.c:190 [inline]
>  syscall_return_slowpath+0x2b6/0x310 arch/x86/entry/common.c:259
>  entry_SYSCALL_64_fastpath+0xc0/0xc2
> Memory state around the buggy address:
>  ffff88003a806b00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>  ffff88003a806b80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>>ffff88003a806c00: fc fc fb fb fb fb fb fb fb fb fb fb fb fb fc fc
>                                ^
>  ffff88003a806c80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>  ffff88003a806d00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
> ==================================================================
> Disabling lock debugging due to kernel taint
> Kernel panic - not syncing: panic_on_warn set ...
> 
> CPU: 3 PID: 31498 Comm: syz-executor3 Tainted: G    B           4.10.0+ #4
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
> Ubuntu-1.8.2-1ubuntu1 04/01/2014
> Call Trace:
>  __dump_stack lib/dump_stack.c:15 [inline]
>  dump_stack+0x115/0x1cf lib/dump_stack.c:51
>  panic+0x1b4/0x392 kernel/panic.c:179
>  kasan_end_report+0x5b/0x60 mm/kasan/report.c:141
>  kasan_report_error mm/kasan/report.c:293 [inline]
>  kasan_report.part.1+0x40a/0x4e0 mm/kasan/report.c:311
>  kasan_report mm/kasan/report.c:332 [inline]
>  __asan_report_load8_noabort+0x29/0x30 mm/kasan/report.c:332
>  inc_ucount+0x90f/0xa70 kernel/ucount.c:198
>  inotify_new_group+0x309/0x410 fs/notify/inotify/inotify_user.c:655
>  SYSC_inotify_init1 fs/notify/inotify/inotify_user.c:682 [inline]
>  SyS_inotify_init1 fs/notify/inotify/inotify_user.c:669 [inline]
>  sys_inotify_init+0x17/0x80 fs/notify/inotify/inotify_user.c:696
>  entry_SYSCALL_64_fastpath+0x1f/0xc2
> RIP: 0033:0x44fb79
> RSP: 002b:00007f13df946b58 EFLAGS: 00000212 ORIG_RAX: 00000000000000fd
> RAX: ffffffffffffffda RBX: 0000000000708000 RCX: 000000000044fb79
> RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
> RBP: 0000000000000046 R08: 0000000000000000 R09: 0000000000000000
> R10: 0000000000000000 R11: 0000000000000212 R12: 0000000000000000
> R13: 00007fff6a0bde1f R14: 00007f13df9479c0 R15: 0000000000000000
> Dumping ftrace buffer:
>    (ftrace buffer empty)
> Kernel Offset: disabled
> Rebooting in 86400 seconds..
> 
> 
> Syzkaller reproducer:
> # {Threaded:false Collide:false Repeat:true Procs:4 Sandbox:setuid
> Repro:false}
> inotify_init()
> mknod$loop(&(0x7f0000002000-0x8)="2e2f66696c653000", 0x82, 0x0)
> tee(0xffffffffffffffff, 0xffffffffffffffff, 0x2, 0x7)
> recvfrom$unix(0xffffffffffffffff, &(0x7f0000001000-0x91)="", 0x0, 0x161,
> &(0x7f0000001000)=@file={0x1, "2e2f66696c653000"}, 0xa)

This reproducer doesn't seem like valid C code. Can you use pastebin to
paste what syzkaller has really produced?

> 
> 
> I think that previous ucount
> patch(https://github.com/torvalds/linux/commit/880a38547ff08715ce4f1daf9a4bb30c87676e68)
> isn't perfect.

I don't think that commit is the culprit. The UAF manifests in code
which I created and has soaked for quite a time in -next without any
problems.

Eric, Jan, do you have any ideas how we might be hitting a
user-after-free when the inotify fd is being passed around? Could it be
that those processes are children of the one which initially allocated
the inotify fd and they are forgetting incrementing a reference count on
fork ? Those are just guesses.

Regards,
Nikolay

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: ucount: use-after-free read in inc_ucount & dec_ucount
       [not found]     ` <180fb7dc-790e-8e82-0cc1-c6e15ddcd20b-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
@ 2017-03-03 16:30       ` Eric W. Biederman
       [not found]         ` <87pohy1fx6.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
  0 siblings, 1 reply; 23+ messages in thread
From: Eric W. Biederman @ 2017-03-03 16:30 UTC (permalink / raw)
  To: Nikolay Borisov
  Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Jan Kara,
	JongHwan Kim

Nikolay Borisov <n.borisov.lkml-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:

> [Added containers ml, Eric Biederman and Jan Kara]. Please,
> next time don't add random people but take the time to see who touched
> the code.

Comments below.

> On  3.03.2017 14:16, JongHwan Kim wrote:
>> I've got the following report with syzkaller fuzzer
>> 
>> 
>> Syzkaller hit 'KASAN: use-after-free Read in dec_ucount' bug on commit .
>> 
>> ==================================================================
>> BUG: KASAN: use-after-free in __read_once_size
>> include/linux/compiler.h:254 [inline] at addr ffff88006d399bc4
>> BUG: KASAN: use-after-free in atomic_read
>> arch/x86/include/asm/atomic.h:26 [inline] at addr ffff88006d399bc4
>> BUG: KASAN: use-after-free in atomic_dec_if_positive
>> include/linux/atomic.h:616 [inline] at addr ffff88006d399bc4
>> BUG: KASAN: use-after-free in dec_ucount+0x1e5/0x210 kernel/ucount.c:217
>> at addr ffff88006d399bc4
>> Read of size 4 by task syz-executor3/19713
>> CPU: 1 PID: 19713 Comm: syz-executor3 Not tainted 4.10.0+ #4
>> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
>> Ubuntu-1.8.2-1ubuntu1 04/01/2014
>> Call Trace:
>>  __dump_stack lib/dump_stack.c:15 [inline]
>>  dump_stack+0x115/0x1cf lib/dump_stack.c:51
>>  kasan_object_err+0x1c/0x70 mm/kasan/report.c:162
>>  print_address_description mm/kasan/report.c:200 [inline]
>>  kasan_report_error mm/kasan/report.c:289 [inline]
>>  kasan_report.part.1+0x20e/0x4e0 mm/kasan/report.c:311
>>  kasan_report mm/kasan/report.c:331 [inline]
>>  __asan_report_load4_noabort+0x29/0x30 mm/kasan/report.c:331
>>  __read_once_size include/linux/compiler.h:254 [inline]
>>  atomic_read arch/x86/include/asm/atomic.h:26 [inline]
>>  atomic_dec_if_positive include/linux/atomic.h:616 [inline]
>>  dec_ucount+0x1e5/0x210 kernel/ucount.c:217
>>  dec_inotify_instances fs/notify/inotify/inotify.h:37 [inline]
>>  inotify_free_group_priv+0x6c/0x80 fs/notify/inotify/inotify_fsnotify.c:169
>>  fsnotify_final_destroy_group fs/notify/group.c:37 [inline]
>>  fsnotify_put_group+0x73/0xa0 fs/notify/group.c:110
>>  fsnotify_destroy_group+0xec/0x120 fs/notify/group.c:93
>>  inotify_release+0x37/0x50 fs/notify/inotify/inotify_user.c:280
>>  __fput+0x327/0x7e0 fs/file_table.c:208
>>  ____fput+0x15/0x20 fs/file_table.c:244
>>  task_work_run+0x18a/0x260 kernel/task_work.c:116
>>  exit_task_work include/linux/task_work.h:21 [inline]
>>  do_exit+0xa45/0x1b20 kernel/exit.c:873
>>  do_group_exit+0x149/0x400 kernel/exit.c:977
>>  get_signal+0x7d5/0x1810 kernel/signal.c:2313
>>  do_signal+0x94/0x1f30 arch/x86/kernel/signal.c:807
>>  exit_to_usermode_loop+0x162/0x1e0 arch/x86/entry/common.c:156
>>  prepare_exit_to_usermode arch/x86/entry/common.c:190 [inline]
>>  syscall_return_slowpath+0x2b6/0x310 arch/x86/entry/common.c:259
>>  entry_SYSCALL_64_fastpath+0xc0/0xc2
>
> So PID 19713 is exitting and as part of it it's freeing its file
> descriptors, one of which is apparently an inotify fd. And this has
> already been freed.
>
>
>> RIP: 0033:0x44fb79
>> RSP: 002b:00007ffd0f00f6d8 EFLAGS: 00000206 ORIG_RAX: 00000000000000ca
>> RAX: fffffffffffffdfc RBX: 0000000000708024 RCX: 000000000044fb79
>> RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000708024
>> RBP: 00000000000ae8e6 R08: 0000000000708000 R09: 000000160000000d
>> R10: 00007ffd0f00f710 R11: 0000000000000206 R12: 0000000000708000
>> R13: 0000000000708024 R14: 00000000000ae8a1 R15: 0000000000000016
>> Object at ffff88006d399b88, in cache kmalloc-96 size: 96
>> Allocated:
>> PID = 19691
>>  save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:57
>>  save_stack+0x43/0xd0 mm/kasan/kasan.c:502
>>  set_track mm/kasan/kasan.c:514 [inline]
>>  kasan_kmalloc+0xad/0xe0 mm/kasan/kasan.c:605
>>  kmem_cache_alloc_trace+0xfb/0x280 mm/slub.c:2745
>>  kmalloc include/linux/slab.h:490 [inline]
>>  kzalloc include/linux/slab.h:663 [inline]
>>  get_ucounts kernel/ucount.c:140 [inline]
>>  inc_ucount+0x538/0xa70 kernel/ucount.c:195
>>  inotify_new_group+0x309/0x410 fs/notify/inotify/inotify_user.c:655
>>  SYSC_inotify_init1 fs/notify/inotify/inotify_user.c:682 [inline]
>>  SyS_inotify_init1 fs/notify/inotify/inotify_user.c:669 [inline]
>>  sys_inotify_init+0x17/0x80 fs/notify/inotify/inotify_user.c:696
>>  entry_SYSCALL_64_fastpath+0x1f/0xc2
>
> However, it has been actually allocated by a different process with pid
> 19691.
>
>
>> Freed:
>> PID = 19708
>>  save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:57
>>  save_stack+0x43/0xd0 mm/kasan/kasan.c:502
>>  set_track mm/kasan/kasan.c:514 [inline]
>>  kasan_slab_free+0x73/0xc0 mm/kasan/kasan.c:578
>>  slab_free_hook mm/slub.c:1357 [inline]
>>  slab_free_freelist_hook mm/slub.c:1379 [inline]
>>  slab_free mm/slub.c:2961 [inline]
>>  kfree+0xe8/0x2c0 mm/slub.c:3882
>>  put_ucounts+0x1dd/0x270 kernel/ucount.c:172
>>  dec_ucount+0x172/0x210 kernel/ucount.c:220
>>  dec_inotify_instances fs/notify/inotify/inotify.h:37 [inline]
>>  inotify_free_group_priv+0x6c/0x80 fs/notify/inotify/inotify_fsnotify.c:169
>>  fsnotify_final_destroy_group fs/notify/group.c:37 [inline]
>>  fsnotify_put_group+0x73/0xa0 fs/notify/group.c:110
>>  fsnotify_destroy_group+0xec/0x120 fs/notify/group.c:93
>>  inotify_release+0x37/0x50 fs/notify/inotify/inotify_user.c:280
>>  __fput+0x327/0x7e0 fs/file_table.c:208
>>  ____fput+0x15/0x20 fs/file_table.c:244
>>  task_work_run+0x18a/0x260 kernel/task_work.c:116
>>  exit_task_work include/linux/task_work.h:21 [inline]
>>  do_exit+0xa45/0x1b20 kernel/exit.c:873
>>  do_group_exit+0x149/0x400 kernel/exit.c:977
>>  get_signal+0x7d5/0x1810 kernel/signal.c:2313
>>  do_signal+0x94/0x1f30 arch/x86/kernel/signal.c:807
>>  exit_to_usermode_loop+0x162/0x1e0 arch/x86/entry/common.c:156
>>  prepare_exit_to_usermode arch/x86/entry/common.c:190 [inline]
>>  syscall_return_slowpath+0x2b6/0x310 arch/x86/entry/common.c:259
>>  entry_SYSCALL_64_fastpath+0xc0/0xc2
>
> And yet we have a third process which freed it, PID 19708. So there is
> some dance happening with this fd, being allocated by one process,
> handed over to 2 more, which are freeing it. Is this a valid usage
> scenario of inotify descriptors?

They are file descriptors so passing them around is valid.  That is
something unix domain sockets have allowed since the dawn of linux.

The dance would need to be the fd being passed to the addtional
processes and then closed in the original before being closed
in the processes the fd was passed to.

If those additional processes last longer than the original process this
is easy to achieve.

My guess is that someone just taught syskallzer to pass file descriptors
around.  So this may be an old bug.  Either that or syskallzer hasn't
been looking at linux-next with KASAN enabled in the kernel.

This is definitely a weird usage pattern.

>> Memory state around the buggy address:
>>  ffff88006d399a80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>>  ffff88006d399b00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>>> ffff88006d399b80: fc fb fb fb fb fb fb fb fb fb fb fb fb fc fc fc
>>                                            ^
>>  ffff88006d399c00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>>  ffff88006d399c80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>> ==================================================================
>> Disabling lock debugging due to kernel taint
>> Kernel panic - not syncing: panic_on_warn set ...
>> 
>> CPU: 1 PID: 19713 Comm: syz-executor3 Tainted: G    B           4.10.0+ #4
>> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
>> Ubuntu-1.8.2-1ubuntu1 04/01/2014
>> Call Trace:
>>  __dump_stack lib/dump_stack.c:15 [inline]
>>  dump_stack+0x115/0x1cf lib/dump_stack.c:51
>>  panic+0x1b4/0x392 kernel/panic.c:179
>>  kasan_end_report+0x5b/0x60 mm/kasan/report.c:141
>>  kasan_report_error mm/kasan/report.c:293 [inline]
>>  kasan_report.part.1+0x40a/0x4e0 mm/kasan/report.c:311
>>  kasan_report mm/kasan/report.c:331 [inline]
>>  __asan_report_load4_noabort+0x29/0x30 mm/kasan/report.c:331
>>  __read_once_size include/linux/compiler.h:254 [inline]
>>  atomic_read arch/x86/include/asm/atomic.h:26 [inline]
>>  atomic_dec_if_positive include/linux/atomic.h:616 [inline]
>>  dec_ucount+0x1e5/0x210 kernel/ucount.c:217
>>  dec_inotify_instances fs/notify/inotify/inotify.h:37 [inline]
>>  inotify_free_group_priv+0x6c/0x80 fs/notify/inotify/inotify_fsnotify.c:169
>>  fsnotify_final_destroy_group fs/notify/group.c:37 [inline]
>>  fsnotify_put_group+0x73/0xa0 fs/notify/group.c:110
>>  fsnotify_destroy_group+0xec/0x120 fs/notify/group.c:93
>>  inotify_release+0x37/0x50 fs/notify/inotify/inotify_user.c:280
>>  __fput+0x327/0x7e0 fs/file_table.c:208
>>  ____fput+0x15/0x20 fs/file_table.c:244
>>  task_work_run+0x18a/0x260 kernel/task_work.c:116
>>  exit_task_work include/linux/task_work.h:21 [inline]
>>  do_exit+0xa45/0x1b20 kernel/exit.c:873
>>  do_group_exit+0x149/0x400 kernel/exit.c:977
>>  get_signal+0x7d5/0x1810 kernel/signal.c:2313
>>  do_signal+0x94/0x1f30 arch/x86/kernel/signal.c:807
>>  exit_to_usermode_loop+0x162/0x1e0 arch/x86/entry/common.c:156
>>  prepare_exit_to_usermode arch/x86/entry/common.c:190 [inline]
>>  syscall_return_slowpath+0x2b6/0x310 arch/x86/entry/common.c:259
>>  entry_SYSCALL_64_fastpath+0xc0/0xc2
>> RIP: 0033:0x44fb79
>> RSP: 002b:00007ffd0f00f6d8 EFLAGS: 00000206 ORIG_RAX: 00000000000000ca
>> RAX: fffffffffffffdfc RBX: 0000000000708024 RCX: 000000000044fb79
>> RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000708024
>> RBP: 00000000000ae8e6 R08: 0000000000708000 R09: 000000160000000d
>> R10: 00007ffd0f00f710 R11: 0000000000000206 R12: 0000000000708000
>> R13: 0000000000708024 R14: 00000000000ae8a1 R15: 0000000000000016
>> Dumping ftrace buffer:
>>    (ftrace buffer empty)
>> Kernel Offset: disabled
>> Rebooting in 86400 seconds..
>> 
>> 
>> Syzkaller reproducer:
>> # {Threaded:false Collide:false Repeat:true Procs:4 Sandbox:setuid
>> Repro:false}
>> inotify_init()
>> 
>> Syzkaller hit 'KASAN: use-after-free Read in inc_ucount' bug on commit .
>> 
>> ==================================================================
>> BUG: KASAN: use-after-free in inc_ucount+0x90f/0xa70 kernel/ucount.c:198
>> at addr ffff88003a806c20
>> Read of size 8 by task syz-executor3/31498
>> CPU: 3 PID: 31498 Comm: syz-executor3 Not tainted 4.10.0+ #4
>> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
>> Ubuntu-1.8.2-1ubuntu1 04/01/2014
>> Call Trace:
>>  __dump_stack lib/dump_stack.c:15 [inline]
>>  dump_stack+0x115/0x1cf lib/dump_stack.c:51
>>  kasan_object_err+0x1c/0x70 mm/kasan/report.c:162
>>  print_address_description mm/kasan/report.c:200 [inline]
>>  kasan_report_error mm/kasan/report.c:289 [inline]
>>  kasan_report.part.1+0x20e/0x4e0 mm/kasan/report.c:311
>>  kasan_report mm/kasan/report.c:332 [inline]
>>  __asan_report_load8_noabort+0x29/0x30 mm/kasan/report.c:332
>>  inc_ucount+0x90f/0xa70 kernel/ucount.c:198
>>  inotify_new_group+0x309/0x410 fs/notify/inotify/inotify_user.c:655
>>  SYSC_inotify_init1 fs/notify/inotify/inotify_user.c:682 [inline]
>>  SyS_inotify_init1 fs/notify/inotify/inotify_user.c:669 [inline]
>>  sys_inotify_init+0x17/0x80 fs/notify/inotify/inotify_user.c:696
>>  entry_SYSCALL_64_fastpath+0x1f/0xc2
>> RIP: 0033:0x44fb79
>> RSP: 002b:00007f13df946b58 EFLAGS: 00000212 ORIG_RAX: 00000000000000fd
>> RAX: ffffffffffffffda RBX: 0000000000708000 RCX: 000000000044fb79
>> RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
>> RBP: 0000000000000046 R08: 0000000000000000 R09: 0000000000000000
>> R10: 0000000000000000 R11: 0000000000000212 R12: 0000000000000000
>> R13: 00007fff6a0bde1f R14: 00007f13df9479c0 R15: 0000000000000000
>> Object at ffff88003a806c10, in cache kmalloc-96 size: 96
>> Allocated:
>> PID = 31470
>>  save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:57
>>  save_stack+0x43/0xd0 mm/kasan/kasan.c:502
>>  set_track mm/kasan/kasan.c:514 [inline]
>>  kasan_kmalloc+0xad/0xe0 mm/kasan/kasan.c:605
>>  kmem_cache_alloc_trace+0xfb/0x280 mm/slub.c:2745
>>  kmalloc include/linux/slab.h:490 [inline]
>>  kzalloc include/linux/slab.h:663 [inline]
>>  get_ucounts kernel/ucount.c:140 [inline]
>>  inc_ucount+0x538/0xa70 kernel/ucount.c:195
>>  inotify_new_group+0x309/0x410 fs/notify/inotify/inotify_user.c:655
>>  SYSC_inotify_init1 fs/notify/inotify/inotify_user.c:682 [inline]
>>  SyS_inotify_init1 fs/notify/inotify/inotify_user.c:669 [inline]
>>  sys_inotify_init+0x17/0x80 fs/notify/inotify/inotify_user.c:696
>>  entry_SYSCALL_64_fastpath+0x1f/0xc2
>> Freed:
>> PID = 31485
>>  save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:57
>>  save_stack+0x43/0xd0 mm/kasan/kasan.c:502
>>  set_track mm/kasan/kasan.c:514 [inline]
>>  kasan_slab_free+0x73/0xc0 mm/kasan/kasan.c:578
>>  slab_free_hook mm/slub.c:1357 [inline]
>>  slab_free_freelist_hook mm/slub.c:1379 [inline]
>>  slab_free mm/slub.c:2961 [inline]
>>  kfree+0xe8/0x2c0 mm/slub.c:3882
>>  put_ucounts+0x1dd/0x270 kernel/ucount.c:172
>>  dec_ucount+0x172/0x210 kernel/ucount.c:220
>>  dec_inotify_instances fs/notify/inotify/inotify.h:37 [inline]
>>  inotify_free_group_priv+0x6c/0x80 fs/notify/inotify/inotify_fsnotify.c:169
>>  fsnotify_final_destroy_group fs/notify/group.c:37 [inline]
>>  fsnotify_put_group+0x73/0xa0 fs/notify/group.c:110
>>  fsnotify_destroy_group+0xec/0x120 fs/notify/group.c:93
>>  inotify_release+0x37/0x50 fs/notify/inotify/inotify_user.c:280
>>  __fput+0x327/0x7e0 fs/file_table.c:208
>>  ____fput+0x15/0x20 fs/file_table.c:244
>>  task_work_run+0x18a/0x260 kernel/task_work.c:116
>>  exit_task_work include/linux/task_work.h:21 [inline]
>>  do_exit+0xa45/0x1b20 kernel/exit.c:873
>>  do_group_exit+0x149/0x400 kernel/exit.c:977
>>  get_signal+0x7d5/0x1810 kernel/signal.c:2313
>>  do_signal+0x94/0x1f30 arch/x86/kernel/signal.c:807
>>  exit_to_usermode_loop+0x162/0x1e0 arch/x86/entry/common.c:156
>>  prepare_exit_to_usermode arch/x86/entry/common.c:190 [inline]
>>  syscall_return_slowpath+0x2b6/0x310 arch/x86/entry/common.c:259
>>  entry_SYSCALL_64_fastpath+0xc0/0xc2
>> Memory state around the buggy address:
>>  ffff88003a806b00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>>  ffff88003a806b80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>>> ffff88003a806c00: fc fc fb fb fb fb fb fb fb fb fb fb fb fb fc fc
>>                                ^
>>  ffff88003a806c80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>>  ffff88003a806d00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>> ==================================================================
>> Disabling lock debugging due to kernel taint
>> Kernel panic - not syncing: panic_on_warn set ...
>> 
>> CPU: 3 PID: 31498 Comm: syz-executor3 Tainted: G    B           4.10.0+ #4
>> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
>> Ubuntu-1.8.2-1ubuntu1 04/01/2014
>> Call Trace:
>>  __dump_stack lib/dump_stack.c:15 [inline]
>>  dump_stack+0x115/0x1cf lib/dump_stack.c:51
>>  panic+0x1b4/0x392 kernel/panic.c:179
>>  kasan_end_report+0x5b/0x60 mm/kasan/report.c:141
>>  kasan_report_error mm/kasan/report.c:293 [inline]
>>  kasan_report.part.1+0x40a/0x4e0 mm/kasan/report.c:311
>>  kasan_report mm/kasan/report.c:332 [inline]
>>  __asan_report_load8_noabort+0x29/0x30 mm/kasan/report.c:332
>>  inc_ucount+0x90f/0xa70 kernel/ucount.c:198
>>  inotify_new_group+0x309/0x410 fs/notify/inotify/inotify_user.c:655
>>  SYSC_inotify_init1 fs/notify/inotify/inotify_user.c:682 [inline]
>>  SyS_inotify_init1 fs/notify/inotify/inotify_user.c:669 [inline]
>>  sys_inotify_init+0x17/0x80 fs/notify/inotify/inotify_user.c:696
>>  entry_SYSCALL_64_fastpath+0x1f/0xc2
>> RIP: 0033:0x44fb79
>> RSP: 002b:00007f13df946b58 EFLAGS: 00000212 ORIG_RAX: 00000000000000fd
>> RAX: ffffffffffffffda RBX: 0000000000708000 RCX: 000000000044fb79
>> RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
>> RBP: 0000000000000046 R08: 0000000000000000 R09: 0000000000000000
>> R10: 0000000000000000 R11: 0000000000000212 R12: 0000000000000000
>> R13: 00007fff6a0bde1f R14: 00007f13df9479c0 R15: 0000000000000000
>> Dumping ftrace buffer:
>>    (ftrace buffer empty)
>> Kernel Offset: disabled
>> Rebooting in 86400 seconds..
>> 
>> 
>> Syzkaller reproducer:
>> # {Threaded:false Collide:false Repeat:true Procs:4 Sandbox:setuid
>> Repro:false}
>> inotify_init()
>> mknod$loop(&(0x7f0000002000-0x8)="2e2f66696c653000", 0x82, 0x0)
>> tee(0xffffffffffffffff, 0xffffffffffffffff, 0x2, 0x7)
>> recvfrom$unix(0xffffffffffffffff, &(0x7f0000001000-0x91)="", 0x0, 0x161,
>> &(0x7f0000001000)=@file={0x1, "2e2f66696c653000"}, 0xa)
>
> This reproducer doesn't seem like valid C code. Can you use pastebin to
> paste what syzkaller has really produced?
>
>> 
>> 
>> I think that previous ucount
>> patch(https://github.com/torvalds/linux/commit/880a38547ff08715ce4f1daf9a4bb30c87676e68)
>> isn't perfect.
>
> I don't think that commit is the culprit. The UAF manifests in code
> which I created and has soaked for quite a time in -next without any
> problems.

I agree that commit is very unlikely to be the culprit.  I don't expect
anyone has been testing the file file descriptor passing case in
linux-next or much of elsewhere as passing inotify file descriptor
around is not particularly useful.

> Eric, Jan, do you have any ideas how we might be hitting a
> user-after-free when the inotify fd is being passed around? Could it be
> that those processes are children of the one which initially allocated
> the inotify fd and they are forgetting incrementing a reference count on
> fork ? Those are just guesses.

I am not at all certain.   There is nothing obvious in your patch that
would change reference counting behavior.  So this may be an old bug
manifesting now.  Possibly because of the modest change in data
structures of your patch.

So I suspect the best course for tracking this down is going to be going
through the inotify code and seeing if there is anything that could
cause a leak.

Well that and setting up a small reproducer that passes file descriptors
around so you can reproduce this.  Which will help to know if you have
fixed this.  This definitely looks like it is worth looking into.

Eric

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: ucount: use-after-free read in inc_ucount & dec_ucount
       [not found]         ` <87pohy1fx6.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
@ 2017-03-03 16:45           ` Nikolay Borisov
       [not found]             ` <d0a23a9c-65a6-063f-748b-e399bed6dacd-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
  2017-03-04 10:58           ` Nikolay Borisov
  1 sibling, 1 reply; 23+ messages in thread
From: Nikolay Borisov @ 2017-03-03 16:45 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Jan Kara,
	JongHwan Kim



On  3.03.2017 18:30, Eric W. Biederman wrote:
> Nikolay Borisov <n.borisov.lkml-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:
> 
>> [Added containers ml, Eric Biederman and Jan Kara]. Please,
>> next time don't add random people but take the time to see who touched
>> the code.
> 
> Comments below.
> 
>> On  3.03.2017 14:16, JongHwan Kim wrote:
>>> I've got the following report with syzkaller fuzzer
>>>
>>>
>>> Syzkaller hit 'KASAN: use-after-free Read in dec_ucount' bug on commit .
>>>
>>> ==================================================================
>>> BUG: KASAN: use-after-free in __read_once_size
>>> include/linux/compiler.h:254 [inline] at addr ffff88006d399bc4
>>> BUG: KASAN: use-after-free in atomic_read
>>> arch/x86/include/asm/atomic.h:26 [inline] at addr ffff88006d399bc4
>>> BUG: KASAN: use-after-free in atomic_dec_if_positive
>>> include/linux/atomic.h:616 [inline] at addr ffff88006d399bc4
>>> BUG: KASAN: use-after-free in dec_ucount+0x1e5/0x210 kernel/ucount.c:217
>>> at addr ffff88006d399bc4
>>> Read of size 4 by task syz-executor3/19713
>>> CPU: 1 PID: 19713 Comm: syz-executor3 Not tainted 4.10.0+ #4
>>> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
>>> Ubuntu-1.8.2-1ubuntu1 04/01/2014
>>> Call Trace:
>>>  __dump_stack lib/dump_stack.c:15 [inline]
>>>  dump_stack+0x115/0x1cf lib/dump_stack.c:51
>>>  kasan_object_err+0x1c/0x70 mm/kasan/report.c:162
>>>  print_address_description mm/kasan/report.c:200 [inline]
>>>  kasan_report_error mm/kasan/report.c:289 [inline]
>>>  kasan_report.part.1+0x20e/0x4e0 mm/kasan/report.c:311
>>>  kasan_report mm/kasan/report.c:331 [inline]
>>>  __asan_report_load4_noabort+0x29/0x30 mm/kasan/report.c:331
>>>  __read_once_size include/linux/compiler.h:254 [inline]
>>>  atomic_read arch/x86/include/asm/atomic.h:26 [inline]
>>>  atomic_dec_if_positive include/linux/atomic.h:616 [inline]
>>>  dec_ucount+0x1e5/0x210 kernel/ucount.c:217
>>>  dec_inotify_instances fs/notify/inotify/inotify.h:37 [inline]
>>>  inotify_free_group_priv+0x6c/0x80 fs/notify/inotify/inotify_fsnotify.c:169
>>>  fsnotify_final_destroy_group fs/notify/group.c:37 [inline]
>>>  fsnotify_put_group+0x73/0xa0 fs/notify/group.c:110
>>>  fsnotify_destroy_group+0xec/0x120 fs/notify/group.c:93
>>>  inotify_release+0x37/0x50 fs/notify/inotify/inotify_user.c:280
>>>  __fput+0x327/0x7e0 fs/file_table.c:208
>>>  ____fput+0x15/0x20 fs/file_table.c:244
>>>  task_work_run+0x18a/0x260 kernel/task_work.c:116
>>>  exit_task_work include/linux/task_work.h:21 [inline]
>>>  do_exit+0xa45/0x1b20 kernel/exit.c:873
>>>  do_group_exit+0x149/0x400 kernel/exit.c:977
>>>  get_signal+0x7d5/0x1810 kernel/signal.c:2313
>>>  do_signal+0x94/0x1f30 arch/x86/kernel/signal.c:807
>>>  exit_to_usermode_loop+0x162/0x1e0 arch/x86/entry/common.c:156
>>>  prepare_exit_to_usermode arch/x86/entry/common.c:190 [inline]
>>>  syscall_return_slowpath+0x2b6/0x310 arch/x86/entry/common.c:259
>>>  entry_SYSCALL_64_fastpath+0xc0/0xc2
>>
>> So PID 19713 is exitting and as part of it it's freeing its file
>> descriptors, one of which is apparently an inotify fd. And this has
>> already been freed.
>>
>>
>>> RIP: 0033:0x44fb79
>>> RSP: 002b:00007ffd0f00f6d8 EFLAGS: 00000206 ORIG_RAX: 00000000000000ca
>>> RAX: fffffffffffffdfc RBX: 0000000000708024 RCX: 000000000044fb79
>>> RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000708024
>>> RBP: 00000000000ae8e6 R08: 0000000000708000 R09: 000000160000000d
>>> R10: 00007ffd0f00f710 R11: 0000000000000206 R12: 0000000000708000
>>> R13: 0000000000708024 R14: 00000000000ae8a1 R15: 0000000000000016
>>> Object at ffff88006d399b88, in cache kmalloc-96 size: 96
>>> Allocated:
>>> PID = 19691
>>>  save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:57
>>>  save_stack+0x43/0xd0 mm/kasan/kasan.c:502
>>>  set_track mm/kasan/kasan.c:514 [inline]
>>>  kasan_kmalloc+0xad/0xe0 mm/kasan/kasan.c:605
>>>  kmem_cache_alloc_trace+0xfb/0x280 mm/slub.c:2745
>>>  kmalloc include/linux/slab.h:490 [inline]
>>>  kzalloc include/linux/slab.h:663 [inline]
>>>  get_ucounts kernel/ucount.c:140 [inline]
>>>  inc_ucount+0x538/0xa70 kernel/ucount.c:195
>>>  inotify_new_group+0x309/0x410 fs/notify/inotify/inotify_user.c:655
>>>  SYSC_inotify_init1 fs/notify/inotify/inotify_user.c:682 [inline]
>>>  SyS_inotify_init1 fs/notify/inotify/inotify_user.c:669 [inline]
>>>  sys_inotify_init+0x17/0x80 fs/notify/inotify/inotify_user.c:696
>>>  entry_SYSCALL_64_fastpath+0x1f/0xc2
>>
>> However, it has been actually allocated by a different process with pid
>> 19691.
>>
>>
>>> Freed:
>>> PID = 19708
>>>  save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:57
>>>  save_stack+0x43/0xd0 mm/kasan/kasan.c:502
>>>  set_track mm/kasan/kasan.c:514 [inline]
>>>  kasan_slab_free+0x73/0xc0 mm/kasan/kasan.c:578
>>>  slab_free_hook mm/slub.c:1357 [inline]
>>>  slab_free_freelist_hook mm/slub.c:1379 [inline]
>>>  slab_free mm/slub.c:2961 [inline]
>>>  kfree+0xe8/0x2c0 mm/slub.c:3882
>>>  put_ucounts+0x1dd/0x270 kernel/ucount.c:172
>>>  dec_ucount+0x172/0x210 kernel/ucount.c:220
>>>  dec_inotify_instances fs/notify/inotify/inotify.h:37 [inline]
>>>  inotify_free_group_priv+0x6c/0x80 fs/notify/inotify/inotify_fsnotify.c:169
>>>  fsnotify_final_destroy_group fs/notify/group.c:37 [inline]
>>>  fsnotify_put_group+0x73/0xa0 fs/notify/group.c:110
>>>  fsnotify_destroy_group+0xec/0x120 fs/notify/group.c:93
>>>  inotify_release+0x37/0x50 fs/notify/inotify/inotify_user.c:280
>>>  __fput+0x327/0x7e0 fs/file_table.c:208
>>>  ____fput+0x15/0x20 fs/file_table.c:244
>>>  task_work_run+0x18a/0x260 kernel/task_work.c:116
>>>  exit_task_work include/linux/task_work.h:21 [inline]
>>>  do_exit+0xa45/0x1b20 kernel/exit.c:873
>>>  do_group_exit+0x149/0x400 kernel/exit.c:977
>>>  get_signal+0x7d5/0x1810 kernel/signal.c:2313
>>>  do_signal+0x94/0x1f30 arch/x86/kernel/signal.c:807
>>>  exit_to_usermode_loop+0x162/0x1e0 arch/x86/entry/common.c:156
>>>  prepare_exit_to_usermode arch/x86/entry/common.c:190 [inline]
>>>  syscall_return_slowpath+0x2b6/0x310 arch/x86/entry/common.c:259
>>>  entry_SYSCALL_64_fastpath+0xc0/0xc2
>>
>> And yet we have a third process which freed it, PID 19708. So there is
>> some dance happening with this fd, being allocated by one process,
>> handed over to 2 more, which are freeing it. Is this a valid usage
>> scenario of inotify descriptors?
> 
> They are file descriptors so passing them around is valid.  That is
> something unix domain sockets have allowed since the dawn of linux.
> 
> The dance would need to be the fd being passed to the addtional
> processes and then closed in the original before being closed
> in the processes the fd was passed to.
> 
> If those additional processes last longer than the original process this
> is easy to achieve.
> 
> My guess is that someone just taught syskallzer to pass file descriptors
> around.  So this may be an old bug.  Either that or syskallzer hasn't
> been looking at linux-next with KASAN enabled in the kernel.
> 
> This is definitely a weird usage pattern.

If we assume that passing the fds is the problem then this *might*
explain the first bug. However, if you look closely into the second set
of stacktraces, you see how the crash occurs on inotify fd
initialization, and according to the trace this line is the culprit:
tns = iter->ns;

And this points to a deeper problem, if ucounts can disappear in the
middle of iterating through them.

> 
>>> Memory state around the buggy address:
>>>  ffff88006d399a80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>>>  ffff88006d399b00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>>>> ffff88006d399b80: fc fb fb fb fb fb fb fb fb fb fb fb fb fc fc fc
>>>                                            ^
>>>  ffff88006d399c00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>>>  ffff88006d399c80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>>> ==================================================================
>>> Disabling lock debugging due to kernel taint
>>> Kernel panic - not syncing: panic_on_warn set ...
>>>
>>> CPU: 1 PID: 19713 Comm: syz-executor3 Tainted: G    B           4.10.0+ #4
>>> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
>>> Ubuntu-1.8.2-1ubuntu1 04/01/2014
>>> Call Trace:
>>>  __dump_stack lib/dump_stack.c:15 [inline]
>>>  dump_stack+0x115/0x1cf lib/dump_stack.c:51
>>>  panic+0x1b4/0x392 kernel/panic.c:179
>>>  kasan_end_report+0x5b/0x60 mm/kasan/report.c:141
>>>  kasan_report_error mm/kasan/report.c:293 [inline]
>>>  kasan_report.part.1+0x40a/0x4e0 mm/kasan/report.c:311
>>>  kasan_report mm/kasan/report.c:331 [inline]
>>>  __asan_report_load4_noabort+0x29/0x30 mm/kasan/report.c:331
>>>  __read_once_size include/linux/compiler.h:254 [inline]
>>>  atomic_read arch/x86/include/asm/atomic.h:26 [inline]
>>>  atomic_dec_if_positive include/linux/atomic.h:616 [inline]
>>>  dec_ucount+0x1e5/0x210 kernel/ucount.c:217
>>>  dec_inotify_instances fs/notify/inotify/inotify.h:37 [inline]
>>>  inotify_free_group_priv+0x6c/0x80 fs/notify/inotify/inotify_fsnotify.c:169
>>>  fsnotify_final_destroy_group fs/notify/group.c:37 [inline]
>>>  fsnotify_put_group+0x73/0xa0 fs/notify/group.c:110
>>>  fsnotify_destroy_group+0xec/0x120 fs/notify/group.c:93
>>>  inotify_release+0x37/0x50 fs/notify/inotify/inotify_user.c:280
>>>  __fput+0x327/0x7e0 fs/file_table.c:208
>>>  ____fput+0x15/0x20 fs/file_table.c:244
>>>  task_work_run+0x18a/0x260 kernel/task_work.c:116
>>>  exit_task_work include/linux/task_work.h:21 [inline]
>>>  do_exit+0xa45/0x1b20 kernel/exit.c:873
>>>  do_group_exit+0x149/0x400 kernel/exit.c:977
>>>  get_signal+0x7d5/0x1810 kernel/signal.c:2313
>>>  do_signal+0x94/0x1f30 arch/x86/kernel/signal.c:807
>>>  exit_to_usermode_loop+0x162/0x1e0 arch/x86/entry/common.c:156
>>>  prepare_exit_to_usermode arch/x86/entry/common.c:190 [inline]
>>>  syscall_return_slowpath+0x2b6/0x310 arch/x86/entry/common.c:259
>>>  entry_SYSCALL_64_fastpath+0xc0/0xc2
>>> RIP: 0033:0x44fb79
>>> RSP: 002b:00007ffd0f00f6d8 EFLAGS: 00000206 ORIG_RAX: 00000000000000ca
>>> RAX: fffffffffffffdfc RBX: 0000000000708024 RCX: 000000000044fb79
>>> RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000708024
>>> RBP: 00000000000ae8e6 R08: 0000000000708000 R09: 000000160000000d
>>> R10: 00007ffd0f00f710 R11: 0000000000000206 R12: 0000000000708000
>>> R13: 0000000000708024 R14: 00000000000ae8a1 R15: 0000000000000016
>>> Dumping ftrace buffer:
>>>    (ftrace buffer empty)
>>> Kernel Offset: disabled
>>> Rebooting in 86400 seconds..
>>>
>>>
>>> Syzkaller reproducer:
>>> # {Threaded:false Collide:false Repeat:true Procs:4 Sandbox:setuid
>>> Repro:false}
>>> inotify_init()
>>>
>>> Syzkaller hit 'KASAN: use-after-free Read in inc_ucount' bug on commit .
>>>
>>> ==================================================================
>>> BUG: KASAN: use-after-free in inc_ucount+0x90f/0xa70 kernel/ucount.c:198
>>> at addr ffff88003a806c20
>>> Read of size 8 by task syz-executor3/31498
>>> CPU: 3 PID: 31498 Comm: syz-executor3 Not tainted 4.10.0+ #4
>>> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
>>> Ubuntu-1.8.2-1ubuntu1 04/01/2014
>>> Call Trace:
>>>  __dump_stack lib/dump_stack.c:15 [inline]
>>>  dump_stack+0x115/0x1cf lib/dump_stack.c:51
>>>  kasan_object_err+0x1c/0x70 mm/kasan/report.c:162
>>>  print_address_description mm/kasan/report.c:200 [inline]
>>>  kasan_report_error mm/kasan/report.c:289 [inline]
>>>  kasan_report.part.1+0x20e/0x4e0 mm/kasan/report.c:311
>>>  kasan_report mm/kasan/report.c:332 [inline]
>>>  __asan_report_load8_noabort+0x29/0x30 mm/kasan/report.c:332
>>>  inc_ucount+0x90f/0xa70 kernel/ucount.c:198
>>>  inotify_new_group+0x309/0x410 fs/notify/inotify/inotify_user.c:655
>>>  SYSC_inotify_init1 fs/notify/inotify/inotify_user.c:682 [inline]
>>>  SyS_inotify_init1 fs/notify/inotify/inotify_user.c:669 [inline]
>>>  sys_inotify_init+0x17/0x80 fs/notify/inotify/inotify_user.c:696
>>>  entry_SYSCALL_64_fastpath+0x1f/0xc2
>>> RIP: 0033:0x44fb79
>>> RSP: 002b:00007f13df946b58 EFLAGS: 00000212 ORIG_RAX: 00000000000000fd
>>> RAX: ffffffffffffffda RBX: 0000000000708000 RCX: 000000000044fb79
>>> RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
>>> RBP: 0000000000000046 R08: 0000000000000000 R09: 0000000000000000
>>> R10: 0000000000000000 R11: 0000000000000212 R12: 0000000000000000
>>> R13: 00007fff6a0bde1f R14: 00007f13df9479c0 R15: 0000000000000000
>>> Object at ffff88003a806c10, in cache kmalloc-96 size: 96
>>> Allocated:
>>> PID = 31470
>>>  save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:57
>>>  save_stack+0x43/0xd0 mm/kasan/kasan.c:502
>>>  set_track mm/kasan/kasan.c:514 [inline]
>>>  kasan_kmalloc+0xad/0xe0 mm/kasan/kasan.c:605
>>>  kmem_cache_alloc_trace+0xfb/0x280 mm/slub.c:2745
>>>  kmalloc include/linux/slab.h:490 [inline]
>>>  kzalloc include/linux/slab.h:663 [inline]
>>>  get_ucounts kernel/ucount.c:140 [inline]
>>>  inc_ucount+0x538/0xa70 kernel/ucount.c:195
>>>  inotify_new_group+0x309/0x410 fs/notify/inotify/inotify_user.c:655
>>>  SYSC_inotify_init1 fs/notify/inotify/inotify_user.c:682 [inline]
>>>  SyS_inotify_init1 fs/notify/inotify/inotify_user.c:669 [inline]
>>>  sys_inotify_init+0x17/0x80 fs/notify/inotify/inotify_user.c:696
>>>  entry_SYSCALL_64_fastpath+0x1f/0xc2
>>> Freed:
>>> PID = 31485
>>>  save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:57
>>>  save_stack+0x43/0xd0 mm/kasan/kasan.c:502
>>>  set_track mm/kasan/kasan.c:514 [inline]
>>>  kasan_slab_free+0x73/0xc0 mm/kasan/kasan.c:578
>>>  slab_free_hook mm/slub.c:1357 [inline]
>>>  slab_free_freelist_hook mm/slub.c:1379 [inline]
>>>  slab_free mm/slub.c:2961 [inline]
>>>  kfree+0xe8/0x2c0 mm/slub.c:3882
>>>  put_ucounts+0x1dd/0x270 kernel/ucount.c:172
>>>  dec_ucount+0x172/0x210 kernel/ucount.c:220
>>>  dec_inotify_instances fs/notify/inotify/inotify.h:37 [inline]
>>>  inotify_free_group_priv+0x6c/0x80 fs/notify/inotify/inotify_fsnotify.c:169
>>>  fsnotify_final_destroy_group fs/notify/group.c:37 [inline]
>>>  fsnotify_put_group+0x73/0xa0 fs/notify/group.c:110
>>>  fsnotify_destroy_group+0xec/0x120 fs/notify/group.c:93
>>>  inotify_release+0x37/0x50 fs/notify/inotify/inotify_user.c:280
>>>  __fput+0x327/0x7e0 fs/file_table.c:208
>>>  ____fput+0x15/0x20 fs/file_table.c:244
>>>  task_work_run+0x18a/0x260 kernel/task_work.c:116
>>>  exit_task_work include/linux/task_work.h:21 [inline]
>>>  do_exit+0xa45/0x1b20 kernel/exit.c:873
>>>  do_group_exit+0x149/0x400 kernel/exit.c:977
>>>  get_signal+0x7d5/0x1810 kernel/signal.c:2313
>>>  do_signal+0x94/0x1f30 arch/x86/kernel/signal.c:807
>>>  exit_to_usermode_loop+0x162/0x1e0 arch/x86/entry/common.c:156
>>>  prepare_exit_to_usermode arch/x86/entry/common.c:190 [inline]
>>>  syscall_return_slowpath+0x2b6/0x310 arch/x86/entry/common.c:259
>>>  entry_SYSCALL_64_fastpath+0xc0/0xc2
>>> Memory state around the buggy address:
>>>  ffff88003a806b00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>>>  ffff88003a806b80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>>>> ffff88003a806c00: fc fc fb fb fb fb fb fb fb fb fb fb fb fb fc fc
>>>                                ^
>>>  ffff88003a806c80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>>>  ffff88003a806d00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>>> ==================================================================
>>> Disabling lock debugging due to kernel taint
>>> Kernel panic - not syncing: panic_on_warn set ...
>>>
>>> CPU: 3 PID: 31498 Comm: syz-executor3 Tainted: G    B           4.10.0+ #4
>>> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
>>> Ubuntu-1.8.2-1ubuntu1 04/01/2014
>>> Call Trace:
>>>  __dump_stack lib/dump_stack.c:15 [inline]
>>>  dump_stack+0x115/0x1cf lib/dump_stack.c:51
>>>  panic+0x1b4/0x392 kernel/panic.c:179
>>>  kasan_end_report+0x5b/0x60 mm/kasan/report.c:141
>>>  kasan_report_error mm/kasan/report.c:293 [inline]
>>>  kasan_report.part.1+0x40a/0x4e0 mm/kasan/report.c:311
>>>  kasan_report mm/kasan/report.c:332 [inline]
>>>  __asan_report_load8_noabort+0x29/0x30 mm/kasan/report.c:332
>>>  inc_ucount+0x90f/0xa70 kernel/ucount.c:198
>>>  inotify_new_group+0x309/0x410 fs/notify/inotify/inotify_user.c:655
>>>  SYSC_inotify_init1 fs/notify/inotify/inotify_user.c:682 [inline]
>>>  SyS_inotify_init1 fs/notify/inotify/inotify_user.c:669 [inline]
>>>  sys_inotify_init+0x17/0x80 fs/notify/inotify/inotify_user.c:696
>>>  entry_SYSCALL_64_fastpath+0x1f/0xc2
>>> RIP: 0033:0x44fb79
>>> RSP: 002b:00007f13df946b58 EFLAGS: 00000212 ORIG_RAX: 00000000000000fd
>>> RAX: ffffffffffffffda RBX: 0000000000708000 RCX: 000000000044fb79
>>> RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
>>> RBP: 0000000000000046 R08: 0000000000000000 R09: 0000000000000000
>>> R10: 0000000000000000 R11: 0000000000000212 R12: 0000000000000000
>>> R13: 00007fff6a0bde1f R14: 00007f13df9479c0 R15: 0000000000000000
>>> Dumping ftrace buffer:
>>>    (ftrace buffer empty)
>>> Kernel Offset: disabled
>>> Rebooting in 86400 seconds..
>>>
>>>
>>> Syzkaller reproducer:
>>> # {Threaded:false Collide:false Repeat:true Procs:4 Sandbox:setuid
>>> Repro:false}
>>> inotify_init()
>>> mknod$loop(&(0x7f0000002000-0x8)="2e2f66696c653000", 0x82, 0x0)
>>> tee(0xffffffffffffffff, 0xffffffffffffffff, 0x2, 0x7)
>>> recvfrom$unix(0xffffffffffffffff, &(0x7f0000001000-0x91)="", 0x0, 0x161,
>>> &(0x7f0000001000)=@file={0x1, "2e2f66696c653000"}, 0xa)
>>
>> This reproducer doesn't seem like valid C code. Can you use pastebin to
>> paste what syzkaller has really produced?
>>
>>>
>>>
>>> I think that previous ucount
>>> patch(https://github.com/torvalds/linux/commit/880a38547ff08715ce4f1daf9a4bb30c87676e68)
>>> isn't perfect.
>>
>> I don't think that commit is the culprit. The UAF manifests in code
>> which I created and has soaked for quite a time in -next without any
>> problems.
> 
> I agree that commit is very unlikely to be the culprit.  I don't expect
> anyone has been testing the file file descriptor passing case in
> linux-next or much of elsewhere as passing inotify file descriptor
> around is not particularly useful.
> 
>> Eric, Jan, do you have any ideas how we might be hitting a
>> user-after-free when the inotify fd is being passed around? Could it be
>> that those processes are children of the one which initially allocated
>> the inotify fd and they are forgetting incrementing a reference count on
>> fork ? Those are just guesses.
> 
> I am not at all certain.   There is nothing obvious in your patch that
> would change reference counting behavior.  So this may be an old bug
> manifesting now.  Possibly because of the modest change in data
> structures of your patch.
> 
> So I suspect the best course for tracking this down is going to be going
> through the inotify code and seeing if there is anything that could
> cause a leak.
> 
> Well that and setting up a small reproducer that passes file descriptors
> around so you can reproduce this.  Which will help to know if you have
> fixed this.  This definitely looks like it is worth looking into.
> 
> Eric
> 

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: ucount: use-after-free read in inc_ucount & dec_ucount
       [not found]             ` <d0a23a9c-65a6-063f-748b-e399bed6dacd-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
@ 2017-03-03 16:46               ` Eric W. Biederman
  0 siblings, 0 replies; 23+ messages in thread
From: Eric W. Biederman @ 2017-03-03 16:46 UTC (permalink / raw)
  To: Nikolay Borisov
  Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Jan Kara,
	JongHwan Kim

Nikolay Borisov <n.borisov.lkml-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:

> On  3.03.2017 18:30, Eric W. Biederman wrote:
>> Nikolay Borisov <n.borisov.lkml-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:
>> 
>>> [Added containers ml, Eric Biederman and Jan Kara]. Please,
>>> next time don't add random people but take the time to see who touched
>>> the code.
>> 
>> Comments below.
>> 
>>> On  3.03.2017 14:16, JongHwan Kim wrote:
>>>> I've got the following report with syzkaller fuzzer
>>>>
>>>>
>>>> Syzkaller hit 'KASAN: use-after-free Read in dec_ucount' bug on commit .
>>>>
>>>> ==================================================================
>>>> BUG: KASAN: use-after-free in __read_once_size
>>>> include/linux/compiler.h:254 [inline] at addr ffff88006d399bc4
>>>> BUG: KASAN: use-after-free in atomic_read
>>>> arch/x86/include/asm/atomic.h:26 [inline] at addr ffff88006d399bc4
>>>> BUG: KASAN: use-after-free in atomic_dec_if_positive
>>>> include/linux/atomic.h:616 [inline] at addr ffff88006d399bc4
>>>> BUG: KASAN: use-after-free in dec_ucount+0x1e5/0x210 kernel/ucount.c:217
>>>> at addr ffff88006d399bc4
>>>> Read of size 4 by task syz-executor3/19713
>>>> CPU: 1 PID: 19713 Comm: syz-executor3 Not tainted 4.10.0+ #4
>>>> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
>>>> Ubuntu-1.8.2-1ubuntu1 04/01/2014
>>>> Call Trace:
>>>>  __dump_stack lib/dump_stack.c:15 [inline]
>>>>  dump_stack+0x115/0x1cf lib/dump_stack.c:51
>>>>  kasan_object_err+0x1c/0x70 mm/kasan/report.c:162
>>>>  print_address_description mm/kasan/report.c:200 [inline]
>>>>  kasan_report_error mm/kasan/report.c:289 [inline]
>>>>  kasan_report.part.1+0x20e/0x4e0 mm/kasan/report.c:311
>>>>  kasan_report mm/kasan/report.c:331 [inline]
>>>>  __asan_report_load4_noabort+0x29/0x30 mm/kasan/report.c:331
>>>>  __read_once_size include/linux/compiler.h:254 [inline]
>>>>  atomic_read arch/x86/include/asm/atomic.h:26 [inline]
>>>>  atomic_dec_if_positive include/linux/atomic.h:616 [inline]
>>>>  dec_ucount+0x1e5/0x210 kernel/ucount.c:217
>>>>  dec_inotify_instances fs/notify/inotify/inotify.h:37 [inline]
>>>>  inotify_free_group_priv+0x6c/0x80 fs/notify/inotify/inotify_fsnotify.c:169
>>>>  fsnotify_final_destroy_group fs/notify/group.c:37 [inline]
>>>>  fsnotify_put_group+0x73/0xa0 fs/notify/group.c:110
>>>>  fsnotify_destroy_group+0xec/0x120 fs/notify/group.c:93
>>>>  inotify_release+0x37/0x50 fs/notify/inotify/inotify_user.c:280
>>>>  __fput+0x327/0x7e0 fs/file_table.c:208
>>>>  ____fput+0x15/0x20 fs/file_table.c:244
>>>>  task_work_run+0x18a/0x260 kernel/task_work.c:116
>>>>  exit_task_work include/linux/task_work.h:21 [inline]
>>>>  do_exit+0xa45/0x1b20 kernel/exit.c:873
>>>>  do_group_exit+0x149/0x400 kernel/exit.c:977
>>>>  get_signal+0x7d5/0x1810 kernel/signal.c:2313
>>>>  do_signal+0x94/0x1f30 arch/x86/kernel/signal.c:807
>>>>  exit_to_usermode_loop+0x162/0x1e0 arch/x86/entry/common.c:156
>>>>  prepare_exit_to_usermode arch/x86/entry/common.c:190 [inline]
>>>>  syscall_return_slowpath+0x2b6/0x310 arch/x86/entry/common.c:259
>>>>  entry_SYSCALL_64_fastpath+0xc0/0xc2
>>>
>>> So PID 19713 is exitting and as part of it it's freeing its file
>>> descriptors, one of which is apparently an inotify fd. And this has
>>> already been freed.
>>>
>>>
>>>> RIP: 0033:0x44fb79
>>>> RSP: 002b:00007ffd0f00f6d8 EFLAGS: 00000206 ORIG_RAX: 00000000000000ca
>>>> RAX: fffffffffffffdfc RBX: 0000000000708024 RCX: 000000000044fb79
>>>> RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000708024
>>>> RBP: 00000000000ae8e6 R08: 0000000000708000 R09: 000000160000000d
>>>> R10: 00007ffd0f00f710 R11: 0000000000000206 R12: 0000000000708000
>>>> R13: 0000000000708024 R14: 00000000000ae8a1 R15: 0000000000000016
>>>> Object at ffff88006d399b88, in cache kmalloc-96 size: 96
>>>> Allocated:
>>>> PID = 19691
>>>>  save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:57
>>>>  save_stack+0x43/0xd0 mm/kasan/kasan.c:502
>>>>  set_track mm/kasan/kasan.c:514 [inline]
>>>>  kasan_kmalloc+0xad/0xe0 mm/kasan/kasan.c:605
>>>>  kmem_cache_alloc_trace+0xfb/0x280 mm/slub.c:2745
>>>>  kmalloc include/linux/slab.h:490 [inline]
>>>>  kzalloc include/linux/slab.h:663 [inline]
>>>>  get_ucounts kernel/ucount.c:140 [inline]
>>>>  inc_ucount+0x538/0xa70 kernel/ucount.c:195
>>>>  inotify_new_group+0x309/0x410 fs/notify/inotify/inotify_user.c:655
>>>>  SYSC_inotify_init1 fs/notify/inotify/inotify_user.c:682 [inline]
>>>>  SyS_inotify_init1 fs/notify/inotify/inotify_user.c:669 [inline]
>>>>  sys_inotify_init+0x17/0x80 fs/notify/inotify/inotify_user.c:696
>>>>  entry_SYSCALL_64_fastpath+0x1f/0xc2
>>>
>>> However, it has been actually allocated by a different process with pid
>>> 19691.
>>>
>>>
>>>> Freed:
>>>> PID = 19708
>>>>  save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:57
>>>>  save_stack+0x43/0xd0 mm/kasan/kasan.c:502
>>>>  set_track mm/kasan/kasan.c:514 [inline]
>>>>  kasan_slab_free+0x73/0xc0 mm/kasan/kasan.c:578
>>>>  slab_free_hook mm/slub.c:1357 [inline]
>>>>  slab_free_freelist_hook mm/slub.c:1379 [inline]
>>>>  slab_free mm/slub.c:2961 [inline]
>>>>  kfree+0xe8/0x2c0 mm/slub.c:3882
>>>>  put_ucounts+0x1dd/0x270 kernel/ucount.c:172
>>>>  dec_ucount+0x172/0x210 kernel/ucount.c:220
>>>>  dec_inotify_instances fs/notify/inotify/inotify.h:37 [inline]
>>>>  inotify_free_group_priv+0x6c/0x80 fs/notify/inotify/inotify_fsnotify.c:169
>>>>  fsnotify_final_destroy_group fs/notify/group.c:37 [inline]
>>>>  fsnotify_put_group+0x73/0xa0 fs/notify/group.c:110
>>>>  fsnotify_destroy_group+0xec/0x120 fs/notify/group.c:93
>>>>  inotify_release+0x37/0x50 fs/notify/inotify/inotify_user.c:280
>>>>  __fput+0x327/0x7e0 fs/file_table.c:208
>>>>  ____fput+0x15/0x20 fs/file_table.c:244
>>>>  task_work_run+0x18a/0x260 kernel/task_work.c:116
>>>>  exit_task_work include/linux/task_work.h:21 [inline]
>>>>  do_exit+0xa45/0x1b20 kernel/exit.c:873
>>>>  do_group_exit+0x149/0x400 kernel/exit.c:977
>>>>  get_signal+0x7d5/0x1810 kernel/signal.c:2313
>>>>  do_signal+0x94/0x1f30 arch/x86/kernel/signal.c:807
>>>>  exit_to_usermode_loop+0x162/0x1e0 arch/x86/entry/common.c:156
>>>>  prepare_exit_to_usermode arch/x86/entry/common.c:190 [inline]
>>>>  syscall_return_slowpath+0x2b6/0x310 arch/x86/entry/common.c:259
>>>>  entry_SYSCALL_64_fastpath+0xc0/0xc2
>>>
>>> And yet we have a third process which freed it, PID 19708. So there is
>>> some dance happening with this fd, being allocated by one process,
>>> handed over to 2 more, which are freeing it. Is this a valid usage
>>> scenario of inotify descriptors?
>> 
>> They are file descriptors so passing them around is valid.  That is
>> something unix domain sockets have allowed since the dawn of linux.
>> 
>> The dance would need to be the fd being passed to the addtional
>> processes and then closed in the original before being closed
>> in the processes the fd was passed to.
>> 
>> If those additional processes last longer than the original process this
>> is easy to achieve.
>> 
>> My guess is that someone just taught syskallzer to pass file descriptors
>> around.  So this may be an old bug.  Either that or syskallzer hasn't
>> been looking at linux-next with KASAN enabled in the kernel.
>> 
>> This is definitely a weird usage pattern.
>
> If we assume that passing the fds is the problem then this *might*
> explain the first bug. However, if you look closely into the second set
> of stacktraces, you see how the crash occurs on inotify fd
> initialization, and according to the trace this line is the culprit:
> tns = iter->ns;
>
> And this points to a deeper problem, if ucounts can disappear in the
> middle of iterating through them.

Ugh.  Yes.  That does sound like a data structure corruption problem.
I had not seen that.  It could just be a reference counting problem
as children are supposed to have references to their parents, but
this looks more like a memory stomp.

Grrr.

Tracking a memory stomp really needs a reproducer because those are
ugly.

Eric

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: ucount: use-after-free read in inc_ucount & dec_ucount
       [not found]         ` <87pohy1fx6.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
  2017-03-03 16:45           ` Nikolay Borisov
@ 2017-03-04 10:58           ` Nikolay Borisov
       [not found]             ` <1aafd5e9-d9de-e5d9-a77d-cf245c5a5c6a-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
  1 sibling, 1 reply; 23+ messages in thread
From: Nikolay Borisov @ 2017-03-04 10:58 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Jan Kara,
	Eric W. Biederman, JongHwan Kim

[Addressing Dmitry Vyukov to ask for syzkaller clarification]

On  3.03.2017 18:30, Eric W. Biederman wrote:
> Nikolay Borisov <n.borisov.lkml-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:
> 
>> [Added containers ml, Eric Biederman and Jan Kara]. Please,
>> next time don't add random people but take the time to see who touched
>> the code.
> 
> Comments below.
> 
>> On  3.03.2017 14:16, JongHwan Kim wrote:
>>> I've got the following report with syzkaller fuzzer
>>>
>>>
>>> Syzkaller hit 'KASAN: use-after-free Read in dec_ucount' bug on commit .
>>>
>>> ==================================================================
>>> BUG: KASAN: use-after-free in __read_once_size
>>> include/linux/compiler.h:254 [inline] at addr ffff88006d399bc4
>>> BUG: KASAN: use-after-free in atomic_read
>>> arch/x86/include/asm/atomic.h:26 [inline] at addr ffff88006d399bc4
>>> BUG: KASAN: use-after-free in atomic_dec_if_positive
>>> include/linux/atomic.h:616 [inline] at addr ffff88006d399bc4
>>> BUG: KASAN: use-after-free in dec_ucount+0x1e5/0x210 kernel/ucount.c:217
>>> at addr ffff88006d399bc4
>>> Read of size 4 by task syz-executor3/19713
>>> CPU: 1 PID: 19713 Comm: syz-executor3 Not tainted 4.10.0+ #4
>>> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
>>> Ubuntu-1.8.2-1ubuntu1 04/01/2014
>>> Call Trace:
>>>  __dump_stack lib/dump_stack.c:15 [inline]
>>>  dump_stack+0x115/0x1cf lib/dump_stack.c:51
>>>  kasan_object_err+0x1c/0x70 mm/kasan/report.c:162
>>>  print_address_description mm/kasan/report.c:200 [inline]
>>>  kasan_report_error mm/kasan/report.c:289 [inline]
>>>  kasan_report.part.1+0x20e/0x4e0 mm/kasan/report.c:311
>>>  kasan_report mm/kasan/report.c:331 [inline]
>>>  __asan_report_load4_noabort+0x29/0x30 mm/kasan/report.c:331
>>>  __read_once_size include/linux/compiler.h:254 [inline]
>>>  atomic_read arch/x86/include/asm/atomic.h:26 [inline]
>>>  atomic_dec_if_positive include/linux/atomic.h:616 [inline]
>>>  dec_ucount+0x1e5/0x210 kernel/ucount.c:217
>>>  dec_inotify_instances fs/notify/inotify/inotify.h:37 [inline]
>>>  inotify_free_group_priv+0x6c/0x80 fs/notify/inotify/inotify_fsnotify.c:169
>>>  fsnotify_final_destroy_group fs/notify/group.c:37 [inline]
>>>  fsnotify_put_group+0x73/0xa0 fs/notify/group.c:110
>>>  fsnotify_destroy_group+0xec/0x120 fs/notify/group.c:93
>>>  inotify_release+0x37/0x50 fs/notify/inotify/inotify_user.c:280
>>>  __fput+0x327/0x7e0 fs/file_table.c:208
>>>  ____fput+0x15/0x20 fs/file_table.c:244
>>>  task_work_run+0x18a/0x260 kernel/task_work.c:116
>>>  exit_task_work include/linux/task_work.h:21 [inline]
>>>  do_exit+0xa45/0x1b20 kernel/exit.c:873
>>>  do_group_exit+0x149/0x400 kernel/exit.c:977
>>>  get_signal+0x7d5/0x1810 kernel/signal.c:2313
>>>  do_signal+0x94/0x1f30 arch/x86/kernel/signal.c:807
>>>  exit_to_usermode_loop+0x162/0x1e0 arch/x86/entry/common.c:156
>>>  prepare_exit_to_usermode arch/x86/entry/common.c:190 [inline]
>>>  syscall_return_slowpath+0x2b6/0x310 arch/x86/entry/common.c:259
>>>  entry_SYSCALL_64_fastpath+0xc0/0xc2
>>
>> So PID 19713 is exitting and as part of it it's freeing its file
>> descriptors, one of which is apparently an inotify fd. And this has
>> already been freed.
>>
>>
>>> RIP: 0033:0x44fb79
>>> RSP: 002b:00007ffd0f00f6d8 EFLAGS: 00000206 ORIG_RAX: 00000000000000ca
>>> RAX: fffffffffffffdfc RBX: 0000000000708024 RCX: 000000000044fb79
>>> RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000708024
>>> RBP: 00000000000ae8e6 R08: 0000000000708000 R09: 000000160000000d
>>> R10: 00007ffd0f00f710 R11: 0000000000000206 R12: 0000000000708000
>>> R13: 0000000000708024 R14: 00000000000ae8a1 R15: 0000000000000016
>>> Object at ffff88006d399b88, in cache kmalloc-96 size: 96
>>> Allocated:
>>> PID = 19691
>>>  save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:57
>>>  save_stack+0x43/0xd0 mm/kasan/kasan.c:502
>>>  set_track mm/kasan/kasan.c:514 [inline]
>>>  kasan_kmalloc+0xad/0xe0 mm/kasan/kasan.c:605
>>>  kmem_cache_alloc_trace+0xfb/0x280 mm/slub.c:2745
>>>  kmalloc include/linux/slab.h:490 [inline]
>>>  kzalloc include/linux/slab.h:663 [inline]
>>>  get_ucounts kernel/ucount.c:140 [inline]
>>>  inc_ucount+0x538/0xa70 kernel/ucount.c:195
>>>  inotify_new_group+0x309/0x410 fs/notify/inotify/inotify_user.c:655
>>>  SYSC_inotify_init1 fs/notify/inotify/inotify_user.c:682 [inline]
>>>  SyS_inotify_init1 fs/notify/inotify/inotify_user.c:669 [inline]
>>>  sys_inotify_init+0x17/0x80 fs/notify/inotify/inotify_user.c:696
>>>  entry_SYSCALL_64_fastpath+0x1f/0xc2
>>
>> However, it has been actually allocated by a different process with pid
>> 19691.
>>
>>
>>> Freed:
>>> PID = 19708
>>>  save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:57
>>>  save_stack+0x43/0xd0 mm/kasan/kasan.c:502
>>>  set_track mm/kasan/kasan.c:514 [inline]
>>>  kasan_slab_free+0x73/0xc0 mm/kasan/kasan.c:578
>>>  slab_free_hook mm/slub.c:1357 [inline]
>>>  slab_free_freelist_hook mm/slub.c:1379 [inline]
>>>  slab_free mm/slub.c:2961 [inline]
>>>  kfree+0xe8/0x2c0 mm/slub.c:3882
>>>  put_ucounts+0x1dd/0x270 kernel/ucount.c:172
>>>  dec_ucount+0x172/0x210 kernel/ucount.c:220
>>>  dec_inotify_instances fs/notify/inotify/inotify.h:37 [inline]
>>>  inotify_free_group_priv+0x6c/0x80 fs/notify/inotify/inotify_fsnotify.c:169
>>>  fsnotify_final_destroy_group fs/notify/group.c:37 [inline]
>>>  fsnotify_put_group+0x73/0xa0 fs/notify/group.c:110
>>>  fsnotify_destroy_group+0xec/0x120 fs/notify/group.c:93
>>>  inotify_release+0x37/0x50 fs/notify/inotify/inotify_user.c:280
>>>  __fput+0x327/0x7e0 fs/file_table.c:208
>>>  ____fput+0x15/0x20 fs/file_table.c:244
>>>  task_work_run+0x18a/0x260 kernel/task_work.c:116
>>>  exit_task_work include/linux/task_work.h:21 [inline]
>>>  do_exit+0xa45/0x1b20 kernel/exit.c:873
>>>  do_group_exit+0x149/0x400 kernel/exit.c:977
>>>  get_signal+0x7d5/0x1810 kernel/signal.c:2313
>>>  do_signal+0x94/0x1f30 arch/x86/kernel/signal.c:807
>>>  exit_to_usermode_loop+0x162/0x1e0 arch/x86/entry/common.c:156
>>>  prepare_exit_to_usermode arch/x86/entry/common.c:190 [inline]
>>>  syscall_return_slowpath+0x2b6/0x310 arch/x86/entry/common.c:259
>>>  entry_SYSCALL_64_fastpath+0xc0/0xc2
>>
>> And yet we have a third process which freed it, PID 19708. So there is
>> some dance happening with this fd, being allocated by one process,
>> handed over to 2 more, which are freeing it. Is this a valid usage
>> scenario of inotify descriptors?
> 
> They are file descriptors so passing them around is valid.  That is
> something unix domain sockets have allowed since the dawn of linux.
> 
> The dance would need to be the fd being passed to the addtional
> processes and then closed in the original before being closed
> in the processes the fd was passed to.
> 
> If those additional processes last longer than the original process this
> is easy to achieve.
> 
> My guess is that someone just taught syskallzer to pass file descriptors
> around.  So this may be an old bug.  Either that or syskallzer hasn't
> been looking at linux-next with KASAN enabled in the kernel.

Dmitry, can you tell if syzkaller tests sending file descriptors across
sockets? Since the calltraces here show multiple processes being
involved in different operations on the exact same file descriptor.

Also JongHwan, can you provide the full, compilable reproducer to try
and track this issue down?


[SNIP]

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: ucount: use-after-free read in inc_ucount & dec_ucount
       [not found]             ` <1aafd5e9-d9de-e5d9-a77d-cf245c5a5c6a-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
@ 2017-03-04 11:44               ` Dmitry Vyukov via Containers
       [not found]                 ` <CACT4Y+a043_qFN=2uzvYhDFwF4JS9_p3jnOykpMk=RcpqQfKGQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 23+ messages in thread
From: Dmitry Vyukov via Containers @ 2017-03-04 11:44 UTC (permalink / raw)
  To: Nikolay Borisov
  Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	syzkaller, Jan Kara, Eric W. Biederman, JongHwan Kim

On Sat, Mar 4, 2017 at 11:58 AM, Nikolay Borisov
<n.borisov.lkml-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> [Addressing Dmitry Vyukov to ask for syzkaller clarification]
>
> On  3.03.2017 18:30, Eric W. Biederman wrote:
>> Nikolay Borisov <n.borisov.lkml-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:
>>
>>> [Added containers ml, Eric Biederman and Jan Kara]. Please,
>>> next time don't add random people but take the time to see who touched
>>> the code.
>>
>> Comments below.
>>
>>> On  3.03.2017 14:16, JongHwan Kim wrote:
>>>> I've got the following report with syzkaller fuzzer
>>>>
>>>>
>>>> Syzkaller hit 'KASAN: use-after-free Read in dec_ucount' bug on commit .
>>>>
>>>> ==================================================================
>>>> BUG: KASAN: use-after-free in __read_once_size
>>>> include/linux/compiler.h:254 [inline] at addr ffff88006d399bc4
>>>> BUG: KASAN: use-after-free in atomic_read
>>>> arch/x86/include/asm/atomic.h:26 [inline] at addr ffff88006d399bc4
>>>> BUG: KASAN: use-after-free in atomic_dec_if_positive
>>>> include/linux/atomic.h:616 [inline] at addr ffff88006d399bc4
>>>> BUG: KASAN: use-after-free in dec_ucount+0x1e5/0x210 kernel/ucount.c:217
>>>> at addr ffff88006d399bc4
>>>> Read of size 4 by task syz-executor3/19713
>>>> CPU: 1 PID: 19713 Comm: syz-executor3 Not tainted 4.10.0+ #4
>>>> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
>>>> Ubuntu-1.8.2-1ubuntu1 04/01/2014
>>>> Call Trace:
>>>>  __dump_stack lib/dump_stack.c:15 [inline]
>>>>  dump_stack+0x115/0x1cf lib/dump_stack.c:51
>>>>  kasan_object_err+0x1c/0x70 mm/kasan/report.c:162
>>>>  print_address_description mm/kasan/report.c:200 [inline]
>>>>  kasan_report_error mm/kasan/report.c:289 [inline]
>>>>  kasan_report.part.1+0x20e/0x4e0 mm/kasan/report.c:311
>>>>  kasan_report mm/kasan/report.c:331 [inline]
>>>>  __asan_report_load4_noabort+0x29/0x30 mm/kasan/report.c:331
>>>>  __read_once_size include/linux/compiler.h:254 [inline]
>>>>  atomic_read arch/x86/include/asm/atomic.h:26 [inline]
>>>>  atomic_dec_if_positive include/linux/atomic.h:616 [inline]
>>>>  dec_ucount+0x1e5/0x210 kernel/ucount.c:217
>>>>  dec_inotify_instances fs/notify/inotify/inotify.h:37 [inline]
>>>>  inotify_free_group_priv+0x6c/0x80 fs/notify/inotify/inotify_fsnotify.c:169
>>>>  fsnotify_final_destroy_group fs/notify/group.c:37 [inline]
>>>>  fsnotify_put_group+0x73/0xa0 fs/notify/group.c:110
>>>>  fsnotify_destroy_group+0xec/0x120 fs/notify/group.c:93
>>>>  inotify_release+0x37/0x50 fs/notify/inotify/inotify_user.c:280
>>>>  __fput+0x327/0x7e0 fs/file_table.c:208
>>>>  ____fput+0x15/0x20 fs/file_table.c:244
>>>>  task_work_run+0x18a/0x260 kernel/task_work.c:116
>>>>  exit_task_work include/linux/task_work.h:21 [inline]
>>>>  do_exit+0xa45/0x1b20 kernel/exit.c:873
>>>>  do_group_exit+0x149/0x400 kernel/exit.c:977
>>>>  get_signal+0x7d5/0x1810 kernel/signal.c:2313
>>>>  do_signal+0x94/0x1f30 arch/x86/kernel/signal.c:807
>>>>  exit_to_usermode_loop+0x162/0x1e0 arch/x86/entry/common.c:156
>>>>  prepare_exit_to_usermode arch/x86/entry/common.c:190 [inline]
>>>>  syscall_return_slowpath+0x2b6/0x310 arch/x86/entry/common.c:259
>>>>  entry_SYSCALL_64_fastpath+0xc0/0xc2
>>>
>>> So PID 19713 is exitting and as part of it it's freeing its file
>>> descriptors, one of which is apparently an inotify fd. And this has
>>> already been freed.
>>>
>>>
>>>> RIP: 0033:0x44fb79
>>>> RSP: 002b:00007ffd0f00f6d8 EFLAGS: 00000206 ORIG_RAX: 00000000000000ca
>>>> RAX: fffffffffffffdfc RBX: 0000000000708024 RCX: 000000000044fb79
>>>> RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000708024
>>>> RBP: 00000000000ae8e6 R08: 0000000000708000 R09: 000000160000000d
>>>> R10: 00007ffd0f00f710 R11: 0000000000000206 R12: 0000000000708000
>>>> R13: 0000000000708024 R14: 00000000000ae8a1 R15: 0000000000000016
>>>> Object at ffff88006d399b88, in cache kmalloc-96 size: 96
>>>> Allocated:
>>>> PID = 19691
>>>>  save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:57
>>>>  save_stack+0x43/0xd0 mm/kasan/kasan.c:502
>>>>  set_track mm/kasan/kasan.c:514 [inline]
>>>>  kasan_kmalloc+0xad/0xe0 mm/kasan/kasan.c:605
>>>>  kmem_cache_alloc_trace+0xfb/0x280 mm/slub.c:2745
>>>>  kmalloc include/linux/slab.h:490 [inline]
>>>>  kzalloc include/linux/slab.h:663 [inline]
>>>>  get_ucounts kernel/ucount.c:140 [inline]
>>>>  inc_ucount+0x538/0xa70 kernel/ucount.c:195
>>>>  inotify_new_group+0x309/0x410 fs/notify/inotify/inotify_user.c:655
>>>>  SYSC_inotify_init1 fs/notify/inotify/inotify_user.c:682 [inline]
>>>>  SyS_inotify_init1 fs/notify/inotify/inotify_user.c:669 [inline]
>>>>  sys_inotify_init+0x17/0x80 fs/notify/inotify/inotify_user.c:696
>>>>  entry_SYSCALL_64_fastpath+0x1f/0xc2
>>>
>>> However, it has been actually allocated by a different process with pid
>>> 19691.
>>>
>>>
>>>> Freed:
>>>> PID = 19708
>>>>  save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:57
>>>>  save_stack+0x43/0xd0 mm/kasan/kasan.c:502
>>>>  set_track mm/kasan/kasan.c:514 [inline]
>>>>  kasan_slab_free+0x73/0xc0 mm/kasan/kasan.c:578
>>>>  slab_free_hook mm/slub.c:1357 [inline]
>>>>  slab_free_freelist_hook mm/slub.c:1379 [inline]
>>>>  slab_free mm/slub.c:2961 [inline]
>>>>  kfree+0xe8/0x2c0 mm/slub.c:3882
>>>>  put_ucounts+0x1dd/0x270 kernel/ucount.c:172
>>>>  dec_ucount+0x172/0x210 kernel/ucount.c:220
>>>>  dec_inotify_instances fs/notify/inotify/inotify.h:37 [inline]
>>>>  inotify_free_group_priv+0x6c/0x80 fs/notify/inotify/inotify_fsnotify.c:169
>>>>  fsnotify_final_destroy_group fs/notify/group.c:37 [inline]
>>>>  fsnotify_put_group+0x73/0xa0 fs/notify/group.c:110
>>>>  fsnotify_destroy_group+0xec/0x120 fs/notify/group.c:93
>>>>  inotify_release+0x37/0x50 fs/notify/inotify/inotify_user.c:280
>>>>  __fput+0x327/0x7e0 fs/file_table.c:208
>>>>  ____fput+0x15/0x20 fs/file_table.c:244
>>>>  task_work_run+0x18a/0x260 kernel/task_work.c:116
>>>>  exit_task_work include/linux/task_work.h:21 [inline]
>>>>  do_exit+0xa45/0x1b20 kernel/exit.c:873
>>>>  do_group_exit+0x149/0x400 kernel/exit.c:977
>>>>  get_signal+0x7d5/0x1810 kernel/signal.c:2313
>>>>  do_signal+0x94/0x1f30 arch/x86/kernel/signal.c:807
>>>>  exit_to_usermode_loop+0x162/0x1e0 arch/x86/entry/common.c:156
>>>>  prepare_exit_to_usermode arch/x86/entry/common.c:190 [inline]
>>>>  syscall_return_slowpath+0x2b6/0x310 arch/x86/entry/common.c:259
>>>>  entry_SYSCALL_64_fastpath+0xc0/0xc2
>>>
>>> And yet we have a third process which freed it, PID 19708. So there is
>>> some dance happening with this fd, being allocated by one process,
>>> handed over to 2 more, which are freeing it. Is this a valid usage
>>> scenario of inotify descriptors?
>>
>> They are file descriptors so passing them around is valid.  That is
>> something unix domain sockets have allowed since the dawn of linux.
>>
>> The dance would need to be the fd being passed to the addtional
>> processes and then closed in the original before being closed
>> in the processes the fd was passed to.
>>
>> If those additional processes last longer than the original process this
>> is easy to achieve.
>>
>> My guess is that someone just taught syskallzer to pass file descriptors
>> around.  So this may be an old bug.  Either that or syskallzer hasn't
>> been looking at linux-next with KASAN enabled in the kernel.
>
> Dmitry, can you tell if syzkaller tests sending file descriptors across
> sockets? Since the calltraces here show multiple processes being
> involved in different operations on the exact same file descriptor.
>
> Also JongHwan, can you provide the full, compilable reproducer to try
> and track this issue down?


syzkaller can pass descriptors across sockets, but currently only
within a single multi-threaded process.

Are you sure it's the same descriptor? It seems to me that it's struct
ucounts, which is shared via the global ucounts_hashtable, so no
sharing is required in user processes.

Unless I am missing something, we want:

@@ -154,7 +155,7 @@ static struct ucounts *get_ucounts(struct
user_namespace *ns, kuid_t uid)
                        ucounts = new;
                }
        }
-       if (!atomic_add_unless(&ucounts->count, 1, INT_MAX))
+       if (!atomic_add_unless(&ucounts->count, 1, 0))
                ucounts = NULL;
        spin_unlock_irq(&ucounts_lock);
        return ucounts;

no?

put_ucounts drops the last reference, then get_ucounts finds the
ucounts and successfully increments refcount as it's not INT_MAX (it's
0) and starts using it, meanwhile put_ucounts proceeds to
unconditionally deleting the ucounts.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: ucount: use-after-free read in inc_ucount & dec_ucount
       [not found]                 ` <CACT4Y+a043_qFN=2uzvYhDFwF4JS9_p3jnOykpMk=RcpqQfKGQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2017-03-04 11:50                   ` Dmitry Vyukov via Containers
       [not found]                     ` <CACT4Y+Yev63VXYm+kZdii5kheV_ACBn2cehFcFdz6LBVam3Q2g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 23+ messages in thread
From: Dmitry Vyukov via Containers @ 2017-03-04 11:50 UTC (permalink / raw)
  To: Nikolay Borisov
  Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	syzkaller, Jan Kara, Eric W. Biederman, JongHwan Kim

On Sat, Mar 4, 2017 at 12:44 PM, Dmitry Vyukov <dvyukov-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> wrote:
> On Sat, Mar 4, 2017 at 11:58 AM, Nikolay Borisov
> <n.borisov.lkml-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
>> [Addressing Dmitry Vyukov to ask for syzkaller clarification]
>>
>> On  3.03.2017 18:30, Eric W. Biederman wrote:
>>> Nikolay Borisov <n.borisov.lkml-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:
>>>
>>>> [Added containers ml, Eric Biederman and Jan Kara]. Please,
>>>> next time don't add random people but take the time to see who touched
>>>> the code.
>>>
>>> Comments below.
>>>
>>>> On  3.03.2017 14:16, JongHwan Kim wrote:
>>>>> I've got the following report with syzkaller fuzzer
>>>>>
>>>>>
>>>>> Syzkaller hit 'KASAN: use-after-free Read in dec_ucount' bug on commit .
>>>>>
>>>>> ==================================================================
>>>>> BUG: KASAN: use-after-free in __read_once_size
>>>>> include/linux/compiler.h:254 [inline] at addr ffff88006d399bc4
>>>>> BUG: KASAN: use-after-free in atomic_read
>>>>> arch/x86/include/asm/atomic.h:26 [inline] at addr ffff88006d399bc4
>>>>> BUG: KASAN: use-after-free in atomic_dec_if_positive
>>>>> include/linux/atomic.h:616 [inline] at addr ffff88006d399bc4
>>>>> BUG: KASAN: use-after-free in dec_ucount+0x1e5/0x210 kernel/ucount.c:217
>>>>> at addr ffff88006d399bc4
>>>>> Read of size 4 by task syz-executor3/19713
>>>>> CPU: 1 PID: 19713 Comm: syz-executor3 Not tainted 4.10.0+ #4
>>>>> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
>>>>> Ubuntu-1.8.2-1ubuntu1 04/01/2014
>>>>> Call Trace:
>>>>>  __dump_stack lib/dump_stack.c:15 [inline]
>>>>>  dump_stack+0x115/0x1cf lib/dump_stack.c:51
>>>>>  kasan_object_err+0x1c/0x70 mm/kasan/report.c:162
>>>>>  print_address_description mm/kasan/report.c:200 [inline]
>>>>>  kasan_report_error mm/kasan/report.c:289 [inline]
>>>>>  kasan_report.part.1+0x20e/0x4e0 mm/kasan/report.c:311
>>>>>  kasan_report mm/kasan/report.c:331 [inline]
>>>>>  __asan_report_load4_noabort+0x29/0x30 mm/kasan/report.c:331
>>>>>  __read_once_size include/linux/compiler.h:254 [inline]
>>>>>  atomic_read arch/x86/include/asm/atomic.h:26 [inline]
>>>>>  atomic_dec_if_positive include/linux/atomic.h:616 [inline]
>>>>>  dec_ucount+0x1e5/0x210 kernel/ucount.c:217
>>>>>  dec_inotify_instances fs/notify/inotify/inotify.h:37 [inline]
>>>>>  inotify_free_group_priv+0x6c/0x80 fs/notify/inotify/inotify_fsnotify.c:169
>>>>>  fsnotify_final_destroy_group fs/notify/group.c:37 [inline]
>>>>>  fsnotify_put_group+0x73/0xa0 fs/notify/group.c:110
>>>>>  fsnotify_destroy_group+0xec/0x120 fs/notify/group.c:93
>>>>>  inotify_release+0x37/0x50 fs/notify/inotify/inotify_user.c:280
>>>>>  __fput+0x327/0x7e0 fs/file_table.c:208
>>>>>  ____fput+0x15/0x20 fs/file_table.c:244
>>>>>  task_work_run+0x18a/0x260 kernel/task_work.c:116
>>>>>  exit_task_work include/linux/task_work.h:21 [inline]
>>>>>  do_exit+0xa45/0x1b20 kernel/exit.c:873
>>>>>  do_group_exit+0x149/0x400 kernel/exit.c:977
>>>>>  get_signal+0x7d5/0x1810 kernel/signal.c:2313
>>>>>  do_signal+0x94/0x1f30 arch/x86/kernel/signal.c:807
>>>>>  exit_to_usermode_loop+0x162/0x1e0 arch/x86/entry/common.c:156
>>>>>  prepare_exit_to_usermode arch/x86/entry/common.c:190 [inline]
>>>>>  syscall_return_slowpath+0x2b6/0x310 arch/x86/entry/common.c:259
>>>>>  entry_SYSCALL_64_fastpath+0xc0/0xc2
>>>>
>>>> So PID 19713 is exitting and as part of it it's freeing its file
>>>> descriptors, one of which is apparently an inotify fd. And this has
>>>> already been freed.
>>>>
>>>>
>>>>> RIP: 0033:0x44fb79
>>>>> RSP: 002b:00007ffd0f00f6d8 EFLAGS: 00000206 ORIG_RAX: 00000000000000ca
>>>>> RAX: fffffffffffffdfc RBX: 0000000000708024 RCX: 000000000044fb79
>>>>> RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000708024
>>>>> RBP: 00000000000ae8e6 R08: 0000000000708000 R09: 000000160000000d
>>>>> R10: 00007ffd0f00f710 R11: 0000000000000206 R12: 0000000000708000
>>>>> R13: 0000000000708024 R14: 00000000000ae8a1 R15: 0000000000000016
>>>>> Object at ffff88006d399b88, in cache kmalloc-96 size: 96
>>>>> Allocated:
>>>>> PID = 19691
>>>>>  save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:57
>>>>>  save_stack+0x43/0xd0 mm/kasan/kasan.c:502
>>>>>  set_track mm/kasan/kasan.c:514 [inline]
>>>>>  kasan_kmalloc+0xad/0xe0 mm/kasan/kasan.c:605
>>>>>  kmem_cache_alloc_trace+0xfb/0x280 mm/slub.c:2745
>>>>>  kmalloc include/linux/slab.h:490 [inline]
>>>>>  kzalloc include/linux/slab.h:663 [inline]
>>>>>  get_ucounts kernel/ucount.c:140 [inline]
>>>>>  inc_ucount+0x538/0xa70 kernel/ucount.c:195
>>>>>  inotify_new_group+0x309/0x410 fs/notify/inotify/inotify_user.c:655
>>>>>  SYSC_inotify_init1 fs/notify/inotify/inotify_user.c:682 [inline]
>>>>>  SyS_inotify_init1 fs/notify/inotify/inotify_user.c:669 [inline]
>>>>>  sys_inotify_init+0x17/0x80 fs/notify/inotify/inotify_user.c:696
>>>>>  entry_SYSCALL_64_fastpath+0x1f/0xc2
>>>>
>>>> However, it has been actually allocated by a different process with pid
>>>> 19691.
>>>>
>>>>
>>>>> Freed:
>>>>> PID = 19708
>>>>>  save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:57
>>>>>  save_stack+0x43/0xd0 mm/kasan/kasan.c:502
>>>>>  set_track mm/kasan/kasan.c:514 [inline]
>>>>>  kasan_slab_free+0x73/0xc0 mm/kasan/kasan.c:578
>>>>>  slab_free_hook mm/slub.c:1357 [inline]
>>>>>  slab_free_freelist_hook mm/slub.c:1379 [inline]
>>>>>  slab_free mm/slub.c:2961 [inline]
>>>>>  kfree+0xe8/0x2c0 mm/slub.c:3882
>>>>>  put_ucounts+0x1dd/0x270 kernel/ucount.c:172
>>>>>  dec_ucount+0x172/0x210 kernel/ucount.c:220
>>>>>  dec_inotify_instances fs/notify/inotify/inotify.h:37 [inline]
>>>>>  inotify_free_group_priv+0x6c/0x80 fs/notify/inotify/inotify_fsnotify.c:169
>>>>>  fsnotify_final_destroy_group fs/notify/group.c:37 [inline]
>>>>>  fsnotify_put_group+0x73/0xa0 fs/notify/group.c:110
>>>>>  fsnotify_destroy_group+0xec/0x120 fs/notify/group.c:93
>>>>>  inotify_release+0x37/0x50 fs/notify/inotify/inotify_user.c:280
>>>>>  __fput+0x327/0x7e0 fs/file_table.c:208
>>>>>  ____fput+0x15/0x20 fs/file_table.c:244
>>>>>  task_work_run+0x18a/0x260 kernel/task_work.c:116
>>>>>  exit_task_work include/linux/task_work.h:21 [inline]
>>>>>  do_exit+0xa45/0x1b20 kernel/exit.c:873
>>>>>  do_group_exit+0x149/0x400 kernel/exit.c:977
>>>>>  get_signal+0x7d5/0x1810 kernel/signal.c:2313
>>>>>  do_signal+0x94/0x1f30 arch/x86/kernel/signal.c:807
>>>>>  exit_to_usermode_loop+0x162/0x1e0 arch/x86/entry/common.c:156
>>>>>  prepare_exit_to_usermode arch/x86/entry/common.c:190 [inline]
>>>>>  syscall_return_slowpath+0x2b6/0x310 arch/x86/entry/common.c:259
>>>>>  entry_SYSCALL_64_fastpath+0xc0/0xc2
>>>>
>>>> And yet we have a third process which freed it, PID 19708. So there is
>>>> some dance happening with this fd, being allocated by one process,
>>>> handed over to 2 more, which are freeing it. Is this a valid usage
>>>> scenario of inotify descriptors?
>>>
>>> They are file descriptors so passing them around is valid.  That is
>>> something unix domain sockets have allowed since the dawn of linux.
>>>
>>> The dance would need to be the fd being passed to the addtional
>>> processes and then closed in the original before being closed
>>> in the processes the fd was passed to.
>>>
>>> If those additional processes last longer than the original process this
>>> is easy to achieve.
>>>
>>> My guess is that someone just taught syskallzer to pass file descriptors
>>> around.  So this may be an old bug.  Either that or syskallzer hasn't
>>> been looking at linux-next with KASAN enabled in the kernel.
>>
>> Dmitry, can you tell if syzkaller tests sending file descriptors across
>> sockets? Since the calltraces here show multiple processes being
>> involved in different operations on the exact same file descriptor.
>>
>> Also JongHwan, can you provide the full, compilable reproducer to try
>> and track this issue down?
>
>
> syzkaller can pass descriptors across sockets, but currently only
> within a single multi-threaded process.
>
> Are you sure it's the same descriptor? It seems to me that it's struct
> ucounts, which is shared via the global ucounts_hashtable, so no
> sharing is required in user processes.
>
> Unless I am missing something, we want:
>
> @@ -154,7 +155,7 @@ static struct ucounts *get_ucounts(struct
> user_namespace *ns, kuid_t uid)
>                         ucounts = new;
>                 }
>         }
> -       if (!atomic_add_unless(&ucounts->count, 1, INT_MAX))
> +       if (!atomic_add_unless(&ucounts->count, 1, 0))
>                 ucounts = NULL;
>         spin_unlock_irq(&ucounts_lock);
>         return ucounts;
>
> no?
>
> put_ucounts drops the last reference, then get_ucounts finds the
> ucounts and successfully increments refcount as it's not INT_MAX (it's
> 0) and starts using it, meanwhile put_ucounts proceeds to
> unconditionally deleting the ucounts.


It also seems that a concurrent put_ucounts can make get_ucounts fail
_spuriously_, which does not look good.
Don't we want something along the following lines?

diff --git a/kernel/ucount.c b/kernel/ucount.c
index 8a11fc0cb459..233c8e46acd5 100644
--- a/kernel/ucount.c
+++ b/kernel/ucount.c
@@ -143,19 +143,18 @@ static struct ucounts *get_ucounts(struct
user_namespace *ns, kuid_t uid)

                new->ns = ns;
                new->uid = uid;
-               atomic_set(&new->count, 0);
+               atomic_set(&new->count, 1);

                spin_lock_irq(&ucounts_lock);
                ucounts = find_ucounts(ns, uid, hashent);
                if (ucounts) {
+                       atomic_inc(&ucounts->count);
                        kfree(new);
                } else {
                        hlist_add_head(&new->node, hashent);
                        ucounts = new;
                }
        }
-       if (!atomic_add_unless(&ucounts->count, 1, INT_MAX))
-               ucounts = NULL;
        spin_unlock_irq(&ucounts_lock);
        return ucounts;
 }
@@ -166,7 +165,10 @@ static void put_ucounts(struct ucounts *ucounts)

        if (atomic_dec_and_test(&ucounts->count)) {
                spin_lock_irqsave(&ucounts_lock, flags);
-               hlist_del_init(&ucounts->node);
+               if (atomic_read(&ucounts->count) == 0)
+                       hlist_del_init(&ucounts->node);
+               else
+                       ucounts = NULL;
                spin_unlock_irqrestore(&ucounts_lock, flags);

                kfree(ucounts);

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: ucount: use-after-free read in inc_ucount & dec_ucount
       [not found]                     ` <CACT4Y+Yev63VXYm+kZdii5kheV_ACBn2cehFcFdz6LBVam3Q2g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2017-03-04 11:57                       ` Dmitry Vyukov via Containers
  2017-03-04 12:01                       ` Dmitry Vyukov via Containers
  1 sibling, 0 replies; 23+ messages in thread
From: Dmitry Vyukov via Containers @ 2017-03-04 11:57 UTC (permalink / raw)
  To: Nikolay Borisov
  Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	syzkaller, Jan Kara, Eric W. Biederman, JongHwan Kim

On Sat, Mar 4, 2017 at 12:50 PM, Dmitry Vyukov <dvyukov-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> wrote:
> On Sat, Mar 4, 2017 at 12:44 PM, Dmitry Vyukov <dvyukov-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> wrote:
>> On Sat, Mar 4, 2017 at 11:58 AM, Nikolay Borisov
>> <n.borisov.lkml-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
>>> [Addressing Dmitry Vyukov to ask for syzkaller clarification]
>>>
>>> On  3.03.2017 18:30, Eric W. Biederman wrote:
>>>> Nikolay Borisov <n.borisov.lkml-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:
>>>>
>>>>> [Added containers ml, Eric Biederman and Jan Kara]. Please,
>>>>> next time don't add random people but take the time to see who touched
>>>>> the code.
>>>>
>>>> Comments below.
>>>>
>>>>> On  3.03.2017 14:16, JongHwan Kim wrote:
>>>>>> I've got the following report with syzkaller fuzzer
>>>>>>
>>>>>>
>>>>>> Syzkaller hit 'KASAN: use-after-free Read in dec_ucount' bug on commit .
>>>>>>
>>>>>> ==================================================================
>>>>>> BUG: KASAN: use-after-free in __read_once_size
>>>>>> include/linux/compiler.h:254 [inline] at addr ffff88006d399bc4
>>>>>> BUG: KASAN: use-after-free in atomic_read
>>>>>> arch/x86/include/asm/atomic.h:26 [inline] at addr ffff88006d399bc4
>>>>>> BUG: KASAN: use-after-free in atomic_dec_if_positive
>>>>>> include/linux/atomic.h:616 [inline] at addr ffff88006d399bc4
>>>>>> BUG: KASAN: use-after-free in dec_ucount+0x1e5/0x210 kernel/ucount.c:217
>>>>>> at addr ffff88006d399bc4
>>>>>> Read of size 4 by task syz-executor3/19713
>>>>>> CPU: 1 PID: 19713 Comm: syz-executor3 Not tainted 4.10.0+ #4
>>>>>> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
>>>>>> Ubuntu-1.8.2-1ubuntu1 04/01/2014
>>>>>> Call Trace:
>>>>>>  __dump_stack lib/dump_stack.c:15 [inline]
>>>>>>  dump_stack+0x115/0x1cf lib/dump_stack.c:51
>>>>>>  kasan_object_err+0x1c/0x70 mm/kasan/report.c:162
>>>>>>  print_address_description mm/kasan/report.c:200 [inline]
>>>>>>  kasan_report_error mm/kasan/report.c:289 [inline]
>>>>>>  kasan_report.part.1+0x20e/0x4e0 mm/kasan/report.c:311
>>>>>>  kasan_report mm/kasan/report.c:331 [inline]
>>>>>>  __asan_report_load4_noabort+0x29/0x30 mm/kasan/report.c:331
>>>>>>  __read_once_size include/linux/compiler.h:254 [inline]
>>>>>>  atomic_read arch/x86/include/asm/atomic.h:26 [inline]
>>>>>>  atomic_dec_if_positive include/linux/atomic.h:616 [inline]
>>>>>>  dec_ucount+0x1e5/0x210 kernel/ucount.c:217
>>>>>>  dec_inotify_instances fs/notify/inotify/inotify.h:37 [inline]
>>>>>>  inotify_free_group_priv+0x6c/0x80 fs/notify/inotify/inotify_fsnotify.c:169
>>>>>>  fsnotify_final_destroy_group fs/notify/group.c:37 [inline]
>>>>>>  fsnotify_put_group+0x73/0xa0 fs/notify/group.c:110
>>>>>>  fsnotify_destroy_group+0xec/0x120 fs/notify/group.c:93
>>>>>>  inotify_release+0x37/0x50 fs/notify/inotify/inotify_user.c:280
>>>>>>  __fput+0x327/0x7e0 fs/file_table.c:208
>>>>>>  ____fput+0x15/0x20 fs/file_table.c:244
>>>>>>  task_work_run+0x18a/0x260 kernel/task_work.c:116
>>>>>>  exit_task_work include/linux/task_work.h:21 [inline]
>>>>>>  do_exit+0xa45/0x1b20 kernel/exit.c:873
>>>>>>  do_group_exit+0x149/0x400 kernel/exit.c:977
>>>>>>  get_signal+0x7d5/0x1810 kernel/signal.c:2313
>>>>>>  do_signal+0x94/0x1f30 arch/x86/kernel/signal.c:807
>>>>>>  exit_to_usermode_loop+0x162/0x1e0 arch/x86/entry/common.c:156
>>>>>>  prepare_exit_to_usermode arch/x86/entry/common.c:190 [inline]
>>>>>>  syscall_return_slowpath+0x2b6/0x310 arch/x86/entry/common.c:259
>>>>>>  entry_SYSCALL_64_fastpath+0xc0/0xc2
>>>>>
>>>>> So PID 19713 is exitting and as part of it it's freeing its file
>>>>> descriptors, one of which is apparently an inotify fd. And this has
>>>>> already been freed.
>>>>>
>>>>>
>>>>>> RIP: 0033:0x44fb79
>>>>>> RSP: 002b:00007ffd0f00f6d8 EFLAGS: 00000206 ORIG_RAX: 00000000000000ca
>>>>>> RAX: fffffffffffffdfc RBX: 0000000000708024 RCX: 000000000044fb79
>>>>>> RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000708024
>>>>>> RBP: 00000000000ae8e6 R08: 0000000000708000 R09: 000000160000000d
>>>>>> R10: 00007ffd0f00f710 R11: 0000000000000206 R12: 0000000000708000
>>>>>> R13: 0000000000708024 R14: 00000000000ae8a1 R15: 0000000000000016
>>>>>> Object at ffff88006d399b88, in cache kmalloc-96 size: 96
>>>>>> Allocated:
>>>>>> PID = 19691
>>>>>>  save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:57
>>>>>>  save_stack+0x43/0xd0 mm/kasan/kasan.c:502
>>>>>>  set_track mm/kasan/kasan.c:514 [inline]
>>>>>>  kasan_kmalloc+0xad/0xe0 mm/kasan/kasan.c:605
>>>>>>  kmem_cache_alloc_trace+0xfb/0x280 mm/slub.c:2745
>>>>>>  kmalloc include/linux/slab.h:490 [inline]
>>>>>>  kzalloc include/linux/slab.h:663 [inline]
>>>>>>  get_ucounts kernel/ucount.c:140 [inline]
>>>>>>  inc_ucount+0x538/0xa70 kernel/ucount.c:195
>>>>>>  inotify_new_group+0x309/0x410 fs/notify/inotify/inotify_user.c:655
>>>>>>  SYSC_inotify_init1 fs/notify/inotify/inotify_user.c:682 [inline]
>>>>>>  SyS_inotify_init1 fs/notify/inotify/inotify_user.c:669 [inline]
>>>>>>  sys_inotify_init+0x17/0x80 fs/notify/inotify/inotify_user.c:696
>>>>>>  entry_SYSCALL_64_fastpath+0x1f/0xc2
>>>>>
>>>>> However, it has been actually allocated by a different process with pid
>>>>> 19691.
>>>>>
>>>>>
>>>>>> Freed:
>>>>>> PID = 19708
>>>>>>  save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:57
>>>>>>  save_stack+0x43/0xd0 mm/kasan/kasan.c:502
>>>>>>  set_track mm/kasan/kasan.c:514 [inline]
>>>>>>  kasan_slab_free+0x73/0xc0 mm/kasan/kasan.c:578
>>>>>>  slab_free_hook mm/slub.c:1357 [inline]
>>>>>>  slab_free_freelist_hook mm/slub.c:1379 [inline]
>>>>>>  slab_free mm/slub.c:2961 [inline]
>>>>>>  kfree+0xe8/0x2c0 mm/slub.c:3882
>>>>>>  put_ucounts+0x1dd/0x270 kernel/ucount.c:172
>>>>>>  dec_ucount+0x172/0x210 kernel/ucount.c:220
>>>>>>  dec_inotify_instances fs/notify/inotify/inotify.h:37 [inline]
>>>>>>  inotify_free_group_priv+0x6c/0x80 fs/notify/inotify/inotify_fsnotify.c:169
>>>>>>  fsnotify_final_destroy_group fs/notify/group.c:37 [inline]
>>>>>>  fsnotify_put_group+0x73/0xa0 fs/notify/group.c:110
>>>>>>  fsnotify_destroy_group+0xec/0x120 fs/notify/group.c:93
>>>>>>  inotify_release+0x37/0x50 fs/notify/inotify/inotify_user.c:280
>>>>>>  __fput+0x327/0x7e0 fs/file_table.c:208
>>>>>>  ____fput+0x15/0x20 fs/file_table.c:244
>>>>>>  task_work_run+0x18a/0x260 kernel/task_work.c:116
>>>>>>  exit_task_work include/linux/task_work.h:21 [inline]
>>>>>>  do_exit+0xa45/0x1b20 kernel/exit.c:873
>>>>>>  do_group_exit+0x149/0x400 kernel/exit.c:977
>>>>>>  get_signal+0x7d5/0x1810 kernel/signal.c:2313
>>>>>>  do_signal+0x94/0x1f30 arch/x86/kernel/signal.c:807
>>>>>>  exit_to_usermode_loop+0x162/0x1e0 arch/x86/entry/common.c:156
>>>>>>  prepare_exit_to_usermode arch/x86/entry/common.c:190 [inline]
>>>>>>  syscall_return_slowpath+0x2b6/0x310 arch/x86/entry/common.c:259
>>>>>>  entry_SYSCALL_64_fastpath+0xc0/0xc2
>>>>>
>>>>> And yet we have a third process which freed it, PID 19708. So there is
>>>>> some dance happening with this fd, being allocated by one process,
>>>>> handed over to 2 more, which are freeing it. Is this a valid usage
>>>>> scenario of inotify descriptors?
>>>>
>>>> They are file descriptors so passing them around is valid.  That is
>>>> something unix domain sockets have allowed since the dawn of linux.
>>>>
>>>> The dance would need to be the fd being passed to the addtional
>>>> processes and then closed in the original before being closed
>>>> in the processes the fd was passed to.
>>>>
>>>> If those additional processes last longer than the original process this
>>>> is easy to achieve.
>>>>
>>>> My guess is that someone just taught syskallzer to pass file descriptors
>>>> around.  So this may be an old bug.  Either that or syskallzer hasn't
>>>> been looking at linux-next with KASAN enabled in the kernel.
>>>
>>> Dmitry, can you tell if syzkaller tests sending file descriptors across
>>> sockets? Since the calltraces here show multiple processes being
>>> involved in different operations on the exact same file descriptor.
>>>
>>> Also JongHwan, can you provide the full, compilable reproducer to try
>>> and track this issue down?
>>
>>
>> syzkaller can pass descriptors across sockets, but currently only
>> within a single multi-threaded process.
>>
>> Are you sure it's the same descriptor? It seems to me that it's struct
>> ucounts, which is shared via the global ucounts_hashtable, so no
>> sharing is required in user processes.
>>
>> Unless I am missing something, we want:
>>
>> @@ -154,7 +155,7 @@ static struct ucounts *get_ucounts(struct
>> user_namespace *ns, kuid_t uid)
>>                         ucounts = new;
>>                 }
>>         }
>> -       if (!atomic_add_unless(&ucounts->count, 1, INT_MAX))
>> +       if (!atomic_add_unless(&ucounts->count, 1, 0))
>>                 ucounts = NULL;
>>         spin_unlock_irq(&ucounts_lock);
>>         return ucounts;
>>
>> no?
>>
>> put_ucounts drops the last reference, then get_ucounts finds the
>> ucounts and successfully increments refcount as it's not INT_MAX (it's
>> 0) and starts using it, meanwhile put_ucounts proceeds to
>> unconditionally deleting the ucounts.
>
>
> It also seems that a concurrent put_ucounts can make get_ucounts fail
> _spuriously_, which does not look good.
> Don't we want something along the following lines?
>
> diff --git a/kernel/ucount.c b/kernel/ucount.c
> index 8a11fc0cb459..233c8e46acd5 100644
> --- a/kernel/ucount.c
> +++ b/kernel/ucount.c
> @@ -143,19 +143,18 @@ static struct ucounts *get_ucounts(struct
> user_namespace *ns, kuid_t uid)
>
>                 new->ns = ns;
>                 new->uid = uid;
> -               atomic_set(&new->count, 0);
> +               atomic_set(&new->count, 1);
>
>                 spin_lock_irq(&ucounts_lock);
>                 ucounts = find_ucounts(ns, uid, hashent);
>                 if (ucounts) {
> +                       atomic_inc(&ucounts->count);
>                         kfree(new);
>                 } else {
>                         hlist_add_head(&new->node, hashent);
>                         ucounts = new;
>                 }
>         }
> -       if (!atomic_add_unless(&ucounts->count, 1, INT_MAX))
> -               ucounts = NULL;
>         spin_unlock_irq(&ucounts_lock);
>         return ucounts;
>  }
> @@ -166,7 +165,10 @@ static void put_ucounts(struct ucounts *ucounts)
>
>         if (atomic_dec_and_test(&ucounts->count)) {
>                 spin_lock_irqsave(&ucounts_lock, flags);
> -               hlist_del_init(&ucounts->node);
> +               if (atomic_read(&ucounts->count) == 0)
> +                       hlist_del_init(&ucounts->node);
> +               else
> +                       ucounts = NULL;
>                 spin_unlock_irqrestore(&ucounts_lock, flags);
>
>                 kfree(ucounts);


I was able to reproduce it by adding udelay(100) into put_ucounts
after the decrement. The workload was just lots of parallel processes
creating private inotify's.


==================================================================
BUG: KASAN: use-after-free in atomic_dec_if_positive
include/linux/compiler.h:254 [inline] at addr ffff88006c86de3c
BUG: KASAN: use-after-free in dec_ucount+0x1e5/0x210
kernel/ucount.c:219 at addr ffff88006c86de3c
Read of size 4 by task udevadm/1644
CPU: 2 PID: 1644 Comm: udevadm Not tainted 4.10.0+ #278
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:15 [inline]
 dump_stack+0x2ee/0x3ef lib/dump_stack.c:51
 kasan_object_err+0x1c/0x70 mm/kasan/report.c:166
 print_address_description mm/kasan/report.c:204 [inline]
 kasan_report_error mm/kasan/report.c:288 [inline]
 kasan_report.part.2+0x198/0x440 mm/kasan/report.c:310
 kasan_report mm/kasan/report.c:330 [inline]
 __asan_report_load4_noabort+0x29/0x30 mm/kasan/report.c:330
 atomic_dec_if_positive include/linux/compiler.h:254 [inline]
 dec_ucount+0x1e5/0x210 kernel/ucount.c:219
 dec_inotify_instances fs/notify/inotify/inotify.h:37 [inline]
 inotify_free_group_priv+0x6c/0x80 fs/notify/inotify/inotify_fsnotify.c:169
 fsnotify_final_destroy_group fs/notify/group.c:37 [inline]
 fsnotify_put_group+0x73/0xa0 fs/notify/group.c:110
 fsnotify_destroy_group+0xec/0x120 fs/notify/group.c:93
 inotify_release+0x37/0x50 fs/notify/inotify/inotify_user.c:280
 __fput+0x332/0x7f0 fs/file_table.c:208
 ____fput+0x15/0x20 fs/file_table.c:244
 task_work_run+0x18a/0x260 kernel/task_work.c:116
 tracehook_notify_resume include/linux/tracehook.h:191 [inline]
 exit_to_usermode_loop+0x23b/0x2a0 arch/x86/entry/common.c:160
 prepare_exit_to_usermode arch/x86/entry/common.c:190 [inline]
 syscall_return_slowpath+0x4d3/0x570 arch/x86/entry/common.c:259
 entry_SYSCALL_64_fastpath+0xc0/0xc2
RIP: 0033:0x7f14a12fa2b0
RSP: 002b:00007ffdeea0a6c8 EFLAGS: 00000246 ORIG_RAX: 0000000000000003
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 00007f14a12fa2b0
RDX: 0000000000e03000 RSI: 00007f14a15b0e40 RDI: 0000000000000003
RBP: 0000000000000000 R08: 00007f14a1bf37a0 R09: 0000000000000000
R10: 1999999999999999 R11: 0000000000000246 R12: 0000000000000078
R13: 0000000000000000 R14: 431bde82d7b634db R15: 0000000000de2130
Object at ffff88006c86de00, in cache kmalloc-96 size: 96
Allocated:
PID = 0
 save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:57
 save_stack+0x43/0xd0 mm/kasan/kasan.c:502
 set_track mm/kasan/kasan.c:514 [inline]
 kasan_kmalloc+0xaa/0xd0 mm/kasan/kasan.c:605
 kmem_cache_alloc_trace+0x10b/0x6e0 mm/slab.c:3637
 kmalloc include/linux/slab.h:490 [inline]
 kzalloc include/linux/slab.h:663 [inline]
 get_ucounts kernel/ucount.c:140 [inline]
 inc_ucount+0x482/0x950 kernel/ucount.c:197
 inc_mnt_namespaces fs/namespace.c:2843 [inline]
 alloc_mnt_ns+0xfe/0x560 fs/namespace.c:2874
 create_mnt_ns+0x6e/0x310 fs/namespace.c:2984
 init_mount_tree fs/namespace.c:3224 [inline]
 mnt_init+0x2b8/0x471 fs/namespace.c:3276
 vfs_caches_init+0xaa/0x156 fs/dcache.c:3626
 start_kernel+0x72e/0x7d2 init/main.c:648
 x86_64_start_reservations+0x2a/0x2c arch/x86/kernel/head64.c:195
 x86_64_start_kernel+0x13c/0x149 arch/x86/kernel/head64.c:176
 verify_cpu+0x0/0xfc
Freed:
PID = 1644
 save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:57
 save_stack+0x43/0xd0 mm/kasan/kasan.c:502
 set_track mm/kasan/kasan.c:514 [inline]
 kasan_slab_free+0x6f/0xb0 mm/kasan/kasan.c:578
 __cache_free mm/slab.c:3513 [inline]
 kfree+0xd3/0x250 mm/slab.c:3830
 put_ucounts+0x341/0x3e0 kernel/ucount.c:174
 dec_ucount+0x172/0x210 kernel/ucount.c:222
 dec_inotify_watches fs/notify/inotify/inotify.h:47 [inline]
 inotify_ignored_and_remove_idr+0x80/0x90 fs/notify/inotify/inotify_user.c:501
 inotify_freeing_mark+0x1d/0x30 fs/notify/inotify/inotify_fsnotify.c:125
 __fsnotify_free_mark+0x13c/0x2b0 fs/notify/mark.c:203
 fsnotify_detach_group_marks+0xd2/0x180 fs/notify/mark.c:508
 fsnotify_destroy_group+0x62/0x120 fs/notify/group.c:70
 inotify_release+0x37/0x50 fs/notify/inotify/inotify_user.c:280
 __fput+0x332/0x7f0 fs/file_table.c:208
 ____fput+0x15/0x20 fs/file_table.c:244
 task_work_run+0x18a/0x260 kernel/task_work.c:116
 tracehook_notify_resume include/linux/tracehook.h:191 [inline]
 exit_to_usermode_loop+0x23b/0x2a0 arch/x86/entry/common.c:160
 prepare_exit_to_usermode arch/x86/entry/common.c:190 [inline]
 syscall_return_slowpath+0x4d3/0x570 arch/x86/entry/common.c:259
 entry_SYSCALL_64_fastpath+0xc0/0xc2
Memory state around the buggy address:
 ffff88006c86dd00: 00 00 00 00 00 00 00 00 00 00 00 fc fc fc fc fc
 ffff88006c86dd80: 00 00 00 00 00 00 00 00 00 00 00 fc fc fc fc fc
>ffff88006c86de00: fb fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc
                                        ^
 ffff88006c86de80: 00 00 00 00 00 00 00 00 00 00 00 00 fc fc fc fc
 ffff88006c86df00: 00 00 00 00 00 00 00 00 00 00 00 00 fc fc fc fc
==================================================================

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: ucount: use-after-free read in inc_ucount & dec_ucount
       [not found]                     ` <CACT4Y+Yev63VXYm+kZdii5kheV_ACBn2cehFcFdz6LBVam3Q2g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2017-03-04 11:57                       ` Dmitry Vyukov via Containers
@ 2017-03-04 12:01                       ` Dmitry Vyukov via Containers
       [not found]                         ` <CACT4Y+b7YXnoOkdYTjp6D6XH-zuYRx063hMayOVMyJVc735dtQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  1 sibling, 1 reply; 23+ messages in thread
From: Dmitry Vyukov via Containers @ 2017-03-04 12:01 UTC (permalink / raw)
  To: Nikolay Borisov
  Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	syzkaller, Jan Kara, Eric W. Biederman, JongHwan Kim

On Sat, Mar 4, 2017 at 12:50 PM, Dmitry Vyukov <dvyukov-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> wrote:
>> <n.borisov.lkml-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
>>> [Addressing Dmitry Vyukov to ask for syzkaller clarification]
>>>
>>> On  3.03.2017 18:30, Eric W. Biederman wrote:
>>>> Nikolay Borisov <n.borisov.lkml-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:
>>>>
>>>>> [Added containers ml, Eric Biederman and Jan Kara]. Please,
>>>>> next time don't add random people but take the time to see who touched
>>>>> the code.
>>>>
>>>> Comments below.
>>>>
>>>>> On  3.03.2017 14:16, JongHwan Kim wrote:
>>>>>> I've got the following report with syzkaller fuzzer
>>>>>>
>>>>>>
>>>>>> Syzkaller hit 'KASAN: use-after-free Read in dec_ucount' bug on commit .
>>>>>>
>>>>>> ==================================================================
>>>>>> BUG: KASAN: use-after-free in __read_once_size
>>>>>> include/linux/compiler.h:254 [inline] at addr ffff88006d399bc4
>>>>>> BUG: KASAN: use-after-free in atomic_read
>>>>>> arch/x86/include/asm/atomic.h:26 [inline] at addr ffff88006d399bc4
>>>>>> BUG: KASAN: use-after-free in atomic_dec_if_positive
>>>>>> include/linux/atomic.h:616 [inline] at addr ffff88006d399bc4
>>>>>> BUG: KASAN: use-after-free in dec_ucount+0x1e5/0x210 kernel/ucount.c:217
>>>>>> at addr ffff88006d399bc4
>>>>>> Read of size 4 by task syz-executor3/19713
>>>>>> CPU: 1 PID: 19713 Comm: syz-executor3 Not tainted 4.10.0+ #4
>>>>>> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
>>>>>> Ubuntu-1.8.2-1ubuntu1 04/01/2014
>>>>>> Call Trace:
>>>>>>  __dump_stack lib/dump_stack.c:15 [inline]
>>>>>>  dump_stack+0x115/0x1cf lib/dump_stack.c:51
>>>>>>  kasan_object_err+0x1c/0x70 mm/kasan/report.c:162
>>>>>>  print_address_description mm/kasan/report.c:200 [inline]
>>>>>>  kasan_report_error mm/kasan/report.c:289 [inline]
>>>>>>  kasan_report.part.1+0x20e/0x4e0 mm/kasan/report.c:311
>>>>>>  kasan_report mm/kasan/report.c:331 [inline]
>>>>>>  __asan_report_load4_noabort+0x29/0x30 mm/kasan/report.c:331
>>>>>>  __read_once_size include/linux/compiler.h:254 [inline]
>>>>>>  atomic_read arch/x86/include/asm/atomic.h:26 [inline]
>>>>>>  atomic_dec_if_positive include/linux/atomic.h:616 [inline]
>>>>>>  dec_ucount+0x1e5/0x210 kernel/ucount.c:217
>>>>>>  dec_inotify_instances fs/notify/inotify/inotify.h:37 [inline]
>>>>>>  inotify_free_group_priv+0x6c/0x80 fs/notify/inotify/inotify_fsnotify.c:169
>>>>>>  fsnotify_final_destroy_group fs/notify/group.c:37 [inline]
>>>>>>  fsnotify_put_group+0x73/0xa0 fs/notify/group.c:110
>>>>>>  fsnotify_destroy_group+0xec/0x120 fs/notify/group.c:93
>>>>>>  inotify_release+0x37/0x50 fs/notify/inotify/inotify_user.c:280
>>>>>>  __fput+0x327/0x7e0 fs/file_table.c:208
>>>>>>  ____fput+0x15/0x20 fs/file_table.c:244
>>>>>>  task_work_run+0x18a/0x260 kernel/task_work.c:116
>>>>>>  exit_task_work include/linux/task_work.h:21 [inline]
>>>>>>  do_exit+0xa45/0x1b20 kernel/exit.c:873
>>>>>>  do_group_exit+0x149/0x400 kernel/exit.c:977
>>>>>>  get_signal+0x7d5/0x1810 kernel/signal.c:2313
>>>>>>  do_signal+0x94/0x1f30 arch/x86/kernel/signal.c:807
>>>>>>  exit_to_usermode_loop+0x162/0x1e0 arch/x86/entry/common.c:156
>>>>>>  prepare_exit_to_usermode arch/x86/entry/common.c:190 [inline]
>>>>>>  syscall_return_slowpath+0x2b6/0x310 arch/x86/entry/common.c:259
>>>>>>  entry_SYSCALL_64_fastpath+0xc0/0xc2
>>>>>
>>>>> So PID 19713 is exitting and as part of it it's freeing its file
>>>>> descriptors, one of which is apparently an inotify fd. And this has
>>>>> already been freed.
>>>>>
>>>>>
>>>>>> RIP: 0033:0x44fb79
>>>>>> RSP: 002b:00007ffd0f00f6d8 EFLAGS: 00000206 ORIG_RAX: 00000000000000ca
>>>>>> RAX: fffffffffffffdfc RBX: 0000000000708024 RCX: 000000000044fb79
>>>>>> RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000708024
>>>>>> RBP: 00000000000ae8e6 R08: 0000000000708000 R09: 000000160000000d
>>>>>> R10: 00007ffd0f00f710 R11: 0000000000000206 R12: 0000000000708000
>>>>>> R13: 0000000000708024 R14: 00000000000ae8a1 R15: 0000000000000016
>>>>>> Object at ffff88006d399b88, in cache kmalloc-96 size: 96
>>>>>> Allocated:
>>>>>> PID = 19691
>>>>>>  save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:57
>>>>>>  save_stack+0x43/0xd0 mm/kasan/kasan.c:502
>>>>>>  set_track mm/kasan/kasan.c:514 [inline]
>>>>>>  kasan_kmalloc+0xad/0xe0 mm/kasan/kasan.c:605
>>>>>>  kmem_cache_alloc_trace+0xfb/0x280 mm/slub.c:2745
>>>>>>  kmalloc include/linux/slab.h:490 [inline]
>>>>>>  kzalloc include/linux/slab.h:663 [inline]
>>>>>>  get_ucounts kernel/ucount.c:140 [inline]
>>>>>>  inc_ucount+0x538/0xa70 kernel/ucount.c:195
>>>>>>  inotify_new_group+0x309/0x410 fs/notify/inotify/inotify_user.c:655
>>>>>>  SYSC_inotify_init1 fs/notify/inotify/inotify_user.c:682 [inline]
>>>>>>  SyS_inotify_init1 fs/notify/inotify/inotify_user.c:669 [inline]
>>>>>>  sys_inotify_init+0x17/0x80 fs/notify/inotify/inotify_user.c:696
>>>>>>  entry_SYSCALL_64_fastpath+0x1f/0xc2
>>>>>
>>>>> However, it has been actually allocated by a different process with pid
>>>>> 19691.
>>>>>
>>>>>
>>>>>> Freed:
>>>>>> PID = 19708
>>>>>>  save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:57
>>>>>>  save_stack+0x43/0xd0 mm/kasan/kasan.c:502
>>>>>>  set_track mm/kasan/kasan.c:514 [inline]
>>>>>>  kasan_slab_free+0x73/0xc0 mm/kasan/kasan.c:578
>>>>>>  slab_free_hook mm/slub.c:1357 [inline]
>>>>>>  slab_free_freelist_hook mm/slub.c:1379 [inline]
>>>>>>  slab_free mm/slub.c:2961 [inline]
>>>>>>  kfree+0xe8/0x2c0 mm/slub.c:3882
>>>>>>  put_ucounts+0x1dd/0x270 kernel/ucount.c:172
>>>>>>  dec_ucount+0x172/0x210 kernel/ucount.c:220
>>>>>>  dec_inotify_instances fs/notify/inotify/inotify.h:37 [inline]
>>>>>>  inotify_free_group_priv+0x6c/0x80 fs/notify/inotify/inotify_fsnotify.c:169
>>>>>>  fsnotify_final_destroy_group fs/notify/group.c:37 [inline]
>>>>>>  fsnotify_put_group+0x73/0xa0 fs/notify/group.c:110
>>>>>>  fsnotify_destroy_group+0xec/0x120 fs/notify/group.c:93
>>>>>>  inotify_release+0x37/0x50 fs/notify/inotify/inotify_user.c:280
>>>>>>  __fput+0x327/0x7e0 fs/file_table.c:208
>>>>>>  ____fput+0x15/0x20 fs/file_table.c:244
>>>>>>  task_work_run+0x18a/0x260 kernel/task_work.c:116
>>>>>>  exit_task_work include/linux/task_work.h:21 [inline]
>>>>>>  do_exit+0xa45/0x1b20 kernel/exit.c:873
>>>>>>  do_group_exit+0x149/0x400 kernel/exit.c:977
>>>>>>  get_signal+0x7d5/0x1810 kernel/signal.c:2313
>>>>>>  do_signal+0x94/0x1f30 arch/x86/kernel/signal.c:807
>>>>>>  exit_to_usermode_loop+0x162/0x1e0 arch/x86/entry/common.c:156
>>>>>>  prepare_exit_to_usermode arch/x86/entry/common.c:190 [inline]
>>>>>>  syscall_return_slowpath+0x2b6/0x310 arch/x86/entry/common.c:259
>>>>>>  entry_SYSCALL_64_fastpath+0xc0/0xc2
>>>>>
>>>>> And yet we have a third process which freed it, PID 19708. So there is
>>>>> some dance happening with this fd, being allocated by one process,
>>>>> handed over to 2 more, which are freeing it. Is this a valid usage
>>>>> scenario of inotify descriptors?
>>>>
>>>> They are file descriptors so passing them around is valid.  That is
>>>> something unix domain sockets have allowed since the dawn of linux.
>>>>
>>>> The dance would need to be the fd being passed to the addtional
>>>> processes and then closed in the original before being closed
>>>> in the processes the fd was passed to.
>>>>
>>>> If those additional processes last longer than the original process this
>>>> is easy to achieve.
>>>>
>>>> My guess is that someone just taught syskallzer to pass file descriptors
>>>> around.  So this may be an old bug.  Either that or syskallzer hasn't
>>>> been looking at linux-next with KASAN enabled in the kernel.
>>>
>>> Dmitry, can you tell if syzkaller tests sending file descriptors across
>>> sockets? Since the calltraces here show multiple processes being
>>> involved in different operations on the exact same file descriptor.
>>>
>>> Also JongHwan, can you provide the full, compilable reproducer to try
>>> and track this issue down?
>>
>>
>> syzkaller can pass descriptors across sockets, but currently only
>> within a single multi-threaded process.
>>
>> Are you sure it's the same descriptor? It seems to me that it's struct
>> ucounts, which is shared via the global ucounts_hashtable, so no
>> sharing is required in user processes.
>>
>> Unless I am missing something, we want:
>>
>> @@ -154,7 +155,7 @@ static struct ucounts *get_ucounts(struct
>> user_namespace *ns, kuid_t uid)
>>                         ucounts = new;
>>                 }
>>         }
>> -       if (!atomic_add_unless(&ucounts->count, 1, INT_MAX))
>> +       if (!atomic_add_unless(&ucounts->count, 1, 0))
>>                 ucounts = NULL;
>>         spin_unlock_irq(&ucounts_lock);
>>         return ucounts;
>>
>> no?
>>
>> put_ucounts drops the last reference, then get_ucounts finds the
>> ucounts and successfully increments refcount as it's not INT_MAX (it's
>> 0) and starts using it, meanwhile put_ucounts proceeds to
>> unconditionally deleting the ucounts.
>
>
> It also seems that a concurrent put_ucounts can make get_ucounts fail
> _spuriously_, which does not look good.
> Don't we want something along the following lines?
>
> diff --git a/kernel/ucount.c b/kernel/ucount.c
> index 8a11fc0cb459..233c8e46acd5 100644
> --- a/kernel/ucount.c
> +++ b/kernel/ucount.c
> @@ -143,19 +143,18 @@ static struct ucounts *get_ucounts(struct
> user_namespace *ns, kuid_t uid)
>
>                 new->ns = ns;
>                 new->uid = uid;
> -               atomic_set(&new->count, 0);
> +               atomic_set(&new->count, 1);
>
>                 spin_lock_irq(&ucounts_lock);
>                 ucounts = find_ucounts(ns, uid, hashent);
>                 if (ucounts) {
> +                       atomic_inc(&ucounts->count);
>                         kfree(new);
>                 } else {
>                         hlist_add_head(&new->node, hashent);
>                         ucounts = new;
>                 }
>         }
> -       if (!atomic_add_unless(&ucounts->count, 1, INT_MAX))
> -               ucounts = NULL;
>         spin_unlock_irq(&ucounts_lock);
>         return ucounts;
>  }
> @@ -166,7 +165,10 @@ static void put_ucounts(struct ucounts *ucounts)
>
>         if (atomic_dec_and_test(&ucounts->count)) {
>                 spin_lock_irqsave(&ucounts_lock, flags);
> -               hlist_del_init(&ucounts->node);
> +               if (atomic_read(&ucounts->count) == 0)
> +                       hlist_del_init(&ucounts->node);
> +               else
> +                       ucounts = NULL;
>                 spin_unlock_irqrestore(&ucounts_lock, flags);
>
>                 kfree(ucounts);


/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\

This is broken per se. Need something more elaborate.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: ucount: use-after-free read in inc_ucount & dec_ucount
       [not found]                         ` <CACT4Y+b7YXnoOkdYTjp6D6XH-zuYRx063hMayOVMyJVc735dtQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2017-03-04 12:10                           ` Nikolay Borisov
       [not found]                             ` <c69f8f03-4324-f934-eed2-643c91d703c0-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
  0 siblings, 1 reply; 23+ messages in thread
From: Nikolay Borisov @ 2017-03-04 12:10 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	syzkaller, Jan Kara, Eric W. Biederman, JongHwan Kim



On  4.03.2017 14:01, Dmitry Vyukov wrote:
> On Sat, Mar 4, 2017 at 12:50 PM, Dmitry Vyukov <dvyukov-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> wrote:
>>> <n.borisov.lkml-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
>>>> [Addressing Dmitry Vyukov to ask for syzkaller clarification]
>>>>
>>>> On  3.03.2017 18:30, Eric W. Biederman wrote:
>>>>> Nikolay Borisov <n.borisov.lkml-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:
>>>>>
>>>>>> [Added containers ml, Eric Biederman and Jan Kara]. Please,
>>>>>> next time don't add random people but take the time to see who touched
>>>>>> the code.
>>>>>
>>>>> Comments below.
>>>>>
>>>>>> On  3.03.2017 14:16, JongHwan Kim wrote:
>>>>>>> I've got the following report with syzkaller fuzzer
>>>>>>>
>>>>>>>
>>>>>>> Syzkaller hit 'KASAN: use-after-free Read in dec_ucount' bug on commit .
>>>>>>>
>>>>>>> ==================================================================
>>>>>>> BUG: KASAN: use-after-free in __read_once_size
>>>>>>> include/linux/compiler.h:254 [inline] at addr ffff88006d399bc4
>>>>>>> BUG: KASAN: use-after-free in atomic_read
>>>>>>> arch/x86/include/asm/atomic.h:26 [inline] at addr ffff88006d399bc4
>>>>>>> BUG: KASAN: use-after-free in atomic_dec_if_positive
>>>>>>> include/linux/atomic.h:616 [inline] at addr ffff88006d399bc4
>>>>>>> BUG: KASAN: use-after-free in dec_ucount+0x1e5/0x210 kernel/ucount.c:217
>>>>>>> at addr ffff88006d399bc4
>>>>>>> Read of size 4 by task syz-executor3/19713
>>>>>>> CPU: 1 PID: 19713 Comm: syz-executor3 Not tainted 4.10.0+ #4
>>>>>>> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
>>>>>>> Ubuntu-1.8.2-1ubuntu1 04/01/2014
>>>>>>> Call Trace:
>>>>>>>  __dump_stack lib/dump_stack.c:15 [inline]
>>>>>>>  dump_stack+0x115/0x1cf lib/dump_stack.c:51
>>>>>>>  kasan_object_err+0x1c/0x70 mm/kasan/report.c:162
>>>>>>>  print_address_description mm/kasan/report.c:200 [inline]
>>>>>>>  kasan_report_error mm/kasan/report.c:289 [inline]
>>>>>>>  kasan_report.part.1+0x20e/0x4e0 mm/kasan/report.c:311
>>>>>>>  kasan_report mm/kasan/report.c:331 [inline]
>>>>>>>  __asan_report_load4_noabort+0x29/0x30 mm/kasan/report.c:331
>>>>>>>  __read_once_size include/linux/compiler.h:254 [inline]
>>>>>>>  atomic_read arch/x86/include/asm/atomic.h:26 [inline]
>>>>>>>  atomic_dec_if_positive include/linux/atomic.h:616 [inline]
>>>>>>>  dec_ucount+0x1e5/0x210 kernel/ucount.c:217
>>>>>>>  dec_inotify_instances fs/notify/inotify/inotify.h:37 [inline]
>>>>>>>  inotify_free_group_priv+0x6c/0x80 fs/notify/inotify/inotify_fsnotify.c:169
>>>>>>>  fsnotify_final_destroy_group fs/notify/group.c:37 [inline]
>>>>>>>  fsnotify_put_group+0x73/0xa0 fs/notify/group.c:110
>>>>>>>  fsnotify_destroy_group+0xec/0x120 fs/notify/group.c:93
>>>>>>>  inotify_release+0x37/0x50 fs/notify/inotify/inotify_user.c:280
>>>>>>>  __fput+0x327/0x7e0 fs/file_table.c:208
>>>>>>>  ____fput+0x15/0x20 fs/file_table.c:244
>>>>>>>  task_work_run+0x18a/0x260 kernel/task_work.c:116
>>>>>>>  exit_task_work include/linux/task_work.h:21 [inline]
>>>>>>>  do_exit+0xa45/0x1b20 kernel/exit.c:873
>>>>>>>  do_group_exit+0x149/0x400 kernel/exit.c:977
>>>>>>>  get_signal+0x7d5/0x1810 kernel/signal.c:2313
>>>>>>>  do_signal+0x94/0x1f30 arch/x86/kernel/signal.c:807
>>>>>>>  exit_to_usermode_loop+0x162/0x1e0 arch/x86/entry/common.c:156
>>>>>>>  prepare_exit_to_usermode arch/x86/entry/common.c:190 [inline]
>>>>>>>  syscall_return_slowpath+0x2b6/0x310 arch/x86/entry/common.c:259
>>>>>>>  entry_SYSCALL_64_fastpath+0xc0/0xc2
>>>>>>
>>>>>> So PID 19713 is exitting and as part of it it's freeing its file
>>>>>> descriptors, one of which is apparently an inotify fd. And this has
>>>>>> already been freed.
>>>>>>
>>>>>>
>>>>>>> RIP: 0033:0x44fb79
>>>>>>> RSP: 002b:00007ffd0f00f6d8 EFLAGS: 00000206 ORIG_RAX: 00000000000000ca
>>>>>>> RAX: fffffffffffffdfc RBX: 0000000000708024 RCX: 000000000044fb79
>>>>>>> RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000708024
>>>>>>> RBP: 00000000000ae8e6 R08: 0000000000708000 R09: 000000160000000d
>>>>>>> R10: 00007ffd0f00f710 R11: 0000000000000206 R12: 0000000000708000
>>>>>>> R13: 0000000000708024 R14: 00000000000ae8a1 R15: 0000000000000016
>>>>>>> Object at ffff88006d399b88, in cache kmalloc-96 size: 96
>>>>>>> Allocated:
>>>>>>> PID = 19691
>>>>>>>  save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:57
>>>>>>>  save_stack+0x43/0xd0 mm/kasan/kasan.c:502
>>>>>>>  set_track mm/kasan/kasan.c:514 [inline]
>>>>>>>  kasan_kmalloc+0xad/0xe0 mm/kasan/kasan.c:605
>>>>>>>  kmem_cache_alloc_trace+0xfb/0x280 mm/slub.c:2745
>>>>>>>  kmalloc include/linux/slab.h:490 [inline]
>>>>>>>  kzalloc include/linux/slab.h:663 [inline]
>>>>>>>  get_ucounts kernel/ucount.c:140 [inline]
>>>>>>>  inc_ucount+0x538/0xa70 kernel/ucount.c:195
>>>>>>>  inotify_new_group+0x309/0x410 fs/notify/inotify/inotify_user.c:655
>>>>>>>  SYSC_inotify_init1 fs/notify/inotify/inotify_user.c:682 [inline]
>>>>>>>  SyS_inotify_init1 fs/notify/inotify/inotify_user.c:669 [inline]
>>>>>>>  sys_inotify_init+0x17/0x80 fs/notify/inotify/inotify_user.c:696
>>>>>>>  entry_SYSCALL_64_fastpath+0x1f/0xc2
>>>>>>
>>>>>> However, it has been actually allocated by a different process with pid
>>>>>> 19691.
>>>>>>
>>>>>>
>>>>>>> Freed:
>>>>>>> PID = 19708
>>>>>>>  save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:57
>>>>>>>  save_stack+0x43/0xd0 mm/kasan/kasan.c:502
>>>>>>>  set_track mm/kasan/kasan.c:514 [inline]
>>>>>>>  kasan_slab_free+0x73/0xc0 mm/kasan/kasan.c:578
>>>>>>>  slab_free_hook mm/slub.c:1357 [inline]
>>>>>>>  slab_free_freelist_hook mm/slub.c:1379 [inline]
>>>>>>>  slab_free mm/slub.c:2961 [inline]
>>>>>>>  kfree+0xe8/0x2c0 mm/slub.c:3882
>>>>>>>  put_ucounts+0x1dd/0x270 kernel/ucount.c:172
>>>>>>>  dec_ucount+0x172/0x210 kernel/ucount.c:220
>>>>>>>  dec_inotify_instances fs/notify/inotify/inotify.h:37 [inline]
>>>>>>>  inotify_free_group_priv+0x6c/0x80 fs/notify/inotify/inotify_fsnotify.c:169
>>>>>>>  fsnotify_final_destroy_group fs/notify/group.c:37 [inline]
>>>>>>>  fsnotify_put_group+0x73/0xa0 fs/notify/group.c:110
>>>>>>>  fsnotify_destroy_group+0xec/0x120 fs/notify/group.c:93
>>>>>>>  inotify_release+0x37/0x50 fs/notify/inotify/inotify_user.c:280
>>>>>>>  __fput+0x327/0x7e0 fs/file_table.c:208
>>>>>>>  ____fput+0x15/0x20 fs/file_table.c:244
>>>>>>>  task_work_run+0x18a/0x260 kernel/task_work.c:116
>>>>>>>  exit_task_work include/linux/task_work.h:21 [inline]
>>>>>>>  do_exit+0xa45/0x1b20 kernel/exit.c:873
>>>>>>>  do_group_exit+0x149/0x400 kernel/exit.c:977
>>>>>>>  get_signal+0x7d5/0x1810 kernel/signal.c:2313
>>>>>>>  do_signal+0x94/0x1f30 arch/x86/kernel/signal.c:807
>>>>>>>  exit_to_usermode_loop+0x162/0x1e0 arch/x86/entry/common.c:156
>>>>>>>  prepare_exit_to_usermode arch/x86/entry/common.c:190 [inline]
>>>>>>>  syscall_return_slowpath+0x2b6/0x310 arch/x86/entry/common.c:259
>>>>>>>  entry_SYSCALL_64_fastpath+0xc0/0xc2
>>>>>>
>>>>>> And yet we have a third process which freed it, PID 19708. So there is
>>>>>> some dance happening with this fd, being allocated by one process,
>>>>>> handed over to 2 more, which are freeing it. Is this a valid usage
>>>>>> scenario of inotify descriptors?
>>>>>
>>>>> They are file descriptors so passing them around is valid.  That is
>>>>> something unix domain sockets have allowed since the dawn of linux.
>>>>>
>>>>> The dance would need to be the fd being passed to the addtional
>>>>> processes and then closed in the original before being closed
>>>>> in the processes the fd was passed to.
>>>>>
>>>>> If those additional processes last longer than the original process this
>>>>> is easy to achieve.
>>>>>
>>>>> My guess is that someone just taught syskallzer to pass file descriptors
>>>>> around.  So this may be an old bug.  Either that or syskallzer hasn't
>>>>> been looking at linux-next with KASAN enabled in the kernel.
>>>>
>>>> Dmitry, can you tell if syzkaller tests sending file descriptors across
>>>> sockets? Since the calltraces here show multiple processes being
>>>> involved in different operations on the exact same file descriptor.
>>>>
>>>> Also JongHwan, can you provide the full, compilable reproducer to try
>>>> and track this issue down?
>>>
>>>
>>> syzkaller can pass descriptors across sockets, but currently only
>>> within a single multi-threaded process.
>>>
>>> Are you sure it's the same descriptor? It seems to me that it's struct
>>> ucounts, which is shared via the global ucounts_hashtable, so no
>>> sharing is required in user processes.
>>>
>>> Unless I am missing something, we want:
>>>
>>> @@ -154,7 +155,7 @@ static struct ucounts *get_ucounts(struct
>>> user_namespace *ns, kuid_t uid)
>>>                         ucounts = new;
>>>                 }
>>>         }
>>> -       if (!atomic_add_unless(&ucounts->count, 1, INT_MAX))
>>> +       if (!atomic_add_unless(&ucounts->count, 1, 0))
>>>                 ucounts = NULL;
>>>         spin_unlock_irq(&ucounts_lock);
>>>         return ucounts;
>>>
>>> no?
>>>
>>> put_ucounts drops the last reference, then get_ucounts finds the
>>> ucounts and successfully increments refcount as it's not INT_MAX (it's
>>> 0) and starts using it, meanwhile put_ucounts proceeds to
>>> unconditionally deleting the ucounts.
>>
>>
>> It also seems that a concurrent put_ucounts can make get_ucounts fail
>> _spuriously_, which does not look good.
>> Don't we want something along the following lines?
>>
>> diff --git a/kernel/ucount.c b/kernel/ucount.c
>> index 8a11fc0cb459..233c8e46acd5 100644
>> --- a/kernel/ucount.c
>> +++ b/kernel/ucount.c
>> @@ -143,19 +143,18 @@ static struct ucounts *get_ucounts(struct
>> user_namespace *ns, kuid_t uid)
>>
>>                 new->ns = ns;
>>                 new->uid = uid;
>> -               atomic_set(&new->count, 0);
>> +               atomic_set(&new->count, 1);
>>
>>                 spin_lock_irq(&ucounts_lock);
>>                 ucounts = find_ucounts(ns, uid, hashent);
>>                 if (ucounts) {
>> +                       atomic_inc(&ucounts->count);
>>                         kfree(new);
>>                 } else {
>>                         hlist_add_head(&new->node, hashent);
>>                         ucounts = new;
>>                 }
>>         }
>> -       if (!atomic_add_unless(&ucounts->count, 1, INT_MAX))
>> -               ucounts = NULL;
>>         spin_unlock_irq(&ucounts_lock);
>>         return ucounts;
>>  }
>> @@ -166,7 +165,10 @@ static void put_ucounts(struct ucounts *ucounts)
>>
>>         if (atomic_dec_and_test(&ucounts->count)) {
>>                 spin_lock_irqsave(&ucounts_lock, flags);
>> -               hlist_del_init(&ucounts->node);
>> +               if (atomic_read(&ucounts->count) == 0)
>> +                       hlist_del_init(&ucounts->node);
>> +               else
>> +                       ucounts = NULL;
>>                 spin_unlock_irqrestore(&ucounts_lock, flags);
>>
>>                 kfree(ucounts);
> 
> 
> /\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\
> 
> This is broken per se. Need something more elaborate.
> 


How about this :

diff --git a/kernel/ucount.c b/kernel/ucount.c
index 8a11fc0cb459..b817ac0e587c 100644
--- a/kernel/ucount.c
+++ b/kernel/ucount.c
@@ -166,11 +166,15 @@ static void put_ucounts(struct ucounts *ucounts)

        if (atomic_dec_and_test(&ucounts->count)) {
                spin_lock_irqsave(&ucounts_lock, flags);
-               hlist_del_init(&ucounts->node);
-               spin_unlock_irqrestore(&ucounts_lock, flags);
-
-               kfree(ucounts);
+               if (!atomic_read(&ucounts->count)) {
+                       hlist_del_init(&ucounts->node);
+                       spin_unlock_irqrestore(&ucounts_lock, flags);
+                       kfree(ucounts);
+                       return;
+               }
        }
+
+       spin_unlock_irqrestore(&ucounts_lock, flags);
 }



This makes the atomic_dec_and_test and hashtable removal atomic in essence.

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: ucount: use-after-free read in inc_ucount & dec_ucount
       [not found]                             ` <c69f8f03-4324-f934-eed2-643c91d703c0-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
@ 2017-03-04 12:15                               ` Dmitry Vyukov via Containers
       [not found]                                 ` <CACT4Y+avCy8Susvb09DLHkGym0K_15+EeZ8yBH7tcgjvnb3Yhw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 23+ messages in thread
From: Dmitry Vyukov via Containers @ 2017-03-04 12:15 UTC (permalink / raw)
  To: Nikolay Borisov
  Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	syzkaller, Jan Kara, Eric W. Biederman, JongHwan Kim

On Sat, Mar 4, 2017 at 1:10 PM, Nikolay Borisov
<n.borisov.lkml-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
>
>
> On  4.03.2017 14:01, Dmitry Vyukov wrote:
>> On Sat, Mar 4, 2017 at 12:50 PM, Dmitry Vyukov <dvyukov-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> wrote:
>>>> <n.borisov.lkml-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
>>>>> [Addressing Dmitry Vyukov to ask for syzkaller clarification]
>>>>>
>>>>> On  3.03.2017 18:30, Eric W. Biederman wrote:
>>>>>> Nikolay Borisov <n.borisov.lkml-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:
>>>>>>
>>>>>>> [Added containers ml, Eric Biederman and Jan Kara]. Please,
>>>>>>> next time don't add random people but take the time to see who touched
>>>>>>> the code.
>>>>>>
>>>>>> Comments below.
>>>>>>
>>>>>>> On  3.03.2017 14:16, JongHwan Kim wrote:
>>>>>>>> I've got the following report with syzkaller fuzzer
>>>>>>>>
>>>>>>>>
>>>>>>>> Syzkaller hit 'KASAN: use-after-free Read in dec_ucount' bug on commit .
>>>>>>>>
>>>>>>>> ==================================================================
>>>>>>>> BUG: KASAN: use-after-free in __read_once_size
>>>>>>>> include/linux/compiler.h:254 [inline] at addr ffff88006d399bc4
>>>>>>>> BUG: KASAN: use-after-free in atomic_read
>>>>>>>> arch/x86/include/asm/atomic.h:26 [inline] at addr ffff88006d399bc4
>>>>>>>> BUG: KASAN: use-after-free in atomic_dec_if_positive
>>>>>>>> include/linux/atomic.h:616 [inline] at addr ffff88006d399bc4
>>>>>>>> BUG: KASAN: use-after-free in dec_ucount+0x1e5/0x210 kernel/ucount.c:217
>>>>>>>> at addr ffff88006d399bc4
>>>>>>>> Read of size 4 by task syz-executor3/19713
>>>>>>>> CPU: 1 PID: 19713 Comm: syz-executor3 Not tainted 4.10.0+ #4
>>>>>>>> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
>>>>>>>> Ubuntu-1.8.2-1ubuntu1 04/01/2014
>>>>>>>> Call Trace:
>>>>>>>>  __dump_stack lib/dump_stack.c:15 [inline]
>>>>>>>>  dump_stack+0x115/0x1cf lib/dump_stack.c:51
>>>>>>>>  kasan_object_err+0x1c/0x70 mm/kasan/report.c:162
>>>>>>>>  print_address_description mm/kasan/report.c:200 [inline]
>>>>>>>>  kasan_report_error mm/kasan/report.c:289 [inline]
>>>>>>>>  kasan_report.part.1+0x20e/0x4e0 mm/kasan/report.c:311
>>>>>>>>  kasan_report mm/kasan/report.c:331 [inline]
>>>>>>>>  __asan_report_load4_noabort+0x29/0x30 mm/kasan/report.c:331
>>>>>>>>  __read_once_size include/linux/compiler.h:254 [inline]
>>>>>>>>  atomic_read arch/x86/include/asm/atomic.h:26 [inline]
>>>>>>>>  atomic_dec_if_positive include/linux/atomic.h:616 [inline]
>>>>>>>>  dec_ucount+0x1e5/0x210 kernel/ucount.c:217
>>>>>>>>  dec_inotify_instances fs/notify/inotify/inotify.h:37 [inline]
>>>>>>>>  inotify_free_group_priv+0x6c/0x80 fs/notify/inotify/inotify_fsnotify.c:169
>>>>>>>>  fsnotify_final_destroy_group fs/notify/group.c:37 [inline]
>>>>>>>>  fsnotify_put_group+0x73/0xa0 fs/notify/group.c:110
>>>>>>>>  fsnotify_destroy_group+0xec/0x120 fs/notify/group.c:93
>>>>>>>>  inotify_release+0x37/0x50 fs/notify/inotify/inotify_user.c:280
>>>>>>>>  __fput+0x327/0x7e0 fs/file_table.c:208
>>>>>>>>  ____fput+0x15/0x20 fs/file_table.c:244
>>>>>>>>  task_work_run+0x18a/0x260 kernel/task_work.c:116
>>>>>>>>  exit_task_work include/linux/task_work.h:21 [inline]
>>>>>>>>  do_exit+0xa45/0x1b20 kernel/exit.c:873
>>>>>>>>  do_group_exit+0x149/0x400 kernel/exit.c:977
>>>>>>>>  get_signal+0x7d5/0x1810 kernel/signal.c:2313
>>>>>>>>  do_signal+0x94/0x1f30 arch/x86/kernel/signal.c:807
>>>>>>>>  exit_to_usermode_loop+0x162/0x1e0 arch/x86/entry/common.c:156
>>>>>>>>  prepare_exit_to_usermode arch/x86/entry/common.c:190 [inline]
>>>>>>>>  syscall_return_slowpath+0x2b6/0x310 arch/x86/entry/common.c:259
>>>>>>>>  entry_SYSCALL_64_fastpath+0xc0/0xc2
>>>>>>>
>>>>>>> So PID 19713 is exitting and as part of it it's freeing its file
>>>>>>> descriptors, one of which is apparently an inotify fd. And this has
>>>>>>> already been freed.
>>>>>>>
>>>>>>>
>>>>>>>> RIP: 0033:0x44fb79
>>>>>>>> RSP: 002b:00007ffd0f00f6d8 EFLAGS: 00000206 ORIG_RAX: 00000000000000ca
>>>>>>>> RAX: fffffffffffffdfc RBX: 0000000000708024 RCX: 000000000044fb79
>>>>>>>> RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000708024
>>>>>>>> RBP: 00000000000ae8e6 R08: 0000000000708000 R09: 000000160000000d
>>>>>>>> R10: 00007ffd0f00f710 R11: 0000000000000206 R12: 0000000000708000
>>>>>>>> R13: 0000000000708024 R14: 00000000000ae8a1 R15: 0000000000000016
>>>>>>>> Object at ffff88006d399b88, in cache kmalloc-96 size: 96
>>>>>>>> Allocated:
>>>>>>>> PID = 19691
>>>>>>>>  save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:57
>>>>>>>>  save_stack+0x43/0xd0 mm/kasan/kasan.c:502
>>>>>>>>  set_track mm/kasan/kasan.c:514 [inline]
>>>>>>>>  kasan_kmalloc+0xad/0xe0 mm/kasan/kasan.c:605
>>>>>>>>  kmem_cache_alloc_trace+0xfb/0x280 mm/slub.c:2745
>>>>>>>>  kmalloc include/linux/slab.h:490 [inline]
>>>>>>>>  kzalloc include/linux/slab.h:663 [inline]
>>>>>>>>  get_ucounts kernel/ucount.c:140 [inline]
>>>>>>>>  inc_ucount+0x538/0xa70 kernel/ucount.c:195
>>>>>>>>  inotify_new_group+0x309/0x410 fs/notify/inotify/inotify_user.c:655
>>>>>>>>  SYSC_inotify_init1 fs/notify/inotify/inotify_user.c:682 [inline]
>>>>>>>>  SyS_inotify_init1 fs/notify/inotify/inotify_user.c:669 [inline]
>>>>>>>>  sys_inotify_init+0x17/0x80 fs/notify/inotify/inotify_user.c:696
>>>>>>>>  entry_SYSCALL_64_fastpath+0x1f/0xc2
>>>>>>>
>>>>>>> However, it has been actually allocated by a different process with pid
>>>>>>> 19691.
>>>>>>>
>>>>>>>
>>>>>>>> Freed:
>>>>>>>> PID = 19708
>>>>>>>>  save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:57
>>>>>>>>  save_stack+0x43/0xd0 mm/kasan/kasan.c:502
>>>>>>>>  set_track mm/kasan/kasan.c:514 [inline]
>>>>>>>>  kasan_slab_free+0x73/0xc0 mm/kasan/kasan.c:578
>>>>>>>>  slab_free_hook mm/slub.c:1357 [inline]
>>>>>>>>  slab_free_freelist_hook mm/slub.c:1379 [inline]
>>>>>>>>  slab_free mm/slub.c:2961 [inline]
>>>>>>>>  kfree+0xe8/0x2c0 mm/slub.c:3882
>>>>>>>>  put_ucounts+0x1dd/0x270 kernel/ucount.c:172
>>>>>>>>  dec_ucount+0x172/0x210 kernel/ucount.c:220
>>>>>>>>  dec_inotify_instances fs/notify/inotify/inotify.h:37 [inline]
>>>>>>>>  inotify_free_group_priv+0x6c/0x80 fs/notify/inotify/inotify_fsnotify.c:169
>>>>>>>>  fsnotify_final_destroy_group fs/notify/group.c:37 [inline]
>>>>>>>>  fsnotify_put_group+0x73/0xa0 fs/notify/group.c:110
>>>>>>>>  fsnotify_destroy_group+0xec/0x120 fs/notify/group.c:93
>>>>>>>>  inotify_release+0x37/0x50 fs/notify/inotify/inotify_user.c:280
>>>>>>>>  __fput+0x327/0x7e0 fs/file_table.c:208
>>>>>>>>  ____fput+0x15/0x20 fs/file_table.c:244
>>>>>>>>  task_work_run+0x18a/0x260 kernel/task_work.c:116
>>>>>>>>  exit_task_work include/linux/task_work.h:21 [inline]
>>>>>>>>  do_exit+0xa45/0x1b20 kernel/exit.c:873
>>>>>>>>  do_group_exit+0x149/0x400 kernel/exit.c:977
>>>>>>>>  get_signal+0x7d5/0x1810 kernel/signal.c:2313
>>>>>>>>  do_signal+0x94/0x1f30 arch/x86/kernel/signal.c:807
>>>>>>>>  exit_to_usermode_loop+0x162/0x1e0 arch/x86/entry/common.c:156
>>>>>>>>  prepare_exit_to_usermode arch/x86/entry/common.c:190 [inline]
>>>>>>>>  syscall_return_slowpath+0x2b6/0x310 arch/x86/entry/common.c:259
>>>>>>>>  entry_SYSCALL_64_fastpath+0xc0/0xc2
>>>>>>>
>>>>>>> And yet we have a third process which freed it, PID 19708. So there is
>>>>>>> some dance happening with this fd, being allocated by one process,
>>>>>>> handed over to 2 more, which are freeing it. Is this a valid usage
>>>>>>> scenario of inotify descriptors?
>>>>>>
>>>>>> They are file descriptors so passing them around is valid.  That is
>>>>>> something unix domain sockets have allowed since the dawn of linux.
>>>>>>
>>>>>> The dance would need to be the fd being passed to the addtional
>>>>>> processes and then closed in the original before being closed
>>>>>> in the processes the fd was passed to.
>>>>>>
>>>>>> If those additional processes last longer than the original process this
>>>>>> is easy to achieve.
>>>>>>
>>>>>> My guess is that someone just taught syskallzer to pass file descriptors
>>>>>> around.  So this may be an old bug.  Either that or syskallzer hasn't
>>>>>> been looking at linux-next with KASAN enabled in the kernel.
>>>>>
>>>>> Dmitry, can you tell if syzkaller tests sending file descriptors across
>>>>> sockets? Since the calltraces here show multiple processes being
>>>>> involved in different operations on the exact same file descriptor.
>>>>>
>>>>> Also JongHwan, can you provide the full, compilable reproducer to try
>>>>> and track this issue down?
>>>>
>>>>
>>>> syzkaller can pass descriptors across sockets, but currently only
>>>> within a single multi-threaded process.
>>>>
>>>> Are you sure it's the same descriptor? It seems to me that it's struct
>>>> ucounts, which is shared via the global ucounts_hashtable, so no
>>>> sharing is required in user processes.
>>>>
>>>> Unless I am missing something, we want:
>>>>
>>>> @@ -154,7 +155,7 @@ static struct ucounts *get_ucounts(struct
>>>> user_namespace *ns, kuid_t uid)
>>>>                         ucounts = new;
>>>>                 }
>>>>         }
>>>> -       if (!atomic_add_unless(&ucounts->count, 1, INT_MAX))
>>>> +       if (!atomic_add_unless(&ucounts->count, 1, 0))
>>>>                 ucounts = NULL;
>>>>         spin_unlock_irq(&ucounts_lock);
>>>>         return ucounts;
>>>>
>>>> no?
>>>>
>>>> put_ucounts drops the last reference, then get_ucounts finds the
>>>> ucounts and successfully increments refcount as it's not INT_MAX (it's
>>>> 0) and starts using it, meanwhile put_ucounts proceeds to
>>>> unconditionally deleting the ucounts.
>>>
>>>
>>> It also seems that a concurrent put_ucounts can make get_ucounts fail
>>> _spuriously_, which does not look good.
>>> Don't we want something along the following lines?
>>>
>>> diff --git a/kernel/ucount.c b/kernel/ucount.c
>>> index 8a11fc0cb459..233c8e46acd5 100644
>>> --- a/kernel/ucount.c
>>> +++ b/kernel/ucount.c
>>> @@ -143,19 +143,18 @@ static struct ucounts *get_ucounts(struct
>>> user_namespace *ns, kuid_t uid)
>>>
>>>                 new->ns = ns;
>>>                 new->uid = uid;
>>> -               atomic_set(&new->count, 0);
>>> +               atomic_set(&new->count, 1);
>>>
>>>                 spin_lock_irq(&ucounts_lock);
>>>                 ucounts = find_ucounts(ns, uid, hashent);
>>>                 if (ucounts) {
>>> +                       atomic_inc(&ucounts->count);
>>>                         kfree(new);
>>>                 } else {
>>>                         hlist_add_head(&new->node, hashent);
>>>                         ucounts = new;
>>>                 }
>>>         }
>>> -       if (!atomic_add_unless(&ucounts->count, 1, INT_MAX))
>>> -               ucounts = NULL;
>>>         spin_unlock_irq(&ucounts_lock);
>>>         return ucounts;
>>>  }
>>> @@ -166,7 +165,10 @@ static void put_ucounts(struct ucounts *ucounts)
>>>
>>>         if (atomic_dec_and_test(&ucounts->count)) {
>>>                 spin_lock_irqsave(&ucounts_lock, flags);
>>> -               hlist_del_init(&ucounts->node);
>>> +               if (atomic_read(&ucounts->count) == 0)
>>> +                       hlist_del_init(&ucounts->node);
>>> +               else
>>> +                       ucounts = NULL;
>>>                 spin_unlock_irqrestore(&ucounts_lock, flags);
>>>
>>>                 kfree(ucounts);
>>
>>
>> /\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\
>>
>> This is broken per se. Need something more elaborate.
>>
>
>
> How about this :
>
> diff --git a/kernel/ucount.c b/kernel/ucount.c
> index 8a11fc0cb459..b817ac0e587c 100644
> --- a/kernel/ucount.c
> +++ b/kernel/ucount.c
> @@ -166,11 +166,15 @@ static void put_ucounts(struct ucounts *ucounts)
>
>         if (atomic_dec_and_test(&ucounts->count)) {
>                 spin_lock_irqsave(&ucounts_lock, flags);
> -               hlist_del_init(&ucounts->node);
> -               spin_unlock_irqrestore(&ucounts_lock, flags);
> -
> -               kfree(ucounts);
> +               if (!atomic_read(&ucounts->count)) {
> +                       hlist_del_init(&ucounts->node);
> +                       spin_unlock_irqrestore(&ucounts_lock, flags);
> +                       kfree(ucounts);
> +                       return;
> +               }
>         }
> +
> +       spin_unlock_irqrestore(&ucounts_lock, flags);
>  }
>
>
>
> This makes the atomic_dec_and_test and hashtable removal atomic in essence.


This won't work.
Consider the following scenario:
Thread 0 decrements count to 0 here:
  if (atomic_dec_and_test(&ucounts->count)) {
Then thread 1 calls get_ucounts, increments count to 1, then calls
put_ucounts, decrements count to 0 and unhashes and frees ucounts.
Not thread 0 does:
  if (!atomic_read(&ucounts->count)) {
but ucounts is already freed!

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: ucount: use-after-free read in inc_ucount & dec_ucount
       [not found]                                 ` <CACT4Y+avCy8Susvb09DLHkGym0K_15+EeZ8yBH7tcgjvnb3Yhw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2017-03-04 12:35                                   ` 쪼르
       [not found]                                     ` <CALRZ7Ut=bAgX+XwS3h8-V-zGqiUFkuSCi_C=j6uN_=OptU-+RQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2017-03-04 12:38                                   ` Dmitry Vyukov via Containers
  1 sibling, 1 reply; 23+ messages in thread
From: 쪼르 @ 2017-03-04 12:35 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Jan Kara,
	Eric W. Biederman, syzkaller

Hi, This is my new one report about dec_ucount:
ps.Sorry for my uncomfortable report. This is my first usage of lkml.

Syzkaller hit 'KASAN: use-after-free Read in dec_ucount' bug on commit .

==================================================================
BUG: KASAN: use-after-free in __read_once_size
include/linux/compiler.h:254 [inline] at addr ffff8800372f21fc
BUG: KASAN: use-after-free in atomic_read
arch/x86/include/asm/atomic.h:26 [inline] at addr ffff8800372f21fc
BUG: KASAN: use-after-free in atomic_dec_if_positive
include/linux/atomic.h:616 [inline] at addr ffff8800372f21fc
BUG: KASAN: use-after-free in dec_ucount+0x1e5/0x210
kernel/ucount.c:217 at addr ffff8800372f21fc
Read of size 4 by task syz-executor0/19190
CPU: 1 PID: 19190 Comm: syz-executor0 Not tainted 4.10.0+ #4
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
Ubuntu-1.8.2-1ubuntu1 04/01/2014
Call Trace:
 __dump_stack lib/dump_stack.c:15 [inline]
 dump_stack+0x115/0x1cf lib/dump_stack.c:51
 kasan_object_err+0x1c/0x70 mm/kasan/report.c:162
 print_address_description mm/kasan/report.c:200 [inline]
 kasan_report_error mm/kasan/report.c:289 [inline]
 kasan_report.part.1+0x20e/0x4e0 mm/kasan/report.c:311
 kasan_report mm/kasan/report.c:331 [inline]
 __asan_report_load4_noabort+0x29/0x30 mm/kasan/report.c:331
 __read_once_size include/linux/compiler.h:254 [inline]
 atomic_read arch/x86/include/asm/atomic.h:26 [inline]
 atomic_dec_if_positive include/linux/atomic.h:616 [inline]
 dec_ucount+0x1e5/0x210 kernel/ucount.c:217
 dec_inotify_instances fs/notify/inotify/inotify.h:37 [inline]
 inotify_free_group_priv+0x6c/0x80 fs/notify/inotify/inotify_fsnotify.c:169
 fsnotify_final_destroy_group fs/notify/group.c:37 [inline]
 fsnotify_put_group+0x73/0xa0 fs/notify/group.c:110
 fsnotify_destroy_group+0xec/0x120 fs/notify/group.c:93
 inotify_release+0x37/0x50 fs/notify/inotify/inotify_user.c:280
 __fput+0x327/0x7e0 fs/file_table.c:208
 ____fput+0x15/0x20 fs/file_table.c:244
 task_work_run+0x18a/0x260 kernel/task_work.c:116
 exit_task_work include/linux/task_work.h:21 [inline]
 do_exit+0xa45/0x1b20 kernel/exit.c:873
 do_group_exit+0x149/0x400 kernel/exit.c:977
 get_signal+0x7d5/0x1810 kernel/signal.c:2313
 do_signal+0x94/0x1f30 arch/x86/kernel/signal.c:807
 exit_to_usermode_loop+0x162/0x1e0 arch/x86/entry/common.c:156
 prepare_exit_to_usermode arch/x86/entry/common.c:190 [inline]
 syscall_return_slowpath+0x2b6/0x310 arch/x86/entry/common.c:259
 entry_SYSCALL_64_fastpath+0xc0/0xc2
RIP: 0033:0x44fb79
RSP: 002b:00007f048a1cacf8 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca
RAX: fffffffffffffe00 RBX: 0000000000708218 RCX: 000000000044fb79
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000708218
RBP: 00000000007081f8 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 00007fff84d701af R14: 00007f048a1cb9c0 R15: 000000000000000e
Object at ffff8800372f21c0, in cache kmalloc-96 size: 96
Allocated:
PID = 19163
 save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:57
 save_stack+0x43/0xd0 mm/kasan/kasan.c:502
 set_track mm/kasan/kasan.c:514 [inline]
 kasan_kmalloc+0xad/0xe0 mm/kasan/kasan.c:605
 kmem_cache_alloc_trace+0xfb/0x280 mm/slub.c:2745
 kmalloc include/linux/slab.h:490 [inline]
 kzalloc include/linux/slab.h:663 [inline]
 get_ucounts kernel/ucount.c:140 [inline]
 inc_ucount+0x538/0xa70 kernel/ucount.c:195
 inotify_new_group+0x309/0x410 fs/notify/inotify/inotify_user.c:655
 SYSC_inotify_init1 fs/notify/inotify/inotify_user.c:682 [inline]
 SyS_inotify_init1 fs/notify/inotify/inotify_user.c:669 [inline]
 sys_inotify_init+0x17/0x80 fs/notify/inotify/inotify_user.c:696
 entry_SYSCALL_64_fastpath+0x1f/0xc2
Freed:
PID = 19163
 save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:57
 save_stack+0x43/0xd0 mm/kasan/kasan.c:502
 set_track mm/kasan/kasan.c:514 [inline]
 kasan_slab_free+0x73/0xc0 mm/kasan/kasan.c:578
 slab_free_hook mm/slub.c:1357 [inline]
 slab_free_freelist_hook mm/slub.c:1379 [inline]
 slab_free mm/slub.c:2961 [inline]
 kfree+0xe8/0x2c0 mm/slub.c:3882
 put_ucounts+0x1dd/0x270 kernel/ucount.c:172
 dec_ucount+0x172/0x210 kernel/ucount.c:220
 dec_inotify_instances fs/notify/inotify/inotify.h:37 [inline]
 inotify_free_group_priv+0x6c/0x80 fs/notify/inotify/inotify_fsnotify.c:169
 fsnotify_final_destroy_group fs/notify/group.c:37 [inline]
 fsnotify_put_group+0x73/0xa0 fs/notify/group.c:110
 fsnotify_destroy_group+0xec/0x120 fs/notify/group.c:93
 inotify_release+0x37/0x50 fs/notify/inotify/inotify_user.c:280
 __fput+0x327/0x7e0 fs/file_table.c:208
 ____fput+0x15/0x20 fs/file_table.c:244
 task_work_run+0x18a/0x260 kernel/task_work.c:116
 exit_task_work include/linux/task_work.h:21 [inline]
 do_exit+0xa45/0x1b20 kernel/exit.c:873
 do_group_exit+0x149/0x400 kernel/exit.c:977
 get_signal+0x7d5/0x1810 kernel/signal.c:2313
 do_signal+0x94/0x1f30 arch/x86/kernel/signal.c:807
 exit_to_usermode_loop+0x162/0x1e0 arch/x86/entry/common.c:156
 prepare_exit_to_usermode arch/x86/entry/common.c:190 [inline]
 syscall_return_slowpath+0x2b6/0x310 arch/x86/entry/common.c:259
 entry_SYSCALL_64_fastpath+0xc0/0xc2
Memory state around the buggy address:
 ffff8800372f2080: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
 ffff8800372f2100: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>ffff8800372f2180: fc fc fc fc fc fc fc fc fb fb fb fb fb fb fb fb
                                                                ^
 ffff8800372f2200: fb fb fb fb fc fc fc fc fc fc fc fc fc fc fc fc
 ffff8800372f2280: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
==================================================================
Disabling lock debugging due to kernel taint
Kernel panic - not syncing: panic_on_warn set ...

CPU: 1 PID: 19190 Comm: syz-executor0 Tainted: G    B           4.10.0+ #4
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
Ubuntu-1.8.2-1ubuntu1 04/01/2014
Call Trace:
 __dump_stack lib/dump_stack.c:15 [inline]
 dump_stack+0x115/0x1cf lib/dump_stack.c:51
 panic+0x1b4/0x392 kernel/panic.c:179
 kasan_end_report+0x5b/0x60 mm/kasan/report.c:141
 kasan_report_error mm/kasan/report.c:293 [inline]
 kasan_report.part.1+0x40a/0x4e0 mm/kasan/report.c:311
 kasan_report mm/kasan/report.c:331 [inline]
 __asan_report_load4_noabort+0x29/0x30 mm/kasan/report.c:331
 __read_once_size include/linux/compiler.h:254 [inline]
 atomic_read arch/x86/include/asm/atomic.h:26 [inline]
 atomic_dec_if_positive include/linux/atomic.h:616 [inline]
 dec_ucount+0x1e5/0x210 kernel/ucount.c:217
 dec_inotify_instances fs/notify/inotify/inotify.h:37 [inline]
 inotify_free_group_priv+0x6c/0x80 fs/notify/inotify/inotify_fsnotify.c:169
 fsnotify_final_destroy_group fs/notify/group.c:37 [inline]
 fsnotify_put_group+0x73/0xa0 fs/notify/group.c:110
 fsnotify_destroy_group+0xec/0x120 fs/notify/group.c:93
 inotify_release+0x37/0x50 fs/notify/inotify/inotify_user.c:280
 __fput+0x327/0x7e0 fs/file_table.c:208
 ____fput+0x15/0x20 fs/file_table.c:244
 task_work_run+0x18a/0x260 kernel/task_work.c:116
 exit_task_work include/linux/task_work.h:21 [inline]
 do_exit+0xa45/0x1b20 kernel/exit.c:873
 do_group_exit+0x149/0x400 kernel/exit.c:977
 get_signal+0x7d5/0x1810 kernel/signal.c:2313
 do_signal+0x94/0x1f30 arch/x86/kernel/signal.c:807
 exit_to_usermode_loop+0x162/0x1e0 arch/x86/entry/common.c:156
 prepare_exit_to_usermode arch/x86/entry/common.c:190 [inline]
 syscall_return_slowpath+0x2b6/0x310 arch/x86/entry/common.c:259
 entry_SYSCALL_64_fastpath+0xc0/0xc2
RIP: 0033:0x44fb79
RSP: 002b:00007f048a1cacf8 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca
RAX: fffffffffffffe00 RBX: 0000000000708218 RCX: 000000000044fb79
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000708218
RBP: 00000000007081f8 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 00007fff84d701af R14: 00007f048a1cb9c0 R15: 000000000000000e
Dumping ftrace buffer:
   (ftrace buffer empty)
Kernel Offset: disabled
Rebooting in 86400 seconds..


Syzkaller reproducer:
# {Threaded:false Collide:false Repeat:true Procs:1 Sandbox:setuid Repro:false}
semget$private(0x0, 0x400001003, 0x181)


C reproducer:
// autogenerated by syzkaller (http://github.com/google/syzkaller)

#ifndef __NR_semget
#define __NR_semget 64
#endif

#define __STDC_VERSION__ 201112L

#define _GNU_SOURCE

#include <sys/ioctl.h>
#include <sys/mman.h>
#include <sys/mount.h>
#include <sys/prctl.h>
#include <sys/resource.h>
#include <sys/socket.h>
#include <sys/stat.h>
#include <sys/syscall.h>
#include <sys/time.h>
#include <sys/types.h>
#include <sys/wait.h>

#include <linux/capability.h>
#include <linux/if.h>
#include <linux/if_tun.h>
#include <linux/kvm.h>
#include <linux/sched.h>
#include <net/if_arp.h>

#include <assert.h>
#include <dirent.h>
#include <errno.h>
#include <fcntl.h>
#include <grp.h>
#include <pthread.h>
#include <setjmp.h>
#include <signal.h>
#include <stdarg.h>
#include <stdbool.h>
#include <stddef.h>
#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>

const int kFailStatus = 67;
const int kErrorStatus = 68;
const int kRetryStatus = 69;

__attribute__((noreturn)) void doexit(int status)
{
  volatile unsigned i;
  syscall(__NR_exit_group, status);
  for (i = 0;; i++) {
  }
}

__attribute__((noreturn)) void fail(const char* msg, ...)
{
  int e = errno;
  fflush(stdout);
  va_list args;
  va_start(args, msg);
  vfprintf(stderr, msg, args);
  va_end(args);
  fprintf(stderr, " (errno %d)\n", e);
  doexit(e == ENOMEM ? kRetryStatus : kFailStatus);
}

__attribute__((noreturn)) void exitf(const char* msg, ...)
{
  int e = errno;
  fflush(stdout);
  va_list args;
  va_start(args, msg);
  vfprintf(stderr, msg, args);
  va_end(args);
  fprintf(stderr, " (errno %d)\n", e);
  doexit(kRetryStatus);
}

static int flag_debug;

void debug(const char* msg, ...)
{
  if (!flag_debug)
    return;
  va_list args;
  va_start(args, msg);
  vfprintf(stdout, msg, args);
  va_end(args);
  fflush(stdout);
}

__thread int skip_segv;
__thread jmp_buf segv_env;

static void segv_handler(int sig, siginfo_t* info, void* uctx)
{
  uintptr_t addr = (uintptr_t)info->si_addr;
  const uintptr_t prog_start = 1 << 20;
  const uintptr_t prog_end = 100 << 20;
  if (__atomic_load_n(&skip_segv, __ATOMIC_RELAXED) &&
      (addr < prog_start || addr > prog_end)) {
    debug("SIGSEGV on %p, skipping\n", addr);
    _longjmp(segv_env, 1);
  }
  debug("SIGSEGV on %p, exiting\n", addr);
  doexit(sig);
  for (;;) {
  }
}

static void install_segv_handler()
{
  struct sigaction sa;
  memset(&sa, 0, sizeof(sa));
  sa.sa_sigaction = segv_handler;
  sa.sa_flags = SA_NODEFER | SA_SIGINFO;
  sigaction(SIGSEGV, &sa, NULL);
  sigaction(SIGBUS, &sa, NULL);
}

#define NONFAILING(...)                                                \
  {                                                                    \
    __atomic_fetch_add(&skip_segv, 1, __ATOMIC_SEQ_CST);               \
    if (_setjmp(segv_env) == 0) {                                      \
      __VA_ARGS__;                                                     \
    }                                                                  \
    __atomic_fetch_sub(&skip_segv, 1, __ATOMIC_SEQ_CST);               \
  }

#define BITMASK_LEN(type, bf_len) (type)((1ull << (bf_len)) - 1)

#define BITMASK_LEN_OFF(type, bf_off, bf_len)                          \
  (type)(BITMASK_LEN(type, (bf_len)) << (bf_off))

#define STORE_BY_BITMASK(type, addr, val, bf_off, bf_len)              \
  if ((bf_off) == 0 && (bf_len) == 0) {                                \
    *(type*)(addr) = (type)(val);                                      \
  } else {                                                             \
    type new_val = *(type*)(addr);                                     \
    new_val &= ~BITMASK_LEN_OFF(type, (bf_off), (bf_len));             \
    new_val |= ((type)(val)&BITMASK_LEN(type, (bf_len))) << (bf_off);  \
    *(type*)(addr) = new_val;                                          \
  }

static uintptr_t execute_syscall(int nr, uintptr_t a0, uintptr_t a1,
                                 uintptr_t a2, uintptr_t a3,
                                 uintptr_t a4, uintptr_t a5,
                                 uintptr_t a6, uintptr_t a7,
                                 uintptr_t a8)
{
  switch (nr) {
  default:
    return syscall(nr, a0, a1, a2, a3, a4, a5);
  }
}

static void setup_main_process()
{
  struct sigaction sa;
  memset(&sa, 0, sizeof(sa));
  sa.sa_handler = SIG_IGN;
  syscall(SYS_rt_sigaction, 0x20, &sa, NULL, 8);
  syscall(SYS_rt_sigaction, 0x21, &sa, NULL, 8);
  install_segv_handler();

  char tmpdir_template[] = "./syzkaller.XXXXXX";
  char* tmpdir = mkdtemp(tmpdir_template);
  if (!tmpdir)
    fail("failed to mkdtemp");
  if (chmod(tmpdir, 0777))
    fail("failed to chmod");
  if (chdir(tmpdir))
    fail("failed to chdir");
}

static void loop();

static void sandbox_common()
{
  prctl(PR_SET_PDEATHSIG, SIGKILL, 0, 0, 0);
  setpgrp();
  setsid();

  struct rlimit rlim;
  rlim.rlim_cur = rlim.rlim_max = 128 << 20;
  setrlimit(RLIMIT_AS, &rlim);
  rlim.rlim_cur = rlim.rlim_max = 1 << 20;
  setrlimit(RLIMIT_FSIZE, &rlim);
  rlim.rlim_cur = rlim.rlim_max = 1 << 20;
  setrlimit(RLIMIT_STACK, &rlim);
  rlim.rlim_cur = rlim.rlim_max = 0;
  setrlimit(RLIMIT_CORE, &rlim);

  unshare(CLONE_NEWNS);
  unshare(CLONE_NEWIPC);
  unshare(CLONE_IO);
}

static int do_sandbox_setuid(int executor_pid, bool enable_tun)
{
  int pid = fork();
  if (pid)
    return pid;

  sandbox_common();

  const int nobody = 65534;
  if (setgroups(0, NULL))
    fail("failed to setgroups");
  if (syscall(SYS_setresgid, nobody, nobody, nobody))
    fail("failed to setresgid");
  if (syscall(SYS_setresuid, nobody, nobody, nobody))
    fail("failed to setresuid");

  loop();
  doexit(1);
}

static void remove_dir(const char* dir)
{
  DIR* dp;
  struct dirent* ep;
  int iter = 0;
retry:
  dp = opendir(dir);
  if (dp == NULL) {
    if (errno == EMFILE) {
      exitf("opendir(%s) failed due to NOFILE, exiting");
    }
    exitf("opendir(%s) failed", dir);
  }
  while ((ep = readdir(dp))) {
    if (strcmp(ep->d_name, ".") == 0 || strcmp(ep->d_name, "..") == 0)
      continue;
    char filename[FILENAME_MAX];
    snprintf(filename, sizeof(filename), "%s/%s", dir, ep->d_name);
    struct stat st;
    if (lstat(filename, &st))
      exitf("lstat(%s) failed", filename);
    if (S_ISDIR(st.st_mode)) {
      remove_dir(filename);
      continue;
    }
    int i;
    for (i = 0;; i++) {
      debug("unlink(%s)\n", filename);
      if (unlink(filename) == 0)
        break;
      if (errno == EROFS) {
        debug("ignoring EROFS\n");
        break;
      }
      if (errno != EBUSY || i > 100)
        exitf("unlink(%s) failed", filename);
      debug("umount(%s)\n", filename);
      if (umount2(filename, MNT_DETACH))
        exitf("umount(%s) failed", filename);
    }
  }
  closedir(dp);
  int i;
  for (i = 0;; i++) {
    debug("rmdir(%s)\n", dir);
    if (rmdir(dir) == 0)
      break;
    if (i < 100) {
      if (errno == EROFS) {
        debug("ignoring EROFS\n");
        break;
      }
      if (errno == EBUSY) {
        debug("umount(%s)\n", dir);
        if (umount2(dir, MNT_DETACH))
          exitf("umount(%s) failed", dir);
        continue;
      }
      if (errno == ENOTEMPTY) {
        if (iter < 100) {
          iter++;
          goto retry;
        }
      }
    }
    exitf("rmdir(%s) failed", dir);
  }
}

static uint64_t current_time_ms()
{
  struct timespec ts;

  if (clock_gettime(CLOCK_MONOTONIC, &ts))
    fail("clock_gettime failed");
  return (uint64_t)ts.tv_sec * 1000 + (uint64_t)ts.tv_nsec / 1000000;
}

static void test();

void loop()
{
  int iter;
  for (iter = 0;; iter++) {
    char cwdbuf[256];
    sprintf(cwdbuf, "./%d", iter);
    if (mkdir(cwdbuf, 0777))
      fail("failed to mkdir");
    int pid = fork();
    if (pid < 0)
      fail("clone failed");
    if (pid == 0) {
      prctl(PR_SET_PDEATHSIG, SIGKILL, 0, 0, 0);
      setpgrp();
      if (chdir(cwdbuf))
        fail("failed to chdir");
      test();
      doexit(0);
    }
    int status = 0;
    uint64_t start = current_time_ms();
    for (;;) {
      int res = waitpid(-1, &status, __WALL | WNOHANG);
      if (res == pid)
        break;
      usleep(1000);
      if (current_time_ms() - start > 5 * 1000) {
        kill(-pid, SIGKILL);
        kill(pid, SIGKILL);
        while (waitpid(-1, &status, __WALL) != pid) {
        }
        break;
      }
    }
    remove_dir(cwdbuf);
  }
}

long r[1];
void test()
{
  memset(r, -1, sizeof(r));
  r[0] = execute_syscall(__NR_semget, 0x0ul, 0x400001003ul, 0x181ul, 0,
                         0, 0, 0, 0, 0);
}
int main()
{
  setup_main_process();
  int pid = do_sandbox_setuid(0, false);
  int status = 0;
  while (waitpid(pid, &status, __WALL) != pid) {
  }
  return 0;
}



On Sat, Mar 4, 2017 at 9:15 PM, Dmitry Vyukov <dvyukov-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> wrote:

> On Sat, Mar 4, 2017 at 1:10 PM, Nikolay Borisov
> <n.borisov.lkml-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> >
> >
> > On  4.03.2017 14:01, Dmitry Vyukov wrote:
> >> On Sat, Mar 4, 2017 at 12:50 PM, Dmitry Vyukov <dvyukov-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
> wrote:
> >>>> <n.borisov.lkml-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> >>>>> [Addressing Dmitry Vyukov to ask for syzkaller clarification]
> >>>>>
> >>>>> On  3.03.2017 18:30, Eric W. Biederman wrote:
> >>>>>> Nikolay Borisov <n.borisov.lkml-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:
> >>>>>>
> >>>>>>> [Added containers ml, Eric Biederman and Jan Kara]. Please,
> >>>>>>> next time don't add random people but take the time to see who
> touched
> >>>>>>> the code.
> >>>>>>
> >>>>>> Comments below.
> >>>>>>
> >>>>>>> On  3.03.2017 14:16, JongHwan Kim wrote:
> >>>>>>>> I've got the following report with syzkaller fuzzer
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> Syzkaller hit 'KASAN: use-after-free Read in dec_ucount' bug on
> commit .
> >>>>>>>>
> >>>>>>>> ============================================================
> ======
> >>>>>>>> BUG: KASAN: use-after-free in __read_once_size
> >>>>>>>> include/linux/compiler.h:254 [inline] at addr ffff88006d399bc4
> >>>>>>>> BUG: KASAN: use-after-free in atomic_read
> >>>>>>>> arch/x86/include/asm/atomic.h:26 [inline] at addr
> ffff88006d399bc4
> >>>>>>>> BUG: KASAN: use-after-free in atomic_dec_if_positive
> >>>>>>>> include/linux/atomic.h:616 [inline] at addr ffff88006d399bc4
> >>>>>>>> BUG: KASAN: use-after-free in dec_ucount+0x1e5/0x210
> kernel/ucount.c:217
> >>>>>>>> at addr ffff88006d399bc4
> >>>>>>>> Read of size 4 by task syz-executor3/19713
> >>>>>>>> CPU: 1 PID: 19713 Comm: syz-executor3 Not tainted 4.10.0+ #4
> >>>>>>>> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
> >>>>>>>> Ubuntu-1.8.2-1ubuntu1 04/01/2014
> >>>>>>>> Call Trace:
> >>>>>>>>  __dump_stack lib/dump_stack.c:15 [inline]
> >>>>>>>>  dump_stack+0x115/0x1cf lib/dump_stack.c:51
> >>>>>>>>  kasan_object_err+0x1c/0x70 mm/kasan/report.c:162
> >>>>>>>>  print_address_description mm/kasan/report.c:200 [inline]
> >>>>>>>>  kasan_report_error mm/kasan/report.c:289 [inline]
> >>>>>>>>  kasan_report.part.1+0x20e/0x4e0 mm/kasan/report.c:311
> >>>>>>>>  kasan_report mm/kasan/report.c:331 [inline]
> >>>>>>>>  __asan_report_load4_noabort+0x29/0x30 mm/kasan/report.c:331
> >>>>>>>>  __read_once_size include/linux/compiler.h:254 [inline]
> >>>>>>>>  atomic_read arch/x86/include/asm/atomic.h:26 [inline]
> >>>>>>>>  atomic_dec_if_positive include/linux/atomic.h:616 [inline]
> >>>>>>>>  dec_ucount+0x1e5/0x210 kernel/ucount.c:217
> >>>>>>>>  dec_inotify_instances fs/notify/inotify/inotify.h:37 [inline]
> >>>>>>>>  inotify_free_group_priv+0x6c/0x80 fs/notify/inotify/inotify_
> fsnotify.c:169
> >>>>>>>>  fsnotify_final_destroy_group fs/notify/group.c:37 [inline]
> >>>>>>>>  fsnotify_put_group+0x73/0xa0 fs/notify/group.c:110
> >>>>>>>>  fsnotify_destroy_group+0xec/0x120 fs/notify/group.c:93
> >>>>>>>>  inotify_release+0x37/0x50 fs/notify/inotify/inotify_user.c:280
> >>>>>>>>  __fput+0x327/0x7e0 fs/file_table.c:208
> >>>>>>>>  ____fput+0x15/0x20 fs/file_table.c:244
> >>>>>>>>  task_work_run+0x18a/0x260 kernel/task_work.c:116
> >>>>>>>>  exit_task_work include/linux/task_work.h:21 [inline]
> >>>>>>>>  do_exit+0xa45/0x1b20 kernel/exit.c:873
> >>>>>>>>  do_group_exit+0x149/0x400 kernel/exit.c:977
> >>>>>>>>  get_signal+0x7d5/0x1810 kernel/signal.c:2313
> >>>>>>>>  do_signal+0x94/0x1f30 arch/x86/kernel/signal.c:807
> >>>>>>>>  exit_to_usermode_loop+0x162/0x1e0 arch/x86/entry/common.c:156
> >>>>>>>>  prepare_exit_to_usermode arch/x86/entry/common.c:190 [inline]
> >>>>>>>>  syscall_return_slowpath+0x2b6/0x310 arch/x86/entry/common.c:259
> >>>>>>>>  entry_SYSCALL_64_fastpath+0xc0/0xc2
> >>>>>>>
> >>>>>>> So PID 19713 is exitting and as part of it it's freeing its file
> >>>>>>> descriptors, one of which is apparently an inotify fd. And this has
> >>>>>>> already been freed.
> >>>>>>>
> >>>>>>>
> >>>>>>>> RIP: 0033:0x44fb79
> >>>>>>>> RSP: 002b:00007ffd0f00f6d8 EFLAGS: 00000206 ORIG_RAX:
> 00000000000000ca
> >>>>>>>> RAX: fffffffffffffdfc RBX: 0000000000708024 RCX: 000000000044fb79
> >>>>>>>> RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000708024
> >>>>>>>> RBP: 00000000000ae8e6 R08: 0000000000708000 R09: 000000160000000d
> >>>>>>>> R10: 00007ffd0f00f710 R11: 0000000000000206 R12: 0000000000708000
> >>>>>>>> R13: 0000000000708024 R14: 00000000000ae8a1 R15: 0000000000000016
> >>>>>>>> Object at ffff88006d399b88, in cache kmalloc-96 size: 96
> >>>>>>>> Allocated:
> >>>>>>>> PID = 19691
> >>>>>>>>  save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:57
> >>>>>>>>  save_stack+0x43/0xd0 mm/kasan/kasan.c:502
> >>>>>>>>  set_track mm/kasan/kasan.c:514 [inline]
> >>>>>>>>  kasan_kmalloc+0xad/0xe0 mm/kasan/kasan.c:605
> >>>>>>>>  kmem_cache_alloc_trace+0xfb/0x280 mm/slub.c:2745
> >>>>>>>>  kmalloc include/linux/slab.h:490 [inline]
> >>>>>>>>  kzalloc include/linux/slab.h:663 [inline]
> >>>>>>>>  get_ucounts kernel/ucount.c:140 [inline]
> >>>>>>>>  inc_ucount+0x538/0xa70 kernel/ucount.c:195
> >>>>>>>>  inotify_new_group+0x309/0x410 fs/notify/inotify/inotify_
> user.c:655
> >>>>>>>>  SYSC_inotify_init1 fs/notify/inotify/inotify_user.c:682 [inline]
> >>>>>>>>  SyS_inotify_init1 fs/notify/inotify/inotify_user.c:669 [inline]
> >>>>>>>>  sys_inotify_init+0x17/0x80 fs/notify/inotify/inotify_user.c:696
> >>>>>>>>  entry_SYSCALL_64_fastpath+0x1f/0xc2
> >>>>>>>
> >>>>>>> However, it has been actually allocated by a different process
> with pid
> >>>>>>> 19691.
> >>>>>>>
> >>>>>>>
> >>>>>>>> Freed:
> >>>>>>>> PID = 19708
> >>>>>>>>  save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:57
> >>>>>>>>  save_stack+0x43/0xd0 mm/kasan/kasan.c:502
> >>>>>>>>  set_track mm/kasan/kasan.c:514 [inline]
> >>>>>>>>  kasan_slab_free+0x73/0xc0 mm/kasan/kasan.c:578
> >>>>>>>>  slab_free_hook mm/slub.c:1357 [inline]
> >>>>>>>>  slab_free_freelist_hook mm/slub.c:1379 [inline]
> >>>>>>>>  slab_free mm/slub.c:2961 [inline]
> >>>>>>>>  kfree+0xe8/0x2c0 mm/slub.c:3882
> >>>>>>>>  put_ucounts+0x1dd/0x270 kernel/ucount.c:172
> >>>>>>>>  dec_ucount+0x172/0x210 kernel/ucount.c:220
> >>>>>>>>  dec_inotify_instances fs/notify/inotify/inotify.h:37 [inline]
> >>>>>>>>  inotify_free_group_priv+0x6c/0x80 fs/notify/inotify/inotify_
> fsnotify.c:169
> >>>>>>>>  fsnotify_final_destroy_group fs/notify/group.c:37 [inline]
> >>>>>>>>  fsnotify_put_group+0x73/0xa0 fs/notify/group.c:110
> >>>>>>>>  fsnotify_destroy_group+0xec/0x120 fs/notify/group.c:93
> >>>>>>>>  inotify_release+0x37/0x50 fs/notify/inotify/inotify_user.c:280
> >>>>>>>>  __fput+0x327/0x7e0 fs/file_table.c:208
> >>>>>>>>  ____fput+0x15/0x20 fs/file_table.c:244
> >>>>>>>>  task_work_run+0x18a/0x260 kernel/task_work.c:116
> >>>>>>>>  exit_task_work include/linux/task_work.h:21 [inline]
> >>>>>>>>  do_exit+0xa45/0x1b20 kernel/exit.c:873
> >>>>>>>>  do_group_exit+0x149/0x400 kernel/exit.c:977
> >>>>>>>>  get_signal+0x7d5/0x1810 kernel/signal.c:2313
> >>>>>>>>  do_signal+0x94/0x1f30 arch/x86/kernel/signal.c:807
> >>>>>>>>  exit_to_usermode_loop+0x162/0x1e0 arch/x86/entry/common.c:156
> >>>>>>>>  prepare_exit_to_usermode arch/x86/entry/common.c:190 [inline]
> >>>>>>>>  syscall_return_slowpath+0x2b6/0x310 arch/x86/entry/common.c:259
> >>>>>>>>  entry_SYSCALL_64_fastpath+0xc0/0xc2
> >>>>>>>
> >>>>>>> And yet we have a third process which freed it, PID 19708. So
> there is
> >>>>>>> some dance happening with this fd, being allocated by one process,
> >>>>>>> handed over to 2 more, which are freeing it. Is this a valid usage
> >>>>>>> scenario of inotify descriptors?
> >>>>>>
> >>>>>> They are file descriptors so passing them around is valid.  That is
> >>>>>> something unix domain sockets have allowed since the dawn of linux.
> >>>>>>
> >>>>>> The dance would need to be the fd being passed to the addtional
> >>>>>> processes and then closed in the original before being closed
> >>>>>> in the processes the fd was passed to.
> >>>>>>
> >>>>>> If those additional processes last longer than the original process
> this
> >>>>>> is easy to achieve.
> >>>>>>
> >>>>>> My guess is that someone just taught syskallzer to pass file
> descriptors
> >>>>>> around.  So this may be an old bug.  Either that or syskallzer
> hasn't
> >>>>>> been looking at linux-next with KASAN enabled in the kernel.
> >>>>>
> >>>>> Dmitry, can you tell if syzkaller tests sending file descriptors
> across
> >>>>> sockets? Since the calltraces here show multiple processes being
> >>>>> involved in different operations on the exact same file descriptor.
> >>>>>
> >>>>> Also JongHwan, can you provide the full, compilable reproducer to try
> >>>>> and track this issue down?
> >>>>
> >>>>
> >>>> syzkaller can pass descriptors across sockets, but currently only
> >>>> within a single multi-threaded process.
> >>>>
> >>>> Are you sure it's the same descriptor? It seems to me that it's struct
> >>>> ucounts, which is shared via the global ucounts_hashtable, so no
> >>>> sharing is required in user processes.
> >>>>
> >>>> Unless I am missing something, we want:
> >>>>
> >>>> @@ -154,7 +155,7 @@ static struct ucounts *get_ucounts(struct
> >>>> user_namespace *ns, kuid_t uid)
> >>>>                         ucounts = new;
> >>>>                 }
> >>>>         }
> >>>> -       if (!atomic_add_unless(&ucounts->count, 1, INT_MAX))
> >>>> +       if (!atomic_add_unless(&ucounts->count, 1, 0))
> >>>>                 ucounts = NULL;
> >>>>         spin_unlock_irq(&ucounts_lock);
> >>>>         return ucounts;
> >>>>
> >>>> no?
> >>>>
> >>>> put_ucounts drops the last reference, then get_ucounts finds the
> >>>> ucounts and successfully increments refcount as it's not INT_MAX (it's
> >>>> 0) and starts using it, meanwhile put_ucounts proceeds to
> >>>> unconditionally deleting the ucounts.
> >>>
> >>>
> >>> It also seems that a concurrent put_ucounts can make get_ucounts fail
> >>> _spuriously_, which does not look good.
> >>> Don't we want something along the following lines?
> >>>
> >>> diff --git a/kernel/ucount.c b/kernel/ucount.c
> >>> index 8a11fc0cb459..233c8e46acd5 100644
> >>> --- a/kernel/ucount.c
> >>> +++ b/kernel/ucount.c
> >>> @@ -143,19 +143,18 @@ static struct ucounts *get_ucounts(struct
> >>> user_namespace *ns, kuid_t uid)
> >>>
> >>>                 new->ns = ns;
> >>>                 new->uid = uid;
> >>> -               atomic_set(&new->count, 0);
> >>> +               atomic_set(&new->count, 1);
> >>>
> >>>                 spin_lock_irq(&ucounts_lock);
> >>>                 ucounts = find_ucounts(ns, uid, hashent);
> >>>                 if (ucounts) {
> >>> +                       atomic_inc(&ucounts->count);
> >>>                         kfree(new);
> >>>                 } else {
> >>>                         hlist_add_head(&new->node, hashent);
> >>>                         ucounts = new;
> >>>                 }
> >>>         }
> >>> -       if (!atomic_add_unless(&ucounts->count, 1, INT_MAX))
> >>> -               ucounts = NULL;
> >>>         spin_unlock_irq(&ucounts_lock);
> >>>         return ucounts;
> >>>  }
> >>> @@ -166,7 +165,10 @@ static void put_ucounts(struct ucounts *ucounts)
> >>>
> >>>         if (atomic_dec_and_test(&ucounts->count)) {
> >>>                 spin_lock_irqsave(&ucounts_lock, flags);
> >>> -               hlist_del_init(&ucounts->node);
> >>> +               if (atomic_read(&ucounts->count) == 0)
> >>> +                       hlist_del_init(&ucounts->node);
> >>> +               else
> >>> +                       ucounts = NULL;
> >>>                 spin_unlock_irqrestore(&ucounts_lock, flags);
> >>>
> >>>                 kfree(ucounts);
> >>
> >>
> >> /\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\
> >>
> >> This is broken per se. Need something more elaborate.
> >>
> >
> >
> > How about this :
> >
> > diff --git a/kernel/ucount.c b/kernel/ucount.c
> > index 8a11fc0cb459..b817ac0e587c 100644
> > --- a/kernel/ucount.c
> > +++ b/kernel/ucount.c
> > @@ -166,11 +166,15 @@ static void put_ucounts(struct ucounts *ucounts)
> >
> >         if (atomic_dec_and_test(&ucounts->count)) {
> >                 spin_lock_irqsave(&ucounts_lock, flags);
> > -               hlist_del_init(&ucounts->node);
> > -               spin_unlock_irqrestore(&ucounts_lock, flags);
> > -
> > -               kfree(ucounts);
> > +               if (!atomic_read(&ucounts->count)) {
> > +                       hlist_del_init(&ucounts->node);
> > +                       spin_unlock_irqrestore(&ucounts_lock, flags);
> > +                       kfree(ucounts);
> > +                       return;
> > +               }
> >         }
> > +
> > +       spin_unlock_irqrestore(&ucounts_lock, flags);
> >  }
> >
> >
> >
> > This makes the atomic_dec_and_test and hashtable removal atomic in
> essence.
>
>
> This won't work.
> Consider the following scenario:
> Thread 0 decrements count to 0 here:
>   if (atomic_dec_and_test(&ucounts->count)) {
> Then thread 1 calls get_ucounts, increments count to 1, then calls
> put_ucounts, decrements count to 0 and unhashes and frees ucounts.
> Not thread 0 does:
>   if (!atomic_read(&ucounts->count)) {
> but ucounts is already freed!
>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: ucount: use-after-free read in inc_ucount & dec_ucount
       [not found]                                 ` <CACT4Y+avCy8Susvb09DLHkGym0K_15+EeZ8yBH7tcgjvnb3Yhw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2017-03-04 12:35                                   ` 쪼르
@ 2017-03-04 12:38                                   ` Dmitry Vyukov via Containers
       [not found]                                     ` <CACT4Y+YWEq66QXbTMR4yGtgc0ULN+TurAnRzcDASaja1z5XNjA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  1 sibling, 1 reply; 23+ messages in thread
From: Dmitry Vyukov via Containers @ 2017-03-04 12:38 UTC (permalink / raw)
  To: Nikolay Borisov
  Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	syzkaller, Jan Kara, Eric W. Biederman, JongHwan Kim

On Sat, Mar 4, 2017 at 1:15 PM, Dmitry Vyukov <dvyukov-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> wrote:
> On Sat, Mar 4, 2017 at 1:10 PM, Nikolay Borisov
> <n.borisov.lkml-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
>>
>>
>> On  4.03.2017 14:01, Dmitry Vyukov wrote:
>>> On Sat, Mar 4, 2017 at 12:50 PM, Dmitry Vyukov <dvyukov-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> wrote:
>>>>> <n.borisov.lkml-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
>>>>>> [Addressing Dmitry Vyukov to ask for syzkaller clarification]
>>>>>>
>>>>>> On  3.03.2017 18:30, Eric W. Biederman wrote:
>>>>>>> Nikolay Borisov <n.borisov.lkml-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:
>>>>>>>
>>>>>>>> [Added containers ml, Eric Biederman and Jan Kara]. Please,
>>>>>>>> next time don't add random people but take the time to see who touched
>>>>>>>> the code.
>>>>>>>
>>>>>>> Comments below.
>>>>>>>
>>>>>>>> On  3.03.2017 14:16, JongHwan Kim wrote:
>>>>>>>>> I've got the following report with syzkaller fuzzer
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Syzkaller hit 'KASAN: use-after-free Read in dec_ucount' bug on commit .
>>>>>>>>>
>>>>>>>>> ==================================================================
>>>>>>>>> BUG: KASAN: use-after-free in __read_once_size
>>>>>>>>> include/linux/compiler.h:254 [inline] at addr ffff88006d399bc4
>>>>>>>>> BUG: KASAN: use-after-free in atomic_read
>>>>>>>>> arch/x86/include/asm/atomic.h:26 [inline] at addr ffff88006d399bc4
>>>>>>>>> BUG: KASAN: use-after-free in atomic_dec_if_positive
>>>>>>>>> include/linux/atomic.h:616 [inline] at addr ffff88006d399bc4
>>>>>>>>> BUG: KASAN: use-after-free in dec_ucount+0x1e5/0x210 kernel/ucount.c:217
>>>>>>>>> at addr ffff88006d399bc4
>>>>>>>>> Read of size 4 by task syz-executor3/19713
>>>>>>>>> CPU: 1 PID: 19713 Comm: syz-executor3 Not tainted 4.10.0+ #4
>>>>>>>>> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
>>>>>>>>> Ubuntu-1.8.2-1ubuntu1 04/01/2014
>>>>>>>>> Call Trace:
>>>>>>>>>  __dump_stack lib/dump_stack.c:15 [inline]
>>>>>>>>>  dump_stack+0x115/0x1cf lib/dump_stack.c:51
>>>>>>>>>  kasan_object_err+0x1c/0x70 mm/kasan/report.c:162
>>>>>>>>>  print_address_description mm/kasan/report.c:200 [inline]
>>>>>>>>>  kasan_report_error mm/kasan/report.c:289 [inline]
>>>>>>>>>  kasan_report.part.1+0x20e/0x4e0 mm/kasan/report.c:311
>>>>>>>>>  kasan_report mm/kasan/report.c:331 [inline]
>>>>>>>>>  __asan_report_load4_noabort+0x29/0x30 mm/kasan/report.c:331
>>>>>>>>>  __read_once_size include/linux/compiler.h:254 [inline]
>>>>>>>>>  atomic_read arch/x86/include/asm/atomic.h:26 [inline]
>>>>>>>>>  atomic_dec_if_positive include/linux/atomic.h:616 [inline]
>>>>>>>>>  dec_ucount+0x1e5/0x210 kernel/ucount.c:217
>>>>>>>>>  dec_inotify_instances fs/notify/inotify/inotify.h:37 [inline]
>>>>>>>>>  inotify_free_group_priv+0x6c/0x80 fs/notify/inotify/inotify_fsnotify.c:169
>>>>>>>>>  fsnotify_final_destroy_group fs/notify/group.c:37 [inline]
>>>>>>>>>  fsnotify_put_group+0x73/0xa0 fs/notify/group.c:110
>>>>>>>>>  fsnotify_destroy_group+0xec/0x120 fs/notify/group.c:93
>>>>>>>>>  inotify_release+0x37/0x50 fs/notify/inotify/inotify_user.c:280
>>>>>>>>>  __fput+0x327/0x7e0 fs/file_table.c:208
>>>>>>>>>  ____fput+0x15/0x20 fs/file_table.c:244
>>>>>>>>>  task_work_run+0x18a/0x260 kernel/task_work.c:116
>>>>>>>>>  exit_task_work include/linux/task_work.h:21 [inline]
>>>>>>>>>  do_exit+0xa45/0x1b20 kernel/exit.c:873
>>>>>>>>>  do_group_exit+0x149/0x400 kernel/exit.c:977
>>>>>>>>>  get_signal+0x7d5/0x1810 kernel/signal.c:2313
>>>>>>>>>  do_signal+0x94/0x1f30 arch/x86/kernel/signal.c:807
>>>>>>>>>  exit_to_usermode_loop+0x162/0x1e0 arch/x86/entry/common.c:156
>>>>>>>>>  prepare_exit_to_usermode arch/x86/entry/common.c:190 [inline]
>>>>>>>>>  syscall_return_slowpath+0x2b6/0x310 arch/x86/entry/common.c:259
>>>>>>>>>  entry_SYSCALL_64_fastpath+0xc0/0xc2
>>>>>>>>
>>>>>>>> So PID 19713 is exitting and as part of it it's freeing its file
>>>>>>>> descriptors, one of which is apparently an inotify fd. And this has
>>>>>>>> already been freed.
>>>>>>>>
>>>>>>>>
>>>>>>>>> RIP: 0033:0x44fb79
>>>>>>>>> RSP: 002b:00007ffd0f00f6d8 EFLAGS: 00000206 ORIG_RAX: 00000000000000ca
>>>>>>>>> RAX: fffffffffffffdfc RBX: 0000000000708024 RCX: 000000000044fb79
>>>>>>>>> RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000708024
>>>>>>>>> RBP: 00000000000ae8e6 R08: 0000000000708000 R09: 000000160000000d
>>>>>>>>> R10: 00007ffd0f00f710 R11: 0000000000000206 R12: 0000000000708000
>>>>>>>>> R13: 0000000000708024 R14: 00000000000ae8a1 R15: 0000000000000016
>>>>>>>>> Object at ffff88006d399b88, in cache kmalloc-96 size: 96
>>>>>>>>> Allocated:
>>>>>>>>> PID = 19691
>>>>>>>>>  save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:57
>>>>>>>>>  save_stack+0x43/0xd0 mm/kasan/kasan.c:502
>>>>>>>>>  set_track mm/kasan/kasan.c:514 [inline]
>>>>>>>>>  kasan_kmalloc+0xad/0xe0 mm/kasan/kasan.c:605
>>>>>>>>>  kmem_cache_alloc_trace+0xfb/0x280 mm/slub.c:2745
>>>>>>>>>  kmalloc include/linux/slab.h:490 [inline]
>>>>>>>>>  kzalloc include/linux/slab.h:663 [inline]
>>>>>>>>>  get_ucounts kernel/ucount.c:140 [inline]
>>>>>>>>>  inc_ucount+0x538/0xa70 kernel/ucount.c:195
>>>>>>>>>  inotify_new_group+0x309/0x410 fs/notify/inotify/inotify_user.c:655
>>>>>>>>>  SYSC_inotify_init1 fs/notify/inotify/inotify_user.c:682 [inline]
>>>>>>>>>  SyS_inotify_init1 fs/notify/inotify/inotify_user.c:669 [inline]
>>>>>>>>>  sys_inotify_init+0x17/0x80 fs/notify/inotify/inotify_user.c:696
>>>>>>>>>  entry_SYSCALL_64_fastpath+0x1f/0xc2
>>>>>>>>
>>>>>>>> However, it has been actually allocated by a different process with pid
>>>>>>>> 19691.
>>>>>>>>
>>>>>>>>
>>>>>>>>> Freed:
>>>>>>>>> PID = 19708
>>>>>>>>>  save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:57
>>>>>>>>>  save_stack+0x43/0xd0 mm/kasan/kasan.c:502
>>>>>>>>>  set_track mm/kasan/kasan.c:514 [inline]
>>>>>>>>>  kasan_slab_free+0x73/0xc0 mm/kasan/kasan.c:578
>>>>>>>>>  slab_free_hook mm/slub.c:1357 [inline]
>>>>>>>>>  slab_free_freelist_hook mm/slub.c:1379 [inline]
>>>>>>>>>  slab_free mm/slub.c:2961 [inline]
>>>>>>>>>  kfree+0xe8/0x2c0 mm/slub.c:3882
>>>>>>>>>  put_ucounts+0x1dd/0x270 kernel/ucount.c:172
>>>>>>>>>  dec_ucount+0x172/0x210 kernel/ucount.c:220
>>>>>>>>>  dec_inotify_instances fs/notify/inotify/inotify.h:37 [inline]
>>>>>>>>>  inotify_free_group_priv+0x6c/0x80 fs/notify/inotify/inotify_fsnotify.c:169
>>>>>>>>>  fsnotify_final_destroy_group fs/notify/group.c:37 [inline]
>>>>>>>>>  fsnotify_put_group+0x73/0xa0 fs/notify/group.c:110
>>>>>>>>>  fsnotify_destroy_group+0xec/0x120 fs/notify/group.c:93
>>>>>>>>>  inotify_release+0x37/0x50 fs/notify/inotify/inotify_user.c:280
>>>>>>>>>  __fput+0x327/0x7e0 fs/file_table.c:208
>>>>>>>>>  ____fput+0x15/0x20 fs/file_table.c:244
>>>>>>>>>  task_work_run+0x18a/0x260 kernel/task_work.c:116
>>>>>>>>>  exit_task_work include/linux/task_work.h:21 [inline]
>>>>>>>>>  do_exit+0xa45/0x1b20 kernel/exit.c:873
>>>>>>>>>  do_group_exit+0x149/0x400 kernel/exit.c:977
>>>>>>>>>  get_signal+0x7d5/0x1810 kernel/signal.c:2313
>>>>>>>>>  do_signal+0x94/0x1f30 arch/x86/kernel/signal.c:807
>>>>>>>>>  exit_to_usermode_loop+0x162/0x1e0 arch/x86/entry/common.c:156
>>>>>>>>>  prepare_exit_to_usermode arch/x86/entry/common.c:190 [inline]
>>>>>>>>>  syscall_return_slowpath+0x2b6/0x310 arch/x86/entry/common.c:259
>>>>>>>>>  entry_SYSCALL_64_fastpath+0xc0/0xc2
>>>>>>>>
>>>>>>>> And yet we have a third process which freed it, PID 19708. So there is
>>>>>>>> some dance happening with this fd, being allocated by one process,
>>>>>>>> handed over to 2 more, which are freeing it. Is this a valid usage
>>>>>>>> scenario of inotify descriptors?
>>>>>>>
>>>>>>> They are file descriptors so passing them around is valid.  That is
>>>>>>> something unix domain sockets have allowed since the dawn of linux.
>>>>>>>
>>>>>>> The dance would need to be the fd being passed to the addtional
>>>>>>> processes and then closed in the original before being closed
>>>>>>> in the processes the fd was passed to.
>>>>>>>
>>>>>>> If those additional processes last longer than the original process this
>>>>>>> is easy to achieve.
>>>>>>>
>>>>>>> My guess is that someone just taught syskallzer to pass file descriptors
>>>>>>> around.  So this may be an old bug.  Either that or syskallzer hasn't
>>>>>>> been looking at linux-next with KASAN enabled in the kernel.
>>>>>>
>>>>>> Dmitry, can you tell if syzkaller tests sending file descriptors across
>>>>>> sockets? Since the calltraces here show multiple processes being
>>>>>> involved in different operations on the exact same file descriptor.
>>>>>>
>>>>>> Also JongHwan, can you provide the full, compilable reproducer to try
>>>>>> and track this issue down?
>>>>>
>>>>>
>>>>> syzkaller can pass descriptors across sockets, but currently only
>>>>> within a single multi-threaded process.
>>>>>
>>>>> Are you sure it's the same descriptor? It seems to me that it's struct
>>>>> ucounts, which is shared via the global ucounts_hashtable, so no
>>>>> sharing is required in user processes.
>>>>>
>>>>> Unless I am missing something, we want:
>>>>>
>>>>> @@ -154,7 +155,7 @@ static struct ucounts *get_ucounts(struct
>>>>> user_namespace *ns, kuid_t uid)
>>>>>                         ucounts = new;
>>>>>                 }
>>>>>         }
>>>>> -       if (!atomic_add_unless(&ucounts->count, 1, INT_MAX))
>>>>> +       if (!atomic_add_unless(&ucounts->count, 1, 0))
>>>>>                 ucounts = NULL;
>>>>>         spin_unlock_irq(&ucounts_lock);
>>>>>         return ucounts;
>>>>>
>>>>> no?
>>>>>
>>>>> put_ucounts drops the last reference, then get_ucounts finds the
>>>>> ucounts and successfully increments refcount as it's not INT_MAX (it's
>>>>> 0) and starts using it, meanwhile put_ucounts proceeds to
>>>>> unconditionally deleting the ucounts.
>>>>
>>>>
>>>> It also seems that a concurrent put_ucounts can make get_ucounts fail
>>>> _spuriously_, which does not look good.
>>>> Don't we want something along the following lines?
>>>>
>>>> diff --git a/kernel/ucount.c b/kernel/ucount.c
>>>> index 8a11fc0cb459..233c8e46acd5 100644
>>>> --- a/kernel/ucount.c
>>>> +++ b/kernel/ucount.c
>>>> @@ -143,19 +143,18 @@ static struct ucounts *get_ucounts(struct
>>>> user_namespace *ns, kuid_t uid)
>>>>
>>>>                 new->ns = ns;
>>>>                 new->uid = uid;
>>>> -               atomic_set(&new->count, 0);
>>>> +               atomic_set(&new->count, 1);
>>>>
>>>>                 spin_lock_irq(&ucounts_lock);
>>>>                 ucounts = find_ucounts(ns, uid, hashent);
>>>>                 if (ucounts) {
>>>> +                       atomic_inc(&ucounts->count);
>>>>                         kfree(new);
>>>>                 } else {
>>>>                         hlist_add_head(&new->node, hashent);
>>>>                         ucounts = new;
>>>>                 }
>>>>         }
>>>> -       if (!atomic_add_unless(&ucounts->count, 1, INT_MAX))
>>>> -               ucounts = NULL;
>>>>         spin_unlock_irq(&ucounts_lock);
>>>>         return ucounts;
>>>>  }
>>>> @@ -166,7 +165,10 @@ static void put_ucounts(struct ucounts *ucounts)
>>>>
>>>>         if (atomic_dec_and_test(&ucounts->count)) {
>>>>                 spin_lock_irqsave(&ucounts_lock, flags);
>>>> -               hlist_del_init(&ucounts->node);
>>>> +               if (atomic_read(&ucounts->count) == 0)
>>>> +                       hlist_del_init(&ucounts->node);
>>>> +               else
>>>> +                       ucounts = NULL;
>>>>                 spin_unlock_irqrestore(&ucounts_lock, flags);
>>>>
>>>>                 kfree(ucounts);
>>>
>>>
>>> /\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\
>>>
>>> This is broken per se. Need something more elaborate.
>>>
>>
>>
>> How about this :
>>
>> diff --git a/kernel/ucount.c b/kernel/ucount.c
>> index 8a11fc0cb459..b817ac0e587c 100644
>> --- a/kernel/ucount.c
>> +++ b/kernel/ucount.c
>> @@ -166,11 +166,15 @@ static void put_ucounts(struct ucounts *ucounts)
>>
>>         if (atomic_dec_and_test(&ucounts->count)) {
>>                 spin_lock_irqsave(&ucounts_lock, flags);
>> -               hlist_del_init(&ucounts->node);
>> -               spin_unlock_irqrestore(&ucounts_lock, flags);
>> -
>> -               kfree(ucounts);
>> +               if (!atomic_read(&ucounts->count)) {
>> +                       hlist_del_init(&ucounts->node);
>> +                       spin_unlock_irqrestore(&ucounts_lock, flags);
>> +                       kfree(ucounts);
>> +                       return;
>> +               }
>>         }
>> +
>> +       spin_unlock_irqrestore(&ucounts_lock, flags);
>>  }
>>
>>
>>
>> This makes the atomic_dec_and_test and hashtable removal atomic in essence.
>
>
> This won't work.
> Consider the following scenario:
> Thread 0 decrements count to 0 here:
>   if (atomic_dec_and_test(&ucounts->count)) {
> Then thread 1 calls get_ucounts, increments count to 1, then calls
> put_ucounts, decrements count to 0 and unhashes and frees ucounts.
> Not thread 0 does:
>   if (!atomic_read(&ucounts->count)) {
> but ucounts is already freed!


What may work is if put_ucounts re-lookups the ucounts. If it can
still find it and count==0, then it is the right time to delete it. If
it can't find the ucounts, then somebody else has beaten it.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: ucount: use-after-free read in inc_ucount & dec_ucount
       [not found]                                     ` <CACT4Y+YWEq66QXbTMR4yGtgc0ULN+TurAnRzcDASaja1z5XNjA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2017-03-04 23:58                                       ` Eric W. Biederman
       [not found]                                         ` <87shmsoar7.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
  0 siblings, 1 reply; 23+ messages in thread
From: Eric W. Biederman @ 2017-03-04 23:58 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: syzkaller,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Jan Kara,
	JongHwan Kim

Dmitry Vyukov <dvyukov-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> writes:

> On Sat, Mar 4, 2017 at 1:15 PM, Dmitry Vyukov <dvyukov-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> wrote:
>> On Sat, Mar 4, 2017 at 1:10 PM, Nikolay Borisov
>> <n.borisov.lkml-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
>>>
>>>
>>> On  4.03.2017 14:01, Dmitry Vyukov wrote:
>>>> On Sat, Mar 4, 2017 at 12:50 PM, Dmitry Vyukov <dvyukov-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> wrote:
>>>>>> <n.borisov.lkml-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
>>>>>>> [Addressing Dmitry Vyukov to ask for syzkaller clarification]
>>>>>>>
>>>>>>> On  3.03.2017 18:30, Eric W. Biederman wrote:
>>>>>>>> Nikolay Borisov <n.borisov.lkml-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:
>>>>>>>>
>>>>>>>>> [Added containers ml, Eric Biederman and Jan Kara]. Please,
>>>>>>>>> next time don't add random people but take the time to see who touched
>>>>>>>>> the code.
>>>>>>>>
>>>>>>>> Comments below.
>>>>>>>>
>>>>>>>>> On  3.03.2017 14:16, JongHwan Kim wrote:
>>>>>>>>>> I've got the following report with syzkaller fuzzer
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Syzkaller hit 'KASAN: use-after-free Read in dec_ucount' bug on commit .
>>>>>>>>>>
>>>>>>>>>> ==================================================================
>>>>>>>>>> BUG: KASAN: use-after-free in __read_once_size
>>>>>>>>>> include/linux/compiler.h:254 [inline] at addr ffff88006d399bc4
>>>>>>>>>> BUG: KASAN: use-after-free in atomic_read
>>>>>>>>>> arch/x86/include/asm/atomic.h:26 [inline] at addr ffff88006d399bc4
>>>>>>>>>> BUG: KASAN: use-after-free in atomic_dec_if_positive
>>>>>>>>>> include/linux/atomic.h:616 [inline] at addr ffff88006d399bc4
>>>>>>>>>> BUG: KASAN: use-after-free in dec_ucount+0x1e5/0x210 kernel/ucount.c:217
>>>>>>>>>> at addr ffff88006d399bc4
>>>>>>>>>> Read of size 4 by task syz-executor3/19713
>>>>>>>>>> CPU: 1 PID: 19713 Comm: syz-executor3 Not tainted 4.10.0+ #4
>>>>>>>>>> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
>>>>>>>>>> Ubuntu-1.8.2-1ubuntu1 04/01/2014
>>>>>>>>>> Call Trace:
>>>>>>>>>>  __dump_stack lib/dump_stack.c:15 [inline]
>>>>>>>>>>  dump_stack+0x115/0x1cf lib/dump_stack.c:51
>>>>>>>>>>  kasan_object_err+0x1c/0x70 mm/kasan/report.c:162
>>>>>>>>>>  print_address_description mm/kasan/report.c:200 [inline]
>>>>>>>>>>  kasan_report_error mm/kasan/report.c:289 [inline]
>>>>>>>>>>  kasan_report.part.1+0x20e/0x4e0 mm/kasan/report.c:311
>>>>>>>>>>  kasan_report mm/kasan/report.c:331 [inline]
>>>>>>>>>>  __asan_report_load4_noabort+0x29/0x30 mm/kasan/report.c:331
>>>>>>>>>>  __read_once_size include/linux/compiler.h:254 [inline]
>>>>>>>>>>  atomic_read arch/x86/include/asm/atomic.h:26 [inline]
>>>>>>>>>>  atomic_dec_if_positive include/linux/atomic.h:616 [inline]
>>>>>>>>>>  dec_ucount+0x1e5/0x210 kernel/ucount.c:217
>>>>>>>>>>  dec_inotify_instances fs/notify/inotify/inotify.h:37 [inline]
>>>>>>>>>>  inotify_free_group_priv+0x6c/0x80 fs/notify/inotify/inotify_fsnotify.c:169
>>>>>>>>>>  fsnotify_final_destroy_group fs/notify/group.c:37 [inline]
>>>>>>>>>>  fsnotify_put_group+0x73/0xa0 fs/notify/group.c:110
>>>>>>>>>>  fsnotify_destroy_group+0xec/0x120 fs/notify/group.c:93
>>>>>>>>>>  inotify_release+0x37/0x50 fs/notify/inotify/inotify_user.c:280
>>>>>>>>>>  __fput+0x327/0x7e0 fs/file_table.c:208
>>>>>>>>>>  ____fput+0x15/0x20 fs/file_table.c:244
>>>>>>>>>>  task_work_run+0x18a/0x260 kernel/task_work.c:116
>>>>>>>>>>  exit_task_work include/linux/task_work.h:21 [inline]
>>>>>>>>>>  do_exit+0xa45/0x1b20 kernel/exit.c:873
>>>>>>>>>>  do_group_exit+0x149/0x400 kernel/exit.c:977
>>>>>>>>>>  get_signal+0x7d5/0x1810 kernel/signal.c:2313
>>>>>>>>>>  do_signal+0x94/0x1f30 arch/x86/kernel/signal.c:807
>>>>>>>>>>  exit_to_usermode_loop+0x162/0x1e0 arch/x86/entry/common.c:156
>>>>>>>>>>  prepare_exit_to_usermode arch/x86/entry/common.c:190 [inline]
>>>>>>>>>>  syscall_return_slowpath+0x2b6/0x310 arch/x86/entry/common.c:259
>>>>>>>>>>  entry_SYSCALL_64_fastpath+0xc0/0xc2
>>>>>>>>>
>>>>>>>>> So PID 19713 is exitting and as part of it it's freeing its file
>>>>>>>>> descriptors, one of which is apparently an inotify fd. And this has
>>>>>>>>> already been freed.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> RIP: 0033:0x44fb79
>>>>>>>>>> RSP: 002b:00007ffd0f00f6d8 EFLAGS: 00000206 ORIG_RAX: 00000000000000ca
>>>>>>>>>> RAX: fffffffffffffdfc RBX: 0000000000708024 RCX: 000000000044fb79
>>>>>>>>>> RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000708024
>>>>>>>>>> RBP: 00000000000ae8e6 R08: 0000000000708000 R09: 000000160000000d
>>>>>>>>>> R10: 00007ffd0f00f710 R11: 0000000000000206 R12: 0000000000708000
>>>>>>>>>> R13: 0000000000708024 R14: 00000000000ae8a1 R15: 0000000000000016
>>>>>>>>>> Object at ffff88006d399b88, in cache kmalloc-96 size: 96
>>>>>>>>>> Allocated:
>>>>>>>>>> PID = 19691
>>>>>>>>>>  save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:57
>>>>>>>>>>  save_stack+0x43/0xd0 mm/kasan/kasan.c:502
>>>>>>>>>>  set_track mm/kasan/kasan.c:514 [inline]
>>>>>>>>>>  kasan_kmalloc+0xad/0xe0 mm/kasan/kasan.c:605
>>>>>>>>>>  kmem_cache_alloc_trace+0xfb/0x280 mm/slub.c:2745
>>>>>>>>>>  kmalloc include/linux/slab.h:490 [inline]
>>>>>>>>>>  kzalloc include/linux/slab.h:663 [inline]
>>>>>>>>>>  get_ucounts kernel/ucount.c:140 [inline]
>>>>>>>>>>  inc_ucount+0x538/0xa70 kernel/ucount.c:195
>>>>>>>>>>  inotify_new_group+0x309/0x410 fs/notify/inotify/inotify_user.c:655
>>>>>>>>>>  SYSC_inotify_init1 fs/notify/inotify/inotify_user.c:682 [inline]
>>>>>>>>>>  SyS_inotify_init1 fs/notify/inotify/inotify_user.c:669 [inline]
>>>>>>>>>>  sys_inotify_init+0x17/0x80 fs/notify/inotify/inotify_user.c:696
>>>>>>>>>>  entry_SYSCALL_64_fastpath+0x1f/0xc2
>>>>>>>>>
>>>>>>>>> However, it has been actually allocated by a different process with pid
>>>>>>>>> 19691.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> Freed:
>>>>>>>>>> PID = 19708
>>>>>>>>>>  save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:57
>>>>>>>>>>  save_stack+0x43/0xd0 mm/kasan/kasan.c:502
>>>>>>>>>>  set_track mm/kasan/kasan.c:514 [inline]
>>>>>>>>>>  kasan_slab_free+0x73/0xc0 mm/kasan/kasan.c:578
>>>>>>>>>>  slab_free_hook mm/slub.c:1357 [inline]
>>>>>>>>>>  slab_free_freelist_hook mm/slub.c:1379 [inline]
>>>>>>>>>>  slab_free mm/slub.c:2961 [inline]
>>>>>>>>>>  kfree+0xe8/0x2c0 mm/slub.c:3882
>>>>>>>>>>  put_ucounts+0x1dd/0x270 kernel/ucount.c:172
>>>>>>>>>>  dec_ucount+0x172/0x210 kernel/ucount.c:220
>>>>>>>>>>  dec_inotify_instances fs/notify/inotify/inotify.h:37 [inline]
>>>>>>>>>>  inotify_free_group_priv+0x6c/0x80 fs/notify/inotify/inotify_fsnotify.c:169
>>>>>>>>>>  fsnotify_final_destroy_group fs/notify/group.c:37 [inline]
>>>>>>>>>>  fsnotify_put_group+0x73/0xa0 fs/notify/group.c:110
>>>>>>>>>>  fsnotify_destroy_group+0xec/0x120 fs/notify/group.c:93
>>>>>>>>>>  inotify_release+0x37/0x50 fs/notify/inotify/inotify_user.c:280
>>>>>>>>>>  __fput+0x327/0x7e0 fs/file_table.c:208
>>>>>>>>>>  ____fput+0x15/0x20 fs/file_table.c:244
>>>>>>>>>>  task_work_run+0x18a/0x260 kernel/task_work.c:116
>>>>>>>>>>  exit_task_work include/linux/task_work.h:21 [inline]
>>>>>>>>>>  do_exit+0xa45/0x1b20 kernel/exit.c:873
>>>>>>>>>>  do_group_exit+0x149/0x400 kernel/exit.c:977
>>>>>>>>>>  get_signal+0x7d5/0x1810 kernel/signal.c:2313
>>>>>>>>>>  do_signal+0x94/0x1f30 arch/x86/kernel/signal.c:807
>>>>>>>>>>  exit_to_usermode_loop+0x162/0x1e0 arch/x86/entry/common.c:156
>>>>>>>>>>  prepare_exit_to_usermode arch/x86/entry/common.c:190 [inline]
>>>>>>>>>>  syscall_return_slowpath+0x2b6/0x310 arch/x86/entry/common.c:259
>>>>>>>>>>  entry_SYSCALL_64_fastpath+0xc0/0xc2
>>>>>>>>>
>>>>>>>>> And yet we have a third process which freed it, PID 19708. So there is
>>>>>>>>> some dance happening with this fd, being allocated by one process,
>>>>>>>>> handed over to 2 more, which are freeing it. Is this a valid usage
>>>>>>>>> scenario of inotify descriptors?
>>>>>>>>
>>>>>>>> They are file descriptors so passing them around is valid.  That is
>>>>>>>> something unix domain sockets have allowed since the dawn of linux.
>>>>>>>>
>>>>>>>> The dance would need to be the fd being passed to the addtional
>>>>>>>> processes and then closed in the original before being closed
>>>>>>>> in the processes the fd was passed to.
>>>>>>>>
>>>>>>>> If those additional processes last longer than the original process this
>>>>>>>> is easy to achieve.
>>>>>>>>
>>>>>>>> My guess is that someone just taught syskallzer to pass file descriptors
>>>>>>>> around.  So this may be an old bug.  Either that or syskallzer hasn't
>>>>>>>> been looking at linux-next with KASAN enabled in the kernel.
>>>>>>>
>>>>>>> Dmitry, can you tell if syzkaller tests sending file descriptors across
>>>>>>> sockets? Since the calltraces here show multiple processes being
>>>>>>> involved in different operations on the exact same file descriptor.
>>>>>>>
>>>>>>> Also JongHwan, can you provide the full, compilable reproducer to try
>>>>>>> and track this issue down?
>>>>>>
>>>>>>
>>>>>> syzkaller can pass descriptors across sockets, but currently only
>>>>>> within a single multi-threaded process.
>>>>>>
>>>>>> Are you sure it's the same descriptor? It seems to me that it's struct
>>>>>> ucounts, which is shared via the global ucounts_hashtable, so no
>>>>>> sharing is required in user processes.
>>>>>>
>>>>>> Unless I am missing something, we want:
>>>>>>
>>>>>> @@ -154,7 +155,7 @@ static struct ucounts *get_ucounts(struct
>>>>>> user_namespace *ns, kuid_t uid)
>>>>>>                         ucounts = new;
>>>>>>                 }
>>>>>>         }
>>>>>> -       if (!atomic_add_unless(&ucounts->count, 1, INT_MAX))
>>>>>> +       if (!atomic_add_unless(&ucounts->count, 1, 0))
>>>>>>                 ucounts = NULL;
>>>>>>         spin_unlock_irq(&ucounts_lock);
>>>>>>         return ucounts;
>>>>>>
>>>>>> no?
>>>>>>
>>>>>> put_ucounts drops the last reference, then get_ucounts finds the
>>>>>> ucounts and successfully increments refcount as it's not INT_MAX (it's
>>>>>> 0) and starts using it, meanwhile put_ucounts proceeds to
>>>>>> unconditionally deleting the ucounts.
>>>>>
>>>>>
>>>>> It also seems that a concurrent put_ucounts can make get_ucounts fail
>>>>> _spuriously_, which does not look good.
>>>>> Don't we want something along the following lines?
>>>>>
>>>>> diff --git a/kernel/ucount.c b/kernel/ucount.c
>>>>> index 8a11fc0cb459..233c8e46acd5 100644
>>>>> --- a/kernel/ucount.c
>>>>> +++ b/kernel/ucount.c
>>>>> @@ -143,19 +143,18 @@ static struct ucounts *get_ucounts(struct
>>>>> user_namespace *ns, kuid_t uid)
>>>>>
>>>>>                 new->ns = ns;
>>>>>                 new->uid = uid;
>>>>> -               atomic_set(&new->count, 0);
>>>>> +               atomic_set(&new->count, 1);
>>>>>
>>>>>                 spin_lock_irq(&ucounts_lock);
>>>>>                 ucounts = find_ucounts(ns, uid, hashent);
>>>>>                 if (ucounts) {
>>>>> +                       atomic_inc(&ucounts->count);
>>>>>                         kfree(new);
>>>>>                 } else {
>>>>>                         hlist_add_head(&new->node, hashent);
>>>>>                         ucounts = new;
>>>>>                 }
>>>>>         }
>>>>> -       if (!atomic_add_unless(&ucounts->count, 1, INT_MAX))
>>>>> -               ucounts = NULL;
>>>>>         spin_unlock_irq(&ucounts_lock);
>>>>>         return ucounts;
>>>>>  }
>>>>> @@ -166,7 +165,10 @@ static void put_ucounts(struct ucounts *ucounts)
>>>>>
>>>>>         if (atomic_dec_and_test(&ucounts->count)) {
>>>>>                 spin_lock_irqsave(&ucounts_lock, flags);
>>>>> -               hlist_del_init(&ucounts->node);
>>>>> +               if (atomic_read(&ucounts->count) == 0)
>>>>> +                       hlist_del_init(&ucounts->node);
>>>>> +               else
>>>>> +                       ucounts = NULL;
>>>>>                 spin_unlock_irqrestore(&ucounts_lock, flags);
>>>>>
>>>>>                 kfree(ucounts);
>>>>
>>>>
>>>> /\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\
>>>>
>>>> This is broken per se. Need something more elaborate.
>>>>
>>>
>>>
>>> How about this :
>>>
>>> diff --git a/kernel/ucount.c b/kernel/ucount.c
>>> index 8a11fc0cb459..b817ac0e587c 100644
>>> --- a/kernel/ucount.c
>>> +++ b/kernel/ucount.c
>>> @@ -166,11 +166,15 @@ static void put_ucounts(struct ucounts *ucounts)
>>>
>>>         if (atomic_dec_and_test(&ucounts->count)) {
>>>                 spin_lock_irqsave(&ucounts_lock, flags);
>>> -               hlist_del_init(&ucounts->node);
>>> -               spin_unlock_irqrestore(&ucounts_lock, flags);
>>> -
>>> -               kfree(ucounts);
>>> +               if (!atomic_read(&ucounts->count)) {
>>> +                       hlist_del_init(&ucounts->node);
>>> +                       spin_unlock_irqrestore(&ucounts_lock, flags);
>>> +                       kfree(ucounts);
>>> +                       return;
>>> +               }
>>>         }
>>> +
>>> +       spin_unlock_irqrestore(&ucounts_lock, flags);
>>>  }
>>>
>>>
>>>
>>> This makes the atomic_dec_and_test and hashtable removal atomic in essence.
>>
>>
>> This won't work.
>> Consider the following scenario:
>> Thread 0 decrements count to 0 here:
>>   if (atomic_dec_and_test(&ucounts->count)) {
>> Then thread 1 calls get_ucounts, increments count to 1, then calls
>> put_ucounts, decrements count to 0 and unhashes and frees ucounts.
>> Not thread 0 does:
>>   if (!atomic_read(&ucounts->count)) {
>> but ucounts is already freed!
>
>
> What may work is if put_ucounts re-lookups the ucounts. If it can
> still find it and count==0, then it is the right time to delete it. If
> it can't find the ucounts, then somebody else has beaten it.

I believe what we want is atomic_dec_and_lock_irqsave.

As that does not exist we can just do:

@@ -164,13 +164,16 @@ static void put_ucounts(struct ucounts *ucounts)
 {
        unsigned long flags;
 
+       /* Unless the count is 1 decrement the quick way */
+       if (atomic_add_unless(&ucounts->count, -1, 1))
+               return;
+
+       spin_lock_irqsave(&ucounts_lock, flags);
        if (atomic_dec_and_test(&ucounts->count)) {
-               spin_lock_irqsave(&ucounts_lock, flags);
                hlist_del_init(&ucounts->node);
-               spin_unlock_irqrestore(&ucounts_lock, flags);
-
                kfree(ucounts);
        }
+       spin_unlock_irqrestore(&ucounts_lock, flags);
 }
 
 static inline bool atomic_inc_below(atomic_t *v, int u)


AKA take the spin_lock around the dec_and_test.

Arguably always decrementing under the ucounts_lock that means we can
stop reduce ucounts->count to a simple integer and just always take the
lock.  Thus reducing our number of atomic operations and speeding up the
code a little.

But that might be a bit much for a simple bug fix.

If you folks can verify the fix above closes the race and stops the
problems I would appreciate it.

Eric

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: ucount: use-after-free read in inc_ucount & dec_ucount
       [not found]                                         ` <87shmsoar7.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
@ 2017-03-05 10:53                                           ` Dmitry Vyukov via Containers
       [not found]                                             ` <CACT4Y+bYAbsxsRWA2o+c7x25f1JPSqZKX-8a8NJq4nab-PyYig-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 23+ messages in thread
From: Dmitry Vyukov via Containers @ 2017-03-05 10:53 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: syzkaller,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Jan Kara,
	JongHwan Kim

On Sun, Mar 5, 2017 at 12:58 AM, Eric W. Biederman
<ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org> wrote:
> Dmitry Vyukov <dvyukov-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> writes:
>
>> On Sat, Mar 4, 2017 at 1:15 PM, Dmitry Vyukov <dvyukov-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> wrote:
>>> On Sat, Mar 4, 2017 at 1:10 PM, Nikolay Borisov
>>> <n.borisov.lkml-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
>>>>
>>>>
>>>> On  4.03.2017 14:01, Dmitry Vyukov wrote:
>>>>> On Sat, Mar 4, 2017 at 12:50 PM, Dmitry Vyukov <dvyukov-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> wrote:
>>>>>>> <n.borisov.lkml-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
>>>>>>>> [Addressing Dmitry Vyukov to ask for syzkaller clarification]
>>>>>>>>
>>>>>>>> On  3.03.2017 18:30, Eric W. Biederman wrote:
>>>>>>>>> Nikolay Borisov <n.borisov.lkml-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:
>>>>>>>>>
>>>>>>>>>> [Added containers ml, Eric Biederman and Jan Kara]. Please,
>>>>>>>>>> next time don't add random people but take the time to see who touched
>>>>>>>>>> the code.
>>>>>>>>>
>>>>>>>>> Comments below.
>>>>>>>>>
>>>>>>>>>> On  3.03.2017 14:16, JongHwan Kim wrote:
>>>>>>>>>>> I've got the following report with syzkaller fuzzer
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Syzkaller hit 'KASAN: use-after-free Read in dec_ucount' bug on commit .
>>>>>>>>>>>
>>>>>>>>>>> ==================================================================
>>>>>>>>>>> BUG: KASAN: use-after-free in __read_once_size
>>>>>>>>>>> include/linux/compiler.h:254 [inline] at addr ffff88006d399bc4
>>>>>>>>>>> BUG: KASAN: use-after-free in atomic_read
>>>>>>>>>>> arch/x86/include/asm/atomic.h:26 [inline] at addr ffff88006d399bc4
>>>>>>>>>>> BUG: KASAN: use-after-free in atomic_dec_if_positive
>>>>>>>>>>> include/linux/atomic.h:616 [inline] at addr ffff88006d399bc4
>>>>>>>>>>> BUG: KASAN: use-after-free in dec_ucount+0x1e5/0x210 kernel/ucount.c:217
>>>>>>>>>>> at addr ffff88006d399bc4
>>>>>>>>>>> Read of size 4 by task syz-executor3/19713
>>>>>>>>>>> CPU: 1 PID: 19713 Comm: syz-executor3 Not tainted 4.10.0+ #4
>>>>>>>>>>> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
>>>>>>>>>>> Ubuntu-1.8.2-1ubuntu1 04/01/2014
>>>>>>>>>>> Call Trace:
>>>>>>>>>>>  __dump_stack lib/dump_stack.c:15 [inline]
>>>>>>>>>>>  dump_stack+0x115/0x1cf lib/dump_stack.c:51
>>>>>>>>>>>  kasan_object_err+0x1c/0x70 mm/kasan/report.c:162
>>>>>>>>>>>  print_address_description mm/kasan/report.c:200 [inline]
>>>>>>>>>>>  kasan_report_error mm/kasan/report.c:289 [inline]
>>>>>>>>>>>  kasan_report.part.1+0x20e/0x4e0 mm/kasan/report.c:311
>>>>>>>>>>>  kasan_report mm/kasan/report.c:331 [inline]
>>>>>>>>>>>  __asan_report_load4_noabort+0x29/0x30 mm/kasan/report.c:331
>>>>>>>>>>>  __read_once_size include/linux/compiler.h:254 [inline]
>>>>>>>>>>>  atomic_read arch/x86/include/asm/atomic.h:26 [inline]
>>>>>>>>>>>  atomic_dec_if_positive include/linux/atomic.h:616 [inline]
>>>>>>>>>>>  dec_ucount+0x1e5/0x210 kernel/ucount.c:217
>>>>>>>>>>>  dec_inotify_instances fs/notify/inotify/inotify.h:37 [inline]
>>>>>>>>>>>  inotify_free_group_priv+0x6c/0x80 fs/notify/inotify/inotify_fsnotify.c:169
>>>>>>>>>>>  fsnotify_final_destroy_group fs/notify/group.c:37 [inline]
>>>>>>>>>>>  fsnotify_put_group+0x73/0xa0 fs/notify/group.c:110
>>>>>>>>>>>  fsnotify_destroy_group+0xec/0x120 fs/notify/group.c:93
>>>>>>>>>>>  inotify_release+0x37/0x50 fs/notify/inotify/inotify_user.c:280
>>>>>>>>>>>  __fput+0x327/0x7e0 fs/file_table.c:208
>>>>>>>>>>>  ____fput+0x15/0x20 fs/file_table.c:244
>>>>>>>>>>>  task_work_run+0x18a/0x260 kernel/task_work.c:116
>>>>>>>>>>>  exit_task_work include/linux/task_work.h:21 [inline]
>>>>>>>>>>>  do_exit+0xa45/0x1b20 kernel/exit.c:873
>>>>>>>>>>>  do_group_exit+0x149/0x400 kernel/exit.c:977
>>>>>>>>>>>  get_signal+0x7d5/0x1810 kernel/signal.c:2313
>>>>>>>>>>>  do_signal+0x94/0x1f30 arch/x86/kernel/signal.c:807
>>>>>>>>>>>  exit_to_usermode_loop+0x162/0x1e0 arch/x86/entry/common.c:156
>>>>>>>>>>>  prepare_exit_to_usermode arch/x86/entry/common.c:190 [inline]
>>>>>>>>>>>  syscall_return_slowpath+0x2b6/0x310 arch/x86/entry/common.c:259
>>>>>>>>>>>  entry_SYSCALL_64_fastpath+0xc0/0xc2
>>>>>>>>>>
>>>>>>>>>> So PID 19713 is exitting and as part of it it's freeing its file
>>>>>>>>>> descriptors, one of which is apparently an inotify fd. And this has
>>>>>>>>>> already been freed.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> RIP: 0033:0x44fb79
>>>>>>>>>>> RSP: 002b:00007ffd0f00f6d8 EFLAGS: 00000206 ORIG_RAX: 00000000000000ca
>>>>>>>>>>> RAX: fffffffffffffdfc RBX: 0000000000708024 RCX: 000000000044fb79
>>>>>>>>>>> RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000708024
>>>>>>>>>>> RBP: 00000000000ae8e6 R08: 0000000000708000 R09: 000000160000000d
>>>>>>>>>>> R10: 00007ffd0f00f710 R11: 0000000000000206 R12: 0000000000708000
>>>>>>>>>>> R13: 0000000000708024 R14: 00000000000ae8a1 R15: 0000000000000016
>>>>>>>>>>> Object at ffff88006d399b88, in cache kmalloc-96 size: 96
>>>>>>>>>>> Allocated:
>>>>>>>>>>> PID = 19691
>>>>>>>>>>>  save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:57
>>>>>>>>>>>  save_stack+0x43/0xd0 mm/kasan/kasan.c:502
>>>>>>>>>>>  set_track mm/kasan/kasan.c:514 [inline]
>>>>>>>>>>>  kasan_kmalloc+0xad/0xe0 mm/kasan/kasan.c:605
>>>>>>>>>>>  kmem_cache_alloc_trace+0xfb/0x280 mm/slub.c:2745
>>>>>>>>>>>  kmalloc include/linux/slab.h:490 [inline]
>>>>>>>>>>>  kzalloc include/linux/slab.h:663 [inline]
>>>>>>>>>>>  get_ucounts kernel/ucount.c:140 [inline]
>>>>>>>>>>>  inc_ucount+0x538/0xa70 kernel/ucount.c:195
>>>>>>>>>>>  inotify_new_group+0x309/0x410 fs/notify/inotify/inotify_user.c:655
>>>>>>>>>>>  SYSC_inotify_init1 fs/notify/inotify/inotify_user.c:682 [inline]
>>>>>>>>>>>  SyS_inotify_init1 fs/notify/inotify/inotify_user.c:669 [inline]
>>>>>>>>>>>  sys_inotify_init+0x17/0x80 fs/notify/inotify/inotify_user.c:696
>>>>>>>>>>>  entry_SYSCALL_64_fastpath+0x1f/0xc2
>>>>>>>>>>
>>>>>>>>>> However, it has been actually allocated by a different process with pid
>>>>>>>>>> 19691.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> Freed:
>>>>>>>>>>> PID = 19708
>>>>>>>>>>>  save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:57
>>>>>>>>>>>  save_stack+0x43/0xd0 mm/kasan/kasan.c:502
>>>>>>>>>>>  set_track mm/kasan/kasan.c:514 [inline]
>>>>>>>>>>>  kasan_slab_free+0x73/0xc0 mm/kasan/kasan.c:578
>>>>>>>>>>>  slab_free_hook mm/slub.c:1357 [inline]
>>>>>>>>>>>  slab_free_freelist_hook mm/slub.c:1379 [inline]
>>>>>>>>>>>  slab_free mm/slub.c:2961 [inline]
>>>>>>>>>>>  kfree+0xe8/0x2c0 mm/slub.c:3882
>>>>>>>>>>>  put_ucounts+0x1dd/0x270 kernel/ucount.c:172
>>>>>>>>>>>  dec_ucount+0x172/0x210 kernel/ucount.c:220
>>>>>>>>>>>  dec_inotify_instances fs/notify/inotify/inotify.h:37 [inline]
>>>>>>>>>>>  inotify_free_group_priv+0x6c/0x80 fs/notify/inotify/inotify_fsnotify.c:169
>>>>>>>>>>>  fsnotify_final_destroy_group fs/notify/group.c:37 [inline]
>>>>>>>>>>>  fsnotify_put_group+0x73/0xa0 fs/notify/group.c:110
>>>>>>>>>>>  fsnotify_destroy_group+0xec/0x120 fs/notify/group.c:93
>>>>>>>>>>>  inotify_release+0x37/0x50 fs/notify/inotify/inotify_user.c:280
>>>>>>>>>>>  __fput+0x327/0x7e0 fs/file_table.c:208
>>>>>>>>>>>  ____fput+0x15/0x20 fs/file_table.c:244
>>>>>>>>>>>  task_work_run+0x18a/0x260 kernel/task_work.c:116
>>>>>>>>>>>  exit_task_work include/linux/task_work.h:21 [inline]
>>>>>>>>>>>  do_exit+0xa45/0x1b20 kernel/exit.c:873
>>>>>>>>>>>  do_group_exit+0x149/0x400 kernel/exit.c:977
>>>>>>>>>>>  get_signal+0x7d5/0x1810 kernel/signal.c:2313
>>>>>>>>>>>  do_signal+0x94/0x1f30 arch/x86/kernel/signal.c:807
>>>>>>>>>>>  exit_to_usermode_loop+0x162/0x1e0 arch/x86/entry/common.c:156
>>>>>>>>>>>  prepare_exit_to_usermode arch/x86/entry/common.c:190 [inline]
>>>>>>>>>>>  syscall_return_slowpath+0x2b6/0x310 arch/x86/entry/common.c:259
>>>>>>>>>>>  entry_SYSCALL_64_fastpath+0xc0/0xc2
>>>>>>>>>>
>>>>>>>>>> And yet we have a third process which freed it, PID 19708. So there is
>>>>>>>>>> some dance happening with this fd, being allocated by one process,
>>>>>>>>>> handed over to 2 more, which are freeing it. Is this a valid usage
>>>>>>>>>> scenario of inotify descriptors?
>>>>>>>>>
>>>>>>>>> They are file descriptors so passing them around is valid.  That is
>>>>>>>>> something unix domain sockets have allowed since the dawn of linux.
>>>>>>>>>
>>>>>>>>> The dance would need to be the fd being passed to the addtional
>>>>>>>>> processes and then closed in the original before being closed
>>>>>>>>> in the processes the fd was passed to.
>>>>>>>>>
>>>>>>>>> If those additional processes last longer than the original process this
>>>>>>>>> is easy to achieve.
>>>>>>>>>
>>>>>>>>> My guess is that someone just taught syskallzer to pass file descriptors
>>>>>>>>> around.  So this may be an old bug.  Either that or syskallzer hasn't
>>>>>>>>> been looking at linux-next with KASAN enabled in the kernel.
>>>>>>>>
>>>>>>>> Dmitry, can you tell if syzkaller tests sending file descriptors across
>>>>>>>> sockets? Since the calltraces here show multiple processes being
>>>>>>>> involved in different operations on the exact same file descriptor.
>>>>>>>>
>>>>>>>> Also JongHwan, can you provide the full, compilable reproducer to try
>>>>>>>> and track this issue down?
>>>>>>>
>>>>>>>
>>>>>>> syzkaller can pass descriptors across sockets, but currently only
>>>>>>> within a single multi-threaded process.
>>>>>>>
>>>>>>> Are you sure it's the same descriptor? It seems to me that it's struct
>>>>>>> ucounts, which is shared via the global ucounts_hashtable, so no
>>>>>>> sharing is required in user processes.
>>>>>>>
>>>>>>> Unless I am missing something, we want:
>>>>>>>
>>>>>>> @@ -154,7 +155,7 @@ static struct ucounts *get_ucounts(struct
>>>>>>> user_namespace *ns, kuid_t uid)
>>>>>>>                         ucounts = new;
>>>>>>>                 }
>>>>>>>         }
>>>>>>> -       if (!atomic_add_unless(&ucounts->count, 1, INT_MAX))
>>>>>>> +       if (!atomic_add_unless(&ucounts->count, 1, 0))
>>>>>>>                 ucounts = NULL;
>>>>>>>         spin_unlock_irq(&ucounts_lock);
>>>>>>>         return ucounts;
>>>>>>>
>>>>>>> no?
>>>>>>>
>>>>>>> put_ucounts drops the last reference, then get_ucounts finds the
>>>>>>> ucounts and successfully increments refcount as it's not INT_MAX (it's
>>>>>>> 0) and starts using it, meanwhile put_ucounts proceeds to
>>>>>>> unconditionally deleting the ucounts.
>>>>>>
>>>>>>
>>>>>> It also seems that a concurrent put_ucounts can make get_ucounts fail
>>>>>> _spuriously_, which does not look good.
>>>>>> Don't we want something along the following lines?
>>>>>>
>>>>>> diff --git a/kernel/ucount.c b/kernel/ucount.c
>>>>>> index 8a11fc0cb459..233c8e46acd5 100644
>>>>>> --- a/kernel/ucount.c
>>>>>> +++ b/kernel/ucount.c
>>>>>> @@ -143,19 +143,18 @@ static struct ucounts *get_ucounts(struct
>>>>>> user_namespace *ns, kuid_t uid)
>>>>>>
>>>>>>                 new->ns = ns;
>>>>>>                 new->uid = uid;
>>>>>> -               atomic_set(&new->count, 0);
>>>>>> +               atomic_set(&new->count, 1);
>>>>>>
>>>>>>                 spin_lock_irq(&ucounts_lock);
>>>>>>                 ucounts = find_ucounts(ns, uid, hashent);
>>>>>>                 if (ucounts) {
>>>>>> +                       atomic_inc(&ucounts->count);
>>>>>>                         kfree(new);
>>>>>>                 } else {
>>>>>>                         hlist_add_head(&new->node, hashent);
>>>>>>                         ucounts = new;
>>>>>>                 }
>>>>>>         }
>>>>>> -       if (!atomic_add_unless(&ucounts->count, 1, INT_MAX))
>>>>>> -               ucounts = NULL;
>>>>>>         spin_unlock_irq(&ucounts_lock);
>>>>>>         return ucounts;
>>>>>>  }
>>>>>> @@ -166,7 +165,10 @@ static void put_ucounts(struct ucounts *ucounts)
>>>>>>
>>>>>>         if (atomic_dec_and_test(&ucounts->count)) {
>>>>>>                 spin_lock_irqsave(&ucounts_lock, flags);
>>>>>> -               hlist_del_init(&ucounts->node);
>>>>>> +               if (atomic_read(&ucounts->count) == 0)
>>>>>> +                       hlist_del_init(&ucounts->node);
>>>>>> +               else
>>>>>> +                       ucounts = NULL;
>>>>>>                 spin_unlock_irqrestore(&ucounts_lock, flags);
>>>>>>
>>>>>>                 kfree(ucounts);
>>>>>
>>>>>
>>>>> /\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\
>>>>>
>>>>> This is broken per se. Need something more elaborate.
>>>>>
>>>>
>>>>
>>>> How about this :
>>>>
>>>> diff --git a/kernel/ucount.c b/kernel/ucount.c
>>>> index 8a11fc0cb459..b817ac0e587c 100644
>>>> --- a/kernel/ucount.c
>>>> +++ b/kernel/ucount.c
>>>> @@ -166,11 +166,15 @@ static void put_ucounts(struct ucounts *ucounts)
>>>>
>>>>         if (atomic_dec_and_test(&ucounts->count)) {
>>>>                 spin_lock_irqsave(&ucounts_lock, flags);
>>>> -               hlist_del_init(&ucounts->node);
>>>> -               spin_unlock_irqrestore(&ucounts_lock, flags);
>>>> -
>>>> -               kfree(ucounts);
>>>> +               if (!atomic_read(&ucounts->count)) {
>>>> +                       hlist_del_init(&ucounts->node);
>>>> +                       spin_unlock_irqrestore(&ucounts_lock, flags);
>>>> +                       kfree(ucounts);
>>>> +                       return;
>>>> +               }
>>>>         }
>>>> +
>>>> +       spin_unlock_irqrestore(&ucounts_lock, flags);
>>>>  }
>>>>
>>>>
>>>>
>>>> This makes the atomic_dec_and_test and hashtable removal atomic in essence.
>>>
>>>
>>> This won't work.
>>> Consider the following scenario:
>>> Thread 0 decrements count to 0 here:
>>>   if (atomic_dec_and_test(&ucounts->count)) {
>>> Then thread 1 calls get_ucounts, increments count to 1, then calls
>>> put_ucounts, decrements count to 0 and unhashes and frees ucounts.
>>> Not thread 0 does:
>>>   if (!atomic_read(&ucounts->count)) {
>>> but ucounts is already freed!
>>
>>
>> What may work is if put_ucounts re-lookups the ucounts. If it can
>> still find it and count==0, then it is the right time to delete it. If
>> it can't find the ucounts, then somebody else has beaten it.
>
> I believe what we want is atomic_dec_and_lock_irqsave.
>
> As that does not exist we can just do:
>
> @@ -164,13 +164,16 @@ static void put_ucounts(struct ucounts *ucounts)
>  {
>         unsigned long flags;
>
> +       /* Unless the count is 1 decrement the quick way */
> +       if (atomic_add_unless(&ucounts->count, -1, 1))
> +               return;
> +
> +       spin_lock_irqsave(&ucounts_lock, flags);
>         if (atomic_dec_and_test(&ucounts->count)) {
> -               spin_lock_irqsave(&ucounts_lock, flags);
>                 hlist_del_init(&ucounts->node);
> -               spin_unlock_irqrestore(&ucounts_lock, flags);
> -
>                 kfree(ucounts);
>         }
> +       spin_unlock_irqrestore(&ucounts_lock, flags);
>  }

Nice. I think it should work.

I would also do:

diff --git a/kernel/ucount.c b/kernel/ucount.c
index 8a11fc0cb459..233c8e46acd5 100644
--- a/kernel/ucount.c
+++ b/kernel/ucount.c
@@ -143,19 +143,18 @@ static struct ucounts *get_ucounts(struct
user_namespace *ns, kuid_t uid)

                new->ns = ns;
                new->uid = uid;
-               atomic_set(&new->count, 0);
+               atomic_set(&new->count, 1);

                spin_lock_irq(&ucounts_lock);
                ucounts = find_ucounts(ns, uid, hashent);
                if (ucounts) {
+                       atomic_inc(&ucounts->count);
                        kfree(new);
                } else {
                        hlist_add_head(&new->node, hashent);
                        ucounts = new;
                }
        }
-       if (!atomic_add_unless(&ucounts->count, 1, INT_MAX))
-               ucounts = NULL;
        spin_unlock_irq(&ucounts_lock);
        return ucounts;
 }



>  static inline bool atomic_inc_below(atomic_t *v, int u)
>
>
> AKA take the spin_lock around the dec_and_test.
>
> Arguably always decrementing under the ucounts_lock that means we can
> stop reduce ucounts->count to a simple integer and just always take the
> lock.  Thus reducing our number of atomic operations and speeding up the
> code a little.
>
> But that might be a bit much for a simple bug fix.
>
> If you folks can verify the fix above closes the race and stops the
> problems I would appreciate it.

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: ucount: use-after-free read in inc_ucount & dec_ucount
       [not found]                                             ` <CACT4Y+bYAbsxsRWA2o+c7x25f1JPSqZKX-8a8NJq4nab-PyYig-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2017-03-05 18:41                                               ` Eric W. Biederman
       [not found]                                                 ` <87y3wjlg6r.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
  0 siblings, 1 reply; 23+ messages in thread
From: Eric W. Biederman @ 2017-03-05 18:41 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: syzkaller,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Jan Kara,
	JongHwan Kim

Dmitry Vyukov <dvyukov-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> writes:
>
> Nice. I think it should work.
>
> I would also do:
>
> diff --git a/kernel/ucount.c b/kernel/ucount.c
> index 8a11fc0cb459..233c8e46acd5 100644
> --- a/kernel/ucount.c
> +++ b/kernel/ucount.c
> @@ -143,19 +143,18 @@ static struct ucounts *get_ucounts(struct
> user_namespace *ns, kuid_t uid)
>
>                 new->ns = ns;
>                 new->uid = uid;
> -               atomic_set(&new->count, 0);
> +               atomic_set(&new->count, 1);
>
>                 spin_lock_irq(&ucounts_lock);
>                 ucounts = find_ucounts(ns, uid, hashent);
>                 if (ucounts) {
> +                       atomic_inc(&ucounts->count);
>                         kfree(new);
>                 } else {
>                         hlist_add_head(&new->node, hashent);
>                         ucounts = new;
>                 }
>         }
> -       if (!atomic_add_unless(&ucounts->count, 1, INT_MAX))
> -               ucounts = NULL;
>         spin_unlock_irq(&ucounts_lock);
>         return ucounts;
>  }

No.  As that allows ucounts->count to be incremented to the point
it goes negative.  Counter wrap-around is just as bad as imperfect
locking if you can trigger it, and has been a cause of use-after-free
errors etc.

So it is a feature that if the count is maxed out for a given kuid that
get_ucounts will fail.

Eric

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: ucount: use-after-free read in inc_ucount & dec_ucount
       [not found]                                     ` <CALRZ7Ut=bAgX+XwS3h8-V-zGqiUFkuSCi_C=j6uN_=OptU-+RQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2017-03-05 21:00                                       ` Eric W. Biederman
       [not found]                                         ` <87efybfnh2.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
  0 siblings, 1 reply; 23+ messages in thread
From: Eric W. Biederman @ 2017-03-05 21:00 UTC (permalink / raw)
  To: 쪼르
  Cc: syzkaller,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Jan Kara,
	Dmitry Vyukov

쪼르 <zzoru007@gmail.com> writes:

> Hi, This is my new one report about dec_ucount:
> ps.Sorry for my uncomfortable report. This is my first usage of lkml. 
> Syzkaller hit 'KASAN: use-after-free Read in dec_ucount' bug on commit
> .

You are doing well.  Thank you very much for the report.

Thank you for the reproducer.  Unfortunately I am not able to reproduce
the bug with what the code you have posted here.

From the initial mailing the code said:

> Syzkaller reproducer:
> # {Threaded:false Collide:false Repeat:true Procs:4 Sandbox:setuid
> Repro:false}
> inotify_init()

The code you posted says:

> Syzkaller reproducer:
> # {Threaded:false Collide:false Repeat:true Procs:1 Sandbox:setuid Repro:false}
> semget$private(0x0, 0x400001003, 0x181)

So I expect syzkaller did not create the same code when you ran it
again.  Something easy to miss if you haven't run used a tool like that
much.

If someone knows how to get the code that syzkaller would generate that
matches the original reproducer I would very much appreciate it so that
we can confirm the bug we have spotted in the code is the bug syzkaller
found.

Until that point I am going to fix the obvious bug in the code and hope
that fixes the problem.

Eric
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/containers

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [REVIEW][PATCH] ucount: Remove the atomicity from ucount->count
       [not found]                                                 ` <87y3wjlg6r.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
@ 2017-03-05 21:41                                                   ` Eric W. Biederman
       [not found]                                                     ` <87tw77csgd.fsf_-_-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
  0 siblings, 1 reply; 23+ messages in thread
From: Eric W. Biederman @ 2017-03-05 21:41 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: syzkaller,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Jan Kara,
	JongHwan Kim


Always increment/decrement ucount->count under the ucounts_lock.  The
increments are there already and moving the decrements there means the
locking logic of the code is simpler.  This simplification in the
locking logic fixes a race between put_ucounts and get_ucounts that
could result in a use-after-free because the count could go zero then
be found by get_ucounts and then be freed by put_ucounts.

A bug presumably this one was found by a combination of syzkaller and
KASAN.  JongWhan Kim reported the syzkaller failure and Dmitry Vyukov
spotted the race in the code.

Reported-by: JongHwan Kim <zzoru007-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Reported-by: Dmitry Vyukov <dvyukov-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
Signed-off-by: "Eric W. Biederman" <ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
---
 include/linux/user_namespace.h |  2 +-
 kernel/ucount.c                | 18 +++++++++++-------
 2 files changed, 12 insertions(+), 8 deletions(-)

diff --git a/include/linux/user_namespace.h b/include/linux/user_namespace.h
index 363e0e8082a9..61071b6d2d12 100644
--- a/include/linux/user_namespace.h
+++ b/include/linux/user_namespace.h
@@ -69,7 +69,7 @@ struct ucounts {
 	struct hlist_node node;
 	struct user_namespace *ns;
 	kuid_t uid;
-	atomic_t count;
+	int count;
 	atomic_t ucount[UCOUNT_COUNTS];
 };
 
diff --git a/kernel/ucount.c b/kernel/ucount.c
index 68716403b261..73696faa80dd 100644
--- a/kernel/ucount.c
+++ b/kernel/ucount.c
@@ -143,7 +143,7 @@ static struct ucounts *get_ucounts(struct user_namespace *ns, kuid_t uid)
 
 		new->ns = ns;
 		new->uid = uid;
-		atomic_set(&new->count, 0);
+		new->count = 0;
 
 		spin_lock_irq(&ucounts_lock);
 		ucounts = find_ucounts(ns, uid, hashent);
@@ -154,8 +154,10 @@ static struct ucounts *get_ucounts(struct user_namespace *ns, kuid_t uid)
 			ucounts = new;
 		}
 	}
-	if (!atomic_add_unless(&ucounts->count, 1, INT_MAX))
+	if (ucounts->count == INT_MAX)
 		ucounts = NULL;
+	else
+		ucounts->count += 1;
 	spin_unlock_irq(&ucounts_lock);
 	return ucounts;
 }
@@ -164,13 +166,15 @@ static void put_ucounts(struct ucounts *ucounts)
 {
 	unsigned long flags;
 
-	if (atomic_dec_and_test(&ucounts->count)) {
-		spin_lock_irqsave(&ucounts_lock, flags);
+	spin_lock_irqsave(&ucounts_lock, flags);
+	ucounts->count -= 1;
+	if (!ucounts->count)
 		hlist_del_init(&ucounts->node);
-		spin_unlock_irqrestore(&ucounts_lock, flags);
+	else
+		ucounts = NULL;
+	spin_unlock_irqrestore(&ucounts_lock, flags);
 
-		kfree(ucounts);
-	}
+	kfree(ucounts);
 }
 
 static inline bool atomic_inc_below(atomic_t *v, int u)
-- 
2.10.1

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: ucount: use-after-free read in inc_ucount & dec_ucount
       [not found]                                         ` <87efybfnh2.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
@ 2017-03-06  9:13                                           ` Dmitry Vyukov via Containers
       [not found]                                             ` <CACT4Y+aZmSY3rmV7+iC7i6Ov2Of4N5YTpjz5Tk8XCki-CyAY5w-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 23+ messages in thread
From: Dmitry Vyukov via Containers @ 2017-03-06  9:13 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: syzkaller,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Jan Kara,
	쪼르

On Sun, Mar 5, 2017 at 10:00 PM, Eric W. Biederman
<ebiederm@xmission.com> wrote:
> 쪼르 <zzoru007@gmail.com> writes:
>
>> Hi, This is my new one report about dec_ucount:
>> ps.Sorry for my uncomfortable report. This is my first usage of lkml.
>> Syzkaller hit 'KASAN: use-after-free Read in dec_ucount' bug on commit
>> .
>
> You are doing well.  Thank you very much for the report.
>
> Thank you for the reproducer.  Unfortunately I am not able to reproduce
> the bug with what the code you have posted here.
>
> From the initial mailing the code said:
>
>> Syzkaller reproducer:
>> # {Threaded:false Collide:false Repeat:true Procs:4 Sandbox:setuid
>> Repro:false}
>> inotify_init()
>
> The code you posted says:
>
>> Syzkaller reproducer:
>> # {Threaded:false Collide:false Repeat:true Procs:1 Sandbox:setuid Repro:false}
>> semget$private(0x0, 0x400001003, 0x181)
>
> So I expect syzkaller did not create the same code when you ran it
> again.  Something easy to miss if you haven't run used a tool like that
> much.
>
> If someone knows how to get the code that syzkaller would generate that
> matches the original reproducer I would very much appreciate it so that
> we can confirm the bug we have spotted in the code is the bug syzkaller
> found.
>
> Until that point I am going to fix the obvious bug in the code and hope
> that fixes the problem.


Reliably reproducing such bugs is not possible (how would you expect
it to look like?). Your best bet is to write a stress test that
provokes the bug, add some sleeps into kernel code and run it for a
while with KASAN. Should be reproducible within minutes.
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/containers

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: ucount: use-after-free read in inc_ucount & dec_ucount
       [not found]                                             ` <CACT4Y+aZmSY3rmV7+iC7i6Ov2Of4N5YTpjz5Tk8XCki-CyAY5w-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2017-03-06 16:33                                               ` Eric W. Biederman
       [not found]                                                 ` <87a88y4b78.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
  0 siblings, 1 reply; 23+ messages in thread
From: Eric W. Biederman @ 2017-03-06 16:33 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: syzkaller,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Jan Kara,
	쪼르

Dmitry Vyukov <dvyukov@google.com> writes:

> On Sun, Mar 5, 2017 at 10:00 PM, Eric W. Biederman
> <ebiederm@xmission.com> wrote:
>> 쪼르 <zzoru007@gmail.com> writes:
>>
>>> Hi, This is my new one report about dec_ucount:
>>> ps.Sorry for my uncomfortable report. This is my first usage of lkml.
>>> Syzkaller hit 'KASAN: use-after-free Read in dec_ucount' bug on commit
>>> .
>>
>> You are doing well.  Thank you very much for the report.
>>
>> Thank you for the reproducer.  Unfortunately I am not able to reproduce
>> the bug with what the code you have posted here.
>>
>> From the initial mailing the code said:
>>
>>> Syzkaller reproducer:
>>> # {Threaded:false Collide:false Repeat:true Procs:4 Sandbox:setuid
>>> Repro:false}
>>> inotify_init()
>>
>> The code you posted says:
>>
>>> Syzkaller reproducer:
>>> # {Threaded:false Collide:false Repeat:true Procs:1 Sandbox:setuid Repro:false}
>>> semget$private(0x0, 0x400001003, 0x181)
>>
>> So I expect syzkaller did not create the same code when you ran it
>> again.  Something easy to miss if you haven't run used a tool like that
>> much.
>>
>> If someone knows how to get the code that syzkaller would generate that
>> matches the original reproducer I would very much appreciate it so that
>> we can confirm the bug we have spotted in the code is the bug syzkaller
>> found.
>>
>> Until that point I am going to fix the obvious bug in the code and hope
>> that fixes the problem.
>
>
> Reliably reproducing such bugs is not possible (how would you expect
> it to look like?). Your best bet is to write a stress test that
> provokes the bug, add some sleeps into kernel code and run it for a
> while with KASAN. Should be reproducible within minutes.

I was not asking for a reliable reproducer.  I was asking what code was
run that triggered the error.

I don't have a clue what the randomly generated code that prompted the
original kernel error is and it doesn't appear anyone else does either.

The only hint I have is:
>>> Syzkaller reproducer:
>>> # {Threaded:false Collide:false Repeat:true Procs:4 Sandbox:setuid
>>> Repro:false}
>>> inotify_init()

The code that was posted did not call inotify_init and so I believe that
was a completely different random piece of code, that has nothing to do
with this issue.

I don't know syzkaller and it looks non-trivial to install on my system
and play around with.  So I am going to leave futzing with syzkaller to
people who have been able to figure it out.

Until I have a reasonable understanding of what the code was doing that
triggered the error I can't say with any certainty that the reported bug
was fixed.

I would love to be able to say that it looks like the bug that caused
the error report was fixed.

Eric
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/containers

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: ucount: use-after-free read in inc_ucount & dec_ucount
       [not found]                                                 ` <87a88y4b78.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
@ 2017-03-06 18:43                                                   ` Dmitry Vyukov via Containers
  0 siblings, 0 replies; 23+ messages in thread
From: Dmitry Vyukov via Containers @ 2017-03-06 18:43 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: syzkaller,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Jan Kara,
	쪼르

On Mon, Mar 6, 2017 at 5:33 PM, Eric W. Biederman <ebiederm@xmission.com> wrote:
> Dmitry Vyukov <dvyukov@google.com> writes:
>
>> On Sun, Mar 5, 2017 at 10:00 PM, Eric W. Biederman
>> <ebiederm@xmission.com> wrote:
>>> 쪼르 <zzoru007@gmail.com> writes:
>>>
>>>> Hi, This is my new one report about dec_ucount:
>>>> ps.Sorry for my uncomfortable report. This is my first usage of lkml.
>>>> Syzkaller hit 'KASAN: use-after-free Read in dec_ucount' bug on commit
>>>> .
>>>
>>> You are doing well.  Thank you very much for the report.
>>>
>>> Thank you for the reproducer.  Unfortunately I am not able to reproduce
>>> the bug with what the code you have posted here.
>>>
>>> From the initial mailing the code said:
>>>
>>>> Syzkaller reproducer:
>>>> # {Threaded:false Collide:false Repeat:true Procs:4 Sandbox:setuid
>>>> Repro:false}
>>>> inotify_init()
>>>
>>> The code you posted says:
>>>
>>>> Syzkaller reproducer:
>>>> # {Threaded:false Collide:false Repeat:true Procs:1 Sandbox:setuid Repro:false}
>>>> semget$private(0x0, 0x400001003, 0x181)
>>>
>>> So I expect syzkaller did not create the same code when you ran it
>>> again.  Something easy to miss if you haven't run used a tool like that
>>> much.
>>>
>>> If someone knows how to get the code that syzkaller would generate that
>>> matches the original reproducer I would very much appreciate it so that
>>> we can confirm the bug we have spotted in the code is the bug syzkaller
>>> found.
>>>
>>> Until that point I am going to fix the obvious bug in the code and hope
>>> that fixes the problem.
>>
>>
>> Reliably reproducing such bugs is not possible (how would you expect
>> it to look like?). Your best bet is to write a stress test that
>> provokes the bug, add some sleeps into kernel code and run it for a
>> while with KASAN. Should be reproducible within minutes.
>
> I was not asking for a reliable reproducer.  I was asking what code was
> run that triggered the error.
>
> I don't have a clue what the randomly generated code that prompted the
> original kernel error is and it doesn't appear anyone else does either.
>
> The only hint I have is:
>>>> Syzkaller reproducer:
>>>> # {Threaded:false Collide:false Repeat:true Procs:4 Sandbox:setuid
>>>> Repro:false}
>>>> inotify_init()
>
> The code that was posted did not call inotify_init and so I believe that
> was a completely different random piece of code, that has nothing to do
> with this issue.
>
> I don't know syzkaller and it looks non-trivial to install on my system
> and play around with.  So I am going to leave futzing with syzkaller to
> people who have been able to figure it out.
>
> Until I have a reasonable understanding of what the code was doing that
> triggered the error I can't say with any certainty that the reported bug
> was fixed.
>
> I would love to be able to say that it looks like the bug that caused
> the error report was fixed.


Here is a C analog of the reproducer:
https://gist.githubusercontent.com/dvyukov/80293e434b1273c6a1cdce5c6b979ab6/raw/a1f5ff3cab17b5e3882b56060efb795b6355bdf2/gistfile1.txt
that JongHwan provided here:
https://groups.google.com/d/msg/syzkaller/crHVWf1QDkU/DrA9uVquDwAJ

It basically just calls inotify_init in parallel processes.
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/containers

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [REVIEW][PATCH] ucount: Remove the atomicity from ucount->count
       [not found]                                                     ` <87tw77csgd.fsf_-_-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
@ 2017-03-06 20:39                                                       ` Andrei Vagin
       [not found]                                                         ` <20170306203919.GA17874-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
  0 siblings, 1 reply; 23+ messages in thread
From: Andrei Vagin @ 2017-03-06 20:39 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	syzkaller, Jan Kara, JongHwan Kim, Dmitry Vyukov

On Sun, Mar 05, 2017 at 03:41:06PM -0600, Eric W. Biederman wrote:
> 
> Always increment/decrement ucount->count under the ucounts_lock.  The
> increments are there already and moving the decrements there means the
> locking logic of the code is simpler.  This simplification in the
> locking logic fixes a race between put_ucounts and get_ucounts that
> could result in a use-after-free because the count could go zero then
> be found by get_ucounts and then be freed by put_ucounts.
> 
> A bug presumably this one was found by a combination of syzkaller and
> KASAN.  JongWhan Kim reported the syzkaller failure and Dmitry Vyukov
> spotted the race in the code.
>

Reviewed-by: Andrei Vagin <avagin-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>

I think we can rework this in a future so that ucount will be rcu
protected.


> Reported-by: JongHwan Kim <zzoru007-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> Reported-by: Dmitry Vyukov <dvyukov-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
> Signed-off-by: "Eric W. Biederman" <ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
> ---
>  include/linux/user_namespace.h |  2 +-
>  kernel/ucount.c                | 18 +++++++++++-------
>  2 files changed, 12 insertions(+), 8 deletions(-)
> 
> diff --git a/include/linux/user_namespace.h b/include/linux/user_namespace.h
> index 363e0e8082a9..61071b6d2d12 100644
> --- a/include/linux/user_namespace.h
> +++ b/include/linux/user_namespace.h
> @@ -69,7 +69,7 @@ struct ucounts {
>  	struct hlist_node node;
>  	struct user_namespace *ns;
>  	kuid_t uid;
> -	atomic_t count;
> +	int count;
>  	atomic_t ucount[UCOUNT_COUNTS];
>  };
>  
> diff --git a/kernel/ucount.c b/kernel/ucount.c
> index 68716403b261..73696faa80dd 100644
> --- a/kernel/ucount.c
> +++ b/kernel/ucount.c
> @@ -143,7 +143,7 @@ static struct ucounts *get_ucounts(struct user_namespace *ns, kuid_t uid)
>  
>  		new->ns = ns;
>  		new->uid = uid;
> -		atomic_set(&new->count, 0);
> +		new->count = 0;
>  
>  		spin_lock_irq(&ucounts_lock);
>  		ucounts = find_ucounts(ns, uid, hashent);
> @@ -154,8 +154,10 @@ static struct ucounts *get_ucounts(struct user_namespace *ns, kuid_t uid)
>  			ucounts = new;
>  		}
>  	}
> -	if (!atomic_add_unless(&ucounts->count, 1, INT_MAX))
> +	if (ucounts->count == INT_MAX)
>  		ucounts = NULL;
> +	else
> +		ucounts->count += 1;
>  	spin_unlock_irq(&ucounts_lock);
>  	return ucounts;
>  }
> @@ -164,13 +166,15 @@ static void put_ucounts(struct ucounts *ucounts)
>  {
>  	unsigned long flags;
>  
> -	if (atomic_dec_and_test(&ucounts->count)) {
> -		spin_lock_irqsave(&ucounts_lock, flags);
> +	spin_lock_irqsave(&ucounts_lock, flags);
> +	ucounts->count -= 1;
> +	if (!ucounts->count)
>  		hlist_del_init(&ucounts->node);
> -		spin_unlock_irqrestore(&ucounts_lock, flags);
> +	else
> +		ucounts = NULL;
> +	spin_unlock_irqrestore(&ucounts_lock, flags);
>  
> -		kfree(ucounts);
> -	}
> +	kfree(ucounts);
>  }
>  
>  static inline bool atomic_inc_below(atomic_t *v, int u)
> -- 
> 2.10.1
> 
> _______________________________________________
> Containers mailing list
> Containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
> https://lists.linuxfoundation.org/mailman/listinfo/containers

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [REVIEW][PATCH] ucount: Remove the atomicity from ucount->count
       [not found]                                                         ` <20170306203919.GA17874-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
@ 2017-03-06 21:26                                                           ` Eric W. Biederman
  0 siblings, 0 replies; 23+ messages in thread
From: Eric W. Biederman @ 2017-03-06 21:26 UTC (permalink / raw)
  To: Andrei Vagin
  Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	syzkaller, Jan Kara, JongHwan Kim, Dmitry Vyukov

Andrei Vagin <avagin-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:

> On Sun, Mar 05, 2017 at 03:41:06PM -0600, Eric W. Biederman wrote:
>> 
>> Always increment/decrement ucount->count under the ucounts_lock.  The
>> increments are there already and moving the decrements there means the
>> locking logic of the code is simpler.  This simplification in the
>> locking logic fixes a race between put_ucounts and get_ucounts that
>> could result in a use-after-free because the count could go zero then
>> be found by get_ucounts and then be freed by put_ucounts.
>> 
>> A bug presumably this one was found by a combination of syzkaller and
>> KASAN.  JongWhan Kim reported the syzkaller failure and Dmitry Vyukov
>> spotted the race in the code.
>>
>
> Reviewed-by: Andrei Vagin <avagin-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
>
> I think we can rework this in a future so that ucount will be rcu
> protected.

Agreed. Although I would like to see a benchmark that motivated that.
So far my impression is that all of these counts are in the noise.
Which is why I have aimed more at simplicity than the fastest possible
data structures.


Eric

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2017-03-06 21:26 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <C4CB953C-4712-43F3-9E75-1A12C14B8A4B@gmail.com>
     [not found] ` <C4CB953C-4712-43F3-9E75-1A12C14B8A4B-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2017-03-03 15:37   ` ucount: use-after-free read in inc_ucount & dec_ucount Nikolay Borisov
     [not found]     ` <180fb7dc-790e-8e82-0cc1-c6e15ddcd20b-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2017-03-03 16:30       ` Eric W. Biederman
     [not found]         ` <87pohy1fx6.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
2017-03-03 16:45           ` Nikolay Borisov
     [not found]             ` <d0a23a9c-65a6-063f-748b-e399bed6dacd-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2017-03-03 16:46               ` Eric W. Biederman
2017-03-04 10:58           ` Nikolay Borisov
     [not found]             ` <1aafd5e9-d9de-e5d9-a77d-cf245c5a5c6a-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2017-03-04 11:44               ` Dmitry Vyukov via Containers
     [not found]                 ` <CACT4Y+a043_qFN=2uzvYhDFwF4JS9_p3jnOykpMk=RcpqQfKGQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2017-03-04 11:50                   ` Dmitry Vyukov via Containers
     [not found]                     ` <CACT4Y+Yev63VXYm+kZdii5kheV_ACBn2cehFcFdz6LBVam3Q2g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2017-03-04 11:57                       ` Dmitry Vyukov via Containers
2017-03-04 12:01                       ` Dmitry Vyukov via Containers
     [not found]                         ` <CACT4Y+b7YXnoOkdYTjp6D6XH-zuYRx063hMayOVMyJVc735dtQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2017-03-04 12:10                           ` Nikolay Borisov
     [not found]                             ` <c69f8f03-4324-f934-eed2-643c91d703c0-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2017-03-04 12:15                               ` Dmitry Vyukov via Containers
     [not found]                                 ` <CACT4Y+avCy8Susvb09DLHkGym0K_15+EeZ8yBH7tcgjvnb3Yhw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2017-03-04 12:35                                   ` 쪼르
     [not found]                                     ` <CALRZ7Ut=bAgX+XwS3h8-V-zGqiUFkuSCi_C=j6uN_=OptU-+RQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2017-03-05 21:00                                       ` Eric W. Biederman
     [not found]                                         ` <87efybfnh2.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
2017-03-06  9:13                                           ` Dmitry Vyukov via Containers
     [not found]                                             ` <CACT4Y+aZmSY3rmV7+iC7i6Ov2Of4N5YTpjz5Tk8XCki-CyAY5w-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2017-03-06 16:33                                               ` Eric W. Biederman
     [not found]                                                 ` <87a88y4b78.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
2017-03-06 18:43                                                   ` Dmitry Vyukov via Containers
2017-03-04 12:38                                   ` Dmitry Vyukov via Containers
     [not found]                                     ` <CACT4Y+YWEq66QXbTMR4yGtgc0ULN+TurAnRzcDASaja1z5XNjA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2017-03-04 23:58                                       ` Eric W. Biederman
     [not found]                                         ` <87shmsoar7.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
2017-03-05 10:53                                           ` Dmitry Vyukov via Containers
     [not found]                                             ` <CACT4Y+bYAbsxsRWA2o+c7x25f1JPSqZKX-8a8NJq4nab-PyYig-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2017-03-05 18:41                                               ` Eric W. Biederman
     [not found]                                                 ` <87y3wjlg6r.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
2017-03-05 21:41                                                   ` [REVIEW][PATCH] ucount: Remove the atomicity from ucount->count Eric W. Biederman
     [not found]                                                     ` <87tw77csgd.fsf_-_-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
2017-03-06 20:39                                                       ` Andrei Vagin
     [not found]                                                         ` <20170306203919.GA17874-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2017-03-06 21:26                                                           ` Eric W. Biederman

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.