* KASAN: use-after-free Read in task_is_descendant
@ 2018-10-21 7:10 syzbot
2018-10-21 7:12 ` Tetsuo Handa
` (2 more replies)
0 siblings, 3 replies; 31+ messages in thread
From: syzbot @ 2018-10-21 7:10 UTC (permalink / raw)
To: jmorris, keescook, linux-kernel, linux-security-module, serge,
syzkaller-bugs
Hello,
syzbot found the following crash on:
HEAD commit: 270b77a0f30e Merge tag 'drm-fixes-2018-10-20-1' of git://a..
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=116f4ad9400000
kernel config: https://syzkaller.appspot.com/x/.config?x=b3f55cb3dfcc6c33
dashboard link: https://syzkaller.appspot.com/bug?extid=a9ac39bf55329e206219
compiler: gcc (GCC) 8.0.1 20180413 (experimental)
Unfortunately, I don't have any reproducer for this crash yet.
IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+a9ac39bf55329e206219@syzkaller.appspotmail.com
binder: 12719:12721 got transaction to context manager from process owning
it
binder: BINDER_SET_CONTEXT_MGR already set
binder: 12719:12724 ioctl 40046207 0 returned -16
binder: 12719:12721 BC_ACQUIRE_DONE u0000000000000000 no match
==================================================================
BUG: KASAN: use-after-free in __read_once_size include/linux/compiler.h:188
[inline]
BUG: KASAN: use-after-free in task_is_descendant.part.2+0x610/0x670
security/yama/yama_lsm.c:295
Read of size 8 at addr ffff8801c4666b20 by task syz-executor3/12722
CPU: 1 PID: 12722 Comm: syz-executor3 Not tainted 4.19.0-rc8+ #70
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x1c4/0x2b4 lib/dump_stack.c:113
print_address_description.cold.8+0x9/0x1ff mm/kasan/report.c:256
kasan_report_error mm/kasan/report.c:354 [inline]
kasan_report.cold.9+0x242/0x309 mm/kasan/report.c:412
__asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:433
__read_once_size include/linux/compiler.h:188 [inline]
task_is_descendant.part.2+0x610/0x670 security/yama/yama_lsm.c:295
task_is_descendant security/yama/yama_lsm.c:282 [inline]
yama_ptrace_access_check+0x215/0x10fc security/yama/yama_lsm.c:371
security_ptrace_access_check+0x54/0xb0 security/security.c:260
__ptrace_may_access+0x564/0x950 kernel/ptrace.c:331
ptrace_attach+0x1fa/0x640 kernel/ptrace.c:379
__do_sys_ptrace kernel/ptrace.c:1130 [inline]
__se_sys_ptrace kernel/ptrace.c:1110 [inline]
__x64_sys_ptrace+0x229/0x260 kernel/ptrace.c:1110
do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x457569
Code: fd b3 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7
48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff
ff 0f 83 cb b3 fb ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007f7df12c4c78 EFLAGS: 00000246 ORIG_RAX: 0000000000000065
RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 0000000000457569
RDX: 0000000000000000 RSI: 0000000000000265 RDI: 0000000000000010
RBP: 000000000072bf00 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00007f7df12c56d4
R13: 00000000004c30c5 R14: 00000000004d4ab8 R15: 00000000ffffffff
Allocated by task 8920:
save_stack+0x43/0xd0 mm/kasan/kasan.c:448
set_track mm/kasan/kasan.c:460 [inline]
kasan_kmalloc+0xc7/0xe0 mm/kasan/kasan.c:553
kasan_slab_alloc+0x12/0x20 mm/kasan/kasan.c:490
kmem_cache_alloc_node+0x144/0x730 mm/slab.c:3644
alloc_task_struct_node kernel/fork.c:157 [inline]
dup_task_struct kernel/fork.c:802 [inline]
copy_process+0x1ff4/0x8780 kernel/fork.c:1707
_do_fork+0x1cb/0x11d0 kernel/fork.c:2166
__do_sys_clone kernel/fork.c:2273 [inline]
__se_sys_clone kernel/fork.c:2267 [inline]
__x64_sys_clone+0xbf/0x150 kernel/fork.c:2267
do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
entry_SYSCALL_64_after_hwframe+0x49/0xbe
Freed by task 18:
save_stack+0x43/0xd0 mm/kasan/kasan.c:448
set_track mm/kasan/kasan.c:460 [inline]
__kasan_slab_free+0x102/0x150 mm/kasan/kasan.c:521
kasan_slab_free+0xe/0x10 mm/kasan/kasan.c:528
__cache_free mm/slab.c:3498 [inline]
kmem_cache_free+0x83/0x290 mm/slab.c:3756
free_task_struct kernel/fork.c:162 [inline]
free_task+0x16e/0x1f0 kernel/fork.c:416
__put_task_struct+0x2e6/0x620 kernel/fork.c:689
put_task_struct include/linux/sched/task.h:96 [inline]
delayed_put_task_struct+0x2ff/0x4c0 kernel/exit.c:181
__rcu_reclaim kernel/rcu/rcu.h:236 [inline]
rcu_do_batch kernel/rcu/tree.c:2576 [inline]
invoke_rcu_callbacks kernel/rcu/tree.c:2880 [inline]
__rcu_process_callbacks kernel/rcu/tree.c:2847 [inline]
rcu_process_callbacks+0xf23/0x2670 kernel/rcu/tree.c:2864
__do_softirq+0x30b/0xad8 kernel/softirq.c:292
The buggy address belongs to the object at ffff8801c4666640
which belongs to the cache task_struct(65:syz3) of size 6080
The buggy address is located 1248 bytes inside of
6080-byte region [ffff8801c4666640, ffff8801c4667e00)
The buggy address belongs to the page:
page:ffffea0007119980 count:1 mapcount:0 mapping:ffff8801c2336800 index:0x0
compound_mapcount: 0
flags: 0x2fffc0000008100(slab|head)
raw: 02fffc0000008100 ffffea00070a0008 ffffea0006080688 ffff8801c2336800
raw: 0000000000000000 ffff8801c4666640 0000000100000001 ffff8801bdb40c00
page dumped because: kasan: bad access detected
page->mem_cgroup:ffff8801bdb40c00
Memory state around the buggy address:
ffff8801c4666a00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff8801c4666a80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> ffff8801c4666b00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
^
ffff8801c4666b80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff8801c4666c00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
==================================================================
---
This bug is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.
syzbot will keep track of this bug report. See:
https://goo.gl/tpsmEJ#bug-status-tracking for how to communicate with
syzbot.
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: KASAN: use-after-free Read in task_is_descendant 2018-10-21 7:10 KASAN: use-after-free Read in task_is_descendant syzbot @ 2018-10-21 7:12 ` Tetsuo Handa 2018-10-22 9:54 ` Oleg Nesterov 2018-11-10 3:25 ` syzbot 2018-11-10 11:46 ` syzbot 2 siblings, 1 reply; 31+ messages in thread From: Tetsuo Handa @ 2018-10-21 7:12 UTC (permalink / raw) To: serge, Oleg Nesterov Cc: syzbot, jmorris, keescook, linux-kernel, linux-security-module, syzkaller-bugs On 2018/10/21 16:10, syzbot wrote: > BUG: KASAN: use-after-free in __read_once_size include/linux/compiler.h:188 [inline] > BUG: KASAN: use-after-free in task_is_descendant.part.2+0x610/0x670 security/yama/yama_lsm.c:295 > Read of size 8 at addr ffff8801c4666b20 by task syz-executor3/12722 > > CPU: 1 PID: 12722 Comm: syz-executor3 Not tainted 4.19.0-rc8+ #70 > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 > Call Trace: > __dump_stack lib/dump_stack.c:77 [inline] > dump_stack+0x1c4/0x2b4 lib/dump_stack.c:113 > print_address_description.cold.8+0x9/0x1ff mm/kasan/report.c:256 > kasan_report_error mm/kasan/report.c:354 [inline] > kasan_report.cold.9+0x242/0x309 mm/kasan/report.c:412 > __asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:433 > __read_once_size include/linux/compiler.h:188 [inline] > task_is_descendant.part.2+0x610/0x670 security/yama/yama_lsm.c:295 Do we need to hold write_lock_irq(&tasklist_lock); rather than rcu_read_lock(); when accessing "struct task_struct"->real_parent ? ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: KASAN: use-after-free Read in task_is_descendant 2018-10-21 7:12 ` Tetsuo Handa @ 2018-10-22 9:54 ` Oleg Nesterov 2018-10-22 10:06 ` Tetsuo Handa 2018-10-25 8:19 ` Kees Cook 0 siblings, 2 replies; 31+ messages in thread From: Oleg Nesterov @ 2018-10-22 9:54 UTC (permalink / raw) To: Tetsuo Handa Cc: serge, syzbot, jmorris, keescook, linux-kernel, linux-security-module, syzkaller-bugs On 10/21, Tetsuo Handa wrote: > > On 2018/10/21 16:10, syzbot wrote: > > BUG: KASAN: use-after-free in __read_once_size include/linux/compiler.h:188 [inline] > > BUG: KASAN: use-after-free in task_is_descendant.part.2+0x610/0x670 security/yama/yama_lsm.c:295 > > Read of size 8 at addr ffff8801c4666b20 by task syz-executor3/12722 > > > > CPU: 1 PID: 12722 Comm: syz-executor3 Not tainted 4.19.0-rc8+ #70 > > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 > > Call Trace: > > __dump_stack lib/dump_stack.c:77 [inline] > > dump_stack+0x1c4/0x2b4 lib/dump_stack.c:113 > > print_address_description.cold.8+0x9/0x1ff mm/kasan/report.c:256 > > kasan_report_error mm/kasan/report.c:354 [inline] > > kasan_report.cold.9+0x242/0x309 mm/kasan/report.c:412 > > __asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:433 > > __read_once_size include/linux/compiler.h:188 [inline] > > task_is_descendant.part.2+0x610/0x670 security/yama/yama_lsm.c:295 > > Do we need to hold > > write_lock_irq(&tasklist_lock); > > rather than > > rcu_read_lock(); > > when accessing > > "struct task_struct"->real_parent Well, if "task" is stable (can't exit), then I think rcu_dereference(task->real_parent) is fine, we know that ->real_parent did not pass exit_notif() yet. However, task_is_descendant() looks unnecessarily complicated, it could be static int task_is_descendant(struct task_struct *parent, struct task_struct *child) { int rc = 0; struct task_struct *walker; if (!parent || !child) return 0; rcu_read_lock(); for (walker = child; walker->pid; walker = rcu_dereference(walker->real_parent)) if (same_thread_group(parent, walker)) { rc = 1; break; } rcu_read_unlock(); return rc; } And again, I do not know how/if yama ensures that child is rcu-protected, perhaps task_is_descendant() needs to check pid_alive(child) right after rcu_read_lock() ? Oleg. ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: KASAN: use-after-free Read in task_is_descendant 2018-10-22 9:54 ` Oleg Nesterov @ 2018-10-22 10:06 ` Tetsuo Handa 2018-10-22 13:46 ` Oleg Nesterov 2018-10-25 8:19 ` Kees Cook 1 sibling, 1 reply; 31+ messages in thread From: Tetsuo Handa @ 2018-10-22 10:06 UTC (permalink / raw) To: Oleg Nesterov Cc: serge, syzbot, jmorris, keescook, linux-kernel, linux-security-module, syzkaller-bugs On 2018/10/22 18:54, Oleg Nesterov wrote: > On 10/21, Tetsuo Handa wrote: >> >> On 2018/10/21 16:10, syzbot wrote: >>> BUG: KASAN: use-after-free in __read_once_size include/linux/compiler.h:188 [inline] >>> BUG: KASAN: use-after-free in task_is_descendant.part.2+0x610/0x670 security/yama/yama_lsm.c:295 >>> Read of size 8 at addr ffff8801c4666b20 by task syz-executor3/12722 >>> >>> CPU: 1 PID: 12722 Comm: syz-executor3 Not tainted 4.19.0-rc8+ #70 >>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 >>> Call Trace: >>> __dump_stack lib/dump_stack.c:77 [inline] >>> dump_stack+0x1c4/0x2b4 lib/dump_stack.c:113 >>> print_address_description.cold.8+0x9/0x1ff mm/kasan/report.c:256 >>> kasan_report_error mm/kasan/report.c:354 [inline] >>> kasan_report.cold.9+0x242/0x309 mm/kasan/report.c:412 >>> __asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:433 >>> __read_once_size include/linux/compiler.h:188 [inline] >>> task_is_descendant.part.2+0x610/0x670 security/yama/yama_lsm.c:295 >> >> Do we need to hold >> >> write_lock_irq(&tasklist_lock); >> >> rather than >> >> rcu_read_lock(); >> >> when accessing >> >> "struct task_struct"->real_parent > > Well, if "task" is stable (can't exit), then I think > > rcu_dereference(task->real_parent) > > is fine, we know that ->real_parent did not pass exit_notif() yet. OK. > > However, task_is_descendant() looks unnecessarily complicated, it could be > > static int task_is_descendant(struct task_struct *parent, > struct task_struct *child) > { > int rc = 0; > struct task_struct *walker; > > if (!parent || !child) > return 0; > > rcu_read_lock(); > for (walker = child; walker->pid; walker = rcu_dereference(walker->real_parent)) > if (same_thread_group(parent, walker)) { > rc = 1; > break; > } > rcu_read_unlock(); > > return rc; > } > > And again, I do not know how/if yama ensures that child is rcu-protected, perhaps > task_is_descendant() needs to check pid_alive(child) right after rcu_read_lock() ? Since the caller (ptrace() path) called get_task_struct(child), child itself can't be released. Do we still need pid_alive(child) ? ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: KASAN: use-after-free Read in task_is_descendant 2018-10-22 10:06 ` Tetsuo Handa @ 2018-10-22 13:46 ` Oleg Nesterov 2018-10-25 2:15 ` Tetsuo Handa 0 siblings, 1 reply; 31+ messages in thread From: Oleg Nesterov @ 2018-10-22 13:46 UTC (permalink / raw) To: Tetsuo Handa Cc: serge, syzbot, jmorris, keescook, linux-kernel, linux-security-module, syzkaller-bugs On 10/22, Tetsuo Handa wrote: > > > However, task_is_descendant() looks unnecessarily complicated, it could be > > > > static int task_is_descendant(struct task_struct *parent, > > struct task_struct *child) > > { > > int rc = 0; > > struct task_struct *walker; > > > > if (!parent || !child) > > return 0; > > > > rcu_read_lock(); > > for (walker = child; walker->pid; walker = rcu_dereference(walker->real_parent)) > > if (same_thread_group(parent, walker)) { > > rc = 1; > > break; > > } > > rcu_read_unlock(); > > > > return rc; > > } > > > > And again, I do not know how/if yama ensures that child is rcu-protected, perhaps > > task_is_descendant() needs to check pid_alive(child) right after rcu_read_lock() ? > > Since the caller (ptrace() path) called get_task_struct(child), child itself can't be > released. Do we still need pid_alive(child) ? get_task_struct(child) can only ensure that this task_struct can't be freed. Suppose that this child exits after get_task_struct(), then its real_parent exits too and calls call_rcu(delayed_put_task_struct). Now, when task_is_descendant() is called, rcu_read_lock() can happen after rcu gp, iow child->parent can be already freed/reused/unmapped. We need to ensure that child is still protected by RCU. Oleg. ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: KASAN: use-after-free Read in task_is_descendant 2018-10-22 13:46 ` Oleg Nesterov @ 2018-10-25 2:15 ` Tetsuo Handa 2018-10-25 11:13 ` Oleg Nesterov 0 siblings, 1 reply; 31+ messages in thread From: Tetsuo Handa @ 2018-10-25 2:15 UTC (permalink / raw) To: Oleg Nesterov Cc: serge, syzbot, jmorris, keescook, linux-kernel, linux-security-module, syzkaller-bugs Oleg Nesterov wrote: > On 10/22, Tetsuo Handa wrote: > > > And again, I do not know how/if yama ensures that child is rcu-protected, perhaps > > > task_is_descendant() needs to check pid_alive(child) right after rcu_read_lock() ? > > > > Since the caller (ptrace() path) called get_task_struct(child), child itself can't be > > released. Do we still need pid_alive(child) ? > > get_task_struct(child) can only ensure that this task_struct can't be freed. The report says that it is a use-after-free read at walker = rcu_dereference(walker->real_parent); which means that walker was already released. > > Suppose that this child exits after get_task_struct(), then its real_parent exits > too and calls call_rcu(delayed_put_task_struct). > > Now, when task_is_descendant() is called, rcu_read_lock() can happen after rcu gp, > iow child->parent can be already freed/reused/unmapped. > > We need to ensure that child is still protected by RCU. I wonder whether pid_alive() test helps. We can get [ 40.620318] parent or walker is dead. [ 40.624146] tracee is dead. messages using below patch and reproducer. ---------- diff --git a/kernel/ptrace.c b/kernel/ptrace.c index 99cfddd..0d9d786 100644 --- a/kernel/ptrace.c +++ b/kernel/ptrace.c @@ -385,6 +385,7 @@ static int ptrace_attach(struct task_struct *task, long request, if (mutex_lock_interruptible(&task->signal->cred_guard_mutex)) goto out; + schedule_timeout_killable(HZ); task_lock(task); retval = __ptrace_may_access(task, PTRACE_MODE_ATTACH_REALCREDS); task_unlock(task); diff --git a/security/yama/yama_lsm.c b/security/yama/yama_lsm.c index ffda91a..a231ec6 100644 --- a/security/yama/yama_lsm.c +++ b/security/yama/yama_lsm.c @@ -283,6 +283,11 @@ static int task_is_descendant(struct task_struct *parent, return 0; rcu_read_lock(); + if (!pid_alive(parent) || !pid_alive(walker)) { + rcu_read_unlock(); + printk("parent or walker is dead.\n"); + return 0; + } if (!thread_group_leader(parent)) parent = rcu_dereference(parent->group_leader); while (walker->pid > 0) { @@ -315,6 +320,10 @@ static int ptracer_exception_found(struct task_struct *tracer, bool found = false; rcu_read_lock(); + if (!pid_alive(tracee)) { + printk("tracee is dead.\n"); + goto unlock; + } /* * If there's already an active tracing relationship, then make an ---------- ---------- #include <unistd.h> #include <sys/ptrace.h> #include <poll.h> int main(int argc, char *argv[]) { if (fork() == 0) { ptrace(PTRACE_ATTACH, getppid(), NULL, NULL); _exit(0); } poll(NULL, 0, 100); return 0; } ---------- But since "child" has at least one reference, reading "child->real_parent" should be safe. Therefore, I think that bailing out due to pid_is_alive(child) == false (like above patch does) cannot avoid this problem... ^ permalink raw reply related [flat|nested] 31+ messages in thread
* Re: KASAN: use-after-free Read in task_is_descendant 2018-10-25 2:15 ` Tetsuo Handa @ 2018-10-25 11:13 ` Oleg Nesterov 2018-10-25 11:36 ` Kees Cook 2018-10-25 11:47 ` Tetsuo Handa 0 siblings, 2 replies; 31+ messages in thread From: Oleg Nesterov @ 2018-10-25 11:13 UTC (permalink / raw) To: Tetsuo Handa Cc: serge, syzbot, jmorris, keescook, linux-kernel, linux-security-module, syzkaller-bugs On 10/25, Tetsuo Handa wrote: > > Oleg Nesterov wrote: > > On 10/22, Tetsuo Handa wrote: > > > > And again, I do not know how/if yama ensures that child is rcu-protected, perhaps > > > > task_is_descendant() needs to check pid_alive(child) right after rcu_read_lock() ? > > > > > > Since the caller (ptrace() path) called get_task_struct(child), child itself can't be > > > released. Do we still need pid_alive(child) ? > > > > get_task_struct(child) can only ensure that this task_struct can't be freed. > > The report says that it is a use-after-free read at > > walker = rcu_dereference(walker->real_parent); > > which means that walker was already released. quite possibly I missed something, but I am not sure I understand your concerns... So again, suppose that "child" is already dead. Its task_struct can't be freed, but child->real_parent can point to the already freed memory. This means that the 1st walker = rcu_dereference(walker->real_parent) is fine, this simply reads the child->real_parent pointer, but on the second iteration walker = rcu_dereference(walker->real_parent); reads the alredy freed memory. > I wonder whether pid_alive() test helps. > > We can get > > [ 40.620318] parent or walker is dead. > [ 40.624146] tracee is dead. > > messages using below patch and reproducer. again, I do not understand, this all looks correct... > ---------- > diff --git a/kernel/ptrace.c b/kernel/ptrace.c > index 99cfddd..0d9d786 100644 > --- a/kernel/ptrace.c > +++ b/kernel/ptrace.c > @@ -385,6 +385,7 @@ static int ptrace_attach(struct task_struct *task, long request, > if (mutex_lock_interruptible(&task->signal->cred_guard_mutex)) > goto out; > > + schedule_timeout_killable(HZ); > task_lock(task); > retval = __ptrace_may_access(task, PTRACE_MODE_ATTACH_REALCREDS); > task_unlock(task); > diff --git a/security/yama/yama_lsm.c b/security/yama/yama_lsm.c > index ffda91a..a231ec6 100644 > --- a/security/yama/yama_lsm.c > +++ b/security/yama/yama_lsm.c > @@ -283,6 +283,11 @@ static int task_is_descendant(struct task_struct *parent, > return 0; > > rcu_read_lock(); > + if (!pid_alive(parent) || !pid_alive(walker)) { > + rcu_read_unlock(); > + printk("parent or walker is dead.\n"); This is what we need to do, except I think we should change yama_ptrace_access_check(). And iiuc parent == current, pid_alive(parent) looks unnecessary. Although we need to check ptracer_exception_found(), may be it needs some changes too. And yes, task_is_descendant() can hit the dead child, if nothing else it can be killed. This can explain the kasan report. > @@ -315,6 +320,10 @@ static int ptracer_exception_found(struct task_struct *tracer, > bool found = false; > > rcu_read_lock(); > + if (!pid_alive(tracee)) { > + printk("tracee is dead.\n"); > + goto unlock; Sure, this is possible too. > But since "child" has at least one reference, reading "child->real_parent" should > be safe. Therefore, I think that bailing out due to pid_is_alive(child) == false > (like above patch does) cannot avoid this problem... Why? OK. Lets ignore ptracer_exception_found() for the moment. Why do you think the patch below can't help? Oleg. --- x/security/yama/yama_lsm.c +++ x/security/yama/yama_lsm.c @@ -368,7 +368,8 @@ static int yama_ptrace_access_check(stru break; case YAMA_SCOPE_RELATIONAL: rcu_read_lock(); - if (!task_is_descendant(current, child) && + if (!pid_alive(child) || + !task_is_descendant(current, child) && !ptracer_exception_found(current, child) && !ns_capable(__task_cred(child)->user_ns, CAP_SYS_PTRACE)) rc = -EPERM; ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: KASAN: use-after-free Read in task_is_descendant 2018-10-25 11:13 ` Oleg Nesterov @ 2018-10-25 11:36 ` Kees Cook 2018-10-25 12:05 ` Oleg Nesterov 2018-10-25 11:47 ` Tetsuo Handa 1 sibling, 1 reply; 31+ messages in thread From: Kees Cook @ 2018-10-25 11:36 UTC (permalink / raw) To: Oleg Nesterov Cc: Tetsuo Handa, Serge E. Hallyn, syzbot, James Morris, LKML, linux-security-module, syzkaller-bugs On Thu, Oct 25, 2018 at 12:13 PM, Oleg Nesterov <oleg@redhat.com> wrote: > So again, suppose that "child" is already dead. Its task_struct can't be freed, > but child->real_parent can point to the already freed memory. I can't find a path for "child" to be released. I see task_lock() always called on it before it ends up in Yama. (Well, I can't find the lock for switch_mm(), but I assume that's safe because it's switching to the task.) > This means that the 1st walker = rcu_dereference(walker->real_parent) is fine, > this simply reads the child->real_parent pointer, but on the second iteration > > walker = rcu_dereference(walker->real_parent); > > reads the alredy freed memory. What does rcu_read_lock() protect actually protect here? I thought none of the task structs would be freed until after all rcu_read_unlock() finished. > OK. Lets ignore ptracer_exception_found() for the moment. Why do you think the > patch below can't help? > > Oleg. > > --- x/security/yama/yama_lsm.c > +++ x/security/yama/yama_lsm.c > @@ -368,7 +368,8 @@ static int yama_ptrace_access_check(stru > break; > case YAMA_SCOPE_RELATIONAL: > rcu_read_lock(); > - if (!task_is_descendant(current, child) && > + if (!pid_alive(child) || > + !task_is_descendant(current, child) && > !ptracer_exception_found(current, child) && > !ns_capable(__task_cred(child)->user_ns, CAP_SYS_PTRACE)) > rc = -EPERM; > Hm, documentation there says: * If pid_alive fails, then pointers within the task structure * can be stale and must not be dereferenced. What is the safe pattern for checking vs rcu? -Kees -- Kees Cook ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: KASAN: use-after-free Read in task_is_descendant 2018-10-25 11:36 ` Kees Cook @ 2018-10-25 12:05 ` Oleg Nesterov 0 siblings, 0 replies; 31+ messages in thread From: Oleg Nesterov @ 2018-10-25 12:05 UTC (permalink / raw) To: Kees Cook Cc: Tetsuo Handa, Serge E. Hallyn, syzbot, James Morris, LKML, linux-security-module, syzkaller-bugs On 10/25, Kees Cook wrote: > > On Thu, Oct 25, 2018 at 12:13 PM, Oleg Nesterov <oleg@redhat.com> wrote: > > So again, suppose that "child" is already dead. Its task_struct can't be freed, > > but child->real_parent can point to the already freed memory. > > I can't find a path for "child" to be released. I see task_lock() > always called on it before it ends up in Yama. Are you saying that yama_ptrace_access_check() is always called under task_lock(child) ? Yes, it seems so So what? Say, ptrace_attach() can hit a dead task. It should notice this and fail later, but security_ptrace_access_check() is called before that. > > This means that the 1st walker = rcu_dereference(walker->real_parent) is fine, > > this simply reads the child->real_parent pointer, but on the second iteration > > > > walker = rcu_dereference(walker->real_parent); > > > > reads the alredy freed memory. > > What does rcu_read_lock() protect actually protect here? I thought > none of the task structs would be freed until after all > rcu_read_unlock() finished. See another email I sent you a minute ago. Oleg. ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: KASAN: use-after-free Read in task_is_descendant 2018-10-25 11:13 ` Oleg Nesterov 2018-10-25 11:36 ` Kees Cook @ 2018-10-25 11:47 ` Tetsuo Handa 2018-10-25 12:17 ` Oleg Nesterov 1 sibling, 1 reply; 31+ messages in thread From: Tetsuo Handa @ 2018-10-25 11:47 UTC (permalink / raw) To: Oleg Nesterov Cc: serge, syzbot, jmorris, keescook, linux-kernel, linux-security-module, syzkaller-bugs On 2018/10/25 20:13, Oleg Nesterov wrote: > On 10/25, Tetsuo Handa wrote: >> >> Oleg Nesterov wrote: >>> On 10/22, Tetsuo Handa wrote: >>>>> And again, I do not know how/if yama ensures that child is rcu-protected, perhaps >>>>> task_is_descendant() needs to check pid_alive(child) right after rcu_read_lock() ? >>>> >>>> Since the caller (ptrace() path) called get_task_struct(child), child itself can't be >>>> released. Do we still need pid_alive(child) ? >>> >>> get_task_struct(child) can only ensure that this task_struct can't be freed. >> >> The report says that it is a use-after-free read at >> >> walker = rcu_dereference(walker->real_parent); >> >> which means that walker was already released. > > quite possibly I missed something, but I am not sure I understand your concerns... > > So again, suppose that "child" is already dead. Its task_struct can't be freed, > but child->real_parent can point to the already freed memory. Yes. But if child->real_parent is pointing to the already freed memory, why does pid_alive(child) == true help? > > This means that the 1st walker = rcu_dereference(walker->real_parent) is fine, > this simply reads the child->real_parent pointer, Yes. > but on the second iteration > > walker = rcu_dereference(walker->real_parent); > > reads the alredy freed memory. Yes. > >> I wonder whether pid_alive() test helps. >> >> We can get >> >> [ 40.620318] parent or walker is dead. >> [ 40.624146] tracee is dead. >> >> messages using below patch and reproducer. > > again, I do not understand, this all looks correct... > >> ---------- >> diff --git a/kernel/ptrace.c b/kernel/ptrace.c >> index 99cfddd..0d9d786 100644 >> --- a/kernel/ptrace.c >> +++ b/kernel/ptrace.c >> @@ -385,6 +385,7 @@ static int ptrace_attach(struct task_struct *task, long request, >> if (mutex_lock_interruptible(&task->signal->cred_guard_mutex)) >> goto out; >> >> + schedule_timeout_killable(HZ); >> task_lock(task); >> retval = __ptrace_may_access(task, PTRACE_MODE_ATTACH_REALCREDS); >> task_unlock(task); >> diff --git a/security/yama/yama_lsm.c b/security/yama/yama_lsm.c >> index ffda91a..a231ec6 100644 >> --- a/security/yama/yama_lsm.c >> +++ b/security/yama/yama_lsm.c >> @@ -283,6 +283,11 @@ static int task_is_descendant(struct task_struct *parent, >> return 0; >> >> rcu_read_lock(); >> + if (!pid_alive(parent) || !pid_alive(walker)) { >> + rcu_read_unlock(); >> + printk("parent or walker is dead.\n"); > > This is what we need to do, except I think we should change yama_ptrace_access_check(). > And iiuc parent == current, pid_alive(parent) looks unnecessary. Although we need to > check ptracer_exception_found(), may be it needs some changes too. There are two task_is_descendant() callers, and one of them is not passing current. > > And yes, task_is_descendant() can hit the dead child, if nothing else it can > be killed. This can explain the kasan report. The kasan is reporting that child->real_parent (or maybe child->real_parent->real_parent or child->real_parent->real_parent->real_parent ...) was pointing to already freed memory, isn't it? How can we check that that pointer is pointing to already freed memory? As soon as walker = rcu_dereference(walker->real_parent); is executed, task_alive(walker) will try to read from already freed memory... ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: KASAN: use-after-free Read in task_is_descendant 2018-10-25 11:47 ` Tetsuo Handa @ 2018-10-25 12:17 ` Oleg Nesterov 2018-10-25 13:01 ` Oleg Nesterov 2018-10-25 13:14 ` KASAN: use-after-free Read in task_is_descendant Tetsuo Handa 0 siblings, 2 replies; 31+ messages in thread From: Oleg Nesterov @ 2018-10-25 12:17 UTC (permalink / raw) To: Tetsuo Handa Cc: serge, syzbot, jmorris, keescook, linux-kernel, linux-security-module, syzkaller-bugs On 10/25, Tetsuo Handa wrote: > > On 2018/10/25 20:13, Oleg Nesterov wrote: > > > > So again, suppose that "child" is already dead. Its task_struct can't be freed, > > but child->real_parent can point to the already freed memory. > > Yes. > > But if child->real_parent is pointing to the already freed memory, > why does pid_alive(child) == true help? Hmm. Because pid_alive(child) == true && child->real_parent is freed must not be possible? As long as we check pid_alive() under rcu_read_lock(). > >> @@ -283,6 +283,11 @@ static int task_is_descendant(struct task_struct *parent, > >> return 0; > >> > >> rcu_read_lock(); > >> + if (!pid_alive(parent) || !pid_alive(walker)) { > >> + rcu_read_unlock(); > >> + printk("parent or walker is dead.\n"); > > > > This is what we need to do, except I think we should change yama_ptrace_access_check(). > > And iiuc parent == current, pid_alive(parent) looks unnecessary. Although we need to > > check ptracer_exception_found(), may be it needs some changes too. > > There are two task_is_descendant() callers, and one of them is not passing current. As I said below, please ignore ptracer_exception_found(), another caller for now, perhaps it needs some changes too. I even have a vague feeling that I have already blamed this function some time ago... > > And yes, task_is_descendant() can hit the dead child, if nothing else it can > > be killed. This can explain the kasan report. > > The kasan is reporting that child->real_parent (or maybe child->real_parent->real_parent > or child->real_parent->real_parent->real_parent ...) was pointing to already freed memory, > isn't it? Yes. and you know, I am all confused. I no longer can understand you :/ > How can we check that that pointer is pointing to already freed memory? As soon as > > walker = rcu_dereference(walker->real_parent); > > is executed, task_alive(walker) will try to read from already freed memory... Of course we should not do it this way. The patch I sent doesn't... Oleg. ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: KASAN: use-after-free Read in task_is_descendant 2018-10-25 12:17 ` Oleg Nesterov @ 2018-10-25 13:01 ` Oleg Nesterov 2018-10-26 16:09 ` Kees Cook 2018-10-25 13:14 ` KASAN: use-after-free Read in task_is_descendant Tetsuo Handa 1 sibling, 1 reply; 31+ messages in thread From: Oleg Nesterov @ 2018-10-25 13:01 UTC (permalink / raw) To: Tetsuo Handa Cc: serge, syzbot, jmorris, keescook, linux-kernel, linux-security-module, syzkaller-bugs On 10/25, Oleg Nesterov wrote: > > As I said below, please ignore ptracer_exception_found(), another caller for now, > perhaps it needs some changes too. I even have a vague feeling that I have already > blamed this function some time ago... Heh, yes, 3 years ago ;) https://lore.kernel.org/lkml/20150106184427.GA18153@redhat.com/ I can't understand my email today, but note that I tried to point out that task_is_descendant() can dereference the freed mem. And yes, task_is_descendant() is overcompicated for no reason, afaics. Oleg. ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: KASAN: use-after-free Read in task_is_descendant 2018-10-25 13:01 ` Oleg Nesterov @ 2018-10-26 16:09 ` Kees Cook 2018-10-29 12:23 ` Oleg Nesterov 0 siblings, 1 reply; 31+ messages in thread From: Kees Cook @ 2018-10-26 16:09 UTC (permalink / raw) To: Oleg Nesterov Cc: Tetsuo Handa, Serge E. Hallyn, syzbot, James Morris, LKML, linux-security-module, syzkaller-bugs On Thu, Oct 25, 2018 at 2:01 PM, Oleg Nesterov <oleg@redhat.com> wrote: > On 10/25, Oleg Nesterov wrote: >> perhaps it needs some changes too. I even have a vague feeling that I have already >> blamed this function some time ago... > > Heh, yes, 3 years ago ;) > > https://lore.kernel.org/lkml/20150106184427.GA18153@redhat.com/ > > I can't understand my email today, but note that I tried to point out that > task_is_descendant() can dereference the freed mem. Instead of: while (walker->pid > 0) { should it simply be "while (pid_liave(walker)) {"? And add a pid_alive(parent) after rcu_read_lock()? > And yes, task_is_descendant() is overcompicated for no reason, afaics. Yeah, agreed. I'll fix this up. Just to make sure I'm not crazy: the real_parent of all tasks in a thread group are the same, yes? The trouble I was trying to deal with for the complication was where a non-leader thread would add an exception to the checking, and the tasks wouldn't match. (As far as I can see, though, using same_thread_group() should fix it.) -Kees -- Kees Cook ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: KASAN: use-after-free Read in task_is_descendant 2018-10-26 16:09 ` Kees Cook @ 2018-10-29 12:23 ` Oleg Nesterov 2018-10-29 15:05 ` yama: unsafe usage of ptrace_relation->tracer Oleg Nesterov 0 siblings, 1 reply; 31+ messages in thread From: Oleg Nesterov @ 2018-10-29 12:23 UTC (permalink / raw) To: Kees Cook Cc: Tetsuo Handa, Serge E. Hallyn, syzbot, James Morris, LKML, linux-security-module, syzkaller-bugs On 10/26, Kees Cook wrote: > > On Thu, Oct 25, 2018 at 2:01 PM, Oleg Nesterov <oleg@redhat.com> wrote: > > On 10/25, Oleg Nesterov wrote: > >> perhaps it needs some changes too. I even have a vague feeling that I have already > >> blamed this function some time ago... > > > > Heh, yes, 3 years ago ;) > > > > https://lore.kernel.org/lkml/20150106184427.GA18153@redhat.com/ > > > > I can't understand my email today, but note that I tried to point out that > > task_is_descendant() can dereference the freed mem. > > Instead of: > > while (walker->pid > 0) { > > should it simply be "while (pid_liave(walker)) {"? No, this would be wrong. Probably walker->pid > 0 is not the best check, but we do not need to change it for correctness. > And add a > pid_alive(parent) after rcu_read_lock()? So you too do not read my emails ;) I still think we need a single pid_alive() check and I even sent the patch. Attached again at the end. To clarify, let me repeat that ptracer_exception_found() may need some fixes too, right now I am only talking about task_is_descendant(). > > And yes, task_is_descendant() is overcompicated for no reason, afaics. > > Yeah, agreed. I'll fix this up. I have already posted this code, this is what I think it should do. static int task_is_descendant(struct task_struct *parent, struct task_struct *child) { struct task_struct *walker; for (walker = child; walker->pid; walker = rcu_dereference(walker->real_parent)) { if (same_thread_group(parent, walker)) return 1; } return 0; } This version differs in that I removed the parent/child != NULL at the start and rcu_read_lock(), it should be held by the caller anyway. > Just to make sure I'm not crazy: the > real_parent of all tasks in a thread group are the same, yes? Well, yes and no. So if same_thread_group(t1, t2) == T then same_thread_group(t1->real_parent, t2->real_parent) == T which means that real_parent of all tasks in a thread group is the same _process_. But t1->real_parent and t2->real_parent are not necessarily the same task. Oleg. --- x/security/yama/yama_lsm.c +++ x/security/yama/yama_lsm.c @@ -368,7 +368,8 @@ static int yama_ptrace_access_check(stru break; case YAMA_SCOPE_RELATIONAL: rcu_read_lock(); - if (!task_is_descendant(current, child) && + if (!pid_alive(child) || + !task_is_descendant(current, child) && !ptracer_exception_found(current, child) && !ns_capable(__task_cred(child)->user_ns, CAP_SYS_PTRACE)) rc = -EPERM; ^ permalink raw reply [flat|nested] 31+ messages in thread
* yama: unsafe usage of ptrace_relation->tracer 2018-10-29 12:23 ` Oleg Nesterov @ 2018-10-29 15:05 ` Oleg Nesterov 2019-01-10 11:05 ` Tetsuo Handa 0 siblings, 1 reply; 31+ messages in thread From: Oleg Nesterov @ 2018-10-29 15:05 UTC (permalink / raw) To: Kees Cook Cc: Tetsuo Handa, Serge E. Hallyn, syzbot, James Morris, LKML, linux-security-module, syzkaller-bugs let me change the subject to avoid the confusion with the already confusing disccussion about task_is_descendant(). On 10/29, Oleg Nesterov wrote: > > I still think we need a single pid_alive() check and I even sent the patch. > Attached again at the end. > > To clarify, let me repeat that ptracer_exception_found() may need some fixes > too, right now I am only talking about task_is_descendant(). so yes, the ptracer_relations code looks very broken to me, but perhaps I misread this code, please correct me. RCU can only protect the ptracer_relations list itself, you can do nothing with (say) relation->tracer. relation->tracer can be already freed when ptracer_exception_found() checks relation->tracee == tracee. Not only pid_alive(parent) can not help in this case, pid_alive(parent) is equally unsafe because, again, this memory can be freed. security_task_free(tsk) is called right before free_task(tsk), there is no a gp pass in between, and of course we can't rely on the ->invalid check. _At first glance_ we can fix this if we simply turn both ->tracer/tracee pointers into "signal_struct *", then we can turn all same_thread_group()'s into walker->signal == parent which doesn't need to dereference the possibly- freed parent. This also allows to remove all thread_group_leader() checks. We need to ensure that false-positive is not possible (if, say, ->tracer was already re-allocated and points to another task->signal), but this doesn't look difficult. Oleg. ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: yama: unsafe usage of ptrace_relation->tracer 2018-10-29 15:05 ` yama: unsafe usage of ptrace_relation->tracer Oleg Nesterov @ 2019-01-10 11:05 ` Tetsuo Handa 2019-01-10 18:47 ` Kees Cook 2019-01-16 17:40 ` Oleg Nesterov 0 siblings, 2 replies; 31+ messages in thread From: Tetsuo Handa @ 2019-01-10 11:05 UTC (permalink / raw) To: Oleg Nesterov, Kees Cook Cc: Serge E. Hallyn, syzbot, James Morris, LKML, linux-security-module, syzkaller-bugs Hello, Kees. syzbot is hitting this problem as of linux-next-20190110. When a patch will be proposed? On 2018/10/30 0:05, Oleg Nesterov wrote: > let me change the subject to avoid the confusion with the already confusing > disccussion about task_is_descendant(). > > On 10/29, Oleg Nesterov wrote: >> >> I still think we need a single pid_alive() check and I even sent the patch. >> Attached again at the end. >> >> To clarify, let me repeat that ptracer_exception_found() may need some fixes >> too, right now I am only talking about task_is_descendant(). > > so yes, the ptracer_relations code looks very broken to me, but perhaps I > misread this code, please correct me. > > RCU can only protect the ptracer_relations list itself, you can do nothing > with (say) relation->tracer. relation->tracer can be already freed when > ptracer_exception_found() checks relation->tracee == tracee. > > Not only pid_alive(parent) can not help in this case, pid_alive(parent) is > equally unsafe because, again, this memory can be freed. > > security_task_free(tsk) is called right before free_task(tsk), there is no > a gp pass in between, and of course we can't rely on the ->invalid check. > > _At first glance_ we can fix this if we simply turn both ->tracer/tracee > pointers into "signal_struct *", then we can turn all same_thread_group()'s > into walker->signal == parent which doesn't need to dereference the possibly- > freed parent. This also allows to remove all thread_group_leader() checks. > We need to ensure that false-positive is not possible (if, say, ->tracer > was already re-allocated and points to another task->signal), but this > doesn't look difficult. > > Oleg. > > ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: yama: unsafe usage of ptrace_relation->tracer 2019-01-10 11:05 ` Tetsuo Handa @ 2019-01-10 18:47 ` Kees Cook 2019-01-16 17:40 ` Oleg Nesterov 1 sibling, 0 replies; 31+ messages in thread From: Kees Cook @ 2019-01-10 18:47 UTC (permalink / raw) To: Tetsuo Handa Cc: Oleg Nesterov, Serge E. Hallyn, syzbot, James Morris, LKML, linux-security-module, syzkaller-bugs On Thu, Jan 10, 2019 at 3:06 AM Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp> wrote: > syzbot is hitting this problem as of linux-next-20190110. > When a patch will be proposed? Hi! Sorry, this got delayed over the holidays. Let me finish the patch I was working on and get it published. Thanks! -- Kees Cook ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: yama: unsafe usage of ptrace_relation->tracer 2019-01-10 11:05 ` Tetsuo Handa 2019-01-10 18:47 ` Kees Cook @ 2019-01-16 17:40 ` Oleg Nesterov 1 sibling, 0 replies; 31+ messages in thread From: Oleg Nesterov @ 2019-01-16 17:40 UTC (permalink / raw) To: Tetsuo Handa Cc: Kees Cook, Serge E. Hallyn, syzbot, James Morris, LKML, linux-security-module, syzkaller-bugs On 01/10, Tetsuo Handa wrote: > > syzbot is hitting this problem as of linux-next-20190110. > When a patch will be proposed? Well. I have already suggested the patch below several times. It won't fix all problems in this code (I forgot the details but iirc ptracer_exception_found() is broken too, task_is_descendant() should be cleanuped, may be something else), but probably this trivial fix makes sense for the start? Oleg. --- x/security/yama/yama_lsm.c +++ x/security/yama/yama_lsm.c @@ -368,7 +368,8 @@ static int yama_ptrace_access_check(stru break; case YAMA_SCOPE_RELATIONAL: rcu_read_lock(); - if (!task_is_descendant(current, child) && + if (!pid_alive(child) || + !task_is_descendant(current, child) && !ptracer_exception_found(current, child) && !ns_capable(__task_cred(child)->user_ns, CAP_SYS_PTRACE)) rc = -EPERM; ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: KASAN: use-after-free Read in task_is_descendant 2018-10-25 12:17 ` Oleg Nesterov 2018-10-25 13:01 ` Oleg Nesterov @ 2018-10-25 13:14 ` Tetsuo Handa 2018-10-25 15:55 ` Oleg Nesterov 1 sibling, 1 reply; 31+ messages in thread From: Tetsuo Handa @ 2018-10-25 13:14 UTC (permalink / raw) To: Oleg Nesterov Cc: serge, syzbot, jmorris, keescook, linux-kernel, linux-security-module, syzkaller-bugs On 2018/10/25 21:17, Oleg Nesterov wrote: >>> And yes, task_is_descendant() can hit the dead child, if nothing else it can >>> be killed. This can explain the kasan report. >> >> The kasan is reporting that child->real_parent (or maybe child->real_parent->real_parent >> or child->real_parent->real_parent->real_parent ...) was pointing to already freed memory, >> isn't it? > > Yes. and you know, I am all confused. I no longer can understand you :/ Why don't we need to check every time like shown below? Why checking only once is sufficient? --- a/security/yama/yama_lsm.c +++ b/security/yama/yama_lsm.c @@ -285,7 +285,7 @@ static int task_is_descendant(struct task_struct *parent, rcu_read_lock(); if (!thread_group_leader(parent)) parent = rcu_dereference(parent->group_leader); - while (walker->pid > 0) { + while (pid_alive(walker) && walker->pid > 0) { if (!thread_group_leader(walker)) walker = rcu_dereference(walker->group_leader); if (walker == parent) { ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: KASAN: use-after-free Read in task_is_descendant 2018-10-25 13:14 ` KASAN: use-after-free Read in task_is_descendant Tetsuo Handa @ 2018-10-25 15:55 ` Oleg Nesterov 2018-10-25 16:25 ` Oleg Nesterov 2018-10-26 12:23 ` Tetsuo Handa 0 siblings, 2 replies; 31+ messages in thread From: Oleg Nesterov @ 2018-10-25 15:55 UTC (permalink / raw) To: Tetsuo Handa Cc: serge, syzbot, jmorris, keescook, linux-kernel, linux-security-module, syzkaller-bugs On 10/25, Tetsuo Handa wrote: > > On 2018/10/25 21:17, Oleg Nesterov wrote: > >>> And yes, task_is_descendant() can hit the dead child, if nothing else it can > >>> be killed. This can explain the kasan report. > >> > >> The kasan is reporting that child->real_parent (or maybe child->real_parent->real_parent > >> or child->real_parent->real_parent->real_parent ...) was pointing to already freed memory, > >> isn't it? > > > > Yes. and you know, I am all confused. I no longer can understand you :/ > > Why don't we need to check every time like shown below? > Why checking only once is sufficient? Why do you think it is not sufficient? Again, I can be easily wrong, rcu is not simple, but so far I think we need a single check at the start. > --- a/security/yama/yama_lsm.c > +++ b/security/yama/yama_lsm.c > @@ -285,7 +285,7 @@ static int task_is_descendant(struct task_struct *parent, > rcu_read_lock(); > if (!thread_group_leader(parent)) > parent = rcu_dereference(parent->group_leader); > - while (walker->pid > 0) { > + while (pid_alive(walker) && walker->pid > 0) { OK. To simplify, ets suppose that task_is_descendant() is called with tasklist lock held. And lets suppose that all tasks are single-threaded. Then we obviously need a single check at the start, we need to ensure that the child was not removed from its ->real_parent->children list. The latter means that if ->real_parent exits, the child will be re-parented and its ->real_parent will be updated. So we could do read_lock(tasklist); if (list_empty(child->sibling)) // it is dead, removed from ->children list, we can't trust // child->real_parent return -EWHATEVER; task_is_descendant(current, child); But note that we can safely use pid_alive(child) instead, detach_pid() and list_del_init(&p->sibling) happen "at the same time" since we hold tasklist. (And btw, I suggested several times to rename it, or add another helper with a better name. Note also that we could check, say, ->sighand != NULL with the same effect.) Now. Why do you think rcu_read_lock() differs in that we need to check pid_alive() at every step? Suppose that one of the grand parents exits, and it is going to be freed. Again, to (over)simplify the things, lets suppose that release_task() does synchronize_rcu(); free_task(p); at the end. Now, can rcu_read_lock(); if (pid_alive(child)) { while (child->pid) child = child->real_parent; } rcu_read_unlock(); hit the already freed ->real_parent ? Say, the freed child->real_parent->real_parent. Lets denote P1 = child->real_parent, P2 = P1->real_parent. Can P2 be already freed? This is only possible if synchronize_rcu() above was called before rcu_read_lock(), see the last sentence below. If P1->real_parent is still P2, then P1 has already exited too. And we still observe that child->real_parent == P1, this too is only possible if child has exited, so we must see pid_alive() == F. Why must we see pid_alive() == F without tasklist? It must be true, release_task() is serialized by tasklist_lock, but why we can't get the stale value under rcu_read_lock() ? Because our rcu read-lock critical section extends beyond the return from synchronize_rcu(), and thus we must have a full memory barrier _between_ that synchronize_rcu() and our rcu_read_lock(). We must see all memory updates, including thread_pid = NULL which makes pid_alive() == F. Do you see any hole? Oleg. ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: KASAN: use-after-free Read in task_is_descendant 2018-10-25 15:55 ` Oleg Nesterov @ 2018-10-25 16:25 ` Oleg Nesterov 2018-10-26 12:23 ` Tetsuo Handa 1 sibling, 0 replies; 31+ messages in thread From: Oleg Nesterov @ 2018-10-25 16:25 UTC (permalink / raw) To: Tetsuo Handa Cc: serge, syzbot, jmorris, keescook, linux-kernel, linux-security-module, syzkaller-bugs On 10/25, Oleg Nesterov wrote: > > Because our rcu read-lock critical section extends beyond the return from > synchronize_rcu(), and thus we must have a full memory barrier _between_ > that synchronize_rcu() and our rcu_read_lock(). We must see all memory updates, > including thread_pid = NULL which makes pid_alive() == F. In case I was not clear.... Suppose we have int X = 0. If some CPU does X = 1; synchronize_rcu(); and another CPU does rcu_read_lock(); x = X; rcu_read_unlock(); then x should be == 1 in case when rcu_read_unlock() happens _after_ return from synchronize_rcu(). Oleg. ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: KASAN: use-after-free Read in task_is_descendant 2018-10-25 15:55 ` Oleg Nesterov 2018-10-25 16:25 ` Oleg Nesterov @ 2018-10-26 12:23 ` Tetsuo Handa 2018-10-26 13:04 ` Oleg Nesterov 1 sibling, 1 reply; 31+ messages in thread From: Tetsuo Handa @ 2018-10-26 12:23 UTC (permalink / raw) To: Oleg Nesterov Cc: serge, syzbot, jmorris, keescook, linux-kernel, linux-security-module, syzkaller-bugs On 2018/10/26 0:55, Oleg Nesterov wrote: > On 10/25, Tetsuo Handa wrote: >> >> On 2018/10/25 21:17, Oleg Nesterov wrote: >>>>> And yes, task_is_descendant() can hit the dead child, if nothing else it can >>>>> be killed. This can explain the kasan report. >>>> >>>> The kasan is reporting that child->real_parent (or maybe child->real_parent->real_parent >>>> or child->real_parent->real_parent->real_parent ...) was pointing to already freed memory, >>>> isn't it? >>> >>> Yes. and you know, I am all confused. I no longer can understand you :/ >> >> Why don't we need to check every time like shown below? >> Why checking only once is sufficient? > > Why do you think it is not sufficient? > > Again, I can be easily wrong, rcu is not simple, but so far I think we need > a single check at the start. > Hmm, this report is difficult to guess what happened. Since the "child" passed to task_is_descendant() has at least one reference count taken by find_get_task_by_vpid(), rcu_dereference(walker->real_parent) in the first iteration while (child->pid > 0) { if (!thread_group_leader(child)) walker = rcu_dereference(child->group_leader); if (walker == parent) { rc = 1; break; } walker = rcu_dereference(walker->real_parent); } must not trigger use-after-free bug. Thus, when this use-after-free was detected at rcu_dereference(walker->real_parent), the memory pointed by "walker" must have been released between while (walker->pid > 0) { if (!thread_group_leader(walker)) walker = rcu_dereference(walker->group_leader); and walker = rcu_dereference(walker->real_parent); } because otherwise use-after-free would have been reported at walker->pid or thread_group_leader(walker) or rcu_dereference(walker->group_leader). Is my understanding correct? Then, what pid_alive(child) is testing? It is not memory pointed by "child" but memory pointed by "walker" (i.e. parent of "child" or parent of parent of "child" or ... ) which is triggering use-after-free. Suppose p1 == p2->real_parent and p2 == p3->real_parent, and p1 exited when p2 tried to attach on p1, p2->real_parent was pointing to already (or about to be) freed p1. Even if pid_alive(p2) test can guarantee that p1 won't be released, how can pid_alive(p3) test guarantee that p1 won't be released? p1 can be released any moment because it has already waited for RCU grace period, can't it? ptrace(PTRACE_ATTACH, vpid_of_p2) { p2 = find_get_task_by_vpid(vpid_of_p2); ptrace_attach(p2, PTRACE_ATTACH, addr, data) { mutex_lock_interruptible(&p2->signal->cred_guard_mutex); // p1 starts exit()ing here. task_lock(p2); __ptrace_may_access(p2) { // p2->real_parent starts pointing to already freed p1. security_ptrace_access_check(p2, PTRACE_MODE_ATTACH) { yama_ptrace_access_check() { task_is_descendant(current, p2) { walker = p2; rcu_read_lock(); if (pid_alive(p2)) { // If true if (p2->pid > 0) { // will be true p1 = rcu_dereference(p2->real_parent); // might be OK due to pid_alive(p2) == true? } } rcu_read_unlock(); } } } } task_unlock(p2); mutex_unlock(&p2->signal->cred_guard_mutex); } put_task_struct(p2); } ptrace(PTRACE_ATTACH, vpid_of_p3) { p3 = find_get_task_by_vpid(vpid_of_p3); ptrace_attach(p3, PTRACE_ATTACH, addr, data) { mutex_lock_interruptible(&p3->signal->cred_guard_mutex); // p1 starts exit()ing here. task_lock(p3); __ptrace_may_access(p3) { // p2->real_parent starts pointing to already freed p1. security_ptrace_access_check(p3, PTRACE_MODE_ATTACH) { yama_ptrace_access_check() { task_is_descendant(current, p3) { walker = p3; rcu_read_lock(); if (pid_alive(p3)) { // If true if (p3->pid > 0) { // will be true p2 = rcu_dereference(p3->real_parent); // will be OK if above assumption is OK. if (p2->pid > 0) { // will be true p1 = rcu_dereference(p2->real_parent); // will read already (or about to be) freed p1 address if (p1->pid > 0) { // Oops here or if (!thread_group_leader(p1)) // oops here or p1 = rcu_dereference(p1->group_leader); // oops here or p0 = rcu_dereference(p1->real_parent); // oops here, or not oops because releasing after this } } } } rcu_read_unlock(); } } } } task_unlock(p3); mutex_unlock(&p3->signal->cred_guard_mutex); } put_task_struct(p3); } ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: KASAN: use-after-free Read in task_is_descendant 2018-10-26 12:23 ` Tetsuo Handa @ 2018-10-26 13:04 ` Oleg Nesterov 2018-10-26 13:51 ` Tetsuo Handa 0 siblings, 1 reply; 31+ messages in thread From: Oleg Nesterov @ 2018-10-26 13:04 UTC (permalink / raw) To: Tetsuo Handa Cc: serge, syzbot, jmorris, keescook, linux-kernel, linux-security-module, syzkaller-bugs On 10/26, Tetsuo Handa wrote: > > Since the "child" passed to task_is_descendant() has at least one reference > count taken by find_get_task_by_vpid(), rcu_dereference(walker->real_parent) > in the first iteration > > while (child->pid > 0) { > if (!thread_group_leader(child)) > walker = rcu_dereference(child->group_leader); > if (walker == parent) { > rc = 1; > break; > } > walker = rcu_dereference(walker->real_parent); > } > > must not trigger use-after-free bug. Yes, > Thus, when this use-after-free was > detected at rcu_dereference(walker->real_parent), the memory pointed by > "walker" must have been released between > > while (walker->pid > 0) { > if (!thread_group_leader(walker)) > walker = rcu_dereference(walker->group_leader); > > and > > walker = rcu_dereference(walker->real_parent); > } this is possible too, rcu callback which frees the task_struct can run int between > because otherwise use-after-free would have been reported at walker->pid > or thread_group_leader(walker) or rcu_dereference(walker->group_leader). Well, I do not know how much kasan is "percise", but if it is "perfect" then you are right, > Then, what pid_alive(child) is testing? I tried to explain this in my previous email. I even mentioned that we could do another check, say, ->sighand != NULL, or list_empty(child->sibling) with the same effect. > Suppose p1 == p2->real_parent and p2 == p3->real_parent, and p1 exited > when p2 tried to attach on p1, p2->real_parent was pointing to already > (or about to be) freed p1. No, p2->real_parent will be updated. If p1 exits it will re-parent its children including p2. Again, did you read my previous email? Let me repeat, of course I can be wrong. But so far I am answering the same questions again and again... Oleg. ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: KASAN: use-after-free Read in task_is_descendant 2018-10-26 13:04 ` Oleg Nesterov @ 2018-10-26 13:51 ` Tetsuo Handa 2018-10-26 14:39 ` Oleg Nesterov 0 siblings, 1 reply; 31+ messages in thread From: Tetsuo Handa @ 2018-10-26 13:51 UTC (permalink / raw) To: Oleg Nesterov Cc: serge, syzbot, jmorris, keescook, linux-kernel, linux-security-module, syzkaller-bugs On 2018/10/26 22:04, Oleg Nesterov wrote: >> Suppose p1 == p2->real_parent and p2 == p3->real_parent, and p1 exited >> when p2 tried to attach on p1, p2->real_parent was pointing to already >> (or about to be) freed p1. > > No, p2->real_parent will be updated. If p1 exits it will re-parent its > children including p2. My error. Suppose p1 == p2->real_parent and p2 == p3->real_parent, and p1 exited when someone tried to attach on p2, p2->real_parent was pointing to already (or about to be) freed p1. So, the puzzle part is why p2->real_parent was still pointing p1 even after p1 was freed... > > Again, did you read my previous email? Yes. But I still can't be convinced that pid_alive() test helps. ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: KASAN: use-after-free Read in task_is_descendant 2018-10-26 13:51 ` Tetsuo Handa @ 2018-10-26 14:39 ` Oleg Nesterov 2018-10-26 15:04 ` Tetsuo Handa 0 siblings, 1 reply; 31+ messages in thread From: Oleg Nesterov @ 2018-10-26 14:39 UTC (permalink / raw) To: Tetsuo Handa Cc: serge, syzbot, jmorris, keescook, linux-kernel, linux-security-module, syzkaller-bugs On 10/26, Tetsuo Handa wrote: > > On 2018/10/26 22:04, Oleg Nesterov wrote: > >> Suppose p1 == p2->real_parent and p2 == p3->real_parent, and p1 exited > >> when p2 tried to attach on p1, p2->real_parent was pointing to already > >> (or about to be) freed p1. > > > > No, p2->real_parent will be updated. If p1 exits it will re-parent its > > children including p2. > > My error. > > Suppose p1 == p2->real_parent and p2 == p3->real_parent, and p1 exited > when someone tried to attach on p2, p2->real_parent was pointing to already > (or about to be) freed p1. I don't see a difference. If p1 exits it will re-parent p2, p2->real_parent will be updated. > So, the puzzle part is why p2->real_parent was still pointing p1 even after > p1 was freed... I don't understand the question. Once again. TASK->real_parent can point to the freed mem only if a) TASK exits, and b) _after_ that its parent TASK->real_parent exits too. > > Again, did you read my previous email? > > Yes. But I still can't be convinced that pid_alive() test helps. Well, I don't understand which part of my explanations is not clear to you. OR. Perhaps I am wrong and do not understand your concerns. Oleg. ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: KASAN: use-after-free Read in task_is_descendant 2018-10-26 14:39 ` Oleg Nesterov @ 2018-10-26 15:04 ` Tetsuo Handa 2018-10-26 15:22 ` Oleg Nesterov 0 siblings, 1 reply; 31+ messages in thread From: Tetsuo Handa @ 2018-10-26 15:04 UTC (permalink / raw) To: Oleg Nesterov Cc: serge, syzbot, jmorris, keescook, linux-kernel, linux-security-module, syzkaller-bugs On 2018/10/26 23:39, Oleg Nesterov wrote: > On 10/26, Tetsuo Handa wrote: >> Suppose p1 == p2->real_parent and p2 == p3->real_parent, and p1 exited >> when someone tried to attach on p2, p2->real_parent was pointing to already >> (or about to be) freed p1. > > I don't see a difference. > > If p1 exits it will re-parent p2, p2->real_parent will be updated. > >> So, the puzzle part is why p2->real_parent was still pointing p1 even after >> p1 was freed... > > I don't understand the question. > > Once again. TASK->real_parent can point to the freed mem only if a) TASK exits, > and b) _after_ that its parent TASK->real_parent exits too. Oh, p2 exited and then p1 also exited when someone tried to attach on p2. Then, p2->real_parent can point to already (or about to be) freed p1. > >>> Again, did you read my previous email? >> >> Yes. But I still can't be convinced that pid_alive() test helps. > > Well, I don't understand which part of my explanations is not clear to you. OK. Checking pid_alive() should help. (By the way, if p->real_parent were updated to point to init_task when p exits, we could omit pid_alive() check?) ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: KASAN: use-after-free Read in task_is_descendant 2018-10-26 15:04 ` Tetsuo Handa @ 2018-10-26 15:22 ` Oleg Nesterov 0 siblings, 0 replies; 31+ messages in thread From: Oleg Nesterov @ 2018-10-26 15:22 UTC (permalink / raw) To: Tetsuo Handa Cc: serge, syzbot, jmorris, keescook, linux-kernel, linux-security-module, syzkaller-bugs On 10/27, Tetsuo Handa wrote: > > On 2018/10/26 23:39, Oleg Nesterov wrote: > > On 10/26, Tetsuo Handa wrote: > >> Suppose p1 == p2->real_parent and p2 == p3->real_parent, and p1 exited > >> when someone tried to attach on p2, p2->real_parent was pointing to already > >> (or about to be) freed p1. > > > > I don't see a difference. > > > > If p1 exits it will re-parent p2, p2->real_parent will be updated. > > > >> So, the puzzle part is why p2->real_parent was still pointing p1 even after > >> p1 was freed... > > > > I don't understand the question. > > > > Once again. TASK->real_parent can point to the freed mem only if a) TASK exits, > > and b) _after_ that its parent TASK->real_parent exits too. > > Oh, p2 exited and then p1 also exited when someone tried to attach on p2. > Then, p2->real_parent can point to already (or about to be) freed p1. Then we must see pid_alive(p2) == F as I already explained you in my yesterday's email. Because if p1 exits _after_ p2, then pid_alive(p2) == F must be already completed, p1 can't exit (I mean, release_task(p1) can't be called) until release_task(p2) does detach_pid() and drops tasklist_lock. > (By the way, if p->real_parent were updated to point to init_task when p exits, > we could omit pid_alive() check?) Sorry, I don't understand the question... Ignoring the PR_SET_CHILD_SUBREAPER parents, ignoring the exiting sub-threads which reparent to group leader, ->real_parent is alwasy updated to point to init_task. But I don't see why we could omit pid_alive() check. Oleg. ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: KASAN: use-after-free Read in task_is_descendant 2018-10-22 9:54 ` Oleg Nesterov 2018-10-22 10:06 ` Tetsuo Handa @ 2018-10-25 8:19 ` Kees Cook 2018-10-25 11:52 ` Oleg Nesterov 1 sibling, 1 reply; 31+ messages in thread From: Kees Cook @ 2018-10-25 8:19 UTC (permalink / raw) To: Oleg Nesterov Cc: Tetsuo Handa, Serge E. Hallyn, syzbot, James Morris, LKML, linux-security-module, syzkaller-bugs On Mon, Oct 22, 2018 at 10:54 AM, Oleg Nesterov <oleg@redhat.com> wrote: > On 10/21, Tetsuo Handa wrote: >> >> On 2018/10/21 16:10, syzbot wrote: >> > BUG: KASAN: use-after-free in __read_once_size include/linux/compiler.h:188 [inline] >> > BUG: KASAN: use-after-free in task_is_descendant.part.2+0x610/0x670 security/yama/yama_lsm.c:295 >> > Read of size 8 at addr ffff8801c4666b20 by task syz-executor3/12722 >> > >> > CPU: 1 PID: 12722 Comm: syz-executor3 Not tainted 4.19.0-rc8+ #70 >> > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 >> > Call Trace: >> > __dump_stack lib/dump_stack.c:77 [inline] >> > dump_stack+0x1c4/0x2b4 lib/dump_stack.c:113 >> > print_address_description.cold.8+0x9/0x1ff mm/kasan/report.c:256 >> > kasan_report_error mm/kasan/report.c:354 [inline] >> > kasan_report.cold.9+0x242/0x309 mm/kasan/report.c:412 >> > __asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:433 >> > __read_once_size include/linux/compiler.h:188 [inline] >> > task_is_descendant.part.2+0x610/0x670 security/yama/yama_lsm.c:295 >> >> Do we need to hold >> >> write_lock_irq(&tasklist_lock); >> >> rather than >> >> rcu_read_lock(); >> >> when accessing >> >> "struct task_struct"->real_parent > > Well, if "task" is stable (can't exit), then I think > > rcu_dereference(task->real_parent) > > is fine, we know that ->real_parent did not pass exit_notif() yet. > > However, task_is_descendant() looks unnecessarily complicated, it could be > > static int task_is_descendant(struct task_struct *parent, > struct task_struct *child) > { > int rc = 0; > struct task_struct *walker; > > if (!parent || !child) > return 0; > > rcu_read_lock(); > for (walker = child; walker->pid; walker = rcu_dereference(walker->real_parent)) > if (same_thread_group(parent, walker)) { > rc = 1; > break; > } > rcu_read_unlock(); > > return rc; > } > > And again, I do not know how/if yama ensures that child is rcu-protected, perhaps > task_is_descendant() needs to check pid_alive(child) right after rcu_read_lock() ? task_is_descendant() is called under rcu_read_lock() in both ptracer_exception_found() and yama_ptrace_access_check() so I don't understand how any of the tasks could get freed? This is walking group_leader and real_parent -- are these not stable under rcu_lock()? -- Kees Cook ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: KASAN: use-after-free Read in task_is_descendant 2018-10-25 8:19 ` Kees Cook @ 2018-10-25 11:52 ` Oleg Nesterov 0 siblings, 0 replies; 31+ messages in thread From: Oleg Nesterov @ 2018-10-25 11:52 UTC (permalink / raw) To: Kees Cook Cc: Tetsuo Handa, Serge E. Hallyn, syzbot, James Morris, LKML, linux-security-module, syzkaller-bugs On 10/25, Kees Cook wrote: > > task_is_descendant() is called under rcu_read_lock() in both > ptracer_exception_found() and yama_ptrace_access_check() so I don't > understand how any of the tasks could get freed? This is walking > group_leader and real_parent -- are these not stable under rcu_lock()? group_leader/real_parent/etc are no longer rcu-protected after the exiting child calls release_task() which in particular removes the child from children/thread_group lists. OK. Suppose you have an rcu-protected list, and each element also has a reference counter so you can do something struct elt { atomic_t ctr; struct list_head list; int pid; }; rcu_read_lock(); list_for_each_entry(elt, &LIST, list) { if (elt->pid == 100) { atomic_inc(&elt->ctr); // get_task_struct() break; } } rcu_read_unlock(); do_something(elt); This code is fine. This elt can't be freed, you have a reference. But once you drop rcu lock you can't trust elt->list.next! So, for example, you can not do rcu_read_lock(); list_for_each_entry_continue_rcu(elt, &LIST, list) { ... } rcu_read_unlock(); too late, elt.list.next can be already freed, or it can be freed while you iterate the list. Another simple example. Suppose you have a global PTR protected by rcu. So ignoring the necessary rcu_dereference this code is fine: rcu_read_lock(); if (ptr = PTR) do_something(ptr); rcu_read_unlcok(); But this is not: ptr = PTR; rcu_read_lock(); if (ptr) do_something(ptr); rcu_read_unlock(); basically the same thing... Oleg. ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: KASAN: use-after-free Read in task_is_descendant 2018-10-21 7:10 KASAN: use-after-free Read in task_is_descendant syzbot 2018-10-21 7:12 ` Tetsuo Handa @ 2018-11-10 3:25 ` syzbot 2018-11-10 11:46 ` syzbot 2 siblings, 0 replies; 31+ messages in thread From: syzbot @ 2018-11-10 3:25 UTC (permalink / raw) To: jmorris, keescook, linux-kernel, linux-security-module, oleg, penguin-kernel, penguin-kernel, serge, syzkaller-bugs syzbot has found a reproducer for the following crash on: HEAD commit: 3541833fd1f2 Merge tag 's390-4.20-2' of git://git.kernel.o.. git tree: upstream console output: https://syzkaller.appspot.com/x/log.txt?x=16d3cad5400000 kernel config: https://syzkaller.appspot.com/x/.config?x=8f559fee2fc3375a dashboard link: https://syzkaller.appspot.com/bug?extid=a9ac39bf55329e206219 compiler: gcc (GCC) 8.0.1 20180413 (experimental) syz repro: https://syzkaller.appspot.com/x/repro.syz?x=1494bb0b400000 IMPORTANT: if you fix the bug, please add the following tag to the commit: Reported-by: syzbot+a9ac39bf55329e206219@syzkaller.appspotmail.com IPv6: ADDRCONF(NETDEV_UP): veth1: link is not ready IPv6: ADDRCONF(NETDEV_CHANGE): veth1: link becomes ready IPv6: ADDRCONF(NETDEV_CHANGE): veth0: link becomes ready 8021q: adding VLAN 0 to HW filter on device team0 ================================================================== BUG: KASAN: use-after-free in __read_once_size include/linux/compiler.h:182 [inline] BUG: KASAN: use-after-free in task_is_descendant.part.3+0x610/0x670 security/yama/yama_lsm.c:295 Read of size 8 at addr ffff8801c46f24e0 by task syz-executor2/9973 CPU: 0 PID: 9973 Comm: syz-executor2 Not tainted 4.20.0-rc1+ #107 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x244/0x39d lib/dump_stack.c:113 print_address_description.cold.7+0x9/0x1ff mm/kasan/report.c:256 kasan_report_error mm/kasan/report.c:354 [inline] kasan_report.cold.8+0x242/0x309 mm/kasan/report.c:412 __asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:433 __read_once_size include/linux/compiler.h:182 [inline] task_is_descendant.part.3+0x610/0x670 security/yama/yama_lsm.c:295 task_is_descendant security/yama/yama_lsm.c:282 [inline] yama_ptrace_access_check+0x215/0x10fc security/yama/yama_lsm.c:371 security_ptrace_access_check+0x54/0xb0 security/security.c:271 __ptrace_may_access+0x5c8/0x980 kernel/ptrace.c:336 ptrace_attach+0x1fa/0x640 kernel/ptrace.c:389 __do_sys_ptrace kernel/ptrace.c:1139 [inline] __se_sys_ptrace kernel/ptrace.c:1119 [inline] __x64_sys_ptrace+0x229/0x260 kernel/ptrace.c:1119 do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x457569 Code: fd b3 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 cb b3 fb ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007f2ed4dbfc78 EFLAGS: 00000246 ORIG_RAX: 0000000000000065 RAX: ffffffffffffffda RBX: 0000000000000004 RCX: 0000000000457569 RDX: 0000000000000000 RSI: 000000000000021d RDI: 0000000000004206 RBP: 000000000072bf00 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 00007f2ed4dc06d4 R13: 00000000004c33bd R14: 00000000004d50e0 R15: 00000000ffffffff Allocated by task 5873: save_stack+0x43/0xd0 mm/kasan/kasan.c:448 set_track mm/kasan/kasan.c:460 [inline] kasan_kmalloc+0xc7/0xe0 mm/kasan/kasan.c:553 kasan_slab_alloc+0x12/0x20 mm/kasan/kasan.c:490 kmem_cache_alloc_node+0x144/0x730 mm/slab.c:3644 alloc_task_struct_node kernel/fork.c:158 [inline] dup_task_struct kernel/fork.c:843 [inline] copy_process+0x2026/0x87a0 kernel/fork.c:1751 _do_fork+0x1cb/0x11d0 kernel/fork.c:2216 __do_sys_clone kernel/fork.c:2323 [inline] __se_sys_clone kernel/fork.c:2317 [inline] __x64_sys_clone+0xbf/0x150 kernel/fork.c:2317 do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe Freed by task 9: save_stack+0x43/0xd0 mm/kasan/kasan.c:448 set_track mm/kasan/kasan.c:460 [inline] __kasan_slab_free+0x102/0x150 mm/kasan/kasan.c:521 kasan_slab_free+0xe/0x10 mm/kasan/kasan.c:528 __cache_free mm/slab.c:3498 [inline] kmem_cache_free+0x83/0x290 mm/slab.c:3760 free_task_struct kernel/fork.c:163 [inline] free_task+0x16e/0x1f0 kernel/fork.c:457 __put_task_struct+0x2e6/0x620 kernel/fork.c:730 put_task_struct include/linux/sched/task.h:96 [inline] delayed_put_task_struct+0x2ff/0x4c0 kernel/exit.c:181 __rcu_reclaim kernel/rcu/rcu.h:240 [inline] rcu_do_batch kernel/rcu/tree.c:2437 [inline] invoke_rcu_callbacks kernel/rcu/tree.c:2716 [inline] rcu_process_callbacks+0x100a/0x1ac0 kernel/rcu/tree.c:2697 __do_softirq+0x308/0xb7e kernel/softirq.c:292 The buggy address belongs to the object at ffff8801c46f2000 which belongs to the cache task_struct(65:syz2) of size 6080 The buggy address is located 1248 bytes inside of 6080-byte region [ffff8801c46f2000, ffff8801c46f37c0) The buggy address belongs to the page: page:ffffea000711bc80 count:1 mapcount:0 mapping:ffff8801c204fc80 index:0x0 compound_mapcount: 0 flags: 0x2fffc0000010200(slab|head) raw: 02fffc0000010200 ffffea00073cf808 ffffea000697c908 ffff8801c204fc80 raw: 0000000000000000 ffff8801c46f2000 0000000100000001 ffff8801d8404a80 page dumped because: kasan: bad access detected page->mem_cgroup:ffff8801d8404a80 Memory state around the buggy address: ffff8801c46f2380: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff8801c46f2400: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb > ffff8801c46f2480: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ^ ffff8801c46f2500: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff8801c46f2580: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ================================================================== ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: KASAN: use-after-free Read in task_is_descendant 2018-10-21 7:10 KASAN: use-after-free Read in task_is_descendant syzbot 2018-10-21 7:12 ` Tetsuo Handa 2018-11-10 3:25 ` syzbot @ 2018-11-10 11:46 ` syzbot 2 siblings, 0 replies; 31+ messages in thread From: syzbot @ 2018-11-10 11:46 UTC (permalink / raw) To: jmorris, keescook, linux-kernel, linux-security-module, oleg, penguin-kernel, penguin-kernel, serge, syzkaller-bugs syzbot has found a reproducer for the following crash on: HEAD commit: aa4330e15c26 Merge tag 'devicetree-fixes-for-4.20-2' of gi.. git tree: upstream console output: https://syzkaller.appspot.com/x/log.txt?x=17abaa47400000 kernel config: https://syzkaller.appspot.com/x/.config?x=8f559fee2fc3375a dashboard link: https://syzkaller.appspot.com/bug?extid=a9ac39bf55329e206219 compiler: gcc (GCC) 8.0.1 20180413 (experimental) userspace arch: i386 syz repro: https://syzkaller.appspot.com/x/repro.syz?x=108ae6d5400000 C reproducer: https://syzkaller.appspot.com/x/repro.c?x=17bd86a3400000 IMPORTANT: if you fix the bug, please add the following tag to the commit: Reported-by: syzbot+a9ac39bf55329e206219@syzkaller.appspotmail.com ================================================================== BUG: KASAN: use-after-free in __read_once_size include/linux/compiler.h:182 [inline] BUG: KASAN: use-after-free in task_is_descendant.part.3+0x610/0x670 security/yama/yama_lsm.c:295 Read of size 8 at addr ffff8801ae718560 by task syz-executor198/4607 CPU: 1 PID: 4607 Comm: syz-executor198 Not tainted 4.20.0-rc1+ #232 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x244/0x39d lib/dump_stack.c:113 print_address_description.cold.7+0x9/0x1ff mm/kasan/report.c:256 kasan_report_error mm/kasan/report.c:354 [inline] kasan_report.cold.8+0x242/0x309 mm/kasan/report.c:412 __asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:433 __read_once_size include/linux/compiler.h:182 [inline] task_is_descendant.part.3+0x610/0x670 security/yama/yama_lsm.c:295 task_is_descendant security/yama/yama_lsm.c:282 [inline] yama_ptrace_access_check+0x215/0x10fc security/yama/yama_lsm.c:371 security_ptrace_access_check+0x54/0xb0 security/security.c:271 __ptrace_may_access+0x5c8/0x980 kernel/ptrace.c:336 ptrace_attach+0x1fa/0x640 kernel/ptrace.c:389 __do_compat_sys_ptrace kernel/ptrace.c:1284 [inline] __se_compat_sys_ptrace kernel/ptrace.c:1266 [inline] __ia32_compat_sys_ptrace+0x1d2/0x260 kernel/ptrace.c:1266 do_syscall_32_irqs_on arch/x86/entry/common.c:326 [inline] do_fast_syscall_32+0x34d/0xfb2 arch/x86/entry/common.c:397 entry_SYSENTER_compat+0x70/0x7f arch/x86/entry/entry_64_compat.S:139 RIP: 0023:0xf7feba29 Code: 85 d2 74 02 89 0a 5b 5d c3 8b 04 24 c3 8b 14 24 c3 8b 3c 24 c3 90 90 90 90 90 90 90 90 90 90 90 90 51 52 55 89 e5 0f 34 cd 80 <5d> 5a 59 c3 90 90 90 90 eb 0d 90 90 90 90 90 90 90 90 90 90 90 90 RSP: 002b:00000000f7fe71ec EFLAGS: 00000296 ORIG_RAX: 000000000000001a RAX: ffffffffffffffda RBX: 0000000000004206 RCX: 00000000000014b5 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 Allocated by task 5668: save_stack+0x43/0xd0 mm/kasan/kasan.c:448 set_track mm/kasan/kasan.c:460 [inline] kasan_kmalloc+0xc7/0xe0 mm/kasan/kasan.c:553 kasan_slab_alloc+0x12/0x20 mm/kasan/kasan.c:490 kmem_cache_alloc_node+0x144/0x730 mm/slab.c:3644 alloc_task_struct_node kernel/fork.c:158 [inline] dup_task_struct kernel/fork.c:843 [inline] copy_process+0x2026/0x87a0 kernel/fork.c:1751 _do_fork+0x1cb/0x11d0 kernel/fork.c:2216 __do_compat_sys_x86_clone arch/x86/ia32/sys_ia32.c:240 [inline] __se_compat_sys_x86_clone arch/x86/ia32/sys_ia32.c:236 [inline] __ia32_compat_sys_x86_clone+0xbc/0x140 arch/x86/ia32/sys_ia32.c:236 do_syscall_32_irqs_on arch/x86/entry/common.c:326 [inline] do_fast_syscall_32+0x34d/0xfb2 arch/x86/entry/common.c:397 entry_SYSENTER_compat+0x70/0x7f arch/x86/entry/entry_64_compat.S:139 Freed by task 16: save_stack+0x43/0xd0 mm/kasan/kasan.c:448 set_track mm/kasan/kasan.c:460 [inline] __kasan_slab_free+0x102/0x150 mm/kasan/kasan.c:521 kasan_slab_free+0xe/0x10 mm/kasan/kasan.c:528 __cache_free mm/slab.c:3498 [inline] kmem_cache_free+0x83/0x290 mm/slab.c:3760 free_task_struct kernel/fork.c:163 [inline] free_task+0x16e/0x1f0 kernel/fork.c:457 __put_task_struct+0x2e6/0x620 kernel/fork.c:730 put_task_struct include/linux/sched/task.h:96 [inline] delayed_put_task_struct+0x2ff/0x4c0 kernel/exit.c:181 __rcu_reclaim kernel/rcu/rcu.h:240 [inline] rcu_do_batch kernel/rcu/tree.c:2437 [inline] invoke_rcu_callbacks kernel/rcu/tree.c:2716 [inline] rcu_process_callbacks+0x100a/0x1ac0 kernel/rcu/tree.c:2697 __do_softirq+0x308/0xb7e kernel/softirq.c:292 The buggy address belongs to the object at ffff8801ae718080 which belongs to the cache task_struct of size 6080 The buggy address is located 1248 bytes inside of 6080-byte region [ffff8801ae718080, ffff8801ae719840) The buggy address belongs to the page: page:ffffea0006b9c600 count:1 mapcount:0 mapping:ffff8801da970180 index:0x0 compound_mapcount: 0 flags: 0x2fffc0000010200(slab|head) raw: 02fffc0000010200 ffffea0006b99488 ffffea0006b9c688 ffff8801da970180 raw: 0000000000000000 ffff8801ae718080 0000000100000001 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff8801ae718400: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff8801ae718480: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb > ffff8801ae718500: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ^ ffff8801ae718580: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff8801ae718600: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ================================================================== ^ permalink raw reply [flat|nested] 31+ messages in thread
end of thread, other threads:[~2019-01-16 17:40 UTC | newest] Thread overview: 31+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2018-10-21 7:10 KASAN: use-after-free Read in task_is_descendant syzbot 2018-10-21 7:12 ` Tetsuo Handa 2018-10-22 9:54 ` Oleg Nesterov 2018-10-22 10:06 ` Tetsuo Handa 2018-10-22 13:46 ` Oleg Nesterov 2018-10-25 2:15 ` Tetsuo Handa 2018-10-25 11:13 ` Oleg Nesterov 2018-10-25 11:36 ` Kees Cook 2018-10-25 12:05 ` Oleg Nesterov 2018-10-25 11:47 ` Tetsuo Handa 2018-10-25 12:17 ` Oleg Nesterov 2018-10-25 13:01 ` Oleg Nesterov 2018-10-26 16:09 ` Kees Cook 2018-10-29 12:23 ` Oleg Nesterov 2018-10-29 15:05 ` yama: unsafe usage of ptrace_relation->tracer Oleg Nesterov 2019-01-10 11:05 ` Tetsuo Handa 2019-01-10 18:47 ` Kees Cook 2019-01-16 17:40 ` Oleg Nesterov 2018-10-25 13:14 ` KASAN: use-after-free Read in task_is_descendant Tetsuo Handa 2018-10-25 15:55 ` Oleg Nesterov 2018-10-25 16:25 ` Oleg Nesterov 2018-10-26 12:23 ` Tetsuo Handa 2018-10-26 13:04 ` Oleg Nesterov 2018-10-26 13:51 ` Tetsuo Handa 2018-10-26 14:39 ` Oleg Nesterov 2018-10-26 15:04 ` Tetsuo Handa 2018-10-26 15:22 ` Oleg Nesterov 2018-10-25 8:19 ` Kees Cook 2018-10-25 11:52 ` Oleg Nesterov 2018-11-10 3:25 ` syzbot 2018-11-10 11:46 ` syzbot
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).