* kmemleak splat on copy_process() @ 2017-02-03 21:06 Luis R. Rodriguez 2017-02-06 9:47 ` Michal Hocko 0 siblings, 1 reply; 13+ messages in thread From: Luis R. Rodriguez @ 2017-02-03 21:06 UTC (permalink / raw) To: Andrew Morton, Michal Hocko Cc: Ingo Molnar, Andy Lutomirski, Kees Cook, Eric W. Biederman, Mateusz Guzik, linux-kernel On next-20170125 running some kselftest not yet upstream I eventually get a kmemleak splat: unreferenced object 0xffffa7b1034b4000 (size 16384): comm "driver_data.sh", pid 6506, jiffies 4295068366 (age 1697.272s) hex dump (first 32 bytes): 9d 6e ac 57 00 00 00 00 74 2d 64 72 69 76 65 72 .n.W....t-driver 5f 64 61 74 61 2e 62 69 6e 0a 00 00 00 00 00 00 _data.bin....... backtrace: [<ffffffff9005f7fa>] kmemleak_alloc+0x4a/0xa0 [<ffffffff8fbe7006>] __vmalloc_node_range+0x206/0x2a0 [<ffffffff8fa7f3e9>] copy_process.part.36+0x609/0x1cc0 [<ffffffff8fa80c77>] _do_fork+0xd7/0x390 [<ffffffff8fa80fd9>] SyS_clone+0x19/0x20 [<ffffffff8fa03b4b>] do_syscall_64+0x5b/0xc0 [<ffffffff9006b3af>] return_from_SYSCALL_64+0x0/0x6a [<ffffffffffffffff>] 0xffffffffffffffff As per gdb: (gdb) l *(copy_process+0x609) 0xffffffff8107f3e9 is in copy_process (kernel/fork.c:204). warning: Source file is more recent than executable. 199 /* 200 * We can't call find_vm_area() in interrupt context, and 201 * free_thread_stack() can be called in interrupt context, 202 * so cache the vm_struct. 203 */ 204 if (stack) { 205 tsk->stack_vm_area = find_vm_area(stack); 206 } 207 return stack; 208 #else So it would seem a complaint about alloc_thread_stack_node() -- I checked but I cannot find the leak so am thinking this is a false positive. Can you confirm? Luis ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: kmemleak splat on copy_process() 2017-02-03 21:06 kmemleak splat on copy_process() Luis R. Rodriguez @ 2017-02-06 9:47 ` Michal Hocko 2017-02-07 1:37 ` Luis R. Rodriguez 0 siblings, 1 reply; 13+ messages in thread From: Michal Hocko @ 2017-02-06 9:47 UTC (permalink / raw) To: Luis R. Rodriguez Cc: Andrew Morton, Ingo Molnar, Andy Lutomirski, Kees Cook, Eric W. Biederman, Mateusz Guzik, linux-kernel On Fri 03-02-17 13:06:04, Luis R. Rodriguez wrote: > On next-20170125 running some kselftest not yet upstream I eventually > get a kmemleak splat: > > unreferenced object 0xffffa7b1034b4000 (size 16384): > comm "driver_data.sh", pid 6506, jiffies 4295068366 (age 1697.272s) > hex dump (first 32 bytes): > 9d 6e ac 57 00 00 00 00 74 2d 64 72 69 76 65 72 .n.W....t-driver > 5f 64 61 74 61 2e 62 69 6e 0a 00 00 00 00 00 00 _data.bin....... > backtrace: > [<ffffffff9005f7fa>] kmemleak_alloc+0x4a/0xa0 > [<ffffffff8fbe7006>] __vmalloc_node_range+0x206/0x2a0 > [<ffffffff8fa7f3e9>] copy_process.part.36+0x609/0x1cc0 > [<ffffffff8fa80c77>] _do_fork+0xd7/0x390 > [<ffffffff8fa80fd9>] SyS_clone+0x19/0x20 > [<ffffffff8fa03b4b>] do_syscall_64+0x5b/0xc0 > [<ffffffff9006b3af>] return_from_SYSCALL_64+0x0/0x6a > [<ffffffffffffffff>] 0xffffffffffffffff > > As per gdb: > > (gdb) l *(copy_process+0x609) > 0xffffffff8107f3e9 is in copy_process (kernel/fork.c:204). > warning: Source file is more recent than executable. > 199 /* > 200 * We can't call find_vm_area() in interrupt context, and > 201 * free_thread_stack() can be called in interrupt context, > 202 * so cache the vm_struct. > 203 */ > 204 if (stack) { > 205 tsk->stack_vm_area = find_vm_area(stack); > 206 } > 207 return stack; > 208 #else Could you check the state of the above process (pid 6506)? Does it still own its stack? From a quick check I do not see any leak there either. -- Michal Hocko SUSE Labs ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: kmemleak splat on copy_process() 2017-02-06 9:47 ` Michal Hocko @ 2017-02-07 1:37 ` Luis R. Rodriguez 2017-02-07 8:03 ` Michal Hocko 0 siblings, 1 reply; 13+ messages in thread From: Luis R. Rodriguez @ 2017-02-07 1:37 UTC (permalink / raw) To: Michal Hocko Cc: Luis R. Rodriguez, Andrew Morton, Ingo Molnar, Andy Lutomirski, Kees Cook, Eric W. Biederman, Mateusz Guzik, linux-kernel On Mon, Feb 06, 2017 at 10:47:41AM +0100, Michal Hocko wrote: > On Fri 03-02-17 13:06:04, Luis R. Rodriguez wrote: > > On next-20170125 running some kselftest not yet upstream I eventually > > get a kmemleak splat: > > > > unreferenced object 0xffffa7b1034b4000 (size 16384): > > comm "driver_data.sh", pid 6506, jiffies 4295068366 (age 1697.272s) > > hex dump (first 32 bytes): > > 9d 6e ac 57 00 00 00 00 74 2d 64 72 69 76 65 72 .n.W....t-driver > > 5f 64 61 74 61 2e 62 69 6e 0a 00 00 00 00 00 00 _data.bin....... > > backtrace: > > [<ffffffff9005f7fa>] kmemleak_alloc+0x4a/0xa0 > > [<ffffffff8fbe7006>] __vmalloc_node_range+0x206/0x2a0 > > [<ffffffff8fa7f3e9>] copy_process.part.36+0x609/0x1cc0 > > [<ffffffff8fa80c77>] _do_fork+0xd7/0x390 > > [<ffffffff8fa80fd9>] SyS_clone+0x19/0x20 > > [<ffffffff8fa03b4b>] do_syscall_64+0x5b/0xc0 > > [<ffffffff9006b3af>] return_from_SYSCALL_64+0x0/0x6a > > [<ffffffffffffffff>] 0xffffffffffffffff > > > > As per gdb: > > > > (gdb) l *(copy_process+0x609) > > 0xffffffff8107f3e9 is in copy_process (kernel/fork.c:204). > > warning: Source file is more recent than executable. > > 199 /* > > 200 * We can't call find_vm_area() in interrupt context, and > > 201 * free_thread_stack() can be called in interrupt context, > > 202 * so cache the vm_struct. > > 203 */ > > 204 if (stack) { > > 205 tsk->stack_vm_area = find_vm_area(stack); > > 206 } > > 207 return stack; > > 208 #else > > Could you check the state of the above process (pid 6506)? Does it still > own its stack? Although I can reproduce the splat on kmemleak, getting it to trigger at a point I can stop the kernel and inspect the process seems rather hard. > From a quick check I do not see any leak there either. Then in that case what about: diff --git a/kernel/fork.c b/kernel/fork.c index 937ba59709c9..3c96aafa1f82 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -196,6 +196,7 @@ static unsigned long *alloc_thread_stack_node(struct task_struct *tsk, int node) PAGE_KERNEL, 0, node, __builtin_return_address(0)); + kmemleak_ignore(stack); /* * We can't call find_vm_area() in interrupt context, and * free_thread_stack() can be called in interrupt context, I no longer get the spurious splats from kmemleak after this. Luis ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: kmemleak splat on copy_process() 2017-02-07 1:37 ` Luis R. Rodriguez @ 2017-02-07 8:03 ` Michal Hocko 2017-02-09 1:37 ` Luis R. Rodriguez 0 siblings, 1 reply; 13+ messages in thread From: Michal Hocko @ 2017-02-07 8:03 UTC (permalink / raw) To: Luis R. Rodriguez Cc: Andrew Morton, Ingo Molnar, Andy Lutomirski, Kees Cook, Eric W. Biederman, Mateusz Guzik, linux-kernel On Tue 07-02-17 02:37:02, Luis R. Rodriguez wrote: > On Mon, Feb 06, 2017 at 10:47:41AM +0100, Michal Hocko wrote: > > On Fri 03-02-17 13:06:04, Luis R. Rodriguez wrote: > > > On next-20170125 running some kselftest not yet upstream I eventually > > > get a kmemleak splat: > > > > > > unreferenced object 0xffffa7b1034b4000 (size 16384): > > > comm "driver_data.sh", pid 6506, jiffies 4295068366 (age 1697.272s) > > > hex dump (first 32 bytes): > > > 9d 6e ac 57 00 00 00 00 74 2d 64 72 69 76 65 72 .n.W....t-driver > > > 5f 64 61 74 61 2e 62 69 6e 0a 00 00 00 00 00 00 _data.bin....... > > > backtrace: > > > [<ffffffff9005f7fa>] kmemleak_alloc+0x4a/0xa0 > > > [<ffffffff8fbe7006>] __vmalloc_node_range+0x206/0x2a0 > > > [<ffffffff8fa7f3e9>] copy_process.part.36+0x609/0x1cc0 > > > [<ffffffff8fa80c77>] _do_fork+0xd7/0x390 > > > [<ffffffff8fa80fd9>] SyS_clone+0x19/0x20 > > > [<ffffffff8fa03b4b>] do_syscall_64+0x5b/0xc0 > > > [<ffffffff9006b3af>] return_from_SYSCALL_64+0x0/0x6a > > > [<ffffffffffffffff>] 0xffffffffffffffff > > > > > > As per gdb: > > > > > > (gdb) l *(copy_process+0x609) > > > 0xffffffff8107f3e9 is in copy_process (kernel/fork.c:204). > > > warning: Source file is more recent than executable. > > > 199 /* > > > 200 * We can't call find_vm_area() in interrupt context, and > > > 201 * free_thread_stack() can be called in interrupt context, > > > 202 * so cache the vm_struct. > > > 203 */ > > > 204 if (stack) { > > > 205 tsk->stack_vm_area = find_vm_area(stack); > > > 206 } > > > 207 return stack; > > > 208 #else > > > > Could you check the state of the above process (pid 6506)? Does it still > > own its stack? > > Although I can reproduce the splat on kmemleak, getting it to trigger > at a point I can stop the kernel and inspect the process seems rather hard. Can you make the kernel BUG_ON in this case and check the vmcore? > > From a quick check I do not see any leak there either. > > Then in that case what about: This just disables the kmemleak altogether which doesn't sound like a good idea to me. > diff --git a/kernel/fork.c b/kernel/fork.c > index 937ba59709c9..3c96aafa1f82 100644 > --- a/kernel/fork.c > +++ b/kernel/fork.c > @@ -196,6 +196,7 @@ static unsigned long *alloc_thread_stack_node(struct task_struct *tsk, int node) > PAGE_KERNEL, > 0, node, __builtin_return_address(0)); > > + kmemleak_ignore(stack); > /* > * We can't call find_vm_area() in interrupt context, and > * free_thread_stack() can be called in interrupt context, > > I no longer get the spurious splats from kmemleak after this. > > Luis -- Michal Hocko SUSE Labs ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: kmemleak splat on copy_process() 2017-02-07 8:03 ` Michal Hocko @ 2017-02-09 1:37 ` Luis R. Rodriguez 2017-02-17 17:07 ` Andy Lutomirski 0 siblings, 1 reply; 13+ messages in thread From: Luis R. Rodriguez @ 2017-02-09 1:37 UTC (permalink / raw) To: Michal Hocko Cc: Luis R. Rodriguez, Andrew Morton, Ingo Molnar, Andy Lutomirski, Kees Cook, Eric W. Biederman, Mateusz Guzik, linux-kernel On Tue, Feb 07, 2017 at 09:03:43AM +0100, Michal Hocko wrote: > On Tue 07-02-17 02:37:02, Luis R. Rodriguez wrote: > > > From a quick check I do not see any leak there either. > > > > Then in that case what about: > > This just disables the kmemleak altogether which doesn't sound like a > good idea to me. Only for this case, but if that is also not desirable let us debug further. That or I think we could perhaps massage code to make it clearer to kmemleak things are good. Luis ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: kmemleak splat on copy_process() 2017-02-09 1:37 ` Luis R. Rodriguez @ 2017-02-17 17:07 ` Andy Lutomirski 2017-02-17 17:23 ` Luis R. Rodriguez 0 siblings, 1 reply; 13+ messages in thread From: Andy Lutomirski @ 2017-02-17 17:07 UTC (permalink / raw) To: Luis R. Rodriguez Cc: Michal Hocko, Andrew Morton, Ingo Molnar, Andy Lutomirski, Kees Cook, Eric W. Biederman, Mateusz Guzik, linux-kernel On Wed, Feb 8, 2017 at 5:37 PM, Luis R. Rodriguez <mcgrof@kernel.org> wrote: > On Tue, Feb 07, 2017 at 09:03:43AM +0100, Michal Hocko wrote: >> On Tue 07-02-17 02:37:02, Luis R. Rodriguez wrote: >> > > From a quick check I do not see any leak there either. >> > >> > Then in that case what about: >> >> This just disables the kmemleak altogether which doesn't sound like a >> good idea to me. > > Only for this case, but if that is also not desirable let us debug further. > That or I think we could perhaps massage code to make it clearer to kmemleak > things are good. > I'm not seeing the issue. There should be a live pointer to stack at all times, either in a local variable or in task->stack. There's a weird window in dup_task_struct in which we're stashing away stack_vm_area, but stack itself should be okay, I think. But maybe there really is a race in which a kmemleak check right in the middle of duplicating the task struct really can't see the stack pointer. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: kmemleak splat on copy_process() 2017-02-17 17:07 ` Andy Lutomirski @ 2017-02-17 17:23 ` Luis R. Rodriguez 2017-02-17 19:32 ` Andy Lutomirski 0 siblings, 1 reply; 13+ messages in thread From: Luis R. Rodriguez @ 2017-02-17 17:23 UTC (permalink / raw) To: Andy Lutomirski Cc: Michal Hocko, Andrew Morton, Ingo Molnar, Andy Lutomirski, Kees Cook, Eric W. Biederman, Mateusz Guzik, linux-kernel On Fri, Feb 17, 2017 at 9:07 AM, Andy Lutomirski <luto@amacapital.net> wrote: > But maybe > there really is a race in which a kmemleak check right in the middle > of duplicating the task struct really can't see the stack pointer. Funny, but it was actually using kmemleak how I can easily reproduce: To reproduce the kmemleak splat: echo clear > /sys/kernel/debug/kmemleak echo scan > /sys/kernel/debug/kmemleak cat /sys/kernel/debug/kmemleak Try that. Luis ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: kmemleak splat on copy_process() 2017-02-17 17:23 ` Luis R. Rodriguez @ 2017-02-17 19:32 ` Andy Lutomirski 2017-05-15 21:53 ` Luis R. Rodriguez 0 siblings, 1 reply; 13+ messages in thread From: Andy Lutomirski @ 2017-02-17 19:32 UTC (permalink / raw) To: Luis R. Rodriguez Cc: Michal Hocko, Andrew Morton, Ingo Molnar, Andy Lutomirski, Kees Cook, Eric W. Biederman, Mateusz Guzik, linux-kernel On Fri, Feb 17, 2017 at 9:23 AM, Luis R. Rodriguez <mcgrof@kernel.org> wrote: > On Fri, Feb 17, 2017 at 9:07 AM, Andy Lutomirski <luto@amacapital.net> wrote: >> But maybe >> there really is a race in which a kmemleak check right in the middle >> of duplicating the task struct really can't see the stack pointer. > > Funny, but it was actually using kmemleak how I can easily reproduce: > > To reproduce the kmemleak splat: > > echo clear > /sys/kernel/debug/kmemleak > echo scan > /sys/kernel/debug/kmemleak > cat /sys/kernel/debug/kmemleak Worked fine for me. Maybe your config is special? --Andy ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: kmemleak splat on copy_process() 2017-02-17 19:32 ` Andy Lutomirski @ 2017-05-15 21:53 ` Luis R. Rodriguez 2017-05-16 6:28 ` Michal Hocko 0 siblings, 1 reply; 13+ messages in thread From: Luis R. Rodriguez @ 2017-05-15 21:53 UTC (permalink / raw) To: Andy Lutomirski Cc: Luis R. Rodriguez, Michal Hocko, Andrew Morton, Ingo Molnar, Andy Lutomirski, Kees Cook, Eric W. Biederman, Mateusz Guzik, linux-kernel On Fri, Feb 17, 2017 at 11:32:34AM -0800, Andy Lutomirski wrote: > On Fri, Feb 17, 2017 at 9:23 AM, Luis R. Rodriguez <mcgrof@kernel.org> wrote: > > On Fri, Feb 17, 2017 at 9:07 AM, Andy Lutomirski <luto@amacapital.net> wrote: > >> But maybe > >> there really is a race in which a kmemleak check right in the middle > >> of duplicating the task struct really can't see the stack pointer. > > > > Funny, but it was actually using kmemleak how I can easily reproduce: > > > > To reproduce the kmemleak splat: > > > > echo clear > /sys/kernel/debug/kmemleak > > echo scan > /sys/kernel/debug/kmemleak > > cat /sys/kernel/debug/kmemleak > > Worked fine for me. Maybe your config is special? I don't think my config is special at all, here it is its just what I use for my qemu kvm guest image: http://drvbp1.linux-foundation.org/~mcgrof/2017/05/15/configs/piggy-x86_64_qemu_fork_kmemleak.config Another new kernel (next-20170515 based now on v4.12-rc1) and yet the same kmemleeak can be triggered easily, although this time I need to hit "scan" quite a bit more -- try using scan 6 times in a row or so. echo clear > /sys/kernel/debug/kmemleak echo scan > /sys/kernel/debug/kmemleak echo scan > /sys/kernel/debug/kmemleak echo scan > /sys/kernel/debug/kmemleak echo scan > /sys/kernel/debug/kmemleak echo scan > /sys/kernel/debug/kmemleak cat /sys/kernel/debug/kmemleak root@piggy:~# cat /sys/kernel/debug/kmemleak unreferenced object 0xffffa07500d4c000 (size 16384): comm "bash", pid 1349, jiffies 4294895999 (age 263.204s) hex dump (first 32 bytes): 9d 6e ac 57 00 00 00 00 00 00 00 00 00 00 00 00 .n.W............ 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<ffffffffa5464cca>] kmemleak_alloc+0x4a/0xa0 [<ffffffffa4fdfe6c>] __vmalloc_node_range+0x20c/0x2b0 [<ffffffffa4e7d1a2>] copy_process.part.37+0x5c2/0x1af0 [<ffffffffa4e7e89f>] _do_fork+0xcf/0x390 [<ffffffffa4e7ec09>] SyS_clone+0x19/0x20 [<ffffffffa4e03b0b>] do_syscall_64+0x5b/0xc0 [<ffffffffa547072b>] return_from_SYSCALL_64+0x0/0x6a [<ffffffffffffffff>] 0xffffffffffffffff unreferenced object 0xffffa07500c30000 (size 16384): comm "bash", pid 1394, jiffies 4294940106 (age 86.780s) hex dump (first 32 bytes): 9d 6e ac 57 00 00 00 00 00 00 00 00 00 00 00 00 .n.W............ 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<ffffffffa5464cca>] kmemleak_alloc+0x4a/0xa0 [<ffffffffa4fdfe6c>] __vmalloc_node_range+0x20c/0x2b0 [<ffffffffa4e7d1a2>] copy_process.part.37+0x5c2/0x1af0 [<ffffffffa4e7e89f>] _do_fork+0xcf/0x390 [<ffffffffa4e7ec09>] SyS_clone+0x19/0x20 [<ffffffffa4e03b0b>] do_syscall_64+0x5b/0xc0 [<ffffffffa547072b>] return_from_SYSCALL_64+0x0/0x6a [<ffffffffffffffff>] 0xffffffffffffffff unreferenced object 0xffffa07500c98000 (size 16384): comm "bash", pid 1368, jiffies 4294956480 (age 21.284s) hex dump (first 32 bytes): 9d 6e ac 57 00 00 00 00 00 00 00 00 fe 01 00 00 .n.W............ c0 f4 9d 44 2f e8 ff ff 00 de ac 44 2f e8 ff ff ...D/......D/... backtrace: [<ffffffffa5464cca>] kmemleak_alloc+0x4a/0xa0 [<ffffffffa4fdfe6c>] __vmalloc_node_range+0x20c/0x2b0 [<ffffffffa4e7d1a2>] copy_process.part.37+0x5c2/0x1af0 [<ffffffffa4e7e89f>] _do_fork+0xcf/0x390 [<ffffffffa4e7ec09>] SyS_clone+0x19/0x20 [<ffffffffa4e03b0b>] do_syscall_64+0x5b/0xc0 [<ffffffffa547072b>] return_from_SYSCALL_64+0x0/0x6a [<ffffffffffffffff>] 0xffffffffffffffff I confirm that stack is allocated and that a respective tsk->stack_vm_area gets assigned. So neither of these BUG() triggers, for instance but yet the kmemleak does: diff --git a/kernel/fork.c b/kernel/fork.c index 657373b2ddd2..9bd7ccd55b89 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -230,8 +230,12 @@ static unsigned long *alloc_thread_stack_node(struct task_struct *tsk, int node) * free_thread_stack() can be called in interrupt context, * so cache the vm_struct. */ - if (stack) + if (stack) { tsk->stack_vm_area = find_vm_area(stack); + if (!tsk->stack_vm_area) + BUG(); + } else + BUG(); return stack; #else struct page *page = alloc_pages_node(node, THREADINFO_GFP, Luis ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: kmemleak splat on copy_process() 2017-05-15 21:53 ` Luis R. Rodriguez @ 2017-05-16 6:28 ` Michal Hocko 2017-05-16 13:39 ` Catalin Marinas 0 siblings, 1 reply; 13+ messages in thread From: Michal Hocko @ 2017-05-16 6:28 UTC (permalink / raw) To: Luis R. Rodriguez Cc: Andy Lutomirski, Andrew Morton, Ingo Molnar, Andy Lutomirski, Kees Cook, Eric W. Biederman, Mateusz Guzik, linux-kernel, Catalin Marinas Let's CC Catalin this smells like a kmemleak bug to me. On Mon 15-05-17 23:53:18, Luis R. Rodriguez wrote: > On Fri, Feb 17, 2017 at 11:32:34AM -0800, Andy Lutomirski wrote: > > On Fri, Feb 17, 2017 at 9:23 AM, Luis R. Rodriguez <mcgrof@kernel.org> wrote: > > > On Fri, Feb 17, 2017 at 9:07 AM, Andy Lutomirski <luto@amacapital.net> wrote: > > >> But maybe > > >> there really is a race in which a kmemleak check right in the middle > > >> of duplicating the task struct really can't see the stack pointer. > > > > > > Funny, but it was actually using kmemleak how I can easily reproduce: > > > > > > To reproduce the kmemleak splat: > > > > > > echo clear > /sys/kernel/debug/kmemleak > > > echo scan > /sys/kernel/debug/kmemleak > > > cat /sys/kernel/debug/kmemleak > > > > Worked fine for me. Maybe your config is special? > > I don't think my config is special at all, here it is its just > what I use for my qemu kvm guest image: > > http://drvbp1.linux-foundation.org/~mcgrof/2017/05/15/configs/piggy-x86_64_qemu_fork_kmemleak.config > > Another new kernel (next-20170515 based now on v4.12-rc1) and yet the same > kmemleeak can be triggered easily, although this time I need to hit "scan" > quite a bit more -- try using scan 6 times in a row or so. > > echo clear > /sys/kernel/debug/kmemleak > echo scan > /sys/kernel/debug/kmemleak > echo scan > /sys/kernel/debug/kmemleak > echo scan > /sys/kernel/debug/kmemleak > echo scan > /sys/kernel/debug/kmemleak > echo scan > /sys/kernel/debug/kmemleak > cat /sys/kernel/debug/kmemleak > > root@piggy:~# cat /sys/kernel/debug/kmemleak > unreferenced object 0xffffa07500d4c000 (size 16384): > comm "bash", pid 1349, jiffies 4294895999 (age 263.204s) > hex dump (first 32 bytes): > 9d 6e ac 57 00 00 00 00 00 00 00 00 00 00 00 00 .n.W............ > 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ > backtrace: > [<ffffffffa5464cca>] kmemleak_alloc+0x4a/0xa0 > [<ffffffffa4fdfe6c>] __vmalloc_node_range+0x20c/0x2b0 > [<ffffffffa4e7d1a2>] copy_process.part.37+0x5c2/0x1af0 > [<ffffffffa4e7e89f>] _do_fork+0xcf/0x390 > [<ffffffffa4e7ec09>] SyS_clone+0x19/0x20 > [<ffffffffa4e03b0b>] do_syscall_64+0x5b/0xc0 > [<ffffffffa547072b>] return_from_SYSCALL_64+0x0/0x6a > [<ffffffffffffffff>] 0xffffffffffffffff > unreferenced object 0xffffa07500c30000 (size 16384): > comm "bash", pid 1394, jiffies 4294940106 (age 86.780s) > hex dump (first 32 bytes): > 9d 6e ac 57 00 00 00 00 00 00 00 00 00 00 00 00 .n.W............ > 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ > backtrace: > [<ffffffffa5464cca>] kmemleak_alloc+0x4a/0xa0 > [<ffffffffa4fdfe6c>] __vmalloc_node_range+0x20c/0x2b0 > [<ffffffffa4e7d1a2>] copy_process.part.37+0x5c2/0x1af0 > [<ffffffffa4e7e89f>] _do_fork+0xcf/0x390 > [<ffffffffa4e7ec09>] SyS_clone+0x19/0x20 > [<ffffffffa4e03b0b>] do_syscall_64+0x5b/0xc0 > [<ffffffffa547072b>] return_from_SYSCALL_64+0x0/0x6a > [<ffffffffffffffff>] 0xffffffffffffffff > unreferenced object 0xffffa07500c98000 (size 16384): > comm "bash", pid 1368, jiffies 4294956480 (age 21.284s) > hex dump (first 32 bytes): > 9d 6e ac 57 00 00 00 00 00 00 00 00 fe 01 00 00 .n.W............ > c0 f4 9d 44 2f e8 ff ff 00 de ac 44 2f e8 ff ff ...D/......D/... > backtrace: > [<ffffffffa5464cca>] kmemleak_alloc+0x4a/0xa0 > [<ffffffffa4fdfe6c>] __vmalloc_node_range+0x20c/0x2b0 > [<ffffffffa4e7d1a2>] copy_process.part.37+0x5c2/0x1af0 > [<ffffffffa4e7e89f>] _do_fork+0xcf/0x390 > [<ffffffffa4e7ec09>] SyS_clone+0x19/0x20 > [<ffffffffa4e03b0b>] do_syscall_64+0x5b/0xc0 > [<ffffffffa547072b>] return_from_SYSCALL_64+0x0/0x6a > [<ffffffffffffffff>] 0xffffffffffffffff > > I confirm that stack is allocated and that a respective tsk->stack_vm_area gets > assigned. So neither of these BUG() triggers, for instance but yet the kmemleak > does: > > diff --git a/kernel/fork.c b/kernel/fork.c > index 657373b2ddd2..9bd7ccd55b89 100644 > --- a/kernel/fork.c > +++ b/kernel/fork.c > @@ -230,8 +230,12 @@ static unsigned long *alloc_thread_stack_node(struct task_struct *tsk, int node) > * free_thread_stack() can be called in interrupt context, > * so cache the vm_struct. > */ > - if (stack) > + if (stack) { > tsk->stack_vm_area = find_vm_area(stack); > + if (!tsk->stack_vm_area) > + BUG(); > + } else > + BUG(); > return stack; > #else > struct page *page = alloc_pages_node(node, THREADINFO_GFP, > > > > Luis -- Michal Hocko SUSE Labs ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: kmemleak splat on copy_process() 2017-05-16 6:28 ` Michal Hocko @ 2017-05-16 13:39 ` Catalin Marinas 2017-05-16 23:55 ` Andy Lutomirski 0 siblings, 1 reply; 13+ messages in thread From: Catalin Marinas @ 2017-05-16 13:39 UTC (permalink / raw) To: Michal Hocko Cc: Luis R. Rodriguez, Andy Lutomirski, Andrew Morton, Ingo Molnar, Andy Lutomirski, Kees Cook, Eric W. Biederman, Mateusz Guzik, linux-kernel Thanks for cc'ing me. The vmalloc allocations have always been tricky for kmemleak since there are 2-3 other memory locations with the same value as the vmalloc'ed object: vm_struct.addr and vmap_area.va_start; occasionally we have vmap_area.va_end pointing to the next vmap_area.va_start. To have a more reliable object reference counting, kmemleak avoids scanning most of the vmap_area structure (apart from vmap_area.vm) and requires a minimum count of 2 references for a vmalloc'ed object to not be considered a leak (that is vm_struct.addr and the caller's location, in this case task->stack). > On Mon 15-05-17 23:53:18, Luis R. Rodriguez wrote: > > root@piggy:~# cat /sys/kernel/debug/kmemleak > > unreferenced object 0xffffa07500d4c000 (size 16384): > > comm "bash", pid 1349, jiffies 4294895999 (age 263.204s) > > hex dump (first 32 bytes): > > 9d 6e ac 57 00 00 00 00 00 00 00 00 00 00 00 00 .n.W............ > > 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ > > backtrace: > > [<ffffffffa5464cca>] kmemleak_alloc+0x4a/0xa0 > > [<ffffffffa4fdfe6c>] __vmalloc_node_range+0x20c/0x2b0 > > [<ffffffffa4e7d1a2>] copy_process.part.37+0x5c2/0x1af0 > > [<ffffffffa4e7e89f>] _do_fork+0xcf/0x390 > > [<ffffffffa4e7ec09>] SyS_clone+0x19/0x20 > > [<ffffffffa4e03b0b>] do_syscall_64+0x5b/0xc0 > > [<ffffffffa547072b>] return_from_SYSCALL_64+0x0/0x6a > > [<ffffffffffffffff>] 0xffffffffffffffff This stack here was allocated by bash for a child process via alloc_thread_stack_node() when CONFIG_VMAP_STACK is enabled. When the child died, free_stack_thread() did not call vfree_atomic() but rather stored the corresponding vm_struct pointer in cached_stacks[i]. However, the original task->stack pointer was lost (child task_struct freed), so when scanning the memory kmemleak can only find a single reference to the original stack (via vm_struct cached_stacks[i]) rather than 2 as required for vmalloc'ed objects. BTW, you can get more info about a specific object, including the number of references found, with: echo dump=0xffffa07500d4c000 > /sys/kernel/debug/kmemleak So basically kernel/fork.c has its own thread stack allocator on top of vmalloc and kmemleak gets confused. A quick workaround: ------------8<------------------------------ diff --git a/kernel/fork.c b/kernel/fork.c index 06d759ab4c62..785907fccf67 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -214,6 +214,7 @@ static unsigned long *alloc_thread_stack_node(struct task_struct *tsk, int node) this_cpu_write(cached_stacks[i], NULL); tsk->stack_vm_area = s; + kmemleak_alloc(s->addr, THREAD_SIZE, 2, GFP_ATOMIC); local_irq_enable(); return s->addr; } @@ -254,6 +255,7 @@ static inline void free_thread_stack(struct task_struct *tsk) continue; this_cpu_write(cached_stacks[i], tsk->stack_vm_area); + kmemleak_free(tsk->stack); local_irq_restore(flags); return; } ------------8<------------------------------ A better solution I think would be something like kmemleak_ignore() on the free path paired with a new kmemleak_unignore() in order to avoid the freeing/re-allocation of the kmemleak metadata. A more complex solution (which I'm not yet convinced it works) may be to pass the number of references of an objects down to the objects it refers. In the vmalloc case we normally have a single reference to vm_struct and two to the vmalloc'ed object. If a reference to the vmalloc'ed object disappeared we could balance it with an increased number of references to the vm_struct object. But this option requires more research. Unless there are better ideas, I'll post a patch adding kmemleak_ignore/unignore annotations to kerne/fork.c (and the corresponding mm/kmemleak.c changes). Thanks. -- Catalin ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: kmemleak splat on copy_process() 2017-05-16 13:39 ` Catalin Marinas @ 2017-05-16 23:55 ` Andy Lutomirski 2017-05-17 11:09 ` Catalin Marinas 0 siblings, 1 reply; 13+ messages in thread From: Andy Lutomirski @ 2017-05-16 23:55 UTC (permalink / raw) To: Catalin Marinas Cc: Michal Hocko, Luis R. Rodriguez, Andrew Morton, Ingo Molnar, Andy Lutomirski, Kees Cook, Eric W. Biederman, Mateusz Guzik, linux-kernel On Tue, May 16, 2017 at 6:39 AM, Catalin Marinas <catalin.marinas@arm.com> wrote: > Thanks for cc'ing me. The vmalloc allocations have always been tricky > for kmemleak since there are 2-3 other memory locations with the same > value as the vmalloc'ed object: vm_struct.addr and vmap_area.va_start; > occasionally we have vmap_area.va_end pointing to the next > vmap_area.va_start. > > To have a more reliable object reference counting, kmemleak avoids > scanning most of the vmap_area structure (apart from vmap_area.vm) and > requires a minimum count of 2 references for a vmalloc'ed object to not > be considered a leak (that is vm_struct.addr and the caller's location, > in this case task->stack). > >> On Mon 15-05-17 23:53:18, Luis R. Rodriguez wrote: >> > root@piggy:~# cat /sys/kernel/debug/kmemleak >> > unreferenced object 0xffffa07500d4c000 (size 16384): >> > comm "bash", pid 1349, jiffies 4294895999 (age 263.204s) >> > hex dump (first 32 bytes): >> > 9d 6e ac 57 00 00 00 00 00 00 00 00 00 00 00 00 .n.W............ >> > 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ >> > backtrace: >> > [<ffffffffa5464cca>] kmemleak_alloc+0x4a/0xa0 >> > [<ffffffffa4fdfe6c>] __vmalloc_node_range+0x20c/0x2b0 >> > [<ffffffffa4e7d1a2>] copy_process.part.37+0x5c2/0x1af0 >> > [<ffffffffa4e7e89f>] _do_fork+0xcf/0x390 >> > [<ffffffffa4e7ec09>] SyS_clone+0x19/0x20 >> > [<ffffffffa4e03b0b>] do_syscall_64+0x5b/0xc0 >> > [<ffffffffa547072b>] return_from_SYSCALL_64+0x0/0x6a >> > [<ffffffffffffffff>] 0xffffffffffffffff > > This stack here was allocated by bash for a child process via > alloc_thread_stack_node() when CONFIG_VMAP_STACK is enabled. When the > child died, free_stack_thread() did not call vfree_atomic() but rather > stored the corresponding vm_struct pointer in cached_stacks[i]. However, > the original task->stack pointer was lost (child task_struct freed), so > when scanning the memory kmemleak can only find a single reference to > the original stack (via vm_struct cached_stacks[i]) rather than 2 as > required for vmalloc'ed objects. > > BTW, you can get more info about a specific object, including the number > of references found, with: > > echo dump=0xffffa07500d4c000 > /sys/kernel/debug/kmemleak > > So basically kernel/fork.c has its own thread stack allocator on top of > vmalloc and kmemleak gets confused. A quick workaround: > > ------------8<------------------------------ > diff --git a/kernel/fork.c b/kernel/fork.c > index 06d759ab4c62..785907fccf67 100644 > --- a/kernel/fork.c > +++ b/kernel/fork.c > @@ -214,6 +214,7 @@ static unsigned long *alloc_thread_stack_node(struct task_struct *tsk, int node) > this_cpu_write(cached_stacks[i], NULL); > > tsk->stack_vm_area = s; > + kmemleak_alloc(s->addr, THREAD_SIZE, 2, GFP_ATOMIC); > local_irq_enable(); > return s->addr; > } > @@ -254,6 +255,7 @@ static inline void free_thread_stack(struct task_struct *tsk) > continue; > > this_cpu_write(cached_stacks[i], tsk->stack_vm_area); > + kmemleak_free(tsk->stack); > local_irq_restore(flags); > return; > } > ------------8<------------------------------ I would do this differently. The problem only affects cached stacks, right? How about kmemleal_ignoring when caching a stack and unignoring when uncaching a stack? Also, this needs a serious comment. How about "kmemleak does not presently understand that a reference to a vm_area_struct implies a reference to the vmalloced memory"? > A more complex solution (which I'm not yet convinced it works) may be to > pass the number of references of an objects down to the objects it > refers. In the vmalloc case we normally have a single reference to > vm_struct and two to the vmalloc'ed object. If a reference to the > vmalloc'ed object disappeared we could balance it with an increased > number of references to the vm_struct object. But this option requires > more research. The idea being that, with this change, kmemleak would start to understand that surplus references to vm_area_struct cause it to think the memory itself is reachable? --Andy ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: kmemleak splat on copy_process() 2017-05-16 23:55 ` Andy Lutomirski @ 2017-05-17 11:09 ` Catalin Marinas 0 siblings, 0 replies; 13+ messages in thread From: Catalin Marinas @ 2017-05-17 11:09 UTC (permalink / raw) To: Andy Lutomirski Cc: Michal Hocko, Luis R. Rodriguez, Andrew Morton, Ingo Molnar, Kees Cook, Eric W. Biederman, Mateusz Guzik, linux-kernel On Tue, May 16, 2017 at 04:55:28PM -0700, Andy Lutomirski wrote: > On Tue, May 16, 2017 at 6:39 AM, Catalin Marinas > <catalin.marinas@arm.com> wrote: > > Thanks for cc'ing me. The vmalloc allocations have always been tricky > > for kmemleak since there are 2-3 other memory locations with the same > > value as the vmalloc'ed object: vm_struct.addr and vmap_area.va_start; > > occasionally we have vmap_area.va_end pointing to the next > > vmap_area.va_start. > > > > To have a more reliable object reference counting, kmemleak avoids > > scanning most of the vmap_area structure (apart from vmap_area.vm) and > > requires a minimum count of 2 references for a vmalloc'ed object to not > > be considered a leak (that is vm_struct.addr and the caller's location, > > in this case task->stack). > > > >> On Mon 15-05-17 23:53:18, Luis R. Rodriguez wrote: > >> > root@piggy:~# cat /sys/kernel/debug/kmemleak > >> > unreferenced object 0xffffa07500d4c000 (size 16384): > >> > comm "bash", pid 1349, jiffies 4294895999 (age 263.204s) > >> > hex dump (first 32 bytes): > >> > 9d 6e ac 57 00 00 00 00 00 00 00 00 00 00 00 00 .n.W............ > >> > 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ > >> > backtrace: > >> > [<ffffffffa5464cca>] kmemleak_alloc+0x4a/0xa0 > >> > [<ffffffffa4fdfe6c>] __vmalloc_node_range+0x20c/0x2b0 > >> > [<ffffffffa4e7d1a2>] copy_process.part.37+0x5c2/0x1af0 > >> > [<ffffffffa4e7e89f>] _do_fork+0xcf/0x390 > >> > [<ffffffffa4e7ec09>] SyS_clone+0x19/0x20 > >> > [<ffffffffa4e03b0b>] do_syscall_64+0x5b/0xc0 > >> > [<ffffffffa547072b>] return_from_SYSCALL_64+0x0/0x6a > >> > [<ffffffffffffffff>] 0xffffffffffffffff > > > > This stack here was allocated by bash for a child process via > > alloc_thread_stack_node() when CONFIG_VMAP_STACK is enabled. When the > > child died, free_stack_thread() did not call vfree_atomic() but rather > > stored the corresponding vm_struct pointer in cached_stacks[i]. However, > > the original task->stack pointer was lost (child task_struct freed), so > > when scanning the memory kmemleak can only find a single reference to > > the original stack (via vm_struct cached_stacks[i]) rather than 2 as > > required for vmalloc'ed objects. > > > > BTW, you can get more info about a specific object, including the number > > of references found, with: > > > > echo dump=0xffffa07500d4c000 > /sys/kernel/debug/kmemleak > > > > So basically kernel/fork.c has its own thread stack allocator on top of > > vmalloc and kmemleak gets confused. A quick workaround: > > > > ------------8<------------------------------ > > diff --git a/kernel/fork.c b/kernel/fork.c > > index 06d759ab4c62..785907fccf67 100644 > > --- a/kernel/fork.c > > +++ b/kernel/fork.c > > @@ -214,6 +214,7 @@ static unsigned long *alloc_thread_stack_node(struct task_struct *tsk, int node) > > this_cpu_write(cached_stacks[i], NULL); > > > > tsk->stack_vm_area = s; > > + kmemleak_alloc(s->addr, THREAD_SIZE, 2, GFP_ATOMIC); > > local_irq_enable(); > > return s->addr; > > } > > @@ -254,6 +255,7 @@ static inline void free_thread_stack(struct task_struct *tsk) > > continue; > > > > this_cpu_write(cached_stacks[i], tsk->stack_vm_area); > > + kmemleak_free(tsk->stack); > > local_irq_restore(flags); > > return; > > } > > ------------8<------------------------------ > > I would do this differently. The problem only affects cached stacks, > right? How about kmemleal_ignoring when caching a stack and > unignoring when uncaching a stack? Indeed ;), that was my paragraph just below the patch: > > A better solution I think would be something like kmemleak_ignore() on > > the free path paired with a new kmemleak_unignore() in order to avoid > > the freeing/re-allocation of the kmemleak metadata. > Also, this needs a serious comment. How about "kmemleak does not > presently understand that a reference to a vm_area_struct implies a > reference to the vmalloced memory"? Yes, as most of the other kmemleak annotations. > > A more complex solution (which I'm not yet convinced it works) may be to > > pass the number of references of an objects down to the objects it > > refers. In the vmalloc case we normally have a single reference to > > vm_struct and two to the vmalloc'ed object. If a reference to the > > vmalloc'ed object disappeared we could balance it with an increased > > number of references to the vm_struct object. But this option requires > > more research. > > The idea being that, with this change, kmemleak would start to > understand that surplus references to vm_area_struct cause it to think > the memory itself is reachable? That's the idea, possibly with a new kmemleak API specific to vmalloc that would link the vm_struct object with the vmalloc'ed one. The advantage is that we don't need to annotate the stack caching (that's the only place that I'm aware of), though the actual implementation is a bit more complex than kmemleak_ignore/unignore (and assuming the idea is sane). -- Catalin ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2017-05-17 11:09 UTC | newest] Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2017-02-03 21:06 kmemleak splat on copy_process() Luis R. Rodriguez 2017-02-06 9:47 ` Michal Hocko 2017-02-07 1:37 ` Luis R. Rodriguez 2017-02-07 8:03 ` Michal Hocko 2017-02-09 1:37 ` Luis R. Rodriguez 2017-02-17 17:07 ` Andy Lutomirski 2017-02-17 17:23 ` Luis R. Rodriguez 2017-02-17 19:32 ` Andy Lutomirski 2017-05-15 21:53 ` Luis R. Rodriguez 2017-05-16 6:28 ` Michal Hocko 2017-05-16 13:39 ` Catalin Marinas 2017-05-16 23:55 ` Andy Lutomirski 2017-05-17 11:09 ` Catalin Marinas
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.