From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756698AbdIHRp0 (ORCPT ); Fri, 8 Sep 2017 13:45:26 -0400 Received: from foss.arm.com ([217.140.101.70]:43552 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753338AbdIHRpY (ORCPT ); Fri, 8 Sep 2017 13:45:24 -0400 Date: Fri, 8 Sep 2017 18:45:20 +0100 From: Catalin Marinas To: Steven Rostedt Cc: LKML , Andrey Ryabinin , kasan-dev@googlegroups.com Subject: Re: kmemleak not always catching stuff Message-ID: <20170908174518.3i74qvdbruf3lynl@armageddon.cambridge.arm.com> References: <20170901183311.3bf3348a@gandalf.local.home> <20170904100904.v5soe2afqebogefv@armageddon.cambridge.arm.com> <20170905101545.1c6459a5@gandalf.local.home> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20170905101545.1c6459a5@gandalf.local.home> User-Agent: NeoMutt/20170113 (1.7.2) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Sep 05, 2017 at 10:15:45AM -0400, Steven Rostedt wrote: > On Mon, 4 Sep 2017 11:09:05 +0100 > Catalin Marinas wrote: > > On Fri, Sep 01, 2017 at 06:33:11PM -0400, Steven Rostedt wrote: > > > Now I was thinking that it may be due to the fact that the trampoline > > > is allocated with module_alloc(), and that has some magic kasan goo in > > > it. But when forcing the issue with adding the following code: > > > > > > void **pblah; > > > void *blah; > > > > > > pblah = kmalloc(sizeof(*pblah), GFP_KERNEL); > > > blah = module_alloc(PAGE_SIZE); > > > *pblah = blah; > > > printk("allocated blah %p\n", blah); > > > kfree(pblah); > > > > > > in a path that I could control, it would catch it only after doing it > > > several times. I was never able to have kmemleak catch the actual bug > > > from user space no matter how many times I triggered it. > > > > module_alloc() uses vmalloc_exec(), so it is tracked by kmemleak but you > > probably hit a false negative with the blah pointer lingering somewhere > > on some stack. > > Hmm, could this also be what causes the miss of catching the lingering > ftrace trampoline? Not sure (not without some additional support in kmemleak to help track down the source of false negatives; I'm on a long flight to LA next week, maybe I manage to hack something up ;)). BTW, I had a quick look at the trace_selftest_ops() function (without pretending I understand the ftrace code) and there is one case where this function can exit without freeing dyn_ops. Is this intentional? diff --git a/kernel/trace/trace_selftest.c b/kernel/trace/trace_selftest.c index cb917cebae29..b17ec642793b 100644 --- a/kernel/trace/trace_selftest.c +++ b/kernel/trace/trace_selftest.c @@ -273,7 +273,7 @@ static int trace_selftest_ops(struct trace_array *tr, int cnt) goto out_free; if (cnt > 1) { if (trace_selftest_test_global_cnt == 0) - goto out; + goto out_free; } if (trace_selftest_test_dyn_cnt == 0) goto out_free; > > > [do the test] > > > > > > # echo scan > /sys/kernel/debug/kmemleak > > > > Some heuristics in kmemleak cause the first leak of an object not to be > > reported (too many false positives). You'd need to do "echo scan" at > > least twice after an allocation. > > OK, is this documented somewhere? Apparently not. I'll add some notes to the kmemleak documentation. > > > Most of the times it found nothing. Even when I switched the above from > > > module_alloc() to kmalloc(). > > > > > > Is this normal? > > > > In general, a leak would eventually appear after a few scans or in time > > when some memory location is overridden. > > > > Yet another heuristics in kmemleak is to treat pointers at some offset > > inside an object as valid references (because of the container_of > > tricks). However, the downside is that the bigger the object, the > > greater chances of finding some random data that looks like a pointer. > > We could change this logic to require precise pointers above a certain > > size (e.g. PAGE_SIZE) where the use of container_of() is less likely. > > > > Kmemleak doesn't have a way to inspect false negatives but if you are > > interested in digging further, I could add a "find=0x..." command to > > print all references to an object during scanning. I also need to find > > some time to implement a "stopscan" command which uses stop_machine() > > and skips the heuristics for reducing false positives. > > Without the patch in the link above, there's a memory leak with the > ftrace trampoline with the following commands: > > Requires: CONFIG_FUNCTION_TRACER and CONFIG_SCHED_TRACER > > # cd /sys/kernel/debug/tracing > # mkdir instances/foo > # echo wakeup > instances/foo/current_tracer > # echo 0 > /proc/sys/kernel/ftrace_enabled > # echo nop > instances/foo/current_tracer > # rmdir instances/foo > > What the above does, is creates a new (allocated/non-static) buffer in > the instance directory. Then we enable the wakeup tracer which > enables function tracing and also creates a dynamic ftrace trampoline > for it. We disable function tracing for all tracers with the proc > sysctl ftrace_enabled set to zero. The nop removes the wakeup tracer > and unregisters its function tracing handler. This is where the leak > happens. The unregister path sees that function tracing is disabled and > exits out early, without releasing the trampoline. Are the ftrace_ops allocated dynamically in this case (and freed when unregistered)? Otherwise, you may have an ops->trampoline still around that kmemleak finds. -- Catalin