linux-bcache.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Kent Overstreet <kent.overstreet@linux.dev>
To: Mel Gorman <mgorman@suse.de>
Cc: Peter Zijlstra <peterz@infradead.org>,
	Suren Baghdasaryan <surenb@google.com>,
	akpm@linux-foundation.org, mhocko@suse.com, vbabka@suse.cz,
	hannes@cmpxchg.org, roman.gushchin@linux.dev, dave@stgolabs.net,
	willy@infradead.org, liam.howlett@oracle.com, void@manifault.com,
	juri.lelli@redhat.com, ldufour@linux.ibm.com, peterx@redhat.com,
	david@redhat.com, axboe@kernel.dk, mcgrof@kernel.org,
	masahiroy@kernel.org, nathan@kernel.org, changbin.du@intel.com,
	ytcoode@gmail.com, vincent.guittot@linaro.org,
	dietmar.eggemann@arm.com, rostedt@goodmis.org,
	bsegall@google.com, bristot@redhat.com, vschneid@redhat.com,
	cl@linux.com, penberg@kernel.org, iamjoonsoo.kim@lge.com,
	42.hyeyoo@gmail.com, glider@google.com, elver@google.com,
	dvyukov@google.com, shakeelb@google.com,
	songmuchun@bytedance.com, arnd@arndb.de, jbaron@akamai.com,
	rientjes@google.com, minchan@google.com, kaleshsingh@google.com,
	kernel-team@android.com, linux-mm@kvack.org,
	iommu@lists.linux.dev, kasan-dev@googlegroups.com,
	io-uring@vger.kernel.org, linux-arch@vger.kernel.org,
	xen-devel@lists.xenproject.org, linux-bcache@vger.kernel.org,
	linux-modules@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [RFC PATCH 00/30] Code tagging framework and applications
Date: Thu, 1 Sep 2022 12:31:55 -0400	[thread overview]
Message-ID: <20220901163155.sz4dqtubicdvzmsw@moria.home.lan> (raw)
In-Reply-To: <20220901110501.o5rq5yzltomirxiw@suse.de>

On Thu, Sep 01, 2022 at 12:05:01PM +0100, Mel Gorman wrote:
> As pointed out elsewhere, attaching to the tracepoint and recording relevant
> state is an option other than trying to parse a raw ftrace feed. For memory
> leaks, there are already tracepoints for page allocation and free that could
> be used to track allocations that are not freed at a given point in time.

Page allocation tracepoints are not sufficient for what we're trying to do here,
and a substantial amount of effort in this patchset has gone into just getting
the hooking locations right - our memory allocation interfaces are not trivial.

That's something people should keep in mind when commenting on the size of this
patchset, since that's effort that would have to be spent for /any/ complete
solution, be in tracepoint based or no.

Additionally, we need to be able to write assertions that verify that our hook
locations are correct, that allocations or frees aren't getting double counted
or missed - highly necessary given the maze of nested memory allocation
interfaces we have (i.e. slab.h), and it's something a tracepoint based
implementation would have to account for - otherwise, a tool isn't very useful
if you can't trust the numbers it's giving you.

And then you have to correlate the allocate and free events, so that you know
which allocate callsite to decrement the amount freed from.

How would you plan on doing that with tracepoints?

> There is also the kernel memory leak detector although I never had reason
> to use it (https://www.kernel.org/doc/html/v6.0-rc3/dev-tools/kmemleak.html)
> and it sounds like it would be expensive.

Kmemleak is indeed expensive, and in the past I've had issues with it not
catching everything (I've noticed the kmemleak annotations growing, so maybe
this is less of an issue than it was).

And this is a more complete solution (though not something that could strictly
replace kmemleak): strict memory leaks aren't the only issue, it's also drivers
unexpectedly consuming more memory than expected.

I'll bet you a beer that when people have had this awhile, we're going to have a
bunch of bugs discovered and fixed along the lines of "oh hey, this driver
wasn't supposed to be using this 1 MB of memory, I never noticed that before".

> > > It's also unclear *who* would enable this. It looks like it would mostly
> > > have value during the development stage of an embedded platform to track
> > > kernel memory usage on a per-application basis in an environment where it
> > > may be difficult to setup tracing and tracking. Would it ever be enabled
> > > in production? Would a distribution ever enable this? If it's enabled, any
> > > overhead cannot be disabled/enabled at run or boot time so anyone enabling
> > > this would carry the cost without never necessarily consuming the data.
> > 
> > The whole point of this is to be cheap enough to enable in production -
> > especially the latency tracing infrastructure. There's a lot of value to
> > always-on system visibility infrastructure, so that when a live machine starts
> > to do something wonky the data is already there.
> > 
> 
> Sure, there is value but nothing stops the tracepoints being attached as
> a boot-time service where interested. For latencies, there is already
> bpf examples for tracing individual function latency over time e.g.
> https://github.com/iovisor/bcc/blob/master/tools/funclatency.py although
> I haven't used it recently.

So this is cool, I'll check it out today.

Tracing of /function/ latency is definitely something you'd want tracing/kprobes
for - that's way more practical than any code tagging-based approach. And if the
output is reliable and useful I could definitely see myself using this, thank
you.

But for data collection where it makes sense to annotate in the source code
where the data collection points are, I see the code-tagging based approach as
simpler - it cuts out a whole bunch of indirection. The diffstat on the code
tagging time stats patch is

 8 files changed, 233 insertions(+), 6 deletions(-)

And that includes hooking wait.h - this is really simple, easy stuff.

The memory allocation tracking patches are more complicated because we've got a
ton of memory allocation interfaces and we're aiming for strict correctness
there - because that tool needs strict correctness in order to be useful.

> Live parsing of ftrace is possible, albeit expensive.
> https://github.com/gormanm/mmtests/blob/master/monitors/watch-highorder.pl
> tracks counts of high-order allocations and dumps a report on interrupt as
> an example of live parsing ftrace and only recording interesting state. It's
> not tracking state you are interested in but it demonstrates it is possible
> to rely on ftrace alone and monitor from userspace. It's bit-rotted but
> can be fixed with

Yeah, if this is as far as people have gotten with ftrace on memory allocations
than I don't think tracing is credible here, sorry.

> The ease of use is a criticism as there is effort required to develop
> the state tracking of in-kernel event be it from live parsing ftrace,
> attaching to tracepoints with systemtap/bpf/whatever and the like. The
> main disadvantage with an in-kernel implementation is three-fold. First,
> it doesn't work with older kernels without backports. Second, if something
> slightly different it needed then it's a kernel rebuild.  Third, if the
> option is not enabled in the deployed kernel config then you are relying
> on the end user being willing to deploy a custom kernel.  The initial
> investment in doing memory leak tracking or latency tracking by attaching
> to tracepoints is significant but it works with older kernels up to a point
> and is less sensitive to the kernel config options selected as features
> like ftrace are often selected.

The next version of this patch set is going to use the alternatives mechanism to
add a boot parameter.

I'm not interested in backporting to older kernels - eesh. People on old
enterprise kernels don't always get all the new shiny things :)

  reply	other threads:[~2022-09-01 16:32 UTC|newest]

Thread overview: 138+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-08-30 21:48 [RFC PATCH 00/30] Code tagging framework and applications Suren Baghdasaryan
2022-08-30 21:48 ` [RFC PATCH 01/30] kernel/module: move find_kallsyms_symbol_value declaration Suren Baghdasaryan
2022-08-30 21:48 ` [RFC PATCH 02/30] lib/string_helpers: Drop space in string_get_size's output Suren Baghdasaryan
2022-08-30 21:48 ` [RFC PATCH 03/30] Lazy percpu counters Suren Baghdasaryan
2022-08-31 10:02   ` Mel Gorman
2022-08-31 15:37     ` Suren Baghdasaryan
2022-08-31 16:20     ` Kent Overstreet
2022-09-01  6:51   ` Peter Zijlstra
2022-09-01 14:32     ` Kent Overstreet
2022-09-01 14:48       ` Steven Rostedt
2022-09-01 15:43         ` Kent Overstreet
2022-09-01 18:59       ` Peter Zijlstra
2022-08-30 21:48 ` [RFC PATCH 04/30] scripts/kallysms: Always include __start and __stop symbols Suren Baghdasaryan
2022-08-30 21:48 ` [RFC PATCH 05/30] lib: code tagging framework Suren Baghdasaryan
2022-08-30 21:48 ` [RFC PATCH 06/30] lib: code tagging module support Suren Baghdasaryan
2022-08-30 21:48 ` [RFC PATCH 07/30] lib: add support for allocation tagging Suren Baghdasaryan
2022-08-30 21:48 ` [RFC PATCH 08/30] lib: introduce page " Suren Baghdasaryan
2022-08-30 21:48 ` [RFC PATCH 09/30] change alloc_pages name in dma_map_ops to avoid name conflicts Suren Baghdasaryan
2022-08-30 21:48 ` [RFC PATCH 10/30] mm: enable page allocation tagging for __get_free_pages and alloc_pages Suren Baghdasaryan
2022-08-31 10:11   ` Mel Gorman
2022-08-31 15:45     ` Suren Baghdasaryan
2022-08-31 15:52       ` Suren Baghdasaryan
2022-08-31 17:46     ` Kent Overstreet
2022-09-01  1:07       ` Suren Baghdasaryan
2022-09-01  7:41       ` Peter Zijlstra
2022-08-30 21:49 ` [RFC PATCH 11/30] mm: introduce slabobj_ext to support slab object extensions Suren Baghdasaryan
2022-09-01 23:35   ` Roman Gushchin
2022-09-02  0:23     ` Suren Baghdasaryan
2022-08-30 21:49 ` [RFC PATCH 12/30] mm: introduce __GFP_NO_OBJ_EXT flag to selectively prevent slabobj_ext creation Suren Baghdasaryan
2022-08-30 21:49 ` [RFC PATCH 13/30] mm/slab: introduce SLAB_NO_OBJ_EXT to avoid obj_ext creation Suren Baghdasaryan
2022-08-30 21:49 ` [RFC PATCH 14/30] mm: prevent slabobj_ext allocations for slabobj_ext and kmem_cache objects Suren Baghdasaryan
2022-09-01 23:40   ` Roman Gushchin
2022-09-02  0:24     ` Suren Baghdasaryan
2022-08-30 21:49 ` [RFC PATCH 15/30] lib: introduce slab allocation tagging Suren Baghdasaryan
2022-08-30 21:49 ` [RFC PATCH 16/30] mm: enable slab allocation tagging for kmalloc and friends Suren Baghdasaryan
2022-09-01 23:50   ` Roman Gushchin
2022-08-30 21:49 ` [RFC PATCH 17/30] lib/string.c: strsep_no_empty() Suren Baghdasaryan
2022-08-30 21:49 ` [RFC PATCH 18/30] codetag: add codetag query helper functions Suren Baghdasaryan
2022-08-30 21:49 ` [RFC PATCH 19/30] move stack capture functionality into a separate function for reuse Suren Baghdasaryan
2022-08-30 21:49 ` [RFC PATCH 20/30] lib: introduce support for storing code tag context Suren Baghdasaryan
2022-08-30 21:49 ` [RFC PATCH 21/30] lib: implement context capture support for page and slab allocators Suren Baghdasaryan
2022-08-30 21:49 ` [RFC PATCH 22/30] Code tagging based fault injection Suren Baghdasaryan
2022-08-31  1:51   ` Randy Dunlap
2022-08-31 15:56     ` Suren Baghdasaryan
2022-08-31 10:37   ` Dmitry Vyukov
2022-08-31 15:51     ` Suren Baghdasaryan
2022-08-31 17:30     ` Kent Overstreet
2022-09-01  8:43       ` Dmitry Vyukov
2022-08-30 21:49 ` [RFC PATCH 23/30] timekeeping: Add a missing include Suren Baghdasaryan
2022-08-30 21:49 ` [RFC PATCH 24/30] wait: Clean up waitqueue_entry initialization Suren Baghdasaryan
2022-08-30 21:49 ` [RFC PATCH 25/30] lib/time_stats: New library for statistics on events Suren Baghdasaryan
2022-08-30 21:49 ` [RFC PATCH 26/30] bcache: Convert to lib/time_stats Suren Baghdasaryan
2022-08-30 21:49 ` [RFC PATCH 27/30] Code tagging based latency tracking Suren Baghdasaryan
2022-08-31  1:53   ` Randy Dunlap
2022-08-31 15:55     ` Suren Baghdasaryan
2022-09-01  7:11   ` Peter Zijlstra
2022-09-01 14:43     ` Kent Overstreet
2022-09-01 21:38   ` Steven Rostedt
2022-09-01 21:46     ` Steven Rostedt
2022-09-01 21:54     ` Kent Overstreet
2022-09-01 22:34       ` Steven Rostedt
2022-09-01 22:55         ` Kent Overstreet
2022-09-02  0:23           ` Steven Rostedt
2022-09-02  1:35             ` Kent Overstreet
2022-09-02  1:59               ` Steven Rostedt
2022-08-30 21:49 ` [RFC PATCH 28/30] Improved symbolic error names Suren Baghdasaryan
2022-09-01 23:19   ` Joe Perches
2022-09-01 23:26     ` Kent Overstreet
2022-08-30 21:49 ` [RFC PATCH 29/30] dyndbg: Convert to code tagging Suren Baghdasaryan
2022-08-30 21:49 ` [RFC PATCH 30/30] MAINTAINERS: Add entries for code tagging & related Suren Baghdasaryan
2022-08-31  7:38 ` [RFC PATCH 00/30] Code tagging framework and applications Peter Zijlstra
2022-08-31  8:42   ` Kent Overstreet
2022-08-31 10:19     ` Mel Gorman
2022-08-31 10:47       ` Michal Hocko
2022-08-31 15:28         ` Suren Baghdasaryan
2022-08-31 16:48           ` Suren Baghdasaryan
2022-08-31 19:01         ` Kent Overstreet
2022-08-31 20:56           ` Yosry Ahmed
2022-08-31 21:38             ` Suren Baghdasaryan
2022-09-01 22:27             ` Roman Gushchin
2022-09-01 22:37               ` Kent Overstreet
2022-09-01 22:53                 ` Roman Gushchin
2022-09-01 23:36                   ` Suren Baghdasaryan
2022-09-02  0:17                   ` Kent Overstreet
2022-09-02  1:04                     ` Roman Gushchin
2022-09-02  1:16                       ` Kent Overstreet
2022-09-02 12:02                       ` Jens Axboe
2022-09-02 19:48                         ` Kent Overstreet
2022-09-02 19:53                           ` Jens Axboe
2022-09-02 20:05                             ` Kent Overstreet
2022-09-02 20:23                               ` Jens Axboe
2022-09-01  7:18           ` Michal Hocko
2022-09-01 15:33             ` Suren Baghdasaryan
2022-09-01 19:15               ` Michal Hocko
2022-09-01 19:39                 ` Suren Baghdasaryan
2022-09-01 20:15                   ` Kent Overstreet
2022-09-05  8:49                     ` Michal Hocko
2022-09-05 23:46                       ` Kent Overstreet
2022-09-06  7:23                         ` Michal Hocko
2022-09-06 18:20                           ` Kent Overstreet
2022-09-07 11:00                             ` Michal Hocko
2022-09-07 13:04                               ` Kent Overstreet
2022-09-07 13:45                                 ` Steven Rostedt
2022-09-08  6:35                                   ` Kent Overstreet
2022-09-08  6:49                                     ` Suren Baghdasaryan
2022-09-08  7:07                                       ` Kent Overstreet
2022-09-08  7:12                                     ` Michal Hocko
2022-09-08  7:29                                       ` Kent Overstreet
2022-09-08  7:47                                         ` Michal Hocko
2022-09-05  1:32                 ` Suren Baghdasaryan
2022-09-05  8:12                   ` Michal Hocko
2022-09-05  8:58                     ` Marco Elver
2022-09-05 18:07                       ` Suren Baghdasaryan
2022-09-05 18:03                     ` Suren Baghdasaryan
2022-09-06  8:01                       ` Michal Hocko
2022-09-06 15:35                         ` Suren Baghdasaryan
2022-09-05 15:07                   ` Steven Rostedt
2022-09-05 18:08                     ` Suren Baghdasaryan
2022-09-05 20:42                       ` Kent Overstreet
2022-09-05 22:16                         ` Steven Rostedt
2022-09-05 23:50                           ` Kent Overstreet
2022-09-01  8:05           ` David Hildenbrand
2022-09-01 14:23             ` Kent Overstreet
2022-09-01 15:07               ` David Hildenbrand
2022-09-01 15:39                 ` Suren Baghdasaryan
2022-09-01 15:48                 ` Kent Overstreet
2022-08-31 15:59       ` Kent Overstreet
2022-09-01  7:05         ` Peter Zijlstra
2022-09-01  7:36           ` Daniel Bristot de Oliveira
2022-09-01  7:42           ` Peter Zijlstra
2022-09-01 11:05         ` Mel Gorman
2022-09-01 16:31           ` Kent Overstreet [this message]
2022-09-01  7:00       ` Peter Zijlstra
2022-09-01 14:29         ` Kent Overstreet
2022-09-05 18:44       ` Nadav Amit
2022-09-05 19:16         ` Steven Rostedt
2022-09-01  4:52 ` Oscar Salvador
2022-09-01  5:05   ` Suren Baghdasaryan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220901163155.sz4dqtubicdvzmsw@moria.home.lan \
    --to=kent.overstreet@linux.dev \
    --cc=42.hyeyoo@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=arnd@arndb.de \
    --cc=axboe@kernel.dk \
    --cc=bristot@redhat.com \
    --cc=bsegall@google.com \
    --cc=changbin.du@intel.com \
    --cc=cl@linux.com \
    --cc=dave@stgolabs.net \
    --cc=david@redhat.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=dvyukov@google.com \
    --cc=elver@google.com \
    --cc=glider@google.com \
    --cc=hannes@cmpxchg.org \
    --cc=iamjoonsoo.kim@lge.com \
    --cc=io-uring@vger.kernel.org \
    --cc=iommu@lists.linux.dev \
    --cc=jbaron@akamai.com \
    --cc=juri.lelli@redhat.com \
    --cc=kaleshsingh@google.com \
    --cc=kasan-dev@googlegroups.com \
    --cc=kernel-team@android.com \
    --cc=ldufour@linux.ibm.com \
    --cc=liam.howlett@oracle.com \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-bcache@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-modules@vger.kernel.org \
    --cc=masahiroy@kernel.org \
    --cc=mcgrof@kernel.org \
    --cc=mgorman@suse.de \
    --cc=mhocko@suse.com \
    --cc=minchan@google.com \
    --cc=nathan@kernel.org \
    --cc=penberg@kernel.org \
    --cc=peterx@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rientjes@google.com \
    --cc=roman.gushchin@linux.dev \
    --cc=rostedt@goodmis.org \
    --cc=shakeelb@google.com \
    --cc=songmuchun@bytedance.com \
    --cc=surenb@google.com \
    --cc=vbabka@suse.cz \
    --cc=vincent.guittot@linaro.org \
    --cc=void@manifault.com \
    --cc=vschneid@redhat.com \
    --cc=willy@infradead.org \
    --cc=xen-devel@lists.xenproject.org \
    --cc=ytcoode@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).