linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Michal Hocko <mhocko@suse.com>
To: Kent Overstreet <kent.overstreet@linux.dev>
Cc: Suren Baghdasaryan <surenb@google.com>,
	akpm@linux-foundation.org, vbabka@suse.cz, hannes@cmpxchg.org,
	roman.gushchin@linux.dev, mgorman@suse.de, dave@stgolabs.net,
	willy@infradead.org, liam.howlett@oracle.com, corbet@lwn.net,
	void@manifault.com, peterz@infradead.org, juri.lelli@redhat.com,
	ldufour@linux.ibm.com, catalin.marinas@arm.com, will@kernel.org,
	arnd@arndb.de, tglx@linutronix.de, mingo@redhat.com,
	dave.hansen@linux.intel.com, x86@kernel.org, peterx@redhat.com,
	david@redhat.com, axboe@kernel.dk, mcgrof@kernel.org,
	masahiroy@kernel.org, nathan@kernel.org, dennis@kernel.org,
	tj@kernel.org, muchun.song@linux.dev, rppt@kernel.org,
	paulmck@kernel.org, pasha.tatashin@soleen.com,
	yosryahmed@google.com, yuzhao@google.com, dhowells@redhat.com,
	hughd@google.com, andreyknvl@gmail.com, keescook@chromium.org,
	ndesaulniers@google.com, gregkh@linuxfoundation.org,
	ebiggers@google.com, ytcoode@gmail.com,
	vincent.guittot@linaro.org, dietmar.eggemann@arm.com,
	rostedt@goodmis.org, bsegall@google.com, bristot@redhat.com,
	vschneid@redhat.com, cl@linux.com, penberg@kernel.org,
	iamjoonsoo.kim@lge.com, 42.hyeyoo@gmail.com, glider@google.com,
	elver@google.com, dvyukov@google.com, shakeelb@google.com,
	songmuchun@bytedance.com, jbaron@akamai.com, rientjes@google.com,
	minchan@google.com, kaleshsingh@google.com,
	kernel-team@android.com, linux-doc@vger.kernel.org,
	linux-kernel@vger.kernel.org, iommu@lists.linux.dev,
	linux-arch@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	linux-mm@kvack.org, linux-modules@vger.kernel.org,
	kasan-dev@googlegroups.com, cgroups@vger.kernel.org
Subject: Re: [PATCH 00/40] Memory allocation profiling
Date: Wed, 3 May 2023 09:51:49 +0200	[thread overview]
Message-ID: <ZFISlX+mSx4QJDK6@dhcp22.suse.cz> (raw)
In-Reply-To: <ZFIOfb6/jHwLqg6M@moria.home.lan>

On Wed 03-05-23 03:34:21, Kent Overstreet wrote:
> On Wed, May 03, 2023 at 09:25:29AM +0200, Michal Hocko wrote:
> > On Mon 01-05-23 09:54:10, Suren Baghdasaryan wrote:
> > > Memory allocation profiling infrastructure provides a low overhead
> > > mechanism to make all kernel allocations in the system visible. It can be
> > > used to monitor memory usage, track memory hotspots, detect memory leaks,
> > > identify memory regressions.
> > > 
> > > To keep the overhead to the minimum, we record only allocation sizes for
> > > every allocation in the codebase. With that information, if users are
> > > interested in more detailed context for a specific allocation, they can
> > > enable in-depth context tracking, which includes capturing the pid, tgid,
> > > task name, allocation size, timestamp and call stack for every allocation
> > > at the specified code location.
> > [...]
> > > Implementation utilizes a more generic concept of code tagging, introduced
> > > as part of this patchset. Code tag is a structure identifying a specific
> > > location in the source code which is generated at compile time and can be
> > > embedded in an application-specific structure. A number of applications
> > > for code tagging have been presented in the original RFC [1].
> > > Code tagging uses the old trick of "define a special elf section for
> > > objects of a given type so that we can iterate over them at runtime" and
> > > creates a proper library for it. 
> > > 
> > > To profile memory allocations, we instrument page, slab and percpu
> > > allocators to record total memory allocated in the associated code tag at
> > > every allocation in the codebase. Every time an allocation is performed by
> > > an instrumented allocator, the code tag at that location increments its
> > > counter by allocation size. Every time the memory is freed the counter is
> > > decremented. To decrement the counter upon freeing, allocated object needs
> > > a reference to its code tag. Page allocators use page_ext to record this
> > > reference while slab allocators use memcg_data (renamed into more generic
> > > slabobj_ext) of the slab page.
> > [...]
> > > [1] https://lore.kernel.org/all/20220830214919.53220-1-surenb@google.com/
> > [...]
> > >  70 files changed, 2765 insertions(+), 554 deletions(-)
> > 
> > Sorry for cutting the cover considerably but I believe I have quoted the
> > most important/interesting parts here. The approach is not fundamentally
> > different from the previous version [1] and there was a significant
> > discussion around this approach. The cover letter doesn't summarize nor
> > deal with concerns expressed previous AFAICS. So let me bring those up
> > back. At least those I find the most important:
> 
> We covered this previously, I'll just be giving the same answers I did
> before:

Your answers have shown your insight into tracing is very limited. I
have a clear recollection there were many suggestions on how to get what
you need and willingness to help out. Repeating your previous position
will not help much to be honest with you.

> > - This is a big change and it adds a significant maintenance burden
> >   because each allocation entry point needs to be handled specifically.
> >   The cost will grow with the intended coverage especially there when
> >   allocation is hidden in a library code.
> 
> We've made this as clean and simple as posssible: a single new macro
> invocation per allocation function, no calling convention changes (that
> would indeed have been a lot of churn!)

That doesn't really make the concern any less relevant. I believe you
and Suren have made a great effort to reduce the churn as much as
possible but looking at the diffstat the code changes are clearly there
and you have to convince the rest of the community that this maintenance
overhead is really worth it. The above statement hasn't helped to
convinced me to be honest.

> > - It has been brought up that this is duplicating functionality already
> >   available via existing tracing infrastructure. You should make it very
> >   clear why that is not suitable for the job
> 
> Tracing people _claimed_ this, but never demonstrated it.

The burden is on you and Suren. You are proposing the implement an
alternative tracing infrastructure.

> Tracepoints
> exist but the tooling that would consume them to provide this kind of
> information does not exist;

Any reasons why an appropriate tooling cannot be developed?

> it would require maintaining an index of
> _every outstanding allocation_ so that frees could be accounted
> correctly - IOW, it would be _drastically_ higher overhead, so not at
> all comparable.

Do you have any actual data points to prove your claim?

> > - We already have page_owner infrastructure that provides allocation
> >   tracking data. Why it cannot be used/extended?
> 
> Page owner is also very high overhead,

Is there any data to prove that claim? I would be really surprised that
page_owner would give higher overhead than page tagging with profiling
enabled (there is an allocation for each allocation request!!!). We can
discuss the bare bone page tagging comparision to page_owner because of
the full stack unwinding but is that overhead really prohibitively costly?
Can we reduce that by trimming the unwinder information?

> and the output is not very user
> friendly (tracking full call stack means many related overhead gets
> split, not generally what you want), and it doesn't cover slab.

Is this something we cannot do anything about? Have you explored any
potential ways?

> This tracks _all_ memory allocations - slab, page, vmalloc, percpu.

-- 
Michal Hocko
SUSE Labs

  reply	other threads:[~2023-05-03  7:52 UTC|newest]

Thread overview: 160+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-05-01 16:54 [PATCH 00/40] Memory allocation profiling Suren Baghdasaryan
2023-05-01 16:54 ` [PATCH 01/40] lib/string_helpers: Drop space in string_get_size's output Suren Baghdasaryan
2023-05-01 18:13   ` Davidlohr Bueso
2023-05-01 19:35     ` Kent Overstreet
2023-05-01 19:57       ` Andy Shevchenko
2023-05-01 21:16         ` Kent Overstreet
2023-05-01 21:33         ` Liam R. Howlett
2023-05-02  0:11           ` Kent Overstreet
2023-05-02  0:53         ` Kent Overstreet
2023-05-02  2:22       ` James Bottomley
2023-05-02  3:17         ` Kent Overstreet
2023-05-02  5:33           ` Andy Shevchenko
2023-05-02  6:21             ` Kent Overstreet
2023-05-02 15:19               ` Andy Shevchenko
2023-05-03  2:07                 ` Kent Overstreet
2023-05-03  6:30                   ` Andy Shevchenko
2023-05-03  7:12                     ` Kent Overstreet
2023-05-03  9:12                       ` Andy Shevchenko
2023-05-03  9:16                         ` Kent Overstreet
2023-05-02 11:42           ` James Bottomley
2023-05-02 22:50             ` Dave Chinner
2023-05-03  9:28               ` Vlastimil Babka
2023-05-03  9:44                 ` Andy Shevchenko
2023-05-03 12:15               ` James Bottomley
2023-05-02  7:55   ` Jani Nikula
2023-05-01 16:54 ` [PATCH 02/40] scripts/kallysms: Always include __start and __stop symbols Suren Baghdasaryan
2023-05-01 16:54 ` [PATCH 03/40] fs: Convert alloc_inode_sb() to a macro Suren Baghdasaryan
2023-05-02 12:35   ` Petr Tesařík
2023-05-02 19:57     ` Kent Overstreet
2023-05-02 20:20       ` Petr Tesařík
2023-05-01 16:54 ` [PATCH 04/40] nodemask: Split out include/linux/nodemask_types.h Suren Baghdasaryan
2023-05-01 16:54 ` [PATCH 05/40] prandom: Remove unused include Suren Baghdasaryan
2023-05-01 16:54 ` [PATCH 06/40] lib/string.c: strsep_no_empty() Suren Baghdasaryan
2023-05-02 12:37   ` Petr Tesařík
2023-05-01 16:54 ` [PATCH 07/40] Lazy percpu counters Suren Baghdasaryan
2023-05-01 19:17   ` Randy Dunlap
2023-05-01 16:54 ` [PATCH 08/40] mm: introduce slabobj_ext to support slab object extensions Suren Baghdasaryan
2023-05-01 16:54 ` [PATCH 09/40] mm: introduce __GFP_NO_OBJ_EXT flag to selectively prevent slabobj_ext creation Suren Baghdasaryan
2023-05-02 12:50   ` Petr Tesařík
2023-05-02 18:33     ` Suren Baghdasaryan
2023-05-01 16:54 ` [PATCH 10/40] mm/slab: introduce SLAB_NO_OBJ_EXT to avoid obj_ext creation Suren Baghdasaryan
2023-05-01 16:54 ` [PATCH 11/40] mm: prevent slabobj_ext allocations for slabobj_ext and kmem_cache objects Suren Baghdasaryan
2023-05-01 16:54 ` [PATCH 12/40] slab: objext: introduce objext_flags as extension to page_memcg_data_flags Suren Baghdasaryan
2023-05-01 16:54 ` [PATCH 13/40] lib: code tagging framework Suren Baghdasaryan
2023-05-01 16:54 ` [PATCH 14/40] lib: code tagging module support Suren Baghdasaryan
2023-05-01 16:54 ` [PATCH 15/40] lib: prevent module unloading if memory is not freed Suren Baghdasaryan
2023-05-01 16:54 ` [PATCH 16/40] lib: code tagging query helper functions Suren Baghdasaryan
2023-05-01 16:54 ` [PATCH 17/40] lib: add allocation tagging support for memory allocation profiling Suren Baghdasaryan
2023-05-01 16:54 ` [PATCH 18/40] lib: introduce support for page allocation tagging Suren Baghdasaryan
2023-05-01 16:54 ` [PATCH 19/40] change alloc_pages name in dma_map_ops to avoid name conflicts Suren Baghdasaryan
2023-05-02 15:50   ` Petr Tesařík
2023-05-02 18:38     ` Suren Baghdasaryan
2023-05-02 20:09       ` Petr Tesařík
2023-05-02 20:18         ` Kent Overstreet
2023-05-02 20:24         ` Suren Baghdasaryan
2023-05-02 20:39           ` Petr Tesařík
2023-05-02 20:41             ` Suren Baghdasaryan
2023-05-03 16:25   ` Steven Rostedt
2023-05-03 18:03     ` Suren Baghdasaryan
2023-05-01 16:54 ` [PATCH 20/40] mm: enable page allocation tagging Suren Baghdasaryan
2023-05-01 16:54 ` [PATCH 21/40] mm/page_ext: enable early_page_ext when CONFIG_MEM_ALLOC_PROFILING_DEBUG=y Suren Baghdasaryan
2023-05-01 16:54 ` [PATCH 22/40] mm: create new codetag references during page splitting Suren Baghdasaryan
2023-05-01 16:54 ` [PATCH 23/40] lib: add codetag reference into slabobj_ext Suren Baghdasaryan
2023-05-01 16:54 ` [PATCH 24/40] mm/slab: add allocation accounting into slab allocation and free paths Suren Baghdasaryan
2023-05-01 16:54 ` [PATCH 25/40] mm/slab: enable slab allocation tagging for kmalloc and friends Suren Baghdasaryan
2023-05-01 16:54 ` [PATCH 26/40] mm/slub: Mark slab_free_freelist_hook() __always_inline Suren Baghdasaryan
2023-05-01 16:54 ` [PATCH 27/40] mempool: Hook up to memory allocation profiling Suren Baghdasaryan
2023-05-01 16:54 ` [PATCH 28/40] timekeeping: Fix a circular include dependency Suren Baghdasaryan
2023-05-02 15:50   ` Thomas Gleixner
2023-05-01 16:54 ` [PATCH 29/40] mm: percpu: Introduce pcpuobj_ext Suren Baghdasaryan
2023-05-01 16:54 ` [PATCH 30/40] mm: percpu: Add codetag reference into pcpuobj_ext Suren Baghdasaryan
2023-05-01 16:54 ` [PATCH 31/40] mm: percpu: enable per-cpu allocation tagging Suren Baghdasaryan
2023-05-01 16:54 ` [PATCH 32/40] arm64: Fix circular header dependency Suren Baghdasaryan
2023-05-01 16:54 ` [PATCH 33/40] move stack capture functionality into a separate function for reuse Suren Baghdasaryan
2023-05-01 16:54 ` [PATCH 34/40] lib: code tagging context capture support Suren Baghdasaryan
2023-05-03  7:35   ` Michal Hocko
2023-05-03 15:18     ` Suren Baghdasaryan
2023-05-03 15:26       ` Dave Hansen
2023-05-03 19:45         ` Suren Baghdasaryan
2023-05-04  8:04       ` Michal Hocko
2023-05-04 14:31         ` Suren Baghdasaryan
2023-05-01 16:54 ` [PATCH 35/40] lib: implement context capture support for tagged allocations Suren Baghdasaryan
2023-05-03  7:39   ` Michal Hocko
2023-05-03 15:24     ` Suren Baghdasaryan
2023-05-04  8:09       ` Michal Hocko
2023-05-04 16:22         ` Suren Baghdasaryan
2023-05-05  8:40           ` Michal Hocko
2023-05-05 18:10             ` Suren Baghdasaryan
2023-05-01 16:54 ` [PATCH 36/40] lib: add memory allocations report in show_mem() Suren Baghdasaryan
2023-05-01 16:54 ` [PATCH 37/40] codetag: debug: skip objext checking when it's for objext itself Suren Baghdasaryan
2023-05-01 16:54 ` [PATCH 38/40] codetag: debug: mark codetags for reserved pages as empty Suren Baghdasaryan
2023-05-01 16:54 ` [PATCH 39/40] codetag: debug: introduce OBJEXTS_ALLOC_FAIL to mark failed slab_ext allocations Suren Baghdasaryan
2023-05-01 16:54 ` [PATCH 40/40] MAINTAINERS: Add entries for code tagging and memory allocation profiling Suren Baghdasaryan
2023-05-01 17:47 ` [PATCH 00/40] Memory " Roman Gushchin
2023-05-01 18:08   ` Suren Baghdasaryan
2023-05-01 18:14     ` Roman Gushchin
2023-05-01 19:37       ` Kent Overstreet
2023-05-01 21:18         ` Roman Gushchin
2023-05-03  7:25 ` Michal Hocko
2023-05-03  7:34   ` Kent Overstreet
2023-05-03  7:51     ` Michal Hocko [this message]
2023-05-03  8:05       ` Kent Overstreet
2023-05-03 13:21         ` Steven Rostedt
2023-05-03 16:35         ` Tejun Heo
2023-05-03 17:42           ` Suren Baghdasaryan
2023-05-03 18:06             ` Tejun Heo
2023-05-03 17:44           ` Kent Overstreet
2023-05-03 17:51           ` Kent Overstreet
2023-05-03 18:24             ` Tejun Heo
2023-05-03 18:07           ` Johannes Weiner
2023-05-03 18:19             ` Tejun Heo
2023-05-03 18:40               ` Tejun Heo
2023-05-03 18:56                 ` Kent Overstreet
2023-05-03 18:58                   ` Tejun Heo
2023-05-03 19:09                     ` Tejun Heo
2023-05-03 19:41                       ` Suren Baghdasaryan
2023-05-03 19:48                         ` Tejun Heo
2023-05-03 20:00                           ` Tejun Heo
2023-05-03 20:14                             ` Suren Baghdasaryan
2023-05-04  2:25                               ` Tejun Heo
2023-05-04  3:33                                 ` Kent Overstreet
2023-05-04  3:33                                 ` Suren Baghdasaryan
2023-05-04  8:00                               ` Petr Tesařík
2023-05-03 20:08                           ` Suren Baghdasaryan
2023-05-03 20:11                             ` Johannes Weiner
2023-05-04  2:16                             ` Tejun Heo
2023-05-03 20:04           ` Andrey Ryabinin
2023-05-03  9:50       ` Petr Tesařík
2023-05-03  9:54         ` Kent Overstreet
2023-05-03 10:24           ` Petr Tesařík
2023-05-03  9:57         ` Kent Overstreet
2023-05-03 10:26           ` Petr Tesařík
2023-05-03 15:30             ` Kent Overstreet
2023-05-03 12:33           ` James Bottomley
2023-05-03 14:31             ` Suren Baghdasaryan
2023-05-03 15:28             ` Kent Overstreet
2023-05-03 15:37               ` Lorenzo Stoakes
2023-05-03 16:03                 ` Kent Overstreet
2023-05-03 15:49               ` James Bottomley
2023-05-03 15:09   ` Suren Baghdasaryan
2023-05-03 16:28     ` Steven Rostedt
2023-05-03 17:40       ` Suren Baghdasaryan
2023-05-03 18:03         ` Steven Rostedt
2023-05-03 18:07           ` Suren Baghdasaryan
2023-05-03 18:12           ` Kent Overstreet
2023-05-04  9:07     ` Michal Hocko
2023-05-04 15:08       ` Suren Baghdasaryan
2023-05-07 10:27         ` Michal Hocko
2023-05-07 17:01           ` Kent Overstreet
2023-05-07 17:20       ` Kent Overstreet
2023-05-07 20:55         ` Steven Rostedt
2023-05-07 21:53           ` Kent Overstreet
2023-05-07 22:09             ` Steven Rostedt
2023-05-07 22:17               ` Kent Overstreet
2023-05-08 15:52         ` Petr Tesařík
2023-05-08 15:57           ` Kent Overstreet
2023-05-08 16:09             ` Petr Tesařík
2023-05-08 16:28               ` Kent Overstreet
2023-05-08 18:59                 ` Petr Tesařík
2023-05-08 20:48                   ` Kent Overstreet

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZFISlX+mSx4QJDK6@dhcp22.suse.cz \
    --to=mhocko@suse.com \
    --cc=42.hyeyoo@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=andreyknvl@gmail.com \
    --cc=arnd@arndb.de \
    --cc=axboe@kernel.dk \
    --cc=bristot@redhat.com \
    --cc=bsegall@google.com \
    --cc=catalin.marinas@arm.com \
    --cc=cgroups@vger.kernel.org \
    --cc=cl@linux.com \
    --cc=corbet@lwn.net \
    --cc=dave.hansen@linux.intel.com \
    --cc=dave@stgolabs.net \
    --cc=david@redhat.com \
    --cc=dennis@kernel.org \
    --cc=dhowells@redhat.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=dvyukov@google.com \
    --cc=ebiggers@google.com \
    --cc=elver@google.com \
    --cc=glider@google.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=hannes@cmpxchg.org \
    --cc=hughd@google.com \
    --cc=iamjoonsoo.kim@lge.com \
    --cc=iommu@lists.linux.dev \
    --cc=jbaron@akamai.com \
    --cc=juri.lelli@redhat.com \
    --cc=kaleshsingh@google.com \
    --cc=kasan-dev@googlegroups.com \
    --cc=keescook@chromium.org \
    --cc=kent.overstreet@linux.dev \
    --cc=kernel-team@android.com \
    --cc=ldufour@linux.ibm.com \
    --cc=liam.howlett@oracle.com \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-modules@vger.kernel.org \
    --cc=masahiroy@kernel.org \
    --cc=mcgrof@kernel.org \
    --cc=mgorman@suse.de \
    --cc=minchan@google.com \
    --cc=mingo@redhat.com \
    --cc=muchun.song@linux.dev \
    --cc=nathan@kernel.org \
    --cc=ndesaulniers@google.com \
    --cc=pasha.tatashin@soleen.com \
    --cc=paulmck@kernel.org \
    --cc=penberg@kernel.org \
    --cc=peterx@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rientjes@google.com \
    --cc=roman.gushchin@linux.dev \
    --cc=rostedt@goodmis.org \
    --cc=rppt@kernel.org \
    --cc=shakeelb@google.com \
    --cc=songmuchun@bytedance.com \
    --cc=surenb@google.com \
    --cc=tglx@linutronix.de \
    --cc=tj@kernel.org \
    --cc=vbabka@suse.cz \
    --cc=vincent.guittot@linaro.org \
    --cc=void@manifault.com \
    --cc=vschneid@redhat.com \
    --cc=will@kernel.org \
    --cc=willy@infradead.org \
    --cc=x86@kernel.org \
    --cc=yosryahmed@google.com \
    --cc=ytcoode@gmail.com \
    --cc=yuzhao@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).