From: Oscar Salvador <osalvador@suse.de>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
Michal Hocko <mhocko@suse.com>, Vlastimil Babka <vbabka@suse.cz>,
Marco Elver <elver@google.com>,
Andrey Konovalov <andreyknvl@gmail.com>,
Alexander Potapenko <glider@google.com>,
Oscar Salvador <osalvador@suse.de>
Subject: [PATCH v9 0/7] page_owner: print stacks and their outstanding allocations
Date: Wed, 14 Feb 2024 18:01:50 +0100 [thread overview]
Message-ID: <20240214170157.17530-1-osalvador@suse.de> (raw)
Changes v8 -> v9
- Fix handle-0 for the very first stack_record entry
- Collect Acked-by and Reviewed-by from Marco and Vlastimil
- Adressed feedback from Marco and Vlastimil
- stack_print() no longer allocates a memory buffer, prints directly
using seq_printf: by Vlastimil
- Added two static struct stack for dummy_handle and faiure_handle
- add_stack_record_to_list() now filters out the gfp_mask the same way
stackdepot does, for consistency
- Rename set_threshold to count_threshold
Changes v7 -> v8
- Rebased on top of -next
- page_owner maintains its own stack_records list now
- Kill auxiliary stackdepot function to traverse buckets
- page_owner_stacks is now a directory with 'show_stacks'
and 'set_threshold'
- Update Documentation/mm/page_owner.rst
- Adressed feedback from Marco
Changes v6 -> v7:
- Rebased on top of Andrey Konovalov's libstackdepot patchset
- Reformulated the changelogs
Changes v5 -> v6:
- Rebase on top of v6.7-rc1
- Move stack_record struct to the header
- Addressed feedback from Vlastimil
(some code tweaks and changelogs suggestions)
Changes v4 -> v5:
- Addressed feedback from Alexander Potapenko
Changes v3 -> v4:
- Rebase (long time has passed)
- Use boolean instead of enum for action by Alexander Potapenko
- (I left some feedback untouched because it's been long and
would like to discuss it here now instead of re-vamping
and old thread)
Changes v2 -> v3:
- Replace interface in favor of seq operations
(suggested by Vlastimil)
- Use debugfs interface to store/read valued (suggested by Ammar)
page_owner is a great debug functionality tool that lets us know
about all pages that have been allocated/freed and their specific
stacktrace.
This comes very handy when debugging memory leaks, since with
some scripting we can see the outstanding allocations, which might point
to a memory leak.
In my experience, that is one of the most useful cases, but it can get
really tedious to screen through all pages and try to reconstruct the
stack <-> allocated/freed relationship, becoming most of the time a
daunting and slow process when we have tons of allocation/free operations.
This patchset aims to ease that by adding a new functionality into
page_owner.
This functionality creates a new directory called 'page_owner_stacks'
under 'sys/kernel//debug' with a read-only file called 'show_stacks',
which prints out all the stacks followed by their outstanding number
of allocations (being that the times the stacktrace has allocated
but not freed yet).
This gives us a clear and a quick overview of stacks <-> allocated/free.
We take advantage of the new refcount_f field that stack_record struct
gained, and increment/decrement the stack refcount on every
__set_page_owner() (alloc operation) and __reset_page_owner (free operation)
call.
Unfortunately, we cannot use the new stackdepot api
STACK_DEPOT_FLAG_GET because it does not fulfill page_owner needs,
meaning we would have to special case things, at which point
makes more sense for page_owner to do its own {dec,inc}rementing
of the stacks.
E.g: Using STACK_DEPOT_FLAG_PUT, once the refcount reaches 0,
such stack gets evicted, so page_owner would lose information.
This patch also creates a new file called 'set_threshold' within
'page_owner_stacks' directory, and by writing a value to it, the stacks
which refcount is below such value will be filtered out.
A PoC can be found below:
# cat /sys/kernel/debug/page_owner_stacks/show_stacks > page_owner_full_stacks.txt
# head -40 page_owner_full_stacks.txt
prep_new_page+0xa9/0x120
get_page_from_freelist+0x801/0x2210
__alloc_pages+0x18b/0x350
alloc_pages_mpol+0x91/0x1f0
folio_alloc+0x14/0x50
filemap_alloc_folio+0xb2/0x100
page_cache_ra_unbounded+0x96/0x180
filemap_get_pages+0xfd/0x590
filemap_read+0xcc/0x330
blkdev_read_iter+0xb8/0x150
vfs_read+0x285/0x320
ksys_read+0xa5/0xe0
do_syscall_64+0x80/0x160
entry_SYSCALL_64_after_hwframe+0x6e/0x76
stack_count: 521
prep_new_page+0xa9/0x120
get_page_from_freelist+0x801/0x2210
__alloc_pages+0x18b/0x350
alloc_pages_mpol+0x91/0x1f0
folio_alloc+0x14/0x50
filemap_alloc_folio+0xb2/0x100
__filemap_get_folio+0x14a/0x490
ext4_write_begin+0xbd/0x4b0 [ext4]
generic_perform_write+0xc1/0x1e0
ext4_buffered_write_iter+0x68/0xe0 [ext4]
ext4_file_write_iter+0x70/0x740 [ext4]
vfs_write+0x33d/0x420
ksys_write+0xa5/0xe0
do_syscall_64+0x80/0x160
entry_SYSCALL_64_after_hwframe+0x6e/0x76
stack_count: 4609
...
...
# echo 5000 > /sys/kernel/debug/page_owner_stacks/set_threshold
# cat /sys/kernel/debug/page_owner_stacks/show_stacks > page_owner_full_stacks_5000.txt
# head -40 page_owner_full_stacks_5000.txt
prep_new_page+0xa9/0x120
get_page_from_freelist+0x801/0x2210
__alloc_pages+0x18b/0x350
alloc_pages_mpol+0x91/0x1f0
folio_alloc+0x14/0x50
filemap_alloc_folio+0xb2/0x100
__filemap_get_folio+0x14a/0x490
ext4_write_begin+0xbd/0x4b0 [ext4]
generic_perform_write+0xc1/0x1e0
ext4_buffered_write_iter+0x68/0xe0 [ext4]
ext4_file_write_iter+0x70/0x740 [ext4]
vfs_write+0x33d/0x420
ksys_pwrite64+0x75/0x90
do_syscall_64+0x80/0x160
entry_SYSCALL_64_after_hwframe+0x6e/0x76
stack_count: 6781
prep_new_page+0xa9/0x120
get_page_from_freelist+0x801/0x2210
__alloc_pages+0x18b/0x350
pcpu_populate_chunk+0xec/0x350
pcpu_balance_workfn+0x2d1/0x4a0
process_scheduled_works+0x84/0x380
worker_thread+0x12a/0x2a0
kthread+0xe3/0x110
ret_from_fork+0x30/0x50
ret_from_fork_asm+0x1b/0x30
stack_count: 8641
Oscar Salvador (7):
lib/stackdepot: Fix first entry having a 0-handle
lib/stackdepot: Move stack_record struct definition into the header
mm,page_owner: Maintain own list of stack_records structs
mm,page_owner: Implement the tracking of the stacks count
mm,page_owner: Display all stacks and their count
mm,page_owner: Filter out stacks by a threshold
mm,page_owner: Update Documentation regarding page_owner_stacks
Documentation/mm/page_owner.rst | 45 +++++++
include/linux/stackdepot.h | 58 +++++++++
lib/stackdepot.c | 65 +++--------
mm/page_owner.c | 200 +++++++++++++++++++++++++++++++-
4 files changed, 318 insertions(+), 50 deletions(-)
--
2.43.0
next reply other threads:[~2024-02-14 17:00 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-02-14 17:01 Oscar Salvador [this message]
2024-02-14 17:01 ` [PATCH v9 1/7] lib/stackdepot: Fix first entry having a 0-handle Oscar Salvador
2024-02-15 10:46 ` Vlastimil Babka
2024-02-14 17:01 ` [PATCH v9 2/7] lib/stackdepot: Move stack_record struct definition into the header Oscar Salvador
2024-02-15 8:16 ` Marco Elver
2024-02-15 8:22 ` Oscar Salvador
2024-02-15 9:30 ` Vlastimil Babka
2024-02-15 9:33 ` Marco Elver
2024-02-15 10:43 ` Vlastimil Babka
2024-02-14 17:01 ` [PATCH v9 3/7] mm,page_owner: Maintain own list of stack_records structs Oscar Salvador
2024-02-15 10:55 ` Vlastimil Babka
2024-02-15 12:52 ` Marco Elver
2024-02-14 17:01 ` [PATCH v9 4/7] mm,page_owner: Implement the tracking of the stacks count Oscar Salvador
2024-02-15 11:08 ` Vlastimil Babka
2024-02-15 11:57 ` Oscar Salvador
2024-02-14 17:01 ` [PATCH v9 5/7] mm,page_owner: Display all stacks and their count Oscar Salvador
2024-02-15 11:10 ` Vlastimil Babka
2024-02-15 11:58 ` Oscar Salvador
2024-02-14 17:01 ` [PATCH v9 6/7] mm,page_owner: Filter out stacks by a threshold Oscar Salvador
2024-02-15 11:12 ` Vlastimil Babka
2024-02-15 12:01 ` Oscar Salvador
2024-02-14 17:01 ` [PATCH v9 7/7] mm,page_owner: Update Documentation regarding page_owner_stacks Oscar Salvador
2024-02-15 11:13 ` Vlastimil Babka
2024-02-15 12:53 ` Marco Elver
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240214170157.17530-1-osalvador@suse.de \
--to=osalvador@suse.de \
--cc=akpm@linux-foundation.org \
--cc=andreyknvl@gmail.com \
--cc=elver@google.com \
--cc=glider@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@suse.com \
--cc=vbabka@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.