linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 00/22] stackdepot: allow evicting stack traces
@ 2023-11-20 17:46 andrey.konovalov
  2023-11-20 17:46 ` [PATCH v4 01/22] lib/stackdepot: print disabled message only if truly disabled andrey.konovalov
                   ` (21 more replies)
  0 siblings, 22 replies; 56+ messages in thread
From: andrey.konovalov @ 2023-11-20 17:46 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Andrey Konovalov, Marco Elver, Alexander Potapenko,
	Dmitry Vyukov, Vlastimil Babka, kasan-dev, Evgenii Stepanov,
	Oscar Salvador, linux-mm, linux-kernel, Andrey Konovalov

From: Andrey Konovalov <andreyknvl@google.com>

Currently, the stack depot grows indefinitely until it reaches its
capacity. Once that happens, the stack depot stops saving new stack
traces.

This creates a problem for using the stack depot for in-field testing
and in production.

For such uses, an ideal stack trace storage should:

1. Allow saving fresh stack traces on systems with a large uptime while
   limiting the amount of memory used to store the traces;
2. Have a low performance impact.

Implementing #1 in the stack depot is impossible with the current
keep-forever approach. This series targets to address that. Issue #2 is
left to be addressed in a future series.

This series changes the stack depot implementation to allow evicting
unneeded stack traces from the stack depot. The users of the stack depot
can do that via new stack_depot_save_flags(STACK_DEPOT_FLAG_GET) and
stack_depot_put APIs.

Internal changes to the stack depot code include:

1. Storing stack traces in fixed-frame-sized slots (vs precisely-sized
   slots in the current implementation); the slot size is controlled via
   CONFIG_STACKDEPOT_MAX_FRAMES (default: 64 frames);
2. Keeping available slots in a freelist (vs keeping an offset to the next
   free slot);
3. Using a read/write lock for synchronization (vs a lock-free approach
   combined with a spinlock).

This series also integrates the eviction functionality into KASAN:
the tag-based modes evict stack traces when the corresponding entry
leaves the stack ring, and Generic KASAN evicts stack traces for objects
once those leave the quarantine.

With KASAN, despite wasting some space on rounding up the size of each
stack record, the total memory consumed by stack depot gets saturated due
to the eviction of irrelevant stack traces from the stack depot.

With the tag-based KASAN modes, the average total amount of memory used
for stack traces becomes ~0.5 MB (with the current default stack ring size
of 32k entries and the default CONFIG_STACKDEPOT_MAX_FRAMES of 64). With
Generic KASAN, the stack traces take up ~1 MB per 1 GB of RAM (as the
quarantine's size depends on the amount of RAM).

However, with KMSAN, the stack depot ends up using ~4x more memory per a
stack trace than before. Thus, for KMSAN, the stack depot capacity is
increased accordingly. KMSAN uses a lot of RAM for shadow memory anyway,
so the increased stack depot memory usage will not make a significant
difference.

Other users of the stack depot do not save stack traces as often as KASAN
and KMSAN. Thus, the increased memory usage is taken as an acceptable
trade-off. In the future, these other users can take advantage of the
eviction API to limit the memory waste.

There is no measurable boot time performance impact of these changes for
KASAN on x86-64. I haven't done any tests for arm64 modes (the stack
depot without performance optimizations is not suitable for intended use
of those anyway), but I expect a similar result. Obtaining and copying
stack trace frames when saving them into stack depot is what takes the
most time.

This series does not yet provide a way to configure the maximum size of
the stack depot externally (e.g. via a command-line parameter). This will
be added in a separate series, possibly together with the performance
improvement changes.

---

Changes v3->v4:
- Rebase onto 6.7-rc2.
- Fix lockdep annotation in depot_fetch_stack.
- New patch: "kasan: use stack_depot_put for Generic mode" (was sent for
  review separately but now merged into this series).
- New patch: "lib/stackdepot: print disabled message only if truly
  disabled" (was sent for review separately but now merged into this
  series).
- New patch: "lib/stackdepot: adjust DEPOT_POOLS_CAP for KMSAN".

Changes v2->v3:
- Fix null-ptr-deref by using the proper number of entries for
  initializing the stack table when alloc_large_system_hash()
  auto-calculates the number (see patch #12).
- Keep STACKDEPOT/STACKDEPOT_ALWAYS_INIT Kconfig options not configurable
  by users.
- Use lockdep_assert_held_read annotation in depot_fetch_stack.
- WARN_ON invalid flags in stack_depot_save_flags.
- Moved "../slab.h" include in mm/kasan/report_tags.c in the right patch.
- Various comment fixes.

Changes v1->v2:
- Rework API to stack_depot_save_flags(STACK_DEPOT_FLAG_GET) +
  stack_depot_put.
- Add CONFIG_STACKDEPOT_MAX_FRAMES Kconfig option.
- Switch stack depot to using list_head's.
- Assorted minor changes, see the commit message for each path.

Andrey Konovalov (22):
  lib/stackdepot: print disabled message only if truly disabled
  lib/stackdepot: check disabled flag when fetching
  lib/stackdepot: simplify __stack_depot_save
  lib/stackdepot: drop valid bit from handles
  lib/stackdepot: add depot_fetch_stack helper
  lib/stackdepot: use fixed-sized slots for stack records
  lib/stackdepot: fix and clean-up atomic annotations
  lib/stackdepot: rework helpers for depot_alloc_stack
  lib/stackdepot: rename next_pool_required to new_pool_required
  lib/stackdepot: store next pool pointer in new_pool
  lib/stackdepot: store free stack records in a freelist
  lib/stackdepot: use read/write lock
  lib/stackdepot: use list_head for stack record links
  kmsan: use stack_depot_save instead of __stack_depot_save
  lib/stackdepot, kasan: add flags to __stack_depot_save and rename
  lib/stackdepot: add refcount for records
  lib/stackdepot: allow users to evict stack traces
  kasan: remove atomic accesses to stack ring entries
  kasan: check object_size in kasan_complete_mode_report_info
  kasan: use stack_depot_put for tag-based modes
  kasan: use stack_depot_put for Generic mode
  lib/stackdepot: adjust DEPOT_POOLS_CAP for KMSAN

 include/linux/stackdepot.h |  59 ++++-
 lib/Kconfig                |  10 +
 lib/stackdepot.c           | 452 ++++++++++++++++++++++++-------------
 mm/kasan/common.c          |   8 +-
 mm/kasan/generic.c         |  27 ++-
 mm/kasan/kasan.h           |   2 +-
 mm/kasan/quarantine.c      |  26 ++-
 mm/kasan/report_tags.c     |  27 +--
 mm/kasan/tags.c            |  24 +-
 mm/kmsan/core.c            |   7 +-
 10 files changed, 427 insertions(+), 215 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 56+ messages in thread

end of thread, other threads:[~2024-01-24 16:24 UTC | newest]

Thread overview: 56+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-11-20 17:46 [PATCH v4 00/22] stackdepot: allow evicting stack traces andrey.konovalov
2023-11-20 17:46 ` [PATCH v4 01/22] lib/stackdepot: print disabled message only if truly disabled andrey.konovalov
2024-01-03  8:14   ` Oscar Salvador
2023-11-20 17:47 ` [PATCH v4 02/22] lib/stackdepot: check disabled flag when fetching andrey.konovalov
2024-01-03  8:18   ` Oscar Salvador
2023-11-20 17:47 ` [PATCH v4 03/22] lib/stackdepot: simplify __stack_depot_save andrey.konovalov
2024-01-03  8:19   ` Oscar Salvador
2023-11-20 17:47 ` [PATCH v4 04/22] lib/stackdepot: drop valid bit from handles andrey.konovalov
2024-01-03  8:21   ` Oscar Salvador
2023-11-20 17:47 ` [PATCH v4 05/22] lib/stackdepot: add depot_fetch_stack helper andrey.konovalov
2024-01-03  8:24   ` Oscar Salvador
2023-11-20 17:47 ` [PATCH v4 06/22] lib/stackdepot: use fixed-sized slots for stack records andrey.konovalov
2024-01-03  8:30   ` Oscar Salvador
2023-11-20 17:47 ` [PATCH v4 07/22] lib/stackdepot: fix and clean-up atomic annotations andrey.konovalov
2023-11-20 17:47 ` [PATCH v4 08/22] lib/stackdepot: rework helpers for depot_alloc_stack andrey.konovalov
2024-01-03  8:42   ` Oscar Salvador
2023-11-20 17:47 ` [PATCH v4 09/22] lib/stackdepot: rename next_pool_required to new_pool_required andrey.konovalov
2024-01-03  8:44   ` Oscar Salvador
2023-11-20 17:47 ` [PATCH v4 10/22] lib/stackdepot: store next pool pointer in new_pool andrey.konovalov
2024-01-03  9:06   ` Oscar Salvador
2023-11-20 17:47 ` [PATCH v4 11/22] lib/stackdepot: store free stack records in a freelist andrey.konovalov
2023-11-20 17:47 ` [PATCH v4 12/22] lib/stackdepot: use read/write lock andrey.konovalov
2024-01-03  9:14   ` Oscar Salvador
2024-01-10 23:01     ` Andi Kleen
2024-01-11  9:48       ` Marco Elver
2024-01-11 12:36         ` Andi Kleen
2024-01-11 19:08           ` Marco Elver
2024-01-12  2:38             ` Andrey Konovalov
2024-01-12  8:24               ` Marco Elver
2024-01-12 22:15                 ` Marco Elver
2024-01-13  1:24                   ` Andi Kleen
2024-01-13  9:12                     ` Marco Elver
2024-01-13  9:19                       ` Andi Kleen
2024-01-13  9:23                         ` Marco Elver
2024-01-13  9:30                           ` Marco Elver
2024-01-13  9:31                           ` Andi Kleen
2024-01-24 14:15   ` Breno Leitao
2024-01-24 14:21     ` Marco Elver
2024-01-24 16:24       ` Breno Leitao
2023-11-20 17:47 ` [PATCH v4 13/22] lib/stackdepot: use list_head for stack record links andrey.konovalov
2023-11-20 17:47 ` [PATCH v4 14/22] kmsan: use stack_depot_save instead of __stack_depot_save andrey.konovalov
2023-11-20 17:47 ` [PATCH v4 15/22] lib/stackdepot, kasan: add flags to __stack_depot_save and rename andrey.konovalov
2023-11-20 17:47 ` [PATCH v4 16/22] lib/stackdepot: add refcount for records andrey.konovalov
2023-11-20 17:47 ` [PATCH v4 17/22] lib/stackdepot: allow users to evict stack traces andrey.konovalov
2024-01-04  8:52   ` Oscar Salvador
2024-01-04  9:25     ` Marco Elver
2024-01-04 10:19       ` Oscar Salvador
2024-01-04 10:42         ` Marco Elver
2023-11-20 17:47 ` [PATCH v4 18/22] kasan: remove atomic accesses to stack ring entries andrey.konovalov
2023-11-20 17:47 ` [PATCH v4 19/22] kasan: check object_size in kasan_complete_mode_report_info andrey.konovalov
2023-11-20 17:47 ` [PATCH v4 20/22] kasan: use stack_depot_put for tag-based modes andrey.konovalov
2023-11-20 17:47 ` [PATCH v4 21/22] kasan: use stack_depot_put for Generic mode andrey.konovalov
2023-11-22  3:17   ` [BISECTED] Boot hangs when SLUB_DEBUG_ON=y Hyeonggon Yoo
2023-11-22 12:37     ` [REGRESSION] Boot hangs when SLUB_DEBUG_ON=y and KASAN_GENERIC=y Hyeonggon Yoo
2023-11-22 23:13     ` [BISECTED] Boot hangs when SLUB_DEBUG_ON=y Andrey Konovalov
2023-11-20 17:47 ` [PATCH v4 22/22] lib/stackdepot: adjust DEPOT_POOLS_CAP for KMSAN andrey.konovalov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).