From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 76CD2C433EF for ; Tue, 12 Oct 2021 09:06:31 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id EF71D60F3A for ; Tue, 12 Oct 2021 09:06:30 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org EF71D60F3A Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=suse.cz Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 77AC96B006C; Tue, 12 Oct 2021 05:06:30 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 6DC846B0071; Tue, 12 Oct 2021 05:06:30 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 57CBE6B0073; Tue, 12 Oct 2021 05:06:30 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0096.hostedemail.com [216.40.44.96]) by kanga.kvack.org (Postfix) with ESMTP id 420E66B006C for ; Tue, 12 Oct 2021 05:06:30 -0400 (EDT) Received: from smtpin01.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id E38DA2D244 for ; Tue, 12 Oct 2021 09:06:29 +0000 (UTC) X-FDA: 78687204498.01.87A66E1 Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.220.29]) by imf15.hostedemail.com (Postfix) with ESMTP id 7BC3CD004EF4 for ; Tue, 12 Oct 2021 09:06:29 +0000 (UTC) Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id 38E4720189; Tue, 12 Oct 2021 09:06:28 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1634029588; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding; bh=uSrXLVrPpCa+6u48GX+1yd8aXZAjPjzqC8eBojPc6dE=; b=UgEGQZxbVIjo+e2WvOToXqsmsMJ4SFwBug/p73slEpf2mYj0GVpb0f2ZoK3xXi8TrBnegS Jqh7Eadi2te4JPeL2a6FVfzP2NGX02JBcsZHUKicuybN4jFHn4ZndUUvwGoqja7KuIdURU d5endPGqroGCYSinWNFQJfDxp/ZnJes= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1634029588; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding; bh=uSrXLVrPpCa+6u48GX+1yd8aXZAjPjzqC8eBojPc6dE=; b=7yNHZz81rueif2vGgS0JC7nDeQAqlFXYQ15nd23qz9ON4snkYSHvhkcYvaFCLeVUH05t7M /7R0WhaECRevzGCg== Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id D4E38132D4; Tue, 12 Oct 2021 09:06:27 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id B01QMxNQZWH6WwAAMHmgww (envelope-from ); Tue, 12 Oct 2021 09:06:27 +0000 From: Vlastimil Babka To: Andrew Morton Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, dri-devel@lists.freedesktop.org, intel-gfx@lists.freedesktop.org, kasan-dev@googlegroups.com, Vlastimil Babka , Dmitry Vyukov , Marco Elver , Vijayanand Jitta , Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann , David Airlie , Daniel Vetter , Andrey Ryabinin , Alexander Potapenko , Andrey Konovalov , Geert Uytterhoeven , Oliver Glitta , Imran Khan Subject: [PATCH v2] lib/stackdepot: allow optional init and stack_table allocation by kvmalloc() Date: Tue, 12 Oct 2021 11:06:21 +0200 Message-Id: <20211012090621.1357-1-vbabka@suse.cz> X-Mailer: git-send-email 2.33.0 MIME-Version: 1.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=9943; h=from:subject; bh=HfbmzwDhXUrmy+xtvEqvLB/Yki46zd4v1cBuuNOeFt8=; b=owEBbQGS/pANAwAIAeAhynPxiakQAcsmYgBhZU/8kwOwccXDS+TdKj4Bn+8pHlZ6Ka62HIunEtF2 Lpf8qA2JATMEAAEIAB0WIQSNS5MBqTXjGL5IXszgIcpz8YmpEAUCYWVP/AAKCRDgIcpz8YmpEKMiCA CzeR8Uc7Ltm469D8xi/HsCwK3g1XgNweDjRk6N7yg87kuxb7IHJeGJpYvVhOtDPiNPE4+V7vtCQP3E N4O4eScgP4zdSRysjZtjohQ6L7KNE3FIz8sjue9flrBzrYo7JwE3Hp9JZS3TC9Y3lABnphNjBvzz64 5c1LgR7kXZFB+jxvxkTRpxhMLs/RbI3VP60W7Mirypqgf59Jgx0Yli56xpBHQKW/i2YqOGralxxdtl 3I5Vz1GNNnpnWMPTwmQ7v6YJVB5Ab26xC/jBqvAa+fkva+sUiQaZTodh1RO76UO95G4tkjNS82zAEn mvzP/D8S10Ov+0WwtSS6vrDo58wu8J X-Developer-Key: i=vbabka@suse.cz; a=openpgp; fpr=A940D434992C2E8E99103D50224FA7E7CC82A664 X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 7BC3CD004EF4 X-Stat-Signature: 9rw39smrifur444x3ug1jaxfhscyseek Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=UgEGQZxb; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=7yNHZz81; dmarc=none; spf=pass (imf15.hostedemail.com: domain of vbabka@suse.cz designates 195.135.220.29 as permitted sender) smtp.mailfrom=vbabka@suse.cz X-HE-Tag: 1634029589-392198 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Currently, enabling CONFIG_STACKDEPOT means its stack_table will be alloc= ated from memblock, even if stack depot ends up not actually used. The default= size of stack_table is 4MB on 32-bit, 8MB on 64-bit. This is fine for use-cases such as KASAN which is also a config option an= d has overhead on its own. But it's an issue for functionality that has to = be actually enabled on boot (page_owner) or depends on hardware (GPU drivers= ) and thus the memory might be wasted. This was raised as an issue [1] when attempting to add stackdepot support for SLUB's debug object tracking functionality. It's common to build kernels with CONFIG_SLUB_DEBUG and en= able slub_debug on boot only when needed, or create only specific kmem caches = with debugging for testing purposes. It would thus be more efficient if stackdepot's table was allocated only = when actually going to be used. This patch thus makes the allocation (and whol= e stack_depot_init() call) optional: - Add a CONFIG_STACKDEPOT_ALWAYS_INIT flag to keep using the current well-defined point of allocation as part of mem_init(). Make CONFIG_KAS= AN select this flag. - Other users have to call stack_depot_init() as part of their own init w= hen it's determined that stack depot will actually be used. This may depend= on both config and runtime conditions. Convert current users which are page_owner and several in the DRM subsystem. Same will be done for SLUB later. - Because the init might now be called after the boot-time memblock alloc= ation has given all memory to the buddy allocator, change stack_depot_init() = to allocate stack_table with kvmalloc() when memblock is no longer availab= le. Also handle allocation failure by disabling stackdepot (could have theoretically happened even with memblock allocation previously), and d= on't unnecessarily align the memblock allocation to its own size anymore. [1] https://lore.kernel.org/all/CAMuHMdW=3DeoVzM1Re5FVoEN87nKfiLmM2+Ah7eN= u2KXEhCvbZyA@mail.gmail.com/ Signed-off-by: Vlastimil Babka Acked-by: Dmitry Vyukov Cc: Marco Elver Cc: Vijayanand Jitta Cc: Maarten Lankhorst Cc: Maxime Ripard Cc: Thomas Zimmermann Cc: David Airlie Cc: Daniel Vetter Cc: Andrey Ryabinin Cc: Alexander Potapenko Cc: Andrey Konovalov Cc: Dmitry Vyukov Cc: Geert Uytterhoeven Cc: Oliver Glitta Cc: Imran Khan --- Changes in v2: - Rebase to v5.15-rc5. - Stylistic changes suggested by Marco Elver. drivers/gpu/drm/drm_dp_mst_topology.c | 1 + drivers/gpu/drm/drm_mm.c | 4 ++++ drivers/gpu/drm/i915/intel_runtime_pm.c | 3 +++ include/linux/stackdepot.h | 25 ++++++++++++------- init/main.c | 2 +- lib/Kconfig | 4 ++++ lib/Kconfig.kasan | 2 +- lib/stackdepot.c | 32 +++++++++++++++++++++---- mm/page_owner.c | 2 ++ 9 files changed, 59 insertions(+), 16 deletions(-) diff --git a/drivers/gpu/drm/drm_dp_mst_topology.c b/drivers/gpu/drm/drm_= dp_mst_topology.c index 86d13d6bc463..b0ebdc843a00 100644 --- a/drivers/gpu/drm/drm_dp_mst_topology.c +++ b/drivers/gpu/drm/drm_dp_mst_topology.c @@ -5493,6 +5493,7 @@ int drm_dp_mst_topology_mgr_init(struct drm_dp_mst_= topology_mgr *mgr, mutex_init(&mgr->probe_lock); #if IS_ENABLED(CONFIG_DRM_DEBUG_DP_MST_TOPOLOGY_REFS) mutex_init(&mgr->topology_ref_history_lock); + stack_depot_init(); #endif INIT_LIST_HEAD(&mgr->tx_msg_downq); INIT_LIST_HEAD(&mgr->destroy_port_list); diff --git a/drivers/gpu/drm/drm_mm.c b/drivers/gpu/drm/drm_mm.c index 93d48a6f04ab..5916228ea0c9 100644 --- a/drivers/gpu/drm/drm_mm.c +++ b/drivers/gpu/drm/drm_mm.c @@ -983,6 +983,10 @@ void drm_mm_init(struct drm_mm *mm, u64 start, u64 s= ize) add_hole(&mm->head_node); =20 mm->scan_active =3D 0; + +#ifdef CONFIG_DRM_DEBUG_MM + stack_depot_init(); +#endif } EXPORT_SYMBOL(drm_mm_init); =20 diff --git a/drivers/gpu/drm/i915/intel_runtime_pm.c b/drivers/gpu/drm/i9= 15/intel_runtime_pm.c index eaf7688f517d..d083506986e1 100644 --- a/drivers/gpu/drm/i915/intel_runtime_pm.c +++ b/drivers/gpu/drm/i915/intel_runtime_pm.c @@ -78,6 +78,9 @@ static void __print_depot_stack(depot_stack_handle_t st= ack, static void init_intel_runtime_pm_wakeref(struct intel_runtime_pm *rpm) { spin_lock_init(&rpm->debug.lock); + + if (rpm->available) + stack_depot_init(); } =20 static noinline depot_stack_handle_t diff --git a/include/linux/stackdepot.h b/include/linux/stackdepot.h index 6bb4bc1a5f54..40fc5e92194f 100644 --- a/include/linux/stackdepot.h +++ b/include/linux/stackdepot.h @@ -13,6 +13,22 @@ =20 typedef u32 depot_stack_handle_t; =20 +/* + * Every user of stack depot has to call this during its own init when i= t's + * decided that it will be calling stack_depot_save() later. + * + * The alternative is to select STACKDEPOT_ALWAYS_INIT to have stack dep= ot + * enabled as part of mm_init(), for subsystems where it's known at comp= ile time + * that stack depot will be used. + */ +int stack_depot_init(void); + +#ifdef CONFIG_STACKDEPOT_ALWAYS_INIT +static inline int stack_depot_early_init(void) { return stack_depot_init= (); } +#else +static inline int stack_depot_early_init(void) { return 0; } +#endif + depot_stack_handle_t stack_depot_save(unsigned long *entries, unsigned int nr_entries, gfp_t gfp_flags); =20 @@ -21,13 +37,4 @@ unsigned int stack_depot_fetch(depot_stack_handle_t ha= ndle, =20 unsigned int filter_irq_stacks(unsigned long *entries, unsigned int nr_e= ntries); =20 -#ifdef CONFIG_STACKDEPOT -int stack_depot_init(void); -#else -static inline int stack_depot_init(void) -{ - return 0; -} -#endif /* CONFIG_STACKDEPOT */ - #endif diff --git a/init/main.c b/init/main.c index 81a79a77db46..ca2765c8e45c 100644 --- a/init/main.c +++ b/init/main.c @@ -842,7 +842,7 @@ static void __init mm_init(void) init_mem_debugging_and_hardening(); kfence_alloc_pool(); report_meminit(); - stack_depot_init(); + stack_depot_early_init(); mem_init(); mem_init_print_info(); /* page_owner must be initialized after buddy is ready */ diff --git a/lib/Kconfig b/lib/Kconfig index 5e7165e6a346..9d0569084152 100644 --- a/lib/Kconfig +++ b/lib/Kconfig @@ -671,6 +671,10 @@ config STACKDEPOT bool select STACKTRACE =20 +config STACKDEPOT_ALWAYS_INIT + bool + select STACKDEPOT + config STACK_HASH_ORDER int "stack depot hash size (12 =3D> 4KB, 20 =3D> 1024KB)" range 12 20 diff --git a/lib/Kconfig.kasan b/lib/Kconfig.kasan index cdc842d090db..879757b6dd14 100644 --- a/lib/Kconfig.kasan +++ b/lib/Kconfig.kasan @@ -38,7 +38,7 @@ menuconfig KASAN CC_HAS_WORKING_NOSANITIZE_ADDRESS) || \ HAVE_ARCH_KASAN_HW_TAGS depends on (SLUB && SYSFS) || (SLAB && !DEBUG_SLAB) - select STACKDEPOT + select STACKDEPOT_ALWAYS_INIT help Enables KASAN (KernelAddressSANitizer) - runtime memory debugger, designed to find out-of-bounds accesses and use-after-free bugs. diff --git a/lib/stackdepot.c b/lib/stackdepot.c index 0a2e417f83cb..9bb5333bf02f 100644 --- a/lib/stackdepot.c +++ b/lib/stackdepot.c @@ -24,6 +24,7 @@ #include #include #include +#include #include #include #include @@ -146,6 +147,7 @@ static struct stack_record *depot_alloc_stack(unsigne= d long *entries, int size, #define STACK_HASH_MASK (STACK_HASH_SIZE - 1) #define STACK_HASH_SEED 0x9747b28c =20 +DEFINE_MUTEX(stack_depot_init_mutex); static bool stack_depot_disable; static struct stack_record **stack_table; =20 @@ -162,18 +164,38 @@ static int __init is_stack_depot_disabled(char *str= ) } early_param("stack_depot_disable", is_stack_depot_disabled); =20 -int __init stack_depot_init(void) +/* + * __ref because of memblock_alloc(), which will not be actually called = after + * the __init code is gone, because at that point slab_is_available() is= true + */ +__ref int stack_depot_init(void) { - if (!stack_depot_disable) { + mutex_lock(&stack_depot_init_mutex); + if (!stack_depot_disable && stack_table =3D=3D NULL) { size_t size =3D (STACK_HASH_SIZE * sizeof(struct stack_record *)); int i; =20 - stack_table =3D memblock_alloc(size, size); - for (i =3D 0; i < STACK_HASH_SIZE; i++) - stack_table[i] =3D NULL; + if (slab_is_available()) { + pr_info("Stack Depot allocating hash table with kvmalloc\n"); + stack_table =3D kvmalloc(size, GFP_KERNEL); + } else { + pr_info("Stack Depot allocating hash table with memblock_alloc\n"); + stack_table =3D memblock_alloc(size, SMP_CACHE_BYTES); + } + if (stack_table) { + for (i =3D 0; i < STACK_HASH_SIZE; i++) + stack_table[i] =3D NULL; + } else { + pr_err("Stack Depot failed hash table allocationg, disabling\n"); + stack_depot_disable =3D true; + mutex_unlock(&stack_depot_init_mutex); + return -ENOMEM; + } } + mutex_unlock(&stack_depot_init_mutex); return 0; } +EXPORT_SYMBOL_GPL(stack_depot_init); =20 /* Calculate hash for a stack */ static inline u32 hash_stack(unsigned long *entries, unsigned int size) diff --git a/mm/page_owner.c b/mm/page_owner.c index 62402d22539b..16a0ef903384 100644 --- a/mm/page_owner.c +++ b/mm/page_owner.c @@ -80,6 +80,8 @@ static void init_page_owner(void) if (!page_owner_enabled) return; =20 + stack_depot_init(); + register_dummy_stack(); register_failure_stack(); register_early_stack(); --=20 2.33.0