All of lore.kernel.org
 help / color / mirror / Atom feed
From: Vlastimil Babka <vbabka@suse.cz>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	kasan-dev@googlegroups.com, Qian Cai <cai@lca.pw>,
	"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
	Matthew Wilcox <willy@infradead.org>,
	Mel Gorman <mgorman@techsingularity.net>,
	Michal Hocko <mhocko@kernel.org>,
	Vlastimil Babka <vbabka@suse.cz>,
	Dmitry Vyukov <dvyukov@google.com>,
	Walter Wu <walter-zh.wu@mediatek.com>,
	Andrey Ryabinin <aryabinin@virtuozzo.com>
Subject: [PATCH v2 2/3] mm, page_owner: decouple freeing stack trace from debug_pagealloc
Date: Mon, 30 Sep 2019 14:29:15 +0200	[thread overview]
Message-ID: <20190930122916.14969-3-vbabka@suse.cz> (raw)
In-Reply-To: <20190930122916.14969-1-vbabka@suse.cz>

The commit 8974558f49a6 ("mm, page_owner, debug_pagealloc: save and dump
freeing stack trace") enhanced page_owner to also store freeing stack trace,
when debug_pagealloc is also enabled. KASAN would also like to do this [1] to
improve error reports to debug e.g. UAF issues. Kirill has suggested that the
freeing stack trace saving should be also possible to be enabled separately.

This patch therefore introduces a new kernel parameter page_owner_free to
enable the functionality in addition to the existing page_owner parameter.
The free stack saving is thus enabled in these cases:
1) booting with page_owner=on and debug_pagealloc=on
2) booting a KASAN kernel with page_owner=on
3) booting with page_owner=on and page_owner_free=on

To minimize runtime CPU and memory overhead when not boot-time enabled, the
patch introduces a new static key and struct page_ext_operations.

[1] https://bugzilla.kernel.org/show_bug.cgi?id=203967

Suggested-by: Dmitry Vyukov <dvyukov@google.com>
Suggested-by: Walter Wu <walter-zh.wu@mediatek.com>
Suggested-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Suggested-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
---
 .../admin-guide/kernel-parameters.txt         |  8 ++
 Documentation/dev-tools/kasan.rst             |  3 +
 include/linux/page_owner.h                    |  1 +
 mm/page_ext.c                                 |  1 +
 mm/page_owner.c                               | 90 +++++++++++++------
 5 files changed, 78 insertions(+), 25 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 944e03e29f65..14dcb66e3457 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -3237,6 +3237,14 @@
 			we can turn it on.
 			on: enable the feature
 
+	page_owner_free=
+			[KNL] When enabled together with page_owner, store also
+			the stack of who frees a page, for error page dump
+			purposes. This is also implicitly enabled by
+			debug_pagealloc=on or KASAN, so only page_owner=on is
+			sufficient in those cases.
+			on: enable the feature
+
 	page_poison=	[KNL] Boot-time parameter changing the state of
 			poisoning on the buddy allocator, available with
 			CONFIG_PAGE_POISONING=y.
diff --git a/Documentation/dev-tools/kasan.rst b/Documentation/dev-tools/kasan.rst
index b72d07d70239..525296121d89 100644
--- a/Documentation/dev-tools/kasan.rst
+++ b/Documentation/dev-tools/kasan.rst
@@ -41,6 +41,9 @@ smaller binary while the latter is 1.1 - 2 times faster.
 Both KASAN modes work with both SLUB and SLAB memory allocators.
 For better bug detection and nicer reporting, enable CONFIG_STACKTRACE.
 
+To augment reports with last allocation and freeing stack of the physical page,
+it is recommended to enable also CONFIG_PAGE_OWNER and boot with page_owner=on.
+
 To disable instrumentation for specific files or directories, add a line
 similar to the following to the respective kernel Makefile:
 
diff --git a/include/linux/page_owner.h b/include/linux/page_owner.h
index 8679ccd722e8..0888dd70cc61 100644
--- a/include/linux/page_owner.h
+++ b/include/linux/page_owner.h
@@ -7,6 +7,7 @@
 #ifdef CONFIG_PAGE_OWNER
 extern struct static_key_false page_owner_inited;
 extern struct page_ext_operations page_owner_ops;
+extern struct page_ext_operations page_owner_free_ops;
 
 extern void __reset_page_owner(struct page *page, unsigned int order);
 extern void __set_page_owner(struct page *page,
diff --git a/mm/page_ext.c b/mm/page_ext.c
index 4ade843ff588..5724b637939a 100644
--- a/mm/page_ext.c
+++ b/mm/page_ext.c
@@ -61,6 +61,7 @@
 static struct page_ext_operations *page_ext_ops[] = {
 #ifdef CONFIG_PAGE_OWNER
 	&page_owner_ops,
+	&page_owner_free_ops,
 #endif
 #if defined(CONFIG_IDLE_PAGE_TRACKING) && !defined(CONFIG_64BIT)
 	&page_idle_ops,
diff --git a/mm/page_owner.c b/mm/page_owner.c
index d3cf5d336ccf..a668a735b9b6 100644
--- a/mm/page_owner.c
+++ b/mm/page_owner.c
@@ -24,13 +24,16 @@ struct page_owner {
 	short last_migrate_reason;
 	gfp_t gfp_mask;
 	depot_stack_handle_t handle;
-#ifdef CONFIG_DEBUG_PAGEALLOC
+};
+
+struct page_owner_free {
 	depot_stack_handle_t free_handle;
-#endif
 };
 
-static bool page_owner_disabled = true;
+static bool page_owner_enabled = false;
+static bool page_owner_free_enabled = false;
 DEFINE_STATIC_KEY_FALSE(page_owner_inited);
+static DEFINE_STATIC_KEY_FALSE(page_owner_free_stack);
 
 static depot_stack_handle_t dummy_handle;
 static depot_stack_handle_t failure_handle;
@@ -44,7 +47,7 @@ static int __init early_page_owner_param(char *buf)
 		return -EINVAL;
 
 	if (strcmp(buf, "on") == 0)
-		page_owner_disabled = false;
+		page_owner_enabled = true;
 
 	return 0;
 }
@@ -52,10 +55,30 @@ early_param("page_owner", early_page_owner_param);
 
 static bool need_page_owner(void)
 {
-	if (page_owner_disabled)
+	return page_owner_enabled;
+}
+
+static int __init early_page_owner_free_param(char *buf)
+{
+	if (!buf)
+		return -EINVAL;
+
+	if (strcmp(buf, "on") == 0)
+		page_owner_free_enabled = true;
+
+	return 0;
+}
+early_param("page_owner_free", early_page_owner_free_param);
+
+static bool need_page_owner_free(void) {
+
+	if (!page_owner_enabled)
 		return false;
 
-	return true;
+	if (IS_ENABLED(CONFIG_KASAN) || debug_pagealloc_enabled())
+		page_owner_free_enabled = true;
+
+	return page_owner_free_enabled;
 }
 
 static __always_inline depot_stack_handle_t create_dummy_stack(void)
@@ -84,7 +107,7 @@ static noinline void register_early_stack(void)
 
 static void init_page_owner(void)
 {
-	if (page_owner_disabled)
+	if (!page_owner_enabled)
 		return;
 
 	register_dummy_stack();
@@ -94,17 +117,36 @@ static void init_page_owner(void)
 	init_early_allocated_pages();
 }
 
+static void init_page_owner_free(void)
+{
+	if (!page_owner_enabled || !page_owner_free_enabled)
+		return;
+
+	static_branch_enable(&page_owner_free_stack);
+}
+
 struct page_ext_operations page_owner_ops = {
 	.size = sizeof(struct page_owner),
 	.need = need_page_owner,
 	.init = init_page_owner,
 };
 
+struct page_ext_operations page_owner_free_ops = {
+	.size = sizeof(struct page_owner_free),
+	.need = need_page_owner_free,
+	.init = init_page_owner_free,
+};
+
 static inline struct page_owner *get_page_owner(struct page_ext *page_ext)
 {
 	return (void *)page_ext + page_owner_ops.offset;
 }
 
+static inline struct page_owner_free *get_page_owner_free(struct page_ext *page_ext)
+{
+	return (void *)page_ext + page_owner_free_ops.offset;
+}
+
 static inline bool check_recursive_alloc(unsigned long *entries,
 					 unsigned int nr_entries,
 					 unsigned long ip)
@@ -148,25 +190,21 @@ void __reset_page_owner(struct page *page, unsigned int order)
 {
 	int i;
 	struct page_ext *page_ext;
-#ifdef CONFIG_DEBUG_PAGEALLOC
 	depot_stack_handle_t handle = 0;
-	struct page_owner *page_owner;
+	struct page_owner_free *page_owner_free;
 
-	if (debug_pagealloc_enabled())
+	if (static_branch_unlikely(&page_owner_free_stack))
 		handle = save_stack(GFP_NOWAIT | __GFP_NOWARN);
-#endif
 
 	page_ext = lookup_page_ext(page);
 	if (unlikely(!page_ext))
 		return;
 	for (i = 0; i < (1 << order); i++) {
 		__clear_bit(PAGE_EXT_OWNER_ACTIVE, &page_ext->flags);
-#ifdef CONFIG_DEBUG_PAGEALLOC
-		if (debug_pagealloc_enabled()) {
-			page_owner = get_page_owner(page_ext);
-			page_owner->free_handle = handle;
+		if (static_branch_unlikely(&page_owner_free_stack)) {
+			page_owner_free = get_page_owner_free(page_ext);
+			page_owner_free->free_handle = handle;
 		}
-#endif
 		page_ext = page_ext_next(page_ext);
 	}
 }
@@ -414,6 +452,7 @@ void __dump_page_owner(struct page *page)
 {
 	struct page_ext *page_ext = lookup_page_ext(page);
 	struct page_owner *page_owner;
+	struct page_owner_free *page_owner_free;
 	depot_stack_handle_t handle;
 	unsigned long *entries;
 	unsigned int nr_entries;
@@ -450,16 +489,17 @@ void __dump_page_owner(struct page *page)
 		stack_trace_print(entries, nr_entries, 0);
 	}
 
-#ifdef CONFIG_DEBUG_PAGEALLOC
-	handle = READ_ONCE(page_owner->free_handle);
-	if (!handle) {
-		pr_alert("page_owner free stack trace missing\n");
-	} else {
-		nr_entries = stack_depot_fetch(handle, &entries);
-		pr_alert("page last free stack trace:\n");
-		stack_trace_print(entries, nr_entries, 0);
+	if (static_branch_unlikely(&page_owner_free_stack)) {
+		page_owner_free = get_page_owner_free(page_ext);
+		handle = READ_ONCE(page_owner_free->free_handle);
+		if (!handle) {
+			pr_alert("page_owner free stack trace missing\n");
+		} else {
+			nr_entries = stack_depot_fetch(handle, &entries);
+			pr_alert("page last free stack trace:\n");
+			stack_trace_print(entries, nr_entries, 0);
+		}
 	}
-#endif
 
 	if (page_owner->last_migrate_reason != -1)
 		pr_alert("page has been migrated, last migrate reason: %s\n",
-- 
2.23.0


  parent reply	other threads:[~2019-09-30 12:29 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-09-30 12:29 [PATCH v2 0/3] followups to debug_pagealloc improvements through page_owner Vlastimil Babka
2019-09-30 12:29 ` [PATCH v2 1/3] mm, page_owner: fix off-by-one error in __set_page_owner_handle() Vlastimil Babka
2019-09-30 12:29 ` Vlastimil Babka [this message]
2019-09-30 12:49   ` [PATCH v2 2/3] mm, page_owner: decouple freeing stack trace from debug_pagealloc Qian Cai
2019-09-30 12:49     ` Qian Cai
2019-09-30 21:39     ` Vlastimil Babka
2019-09-30 23:49       ` Qian Cai
2019-10-01  8:07         ` Vlastimil Babka
2019-10-01 11:51           ` Kirill A. Shutemov
2019-10-01 12:26             ` Qian Cai
2019-10-01 12:26               ` Qian Cai
2019-10-01 12:31               ` Kirill A. Shutemov
2019-10-01 12:32               ` Vlastimil Babka
2019-10-01 12:35                 ` Vlastimil Babka
2019-10-01 13:18                   ` Qian Cai
2019-10-01 13:18                     ` Qian Cai
2019-10-01 14:15                     ` Vlastimil Babka
2019-10-01 11:35       ` Kirill A. Shutemov
2019-09-30 12:29 ` [PATCH v2 3/3] mm, page_owner: rename flag indicating that page is allocated Vlastimil Babka
2019-09-30 12:49   ` Vlastimil Babka

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190930122916.14969-3-vbabka@suse.cz \
    --to=vbabka@suse.cz \
    --cc=akpm@linux-foundation.org \
    --cc=aryabinin@virtuozzo.com \
    --cc=cai@lca.pw \
    --cc=dvyukov@google.com \
    --cc=kasan-dev@googlegroups.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@techsingularity.net \
    --cc=mhocko@kernel.org \
    --cc=walter-zh.wu@mediatek.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.