linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 sl-b 0/6] Export return addresses etc. for better diagnostics
@ 2021-01-06  1:16 Paul E. McKenney
  2021-01-06  1:17 ` [PATCH mm,percpu_ref,rcu 1/6] mm: Add mem_dump_obj() to print source of memory block paulmck
                   ` (6 more replies)
  0 siblings, 7 replies; 17+ messages in thread
From: Paul E. McKenney @ 2021-01-06  1:16 UTC (permalink / raw)
  To: linux-kernel, rcu, linux-mm
  Cc: cl, penberg, rientjes, iamjoonsoo.kim, akpm, ming.lei, axboe,
	kernel-team

Hello!

This is v4 of the series the improves diagnostics by providing access
to additional information including the return addresses, slab names,
offsets, and sizes collected by the sl*b allocators and by vmalloc().
If the allocator is not configured to collect this information, the
diagnostics fall back to a reasonable approximation of their earlier
state.

One use case is the queue_rcu_work() function, which might be used
by any number of kernel subsystems.  If the caller does back-to-back
invocations of queue_rcu_work(), this constitutes a double-free bug,
and (if so configured) the debug-objects system will flag this, printing
the callback function.  In most cases, printing this function suffices.
However, for double-free bugs involving queue_rcu_work(), the RCU callback
function will always be rcu_work_rcufn(), which provides almost no help to
the poor person trying to find this double-free bug.  The return address
from the allocator of the memory containing the rcu_work structure can
provide an additional valuable clue.

Another use case is the percpu_ref_switch_to_atomic_rcu() function,
which detects percpu_ref reference-count underflow.  Unfortunately,
the only data that this function has access to doesn't have much in the
way of identifying characteristics.  Yes, it might be possible to gain
more information from a crash dump, but it is more convenient for the
needed hints to be in the console log.

Unfortunately, printing the return address in this case is of little help
because this object is allocated from percpu_ref_init(), regardless of
what part of the kernel is responsible for the reference-count underflow
(though perhaps the slab and offsets might help in some cases).  However,
CONFIG_STACKTRACE=y kernels (such as those enabling ftrace) using slub
with debugging enabled also collect stack traces.  This series therefore
also provides a way of extracting these stack traces to provide additional
information to those debugging percpu_ref reference-count underflows.

The patches are as follows:

1.	Add mem_dump_obj() to print source of memory block.

2.	Make mem_dump_obj() handle NULL and zero-sized pointers.

3.	Make mem_dump_obj() handle vmalloc() memory.

4.	Make mem_obj_dump() vmalloc() dumps include start and length.

5.	Make call_rcu() print mem_dump_obj() info for double-freed
	callback.

6.	percpu_ref: Dump mem_dump_obj() info upon reference-count
	underflow.

						Thanx, Paul

Changes since v3 (https://lore.kernel.org/lkml/20201211011907.GA16110@paulmck-ThinkPad-P72/):

o	Extract more information from CONFIG_SLUB_DEBUG=n builds.

o	Add Joonsoo Kim's Acked-by to 1/6 above.

o	Rebased onto v5.11-rc1.

Changes since v2:

o	Apply more feedback from Joonsoo Kim on naming and code structure.

o	Based on discussions with Vlastimil Babka, added code to print
	offsets and sizes where available.  This can help identify which
	structure is involved.

Changes since v1:

o	Apply feedback from Joonsoo Kim, mostly around naming and
	code structure.

o	Apply fix suggested by Stephen Rothwell for a bug that was
	also located by kbuild test robot.

o	Add support for vmalloc().

o	Add support for special pointers.

o	Additional rework simplifying use of mem_dump_obj(), which
	simplifies both the RCU and the percpu_ref uses.

------------------------------------------------------------------------

 include/linux/mm.h      |    2 +
 include/linux/slab.h    |    2 +
 include/linux/vmalloc.h |    6 +++
 kernel/rcu/tree.c       |    7 +++-
 lib/percpu-refcount.c   |   12 +++++--
 mm/slab.c               |   20 ++++++++++++
 mm/slab.h               |   12 +++++++
 mm/slab_common.c        |   74 ++++++++++++++++++++++++++++++++++++++++++++++++
 mm/slob.c               |    6 +++
 mm/slub.c               |   40 +++++++++++++++++++++++++
 mm/util.c               |   45 ++++++++++++++++++++++++-----
 mm/vmalloc.c            |   15 +++++++++
 12 files changed, 228 insertions(+), 13 deletions(-)

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH mm,percpu_ref,rcu 1/6] mm: Add mem_dump_obj() to print source of memory block
  2021-01-06  1:16 [PATCH v4 sl-b 0/6] Export return addresses etc. for better diagnostics Paul E. McKenney
@ 2021-01-06  1:17 ` paulmck
  2021-01-08 13:50   ` Vlastimil Babka
  2021-01-06  1:17 ` [PATCH mm,percpu_ref,rcu 2/6] mm: Make mem_dump_obj() handle NULL and zero-sized pointers paulmck
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 17+ messages in thread
From: paulmck @ 2021-01-06  1:17 UTC (permalink / raw)
  To: linux-kernel, rcu, linux-mm
  Cc: cl, penberg, rientjes, iamjoonsoo.kim, akpm, ming.lei, axboe,
	kernel-team, Paul E. McKenney

From: "Paul E. McKenney" <paulmck@kernel.org>

There are kernel facilities such as per-CPU reference counts that give
error messages in generic handlers or callbacks, whose messages are
unenlightening.  In the case of per-CPU reference-count underflow, this
is not a problem when creating a new use of this facility because in that
case the bug is almost certainly in the code implementing that new use.
However, trouble arises when deploying across many systems, which might
exercise corner cases that were not seen during development and testing.
Here, it would be really nice to get some kind of hint as to which of
several uses the underflow was caused by.

This commit therefore exposes a mem_dump_obj() function that takes
a pointer to memory (which must still be allocated if it has been
dynamically allocated) and prints available information on where that
memory came from.  This pointer can reference the middle of the block as
well as the beginning of the block, as needed by things like RCU callback
functions and timer handlers that might not know where the beginning of
the memory block is.  These functions and handlers can use mem_dump_obj()
to print out better hints as to where the problem might lie.

The information printed can depend on kernel configuration.  For example,
the allocation return address can be printed only for slab and slub,
and even then only when the necessary debug has been enabled.  For slab,
build with CONFIG_DEBUG_SLAB=y, and either use sizes with ample space
to the next power of two or use the SLAB_STORE_USER when creating the
kmem_cache structure.  For slub, build with CONFIG_SLUB_DEBUG=y and
boot with slub_debug=U, or pass SLAB_STORE_USER to kmem_cache_create()
if more focused use is desired.  Also for slub, use CONFIG_STACKTRACE
to enable printing of the allocation-time stack trace.

Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: <linux-mm@kvack.org>
Reported-by: Andrii Nakryiko <andrii@kernel.org>
[ paulmck: Convert to printing and change names per Joonsoo Kim. ]
[ paulmck: Move slab definition per Stephen Rothwell and kbuild test robot. ]
[ paulmck: Handle CONFIG_MMU=n case where vmalloc() is kmalloc(). ]
[ paulmck: Apply Vlastimil Babka feedback on slab.c kmem_provenance(). ]
[ paulmck: Extract more info from !SLUB_DEBUG per Joonsoo Kim. ]
Acked-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
---
 include/linux/mm.h   |  2 ++
 include/linux/slab.h |  2 ++
 mm/slab.c            | 20 ++++++++++++++
 mm/slab.h            | 12 +++++++++
 mm/slab_common.c     | 74 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 mm/slob.c            |  6 +++++
 mm/slub.c            | 40 ++++++++++++++++++++++++++++
 mm/util.c            | 24 +++++++++++++++++
 8 files changed, 180 insertions(+)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 5299b90a..af7d050 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -3169,5 +3169,7 @@ unsigned long wp_shared_mapping_range(struct address_space *mapping,
 
 extern int sysctl_nr_trim_pages;
 
+void mem_dump_obj(void *object);
+
 #endif /* __KERNEL__ */
 #endif /* _LINUX_MM_H */
diff --git a/include/linux/slab.h b/include/linux/slab.h
index be4ba58..7ae6040 100644
--- a/include/linux/slab.h
+++ b/include/linux/slab.h
@@ -186,6 +186,8 @@ void kfree(const void *);
 void kfree_sensitive(const void *);
 size_t __ksize(const void *);
 size_t ksize(const void *);
+bool kmem_valid_obj(void *object);
+void kmem_dump_obj(void *object);
 
 #ifdef CONFIG_HAVE_HARDENED_USERCOPY_ALLOCATOR
 void __check_heap_object(const void *ptr, unsigned long n, struct page *page,
diff --git a/mm/slab.c b/mm/slab.c
index d7c8da9..dcc55e7 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -3635,6 +3635,26 @@ void *__kmalloc_node_track_caller(size_t size, gfp_t flags,
 EXPORT_SYMBOL(__kmalloc_node_track_caller);
 #endif /* CONFIG_NUMA */
 
+void kmem_obj_info(struct kmem_obj_info *kpp, void *object, struct page *page)
+{
+	struct kmem_cache *cachep;
+	unsigned int objnr;
+	void *objp;
+
+	kpp->kp_ptr = object;
+	kpp->kp_page = page;
+	cachep = page->slab_cache;
+	kpp->kp_slab_cache = cachep;
+	objp = object - obj_offset(cachep);
+	kpp->kp_data_offset = obj_offset(cachep);
+	page = virt_to_head_page(objp);
+	objnr = obj_to_index(cachep, page, objp);
+	objp = index_to_obj(cachep, page, objnr);
+	kpp->kp_objp = objp;
+	if (DEBUG && cachep->flags & SLAB_STORE_USER)
+		kpp->kp_ret = *dbg_userword(cachep, objp);
+}
+
 /**
  * __do_kmalloc - allocate memory
  * @size: how many bytes of memory are required.
diff --git a/mm/slab.h b/mm/slab.h
index 1a756a3..ecad9b5 100644
--- a/mm/slab.h
+++ b/mm/slab.h
@@ -615,4 +615,16 @@ static inline bool slab_want_init_on_free(struct kmem_cache *c)
 	return false;
 }
 
+#define KS_ADDRS_COUNT 16
+struct kmem_obj_info {
+	void *kp_ptr;
+	struct page *kp_page;
+	void *kp_objp;
+	unsigned long kp_data_offset;
+	struct kmem_cache *kp_slab_cache;
+	void *kp_ret;
+	void *kp_stack[KS_ADDRS_COUNT];
+};
+void kmem_obj_info(struct kmem_obj_info *kpp, void *object, struct page *page);
+
 #endif /* MM_SLAB_H */
diff --git a/mm/slab_common.c b/mm/slab_common.c
index e981c80..b594413 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -537,6 +537,80 @@ bool slab_is_available(void)
 	return slab_state >= UP;
 }
 
+/**
+ * kmem_valid_obj - does the pointer reference a valid slab object?
+ * @object: pointer to query.
+ *
+ * Return: %true if the pointer is to a not-yet-freed object from
+ * kmalloc() or kmem_cache_alloc(), either %true or %false if the pointer
+ * is to an already-freed object, and %false otherwise.
+ */
+bool kmem_valid_obj(void *object)
+{
+	struct page *page;
+
+	if (!virt_addr_valid(object))
+		return false;
+	page = virt_to_head_page(object);
+	return PageSlab(page);
+}
+
+/**
+ * kmem_dump_obj - Print available slab provenance information
+ * @object: slab object for which to find provenance information.
+ *
+ * This function uses pr_cont(), so that the caller is expected to have
+ * printed out whatever preamble is appropriate.  The provenance information
+ * depends on the type of object and on how much debugging is enabled.
+ * For a slab-cache object, the fact that it is a slab object is printed,
+ * and, if available, the slab name, return address, and stack trace from
+ * the allocation of that object.
+ *
+ * This function will splat if passed a pointer to a non-slab object.
+ * If you are not sure what type of object you have, you should instead
+ * use mem_dump_obj().
+ */
+void kmem_dump_obj(void *object)
+{
+	char *cp = IS_ENABLED(CONFIG_MMU) ? "" : "/vmalloc";
+	int i;
+	struct page *page;
+	unsigned long ptroffset;
+	struct kmem_obj_info kp = { };
+
+	if (WARN_ON_ONCE(!virt_addr_valid(object)))
+		return;
+	page = virt_to_head_page(object);
+	if (WARN_ON_ONCE(!PageSlab(page))) {
+		pr_cont(" non-slab memory.\n");
+		return;
+	}
+	kmem_obj_info(&kp, object, page);
+	if (kp.kp_slab_cache)
+		pr_cont(" slab%s %s", cp, kp.kp_slab_cache->name);
+	else
+		pr_cont(" slab%s", cp);
+	if (kp.kp_objp)
+		pr_cont(" start %px", kp.kp_objp);
+	if (kp.kp_data_offset)
+		pr_cont(" data offset %lu", kp.kp_data_offset);
+	if (kp.kp_objp) {
+		ptroffset = ((char *)object - (char *)kp.kp_objp) - kp.kp_data_offset;
+		pr_cont(" pointer offset %lu", ptroffset);
+	}
+	if (kp.kp_slab_cache && kp.kp_slab_cache->usersize)
+		pr_cont(" size %u", kp.kp_slab_cache->usersize);
+	if (kp.kp_ret)
+		pr_cont(" allocated at %pS\n", kp.kp_ret);
+	else
+		pr_cont("\n");
+	for (i = 0; i < ARRAY_SIZE(kp.kp_stack); i++) {
+		if (!kp.kp_stack[i])
+			break;
+		pr_info("    %pS\n", kp.kp_stack[i]);
+	}
+}
+
 #ifndef CONFIG_SLOB
 /* Create a cache during boot when no slab services are available yet */
 void __init create_boot_cache(struct kmem_cache *s, const char *name,
diff --git a/mm/slob.c b/mm/slob.c
index 8d4bfa4..ef87ada 100644
--- a/mm/slob.c
+++ b/mm/slob.c
@@ -461,6 +461,12 @@ static void slob_free(void *block, int size)
 	spin_unlock_irqrestore(&slob_lock, flags);
 }
 
+void kmem_obj_info(struct kmem_obj_info *kpp, void *object, struct page *page)
+{
+	kpp->kp_ptr = object;
+	kpp->kp_page = page;
+}
+
 /*
  * End of slob allocator proper. Begin kmem_cache_alloc and kmalloc frontend.
  */
diff --git a/mm/slub.c b/mm/slub.c
index 0c8b43a..3c1a843 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -3919,6 +3919,46 @@ int __kmem_cache_shutdown(struct kmem_cache *s)
 	return 0;
 }
 
+void kmem_obj_info(struct kmem_obj_info *kpp, void *object, struct page *page)
+{
+	void *base;
+	int __maybe_unused i;
+	unsigned int objnr;
+	void *objp;
+	void *objp0;
+	struct kmem_cache *s = page->slab_cache;
+	struct track __maybe_unused *trackp;
+
+	kpp->kp_ptr = object;
+	kpp->kp_page = page;
+	kpp->kp_slab_cache = s;
+	base = page_address(page);
+	objp0 = kasan_reset_tag(object);
+#ifdef CONFIG_SLUB_DEBUG
+	objp = restore_red_left(s, objp0);
+#else
+	objp = objp0;
+#endif
+	objnr = obj_to_index(s, page, objp);
+	kpp->kp_data_offset = (unsigned long)((char *)objp0 - (char *)objp);
+	objp = base + s->size * objnr;
+	kpp->kp_objp = objp;
+	if (WARN_ON_ONCE(objp < base || objp >= base + page->objects * s->size || (objp - base) % s->size) ||
+	    !(s->flags & SLAB_STORE_USER))
+		return;
+#ifdef CONFIG_SLUB_DEBUG
+	trackp = get_track(s, objp, TRACK_ALLOC);
+	kpp->kp_ret = (void *)trackp->addr;
+#ifdef CONFIG_STACKTRACE
+	for (i = 0; i < KS_ADDRS_COUNT && i < TRACK_ADDRS_COUNT; i++) {
+		kpp->kp_stack[i] = (void *)trackp->addrs[i];
+		if (!kpp->kp_stack[i])
+			break;
+	}
+#endif
+#endif
+}
+
 /********************************************************************
  *		Kmalloc subsystem
  *******************************************************************/
diff --git a/mm/util.c b/mm/util.c
index 8c9b7d1..da46f9d 100644
--- a/mm/util.c
+++ b/mm/util.c
@@ -982,3 +982,27 @@ int __weak memcmp_pages(struct page *page1, struct page *page2)
 	kunmap_atomic(addr1);
 	return ret;
 }
+
+/**
+ * mem_dump_obj - Print available provenance information
+ * @object: object for which to find provenance information.
+ *
+ * This function uses pr_cont(), so that the caller is expected to have
+ * printed out whatever preamble is appropriate.  The provenance information
+ * depends on the type of object and on how much debugging is enabled.
+ * For example, for a slab-cache object, the slab name is printed, and,
+ * if available, the return address and stack trace from the allocation
+ * of that object.
+ */
+void mem_dump_obj(void *object)
+{
+	if (!virt_addr_valid(object)) {
+		pr_cont(" non-paged (local) memory.\n");
+		return;
+	}
+	if (kmem_valid_obj(object)) {
+		kmem_dump_obj(object);
+		return;
+	}
+	pr_cont(" non-slab memory.\n");
+}
-- 
2.9.5


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH mm,percpu_ref,rcu 2/6] mm: Make mem_dump_obj() handle NULL and zero-sized pointers
  2021-01-06  1:16 [PATCH v4 sl-b 0/6] Export return addresses etc. for better diagnostics Paul E. McKenney
  2021-01-06  1:17 ` [PATCH mm,percpu_ref,rcu 1/6] mm: Add mem_dump_obj() to print source of memory block paulmck
@ 2021-01-06  1:17 ` paulmck
  2021-01-06  1:17 ` [PATCH mm,percpu_ref,rcu 3/6] mm: Make mem_dump_obj() handle vmalloc() memory paulmck
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 17+ messages in thread
From: paulmck @ 2021-01-06  1:17 UTC (permalink / raw)
  To: linux-kernel, rcu, linux-mm
  Cc: cl, penberg, rientjes, iamjoonsoo.kim, akpm, ming.lei, axboe,
	kernel-team, Paul E. McKenney

From: "Paul E. McKenney" <paulmck@kernel.org>

This commit makes mem_dump_obj() call out NULL and zero-sized pointers
specially instead of classifying them as non-paged memory.

Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: <linux-mm@kvack.org>
Reported-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
---
 mm/util.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/mm/util.c b/mm/util.c
index da46f9d..92f23d2 100644
--- a/mm/util.c
+++ b/mm/util.c
@@ -997,7 +997,12 @@ int __weak memcmp_pages(struct page *page1, struct page *page2)
 void mem_dump_obj(void *object)
 {
 	if (!virt_addr_valid(object)) {
-		pr_cont(" non-paged (local) memory.\n");
+		if (object == NULL)
+			pr_cont(" NULL pointer.\n");
+		else if (object == ZERO_SIZE_PTR)
+			pr_cont(" zero-size pointer.\n");
+		else
+			pr_cont(" non-paged (local) memory.\n");
 		return;
 	}
 	if (kmem_valid_obj(object)) {
-- 
2.9.5


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH mm,percpu_ref,rcu 3/6] mm: Make mem_dump_obj() handle vmalloc() memory
  2021-01-06  1:16 [PATCH v4 sl-b 0/6] Export return addresses etc. for better diagnostics Paul E. McKenney
  2021-01-06  1:17 ` [PATCH mm,percpu_ref,rcu 1/6] mm: Add mem_dump_obj() to print source of memory block paulmck
  2021-01-06  1:17 ` [PATCH mm,percpu_ref,rcu 2/6] mm: Make mem_dump_obj() handle NULL and zero-sized pointers paulmck
@ 2021-01-06  1:17 ` paulmck
  2021-01-08 15:30   ` Vlastimil Babka
  2021-01-06  1:17 ` [PATCH mm,percpu_ref,rcu 4/6] mm: Make mem_obj_dump() vmalloc() dumps include start and length paulmck
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 17+ messages in thread
From: paulmck @ 2021-01-06  1:17 UTC (permalink / raw)
  To: linux-kernel, rcu, linux-mm
  Cc: cl, penberg, rientjes, iamjoonsoo.kim, akpm, ming.lei, axboe,
	kernel-team, Paul E. McKenney

From: "Paul E. McKenney" <paulmck@kernel.org>

This commit adds vmalloc() support to mem_dump_obj().  Note that the
vmalloc_dump_obj() function combines the checking and dumping, in
contrast with the split between kmem_valid_obj() and kmem_dump_obj().
The reason for the difference is that the checking in the vmalloc()
case involves acquiring a global lock, and redundant acquisitions of
global locks should be avoided, even on not-so-fast paths.

Note that this change causes on-stack variables to be reported as
vmalloc() storage from kernel_clone() or similar, depending on the degree
of inlining that your compiler does.  This is likely more helpful than
the earlier "non-paged (local) memory".

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: <linux-mm@kvack.org>
Reported-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
---
 include/linux/vmalloc.h |  6 ++++++
 mm/util.c               | 14 ++++++++------
 mm/vmalloc.c            | 12 ++++++++++++
 3 files changed, 26 insertions(+), 6 deletions(-)

diff --git a/include/linux/vmalloc.h b/include/linux/vmalloc.h
index 80c0181..c18f475 100644
--- a/include/linux/vmalloc.h
+++ b/include/linux/vmalloc.h
@@ -246,4 +246,10 @@ pcpu_free_vm_areas(struct vm_struct **vms, int nr_vms)
 int register_vmap_purge_notifier(struct notifier_block *nb);
 int unregister_vmap_purge_notifier(struct notifier_block *nb);
 
+#ifdef CONFIG_MMU
+bool vmalloc_dump_obj(void *object);
+#else
+static inline bool vmalloc_dump_obj(void *object) { return false; }
+#endif
+
 #endif /* _LINUX_VMALLOC_H */
diff --git a/mm/util.c b/mm/util.c
index 92f23d2..5487022 100644
--- a/mm/util.c
+++ b/mm/util.c
@@ -996,18 +996,20 @@ int __weak memcmp_pages(struct page *page1, struct page *page2)
  */
 void mem_dump_obj(void *object)
 {
+	if (kmem_valid_obj(object)) {
+		kmem_dump_obj(object);
+		return;
+	}
+	if (vmalloc_dump_obj(object))
+		return;
 	if (!virt_addr_valid(object)) {
 		if (object == NULL)
 			pr_cont(" NULL pointer.\n");
 		else if (object == ZERO_SIZE_PTR)
 			pr_cont(" zero-size pointer.\n");
 		else
-			pr_cont(" non-paged (local) memory.\n");
-		return;
-	}
-	if (kmem_valid_obj(object)) {
-		kmem_dump_obj(object);
+			pr_cont(" non-paged memory.\n");
 		return;
 	}
-	pr_cont(" non-slab memory.\n");
+	pr_cont(" non-slab/vmalloc memory.\n");
 }
diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index 4d88fe5..c274ea4 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -3448,6 +3448,18 @@ void pcpu_free_vm_areas(struct vm_struct **vms, int nr_vms)
 }
 #endif	/* CONFIG_SMP */
 
+bool vmalloc_dump_obj(void *object)
+{
+	struct vm_struct *vm;
+	void *objp = (void *)PAGE_ALIGN((unsigned long)object);
+
+	vm = find_vm_area(objp);
+	if (!vm)
+		return false;
+	pr_cont(" vmalloc allocated at %pS\n", vm->caller);
+	return true;
+}
+
 #ifdef CONFIG_PROC_FS
 static void *s_start(struct seq_file *m, loff_t *pos)
 	__acquires(&vmap_purge_lock)
-- 
2.9.5


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH mm,percpu_ref,rcu 4/6] mm: Make mem_obj_dump() vmalloc() dumps include start and length
  2021-01-06  1:16 [PATCH v4 sl-b 0/6] Export return addresses etc. for better diagnostics Paul E. McKenney
                   ` (2 preceding siblings ...)
  2021-01-06  1:17 ` [PATCH mm,percpu_ref,rcu 3/6] mm: Make mem_dump_obj() handle vmalloc() memory paulmck
@ 2021-01-06  1:17 ` paulmck
  2021-01-08 15:30   ` Vlastimil Babka
  2021-01-06  1:17 ` [PATCH mm,percpu_ref,rcu 5/6] rcu: Make call_rcu() print mem_dump_obj() info for double-freed callback paulmck
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 17+ messages in thread
From: paulmck @ 2021-01-06  1:17 UTC (permalink / raw)
  To: linux-kernel, rcu, linux-mm
  Cc: cl, penberg, rientjes, iamjoonsoo.kim, akpm, ming.lei, axboe,
	kernel-team, Paul E. McKenney

From: "Paul E. McKenney" <paulmck@kernel.org>

This commit adds the starting address and number of pages to the vmalloc()
information dumped by way of vmalloc_dump_obj().

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: <linux-mm@kvack.org>
Reported-by: Andrii Nakryiko <andrii@kernel.org>
Suggested-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
---
 mm/vmalloc.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index c274ea4..e3229ff 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -3456,7 +3456,8 @@ bool vmalloc_dump_obj(void *object)
 	vm = find_vm_area(objp);
 	if (!vm)
 		return false;
-	pr_cont(" vmalloc allocated at %pS\n", vm->caller);
+	pr_cont(" %u-page vmalloc region starting at %#lx allocated at %pS\n",
+		vm->nr_pages, (unsigned long)vm->addr, vm->caller);
 	return true;
 }
 
-- 
2.9.5


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH mm,percpu_ref,rcu 5/6] rcu: Make call_rcu() print mem_dump_obj() info for double-freed callback
  2021-01-06  1:16 [PATCH v4 sl-b 0/6] Export return addresses etc. for better diagnostics Paul E. McKenney
                   ` (3 preceding siblings ...)
  2021-01-06  1:17 ` [PATCH mm,percpu_ref,rcu 4/6] mm: Make mem_obj_dump() vmalloc() dumps include start and length paulmck
@ 2021-01-06  1:17 ` paulmck
  2021-01-06  1:17 ` [PATCH mm,percpu_ref,rcu 6/6] percpu_ref: Dump mem_dump_obj() info upon reference-count underflow paulmck
  2021-01-06 21:48 ` [PATCH v4 sl-b 0/6] Export return addresses etc. for better diagnostics Andrew Morton
  6 siblings, 0 replies; 17+ messages in thread
From: paulmck @ 2021-01-06  1:17 UTC (permalink / raw)
  To: linux-kernel, rcu, linux-mm
  Cc: cl, penberg, rientjes, iamjoonsoo.kim, akpm, ming.lei, axboe,
	kernel-team, Paul E. McKenney

From: "Paul E. McKenney" <paulmck@kernel.org>

The debug-object double-free checks in __call_rcu() print out the
RCU callback function, which is usually sufficient to track down the
double free.  However, all uses of things like queue_rcu_work() will
have the same RCU callback function (rcu_work_rcufn() in this case),
so a diagnostic message for a double queue_rcu_work() needs more than
just the callback function.

This commit therefore calls mem_dump_obj() to dump out any additional
available information on the double-freed callback.

Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: <linux-mm@kvack.org>
Reported-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
---
 kernel/rcu/tree.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 40e5e3d..84513c5 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -2941,6 +2941,7 @@ static void check_cb_ovld(struct rcu_data *rdp)
 static void
 __call_rcu(struct rcu_head *head, rcu_callback_t func)
 {
+	static atomic_t doublefrees;
 	unsigned long flags;
 	struct rcu_data *rdp;
 	bool was_alldone;
@@ -2954,8 +2955,10 @@ __call_rcu(struct rcu_head *head, rcu_callback_t func)
 		 * Use rcu:rcu_callback trace event to find the previous
 		 * time callback was passed to __call_rcu().
 		 */
-		WARN_ONCE(1, "__call_rcu(): Double-freed CB %p->%pS()!!!\n",
-			  head, head->func);
+		if (atomic_inc_return(&doublefrees) < 4) {
+			pr_err("%s(): Double-freed CB %p->%pS()!!!  ", __func__, head, head->func);
+			mem_dump_obj(head);
+		}
 		WRITE_ONCE(head->func, rcu_leak_callback);
 		return;
 	}
-- 
2.9.5


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH mm,percpu_ref,rcu 6/6] percpu_ref: Dump mem_dump_obj() info upon reference-count underflow
  2021-01-06  1:16 [PATCH v4 sl-b 0/6] Export return addresses etc. for better diagnostics Paul E. McKenney
                   ` (4 preceding siblings ...)
  2021-01-06  1:17 ` [PATCH mm,percpu_ref,rcu 5/6] rcu: Make call_rcu() print mem_dump_obj() info for double-freed callback paulmck
@ 2021-01-06  1:17 ` paulmck
  2021-01-06 21:48 ` [PATCH v4 sl-b 0/6] Export return addresses etc. for better diagnostics Andrew Morton
  6 siblings, 0 replies; 17+ messages in thread
From: paulmck @ 2021-01-06  1:17 UTC (permalink / raw)
  To: linux-kernel, rcu, linux-mm
  Cc: cl, penberg, rientjes, iamjoonsoo.kim, akpm, ming.lei, axboe,
	kernel-team, Paul E. McKenney

From: "Paul E. McKenney" <paulmck@kernel.org>

Reference-count underflow for percpu_ref is detected in the RCU callback
percpu_ref_switch_to_atomic_rcu(), and the resulting warning does not
print anything allowing easy identification of which percpu_ref use
case is underflowing.  This is of course not normally a problem when
developing a new percpu_ref use case because it is most likely that
the problem resides in this new use case.  However, when deploying a
new kernel to a large set of servers, the underflow might well be a new
corner case in any of the old percpu_ref use cases.

This commit therefore calls mem_dump_obj() to dump out any additional
available information on the underflowing percpu_ref instance.

Cc: Ming Lei <ming.lei@redhat.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Reported-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
---
 lib/percpu-refcount.c | 12 +++++++++---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/lib/percpu-refcount.c b/lib/percpu-refcount.c
index e59eda0..a1071cd 100644
--- a/lib/percpu-refcount.c
+++ b/lib/percpu-refcount.c
@@ -5,6 +5,7 @@
 #include <linux/sched.h>
 #include <linux/wait.h>
 #include <linux/slab.h>
+#include <linux/mm.h>
 #include <linux/percpu-refcount.h>
 
 /*
@@ -168,6 +169,7 @@ static void percpu_ref_switch_to_atomic_rcu(struct rcu_head *rcu)
 			struct percpu_ref_data, rcu);
 	struct percpu_ref *ref = data->ref;
 	unsigned long __percpu *percpu_count = percpu_count_ptr(ref);
+	static atomic_t underflows;
 	unsigned long count = 0;
 	int cpu;
 
@@ -191,9 +193,13 @@ static void percpu_ref_switch_to_atomic_rcu(struct rcu_head *rcu)
 	 */
 	atomic_long_add((long)count - PERCPU_COUNT_BIAS, &data->count);
 
-	WARN_ONCE(atomic_long_read(&data->count) <= 0,
-		  "percpu ref (%ps) <= 0 (%ld) after switching to atomic",
-		  data->release, atomic_long_read(&data->count));
+	if (WARN_ONCE(atomic_long_read(&data->count) <= 0,
+		      "percpu ref (%ps) <= 0 (%ld) after switching to atomic",
+		      data->release, atomic_long_read(&data->count)) &&
+	    atomic_inc_return(&underflows) < 4) {
+		pr_err("%s(): percpu_ref underflow", __func__);
+		mem_dump_obj(data);
+	}
 
 	/* @ref is viewed as dead on all CPUs, send out switch confirmation */
 	percpu_ref_call_confirm_rcu(rcu);
-- 
2.9.5


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH v4 sl-b 0/6] Export return addresses etc. for better diagnostics
  2021-01-06  1:16 [PATCH v4 sl-b 0/6] Export return addresses etc. for better diagnostics Paul E. McKenney
                   ` (5 preceding siblings ...)
  2021-01-06  1:17 ` [PATCH mm,percpu_ref,rcu 6/6] percpu_ref: Dump mem_dump_obj() info upon reference-count underflow paulmck
@ 2021-01-06 21:48 ` Andrew Morton
  2021-01-06 23:42   ` Paul E. McKenney
  6 siblings, 1 reply; 17+ messages in thread
From: Andrew Morton @ 2021-01-06 21:48 UTC (permalink / raw)
  To: paulmck
  Cc: linux-kernel, rcu, linux-mm, cl, penberg, rientjes,
	iamjoonsoo.kim, ming.lei, axboe, kernel-team

On Tue, 5 Jan 2021 17:16:03 -0800 "Paul E. McKenney" <paulmck@kernel.org> wrote:

> This is v4 of the series the improves diagnostics by providing access
> to additional information including the return addresses, slab names,
> offsets, and sizes collected by the sl*b allocators and by vmalloc().

Looks reasonable.  And not as bloaty as I feared, but it does add ~300
bytes to my allnoconfig build.  Is the CONFIG_ coverage as tight as it
could be?


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v4 sl-b 0/6] Export return addresses etc. for better diagnostics
  2021-01-06 21:48 ` [PATCH v4 sl-b 0/6] Export return addresses etc. for better diagnostics Andrew Morton
@ 2021-01-06 23:42   ` Paul E. McKenney
  2021-01-08  0:26     ` Paul E. McKenney
  0 siblings, 1 reply; 17+ messages in thread
From: Paul E. McKenney @ 2021-01-06 23:42 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-kernel, rcu, linux-mm, cl, penberg, rientjes,
	iamjoonsoo.kim, ming.lei, axboe, kernel-team

On Wed, Jan 06, 2021 at 01:48:43PM -0800, Andrew Morton wrote:
> On Tue, 5 Jan 2021 17:16:03 -0800 "Paul E. McKenney" <paulmck@kernel.org> wrote:
> 
> > This is v4 of the series the improves diagnostics by providing access
> > to additional information including the return addresses, slab names,
> > offsets, and sizes collected by the sl*b allocators and by vmalloc().
> 
> Looks reasonable.  And not as bloaty as I feared, but it does add ~300
> bytes to my allnoconfig build.  Is the CONFIG_ coverage as tight as it
> could be?

Glad I managed to exceed your expectations.  ;-)

Let's see...  When I do an allnoconfig build, it has CONFIG_PRINTK=n.
Given that this facility is based on printk(), I could create an
additional patch that #ifdef'ed everything out in CONFIG_PRINTK=n kernels.
This would uglify things a bit, but it would save ~300 bytes.

If I don't hear otherwise from you, I will put something together,
test it, and send it along.

							Thanx, Paul

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v4 sl-b 0/6] Export return addresses etc. for better diagnostics
  2021-01-06 23:42   ` Paul E. McKenney
@ 2021-01-08  0:26     ` Paul E. McKenney
  2021-01-08 15:35       ` Vlastimil Babka
  0 siblings, 1 reply; 17+ messages in thread
From: Paul E. McKenney @ 2021-01-08  0:26 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-kernel, rcu, linux-mm, cl, penberg, rientjes,
	iamjoonsoo.kim, ming.lei, axboe, kernel-team

On Wed, Jan 06, 2021 at 03:42:12PM -0800, Paul E. McKenney wrote:
> On Wed, Jan 06, 2021 at 01:48:43PM -0800, Andrew Morton wrote:
> > On Tue, 5 Jan 2021 17:16:03 -0800 "Paul E. McKenney" <paulmck@kernel.org> wrote:
> > 
> > > This is v4 of the series the improves diagnostics by providing access
> > > to additional information including the return addresses, slab names,
> > > offsets, and sizes collected by the sl*b allocators and by vmalloc().
> > 
> > Looks reasonable.  And not as bloaty as I feared, but it does add ~300
> > bytes to my allnoconfig build.  Is the CONFIG_ coverage as tight as it
> > could be?
> 
> Glad I managed to exceed your expectations.  ;-)
> 
> Let's see...  When I do an allnoconfig build, it has CONFIG_PRINTK=n.
> Given that this facility is based on printk(), I could create an
> additional patch that #ifdef'ed everything out in CONFIG_PRINTK=n kernels.
> This would uglify things a bit, but it would save ~300 bytes.
> 
> If I don't hear otherwise from you, I will put something together,
> test it, and send it along.

And is a first draft, very lightly tested.  Thoughts?

							Thanx, Paul

------------------------------------------------------------------------

commit 4164efdca255093a423b55f44bd788b46d9c648f
Author: Paul E. McKenney <paulmck@kernel.org>
Date:   Thu Jan 7 13:46:11 2021 -0800

    mm: Don't build mm_dump_obj() on CONFIG_PRINTK=n kernels
    
    The mem_dump_obj() functionality adds a few hundred bytes, which is a
    small price to pay.  Except on kernels built with CONFIG_PRINTK=n, in
    which mem_dump_obj() messages will be suppressed.  This commit therefore
    makes mem_dump_obj() be a static inline empty function on kernels built
    with CONFIG_PRINTK=n and excludes all of its support functions as well.
    This avoids kernel bloat on systems that cannot use mem_dump_obj().
    
    Cc: Christoph Lameter <cl@linux.com>
    Cc: Pekka Enberg <penberg@kernel.org>
    Cc: David Rientjes <rientjes@google.com>
    Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
    Cc: <linux-mm@kvack.org>
    Suggested-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Paul E. McKenney <paulmck@kernel.org>

diff --git a/include/linux/mm.h b/include/linux/mm.h
index af7d050..ed75a3a 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -3169,7 +3169,11 @@ unsigned long wp_shared_mapping_range(struct address_space *mapping,
 
 extern int sysctl_nr_trim_pages;
 
+#ifdef CONFIG_PRINTK
 void mem_dump_obj(void *object);
+#else
+static inline void mem_dump_obj(void *object) {}
+#endif
 
 #endif /* __KERNEL__ */
 #endif /* _LINUX_MM_H */
diff --git a/include/linux/slab.h b/include/linux/slab.h
index 7ae6040..0c97d78 100644
--- a/include/linux/slab.h
+++ b/include/linux/slab.h
@@ -186,8 +186,10 @@ void kfree(const void *);
 void kfree_sensitive(const void *);
 size_t __ksize(const void *);
 size_t ksize(const void *);
+#ifdef CONFIG_PRINTK
 bool kmem_valid_obj(void *object);
 void kmem_dump_obj(void *object);
+#endif
 
 #ifdef CONFIG_HAVE_HARDENED_USERCOPY_ALLOCATOR
 void __check_heap_object(const void *ptr, unsigned long n, struct page *page,
diff --git a/include/linux/vmalloc.h b/include/linux/vmalloc.h
index c18f475..3c8a765 100644
--- a/include/linux/vmalloc.h
+++ b/include/linux/vmalloc.h
@@ -246,7 +246,7 @@ pcpu_free_vm_areas(struct vm_struct **vms, int nr_vms)
 int register_vmap_purge_notifier(struct notifier_block *nb);
 int unregister_vmap_purge_notifier(struct notifier_block *nb);
 
-#ifdef CONFIG_MMU
+#if defined(CONFIG_MMU) && defined(CONFIG_PRINTK)
 bool vmalloc_dump_obj(void *object);
 #else
 static inline bool vmalloc_dump_obj(void *object) { return false; }
diff --git a/mm/slab.c b/mm/slab.c
index dcc55e7..965d277 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -3635,6 +3635,7 @@ void *__kmalloc_node_track_caller(size_t size, gfp_t flags,
 EXPORT_SYMBOL(__kmalloc_node_track_caller);
 #endif /* CONFIG_NUMA */
 
+#ifdef CONFIG_PRINTK
 void kmem_obj_info(struct kmem_obj_info *kpp, void *object, struct page *page)
 {
 	struct kmem_cache *cachep;
@@ -3654,6 +3655,7 @@ void kmem_obj_info(struct kmem_obj_info *kpp, void *object, struct page *page)
 	if (DEBUG && cachep->flags & SLAB_STORE_USER)
 		kpp->kp_ret = *dbg_userword(cachep, objp);
 }
+#endif
 
 /**
  * __do_kmalloc - allocate memory
diff --git a/mm/slab.h b/mm/slab.h
index ecad9b5..d2e39ab 100644
--- a/mm/slab.h
+++ b/mm/slab.h
@@ -615,6 +615,7 @@ static inline bool slab_want_init_on_free(struct kmem_cache *c)
 	return false;
 }
 
+#ifdef CONFIG_PRINTK
 #define KS_ADDRS_COUNT 16
 struct kmem_obj_info {
 	void *kp_ptr;
@@ -626,5 +627,6 @@ struct kmem_obj_info {
 	void *kp_stack[KS_ADDRS_COUNT];
 };
 void kmem_obj_info(struct kmem_obj_info *kpp, void *object, struct page *page);
+#endif
 
 #endif /* MM_SLAB_H */
diff --git a/mm/slab_common.c b/mm/slab_common.c
index b594413..20b2cc6 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -537,6 +537,7 @@ bool slab_is_available(void)
 	return slab_state >= UP;
 }
 
+#ifdef CONFIG_PRINTK
 /**
  * kmem_valid_obj - does the pointer reference a valid slab object?
  * @object: pointer to query.
@@ -610,6 +611,7 @@ void kmem_dump_obj(void *object)
 		pr_info("    %pS\n", kp.kp_stack[i]);
 	}
 }
+#endif
 
 #ifndef CONFIG_SLOB
 /* Create a cache during boot when no slab services are available yet */
diff --git a/mm/slob.c b/mm/slob.c
index ef87ada..8726931 100644
--- a/mm/slob.c
+++ b/mm/slob.c
@@ -461,11 +461,13 @@ static void slob_free(void *block, int size)
 	spin_unlock_irqrestore(&slob_lock, flags);
 }
 
+#ifdef CONFIG_PRINTK
 void kmem_obj_info(struct kmem_obj_info *kpp, void *object, struct page *page)
 {
 	kpp->kp_ptr = object;
 	kpp->kp_page = page;
 }
+#endif
 
 /*
  * End of slob allocator proper. Begin kmem_cache_alloc and kmalloc frontend.
diff --git a/mm/slub.c b/mm/slub.c
index 3c1a843..9e205e4 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -3919,6 +3919,7 @@ int __kmem_cache_shutdown(struct kmem_cache *s)
 	return 0;
 }
 
+#ifdef CONFIG_PRINTK
 void kmem_obj_info(struct kmem_obj_info *kpp, void *object, struct page *page)
 {
 	void *base;
@@ -3958,6 +3959,7 @@ void kmem_obj_info(struct kmem_obj_info *kpp, void *object, struct page *page)
 #endif
 #endif
 }
+#endif
 
 /********************************************************************
  *		Kmalloc subsystem
diff --git a/mm/util.c b/mm/util.c
index 5487022..2d497fe 100644
--- a/mm/util.c
+++ b/mm/util.c
@@ -983,6 +983,7 @@ int __weak memcmp_pages(struct page *page1, struct page *page2)
 	return ret;
 }
 
+#ifdef CONFIG_PRINTK
 /**
  * mem_dump_obj - Print available provenance information
  * @object: object for which to find provenance information.
@@ -1013,3 +1014,4 @@ void mem_dump_obj(void *object)
 	}
 	pr_cont(" non-slab/vmalloc memory.\n");
 }
+#endif
diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index e3229ff..5002fd6 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -3448,6 +3448,7 @@ void pcpu_free_vm_areas(struct vm_struct **vms, int nr_vms)
 }
 #endif	/* CONFIG_SMP */
 
+#ifdef CONFIG_PRINTK
 bool vmalloc_dump_obj(void *object)
 {
 	struct vm_struct *vm;
@@ -3460,6 +3461,7 @@ bool vmalloc_dump_obj(void *object)
 		vm->nr_pages, (unsigned long)vm->addr, vm->caller);
 	return true;
 }
+#endif
 
 #ifdef CONFIG_PROC_FS
 static void *s_start(struct seq_file *m, loff_t *pos)

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH mm,percpu_ref,rcu 1/6] mm: Add mem_dump_obj() to print source of memory block
  2021-01-06  1:17 ` [PATCH mm,percpu_ref,rcu 1/6] mm: Add mem_dump_obj() to print source of memory block paulmck
@ 2021-01-08 13:50   ` Vlastimil Babka
  2021-01-08 19:01     ` Paul E. McKenney
  0 siblings, 1 reply; 17+ messages in thread
From: Vlastimil Babka @ 2021-01-08 13:50 UTC (permalink / raw)
  To: paulmck, linux-kernel, rcu, linux-mm
  Cc: cl, penberg, rientjes, iamjoonsoo.kim, akpm, ming.lei, axboe,
	kernel-team

On 1/6/21 2:17 AM, paulmck@kernel.org wrote:
> From: "Paul E. McKenney" <paulmck@kernel.org>
> 
> There are kernel facilities such as per-CPU reference counts that give
> error messages in generic handlers or callbacks, whose messages are
> unenlightening.  In the case of per-CPU reference-count underflow, this
> is not a problem when creating a new use of this facility because in that
> case the bug is almost certainly in the code implementing that new use.
> However, trouble arises when deploying across many systems, which might
> exercise corner cases that were not seen during development and testing.
> Here, it would be really nice to get some kind of hint as to which of
> several uses the underflow was caused by.
> 
> This commit therefore exposes a mem_dump_obj() function that takes
> a pointer to memory (which must still be allocated if it has been
> dynamically allocated) and prints available information on where that
> memory came from.  This pointer can reference the middle of the block as
> well as the beginning of the block, as needed by things like RCU callback
> functions and timer handlers that might not know where the beginning of
> the memory block is.  These functions and handlers can use mem_dump_obj()
> to print out better hints as to where the problem might lie.
> 
> The information printed can depend on kernel configuration.  For example,
> the allocation return address can be printed only for slab and slub,
> and even then only when the necessary debug has been enabled.  For slab,
> build with CONFIG_DEBUG_SLAB=y, and either use sizes with ample space
> to the next power of two or use the SLAB_STORE_USER when creating the
> kmem_cache structure.  For slub, build with CONFIG_SLUB_DEBUG=y and
> boot with slub_debug=U, or pass SLAB_STORE_USER to kmem_cache_create()
> if more focused use is desired.  Also for slub, use CONFIG_STACKTRACE
> to enable printing of the allocation-time stack trace.
> 
> Cc: Christoph Lameter <cl@linux.com>
> Cc: Pekka Enberg <penberg@kernel.org>
> Cc: David Rientjes <rientjes@google.com>
> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: <linux-mm@kvack.org>
> Reported-by: Andrii Nakryiko <andrii@kernel.org>
> [ paulmck: Convert to printing and change names per Joonsoo Kim. ]
> [ paulmck: Move slab definition per Stephen Rothwell and kbuild test robot. ]
> [ paulmck: Handle CONFIG_MMU=n case where vmalloc() is kmalloc(). ]
> [ paulmck: Apply Vlastimil Babka feedback on slab.c kmem_provenance(). ]
> [ paulmck: Extract more info from !SLUB_DEBUG per Joonsoo Kim. ]
> Acked-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>

Acked-by: Vlastimil Babka <vbabka@suse.cz>

Some nits below:

> --- a/mm/slab.c
> +++ b/mm/slab.c
> @@ -3635,6 +3635,26 @@ void *__kmalloc_node_track_caller(size_t size, gfp_t flags,
>  EXPORT_SYMBOL(__kmalloc_node_track_caller);
>  #endif /* CONFIG_NUMA */
>  
> +void kmem_obj_info(struct kmem_obj_info *kpp, void *object, struct page *page)
> +{
> +	struct kmem_cache *cachep;
> +	unsigned int objnr;
> +	void *objp;
> +
> +	kpp->kp_ptr = object;
> +	kpp->kp_page = page;
> +	cachep = page->slab_cache;
> +	kpp->kp_slab_cache = cachep;
> +	objp = object - obj_offset(cachep);
> +	kpp->kp_data_offset = obj_offset(cachep);
> +	page = virt_to_head_page(objp);

Hm when can this page be different from the "page" we got as function parameter?
I guess only if "object" was so close to the beginning of page that "object -
obj_offset(cachep)" underflowed it. So either "object" pointed to the
padding/redzone, or even below page->s_mem. Both situations sounds like we
should handle them differently than continuing with an unrelated page that's
below our slab page?

> +	objnr = obj_to_index(cachep, page, objp);

Related, this will return bogus value for objp below page->s_mem.
And if our "object" pointer pointed beyond last valid object, this will give us
too large index.


> +	objp = index_to_obj(cachep, page, objnr);

Too large index can cause dbg_userword to be beyond our page.
In SLUB version you have the WARN_ON_ONCE that catches such invalid pointers
(before first valid object or after last valid object) and skips getting the
backtrace for those, so analogical thing should probably be done here?

> +	kpp->kp_objp = objp;
> +	if (DEBUG && cachep->flags & SLAB_STORE_USER)
> +		kpp->kp_ret = *dbg_userword(cachep, objp);
> +}
> +
> diff --git a/mm/slub.c b/mm/slub.c
> index 0c8b43a..3c1a843 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -3919,6 +3919,46 @@ int __kmem_cache_shutdown(struct kmem_cache *s)
>  	return 0;
>  }
>  
> +void kmem_obj_info(struct kmem_obj_info *kpp, void *object, struct page *page)
> +{
> +	void *base;
> +	int __maybe_unused i;
> +	unsigned int objnr;
> +	void *objp;
> +	void *objp0;
> +	struct kmem_cache *s = page->slab_cache;
> +	struct track __maybe_unused *trackp;
> +
> +	kpp->kp_ptr = object;
> +	kpp->kp_page = page;
> +	kpp->kp_slab_cache = s;
> +	base = page_address(page);
> +	objp0 = kasan_reset_tag(object);
> +#ifdef CONFIG_SLUB_DEBUG
> +	objp = restore_red_left(s, objp0);
> +#else
> +	objp = objp0;
> +#endif
> +	objnr = obj_to_index(s, page, objp);

It would be safer to use objp0 instead of objp here I think. In case "object"
was pointer to the first object's left red zone, then we would not have "objp"
underflow "base" and get a bogus objnr. The WARN_ON_ONCE below could then be
less paranoid? Basically just the "objp >= base + page->objects * s->size"
should be possible if "object" points beyond the last valid object. But
otherwise we should get valid index and thus valid "objp = base + s->size *
objnr;" below, and "objp < base" and "(objp - base) % s->size)" should be
impossible?

Hmm but since it would then be possible to get a negative pointer offset (into
the left padding/redzone), kmem_dump_obj() should calculate and print it as signed?
But it's not obvious if a pointer to left red zone is a pointer that was an
overflow of object N-1 or underflow of object N, and which one to report (unless
it's the very first object). AFAICS your current code will report all as
overflows of object N-1, which is problematic with N=0 (as described above) so
changing it to report underflows of object N would make more sense?

Thanks,
Vlastimil

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH mm,percpu_ref,rcu 3/6] mm: Make mem_dump_obj() handle vmalloc() memory
  2021-01-06  1:17 ` [PATCH mm,percpu_ref,rcu 3/6] mm: Make mem_dump_obj() handle vmalloc() memory paulmck
@ 2021-01-08 15:30   ` Vlastimil Babka
  0 siblings, 0 replies; 17+ messages in thread
From: Vlastimil Babka @ 2021-01-08 15:30 UTC (permalink / raw)
  To: paulmck, linux-kernel, rcu, linux-mm
  Cc: cl, penberg, rientjes, iamjoonsoo.kim, akpm, ming.lei, axboe,
	kernel-team

On 1/6/21 2:17 AM, paulmck@kernel.org wrote:
> From: "Paul E. McKenney" <paulmck@kernel.org>
> 
> This commit adds vmalloc() support to mem_dump_obj().  Note that the
> vmalloc_dump_obj() function combines the checking and dumping, in
> contrast with the split between kmem_valid_obj() and kmem_dump_obj().
> The reason for the difference is that the checking in the vmalloc()
> case involves acquiring a global lock, and redundant acquisitions of
> global locks should be avoided, even on not-so-fast paths.
> 
> Note that this change causes on-stack variables to be reported as
> vmalloc() storage from kernel_clone() or similar, depending on the degree
> of inlining that your compiler does.  This is likely more helpful than
> the earlier "non-paged (local) memory".
> 
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> Cc: <linux-mm@kvack.org>
> Reported-by: Andrii Nakryiko <andrii@kernel.org>
> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>

Acked-by: Vlastimil Babka <vbabka@suse.cz>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH mm,percpu_ref,rcu 4/6] mm: Make mem_obj_dump() vmalloc() dumps include start and length
  2021-01-06  1:17 ` [PATCH mm,percpu_ref,rcu 4/6] mm: Make mem_obj_dump() vmalloc() dumps include start and length paulmck
@ 2021-01-08 15:30   ` Vlastimil Babka
  0 siblings, 0 replies; 17+ messages in thread
From: Vlastimil Babka @ 2021-01-08 15:30 UTC (permalink / raw)
  To: paulmck, linux-kernel, rcu, linux-mm
  Cc: cl, penberg, rientjes, iamjoonsoo.kim, akpm, ming.lei, axboe,
	kernel-team

On 1/6/21 2:17 AM, paulmck@kernel.org wrote:
> From: "Paul E. McKenney" <paulmck@kernel.org>
> 
> This commit adds the starting address and number of pages to the vmalloc()
> information dumped by way of vmalloc_dump_obj().
> 
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> Cc: <linux-mm@kvack.org>
> Reported-by: Andrii Nakryiko <andrii@kernel.org>
> Suggested-by: Vlastimil Babka <vbabka@suse.cz>
> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>

Acked-by: Vlastimil Babka <vbabka@suse.cz>

> ---
>  mm/vmalloc.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> index c274ea4..e3229ff 100644
> --- a/mm/vmalloc.c
> +++ b/mm/vmalloc.c
> @@ -3456,7 +3456,8 @@ bool vmalloc_dump_obj(void *object)
>  	vm = find_vm_area(objp);
>  	if (!vm)
>  		return false;
> -	pr_cont(" vmalloc allocated at %pS\n", vm->caller);
> +	pr_cont(" %u-page vmalloc region starting at %#lx allocated at %pS\n",
> +		vm->nr_pages, (unsigned long)vm->addr, vm->caller);
>  	return true;
>  }
>  
> 


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v4 sl-b 0/6] Export return addresses etc. for better diagnostics
  2021-01-08  0:26     ` Paul E. McKenney
@ 2021-01-08 15:35       ` Vlastimil Babka
  2021-01-08 17:36         ` Paul E. McKenney
  0 siblings, 1 reply; 17+ messages in thread
From: Vlastimil Babka @ 2021-01-08 15:35 UTC (permalink / raw)
  To: paulmck, Andrew Morton
  Cc: linux-kernel, rcu, linux-mm, cl, penberg, rientjes,
	iamjoonsoo.kim, ming.lei, axboe, kernel-team

On 1/8/21 1:26 AM, Paul E. McKenney wrote:
> On Wed, Jan 06, 2021 at 03:42:12PM -0800, Paul E. McKenney wrote:
>> On Wed, Jan 06, 2021 at 01:48:43PM -0800, Andrew Morton wrote:
>> > On Tue, 5 Jan 2021 17:16:03 -0800 "Paul E. McKenney" <paulmck@kernel.org> wrote:
>> > 
>> > > This is v4 of the series the improves diagnostics by providing access
>> > > to additional information including the return addresses, slab names,
>> > > offsets, and sizes collected by the sl*b allocators and by vmalloc().
>> > 
>> > Looks reasonable.  And not as bloaty as I feared, but it does add ~300
>> > bytes to my allnoconfig build.  Is the CONFIG_ coverage as tight as it
>> > could be?
>> 
>> Glad I managed to exceed your expectations.  ;-)
>> 
>> Let's see...  When I do an allnoconfig build, it has CONFIG_PRINTK=n.
>> Given that this facility is based on printk(), I could create an
>> additional patch that #ifdef'ed everything out in CONFIG_PRINTK=n kernels.
>> This would uglify things a bit, but it would save ~300 bytes.
>> 
>> If I don't hear otherwise from you, I will put something together,
>> test it, and send it along.
> 
> And is a first draft, very lightly tested.  Thoughts?

Looks ok, but I'm not sure it's worth the trouble, there's probably tons of
other code that could be treated like this and nobody is doing that
systematically (at least I haven't heard about kernel tinyfication effort for
years)... Up to Andrew I guess.

Thanks,
Vlastimil

> 							Thanx, Paul
> 
> ------------------------------------------------------------------------
> 
> commit 4164efdca255093a423b55f44bd788b46d9c648f
> Author: Paul E. McKenney <paulmck@kernel.org>
> Date:   Thu Jan 7 13:46:11 2021 -0800
> 
>     mm: Don't build mm_dump_obj() on CONFIG_PRINTK=n kernels
>     
>     The mem_dump_obj() functionality adds a few hundred bytes, which is a
>     small price to pay.  Except on kernels built with CONFIG_PRINTK=n, in
>     which mem_dump_obj() messages will be suppressed.  This commit therefore
>     makes mem_dump_obj() be a static inline empty function on kernels built
>     with CONFIG_PRINTK=n and excludes all of its support functions as well.
>     This avoids kernel bloat on systems that cannot use mem_dump_obj().
>     
>     Cc: Christoph Lameter <cl@linux.com>
>     Cc: Pekka Enberg <penberg@kernel.org>
>     Cc: David Rientjes <rientjes@google.com>
>     Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
>     Cc: <linux-mm@kvack.org>
>     Suggested-by: Andrew Morton <akpm@linux-foundation.org>
>     Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
> 
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index af7d050..ed75a3a 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -3169,7 +3169,11 @@ unsigned long wp_shared_mapping_range(struct address_space *mapping,
>  
>  extern int sysctl_nr_trim_pages;
>  
> +#ifdef CONFIG_PRINTK
>  void mem_dump_obj(void *object);
> +#else
> +static inline void mem_dump_obj(void *object) {}
> +#endif
>  
>  #endif /* __KERNEL__ */
>  #endif /* _LINUX_MM_H */
> diff --git a/include/linux/slab.h b/include/linux/slab.h
> index 7ae6040..0c97d78 100644
> --- a/include/linux/slab.h
> +++ b/include/linux/slab.h
> @@ -186,8 +186,10 @@ void kfree(const void *);
>  void kfree_sensitive(const void *);
>  size_t __ksize(const void *);
>  size_t ksize(const void *);
> +#ifdef CONFIG_PRINTK
>  bool kmem_valid_obj(void *object);
>  void kmem_dump_obj(void *object);
> +#endif
>  
>  #ifdef CONFIG_HAVE_HARDENED_USERCOPY_ALLOCATOR
>  void __check_heap_object(const void *ptr, unsigned long n, struct page *page,
> diff --git a/include/linux/vmalloc.h b/include/linux/vmalloc.h
> index c18f475..3c8a765 100644
> --- a/include/linux/vmalloc.h
> +++ b/include/linux/vmalloc.h
> @@ -246,7 +246,7 @@ pcpu_free_vm_areas(struct vm_struct **vms, int nr_vms)
>  int register_vmap_purge_notifier(struct notifier_block *nb);
>  int unregister_vmap_purge_notifier(struct notifier_block *nb);
>  
> -#ifdef CONFIG_MMU
> +#if defined(CONFIG_MMU) && defined(CONFIG_PRINTK)
>  bool vmalloc_dump_obj(void *object);
>  #else
>  static inline bool vmalloc_dump_obj(void *object) { return false; }
> diff --git a/mm/slab.c b/mm/slab.c
> index dcc55e7..965d277 100644
> --- a/mm/slab.c
> +++ b/mm/slab.c
> @@ -3635,6 +3635,7 @@ void *__kmalloc_node_track_caller(size_t size, gfp_t flags,
>  EXPORT_SYMBOL(__kmalloc_node_track_caller);
>  #endif /* CONFIG_NUMA */
>  
> +#ifdef CONFIG_PRINTK
>  void kmem_obj_info(struct kmem_obj_info *kpp, void *object, struct page *page)
>  {
>  	struct kmem_cache *cachep;
> @@ -3654,6 +3655,7 @@ void kmem_obj_info(struct kmem_obj_info *kpp, void *object, struct page *page)
>  	if (DEBUG && cachep->flags & SLAB_STORE_USER)
>  		kpp->kp_ret = *dbg_userword(cachep, objp);
>  }
> +#endif
>  
>  /**
>   * __do_kmalloc - allocate memory
> diff --git a/mm/slab.h b/mm/slab.h
> index ecad9b5..d2e39ab 100644
> --- a/mm/slab.h
> +++ b/mm/slab.h
> @@ -615,6 +615,7 @@ static inline bool slab_want_init_on_free(struct kmem_cache *c)
>  	return false;
>  }
>  
> +#ifdef CONFIG_PRINTK
>  #define KS_ADDRS_COUNT 16
>  struct kmem_obj_info {
>  	void *kp_ptr;
> @@ -626,5 +627,6 @@ struct kmem_obj_info {
>  	void *kp_stack[KS_ADDRS_COUNT];
>  };
>  void kmem_obj_info(struct kmem_obj_info *kpp, void *object, struct page *page);
> +#endif
>  
>  #endif /* MM_SLAB_H */
> diff --git a/mm/slab_common.c b/mm/slab_common.c
> index b594413..20b2cc6 100644
> --- a/mm/slab_common.c
> +++ b/mm/slab_common.c
> @@ -537,6 +537,7 @@ bool slab_is_available(void)
>  	return slab_state >= UP;
>  }
>  
> +#ifdef CONFIG_PRINTK
>  /**
>   * kmem_valid_obj - does the pointer reference a valid slab object?
>   * @object: pointer to query.
> @@ -610,6 +611,7 @@ void kmem_dump_obj(void *object)
>  		pr_info("    %pS\n", kp.kp_stack[i]);
>  	}
>  }
> +#endif
>  
>  #ifndef CONFIG_SLOB
>  /* Create a cache during boot when no slab services are available yet */
> diff --git a/mm/slob.c b/mm/slob.c
> index ef87ada..8726931 100644
> --- a/mm/slob.c
> +++ b/mm/slob.c
> @@ -461,11 +461,13 @@ static void slob_free(void *block, int size)
>  	spin_unlock_irqrestore(&slob_lock, flags);
>  }
>  
> +#ifdef CONFIG_PRINTK
>  void kmem_obj_info(struct kmem_obj_info *kpp, void *object, struct page *page)
>  {
>  	kpp->kp_ptr = object;
>  	kpp->kp_page = page;
>  }
> +#endif
>  
>  /*
>   * End of slob allocator proper. Begin kmem_cache_alloc and kmalloc frontend.
> diff --git a/mm/slub.c b/mm/slub.c
> index 3c1a843..9e205e4 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -3919,6 +3919,7 @@ int __kmem_cache_shutdown(struct kmem_cache *s)
>  	return 0;
>  }
>  
> +#ifdef CONFIG_PRINTK
>  void kmem_obj_info(struct kmem_obj_info *kpp, void *object, struct page *page)
>  {
>  	void *base;
> @@ -3958,6 +3959,7 @@ void kmem_obj_info(struct kmem_obj_info *kpp, void *object, struct page *page)
>  #endif
>  #endif
>  }
> +#endif
>  
>  /********************************************************************
>   *		Kmalloc subsystem
> diff --git a/mm/util.c b/mm/util.c
> index 5487022..2d497fe 100644
> --- a/mm/util.c
> +++ b/mm/util.c
> @@ -983,6 +983,7 @@ int __weak memcmp_pages(struct page *page1, struct page *page2)
>  	return ret;
>  }
>  
> +#ifdef CONFIG_PRINTK
>  /**
>   * mem_dump_obj - Print available provenance information
>   * @object: object for which to find provenance information.
> @@ -1013,3 +1014,4 @@ void mem_dump_obj(void *object)
>  	}
>  	pr_cont(" non-slab/vmalloc memory.\n");
>  }
> +#endif
> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> index e3229ff..5002fd6 100644
> --- a/mm/vmalloc.c
> +++ b/mm/vmalloc.c
> @@ -3448,6 +3448,7 @@ void pcpu_free_vm_areas(struct vm_struct **vms, int nr_vms)
>  }
>  #endif	/* CONFIG_SMP */
>  
> +#ifdef CONFIG_PRINTK
>  bool vmalloc_dump_obj(void *object)
>  {
>  	struct vm_struct *vm;
> @@ -3460,6 +3461,7 @@ bool vmalloc_dump_obj(void *object)
>  		vm->nr_pages, (unsigned long)vm->addr, vm->caller);
>  	return true;
>  }
> +#endif
>  
>  #ifdef CONFIG_PROC_FS
>  static void *s_start(struct seq_file *m, loff_t *pos)
> 


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v4 sl-b 0/6] Export return addresses etc. for better diagnostics
  2021-01-08 15:35       ` Vlastimil Babka
@ 2021-01-08 17:36         ` Paul E. McKenney
  0 siblings, 0 replies; 17+ messages in thread
From: Paul E. McKenney @ 2021-01-08 17:36 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Andrew Morton, linux-kernel, rcu, linux-mm, cl, penberg,
	rientjes, iamjoonsoo.kim, ming.lei, axboe, kernel-team

On Fri, Jan 08, 2021 at 04:35:57PM +0100, Vlastimil Babka wrote:
> On 1/8/21 1:26 AM, Paul E. McKenney wrote:
> > On Wed, Jan 06, 2021 at 03:42:12PM -0800, Paul E. McKenney wrote:
> >> On Wed, Jan 06, 2021 at 01:48:43PM -0800, Andrew Morton wrote:
> >> > On Tue, 5 Jan 2021 17:16:03 -0800 "Paul E. McKenney" <paulmck@kernel.org> wrote:
> >> > 
> >> > > This is v4 of the series the improves diagnostics by providing access
> >> > > to additional information including the return addresses, slab names,
> >> > > offsets, and sizes collected by the sl*b allocators and by vmalloc().
> >> > 
> >> > Looks reasonable.  And not as bloaty as I feared, but it does add ~300
> >> > bytes to my allnoconfig build.  Is the CONFIG_ coverage as tight as it
> >> > could be?
> >> 
> >> Glad I managed to exceed your expectations.  ;-)
> >> 
> >> Let's see...  When I do an allnoconfig build, it has CONFIG_PRINTK=n.
> >> Given that this facility is based on printk(), I could create an
> >> additional patch that #ifdef'ed everything out in CONFIG_PRINTK=n kernels.
> >> This would uglify things a bit, but it would save ~300 bytes.
> >> 
> >> If I don't hear otherwise from you, I will put something together,
> >> test it, and send it along.
> > 
> > And is a first draft, very lightly tested.  Thoughts?
> 
> Looks ok, but I'm not sure it's worth the trouble, there's probably tons of
> other code that could be treated like this and nobody is doing that
> systematically (at least I haven't heard about kernel tinyfication effort for
> years)... Up to Andrew I guess.

I am good either way.  ;-)

							Thanx, Paul

> Thanks,
> Vlastimil
> 
> > 							Thanx, Paul
> > 
> > ------------------------------------------------------------------------
> > 
> > commit 4164efdca255093a423b55f44bd788b46d9c648f
> > Author: Paul E. McKenney <paulmck@kernel.org>
> > Date:   Thu Jan 7 13:46:11 2021 -0800
> > 
> >     mm: Don't build mm_dump_obj() on CONFIG_PRINTK=n kernels
> >     
> >     The mem_dump_obj() functionality adds a few hundred bytes, which is a
> >     small price to pay.  Except on kernels built with CONFIG_PRINTK=n, in
> >     which mem_dump_obj() messages will be suppressed.  This commit therefore
> >     makes mem_dump_obj() be a static inline empty function on kernels built
> >     with CONFIG_PRINTK=n and excludes all of its support functions as well.
> >     This avoids kernel bloat on systems that cannot use mem_dump_obj().
> >     
> >     Cc: Christoph Lameter <cl@linux.com>
> >     Cc: Pekka Enberg <penberg@kernel.org>
> >     Cc: David Rientjes <rientjes@google.com>
> >     Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> >     Cc: <linux-mm@kvack.org>
> >     Suggested-by: Andrew Morton <akpm@linux-foundation.org>
> >     Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
> > 
> > diff --git a/include/linux/mm.h b/include/linux/mm.h
> > index af7d050..ed75a3a 100644
> > --- a/include/linux/mm.h
> > +++ b/include/linux/mm.h
> > @@ -3169,7 +3169,11 @@ unsigned long wp_shared_mapping_range(struct address_space *mapping,
> >  
> >  extern int sysctl_nr_trim_pages;
> >  
> > +#ifdef CONFIG_PRINTK
> >  void mem_dump_obj(void *object);
> > +#else
> > +static inline void mem_dump_obj(void *object) {}
> > +#endif
> >  
> >  #endif /* __KERNEL__ */
> >  #endif /* _LINUX_MM_H */
> > diff --git a/include/linux/slab.h b/include/linux/slab.h
> > index 7ae6040..0c97d78 100644
> > --- a/include/linux/slab.h
> > +++ b/include/linux/slab.h
> > @@ -186,8 +186,10 @@ void kfree(const void *);
> >  void kfree_sensitive(const void *);
> >  size_t __ksize(const void *);
> >  size_t ksize(const void *);
> > +#ifdef CONFIG_PRINTK
> >  bool kmem_valid_obj(void *object);
> >  void kmem_dump_obj(void *object);
> > +#endif
> >  
> >  #ifdef CONFIG_HAVE_HARDENED_USERCOPY_ALLOCATOR
> >  void __check_heap_object(const void *ptr, unsigned long n, struct page *page,
> > diff --git a/include/linux/vmalloc.h b/include/linux/vmalloc.h
> > index c18f475..3c8a765 100644
> > --- a/include/linux/vmalloc.h
> > +++ b/include/linux/vmalloc.h
> > @@ -246,7 +246,7 @@ pcpu_free_vm_areas(struct vm_struct **vms, int nr_vms)
> >  int register_vmap_purge_notifier(struct notifier_block *nb);
> >  int unregister_vmap_purge_notifier(struct notifier_block *nb);
> >  
> > -#ifdef CONFIG_MMU
> > +#if defined(CONFIG_MMU) && defined(CONFIG_PRINTK)
> >  bool vmalloc_dump_obj(void *object);
> >  #else
> >  static inline bool vmalloc_dump_obj(void *object) { return false; }
> > diff --git a/mm/slab.c b/mm/slab.c
> > index dcc55e7..965d277 100644
> > --- a/mm/slab.c
> > +++ b/mm/slab.c
> > @@ -3635,6 +3635,7 @@ void *__kmalloc_node_track_caller(size_t size, gfp_t flags,
> >  EXPORT_SYMBOL(__kmalloc_node_track_caller);
> >  #endif /* CONFIG_NUMA */
> >  
> > +#ifdef CONFIG_PRINTK
> >  void kmem_obj_info(struct kmem_obj_info *kpp, void *object, struct page *page)
> >  {
> >  	struct kmem_cache *cachep;
> > @@ -3654,6 +3655,7 @@ void kmem_obj_info(struct kmem_obj_info *kpp, void *object, struct page *page)
> >  	if (DEBUG && cachep->flags & SLAB_STORE_USER)
> >  		kpp->kp_ret = *dbg_userword(cachep, objp);
> >  }
> > +#endif
> >  
> >  /**
> >   * __do_kmalloc - allocate memory
> > diff --git a/mm/slab.h b/mm/slab.h
> > index ecad9b5..d2e39ab 100644
> > --- a/mm/slab.h
> > +++ b/mm/slab.h
> > @@ -615,6 +615,7 @@ static inline bool slab_want_init_on_free(struct kmem_cache *c)
> >  	return false;
> >  }
> >  
> > +#ifdef CONFIG_PRINTK
> >  #define KS_ADDRS_COUNT 16
> >  struct kmem_obj_info {
> >  	void *kp_ptr;
> > @@ -626,5 +627,6 @@ struct kmem_obj_info {
> >  	void *kp_stack[KS_ADDRS_COUNT];
> >  };
> >  void kmem_obj_info(struct kmem_obj_info *kpp, void *object, struct page *page);
> > +#endif
> >  
> >  #endif /* MM_SLAB_H */
> > diff --git a/mm/slab_common.c b/mm/slab_common.c
> > index b594413..20b2cc6 100644
> > --- a/mm/slab_common.c
> > +++ b/mm/slab_common.c
> > @@ -537,6 +537,7 @@ bool slab_is_available(void)
> >  	return slab_state >= UP;
> >  }
> >  
> > +#ifdef CONFIG_PRINTK
> >  /**
> >   * kmem_valid_obj - does the pointer reference a valid slab object?
> >   * @object: pointer to query.
> > @@ -610,6 +611,7 @@ void kmem_dump_obj(void *object)
> >  		pr_info("    %pS\n", kp.kp_stack[i]);
> >  	}
> >  }
> > +#endif
> >  
> >  #ifndef CONFIG_SLOB
> >  /* Create a cache during boot when no slab services are available yet */
> > diff --git a/mm/slob.c b/mm/slob.c
> > index ef87ada..8726931 100644
> > --- a/mm/slob.c
> > +++ b/mm/slob.c
> > @@ -461,11 +461,13 @@ static void slob_free(void *block, int size)
> >  	spin_unlock_irqrestore(&slob_lock, flags);
> >  }
> >  
> > +#ifdef CONFIG_PRINTK
> >  void kmem_obj_info(struct kmem_obj_info *kpp, void *object, struct page *page)
> >  {
> >  	kpp->kp_ptr = object;
> >  	kpp->kp_page = page;
> >  }
> > +#endif
> >  
> >  /*
> >   * End of slob allocator proper. Begin kmem_cache_alloc and kmalloc frontend.
> > diff --git a/mm/slub.c b/mm/slub.c
> > index 3c1a843..9e205e4 100644
> > --- a/mm/slub.c
> > +++ b/mm/slub.c
> > @@ -3919,6 +3919,7 @@ int __kmem_cache_shutdown(struct kmem_cache *s)
> >  	return 0;
> >  }
> >  
> > +#ifdef CONFIG_PRINTK
> >  void kmem_obj_info(struct kmem_obj_info *kpp, void *object, struct page *page)
> >  {
> >  	void *base;
> > @@ -3958,6 +3959,7 @@ void kmem_obj_info(struct kmem_obj_info *kpp, void *object, struct page *page)
> >  #endif
> >  #endif
> >  }
> > +#endif
> >  
> >  /********************************************************************
> >   *		Kmalloc subsystem
> > diff --git a/mm/util.c b/mm/util.c
> > index 5487022..2d497fe 100644
> > --- a/mm/util.c
> > +++ b/mm/util.c
> > @@ -983,6 +983,7 @@ int __weak memcmp_pages(struct page *page1, struct page *page2)
> >  	return ret;
> >  }
> >  
> > +#ifdef CONFIG_PRINTK
> >  /**
> >   * mem_dump_obj - Print available provenance information
> >   * @object: object for which to find provenance information.
> > @@ -1013,3 +1014,4 @@ void mem_dump_obj(void *object)
> >  	}
> >  	pr_cont(" non-slab/vmalloc memory.\n");
> >  }
> > +#endif
> > diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> > index e3229ff..5002fd6 100644
> > --- a/mm/vmalloc.c
> > +++ b/mm/vmalloc.c
> > @@ -3448,6 +3448,7 @@ void pcpu_free_vm_areas(struct vm_struct **vms, int nr_vms)
> >  }
> >  #endif	/* CONFIG_SMP */
> >  
> > +#ifdef CONFIG_PRINTK
> >  bool vmalloc_dump_obj(void *object)
> >  {
> >  	struct vm_struct *vm;
> > @@ -3460,6 +3461,7 @@ bool vmalloc_dump_obj(void *object)
> >  		vm->nr_pages, (unsigned long)vm->addr, vm->caller);
> >  	return true;
> >  }
> > +#endif
> >  
> >  #ifdef CONFIG_PROC_FS
> >  static void *s_start(struct seq_file *m, loff_t *pos)
> > 
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH mm,percpu_ref,rcu 1/6] mm: Add mem_dump_obj() to print source of memory block
  2021-01-08 13:50   ` Vlastimil Babka
@ 2021-01-08 19:01     ` Paul E. McKenney
  2021-01-08 19:41       ` Vlastimil Babka
  0 siblings, 1 reply; 17+ messages in thread
From: Paul E. McKenney @ 2021-01-08 19:01 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: linux-kernel, rcu, linux-mm, cl, penberg, rientjes,
	iamjoonsoo.kim, akpm, ming.lei, axboe, kernel-team

On Fri, Jan 08, 2021 at 02:50:35PM +0100, Vlastimil Babka wrote:
> On 1/6/21 2:17 AM, paulmck@kernel.org wrote:
> > From: "Paul E. McKenney" <paulmck@kernel.org>
> > 
> > There are kernel facilities such as per-CPU reference counts that give
> > error messages in generic handlers or callbacks, whose messages are
> > unenlightening.  In the case of per-CPU reference-count underflow, this
> > is not a problem when creating a new use of this facility because in that
> > case the bug is almost certainly in the code implementing that new use.
> > However, trouble arises when deploying across many systems, which might
> > exercise corner cases that were not seen during development and testing.
> > Here, it would be really nice to get some kind of hint as to which of
> > several uses the underflow was caused by.
> > 
> > This commit therefore exposes a mem_dump_obj() function that takes
> > a pointer to memory (which must still be allocated if it has been
> > dynamically allocated) and prints available information on where that
> > memory came from.  This pointer can reference the middle of the block as
> > well as the beginning of the block, as needed by things like RCU callback
> > functions and timer handlers that might not know where the beginning of
> > the memory block is.  These functions and handlers can use mem_dump_obj()
> > to print out better hints as to where the problem might lie.
> > 
> > The information printed can depend on kernel configuration.  For example,
> > the allocation return address can be printed only for slab and slub,
> > and even then only when the necessary debug has been enabled.  For slab,
> > build with CONFIG_DEBUG_SLAB=y, and either use sizes with ample space
> > to the next power of two or use the SLAB_STORE_USER when creating the
> > kmem_cache structure.  For slub, build with CONFIG_SLUB_DEBUG=y and
> > boot with slub_debug=U, or pass SLAB_STORE_USER to kmem_cache_create()
> > if more focused use is desired.  Also for slub, use CONFIG_STACKTRACE
> > to enable printing of the allocation-time stack trace.
> > 
> > Cc: Christoph Lameter <cl@linux.com>
> > Cc: Pekka Enberg <penberg@kernel.org>
> > Cc: David Rientjes <rientjes@google.com>
> > Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> > Cc: Andrew Morton <akpm@linux-foundation.org>
> > Cc: <linux-mm@kvack.org>
> > Reported-by: Andrii Nakryiko <andrii@kernel.org>
> > [ paulmck: Convert to printing and change names per Joonsoo Kim. ]
> > [ paulmck: Move slab definition per Stephen Rothwell and kbuild test robot. ]
> > [ paulmck: Handle CONFIG_MMU=n case where vmalloc() is kmalloc(). ]
> > [ paulmck: Apply Vlastimil Babka feedback on slab.c kmem_provenance(). ]
> > [ paulmck: Extract more info from !SLUB_DEBUG per Joonsoo Kim. ]
> > Acked-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> 
> Acked-by: Vlastimil Babka <vbabka@suse.cz>

Thank you for the review and comments!

> Some nits below:

Andrew pushed this to an upstream maintainer, but I have not seen these
patches appear anywhere.  So if that upstream maintainer was Linus, I can
send a follow-up patch once we converge.  If the upstream maintainer was
in fact me, I can of course update the commit directly.  If the upstream
maintainer was someone else, please let me know who it is.  ;-)

(Linus does not appear to have pushed anything out since before Andrew's
email, hence my uncertainty.)

> > --- a/mm/slab.c
> > +++ b/mm/slab.c
> > @@ -3635,6 +3635,26 @@ void *__kmalloc_node_track_caller(size_t size, gfp_t flags,
> >  EXPORT_SYMBOL(__kmalloc_node_track_caller);
> >  #endif /* CONFIG_NUMA */
> >  
> > +void kmem_obj_info(struct kmem_obj_info *kpp, void *object, struct page *page)
> > +{
> > +	struct kmem_cache *cachep;
> > +	unsigned int objnr;
> > +	void *objp;
> > +
> > +	kpp->kp_ptr = object;
> > +	kpp->kp_page = page;
> > +	cachep = page->slab_cache;
> > +	kpp->kp_slab_cache = cachep;
> > +	objp = object - obj_offset(cachep);
> > +	kpp->kp_data_offset = obj_offset(cachep);
> > +	page = virt_to_head_page(objp);
> 
> Hm when can this page be different from the "page" we got as function parameter?
> I guess only if "object" was so close to the beginning of page that "object -
> obj_offset(cachep)" underflowed it. So either "object" pointed to the
> padding/redzone, or even below page->s_mem. Both situations sounds like we
> should handle them differently than continuing with an unrelated page that's
> below our slab page?

I examined other code to obtain this.  I have been assuming that the
point was to be able to handle multipage slabs, but I freely confess to
having no idea.  But I am reluctant to change this sequence unless the
other code translating from pointer to in-slab object is also changed.

> > +	objnr = obj_to_index(cachep, page, objp);
> 
> Related, this will return bogus value for objp below page->s_mem.
> And if our "object" pointer pointed beyond last valid object, this will give us
> too large index.
> 
> 
> > +	objp = index_to_obj(cachep, page, objnr);
> 
> Too large index can cause dbg_userword to be beyond our page.
> In SLUB version you have the WARN_ON_ONCE that catches such invalid pointers
> (before first valid object or after last valid object) and skips getting the
> backtrace for those, so analogical thing should probably be done here?

Like this, just before the "objp =" statement?

	WARN_ON_ONCE(objnr >= cachep->num);

> > +	kpp->kp_objp = objp;
> > +	if (DEBUG && cachep->flags & SLAB_STORE_USER)
> > +		kpp->kp_ret = *dbg_userword(cachep, objp);
> > +}
> > +
> > diff --git a/mm/slub.c b/mm/slub.c
> > index 0c8b43a..3c1a843 100644
> > --- a/mm/slub.c
> > +++ b/mm/slub.c
> > @@ -3919,6 +3919,46 @@ int __kmem_cache_shutdown(struct kmem_cache *s)
> >  	return 0;
> >  }
> >  
> > +void kmem_obj_info(struct kmem_obj_info *kpp, void *object, struct page *page)
> > +{
> > +	void *base;
> > +	int __maybe_unused i;
> > +	unsigned int objnr;
> > +	void *objp;
> > +	void *objp0;
> > +	struct kmem_cache *s = page->slab_cache;
> > +	struct track __maybe_unused *trackp;
> > +
> > +	kpp->kp_ptr = object;
> > +	kpp->kp_page = page;
> > +	kpp->kp_slab_cache = s;
> > +	base = page_address(page);
> > +	objp0 = kasan_reset_tag(object);
> > +#ifdef CONFIG_SLUB_DEBUG
> > +	objp = restore_red_left(s, objp0);
> > +#else
> > +	objp = objp0;
> > +#endif
> > +	objnr = obj_to_index(s, page, objp);
> 
> It would be safer to use objp0 instead of objp here I think. In case "object"
> was pointer to the first object's left red zone, then we would not have "objp"
> underflow "base" and get a bogus objnr. The WARN_ON_ONCE below could then be
> less paranoid? Basically just the "objp >= base + page->objects * s->size"
> should be possible if "object" points beyond the last valid object. But
> otherwise we should get valid index and thus valid "objp = base + s->size *
> objnr;" below, and "objp < base" and "(objp - base) % s->size)" should be
> impossible?
> 
> Hmm but since it would then be possible to get a negative pointer offset (into
> the left padding/redzone), kmem_dump_obj() should calculate and print it as signed?
> But it's not obvious if a pointer to left red zone is a pointer that was an
> overflow of object N-1 or underflow of object N, and which one to report (unless
> it's the very first object). AFAICS your current code will report all as
> overflows of object N-1, which is problematic with N=0 (as described above) so
> changing it to report underflows of object N would make more sense?

Doesn't the "WARN_ON_ONCE(objp < base" further down report underflows?
Or am I missing something subtle here?

							Thanx, Paul

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH mm,percpu_ref,rcu 1/6] mm: Add mem_dump_obj() to print source of memory block
  2021-01-08 19:01     ` Paul E. McKenney
@ 2021-01-08 19:41       ` Vlastimil Babka
  0 siblings, 0 replies; 17+ messages in thread
From: Vlastimil Babka @ 2021-01-08 19:41 UTC (permalink / raw)
  To: paulmck
  Cc: linux-kernel, rcu, linux-mm, cl, penberg, rientjes,
	iamjoonsoo.kim, akpm, ming.lei, axboe, kernel-team

On 1/8/21 8:01 PM, Paul E. McKenney wrote:
> 
> Andrew pushed this to an upstream maintainer, but I have not seen these
> patches appear anywhere.  So if that upstream maintainer was Linus, I can
> send a follow-up patch once we converge.  If the upstream maintainer was
> in fact me, I can of course update the commit directly.  If the upstream
> maintainer was someone else, please let me know who it is.  ;-)
> 
> (Linus does not appear to have pushed anything out since before Andrew's
> email, hence my uncertainty.)

I've wondered about the mm-commits messages too, and concluded that the most probable explanation is that Andrew tried to add your series to mmotm and then tried mmotm merge to linux-next and found out the series is already there via your rcu tree :)
 
>> > --- a/mm/slab.c
>> > +++ b/mm/slab.c
>> > @@ -3635,6 +3635,26 @@ void *__kmalloc_node_track_caller(size_t size, gfp_t flags,
>> >  EXPORT_SYMBOL(__kmalloc_node_track_caller);
>> >  #endif /* CONFIG_NUMA */
>> >  
>> > +void kmem_obj_info(struct kmem_obj_info *kpp, void *object, struct page *page)
>> > +{
>> > +	struct kmem_cache *cachep;
>> > +	unsigned int objnr;
>> > +	void *objp;
>> > +
>> > +	kpp->kp_ptr = object;
>> > +	kpp->kp_page = page;
>> > +	cachep = page->slab_cache;
>> > +	kpp->kp_slab_cache = cachep;
>> > +	objp = object - obj_offset(cachep);
>> > +	kpp->kp_data_offset = obj_offset(cachep);
>> > +	page = virt_to_head_page(objp);
>> 
>> Hm when can this page be different from the "page" we got as function parameter?
>> I guess only if "object" was so close to the beginning of page that "object -
>> obj_offset(cachep)" underflowed it. So either "object" pointed to the
>> padding/redzone, or even below page->s_mem. Both situations sounds like we
>> should handle them differently than continuing with an unrelated page that's
>> below our slab page?
> 
> I examined other code to obtain this.  I have been assuming that the
> point was to be able to handle multipage slabs, but I freely confess to
> having no idea.  But I am reluctant to change this sequence unless the
> other code translating from pointer to in-slab object is also changed.

OK, I will check the other code.

>> > +	objnr = obj_to_index(cachep, page, objp);
>> 
>> Related, this will return bogus value for objp below page->s_mem.
>> And if our "object" pointer pointed beyond last valid object, this will give us
>> too large index.
>> 
>> 
>> > +	objp = index_to_obj(cachep, page, objnr);
>> 
>> Too large index can cause dbg_userword to be beyond our page.
>> In SLUB version you have the WARN_ON_ONCE that catches such invalid pointers
>> (before first valid object or after last valid object) and skips getting the
>> backtrace for those, so analogical thing should probably be done here?
> 
> Like this, just before the "objp =" statement?
> 
> 	WARN_ON_ONCE(objnr >= cachep->num);

Yeah, that should do the trick to prevent accessing garbage dbg_userword.

But I wrote the comments about SLAB first and only in the SLUB part I realized
about the larger picture. So now I would consider something like below, which
should find the closest valid object index and resulting pointer offset in
kmem_dump_obj() might become negative. Pointers to padding, below page->s_mem or
beyond last object just become respectively large negative or positive pointer
offsets and we probably don't need to warn for them specially unless we warn
also for all other pointers that are not within the "data area" of the object.

void kmem_obj_info(struct kmem_obj_info *kpp, void *object, struct page *page)
{
        struct kmem_cache *cachep;
        unsigned int objnr;
        void *objp;

        kpp->kp_ptr = object;
        kpp->kp_page = page;
        cachep = page->slab_cache;
        kpp->kp_slab_cache = cachep;
        kpp->kp_data_offset = obj_offset(cachep);
	if (object < page->s_mem)
		objnr = 0;
	else
        	objnr = obj_to_index(cachep, page, object);
	if (objnr >= cachep->num)
		objnr = cachep->num - 1;
        objp = index_to_obj(cachep, page, objnr);
        kpp->kp_objp = objp;
        if (DEBUG && cachep->flags & SLAB_STORE_USER)
                kpp->kp_ret = *dbg_userword(cachep, objp);
}

 
>> > +	kpp->kp_objp = objp;
>> > +	if (DEBUG && cachep->flags & SLAB_STORE_USER)
>> > +		kpp->kp_ret = *dbg_userword(cachep, objp);
>> > +}
>> > +
>> > diff --git a/mm/slub.c b/mm/slub.c
>> > index 0c8b43a..3c1a843 100644
>> > --- a/mm/slub.c
>> > +++ b/mm/slub.c
>> > @@ -3919,6 +3919,46 @@ int __kmem_cache_shutdown(struct kmem_cache *s)
>> >  	return 0;
>> >  }
>> >  
>> > +void kmem_obj_info(struct kmem_obj_info *kpp, void *object, struct page *page)
>> > +{
>> > +	void *base;
>> > +	int __maybe_unused i;
>> > +	unsigned int objnr;
>> > +	void *objp;
>> > +	void *objp0;
>> > +	struct kmem_cache *s = page->slab_cache;
>> > +	struct track __maybe_unused *trackp;
>> > +
>> > +	kpp->kp_ptr = object;
>> > +	kpp->kp_page = page;
>> > +	kpp->kp_slab_cache = s;
>> > +	base = page_address(page);
>> > +	objp0 = kasan_reset_tag(object);
>> > +#ifdef CONFIG_SLUB_DEBUG
>> > +	objp = restore_red_left(s, objp0);
>> > +#else
>> > +	objp = objp0;
>> > +#endif
>> > +	objnr = obj_to_index(s, page, objp);
>> 
>> It would be safer to use objp0 instead of objp here I think. In case "object"
>> was pointer to the first object's left red zone, then we would not have "objp"
>> underflow "base" and get a bogus objnr. The WARN_ON_ONCE below could then be
>> less paranoid? Basically just the "objp >= base + page->objects * s->size"
>> should be possible if "object" points beyond the last valid object. But
>> otherwise we should get valid index and thus valid "objp = base + s->size *
>> objnr;" below, and "objp < base" and "(objp - base) % s->size)" should be
>> impossible?
>> 
>> Hmm but since it would then be possible to get a negative pointer offset (into
>> the left padding/redzone), kmem_dump_obj() should calculate and print it as signed?
>> But it's not obvious if a pointer to left red zone is a pointer that was an
>> overflow of object N-1 or underflow of object N, and which one to report (unless
>> it's the very first object). AFAICS your current code will report all as
>> overflows of object N-1, which is problematic with N=0 (as described above) so
>> changing it to report underflows of object N would make more sense?
> 
> Doesn't the "WARN_ON_ONCE(objp < base" further down report underflows?

I don't think it could be possible, could you describe the conditions?

> Or am I missing something subtle here?

A version analogical to the SLAB one above could AFAICS look like this:

...
        kpp->kp_ptr = object;
        kpp->kp_page = page;
        kpp->kp_slab_cache = s;
        base = page_address(page);
        objp0 = kasan_reset_tag(object);
#ifdef CONFIG_SLUB_DEBUG
        objp = restore_red_left(s, objp0);
#else
        objp = objp0;
#endif
        kpp->kp_data_offset = (unsigned long)((char *)objp0 - (char *)objp);
        objnr = obj_to_index(s, page, objp0); // unlike SLAB this can't underflow
	if (objnr >= page->objects)
		objnr = page->objects - 1;
        objp = base + s->size * objnr;
        kpp->kp_objp = objp;
	// no WARN_ON_ONCE() needed, objp has to be valid, we just might have negative
	// offset to it, or a larger than s->size positive offset
#ifdef CONFIG_SLUB_DEBUG
	// etc, no changes below

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2021-01-08 19:42 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-06  1:16 [PATCH v4 sl-b 0/6] Export return addresses etc. for better diagnostics Paul E. McKenney
2021-01-06  1:17 ` [PATCH mm,percpu_ref,rcu 1/6] mm: Add mem_dump_obj() to print source of memory block paulmck
2021-01-08 13:50   ` Vlastimil Babka
2021-01-08 19:01     ` Paul E. McKenney
2021-01-08 19:41       ` Vlastimil Babka
2021-01-06  1:17 ` [PATCH mm,percpu_ref,rcu 2/6] mm: Make mem_dump_obj() handle NULL and zero-sized pointers paulmck
2021-01-06  1:17 ` [PATCH mm,percpu_ref,rcu 3/6] mm: Make mem_dump_obj() handle vmalloc() memory paulmck
2021-01-08 15:30   ` Vlastimil Babka
2021-01-06  1:17 ` [PATCH mm,percpu_ref,rcu 4/6] mm: Make mem_obj_dump() vmalloc() dumps include start and length paulmck
2021-01-08 15:30   ` Vlastimil Babka
2021-01-06  1:17 ` [PATCH mm,percpu_ref,rcu 5/6] rcu: Make call_rcu() print mem_dump_obj() info for double-freed callback paulmck
2021-01-06  1:17 ` [PATCH mm,percpu_ref,rcu 6/6] percpu_ref: Dump mem_dump_obj() info upon reference-count underflow paulmck
2021-01-06 21:48 ` [PATCH v4 sl-b 0/6] Export return addresses etc. for better diagnostics Andrew Morton
2021-01-06 23:42   ` Paul E. McKenney
2021-01-08  0:26     ` Paul E. McKenney
2021-01-08 15:35       ` Vlastimil Babka
2021-01-08 17:36         ` Paul E. McKenney

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).