From: "Paul E. McKenney" <paulmck@kernel.org> There are kernel facilities such as per-CPU reference counts that give error messages in generic handlers or callbacks, whose messages are unenlightening. In the case of per-CPU reference-count underflow, this is not a problem when creating a new use of this facility because in that case the bug is almost certainly in the code implementing that new use. However, trouble arises when deploying across many systems, which might exercise corner cases that were not seen during development and testing. Here, it would be really nice to get some kind of hint as to which of several uses the underflow was caused by. This commit therefore exposes a new kmem_last_alloc() function that takes a pointer to dynamically allocated memory and returns the return address of the call that allocated it. This pointer can reference the middle of the block as well as the beginning of the block, as needed by things like RCU callback functions and timer handlers that might not know where the beginning of the memory block is. These functions and handlers can use the return value from kmem_last_alloc() to give the kernel hacker a better hint as to where the problem might lie. This kmem_last_alloc() function returns NULL for slob and when the necessary debug has not been enabled for slab and slub. For slub, build with CONFIG_SLUB_DEBUG=y and boot with slub_debug=U, or pass SLAB_STORE_USER to kmem_cache_create() if more focused use is desired. Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: <linux-mm@kvack.org> Reported-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> --- include/linux/slab.h | 2 ++ mm/slab.c | 19 +++++++++++++++++++ mm/slab_common.c | 20 ++++++++++++++++++++ mm/slob.c | 5 +++++ mm/slub.c | 26 ++++++++++++++++++++++++++ 5 files changed, 72 insertions(+) diff --git a/include/linux/slab.h b/include/linux/slab.h index dd6897f..06dd56b 100644 --- a/include/linux/slab.h +++ b/include/linux/slab.h @@ -186,6 +186,8 @@ void kfree(const void *); void kfree_sensitive(const void *); size_t __ksize(const void *); size_t ksize(const void *); +void *kmem_cache_last_alloc(struct kmem_cache *s, void *object); +void *kmem_last_alloc(void *object); #ifdef CONFIG_HAVE_HARDENED_USERCOPY_ALLOCATOR void __check_heap_object(const void *ptr, unsigned long n, struct page *page, diff --git a/mm/slab.c b/mm/slab.c index b111356..2ab93b8 100644 --- a/mm/slab.c +++ b/mm/slab.c @@ -3602,6 +3602,25 @@ void *kmem_cache_alloc_node_trace(struct kmem_cache *cachep, EXPORT_SYMBOL(kmem_cache_alloc_node_trace); #endif +void *kmem_cache_last_alloc(struct kmem_cache *cachep, void *object) +{ +#ifdef DEBUG + unsigned int objnr; + void *objp; + struct page *page; + + if (!(cachep->flags & SLAB_STORE_USER)) + return NULL; + objp = object - obj_offset(cachep); + page = virt_to_head_page(objp); + objnr = obj_to_index(cachep, page, objp); + objp = index_to_obj(cachep, page, objnr); + return *dbg_userword(cachep, objp); +#else + return NULL; +#endif +} + static __always_inline void * __do_kmalloc_node(size_t size, gfp_t flags, int node, unsigned long caller) { diff --git a/mm/slab_common.c b/mm/slab_common.c index f9ccd5d..3f647982 100644 --- a/mm/slab_common.c +++ b/mm/slab_common.c @@ -536,6 +536,26 @@ bool slab_is_available(void) return slab_state >= UP; } +/* + * If the pointer references a slab-allocated object and if sufficient + * debugging is enabled, return the returrn address for the corresponding + * allocation. Otherwise, return NULL. Note that passing random pointers + * to this function (including addresses of on-stack variables) is likely + * to result in panics. + */ +void *kmem_last_alloc(void *object) +{ + struct page *page; + + if (!virt_addr_valid(object)) + return NULL; + page = virt_to_head_page(object); + if (!PageSlab(page)) + return NULL; + return kmem_cache_last_alloc(page->slab_cache, object); +} +EXPORT_SYMBOL_GPL(kmem_last_alloc); + #ifndef CONFIG_SLOB /* Create a cache during boot when no slab services are available yet */ void __init create_boot_cache(struct kmem_cache *s, const char *name, diff --git a/mm/slob.c b/mm/slob.c index 7cc9805..c1f8ed7 100644 --- a/mm/slob.c +++ b/mm/slob.c @@ -461,6 +461,11 @@ static void slob_free(void *block, int size) spin_unlock_irqrestore(&slob_lock, flags); } +void *kmem_cache_last_alloc(struct kmem_cache *s, void *object) +{ + return NULL; +} + /* * End of slob allocator proper. Begin kmem_cache_alloc and kmalloc frontend. */ diff --git a/mm/slub.c b/mm/slub.c index b30be23..8ed3ba2 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -3918,6 +3918,32 @@ int __kmem_cache_shutdown(struct kmem_cache *s) return 0; } +void *kmem_cache_last_alloc(struct kmem_cache *s, void *object) +{ +#ifdef CONFIG_SLUB_DEBUG + void *base; + unsigned int objnr; + void *objp; + struct page *page; + struct track *trackp; + + if (!(s->flags & SLAB_STORE_USER)) + return NULL; + page = virt_to_head_page(object); + base = page_address(page); + objp = kasan_reset_tag(object); + objp = restore_red_left(s, objp); + objnr = obj_to_index(s, page, objp); + objp = base + s->size * objnr; + if (objp < base || objp >= base + page->objects * s->size || (objp - base) % s->size) + return NULL; + trackp = get_track(s, objp, TRACK_ALLOC); + return (void *)trackp->addr; +#else + return NULL; +#endif +} + /******************************************************************** * Kmalloc subsystem *******************************************************************/ -- 2.9.5
From: "Paul E. McKenney" <paulmck@kernel.org> NULL pointers can be useful, but the NULL pointers from kmem_last_alloc() might be caused by any number of things: A not-to-a-slab pointer, failure to enable all the needed debugging, and bogus slob block-address computations. This commit therefore introduces error codes to the kmem_last_alloc() function using the ERR_PTR() facility, and also introduces kmem_last_alloc_errstring(), which translates the error codes into strings. Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: <linux-mm@kvack.org> Reported-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> --- include/linux/slab.h | 10 ++++++++++ mm/slab.c | 2 +- mm/slab_common.c | 28 ++++++++++++++++++++++++++-- mm/slob.c | 2 +- mm/slub.c | 4 ++-- 5 files changed, 40 insertions(+), 6 deletions(-) diff --git a/include/linux/slab.h b/include/linux/slab.h index 06dd56b..031e630 100644 --- a/include/linux/slab.h +++ b/include/linux/slab.h @@ -133,6 +133,15 @@ #define ZERO_OR_NULL_PTR(x) ((unsigned long)(x) <= \ (unsigned long)ZERO_SIZE_PTR) +/* + * kmem_last_alloc error codes. + */ +#define KMEM_LA_NO_PAGE 1 /* No page structure for pointer. */ +#define KMEM_LA_NO_SLAB 2 /* Pointer not from slab allocator. */ +#define KMEM_LA_SLOB 3 /* No debugging info for slob. */ +#define KMEM_LA_NO_DEBUG 4 /* Debugging not enabled for slab/slub. */ +#define KMEM_LA_INCONSISTENT 5 /* Bogus block within slub page. */ + #include <linux/kasan.h> struct mem_cgroup; @@ -188,6 +197,7 @@ size_t __ksize(const void *); size_t ksize(const void *); void *kmem_cache_last_alloc(struct kmem_cache *s, void *object); void *kmem_last_alloc(void *object); +const char *kmem_last_alloc_errstring(void *lastalloc); #ifdef CONFIG_HAVE_HARDENED_USERCOPY_ALLOCATOR void __check_heap_object(const void *ptr, unsigned long n, struct page *page, diff --git a/mm/slab.c b/mm/slab.c index 2ab93b8..1f3b263 100644 --- a/mm/slab.c +++ b/mm/slab.c @@ -3610,7 +3610,7 @@ void *kmem_cache_last_alloc(struct kmem_cache *cachep, void *object) struct page *page; if (!(cachep->flags & SLAB_STORE_USER)) - return NULL; + return ERR_PTR(-KMEM_LA_NO_DEBUG); objp = object - obj_offset(cachep); page = virt_to_head_page(objp); objnr = obj_to_index(cachep, page, objp); diff --git a/mm/slab_common.c b/mm/slab_common.c index 3f647982..8430a14 100644 --- a/mm/slab_common.c +++ b/mm/slab_common.c @@ -537,6 +537,30 @@ bool slab_is_available(void) } /* + * If the pointer corresponds to a kmem_last_alloc() error, return + * a pointer to the corresponding string, otherwise NULL. + */ +const char *kmem_last_alloc_errstring(void *lastalloc) +{ + long klaerrno; + static const char * const es[] = { + "local memory", /* KMEM_LA_NO_PAGE - 1 */ + "non-slab memory", /* KMEM_LA_NO_SLAB - 1 */ + "slob doesn't do debug", /* KMEM_LA_SLOB - 1 */ + "debugging disabled", /* KMEM_LA_NO_DEBUG - 1 */ + "bogus slub block", /* KMEM_LA_INCONSISTENT - 1 */ + }; + + if (!IS_ERR(lastalloc)) + return NULL; + klaerrno = -PTR_ERR(lastalloc) - 1; + if (WARN_ON_ONCE(klaerrno >= ARRAY_SIZE(es))) + return "kmem_last_alloc error out of range"; + return es[klaerrno]; +} +EXPORT_SYMBOL_GPL(kmem_last_alloc_errstring); + +/* * If the pointer references a slab-allocated object and if sufficient * debugging is enabled, return the returrn address for the corresponding * allocation. Otherwise, return NULL. Note that passing random pointers @@ -548,10 +572,10 @@ void *kmem_last_alloc(void *object) struct page *page; if (!virt_addr_valid(object)) - return NULL; + return ERR_PTR(-KMEM_LA_NO_PAGE); page = virt_to_head_page(object); if (!PageSlab(page)) - return NULL; + return ERR_PTR(-KMEM_LA_NO_SLAB); return kmem_cache_last_alloc(page->slab_cache, object); } EXPORT_SYMBOL_GPL(kmem_last_alloc); diff --git a/mm/slob.c b/mm/slob.c index c1f8ed7..e7d6b90 100644 --- a/mm/slob.c +++ b/mm/slob.c @@ -463,7 +463,7 @@ static void slob_free(void *block, int size) void *kmem_cache_last_alloc(struct kmem_cache *s, void *object) { - return NULL; + return ERR_PTR(-KMEM_LA_SLOB); } /* diff --git a/mm/slub.c b/mm/slub.c index 8ed3ba2..3ddf16a 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -3928,7 +3928,7 @@ void *kmem_cache_last_alloc(struct kmem_cache *s, void *object) struct track *trackp; if (!(s->flags & SLAB_STORE_USER)) - return NULL; + return ERR_PTR(-KMEM_LA_NO_DEBUG); page = virt_to_head_page(object); base = page_address(page); objp = kasan_reset_tag(object); @@ -3936,7 +3936,7 @@ void *kmem_cache_last_alloc(struct kmem_cache *s, void *object) objnr = obj_to_index(s, page, objp); objp = base + s->size * objnr; if (objp < base || objp >= base + page->objects * s->size || (objp - base) % s->size) - return NULL; + return ERR_PTR(-KMEM_LA_INCONSISTENT); trackp = get_track(s, objp, TRACK_ALLOC); return (void *)trackp->addr; #else -- 2.9.5
From: "Paul E. McKenney" <paulmck@kernel.org> The debug-object double-free checks in __call_rcu() print out the RCU callback function, which is usually sufficient to track down the double free. However, all uses of things like queue_rcu_work() will have the same RCU callback function (rcu_work_rcufn() in this case), so a diagnostic message for a double queue_rcu_work() needs more than just the callback function. This commit therefore prints the last allocation address of the double-freed callback when the callback is slab-allocated and sufficient debugging is enabled. It uses the shiny new kmem_last_alloc() and kmem_last_alloc_errstring() functions for this purpose. Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: <linux-mm@kvack.org> Cc: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> --- kernel/rcu/tree.c | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-) diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index b6c9c49..788a072 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -2957,6 +2957,8 @@ static void check_cb_ovld(struct rcu_data *rdp) static void __call_rcu(struct rcu_head *head, rcu_callback_t func) { + void *allocaddr; + const char *allocerr; unsigned long flags; struct rcu_data *rdp; bool was_alldone; @@ -2970,8 +2972,14 @@ __call_rcu(struct rcu_head *head, rcu_callback_t func) * Use rcu:rcu_callback trace event to find the previous * time callback was passed to __call_rcu(). */ - WARN_ONCE(1, "__call_rcu(): Double-freed CB %p->%pS()!!!\n", - head, head->func); + allocaddr = kmem_last_alloc(head); + allocerr = kmem_last_alloc_errstring(allocaddr); + if (allocerr) + WARN_ONCE(1, "__call_rcu(): Double-freed CB %p->%pS()!!! (%s)\n", + head, head->func, allocerr); + else + WARN_ONCE(1, "__call_rcu(): Double-freed CB %p->%pS()!!! (Allocated at %pS)\n", + head, head->func, allocaddr); WRITE_ONCE(head->func, rcu_leak_callback); return; } -- 2.9.5
From: "Paul E. McKenney" <paulmck@kernel.org> In some cases, the allocator return address is in a common function, so that more information is desired. For example, a percpu_ref reference-count underflow only has access to a data structure that is allocated in percpu_ref_init(). In this case, the return address from the allocator provides no additional information. This commit therefore creates a kmem_cache_last_alloc() function that can be passed stackp and nstackp parameters, allowing CONFIG_STACKTRACE=y slub stack traces to be provided to the caller. Please note that stack traces cannot be provided unless they are collected. Collecting stack traces requires that the kernel: (1) Use the slub allocator, (2) Be built with CONFIG_STACKTRACE=y (which is the case when ftrace is configured), and (3) Have slub debugging enabled one way or another, for example, by booting with the "slub_debug=U" kernel boot parameter. Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: <linux-mm@kvack.org> Reported-by: Andrii Nakryiko <andrii@kernel.org> [ paulmck: Move slab definition per Stephen Rothwell and kbuild test robot. ] Signed-off-by: Paul E. McKenney <paulmck@kernel.org> --- include/linux/slab.h | 3 ++- mm/slab.c | 40 +++++++++++++++++++++------------------- mm/slab_common.c | 39 ++++++++++++++++++++++++++++++++------- mm/slob.c | 4 +++- mm/slub.c | 14 +++++++++++++- 5 files changed, 71 insertions(+), 29 deletions(-) diff --git a/include/linux/slab.h b/include/linux/slab.h index 031e630..bdedefd 100644 --- a/include/linux/slab.h +++ b/include/linux/slab.h @@ -195,8 +195,9 @@ void kfree(const void *); void kfree_sensitive(const void *); size_t __ksize(const void *); size_t ksize(const void *); -void *kmem_cache_last_alloc(struct kmem_cache *s, void *object); +void *kmem_cache_last_alloc(struct kmem_cache *s, void *object, void **stackp, int nstackp); void *kmem_last_alloc(void *object); +void *kmem_last_alloc_stack(void *object, void **stackp, int nstackp); const char *kmem_last_alloc_errstring(void *lastalloc); #ifdef CONFIG_HAVE_HARDENED_USERCOPY_ALLOCATOR diff --git a/mm/slab.c b/mm/slab.c index 1f3b263..ae1a74c 100644 --- a/mm/slab.c +++ b/mm/slab.c @@ -3602,25 +3602,6 @@ void *kmem_cache_alloc_node_trace(struct kmem_cache *cachep, EXPORT_SYMBOL(kmem_cache_alloc_node_trace); #endif -void *kmem_cache_last_alloc(struct kmem_cache *cachep, void *object) -{ -#ifdef DEBUG - unsigned int objnr; - void *objp; - struct page *page; - - if (!(cachep->flags & SLAB_STORE_USER)) - return ERR_PTR(-KMEM_LA_NO_DEBUG); - objp = object - obj_offset(cachep); - page = virt_to_head_page(objp); - objnr = obj_to_index(cachep, page, objp); - objp = index_to_obj(cachep, page, objnr); - return *dbg_userword(cachep, objp); -#else - return NULL; -#endif -} - static __always_inline void * __do_kmalloc_node(size_t size, gfp_t flags, int node, unsigned long caller) { @@ -3652,6 +3633,27 @@ void *__kmalloc_node_track_caller(size_t size, gfp_t flags, EXPORT_SYMBOL(__kmalloc_node_track_caller); #endif /* CONFIG_NUMA */ +void *kmem_cache_last_alloc(struct kmem_cache *cachep, void *object, void **stackp, int nstackp) +{ +#ifdef DEBUG + unsigned int objnr; + void *objp; + struct page *page; + + if (!(cachep->flags & SLAB_STORE_USER)) + return ERR_PTR(-KMEM_LA_NO_DEBUG); + objp = object - obj_offset(cachep); + page = virt_to_head_page(objp); + objnr = obj_to_index(cachep, page, objp); + objp = index_to_obj(cachep, page, objnr); + if (stackp && nstackp) + stackp[0] = NULL; + return *dbg_userword(cachep, objp); +#else + return NULL; +#endif +} + /** * __do_kmalloc - allocate memory * @size: how many bytes of memory are required. diff --git a/mm/slab_common.c b/mm/slab_common.c index 8430a14..b70f357 100644 --- a/mm/slab_common.c +++ b/mm/slab_common.c @@ -560,14 +560,22 @@ const char *kmem_last_alloc_errstring(void *lastalloc) } EXPORT_SYMBOL_GPL(kmem_last_alloc_errstring); -/* +/** + * kmem_last_alloc_stack - Get return address and stack for last allocation + * @object: object for which to find last-allocation return address. + * @stackp: %NULL or pointer to location to place return-address stack. + * @nstackp: maximum number of return addresses that may be stored. + * * If the pointer references a slab-allocated object and if sufficient - * debugging is enabled, return the returrn address for the corresponding - * allocation. Otherwise, return NULL. Note that passing random pointers - * to this function (including addresses of on-stack variables) is likely - * to result in panics. + * debugging is enabled, return the return address for the corresponding + * allocation. If stackp is non-%NULL in %CONFIG_STACKTRACE kernels running + * the slub allocator, also copy the return-address stack into @stackp, + * limited by @nstackp. Otherwise, return %NULL or an appropriate error + * code using %ERR_PTR(). + * + * Return: return address from last allocation, %NULL or negative error code. */ -void *kmem_last_alloc(void *object) +void *kmem_last_alloc_stack(void *object, void **stackp, int nstackp) { struct page *page; @@ -576,7 +584,24 @@ void *kmem_last_alloc(void *object) page = virt_to_head_page(object); if (!PageSlab(page)) return ERR_PTR(-KMEM_LA_NO_SLAB); - return kmem_cache_last_alloc(page->slab_cache, object); + return kmem_cache_last_alloc(page->slab_cache, object, stackp, nstackp); +} +EXPORT_SYMBOL_GPL(kmem_last_alloc_stack); + +/** + * kmem_last_alloc - Get return address for last allocation + * @object: object for which to find last-allocation return address. + * + * If the pointer references a slab-allocated object and if sufficient + * debugging is enabled, return the return address for the corresponding + * allocation. Otherwise, return %NULL or an appropriate error code using + * %ERR_PTR(). + * + * Return: return address from last allocation, %NULL or negative error code. + */ +void *kmem_last_alloc(void *object) +{ + return kmem_last_alloc_stack(object, NULL, 0); } EXPORT_SYMBOL_GPL(kmem_last_alloc); diff --git a/mm/slob.c b/mm/slob.c index e7d6b90..dab7f3b 100644 --- a/mm/slob.c +++ b/mm/slob.c @@ -461,8 +461,10 @@ static void slob_free(void *block, int size) spin_unlock_irqrestore(&slob_lock, flags); } -void *kmem_cache_last_alloc(struct kmem_cache *s, void *object) +void *kmem_cache_last_alloc(struct kmem_cache *s, void *object, void **stackp, int nstackp) { + if (stackp && nstackp) + stackp[0] = NULL; return ERR_PTR(-KMEM_LA_SLOB); } diff --git a/mm/slub.c b/mm/slub.c index 3ddf16a..a918b1d 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -3918,10 +3918,11 @@ int __kmem_cache_shutdown(struct kmem_cache *s) return 0; } -void *kmem_cache_last_alloc(struct kmem_cache *s, void *object) +void *kmem_cache_last_alloc(struct kmem_cache *s, void *object, void **stackp, int nstackp) { #ifdef CONFIG_SLUB_DEBUG void *base; + int i = 0; unsigned int objnr; void *objp; struct page *page; @@ -3938,6 +3939,17 @@ void *kmem_cache_last_alloc(struct kmem_cache *s, void *object) if (objp < base || objp >= base + page->objects * s->size || (objp - base) % s->size) return ERR_PTR(-KMEM_LA_INCONSISTENT); trackp = get_track(s, objp, TRACK_ALLOC); +#ifdef CONFIG_STACKTRACE + if (stackp) { + for (; i < nstackp && i < TRACK_ADDRS_COUNT; i++) { + stackp[i] = (void *)trackp->addrs[i]; + if (!stackp[i]) + break; + } + } +#endif + if (stackp && i < nstackp) + stackp[i] = NULL; return (void *)trackp->addr; #else return NULL; -- 2.9.5
Hello, Paul.
On Fri, Dec 04, 2020 at 04:40:52PM -0800, paulmck@kernel.org wrote:
> From: "Paul E. McKenney" <paulmck@kernel.org>
>
> There are kernel facilities such as per-CPU reference counts that give
> error messages in generic handlers or callbacks, whose messages are
> unenlightening. In the case of per-CPU reference-count underflow, this
> is not a problem when creating a new use of this facility because in that
> case the bug is almost certainly in the code implementing that new use.
> However, trouble arises when deploying across many systems, which might
> exercise corner cases that were not seen during development and testing.
> Here, it would be really nice to get some kind of hint as to which of
> several uses the underflow was caused by.
>
> This commit therefore exposes a new kmem_last_alloc() function that
> takes a pointer to dynamically allocated memory and returns the return
> address of the call that allocated it. This pointer can reference the
> middle of the block as well as the beginning of the block, as needed
> by things like RCU callback functions and timer handlers that might not
> know where the beginning of the memory block is. These functions and
> handlers can use the return value from kmem_last_alloc() to give the
> kernel hacker a better hint as to where the problem might lie.
I agree with exposing allocation caller information to the other
subsystem to help the debugging. Some suggestions...
1. It's better to separate a slab object check (validity check) and
retrieving the allocation caller. Someone else would want to check
only a validity. And, it doesn't depend on the debug configuration so
it's not good to bind it to the debug function.
kmem_cache_valid_(obj|ptr)
kmalloc_valid_(obj|ptr)
2. rename kmem_last_alloc to ...
int kmem_cache_debug_alloc_caller(cache, obj, &ret_addr)
int kmalloc_debug_alloc_caller(obj, &ret_addr)
or debug_kmem_cache_alloc_caller()
I think that function name need to include the keyword 'debug' to show
itself as a debugging facility (enabled at the debugging). And, return
errno and get caller address by pointer argument.
3. If concrete error message is needed, please introduce more functions.
void *kmalloc_debug_error(errno)
Thanks.
On Mon, Dec 07, 2020 at 06:02:53PM +0900, Joonsoo Kim wrote: > Hello, Paul. > > On Fri, Dec 04, 2020 at 04:40:52PM -0800, paulmck@kernel.org wrote: > > From: "Paul E. McKenney" <paulmck@kernel.org> > > > > There are kernel facilities such as per-CPU reference counts that give > > error messages in generic handlers or callbacks, whose messages are > > unenlightening. In the case of per-CPU reference-count underflow, this > > is not a problem when creating a new use of this facility because in that > > case the bug is almost certainly in the code implementing that new use. > > However, trouble arises when deploying across many systems, which might > > exercise corner cases that were not seen during development and testing. > > Here, it would be really nice to get some kind of hint as to which of > > several uses the underflow was caused by. > > > > This commit therefore exposes a new kmem_last_alloc() function that > > takes a pointer to dynamically allocated memory and returns the return > > address of the call that allocated it. This pointer can reference the > > middle of the block as well as the beginning of the block, as needed > > by things like RCU callback functions and timer handlers that might not > > know where the beginning of the memory block is. These functions and > > handlers can use the return value from kmem_last_alloc() to give the > > kernel hacker a better hint as to where the problem might lie. > > I agree with exposing allocation caller information to the other > subsystem to help the debugging. Some suggestions... Good to hear! ;-) > 1. It's better to separate a slab object check (validity check) and > retrieving the allocation caller. Someone else would want to check > only a validity. And, it doesn't depend on the debug configuration so > it's not good to bind it to the debug function. > > kmem_cache_valid_(obj|ptr) > kmalloc_valid_(obj|ptr) Here both functions would say "true" for a pointer from kmalloc()? Or do I need to add a third function that is happy with a pointer from either source? I do understand that people who don't want to distinguish could just do "kmem_cache_valid_ptr(p) || kmalloc_valid_ptr(p)". However, the two use cases in the series have no idea whether the pointer they have came from kmalloc(), kmem_cache_alloc(), or somewhere else entirely, even an on-stack variable. Are you asking me to choose between the _obj() and _ptr() suffixes? If not, please help me understand the distinction. Do we want "debug" in these names as well? > 2. rename kmem_last_alloc to ... > > int kmem_cache_debug_alloc_caller(cache, obj, &ret_addr) > int kmalloc_debug_alloc_caller(obj, &ret_addr) > > or debug_kmem_cache_alloc_caller() > > I think that function name need to include the keyword 'debug' to show > itself as a debugging facility (enabled at the debugging). And, return > errno and get caller address by pointer argument. I am quite happy to add the "debug", but my use cases have no idea how the pointer was allocated. In fact, the next version of the patch will also handle allocator return addresses from vmalloc(). And for kernels without sufficient debug enabled, I need to provide the name of the slab cache, and this also is to be in the next version. > 3. If concrete error message is needed, please introduce more functions. > > void *kmalloc_debug_error(errno) Agreed, in fact, I was planning to have a function that printed out a suitable error-message continuation to the console for ease-of-use reasons. For example, why is the caller deciding how deep the stack frame is? ;-) So something like this? void kmalloc_debug_print_provenance(void *ptr); With the understanding that it will print something helpful regardless of where ptr came from, within the constraints of the kernel build and boot options? Thanx, Paul
On Mon, Dec 07, 2020 at 09:25:54AM -0800, Paul E. McKenney wrote: > On Mon, Dec 07, 2020 at 06:02:53PM +0900, Joonsoo Kim wrote: > > Hello, Paul. > > > > On Fri, Dec 04, 2020 at 04:40:52PM -0800, paulmck@kernel.org wrote: > > > From: "Paul E. McKenney" <paulmck@kernel.org> > > > > > > There are kernel facilities such as per-CPU reference counts that give > > > error messages in generic handlers or callbacks, whose messages are > > > unenlightening. In the case of per-CPU reference-count underflow, this > > > is not a problem when creating a new use of this facility because in that > > > case the bug is almost certainly in the code implementing that new use. > > > However, trouble arises when deploying across many systems, which might > > > exercise corner cases that were not seen during development and testing. > > > Here, it would be really nice to get some kind of hint as to which of > > > several uses the underflow was caused by. > > > > > > This commit therefore exposes a new kmem_last_alloc() function that > > > takes a pointer to dynamically allocated memory and returns the return > > > address of the call that allocated it. This pointer can reference the > > > middle of the block as well as the beginning of the block, as needed > > > by things like RCU callback functions and timer handlers that might not > > > know where the beginning of the memory block is. These functions and > > > handlers can use the return value from kmem_last_alloc() to give the > > > kernel hacker a better hint as to where the problem might lie. > > > > I agree with exposing allocation caller information to the other > > subsystem to help the debugging. Some suggestions... > > Good to hear! ;-) > > > 1. It's better to separate a slab object check (validity check) and > > retrieving the allocation caller. Someone else would want to check > > only a validity. And, it doesn't depend on the debug configuration so > > it's not good to bind it to the debug function. > > > > kmem_cache_valid_(obj|ptr) > > kmalloc_valid_(obj|ptr) > > Here both functions would say "true" for a pointer from kmalloc()? > Or do I need to add a third function that is happy with a pointer from > either source? I focused on separation and missed this case that the user sometimes cannot know the object source (kmalloc/kmem_cache). At first step, just checking whether it is a slab-object or not looks enough. int kmem_valid_obj() > > I do understand that people who don't want to distinguish could just do > "kmem_cache_valid_ptr(p) || kmalloc_valid_ptr(p)". However, the two > use cases in the series have no idea whether the pointer they have came > from kmalloc(), kmem_cache_alloc(), or somewhere else entirely, even an > on-stack variable. > > Are you asking me to choose between the _obj() and _ptr() suffixes? Yes, I prefer _obj(). > If not, please help me understand the distinction. > > Do we want "debug" in these names as well? I don't think so since it can be called without enabling the debug option. > > > 2. rename kmem_last_alloc to ... > > > > int kmem_cache_debug_alloc_caller(cache, obj, &ret_addr) > > int kmalloc_debug_alloc_caller(obj, &ret_addr) > > > > or debug_kmem_cache_alloc_caller() > > > > I think that function name need to include the keyword 'debug' to show > > itself as a debugging facility (enabled at the debugging). And, return > > errno and get caller address by pointer argument. > > I am quite happy to add the "debug", but my use cases have no idea > how the pointer was allocated. In fact, the next version of the > patch will also handle allocator return addresses from vmalloc(). > > And for kernels without sufficient debug enabled, I need to provide > the name of the slab cache, and this also is to be in the next version. Okay. So, your code would be... if (kmem_valid_obj(ptr)) kmalloc_debug_print_provenance(ptr) else if (vmalloc_valid_obj(ptr)) .... > > 3. If concrete error message is needed, please introduce more functions. > > > > void *kmalloc_debug_error(errno) > > Agreed, in fact, I was planning to have a function that printed out > a suitable error-message continuation to the console for ease-of-use > reasons. For example, why is the caller deciding how deep the stack > frame is? ;-) > > So something like this? > > void kmalloc_debug_print_provenance(void *ptr); > > With the understanding that it will print something helpful regardless > of where ptr came from, within the constraints of the kernel build and > boot options? Looks good idea. I suggest a name, kmem_dump_obj(), for this function. In this case, I don't think that "debug" keyword is needed since it shows something useful (slab cache info) even if debug option isn't enabled. So, for summary, we need to introduce two functions to accomplish your purpose. Please correct me if wrong. int kmem_valid_obj(ptr) void kmem_dump_obj(ptr) Thanks.
On Tue, Dec 08, 2020 at 05:57:07PM +0900, Joonsoo Kim wrote: > On Mon, Dec 07, 2020 at 09:25:54AM -0800, Paul E. McKenney wrote: > > On Mon, Dec 07, 2020 at 06:02:53PM +0900, Joonsoo Kim wrote: > > > Hello, Paul. > > > > > > On Fri, Dec 04, 2020 at 04:40:52PM -0800, paulmck@kernel.org wrote: > > > > From: "Paul E. McKenney" <paulmck@kernel.org> > > > > > > > > There are kernel facilities such as per-CPU reference counts that give > > > > error messages in generic handlers or callbacks, whose messages are > > > > unenlightening. In the case of per-CPU reference-count underflow, this > > > > is not a problem when creating a new use of this facility because in that > > > > case the bug is almost certainly in the code implementing that new use. > > > > However, trouble arises when deploying across many systems, which might > > > > exercise corner cases that were not seen during development and testing. > > > > Here, it would be really nice to get some kind of hint as to which of > > > > several uses the underflow was caused by. > > > > > > > > This commit therefore exposes a new kmem_last_alloc() function that > > > > takes a pointer to dynamically allocated memory and returns the return > > > > address of the call that allocated it. This pointer can reference the > > > > middle of the block as well as the beginning of the block, as needed > > > > by things like RCU callback functions and timer handlers that might not > > > > know where the beginning of the memory block is. These functions and > > > > handlers can use the return value from kmem_last_alloc() to give the > > > > kernel hacker a better hint as to where the problem might lie. > > > > > > I agree with exposing allocation caller information to the other > > > subsystem to help the debugging. Some suggestions... > > > > Good to hear! ;-) > > > > > 1. It's better to separate a slab object check (validity check) and > > > retrieving the allocation caller. Someone else would want to check > > > only a validity. And, it doesn't depend on the debug configuration so > > > it's not good to bind it to the debug function. > > > > > > kmem_cache_valid_(obj|ptr) > > > kmalloc_valid_(obj|ptr) > > > > Here both functions would say "true" for a pointer from kmalloc()? > > Or do I need to add a third function that is happy with a pointer from > > either source? > > I focused on separation and missed this case that the user sometimes > cannot know the object source (kmalloc/kmem_cache). At first step, > just checking whether it is a slab-object or not looks enough. > > int kmem_valid_obj() OK, I will update my current kmalloc_valid_obj() to kmem_valid_obj(), thank you! > > I do understand that people who don't want to distinguish could just do > > "kmem_cache_valid_ptr(p) || kmalloc_valid_ptr(p)". However, the two > > use cases in the series have no idea whether the pointer they have came > > from kmalloc(), kmem_cache_alloc(), or somewhere else entirely, even an > > on-stack variable. > > > > Are you asking me to choose between the _obj() and _ptr() suffixes? > > Yes, I prefer _obj(). Then _obj() it is. > > If not, please help me understand the distinction. > > > > Do we want "debug" in these names as well? > > I don't think so since it can be called without enabling the debug > option. OK, understood. > > > 2. rename kmem_last_alloc to ... > > > > > > int kmem_cache_debug_alloc_caller(cache, obj, &ret_addr) > > > int kmalloc_debug_alloc_caller(obj, &ret_addr) > > > > > > or debug_kmem_cache_alloc_caller() > > > > > > I think that function name need to include the keyword 'debug' to show > > > itself as a debugging facility (enabled at the debugging). And, return > > > errno and get caller address by pointer argument. > > > > I am quite happy to add the "debug", but my use cases have no idea > > how the pointer was allocated. In fact, the next version of the > > patch will also handle allocator return addresses from vmalloc(). > > > > And for kernels without sufficient debug enabled, I need to provide > > the name of the slab cache, and this also is to be in the next version. > > Okay. So, your code would be... > > if (kmem_valid_obj(ptr)) > kmalloc_debug_print_provenance(ptr) > else if (vmalloc_valid_obj(ptr)) > .... Suggestions on where to put the mem_dump_obj() or whatever name that executes this code? Left to myself, I will pick a likely on the theory that it can always be moved later. This structuring does cause double work, but this should be OK because all of the uses I know of are on error paths. > > > 3. If concrete error message is needed, please introduce more functions. > > > > > > void *kmalloc_debug_error(errno) > > > > Agreed, in fact, I was planning to have a function that printed out > > a suitable error-message continuation to the console for ease-of-use > > reasons. For example, why is the caller deciding how deep the stack > > frame is? ;-) > > > > So something like this? > > > > void kmalloc_debug_print_provenance(void *ptr); > > > > With the understanding that it will print something helpful regardless > > of where ptr came from, within the constraints of the kernel build and > > boot options? > > Looks good idea. I suggest a name, kmem_dump_obj(), for this function. > In this case, I don't think that "debug" keyword is needed since it shows > something useful (slab cache info) even if debug option isn't enabled. > > So, for summary, we need to introduce two functions to accomplish your > purpose. Please correct me if wrong. > > int kmem_valid_obj(ptr) > void kmem_dump_obj(ptr) Within slab, agreed. We course also need something like mem_dump_obj() to handle a pointer with unknown provenance, along with the vmalloc_valid_obj() and the vmalloc_dump_obj(). And similar functions should other allocation sources become important. Thanx, Paul