linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] slab: fix a crash by reading /proc/slab_allocators
@ 2019-04-06 22:59 Qian Cai
  2019-04-08  1:59 ` Tobin C. Harding
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Qian Cai @ 2019-04-06 22:59 UTC (permalink / raw)
  To: akpm
  Cc: cl, penberg, rientjes, iamjoonsoo.kim, tj, linux-mm,
	linux-kernel, Qian Cai

The commit 510ded33e075 ("slab: implement slab_root_caches list")
changes the name of the list node within "struct kmem_cache" from
"list" to "root_caches_node", but leaks_show() still use the "list"
which causes a crash when reading /proc/slab_allocators.

BUG: unable to handle kernel NULL pointer dereference at
00000000000000aa
PGD 0 P4D 0
Oops: 0000 [#1] SMP DEBUG_PAGEALLOC PTI
CPU: 3 PID: 5925 Comm: ldd Not tainted 5.1.0-rc3-mm1+ #6
RIP: 0010:__lock_acquire.isra.14+0x4b4/0xa50
Call Trace:
 <IRQ>
 lock_acquire+0xa3/0x180
 _raw_spin_lock+0x2f/0x40
 do_drain+0x61/0xc0
 flush_smp_call_function_queue+0x3a/0x110
 generic_smp_call_function_single_interrupt+0x13/0x2b
 smp_call_function_interrupt+0x66/0x1a0
 call_function_interrupt+0xf/0x20
 </IRQ>
RIP: 0010:__tlb_remove_page_size+0x8c/0xe0
 zap_pte_range+0x39f/0xc80
 unmap_page_range+0x38a/0x550
 unmap_single_vma+0x7d/0xe0
 unmap_vmas+0xae/0xd0
 exit_mmap+0xae/0x190
 mmput+0x7a/0x150
 do_exit+0x2d9/0xd40
 do_group_exit+0x41/0xd0
 __x64_sys_exit_group+0x18/0x20
 do_syscall_64+0x68/0x381
 entry_SYSCALL_64_after_hwframe+0x44/0xa9

Fixes: 510ded33e075 ("slab: implement slab_root_caches list")
Signed-off-by: Qian Cai <cai@lca.pw>
---
 mm/slab.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/mm/slab.c b/mm/slab.c
index 46a6e084222b..9142ee992493 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -4307,7 +4307,8 @@ static void show_symbol(struct seq_file *m, unsigned long address)
 
 static int leaks_show(struct seq_file *m, void *p)
 {
-	struct kmem_cache *cachep = list_entry(p, struct kmem_cache, list);
+	struct kmem_cache *cachep = list_entry(p, struct kmem_cache,
+					       root_caches_node);
 	struct page *page;
 	struct kmem_cache_node *n;
 	const char *name;
-- 
2.17.2 (Apple Git-113)


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] slab: fix a crash by reading /proc/slab_allocators
  2019-04-06 22:59 [PATCH] slab: fix a crash by reading /proc/slab_allocators Qian Cai
@ 2019-04-08  1:59 ` Tobin C. Harding
  2019-04-08  2:18   ` Qian Cai
  2019-04-08  5:03 ` Tobin C. Harding
  2019-04-08  5:35 ` Linus Torvalds
  2 siblings, 1 reply; 8+ messages in thread
From: Tobin C. Harding @ 2019-04-08  1:59 UTC (permalink / raw)
  To: Qian Cai
  Cc: akpm, cl, penberg, rientjes, iamjoonsoo.kim, tj, linux-mm, linux-kernel

On Sat, Apr 06, 2019 at 06:59:01PM -0400, Qian Cai wrote:
> The commit 510ded33e075 ("slab: implement slab_root_caches list")
> changes the name of the list node within "struct kmem_cache" from
> "list" to "root_caches_node"

Are you sure? It looks to me like it adds a member to the memcg_cache_array

diff --git a/include/linux/slab.h b/include/linux/slab.h
index a0cc7a77cda2..af1a5bef80f4 100644
--- a/include/linux/slab.h
+++ b/include/linux/slab.h
@@ -556,6 +556,8 @@ struct memcg_cache_array {
  *             used to index child cachces during allocation and cleared
  *             early during shutdown.
  * 
+ * @root_caches_node: List node for slab_root_caches list.
+ * 
  * @children:  List of all child caches.  While the child caches are also
  *             reachable through @memcg_caches, a child cache remains on
  *             this list until it is actually destroyed.
@@ -573,6 +575,7 @@ struct memcg_cache_params {
        union { 
                struct {
                        struct memcg_cache_array __rcu *memcg_caches;
+                       struct list_head __root_caches_node;
                        struct list_head children;
                };

And then defines 'root_caches_node' to be 'memcg_params.__root_caches_node'
if we have CONFIG_MEMCG otherwise defines 'root_caches_node' to be 'list'


> but leaks_show() still use the "list"

I believe it should since 'list' is used to add to slab_caches list.

> which causes a crash when reading /proc/slab_allocators.

I was unable to reproduce this crash, I built with

# CONFIG_MEMCG is not set
CONFIG_SLAB=y
CONFIG_SLAB_MERGE_DEFAULT=y
CONFIG_SLAB_FREELIST_RANDOM=y
CONFIG_DEBUG_SLAB=y
CONFIG_DEBUG_SLAB_LEAK=y

I then booted in Qemu and successfully ran 
$ cat slab_allocators

Perhaps you could post your config?

Hope this helps,
Tobin.

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] slab: fix a crash by reading /proc/slab_allocators
  2019-04-08  1:59 ` Tobin C. Harding
@ 2019-04-08  2:18   ` Qian Cai
  0 siblings, 0 replies; 8+ messages in thread
From: Qian Cai @ 2019-04-08  2:18 UTC (permalink / raw)
  To: Tobin C. Harding
  Cc: akpm, cl, penberg, rientjes, iamjoonsoo.kim, tj, linux-mm, linux-kernel



On 4/7/19 9:59 PM, Tobin C. Harding wrote:
> On Sat, Apr 06, 2019 at 06:59:01PM -0400, Qian Cai wrote:
>> The commit 510ded33e075 ("slab: implement slab_root_caches list")
>> changes the name of the list node within "struct kmem_cache" from
>> "list" to "root_caches_node"
> 
> Are you sure? It looks to me like it adds a member to the memcg_cache_array
> 
> diff --git a/include/linux/slab.h b/include/linux/slab.h
> index a0cc7a77cda2..af1a5bef80f4 100644
> --- a/include/linux/slab.h
> +++ b/include/linux/slab.h
> @@ -556,6 +556,8 @@ struct memcg_cache_array {
>   *             used to index child cachces during allocation and cleared
>   *             early during shutdown.
>   * 
> + * @root_caches_node: List node for slab_root_caches list.
> + * 
>   * @children:  List of all child caches.  While the child caches are also
>   *             reachable through @memcg_caches, a child cache remains on
>   *             this list until it is actually destroyed.
> @@ -573,6 +575,7 @@ struct memcg_cache_params {
>         union { 
>                 struct {
>                         struct memcg_cache_array __rcu *memcg_caches;
> +                       struct list_head __root_caches_node;
>                         struct list_head children;
>                 };
> 
> And then defines 'root_caches_node' to be 'memcg_params.__root_caches_node'
> if we have CONFIG_MEMCG otherwise defines 'root_caches_node' to be 'list'
> 
> 
>> but leaks_show() still use the "list"
> 
> I believe it should since 'list' is used to add to slab_caches list.

See the offensive commit 510ded33e075 which changed those.

@@ -1136,12 +1146,12 @@ static void print_slabinfo_header(struct seq_file *m)
 void *slab_start(struct seq_file *m, loff_t *pos)
 {
        mutex_lock(&slab_mutex);
-       return seq_list_start(&slab_caches, *pos);
+       return seq_list_start(&slab_root_caches, *pos);
 }

 void *slab_next(struct seq_file *m, void *p, loff_t *pos)
 {
-       return seq_list_next(p, &slab_caches, pos);
+       return seq_list_next(p, &slab_root_caches, pos);
 }

and then memcg_link_cache() does,

if (is_root_cache(s)) {
	list_add(&s->root_caches_node, &slab_root_caches);

memcg_unlink_cache() does,

if (is_root_cache(s)) {
	list_del(&s->root_caches_node);

It also changed /proc/slabinfo but forgot to change /proc/slab_allocators.

@@ -1193,12 +1203,11 @@ static void cache_show(struct kmem_cache *s, struct
seq_file *m)

 static int slab_show(struct seq_file *m, void *p)
 {
-       struct kmem_cache *s = list_entry(p, struct kmem_cache, list);
+       struct kmem_cache *s = list_entry(p, struct kmem_cache, root_caches_node);

> 
>> which causes a crash when reading /proc/slab_allocators.
> 
> I was unable to reproduce this crash, I built with
> 
> # CONFIG_MEMCG is not set
> CONFIG_SLAB=y
> CONFIG_SLAB_MERGE_DEFAULT=y
> CONFIG_SLAB_FREELIST_RANDOM=y
> CONFIG_DEBUG_SLAB=y
> CONFIG_DEBUG_SLAB_LEAK=y
> 
> I then booted in Qemu and successfully ran 
> $ cat slab_allocators
> 
> Perhaps you could post your config?

Yes, it won't be reproducible without CONFIG_MEMCG=y, because it has,

/* If !memcg, all caches are root. */
#define slab_root_caches       slab_caches
#define root_caches_node       list

Anyway,

https://git.sr.ht/~cai/linux-debug/blob/master/config


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] slab: fix a crash by reading /proc/slab_allocators
  2019-04-06 22:59 [PATCH] slab: fix a crash by reading /proc/slab_allocators Qian Cai
  2019-04-08  1:59 ` Tobin C. Harding
@ 2019-04-08  5:03 ` Tobin C. Harding
  2019-04-08  5:35 ` Linus Torvalds
  2 siblings, 0 replies; 8+ messages in thread
From: Tobin C. Harding @ 2019-04-08  5:03 UTC (permalink / raw)
  To: Qian Cai
  Cc: akpm, cl, penberg, rientjes, iamjoonsoo.kim, tj, linux-mm, linux-kernel

On Sat, Apr 06, 2019 at 06:59:01PM -0400, Qian Cai wrote:
> The commit 510ded33e075 ("slab: implement slab_root_caches list")
> changes the name of the list node within "struct kmem_cache" from
> "list" to "root_caches_node", but leaks_show() still use the "list"
> which causes a crash when reading /proc/slab_allocators.
> 
> BUG: unable to handle kernel NULL pointer dereference at
> 00000000000000aa
> PGD 0 P4D 0
> Oops: 0000 [#1] SMP DEBUG_PAGEALLOC PTI
> CPU: 3 PID: 5925 Comm: ldd Not tainted 5.1.0-rc3-mm1+ #6
> RIP: 0010:__lock_acquire.isra.14+0x4b4/0xa50
> Call Trace:
>  <IRQ>
>  lock_acquire+0xa3/0x180
>  _raw_spin_lock+0x2f/0x40
>  do_drain+0x61/0xc0
>  flush_smp_call_function_queue+0x3a/0x110
>  generic_smp_call_function_single_interrupt+0x13/0x2b
>  smp_call_function_interrupt+0x66/0x1a0
>  call_function_interrupt+0xf/0x20
>  </IRQ>
> RIP: 0010:__tlb_remove_page_size+0x8c/0xe0
>  zap_pte_range+0x39f/0xc80
>  unmap_page_range+0x38a/0x550
>  unmap_single_vma+0x7d/0xe0
>  unmap_vmas+0xae/0xd0
>  exit_mmap+0xae/0x190
>  mmput+0x7a/0x150
>  do_exit+0x2d9/0xd40
>  do_group_exit+0x41/0xd0
>  __x64_sys_exit_group+0x18/0x20
>  do_syscall_64+0x68/0x381
>  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> 
> Fixes: 510ded33e075 ("slab: implement slab_root_caches list")
> Signed-off-by: Qian Cai <cai@lca.pw>
> ---
>  mm/slab.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/mm/slab.c b/mm/slab.c
> index 46a6e084222b..9142ee992493 100644
> --- a/mm/slab.c
> +++ b/mm/slab.c
> @@ -4307,7 +4307,8 @@ static void show_symbol(struct seq_file *m, unsigned long address)
>  
>  static int leaks_show(struct seq_file *m, void *p)
>  {
> -	struct kmem_cache *cachep = list_entry(p, struct kmem_cache, list);
> +	struct kmem_cache *cachep = list_entry(p, struct kmem_cache,
> +					       root_caches_node);
>  	struct page *page;
>  	struct kmem_cache_node *n;
>  	const char *name;
> -- 
> 2.17.2 (Apple Git-113)
> 

For what its worth

Reviewed-by: Tobin C. Harding <tobin@kernel.org>

thanks,
Tobin.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] slab: fix a crash by reading /proc/slab_allocators
  2019-04-06 22:59 [PATCH] slab: fix a crash by reading /proc/slab_allocators Qian Cai
  2019-04-08  1:59 ` Tobin C. Harding
  2019-04-08  5:03 ` Tobin C. Harding
@ 2019-04-08  5:35 ` Linus Torvalds
  2019-04-08 13:17   ` Qian Cai
                     ` (2 more replies)
  2 siblings, 3 replies; 8+ messages in thread
From: Linus Torvalds @ 2019-04-08  5:35 UTC (permalink / raw)
  To: Qian Cai
  Cc: Andrew Morton, Christoph Lameter, penberg, David Rientjes,
	iamjoonsoo.kim, Tejun Heo, Linux-MM, Linux List Kernel Mailing

On Sat, Apr 6, 2019 at 12:59 PM Qian Cai <cai@lca.pw> wrote:
>
> The commit 510ded33e075 ("slab: implement slab_root_caches list")
> changes the name of the list node within "struct kmem_cache" from
> "list" to "root_caches_node", but leaks_show() still use the "list"
> which causes a crash when reading /proc/slab_allocators.

The patch does seem to be correct, and I have applied it.

However, it does strike me that apparently this wasn't caught for two
years. Which makes me wonder whether we should (once again) discuss
just removing SLAB entirely, or at least removing the
/proc/slab_allocators file. Apparently it has never been used in the
last two years. At some point a "this can't have worked if  anybody
ever tried to use it" situation means that the code should likely be
excised.

Qian, how did you end up noticing and debugging this?

                 Linus

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] slab: fix a crash by reading /proc/slab_allocators
  2019-04-08  5:35 ` Linus Torvalds
@ 2019-04-08 13:17   ` Qian Cai
  2019-04-08 15:15   ` Christopher Lameter
  2019-04-08 23:41   ` Tobin C. Harding
  2 siblings, 0 replies; 8+ messages in thread
From: Qian Cai @ 2019-04-08 13:17 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Andrew Morton, Christoph Lameter, penberg, David Rientjes,
	iamjoonsoo.kim, Tejun Heo, Linux-MM, Linux List Kernel Mailing

On Sun, 2019-04-07 at 19:35 -1000, Linus Torvalds wrote:
> On Sat, Apr 6, 2019 at 12:59 PM Qian Cai <cai@lca.pw> wrote:
> > 
> > The commit 510ded33e075 ("slab: implement slab_root_caches list")
> > changes the name of the list node within "struct kmem_cache" from
> > "list" to "root_caches_node", but leaks_show() still use the "list"
> > which causes a crash when reading /proc/slab_allocators.
> 
> The patch does seem to be correct, and I have applied it.
> 
> However, it does strike me that apparently this wasn't caught for two
> years. Which makes me wonder whether we should (once again) discuss
> just removing SLAB entirely, or at least removing the
> /proc/slab_allocators file. Apparently it has never been used in the
> last two years. At some point a "this can't have worked if  anybody
> ever tried to use it" situation means that the code should likely be
> excised.
> 
> Qian, how did you end up noticing and debugging this?

There are some nice texts for CONFIG_SLAB Kconfig written in 2007,

"The regular slab allocator that is established and known to work well in all
environments."

"tricked" me into enabling it in a debug kernel for running testing where LTP
proc01 test case (read all files in procfs) would usually trigger the crash
(Sometimes, "cat /proc/slab_allocators" would just end up printing nothing).

Normally, all those debug kernels would use CONFIG_KASAN which would set
CONFIG_DEBUG_SLAB=n. However, there is no KASAN for powerpc yet, so it selects
CONFIG_DEBUG_SLAB=y there, and then the testing found the issue.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] slab: fix a crash by reading /proc/slab_allocators
  2019-04-08  5:35 ` Linus Torvalds
  2019-04-08 13:17   ` Qian Cai
@ 2019-04-08 15:15   ` Christopher Lameter
  2019-04-08 23:41   ` Tobin C. Harding
  2 siblings, 0 replies; 8+ messages in thread
From: Christopher Lameter @ 2019-04-08 15:15 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Qian Cai, Andrew Morton, penberg, David Rientjes, iamjoonsoo.kim,
	Tejun Heo, Linux-MM, Linux List Kernel Mailing

On Sun, 7 Apr 2019, Linus Torvalds wrote:

> On Sat, Apr 6, 2019 at 12:59 PM Qian Cai <cai@lca.pw> wrote:
> >
> > The commit 510ded33e075 ("slab: implement slab_root_caches list")
> > changes the name of the list node within "struct kmem_cache" from
> > "list" to "root_caches_node", but leaks_show() still use the "list"
> > which causes a crash when reading /proc/slab_allocators.
>
> The patch does seem to be correct, and I have applied it.
>
> However, it does strike me that apparently this wasn't caught for two
> years. Which makes me wonder whether we should (once again) discuss
> just removing SLAB entirely, or at least removing the
> /proc/slab_allocators file. Apparently it has never been used in the
> last two years. At some point a "this can't have worked if  anybody
> ever tried to use it" situation means that the code should likely be
> excised.

This is only occurring with specially build kernels so that memory leaks
can be investigated. The same is done with other tools (kasan and friends)
today I guess and also the SLUB debugging tools are much more user
friendly. So this means that some esoteric debugging feature of SLAB was
broken.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] slab: fix a crash by reading /proc/slab_allocators
  2019-04-08  5:35 ` Linus Torvalds
  2019-04-08 13:17   ` Qian Cai
  2019-04-08 15:15   ` Christopher Lameter
@ 2019-04-08 23:41   ` Tobin C. Harding
  2 siblings, 0 replies; 8+ messages in thread
From: Tobin C. Harding @ 2019-04-08 23:41 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Qian Cai, Andrew Morton, Christoph Lameter, penberg,
	David Rientjes, iamjoonsoo.kim, Tejun Heo, Linux-MM,
	Linux List Kernel Mailing

On Sun, Apr 07, 2019 at 07:35:34PM -1000, Linus Torvalds wrote:
> On Sat, Apr 6, 2019 at 12:59 PM Qian Cai <cai@lca.pw> wrote:
> >
> > The commit 510ded33e075 ("slab: implement slab_root_caches list")
> > changes the name of the list node within "struct kmem_cache" from
> > "list" to "root_caches_node", but leaks_show() still use the "list"
> > which causes a crash when reading /proc/slab_allocators.
> 
> The patch does seem to be correct, and I have applied it.
> 
> However, it does strike me that apparently this wasn't caught for two
> years. Which makes me wonder whether we should (once again) discuss
> just removing SLAB entirely, or at least removing the
> /proc/slab_allocators file. Apparently it has never been used in the
> last two years. At some point a "this can't have worked if  anybody
> ever tried to use it" situation means that the code should likely be
> excised.

The bug doesn't trigger on every read of /proc/slab_allocators (as noted
later in this thread by Qian).  I tried to repro it with a bunch of
different configs and often `cat /proc/slab_allocators` just returns
empty.

I've got a patchset ready to go sitting in my tree that removes SLAB, I
could just send it to start the conversation :)


	Tobin

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2019-04-08 23:42 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-04-06 22:59 [PATCH] slab: fix a crash by reading /proc/slab_allocators Qian Cai
2019-04-08  1:59 ` Tobin C. Harding
2019-04-08  2:18   ` Qian Cai
2019-04-08  5:03 ` Tobin C. Harding
2019-04-08  5:35 ` Linus Torvalds
2019-04-08 13:17   ` Qian Cai
2019-04-08 15:15   ` Christopher Lameter
2019-04-08 23:41   ` Tobin C. Harding

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).