linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Possible regression in "slab, slub: skip unnecessary kasan_cache_shutdown()"
@ 2018-06-19  2:51 Jason A. Donenfeld
  2018-06-19  3:59 ` Shakeel Butt
  2018-06-19 17:34 ` Andrey Ryabinin
  0 siblings, 2 replies; 16+ messages in thread
From: Jason A. Donenfeld @ 2018-06-19  2:51 UTC (permalink / raw)
  To: aryabinin, Alexander Potapenko, Dmitry Vyukov, cl, penberg,
	rientjes, iamjoonsoo.kim, Andrew Morton, kasan-dev, Linux-MM,
	LKML, Shakeel Butt

Hello Shakeel,

It may be the case that f9e13c0a5a33d1eaec374d6d4dab53a4f72756a0 has
introduced a regression. I've bisected a failing test to this commit,
and after staring at the my code for a long time, I'm unable to find a
bug that this commit might have unearthed. Rather, it looks like this
commit introduces a performance optimization, rather than a
correctness fix, so it seems that whatever test case is failing is
likely an incorrect failure. Does that seem like an accurate
possibility to you?

Below is a stack trace when things go south. Let me know if you'd like
to run my test suite, and I can send additional information.

Regards,
Jason


[    1.364686] kasan: GPF could be caused by NULL-ptr deref or user
memory access
[    1.365258] general protection fault: 0000 [#1] PREEMPT SMP
DEBUG_PAGEALLOC KASAN
[    1.365852] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 4.16.0 #19
[    1.366315] RIP: 0010:___cache_free+0x76/0x1e0
[    1.366667] RSP: 0000:ffff8800003af868 EFLAGS: 00010286
[    1.367079] RAX: ffffea0000cb04a0 RBX: ffff8800351f1958 RCX: ffff880035954900
[    1.367640] RDX: ffffea0000cb049f RSI: ffff8800351f1958 RDI: ffff880035954900
[    1.368014] RBP: ffffea0000d47c40 R08: ffff8800003a0870 R09: 0000000000000006
[    1.368014] R10: 0000000000000000 R11: 0000000000000000 R12: ffff880033314b98
[    1.368014] R13: ffff880035954900 R14: ffffea0000000000 R15: ffffffff826dfae0
[    1.368014] FS:  0000000000000000(0000) GS:ffff880036480000(0000)
knlGS:00000000000
[    1.368014] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    1.368014] CR2: 00000000ffffffff CR3: 0000000002220001 CR4: 00000000001606a0
[    1.368014] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[    1.368014] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[    1.368014] Call Trace:
[    1.368014]  ? qlist_free_all+0x58/0x1c0
[    1.368014]  qlist_free_all+0x70/0x1c0
[    1.368014]  ? trace_hardirqs_on_caller+0x3d0/0x630
[    1.368014]  quarantine_reduce+0x221/0x310
[    1.368014]  kasan_kmalloc+0x95/0xc0
[    1.368014]  kmem_cache_alloc+0x151/0x2b0
[    1.368014]  create_object+0xa7/0xa70
[    1.368014]  ? kmemleak_disable+0x90/0x90
[    1.368014]  ? trace_hardirqs_on_caller+0x3d0/0x630
[    1.368014]  ? fs_reclaim_acquire.part.14+0x30/0x30
[    1.368014]  __kmalloc+0x200/0x340
[    1.368014]  ? do_one_initcall+0x12c/0x212
[    1.368014]  __register_sysctl_table+0xbe/0x11b0
[    1.368014]  ipv4_sysctl_init_net+0x1cf/0x2d0
[    1.368014]  ops_init+0x203/0x510
[    1.368014]  ? proc_sys_setattr+0xe0/0xe0
[    1.368014]  ? __peernet2id_alloc+0x180/0x180
[    1.368014]  ? __rb_erase_color+0x1d90/0x1d90
[    1.368014]  register_pernet_operations+0x38e/0x960
[    1.368014]  ? setup_net+0x8b0/0x8b0
[    1.368014]  ? register_pernet_subsys+0x10/0x40
[    1.368014]  ? down_write+0x96/0x150
[    1.368014]  ? register_pernet_subsys+0x10/0x40
[    1.368014]  ? __register_sysctl_table+0x669/0x11b0
[    1.368014]  ? gre_offload_init+0x44/0x44
[    1.368014]  register_pernet_subsys+0x1f/0x40
[    1.368014]  sysctl_ipv4_init+0x34/0x47
[    1.368014]  do_one_initcall+0x12c/0x212
[    1.368014]  ? start_kernel+0x60e/0x60e
[    1.368014]  ? up_write+0x78/0x220
[    1.368014]  ? up_read+0x130/0x130
[    1.368014]  ? __asan_register_globals+0x53/0x80
[    1.368014]  ? kasan_unpoison_shadow+0x30/0x40
[    1.368014]  kernel_init_freeable+0x3b5/0x459
[    1.368014]  ? rest_init+0x2bf/0x2bf
[    1.368014]  kernel_init+0x7/0x11b
[    1.368014]  ? rest_init+0x2bf/0x2bf
[    1.368014]  ret_from_fork+0x24/0x30
[    1.368014] Code: 83 fd e0 0f 84 62 01 00 00 48 8b 45 20 49 c7 c7
e0 fa 6d 82 48 8
[    1.368014] RIP: ___cache_free+0x76/0x1e0 RSP: ffff8800003af868
[    1.387680] ---[ end trace 975b7b250dd637de ]---
[    1.388098] Kernel panic - not syncing: Fatal exception
[    1.388655] Kernel Offset: disabled

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Possible regression in "slab, slub: skip unnecessary kasan_cache_shutdown()"
  2018-06-19  2:51 Possible regression in "slab, slub: skip unnecessary kasan_cache_shutdown()" Jason A. Donenfeld
@ 2018-06-19  3:59 ` Shakeel Butt
  2018-06-19  4:08   ` Jason A. Donenfeld
  2018-06-19 17:34 ` Andrey Ryabinin
  1 sibling, 1 reply; 16+ messages in thread
From: Shakeel Butt @ 2018-06-19  3:59 UTC (permalink / raw)
  To: Jason
  Cc: Andrey Ryabinin, Alexander Potapenko, Dmitry Vyukov,
	Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim,
	Andrew Morton, kasan-dev, Linux MM, LKML

On Mon, Jun 18, 2018 at 7:51 PM Jason A. Donenfeld <Jason@zx2c4.com> wrote:
>
> Hello Shakeel,
>
> It may be the case that f9e13c0a5a33d1eaec374d6d4dab53a4f72756a0 has
> introduced a regression. I've bisected a failing test to this commit,
> and after staring at the my code for a long time, I'm unable to find a
> bug that this commit might have unearthed. Rather, it looks like this
> commit introduces a performance optimization, rather than a
> correctness fix, so it seems that whatever test case is failing is
> likely an incorrect failure. Does that seem like an accurate
> possibility to you?
>
> Below is a stack trace when things go south. Let me know if you'd like
> to run my test suite, and I can send additional information.
>
> Regards,
> Jason
>

Hi Jason, yes please do send me the test suite with the kernel config.

Shakeel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Possible regression in "slab, slub: skip unnecessary kasan_cache_shutdown()"
  2018-06-19  3:59 ` Shakeel Butt
@ 2018-06-19  4:08   ` Jason A. Donenfeld
  2018-06-19  4:55     ` Dmitry Vyukov
  2018-06-19  5:06     ` Shakeel Butt
  0 siblings, 2 replies; 16+ messages in thread
From: Jason A. Donenfeld @ 2018-06-19  4:08 UTC (permalink / raw)
  To: Shakeel Butt
  Cc: aryabinin, Alexander Potapenko, Dmitry Vyukov, cl, penberg,
	rientjes, iamjoonsoo.kim, Andrew Morton, kasan-dev, Linux-MM,
	LKML

On Tue, Jun 19, 2018 at 5:59 AM Shakeel Butt <shakeelb@google.com> wrote:
> Hi Jason, yes please do send me the test suite with the kernel config.

$ git clone https://git.zx2c4.com/WireGuard
$ cd WireGuard/src
$ [[ $(gcc -v 2>&1) =~ gcc\ version\ 8\.1\.0 ]] || echo crash needs 8.1
$ export DEBUG_KERNEL=yes
$ export KERNEL_VERSION=4.18-rc1
$ make test-qemu -j$(nproc)

This will build a kernel and a minimal userland and load it in qemu,
which must be installed.

This code is what causes the crash:
The self test that's executed:
https://git.zx2c4.com/WireGuard/tree/src/selftest/ratelimiter.h
Which exercises this code:
https://git.zx2c4.com/WireGuard/tree/src/ratelimiter.c

The problem occurs after gc_entries(NULL) frees things (line 124 in
ratelimiter.h above), and then line 133 reallocates those objects.
Sometime after that happens, elsewhere in the kernel invokes this
kasan issue in the kasan cache cleanup.

I realize it's disappointing that the test case here is in WireGuard,
which isn't (yet!) upstream. That's why in my original message I
wrote:

> Rather, it looks like this
> commit introduces a performance optimization, rather than a
> correctness fix, so it seems that whatever test case is failing is
> likely an incorrect failure. Does that seem like an accurate
> possibility to you?

I was hoping to only point you toward my own code after establishing
the possibility that the bug is not my own. If you still think there's
a chance this is due to my own correctness issue, and your commit has
simply unearthed it, let me know and I'll happily keep debugging on my
own before pinging you further.

Regards,
Jason

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Possible regression in "slab, slub: skip unnecessary kasan_cache_shutdown()"
  2018-06-19  4:08   ` Jason A. Donenfeld
@ 2018-06-19  4:55     ` Dmitry Vyukov
  2018-06-19 13:04       ` Jason A. Donenfeld
  2018-06-19  5:06     ` Shakeel Butt
  1 sibling, 1 reply; 16+ messages in thread
From: Dmitry Vyukov @ 2018-06-19  4:55 UTC (permalink / raw)
  To: Jason A. Donenfeld
  Cc: Shakeel Butt, Andrey Ryabinin, Alexander Potapenko,
	Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim,
	Andrew Morton, kasan-dev, Linux-MM, LKML

On Tue, Jun 19, 2018 at 6:08 AM, Jason A. Donenfeld <Jason@zx2c4.com> wrote:
> On Tue, Jun 19, 2018 at 5:59 AM Shakeel Butt <shakeelb@google.com> wrote:
>> Hi Jason, yes please do send me the test suite with the kernel config.
>
> $ git clone https://git.zx2c4.com/WireGuard
> $ cd WireGuard/src
> $ [[ $(gcc -v 2>&1) =~ gcc\ version\ 8\.1\.0 ]] || echo crash needs 8.1
> $ export DEBUG_KERNEL=yes
> $ export KERNEL_VERSION=4.18-rc1
> $ make test-qemu -j$(nproc)
>
> This will build a kernel and a minimal userland and load it in qemu,
> which must be installed.
>
> This code is what causes the crash:
> The self test that's executed:
> https://git.zx2c4.com/WireGuard/tree/src/selftest/ratelimiter.h
> Which exercises this code:
> https://git.zx2c4.com/WireGuard/tree/src/ratelimiter.c
>
> The problem occurs after gc_entries(NULL) frees things (line 124 in
> ratelimiter.h above), and then line 133 reallocates those objects.
> Sometime after that happens, elsewhere in the kernel invokes this
> kasan issue in the kasan cache cleanup.
>
> I realize it's disappointing that the test case here is in WireGuard,
> which isn't (yet!) upstream. That's why in my original message I
> wrote:
>
>> Rather, it looks like this
>> commit introduces a performance optimization, rather than a
>> correctness fix, so it seems that whatever test case is failing is
>> likely an incorrect failure. Does that seem like an accurate
>> possibility to you?
>
> I was hoping to only point you toward my own code after establishing
> the possibility that the bug is not my own. If you still think there's
> a chance this is due to my own correctness issue, and your commit has
> simply unearthed it, let me know and I'll happily keep debugging on my
> own before pinging you further.


Hi Jason,

Your code frees all entries before freeing the cache, right? If you
add total_entries check before freeing the cache, it does not fire,
right?
Are you using SLAB or SLUB? We stress kernel pretty heavily, but with
SLAB, and I suspect Shakeel may also be using SLAB. So if you are
using SLUB, there is significant chance that it's a bug in the SLUB
part of the change.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Possible regression in "slab, slub: skip unnecessary kasan_cache_shutdown()"
  2018-06-19  4:08   ` Jason A. Donenfeld
  2018-06-19  4:55     ` Dmitry Vyukov
@ 2018-06-19  5:06     ` Shakeel Butt
  2018-06-19 13:25       ` Jason A. Donenfeld
  1 sibling, 1 reply; 16+ messages in thread
From: Shakeel Butt @ 2018-06-19  5:06 UTC (permalink / raw)
  To: Jason
  Cc: Andrey Ryabinin, Alexander Potapenko, Dmitry Vyukov,
	Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim,
	Andrew Morton, kasan-dev, Linux MM, LKML

On Mon, Jun 18, 2018 at 9:08 PM Jason A. Donenfeld <Jason@zx2c4.com> wrote:
>
> On Tue, Jun 19, 2018 at 5:59 AM Shakeel Butt <shakeelb@google.com> wrote:
> > Hi Jason, yes please do send me the test suite with the kernel config.
>
> $ git clone https://git.zx2c4.com/WireGuard
> $ cd WireGuard/src
> $ [[ $(gcc -v 2>&1) =~ gcc\ version\ 8\.1\.0 ]] || echo crash needs 8.1
> $ export DEBUG_KERNEL=yes
> $ export KERNEL_VERSION=4.18-rc1
> $ make test-qemu -j$(nproc)
>
> This will build a kernel and a minimal userland and load it in qemu,
> which must be installed.
>
> This code is what causes the crash:
> The self test that's executed:
> https://git.zx2c4.com/WireGuard/tree/src/selftest/ratelimiter.h
> Which exercises this code:
> https://git.zx2c4.com/WireGuard/tree/src/ratelimiter.c
>
> The problem occurs after gc_entries(NULL) frees things (line 124 in
> ratelimiter.h above), and then line 133 reallocates those objects.
> Sometime after that happens, elsewhere in the kernel invokes this
> kasan issue in the kasan cache cleanup.
>

I will try to repro with your test suite sometime later this week.
However from high level code inspection, I see that the code is
creating a 'entry_cache' kmem_cache which is destroyed by
ratelimiter_uninit on last reference drop. Currently refcnt in your
code can underflow, through it does not seem like the selftest will
cause the underflow but still you should fix it.

From high level your code seems fine. Does the issue occur on first
try of selftest? Basically I wanted to ask if kmem_cache_destroy of
your entry_cache was ever executed and have you tried to run this
selftest multiple time while the system was up.

As Dmitry already asked, are you using SLAB or SLUB?

> I realize it's disappointing that the test case here is in WireGuard,
> which isn't (yet!) upstream. That's why in my original message I
> wrote:
>
> > Rather, it looks like this
> > commit introduces a performance optimization, rather than a
> > correctness fix, so it seems that whatever test case is failing is
> > likely an incorrect failure. Does that seem like an accurate
> > possibility to you?
>
> I was hoping to only point you toward my own code after establishing
> the possibility that the bug is not my own. If you still think there's
> a chance this is due to my own correctness issue, and your commit has
> simply unearthed it, let me know and I'll happily keep debugging on my
> own before pinging you further.
>

Sorry, I can not really give a definitive answer.

Shakeel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Possible regression in "slab, slub: skip unnecessary kasan_cache_shutdown()"
  2018-06-19  4:55     ` Dmitry Vyukov
@ 2018-06-19 13:04       ` Jason A. Donenfeld
  2018-06-19 13:16         ` Dmitry Vyukov
  2018-06-19 15:08         ` Shakeel Butt
  0 siblings, 2 replies; 16+ messages in thread
From: Jason A. Donenfeld @ 2018-06-19 13:04 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Shakeel Butt, aryabinin, Alexander Potapenko, cl, penberg,
	rientjes, iamjoonsoo.kim, Andrew Morton, kasan-dev, Linux-MM,
	LKML

HI Dimitry,

On Tue, Jun 19, 2018 at 6:55 AM Dmitry Vyukov <dvyukov@google.com> wrote:
> Your code frees all entries before freeing the cache, right? If you
> add total_entries check before freeing the cache, it does not fire,
> right?

Yes, certainly.

> Are you using SLAB or SLUB? We stress kernel pretty heavily, but with
> SLAB, and I suspect Shakeel may also be using SLAB. So if you are
> using SLUB, there is significant chance that it's a bug in the SLUB
> part of the change.

Nice intuition; I am indeed using SLUB rather than SLAB...

Jason

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Possible regression in "slab, slub: skip unnecessary kasan_cache_shutdown()"
  2018-06-19 13:04       ` Jason A. Donenfeld
@ 2018-06-19 13:16         ` Dmitry Vyukov
  2018-06-19 15:08         ` Shakeel Butt
  1 sibling, 0 replies; 16+ messages in thread
From: Dmitry Vyukov @ 2018-06-19 13:16 UTC (permalink / raw)
  To: Jason A. Donenfeld
  Cc: Shakeel Butt, Andrey Ryabinin, Alexander Potapenko,
	Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim,
	Andrew Morton, kasan-dev, Linux-MM, LKML

On Tue, Jun 19, 2018 at 3:04 PM, Jason A. Donenfeld <Jason@zx2c4.com> wrote:
> HI Dimitry,
>
> On Tue, Jun 19, 2018 at 6:55 AM Dmitry Vyukov <dvyukov@google.com> wrote:
>> Your code frees all entries before freeing the cache, right? If you
>> add total_entries check before freeing the cache, it does not fire,
>> right?
>
> Yes, certainly.
>
>> Are you using SLAB or SLUB? We stress kernel pretty heavily, but with
>> SLAB, and I suspect Shakeel may also be using SLAB. So if you are
>> using SLUB, there is significant chance that it's a bug in the SLUB
>> part of the change.
>
> Nice intuition; I am indeed using SLUB rather than SLAB...

Now the reasonable question is: does SLUB path of
f9e13c0a5a33d1eaec374d6d4dab53a4f72756a0 have a bug?
syzbot has stressed SLAB version to death, and any such issues would
pop up very loudly, but I am not sure what is the amount of testing
for SLUB.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Possible regression in "slab, slub: skip unnecessary kasan_cache_shutdown()"
  2018-06-19  5:06     ` Shakeel Butt
@ 2018-06-19 13:25       ` Jason A. Donenfeld
  2018-06-19 13:30         ` Dmitry Vyukov
  0 siblings, 1 reply; 16+ messages in thread
From: Jason A. Donenfeld @ 2018-06-19 13:25 UTC (permalink / raw)
  To: Shakeel Butt
  Cc: aryabinin, Alexander Potapenko, Dmitry Vyukov, cl, penberg,
	rientjes, iamjoonsoo.kim, Andrew Morton, kasan-dev, Linux-MM,
	LKML

On Tue, Jun 19, 2018 at 7:06 AM Shakeel Butt <shakeelb@google.com> wrote:
> Currently refcnt in your
> code can underflow, through it does not seem like the selftest will
> cause the underflow but still you should fix it.

Indeed, and if this happened this would be a bug in the caller, not
the ratelimiter itself, kind of like lock imbalances; however it's
easy to mitigate this by just replacing that atomic64_dec_return with
atomic64_dec_if_positive, so I'll do that. Thanks for the suggestion.

> From high level your code seems fine. Does the issue occur on first
> try of selftest? Basically I wanted to ask if kmem_cache_destroy of
> your entry_cache was ever executed

Yes, it is.

> and have you tried to run this
> selftest multiple time while the system was up.

Interesting, it crashes on the second run of it, when executing
`KMEM_CACHE(ratelimiter_entry, 0)`. (OOPS is below.)

> As Dmitry already asked, are you using SLAB or SLUB?

SLUB.

> Sorry, I can not really give a definitive answer.

Alright, we'll poke it from both ends, then.

Crash on second run:
[    1.648240] general protection fault: 0000 [#1] PREEMPT SMP
DEBUG_PAGEALLOC KASAN
[    1.648240] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.17.2 #51
[    1.648240] RIP: 0010:__lock_acquire+0x9a9/0x3430
[    1.648240] RSP: 0000:ffff8800003b76a0 EFLAGS: 00010006
[    1.648240] RAX: dffffc0000000000 RBX: 0000000000000000 RCX: 0000000000000000
[    1.648240] RDX: 0000000000000003 RSI: 0000000000000000 RDI: 0000000000000001
[    1.648240] RBP: ffff8800003a8000 R08: 0000000000000001 R09: 0000000000000000
[    1.648240] R10: 0000000000000001 R11: ffffffff828c69e3 R12: 0000000000000001
[    1.648240] R13: 0000000000000018 R14: 0000000000000000 R15: ffff8800003a8000
[    1.648240] FS:  0000000000000000(0000) GS:ffff880036400000(0000)
knlGS:000000000
[    1.648240] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    1.648240] CR2: 0000000000000000 CR3: 0000000002220001 CR4: 00000000001606b0
[    1.660020] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[    1.660020] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[    1.660020] Call Trace:
[    1.660020]  ? _raw_spin_unlock_irqrestore+0x4a/0x90
[    1.660020]  ? debug_check_no_locks_held+0xb0/0xb0
[    1.660020]  ? depot_save_stack.cold.2+0x1e/0x1e
[    1.660020]  ? handle_null_ptr_deref+0x178/0x1f0
[    1.660020]  ? depot_save_stack.cold.2+0x1e/0x1e
[    1.660020]  ? _raw_spin_unlock+0x32/0x70
[    1.660020]  lock_acquire+0xf4/0x240
[    1.660020]  ? __slab_free.isra.11+0x1db/0x770
[    1.660020]  ? __slab_free.isra.11+0x1db/0x770
[    1.660020]  _raw_spin_lock_irqsave+0x4b/0x70
[    1.660020]  ? __slab_free.isra.11+0x1db/0x770
[    1.660020]  __slab_free.isra.11+0x1db/0x770
[    1.660020]  ? __free_pages_ok+0x49a/0xd10
[    1.660020]  ? mark_held_locks+0xdf/0x1b0
[    1.660020]  ? _raw_spin_unlock_irqrestore+0x61/0x90
[    1.660020]  ? qlist_free_all+0x58/0x1c0
[    1.660020]  qlist_free_all+0x70/0x1c0
[    1.660020]  ? trace_hardirqs_on_caller+0x3d0/0x630
[    1.660020]  quarantine_reduce+0x221/0x310
[    1.660020]  kasan_kmalloc+0x95/0xc0
[    1.660020]  kmem_cache_alloc+0x167/0x2e0
[    1.660020]  ? do_one_initcall+0x104/0x232
[    1.660020]  create_object+0xa7/0xa70
[    1.660020]  ? kmemleak_disable+0x90/0x90
[    1.660020]  ? quarantine_reduce+0x207/0x310
[    1.660020]  ? fs_reclaim_acquire.part.15+0x30/0x30
[    1.660020]  kmem_cache_alloc_node+0x209/0x340
[    1.660020]  __kmem_cache_create+0xe6/0x5c0
[    1.660020]  kmem_cache_create_usercopy+0x1ef/0x380
[    1.660020]  ? length_mt_init+0x11/0x11
[    1.660020]  kmem_cache_create+0xd/0x10
[    1.660020]  ratelimiter_init+0x4a/0x1e0
[    1.660020]  ratelimiter_selftest+0x13/0x9ec
[    1.660020]  ? length_mt_init+0x11/0x11
[    1.660020]  mod_init+0xb/0xa5
[    1.660020]  do_one_initcall+0x104/0x232
[    1.660020]  ? start_kernel+0x62c/0x62c
[    1.660020]  ? up_write+0x78/0x220
[    1.660020]  ? up_read+0x130/0x130
[    1.660020]  ? kasan_unpoison_shadow+0x30/0x40
[    1.660020]  kernel_init_freeable+0x3e8/0x48c
[    1.660020]  ? rest_init+0x2bf/0x2bf
[    1.660020]  kernel_init+0x7/0x121
[    1.660020]  ? rest_init+0x2bf/0x2bf
[    1.660020]  ret_from_fork+0x24/0x30
[    1.660020] Code: 81 c4 40 01 00 00 5b 5d 41 5c 41 5d 41 5e 41 5f
c3 4d 85 ed 0f
[    1.660020] RIP: __lock_acquire+0x9a9/0x3430 RSP: ffff8800003b76a0
[    1.660020] ---[ end trace daba3b506c5594e5 ]---

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Possible regression in "slab, slub: skip unnecessary kasan_cache_shutdown()"
  2018-06-19 13:25       ` Jason A. Donenfeld
@ 2018-06-19 13:30         ` Dmitry Vyukov
  2018-06-19 13:40           ` Jason A. Donenfeld
  0 siblings, 1 reply; 16+ messages in thread
From: Dmitry Vyukov @ 2018-06-19 13:30 UTC (permalink / raw)
  To: Jason A. Donenfeld
  Cc: Shakeel Butt, Andrey Ryabinin, Alexander Potapenko,
	Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim,
	Andrew Morton, kasan-dev, Linux-MM, LKML

On Tue, Jun 19, 2018 at 3:25 PM, Jason A. Donenfeld <Jason@zx2c4.com> wrote:
> On Tue, Jun 19, 2018 at 7:06 AM Shakeel Butt <shakeelb@google.com> wrote:
>> Currently refcnt in your
>> code can underflow, through it does not seem like the selftest will
>> cause the underflow but still you should fix it.
>
> Indeed, and if this happened this would be a bug in the caller, not
> the ratelimiter itself, kind of like lock imbalances; however it's
> easy to mitigate this by just replacing that atomic64_dec_return with
> atomic64_dec_if_positive, so I'll do that. Thanks for the suggestion.

Since I already looked at the code, if init and uninit can be called
concurrently, I think there is a prominent race condition between init
and uninit: a concurrent uninit can run concurrnetly with the next
init and this will totally mess things up.

>> From high level your code seems fine. Does the issue occur on first
>> try of selftest? Basically I wanted to ask if kmem_cache_destroy of
>> your entry_cache was ever executed
>
> Yes, it is.
>
>> and have you tried to run this
>> selftest multiple time while the system was up.
>
> Interesting, it crashes on the second run of it, when executing
> `KMEM_CACHE(ratelimiter_entry, 0)`. (OOPS is below.)
>
>> As Dmitry already asked, are you using SLAB or SLUB?
>
> SLUB.
>
>> Sorry, I can not really give a definitive answer.
>
> Alright, we'll poke it from both ends, then.
>
> Crash on second run:
> [    1.648240] general protection fault: 0000 [#1] PREEMPT SMP
> DEBUG_PAGEALLOC KASAN
> [    1.648240] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.17.2 #51
> [    1.648240] RIP: 0010:__lock_acquire+0x9a9/0x3430
> [    1.648240] RSP: 0000:ffff8800003b76a0 EFLAGS: 00010006
> [    1.648240] RAX: dffffc0000000000 RBX: 0000000000000000 RCX: 0000000000000000
> [    1.648240] RDX: 0000000000000003 RSI: 0000000000000000 RDI: 0000000000000001
> [    1.648240] RBP: ffff8800003a8000 R08: 0000000000000001 R09: 0000000000000000
> [    1.648240] R10: 0000000000000001 R11: ffffffff828c69e3 R12: 0000000000000001
> [    1.648240] R13: 0000000000000018 R14: 0000000000000000 R15: ffff8800003a8000
> [    1.648240] FS:  0000000000000000(0000) GS:ffff880036400000(0000)
> knlGS:000000000
> [    1.648240] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [    1.648240] CR2: 0000000000000000 CR3: 0000000002220001 CR4: 00000000001606b0
> [    1.660020] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [    1.660020] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [    1.660020] Call Trace:
> [    1.660020]  ? _raw_spin_unlock_irqrestore+0x4a/0x90
> [    1.660020]  ? debug_check_no_locks_held+0xb0/0xb0
> [    1.660020]  ? depot_save_stack.cold.2+0x1e/0x1e
> [    1.660020]  ? handle_null_ptr_deref+0x178/0x1f0
> [    1.660020]  ? depot_save_stack.cold.2+0x1e/0x1e
> [    1.660020]  ? _raw_spin_unlock+0x32/0x70
> [    1.660020]  lock_acquire+0xf4/0x240
> [    1.660020]  ? __slab_free.isra.11+0x1db/0x770
> [    1.660020]  ? __slab_free.isra.11+0x1db/0x770
> [    1.660020]  _raw_spin_lock_irqsave+0x4b/0x70
> [    1.660020]  ? __slab_free.isra.11+0x1db/0x770
> [    1.660020]  __slab_free.isra.11+0x1db/0x770
> [    1.660020]  ? __free_pages_ok+0x49a/0xd10
> [    1.660020]  ? mark_held_locks+0xdf/0x1b0
> [    1.660020]  ? _raw_spin_unlock_irqrestore+0x61/0x90
> [    1.660020]  ? qlist_free_all+0x58/0x1c0
> [    1.660020]  qlist_free_all+0x70/0x1c0
> [    1.660020]  ? trace_hardirqs_on_caller+0x3d0/0x630
> [    1.660020]  quarantine_reduce+0x221/0x310
> [    1.660020]  kasan_kmalloc+0x95/0xc0
> [    1.660020]  kmem_cache_alloc+0x167/0x2e0
> [    1.660020]  ? do_one_initcall+0x104/0x232
> [    1.660020]  create_object+0xa7/0xa70
> [    1.660020]  ? kmemleak_disable+0x90/0x90
> [    1.660020]  ? quarantine_reduce+0x207/0x310
> [    1.660020]  ? fs_reclaim_acquire.part.15+0x30/0x30
> [    1.660020]  kmem_cache_alloc_node+0x209/0x340
> [    1.660020]  __kmem_cache_create+0xe6/0x5c0
> [    1.660020]  kmem_cache_create_usercopy+0x1ef/0x380
> [    1.660020]  ? length_mt_init+0x11/0x11
> [    1.660020]  kmem_cache_create+0xd/0x10
> [    1.660020]  ratelimiter_init+0x4a/0x1e0
> [    1.660020]  ratelimiter_selftest+0x13/0x9ec
> [    1.660020]  ? length_mt_init+0x11/0x11
> [    1.660020]  mod_init+0xb/0xa5
> [    1.660020]  do_one_initcall+0x104/0x232
> [    1.660020]  ? start_kernel+0x62c/0x62c
> [    1.660020]  ? up_write+0x78/0x220
> [    1.660020]  ? up_read+0x130/0x130
> [    1.660020]  ? kasan_unpoison_shadow+0x30/0x40
> [    1.660020]  kernel_init_freeable+0x3e8/0x48c
> [    1.660020]  ? rest_init+0x2bf/0x2bf
> [    1.660020]  kernel_init+0x7/0x121
> [    1.660020]  ? rest_init+0x2bf/0x2bf
> [    1.660020]  ret_from_fork+0x24/0x30
> [    1.660020] Code: 81 c4 40 01 00 00 5b 5d 41 5c 41 5d 41 5e 41 5f
> c3 4d 85 ed 0f
> [    1.660020] RIP: __lock_acquire+0x9a9/0x3430 RSP: ffff8800003b76a0
> [    1.660020] ---[ end trace daba3b506c5594e5 ]---
>
> --
> You received this message because you are subscribed to the Google Groups "kasan-dev" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to kasan-dev+unsubscribe@googlegroups.com.
> To post to this group, send email to kasan-dev@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/kasan-dev/CAHmME9r%3D%2B91YtujYqsBwf52VkCdPMD8VXJED_AsR42H5h9--mA%40mail.gmail.com.
> For more options, visit https://groups.google.com/d/optout.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Possible regression in "slab, slub: skip unnecessary kasan_cache_shutdown()"
  2018-06-19 13:30         ` Dmitry Vyukov
@ 2018-06-19 13:40           ` Jason A. Donenfeld
  0 siblings, 0 replies; 16+ messages in thread
From: Jason A. Donenfeld @ 2018-06-19 13:40 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Shakeel Butt, aryabinin, Alexander Potapenko, cl, penberg,
	rientjes, iamjoonsoo.kim, Andrew Morton, kasan-dev, Linux-MM,
	LKML

On Tue, Jun 19, 2018 at 3:31 PM Dmitry Vyukov <dvyukov@google.com> wrote:
> Since I already looked at the code, if init and uninit can be called
> concurrently, I think there is a prominent race condition between init
> and uninit: a concurrent uninit can run concurrnetly with the next
> init and this will totally mess things up.

Good point; fixed. Though this doesn't have any effect on the issue here. :)

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Possible regression in "slab, slub: skip unnecessary kasan_cache_shutdown()"
  2018-06-19 13:04       ` Jason A. Donenfeld
  2018-06-19 13:16         ` Dmitry Vyukov
@ 2018-06-19 15:08         ` Shakeel Butt
  2018-06-19 15:18           ` Jason A. Donenfeld
  1 sibling, 1 reply; 16+ messages in thread
From: Shakeel Butt @ 2018-06-19 15:08 UTC (permalink / raw)
  To: Jason
  Cc: Dmitry Vyukov, Andrey Ryabinin, Alexander Potapenko,
	Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim,
	Andrew Morton, kasan-dev, Linux MM, LKML

On Tue, Jun 19, 2018 at 6:05 AM Jason A. Donenfeld <Jason@zx2c4.com> wrote:
>
> HI Dimitry,
>
> On Tue, Jun 19, 2018 at 6:55 AM Dmitry Vyukov <dvyukov@google.com> wrote:
> > Your code frees all entries before freeing the cache, right? If you
> > add total_entries check before freeing the cache, it does not fire,
> > right?
>
> Yes, certainly.
>
> > Are you using SLAB or SLUB? We stress kernel pretty heavily, but with
> > SLAB, and I suspect Shakeel may also be using SLAB. So if you are
> > using SLUB, there is significant chance that it's a bug in the SLUB
> > part of the change.
>
> Nice intuition; I am indeed using SLUB rather than SLAB...
>

Can you try once with SLAB? Just to make sure that it is SLUB specific.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Possible regression in "slab, slub: skip unnecessary kasan_cache_shutdown()"
  2018-06-19 15:08         ` Shakeel Butt
@ 2018-06-19 15:18           ` Jason A. Donenfeld
  2018-06-19 19:21             ` Shakeel Butt
  0 siblings, 1 reply; 16+ messages in thread
From: Jason A. Donenfeld @ 2018-06-19 15:18 UTC (permalink / raw)
  To: Shakeel Butt
  Cc: Dmitry Vyukov, aryabinin, Alexander Potapenko, cl, penberg,
	rientjes, iamjoonsoo.kim, Andrew Morton, kasan-dev, Linux-MM,
	LKML

On Tue, Jun 19, 2018 at 5:08 PM Shakeel Butt <shakeelb@google.com> wrote:
> > > Are you using SLAB or SLUB? We stress kernel pretty heavily, but with
> > > SLAB, and I suspect Shakeel may also be using SLAB. So if you are
> > > using SLUB, there is significant chance that it's a bug in the SLUB
> > > part of the change.
> >
> > Nice intuition; I am indeed using SLUB rather than SLAB...
> >
>
> Can you try once with SLAB? Just to make sure that it is SLUB specific.

Sorry, I meant to mention that earlier. I tried with SLAB; the crash
does not occur. This appears to be SLUB-specific.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Possible regression in "slab, slub: skip unnecessary kasan_cache_shutdown()"
  2018-06-19  2:51 Possible regression in "slab, slub: skip unnecessary kasan_cache_shutdown()" Jason A. Donenfeld
  2018-06-19  3:59 ` Shakeel Butt
@ 2018-06-19 17:34 ` Andrey Ryabinin
  2018-06-19 20:17   ` Jason A. Donenfeld
  1 sibling, 1 reply; 16+ messages in thread
From: Andrey Ryabinin @ 2018-06-19 17:34 UTC (permalink / raw)
  To: Jason A. Donenfeld, Alexander Potapenko, Dmitry Vyukov, cl,
	penberg, rientjes, iamjoonsoo.kim, Andrew Morton, kasan-dev,
	Linux-MM, LKML, Shakeel Butt



On 06/19/2018 05:51 AM, Jason A. Donenfeld wrote:
> Hello Shakeel,
> 
> It may be the case that f9e13c0a5a33d1eaec374d6d4dab53a4f72756a0 has
> introduced a regression. I've bisected a failing test to this commit,
> and after staring at the my code for a long time, I'm unable to find a
> bug that this commit might have unearthed. Rather, it looks like this
> commit introduces a performance optimization, rather than a
> correctness fix, so it seems that whatever test case is failing is
> likely an incorrect failure. Does that seem like an accurate
> possibility to you?
> 
> Below is a stack trace when things go south. Let me know if you'd like
> to run my test suite, and I can send additional information.
> 
> Regards,
> Jason
> 
> 

What's the status of CONFIG_SLUB_DEBUG in your config?

AFAICS __kmem_cache_empty() is broken for CONFIG_SLUB_DEBUG=n. We use slabs_node() there
which is always 0 for CONFIG_SLUB_DEBUG=n.

The problem seems not limited to __kmem_cache_empty(), __kmem_cache_shutdown() and __kmem_cache_shrink()
are also rely on correctness of the slabs_node(). Presumably this might cause some problems while
destroying memcg kmem caches.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Possible regression in "slab, slub: skip unnecessary kasan_cache_shutdown()"
  2018-06-19 15:18           ` Jason A. Donenfeld
@ 2018-06-19 19:21             ` Shakeel Butt
  2018-06-19 20:20               ` Jason A. Donenfeld
  0 siblings, 1 reply; 16+ messages in thread
From: Shakeel Butt @ 2018-06-19 19:21 UTC (permalink / raw)
  To: Jason A . Donenfeld
  Cc: Dmitry Vyukov, aryabinin, Alexander Potapenko, cl, penberg,
	rientjes, iamjoonsoo.kim, Andrew Morton, kasan-dev, Linux-MM,
	LKML, Shakeel Butt

On Tue, Jun 19, 2018 at 8:19 AM Jason A. Donenfeld <Jason@zx2c4.com> wrote:
>
> On Tue, Jun 19, 2018 at 5:08 PM Shakeel Butt <shakeelb@google.com> wrote:
> > > > Are you using SLAB or SLUB? We stress kernel pretty heavily, but with
> > > > SLAB, and I suspect Shakeel may also be using SLAB. So if you are
> > > > using SLUB, there is significant chance that it's a bug in the SLUB
> > > > part of the change.
> > >
> > > Nice intuition; I am indeed using SLUB rather than SLAB...
> > >
> >
> > Can you try once with SLAB? Just to make sure that it is SLUB specific.
>
> Sorry, I meant to mention that earlier. I tried with SLAB; the crash
> does not occur. This appears to be SLUB-specific.

Jason, can you try the following patch?

---
 mm/slub.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/mm/slub.c b/mm/slub.c
index a3b8467c14af..746cfe4515c2 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -3673,9 +3673,17 @@ static void free_partial(struct kmem_cache *s, struct kmem_cache_node *n)
 
 bool __kmem_cache_empty(struct kmem_cache *s)
 {
+	int cpu;
 	int node;
 	struct kmem_cache_node *n;
 
+	for_each_online_cpu(cpu) {
+		struct kmem_cache_cpu *c = per_cpu_ptr(s->cpu_slab, cpu);
+
+		if (c->page || slub_percpu_partial(c))
+			return false;
+	}
+
 	for_each_kmem_cache_node(s, node, n)
 		if (n->nr_partial || slabs_node(s, node))
 			return false;
-- 
2.18.0.rc1.244.gcf134e6275-goog


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: Possible regression in "slab, slub: skip unnecessary kasan_cache_shutdown()"
  2018-06-19 17:34 ` Andrey Ryabinin
@ 2018-06-19 20:17   ` Jason A. Donenfeld
  0 siblings, 0 replies; 16+ messages in thread
From: Jason A. Donenfeld @ 2018-06-19 20:17 UTC (permalink / raw)
  To: aryabinin
  Cc: Alexander Potapenko, Dmitry Vyukov, cl, penberg, rientjes,
	iamjoonsoo.kim, Andrew Morton, kasan-dev, Linux-MM, LKML,
	Shakeel Butt

Hi Andrey,

On Tue, Jun 19, 2018 at 7:33 PM Andrey Ryabinin <aryabinin@virtuozzo.com> wrote:
> What's the status of CONFIG_SLUB_DEBUG in your config?
>
> AFAICS __kmem_cache_empty() is broken for CONFIG_SLUB_DEBUG=n. We use slabs_node() there
> which is always 0 for CONFIG_SLUB_DEBUG=n.
>
> The problem seems not limited to __kmem_cache_empty(), __kmem_cache_shutdown() and __kmem_cache_shrink()
> are also rely on correctness of the slabs_node(). Presumably this might cause some problems while
> destroying memcg kmem caches.

CONFIG_SLUB_DEBUG is not set in the crash I sent.

Enabling it "fixes" the problem! This either means that KASAN+SLUB
should enable SLUB_DEBUG, or the extra overhead from SLUB_DEBUG is
just making the bug more rare but not actually eliminating it.

Jason

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Possible regression in "slab, slub: skip unnecessary kasan_cache_shutdown()"
  2018-06-19 19:21             ` Shakeel Butt
@ 2018-06-19 20:20               ` Jason A. Donenfeld
  0 siblings, 0 replies; 16+ messages in thread
From: Jason A. Donenfeld @ 2018-06-19 20:20 UTC (permalink / raw)
  To: Shakeel Butt
  Cc: Dmitry Vyukov, aryabinin, Alexander Potapenko, cl, penberg,
	rientjes, iamjoonsoo.kim, Andrew Morton, kasan-dev, Linux-MM,
	LKML

Hi Shakeel,

On Tue, Jun 19, 2018 at 9:21 PM Shakeel Butt <shakeelb@google.com> wrote:
> Jason, can you try the following patch?

Your patch also fixed the problem, which was also fixed by enabling
CONFIG_SLUB_DEBUG, per the other email. I haven't checked to see if
your patch is simply a subset of what SLUB_DEBUG is doing or what. But
hopefully this points in the right direction now.

Jason

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2018-06-19 20:20 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-06-19  2:51 Possible regression in "slab, slub: skip unnecessary kasan_cache_shutdown()" Jason A. Donenfeld
2018-06-19  3:59 ` Shakeel Butt
2018-06-19  4:08   ` Jason A. Donenfeld
2018-06-19  4:55     ` Dmitry Vyukov
2018-06-19 13:04       ` Jason A. Donenfeld
2018-06-19 13:16         ` Dmitry Vyukov
2018-06-19 15:08         ` Shakeel Butt
2018-06-19 15:18           ` Jason A. Donenfeld
2018-06-19 19:21             ` Shakeel Butt
2018-06-19 20:20               ` Jason A. Donenfeld
2018-06-19  5:06     ` Shakeel Butt
2018-06-19 13:25       ` Jason A. Donenfeld
2018-06-19 13:30         ` Dmitry Vyukov
2018-06-19 13:40           ` Jason A. Donenfeld
2018-06-19 17:34 ` Andrey Ryabinin
2018-06-19 20:17   ` Jason A. Donenfeld

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).