WARNING bisected (was Re: [PATCH v7 08/10] mm: rework non-root kmem

kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* WARNING bisected (was Re: [PATCH v7 08/10] mm: rework non-root kmem_cache lifecycle management)
       [not found] <20190611231813.3148843-9-guro@fb.com>
@ 2019-11-21 11:17 ` Christian Borntraeger
  2019-11-21 13:08   ` Christian Borntraeger
  2019-11-21 16:58   ` Roman Gushchin
  0 siblings, 2 replies; 12+ messages in thread
From: Christian Borntraeger @ 2019-11-21 11:17 UTC (permalink / raw)
  To: guro
  Cc: akpm, hannes, kernel-team, linux-kernel, linux-mm, longman,
	shakeelb, vdavydov.dev, Heiko Carstens, Janosch Frank, kvm,
	Christian Borntraeger

Folks,

I do get errors like the following when running a new testcase in our KVM CI.
The test basically unloads kvm, reloads with with hpage=1 (enable huge page
support for guests on s390) start a guest with libvirt and hugepages, shut the
guest down and unload the kvm module. 

WARNING: CPU: 8 PID: 208 at lib/percpu-refcount.c:108 percpu_ref_exit+0x50/0x58
Modules linked in: kvm(-) xt_CHECKSUM xt_MASQUERADE bonding xt_tcpudp ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 ipt_REJECT nf_reject_ipv4 xt_conntrack ip6table_na>
CPU: 8 PID: 208 Comm: kworker/8:1 Not tainted 5.2.0+ #66
Hardware name: IBM 2964 NC9 712 (LPAR)
Workqueue: events sysfs_slab_remove_workfn
Krnl PSW : 0704e00180000000 0000001529746850 (percpu_ref_exit+0x50/0x58)
           R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:2 PM:0 RI:0 EA:3
Krnl GPRS: 00000000ffff8808 0000001529746740 000003f4e30e8e18 0036008100000000
           0000001f00000000 0035008100000000 0000001fb3573ab8 0000000000000000
           0000001fbdb6de00 0000000000000000 0000001529f01328 0000001fb3573b00
           0000001fbb27e000 0000001fbdb69300 000003e009263d00 000003e009263cd0
Krnl Code: 0000001529746842: f0a0000407fe        srp        4(11,%r0),2046,0
           0000001529746848: 47000700            bc         0,1792
          #000000152974684c: a7f40001            brc        15,152974684e
          >0000001529746850: a7f4fff2            brc        15,1529746834
           0000001529746854: 0707                bcr        0,%r7
           0000001529746856: 0707                bcr        0,%r7
           0000001529746858: eb8ff0580024        stmg       %r8,%r15,88(%r15)
           000000152974685e: a738ffff            lhi        %r3,-1
Call Trace:
([<000003e009263d00>] 0x3e009263d00)
 [<00000015293252ea>] slab_kmem_cache_release+0x3a/0x70 
 [<0000001529b04882>] kobject_put+0xaa/0xe8 
 [<000000152918cf28>] process_one_work+0x1e8/0x428 
 [<000000152918d1b0>] worker_thread+0x48/0x460 
 [<00000015291942c6>] kthread+0x126/0x160 
 [<0000001529b22344>] ret_from_fork+0x28/0x30 
 [<0000001529b2234c>] kernel_thread_starter+0x0/0x10 
Last Breaking-Event-Address:
 [<000000152974684c>] percpu_ref_exit+0x4c/0x58
---[ end trace b035e7da5788eb09 ]---

I have bisected this to
# first bad commit: [f0a3a24b532d9a7e56a33c5112b2a212ed6ec580] mm: memcg/slab: rework non-root kmem_cache lifecycle management

unmounting /sys/fs/cgroup/memory/ before the test makes the problem go away so
it really seems to be related to the new percpu-refs from this patch.
 
Any ideas?

Christian


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: WARNING bisected (was Re: [PATCH v7 08/10] mm: rework non-root kmem_cache lifecycle management)
  2019-11-21 11:17 ` WARNING bisected (was Re: [PATCH v7 08/10] mm: rework non-root kmem_cache lifecycle management) Christian Borntraeger
@ 2019-11-21 13:08   ` Christian Borntraeger
  2019-11-21 16:58   ` Roman Gushchin
  1 sibling, 0 replies; 12+ messages in thread
From: Christian Borntraeger @ 2019-11-21 13:08 UTC (permalink / raw)
  To: guro
  Cc: akpm, hannes, kernel-team, linux-kernel, linux-mm, longman,
	shakeelb, vdavydov.dev, Heiko Carstens, Janosch Frank, kvm



On 21.11.19 12:17, Christian Borntraeger wrote:
> Folks,
> 
> I do get errors like the following when running a new testcase in our KVM CI.
> The test basically unloads kvm, reloads with with hpage=1 (enable huge page
> support for guests on s390) start a guest with libvirt and hugepages, shut the
> guest down and unload the kvm module. 

It also crashes without large pages. The trigger is really that the time between 
"guest is going away" and rmmod kvm is really short.

i
> 
> WARNING: CPU: 8 PID: 208 at lib/percpu-refcount.c:108 percpu_ref_exit+0x50/0x58
> Modules linked in: kvm(-) xt_CHECKSUM xt_MASQUERADE bonding xt_tcpudp ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 ipt_REJECT nf_reject_ipv4 xt_conntrack ip6table_na>
> CPU: 8 PID: 208 Comm: kworker/8:1 Not tainted 5.2.0+ #66
> Hardware name: IBM 2964 NC9 712 (LPAR)
> Workqueue: events sysfs_slab_remove_workfn
> Krnl PSW : 0704e00180000000 0000001529746850 (percpu_ref_exit+0x50/0x58)
>            R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:2 PM:0 RI:0 EA:3
> Krnl GPRS: 00000000ffff8808 0000001529746740 000003f4e30e8e18 0036008100000000
>            0000001f00000000 0035008100000000 0000001fb3573ab8 0000000000000000
>            0000001fbdb6de00 0000000000000000 0000001529f01328 0000001fb3573b00
>            0000001fbb27e000 0000001fbdb69300 000003e009263d00 000003e009263cd0
> Krnl Code: 0000001529746842: f0a0000407fe        srp        4(11,%r0),2046,0
>            0000001529746848: 47000700            bc         0,1792
>           #000000152974684c: a7f40001            brc        15,152974684e
>           >0000001529746850: a7f4fff2            brc        15,1529746834
>            0000001529746854: 0707                bcr        0,%r7
>            0000001529746856: 0707                bcr        0,%r7
>            0000001529746858: eb8ff0580024        stmg       %r8,%r15,88(%r15)
>            000000152974685e: a738ffff            lhi        %r3,-1
> Call Trace:
> ([<000003e009263d00>] 0x3e009263d00)
>  [<00000015293252ea>] slab_kmem_cache_release+0x3a/0x70 
>  [<0000001529b04882>] kobject_put+0xaa/0xe8 
>  [<000000152918cf28>] process_one_work+0x1e8/0x428 
>  [<000000152918d1b0>] worker_thread+0x48/0x460 
>  [<00000015291942c6>] kthread+0x126/0x160 
>  [<0000001529b22344>] ret_from_fork+0x28/0x30 
>  [<0000001529b2234c>] kernel_thread_starter+0x0/0x10 
> Last Breaking-Event-Address:
>  [<000000152974684c>] percpu_ref_exit+0x4c/0x58
> ---[ end trace b035e7da5788eb09 ]---
> 
> I have bisected this to
> # first bad commit: [f0a3a24b532d9a7e56a33c5112b2a212ed6ec580] mm: memcg/slab: rework non-root kmem_cache lifecycle management
> 
> unmounting /sys/fs/cgroup/memory/ before the test makes the problem go away so
> it really seems to be related to the new percpu-refs from this patch.
>  
> Any ideas?
> 
> Christian
> 


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: WARNING bisected (was Re: [PATCH v7 08/10] mm: rework non-root kmem_cache lifecycle management)
  2019-11-21 11:17 ` WARNING bisected (was Re: [PATCH v7 08/10] mm: rework non-root kmem_cache lifecycle management) Christian Borntraeger
  2019-11-21 13:08   ` Christian Borntraeger
@ 2019-11-21 16:58   ` Roman Gushchin
  2019-11-21 16:59     ` Christian Borntraeger
  1 sibling, 1 reply; 12+ messages in thread
From: Roman Gushchin @ 2019-11-21 16:58 UTC (permalink / raw)
  To: Christian Borntraeger
  Cc: akpm, hannes, Kernel Team, linux-kernel, linux-mm, longman,
	shakeelb, vdavydov.dev, Heiko Carstens, Janosch Frank, kvm

On Thu, Nov 21, 2019 at 12:17:39PM +0100, Christian Borntraeger wrote:
> Folks,
> 
> I do get errors like the following when running a new testcase in our KVM CI.
> The test basically unloads kvm, reloads with with hpage=1 (enable huge page
> support for guests on s390) start a guest with libvirt and hugepages, shut the
> guest down and unload the kvm module. 
> 
> WARNING: CPU: 8 PID: 208 at lib/percpu-refcount.c:108 percpu_ref_exit+0x50/0x58
> Modules linked in: kvm(-) xt_CHECKSUM xt_MASQUERADE bonding xt_tcpudp ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 ipt_REJECT nf_reject_ipv4 xt_conntrack ip6table_na>
> CPU: 8 PID: 208 Comm: kworker/8:1 Not tainted 5.2.0+ #66
> Hardware name: IBM 2964 NC9 712 (LPAR)
> Workqueue: events sysfs_slab_remove_workfn
> Krnl PSW : 0704e00180000000 0000001529746850 (percpu_ref_exit+0x50/0x58)
>            R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:2 PM:0 RI:0 EA:3
> Krnl GPRS: 00000000ffff8808 0000001529746740 000003f4e30e8e18 0036008100000000
>            0000001f00000000 0035008100000000 0000001fb3573ab8 0000000000000000
>            0000001fbdb6de00 0000000000000000 0000001529f01328 0000001fb3573b00
>            0000001fbb27e000 0000001fbdb69300 000003e009263d00 000003e009263cd0
> Krnl Code: 0000001529746842: f0a0000407fe        srp        4(11,%r0),2046,0
>            0000001529746848: 47000700            bc         0,1792
>           #000000152974684c: a7f40001            brc        15,152974684e
>           >0000001529746850: a7f4fff2            brc        15,1529746834
>            0000001529746854: 0707                bcr        0,%r7
>            0000001529746856: 0707                bcr        0,%r7
>            0000001529746858: eb8ff0580024        stmg       %r8,%r15,88(%r15)
>            000000152974685e: a738ffff            lhi        %r3,-1
> Call Trace:
> ([<000003e009263d00>] 0x3e009263d00)
>  [<00000015293252ea>] slab_kmem_cache_release+0x3a/0x70 
>  [<0000001529b04882>] kobject_put+0xaa/0xe8 
>  [<000000152918cf28>] process_one_work+0x1e8/0x428 
>  [<000000152918d1b0>] worker_thread+0x48/0x460 
>  [<00000015291942c6>] kthread+0x126/0x160 
>  [<0000001529b22344>] ret_from_fork+0x28/0x30 
>  [<0000001529b2234c>] kernel_thread_starter+0x0/0x10 
> Last Breaking-Event-Address:
>  [<000000152974684c>] percpu_ref_exit+0x4c/0x58
> ---[ end trace b035e7da5788eb09 ]---
> 
> I have bisected this to
> # first bad commit: [f0a3a24b532d9a7e56a33c5112b2a212ed6ec580] mm: memcg/slab: rework non-root kmem_cache lifecycle management
> 
> unmounting /sys/fs/cgroup/memory/ before the test makes the problem go away so
> it really seems to be related to the new percpu-refs from this patch.
>  
> Any ideas?

Hello, Christian!

It seems to be a race between releasing of the root kmem_cache (caused by rmmod)
and a memcg kmem_cache. Does delaying rmmod for say, a minute, "resolve" the
issue?

I'll take a look at this code path.

Thanks!

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: WARNING bisected (was Re: [PATCH v7 08/10] mm: rework non-root kmem_cache lifecycle management)
  2019-11-21 16:58   ` Roman Gushchin
@ 2019-11-21 16:59     ` Christian Borntraeger
  2019-11-21 18:45       ` Roman Gushchin
  0 siblings, 1 reply; 12+ messages in thread
From: Christian Borntraeger @ 2019-11-21 16:59 UTC (permalink / raw)
  To: Roman Gushchin
  Cc: akpm, hannes, Kernel Team, linux-kernel, linux-mm, longman,
	shakeelb, vdavydov.dev, Heiko Carstens, Janosch Frank, kvm



On 21.11.19 17:58, Roman Gushchin wrote:
> On Thu, Nov 21, 2019 at 12:17:39PM +0100, Christian Borntraeger wrote:
>> Folks,
>>
>> I do get errors like the following when running a new testcase in our KVM CI.
>> The test basically unloads kvm, reloads with with hpage=1 (enable huge page
>> support for guests on s390) start a guest with libvirt and hugepages, shut the
>> guest down and unload the kvm module. 
>>
>> WARNING: CPU: 8 PID: 208 at lib/percpu-refcount.c:108 percpu_ref_exit+0x50/0x58
>> Modules linked in: kvm(-) xt_CHECKSUM xt_MASQUERADE bonding xt_tcpudp ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 ipt_REJECT nf_reject_ipv4 xt_conntrack ip6table_na>
>> CPU: 8 PID: 208 Comm: kworker/8:1 Not tainted 5.2.0+ #66
>> Hardware name: IBM 2964 NC9 712 (LPAR)
>> Workqueue: events sysfs_slab_remove_workfn
>> Krnl PSW : 0704e00180000000 0000001529746850 (percpu_ref_exit+0x50/0x58)
>>            R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:2 PM:0 RI:0 EA:3
>> Krnl GPRS: 00000000ffff8808 0000001529746740 000003f4e30e8e18 0036008100000000
>>            0000001f00000000 0035008100000000 0000001fb3573ab8 0000000000000000
>>            0000001fbdb6de00 0000000000000000 0000001529f01328 0000001fb3573b00
>>            0000001fbb27e000 0000001fbdb69300 000003e009263d00 000003e009263cd0
>> Krnl Code: 0000001529746842: f0a0000407fe        srp        4(11,%r0),2046,0
>>            0000001529746848: 47000700            bc         0,1792
>>           #000000152974684c: a7f40001            brc        15,152974684e
>>           >0000001529746850: a7f4fff2            brc        15,1529746834
>>            0000001529746854: 0707                bcr        0,%r7
>>            0000001529746856: 0707                bcr        0,%r7
>>            0000001529746858: eb8ff0580024        stmg       %r8,%r15,88(%r15)
>>            000000152974685e: a738ffff            lhi        %r3,-1
>> Call Trace:
>> ([<000003e009263d00>] 0x3e009263d00)
>>  [<00000015293252ea>] slab_kmem_cache_release+0x3a/0x70 
>>  [<0000001529b04882>] kobject_put+0xaa/0xe8 
>>  [<000000152918cf28>] process_one_work+0x1e8/0x428 
>>  [<000000152918d1b0>] worker_thread+0x48/0x460 
>>  [<00000015291942c6>] kthread+0x126/0x160 
>>  [<0000001529b22344>] ret_from_fork+0x28/0x30 
>>  [<0000001529b2234c>] kernel_thread_starter+0x0/0x10 
>> Last Breaking-Event-Address:
>>  [<000000152974684c>] percpu_ref_exit+0x4c/0x58
>> ---[ end trace b035e7da5788eb09 ]---
>>
>> I have bisected this to
>> # first bad commit: [f0a3a24b532d9a7e56a33c5112b2a212ed6ec580] mm: memcg/slab: rework non-root kmem_cache lifecycle management
>>
>> unmounting /sys/fs/cgroup/memory/ before the test makes the problem go away so
>> it really seems to be related to the new percpu-refs from this patch.
>>  
>> Any ideas?
> 
> Hello, Christian!
> 
> It seems to be a race between releasing of the root kmem_cache (caused by rmmod)
> and a memcg kmem_cache. Does delaying rmmod for say, a minute, "resolve" the
> issue?

Yes, rmmod has to be called directly after the guest shutdown to see the issue.
See my 2nd mail.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: WARNING bisected (was Re: [PATCH v7 08/10] mm: rework non-root kmem_cache lifecycle management)
  2019-11-21 16:59     ` Christian Borntraeger
@ 2019-11-21 18:45       ` Roman Gushchin
  2019-11-21 20:43         ` Rik van Riel
  2019-11-22 16:28         ` Christian Borntraeger
  0 siblings, 2 replies; 12+ messages in thread
From: Roman Gushchin @ 2019-11-21 18:45 UTC (permalink / raw)
  To: Christian Borntraeger
  Cc: akpm, hannes, Kernel Team, linux-kernel, linux-mm, longman,
	shakeelb, vdavydov.dev, Heiko Carstens, Janosch Frank, kvm

On Thu, Nov 21, 2019 at 05:59:54PM +0100, Christian Borntraeger wrote:
> 
> 
> On 21.11.19 17:58, Roman Gushchin wrote:
> > On Thu, Nov 21, 2019 at 12:17:39PM +0100, Christian Borntraeger wrote:
> >> Folks,
> >>
> >> I do get errors like the following when running a new testcase in our KVM CI.
> >> The test basically unloads kvm, reloads with with hpage=1 (enable huge page
> >> support for guests on s390) start a guest with libvirt and hugepages, shut the
> >> guest down and unload the kvm module. 
> >>
> >> WARNING: CPU: 8 PID: 208 at lib/percpu-refcount.c:108 percpu_ref_exit+0x50/0x58
> >> Modules linked in: kvm(-) xt_CHECKSUM xt_MASQUERADE bonding xt_tcpudp ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 ipt_REJECT nf_reject_ipv4 xt_conntrack ip6table_na>
> >> CPU: 8 PID: 208 Comm: kworker/8:1 Not tainted 5.2.0+ #66
> >> Hardware name: IBM 2964 NC9 712 (LPAR)
> >> Workqueue: events sysfs_slab_remove_workfn
> >> Krnl PSW : 0704e00180000000 0000001529746850 (percpu_ref_exit+0x50/0x58)
> >>            R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:2 PM:0 RI:0 EA:3
> >> Krnl GPRS: 00000000ffff8808 0000001529746740 000003f4e30e8e18 0036008100000000
> >>            0000001f00000000 0035008100000000 0000001fb3573ab8 0000000000000000
> >>            0000001fbdb6de00 0000000000000000 0000001529f01328 0000001fb3573b00
> >>            0000001fbb27e000 0000001fbdb69300 000003e009263d00 000003e009263cd0
> >> Krnl Code: 0000001529746842: f0a0000407fe        srp        4(11,%r0),2046,0
> >>            0000001529746848: 47000700            bc         0,1792
> >>           #000000152974684c: a7f40001            brc        15,152974684e
> >>           >0000001529746850: a7f4fff2            brc        15,1529746834
> >>            0000001529746854: 0707                bcr        0,%r7
> >>            0000001529746856: 0707                bcr        0,%r7
> >>            0000001529746858: eb8ff0580024        stmg       %r8,%r15,88(%r15)
> >>            000000152974685e: a738ffff            lhi        %r3,-1
> >> Call Trace:
> >> ([<000003e009263d00>] 0x3e009263d00)
> >>  [<00000015293252ea>] slab_kmem_cache_release+0x3a/0x70 
> >>  [<0000001529b04882>] kobject_put+0xaa/0xe8 
> >>  [<000000152918cf28>] process_one_work+0x1e8/0x428 
> >>  [<000000152918d1b0>] worker_thread+0x48/0x460 
> >>  [<00000015291942c6>] kthread+0x126/0x160 
> >>  [<0000001529b22344>] ret_from_fork+0x28/0x30 
> >>  [<0000001529b2234c>] kernel_thread_starter+0x0/0x10 
> >> Last Breaking-Event-Address:
> >>  [<000000152974684c>] percpu_ref_exit+0x4c/0x58
> >> ---[ end trace b035e7da5788eb09 ]---
> >>
> >> I have bisected this to
> >> # first bad commit: [f0a3a24b532d9a7e56a33c5112b2a212ed6ec580] mm: memcg/slab: rework non-root kmem_cache lifecycle management
> >>
> >> unmounting /sys/fs/cgroup/memory/ before the test makes the problem go away so
> >> it really seems to be related to the new percpu-refs from this patch.
> >>  
> >> Any ideas?
> > 
> > Hello, Christian!
> > 
> > It seems to be a race between releasing of the root kmem_cache (caused by rmmod)
> > and a memcg kmem_cache. Does delaying rmmod for say, a minute, "resolve" the
> > issue?
> 
> Yes, rmmod has to be called directly after the guest shutdown to see the issue.
> See my 2nd mail.

I see. Do you know, which kmem_cache it is? If not, can you, please,
figure it out?

I tried to reproduce the issue, but wasn't successful so far. So I wonder
what can make your case special.

Thanks!

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: WARNING bisected (was Re: [PATCH v7 08/10] mm: rework non-root kmem_cache lifecycle management)
  2019-11-21 18:45       ` Roman Gushchin
@ 2019-11-21 20:43         ` Rik van Riel
  2019-11-21 20:55           ` Roman Gushchin
  2019-11-22 16:28         ` Christian Borntraeger
  1 sibling, 1 reply; 12+ messages in thread
From: Rik van Riel @ 2019-11-21 20:43 UTC (permalink / raw)
  To: Roman Gushchin, Christian Borntraeger
  Cc: akpm, hannes, Kernel Team, linux-kernel, linux-mm, longman,
	shakeelb, vdavydov.dev, Heiko Carstens, Janosch Frank, kvm

On Thu, 2019-11-21 at 13:45 -0500, Roman Gushchin wrote:
> On Thu, Nov 21, 2019 at 05:59:54PM +0100, Christian Borntraeger
> wrote:
> > 
> > 
> > Yes, rmmod has to be called directly after the guest shutdown to
> > see the issue.
> > See my 2nd mail.
> 
> I see. Do you know, which kmem_cache it is? If not, can you, please,
> figure it out?
> 
> I tried to reproduce the issue, but wasn't successful so far. So I
> wonder
> what can make your case special.

I do not know either, but have a guess.

My guess would be that either the slab object or the
slab page is RCU freed, and the kmem_cache destruction
is called before that RCU callback has completed.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: WARNING bisected (was Re: [PATCH v7 08/10] mm: rework non-root kmem_cache lifecycle management)
  2019-11-21 20:43         ` Rik van Riel
@ 2019-11-21 20:55           ` Roman Gushchin
  2019-11-21 22:09             ` Roman Gushchin
  0 siblings, 1 reply; 12+ messages in thread
From: Roman Gushchin @ 2019-11-21 20:55 UTC (permalink / raw)
  To: Rik van Riel
  Cc: Christian Borntraeger, akpm, hannes, Kernel Team, linux-kernel,
	linux-mm, longman, shakeelb, vdavydov.dev, Heiko Carstens,
	Janosch Frank, kvm

On Thu, Nov 21, 2019 at 12:43:01PM -0800, Rik van Riel wrote:
> On Thu, 2019-11-21 at 13:45 -0500, Roman Gushchin wrote:
> > On Thu, Nov 21, 2019 at 05:59:54PM +0100, Christian Borntraeger
> > wrote:
> > > 
> > > 
> > > Yes, rmmod has to be called directly after the guest shutdown to
> > > see the issue.
> > > See my 2nd mail.
> > 
> > I see. Do you know, which kmem_cache it is? If not, can you, please,
> > figure it out?
> > 
> > I tried to reproduce the issue, but wasn't successful so far. So I
> > wonder
> > what can make your case special.
> 
> I do not know either, but have a guess.
> 
> My guess would be that either the slab object or the
> slab page is RCU freed, and the kmem_cache destruction
> is called before that RCU callback has completed.
> 

I've a reproducer, but it requires SLAB_TYPESAFE_BY_RCU to panic.
The only question is if it's the same or different issues.
As soon as I'll have a fix, I'll post it here to test.

Thanks!

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: WARNING bisected (was Re: [PATCH v7 08/10] mm: rework non-root kmem_cache lifecycle management)
  2019-11-21 20:55           ` Roman Gushchin
@ 2019-11-21 22:09             ` Roman Gushchin
  0 siblings, 0 replies; 12+ messages in thread
From: Roman Gushchin @ 2019-11-21 22:09 UTC (permalink / raw)
  To: Rik van Riel
  Cc: Christian Borntraeger, akpm, hannes, Kernel Team, linux-kernel,
	linux-mm, longman, shakeelb, vdavydov.dev, Heiko Carstens,
	Janosch Frank, kvm

On Thu, Nov 21, 2019 at 12:55:28PM -0800, Roman Gushchin wrote:
> On Thu, Nov 21, 2019 at 12:43:01PM -0800, Rik van Riel wrote:
> > On Thu, 2019-11-21 at 13:45 -0500, Roman Gushchin wrote:
> > > On Thu, Nov 21, 2019 at 05:59:54PM +0100, Christian Borntraeger
> > > wrote:
> > > > 
> > > > 
> > > > Yes, rmmod has to be called directly after the guest shutdown to
> > > > see the issue.
> > > > See my 2nd mail.
> > > 
> > > I see. Do you know, which kmem_cache it is? If not, can you, please,
> > > figure it out?
> > > 
> > > I tried to reproduce the issue, but wasn't successful so far. So I
> > > wonder
> > > what can make your case special.
> > 
> > I do not know either, but have a guess.
> > 
> > My guess would be that either the slab object or the
> > slab page is RCU freed, and the kmem_cache destruction
> > is called before that RCU callback has completed.
> > 
> 
> I've a reproducer, but it requires SLAB_TYPESAFE_BY_RCU to panic.
> The only question is if it's the same or different issues.
> As soon as I'll have a fix, I'll post it here to test.

Ah, no, the issue I've reproduced is already fixed by commit b749ecfaf6c5
("mm: memcg/slab: fix panic in __free_slab() caused by premature memcg pointer release").

Christian, can you, please, confirm that you have this one in your tree?

Also, can you, please, provide you config?
And you mentioned some panics, but didn't send any dmesg messages.
Can you, please, provide them?

Thanks!

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: WARNING bisected (was Re: [PATCH v7 08/10] mm: rework non-root kmem_cache lifecycle management)
  2019-11-21 18:45       ` Roman Gushchin
  2019-11-21 20:43         ` Rik van Riel
@ 2019-11-22 16:28         ` Christian Borntraeger
  2019-11-24  0:39           ` Roman Gushchin
  1 sibling, 1 reply; 12+ messages in thread
From: Christian Borntraeger @ 2019-11-22 16:28 UTC (permalink / raw)
  To: Roman Gushchin
  Cc: akpm, hannes, Kernel Team, linux-kernel, linux-mm, longman,
	shakeelb, vdavydov.dev, Heiko Carstens, Janosch Frank, kvm

On 21.11.19 19:45, Roman Gushchin wrote:
> I see. Do you know, which kmem_cache it is? If not, can you, please,
> figure it out?

The release function for that ref is kmemcg_cache_shutdown. 


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: WARNING bisected (was Re: [PATCH v7 08/10] mm: rework non-root kmem_cache lifecycle management)
  2019-11-22 16:28         ` Christian Borntraeger
@ 2019-11-24  0:39           ` Roman Gushchin
  2019-11-25  8:00             ` Christian Borntraeger
  0 siblings, 1 reply; 12+ messages in thread
From: Roman Gushchin @ 2019-11-24  0:39 UTC (permalink / raw)
  To: Christian Borntraeger
  Cc: akpm, hannes, Kernel Team, linux-kernel, linux-mm, longman,
	shakeelb, vdavydov.dev, Heiko Carstens, Janosch Frank, kvm

On Fri, Nov 22, 2019 at 05:28:46PM +0100, Christian Borntraeger wrote:
> On 21.11.19 19:45, Roman Gushchin wrote:
> > I see. Do you know, which kmem_cache it is? If not, can you, please,
> > figure it out?
> 
> The release function for that ref is kmemcg_cache_shutdown. 
> 

Hi Christian!

Can you, please, test if the following patch resolves the problem?

diff --git a/mm/slab_common.c b/mm/slab_common.c
index 8afa188f6e20..628e5f0ee19e 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -888,6 +888,8 @@ static int shutdown_memcg_caches(struct kmem_cache *s)
 
 static void flush_memcg_workqueue(struct kmem_cache *s)
 {
+	bool wait_for_children;
+
 	spin_lock_irq(&memcg_kmem_wq_lock);
 	s->memcg_params.dying = true;
 	spin_unlock_irq(&memcg_kmem_wq_lock);
@@ -904,6 +906,13 @@ static void flush_memcg_workqueue(struct kmem_cache *s)
 	 * previous workitems on workqueue are processed.
 	 */
 	flush_workqueue(memcg_kmem_cache_wq);
+
+	mutex_lock(&slab_mutex);
+	wait_for_children = !list_empty(&s->memcg_params.children);
+	mutex_unlock(&slab_mutex);
+
+	if (wait_for_children)
+		rcu_barrier();
 }
 #else
 static inline int shutdown_memcg_caches(struct kmem_cache *s)

--

Thanks!

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: WARNING bisected (was Re: [PATCH v7 08/10] mm: rework non-root kmem_cache lifecycle management)
  2019-11-24  0:39           ` Roman Gushchin
@ 2019-11-25  8:00             ` Christian Borntraeger
  2019-11-25 18:07               ` Roman Gushchin
  0 siblings, 1 reply; 12+ messages in thread
From: Christian Borntraeger @ 2019-11-25  8:00 UTC (permalink / raw)
  To: Roman Gushchin
  Cc: akpm, hannes, Kernel Team, linux-kernel, linux-mm, longman,
	shakeelb, vdavydov.dev, Heiko Carstens, Janosch Frank, kvm



On 24.11.19 01:39, Roman Gushchin wrote:
> On Fri, Nov 22, 2019 at 05:28:46PM +0100, Christian Borntraeger wrote:
>> On 21.11.19 19:45, Roman Gushchin wrote:
>>> I see. Do you know, which kmem_cache it is? If not, can you, please,
>>> figure it out?
>>
>> The release function for that ref is kmemcg_cache_shutdown. 
>>
> 
> Hi Christian!
> 
> Can you, please, test if the following patch resolves the problem?

Yes, it does.


> 
> diff --git a/mm/slab_common.c b/mm/slab_common.c
> index 8afa188f6e20..628e5f0ee19e 100644
> --- a/mm/slab_common.c
> +++ b/mm/slab_common.c
> @@ -888,6 +888,8 @@ static int shutdown_memcg_caches(struct kmem_cache *s)
>  
>  static void flush_memcg_workqueue(struct kmem_cache *s)
>  {
> +	bool wait_for_children;
> +
>  	spin_lock_irq(&memcg_kmem_wq_lock);
>  	s->memcg_params.dying = true;
>  	spin_unlock_irq(&memcg_kmem_wq_lock);
> @@ -904,6 +906,13 @@ static void flush_memcg_workqueue(struct kmem_cache *s)
>  	 * previous workitems on workqueue are processed.
>  	 */
>  	flush_workqueue(memcg_kmem_cache_wq);
> +
> +	mutex_lock(&slab_mutex);
> +	wait_for_children = !list_empty(&s->memcg_params.children);
> +	mutex_unlock(&slab_mutex);

Not sure if (for reading) we really need the mutex. 
> +
> +	if (wait_for_children)
> +		rcu_barrier();
>  }
>  #else
>  static inline int shutdown_memcg_caches(struct kmem_cache *s)
> 
> --
> 
> Thanks!
> 


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: WARNING bisected (was Re: [PATCH v7 08/10] mm: rework non-root kmem_cache lifecycle management)
  2019-11-25  8:00             ` Christian Borntraeger
@ 2019-11-25 18:07               ` Roman Gushchin
  0 siblings, 0 replies; 12+ messages in thread
From: Roman Gushchin @ 2019-11-25 18:07 UTC (permalink / raw)
  To: Christian Borntraeger
  Cc: akpm, hannes, Kernel Team, linux-kernel, linux-mm, longman,
	shakeelb, vdavydov.dev, Heiko Carstens, Janosch Frank, kvm

On Mon, Nov 25, 2019 at 09:00:56AM +0100, Christian Borntraeger wrote:
> 
> 
> On 24.11.19 01:39, Roman Gushchin wrote:
> > On Fri, Nov 22, 2019 at 05:28:46PM +0100, Christian Borntraeger wrote:
> >> On 21.11.19 19:45, Roman Gushchin wrote:
> >>> I see. Do you know, which kmem_cache it is? If not, can you, please,
> >>> figure it out?
> >>
> >> The release function for that ref is kmemcg_cache_shutdown. 
> >>
> > 
> > Hi Christian!
> > 
> > Can you, please, test if the following patch resolves the problem?
> 
> Yes, it does.

Thanks for testing it!
I'll send the patch shortly.

> 
> 
> > 
> > diff --git a/mm/slab_common.c b/mm/slab_common.c
> > index 8afa188f6e20..628e5f0ee19e 100644
> > --- a/mm/slab_common.c
> > +++ b/mm/slab_common.c
> > @@ -888,6 +888,8 @@ static int shutdown_memcg_caches(struct kmem_cache *s)
> >  
> >  static void flush_memcg_workqueue(struct kmem_cache *s)
> >  {
> > +	bool wait_for_children;
> > +
> >  	spin_lock_irq(&memcg_kmem_wq_lock);
> >  	s->memcg_params.dying = true;
> >  	spin_unlock_irq(&memcg_kmem_wq_lock);
> > @@ -904,6 +906,13 @@ static void flush_memcg_workqueue(struct kmem_cache *s)
> >  	 * previous workitems on workqueue are processed.
> >  	 */
> >  	flush_workqueue(memcg_kmem_cache_wq);
> > +
> > +	mutex_lock(&slab_mutex);
> > +	wait_for_children = !list_empty(&s->memcg_params.children);
> > +	mutex_unlock(&slab_mutex);
> 
> Not sure if (for reading) we really need the mutex.

Good point!
At this moment the list of children caches can't grow, only shrink.
So if we're reading it without the slab mutex, the worst thing can
happen is that we'll make an excessive rcu_barrier() call.
Which is fine given that resulting code looks much simpler.

Thanks!

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2019-11-25 18:08 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20190611231813.3148843-9-guro@fb.com>
2019-11-21 11:17 ` WARNING bisected (was Re: [PATCH v7 08/10] mm: rework non-root kmem_cache lifecycle management) Christian Borntraeger
2019-11-21 13:08   ` Christian Borntraeger
2019-11-21 16:58   ` Roman Gushchin
2019-11-21 16:59     ` Christian Borntraeger
2019-11-21 18:45       ` Roman Gushchin
2019-11-21 20:43         ` Rik van Riel
2019-11-21 20:55           ` Roman Gushchin
2019-11-21 22:09             ` Roman Gushchin
2019-11-22 16:28         ` Christian Borntraeger
2019-11-24  0:39           ` Roman Gushchin
2019-11-25  8:00             ` Christian Borntraeger
2019-11-25 18:07               ` Roman Gushchin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).