linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] mm: memcontrol: fix forget to obtain the ref to objcg in split_page_memcg
@ 2021-04-01  3:01 Muchun Song
  2021-04-01  3:04 ` Shakeel Butt
                   ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: Muchun Song @ 2021-04-01  3:01 UTC (permalink / raw)
  To: guro, hannes, mhocko, akpm, shakeelb, vdavydov.dev
  Cc: linux-kernel, linux-mm, duanxiongchun, Muchun Song,
	Christian Borntraeger

Christian Borntraeger reported a warning about "percpu ref
(obj_cgroup_release) <= 0 (-1) after switching to atomic".
Because we forgot to obtain the reference to the objcg and
wrongly obtain the reference of memcg.

Reported-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Muchun Song <songmuchun@bytedance.com>
---
 include/linux/memcontrol.h | 6 ++++++
 mm/memcontrol.c            | 6 +++++-
 2 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index 0e8907957227..c960fd49c3e8 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -804,6 +804,12 @@ static inline void obj_cgroup_get(struct obj_cgroup *objcg)
 	percpu_ref_get(&objcg->refcnt);
 }
 
+static inline void obj_cgroup_get_many(struct obj_cgroup *objcg,
+				       unsigned long nr)
+{
+	percpu_ref_get_many(&objcg->refcnt, nr);
+}
+
 static inline void obj_cgroup_put(struct obj_cgroup *objcg)
 {
 	percpu_ref_put(&objcg->refcnt);
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index c0b83a396299..64ada9e650a5 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -3133,7 +3133,11 @@ void split_page_memcg(struct page *head, unsigned int nr)
 
 	for (i = 1; i < nr; i++)
 		head[i].memcg_data = head->memcg_data;
-	css_get_many(&memcg->css, nr - 1);
+
+	if (PageMemcgKmem(head))
+		obj_cgroup_get_many(__page_objcg(head), nr - 1);
+	else
+		css_get_many(&memcg->css, nr - 1);
 }
 
 #ifdef CONFIG_MEMCG_SWAP
-- 
2.11.0



^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH] mm: memcontrol: fix forget to obtain the ref to objcg in split_page_memcg
  2021-04-01  3:01 [PATCH] mm: memcontrol: fix forget to obtain the ref to objcg in split_page_memcg Muchun Song
@ 2021-04-01  3:04 ` Shakeel Butt
  2021-04-01  3:31 ` Miaohe Lin
  2021-04-12 10:41 ` Christian Borntraeger
  2 siblings, 0 replies; 11+ messages in thread
From: Shakeel Butt @ 2021-04-01  3:04 UTC (permalink / raw)
  To: Muchun Song
  Cc: Roman Gushchin, Johannes Weiner, Michal Hocko, Andrew Morton,
	Vladimir Davydov, LKML, Linux MM, Xiongchun duan,
	Christian Borntraeger

On Wed, Mar 31, 2021 at 8:02 PM Muchun Song <songmuchun@bytedance.com> wrote:
>
> Christian Borntraeger reported a warning about "percpu ref
> (obj_cgroup_release) <= 0 (-1) after switching to atomic".
> Because we forgot to obtain the reference to the objcg and
> wrongly obtain the reference of memcg.
>
> Reported-by: Christian Borntraeger <borntraeger@de.ibm.com>
> Signed-off-by: Muchun Song <songmuchun@bytedance.com>

Looks good to me.

Reviewed-by: Shakeel Butt <shakeelb@google.com>


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] mm: memcontrol: fix forget to obtain the ref to objcg in split_page_memcg
  2021-04-01  3:01 [PATCH] mm: memcontrol: fix forget to obtain the ref to objcg in split_page_memcg Muchun Song
  2021-04-01  3:04 ` Shakeel Butt
@ 2021-04-01  3:31 ` Miaohe Lin
  2021-04-01  3:35   ` Roman Gushchin
  2021-04-12 10:41 ` Christian Borntraeger
  2 siblings, 1 reply; 11+ messages in thread
From: Miaohe Lin @ 2021-04-01  3:31 UTC (permalink / raw)
  To: Muchun Song
  Cc: linux-kernel, linux-mm, duanxiongchun, Christian Borntraeger,
	guro, hannes, mhocko, akpm, shakeelb, vdavydov.dev

On 2021/4/1 11:01, Muchun Song wrote:
> Christian Borntraeger reported a warning about "percpu ref
> (obj_cgroup_release) <= 0 (-1) after switching to atomic".
> Because we forgot to obtain the reference to the objcg and
> wrongly obtain the reference of memcg.
> 
> Reported-by: Christian Borntraeger <borntraeger@de.ibm.com>
> Signed-off-by: Muchun Song <songmuchun@bytedance.com>

Thanks for the patch.
Is a Fixes tag needed?

> ---
>  include/linux/memcontrol.h | 6 ++++++
>  mm/memcontrol.c            | 6 +++++-
>  2 files changed, 11 insertions(+), 1 deletion(-)
> 
> diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
> index 0e8907957227..c960fd49c3e8 100644
> --- a/include/linux/memcontrol.h
> +++ b/include/linux/memcontrol.h
> @@ -804,6 +804,12 @@ static inline void obj_cgroup_get(struct obj_cgroup *objcg)
>  	percpu_ref_get(&objcg->refcnt);
>  }
>  
> +static inline void obj_cgroup_get_many(struct obj_cgroup *objcg,
> +				       unsigned long nr)
> +{
> +	percpu_ref_get_many(&objcg->refcnt, nr);
> +}
> +
>  static inline void obj_cgroup_put(struct obj_cgroup *objcg)
>  {
>  	percpu_ref_put(&objcg->refcnt);
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index c0b83a396299..64ada9e650a5 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -3133,7 +3133,11 @@ void split_page_memcg(struct page *head, unsigned int nr)
>  
>  	for (i = 1; i < nr; i++)
>  		head[i].memcg_data = head->memcg_data;
> -	css_get_many(&memcg->css, nr - 1);
> +
> +	if (PageMemcgKmem(head))
> +		obj_cgroup_get_many(__page_objcg(head), nr - 1);
> +	else
> +		css_get_many(&memcg->css, nr - 1);
>  }
>  
>  #ifdef CONFIG_MEMCG_SWAP
> 



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] mm: memcontrol: fix forget to obtain the ref to objcg in split_page_memcg
  2021-04-01  3:31 ` Miaohe Lin
@ 2021-04-01  3:35   ` Roman Gushchin
  2021-04-01  3:38     ` Miaohe Lin
  2021-04-03  1:04     ` Andrew Morton
  0 siblings, 2 replies; 11+ messages in thread
From: Roman Gushchin @ 2021-04-01  3:35 UTC (permalink / raw)
  To: Miaohe Lin
  Cc: Muchun Song, linux-kernel, linux-mm, duanxiongchun,
	Christian Borntraeger, hannes, mhocko, akpm, shakeelb,
	vdavydov.dev

On Thu, Apr 01, 2021 at 11:31:16AM +0800, Miaohe Lin wrote:
> On 2021/4/1 11:01, Muchun Song wrote:
> > Christian Borntraeger reported a warning about "percpu ref
> > (obj_cgroup_release) <= 0 (-1) after switching to atomic".
> > Because we forgot to obtain the reference to the objcg and
> > wrongly obtain the reference of memcg.
> > 
> > Reported-by: Christian Borntraeger <borntraeger@de.ibm.com>
> > Signed-off-by: Muchun Song <songmuchun@bytedance.com>
> 
> Thanks for the patch.
> Is a Fixes tag needed?

No, as the original patch hasn't been merged into the Linus's tree yet.
So the fix can be simply squashed.

Btw, the fix looks good to me.

Acked-by: Roman Gushchin <guro@fb.com>

> 
> > ---
> >  include/linux/memcontrol.h | 6 ++++++
> >  mm/memcontrol.c            | 6 +++++-
> >  2 files changed, 11 insertions(+), 1 deletion(-)
> > 
> > diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
> > index 0e8907957227..c960fd49c3e8 100644
> > --- a/include/linux/memcontrol.h
> > +++ b/include/linux/memcontrol.h
> > @@ -804,6 +804,12 @@ static inline void obj_cgroup_get(struct obj_cgroup *objcg)
> >  	percpu_ref_get(&objcg->refcnt);
> >  }
> >  
> > +static inline void obj_cgroup_get_many(struct obj_cgroup *objcg,
> > +				       unsigned long nr)
> > +{
> > +	percpu_ref_get_many(&objcg->refcnt, nr);
> > +}
> > +
> >  static inline void obj_cgroup_put(struct obj_cgroup *objcg)
> >  {
> >  	percpu_ref_put(&objcg->refcnt);
> > diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> > index c0b83a396299..64ada9e650a5 100644
> > --- a/mm/memcontrol.c
> > +++ b/mm/memcontrol.c
> > @@ -3133,7 +3133,11 @@ void split_page_memcg(struct page *head, unsigned int nr)
> >  
> >  	for (i = 1; i < nr; i++)
> >  		head[i].memcg_data = head->memcg_data;
> > -	css_get_many(&memcg->css, nr - 1);
> > +
> > +	if (PageMemcgKmem(head))
> > +		obj_cgroup_get_many(__page_objcg(head), nr - 1);
> > +	else
> > +		css_get_many(&memcg->css, nr - 1);
> >  }
> >  
> >  #ifdef CONFIG_MEMCG_SWAP
> > 
> 


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] mm: memcontrol: fix forget to obtain the ref to objcg in split_page_memcg
  2021-04-01  3:35   ` Roman Gushchin
@ 2021-04-01  3:38     ` Miaohe Lin
  2021-04-03  1:04     ` Andrew Morton
  1 sibling, 0 replies; 11+ messages in thread
From: Miaohe Lin @ 2021-04-01  3:38 UTC (permalink / raw)
  To: Roman Gushchin, Muchun Song
  Cc: linux-kernel, linux-mm, duanxiongchun, Christian Borntraeger,
	hannes, mhocko, akpm, shakeelb, vdavydov.dev

On 2021/4/1 11:35, Roman Gushchin wrote:
> On Thu, Apr 01, 2021 at 11:31:16AM +0800, Miaohe Lin wrote:
>> On 2021/4/1 11:01, Muchun Song wrote:
>>> Christian Borntraeger reported a warning about "percpu ref
>>> (obj_cgroup_release) <= 0 (-1) after switching to atomic".
>>> Because we forgot to obtain the reference to the objcg and
>>> wrongly obtain the reference of memcg.
>>>
>>> Reported-by: Christian Borntraeger <borntraeger@de.ibm.com>
>>> Signed-off-by: Muchun Song <songmuchun@bytedance.com>
>>
>> Thanks for the patch.
>> Is a Fixes tag needed?
> 
> No, as the original patch hasn't been merged into the Linus's tree yet.
> So the fix can be simply squashed.
> 
> Btw, the fix looks good to me.
> 
> Acked-by: Roman Gushchin <guro@fb.com>
> 

I see. Many thanks for explanation!

The code looks good to me.
Reviewed-by: Miaohe Lin <linmiaohe@huawei.com>

>>
>>> ---
>>>  include/linux/memcontrol.h | 6 ++++++
>>>  mm/memcontrol.c            | 6 +++++-
>>>  2 files changed, 11 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
>>> index 0e8907957227..c960fd49c3e8 100644
>>> --- a/include/linux/memcontrol.h
>>> +++ b/include/linux/memcontrol.h
>>> @@ -804,6 +804,12 @@ static inline void obj_cgroup_get(struct obj_cgroup *objcg)
>>>  	percpu_ref_get(&objcg->refcnt);
>>>  }
>>>  
>>> +static inline void obj_cgroup_get_many(struct obj_cgroup *objcg,
>>> +				       unsigned long nr)
>>> +{
>>> +	percpu_ref_get_many(&objcg->refcnt, nr);
>>> +}
>>> +
>>>  static inline void obj_cgroup_put(struct obj_cgroup *objcg)
>>>  {
>>>  	percpu_ref_put(&objcg->refcnt);
>>> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
>>> index c0b83a396299..64ada9e650a5 100644
>>> --- a/mm/memcontrol.c
>>> +++ b/mm/memcontrol.c
>>> @@ -3133,7 +3133,11 @@ void split_page_memcg(struct page *head, unsigned int nr)
>>>  
>>>  	for (i = 1; i < nr; i++)
>>>  		head[i].memcg_data = head->memcg_data;
>>> -	css_get_many(&memcg->css, nr - 1);
>>> +
>>> +	if (PageMemcgKmem(head))
>>> +		obj_cgroup_get_many(__page_objcg(head), nr - 1);
>>> +	else
>>> +		css_get_many(&memcg->css, nr - 1);
>>>  }
>>>  
>>>  #ifdef CONFIG_MEMCG_SWAP
>>>
>>
> .
> 



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] mm: memcontrol: fix forget to obtain the ref to objcg in split_page_memcg
  2021-04-01  3:35   ` Roman Gushchin
  2021-04-01  3:38     ` Miaohe Lin
@ 2021-04-03  1:04     ` Andrew Morton
  2021-04-03  1:10       ` Shakeel Butt
  2021-04-03  1:12       ` Roman Gushchin
  1 sibling, 2 replies; 11+ messages in thread
From: Andrew Morton @ 2021-04-03  1:04 UTC (permalink / raw)
  To: Roman Gushchin
  Cc: Miaohe Lin, Muchun Song, linux-kernel, linux-mm, duanxiongchun,
	Christian Borntraeger, hannes, mhocko, shakeelb, vdavydov.dev

On Wed, 31 Mar 2021 20:35:02 -0700 Roman Gushchin <guro@fb.com> wrote:

> On Thu, Apr 01, 2021 at 11:31:16AM +0800, Miaohe Lin wrote:
> > On 2021/4/1 11:01, Muchun Song wrote:
> > > Christian Borntraeger reported a warning about "percpu ref
> > > (obj_cgroup_release) <= 0 (-1) after switching to atomic".
> > > Because we forgot to obtain the reference to the objcg and
> > > wrongly obtain the reference of memcg.
> > > 
> > > Reported-by: Christian Borntraeger <borntraeger@de.ibm.com>
> > > Signed-off-by: Muchun Song <songmuchun@bytedance.com>
> > 
> > Thanks for the patch.
> > Is a Fixes tag needed?
> 
> No, as the original patch hasn't been merged into the Linus's tree yet.
> So the fix can be simply squashed.

Help.  Which is "the original patch"?

> Btw, the fix looks good to me.
> 
> Acked-by: Roman Gushchin <guro@fb.com>



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] mm: memcontrol: fix forget to obtain the ref to objcg in split_page_memcg
  2021-04-03  1:04     ` Andrew Morton
@ 2021-04-03  1:10       ` Shakeel Butt
  2021-04-03  1:12       ` Roman Gushchin
  1 sibling, 0 replies; 11+ messages in thread
From: Shakeel Butt @ 2021-04-03  1:10 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Roman Gushchin, Miaohe Lin, Muchun Song, LKML, Linux MM,
	Xiongchun duan, Christian Borntraeger, Johannes Weiner,
	Michal Hocko, Vladimir Davydov

On Fri, Apr 2, 2021 at 6:04 PM Andrew Morton <akpm@linux-foundation.org> wrote:
>
> On Wed, 31 Mar 2021 20:35:02 -0700 Roman Gushchin <guro@fb.com> wrote:
>
> > On Thu, Apr 01, 2021 at 11:31:16AM +0800, Miaohe Lin wrote:
> > > On 2021/4/1 11:01, Muchun Song wrote:
> > > > Christian Borntraeger reported a warning about "percpu ref
> > > > (obj_cgroup_release) <= 0 (-1) after switching to atomic".
> > > > Because we forgot to obtain the reference to the objcg and
> > > > wrongly obtain the reference of memcg.
> > > >
> > > > Reported-by: Christian Borntraeger <borntraeger@de.ibm.com>
> > > > Signed-off-by: Muchun Song <songmuchun@bytedance.com>
> > >
> > > Thanks for the patch.
> > > Is a Fixes tag needed?
> >
> > No, as the original patch hasn't been merged into the Linus's tree yet.
> > So the fix can be simply squashed.
>
> Help.  Which is "the original patch"?

"mm: memcontrol: use obj_cgroup APIs to charge kmem pages"


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] mm: memcontrol: fix forget to obtain the ref to objcg in split_page_memcg
  2021-04-03  1:04     ` Andrew Morton
  2021-04-03  1:10       ` Shakeel Butt
@ 2021-04-03  1:12       ` Roman Gushchin
  1 sibling, 0 replies; 11+ messages in thread
From: Roman Gushchin @ 2021-04-03  1:12 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Miaohe Lin, Muchun Song, linux-kernel, linux-mm, duanxiongchun,
	Christian Borntraeger, hannes, mhocko, shakeelb, vdavydov.dev

On Fri, Apr 02, 2021 at 06:04:54PM -0700, Andrew Morton wrote:
> On Wed, 31 Mar 2021 20:35:02 -0700 Roman Gushchin <guro@fb.com> wrote:
> 
> > On Thu, Apr 01, 2021 at 11:31:16AM +0800, Miaohe Lin wrote:
> > > On 2021/4/1 11:01, Muchun Song wrote:
> > > > Christian Borntraeger reported a warning about "percpu ref
> > > > (obj_cgroup_release) <= 0 (-1) after switching to atomic".
> > > > Because we forgot to obtain the reference to the objcg and
> > > > wrongly obtain the reference of memcg.
> > > > 
> > > > Reported-by: Christian Borntraeger <borntraeger@de.ibm.com>
> > > > Signed-off-by: Muchun Song <songmuchun@bytedance.com>
> > > 
> > > Thanks for the patch.
> > > Is a Fixes tag needed?
> > 
> > No, as the original patch hasn't been merged into the Linus's tree yet.
> > So the fix can be simply squashed.
> 
> Help.  Which is "the original patch"?

"mm: memcontrol: use obj_cgroup APIs to charge kmem pages"


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] mm: memcontrol: fix forget to obtain the ref to objcg in split_page_memcg
  2021-04-01  3:01 [PATCH] mm: memcontrol: fix forget to obtain the ref to objcg in split_page_memcg Muchun Song
  2021-04-01  3:04 ` Shakeel Butt
  2021-04-01  3:31 ` Miaohe Lin
@ 2021-04-12 10:41 ` Christian Borntraeger
  2021-04-12 10:53   ` [External] " Muchun Song
  2 siblings, 1 reply; 11+ messages in thread
From: Christian Borntraeger @ 2021-04-12 10:41 UTC (permalink / raw)
  To: Muchun Song, guro, hannes, mhocko, akpm, shakeelb, vdavydov.dev
  Cc: linux-kernel, linux-mm, duanxiongchun, linux-s390

FWIW, I was away the last week, and I checked yesterdays next (e99d8a849517) regression runs.
I still do see errors in our CI system:

[ 2263.021681] ------------[ cut here ]------------
[ 2263.021697] percpu ref (obj_cgroup_release) <= 0 (0) after switching to atomic
[ 2263.021748] WARNING: CPU: 4 PID: 0 at lib/percpu-refcount.c:196 percpu_ref_switch_to_atomic_rcu+0x1ea/0x1f8
[ 2263.021756] Modules linked in: scsi_debug vfio_pci irqbypass vfio_virqfd kvm vhost_vsock vmw_vsock_virtio_transport_common vsock vhost vhost_iotlb xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT xt_tcpudp nft_compat nf_nat_tftp nft_objref nf_conntrack_tftp nft_counter bridge stp llc nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set nf_tables nfnetlink dm_service_time zfcp scsi_transport_fc dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua rpcrdma sunrpc rdma_ucm rdma_cm iw_cm ib_cm mlx5_ib dm_mod ib_uverbs ib_core s390_trng vfio_ccw vfio_mdev mdev vfio_iommu_type1 vfio eadm_sch zcrypt_cex4 sch_fq_codel configfs ip_tables x_tables ghash_s390 prng aes_s390 des_s390 libdes sha3_512_s390 sha3_256_s390 mlx5_core sha512_s390 sha256_s390 sha1_s390 sha_common nvme nvme_core pkey zcrypt rng_core autofs4 [last unloaded: vfio_ap]
[ 2263.021820] CPU: 4 PID: 0 Comm: swapper/4 Not tainted 5.12.0-20210412.rc6.git0.e99d8a849517.300.fc33.s390x+next #1
[ 2263.021823] Hardware name: IBM 8561 T01 703 (LPAR)
[ 2263.021825] Krnl PSW : 0704c00180000000 000000025b234c1e (percpu_ref_switch_to_atomic_rcu+0x1ee/0x1f8)
[ 2263.021829]            R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:0 PM:0 RI:0 EA:3
[ 2263.021832] Krnl GPRS: c0000000fffeffff 00000002f7212818 0000000000000042 00000000fffeffff
[ 2263.021834]            00000000ffffffea 0000038000000001 0000000000000000 000003800000017c
[ 2263.021836]            000000025b980988 00000000b774d0e0 000003fee191d5d8 8000000000000000
[ 2263.021838]            000000008034c000 00000002f7227570 000000025b234c1a 00000380000aba28
[ 2263.021849] Krnl Code: 000000025b234c0e: e3309fe8ff04        lg      %r3,-24(%r9)
                           000000025b234c14: c0e5001ebe92        brasl   %r14,000000025b60c938
                          #000000025b234c1a: af000000            mc      0,0
                          >000000025b234c1e: a7f4ffcc            brc     15,000000025b234bb6
                           000000025b234c22: 0707                bcr     0,%r7
                           000000025b234c24: 0707                bcr     0,%r7
                           000000025b234c26: 0707                bcr     0,%r7
                           000000025b234c28: eb6ff0480024        stmg    %r6,%r15,72(%r15)
[ 2263.021912] Call Trace:
[ 2263.021914]  [<000000025b234c1e>] percpu_ref_switch_to_atomic_rcu+0x1ee/0x1f8
[ 2263.021917] ([<000000025b234c1a>] percpu_ref_switch_to_atomic_rcu+0x1ea/0x1f8)
[ 2263.021919]  [<000000025abe16fe>] rcu_do_batch+0x146/0x608
[ 2263.021924]  [<000000025abe5ff4>] rcu_core+0x124/0x1d0
[ 2263.021926]  [<000000025b62a222>] __do_softirq+0x13a/0x3c8
[ 2263.021930]  [<000000025ab5d3f6>] irq_exit+0xce/0xf8
[ 2263.021934]  [<000000025b61a5f6>] do_ext_irq+0xd6/0x160
[ 2263.021937]  [<000000025b627c3c>] ext_int_handler+0xc4/0xf4
[ 2263.021939]  [<0000000000000000>] 0x0
[ 2263.021943]  [<000000025b62775a>] default_idle_call+0x42/0x110
[ 2263.021945]  [<000000025ab99328>] do_idle+0xd8/0x168
[ 2263.021949]  [<000000025ab99576>] cpu_startup_entry+0x36/0x40
[ 2263.021952]  [<000000025ab1f33a>] smp_start_secondary+0x82/0x88
[ 2263.021955] Last Breaking-Event-Address:
[ 2263.021955]  [<000000025abc8828>] vprintk_emit+0xa8/0x110
[ 2263.021961] Kernel panic - not syncing: panic_on_warn set ...
[ 2263.021962] CPU: 4 PID: 0 Comm: swapper/4 Not tainted 5.12.0-20210412.rc6.git0.e99d8a849517.300.fc33.s390x+next #1
[ 2263.021964] Hardware name: IBM 8561 T01 703 (LPAR)
[ 2263.021965] Call Trace:
[ 2263.021966]  [<000000025b60bc9a>] show_stack+0x92/0xd8
[ 2263.021972]  [<000000025b6161c0>] dump_stack+0x90/0xc0
[ 2263.021975]  [<000000025b60cab2>] panic+0x112/0x308
[ 2263.021977]  [<000000025ab5571a>] __warn+0xc2/0x158
[ 2263.021981]  [<000000025b2a5e4a>] report_bug+0xb2/0x130
[ 2263.021984]  [<000000025ab09ef4>] monitor_event_exception+0x44/0xc0
[ 2263.021986]  [<000000025b61a1e8>] __do_pgm_check+0xe0/0x1f0
[ 2263.021988]  [<000000025b627b30>] pgm_check_handler+0x118/0x160
[ 2263.021990]  [<000000025b234c1e>] percpu_ref_switch_to_atomic_rcu+0x1ee/0x1f8
[ 2263.021992] ([<000000025b234c1a>] percpu_ref_switch_to_atomic_rcu+0x1ea/0x1f8)
[ 2263.021993]  [<000000025abe16fe>] rcu_do_batch+0x146/0x608
[ 2263.021995]  [<000000025abe5ff4>] rcu_core+0x124/0x1d0
[ 2263.021997]  [<000000025b62a222>] __do_softirq+0x13a/0x3c8
[ 2263.021998]  [<000000025ab5d3f6>] irq_exit+0xce/0xf8
[ 2263.022000]  [<000000025b61a5f6>] do_ext_irq+0xd6/0x160
[ 2263.022001]  [<000000025b627c3c>] ext_int_handler+0xc4/0xf4
[ 2263.022003]  [<0000000000000000>] 0x0
[ 2263.022004]  [<000000025b62775a>] default_idle_call+0x42/0x110
[ 2263.022006]  [<000000025ab99328>] do_idle+0xd8/0x168
[ 2263.022008]  [<000000025ab99576>] cpu_startup_entry+0x36/0x40

So either the fix was not complete or it is still missing in next.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [External] Re: [PATCH] mm: memcontrol: fix forget to obtain the ref to objcg in split_page_memcg
  2021-04-12 10:41 ` Christian Borntraeger
@ 2021-04-12 10:53   ` Muchun Song
  2021-04-12 11:05     ` Christian Borntraeger
  0 siblings, 1 reply; 11+ messages in thread
From: Muchun Song @ 2021-04-12 10:53 UTC (permalink / raw)
  To: Christian Borntraeger
  Cc: guro, hannes, mhocko, akpm, shakeelb, vdavydov.dev, linux-kernel,
	linux-mm, duanxiongchun, linux-s390

On Mon, Apr 12, 2021 at 6:42 PM Christian Borntraeger
<borntraeger@de.ibm.com> wrote:
>
> FWIW, I was away the last week, and I checked yesterdays next (e99d8a849517) regression runs.
> I still do see errors in our CI system:
>
> [ 2263.021681] ------------[ cut here ]------------
> [ 2263.021697] percpu ref (obj_cgroup_release) <= 0 (0) after switching to atomic
> [ 2263.021748] WARNING: CPU: 4 PID: 0 at lib/percpu-refcount.c:196 percpu_ref_switch_to_atomic_rcu+0x1ea/0x1f8
> [ 2263.021756] Modules linked in: scsi_debug vfio_pci irqbypass vfio_virqfd kvm vhost_vsock vmw_vsock_virtio_transport_common vsock vhost vhost_iotlb xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT xt_tcpudp nft_compat nf_nat_tftp nft_objref nf_conntrack_tftp nft_counter bridge stp llc nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set nf_tables nfnetlink dm_service_time zfcp scsi_transport_fc dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua rpcrdma sunrpc rdma_ucm rdma_cm iw_cm ib_cm mlx5_ib dm_mod ib_uverbs ib_core s390_trng vfio_ccw vfio_mdev mdev vfio_iommu_type1 vfio eadm_sch zcrypt_cex4 sch_fq_codel configfs ip_tables x_tables ghash_s390 prng aes_s390 des_s390 libdes sha3_512_s390 sha3_256_s390 mlx5_core sha512_s390 sha256_s390 sha1_s390 sha_common nvme nvme_core pkey zcrypt rng_core autofs4 [last unloaded: vfio_ap]
> [ 2263.021820] CPU: 4 PID: 0 Comm: swapper/4 Not tainted 5.12.0-20210412.rc6.git0.e99d8a849517.300.fc33.s390x+next #1
> [ 2263.021823] Hardware name: IBM 8561 T01 703 (LPAR)
> [ 2263.021825] Krnl PSW : 0704c00180000000 000000025b234c1e (percpu_ref_switch_to_atomic_rcu+0x1ee/0x1f8)
> [ 2263.021829]            R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:0 PM:0 RI:0 EA:3
> [ 2263.021832] Krnl GPRS: c0000000fffeffff 00000002f7212818 0000000000000042 00000000fffeffff
> [ 2263.021834]            00000000ffffffea 0000038000000001 0000000000000000 000003800000017c
> [ 2263.021836]            000000025b980988 00000000b774d0e0 000003fee191d5d8 8000000000000000
> [ 2263.021838]            000000008034c000 00000002f7227570 000000025b234c1a 00000380000aba28
> [ 2263.021849] Krnl Code: 000000025b234c0e: e3309fe8ff04        lg      %r3,-24(%r9)
>                            000000025b234c14: c0e5001ebe92        brasl   %r14,000000025b60c938
>                           #000000025b234c1a: af000000            mc      0,0
>                           >000000025b234c1e: a7f4ffcc            brc     15,000000025b234bb6
>                            000000025b234c22: 0707                bcr     0,%r7
>                            000000025b234c24: 0707                bcr     0,%r7
>                            000000025b234c26: 0707                bcr     0,%r7
>                            000000025b234c28: eb6ff0480024        stmg    %r6,%r15,72(%r15)
> [ 2263.021912] Call Trace:
> [ 2263.021914]  [<000000025b234c1e>] percpu_ref_switch_to_atomic_rcu+0x1ee/0x1f8
> [ 2263.021917] ([<000000025b234c1a>] percpu_ref_switch_to_atomic_rcu+0x1ea/0x1f8)
> [ 2263.021919]  [<000000025abe16fe>] rcu_do_batch+0x146/0x608
> [ 2263.021924]  [<000000025abe5ff4>] rcu_core+0x124/0x1d0
> [ 2263.021926]  [<000000025b62a222>] __do_softirq+0x13a/0x3c8
> [ 2263.021930]  [<000000025ab5d3f6>] irq_exit+0xce/0xf8
> [ 2263.021934]  [<000000025b61a5f6>] do_ext_irq+0xd6/0x160
> [ 2263.021937]  [<000000025b627c3c>] ext_int_handler+0xc4/0xf4
> [ 2263.021939]  [<0000000000000000>] 0x0
> [ 2263.021943]  [<000000025b62775a>] default_idle_call+0x42/0x110
> [ 2263.021945]  [<000000025ab99328>] do_idle+0xd8/0x168
> [ 2263.021949]  [<000000025ab99576>] cpu_startup_entry+0x36/0x40
> [ 2263.021952]  [<000000025ab1f33a>] smp_start_secondary+0x82/0x88
> [ 2263.021955] Last Breaking-Event-Address:
> [ 2263.021955]  [<000000025abc8828>] vprintk_emit+0xa8/0x110
> [ 2263.021961] Kernel panic - not syncing: panic_on_warn set ...
> [ 2263.021962] CPU: 4 PID: 0 Comm: swapper/4 Not tainted 5.12.0-20210412.rc6.git0.e99d8a849517.300.fc33.s390x+next #1
> [ 2263.021964] Hardware name: IBM 8561 T01 703 (LPAR)
> [ 2263.021965] Call Trace:
> [ 2263.021966]  [<000000025b60bc9a>] show_stack+0x92/0xd8
> [ 2263.021972]  [<000000025b6161c0>] dump_stack+0x90/0xc0
> [ 2263.021975]  [<000000025b60cab2>] panic+0x112/0x308
> [ 2263.021977]  [<000000025ab5571a>] __warn+0xc2/0x158
> [ 2263.021981]  [<000000025b2a5e4a>] report_bug+0xb2/0x130
> [ 2263.021984]  [<000000025ab09ef4>] monitor_event_exception+0x44/0xc0
> [ 2263.021986]  [<000000025b61a1e8>] __do_pgm_check+0xe0/0x1f0
> [ 2263.021988]  [<000000025b627b30>] pgm_check_handler+0x118/0x160
> [ 2263.021990]  [<000000025b234c1e>] percpu_ref_switch_to_atomic_rcu+0x1ee/0x1f8
> [ 2263.021992] ([<000000025b234c1a>] percpu_ref_switch_to_atomic_rcu+0x1ea/0x1f8)
> [ 2263.021993]  [<000000025abe16fe>] rcu_do_batch+0x146/0x608
> [ 2263.021995]  [<000000025abe5ff4>] rcu_core+0x124/0x1d0
> [ 2263.021997]  [<000000025b62a222>] __do_softirq+0x13a/0x3c8
> [ 2263.021998]  [<000000025ab5d3f6>] irq_exit+0xce/0xf8
> [ 2263.022000]  [<000000025b61a5f6>] do_ext_irq+0xd6/0x160
> [ 2263.022001]  [<000000025b627c3c>] ext_int_handler+0xc4/0xf4
> [ 2263.022003]  [<0000000000000000>] 0x0
> [ 2263.022004]  [<000000025b62775a>] default_idle_call+0x42/0x110
> [ 2263.022006]  [<000000025ab99328>] do_idle+0xd8/0x168
> [ 2263.022008]  [<000000025ab99576>] cpu_startup_entry+0x36/0x40
>
> So either the fix was not complete or it is still missing in next.

The fix now is on the mm-tree. I guess the branch you
tested does not contain this fix patch. You can check if
the function of obj_cgroup_get_many() exists. If it
doesn't exist, this means my guess is correct.

Thanks.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: [PATCH] mm: memcontrol: fix forget to obtain the ref to objcg in split_page_memcg
  2021-04-12 10:53   ` [External] " Muchun Song
@ 2021-04-12 11:05     ` Christian Borntraeger
  0 siblings, 0 replies; 11+ messages in thread
From: Christian Borntraeger @ 2021-04-12 11:05 UTC (permalink / raw)
  To: Muchun Song
  Cc: guro, hannes, mhocko, akpm, shakeelb, vdavydov.dev, linux-kernel,
	linux-mm, duanxiongchun, linux-s390



On 12.04.21 12:53, Muchun Song wrote:
> On Mon, Apr 12, 2021 at 6:42 PM Christian Borntraeger
> <borntraeger@de.ibm.com> wrote:
>>
>> FWIW, I was away the last week, and I checked yesterdays next (e99d8a849517) regression runs.
>> I still do see errors in our CI system:
>>
>> [ 2263.021681] ------------[ cut here ]------------
>> [ 2263.021697] percpu ref (obj_cgroup_release) <= 0 (0) after switching to atomic
>> [ 2263.021748] WARNING: CPU: 4 PID: 0 at lib/percpu-refcount.c:196 percpu_ref_switch_to_atomic_rcu+0x1ea/0x1f8
>> [ 2263.021756] Modules linked in: scsi_debug vfio_pci irqbypass vfio_virqfd kvm vhost_vsock vmw_vsock_virtio_transport_common vsock vhost vhost_iotlb xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT xt_tcpudp nft_compat nf_nat_tftp nft_objref nf_conntrack_tftp nft_counter bridge stp llc nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set nf_tables nfnetlink dm_service_time zfcp scsi_transport_fc dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua rpcrdma sunrpc rdma_ucm rdma_cm iw_cm ib_cm mlx5_ib dm_mod ib_uverbs ib_core s390_trng vfio_ccw vfio_mdev mdev vfio_iommu_type1 vfio eadm_sch zcrypt_cex4 sch_fq_codel configfs ip_tables x_tables ghash_s390 prng aes_s390 des_s390 libdes sha3_512_s390 sha3_256_s390 mlx5_core sha512_s390 sha256_s390 sha1_s390 sha_common nvme nvme_core pkey zcrypt rng_core autofs4 [last unloaded: vfio_ap]
>> [ 2263.021820] CPU: 4 PID: 0 Comm: swapper/4 Not tainted 5.12.0-20210412.rc6.git0.e99d8a849517.300.fc33.s390x+next #1
>> [ 2263.021823] Hardware name: IBM 8561 T01 703 (LPAR)
>> [ 2263.021825] Krnl PSW : 0704c00180000000 000000025b234c1e (percpu_ref_switch_to_atomic_rcu+0x1ee/0x1f8)
>> [ 2263.021829]            R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:0 PM:0 RI:0 EA:3
>> [ 2263.021832] Krnl GPRS: c0000000fffeffff 00000002f7212818 0000000000000042 00000000fffeffff
>> [ 2263.021834]            00000000ffffffea 0000038000000001 0000000000000000 000003800000017c
>> [ 2263.021836]            000000025b980988 00000000b774d0e0 000003fee191d5d8 8000000000000000
>> [ 2263.021838]            000000008034c000 00000002f7227570 000000025b234c1a 00000380000aba28
>> [ 2263.021849] Krnl Code: 000000025b234c0e: e3309fe8ff04        lg      %r3,-24(%r9)
>>                             000000025b234c14: c0e5001ebe92        brasl   %r14,000000025b60c938
>>                            #000000025b234c1a: af000000            mc      0,0
>>                            >000000025b234c1e: a7f4ffcc            brc     15,000000025b234bb6
>>                             000000025b234c22: 0707                bcr     0,%r7
>>                             000000025b234c24: 0707                bcr     0,%r7
>>                             000000025b234c26: 0707                bcr     0,%r7
>>                             000000025b234c28: eb6ff0480024        stmg    %r6,%r15,72(%r15)
>> [ 2263.021912] Call Trace:
>> [ 2263.021914]  [<000000025b234c1e>] percpu_ref_switch_to_atomic_rcu+0x1ee/0x1f8
>> [ 2263.021917] ([<000000025b234c1a>] percpu_ref_switch_to_atomic_rcu+0x1ea/0x1f8)
>> [ 2263.021919]  [<000000025abe16fe>] rcu_do_batch+0x146/0x608
>> [ 2263.021924]  [<000000025abe5ff4>] rcu_core+0x124/0x1d0
>> [ 2263.021926]  [<000000025b62a222>] __do_softirq+0x13a/0x3c8
>> [ 2263.021930]  [<000000025ab5d3f6>] irq_exit+0xce/0xf8
>> [ 2263.021934]  [<000000025b61a5f6>] do_ext_irq+0xd6/0x160
>> [ 2263.021937]  [<000000025b627c3c>] ext_int_handler+0xc4/0xf4
>> [ 2263.021939]  [<0000000000000000>] 0x0
>> [ 2263.021943]  [<000000025b62775a>] default_idle_call+0x42/0x110
>> [ 2263.021945]  [<000000025ab99328>] do_idle+0xd8/0x168
>> [ 2263.021949]  [<000000025ab99576>] cpu_startup_entry+0x36/0x40
>> [ 2263.021952]  [<000000025ab1f33a>] smp_start_secondary+0x82/0x88
>> [ 2263.021955] Last Breaking-Event-Address:
>> [ 2263.021955]  [<000000025abc8828>] vprintk_emit+0xa8/0x110
>> [ 2263.021961] Kernel panic - not syncing: panic_on_warn set ...
>> [ 2263.021962] CPU: 4 PID: 0 Comm: swapper/4 Not tainted 5.12.0-20210412.rc6.git0.e99d8a849517.300.fc33.s390x+next #1
>> [ 2263.021964] Hardware name: IBM 8561 T01 703 (LPAR)
>> [ 2263.021965] Call Trace:
>> [ 2263.021966]  [<000000025b60bc9a>] show_stack+0x92/0xd8
>> [ 2263.021972]  [<000000025b6161c0>] dump_stack+0x90/0xc0
>> [ 2263.021975]  [<000000025b60cab2>] panic+0x112/0x308
>> [ 2263.021977]  [<000000025ab5571a>] __warn+0xc2/0x158
>> [ 2263.021981]  [<000000025b2a5e4a>] report_bug+0xb2/0x130
>> [ 2263.021984]  [<000000025ab09ef4>] monitor_event_exception+0x44/0xc0
>> [ 2263.021986]  [<000000025b61a1e8>] __do_pgm_check+0xe0/0x1f0
>> [ 2263.021988]  [<000000025b627b30>] pgm_check_handler+0x118/0x160
>> [ 2263.021990]  [<000000025b234c1e>] percpu_ref_switch_to_atomic_rcu+0x1ee/0x1f8
>> [ 2263.021992] ([<000000025b234c1a>] percpu_ref_switch_to_atomic_rcu+0x1ea/0x1f8)
>> [ 2263.021993]  [<000000025abe16fe>] rcu_do_batch+0x146/0x608
>> [ 2263.021995]  [<000000025abe5ff4>] rcu_core+0x124/0x1d0
>> [ 2263.021997]  [<000000025b62a222>] __do_softirq+0x13a/0x3c8
>> [ 2263.021998]  [<000000025ab5d3f6>] irq_exit+0xce/0xf8
>> [ 2263.022000]  [<000000025b61a5f6>] do_ext_irq+0xd6/0x160
>> [ 2263.022001]  [<000000025b627c3c>] ext_int_handler+0xc4/0xf4
>> [ 2263.022003]  [<0000000000000000>] 0x0
>> [ 2263.022004]  [<000000025b62775a>] default_idle_call+0x42/0x110
>> [ 2263.022006]  [<000000025ab99328>] do_idle+0xd8/0x168
>> [ 2263.022008]  [<000000025ab99576>] cpu_startup_entry+0x36/0x40
>>
>> So either the fix was not complete or it is still missing in next.
> 
> The fix now is on the mm-tree. I guess the branch you
> tested does not contain this fix patch. You can check if
> the function of obj_cgroup_get_many() exists. If it
> doesn't exist, this means my guess is correct.

Right, the next tree from april 9th does not yet contain obj_cgroup_get_many.


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2021-04-12 11:05 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-01  3:01 [PATCH] mm: memcontrol: fix forget to obtain the ref to objcg in split_page_memcg Muchun Song
2021-04-01  3:04 ` Shakeel Butt
2021-04-01  3:31 ` Miaohe Lin
2021-04-01  3:35   ` Roman Gushchin
2021-04-01  3:38     ` Miaohe Lin
2021-04-03  1:04     ` Andrew Morton
2021-04-03  1:10       ` Shakeel Butt
2021-04-03  1:12       ` Roman Gushchin
2021-04-12 10:41 ` Christian Borntraeger
2021-04-12 10:53   ` [External] " Muchun Song
2021-04-12 11:05     ` Christian Borntraeger

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).