[PATCH v3] mm, memcg: fix error return value of mem_cgroup_css

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

* [PATCH v3] mm, memcg: fix error return value of mem_cgroup_css_alloc()
@ 2020-04-06 16:56 Yafang Shao
  2020-04-06 23:23 ` Andrew Morton
  0 siblings, 1 reply; 15+ messages in thread
From: Yafang Shao @ 2020-04-06 16:56 UTC (permalink / raw)
  To: willy, hannes, mhocko, vdavydov.dev, akpm; +Cc: linux-mm, Yafang Shao

When I run my memcg testcase which creates lots of memcgs, I found
there're unexpected out of memory logs while there're still enough
available free memory. The error log is,
mkdir: cannot create directory 'foo.65533': Cannot allocate memory

The reason is when we try to create more than MEM_CGROUP_ID_MAX memcgs, an
-ENOMEM errno will be set by mem_cgroup_css_alloc(), but the right errno
should be -EBUSY "Device or resource busy". That is same with
memcg_alloc_cache_id().

As the errno really misled me, we should make it right. After this patch,
the error log will be,
mkdir: cannot create directory 'foo.65533': Device or resource busy

Suggested-by: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
---
 mm/memcontrol.c | 21 +++++++++++++++------
 1 file changed, 15 insertions(+), 6 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index ca194864d802..042af0bc4a05 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2717,6 +2717,8 @@ static int memcg_alloc_cache_id(void)
 
 	id = ida_simple_get(&memcg_cache_ida,
 			    0, MEMCG_CACHES_MAX_SIZE, GFP_KERNEL);
+	if (id == -ENOSPC)
+		return -EBUSY;
 	if (id < 0)
 		return id;
 
@@ -4986,19 +4988,26 @@ static struct mem_cgroup *mem_cgroup_alloc(void)
 	unsigned int size;
 	int node;
 	int __maybe_unused i;
+	long error = -ENOMEM;
 
 	size = sizeof(struct mem_cgroup);
 	size += nr_node_ids * sizeof(struct mem_cgroup_per_node *);
 
 	memcg = kzalloc(size, GFP_KERNEL);
 	if (!memcg)
-		return NULL;
+		return ERR_PTR(error);
 
 	memcg->id.id = idr_alloc(&mem_cgroup_idr, NULL,
 				 1, MEM_CGROUP_ID_MAX,
 				 GFP_KERNEL);
-	if (memcg->id.id < 0)
+	if (memcg->id.id == -ENOSPC) {
+		error = -EBUSY;
+		goto fail;
+	}
+	if (memcg->id.id < 0) {
+		error = memcg->id.id;
 		goto fail;
+	}
 
 	memcg->vmstats_local = alloc_percpu(struct memcg_vmstats_percpu);
 	if (!memcg->vmstats_local)
@@ -5042,7 +5051,7 @@ static struct mem_cgroup *mem_cgroup_alloc(void)
 fail:
 	mem_cgroup_id_remove(memcg);
 	__mem_cgroup_free(memcg);
-	return NULL;
+	return ERR_PTR(error);
 }
 
 static struct cgroup_subsys_state * __ref
@@ -5053,8 +5062,8 @@ mem_cgroup_css_alloc(struct cgroup_subsys_state *parent_css)
 	long error = -ENOMEM;
 
 	memcg = mem_cgroup_alloc();
-	if (!memcg)
-		return ERR_PTR(error);
+	if (IS_ERR(memcg))
+		return ERR_CAST(memcg);
 
 	WRITE_ONCE(memcg->high, PAGE_COUNTER_MAX);
 	memcg->soft_limit = PAGE_COUNTER_MAX;
@@ -5104,7 +5113,7 @@ mem_cgroup_css_alloc(struct cgroup_subsys_state *parent_css)
 fail:
 	mem_cgroup_id_remove(memcg);
 	mem_cgroup_free(memcg);
-	return ERR_PTR(-ENOMEM);
+	return ERR_PTR(error);
 }
 
 static int mem_cgroup_css_online(struct cgroup_subsys_state *css)
-- 
2.18.1



^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH v3] mm, memcg: fix error return value of mem_cgroup_css_alloc()
  2020-04-06 16:56 [PATCH v3] mm, memcg: fix error return value of mem_cgroup_css_alloc() Yafang Shao
@ 2020-04-06 23:23 ` Andrew Morton
  2020-04-07  3:02   ` Yafang Shao
  0 siblings, 1 reply; 15+ messages in thread
From: Andrew Morton @ 2020-04-06 23:23 UTC (permalink / raw)
  To: Yafang Shao; +Cc: willy, hannes, mhocko, vdavydov.dev, linux-mm

On Tue,  7 Apr 2020 00:56:03 +0800 Yafang Shao <laoar.shao@gmail.com> wrote:

> When I run my memcg testcase which creates lots of memcgs, I found
> there're unexpected out of memory logs while there're still enough
> available free memory. The error log is,
> mkdir: cannot create directory 'foo.65533': Cannot allocate memory
> 
> The reason is when we try to create more than MEM_CGROUP_ID_MAX memcgs, an
> -ENOMEM errno will be set by mem_cgroup_css_alloc(), but the right errno
> should be -EBUSY "Device or resource busy". That is same with
> memcg_alloc_cache_id().
> 
> As the errno really misled me, we should make it right. After this patch,
> the error log will be,
> mkdir: cannot create directory 'foo.65533': Device or resource busy

Thanks.

Was a -stable backport considered?


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v3] mm, memcg: fix error return value of mem_cgroup_css_alloc()
  2020-04-06 23:23 ` Andrew Morton
@ 2020-04-07  3:02   ` Yafang Shao
  2020-04-07  3:09     ` Andrew Morton
  0 siblings, 1 reply; 15+ messages in thread
From: Yafang Shao @ 2020-04-07  3:02 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Matthew Wilcox, Johannes Weiner, Michal Hocko, Vladimir Davydov,
	Linux MM

On Tue, Apr 7, 2020 at 7:23 AM Andrew Morton <akpm@linux-foundation.org> wrote:
>
> On Tue,  7 Apr 2020 00:56:03 +0800 Yafang Shao <laoar.shao@gmail.com> wrote:
>
> > When I run my memcg testcase which creates lots of memcgs, I found
> > there're unexpected out of memory logs while there're still enough
> > available free memory. The error log is,
> > mkdir: cannot create directory 'foo.65533': Cannot allocate memory
> >
> > The reason is when we try to create more than MEM_CGROUP_ID_MAX memcgs, an
> > -ENOMEM errno will be set by mem_cgroup_css_alloc(), but the right errno
> > should be -EBUSY "Device or resource busy". That is same with
> > memcg_alloc_cache_id().
> >
> > As the errno really misled me, we should make it right. After this patch,
> > the error log will be,
> > mkdir: cannot create directory 'foo.65533': Device or resource busy
>
> Thanks.
>
> Was a -stable backport considered?

I only backported to our kernel version 4.18, but I'm not sure whether
it will apply to -stable or not.
Seems this issue is introduced long time ago. I will try to backport
it to -stable.
Should I try it based on 5.5.y and 5.6.y only ? Or all the LTS kernel
versions ?

Thanks
Yafang


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v3] mm, memcg: fix error return value of mem_cgroup_css_alloc()
  2020-04-07  3:02   ` Yafang Shao
@ 2020-04-07  3:09     ` Andrew Morton
  2020-04-07  3:11       ` Yafang Shao
  0 siblings, 1 reply; 15+ messages in thread
From: Andrew Morton @ 2020-04-07  3:09 UTC (permalink / raw)
  To: Yafang Shao
  Cc: Matthew Wilcox, Johannes Weiner, Michal Hocko, Vladimir Davydov,
	Linux MM

On Tue, 7 Apr 2020 11:02:31 +0800 Yafang Shao <laoar.shao@gmail.com> wrote:

> On Tue, Apr 7, 2020 at 7:23 AM Andrew Morton <akpm@linux-foundation.org> wrote:
> >
> > On Tue,  7 Apr 2020 00:56:03 +0800 Yafang Shao <laoar.shao@gmail.com> wrote:
> >
> > > When I run my memcg testcase which creates lots of memcgs, I found
> > > there're unexpected out of memory logs while there're still enough
> > > available free memory. The error log is,
> > > mkdir: cannot create directory 'foo.65533': Cannot allocate memory
> > >
> > > The reason is when we try to create more than MEM_CGROUP_ID_MAX memcgs, an
> > > -ENOMEM errno will be set by mem_cgroup_css_alloc(), but the right errno
> > > should be -EBUSY "Device or resource busy". That is same with
> > > memcg_alloc_cache_id().
> > >
> > > As the errno really misled me, we should make it right. After this patch,
> > > the error log will be,
> > > mkdir: cannot create directory 'foo.65533': Device or resource busy
> >
> > Thanks.
> >
> > Was a -stable backport considered?
> 
> I only backported to our kernel version 4.18, but I'm not sure whether
> it will apply to -stable or not.
> Seems this issue is introduced long time ago. I will try to backport
> it to -stable.
> Should I try it based on 5.5.y and 5.6.y only ? Or all the LTS kernel
> versions ?

What I'm asking (of you and of reviewers) is whether this issue is
sufficiently serious to require fixing in -stable kernels.  The
patch merging details can be sorted out later.




^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v3] mm, memcg: fix error return value of mem_cgroup_css_alloc()
  2020-04-07  3:09     ` Andrew Morton
@ 2020-04-07  3:11       ` Yafang Shao
  2020-04-07  6:43         ` Michal Hocko
  0 siblings, 1 reply; 15+ messages in thread
From: Yafang Shao @ 2020-04-07  3:11 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Matthew Wilcox, Johannes Weiner, Michal Hocko, Vladimir Davydov,
	Linux MM

On Tue, Apr 7, 2020 at 11:09 AM Andrew Morton <akpm@linux-foundation.org> wrote:
>
> On Tue, 7 Apr 2020 11:02:31 +0800 Yafang Shao <laoar.shao@gmail.com> wrote:
>
> > On Tue, Apr 7, 2020 at 7:23 AM Andrew Morton <akpm@linux-foundation.org> wrote:
> > >
> > > On Tue,  7 Apr 2020 00:56:03 +0800 Yafang Shao <laoar.shao@gmail.com> wrote:
> > >
> > > > When I run my memcg testcase which creates lots of memcgs, I found
> > > > there're unexpected out of memory logs while there're still enough
> > > > available free memory. The error log is,
> > > > mkdir: cannot create directory 'foo.65533': Cannot allocate memory
> > > >
> > > > The reason is when we try to create more than MEM_CGROUP_ID_MAX memcgs, an
> > > > -ENOMEM errno will be set by mem_cgroup_css_alloc(), but the right errno
> > > > should be -EBUSY "Device or resource busy". That is same with
> > > > memcg_alloc_cache_id().
> > > >
> > > > As the errno really misled me, we should make it right. After this patch,
> > > > the error log will be,
> > > > mkdir: cannot create directory 'foo.65533': Device or resource busy
> > >
> > > Thanks.
> > >
> > > Was a -stable backport considered?
> >
> > I only backported to our kernel version 4.18, but I'm not sure whether
> > it will apply to -stable or not.
> > Seems this issue is introduced long time ago. I will try to backport
> > it to -stable.
> > Should I try it based on 5.5.y and 5.6.y only ? Or all the LTS kernel
> > versions ?
>
> What I'm asking (of you and of reviewers) is whether this issue is
> sufficiently serious to require fixing in -stable kernels.  The
> patch merging details can be sorted out later.
>
>

I think it is worth to cc stable, as this errno will mislead the user.

Thanks
Yafang


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v3] mm, memcg: fix error return value of mem_cgroup_css_alloc()
  2020-04-07  3:11       ` Yafang Shao
@ 2020-04-07  6:43         ` Michal Hocko
  2020-04-07  9:31           ` Yafang Shao
  0 siblings, 1 reply; 15+ messages in thread
From: Michal Hocko @ 2020-04-07  6:43 UTC (permalink / raw)
  To: Yafang Shao
  Cc: Andrew Morton, Matthew Wilcox, Johannes Weiner, Vladimir Davydov,
	Linux MM

[I have only now noticed there was another version posted. Please try to
wait a bit longer for other feedback before reposting a newer version]

On Tue 07-04-20 11:11:13, Yafang Shao wrote:
> On Tue, Apr 7, 2020 at 11:09 AM Andrew Morton <akpm@linux-foundation.org> wrote:
> >
> > On Tue, 7 Apr 2020 11:02:31 +0800 Yafang Shao <laoar.shao@gmail.com> wrote:
> >
> > > On Tue, Apr 7, 2020 at 7:23 AM Andrew Morton <akpm@linux-foundation.org> wrote:
> > > >
> > > > On Tue,  7 Apr 2020 00:56:03 +0800 Yafang Shao <laoar.shao@gmail.com> wrote:
> > > >
> > > > > When I run my memcg testcase which creates lots of memcgs, I found
> > > > > there're unexpected out of memory logs while there're still enough
> > > > > available free memory. The error log is,
> > > > > mkdir: cannot create directory 'foo.65533': Cannot allocate memory
> > > > >
> > > > > The reason is when we try to create more than MEM_CGROUP_ID_MAX memcgs, an
> > > > > -ENOMEM errno will be set by mem_cgroup_css_alloc(), but the right errno
> > > > > should be -EBUSY "Device or resource busy". That is same with
> > > > > memcg_alloc_cache_id().

See my comment about EBUSY in the previous version
http://lkml.kernel.org/r/20200407063621.GA18914@dhcp22.suse.cz

> > > > > As the errno really misled me, we should make it right. After this patch,
> > > > > the error log will be,
> > > > > mkdir: cannot create directory 'foo.65533': Device or resource busy
> > > >
> > > > Thanks.
> > > >
> > > > Was a -stable backport considered?
> > >
> > > I only backported to our kernel version 4.18, but I'm not sure whether
> > > it will apply to -stable or not.
> > > Seems this issue is introduced long time ago. I will try to backport
> > > it to -stable.
> > > Should I try it based on 5.5.y and 5.6.y only ? Or all the LTS kernel
> > > versions ?
> >
> > What I'm asking (of you and of reviewers) is whether this issue is
> > sufficiently serious to require fixing in -stable kernels.  The
> > patch merging details can be sorted out later.
> >
> >
> 
> I think it is worth to cc stable, as this errno will mislead the user.

We have idr failure path since 73f576c04b94 ("mm: memcontrol: fix cgroup
creation failure after many small jobs") which is more than four years
ago without anybody ever noticing. While I do agree that the existing
behavior might be confusing I am not really sure this qualifies as
stable material TBH.
-- 
Michal Hocko
SUSE Labs


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v3] mm, memcg: fix error return value of mem_cgroup_css_alloc()
  2020-04-07  6:43         ` Michal Hocko
@ 2020-04-07  9:31           ` Yafang Shao
  2020-04-07 11:10             ` Michal Hocko
  0 siblings, 1 reply; 15+ messages in thread
From: Yafang Shao @ 2020-04-07  9:31 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Andrew Morton, Matthew Wilcox, Johannes Weiner, Vladimir Davydov,
	Linux MM

On Tue, Apr 7, 2020 at 2:43 PM Michal Hocko <mhocko@kernel.org> wrote:
>
> [I have only now noticed there was another version posted. Please try to
> wait a bit longer for other feedback before reposting a newer version]
>

Sure I will. sorry about that.
But after we send a patch, we always don't know in which state this
patch is, e.g. Changes Requested, Not Applicable, Rejected, Needs
Review / ACK and etc, as listed in netdev list[1]. That could give us
an explicit instruction on what to do next.

[1]. https://patchwork.ozlabs.org/project/netdev/list/?state=*&page=2&param=3

> On Tue 07-04-20 11:11:13, Yafang Shao wrote:
> > On Tue, Apr 7, 2020 at 11:09 AM Andrew Morton <akpm@linux-foundation.org> wrote:
> > >
> > > On Tue, 7 Apr 2020 11:02:31 +0800 Yafang Shao <laoar.shao@gmail.com> wrote:
> > >
> > > > On Tue, Apr 7, 2020 at 7:23 AM Andrew Morton <akpm@linux-foundation.org> wrote:
> > > > >
> > > > > On Tue,  7 Apr 2020 00:56:03 +0800 Yafang Shao <laoar.shao@gmail.com> wrote:
> > > > >
> > > > > > When I run my memcg testcase which creates lots of memcgs, I found
> > > > > > there're unexpected out of memory logs while there're still enough
> > > > > > available free memory. The error log is,
> > > > > > mkdir: cannot create directory 'foo.65533': Cannot allocate memory
> > > > > >
> > > > > > The reason is when we try to create more than MEM_CGROUP_ID_MAX memcgs, an
> > > > > > -ENOMEM errno will be set by mem_cgroup_css_alloc(), but the right errno
> > > > > > should be -EBUSY "Device or resource busy". That is same with
> > > > > > memcg_alloc_cache_id().
>
> See my comment about EBUSY in the previous version
> http://lkml.kernel.org/r/20200407063621.GA18914@dhcp22.suse.cz
>

From the perspective of mkdir(2), it is better to use ENOSPC here. But
I'm not sure whether it is better to update the man page of mkdir(2)
or not.

I checked the explaination about ENOSPC in 73f576c04b94 ("mm:
memcontrol: fix cgroup creation failure after many small jobs")
carefully, but I don't a clear idea which one is better now.

Per Matthew, EBUSY is better, while per you, ENOSPC is better.

Matthew, would you mind to give more detailed explanation ?

> > > > > > As the errno really misled me, we should make it right. After this patch,
> > > > > > the error log will be,
> > > > > > mkdir: cannot create directory 'foo.65533': Device or resource busy
> > > > >
> > > > > Thanks.
> > > > >
> > > > > Was a -stable backport considered?
> > > >
> > > > I only backported to our kernel version 4.18, but I'm not sure whether
> > > > it will apply to -stable or not.
> > > > Seems this issue is introduced long time ago. I will try to backport
> > > > it to -stable.
> > > > Should I try it based on 5.5.y and 5.6.y only ? Or all the LTS kernel
> > > > versions ?
> > >
> > > What I'm asking (of you and of reviewers) is whether this issue is
> > > sufficiently serious to require fixing in -stable kernels.  The
> > > patch merging details can be sorted out later.
> > >
> > >
> >
> > I think it is worth to cc stable, as this errno will mislead the user.
>
> We have idr failure path since 73f576c04b94 ("mm: memcontrol: fix cgroup
> creation failure after many small jobs") which is more than four years
> ago without anybody ever noticing. While I do agree that the existing
> behavior might be confusing I am not really sure this qualifies as
> stable material TBH.

OK

Thanks
Yafang


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v3] mm, memcg: fix error return value of mem_cgroup_css_alloc()
  2020-04-07  9:31           ` Yafang Shao
@ 2020-04-07 11:10             ` Michal Hocko
  2020-04-07 18:10               ` Johannes Weiner
  0 siblings, 1 reply; 15+ messages in thread
From: Michal Hocko @ 2020-04-07 11:10 UTC (permalink / raw)
  To: Yafang Shao
  Cc: Andrew Morton, Matthew Wilcox, Johannes Weiner, Vladimir Davydov,
	Linux MM

On Tue 07-04-20 17:31:33, Yafang Shao wrote:
> On Tue, Apr 7, 2020 at 2:43 PM Michal Hocko <mhocko@kernel.org> wrote:
> >
> > [I have only now noticed there was another version posted. Please try to
> > wait a bit longer for other feedback before reposting a newer version]
> >
> 
> Sure I will. sorry about that.
> But after we send a patch, we always don't know in which state this
> patch is, e.g. Changes Requested, Not Applicable, Rejected, Needs
> Review / ACK and etc, as listed in netdev list[1]. That could give us
> an explicit instruction on what to do next.

Yes I can see how that can be unclear or consuming at times. But a
general rule of thumb I would suggest is to simply wait few days before
reposting a new version. If you need to discuss an incremental change
based on the review feedback you can always do that in the original
email thread and repost the new version after the review feedback
settles.

> [1]. https://patchwork.ozlabs.org/project/netdev/list/?state=*&page=2&param=3
> 
> > On Tue 07-04-20 11:11:13, Yafang Shao wrote:
[...]
> > See my comment about EBUSY in the previous version
> > http://lkml.kernel.org/r/20200407063621.GA18914@dhcp22.suse.cz
> >
> 
> >From the perspective of mkdir(2), it is better to use ENOSPC here. But
> I'm not sure whether it is better to update the man page of mkdir(2)
> or not.

Well, I do not know whether EBUSY missing in the man page is an omission
or an intention. But adding it only for this single operation failure
would sound like an overkill to me.

> I checked the explaination about ENOSPC in 73f576c04b94 ("mm:
> memcontrol: fix cgroup creation failure after many small jobs")
> carefully, but I don't a clear idea which one is better now.

The changelog simply mentioned that without the additional id tracking
the ENOSPC would have been returned from elsewhere. I haven't checked
but I suspect it would be from the cgroup core. This patch just didn't
propagate the idr failure specifically and hid it under the ENOMEM
failure path. I can only speculate why that was the case but I suspect
that Johannes simply didn't consider the distinction important enough.
All I am saying is that preserving the failure mode would be preferable.
Especially unless there a strong reason to use EBUSY which then should
be carefully explained in the changelog.

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v3] mm, memcg: fix error return value of mem_cgroup_css_alloc()
  2020-04-07 11:10             ` Michal Hocko
@ 2020-04-07 18:10               ` Johannes Weiner
  2020-04-09  1:29                 ` Andrew Morton
  0 siblings, 1 reply; 15+ messages in thread
From: Johannes Weiner @ 2020-04-07 18:10 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Yafang Shao, Andrew Morton, Matthew Wilcox, Vladimir Davydov, Linux MM

On Tue, Apr 07, 2020 at 01:10:17PM +0200, Michal Hocko wrote:
> On Tue 07-04-20 17:31:33, Yafang Shao wrote:
> > I checked the explaination about ENOSPC in 73f576c04b94 ("mm:
> > memcontrol: fix cgroup creation failure after many small jobs")
> > carefully, but I don't a clear idea which one is better now.
> 
> The changelog simply mentioned that without the additional id tracking
> the ENOSPC would have been returned from elsewhere. I haven't checked
> but I suspect it would be from the cgroup core. This patch just didn't
> propagate the idr failure specifically and hid it under the ENOMEM
> failure path. I can only speculate why that was the case but I suspect
> that Johannes simply didn't consider the distinction important enough.

The old -ENOSPC came directly from the memcg css online callback:

-       if (css->id > MEM_CGROUP_ID_MAX)
-               return -ENOSPC;

And it only became a problem because, on big memory machines (128+G)
with a high rate of short-lived jobs, lazily freed cgroups piled up
over the course of multiple days and clogged up the ID space. Nobody
actually tried to create 64k user-visible cgroups concurrently. It's
hard to imagine any machine running that many meaningfully distinct
memory consumers in parallel. So I didn't think (and still don't tbh)
it matters all that much in practice what we return here.

I'm not against changing it back to -ENOSPC. And I agree with Michal
it might be better than -EBUSY because of the mkdir() interface.

Given how unlikely this is to affect real setups, I don't think this
patch is stable material.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v3] mm, memcg: fix error return value of mem_cgroup_css_alloc()
  2020-04-07 18:10               ` Johannes Weiner
@ 2020-04-09  1:29                 ` Andrew Morton
  2020-04-09  6:57                   ` Michal Hocko
  0 siblings, 1 reply; 15+ messages in thread
From: Andrew Morton @ 2020-04-09  1:29 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Michal Hocko, Yafang Shao, Matthew Wilcox, Vladimir Davydov, Linux MM

On Tue, 7 Apr 2020 14:10:12 -0400 Johannes Weiner <hannes@cmpxchg.org> wrote:

> Given how unlikely this is to affect real setups, I don't think this
> patch is stable material.

OK, thanks, done.

I didn't notice any acks - is this patch otherwise good to go upstream?


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v3] mm, memcg: fix error return value of mem_cgroup_css_alloc()
  2020-04-09  1:29                 ` Andrew Morton
@ 2020-04-09  6:57                   ` Michal Hocko
  2020-04-09 13:59                     ` Yafang Shao
  0 siblings, 1 reply; 15+ messages in thread
From: Michal Hocko @ 2020-04-09  6:57 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Johannes Weiner, Yafang Shao, Matthew Wilcox, Vladimir Davydov, Linux MM

On Wed 08-04-20 18:29:56, Andrew Morton wrote:
> On Tue, 7 Apr 2020 14:10:12 -0400 Johannes Weiner <hannes@cmpxchg.org> wrote:
> 
> > Given how unlikely this is to affect real setups, I don't think this
> > patch is stable material.
> 
> OK, thanks, done.
> 
> I didn't notice any acks - is this patch otherwise good to go upstream?

I will ack it if s@EBUSY@ENOSPC@

-- 
Michal Hocko
SUSE Labs


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v3] mm, memcg: fix error return value of mem_cgroup_css_alloc()
  2020-04-09  6:57                   ` Michal Hocko
@ 2020-04-09 13:59                     ` Yafang Shao
  2020-04-09 14:07                       ` Michal Hocko
  0 siblings, 1 reply; 15+ messages in thread
From: Yafang Shao @ 2020-04-09 13:59 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Andrew Morton, Johannes Weiner, Matthew Wilcox, Vladimir Davydov,
	Linux MM

On Thu, Apr 9, 2020 at 2:57 PM Michal Hocko <mhocko@kernel.org> wrote:
>
> On Wed 08-04-20 18:29:56, Andrew Morton wrote:
> > On Tue, 7 Apr 2020 14:10:12 -0400 Johannes Weiner <hannes@cmpxchg.org> wrote:
> >
> > > Given how unlikely this is to affect real setups, I don't think this
> > > patch is stable material.
> >
> > OK, thanks, done.
> >
> > I didn't notice any acks - is this patch otherwise good to go upstream?
>
> I will ack it if s@EBUSY@ENOSPC@
>

BTW, there's no EBUSY in man page of write(2) neither.
But when we echo some procs into cgroup.subtree_control of a non-root
cgroup, it will return EBUSY. See also cgroup_subtree_control_write().
Pls. correct me if I missed something.

We have to admit that the man page is out of date.

Thanks
Yafang


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v3] mm, memcg: fix error return value of mem_cgroup_css_alloc()
  2020-04-09 13:59                     ` Yafang Shao
@ 2020-04-09 14:07                       ` Michal Hocko
  2020-04-20 23:44                         ` Andrew Morton
  0 siblings, 1 reply; 15+ messages in thread
From: Michal Hocko @ 2020-04-09 14:07 UTC (permalink / raw)
  To: Yafang Shao
  Cc: Andrew Morton, Johannes Weiner, Matthew Wilcox, Vladimir Davydov,
	Linux MM

On Thu 09-04-20 21:59:42, Yafang Shao wrote:
> On Thu, Apr 9, 2020 at 2:57 PM Michal Hocko <mhocko@kernel.org> wrote:
> >
> > On Wed 08-04-20 18:29:56, Andrew Morton wrote:
> > > On Tue, 7 Apr 2020 14:10:12 -0400 Johannes Weiner <hannes@cmpxchg.org> wrote:
> > >
> > > > Given how unlikely this is to affect real setups, I don't think this
> > > > patch is stable material.
> > >
> > > OK, thanks, done.
> > >
> > > I didn't notice any acks - is this patch otherwise good to go upstream?
> >
> > I will ack it if s@EBUSY@ENOSPC@
> >
> 
> BTW, there's no EBUSY in man page of write(2) neither.
> But when we echo some procs into cgroup.subtree_control of a non-root
> cgroup, it will return EBUSY. See also cgroup_subtree_control_write().
> Pls. correct me if I missed something.
> 
> We have to admit that the man page is out of date.

Or the code is returning something unexpected but nobody has noticed so
far. Please talk to cgroup maintainers on how to address that.

-- 
Michal Hocko
SUSE Labs


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v3] mm, memcg: fix error return value of mem_cgroup_css_alloc()
  2020-04-09 14:07                       ` Michal Hocko
@ 2020-04-20 23:44                         ` Andrew Morton
  2020-04-21 14:44                           ` Johannes Weiner
  0 siblings, 1 reply; 15+ messages in thread
From: Andrew Morton @ 2020-04-20 23:44 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Yafang Shao, Johannes Weiner, Matthew Wilcox, Vladimir Davydov, Linux MM

On Thu, 9 Apr 2020 16:07:29 +0200 Michal Hocko <mhocko@kernel.org> wrote:

> On Thu 09-04-20 21:59:42, Yafang Shao wrote:
> > On Thu, Apr 9, 2020 at 2:57 PM Michal Hocko <mhocko@kernel.org> wrote:
> > >
> > > On Wed 08-04-20 18:29:56, Andrew Morton wrote:
> > > > On Tue, 7 Apr 2020 14:10:12 -0400 Johannes Weiner <hannes@cmpxchg.org> wrote:
> > > >
> > > > > Given how unlikely this is to affect real setups, I don't think this
> > > > > patch is stable material.
> > > >
> > > > OK, thanks, done.
> > > >
> > > > I didn't notice any acks - is this patch otherwise good to go upstream?
> > >
> > > I will ack it if s@EBUSY@ENOSPC@
> > >
> > 
> > BTW, there's no EBUSY in man page of write(2) neither.
> > But when we echo some procs into cgroup.subtree_control of a non-root
> > cgroup, it will return EBUSY. See also cgroup_subtree_control_write().
> > Pls. correct me if I missed something.
> > 
> > We have to admit that the man page is out of date.
> 
> Or the code is returning something unexpected but nobody has noticed so
> far. Please talk to cgroup maintainers on how to address that.

I agree with your ENOSPC reasoning in
http://lkml.kernel.org/r/20200407063621.GA18914@dhcp22.suse.cz

That means this change, methinks:

--- a/mm/memcontrol.c~mm-memcg-fix-error-return-value-of-mem_cgroup_css_alloc-fix
+++ a/mm/memcontrol.c
@@ -2721,8 +2721,6 @@ static int memcg_alloc_cache_id(void)
 
 	id = ida_simple_get(&memcg_cache_ida,
 			    0, MEMCG_CACHES_MAX_SIZE, GFP_KERNEL);
-	if (id == -ENOSPC)
-		return -EBUSY;
 	if (id < 0)
 		return id;
 
@@ -5004,10 +5002,6 @@ static struct mem_cgroup *mem_cgroup_all
 	memcg->id.id = idr_alloc(&mem_cgroup_idr, NULL,
 				 1, MEM_CGROUP_ID_MAX,
 				 GFP_KERNEL);
-	if (memcg->id.id == -ENOSPC) {
-		error = -EBUSY;
-		goto fail;
-	}
 	if (memcg->id.id < 0) {
 		error = memcg->id.id;
 		goto fail;
_


So the final patch will be as below.  Please review my changelog
alterations.


From: Yafang Shao <laoar.shao@gmail.com>
Subject: mm, memcg: fix error return value of mem_cgroup_css_alloc()

When I run my memcg testcase which creates lots of memcgs, I found
there're unexpected out of memory logs while there're still enough
available free memory.  The error log is, mkdir: cannot create directory
'foo.65533': Cannot allocate memory

The reason is when we try to create more than MEM_CGROUP_ID_MAX memcgs, an
-ENOMEM errno will be set by mem_cgroup_css_alloc(), but the right errno
should be -ENOSPC "No space left on device", which is an appropriate errno
for userspace's failed mkdir.

As the errno really misled me, we should make it right.  After this patch,
the error log will be "mkdir: cannot create directory 'foo.65533': No
space left on device"

[akpm@linux-foundation.org: s/EBUSY/ENOSPC/, per Michal]
Link: http://lkml.kernel.org/r/20200407063621.GA18914@dhcp22.suse.cz
Link: http://lkml.kernel.org/r/1586192163-20099-1-git-send-email-laoar.shao@gmail.com
Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Suggested-by: Matthew Wilcox <willy@infradead.org>
Acked-by: Michal Hocko <mhocko@kernel.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/memcontrol.c |   15 +++++++++------
 1 file changed, 9 insertions(+), 6 deletions(-)

--- a/mm/memcontrol.c~mm-memcg-fix-error-return-value-of-mem_cgroup_css_alloc
+++ a/mm/memcontrol.c
@@ -4990,19 +4990,22 @@ static struct mem_cgroup *mem_cgroup_all
 	unsigned int size;
 	int node;
 	int __maybe_unused i;
+	long error = -ENOMEM;
 
 	size = sizeof(struct mem_cgroup);
 	size += nr_node_ids * sizeof(struct mem_cgroup_per_node *);
 
 	memcg = kzalloc(size, GFP_KERNEL);
 	if (!memcg)
-		return NULL;
+		return ERR_PTR(error);
 
 	memcg->id.id = idr_alloc(&mem_cgroup_idr, NULL,
 				 1, MEM_CGROUP_ID_MAX,
 				 GFP_KERNEL);
-	if (memcg->id.id < 0)
+	if (memcg->id.id < 0) {
+		error = memcg->id.id;
 		goto fail;
+	}
 
 	memcg->vmstats_local = alloc_percpu(struct memcg_vmstats_percpu);
 	if (!memcg->vmstats_local)
@@ -5046,7 +5049,7 @@ static struct mem_cgroup *mem_cgroup_all
 fail:
 	mem_cgroup_id_remove(memcg);
 	__mem_cgroup_free(memcg);
-	return NULL;
+	return ERR_PTR(error);
 }
 
 static struct cgroup_subsys_state * __ref
@@ -5057,8 +5060,8 @@ mem_cgroup_css_alloc(struct cgroup_subsy
 	long error = -ENOMEM;
 
 	memcg = mem_cgroup_alloc();
-	if (!memcg)
-		return ERR_PTR(error);
+	if (IS_ERR(memcg))
+		return ERR_CAST(memcg);
 
 	WRITE_ONCE(memcg->high, PAGE_COUNTER_MAX);
 	memcg->soft_limit = PAGE_COUNTER_MAX;
@@ -5108,7 +5111,7 @@ mem_cgroup_css_alloc(struct cgroup_subsy
 fail:
 	mem_cgroup_id_remove(memcg);
 	mem_cgroup_free(memcg);
-	return ERR_PTR(-ENOMEM);
+	return ERR_PTR(error);
 }
 
 static int mem_cgroup_css_online(struct cgroup_subsys_state *css)
_



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v3] mm, memcg: fix error return value of mem_cgroup_css_alloc()
  2020-04-20 23:44                         ` Andrew Morton
@ 2020-04-21 14:44                           ` Johannes Weiner
  0 siblings, 0 replies; 15+ messages in thread
From: Johannes Weiner @ 2020-04-21 14:44 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Michal Hocko, Yafang Shao, Matthew Wilcox, Vladimir Davydov, Linux MM

On Mon, Apr 20, 2020 at 04:44:37PM -0700, Andrew Morton wrote:
> From: Yafang Shao <laoar.shao@gmail.com>
> Subject: mm, memcg: fix error return value of mem_cgroup_css_alloc()
> 
> When I run my memcg testcase which creates lots of memcgs, I found
> there're unexpected out of memory logs while there're still enough
> available free memory.  The error log is, mkdir: cannot create directory
> 'foo.65533': Cannot allocate memory
> 
> The reason is when we try to create more than MEM_CGROUP_ID_MAX memcgs, an
> -ENOMEM errno will be set by mem_cgroup_css_alloc(), but the right errno
> should be -ENOSPC "No space left on device", which is an appropriate errno
> for userspace's failed mkdir.
> 
> As the errno really misled me, we should make it right.  After this patch,
> the error log will be "mkdir: cannot create directory 'foo.65533': No
> space left on device"
> 
> [akpm@linux-foundation.org: s/EBUSY/ENOSPC/, per Michal]
> Link: http://lkml.kernel.org/r/20200407063621.GA18914@dhcp22.suse.cz
> Link: http://lkml.kernel.org/r/1586192163-20099-1-git-send-email-laoar.shao@gmail.com
> Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
> Suggested-by: Matthew Wilcox <willy@infradead.org>
> Acked-by: Michal Hocko <mhocko@kernel.org>
> Cc: Johannes Weiner <hannes@cmpxchg.org>
> Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Fixes: 73f576c04b94 ("mm: memcontrol: fix cgroup creation failure after many small jobs")
Acked-by: Johannes Weiner <hannes@cmpxchg.org>


^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2020-04-21 14:44 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-04-06 16:56 [PATCH v3] mm, memcg: fix error return value of mem_cgroup_css_alloc() Yafang Shao
2020-04-06 23:23 ` Andrew Morton
2020-04-07  3:02   ` Yafang Shao
2020-04-07  3:09     ` Andrew Morton
2020-04-07  3:11       ` Yafang Shao
2020-04-07  6:43         ` Michal Hocko
2020-04-07  9:31           ` Yafang Shao
2020-04-07 11:10             ` Michal Hocko
2020-04-07 18:10               ` Johannes Weiner
2020-04-09  1:29                 ` Andrew Morton
2020-04-09  6:57                   ` Michal Hocko
2020-04-09 13:59                     ` Yafang Shao
2020-04-09 14:07                       ` Michal Hocko
2020-04-20 23:44                         ` Andrew Morton
2020-04-21 14:44                           ` Johannes Weiner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).