All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3] mm, memcg: fix the stupid OOM killer when shrinking memcg hard limit
@ 2019-11-28  4:17 Yafang Shao
  2019-11-28  4:28 ` Qian Cai
  0 siblings, 1 reply; 3+ messages in thread
From: Yafang Shao @ 2019-11-28  4:17 UTC (permalink / raw)
  To: mhocko, hannes, vdavydov.dev, akpm
  Cc: linux-mm, Yafang Shao, David Hildenbrand

When there are no more processes in a memcg (e.g., due to OOM
group), we can still have file pages in the page cache.

If these pages are protected by memory.min, they can't be reclaimed.
Especially if there won't be another process in this memcg and the memcg
is kept online, we do want to drop these pages from the page cache.

By dropping these page caches we can avoid reclaimers (e.g., kswapd or
direct) to scan and reclaim pages from all memcgs in the system -
because the reclaimers will try to fairly reclaim pages from all memcgs
in the system when under memory pressure.

By setting the hard limit of such a memcg to 0, we allow to drop the
page cache of such memcgs. Unfortunately, this may invoke the OOM killer
and generate a lot of output. The OOM output is not expected by an admin
who wants to drop these caches and knows that there are no processes in
this memcg anymore.

Therefore, if a memcg is not populated, we should not invoke the OOM
killer - there is nothing to kill. The next time a new process is
started in the memcg and the "max" is still below usage, the OOM killer
will be invoked and the new process will be killed.

[ Above commit log is contributed by David ]

What's worse about this issue is that when there're killable tasks and the
OOM killer killed the last task, and what will happen then ? As nr_reclaims
is already 0 and drained is alreay true, the OOM killer will try to kill
nothing (because he knows he has killed the last task), what's a stupid
behavior.

Someone may worry that the admins may not going to see that the memcg was
OOM due to the limit change. But this is not a issue, because the admins
changes the limit and then the admins must check the result of his change
- by checking memory.{max, current, stat} he can get all he wants.

Cc: David Hildenbrand <david@redhat.com>
Nacked-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Yafang Shao <laoar.shao@gmail.com>

---
Changes since v2: Refresh the subject and commit log. The original
subject is "mm, memcg: avoid oom if cgroup is not populated"
---
 mm/memcontrol.c | 15 +++++++++++++--
 1 file changed, 13 insertions(+), 2 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 1c4c08b..e936f1b 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -6139,9 +6139,20 @@ static ssize_t memory_max_write(struct kernfs_open_file *of,
 			continue;
 		}
 
-		memcg_memory_event(memcg, MEMCG_OOM);
-		if (!mem_cgroup_out_of_memory(memcg, GFP_KERNEL, 0))
+		/* If there's no procesess, we don't need to invoke the OOM
+		 * killer. Then next time when you try to start a process
+		 * in this memcg, the max may still bellow usage, and then
+		 * this OOM killer will be invoked. This can be considered
+		 * as lazy OOM, that is we have been always doing in the
+		 * kernel. Pls. Michal, that is really consistency.
+		 */
+		if (cgroup_is_populated(memcg->css.cgroup)) {
+			memcg_memory_event(memcg, MEMCG_OOM);
+			if (!mem_cgroup_out_of_memory(memcg, GFP_KERNEL, 0))
+				break;
+		} else  {
 			break;
+		}
 	}
 
 	memcg_wb_domain_size_changed(memcg);
-- 
1.8.3.1



^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH v3] mm, memcg: fix the stupid OOM killer when shrinking memcg hard limit
  2019-11-28  4:17 [PATCH v3] mm, memcg: fix the stupid OOM killer when shrinking memcg hard limit Yafang Shao
@ 2019-11-28  4:28 ` Qian Cai
  2019-11-28  4:59   ` Yafang Shao
  0 siblings, 1 reply; 3+ messages in thread
From: Qian Cai @ 2019-11-28  4:28 UTC (permalink / raw)
  To: Yafang Shao
  Cc: Michal Hocko, hannes, vdavydov.dev, Andrew Morton, linux-mm,
	David Hildenbrand



> On Nov 27, 2019, at 11:17 PM, Yafang Shao <laoar.shao@gmail.com> wrote:
> 
> When there are no more processes in a memcg (e.g., due to OOM
> group), we can still have file pages in the page cache.
> 
> If these pages are protected by memory.min, they can't be reclaimed.
> Especially if there won't be another process in this memcg and the memcg
> is kept online, we do want to drop these pages from the page cache.
> 
> By dropping these page caches we can avoid reclaimers (e.g., kswapd or
> direct) to scan and reclaim pages from all memcgs in the system -
> because the reclaimers will try to fairly reclaim pages from all memcgs
> in the system when under memory pressure.
> 
> By setting the hard limit of such a memcg to 0, we allow to drop the
> page cache of such memcgs. Unfortunately, this may invoke the OOM killer
> and generate a lot of output. The OOM output is not expected by an admin
> who wants to drop these caches and knows that there are no processes in
> this memcg anymore.
> 
> Therefore, if a memcg is not populated, we should not invoke the OOM
> killer - there is nothing to kill. The next time a new process is
> started in the memcg and the "max" is still below usage, the OOM killer
> will be invoked and the new process will be killed.
> 
> [ Above commit log is contributed by David ]
> 
> What's worse about this issue is that when there're killable tasks and the
> OOM killer killed the last task, and what will happen then ? As nr_reclaims
> is already 0 and drained is alreay true, the OOM killer will try to kill
> nothing (because he knows he has killed the last task), what's a stupid
> behavior.
> 
> Someone may worry that the admins may not going to see that the memcg was
> OOM due to the limit change. But this is not a issue, because the admins
> changes the limit and then the admins must check the result of his change
> - by checking memory.{max, current, stat} he can get all he wants.
> 
> Cc: David Hildenbrand <david@redhat.com>
> Nacked-by: Michal Hocko <mhocko@suse.com>
> Signed-off-by: Yafang Shao <laoar.shao@gmail.com>

Surely too big a turkey to swallow ? — unprofessional wording and carrying a NACK
from one of the maintainers.

> 
> ---
> Changes since v2: Refresh the subject and commit log. The original
> subject is "mm, memcg: avoid oom if cgroup is not populated"
> ---
> mm/memcontrol.c | 15 +++++++++++++--
> 1 file changed, 13 insertions(+), 2 deletions(-)
> 
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 1c4c08b..e936f1b 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -6139,9 +6139,20 @@ static ssize_t memory_max_write(struct kernfs_open_file *of,
> 			continue;
> 		}
> 
> -		memcg_memory_event(memcg, MEMCG_OOM);
> -		if (!mem_cgroup_out_of_memory(memcg, GFP_KERNEL, 0))
> +		/* If there's no procesess, we don't need to invoke the OOM
> +		 * killer. Then next time when you try to start a process
> +		 * in this memcg, the max may still bellow usage, and then
> +		 * this OOM killer will be invoked. This can be considered
> +		 * as lazy OOM, that is we have been always doing in the
> +		 * kernel. Pls. Michal, that is really consistency.
> +		 */
> +		if (cgroup_is_populated(memcg->css.cgroup)) {
> +			memcg_memory_event(memcg, MEMCG_OOM);
> +			if (!mem_cgroup_out_of_memory(memcg, GFP_KERNEL, 0))
> +				break;
> +		} else  {
> 			break;
> +		}
> 	}
> 
> 	memcg_wb_domain_size_changed(memcg);
> -- 
> 1.8.3.1
> 
> 



^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH v3] mm, memcg: fix the stupid OOM killer when shrinking memcg hard limit
  2019-11-28  4:28 ` Qian Cai
@ 2019-11-28  4:59   ` Yafang Shao
  0 siblings, 0 replies; 3+ messages in thread
From: Yafang Shao @ 2019-11-28  4:59 UTC (permalink / raw)
  To: Qian Cai
  Cc: Michal Hocko, Johannes Weiner, Vladimir Davydov, Andrew Morton,
	Linux MM, David Hildenbrand

On Thu, Nov 28, 2019 at 12:28 PM Qian Cai <cai@lca.pw> wrote:
>
>
>
> > On Nov 27, 2019, at 11:17 PM, Yafang Shao <laoar.shao@gmail.com> wrote:
> >
> > When there are no more processes in a memcg (e.g., due to OOM
> > group), we can still have file pages in the page cache.
> >
> > If these pages are protected by memory.min, they can't be reclaimed.
> > Especially if there won't be another process in this memcg and the memcg
> > is kept online, we do want to drop these pages from the page cache.
> >
> > By dropping these page caches we can avoid reclaimers (e.g., kswapd or
> > direct) to scan and reclaim pages from all memcgs in the system -
> > because the reclaimers will try to fairly reclaim pages from all memcgs
> > in the system when under memory pressure.
> >
> > By setting the hard limit of such a memcg to 0, we allow to drop the
> > page cache of such memcgs. Unfortunately, this may invoke the OOM killer
> > and generate a lot of output. The OOM output is not expected by an admin
> > who wants to drop these caches and knows that there are no processes in
> > this memcg anymore.
> >
> > Therefore, if a memcg is not populated, we should not invoke the OOM
> > killer - there is nothing to kill. The next time a new process is
> > started in the memcg and the "max" is still below usage, the OOM killer
> > will be invoked and the new process will be killed.
> >
> > [ Above commit log is contributed by David ]
> >
> > What's worse about this issue is that when there're killable tasks and the
> > OOM killer killed the last task, and what will happen then ? As nr_reclaims
> > is already 0 and drained is alreay true, the OOM killer will try to kill
> > nothing (because he knows he has killed the last task), what's a stupid
> > behavior.
> >
> > Someone may worry that the admins may not going to see that the memcg was
> > OOM due to the limit change. But this is not a issue, because the admins
> > changes the limit and then the admins must check the result of his change
> > - by checking memory.{max, current, stat} he can get all he wants.
> >
> > Cc: David Hildenbrand <david@redhat.com>
> > Nacked-by: Michal Hocko <mhocko@suse.com>
> > Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
>
> Surely too big a turkey to swallow ? — unprofessional wording and carrying a NACK
> from one of the maintainers.
>

I'm sorry if there're any unprofessional wording.
I just want to make the kernel more stable.

Thanks
Yafang


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2019-11-28  5:00 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-11-28  4:17 [PATCH v3] mm, memcg: fix the stupid OOM killer when shrinking memcg hard limit Yafang Shao
2019-11-28  4:28 ` Qian Cai
2019-11-28  4:59   ` Yafang Shao

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.