All of lore.kernel.org
 help / color / mirror / Atom feed
From: Johannes Weiner <hannes@cmpxchg.org>
To: Greg Thelen <gthelen@google.com>
Cc: Michal Hocko <mhocko@suse.cz>, Hugh Dickins <hughd@google.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
	KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
	Michel Lespinasse <walken@google.com>, Tejun Heo <tj@kernel.org>,
	Roman Gushchin <klamm@yandex-team.ru>,
	LKML <linux-kernel@vger.kernel.org>,
	linux-mm@kvack.org
Subject: Re: [PATCH 2/2] memcg: Allow hard guarantee mode for low limit reclaim
Date: Tue, 10 Jun 2014 12:57:56 -0400	[thread overview]
Message-ID: <20140610165756.GG2878@cmpxchg.org> (raw)
In-Reply-To: <xr934mzt4rwc.fsf@gthelen.mtv.corp.google.com>

On Mon, Jun 09, 2014 at 03:52:51PM -0700, Greg Thelen wrote:
> 
> On Fri, Jun 06 2014, Michal Hocko <mhocko@suse.cz> wrote:
> 
> > Some users (e.g. Google) would like to have stronger semantic than low
> > limit offers currently. The fallback mode is not desirable and they
> > prefer hitting OOM killer rather than ignoring low limit for protected
> > groups. There are other possible usecases which can benefit from hard
> > guarantees. I can imagine workloads where setting low_limit to the same
> > value as hard_limit to prevent from any reclaim at all makes a lot of
> > sense because reclaim is much more disrupting than restart of the load.
> >
> > This patch adds a new per memcg memory.reclaim_strategy knob which
> > tells what to do in a situation when memory reclaim cannot do any
> > progress because all groups in the reclaimed hierarchy are within their
> > low_limit. There are two options available:
> > 	- low_limit_best_effort - the current mode when reclaim falls
> > 	  back to the even reclaim of all groups in the reclaimed
> > 	  hierarchy
> > 	- low_limit_guarantee - groups within low_limit are never
> > 	  reclaimed and OOM killer is triggered instead. OOM message
> > 	  will mention the fact that the OOM was triggered due to
> > 	  low_limit reclaim protection.
> 
> To (a) be consistent with existing hard and soft limits APIs and (b)
> allow use of both best effort and guarantee memory limits, I wonder if
> it's best to offer three per memcg limits, rather than two limits (hard,
> low_limit) and a related reclaim_strategy knob.  The three limits I'm
> thinking about are:
> 
> 1) hard_limit (aka the existing limit_in_bytes cgroupfs file).  No
>    change needed here.  This is an upper bound on a memcg hierarchy's
>    memory consumption (assuming use_hierarchy=1).

This creates internal pressure.  Outside reclaim is not affected by
it, but internal charges can not exceed this limit.  This is set to
hard limit the maximum memory consumption of a group (max).

> 2) best_effort_limit (aka desired working set).  This allow an
>    application or administrator to provide a hint to the kernel about
>    desired working set size.  Before oom'ing the kernel is allowed to
>    reclaim below this limit.  I think the current soft_limit_in_bytes
>    claims to provide this.  If we prefer to deprecate
>    soft_limit_in_bytes, then a new desired_working_set_in_bytes (or a
>    hopefully better named) API seems reasonable.

This controls how external pressure applies to the group.

But it's conceivable that we'd like to have the equivalent of such a
soft limit for *internal* pressure.  Set below the hard limit, this
internal soft limit would have charges trigger direct reclaim in the
memcg but allow them to continue to the hard limit.  This would create
a situation wherein the allocating tasks are not killed, but throttled
under reclaim, which gives the administrator a window to detect the
situation with vmpressure and possibly intervene.  Because as it
stands, once the current hard limit is hit things can go down pretty
fast and the window for reacting to vmpressure readings is often too
small.  This would offer a more gradual deterioration.  It would be
set to the upper end of the working set size range (high).

I think for many users such an internal soft limit would actually be
preferred over the current hard limit, as they'd rather have some
reclaim throttling than an OOM kill when the group reaches its upper
bound.  The current hard limit would be reserved for more advanced or
paid cases, where the admin would rather see a memcg get OOM killed
than exceed a certain size.

Then, as you proposed, we'd have the soft limit for external pressure,
where the kernel only reclaims groups within that limit in order to
avoid OOM kills.  It would be set to the estimated lower end of the
working set size range (low).

> 3) low_limit_guarantee which is a lower bound of memory usage.  A memcg
>    would prefer to be oom killed rather than operate below this
>    threshold.  Default value is zero to preserve compatibility with
>    existing apps.

And this would be the external pressure hard limit, which would be set
to the absolute minimum requirement of the group (min).

Either because it would be hopelessly thrashing without it, or because
this guaranteed memory is actually paid for.  Again, I would expect
many users to not even set this minimum guarantee but solely use the
external soft limit (low) instead.

> Logically hard_limit >= best_effort_limit >= low_limit_guarantee.

max >= high >= low >= min

I think we should be able to express all desired usecases with these
four limits, including the advanced configurations, while making it
easy for many users to set up groups without being a) dead certain
about their memory consumption or b) prepared for frequent OOM kills,
while still allowing them to properly utilize their machines.

What do you think?

WARNING: multiple messages have this Message-ID (diff)
From: Johannes Weiner <hannes@cmpxchg.org>
To: Greg Thelen <gthelen@google.com>
Cc: Michal Hocko <mhocko@suse.cz>, Hugh Dickins <hughd@google.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
	KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
	Michel Lespinasse <walken@google.com>, Tejun Heo <tj@kernel.org>,
	Roman Gushchin <klamm@yandex-team.ru>,
	LKML <linux-kernel@vger.kernel.org>,
	linux-mm@kvack.org
Subject: Re: [PATCH 2/2] memcg: Allow hard guarantee mode for low limit reclaim
Date: Tue, 10 Jun 2014 12:57:56 -0400	[thread overview]
Message-ID: <20140610165756.GG2878@cmpxchg.org> (raw)
In-Reply-To: <xr934mzt4rwc.fsf@gthelen.mtv.corp.google.com>

On Mon, Jun 09, 2014 at 03:52:51PM -0700, Greg Thelen wrote:
> 
> On Fri, Jun 06 2014, Michal Hocko <mhocko@suse.cz> wrote:
> 
> > Some users (e.g. Google) would like to have stronger semantic than low
> > limit offers currently. The fallback mode is not desirable and they
> > prefer hitting OOM killer rather than ignoring low limit for protected
> > groups. There are other possible usecases which can benefit from hard
> > guarantees. I can imagine workloads where setting low_limit to the same
> > value as hard_limit to prevent from any reclaim at all makes a lot of
> > sense because reclaim is much more disrupting than restart of the load.
> >
> > This patch adds a new per memcg memory.reclaim_strategy knob which
> > tells what to do in a situation when memory reclaim cannot do any
> > progress because all groups in the reclaimed hierarchy are within their
> > low_limit. There are two options available:
> > 	- low_limit_best_effort - the current mode when reclaim falls
> > 	  back to the even reclaim of all groups in the reclaimed
> > 	  hierarchy
> > 	- low_limit_guarantee - groups within low_limit are never
> > 	  reclaimed and OOM killer is triggered instead. OOM message
> > 	  will mention the fact that the OOM was triggered due to
> > 	  low_limit reclaim protection.
> 
> To (a) be consistent with existing hard and soft limits APIs and (b)
> allow use of both best effort and guarantee memory limits, I wonder if
> it's best to offer three per memcg limits, rather than two limits (hard,
> low_limit) and a related reclaim_strategy knob.  The three limits I'm
> thinking about are:
> 
> 1) hard_limit (aka the existing limit_in_bytes cgroupfs file).  No
>    change needed here.  This is an upper bound on a memcg hierarchy's
>    memory consumption (assuming use_hierarchy=1).

This creates internal pressure.  Outside reclaim is not affected by
it, but internal charges can not exceed this limit.  This is set to
hard limit the maximum memory consumption of a group (max).

> 2) best_effort_limit (aka desired working set).  This allow an
>    application or administrator to provide a hint to the kernel about
>    desired working set size.  Before oom'ing the kernel is allowed to
>    reclaim below this limit.  I think the current soft_limit_in_bytes
>    claims to provide this.  If we prefer to deprecate
>    soft_limit_in_bytes, then a new desired_working_set_in_bytes (or a
>    hopefully better named) API seems reasonable.

This controls how external pressure applies to the group.

But it's conceivable that we'd like to have the equivalent of such a
soft limit for *internal* pressure.  Set below the hard limit, this
internal soft limit would have charges trigger direct reclaim in the
memcg but allow them to continue to the hard limit.  This would create
a situation wherein the allocating tasks are not killed, but throttled
under reclaim, which gives the administrator a window to detect the
situation with vmpressure and possibly intervene.  Because as it
stands, once the current hard limit is hit things can go down pretty
fast and the window for reacting to vmpressure readings is often too
small.  This would offer a more gradual deterioration.  It would be
set to the upper end of the working set size range (high).

I think for many users such an internal soft limit would actually be
preferred over the current hard limit, as they'd rather have some
reclaim throttling than an OOM kill when the group reaches its upper
bound.  The current hard limit would be reserved for more advanced or
paid cases, where the admin would rather see a memcg get OOM killed
than exceed a certain size.

Then, as you proposed, we'd have the soft limit for external pressure,
where the kernel only reclaims groups within that limit in order to
avoid OOM kills.  It would be set to the estimated lower end of the
working set size range (low).

> 3) low_limit_guarantee which is a lower bound of memory usage.  A memcg
>    would prefer to be oom killed rather than operate below this
>    threshold.  Default value is zero to preserve compatibility with
>    existing apps.

And this would be the external pressure hard limit, which would be set
to the absolute minimum requirement of the group (min).

Either because it would be hopelessly thrashing without it, or because
this guaranteed memory is actually paid for.  Again, I would expect
many users to not even set this minimum guarantee but solely use the
external soft limit (low) instead.

> Logically hard_limit >= best_effort_limit >= low_limit_guarantee.

max >= high >= low >= min

I think we should be able to express all desired usecases with these
four limits, including the advanced configurations, while making it
easy for many users to set up groups without being a) dead certain
about their memory consumption or b) prepared for frequent OOM kills,
while still allowing them to properly utilize their machines.

What do you think?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2014-06-10 16:58 UTC|newest]

Thread overview: 196+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-04-28 12:26 [PATCH v2 0/4] memcg: Low-limit reclaim Michal Hocko
2014-04-28 12:26 ` Michal Hocko
2014-04-28 12:26 ` [PATCH 1/4] memcg, mm: introduce lowlimit reclaim Michal Hocko
2014-04-28 12:26   ` Michal Hocko
2014-04-30 22:55   ` Johannes Weiner
2014-04-30 22:55     ` Johannes Weiner
2014-05-02  9:36     ` Michal Hocko
2014-05-02  9:36       ` Michal Hocko
2014-05-02 12:07       ` Michal Hocko
2014-05-02 12:07         ` Michal Hocko
2014-05-02 13:01         ` Johannes Weiner
2014-05-02 13:01           ` Johannes Weiner
2014-05-02 14:15           ` Michal Hocko
2014-05-02 14:15             ` Michal Hocko
2014-05-02 15:04             ` Johannes Weiner
2014-05-02 15:04               ` Johannes Weiner
2014-05-02 15:11               ` Michal Hocko
2014-05-02 15:11                 ` Michal Hocko
2014-05-02 15:34                 ` Johannes Weiner
2014-05-02 15:34                   ` Johannes Weiner
2014-05-02 15:48                   ` Michal Hocko
2014-05-02 15:48                     ` Michal Hocko
2014-05-06 19:58                     ` Michal Hocko
2014-05-06 19:58                       ` Michal Hocko
2014-05-02 15:58       ` Johannes Weiner
2014-05-02 15:58         ` Johannes Weiner
2014-05-02 16:49         ` Michal Hocko
2014-05-02 16:49           ` Michal Hocko
2014-05-02 22:00           ` Johannes Weiner
2014-05-02 22:00             ` Johannes Weiner
2014-05-05 14:21             ` Michal Hocko
2014-05-05 14:21               ` Michal Hocko
2014-05-19 16:18               ` Michal Hocko
2014-05-19 16:18                 ` Michal Hocko
2014-06-11 15:15               ` Johannes Weiner
2014-06-11 15:15                 ` Johannes Weiner
2014-06-11 16:08                 ` Michal Hocko
2014-06-11 16:08                   ` Michal Hocko
2014-05-06 13:29             ` Johannes Weiner
2014-05-06 14:32               ` Michal Hocko
2014-05-06 14:32                 ` Michal Hocko
2014-05-06 15:21                 ` Johannes Weiner
2014-05-06 15:21                   ` Johannes Weiner
2014-05-06 16:12                   ` Michal Hocko
2014-05-06 16:12                     ` Michal Hocko
2014-05-06 16:51                     ` Johannes Weiner
2014-05-06 16:51                       ` Johannes Weiner
2014-05-06 18:30                       ` Michal Hocko
2014-05-06 18:30                         ` Michal Hocko
2014-05-06 19:55                         ` Johannes Weiner
2014-05-06 19:55                           ` Johannes Weiner
2014-04-28 12:26 ` [PATCH 2/4] memcg: Allow setting low_limit Michal Hocko
2014-04-28 12:26   ` Michal Hocko
2014-04-28 12:26 ` [PATCH 3/4] memcg, doc: clarify global vs. limit reclaims Michal Hocko
2014-04-28 12:26   ` Michal Hocko
2014-04-30 23:03   ` Johannes Weiner
2014-04-30 23:03     ` Johannes Weiner
2014-05-02  9:43     ` Michal Hocko
2014-05-02  9:43       ` Michal Hocko
2014-05-06 19:56       ` Michal Hocko
2014-05-06 19:56         ` Michal Hocko
2014-04-28 12:26 ` [PATCH 4/4] memcg: Document memory.low_limit_in_bytes Michal Hocko
2014-04-28 12:26   ` Michal Hocko
2014-04-30 22:57   ` Johannes Weiner
2014-04-30 22:57     ` Johannes Weiner
2014-05-02  9:46     ` Michal Hocko
2014-05-02  9:46       ` Michal Hocko
2014-04-28 15:46 ` [PATCH v2 0/4] memcg: Low-limit reclaim Roman Gushchin
2014-04-28 15:46   ` Roman Gushchin
2014-04-29  7:42   ` Greg Thelen
2014-04-29  7:42     ` Greg Thelen
2014-04-29 10:50     ` Roman Gushchin
2014-04-29 10:50       ` Roman Gushchin
2014-04-29 12:54       ` Michal Hocko
2014-04-29 12:54         ` Michal Hocko
2014-04-30 21:52 ` Andrew Morton
2014-04-30 21:52   ` Andrew Morton
2014-04-30 22:49   ` Johannes Weiner
2014-04-30 22:49     ` Johannes Weiner
2014-05-02 12:03   ` Michal Hocko
2014-05-02 12:03     ` Michal Hocko
2014-04-30 21:59 ` Andrew Morton
2014-04-30 21:59   ` Andrew Morton
2014-05-02 11:22   ` Michal Hocko
2014-05-02 11:22     ` Michal Hocko
2014-05-28 12:10 ` Michal Hocko
2014-05-28 12:10   ` Michal Hocko
2014-05-28 13:49   ` Johannes Weiner
2014-05-28 13:49     ` Johannes Weiner
2014-05-28 14:21     ` Michal Hocko
2014-05-28 14:21       ` Michal Hocko
2014-05-28 15:28       ` Johannes Weiner
2014-05-28 15:28         ` Johannes Weiner
2014-05-28 15:54         ` Michal Hocko
2014-05-28 15:54           ` Michal Hocko
2014-05-28 16:33           ` Johannes Weiner
2014-05-28 16:33             ` Johannes Weiner
2014-06-03 11:07             ` Michal Hocko
2014-06-03 11:07               ` Michal Hocko
2014-06-03 14:22               ` Johannes Weiner
2014-06-03 14:22                 ` Johannes Weiner
2014-06-04 14:46                 ` Michal Hocko
2014-06-04 14:46                   ` Michal Hocko
2014-06-04 15:44                   ` Johannes Weiner
2014-06-04 15:44                     ` Johannes Weiner
2014-06-04 19:18                     ` Hugh Dickins
2014-06-04 19:18                       ` Hugh Dickins
2014-06-04 21:45                       ` Johannes Weiner
2014-06-04 21:45                         ` Johannes Weiner
2014-06-05 14:51                         ` Michal Hocko
2014-06-05 14:51                           ` Michal Hocko
2014-06-05 16:10                           ` Johannes Weiner
2014-06-05 16:10                             ` Johannes Weiner
2014-06-05 16:43                             ` Michal Hocko
2014-06-05 16:43                               ` Michal Hocko
2014-06-05 18:23                               ` Johannes Weiner
2014-06-05 18:23                                 ` Johannes Weiner
2014-06-06 14:44                                 ` Michal Hocko
2014-06-06 14:44                                   ` Michal Hocko
2014-06-06 14:46                                   ` [PATCH 1/2] mm, memcg: allow OOM if no memcg is eligible during direct reclaim Michal Hocko
2014-06-06 14:46                                     ` Michal Hocko
2014-06-06 14:46                                     ` [PATCH 2/2] memcg: Allow hard guarantee mode for low limit reclaim Michal Hocko
2014-06-06 14:46                                       ` Michal Hocko
2014-06-06 15:29                                       ` Tejun Heo
2014-06-06 15:29                                         ` Tejun Heo
2014-06-06 15:34                                         ` Tejun Heo
2014-06-06 15:34                                           ` Tejun Heo
2014-06-09  8:30                                         ` Michal Hocko
2014-06-09  8:30                                           ` Michal Hocko
2014-06-09 13:54                                           ` Tejun Heo
2014-06-09 13:54                                             ` Tejun Heo
2014-06-09 22:52                                       ` Greg Thelen
2014-06-09 22:52                                         ` Greg Thelen
2014-06-10 16:57                                         ` Johannes Weiner [this message]
2014-06-10 16:57                                           ` Johannes Weiner
2014-06-10 22:16                                           ` Greg Thelen
2014-06-10 22:16                                             ` Greg Thelen
2014-06-11  7:57                                           ` Michal Hocko
2014-06-11  7:57                                             ` Michal Hocko
2014-06-11  8:00                                             ` [PATCH 1/2] mm, memcg: allow OOM if no memcg is eligible during direct reclaim Michal Hocko
2014-06-11  8:00                                               ` Michal Hocko
2014-06-11  8:00                                               ` [PATCH 2/2] memcg: Allow guarantee reclaim Michal Hocko
2014-06-11  8:00                                                 ` Michal Hocko
2014-06-11 15:36                                                 ` Johannes Weiner
2014-06-11 15:36                                                   ` Johannes Weiner
2014-06-12 13:22                                                   ` Michal Hocko
2014-06-12 13:22                                                     ` Michal Hocko
2014-06-12 13:56                                                     ` Johannes Weiner
2014-06-12 13:56                                                       ` Johannes Weiner
2014-06-12 14:22                                                       ` Michal Hocko
2014-06-12 14:22                                                         ` Michal Hocko
2014-06-12 16:17                                                         ` Tejun Heo
2014-06-12 16:17                                                           ` Tejun Heo
2014-06-16 12:59                                                           ` Michal Hocko
2014-06-16 12:59                                                             ` Michal Hocko
2014-06-16 13:57                                                             ` Tejun Heo
2014-06-16 13:57                                                               ` Tejun Heo
2014-06-16 14:04                                                               ` Michal Hocko
2014-06-16 14:04                                                                 ` Michal Hocko
2014-06-16 14:12                                                                 ` Tejun Heo
2014-06-16 14:12                                                                   ` Tejun Heo
2014-06-16 14:29                                                                   ` Michal Hocko
2014-06-16 14:29                                                                     ` Michal Hocko
2014-06-16 14:40                                                                     ` Tejun Heo
2014-06-16 14:40                                                                       ` Tejun Heo
2014-06-12 16:51                                                         ` Johannes Weiner
2014-06-12 16:51                                                           ` Johannes Weiner
2014-06-16 13:22                                                           ` Michal Hocko
2014-06-16 13:22                                                             ` Michal Hocko
2014-06-11 15:20                                               ` [PATCH 1/2] mm, memcg: allow OOM if no memcg is eligible during direct reclaim Johannes Weiner
2014-06-11 15:20                                                 ` Johannes Weiner
2014-06-11 16:14                                                 ` Michal Hocko
2014-06-11 16:14                                                   ` Michal Hocko
2014-06-11 12:31                                             ` [PATCH 2/2] memcg: Allow hard guarantee mode for low limit reclaim Tejun Heo
2014-06-11 12:31                                               ` Tejun Heo
2014-06-11 14:11                                               ` Michal Hocko
2014-06-11 14:11                                                 ` Michal Hocko
2014-06-11 15:34                                                 ` Tejun Heo
2014-06-11 15:34                                                   ` Tejun Heo
2014-06-05 19:36                       ` [PATCH v2 0/4] memcg: Low-limit reclaim Tejun Heo
2014-06-05 19:36                         ` Tejun Heo
2014-06-05 14:32                     ` Michal Hocko
2014-06-05 14:32                       ` Michal Hocko
2014-06-05 15:43                       ` Johannes Weiner
2014-06-05 15:43                         ` Johannes Weiner
2014-06-05 16:09                         ` Michal Hocko
2014-06-05 16:09                           ` Michal Hocko
2014-06-05 16:46                           ` Johannes Weiner
2014-06-05 16:46                             ` Johannes Weiner
2014-05-28 16:17         ` Greg Thelen
2014-05-28 16:17           ` Greg Thelen
2014-06-03 11:09           ` Michal Hocko
2014-06-03 11:09             ` Michal Hocko
2014-06-03 14:01             ` Greg Thelen
2014-06-03 14:44               ` Michal Hocko
2014-06-03 14:44                 ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140610165756.GG2878@cmpxchg.org \
    --to=hannes@cmpxchg.org \
    --cc=akpm@linux-foundation.org \
    --cc=gthelen@google.com \
    --cc=hughd@google.com \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=klamm@yandex-team.ru \
    --cc=kosaki.motohiro@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.cz \
    --cc=tj@kernel.org \
    --cc=walken@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.