[PATCH] doc: cgroup: update note about conditions when oom killer is invoked

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

* [PATCH] doc: cgroup: update note about conditions when oom killer is invoked
@ 2020-05-08 14:16 Konstantin Khlebnikov
  2020-05-08 16:00 ` Randy Dunlap
  2020-05-11  8:39 ` Michal Hocko
  0 siblings, 2 replies; 5+ messages in thread
From: Konstantin Khlebnikov @ 2020-05-08 14:16 UTC (permalink / raw)
  To: linux-kernel, linux-mm, Andrew Morton
  Cc: cgroups, Roman Gushchin, Michal Hocko

Starting from v4.19 commit 29ef680ae7c2 ("memcg, oom: move out_of_memory
back to the charge path") cgroup oom killer is no longer invoked only from
page faults. Now it implements the same semantics as global OOM killer:
allocation context invokes OOM killer and keeps retrying until success.

Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
---
 Documentation/admin-guide/cgroup-v2.rst |   17 ++++++++---------
 1 file changed, 8 insertions(+), 9 deletions(-)

diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admin-guide/cgroup-v2.rst
index bcc80269bb6a..1bb9a8f6ebe1 100644
--- a/Documentation/admin-guide/cgroup-v2.rst
+++ b/Documentation/admin-guide/cgroup-v2.rst
@@ -1172,6 +1172,13 @@ PAGE_SIZE multiple when read back.
 	Under certain circumstances, the usage may go over the limit
 	temporarily.
 
+	In default configuration regular 0-order allocation always
+	succeed unless OOM killer choose current task as a victim.
+
+	Some kinds of allocations don't invoke the OOM killer.
+	Caller could retry them differently, return into userspace
+	as -ENOMEM or silently ignore in cases like disk readahead.
+
 	This is the ultimate protection mechanism.  As long as the
 	high limit is used and monitored properly, this limit's
 	utility is limited to providing the final safety net.
@@ -1228,17 +1235,9 @@ PAGE_SIZE multiple when read back.
 		The number of time the cgroup's memory usage was
 		reached the limit and allocation was about to fail.
 
-		Depending on context result could be invocation of OOM
-		killer and retrying allocation or failing allocation.
-
-		Failed allocation in its turn could be returned into
-		userspace as -ENOMEM or silently ignored in cases like
-		disk readahead.  For now OOM in memory cgroup kills
-		tasks iff shortage has happened inside page fault.
-
 		This event is not raised if the OOM killer is not
 		considered as an option, e.g. for failed high-order
-		allocations.
+		allocations or if caller asked to not retry attempts.
 
 	  oom_kill
 		The number of processes belonging to this cgroup



^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] doc: cgroup: update note about conditions when oom killer is invoked
  2020-05-08 14:16 [PATCH] doc: cgroup: update note about conditions when oom killer is invoked Konstantin Khlebnikov
@ 2020-05-08 16:00 ` Randy Dunlap
  2020-05-11  8:39 ` Michal Hocko
  1 sibling, 0 replies; 5+ messages in thread
From: Randy Dunlap @ 2020-05-08 16:00 UTC (permalink / raw)
  To: Konstantin Khlebnikov, linux-kernel, linux-mm, Andrew Morton
  Cc: cgroups, Roman Gushchin, Michal Hocko

Hi,

On 5/8/20 7:16 AM, Konstantin Khlebnikov wrote:
> Starting from v4.19 commit 29ef680ae7c2 ("memcg, oom: move out_of_memory
> back to the charge path") cgroup oom killer is no longer invoked only from
> page faults. Now it implements the same semantics as global OOM killer:
> allocation context invokes OOM killer and keeps retrying until success.
> 
> Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
> ---
>  Documentation/admin-guide/cgroup-v2.rst |   17 ++++++++---------
>  1 file changed, 8 insertions(+), 9 deletions(-)
> 
> diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admin-guide/cgroup-v2.rst
> index bcc80269bb6a..1bb9a8f6ebe1 100644
> --- a/Documentation/admin-guide/cgroup-v2.rst
> +++ b/Documentation/admin-guide/cgroup-v2.rst
> @@ -1172,6 +1172,13 @@ PAGE_SIZE multiple when read back.
>  	Under certain circumstances, the usage may go over the limit
>  	temporarily.
>  
> +	In default configuration regular 0-order allocation always

	                                         allocations

> +	succeed unless OOM killer choose current task as a victim.

	                          chooses

> +
> +	Some kinds of allocations don't invoke the OOM killer.
> +	Caller could retry them differently, return into userspace
> +	as -ENOMEM or silently ignore in cases like disk readahead.
> +
>  	This is the ultimate protection mechanism.  As long as the
>  	high limit is used and monitored properly, this limit's
>  	utility is limited to providing the final safety net.
> @@ -1228,17 +1235,9 @@ PAGE_SIZE multiple when read back.
>  		The number of time the cgroup's memory usage was
>  		reached the limit and allocation was about to fail.
>  
> -		Depending on context result could be invocation of OOM
> -		killer and retrying allocation or failing allocation.
> -
> -		Failed allocation in its turn could be returned into
> -		userspace as -ENOMEM or silently ignored in cases like
> -		disk readahead.  For now OOM in memory cgroup kills
> -		tasks iff shortage has happened inside page fault.
> -
>  		This event is not raised if the OOM killer is not
>  		considered as an option, e.g. for failed high-order
> -		allocations.
> +		allocations or if caller asked to not retry attempts.
>  
>  	  oom_kill
>  		The number of processes belonging to this cgroup
> 


thanks for updating the docs.
-- 
~Randy



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] doc: cgroup: update note about conditions when oom killer is invoked
  2020-05-08 14:16 [PATCH] doc: cgroup: update note about conditions when oom killer is invoked Konstantin Khlebnikov
  2020-05-08 16:00 ` Randy Dunlap
@ 2020-05-11  8:39 ` Michal Hocko
  2020-05-11  9:34   ` Konstantin Khlebnikov
  1 sibling, 1 reply; 5+ messages in thread
From: Michal Hocko @ 2020-05-11  8:39 UTC (permalink / raw)
  To: Konstantin Khlebnikov
  Cc: linux-kernel, linux-mm, Andrew Morton, cgroups, Roman Gushchin

On Fri 08-05-20 17:16:29, Konstantin Khlebnikov wrote:
> Starting from v4.19 commit 29ef680ae7c2 ("memcg, oom: move out_of_memory
> back to the charge path") cgroup oom killer is no longer invoked only from
> page faults. Now it implements the same semantics as global OOM killer:
> allocation context invokes OOM killer and keeps retrying until success.
> 
> Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>

Acked-by: Michal Hocko <mhocko@suse.com>

> ---
>  Documentation/admin-guide/cgroup-v2.rst |   17 ++++++++---------
>  1 file changed, 8 insertions(+), 9 deletions(-)
> 
> diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admin-guide/cgroup-v2.rst
> index bcc80269bb6a..1bb9a8f6ebe1 100644
> --- a/Documentation/admin-guide/cgroup-v2.rst
> +++ b/Documentation/admin-guide/cgroup-v2.rst
> @@ -1172,6 +1172,13 @@ PAGE_SIZE multiple when read back.
>  	Under certain circumstances, the usage may go over the limit
>  	temporarily.
>  
> +	In default configuration regular 0-order allocation always
> +	succeed unless OOM killer choose current task as a victim.
> +
> +	Some kinds of allocations don't invoke the OOM killer.
> +	Caller could retry them differently, return into userspace
> +	as -ENOMEM or silently ignore in cases like disk readahead.

I would probably add -EFAULT but the less error codes we document the
better.

> +
>  	This is the ultimate protection mechanism.  As long as the
>  	high limit is used and monitored properly, this limit's
>  	utility is limited to providing the final safety net.
> @@ -1228,17 +1235,9 @@ PAGE_SIZE multiple when read back.
>  		The number of time the cgroup's memory usage was
>  		reached the limit and allocation was about to fail.
>  
> -		Depending on context result could be invocation of OOM
> -		killer and retrying allocation or failing allocation.
> -
> -		Failed allocation in its turn could be returned into
> -		userspace as -ENOMEM or silently ignored in cases like
> -		disk readahead.  For now OOM in memory cgroup kills
> -		tasks iff shortage has happened inside page fault.
> -
>  		This event is not raised if the OOM killer is not
>  		considered as an option, e.g. for failed high-order
> -		allocations.
> +		allocations or if caller asked to not retry attempts.
>  
>  	  oom_kill
>  		The number of processes belonging to this cgroup

-- 
Michal Hocko
SUSE Labs


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] doc: cgroup: update note about conditions when oom killer is invoked
  2020-05-11  8:39 ` Michal Hocko
@ 2020-05-11  9:34   ` Konstantin Khlebnikov
  2020-05-11 10:13     ` Michal Hocko
  0 siblings, 1 reply; 5+ messages in thread
From: Konstantin Khlebnikov @ 2020-05-11  9:34 UTC (permalink / raw)
  To: Michal Hocko
  Cc: linux-kernel, linux-mm, Andrew Morton, cgroups, Roman Gushchin



On 11/05/2020 11.39, Michal Hocko wrote:
> On Fri 08-05-20 17:16:29, Konstantin Khlebnikov wrote:
>> Starting from v4.19 commit 29ef680ae7c2 ("memcg, oom: move out_of_memory
>> back to the charge path") cgroup oom killer is no longer invoked only from
>> page faults. Now it implements the same semantics as global OOM killer:
>> allocation context invokes OOM killer and keeps retrying until success.
>>
>> Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
> 
> Acked-by: Michal Hocko <mhocko@suse.com>
> 
>> ---
>>   Documentation/admin-guide/cgroup-v2.rst |   17 ++++++++---------
>>   1 file changed, 8 insertions(+), 9 deletions(-)
>>
>> diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admin-guide/cgroup-v2.rst
>> index bcc80269bb6a..1bb9a8f6ebe1 100644
>> --- a/Documentation/admin-guide/cgroup-v2.rst
>> +++ b/Documentation/admin-guide/cgroup-v2.rst
>> @@ -1172,6 +1172,13 @@ PAGE_SIZE multiple when read back.
>>   	Under certain circumstances, the usage may go over the limit
>>   	temporarily.
>>   
>> +	In default configuration regular 0-order allocation always
>> +	succeed unless OOM killer choose current task as a victim.
>> +
>> +	Some kinds of allocations don't invoke the OOM killer.
>> +	Caller could retry them differently, return into userspace
>> +	as -ENOMEM or silently ignore in cases like disk readahead.
> 
> I would probably add -EFAULT but the less error codes we document the
> better.

Yeah, EFAULT was a most obscure result of memory shortage.
Fortunately with new behaviour this shouldn't happens a lot.

Actually where it is still possible? THP always fallback to 0-order.
I mean EFAULT could appear inside kernel only if task is killed so
nobody would see it.

> 
>> +
>>   	This is the ultimate protection mechanism.  As long as the
>>   	high limit is used and monitored properly, this limit's
>>   	utility is limited to providing the final safety net.
>> @@ -1228,17 +1235,9 @@ PAGE_SIZE multiple when read back.
>>   		The number of time the cgroup's memory usage was
>>   		reached the limit and allocation was about to fail.
>>   
>> -		Depending on context result could be invocation of OOM
>> -		killer and retrying allocation or failing allocation.
>> -
>> -		Failed allocation in its turn could be returned into
>> -		userspace as -ENOMEM or silently ignored in cases like
>> -		disk readahead.  For now OOM in memory cgroup kills
>> -		tasks iff shortage has happened inside page fault.
>> -
>>   		This event is not raised if the OOM killer is not
>>   		considered as an option, e.g. for failed high-order
>> -		allocations.
>> +		allocations or if caller asked to not retry attempts.
>>   
>>   	  oom_kill
>>   		The number of processes belonging to this cgroup
> 


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] doc: cgroup: update note about conditions when oom killer is invoked
  2020-05-11  9:34   ` Konstantin Khlebnikov
@ 2020-05-11 10:13     ` Michal Hocko
  0 siblings, 0 replies; 5+ messages in thread
From: Michal Hocko @ 2020-05-11 10:13 UTC (permalink / raw)
  To: Konstantin Khlebnikov
  Cc: linux-kernel, linux-mm, Andrew Morton, cgroups, Roman Gushchin

On Mon 11-05-20 12:34:00, Konstantin Khlebnikov wrote:
> 
> 
> On 11/05/2020 11.39, Michal Hocko wrote:
> > On Fri 08-05-20 17:16:29, Konstantin Khlebnikov wrote:
> > > Starting from v4.19 commit 29ef680ae7c2 ("memcg, oom: move out_of_memory
> > > back to the charge path") cgroup oom killer is no longer invoked only from
> > > page faults. Now it implements the same semantics as global OOM killer:
> > > allocation context invokes OOM killer and keeps retrying until success.
> > > 
> > > Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
> > 
> > Acked-by: Michal Hocko <mhocko@suse.com>
> > 
> > > ---
> > >   Documentation/admin-guide/cgroup-v2.rst |   17 ++++++++---------
> > >   1 file changed, 8 insertions(+), 9 deletions(-)
> > > 
> > > diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admin-guide/cgroup-v2.rst
> > > index bcc80269bb6a..1bb9a8f6ebe1 100644
> > > --- a/Documentation/admin-guide/cgroup-v2.rst
> > > +++ b/Documentation/admin-guide/cgroup-v2.rst
> > > @@ -1172,6 +1172,13 @@ PAGE_SIZE multiple when read back.
> > >   	Under certain circumstances, the usage may go over the limit
> > >   	temporarily.
> > > +	In default configuration regular 0-order allocation always
> > > +	succeed unless OOM killer choose current task as a victim.
> > > +
> > > +	Some kinds of allocations don't invoke the OOM killer.
> > > +	Caller could retry them differently, return into userspace
> > > +	as -ENOMEM or silently ignore in cases like disk readahead.
> > 
> > I would probably add -EFAULT but the less error codes we document the
> > better.
> 
> Yeah, EFAULT was a most obscure result of memory shortage.
> Fortunately with new behaviour this shouldn't happens a lot.

Yes, it shouldn't really happen very often. gup was the most prominent
example but this one should be taken care of by triggering the OOM
killer. But I wouldn't bet my hat there are no potential cases anymore.

> Actually where it is still possible? THP always fallback to 0-order.
> I mean EFAULT could appear inside kernel only if task is killed so
> nobody would see it.

Yes fatal_signal_pending paths are ok. And no I do not have any specific
examples. But as you've said EFAULT was a real surprise so I thought it
would be nice to still keep a reference for it around. Even when it is
unlikely.

-- 
Michal Hocko
SUSE Labs


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2020-05-11 10:13 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-08 14:16 [PATCH] doc: cgroup: update note about conditions when oom killer is invoked Konstantin Khlebnikov
2020-05-08 16:00 ` Randy Dunlap
2020-05-11  8:39 ` Michal Hocko
2020-05-11  9:34   ` Konstantin Khlebnikov
2020-05-11 10:13     ` Michal Hocko

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).