All of lore.kernel.org
 help / color / mirror / Atom feed
From: Vasily Averin <vvs@virtuozzo.com>
To: Michal Hocko <mhocko@suse.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>,
	Vladimir Davydov <vdavydov.dev@gmail.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Roman Gushchin <guro@fb.com>, Uladzislau Rezki <urezki@gmail.com>,
	Vlastimil Babka <vbabka@suse.cz>,
	Shakeel Butt <shakeelb@google.com>,
	Mel Gorman <mgorman@techsingularity.net>,
	cgroups@vger.kernel.org, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org, kernel@openvz.org
Subject: Re: [PATCH memcg 0/1] false global OOM triggered by memcg-limited task
Date: Tue, 19 Oct 2021 22:09:19 +0300	[thread overview]
Message-ID: <3c76e2d7-e545-ef34-b2c3-a5f63b1eff51@virtuozzo.com> (raw)
In-Reply-To: <YW7SfkZR/ZsabkXV@dhcp22.suse.cz>

On 19.10.2021 17:13, Michal Hocko wrote:
> On Tue 19-10-21 16:26:50, Vasily Averin wrote:
>> On 19.10.2021 15:04, Michal Hocko wrote:
>>> On Tue 19-10-21 13:54:42, Michal Hocko wrote:
>>>> On Tue 19-10-21 13:30:06, Vasily Averin wrote:
>>>>> On 19.10.2021 11:49, Michal Hocko wrote:
>>>>>> On Tue 19-10-21 09:30:18, Vasily Averin wrote:
>>>>>> [...]
>>>>>>> With my patch ("memcg: prohibit unconditional exceeding the limit of dying tasks") try_charge_memcg() can fail:
>>>>>>> a) due to fatal signal
>>>>>>> b) when mem_cgroup_oom -> mem_cgroup_out_of_memory -> out_of_memory() returns false (when select_bad_process() found nothing)
>>>>>>>
>>>>>>> To handle a) we can follow to your suggestion and skip excution of out_of_memory() in pagefault_out_of memory()
>>>>>>> To handle b) we can go to retry: if mem_cgroup_oom() return OOM_FAILED.
>>>>>
>>>>>> How is b) possible without current being killed? Do we allow remote
>>>>>> charging?
>>>>>
>>>>> out_of_memory for memcg_oom
>>>>>  select_bad_process
>>>>>   mem_cgroup_scan_tasks
>>>>>    oom_evaluate_task
>>>>>     oom_badness
>>>>>
>>>>>         /*
>>>>>          * Do not even consider tasks which are explicitly marked oom
>>>>>          * unkillable or have been already oom reaped or the are in
>>>>>          * the middle of vfork
>>>>>          */
>>>>>         adj = (long)p->signal->oom_score_adj;
>>>>>         if (adj == OOM_SCORE_ADJ_MIN ||
>>>>>                         test_bit(MMF_OOM_SKIP, &p->mm->flags) ||
>>>>>                         in_vfork(p)) {
>>>>>                 task_unlock(p);
>>>>>                 return LONG_MIN;
>>>>>         }
>>>>>
>>>>> This time we handle userspace page fault, so we cannot be kenrel thread,
>>>>> and cannot be in_vfork().
>>>>> However task can be marked as oom unkillable, 
>>>>> i.e. have p->signal->oom_score_adj == OOM_SCORE_ADJ_MIN
>>>>
>>>> You are right. I am not sure there is a way out of this though. The task
>>>> can only retry for ever in this case. There is nothing actionable here.
>>>> We cannot kill the task and there is no other way to release the memory.
>>>
>>> Btw. don't we force the charge in that case?
>>
>> We should force charge for allocation from inside page fault handler,
>> to prevent endless cycle in retried page faults.
>> However we should not do it for allocations from task context,
>> to prevent memcg-limited vmalloc-eaters from to consume all host memory.
> 
> I don't see a big difference between those two. Because the #PF could
> result into the very same situation depleting all the memory by
> overcharging. A different behavior just leads to a confusion and
> unexpected behavior. E.g. in the past we only triggered memcg OOM killer
> from the #PF path and failed the charge otherwise. That is something
> different but it shows problems we haven't anticipated and had user
> visible problems. See 29ef680ae7c2 ("memcg, oom: move out_of_memory back
> to the charge path").

In this case I think we should fail this allocation.
It's better do not allow overcharge, neither in #PF not in regular allocations.

However this failure will trigger false global OOM in pagefault_out_of_memory(),
and we need to find some way to prevent it.

Thank you,
	Vasily Averin

WARNING: multiple messages have this Message-ID (diff)
From: Vasily Averin <vvs-5HdwGun5lf+gSpxsJD1C4w@public.gmane.org>
To: Michal Hocko <mhocko-IBi9RG/b67k@public.gmane.org>
Cc: Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org>,
	Vladimir Davydov
	<vdavydov.dev-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>,
	Andrew Morton
	<akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>,
	Roman Gushchin <guro-b10kYP2dOMg@public.gmane.org>,
	Uladzislau Rezki <urezki-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>,
	Vlastimil Babka <vbabka-AlSwsSmVLrQ@public.gmane.org>,
	Shakeel Butt <shakeelb-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>,
	Mel Gorman
	<mgorman-3eNAlZScCAx27rWaFMvyedHuzzzSOjJt@public.gmane.org>,
	cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	kernel-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org
Subject: Re: [PATCH memcg 0/1] false global OOM triggered by memcg-limited task
Date: Tue, 19 Oct 2021 22:09:19 +0300	[thread overview]
Message-ID: <3c76e2d7-e545-ef34-b2c3-a5f63b1eff51@virtuozzo.com> (raw)
In-Reply-To: <YW7SfkZR/ZsabkXV-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>

On 19.10.2021 17:13, Michal Hocko wrote:
> On Tue 19-10-21 16:26:50, Vasily Averin wrote:
>> On 19.10.2021 15:04, Michal Hocko wrote:
>>> On Tue 19-10-21 13:54:42, Michal Hocko wrote:
>>>> On Tue 19-10-21 13:30:06, Vasily Averin wrote:
>>>>> On 19.10.2021 11:49, Michal Hocko wrote:
>>>>>> On Tue 19-10-21 09:30:18, Vasily Averin wrote:
>>>>>> [...]
>>>>>>> With my patch ("memcg: prohibit unconditional exceeding the limit of dying tasks") try_charge_memcg() can fail:
>>>>>>> a) due to fatal signal
>>>>>>> b) when mem_cgroup_oom -> mem_cgroup_out_of_memory -> out_of_memory() returns false (when select_bad_process() found nothing)
>>>>>>>
>>>>>>> To handle a) we can follow to your suggestion and skip excution of out_of_memory() in pagefault_out_of memory()
>>>>>>> To handle b) we can go to retry: if mem_cgroup_oom() return OOM_FAILED.
>>>>>
>>>>>> How is b) possible without current being killed? Do we allow remote
>>>>>> charging?
>>>>>
>>>>> out_of_memory for memcg_oom
>>>>>  select_bad_process
>>>>>   mem_cgroup_scan_tasks
>>>>>    oom_evaluate_task
>>>>>     oom_badness
>>>>>
>>>>>         /*
>>>>>          * Do not even consider tasks which are explicitly marked oom
>>>>>          * unkillable or have been already oom reaped or the are in
>>>>>          * the middle of vfork
>>>>>          */
>>>>>         adj = (long)p->signal->oom_score_adj;
>>>>>         if (adj == OOM_SCORE_ADJ_MIN ||
>>>>>                         test_bit(MMF_OOM_SKIP, &p->mm->flags) ||
>>>>>                         in_vfork(p)) {
>>>>>                 task_unlock(p);
>>>>>                 return LONG_MIN;
>>>>>         }
>>>>>
>>>>> This time we handle userspace page fault, so we cannot be kenrel thread,
>>>>> and cannot be in_vfork().
>>>>> However task can be marked as oom unkillable, 
>>>>> i.e. have p->signal->oom_score_adj == OOM_SCORE_ADJ_MIN
>>>>
>>>> You are right. I am not sure there is a way out of this though. The task
>>>> can only retry for ever in this case. There is nothing actionable here.
>>>> We cannot kill the task and there is no other way to release the memory.
>>>
>>> Btw. don't we force the charge in that case?
>>
>> We should force charge for allocation from inside page fault handler,
>> to prevent endless cycle in retried page faults.
>> However we should not do it for allocations from task context,
>> to prevent memcg-limited vmalloc-eaters from to consume all host memory.
> 
> I don't see a big difference between those two. Because the #PF could
> result into the very same situation depleting all the memory by
> overcharging. A different behavior just leads to a confusion and
> unexpected behavior. E.g. in the past we only triggered memcg OOM killer
> from the #PF path and failed the charge otherwise. That is something
> different but it shows problems we haven't anticipated and had user
> visible problems. See 29ef680ae7c2 ("memcg, oom: move out_of_memory back
> to the charge path").

In this case I think we should fail this allocation.
It's better do not allow overcharge, neither in #PF not in regular allocations.

However this failure will trigger false global OOM in pagefault_out_of_memory(),
and we need to find some way to prevent it.

Thank you,
	Vasily Averin

  parent reply	other threads:[~2021-10-19 19:09 UTC|newest]

Thread overview: 131+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-18  8:13 [PATCH memcg 0/1] false global OOM triggered by memcg-limited task Vasily Averin
2021-10-18  8:13 ` Vasily Averin
2021-10-18  9:04 ` Michal Hocko
2021-10-18  9:04   ` Michal Hocko
2021-10-18 10:05   ` Vasily Averin
2021-10-18 10:05     ` Vasily Averin
2021-10-18 10:12     ` Vasily Averin
2021-10-18 10:12       ` Vasily Averin
2021-10-18 11:53     ` Michal Hocko
2021-10-18 11:53       ` Michal Hocko
     [not found]       ` <27dc0c49-a0d6-875b-49c6-0ef5c0cc3ac8@virtuozzo.com>
2021-10-18 12:27         ` Michal Hocko
2021-10-18 12:27           ` Michal Hocko
2021-10-18 15:07           ` Shakeel Butt
2021-10-18 15:07             ` Shakeel Butt
2021-10-18 16:51             ` Michal Hocko
2021-10-18 16:51               ` Michal Hocko
2021-10-18 17:13               ` Shakeel Butt
2021-10-18 18:52             ` Vasily Averin
2021-10-18 18:52               ` Vasily Averin
2021-10-18 19:18               ` Vasily Averin
2021-10-18 19:18                 ` Vasily Averin
2021-10-19  5:34                 ` Shakeel Butt
2021-10-19  5:34                   ` Shakeel Butt
2021-10-19  5:33               ` Shakeel Butt
2021-10-19  5:33                 ` Shakeel Butt
2021-10-19  6:42                 ` Vasily Averin
2021-10-19  6:42                   ` Vasily Averin
2021-10-19  8:47                   ` Michal Hocko
2021-10-19  8:47                     ` Michal Hocko
2021-10-19  6:30       ` Vasily Averin
2021-10-19  6:30         ` Vasily Averin
2021-10-19  8:49         ` Michal Hocko
2021-10-19  8:49           ` Michal Hocko
2021-10-19 10:30           ` Vasily Averin
2021-10-19 10:30             ` Vasily Averin
2021-10-19 11:54             ` Michal Hocko
2021-10-19 11:54               ` Michal Hocko
2021-10-19 12:04               ` Michal Hocko
2021-10-19 12:04                 ` Michal Hocko
2021-10-19 13:26                 ` Vasily Averin
2021-10-19 13:26                   ` Vasily Averin
2021-10-19 14:13                   ` Michal Hocko
2021-10-19 14:13                     ` Michal Hocko
2021-10-19 14:19                     ` Michal Hocko
2021-10-19 14:19                       ` Michal Hocko
2021-10-19 19:09                     ` Vasily Averin [this message]
2021-10-19 19:09                       ` Vasily Averin
2021-10-20  8:07                       ` [PATCH memcg v4] memcg: prohibit unconditional exceeding the limit of dying tasks Vasily Averin
2021-10-20  8:07                         ` Vasily Averin
2021-10-20  8:43                         ` Michal Hocko
2021-10-20  8:43                           ` Michal Hocko
2021-10-20 12:11                           ` [PATCH memcg RFC 0/3] " Vasily Averin
2021-10-20 12:11                             ` Vasily Averin
     [not found]                           ` <cover.1634730787.git.vvs@virtuozzo.com>
2021-10-20 12:12                             ` [PATCH memcg 1/3] mm: do not firce global OOM from inside " Vasily Averin
2021-10-20 12:12                               ` Vasily Averin
2021-10-20 12:33                               ` Michal Hocko
2021-10-20 12:33                                 ` Michal Hocko
2021-10-20 13:52                                 ` Vasily Averin
2021-10-20 13:52                                   ` Vasily Averin
2021-10-20 12:13                             ` [PATCH memcg 2/3] memcg: remove charge forcinig for " Vasily Averin
2021-10-20 12:13                               ` Vasily Averin
2021-10-20 12:41                               ` Michal Hocko
2021-10-20 12:41                                 ` Michal Hocko
2021-10-20 14:21                                 ` Vasily Averin
2021-10-20 14:21                                   ` Vasily Averin
2021-10-20 14:57                                   ` Michal Hocko
2021-10-20 14:57                                     ` Michal Hocko
2021-10-20 15:20                                     ` Tetsuo Handa
2021-10-20 15:20                                       ` Tetsuo Handa
2021-10-21 10:03                                       ` Michal Hocko
2021-10-21 10:03                                         ` Michal Hocko
2021-10-20 12:14                             ` [PATCH memcg 3/3] memcg: handle memcg oom failures Vasily Averin
2021-10-20 12:14                               ` Vasily Averin
2021-10-20 13:02                               ` Michal Hocko
2021-10-20 15:46                                 ` Vasily Averin
2021-10-20 15:46                                   ` Vasily Averin
2021-10-21 11:49                                   ` Michal Hocko
2021-10-21 11:49                                     ` Michal Hocko
2021-10-21 15:05                                     ` Vasily Averin
2021-10-21 15:05                                       ` Vasily Averin
2021-10-21 16:47                                       ` Michal Hocko
2021-10-21 16:47                                         ` Michal Hocko
2021-10-22  8:10                                         ` [PATCH memcg v2 0/2] memcg: prohibit unconditional exceeding the limit of dying tasks Vasily Averin
2021-10-22  8:10                                           ` Vasily Averin
     [not found]                                         ` <cover.1634889066.git.vvs@virtuozzo.com>
2021-10-22  8:11                                           ` [PATCH memcg v2 1/2] mm, oom: do not trigger out_of_memory from the #PF Vasily Averin
2021-10-22  8:11                                             ` Vasily Averin
2021-10-22  8:55                                             ` Michal Hocko
2021-10-22  8:55                                               ` Michal Hocko
2021-10-22  8:11                                           ` [PATCH memcg v2 2/2] memcg: prohibit unconditional exceeding the limit of dying tasks Vasily Averin
2021-10-22  8:11                                             ` Vasily Averin
2021-10-22  9:10                                             ` Michal Hocko
2021-10-22  9:10                                               ` Michal Hocko
2021-10-23 13:18                                               ` [PATCH memcg v3 0/3] " Vasily Averin
2021-10-23 13:18                                                 ` Vasily Averin
     [not found]                                               ` <cover.1634994605.git.vvs@virtuozzo.com>
2021-10-23 13:19                                                 ` [PATCH memcg v3 1/3] mm, oom: pagefault_out_of_memory: don't force global OOM for " Vasily Averin
2021-10-23 13:19                                                   ` Vasily Averin
2021-10-25  9:27                                                   ` Michal Hocko
2021-10-25  9:27                                                     ` Michal Hocko
2021-10-23 13:20                                                 ` [PATCH memcg v3 2/3] mm, oom: do not trigger out_of_memory from the #PF Vasily Averin
2021-10-23 13:20                                                   ` Vasily Averin
2021-10-23 15:01                                                   ` Tetsuo Handa
2021-10-23 15:01                                                     ` Tetsuo Handa
2021-10-23 19:15                                                     ` Vasily Averin
2021-10-25  8:04                                                     ` Michal Hocko
2021-10-25  8:04                                                       ` Michal Hocko
2021-10-26 13:56                                                       ` Tetsuo Handa
2021-10-26 13:56                                                         ` Tetsuo Handa
2021-10-26 14:07                                                         ` Michal Hocko
2021-10-26 14:07                                                           ` Michal Hocko
2021-10-25  9:34                                                   ` Michal Hocko
2021-10-25  9:34                                                     ` Michal Hocko
2021-10-23 13:20                                                 ` [PATCH memcg v3 3/3] memcg: prohibit unconditional exceeding the limit of dying tasks Vasily Averin
2021-10-23 13:20                                                   ` Vasily Averin
2021-10-25  9:36                                                   ` Michal Hocko
2021-10-25  9:36                                                     ` Michal Hocko
2021-10-27 22:36                                                     ` Andrew Morton
2021-10-27 22:36                                                       ` Andrew Morton
2021-10-28  7:22                                                       ` Vasily Averin
2021-10-28  7:22                                                         ` Vasily Averin
2021-10-29  7:46                                                         ` Greg Kroah-Hartman
2021-10-29  7:46                                                           ` Greg Kroah-Hartman
2021-10-29  7:58                                                       ` Michal Hocko
2021-10-29  7:58                                                         ` Michal Hocko
2021-11-12 23:48                         ` [PATCH memcg v4] " kernel test robot
2021-11-26  4:32                         ` kernel test robot
2021-10-21  8:03   ` [PATCH memcg 0/1] false global OOM triggered by memcg-limited task Vasily Averin
2021-10-21  8:03     ` Vasily Averin
2021-10-21 11:49     ` Michal Hocko
2021-10-21 11:49       ` Michal Hocko
2021-10-21 13:24       ` Vasily Averin
2021-10-21 13:24         ` Vasily Averin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3c76e2d7-e545-ef34-b2c3-a5f63b1eff51@virtuozzo.com \
    --to=vvs@virtuozzo.com \
    --cc=akpm@linux-foundation.org \
    --cc=cgroups@vger.kernel.org \
    --cc=guro@fb.com \
    --cc=hannes@cmpxchg.org \
    --cc=kernel@openvz.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@techsingularity.net \
    --cc=mhocko@suse.com \
    --cc=shakeelb@google.com \
    --cc=urezki@gmail.com \
    --cc=vbabka@suse.cz \
    --cc=vdavydov.dev@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.