From: Shakeel Butt <shakeelb@google.com> To: Michal Hocko <mhocko@suse.com> Cc: Roman Gushchin <guro@fb.com>, Johannes Weiner <hannes@cmpxchg.org>, Linux MM <linux-mm@kvack.org>, Andrew Morton <akpm@linux-foundation.org>, Cgroups <cgroups@vger.kernel.org>, David Rientjes <rientjes@google.com>, LKML <linux-kernel@vger.kernel.org>, Suren Baghdasaryan <surenb@google.com>, Greg Thelen <gthelen@google.com>, Dragos Sbirlea <dragoss@google.com>, Priya Duraisamy <padmapriyad@google.com> Subject: Re: [RFC] memory reserve for userspace oom-killer Date: Wed, 21 Apr 2021 07:13:45 -0700 [thread overview] Message-ID: <CALvZod6oCBB6tDh5wABSwdHfcDzLX7S7cOTLp_4Qk4DCi50X_A@mail.gmail.com> (raw) In-Reply-To: <YH/S2dVxk2le8SMw@dhcp22.suse.cz> On Wed, Apr 21, 2021 at 12:23 AM Michal Hocko <mhocko@suse.com> wrote: > [...] > > In our observation the global reclaim is very non-deterministic at the > > tail and dramatically impacts the reliability of the system. We are > > looking for a solution which is independent of the global reclaim. > > I believe it is worth purusing a solution that would make the memory > reclaim more predictable. I have seen direct reclaim memory throttling > in the past. For some reason which I haven't tried to examine this has > become less of a problem with newer kernels. Maybe the memory access > patterns have changed or those problems got replaced by other issues but > an excessive throttling is definitely something that we want to address > rather than work around by some user visible APIs. > I agree we want to address the excessive throttling but for everyone on the machine and most importantly it is a moving target. The reclaim code continues to evolve and in addition it has callbacks to diverse sets of subsystems. The user visible APIs is for one specific use-case i.e. oom-killer which will indirectly help in reducing the excessive throttling. [...] > > So, the suggestion is to have a per-task flag to (1) indicate to not > > throttle and (2) fail allocations easily on significant memory > > pressure. > > > > For (1), the challenge I see is that there are a lot of places in the > > reclaim code paths where a task can get throttled. There are > > filesystems that block/throttle in slab shrinking. Any process can get > > blocked on an unrelated page or inode writeback within reclaim. > > > > For (2), I am not sure how to deterministically define "significant > > memory pressure". One idea is to follow the __GFP_NORETRY semantics > > and along with (1) the userspace oom-killer will see ENOMEM more > > reliably than stucking in the reclaim. > > Some of the interfaces (e.g. seq_file uses GFP_KERNEL reclaim strength) > could be more relaxed and rather fail than OOM kill but wouldn't your > OOM handler be effectivelly dysfunctional when not able to collect data > to make a decision? > Yes it would be. Roman is suggesting to have a precomputed kill-list (pidfds ready to send SIGKILL) and whenever oom-killer gets ENOMEM, it would go with the kill-list. Though we are still contemplating the ways and side-effects of preferably returning ENOMEM in slowpath for oom-killer and in addition the complexity to maintain the kill-list and keeping it up to date. thanks, Shakeel
WARNING: multiple messages have this Message-ID (diff)
From: Shakeel Butt <shakeelb-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> To: Michal Hocko <mhocko-IBi9RG/b67k@public.gmane.org> Cc: Roman Gushchin <guro-b10kYP2dOMg@public.gmane.org>, Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org>, Linux MM <linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org>, Andrew Morton <akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>, Cgroups <cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>, David Rientjes <rientjes-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>, LKML <linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>, Suren Baghdasaryan <surenb-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>, Greg Thelen <gthelen-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>, Dragos Sbirlea <dragoss-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>, Priya Duraisamy <padmapriyad-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> Subject: Re: [RFC] memory reserve for userspace oom-killer Date: Wed, 21 Apr 2021 07:13:45 -0700 [thread overview] Message-ID: <CALvZod6oCBB6tDh5wABSwdHfcDzLX7S7cOTLp_4Qk4DCi50X_A@mail.gmail.com> (raw) In-Reply-To: <YH/S2dVxk2le8SMw-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org> On Wed, Apr 21, 2021 at 12:23 AM Michal Hocko <mhocko-IBi9RG/b67k@public.gmane.org> wrote: > [...] > > In our observation the global reclaim is very non-deterministic at the > > tail and dramatically impacts the reliability of the system. We are > > looking for a solution which is independent of the global reclaim. > > I believe it is worth purusing a solution that would make the memory > reclaim more predictable. I have seen direct reclaim memory throttling > in the past. For some reason which I haven't tried to examine this has > become less of a problem with newer kernels. Maybe the memory access > patterns have changed or those problems got replaced by other issues but > an excessive throttling is definitely something that we want to address > rather than work around by some user visible APIs. > I agree we want to address the excessive throttling but for everyone on the machine and most importantly it is a moving target. The reclaim code continues to evolve and in addition it has callbacks to diverse sets of subsystems. The user visible APIs is for one specific use-case i.e. oom-killer which will indirectly help in reducing the excessive throttling. [...] > > So, the suggestion is to have a per-task flag to (1) indicate to not > > throttle and (2) fail allocations easily on significant memory > > pressure. > > > > For (1), the challenge I see is that there are a lot of places in the > > reclaim code paths where a task can get throttled. There are > > filesystems that block/throttle in slab shrinking. Any process can get > > blocked on an unrelated page or inode writeback within reclaim. > > > > For (2), I am not sure how to deterministically define "significant > > memory pressure". One idea is to follow the __GFP_NORETRY semantics > > and along with (1) the userspace oom-killer will see ENOMEM more > > reliably than stucking in the reclaim. > > Some of the interfaces (e.g. seq_file uses GFP_KERNEL reclaim strength) > could be more relaxed and rather fail than OOM kill but wouldn't your > OOM handler be effectivelly dysfunctional when not able to collect data > to make a decision? > Yes it would be. Roman is suggesting to have a precomputed kill-list (pidfds ready to send SIGKILL) and whenever oom-killer gets ENOMEM, it would go with the kill-list. Though we are still contemplating the ways and side-effects of preferably returning ENOMEM in slowpath for oom-killer and in addition the complexity to maintain the kill-list and keeping it up to date. thanks, Shakeel
next prev parent reply other threads:[~2021-04-21 14:14 UTC|newest] Thread overview: 65+ messages / expand[flat|nested] mbox.gz Atom feed top 2021-04-20 1:44 [RFC] memory reserve for userspace oom-killer Shakeel Butt 2021-04-20 1:44 ` Shakeel Butt 2021-04-20 1:44 ` Shakeel Butt 2021-04-20 6:45 ` Michal Hocko 2021-04-20 6:45 ` Michal Hocko 2021-04-20 16:04 ` Shakeel Butt 2021-04-20 16:04 ` Shakeel Butt 2021-04-20 16:04 ` Shakeel Butt 2021-04-21 7:16 ` Michal Hocko 2021-04-21 7:16 ` Michal Hocko 2021-04-21 13:57 ` Shakeel Butt 2021-04-21 13:57 ` Shakeel Butt 2021-04-21 13:57 ` Shakeel Butt 2021-04-21 14:29 ` Michal Hocko 2021-04-22 12:33 ` [RFC PATCH] Android OOM helper proof of concept peter enderborg 2021-04-22 12:33 ` peter enderborg 2021-04-22 13:03 ` Michal Hocko 2021-05-05 0:37 ` [RFC] memory reserve for userspace oom-killer Shakeel Butt 2021-05-05 0:37 ` Shakeel Butt 2021-05-05 0:37 ` Shakeel Butt 2021-05-05 1:26 ` Suren Baghdasaryan 2021-05-05 1:26 ` Suren Baghdasaryan 2021-05-05 2:45 ` Shakeel Butt 2021-05-05 2:45 ` Shakeel Butt 2021-05-05 2:45 ` Shakeel Butt 2021-05-05 2:59 ` Suren Baghdasaryan 2021-05-05 2:59 ` Suren Baghdasaryan 2021-05-05 2:59 ` Suren Baghdasaryan 2021-05-05 2:43 ` Hillf Danton 2021-04-20 19:17 ` Roman Gushchin 2021-04-20 19:17 ` Roman Gushchin 2021-04-20 19:36 ` Suren Baghdasaryan 2021-04-20 19:36 ` Suren Baghdasaryan 2021-04-20 19:36 ` Suren Baghdasaryan 2021-04-21 1:18 ` Shakeel Butt 2021-04-21 1:18 ` Shakeel Butt 2021-04-21 1:18 ` Shakeel Butt 2021-04-21 2:58 ` Roman Gushchin 2021-04-21 13:26 ` Shakeel Butt 2021-04-21 13:26 ` Shakeel Butt 2021-04-21 13:26 ` Shakeel Butt 2021-04-21 19:04 ` Roman Gushchin 2021-04-21 19:04 ` Roman Gushchin 2021-04-21 7:23 ` Michal Hocko 2021-04-21 7:23 ` Michal Hocko 2021-04-21 14:13 ` Shakeel Butt [this message] 2021-04-21 14:13 ` Shakeel Butt 2021-04-21 14:13 ` Shakeel Butt 2021-04-21 17:05 ` peter enderborg 2021-04-21 18:28 ` Shakeel Butt 2021-04-21 18:28 ` Shakeel Butt 2021-04-21 18:28 ` Shakeel Butt 2021-04-21 18:46 ` Peter.Enderborg 2021-04-21 18:46 ` Peter.Enderborg-7U/KSKJipcs 2021-04-21 19:18 ` Shakeel Butt 2021-04-21 19:18 ` Shakeel Butt 2021-04-21 19:18 ` Shakeel Butt 2021-04-22 5:38 ` Peter.Enderborg 2021-04-22 5:38 ` Peter.Enderborg-7U/KSKJipcs 2021-04-22 14:27 ` Shakeel Butt 2021-04-22 14:27 ` Shakeel Butt 2021-04-22 14:27 ` Shakeel Butt 2021-04-22 15:41 ` Peter.Enderborg 2021-04-22 15:41 ` Peter.Enderborg-7U/KSKJipcs 2021-04-22 13:08 ` Michal Hocko
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=CALvZod6oCBB6tDh5wABSwdHfcDzLX7S7cOTLp_4Qk4DCi50X_A@mail.gmail.com \ --to=shakeelb@google.com \ --cc=akpm@linux-foundation.org \ --cc=cgroups@vger.kernel.org \ --cc=dragoss@google.com \ --cc=gthelen@google.com \ --cc=guro@fb.com \ --cc=hannes@cmpxchg.org \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-mm@kvack.org \ --cc=mhocko@suse.com \ --cc=padmapriyad@google.com \ --cc=rientjes@google.com \ --cc=surenb@google.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.