All of lore.kernel.org
 help / color / mirror / Atom feed
From: Suren Baghdasaryan <surenb@google.com>
To: Gang Li <ligang.bdlg@bytedance.com>, Michal Hocko <mhocko@suse.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Muchun Song <songmuchun@bytedance.com>,
	hca@linux.ibm.com, gor@linux.ibm.com, agordeev@linux.ibm.com,
	borntraeger@linux.ibm.com, svens@linux.ibm.com,
	"Eric W. Biederman" <ebiederm@xmission.com>,
	Kees Cook <keescook@chromium.org>,
	Al Viro <viro@zeniv.linux.org.uk>,
	Steven Rostedt <rostedt@goodmis.org>,
	Ingo Molnar <mingo@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>,
	acme@kernel.org, mark.rutland@arm.com,
	alexander.shishkin@linux.intel.com, jolsa@kernel.org,
	namhyung@kernel.org, David Hildenbrand <david@redhat.com>,
	imbrenda@linux.ibm.com, apopple@nvidia.com,
	Alexey Dobriyan <adobriyan@gmail.com>,
	stephen.s.brennan@oracle.com, ohoono.kwon@samsung.com,
	haolee.swjtu@gmail.com, Kalesh Singh <kaleshsingh@google.com>,
	zhengqi.arch@bytedance.com, Peter Xu <peterx@redhat.com>,
	Yang Shi <shy828301@gmail.com>, Colin Cross <ccross@google.com>,
	vincent.whitchurch@axis.com, Thomas Gleixner <tglx@linutronix.de>,
	bigeasy@linutronix.de, fenghua.yu@intel.com,
	linux-s390@vger.kernel.org, LKML <linux-kernel@vger.kernel.org>,
	linux-mm <linux-mm@kvack.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	linux-perf-users@vger.kernel.org
Subject: Re: [PATCH 0/5 v1] mm, oom: Introduce per numa node oom for CONSTRAINT_MEMORY_POLICY
Date: Thu, 12 May 2022 15:31:34 -0700	[thread overview]
Message-ID: <CAJuCfpGDamD6P6Tgz=Y59fpj1NgFL0wjKe+y42-mCQ2x-asx3A@mail.gmail.com> (raw)
In-Reply-To: <20220512044634.63586-1-ligang.bdlg@bytedance.com>

On Wed, May 11, 2022 at 9:47 PM Gang Li <ligang.bdlg@bytedance.com> wrote:
>
> TLDR:
> If a mempolicy is in effect(oc->constraint == CONSTRAINT_MEMORY_POLICY), out_of_memory() will
> select victim on specific node to kill. So that kernel can avoid accidental killing on NUMA system.
>
> Problem:
> Before this patch series, oom will only kill the process with the highest memory usage.
> by selecting process with the highest oom_badness on the entire system to kill.
>
> This works fine on UMA system, but may have some accidental killing on NUMA system.
>
> As shown below, if process c.out is bind to Node1 and keep allocating pages from Node1,
> a.out will be killed first. But killing a.out did't free any mem on Node1, so c.out
> will be killed then.
>
> A lot of our AMD machines have 8 numa nodes. In these systems, there is a greater chance
> of triggering this problem.
>
> OOM before patches:
> ```
> Per-node process memory usage (in MBs)
> PID             Node 0        Node 1      Total
> ----------- ---------- ------------- ----------
> 3095 a.out     3073.34          0.11    3073.45(Killed first. Maximum memory consumption)
> 3199 b.out      501.35       1500.00    2001.35
> 3805 c.out        1.52 (grow)2248.00    2249.52(Killed then. Node1 is full)
> ----------- ---------- ------------- ----------
> Total          3576.21       3748.11    7324.31
> ```
>
> Solution:
> We store per node rss in mm_rss_stat for each process.
>
> If a page allocation with mempolicy in effect(oc->constraint == CONSTRAINT_MEMORY_POLICY)
> triger oom. We will calculate oom_badness with rss counter for the corresponding node. Then
> select the process with the highest oom_badness on the corresponding node to kill.
>
> OOM after patches:
> ```
> Per-node process memory usage (in MBs)
> PID             Node 0        Node 1     Total
> ----------- ---------- ------------- ----------
> 3095 a.out     3073.34          0.11    3073.45
> 3199 b.out      501.35       1500.00    2001.35
> 3805 c.out        1.52 (grow)2248.00    2249.52(killed)
> ----------- ---------- ------------- ----------
> Total          3576.21       3748.11    7324.31
> ```

You included lots of people but missed Michal Hocko. CC'ing him and
please include him in the future postings.

>
> Gang Li (5):
>   mm: add a new parameter `node` to `get/add/inc/dec_mm_counter`
>   mm: add numa_count field for rss_stat
>   mm: add numa fields for tracepoint rss_stat
>   mm: enable per numa node rss_stat count
>   mm, oom: enable per numa node oom for CONSTRAINT_MEMORY_POLICY
>
>  arch/s390/mm/pgtable.c        |   4 +-
>  fs/exec.c                     |   2 +-
>  fs/proc/base.c                |   6 +-
>  fs/proc/task_mmu.c            |  14 ++--
>  include/linux/mm.h            |  59 ++++++++++++-----
>  include/linux/mm_types_task.h |  16 +++++
>  include/linux/oom.h           |   2 +-
>  include/trace/events/kmem.h   |  27 ++++++--
>  kernel/events/uprobes.c       |   6 +-
>  kernel/fork.c                 |  70 +++++++++++++++++++-
>  mm/huge_memory.c              |  13 ++--
>  mm/khugepaged.c               |   4 +-
>  mm/ksm.c                      |   2 +-
>  mm/madvise.c                  |   2 +-
>  mm/memory.c                   | 116 ++++++++++++++++++++++++----------
>  mm/migrate.c                  |   2 +
>  mm/migrate_device.c           |   2 +-
>  mm/oom_kill.c                 |  59 ++++++++++++-----
>  mm/rmap.c                     |  16 ++---
>  mm/swapfile.c                 |   4 +-
>  mm/userfaultfd.c              |   2 +-
>  21 files changed, 317 insertions(+), 111 deletions(-)
>
> --
> 2.20.1
>

  parent reply	other threads:[~2022-05-12 22:31 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-12  4:46 [PATCH 0/5 v1] mm, oom: Introduce per numa node oom for CONSTRAINT_MEMORY_POLICY Gang Li
2022-05-12  4:46 ` [PATCH 1/5 v1] mm: add a new parameter `node` to `get/add/inc/dec_mm_counter` Gang Li
2022-05-12  4:46 ` [PATCH 2/5 v1] mm: add numa_count field for rss_stat Gang Li
2022-05-12 16:31   ` kernel test robot
2022-05-12  4:46 ` [PATCH 3/5 v1] mm: add numa fields for tracepoint rss_stat Gang Li
2022-05-12  4:46 ` [PATCH 4/5 v1] mm: enable per numa node rss_stat count Gang Li
2022-05-17  2:28   ` [mm] c9dc81ef10: BUG:Bad_rss-counter_state_mm:#node:#val kernel test robot
2022-05-17  2:28     ` kernel test robot
2022-05-12  4:46 ` [PATCH 5/5 v1] mm, oom: enable per numa node oom for CONSTRAINT_MEMORY_POLICY Gang Li
2022-05-12 22:31 ` Suren Baghdasaryan [this message]
2022-05-16 16:44 ` [PATCH 0/5 v1] mm, oom: Introduce " Michal Hocko
2022-06-15 10:13   ` Gang Li

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAJuCfpGDamD6P6Tgz=Y59fpj1NgFL0wjKe+y42-mCQ2x-asx3A@mail.gmail.com' \
    --to=surenb@google.com \
    --cc=acme@kernel.org \
    --cc=adobriyan@gmail.com \
    --cc=agordeev@linux.ibm.com \
    --cc=akpm@linux-foundation.org \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=apopple@nvidia.com \
    --cc=bigeasy@linutronix.de \
    --cc=borntraeger@linux.ibm.com \
    --cc=ccross@google.com \
    --cc=david@redhat.com \
    --cc=ebiederm@xmission.com \
    --cc=fenghua.yu@intel.com \
    --cc=gor@linux.ibm.com \
    --cc=haolee.swjtu@gmail.com \
    --cc=hca@linux.ibm.com \
    --cc=imbrenda@linux.ibm.com \
    --cc=jolsa@kernel.org \
    --cc=kaleshsingh@google.com \
    --cc=keescook@chromium.org \
    --cc=ligang.bdlg@bytedance.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=mhocko@suse.com \
    --cc=mingo@redhat.com \
    --cc=namhyung@kernel.org \
    --cc=ohoono.kwon@samsung.com \
    --cc=peterx@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=shy828301@gmail.com \
    --cc=songmuchun@bytedance.com \
    --cc=stephen.s.brennan@oracle.com \
    --cc=svens@linux.ibm.com \
    --cc=tglx@linutronix.de \
    --cc=vincent.whitchurch@axis.com \
    --cc=viro@zeniv.linux.org.uk \
    --cc=zhengqi.arch@bytedance.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.