All of lore.kernel.org
 help / color / mirror / Atom feed
From: Abel Wu <wuyun.abel@bytedance.com>
To: Michal Hocko <mhocko@suse.com>, Gang Li <ligang.bdlg@bytedance.com>
Cc: akpm@linux-foundation.org, surenb@google.com, hca@linux.ibm.com,
	gor@linux.ibm.com, agordeev@linux.ibm.com,
	borntraeger@linux.ibm.com, svens@linux.ibm.com,
	viro@zeniv.linux.org.uk, ebiederm@xmission.com,
	keescook@chromium.org, rostedt@goodmis.org, mingo@redhat.com,
	peterz@infradead.org, acme@kernel.org, mark.rutland@arm.com,
	alexander.shishkin@linux.intel.com, jolsa@kernel.org,
	namhyung@kernel.org, david@redhat.com, imbrenda@linux.ibm.com,
	adobriyan@gmail.com, yang.yang29@zte.com.cn, brauner@kernel.org,
	stephen.s.brennan@oracle.com, zhengqi.arch@bytedance.com,
	haolee.swjtu@gmail.com, xu.xin16@zte.com.cn,
	Liam.Howlett@oracle.com, ohoono.kwon@samsung.com,
	peterx@redhat.com, arnd@arndb.de, shy828301@gmail.com,
	alex.sierra@amd.com, xianting.tian@linux.alibaba.com,
	willy@infradead.org, ccross@google.com, vbabka@suse.cz,
	sujiaxun@uniontech.com, sfr@canb.auug.org.au,
	vasily.averin@linux.dev, mgorman@suse.de, vvghjk1234@gmail.com,
	tglx@linutronix.de, luto@kernel.org, bigeasy@linutronix.de,
	fenghua.yu@intel.com, linux-s390@vger.kernel.org,
	linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	linux-mm@kvack.org, linux-perf-users@vger.kernel.org,
	hezhongkun.hzk@bytedance.com
Subject: Re: [PATCH v2 0/5] mm, oom: Introduce per numa node oom for CONSTRAINT_{MEMORY_POLICY,CPUSET}
Date: Tue, 12 Jul 2022 19:12:18 +0800	[thread overview]
Message-ID: <41ae31a7-6998-be88-858c-744e31a76b2a@bytedance.com> (raw)
In-Reply-To: <YsfwyTHE/5py1kHC@dhcp22.suse.cz>

Hi Michal,

On 7/8/22 4:54 PM, Michal Hocko Wrote:
> On Fri 08-07-22 16:21:24, Gang Li wrote:
>> TLDR
>> ----
>> If a mempolicy or cpuset is in effect, out_of_memory() will select victim
>> on specific node to kill. So that kernel can avoid accidental killing on
>> NUMA system.
> 
> We have discussed this in your previous posting and an alternative
> proposal was to use cpusets to partition NUMA aware workloads and
> enhance the oom killer to be cpuset aware instead which should be a much
> easier solution.
> 
>> Problem
>> -------
>> Before this patch series, oom will only kill the process with the highest
>> memory usage by selecting process with the highest oom_badness on the
>> entire system.
>>
>> This works fine on UMA system, but may have some accidental killing on NUMA
>> system.
>>
>> As shown below, if process c.out is bind to Node1 and keep allocating pages
>> from Node1, a.out will be killed first. But killing a.out did't free any
>> mem on Node1, so c.out will be killed then.
>>
>> A lot of AMD machines have 8 numa nodes. In these systems, there is a
>> greater chance of triggering this problem.
> 
> Please be more specific about existing usecases which suffer from the
> current OOM handling limitations.

I was just going through the mail list and happen to see this. There
is another usecase for us about per-numa memory usage.

Say we have several important latency-critical services sitting inside
different NUMA nodes without intersection. The need for memory of these
LC services varies, so the free memory of each node is also different.
Then we launch several background containers without cpuset constrains
to eat the left resources. Now the problem is that there doesn't seem
like a proper memory policy available to balance the usage between the
nodes, which could lead to memory-heavy LC services suffer from high
memory pressure and fails to meet the SLOs.

It's quite appreciated if you can shed some light on this!

Thanks & BR,
Abel

  parent reply	other threads:[~2022-07-12 11:13 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-07-08  8:21 [PATCH v2 0/5] mm, oom: Introduce per numa node oom for CONSTRAINT_{MEMORY_POLICY,CPUSET} Gang Li
2022-07-08  8:21 ` [PATCH v2 1/5] mm: add a new parameter `node` to `get/add/inc/dec_mm_counter` Gang Li
2022-07-12  6:33   ` [mm] c20f7bacef: WARNING:possible_circular_locking_dependency_detected kernel test robot
2022-07-12  6:33     ` kernel test robot
2022-07-08  8:21 ` [PATCH v2 2/5] mm: add numa_count field for rss_stat Gang Li
2022-07-08 12:22   ` kernel test robot
2022-07-08  8:21 ` [PATCH v2 3/5] mm: add numa fields for tracepoint rss_stat Gang Li
2022-07-08 17:31   ` Steven Rostedt
2022-07-08  8:21 ` [PATCH v2 4/5] mm: enable per numa node rss_stat count Gang Li
2022-07-08  8:21 ` [PATCH v2 5/5] mm, oom: enable per numa node oom for CONSTRAINT_{MEMORY_POLICY,CPUSET} Gang Li
2022-07-08  8:54 ` [PATCH v2 0/5] mm, oom: Introduce " Michal Hocko
2022-07-08  9:25   ` Gang Li
2022-07-08  9:37     ` Michal Hocko
2022-07-12 11:12   ` Abel Wu [this message]
2022-07-12 13:35     ` Michal Hocko
2022-07-12 15:00       ` Abel Wu
2022-07-18 12:11         ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=41ae31a7-6998-be88-858c-744e31a76b2a@bytedance.com \
    --to=wuyun.abel@bytedance.com \
    --cc=Liam.Howlett@oracle.com \
    --cc=acme@kernel.org \
    --cc=adobriyan@gmail.com \
    --cc=agordeev@linux.ibm.com \
    --cc=akpm@linux-foundation.org \
    --cc=alex.sierra@amd.com \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=arnd@arndb.de \
    --cc=bigeasy@linutronix.de \
    --cc=borntraeger@linux.ibm.com \
    --cc=brauner@kernel.org \
    --cc=ccross@google.com \
    --cc=david@redhat.com \
    --cc=ebiederm@xmission.com \
    --cc=fenghua.yu@intel.com \
    --cc=gor@linux.ibm.com \
    --cc=haolee.swjtu@gmail.com \
    --cc=hca@linux.ibm.com \
    --cc=hezhongkun.hzk@bytedance.com \
    --cc=imbrenda@linux.ibm.com \
    --cc=jolsa@kernel.org \
    --cc=keescook@chromium.org \
    --cc=ligang.bdlg@bytedance.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=mgorman@suse.de \
    --cc=mhocko@suse.com \
    --cc=mingo@redhat.com \
    --cc=namhyung@kernel.org \
    --cc=ohoono.kwon@samsung.com \
    --cc=peterx@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=sfr@canb.auug.org.au \
    --cc=shy828301@gmail.com \
    --cc=stephen.s.brennan@oracle.com \
    --cc=sujiaxun@uniontech.com \
    --cc=surenb@google.com \
    --cc=svens@linux.ibm.com \
    --cc=tglx@linutronix.de \
    --cc=vasily.averin@linux.dev \
    --cc=vbabka@suse.cz \
    --cc=viro@zeniv.linux.org.uk \
    --cc=vvghjk1234@gmail.com \
    --cc=willy@infradead.org \
    --cc=xianting.tian@linux.alibaba.com \
    --cc=xu.xin16@zte.com.cn \
    --cc=yang.yang29@zte.com.cn \
    --cc=zhengqi.arch@bytedance.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.