From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S932419AbdIGQ1f (ORCPT <rfc822;w@1wt.eu>);
        Thu, 7 Sep 2017 12:27:35 -0400
Received: from resqmta-ch2-02v.sys.comcast.net ([69.252.207.34]:58738 "EHLO
        resqmta-ch2-02v.sys.comcast.net" rhost-flags-OK-OK-OK-OK)
        by vger.kernel.org with ESMTP id S932105AbdIGQ1d (ORCPT
        <rfc822;linux-kernel@vger.kernel.org>);
        Thu, 7 Sep 2017 12:27:33 -0400
Date: Thu, 7 Sep 2017 11:27:30 -0500 (CDT)
From: Christopher Lameter <cl@linux.com>
X-X-Sender: cl@nuc-kabylake
To: Michal Hocko <mhocko@kernel.org>
cc: Johannes Weiner <hannes@cmpxchg.org>, Roman Gushchin <guro@fb.com>,
        linux-mm@kvack.org, Vladimir Davydov <vdavydov.dev@gmail.com>,
        Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>,
        David Rientjes <rientjes@google.com>,
        Andrew Morton <akpm@linux-foundation.org>, Tejun Heo <tj@kernel.org>,
        kernel-team@fb.com, cgroups@vger.kernel.org, linux-doc@vger.kernel.org,
        linux-kernel@vger.kernel.org
Subject: Re: [v7 5/5] mm, oom: cgroup v2 mount option to disable cgroup-aware
 OOM killer
In-Reply-To: <20170906082859.qlqenftxuib64j35@dhcp22.suse.cz>
Message-ID: <alpine.DEB.2.20.1709071122360.20082@nuc-kabylake>
References: <20170904142108.7165-1-guro@fb.com> <20170904142108.7165-6-guro@fb.com> <20170905134412.qdvqcfhvbdzmarna@dhcp22.suse.cz> <20170905215344.GA27427@cmpxchg.org> <20170906082859.qlqenftxuib64j35@dhcp22.suse.cz>
User-Agent: Alpine 2.20 (DEB 67 2015-01-07)
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
X-CMAE-Envelope: MS4wfBtE68sXa476JB/CXcm1QbFAFnEAGpqHx7amHIgFDwl79waRHugpJvTqMRKrjfmiBo/oI+W0U/Bcvye/vWnefwbNzFeSyCjB23ZGUpgC1iyZ8vdhAzSr
 ExOVGxDPBv5cbw36B6k2uuNnNPIkIFVQ8afSd4ves9nijcTSoFypgzBHkGrhUML2TjNrYRwyw11TDKdJ6xvrBkmx3HSoWft6YLf9fMR20hSLL8qXKNxaFOXP
 NOMCWCjCCTiBLv4YrTTyuawhfr5gW/a5m24yb8d+A90YG4ABzhV70jOvwYVK+9Wa04Rrn61w2eKpNquYfup7ghcv6q8CBPpJ16J3ZzGWk0qC9XythQ23bG8h
 r7Bh8+1Z8wkdAsoGF0qiZ8cXw+Z8GiVkpHoV/f1Zl+Arrc31YuV5eBXPMqycyz13OSNNf6RUNB88h8/kED+RGYM6JDM6bII5RrgyjGMZYLAQR3xxc3UuHKQ2
 uDGWUYw/fGLXzWhG+z3WySQCeI6w0KHT8tgMTOBZr6xHJAtnJsUR6G55fwyuAdfdZa8hrNlRt6Kxg3b0
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

On Wed, 6 Sep 2017, Michal Hocko wrote:

> I am not sure this is how things evolved actually. This is way before
> my time so my git log interpretation might be imprecise. We do have
> oom_badness heuristic since out_of_memory has been introduced and
> oom_kill_allocating_task has been introduced much later because of large
> boxes with zillions of tasks (SGI I suspect) which took too long to
> select a victim so David has added this heuristic.

Nope. The logic was required for tasks that run out of memory when the
restriction on the allocation did not allow the use of all of memory.
cpuset restrictions and memory policy restrictions where the prime
considerations at the time.

It has *nothing* to do with zillions of tasks. Its amusing that the SGI
ghost is still haunting the discussion here. The company died a couple of
years ago finally (ok somehow HP has an "SGI" brand now I believe). But
there are multiple companies that have large NUMA configurations and they
all have configurations where they want to restrict allocations of a
process to subset of system memory. This is even more important now that
we get new forms of memory (NVDIMM, PCI-E device memory etc). You need to
figure out what to do with allocations that fail because the *allowed*
memory pools are empty.