From mboxrd@z Thu Jan 1 00:00:00 1970 From: Michal Hocko Subject: Re: [RFC 1/6] mm, page_alloc: fix more premature OOM due to race with cpuset update Date: Thu, 18 May 2017 19:24:24 +0200 Message-ID: <20170518172424.GB30148@dhcp22.suse.cz> References: <20170517092042.GH18247@dhcp22.suse.cz> <20170517140501.GM18247@dhcp22.suse.cz> <20170517145645.GO18247@dhcp22.suse.cz> <20170518090846.GD25462@dhcp22.suse.cz> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Return-path: Content-Disposition: inline In-Reply-To: Sender: cgroups-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org To: Christoph Lameter Cc: Vlastimil Babka , linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org, linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, Li Zefan , Mel Gorman , David Rientjes , Hugh Dickins , Andrea Arcangeli , Anshuman Khandual , "Kirill A. Shutemov" , linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org List-Id: linux-api@vger.kernel.org On Thu 18-05-17 11:57:55, Cristopher Lameter wrote: > On Thu, 18 May 2017, Michal Hocko wrote: > > > > Nope. The OOM in a cpuset gets the process doing the alloc killed. Or what > > > that changed? > > !!!!! > > > > > > > At this point you have messed up royally and nothing is going to rescue > > > you anyways. OOM or not does not matter anymore. The app will fail. > > > > Not really. If you can trick the system to _think_ that the intersection > > between mempolicy and the cpuset is empty then the OOM killer might > > trigger an innocent task rather than the one which tricked it into that > > situation. > > See above. OOM Kill in a cpuset does not kill an innocent task but a task > that does an allocation in that specific context meaning a task in that > cpuset that also has a memory policty. No, the oom killer will chose the largest task in the specific NUMA domain. If you just fail such an allocation then a page fault would get VM_FAULT_OOM and pagefault_out_of_memory would kill a task regardless of the cpusets. > Regardless of that the point earlier was that the moving logic can avoid > creating temporary situations of empty sets of nodes by analysing the > memory policies etc and only performing moves when doing so is safe. How are you going to do that in a raceless way? Moreover the whole discussion is about _failing_ allocations on an empty cpuset and mempolicy intersection. -- Michal Hocko SUSE Labs