From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755168AbdESNCt (ORCPT ); Fri, 19 May 2017 09:02:49 -0400 Received: from www262.sakura.ne.jp ([202.181.97.72]:30667 "EHLO www262.sakura.ne.jp" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751598AbdESNCq (ORCPT ); Fri, 19 May 2017 09:02:46 -0400 To: mhocko@kernel.org, akpm@linux-foundation.org Cc: hannes@cmpxchg.org, guro@fb.com, vdavydov.dev@gmail.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, mhocko@suse.com Subject: Re: [RFC PATCH 2/2] mm, oom: do not trigger out_of_memory from the #PF From: Tetsuo Handa References: <20170519112604.29090-1-mhocko@kernel.org> <20170519112604.29090-3-mhocko@kernel.org> In-Reply-To: <20170519112604.29090-3-mhocko@kernel.org> Message-Id: <201705192202.EDD30719.OSLJHFMOFtFVOQ@I-love.SAKURA.ne.jp> X-Mailer: Winbiff [Version 2.51 PL2] X-Accept-Language: ja,en,zh Date: Fri, 19 May 2017 22:02:44 +0900 Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Michal Hocko wrote: > Any allocation failure during the #PF path will return with VM_FAULT_OOM > which in turn results in pagefault_out_of_memory. This can happen for > 2 different reasons. a) Memcg is out of memory and we rely on > mem_cgroup_oom_synchronize to perform the memcg OOM handling or b) > normal allocation fails. > > The later is quite problematic because allocation paths already trigger > out_of_memory and the page allocator tries really hard to not fail We made many memory allocation requests from page fault path (e.g. XFS) __GFP_FS some time ago, didn't we? But if I recall correctly (I couldn't find the message), there are some allocation requests from page fault path which cannot use __GFP_FS. Then, not all allocation requests can call oom_kill_process() and reaching pagefault_out_of_memory() will be inevitable. > allocations. Anyway, if the OOM killer has been already invoked there > is no reason to invoke it again from the #PF path. Especially when the > OOM condition might be gone by that time and we have no way to find out > other than allocate. > > Moreover if the allocation failed and the OOM killer hasn't been > invoked then we are unlikely to do the right thing from the #PF context > because we have already lost the allocation context and restictions and > therefore might oom kill a task from a different NUMA domain. If we carry a flag via task_struct that indicates whether it is an memory allocation request from page fault and allocation failure is not acceptable, we can call out_of_memory() from page allocator path. > - if (!mutex_trylock(&oom_lock)) > + if (fatal_signal_pending) fatal_signal_pending(current) By the way, can page fault occur after reaching do_exit()? When a thread reached do_exit(), fatal_signal_pending(current) becomes false, doesn't it? From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-oi0-f71.google.com (mail-oi0-f71.google.com [209.85.218.71]) by kanga.kvack.org (Postfix) with ESMTP id 66D9D28041F for ; Fri, 19 May 2017 09:03:02 -0400 (EDT) Received: by mail-oi0-f71.google.com with SMTP id h4so78939216oib.5 for ; Fri, 19 May 2017 06:03:02 -0700 (PDT) Received: from www262.sakura.ne.jp (www262.sakura.ne.jp. [2001:e42:101:1:202:181:97:72]) by mx.google.com with ESMTPS id v64si3804237oif.138.2017.05.19.06.03.00 for (version=TLS1 cipher=AES128-SHA bits=128/128); Fri, 19 May 2017 06:03:01 -0700 (PDT) Subject: Re: [RFC PATCH 2/2] mm, oom: do not trigger out_of_memory from the #PF From: Tetsuo Handa References: <20170519112604.29090-1-mhocko@kernel.org> <20170519112604.29090-3-mhocko@kernel.org> In-Reply-To: <20170519112604.29090-3-mhocko@kernel.org> Message-Id: <201705192202.EDD30719.OSLJHFMOFtFVOQ@I-love.SAKURA.ne.jp> Date: Fri, 19 May 2017 22:02:44 +0900 Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Sender: owner-linux-mm@kvack.org List-ID: To: mhocko@kernel.org, akpm@linux-foundation.org Cc: hannes@cmpxchg.org, guro@fb.com, vdavydov.dev@gmail.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, mhocko@suse.com Michal Hocko wrote: > Any allocation failure during the #PF path will return with VM_FAULT_OOM > which in turn results in pagefault_out_of_memory. This can happen for > 2 different reasons. a) Memcg is out of memory and we rely on > mem_cgroup_oom_synchronize to perform the memcg OOM handling or b) > normal allocation fails. > > The later is quite problematic because allocation paths already trigger > out_of_memory and the page allocator tries really hard to not fail We made many memory allocation requests from page fault path (e.g. XFS) __GFP_FS some time ago, didn't we? But if I recall correctly (I couldn't find the message), there are some allocation requests from page fault path which cannot use __GFP_FS. Then, not all allocation requests can call oom_kill_process() and reaching pagefault_out_of_memory() will be inevitable. > allocations. Anyway, if the OOM killer has been already invoked there > is no reason to invoke it again from the #PF path. Especially when the > OOM condition might be gone by that time and we have no way to find out > other than allocate. > > Moreover if the allocation failed and the OOM killer hasn't been > invoked then we are unlikely to do the right thing from the #PF context > because we have already lost the allocation context and restictions and > therefore might oom kill a task from a different NUMA domain. If we carry a flag via task_struct that indicates whether it is an memory allocation request from page fault and allocation failure is not acceptable, we can call out_of_memory() from page allocator path. > - if (!mutex_trylock(&oom_lock)) > + if (fatal_signal_pending) fatal_signal_pending(current) By the way, can page fault occur after reaching do_exit()? When a thread reached do_exit(), fatal_signal_pending(current) becomes false, doesn't it? -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org