From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753341AbbJFWtZ (ORCPT ); Tue, 6 Oct 2015 18:49:25 -0400 Received: from mail-pa0-f54.google.com ([209.85.220.54]:32841 "EHLO mail-pa0-f54.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752736AbbJFWtY (ORCPT ); Tue, 6 Oct 2015 18:49:24 -0400 Date: Tue, 6 Oct 2015 15:49:16 -0700 (PDT) From: Hugh Dickins X-X-Sender: hugh@eggly.anvils To: Michal Hocko cc: Oleg Nesterov , Sasha Levin , Andrew Morton , David Rientjes , Kyle Walker , Stanislav Kozina , Tetsuo Handa , linux-kernel@vger.kernel.org Subject: Re: [PATCH -mm] mmoom-fix-potentially-killing-unrelated-process-fix In-Reply-To: <20151006165612.GA2752@dhcp22.suse.cz> Message-ID: References: <20151005163427.GA20595@redhat.com> <20151006162804.GB9570@redhat.com> <20151006165612.GA2752@dhcp22.suse.cz> User-Agent: Alpine 2.11 (LSU 23 2013-08-11) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, 6 Oct 2015, Michal Hocko wrote: > On Tue 06-10-15 18:28:04, Oleg Nesterov wrote: > > oom_kill_process() does atomic_inc(&mm->mm_users) to ensure that > > this ->mm can't go away and this is wrong, change it to rely on > > ->mm_count and mmdrop(). > > > > Firstly, we do not want to delay exit_mmap/etc if the victim exits > > before we do mmput(), but this is minor. > > > > More importantly, we simply can not do mmput() in oom_kill_process(), > > this can deadlock if (for example) the caller holds i_mmap_rwsem and > > mmput() actually leads to exit_mmap(); the victim can have this file > > mmaped and in this case unmap_vmas/free_pgtables paths will take the > > same lock for writing. And at least huge_pmd_share() does pmd_alloc() > > under i_mmap_rwsem because VM_HUGETLB memory is not reclaimable. > > Ouch, I have completely missed this during review! Thanks for catching > this. On the second thought it is clear now. We really want to pin the > mm_struct not the address space. > > > Signed-off-by: Oleg Nesterov > > Acked-by: Michal Hocko Acked-by: Hugh Dickins Thanks: looks like this is what was behind recent trinity/KSM deadlock, https://lkml.org/lkml/2015/10/1/563 > > > --- > > mm/oom_kill.c | 4 ++-- > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > diff --git a/mm/oom_kill.c b/mm/oom_kill.c > > index 034d219..52abb78 100644 > > --- a/mm/oom_kill.c > > +++ b/mm/oom_kill.c > > @@ -571,7 +571,7 @@ void oom_kill_process(struct oom_control *oc, struct task_struct *p, > > > > /* Get a reference to safely compare mm after task_unlock(victim) */ > > mm = victim->mm; > > - atomic_inc(&mm->mm_users); > > + atomic_inc(&mm->mm_count); > > /* > > * We should send SIGKILL before setting TIF_MEMDIE in order to prevent > > * the OOM victim from depleting the memory reserves from the user > > @@ -609,7 +609,7 @@ void oom_kill_process(struct oom_control *oc, struct task_struct *p, > > } > > rcu_read_unlock(); > > > > - mmput(mm); > > + mmdrop(mm); > > put_task_struct(victim); > > } > > #undef K > > -- > > 2.4.3 > > > > -- > Michal Hocko > SUSE Labs