All of lore.kernel.org
 help / color / mirror / Atom feed
From: Nico Pache <npache@redhat.com>
To: Michal Hocko <mhocko@suse.com>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	Rafael Aquini <aquini@redhat.com>,
	Waiman Long <longman@redhat.com>, Baoquan He <bhe@redhat.com>,
	Christoph von Recklinghausen <crecklin@redhat.com>,
	Don Dutile <ddutile@redhat.com>,
	"Herton R . Krzesinski" <herton@redhat.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Darren Hart <dvhart@infradead.org>,
	Davidlohr Bueso <dave@stgolabs.net>,
	Andre Almeida <andrealmeid@collabora.com>,
	David Rientjes <rientjes@google.com>,
	Andrea Arcangeli <aarcange@redhat.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Joel Savitz <jsavitz@redhat.com>
Subject: Re: [PATCH v5] mm/oom_kill.c: futex: Close a race between do_exit and the oom_reaper
Date: Mon, 21 Mar 2022 16:45:37 -0600	[thread overview]
Message-ID: <9fb05fa3-4474-5a49-9f1c-67c31bf96c94@redhat.com> (raw)
In-Reply-To: <Yjg9ncgep58gFLiN@dhcp22.suse.cz>



On 3/21/22 02:55, Michal Hocko wrote:
> On Thu 17-03-22 21:36:21, Nico Pache wrote:
>> The pthread struct is allocated on PRIVATE|ANONYMOUS memory [1] which can
>> be targeted by the oom reaper. This mapping is used to store the futex
>> robust list; the kernel does not keep a copy of the robust list and instead
>> references a userspace address to maintain the robustness during a process
>> death. A race can occur between exit_mm and the oom reaper that allows
>> the oom reaper to free the memory of the futex robust list before the
>> exit path has handled the futex death:
>>
>>     CPU1                               CPU2
>> ------------------------------------------------------------------------
>>     page_fault
>>     out_of_memory
>>     do_exit "signal"
>>     wake_oom_reaper
>> 					oom_reaper
>>                         		oom_reap_task_mm (invalidates mm)
>>     exit_mm
>>     exit_mm_release
>>     futex_exit_release
>>     futex_cleanup
>>     exit_robust_list
>>     get_user (EFAULT- can't access memory)
> 
> I still think it is useful to be explicit about the consequences of the
> EFAULT here. Did you want to mention that a failing get_user in this
> path would result in a hang because nobody is woken up when the current
> holder of the lock terminates.

Sounds good! You make a good point-- We had that in all the other versions, but
I forgot to include it in this commit log.
> 
>> While in the oom reaper thread, we must handle the futex cleanup without
>> sleeping. To achieve this, add the boolean `try` to futex_exit_begin().
>> This will control weather or not we use a trylock. Also introduce
>> try_futex_exit_release() which will utilize the trylock version of the
>> futex_cleanup_begin(). Also call kthread_use_mm in this context to assure
>> the get_user call in futex_cleanup() does not return with EFAULT.
> 
> This alone is not sufficient. get_user can sleep in the #PF handler path
> (e.g. by waiting for swap in). Or is there any guarantee that the page
> is never swapped out? If we cannot rule out #PF then this is not a
> viable way to address the problem I am afraid.>
> Please also note that this all is done after mmap_lock has been already
> taken so a page fault could deadlock on the mmap_lock.
>
I don't think we can guarantee that page is not swapped out. Good catch, I was
concerned when I saw the 'might_fault' in get_user, but I wasn't fully sure of
its consequences. I'm still learning the labyrinth that is the MM space, so
thanks for the context!

> The more I am thinking about this the more I am getting convinced that
> we should rather approach this differently and skip over vmas which can
> be holding the list. Have you considered this option?

We've discussed it and it seems very doable, but we haven't attempted to
implement it yet. I'll give it a shot and see what I can come up with!

Thank you for your feedback and reviews :)
-- Nico


  reply	other threads:[~2022-03-21 22:59 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-18  3:36 [PATCH v5] mm/oom_kill.c: futex: Close a race between do_exit and the oom_reaper Nico Pache
2022-03-21  8:55 ` Michal Hocko
2022-03-21 22:45   ` Nico Pache [this message]
2022-03-22  0:42   ` Davidlohr Bueso
2022-03-22  1:53     ` Nico Pache
2022-03-22  2:57       ` Davidlohr Bueso
2022-03-22  3:09         ` Nico Pache
2022-03-22  8:26         ` Michal Hocko
2022-03-22 15:17           ` Thomas Gleixner
2022-03-22 16:36             ` Michal Hocko
2022-03-22 22:43               ` Thomas Gleixner
2022-03-23  9:17                 ` Michal Hocko
2022-03-23 10:30                   ` Thomas Gleixner
2022-03-23 11:11                   ` Peter Zijlstra
2022-03-30  9:18                   ` Michal Hocko
2022-03-30 18:18                     ` Nico Pache
2022-03-30 21:36                       ` Nico Pache
2022-04-06 17:22             ` Nico Pache
2022-04-06 17:36               ` Nico Pache

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9fb05fa3-4474-5a49-9f1c-67c31bf96c94@redhat.com \
    --to=npache@redhat.com \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=andrealmeid@collabora.com \
    --cc=aquini@redhat.com \
    --cc=bhe@redhat.com \
    --cc=crecklin@redhat.com \
    --cc=dave@stgolabs.net \
    --cc=ddutile@redhat.com \
    --cc=dvhart@infradead.org \
    --cc=herton@redhat.com \
    --cc=jsavitz@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=longman@redhat.com \
    --cc=mhocko@suse.com \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rientjes@google.com \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.