All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: David Hildenbrand <david@redhat.com>
Cc: mingo@redhat.com, tglx@linutronix.de, juri.lelli@redhat.com,
	vincent.guittot@linaro.org, dietmar.eggemann@arm.com,
	rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de,
	bristot@redhat.com, linux-kernel@vger.kernel.org,
	linux-mm@kvack.org, linux-api@vger.kernel.org, x86@kernel.org,
	pjt@google.com, posk@google.com, avagin@google.com,
	jannh@google.com, tdelisle@uwaterloo.ca, mark.rutland@arm.com,
	posk@posk.io
Subject: Re: [RFC][PATCH v2 1/5] mm: Avoid unmapping pinned pages
Date: Fri, 21 Jan 2022 09:59:17 +0100	[thread overview]
Message-ID: <20220121085917.GA22849@worktop.programming.kicks-ass.net> (raw)
In-Reply-To: <20220121075157.GA20638@worktop.programming.kicks-ass.net>

On Fri, Jan 21, 2022 at 08:51:57AM +0100, Peter Zijlstra wrote:
> On Thu, Jan 20, 2022 at 07:25:08PM +0100, David Hildenbrand wrote:
> > On 20.01.22 16:55, Peter Zijlstra wrote:
> > > Add a guarantee for Anon pages that pin_user_page*() ensures the
> > > user-mapping of these pages stay preserved. In order to ensure this
> > > all rmap users have been audited:
> > > 
> > >  vmscan:	already fails eviction due to page_maybe_dma_pinned()
> > > 
> > >  migrate:	migration will fail on pinned pages due to
> > > 		expected_page_refs() not matching, however that is
> > > 		*after* try_to_migrate() has already destroyed the
> > > 		user mapping of these pages. Add an early exit for
> > > 		this case.
> > > 
> > >  numa-balance:	as per the above, pinned pages cannot be migrated,
> > > 		however numa balancing scanning will happily PROT_NONE
> > > 		them to get usage information on these pages. Avoid
> > > 		this for pinned pages.
> > 
> > page_maybe_dma_pinned() can race with GUP-fast without
> > mm->write_protect_seq. This is a real problem for vmscan() with
> > concurrent GUP-fast as it can result in R/O mappings of pinned pages and
> > GUP will lose synchronicity to the page table on write faults due to
> > wrong COW.
> 
> Urgh, so yeah, that might be a problem. Follow up code uses it like
> this:
> 
> +/*
> + * Pinning a page inhibits rmap based unmap for Anon pages. Doing a load
> + * through the user mapping ensures the user mapping exists.
> + */
> +#define umcg_pin_and_load(_self, _pagep, _member)                              \
> +({                                                                             \
> +       __label__ __out;                                                        \
> +       int __ret = -EFAULT;                                                    \
> +                                                                               \
> +       if (pin_user_pages_fast((unsigned long)(_self), 1, 0, &(_pagep)) != 1)  \
> +               goto __out;                                                     \
> +                                                                               \
> +       if (!PageAnon(_pagep) ||                                                \
> +           get_user(_member, &(_self)->_member)) {                             \
> +               unpin_user_page(_pagep);                                        \
> +               goto __out;                                                     \
> +       }                                                                       \
> +       __ret = 0;                                                              \
> +__out: __ret;                                                                  \
> +})
> 
> And after that hard assumes (on the penalty of SIGKILL) that direct user
> access works. Specifically it does RmW ops on it. So I suppose I'd
> better upgrade that load to a RmW at the very least.
> 
> But is that sufficient? Let me go find that race you mention...

OK, so copy_page_range() vs lockless_pages_from_mm(). Since I use
FOLL_PIN that should be sorted, it'll fall back the slow path and use
mmap_sem and serialize against the fork().

(Also, can I express my hate for __gup_longterm_unlocked(), that
function name is utter garbage)

However, I'm not quite sure what fork() does with pages that have a pin.
There's been a number of GUP vs fork() problems over the years, but I'm
afraid I have lost track of that and I can't quickly find anything in
the code..

Naively, a page that has async DMA activity should not be CoW'ed, or if
it is, care must be taken to ensure the original pages stays in the
original process, but I realize that's somewhat hard.

Let me dig in a bit more.

  parent reply	other threads:[~2022-01-21  8:59 UTC|newest]

Thread overview: 47+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-01-20 15:55 [RFC][PATCH v2 0/5] sched: User Managed Concurrency Groups Peter Zijlstra
2022-01-20 15:55 ` [RFC][PATCH v2 1/5] mm: Avoid unmapping pinned pages Peter Zijlstra
2022-01-20 18:03   ` Nadav Amit
2022-01-21  7:59     ` Peter Zijlstra
2022-01-20 18:25   ` David Hildenbrand
2022-01-21  7:51     ` Peter Zijlstra
2022-01-21  8:22       ` David Hildenbrand
2022-01-21  8:59       ` Peter Zijlstra [this message]
2022-01-21  9:04         ` David Hildenbrand
2022-01-21 11:40           ` Peter Zijlstra
2022-01-21 12:04             ` David Hildenbrand
2022-01-20 15:55 ` [RFC][PATCH v2 2/5] entry,x86: Create common IRQ operations for exceptions Peter Zijlstra
2022-01-21 16:34   ` Mark Rutland
2022-01-20 15:55 ` [RFC][PATCH v2 3/5] sched/umcg: add WF_CURRENT_CPU and externise ttwu Peter Zijlstra
2022-01-20 15:55 ` [RFC][PATCH v2 4/5] x86/uaccess: Implement unsafe_try_cmpxchg_user() Peter Zijlstra
2022-01-27  2:17   ` Sean Christopherson
2022-01-27  6:36     ` Sean Christopherson
2022-01-27  9:56       ` Peter Zijlstra
2022-01-27 23:33         ` Sean Christopherson
2022-01-28  0:17           ` Nick Desaulniers
2022-01-28 16:29             ` Sean Christopherson
2022-01-27  9:55     ` Peter Zijlstra
2022-01-20 15:55 ` [RFC][PATCH v2 5/5] sched: User Mode Concurency Groups Peter Zijlstra
2022-01-21 11:47   ` Peter Zijlstra
2022-01-21 15:18     ` Peter Zijlstra
2022-01-24 14:29       ` Peter Zijlstra
2022-01-24 16:44         ` Peter Zijlstra
2022-01-24 17:06           ` Peter Oskolkov
2022-01-25 14:59         ` Peter Zijlstra
2022-01-24 13:59     ` Peter Zijlstra
2022-01-21 12:26   ` Peter Zijlstra
2022-01-21 16:57   ` Mark Rutland
2022-01-24  9:48     ` Peter Zijlstra
2022-01-24 10:03     ` Peter Zijlstra
2022-01-24 10:07       ` Peter Zijlstra
2022-01-24 10:27         ` Mark Rutland
2022-01-24 14:46   ` Tao Zhou
2022-01-27 12:19     ` Peter Zijlstra
2022-01-27 18:33       ` Tao Zhou
2022-01-27 12:25     ` Peter Zijlstra
2022-01-27 18:47       ` Tao Zhou
2022-01-27 12:26     ` Peter Zijlstra
2022-01-27 18:31   ` Tao Zhou
2022-01-20 17:28 ` [RFC][PATCH v2 0/5] sched: User Managed Concurrency Groups Peter Oskolkov
2022-01-21  8:01   ` Peter Zijlstra
2022-01-21 18:01 ` Steven Rostedt
2022-01-24  8:20   ` Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220121085917.GA22849@worktop.programming.kicks-ass.net \
    --to=peterz@infradead.org \
    --cc=avagin@google.com \
    --cc=bristot@redhat.com \
    --cc=bsegall@google.com \
    --cc=david@redhat.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=jannh@google.com \
    --cc=juri.lelli@redhat.com \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mark.rutland@arm.com \
    --cc=mgorman@suse.de \
    --cc=mingo@redhat.com \
    --cc=pjt@google.com \
    --cc=posk@google.com \
    --cc=posk@posk.io \
    --cc=rostedt@goodmis.org \
    --cc=tdelisle@uwaterloo.ca \
    --cc=tglx@linutronix.de \
    --cc=vincent.guittot@linaro.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.