All of lore.kernel.org
 help / color / mirror / Atom feed
From: Lorenzo Stoakes <lstoakes@gmail.com>
To: David Hildenbrand <david@redhat.com>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	Andrew Morton <akpm@linux-foundation.org>,
	Jason Gunthorpe <jgg@ziepe.ca>, Jens Axboe <axboe@kernel.dk>,
	Matthew Wilcox <willy@infradead.org>,
	Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>,
	Leon Romanovsky <leon@kernel.org>,
	Christian Benvenuti <benve@cisco.com>,
	Nelson Escobar <neescoba@cisco.com>,
	Bernard Metzler <bmt@zurich.ibm.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@redhat.com>,
	Arnaldo Carvalho de Melo <acme@kernel.org>,
	Mark Rutland <mark.rutland@arm.com>,
	Alexander Shishkin <alexander.shishkin@linux.intel.com>,
	Jiri Olsa <jolsa@kernel.org>, Namhyung Kim <namhyung@kernel.org>,
	Ian Rogers <irogers@google.com>,
	Adrian Hunter <adrian.hunter@intel.com>,
	Bjorn Topel <bjorn@kernel.org>,
	Magnus Karlsson <magnus.karlsson@intel.com>,
	Maciej Fijalkowski <maciej.fijalkowski@intel.com>,
	Jonathan Lemon <jonathan.lemon@gmail.com>,
	"David S . Miller" <davem@davemloft.net>,
	Eric Dumazet <edumazet@google.com>,
	Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
	Christian Brauner <brauner@kernel.org>,
	Richard Cochran <richardcochran@gmail.com>,
	Alexei Starovoitov <ast@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Jesper Dangaard Brouer <hawk@kernel.org>,
	John Fastabend <john.fastabend@gmail.com>,
	linux-fsdevel@vger.kernel.org, linux-perf-users@vger.kernel.org,
	netdev@vger.kernel.org, bpf@vger.kernel.org,
	Oleg Nesterov <oleg@redhat.com>, Jason Gunthorpe <jgg@nvidia.com>,
	John Hubbard <jhubbard@nvidia.com>, Jan Kara <jack@suse.cz>,
	"Kirill A . Shutemov" <kirill@shutemov.name>,
	Pavel Begunkov <asml.silence@gmail.com>,
	Mika Penttila <mpenttil@redhat.com>,
	Dave Chinner <david@fromorbit.com>, Theodore Ts'o <tytso@mit.edu>,
	Peter Xu <peterx@redhat.com>,
	Matthew Rosato <mjrosato@linux.ibm.com>,
	"Paul E . McKenney" <paulmck@kernel.org>,
	Christian Borntraeger <borntraeger@linux.ibm.com>
Subject: Re: [PATCH v7 1/3] mm/mmap: separate writenotify and dirty tracking logic
Date: Tue, 2 May 2023 18:09:14 +0100	[thread overview]
Message-ID: <bf04a98a-9de6-4532-a36c-59572d22dd7c@lucifer.local> (raw)
In-Reply-To: <f777a151-edfc-4882-8aca-9a926179c5bb@lucifer.local>

On Tue, May 02, 2023 at 05:53:46PM +0100, Lorenzo Stoakes wrote:
> On Tue, May 02, 2023 at 06:38:53PM +0200, David Hildenbrand wrote:
> > On 02.05.23 18:34, Lorenzo Stoakes wrote:
> > > vma_wants_writenotify() is specifically intended for setting PTE page table
> > > flags, accounting for existing PTE flag state and whether that might
> > > already be read-only while mixing this check with a check whether the
> > > filesystem performs dirty tracking.
> > >
> > > Separate out the notions of dirty tracking and a PTE write notify checking
> > > in order that we can invoke the dirty tracking check from elsewhere.
> > >
> > > Note that this change introduces a very small duplicate check of the
> > > separated out vm_ops_needs_writenotify(). This is necessary to avoid making
> > > vma_needs_dirty_tracking() needlessly complicated (e.g. passing a
> > > check_writenotify flag or having it assume this check was already
> > > performed). This is such a small check that it doesn't seem too egregious
> > > to do this.
> > >
> > > Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com>
> > > Reviewed-by: John Hubbard <jhubbard@nvidia.com>
> > > Reviewed-by: Mika Penttilä <mpenttil@redhat.com>
> > > Reviewed-by: Jan Kara <jack@suse.cz>
> > > Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
> > > ---
> > >   include/linux/mm.h |  1 +
> > >   mm/mmap.c          | 36 +++++++++++++++++++++++++++---------
> > >   2 files changed, 28 insertions(+), 9 deletions(-)
> > >
> > > diff --git a/include/linux/mm.h b/include/linux/mm.h
> > > index 27ce77080c79..7b1d4e7393ef 100644
> > > --- a/include/linux/mm.h
> > > +++ b/include/linux/mm.h
> > > @@ -2422,6 +2422,7 @@ extern unsigned long move_page_tables(struct vm_area_struct *vma,
> > >   #define  MM_CP_UFFD_WP_ALL                 (MM_CP_UFFD_WP | \
> > >   					    MM_CP_UFFD_WP_RESOLVE)
> > > +bool vma_needs_dirty_tracking(struct vm_area_struct *vma);
> > >   int vma_wants_writenotify(struct vm_area_struct *vma, pgprot_t vm_page_prot);
> > >   static inline bool vma_wants_manual_pte_write_upgrade(struct vm_area_struct *vma)
> > >   {
> > > diff --git a/mm/mmap.c b/mm/mmap.c
> > > index 5522130ae606..295c5f2e9bd9 100644
> > > --- a/mm/mmap.c
> > > +++ b/mm/mmap.c
> > > @@ -1475,6 +1475,31 @@ SYSCALL_DEFINE1(old_mmap, struct mmap_arg_struct __user *, arg)
> > >   }
> > >   #endif /* __ARCH_WANT_SYS_OLD_MMAP */
> > > +/* Do VMA operations imply write notify is required? */
> > > +static bool vm_ops_needs_writenotify(const struct vm_operations_struct *vm_ops)
> > > +{
> > > +	return vm_ops && (vm_ops->page_mkwrite || vm_ops->pfn_mkwrite);
> > > +}
> > > +
> > > +/*
> > > + * Does this VMA require the underlying folios to have their dirty state
> > > + * tracked?
> > > + */
> > > +bool vma_needs_dirty_tracking(struct vm_area_struct *vma)
> > > +{
> >
> > Sorry for not noticing this earlier, but ...
>
> pints_owed++
>
> >
> > what about MAP_PRIVATE mappings? When we write, we populate an anon page,
> > which will work as expected ... because we don't have to notify the fs?
> >
> > I think you really also want the "If it was private or non-writable, the
> > write bit is already clear */" part as well and remove "false" in that case.
> >
>
> Not sure a 'write bit is already clear' case is relevant to checking
> whether a filesystem dirty tracks? That seems specific entirely to the page
> table bits.
>
> That's why I didn't include it,
>
> A !VM_WRITE shouldn't be GUP-writable except for FOLL_FORCE, and that
> surely could be problematic if VM_MAYWRITE later?
>
> Thinking about it though a !VM_SHARE should probably can be safely assumed
> to not be dirty-trackable, so we probably do need to add a check for
> !VM_SHARED -> !vma_needs_dirty_tracking
>

On second thoughts, we explicitly check FOLL_FORCE && !is_cow_mapping() in
check_vma_flags() so that case cannot occur.

So actually yes we should probably include this on the basis of that and
the fact that a FOLL_WRITE operation will CoW the MAP_PRIVATE mapping.

This was an (over)abundance of caution.

Will fix on respin.

> > Or was there a good reason to disallow private mappings as well?
> >
>
> Until the page is CoW'd walking the page tables will get you to the page
> cache page right? This was the reason I (perhaps rather too quickly) felt
> MAP_PRIVATE should be excluded.
>
> However a FOLL_WRITE would trigger CoW... and then we'd be trivially OK.
>
> So yeah, ok perhaps I dismissed that a little too soon. I was concerned
> about some sort of egregious FOLL_FORCE case where somehow we'd end up with
> the page cache folio. But actually, that probably can't happen...
>
> > --
> > Thanks,
> >
> > David / dhildenb
> >

  reply	other threads:[~2023-05-02 17:09 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-05-02 16:34 [PATCH v7 0/3] mm/gup: disallow GUP writing to file-backed mappings by default Lorenzo Stoakes
2023-05-02 16:34 ` [PATCH v7 1/3] mm/mmap: separate writenotify and dirty tracking logic Lorenzo Stoakes
2023-05-02 16:38   ` David Hildenbrand
2023-05-02 16:53     ` Lorenzo Stoakes
2023-05-02 17:09       ` Lorenzo Stoakes [this message]
2023-05-02 17:16         ` David Hildenbrand
2023-05-02 16:34 ` [PATCH v7 2/3] mm/gup: disallow FOLL_LONGTERM GUP-nonfast writing to file-backed mappings Lorenzo Stoakes
2023-05-02 16:42   ` David Hildenbrand
2023-05-02 16:56     ` Lorenzo Stoakes
2023-05-02 16:34 ` [PATCH v7 3/3] mm/gup: disallow FOLL_LONGTERM GUP-fast " Lorenzo Stoakes
2023-05-02 17:13   ` David Hildenbrand
2023-05-02 17:22     ` Peter Zijlstra
2023-05-02 17:34       ` David Hildenbrand
2023-05-02 18:17         ` Lorenzo Stoakes
2023-05-02 19:07           ` David Hildenbrand
2023-05-02 19:07           ` Jason Gunthorpe
2023-05-02 19:25             ` Lorenzo Stoakes
2023-05-02 19:33               ` David Hildenbrand
2023-05-02 19:37                 ` Lorenzo Stoakes
2023-05-02 23:51                 ` Jason Gunthorpe
2023-05-03  0:22               ` Jason Gunthorpe
2023-05-02 18:59         ` Peter Zijlstra
2023-05-02 19:05           ` David Hildenbrand
2023-05-02 17:34       ` Lorenzo Stoakes
2023-05-02 17:31     ` Lorenzo Stoakes
2023-05-02 17:38       ` David Hildenbrand
2023-05-02 17:45         ` Lorenzo Stoakes
2023-05-02 19:17   ` David Hildenbrand
2023-05-02 19:45     ` Lorenzo Stoakes
2023-05-02 18:45 ` [PATCH v7 0/3] mm/gup: disallow GUP writing to file-backed mappings by default Matthew Rosato
2023-05-02 18:53   ` Lorenzo Stoakes

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bf04a98a-9de6-4532-a36c-59572d22dd7c@lucifer.local \
    --to=lstoakes@gmail.com \
    --cc=acme@kernel.org \
    --cc=adrian.hunter@intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=asml.silence@gmail.com \
    --cc=ast@kernel.org \
    --cc=axboe@kernel.dk \
    --cc=benve@cisco.com \
    --cc=bjorn@kernel.org \
    --cc=bmt@zurich.ibm.com \
    --cc=borntraeger@linux.ibm.com \
    --cc=bpf@vger.kernel.org \
    --cc=brauner@kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=david@fromorbit.com \
    --cc=david@redhat.com \
    --cc=dennis.dalessandro@cornelisnetworks.com \
    --cc=edumazet@google.com \
    --cc=hawk@kernel.org \
    --cc=irogers@google.com \
    --cc=jack@suse.cz \
    --cc=jgg@nvidia.com \
    --cc=jgg@ziepe.ca \
    --cc=jhubbard@nvidia.com \
    --cc=john.fastabend@gmail.com \
    --cc=jolsa@kernel.org \
    --cc=jonathan.lemon@gmail.com \
    --cc=kirill@shutemov.name \
    --cc=kuba@kernel.org \
    --cc=leon@kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=maciej.fijalkowski@intel.com \
    --cc=magnus.karlsson@intel.com \
    --cc=mark.rutland@arm.com \
    --cc=mingo@redhat.com \
    --cc=mjrosato@linux.ibm.com \
    --cc=mpenttil@redhat.com \
    --cc=namhyung@kernel.org \
    --cc=neescoba@cisco.com \
    --cc=netdev@vger.kernel.org \
    --cc=oleg@redhat.com \
    --cc=pabeni@redhat.com \
    --cc=paulmck@kernel.org \
    --cc=peterx@redhat.com \
    --cc=peterz@infradead.org \
    --cc=richardcochran@gmail.com \
    --cc=tytso@mit.edu \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.