All of lore.kernel.org
 help / color / mirror / Atom feed
From: Lorenzo Stoakes <lstoakes@gmail.com>
To: Matthew Rosato <mjrosato@linux.ibm.com>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	Andrew Morton <akpm@linux-foundation.org>,
	Jason Gunthorpe <jgg@ziepe.ca>, Jens Axboe <axboe@kernel.dk>,
	Matthew Wilcox <willy@infradead.org>,
	Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>,
	Leon Romanovsky <leon@kernel.org>,
	Christian Benvenuti <benve@cisco.com>,
	Nelson Escobar <neescoba@cisco.com>,
	Bernard Metzler <bmt@zurich.ibm.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@redhat.com>,
	Arnaldo Carvalho de Melo <acme@kernel.org>,
	Mark Rutland <mark.rutland@arm.com>,
	Alexander Shishkin <alexander.shishkin@linux.intel.com>,
	Jiri Olsa <jolsa@kernel.org>, Namhyung Kim <namhyung@kernel.org>,
	Ian Rogers <irogers@google.com>,
	Adrian Hunter <adrian.hunter@intel.com>,
	Bjorn Topel <bjorn@kernel.org>,
	Magnus Karlsson <magnus.karlsson@intel.com>,
	Maciej Fijalkowski <maciej.fijalkowski@intel.com>,
	Jonathan Lemon <jonathan.lemon@gmail.com>,
	"David S . Miller" <davem@davemloft.net>,
	Eric Dumazet <edumazet@google.com>,
	Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
	Christian Brauner <brauner@kernel.org>,
	Richard Cochran <richardcochran@gmail.com>,
	Alexei Starovoitov <ast@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Jesper Dangaard Brouer <hawk@kernel.org>,
	John Fastabend <john.fastabend@gmail.com>,
	linux-fsdevel@vger.kernel.org, linux-perf-users@vger.kernel.org,
	netdev@vger.kernel.org, bpf@vger.kernel.org,
	Oleg Nesterov <oleg@redhat.com>, Jason Gunthorpe <jgg@nvidia.com>,
	John Hubbard <jhubbard@nvidia.com>, Jan Kara <jack@suse.cz>,
	"Kirill A . Shutemov" <kirill@shutemov.name>,
	Pavel Begunkov <asml.silence@gmail.com>,
	Mika Penttila <mpenttil@redhat.com>,
	David Hildenbrand <david@redhat.com>,
	Dave Chinner <david@fromorbit.com>, Theodore Ts'o <tytso@mit.edu>,
	Peter Xu <peterx@redhat.com>,
	"Paul E . McKenney" <paulmck@kernel.org>,
	Christian Borntraeger <borntraeger@linux.ibm.com>
Subject: Re: [PATCH v7 0/3] mm/gup: disallow GUP writing to file-backed mappings by default
Date: Tue, 2 May 2023 19:53:30 +0100	[thread overview]
Message-ID: <92fd5d71-ef9b-4971-944a-2a7bd74b5970@lucifer.local> (raw)
In-Reply-To: <ce86e956-173f-848a-a1f3-f102134ccd94@linux.ibm.com>

On Tue, May 02, 2023 at 02:45:01PM -0400, Matthew Rosato wrote:
> On 5/2/23 12:34 PM, Lorenzo Stoakes wrote:
> > Writing to file-backed mappings which require folio dirty tracking using
> > GUP is a fundamentally broken operation, as kernel write access to GUP
> > mappings do not adhere to the semantics expected by a file system.
> >
> > A GUP caller uses the direct mapping to access the folio, which does not
> > cause write notify to trigger, nor does it enforce that the caller marks
> > the folio dirty.
> >
> > The problem arises when, after an initial write to the folio, writeback
> > results in the folio being cleaned and then the caller, via the GUP
> > interface, writes to the folio again.
> >
> > As a result of the use of this secondary, direct, mapping to the folio no
> > write notify will occur, and if the caller does mark the folio dirty, this
> > will be done so unexpectedly.
> >
> > For example, consider the following scenario:-
> >
> > 1. A folio is written to via GUP which write-faults the memory, notifying
> >    the file system and dirtying the folio.
> > 2. Later, writeback is triggered, resulting in the folio being cleaned and
> >    the PTE being marked read-only.
> > 3. The GUP caller writes to the folio, as it is mapped read/write via the
> >    direct mapping.
> > 4. The GUP caller, now done with the page, unpins it and sets it dirty
> >    (though it does not have to).
> >
> > This change updates both the PUP FOLL_LONGTERM slow and fast APIs. As
> > pin_user_pages_fast_only() does not exist, we can rely on a slightly
> > imperfect whitelisting in the PUP-fast case and fall back to the slow case
> > should this fail.
> >
> > v7:
> > - Fixed very silly bug in writeable_file_mapping_allowed() inverting the
> >   logic.
> > - Removed unnecessary RCU lock code and replaced with adaptation of Peter's
> >   idea.
> > - Removed unnecessary open-coded folio_test_anon() in
> >   folio_longterm_write_pin_allowed() and restructured to generally permit
> >   NULL folio_mapping().
> >
>
> FWIW, I realize you are planning another respin, but I went and tried this version out on s390 -- Now when using a memory backend file and vfio-pci on s390 I see vfio_pin_pages_remote failing consistently.  However, the pin_user_pages_fast(FOLL_WRITE | FOLL_LONGTERM) in kvm_s390_pci_aif_enable will still return positive.
>

Hey thanks very much for checking that :)

This version will unconditionally apply the retriction to non-FOLL_LONGTERM
by mistake (ugh) but vfio_pin_pages_remote() does seem to be setting
FOLL_LONGTERM anyway so this seems a legitimate test.

Interesting the _fast() variant succeeds...

David, Jason et al. can speak more to the ins and outs of these
virtualisation cases which I am not so familiar with, but I wonder if we do
need a flag to provide an exception for VFIO.

      reply	other threads:[~2023-05-02 18:53 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-05-02 16:34 [PATCH v7 0/3] mm/gup: disallow GUP writing to file-backed mappings by default Lorenzo Stoakes
2023-05-02 16:34 ` [PATCH v7 1/3] mm/mmap: separate writenotify and dirty tracking logic Lorenzo Stoakes
2023-05-02 16:38   ` David Hildenbrand
2023-05-02 16:53     ` Lorenzo Stoakes
2023-05-02 17:09       ` Lorenzo Stoakes
2023-05-02 17:16         ` David Hildenbrand
2023-05-02 16:34 ` [PATCH v7 2/3] mm/gup: disallow FOLL_LONGTERM GUP-nonfast writing to file-backed mappings Lorenzo Stoakes
2023-05-02 16:42   ` David Hildenbrand
2023-05-02 16:56     ` Lorenzo Stoakes
2023-05-02 16:34 ` [PATCH v7 3/3] mm/gup: disallow FOLL_LONGTERM GUP-fast " Lorenzo Stoakes
2023-05-02 17:13   ` David Hildenbrand
2023-05-02 17:22     ` Peter Zijlstra
2023-05-02 17:34       ` David Hildenbrand
2023-05-02 18:17         ` Lorenzo Stoakes
2023-05-02 19:07           ` David Hildenbrand
2023-05-02 19:07           ` Jason Gunthorpe
2023-05-02 19:25             ` Lorenzo Stoakes
2023-05-02 19:33               ` David Hildenbrand
2023-05-02 19:37                 ` Lorenzo Stoakes
2023-05-02 23:51                 ` Jason Gunthorpe
2023-05-03  0:22               ` Jason Gunthorpe
2023-05-02 18:59         ` Peter Zijlstra
2023-05-02 19:05           ` David Hildenbrand
2023-05-02 17:34       ` Lorenzo Stoakes
2023-05-02 17:31     ` Lorenzo Stoakes
2023-05-02 17:38       ` David Hildenbrand
2023-05-02 17:45         ` Lorenzo Stoakes
2023-05-02 19:17   ` David Hildenbrand
2023-05-02 19:45     ` Lorenzo Stoakes
2023-05-02 18:45 ` [PATCH v7 0/3] mm/gup: disallow GUP writing to file-backed mappings by default Matthew Rosato
2023-05-02 18:53   ` Lorenzo Stoakes [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=92fd5d71-ef9b-4971-944a-2a7bd74b5970@lucifer.local \
    --to=lstoakes@gmail.com \
    --cc=acme@kernel.org \
    --cc=adrian.hunter@intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=asml.silence@gmail.com \
    --cc=ast@kernel.org \
    --cc=axboe@kernel.dk \
    --cc=benve@cisco.com \
    --cc=bjorn@kernel.org \
    --cc=bmt@zurich.ibm.com \
    --cc=borntraeger@linux.ibm.com \
    --cc=bpf@vger.kernel.org \
    --cc=brauner@kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=david@fromorbit.com \
    --cc=david@redhat.com \
    --cc=dennis.dalessandro@cornelisnetworks.com \
    --cc=edumazet@google.com \
    --cc=hawk@kernel.org \
    --cc=irogers@google.com \
    --cc=jack@suse.cz \
    --cc=jgg@nvidia.com \
    --cc=jgg@ziepe.ca \
    --cc=jhubbard@nvidia.com \
    --cc=john.fastabend@gmail.com \
    --cc=jolsa@kernel.org \
    --cc=jonathan.lemon@gmail.com \
    --cc=kirill@shutemov.name \
    --cc=kuba@kernel.org \
    --cc=leon@kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=maciej.fijalkowski@intel.com \
    --cc=magnus.karlsson@intel.com \
    --cc=mark.rutland@arm.com \
    --cc=mingo@redhat.com \
    --cc=mjrosato@linux.ibm.com \
    --cc=mpenttil@redhat.com \
    --cc=namhyung@kernel.org \
    --cc=neescoba@cisco.com \
    --cc=netdev@vger.kernel.org \
    --cc=oleg@redhat.com \
    --cc=pabeni@redhat.com \
    --cc=paulmck@kernel.org \
    --cc=peterx@redhat.com \
    --cc=peterz@infradead.org \
    --cc=richardcochran@gmail.com \
    --cc=tytso@mit.edu \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.