All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jason Gunthorpe <jgg@nvidia.com>
To: David Hildenbrand <david@redhat.com>
Cc: Peter Xu <peterx@redhat.com>,
	Matthew Rosato <mjrosato@linux.ibm.com>,
	Christian Borntraeger <borntraeger@linux.ibm.com>,
	Lorenzo Stoakes <lstoakes@gmail.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	Andrew Morton <akpm@linux-foundation.org>,
	Jens Axboe <axboe@kernel.dk>,
	Matthew Wilcox <willy@infradead.org>,
	Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>,
	Leon Romanovsky <leon@kernel.org>,
	Christian Benvenuti <benve@cisco.com>,
	Nelson Escobar <neescoba@cisco.com>,
	Bernard Metzler <bmt@zurich.ibm.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@redhat.com>,
	Arnaldo Carvalho de Melo <acme@kernel.org>,
	Mark Rutland <mark.rutland@arm.com>,
	Alexander Shishkin <alexander.shishkin@linux.intel.com>,
	Jiri Olsa <jolsa@kernel.org>, Namhyung Kim <namhyung@kernel.org>,
	Ian Rogers <irogers@google.com>,
	Adrian Hunter <adrian.hunter@intel.com>,
	Bjorn Topel <bjorn@kernel.org>,
	Magnus Karlsson <magnus.karlsson@intel.com>,
	Maciej Fijalkowski <maciej.fijalkowski@intel.com>,
	Jonathan Lemon <jonathan.lemon@gmail.com>,
	"David S . Miller" <davem@davemloft.net>,
	Eric Dumazet <edumazet@google.com>,
	Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
	Christian Brauner <brauner@kernel.org>,
	Richard Cochran <richardcochran@gmail.com>,
	Alexei Starovoitov <ast@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Jesper Dangaard Brouer <hawk@kernel.org>,
	John Fastabend <john.fastabend@gmail.com>,
	linux-fsdevel@vger.kernel.org, linux-perf-users@vger.kernel.org,
	netdev@vger.kernel.org, bpf@vger.kernel.org,
	Oleg Nesterov <oleg@redhat.com>,
	John Hubbard <jhubbard@nvidia.com>, Jan Kara <jack@suse.cz>,
	"Kirill A . Shutemov" <kirill@shutemov.name>,
	Pavel Begunkov <asml.silence@gmail.com>,
	Mika Penttila <mpenttil@redhat.com>,
	Dave Chinner <david@fromorbit.com>, Theodore Ts'o <tytso@mit.edu>
Subject: Re: [PATCH v6 3/3] mm/gup: disallow FOLL_LONGTERM GUP-fast writing to file-backed mappings
Date: Tue, 2 May 2023 14:46:06 -0300	[thread overview]
Message-ID: <ZFFMXswUwsQ6lRi5@nvidia.com> (raw)
In-Reply-To: <6681789f-f70e-820d-a185-a17e638dfa53@redhat.com>

On Tue, May 02, 2023 at 06:32:23PM +0200, David Hildenbrand wrote:
> On 02.05.23 18:19, Jason Gunthorpe wrote:
> > On Tue, May 02, 2023 at 06:12:39PM +0200, David Hildenbrand wrote:
> > 
> > > > It missses the general architectural point why we have all these
> > > > shootdown mechanims in other places - plares are not supposed to make
> > > > these kinds of assumptions. When the userspace unplugs the memory from
> > > > KVM or unmaps it from VFIO it is not still being accessed by the
> > > > kernel.
> > > 
> > > Yes. Like having memory in a vfio iommu v1 and doing the same (mremap,
> > > munmap, MADV_DONTNEED, ...). Which is why we disable MADV_DONTNEED (e.g.,
> > > virtio-balloon) in QEMU with vfio.
> > 
> > That is different, VFIO has it's own contract how it consumes the
> > memory from the MM and VFIO breaks all this stuff.
> > 
> > But when you tell VFIO to unmap the memory it doesn't keep accessing
> > it in the background like this does.
> 
> To me, this is similar to when QEMU (user space) triggers
> KVM_S390_ZPCIOP_DEREG_AEN, to tell KVM to disable AIF and stop using the
> page (1) When triggered by the guest explicitly (2) when resetting the VM
> (3) when resetting the virtual PCI device / configuration.
> 
> Interrupt gets unregistered from HW (which stops using the page), the pages
> get unpinned. Pages get no longer used.
> 
> I guess I am still missing (a) how this is fundamentally different (b) how
> it could be done differently.

It uses an address that is already scoped within the KVM memory map
and uses KVM's gpa_to_gfn() to translate it to some pinnable page

It is not some independent thing like VFIO, it is explicitly scoped
within the existing KVM structure and it does not follow any mutations
that are done to the gpa map through the usual KVM APIs.

> I'd really be happy to learn how a better approach would look like that does
> not use longterm pinnings.

Sounds like the FW sadly needs pinnings. This is why I said it looks
like DMA. If possible it would be better to get the pinning through
VFIO, eg as a mdev

Otherwise, it would have been cleaner if this was divorced from KVM
and took in a direct user pointer, then maybe you could make the
argument is its own thing with its own lifetime rules. (then you are
kind of making your own mdev)

Or, perhaps, this is really part of some radical "irqfd" that we've
been on and off talking about specifically to get this area of
interrupt bypass uAPI'd properly..

Jason

  reply	other threads:[~2023-05-02 17:46 UTC|newest]

Thread overview: 56+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-05-01 23:11 [PATCH v6 0/3] mm/gup: disallow GUP writing to file-backed mappings by default Lorenzo Stoakes
2023-05-01 23:11 ` [PATCH v6 1/3] mm/mmap: separate writenotify and dirty tracking logic Lorenzo Stoakes
2023-05-01 23:11 ` [PATCH v6 2/3] mm/gup: disallow FOLL_LONGTERM GUP-nonfast writing to file-backed mappings Lorenzo Stoakes
2023-05-02 15:04   ` David Hildenbrand
2023-05-02 15:17     ` Lorenzo Stoakes
2023-05-01 23:11 ` [PATCH v6 3/3] mm/gup: disallow FOLL_LONGTERM GUP-fast " Lorenzo Stoakes
2023-05-01 23:42   ` John Hubbard
2023-05-02  3:33   ` kernel test robot
2023-05-02  7:46     ` Lorenzo Stoakes
2023-05-02 11:13   ` Peter Zijlstra
2023-05-02 11:23     ` Jan Kara
2023-05-02 11:25     ` Lorenzo Stoakes
2023-05-02 11:28       ` Lorenzo Stoakes
2023-05-02 12:08       ` Peter Zijlstra
2023-05-02 12:27         ` Lorenzo Stoakes
2023-05-02 12:40         ` Peter Zijlstra
2023-05-02 12:47           ` David Hildenbrand
2023-05-02 12:52             ` Lorenzo Stoakes
2023-05-02 12:53               ` David Hildenbrand
2023-05-02 13:30         ` Paul E. McKenney
2023-05-02 11:20   ` Jan Kara
2023-05-02 12:46   ` Christian Borntraeger
2023-05-02 12:54     ` Lorenzo Stoakes
2023-05-02 13:02       ` Jason Gunthorpe
2023-05-02 13:04       ` Christian Borntraeger
2023-05-02 13:10         ` Jason Gunthorpe
2023-05-02 13:28           ` David Hildenbrand
2023-05-02 13:36             ` Jason Gunthorpe
2023-05-02 13:39               ` David Hildenbrand
2023-05-02 13:43                 ` Matthew Rosato
2023-05-02 13:47                   ` David Hildenbrand
2023-05-02 13:50                     ` Jason Gunthorpe
2023-05-02 13:56                       ` Matthew Rosato
2023-05-02 15:09                         ` David Hildenbrand
2023-05-02 15:19                           ` Lorenzo Stoakes
2023-05-02 15:20                             ` Matthew Rosato
2023-05-02 13:57                       ` David Hildenbrand
2023-05-02 14:04                         ` Jason Gunthorpe
2023-05-02 14:15                           ` David Hildenbrand
2023-05-02 14:54                             ` Matthew Rosato
2023-05-02 15:20                               ` Jason Gunthorpe
2023-05-02 15:32                                 ` Peter Xu
2023-05-02 15:36                                   ` Jason Gunthorpe
2023-05-02 15:45                                     ` David Hildenbrand
2023-05-02 16:06                                       ` Jason Gunthorpe
2023-05-02 16:12                                         ` David Hildenbrand
2023-05-02 16:19                                           ` Jason Gunthorpe
2023-05-02 16:32                                             ` David Hildenbrand
2023-05-02 17:46                                               ` Jason Gunthorpe [this message]
2023-05-02 17:59                                                 ` Matthew Rosato
2023-05-02 18:09                                                   ` Jason Gunthorpe
2023-05-02 19:23                                                 ` David Hildenbrand
2023-05-02 13:38             ` Matthew Rosato
2023-05-02 13:35         ` Matthew Rosato
2023-05-02 14:57           ` David Hildenbrand
2023-05-02 15:19             ` Matthew Rosato

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZFFMXswUwsQ6lRi5@nvidia.com \
    --to=jgg@nvidia.com \
    --cc=acme@kernel.org \
    --cc=adrian.hunter@intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=asml.silence@gmail.com \
    --cc=ast@kernel.org \
    --cc=axboe@kernel.dk \
    --cc=benve@cisco.com \
    --cc=bjorn@kernel.org \
    --cc=bmt@zurich.ibm.com \
    --cc=borntraeger@linux.ibm.com \
    --cc=bpf@vger.kernel.org \
    --cc=brauner@kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=david@fromorbit.com \
    --cc=david@redhat.com \
    --cc=dennis.dalessandro@cornelisnetworks.com \
    --cc=edumazet@google.com \
    --cc=hawk@kernel.org \
    --cc=irogers@google.com \
    --cc=jack@suse.cz \
    --cc=jhubbard@nvidia.com \
    --cc=john.fastabend@gmail.com \
    --cc=jolsa@kernel.org \
    --cc=jonathan.lemon@gmail.com \
    --cc=kirill@shutemov.name \
    --cc=kuba@kernel.org \
    --cc=leon@kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=lstoakes@gmail.com \
    --cc=maciej.fijalkowski@intel.com \
    --cc=magnus.karlsson@intel.com \
    --cc=mark.rutland@arm.com \
    --cc=mingo@redhat.com \
    --cc=mjrosato@linux.ibm.com \
    --cc=mpenttil@redhat.com \
    --cc=namhyung@kernel.org \
    --cc=neescoba@cisco.com \
    --cc=netdev@vger.kernel.org \
    --cc=oleg@redhat.com \
    --cc=pabeni@redhat.com \
    --cc=peterx@redhat.com \
    --cc=peterz@infradead.org \
    --cc=richardcochran@gmail.com \
    --cc=tytso@mit.edu \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.