linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jason Gunthorpe <jgg@ziepe.ca>
To: Alex Williamson <alex.williamson@redhat.com>
Cc: linux-doc@vger.kernel.org, "John Hubbard" <jhubbard@nvidia.com>,
	LKML <linux-kernel@vger.kernel.org>,
	"Andrew Morton" <akpm@linux-foundation.org>,
	"Al Viro" <viro@zeniv.linux.org.uk>,
	"Christoph Hellwig" <hch@infradead.org>,
	"Dan Williams" <dan.j.williams@intel.com>,
	"Dave Chinner" <david@fromorbit.com>,
	"Ira Weiny" <ira.weiny@intel.com>, "Jan Kara" <jack@suse.cz>,
	"Jonathan Corbet" <corbet@lwn.net>,
	"Jérôme Glisse" <jglisse@redhat.com>,
	"Kirill A . Shutemov" <kirill@shutemov.name>,
	"Michal Hocko" <mhocko@suse.com>,
	"Mike Kravetz" <mike.kravetz@oracle.com>,
	"Shuah Khan" <shuah@kernel.org>,
	"Vlastimil Babka" <vbabka@suse.cz>,
	"Matthew Wilcox" <willy@infradead.org>,
	linux-fsdevel@vger.kernel.org, linux-kselftest@vger.kernel.org,
	linux-rdma@vger.kernel.org, linux-mm@kvack.org,
	"Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>
Subject: Re: [regression?] Re: [PATCH v6 06/12] mm/gup: track FOLL_PIN pages
Date: Wed, 29 Apr 2020 20:03:01 -0300	[thread overview]
Message-ID: <20200429230301.GL26002@ziepe.ca> (raw)
In-Reply-To: <20200429135633.626a8411@w520.home>

On Wed, Apr 29, 2020 at 01:56:33PM -0600, Alex Williamson wrote:
> On Tue, 28 Apr 2020 21:29:03 -0300
> Jason Gunthorpe <jgg@ziepe.ca> wrote:
> 
> > On Tue, Apr 28, 2020 at 02:12:23PM -0600, Alex Williamson wrote:
> > 
> > > > > Maybe I was just getting lucky before this commit.  For a
> > > > > VM_PFNMAP, vaddr_get_pfn() only needs pin_user_pages_remote() to return
> > > > > error and the vma information that we setup in vfio_pci_mmap().    
> > > > 
> > > > I've written on this before, vfio should not be passing pages to the
> > > > iommu that it cannot pin eg it should not touch VM_PFNMAP vma's in the
> > > > first place.
> > > > 
> > > > It is a use-after-free security issue the way it is..  
> > > 
> > > Where is the user after free?  Here I'm trying to map device mmio space
> > > through the iommu, which we need to enable p2p when the user owns
> > > multiple devices.  
> > 
> > Yes, I gathered what the intent was..
> > 
> > > The device is owned by the user, bound to vfio-pci, and can't be
> > > unbound while the user has it open.  The iommu mappings are torn
> > > down on release.  I guess I don't understand the problem.  
> > 
> > For PFNMAP VMAs the lifecycle rule is basically that the PFN inside
> > the VMA can only be used inside the mmap_sem that read it. Ie you
> > cannot take a PFN outside the mmap_sem and continue to use it.
> > 
> > This is because the owner of the VMA owns the lifetime of that PFN,
> > and under the write side of the mmap_sem it can zap the PFN, or close
> > the VMA. Afterwards the VMA owner knows that there are no active
> > reference to the PFN in the system and can reclaim the PFN
> > 
> > ie the PFNMAP has no per-page pin counter. All lifetime revolves around
> > the mmap_sem and the vma.
> > 
> > What vfio does is take the PFN out of the mmap_sem and program it into
> > the iommu.
> > 
> > So when the VMA owner decides the PFN has no references, it actually
> > doesn't: vfio continues to access it beyond its permitted lifetime.
> > 
> > HW like mlx5 and GPUs have BAR pages which have security
> > properties. Once the PFN is returned to the driver the security
> > context of the PFN can be reset and re-assigned to another
> > process. Using VFIO a hostile user space can retain access to the BAR
> > page and upon its reassignment access a security context they were not
> > permitted to access.
> > 
> > This is why GUP does not return PFNMAP pages and vfio should not carry
> > a reference outside the mmap_sem. It breaks all the lifetime rules.
> 
> Thanks for the explanation.  I'm inferring that there is no solution to
> this, 

Not a particularly good one unfortunately. I've been wanting to use
P2P_DMA pages to solve these kinds of things but they are kind of
expensive.

I have a copy of some draft patches trying to do this

> but why can't we use mmu notifiers to invalidate the iommu on zap or
> close?

Hum.. I think with the new mmu interval notifiers vfio might be able
to manage that without a huge amount of trouble. But the iommu
invalidation needs to be synchronous from a mmu notifier callback - is
that feasible?

But even so, we have all this stuff now for authorizing PCI P2P which
this design completely ignores as well. :(

> I know that at least QEMU won't consider these sorts of mapping
> fatal, so we could possibly change the default and make support for
> such mappings opt-in, but I don't know if I'd break DPDK, or
> potentially users within QEMU that make use of p2p between devices.

I'd heard this was mostly for GPU device assignment? I'd be surprised
if DPDK used this..

Jason

  reply	other threads:[~2020-04-29 23:03 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-02-11  0:15 [PATCH v6 00/12] mm/gup: track FOLL_PIN pages John Hubbard
2020-02-11  0:15 ` [PATCH v6 01/12] mm/gup: split get_user_pages_remote() into two routines John Hubbard
2020-02-11  0:15 ` [PATCH v6 02/12] mm/gup: pass a flags arg to __gup_device_* functions John Hubbard
2020-02-11  0:15 ` [PATCH v6 03/12] mm: introduce page_ref_sub_return() John Hubbard
2020-02-11  0:15 ` [PATCH v6 04/12] mm/gup: pass gup flags to two more routines John Hubbard
2020-02-11  0:15 ` [PATCH v6 05/12] mm/gup: require FOLL_GET for get_user_pages_fast() John Hubbard
2020-02-11  0:15 ` [PATCH v6 06/12] mm/gup: track FOLL_PIN pages John Hubbard
2020-04-24 18:18   ` [regression] " Alex Williamson
2020-04-24 19:20     ` John Hubbard
2020-04-24 20:15       ` Alex Williamson
2020-04-24 22:58         ` John Hubbard
2020-04-28 16:54           ` [regression?] " Alex Williamson
2020-04-28 17:49             ` Jason Gunthorpe
2020-04-28 19:07               ` Alex Williamson
2020-04-28 19:22                 ` Jason Gunthorpe
2020-04-28 20:12                   ` Alex Williamson
2020-04-29  0:29                     ` Jason Gunthorpe
2020-04-29 19:56                       ` Alex Williamson
2020-04-29 23:03                         ` Jason Gunthorpe [this message]
2020-02-11  0:15 ` [PATCH v6 07/12] mm/gup: page->hpage_pinned_refcount: exact pin counts for huge pages John Hubbard
2020-02-11  0:15 ` [PATCH v6 08/12] mm/gup: /proc/vmstat: pin_user_pages (FOLL_PIN) reporting John Hubbard
2020-02-12  9:17   ` Jan Kara
2020-02-11  0:15 ` [PATCH v6 09/12] mm/gup_benchmark: support pin_user_pages() and related calls John Hubbard
2020-02-11  0:15 ` [PATCH v6 10/12] selftests/vm: run_vmtests: invoke gup_benchmark with basic FOLL_PIN coverage John Hubbard
2020-02-11  0:15 ` [PATCH v6 11/12] mm: Improve dump_page() for compound pages John Hubbard
2020-02-11  0:15 ` [PATCH v6 12/12] mm: dump_page(): additional diagnostics for huge pinned pages John Hubbard
2020-02-11 13:21   ` Kirill A. Shutemov
2020-02-12  2:10     ` John Hubbard

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200429230301.GL26002@ziepe.ca \
    --to=jgg@ziepe.ca \
    --cc=akpm@linux-foundation.org \
    --cc=alex.williamson@redhat.com \
    --cc=corbet@lwn.net \
    --cc=dan.j.williams@intel.com \
    --cc=david@fromorbit.com \
    --cc=hch@infradead.org \
    --cc=ira.weiny@intel.com \
    --cc=jack@suse.cz \
    --cc=jglisse@redhat.com \
    --cc=jhubbard@nvidia.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=kirill@shutemov.name \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=mhocko@suse.com \
    --cc=mike.kravetz@oracle.com \
    --cc=shuah@kernel.org \
    --cc=vbabka@suse.cz \
    --cc=viro@zeniv.linux.org.uk \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).