linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Alex Williamson <alex.williamson@redhat.com>
To: Jason Gunthorpe <jgg@ziepe.ca>, linux-doc@vger.kernel.org
Cc: "John Hubbard" <jhubbard@nvidia.com>,
	LKML <linux-kernel@vger.kernel.org>,
	"Andrew Morton" <akpm@linux-foundation.org>,
	"Al Viro" <viro@zeniv.linux.org.uk>,
	"Christoph Hellwig" <hch@infradead.org>,
	"Dan Williams" <dan.j.williams@intel.com>,
	"Dave Chinner" <david@fromorbit.com>,
	"Ira Weiny" <ira.weiny@intel.com>, "Jan Kara" <jack@suse.cz>,
	"Jonathan Corbet" <corbet@lwn.net>,
	"Jérôme Glisse" <jglisse@redhat.com>,
	"Kirill A . Shutemov" <kirill@shutemov.name>,
	"Michal Hocko" <mhocko@suse.com>,
	"Mike Kravetz" <mike.kravetz@oracle.com>,
	"Shuah Khan" <shuah@kernel.org>,
	"Vlastimil Babka" <vbabka@suse.cz>,
	"Matthew Wilcox" <willy@infradead.org>,
	linux-fsdevel@vger.kernel.org, linux-kselftest@vger.kernel.org,
	linux-rdma@vger.kernel.org, linux-mm@kvack.org,
	"Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>
Subject: Re: [regression?] Re: [PATCH v6 06/12] mm/gup: track FOLL_PIN pages
Date: Tue, 28 Apr 2020 14:12:23 -0600	[thread overview]
Message-ID: <20200428141223.5b1653db@w520.home> (raw)
In-Reply-To: <20200428192251.GW26002@ziepe.ca>

On Tue, 28 Apr 2020 16:22:51 -0300
Jason Gunthorpe <jgg@ziepe.ca> wrote:

> On Tue, Apr 28, 2020 at 01:07:52PM -0600, Alex Williamson wrote:
> > On Tue, 28 Apr 2020 14:49:57 -0300
> > Jason Gunthorpe <jgg@ziepe.ca> wrote:
> >   
> > > On Tue, Apr 28, 2020 at 10:54:55AM -0600, Alex Williamson wrote:  
> > > >  static int vfio_pci_mmap(void *device_data, struct vm_area_struct *vma)
> > > >  {
> > > >  	struct vfio_pci_device *vdev = device_data;
> > > > @@ -1253,8 +1323,14 @@ static int vfio_pci_mmap(void *device_data, struct vm_area_struct *vma)
> > > >  	vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
> > > >  	vma->vm_pgoff = (pci_resource_start(pdev, index) >> PAGE_SHIFT) + pgoff;
> > > >  
> > > > +	vma->vm_ops = &vfio_pci_mmap_ops;
> > > > +
> > > > +#if 1
> > > > +	return 0;
> > > > +#else
> > > >  	return remap_pfn_range(vma, vma->vm_start, vma->vm_pgoff,
> > > > -			       req_len, vma->vm_page_prot);
> > > > +			       vma->vm_end - vma->vm_start, vma->vm_page_prot);    
> > > 
> > > The remap_pfn_range here is what tells get_user_pages this is a
> > > non-struct page mapping:
> > > 
> > > 	vma->vm_flags |= VM_IO | VM_PFNMAP | VM_DONTEXPAND | VM_DONTDUMP;
> > > 
> > > Which has to be set when the VMA is created, they shouldn't be
> > > modified during fault.  
> > 
> > Aha, thanks Jason!  So fundamentally, pin_user_pages_remote() should
> > never have been faulting in this vma since the pages are non-struct
> > page backed.   
> 
> gup should not try to pin them.. I think the VM will still call fault
> though, not sure from memory?

Hmm, at commit 3faa52c03f44 the behavior is that I don't see a fault on
pin, maybe that's a bug.  But trying to rebase to current top of tree,
now my DMA mapping gets an -EFAULT, so something is still funky :-\

> > Maybe I was just getting lucky before this commit.  For a
> > VM_PFNMAP, vaddr_get_pfn() only needs pin_user_pages_remote() to return
> > error and the vma information that we setup in vfio_pci_mmap().  
> 
> I've written on this before, vfio should not be passing pages to the
> iommu that it cannot pin eg it should not touch VM_PFNMAP vma's in the
> first place.
> 
> It is a use-after-free security issue the way it is..

Where is the user after free?  Here I'm trying to map device mmio space
through the iommu, which we need to enable p2p when the user owns
multiple devices.  The device is owned by the user, bound to vfio-pci,
and can't be unbound while the user has it open.  The iommu mappings
are torn down on release.  I guess I don't understand the problem.

> > only need the fault handler to trigger for user access, which is what I
> > see with this change.  That should work for me.
> >   
> > > Also the vma code above looked a little strange to me, if you do send
> > > something like this cc me and I can look at it. I did some work like
> > > this for rdma a while ago..  
> > 
> > Cool, I'll do that.  I'd like to be able to zap the vmas from user
> > access at a later point and I have doubts that I'm holding the
> > refs/locks that I need to for that.  Thanks,  
> 
> Check rdma_umap_ops, it does what you described (actually it replaces
> them with 0 page, but along the way it zaps too).

Ok, thanks,

Alex


  reply	other threads:[~2020-04-28 20:12 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-02-11  0:15 [PATCH v6 00/12] mm/gup: track FOLL_PIN pages John Hubbard
2020-02-11  0:15 ` [PATCH v6 01/12] mm/gup: split get_user_pages_remote() into two routines John Hubbard
2020-02-11  0:15 ` [PATCH v6 02/12] mm/gup: pass a flags arg to __gup_device_* functions John Hubbard
2020-02-11  0:15 ` [PATCH v6 03/12] mm: introduce page_ref_sub_return() John Hubbard
2020-02-11  0:15 ` [PATCH v6 04/12] mm/gup: pass gup flags to two more routines John Hubbard
2020-02-11  0:15 ` [PATCH v6 05/12] mm/gup: require FOLL_GET for get_user_pages_fast() John Hubbard
2020-02-11  0:15 ` [PATCH v6 06/12] mm/gup: track FOLL_PIN pages John Hubbard
2020-04-24 18:18   ` [regression] " Alex Williamson
2020-04-24 19:20     ` John Hubbard
2020-04-24 20:15       ` Alex Williamson
2020-04-24 22:58         ` John Hubbard
2020-04-28 16:54           ` [regression?] " Alex Williamson
2020-04-28 17:49             ` Jason Gunthorpe
2020-04-28 19:07               ` Alex Williamson
2020-04-28 19:22                 ` Jason Gunthorpe
2020-04-28 20:12                   ` Alex Williamson [this message]
2020-04-29  0:29                     ` Jason Gunthorpe
2020-04-29 19:56                       ` Alex Williamson
2020-04-29 23:03                         ` Jason Gunthorpe
2020-02-11  0:15 ` [PATCH v6 07/12] mm/gup: page->hpage_pinned_refcount: exact pin counts for huge pages John Hubbard
2020-02-11  0:15 ` [PATCH v6 08/12] mm/gup: /proc/vmstat: pin_user_pages (FOLL_PIN) reporting John Hubbard
2020-02-12  9:17   ` Jan Kara
2020-02-11  0:15 ` [PATCH v6 09/12] mm/gup_benchmark: support pin_user_pages() and related calls John Hubbard
2020-02-11  0:15 ` [PATCH v6 10/12] selftests/vm: run_vmtests: invoke gup_benchmark with basic FOLL_PIN coverage John Hubbard
2020-02-11  0:15 ` [PATCH v6 11/12] mm: Improve dump_page() for compound pages John Hubbard
2020-02-11  0:15 ` [PATCH v6 12/12] mm: dump_page(): additional diagnostics for huge pinned pages John Hubbard
2020-02-11 13:21   ` Kirill A. Shutemov
2020-02-12  2:10     ` John Hubbard

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200428141223.5b1653db@w520.home \
    --to=alex.williamson@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=corbet@lwn.net \
    --cc=dan.j.williams@intel.com \
    --cc=david@fromorbit.com \
    --cc=hch@infradead.org \
    --cc=ira.weiny@intel.com \
    --cc=jack@suse.cz \
    --cc=jgg@ziepe.ca \
    --cc=jglisse@redhat.com \
    --cc=jhubbard@nvidia.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=kirill@shutemov.name \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=mhocko@suse.com \
    --cc=mike.kravetz@oracle.com \
    --cc=shuah@kernel.org \
    --cc=vbabka@suse.cz \
    --cc=viro@zeniv.linux.org.uk \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).