kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jason Gunthorpe <jgg@nvidia.com>
To: Alex Williamson <alex.williamson@redhat.com>
Cc: <cohuck@redhat.com>, <kvm@vger.kernel.org>,
	<linux-kernel@vger.kernel.org>, <peterx@redhat.com>
Subject: Re: [RFC PATCH 10/10] vfio/type1: Register device notifier
Date: Wed, 24 Feb 2021 20:22:16 -0400	[thread overview]
Message-ID: <20210225002216.GQ4247@nvidia.com> (raw)
In-Reply-To: <20210224145508.1f0edb06@omen.home.shazbot.org>

On Wed, Feb 24, 2021 at 02:55:08PM -0700, Alex Williamson wrote:

> > > +static bool strict_mmio_maps = true;
> > > +module_param_named(strict_mmio_maps, strict_mmio_maps, bool, 0644);
> > > +MODULE_PARM_DESC(strict_mmio_maps,
> > > +		 "Restrict to safe DMA mappings of device memory (true).");  
> > 
> > I think this should be a kconfig, historically we've required kconfig
> > to opt-in to unsafe things that could violate kernel security. Someone
> > building a secure boot trusted kernel system should not have an
> > options for userspace to just turn off protections.
> 
> It could certainly be further protected that this option might not
> exist based on a Kconfig, but I think we're already risking breaking
> some existing users and I'd rather allow it with an opt-in (like we
> already do for lack of interrupt isolation), possibly even with a
> kernel taint if used, if necessary.

Makes me nervous, security should not be optional.

> > I'd prefer this was written a bit differently, I would like it very
> > much if this doesn't mis-use follow_pte() by returning pfn outside
> > the lock.
> > 
> > vaddr_get_bar_pfn(..)
> > {
> >         vma = find_vma_intersection(mm, vaddr, vaddr + 1);
> > 	if (!vma)
> >            return -ENOENT;
> >         if ((vma->vm_flags & VM_DENYWRITE) && (prot & PROT_WRITE)) // Check me
> >            return -EFAULT;
> >         device = vfio_device_get_from_vma(vma);
> > 	if (!device)
> >            return -ENOENT;
> > 
> > 	/*
> >          * Now do the same as vfio_pci_mmap_fault() - the vm_pgoff must
> > 	 * be the physical pfn when using this mechanism. Delete follow_pte entirely()
> >          */
> >         pfn = (vaddr - vma->vm_start)/PAGE_SIZE + vma->vm_pgoff
> > 	
> >         /* de-dup device and record that we are using device's pages in the
> > 	   pfnmap */
> >         ...
> > }
> 
> 
> This seems to undo both:
> 
> 5cbf3264bc71 ("vfio/type1: Fix VA->PA translation for PFNMAP VMAs in vaddr_get_pfn()")

No, the bug this commit described is fixed by calling
vfio_device_get_from_vma() which excludes all non-VFIO VMAs already.

We can assert that the vm_pgoff is in a specific format because it is
a VFIO owned VMA and must follow the rules to be part of the address
space. See my last email

Here I was suggesting to use the vm_pgoff == PFN rule, but since
you've clarified that doesn't work we'd have to determine the PFN from
the region number through the vfio_device instead.

> (which also suggests we are going to break users without the module
> option opt-in above)

Not necessarily, this is complaining vfio crashes, it doesn't say they
actually needed the IOMMU to work on those VMAs because they are doing
P2P DMA.

I think, if this does break someone, they are on a real fringe and
must have already modified their kernel, so a kconfig is the right
approach. It is pretty hard to get non-GUP'able DMA'able memory into a
process with the stock kernel.

Generally speaking, I think Linus has felt security bug fixes like
this are more on the OK side of things to break fringe users.

> And:
> 
> 41311242221e ("vfio/type1: Support faulting PFNMAP vmas")
> 
> So we'd have an alternate path in the un-safe mode and we'd lose the
> ability to fault in mappings.

As above we already exclude VMAs that are not from VFIO, and VFIO
sourced VMA's do not meaningfully implement fault for this use
case. So calling fixup_user_fault() is pointless.

Peter just did this so we could ask him what it was for..

I feel pretty strongly that removing the call to follow_pte is
important here. Even if we do cover all the issues with mis-using the
API it just makes a maintenance problem to leave it in.

Jason

  reply	other threads:[~2021-02-25  0:23 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-02-22 16:50 [RFC PATCH 00/10] vfio: Device memory DMA mapping improvements Alex Williamson
2021-02-22 16:50 ` [RFC PATCH 01/10] vfio: Create vfio_fs_type with inode per device Alex Williamson
2021-02-26  5:38   ` Christoph Hellwig
2021-02-26 13:15     ` Jason Gunthorpe
2021-02-22 16:50 ` [RFC PATCH 02/10] vfio: Update vfio_add_group_dev() API Alex Williamson
2021-02-22 17:01   ` Jason Gunthorpe
2021-02-22 16:50 ` [RFC PATCH 03/10] vfio: Export unmap_mapping_range() wrapper Alex Williamson
2021-02-22 16:51 ` [RFC PATCH 04/10] vfio/pci: Use vfio_device_unmap_mapping_range() Alex Williamson
2021-02-22 17:22   ` Jason Gunthorpe
2021-02-24 21:55     ` Alex Williamson
2021-02-25  0:57       ` Jason Gunthorpe
2021-02-22 16:51 ` [RFC PATCH 05/10] vfio: Create a vfio_device from vma lookup Alex Williamson
2021-02-22 17:29   ` Jason Gunthorpe
2021-02-24 21:55     ` Alex Williamson
2021-02-25  0:06       ` Jason Gunthorpe
2021-02-25 22:21         ` Alex Williamson
2021-02-25 23:49           ` Jason Gunthorpe
2021-03-04 21:37             ` Alex Williamson
2021-03-04 23:16               ` Jason Gunthorpe
2021-03-05  0:07                 ` Alex Williamson
2021-03-05  0:36                   ` Jason Gunthorpe
2021-02-22 16:51 ` [RFC PATCH 06/10] vfio: Add a device notifier interface Alex Williamson
2021-02-22 16:51 ` [RFC PATCH 07/10] vfio/pci: Notify on device release Alex Williamson
2021-02-22 16:52 ` [RFC PATCH 08/10] vfio/type1: Refactor pfn_list clearing Alex Williamson
2021-02-22 16:52 ` [RFC PATCH 09/10] vfio/type1: Pass iommu and dma objects through to vaddr_get_pfn Alex Williamson
2021-02-22 16:52 ` [RFC PATCH 10/10] vfio/type1: Register device notifier Alex Williamson
2021-02-22 17:55   ` Jason Gunthorpe
2021-02-24 21:55     ` Alex Williamson
2021-02-25  0:22       ` Jason Gunthorpe [this message]
2021-02-25 17:54         ` Peter Xu
2021-02-25 18:19           ` Jason Gunthorpe
2021-02-25 19:06             ` Peter Xu
2021-02-25 19:17               ` Jason Gunthorpe
2021-02-25 19:54                 ` Peter Xu
2021-02-26  5:47     ` Christoph Hellwig
2021-02-22 18:00 ` [RFC PATCH 00/10] vfio: Device memory DMA mapping improvements Jason Gunthorpe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210225002216.GQ4247@nvidia.com \
    --to=jgg@nvidia.com \
    --cc=alex.williamson@redhat.com \
    --cc=cohuck@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=peterx@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).