kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jason Gunthorpe <jgg@nvidia.com>
To: Robin Murphy <robin.murphy@arm.com>
Cc: Christoph Hellwig <hch@infradead.org>,
	Kevin Tian <kevin.tian@intel.com>,
	Chaitanya Kulkarni <kch@nvidia.com>,
	Ashok Raj <ashok.raj@intel.com>,
	kvm@vger.kernel.org, rafael@kernel.org,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Cornelia Huck <cohuck@redhat.com>, Will Deacon <will@kernel.org>,
	linux-kernel@vger.kernel.org, iommu@lists.linux-foundation.org,
	Alex Williamson <alex.williamson@redhat.com>,
	Jacob jun Pan <jacob.jun.pan@intel.com>,
	linux-pci@vger.kernel.org, Bjorn Helgaas <bhelgaas@google.com>,
	Diana Craciun <diana.craciun@oss.nxp.com>
Subject: Re: [PATCH 03/11] PCI: pci_stub: Suppress kernel DMA ownership auto-claiming
Date: Mon, 15 Nov 2021 17:19:52 -0400	[thread overview]
Message-ID: <20211115211952.GA2534655@nvidia.com> (raw)
In-Reply-To: <ab3ae3d1-c4d7-251a-fecc-d21f6b9d87a5@arm.com>

On Mon, Nov 15, 2021 at 08:58:19PM +0000, Robin Murphy wrote:
> > The above scenarios are already blocked by the kernel with
> > LOCKDOWN_DEV_MEM - yes there are historical ways to violate kernel
> > integrity, and these days they almost all have mitigation. I would
> > consider any kernel integrity violation to be a bug today if
> > LOCKDOWN_INTEGRITY_MAX is enabled.
> > 
> > I don't know why you bring this up?
> 
> Because as soon as anyone brings up "security" I'm not going to blindly
> accept the implicit assumption that VFIO is the only possible way to get one
> device to mess with another. That was just a silly example in the most basic
> terms, and obviously I don't expect well-worn generic sysfs interfaces to be
> a genuine threat, but how confident are you that no other subsystem- or
> driver-level interfaces past present and future can ever be tricked into p2p
> DMA?

Given the definition of LOCKDOWN_INTEGRITY_MAX I will consider any
past/present/future p2p attacks as definitive kernel bugs.

Generally, allowing a device to do arbitary DMA to a userspace
controlled address is a pretty serious bug, and directly attacking the
kernel memory is a much more interesting and serious threat vector.

> > What is the preference then? This is the only working API today,
> > right?
> 
> I believe the intent was that everyone should move to
> iommu_group_get()/iommu_attach_group() - precisely *because*
> iommu_attach_device() can't work sensibly for multi-device groups.

And iommu_attach_group() can't work sensibly for anything except VFIO
today, so hum :)

> > > Indeed I wasn't imagining it changing any ownership, just preventing a group
> > > from being attached to a non-default domain if it contains devices bound to
> > > different incompatible drivers.
> > 
> > So this could solve just the domain/DMA API problem, but it leaves the
> > MMIO peer-to-peer issue unsolved, and it gives no tools to solve it in
> > a layered way.
> > 
> > This seems like half an idea, do you have a solution for the rest?
> 
> Tell me how the p2p DMA issue can manifest if device A is prohibited from
> attaching to VFIO's unmanaged domain while device B still has a driver
> bound, and thus would fail to be assigned to userspace in the first place.
> And conversely if non-VFIO drivers are still prevented from binding to
> device B while device A remains attached to the VFIO domain.

You've assumed that a group is continuously attached to the same
domain during the entire period that userspace has MMIO.

Any domain detatch creates a race where a kernel driver can jump in
and bind, while user space continues to have MMIO control over a
device. That violates the security invariant.

Many new flows, like PASID support, are requiring dynamically changing
the domain bound to a group.

If you want to go in this direction then we also need to have some
kind of compare and swap operation for the domain currently bound to a
group.

From a security perspective I disliek this idea a lot. Instead of
having nice clear barriers indicating the security domain we have a
very subtle 'so long as a domain is attached' idea, which is going to
get broken.

> > The concept of DMA USER is important here, and it is more than just
> > which domain is attached.
> 
> Tell me how a device would be assigned to userspace while its group is still
> attached to a kernel-managed default domain.
> 
> As soon as anyone calls iommu_attach_group() - or indeed
> iommu_attach_device() if more devices may end up hotplugged into the same
> group later - *that's* when the door opens for potential subversion of any
> kind, without ever having to leave kernel space.

The approach in this series can solve both, attach_device could switch
the device to user mode and it will block future hot plugged kernel
drivers.

> > VFIO also has logic related to the file
> 
> Yes, because unsurprisingly VFIO code is tailored for the specific case of
> VFIO usage rather than anything more general.

VFIO represents this class of users exposing the IOMMU to userspace,
I say it is general of that use class.

> > It isn't muddying the water, it is providing common security code that
> > is easy to undertand.
> > 
> > *Which* userspace FD/process owns the iommu_group is important
> > security information because we can't have process A do DMA attacks on
> > some other process B.
> 
> Tell me how a single group could be attached to two domains representing two
> different process address spaces at once.

Again you are focused on domains and ignoring MMIO.

Requiring the users of the API to continuously assert a non-default
domain is a non-trivial ask.

> In case this concept wasn't as clear as I thought, which seems to be so:
>
>                  | dev->iommu_group->domain | dev->driver
> DMA_OWNER_NONE   |          default         |   unbound
> DMA_OWNER_KERNEL |          default         |    bound
> DMA_OWNER_USER   |        non-default       |    bound

Unfortunately this can't use dev->driver. Reading dev->driver of every
device in the group requires holding the device_lock. really_probe()
already holds the device lock so this becomes a user triggerable ABBA
deadlock when scaled up to N devices.

This is why this series uses only the group mutex and tracks if
drivers are bound inside the group. I view that as unavoidable.

Jason

  reply	other threads:[~2021-11-15 21:28 UTC|newest]

Thread overview: 60+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-15  2:05 [PATCH 00/11] Fix BUG_ON in vfio_iommu_group_notifier() Lu Baolu
2021-11-15  2:05 ` [PATCH 01/11] iommu: Add device dma ownership set/release interfaces Lu Baolu
2021-11-15 13:14   ` Christoph Hellwig
2021-11-16  1:57     ` Lu Baolu
2021-11-16 13:46       ` Jason Gunthorpe
2021-11-17  5:22         ` Lu Baolu
2021-11-17 13:35           ` Jason Gunthorpe
2021-11-18  1:12             ` Lu Baolu
2021-11-18 14:10               ` Jason Gunthorpe
2021-11-18  2:39         ` Tian, Kevin
2021-11-18 13:33           ` Jason Gunthorpe
2021-11-19  5:44             ` Tian, Kevin
2021-11-19 11:14               ` Lu Baolu
2021-11-19 15:06                 ` Jörg Rödel
2021-11-19 15:43                   ` Jason Gunthorpe
2021-11-20 11:16                   ` Lu Baolu
2021-11-19 12:56               ` Jason Gunthorpe
2021-11-15 20:38   ` Bjorn Helgaas
2021-11-16  1:52     ` Lu Baolu
2021-11-15  2:05 ` [PATCH 02/11] driver core: Set DMA ownership during driver bind/unbind Lu Baolu
2021-11-15  6:59   ` Greg Kroah-Hartman
2021-11-15 13:20     ` Christoph Hellwig
2021-11-15 13:38     ` Jason Gunthorpe
2021-11-15 13:19   ` Christoph Hellwig
2021-11-15 13:24     ` Jason Gunthorpe
2021-11-15 15:37       ` Robin Murphy
2021-11-15 15:56         ` Jason Gunthorpe
2021-11-15 18:15           ` Christoph Hellwig
2021-11-15 18:35           ` Robin Murphy
2021-11-15 19:39             ` Jason Gunthorpe
2021-11-15  2:05 ` [PATCH 03/11] PCI: pci_stub: Suppress kernel DMA ownership auto-claiming Lu Baolu
2021-11-15 13:21   ` Christoph Hellwig
2021-11-15 13:31     ` Jason Gunthorpe
2021-11-15 15:14       ` Robin Murphy
2021-11-15 16:17         ` Jason Gunthorpe
2021-11-15 17:54           ` Robin Murphy
2021-11-15 18:19             ` Christoph Hellwig
2021-11-15 18:44               ` Robin Murphy
2021-11-15 19:22             ` Jason Gunthorpe
2021-11-15 20:58               ` Robin Murphy
2021-11-15 21:19                 ` Jason Gunthorpe [this message]
2021-11-15 20:48   ` Bjorn Helgaas
2021-11-15 22:17   ` Bjorn Helgaas
2021-11-16  6:05     ` Lu Baolu
2021-11-15  2:05 ` [PATCH 04/11] PCI: portdrv: " Lu Baolu
2021-11-15 20:44   ` Bjorn Helgaas
2021-11-16  7:24     ` Lu Baolu
2021-11-16 20:22       ` Bjorn Helgaas
2021-11-16 20:48         ` Jason Gunthorpe
2021-11-15  2:05 ` [PATCH 05/11] iommu: Add security context management for assigned devices Lu Baolu
2021-11-15 13:22   ` Christoph Hellwig
2021-11-16  7:25     ` Lu Baolu
2021-11-15  2:05 ` [PATCH 06/11] iommu: Expose group variants of dma ownership interfaces Lu Baolu
2021-11-15 13:27   ` Christoph Hellwig
2021-11-16  9:42     ` Lu Baolu
2021-11-15  2:05 ` [PATCH 07/11] vfio: Use DMA_OWNER_USER to declaim passthrough devices Lu Baolu
2021-11-15  2:05 ` [PATCH 08/11] vfio: Remove use of vfio_group_viable() Lu Baolu
2021-11-15  2:05 ` [PATCH 09/11] vfio: Delete the unbound_list Lu Baolu
2021-11-15  2:05 ` [PATCH 10/11] vfio: Remove iommu group notifier Lu Baolu
2021-11-15  2:05 ` [PATCH 11/11] iommu: Remove iommu group changes notifier Lu Baolu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20211115211952.GA2534655@nvidia.com \
    --to=jgg@nvidia.com \
    --cc=alex.williamson@redhat.com \
    --cc=ashok.raj@intel.com \
    --cc=bhelgaas@google.com \
    --cc=cohuck@redhat.com \
    --cc=diana.craciun@oss.nxp.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=hch@infradead.org \
    --cc=iommu@lists.linux-foundation.org \
    --cc=jacob.jun.pan@intel.com \
    --cc=kch@nvidia.com \
    --cc=kevin.tian@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=rafael@kernel.org \
    --cc=robin.murphy@arm.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).