From: Jason Gunthorpe <jgg@nvidia.com>
To: Robin Murphy <robin.murphy@arm.com>
Cc: Christoph Hellwig <hch@infradead.org>,
Kevin Tian <kevin.tian@intel.com>,
Chaitanya Kulkarni <kch@nvidia.com>,
Ashok Raj <ashok.raj@intel.com>,
kvm@vger.kernel.org, rafael@kernel.org,
Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
Cornelia Huck <cohuck@redhat.com>, Will Deacon <will@kernel.org>,
linux-kernel@vger.kernel.org, iommu@lists.linux-foundation.org,
Alex Williamson <alex.williamson@redhat.com>,
Jacob jun Pan <jacob.jun.pan@intel.com>,
linux-pci@vger.kernel.org, Bjorn Helgaas <bhelgaas@google.com>,
Diana Craciun <diana.craciun@oss.nxp.com>
Subject: Re: [PATCH 03/11] PCI: pci_stub: Suppress kernel DMA ownership auto-claiming
Date: Mon, 15 Nov 2021 17:19:52 -0400 [thread overview]
Message-ID: <20211115211952.GA2534655@nvidia.com> (raw)
In-Reply-To: <ab3ae3d1-c4d7-251a-fecc-d21f6b9d87a5@arm.com>
On Mon, Nov 15, 2021 at 08:58:19PM +0000, Robin Murphy wrote:
> > The above scenarios are already blocked by the kernel with
> > LOCKDOWN_DEV_MEM - yes there are historical ways to violate kernel
> > integrity, and these days they almost all have mitigation. I would
> > consider any kernel integrity violation to be a bug today if
> > LOCKDOWN_INTEGRITY_MAX is enabled.
> >
> > I don't know why you bring this up?
>
> Because as soon as anyone brings up "security" I'm not going to blindly
> accept the implicit assumption that VFIO is the only possible way to get one
> device to mess with another. That was just a silly example in the most basic
> terms, and obviously I don't expect well-worn generic sysfs interfaces to be
> a genuine threat, but how confident are you that no other subsystem- or
> driver-level interfaces past present and future can ever be tricked into p2p
> DMA?
Given the definition of LOCKDOWN_INTEGRITY_MAX I will consider any
past/present/future p2p attacks as definitive kernel bugs.
Generally, allowing a device to do arbitary DMA to a userspace
controlled address is a pretty serious bug, and directly attacking the
kernel memory is a much more interesting and serious threat vector.
> > What is the preference then? This is the only working API today,
> > right?
>
> I believe the intent was that everyone should move to
> iommu_group_get()/iommu_attach_group() - precisely *because*
> iommu_attach_device() can't work sensibly for multi-device groups.
And iommu_attach_group() can't work sensibly for anything except VFIO
today, so hum :)
> > > Indeed I wasn't imagining it changing any ownership, just preventing a group
> > > from being attached to a non-default domain if it contains devices bound to
> > > different incompatible drivers.
> >
> > So this could solve just the domain/DMA API problem, but it leaves the
> > MMIO peer-to-peer issue unsolved, and it gives no tools to solve it in
> > a layered way.
> >
> > This seems like half an idea, do you have a solution for the rest?
>
> Tell me how the p2p DMA issue can manifest if device A is prohibited from
> attaching to VFIO's unmanaged domain while device B still has a driver
> bound, and thus would fail to be assigned to userspace in the first place.
> And conversely if non-VFIO drivers are still prevented from binding to
> device B while device A remains attached to the VFIO domain.
You've assumed that a group is continuously attached to the same
domain during the entire period that userspace has MMIO.
Any domain detatch creates a race where a kernel driver can jump in
and bind, while user space continues to have MMIO control over a
device. That violates the security invariant.
Many new flows, like PASID support, are requiring dynamically changing
the domain bound to a group.
If you want to go in this direction then we also need to have some
kind of compare and swap operation for the domain currently bound to a
group.
From a security perspective I disliek this idea a lot. Instead of
having nice clear barriers indicating the security domain we have a
very subtle 'so long as a domain is attached' idea, which is going to
get broken.
> > The concept of DMA USER is important here, and it is more than just
> > which domain is attached.
>
> Tell me how a device would be assigned to userspace while its group is still
> attached to a kernel-managed default domain.
>
> As soon as anyone calls iommu_attach_group() - or indeed
> iommu_attach_device() if more devices may end up hotplugged into the same
> group later - *that's* when the door opens for potential subversion of any
> kind, without ever having to leave kernel space.
The approach in this series can solve both, attach_device could switch
the device to user mode and it will block future hot plugged kernel
drivers.
> > VFIO also has logic related to the file
>
> Yes, because unsurprisingly VFIO code is tailored for the specific case of
> VFIO usage rather than anything more general.
VFIO represents this class of users exposing the IOMMU to userspace,
I say it is general of that use class.
> > It isn't muddying the water, it is providing common security code that
> > is easy to undertand.
> >
> > *Which* userspace FD/process owns the iommu_group is important
> > security information because we can't have process A do DMA attacks on
> > some other process B.
>
> Tell me how a single group could be attached to two domains representing two
> different process address spaces at once.
Again you are focused on domains and ignoring MMIO.
Requiring the users of the API to continuously assert a non-default
domain is a non-trivial ask.
> In case this concept wasn't as clear as I thought, which seems to be so:
>
> | dev->iommu_group->domain | dev->driver
> DMA_OWNER_NONE | default | unbound
> DMA_OWNER_KERNEL | default | bound
> DMA_OWNER_USER | non-default | bound
Unfortunately this can't use dev->driver. Reading dev->driver of every
device in the group requires holding the device_lock. really_probe()
already holds the device lock so this becomes a user triggerable ABBA
deadlock when scaled up to N devices.
This is why this series uses only the group mutex and tracks if
drivers are bound inside the group. I view that as unavoidable.
Jason
next prev parent reply other threads:[~2021-11-15 21:28 UTC|newest]
Thread overview: 60+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-11-15 2:05 [PATCH 00/11] Fix BUG_ON in vfio_iommu_group_notifier() Lu Baolu
2021-11-15 2:05 ` [PATCH 01/11] iommu: Add device dma ownership set/release interfaces Lu Baolu
2021-11-15 13:14 ` Christoph Hellwig
2021-11-16 1:57 ` Lu Baolu
2021-11-16 13:46 ` Jason Gunthorpe
2021-11-17 5:22 ` Lu Baolu
2021-11-17 13:35 ` Jason Gunthorpe
2021-11-18 1:12 ` Lu Baolu
2021-11-18 14:10 ` Jason Gunthorpe
2021-11-18 2:39 ` Tian, Kevin
2021-11-18 13:33 ` Jason Gunthorpe
2021-11-19 5:44 ` Tian, Kevin
2021-11-19 11:14 ` Lu Baolu
2021-11-19 15:06 ` Jörg Rödel
2021-11-19 15:43 ` Jason Gunthorpe
2021-11-20 11:16 ` Lu Baolu
2021-11-19 12:56 ` Jason Gunthorpe
2021-11-15 20:38 ` Bjorn Helgaas
2021-11-16 1:52 ` Lu Baolu
2021-11-15 2:05 ` [PATCH 02/11] driver core: Set DMA ownership during driver bind/unbind Lu Baolu
2021-11-15 6:59 ` Greg Kroah-Hartman
2021-11-15 13:20 ` Christoph Hellwig
2021-11-15 13:38 ` Jason Gunthorpe
2021-11-15 13:19 ` Christoph Hellwig
2021-11-15 13:24 ` Jason Gunthorpe
2021-11-15 15:37 ` Robin Murphy
2021-11-15 15:56 ` Jason Gunthorpe
2021-11-15 18:15 ` Christoph Hellwig
2021-11-15 18:35 ` Robin Murphy
2021-11-15 19:39 ` Jason Gunthorpe
2021-11-15 2:05 ` [PATCH 03/11] PCI: pci_stub: Suppress kernel DMA ownership auto-claiming Lu Baolu
2021-11-15 13:21 ` Christoph Hellwig
2021-11-15 13:31 ` Jason Gunthorpe
2021-11-15 15:14 ` Robin Murphy
2021-11-15 16:17 ` Jason Gunthorpe
2021-11-15 17:54 ` Robin Murphy
2021-11-15 18:19 ` Christoph Hellwig
2021-11-15 18:44 ` Robin Murphy
2021-11-15 19:22 ` Jason Gunthorpe
2021-11-15 20:58 ` Robin Murphy
2021-11-15 21:19 ` Jason Gunthorpe [this message]
2021-11-15 20:48 ` Bjorn Helgaas
2021-11-15 22:17 ` Bjorn Helgaas
2021-11-16 6:05 ` Lu Baolu
2021-11-15 2:05 ` [PATCH 04/11] PCI: portdrv: " Lu Baolu
2021-11-15 20:44 ` Bjorn Helgaas
2021-11-16 7:24 ` Lu Baolu
2021-11-16 20:22 ` Bjorn Helgaas
2021-11-16 20:48 ` Jason Gunthorpe
2021-11-15 2:05 ` [PATCH 05/11] iommu: Add security context management for assigned devices Lu Baolu
2021-11-15 13:22 ` Christoph Hellwig
2021-11-16 7:25 ` Lu Baolu
2021-11-15 2:05 ` [PATCH 06/11] iommu: Expose group variants of dma ownership interfaces Lu Baolu
2021-11-15 13:27 ` Christoph Hellwig
2021-11-16 9:42 ` Lu Baolu
2021-11-15 2:05 ` [PATCH 07/11] vfio: Use DMA_OWNER_USER to declaim passthrough devices Lu Baolu
2021-11-15 2:05 ` [PATCH 08/11] vfio: Remove use of vfio_group_viable() Lu Baolu
2021-11-15 2:05 ` [PATCH 09/11] vfio: Delete the unbound_list Lu Baolu
2021-11-15 2:05 ` [PATCH 10/11] vfio: Remove iommu group notifier Lu Baolu
2021-11-15 2:05 ` [PATCH 11/11] iommu: Remove iommu group changes notifier Lu Baolu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20211115211952.GA2534655@nvidia.com \
--to=jgg@nvidia.com \
--cc=alex.williamson@redhat.com \
--cc=ashok.raj@intel.com \
--cc=bhelgaas@google.com \
--cc=cohuck@redhat.com \
--cc=diana.craciun@oss.nxp.com \
--cc=gregkh@linuxfoundation.org \
--cc=hch@infradead.org \
--cc=iommu@lists.linux-foundation.org \
--cc=jacob.jun.pan@intel.com \
--cc=kch@nvidia.com \
--cc=kevin.tian@intel.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pci@vger.kernel.org \
--cc=rafael@kernel.org \
--cc=robin.murphy@arm.com \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).