All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC] Independent use of IOMMU groups
@ 2015-11-05 17:54 Alex Williamson
       [not found] ` <1446746079.8831.82.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
  0 siblings, 1 reply; 5+ messages in thread
From: Alex Williamson @ 2015-11-05 17:54 UTC (permalink / raw)
  To: iommu; +Cc: Paolo Bonzini

Hi,

We have a couple things in-flight that are trying to make use of IOMMU
groups, independent of the rest of the IOMMU API.  One is the proposed
VFIO No-IOMMU hack that will create an IOMMU group for a non-IOMMU
backed device in order to make it operate within vfio and exposed via
vfio-pci:

https://lkml.org/lkml/2015/11/4/437

This has all the caveats that DMA for the device is unsafe and we taint
the kernel, but at least it provides users that are already doing this
thing with a more featureful interface without duplicating tons of code
into UIO and it gives them a consistent device model to easily move to
when an IOMMU is supported.

When we do this, vfio creates the IOMMU group for the device when it
binds to the vfio bus driver (vfio-pci) and removes it when unbound.
For that period, we own the device and don't interact with the IOMMU API
for any sort of mapping.

Another idea that's floating around is that vfio could actually expose
virtual devices to a user, think for instance vGPUs in a non-SR-IOV
scenario.  A struct device is created where portions of the device are
backed directly by some subset of a physical device while other parts
may be emulated by the vfio bus driver.  The virtual device needs an
IOMMU group to participate in the vfio framework, but the platform
itself doesn't necessarily need an IOMMU.  In this case isolation of the
virtual device might be provided by policing of the user programming of
the device and MMU control on the physical device itself.  The vfio
IOMMU backend for such a device would use device specific programming
rather than making use of the IOMMU API.

The IOMMU group address space is global, so creating these groups will
necessarily cause them to appear in /sys/kernel/iommu_groups/, so I want
to make sure there are no objections to these sorts of uses.  In one
scenario the device is real, but the IOMMU group is only present when
bound to a driver that knows the usage restrictions, in the other the
device is virtual, created by the driver itself, which is therefore
aware of its restrictions.  Comments?

A question I'd expect to be asked is why not create a new bus_type and
register an IOMMU for it?  In the no-iommu case, the bus for the device
already exists and does not have an IOMMU present.  Registering an IOMMU
for that bus_type risks other devices on that bus_type attempting to do
real IOMMU API tasks.  In the virtual case, this is more reasonable
since all the virtual devices could be children on a new bus_type, but
the overhead of trying to mate a general purpose API to a device that
really only requires very special purpose mappings seems like
unnecessary overhead.

If there are any gotchas that I'm missing, please let me know.  Thanks,

Alex

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [RFC] Independent use of IOMMU groups
       [not found] ` <1446746079.8831.82.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
@ 2015-11-06 12:29   ` Joerg Roedel
       [not found]     ` <20151106122939.GA13027-zLv9SwRftAIdnm+yROfE0A@public.gmane.org>
  0 siblings, 1 reply; 5+ messages in thread
From: Joerg Roedel @ 2015-11-06 12:29 UTC (permalink / raw)
  To: Alex Williamson; +Cc: Paolo Bonzini, iommu

Hi Alex,

On Thu, Nov 05, 2015 at 10:54:39AM -0700, Alex Williamson wrote:
> We have a couple things in-flight that are trying to make use of IOMMU
> groups, independent of the rest of the IOMMU API.  One is the proposed
> VFIO No-IOMMU hack that will create an IOMMU group for a non-IOMMU
> backed device in order to make it operate within vfio and exposed via
> vfio-pci:
> 
> https://lkml.org/lkml/2015/11/4/437

Do you really need iommu-groups for non-IOMMU vfio backend? VFIO has its
own representation of groups (iirc they map 1-1 to iommu-groups). Can
this concept in VFIO not be made more independent of iommu-groups?

I think having iommu-groups in sysfs without an iommu in the system is
pretty confusing for the user. Not to say that the usual iommu grouping
code makes no sense anymore, as there is no isolation at all :)


	Joerg

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [RFC] Independent use of IOMMU groups
       [not found]     ` <20151106122939.GA13027-zLv9SwRftAIdnm+yROfE0A@public.gmane.org>
@ 2015-11-06 15:35       ` Alex Williamson
       [not found]         ` <1446824140.8831.168.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
  0 siblings, 1 reply; 5+ messages in thread
From: Alex Williamson @ 2015-11-06 15:35 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: Paolo Bonzini, iommu

On Fri, 2015-11-06 at 13:29 +0100, Joerg Roedel wrote:
> Hi Alex,
> 
> On Thu, Nov 05, 2015 at 10:54:39AM -0700, Alex Williamson wrote:
> > We have a couple things in-flight that are trying to make use of IOMMU
> > groups, independent of the rest of the IOMMU API.  One is the proposed
> > VFIO No-IOMMU hack that will create an IOMMU group for a non-IOMMU
> > backed device in order to make it operate within vfio and exposed via
> > vfio-pci:
> > 
> > https://lkml.org/lkml/2015/11/4/437
> 
> Do you really need iommu-groups for non-IOMMU vfio backend? VFIO has its
> own representation of groups (iirc they map 1-1 to iommu-groups). Can
> this concept in VFIO not be made more independent of iommu-groups?
> 
> I think having iommu-groups in sysfs without an iommu in the system is
> pretty confusing for the user. Not to say that the usual iommu grouping
> code makes no sense anymore, as there is no isolation at all :)

Hi Joerg,

VFIO is really built on iommu groups, so making a vfio group independent
of iommu groups is a difficult proposition.  With introducing the
no-iommu vfio code, I accept that people are going to run userspace
drivers without iommu protection, regardless of whether it's
supportable.  By using the vfio device interface, we're at least pushing
them towards code that does have a supported use case.  So my goal there
is to enable no-iommu mode in a way that is compact (I'm only willing to
invest limited lines of code to enable this) and does not undermine the
foundation of vfio.  I also do everything I can to make it clear that
this is unsafe, from the naming of the opt-in module parameter to the
tainting of the kernel when a no-iommu group is created to the dev_warn
with that group creation and later when the device is opened, using a
differently named vfio device node for the group, and allowing only a
no-iommu IOMMU backend for the group.  There is no chance that a user
can accidentally operate on a no-iommu vfio group and there are
breadcrumbs left behind even in the normal process of using them.  Also,
as I mentioned previously, the lifetime of this no-iommu group is tied
to the device being bound to the vfio driver, so no other drivers would
have access to the iommu group and the user has already had to opt-in
their system and generated a dmesg log and kernel taint before they even
get the chance to be confused by that iommu group.  Thanks,

Alex

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [RFC] Independent use of IOMMU groups
       [not found]         ` <1446824140.8831.168.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
@ 2015-11-27 15:39           ` Joerg Roedel
       [not found]             ` <20151127153910.GL2064-zLv9SwRftAIdnm+yROfE0A@public.gmane.org>
  0 siblings, 1 reply; 5+ messages in thread
From: Joerg Roedel @ 2015-11-27 15:39 UTC (permalink / raw)
  To: Alex Williamson; +Cc: Paolo Bonzini, iommu

Hi Alex,

On Fri, Nov 06, 2015 at 08:35:40AM -0700, Alex Williamson wrote:
> VFIO is really built on iommu groups, so making a vfio group independent
> of iommu groups is a difficult proposition.

I have been thinking about the relation between vfio device groups and
iommu-groups lately, because at least for PCI the iommu-grouping is too
coarse grained. I ran into this with the default-domain approach I am
working on.

Grouping devices together that have different request-ids (multifunction
and acs based grouping) only makes sense when the device is controlled
by an untrusted piece of software, in our case userspace or a KVM guest.
The device drivers in Linux are trusted, and this coarse grained
grouping becomes problematic, because it forces more devices into a
single domain, which can become a bottleneck for DMA-API allocations.

I have been thinking about moving the multi-function and acs grouping
into vfio code, meaning that a vfio-group contains more than one
iommu-group. The problem with this is that iommu-groups are exposed
in sysfs and thus became a userspace ABI.

So the vfio-group code might need changes anyway which could solve the
above problem too, no? I am just not sure yet what the best way is to
solve it.


	Joerg

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [RFC] Independent use of IOMMU groups
       [not found]             ` <20151127153910.GL2064-zLv9SwRftAIdnm+yROfE0A@public.gmane.org>
@ 2015-12-02 15:58               ` Alex Williamson
  0 siblings, 0 replies; 5+ messages in thread
From: Alex Williamson @ 2015-12-02 15:58 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: Paolo Bonzini, iommu

On Fri, 2015-11-27 at 16:39 +0100, Joerg Roedel wrote:
> Hi Alex,
> 
> On Fri, Nov 06, 2015 at 08:35:40AM -0700, Alex Williamson wrote:
> > VFIO is really built on iommu groups, so making a vfio group independent
> > of iommu groups is a difficult proposition.
> 
> I have been thinking about the relation between vfio device groups and
> iommu-groups lately, because at least for PCI the iommu-grouping is too
> coarse grained. I ran into this with the default-domain approach I am
> working on.
> 
> Grouping devices together that have different request-ids (multifunction
> and acs based grouping) only makes sense when the device is controlled
> by an untrusted piece of software, in our case userspace or a KVM guest.
> The device drivers in Linux are trusted, and this coarse grained
> grouping becomes problematic, because it forces more devices into a
> single domain, which can become a bottleneck for DMA-API allocations.
> 
> I have been thinking about moving the multi-function and acs grouping
> into vfio code, meaning that a vfio-group contains more than one
> iommu-group. The problem with this is that iommu-groups are exposed
> in sysfs and thus became a userspace ABI.
> 
> So the vfio-group code might need changes anyway which could solve the
> above problem too, no? I am just not sure yet what the best way is to
> solve it.

Hi Joerg,

That's a hard one.  As you say, iommu groups are really a userspace ABI
and tightly integrated into the mapping of vfio groups, so I don't
really think we have much flexibility in re-defining an iommu group.
The original intent with putting the grouping logic in the iommu drivers
and core code was that vfio isn't smart enough to be able to determine
both the iommu visibility and topology based isolation for any given
architecture.  I think that's still true.  We don't really want to
enable vfio on platforms that haven't given the issue sufficient
consideration to enable the iommu API for devices.

On the other hand, it doesn't make a whole lot of sense for native
kernel drivers to care about topology based isolation.  It would be
preferable to fully isolate a device, but we do manage the IOVA address
space for native drivers, so unintentional peer-to-peer shouldn't really
be a possibility.

It makes sense to me to think about an iommu group as a set of one or
more iommu granules, where each granule is the granularity of the iommu
visibility.  A granule probably also has a relation to DMA aliases.  So
perhaps the granule encompasses all DMA related visibility issues and
the group is an overlay which takes topology into account when the IOVA
space is defined by the user (and malicious DMA needs to be considered
as well).

So maybe the first step is to create that dividing line and figure out
what granules look like and how we can more explicitly expose groups on
top of them.  Easier said than done, I'm sure.  Thanks,

Alex

PS - sorry for the delay, was off on holiday.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2015-12-02 15:58 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-11-05 17:54 [RFC] Independent use of IOMMU groups Alex Williamson
     [not found] ` <1446746079.8831.82.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2015-11-06 12:29   ` Joerg Roedel
     [not found]     ` <20151106122939.GA13027-zLv9SwRftAIdnm+yROfE0A@public.gmane.org>
2015-11-06 15:35       ` Alex Williamson
     [not found]         ` <1446824140.8831.168.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2015-11-27 15:39           ` Joerg Roedel
     [not found]             ` <20151127153910.GL2064-zLv9SwRftAIdnm+yROfE0A@public.gmane.org>
2015-12-02 15:58               ` Alex Williamson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.