* [RFC PATCH 0/3] IOMMU groups @ 2012-04-02 21:14 Alex Williamson 2012-04-02 21:14 ` [RFC PATCH 1/3] iommu: Introduce iommu_group Alex Williamson ` (3 more replies) 0 siblings, 4 replies; 9+ messages in thread From: Alex Williamson @ 2012-04-02 21:14 UTC (permalink / raw) To: david, dwmw2 Cc: alex.williamson, iommu, qemu-devel, joerg.roedel, kvm, linux-kernel This series attempts to make IOMMU device grouping a slightly more integral part of the device model. iommu_device_groups were originally introduced to support the VFIO user space driver interface which needs to understand the granularity of device isolation in order to ensure security of devices when assigned for user access. This information was provided via a simple group identifier from the IOMMU driver allowing VFIO to walk devices and assemble groups itself. The feedback received from this was that groups should be the effective unit of work for the IOMMU API. The existing model of allowing domains to be created and individual devices attached ignores many of the restrictions of the IOMMU, whether by design, by topology or by defective devices. Additionally we should be able to use the grouping information at the dma ops layer for managing domains and quirking devices. This series is a sketch at implementing only those aspects and leaving everything else about the multifaceted hairball of Isolation groups for another API. Please comment and let me know if this seems like the direction we should be headed. Thanks, Alex --- Alex Williamson (3): iommu: Create attach/detach group interface iommu: Create basic group infrastructure and update AMD-Vi & Intel VT-d iommu: Introduce iommu_group drivers/iommu/amd_iommu.c | 50 ++++++---- drivers/iommu/intel-iommu.c | 76 ++++++++-------- drivers/iommu/iommu.c | 210 ++++++++++++++++++++++++++++++++++++++----- include/linux/device.h | 2 include/linux/iommu.h | 43 +++++++++ 5 files changed, 301 insertions(+), 80 deletions(-) ^ permalink raw reply [flat|nested] 9+ messages in thread
* [RFC PATCH 1/3] iommu: Introduce iommu_group 2012-04-02 21:14 [RFC PATCH 0/3] IOMMU groups Alex Williamson @ 2012-04-02 21:14 ` Alex Williamson 2012-04-18 9:58 ` David Gibson 2012-04-02 21:14 ` [RFC PATCH 2/3] iommu: Create basic group infrastructure and update AMD-Vi & Intel VT-d Alex Williamson ` (2 subsequent siblings) 3 siblings, 1 reply; 9+ messages in thread From: Alex Williamson @ 2012-04-02 21:14 UTC (permalink / raw) To: david, dwmw2 Cc: alex.williamson, iommu, qemu-devel, joerg.roedel, kvm, linux-kernel IOMMUs often do not have visibility of individual devices in the system. Due to IOMMU design, bus topology, or device quirks, we can often only identify groups of devices. Examples include Intel VT-d & AMD-Vi which often have function level visibility compared to POWER partitionable endpoints which have bridge level granularity. PCIe-to-PCI bridges also often cloud the IOMMU visibility as it cannot distiguish devices behind the bridge. Devices can also sometimes hurt themselves by initiating DMA using the wrong source ID on a multifunction PCI device. IOMMU groups are meant to help solve these problems and hopefully become the working unit of the IOMMI API. Signed-off-by: Alex Williamson <alex.williamson@redhat.com> --- include/linux/device.h | 2 ++ include/linux/iommu.h | 5 +++++ 2 files changed, 7 insertions(+), 0 deletions(-) diff --git a/include/linux/device.h b/include/linux/device.h index b63fb39..6acab1c 100644 --- a/include/linux/device.h +++ b/include/linux/device.h @@ -35,6 +35,7 @@ struct subsys_private; struct bus_type; struct device_node; struct iommu_ops; +struct iommu_group; struct bus_attribute { struct attribute attr; @@ -683,6 +684,7 @@ struct device { const struct attribute_group **groups; /* optional groups */ void (*release)(struct device *dev); + struct iommu_group *iommu_group; }; /* Get the wakeup routines, which depend on struct device */ diff --git a/include/linux/iommu.h b/include/linux/iommu.h index d937580..2ee375c 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -26,6 +26,7 @@ #define IOMMU_CACHE (4) /* DMA cache coherency */ struct iommu_ops; +struct iommu_group; struct bus_type; struct device; struct iommu_domain; @@ -78,6 +79,9 @@ struct iommu_ops { unsigned long pgsize_bitmap; }; +struct iommu_group { +}; + extern int bus_set_iommu(struct bus_type *bus, struct iommu_ops *ops); extern bool iommu_present(struct bus_type *bus); extern struct iommu_domain *iommu_domain_alloc(struct bus_type *bus); @@ -140,6 +144,7 @@ static inline int report_iommu_fault(struct iommu_domain *domain, #else /* CONFIG_IOMMU_API */ struct iommu_ops {}; +struct iommu_group {}; static inline bool iommu_present(struct bus_type *bus) { ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [RFC PATCH 1/3] iommu: Introduce iommu_group 2012-04-02 21:14 ` [RFC PATCH 1/3] iommu: Introduce iommu_group Alex Williamson @ 2012-04-18 9:58 ` David Gibson 2012-04-18 20:07 ` Alex Williamson 0 siblings, 1 reply; 9+ messages in thread From: David Gibson @ 2012-04-18 9:58 UTC (permalink / raw) To: Alex Williamson; +Cc: dwmw2, iommu, qemu-devel, joerg.roedel, kvm, linux-kernel On Mon, Apr 02, 2012 at 03:14:40PM -0600, Alex Williamson wrote: > IOMMUs often do not have visibility of individual devices in the > system. Due to IOMMU design, bus topology, or device quirks, we > can often only identify groups of devices. Examples include > Intel VT-d & AMD-Vi which often have function level visibility > compared to POWER partitionable endpoints which have bridge level > granularity. That's a significant oversimplification of the situation on POWER, although it doesn't really matter in this context. On older (i.e. pre PCI-E) hardware, PEs have either host bridge (i.e. domain) granularity, or in IIUC in some cases p2p bridge granularity, using special p2p bridges, since that's the only real way to do iommu differentiation without the PCI-E requestor IDs. This isn't as coarse as it seems in practice, because the hardware is usually built with a bridge per physical PCI slot. On newer PCI-E hardware, the PE granularity is basically a firmware decision, and can go down to function level. I believe pHyp puts the granularity at the bridge level. Our non-virtualized Linux "firmware" currently does put it at the function level, but Ben is thinking about changing that to bridge level: again, because of the hardware design that isn't as coarse as it seems, and at this level we can hardware guarantee isolation to a degree that's not possible at the function level. > PCIe-to-PCI bridges also often cloud the IOMMU > visibility as it cannot distiguish devices behind the bridge. > Devices can also sometimes hurt themselves by initiating DMA using > the wrong source ID on a multifunction PCI device. > > IOMMU groups are meant to help solve these problems and hopefully > become the working unit of the IOMMI API. So far, so simple. No objections here. I am trying to work out what the real difference in approach is in this seriers from either your or my earlier isolation group series. AFAICT it's just that this approach is explicitly only about IOMMU identity, ignoring (here) any other factors which might affect isolation. Or am I missing something? -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [RFC PATCH 1/3] iommu: Introduce iommu_group 2012-04-18 9:58 ` David Gibson @ 2012-04-18 20:07 ` Alex Williamson 0 siblings, 0 replies; 9+ messages in thread From: Alex Williamson @ 2012-04-18 20:07 UTC (permalink / raw) To: David Gibson; +Cc: dwmw2, iommu, qemu-devel, joerg.roedel, kvm, linux-kernel On Wed, 2012-04-18 at 19:58 +1000, David Gibson wrote: > On Mon, Apr 02, 2012 at 03:14:40PM -0600, Alex Williamson wrote: > > IOMMUs often do not have visibility of individual devices in the > > system. Due to IOMMU design, bus topology, or device quirks, we > > can often only identify groups of devices. Examples include > > Intel VT-d & AMD-Vi which often have function level visibility > > compared to POWER partitionable endpoints which have bridge level > > granularity. > > That's a significant oversimplification of the situation on POWER, > although it doesn't really matter in this context. On older (i.e. pre > PCI-E) hardware, PEs have either host bridge (i.e. domain) > granularity, or in IIUC in some cases p2p bridge granularity, using > special p2p bridges, since that's the only real way to do iommu > differentiation without the PCI-E requestor IDs. This isn't as coarse > as it seems in practice, because the hardware is usually built with a > bridge per physical PCI slot. > > On newer PCI-E hardware, the PE granularity is basically a firmware > decision, and can go down to function level. I believe pHyp puts the > granularity at the bridge level. Our non-virtualized Linux "firmware" > currently does put it at the function level, but Ben is thinking about > changing that to bridge level: again, because of the hardware design > that isn't as coarse as it seems, and at this level we can hardware > guarantee isolation to a degree that's not possible at the function > level. Ok, thanks for the clarification. This should support either model and it will be up to the iommu driver to fill the groups with the right devices. > > PCIe-to-PCI bridges also often cloud the IOMMU > > visibility as it cannot distiguish devices behind the bridge. > > Devices can also sometimes hurt themselves by initiating DMA using > > the wrong source ID on a multifunction PCI device. > > > > IOMMU groups are meant to help solve these problems and hopefully > > become the working unit of the IOMMI API. > > So far, so simple. No objections here. I am trying to work out what > the real difference in approach is in this seriers from either your or > my earlier isolation group series. AFAICT it's just that this > approach is explicitly only about IOMMU identity, ignoring (here) any > other factors which might affect isolation. Or am I missing > something? Yes, they are very similar and actually also similar to how VFIO manages groups. It's easy to start some kind of group structure, the hard part is in the details and particularly where to stop. My attempt to figure out where isolation groups stop went quite poorly, ending up with a ridiculously complicated mess of hierarchical groups that got worse as I tried to fill in the gaps. With iommu groups I try to take a step back and simplify. I initially had a goal of describing only the minimum iommu granularity sets, this is where the dma_dev idea came from. But the iommu granularity doesn't really guarantee all that much or allow any ability for the iommu driver to add additional policies that would be useful for userspace drivers (ex. multi-function devices and peer-to-peer isolation). So again I've had to allow that a group might have multiple visible requestor IDs within the group. This time though I'm trying to disallow hierarchies, which means that even kernel, dma_ops, usage of groups are restricted to a single, platform defined, level of isolation. I'm also trying to stay out of the business of providing a group management interface. I only want to describe groups. Things like stopping driver probes should be device level problems. In effect, this level should not be providing enforcement and ownership, something like VFIO will do that. So the differences are subtle, but important. Thanks, Alex ^ permalink raw reply [flat|nested] 9+ messages in thread
* [RFC PATCH 2/3] iommu: Create basic group infrastructure and update AMD-Vi & Intel VT-d 2012-04-02 21:14 [RFC PATCH 0/3] IOMMU groups Alex Williamson 2012-04-02 21:14 ` [RFC PATCH 1/3] iommu: Introduce iommu_group Alex Williamson @ 2012-04-02 21:14 ` Alex Williamson 2012-04-18 11:55 ` David Gibson 2012-04-02 21:14 ` [RFC PATCH 3/3] iommu: Create attach/detach group interface Alex Williamson 2012-04-10 21:03 ` [RFC PATCH 0/3] IOMMU groups Alex Williamson 3 siblings, 1 reply; 9+ messages in thread From: Alex Williamson @ 2012-04-02 21:14 UTC (permalink / raw) To: david, dwmw2 Cc: alex.williamson, iommu, qemu-devel, joerg.roedel, kvm, linux-kernel IOMMU groups define the minimum granularity of the IOMMU. We therefore create groups using a dma_dev which is the effective requestor ID for the entire group. Additional devices can be added to groups based on system topology, IOMMU limitations, or device quirks. This implementation also includes a simple idr database for creating a flat address space of group IDs across multiple IOMMUs. Updates included for Intel VT-d, using example iommu callbacks for adding and removing devices, as well as AMD-Vi, which tracks devices on it's own. We should be able to better integrate the iommu_group within existing AMD-Vi structs or provide a private data location within the iommu_group where we can eventually reduce redundancy. Signed-off-by: Alex Williamson <alex.williamson@redhat.com> --- drivers/iommu/amd_iommu.c | 50 ++++++----- drivers/iommu/intel-iommu.c | 76 +++++++++-------- drivers/iommu/iommu.c | 198 ++++++++++++++++++++++++++++++++++++++----- include/linux/iommu.h | 23 +++++ 4 files changed, 267 insertions(+), 80 deletions(-) diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c index f75e060..876db28 100644 --- a/drivers/iommu/amd_iommu.c +++ b/drivers/iommu/amd_iommu.c @@ -256,9 +256,10 @@ static bool check_device(struct device *dev) static int iommu_init_device(struct device *dev) { - struct pci_dev *pdev = to_pci_dev(dev); + struct pci_dev *dma_pdev, *pdev = to_pci_dev(dev); struct iommu_dev_data *dev_data; u16 alias; + int ret; if (dev->archdata.iommu) return 0; @@ -279,6 +280,26 @@ static int iommu_init_device(struct device *dev) return -ENOTSUPP; } dev_data->alias_data = alias_data; + + dma_pdev = pci_get_bus_and_slot(alias >> 8, alias & 0xff); + } else + dma_pdev = pdev; + + /* dma_pdev = iommu_pci_quirk(dma_pdev) */ + + if (!dma_pdev->dev.iommu_group) { + struct iommu_group *group; + + group = iommu_group_alloc(&dma_pdev->dev); + if (IS_ERR(group)) + return PTR_ERR(group); + } + + ret = iommu_group_add_device(dma_pdev->dev.iommu_group, dev); + if (ret) { + if (iommu_group_empty(dma_pdev->dev.iommu_group)) + iommu_group_free(dma_pdev->dev.iommu_group); + return ret; } if (pci_iommuv2_capable(pdev)) { @@ -309,6 +330,12 @@ static void iommu_ignore_device(struct device *dev) static void iommu_uninit_device(struct device *dev) { + struct iommu_group *group = dev->iommu_group; + + iommu_group_remove_device(dev); + if (iommu_group_empty(group)) + iommu_group_free(group); + /* * Nothing to do here - we keep dev_data around for unplugged devices * and reuse it when the device is re-plugged - not doing so would @@ -3191,26 +3218,6 @@ static int amd_iommu_domain_has_cap(struct iommu_domain *domain, return 0; } -static int amd_iommu_device_group(struct device *dev, unsigned int *groupid) -{ - struct iommu_dev_data *dev_data = dev->archdata.iommu; - struct pci_dev *pdev = to_pci_dev(dev); - u16 devid; - - if (!dev_data) - return -ENODEV; - - if (pdev->is_virtfn || !iommu_group_mf) - devid = dev_data->devid; - else - devid = calc_devid(pdev->bus->number, - PCI_DEVFN(PCI_SLOT(pdev->devfn), 0)); - - *groupid = amd_iommu_alias_table[devid]; - - return 0; -} - static struct iommu_ops amd_iommu_ops = { .domain_init = amd_iommu_domain_init, .domain_destroy = amd_iommu_domain_destroy, @@ -3220,7 +3227,6 @@ static struct iommu_ops amd_iommu_ops = { .unmap = amd_iommu_unmap, .iova_to_phys = amd_iommu_iova_to_phys, .domain_has_cap = amd_iommu_domain_has_cap, - .device_group = amd_iommu_device_group, .pgsize_bitmap = AMD_IOMMU_PGSIZES, }; diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index c9c6053..41ab7d0 100644 --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -4075,54 +4075,59 @@ static int intel_iommu_domain_has_cap(struct iommu_domain *domain, return 0; } -/* - * Group numbers are arbitrary. Device with the same group number - * indicate the iommu cannot differentiate between them. To avoid - * tracking used groups we just use the seg|bus|devfn of the lowest - * level we're able to differentiate devices - */ -static int intel_iommu_device_group(struct device *dev, unsigned int *groupid) +static int intel_iommu_add_device(struct device *dev) { struct pci_dev *pdev = to_pci_dev(dev); - struct pci_dev *bridge; - union { - struct { - u8 devfn; - u8 bus; - u16 segment; - } pci; - u32 group; - } id; - - if (iommu_no_mapping(dev)) - return -ENODEV; - - id.pci.segment = pci_domain_nr(pdev->bus); - id.pci.bus = pdev->bus->number; - id.pci.devfn = pdev->devfn; + struct pci_dev *bridge, *dma_pdev = pdev; + int ret; - if (!device_to_iommu(id.pci.segment, id.pci.bus, id.pci.devfn)) + if (!device_to_iommu(pci_domain_nr(pdev->bus), + pdev->bus->number, pdev->devfn)) return -ENODEV; bridge = pci_find_upstream_pcie_bridge(pdev); if (bridge) { - if (pci_is_pcie(bridge)) { - id.pci.bus = bridge->subordinate->number; - id.pci.devfn = 0; - } else { - id.pci.bus = bridge->bus->number; - id.pci.devfn = bridge->devfn; - } + if (pci_is_pcie(bridge)) + dma_pdev = pci_get_domain_bus_and_slot( + pci_domain_nr(pdev->bus), + bridge->subordinate->number, 0); + else + dma_pdev = bridge; + } + + if (!bridge && !pdev->is_virtfn && iommu_group_mf) + dma_pdev = pci_get_slot(dma_pdev->bus, 0); + + /* dma_pdev = iommu_pci_quirk(dma_pdev); */ + + if (!dma_pdev->dev.iommu_group) { + struct iommu_group *group; + + group = iommu_group_alloc(&dma_pdev->dev); + if (IS_ERR(group)) + return PTR_ERR(group); } - if (!pdev->is_virtfn && iommu_group_mf) - id.pci.devfn = PCI_DEVFN(PCI_SLOT(id.pci.devfn), 0); + ret = iommu_group_add_device(dma_pdev->dev.iommu_group, dev); + if (ret) { + if (iommu_group_empty(dma_pdev->dev.iommu_group)) + iommu_group_free(dma_pdev->dev.iommu_group); - *groupid = id.group; + return ret; + } return 0; } +static void intel_iommu_remove_device(struct device *dev) +{ + struct iommu_group *group = dev->iommu_group; + + iommu_group_remove_device(dev); + if (iommu_group_empty(group)) + iommu_group_free(group); +} + static struct iommu_ops intel_iommu_ops = { .domain_init = intel_iommu_domain_init, .domain_destroy = intel_iommu_domain_destroy, @@ -4132,7 +4137,8 @@ static struct iommu_ops intel_iommu_ops = { .unmap = intel_iommu_unmap, .iova_to_phys = intel_iommu_iova_to_phys, .domain_has_cap = intel_iommu_domain_has_cap, - .device_group = intel_iommu_device_group, + .add_device = intel_iommu_add_device, + .remove_device = intel_iommu_remove_device, .pgsize_bitmap = INTEL_IOMMU_PGSIZES, }; diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index 2198b2d..6a51ed9 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -26,48 +26,96 @@ #include <linux/slab.h> #include <linux/errno.h> #include <linux/iommu.h> +#include <linux/idr.h> -static ssize_t show_iommu_group(struct device *dev, - struct device_attribute *attr, char *buf) +static struct kset *iommu_group_kset; +static struct idr iommu_group_idr; +static struct mutex iommu_group_mutex; + +struct iommu_group_attribute { + struct attribute attr; + ssize_t (*show)(struct iommu_group *group, char *buf); + ssize_t (*store)(struct iommu_group *group, + const char *buf, size_t count); +}; + +#define to_iommu_group_attr(_attr) \ + container_of(_attr, struct iommu_group_attribute, attr) +#define to_iommu_group(_kobj) \ + container_of(_kobj, struct iommu_group, kobj) + +static ssize_t iommu_group_attr_show(struct kobject *kobj, + struct attribute *__attr, char *buf) { - unsigned int groupid; + struct iommu_group_attribute *attr = to_iommu_group_attr(__attr); + struct iommu_group *group = to_iommu_group(kobj); + ssize_t ret = -EIO; - if (iommu_device_group(dev, &groupid)) - return 0; + if (attr->show) + ret = attr->show(group, buf); + return ret; +} + +static ssize_t iommu_group_attr_store(struct kobject *kobj, + struct attribute *__attr, + const char *buf, size_t count) +{ + struct iommu_group_attribute *attr = to_iommu_group_attr(__attr); + struct iommu_group *group = to_iommu_group(kobj); + ssize_t ret = -EIO; - return sprintf(buf, "%u", groupid); + if (attr->store) + ret = attr->store(group, buf, count); + return ret; } -static DEVICE_ATTR(iommu_group, S_IRUGO, show_iommu_group, NULL); -static int add_iommu_group(struct device *dev, void *data) +static void iommu_group_release(struct kobject *kobj) { - unsigned int groupid; + struct iommu_group *group = to_iommu_group(kobj); - if (iommu_device_group(dev, &groupid) == 0) - return device_create_file(dev, &dev_attr_iommu_group); + mutex_lock(&iommu_group_mutex); + idr_remove(&iommu_group_idr, group->id); + mutex_unlock(&iommu_group_mutex); - return 0; + kfree(group); } -static int remove_iommu_group(struct device *dev) +static const struct sysfs_ops iommu_group_sysfs_ops = { + .show = iommu_group_attr_show, + .store = iommu_group_attr_store, +}; + +static struct kobj_type iommu_group_ktype = { + .sysfs_ops = &iommu_group_sysfs_ops, + .release = iommu_group_release, +}; + +static int add_iommu_group(struct device *dev, void *data) { - unsigned int groupid; + struct iommu_ops *ops = data; - if (iommu_device_group(dev, &groupid) == 0) - device_remove_file(dev, &dev_attr_iommu_group); + if (!ops->add_device) + return -ENODEV; - return 0; + WARN_ON(dev->iommu_group); + + return ops->add_device(dev); } static int iommu_device_notifier(struct notifier_block *nb, unsigned long action, void *data) { struct device *dev = data; + struct iommu_ops *ops = dev->bus->iommu_ops; if (action == BUS_NOTIFY_ADD_DEVICE) - return add_iommu_group(dev, NULL); - else if (action == BUS_NOTIFY_DEL_DEVICE) - return remove_iommu_group(dev); + return add_iommu_group(dev, ops); + else if (action == BUS_NOTIFY_DEL_DEVICE) { + if (ops->remove_device && dev->iommu_group) { + ops->remove_device(dev); + return 0; + } + } return 0; } @@ -79,7 +127,97 @@ static struct notifier_block iommu_device_nb = { static void iommu_bus_init(struct bus_type *bus, struct iommu_ops *ops) { bus_register_notifier(bus, &iommu_device_nb); - bus_for_each_dev(bus, NULL, NULL, add_iommu_group); + bus_for_each_dev(bus, NULL, ops, add_iommu_group); +} + +struct iommu_group *iommu_group_alloc(struct device *dma_dev) +{ + struct iommu_group *group; + int ret; + + group = kzalloc(sizeof(*group), GFP_KERNEL); + if (!group) + return ERR_PTR(-ENOMEM); + + group->kobj.kset = iommu_group_kset; + group->dma_dev = dma_dev; + atomic_set(&group->refcnt, 0); + + mutex_lock(&iommu_group_mutex); + +again: + if (unlikely(0 == idr_pre_get(&iommu_group_idr, GFP_KERNEL))) { + kfree(group); + mutex_unlock(&iommu_group_mutex); + return ERR_PTR(-ENOMEM); + } + + if (-EAGAIN == idr_get_new(&iommu_group_idr, group, &group->id)) + goto again; + + mutex_unlock(&iommu_group_mutex); + + ret = kobject_init_and_add(&group->kobj, &iommu_group_ktype, + NULL, "%d", group->id); + if (ret) { + iommu_group_release(&group->kobj); + return ERR_PTR(ret); + } + + group->devices_kobj = kobject_create_and_add("devices", &group->kobj); + if (!group->devices_kobj) { + kobject_put(&group->kobj); + return ERR_PTR(-ENOMEM); + } + + dma_dev->iommu_group = group; + + return group; +} + +void iommu_group_free(struct iommu_group *group) +{ + group->dma_dev->iommu_group = NULL; + WARN_ON(atomic_read(&group->refcnt)); + kobject_put(group->devices_kobj); + kobject_put(&group->kobj); +} + +bool iommu_group_empty(struct iommu_group *group) +{ + return (0 == atomic_read(&group->refcnt)); +} + +int iommu_group_add_device(struct iommu_group *group, struct device *dev) +{ + int ret; + + atomic_inc(&group->refcnt); + + ret = sysfs_create_link(&dev->kobj, &group->kobj, "iommu_group"); + if (ret) + return ret; + + ret = sysfs_create_link(group->devices_kobj, &dev->kobj, + kobject_name(&dev->kobj)); + if (ret) { + sysfs_remove_link(&dev->kobj, "iommu_group"); + return ret; + } + + dev->iommu_group = group; + + return 0; +} + +void iommu_group_remove_device(struct device *dev) +{ + sysfs_remove_link(dev->iommu_group->devices_kobj, + kobject_name(&dev->kobj)); + sysfs_remove_link(&dev->kobj, "iommu_group"); + + atomic_dec(&dev->iommu_group->refcnt); + dev->iommu_group = NULL; } /** @@ -335,9 +473,23 @@ EXPORT_SYMBOL_GPL(iommu_unmap); int iommu_device_group(struct device *dev, unsigned int *groupid) { - if (iommu_present(dev->bus) && dev->bus->iommu_ops->device_group) - return dev->bus->iommu_ops->device_group(dev, groupid); + if (dev->iommu_group) { + *groupid = dev->iommu_group->id; + return 0; + } return -ENODEV; } EXPORT_SYMBOL_GPL(iommu_device_group); + +static int __init iommu_init(void) +{ + iommu_group_kset = kset_create_and_add("iommu_groups", NULL, NULL); + idr_init(&iommu_group_idr); + mutex_init(&iommu_group_mutex); + + BUG_ON(!iommu_group_kset); + + return 0; +} +subsys_initcall(iommu_init); diff --git a/include/linux/iommu.h b/include/linux/iommu.h index 2ee375c..24004d6 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -60,6 +60,8 @@ struct iommu_domain { * @iova_to_phys: translate iova to physical address * @domain_has_cap: domain capabilities query * @commit: commit iommu domain + * @add_device: add device to iommu grouping + * @remove_device: remove device from iommu grouping * @pgsize_bitmap: bitmap of supported page sizes */ struct iommu_ops { @@ -76,10 +78,25 @@ struct iommu_ops { int (*domain_has_cap)(struct iommu_domain *domain, unsigned long cap); int (*device_group)(struct device *dev, unsigned int *groupid); + int (*add_device)(struct device *dev); + void (*remove_device)(struct device *dev); unsigned long pgsize_bitmap; }; +/** + * struct iommu_group - groups of devices representing iommu visibility + * @dma_dev: all dma from the group appears to the iommu using this source id + * @kobj: iommu group node in sysfs + * @devices_kobj: sysfs subdir node for linking devices + * @refcnt: number of devices in group + * @id: unique id number for the group (for easy sysfs listing) + */ struct iommu_group { + struct device *dma_dev; + struct kobject kobj; + struct kobject *devices_kobj; + atomic_t refcnt; + int id; }; extern int bus_set_iommu(struct bus_type *bus, struct iommu_ops *ops); @@ -101,6 +118,12 @@ extern int iommu_domain_has_cap(struct iommu_domain *domain, extern void iommu_set_fault_handler(struct iommu_domain *domain, iommu_fault_handler_t handler); extern int iommu_device_group(struct device *dev, unsigned int *groupid); +extern struct iommu_group *iommu_group_alloc(struct device *dev); +extern void iommu_group_free(struct iommu_group *group); +extern bool iommu_group_empty(struct iommu_group *group); +extern int iommu_group_add_device(struct iommu_group *group, + struct device *dev); +extern void iommu_group_remove_device(struct device *dev); /** * report_iommu_fault() - report about an IOMMU fault to the IOMMU framework ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [RFC PATCH 2/3] iommu: Create basic group infrastructure and update AMD-Vi & Intel VT-d 2012-04-02 21:14 ` [RFC PATCH 2/3] iommu: Create basic group infrastructure and update AMD-Vi & Intel VT-d Alex Williamson @ 2012-04-18 11:55 ` David Gibson 2012-04-18 20:28 ` Alex Williamson 0 siblings, 1 reply; 9+ messages in thread From: David Gibson @ 2012-04-18 11:55 UTC (permalink / raw) To: Alex Williamson; +Cc: dwmw2, iommu, qemu-devel, joerg.roedel, kvm, linux-kernel On Mon, Apr 02, 2012 at 03:14:46PM -0600, Alex Williamson wrote: > IOMMU groups define the minimum granularity of the IOMMU. We therefore > create groups using a dma_dev which is the effective requestor ID for > the entire group. Additional devices can be added to groups based on > system topology, IOMMU limitations, or device quirks. > > This implementation also includes a simple idr database for creating > a flat address space of group IDs across multiple IOMMUs. Updates > included for Intel VT-d, using example iommu callbacks for adding and > removing devices, as well as AMD-Vi, which tracks devices on it's own. > We should be able to better integrate the iommu_group within existing > AMD-Vi structs or provide a private data location within the iommu_group > where we can eventually reduce redundancy. > > Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Looks reasonable as far as it goes. This still lacks an obvious means for doing group assignment of devices on busses subordinate to devices that are on iommu managed busses. Which as we discussed earlier is a bit of a can of worms, but necessary. > --- > > drivers/iommu/amd_iommu.c | 50 ++++++----- > drivers/iommu/intel-iommu.c | 76 +++++++++-------- > drivers/iommu/iommu.c | 198 ++++++++++++++++++++++++++++++++++++++----- > include/linux/iommu.h | 23 +++++ > 4 files changed, 267 insertions(+), 80 deletions(-) > > diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c > index f75e060..876db28 100644 > --- a/drivers/iommu/amd_iommu.c > +++ b/drivers/iommu/amd_iommu.c > @@ -256,9 +256,10 @@ static bool check_device(struct device *dev) > > static int iommu_init_device(struct device *dev) > { > - struct pci_dev *pdev = to_pci_dev(dev); > + struct pci_dev *dma_pdev, *pdev = to_pci_dev(dev); > struct iommu_dev_data *dev_data; > u16 alias; > + int ret; > > if (dev->archdata.iommu) > return 0; > @@ -279,6 +280,26 @@ static int iommu_init_device(struct device *dev) > return -ENOTSUPP; > } > dev_data->alias_data = alias_data; > + > + dma_pdev = pci_get_bus_and_slot(alias >> 8, alias & 0xff); > + } else > + dma_pdev = pdev; > + > + /* dma_pdev = iommu_pci_quirk(dma_pdev) */ Presumably an actual implementation of the quirk is coming? It might be an idea for it to take both the individual and representative devices, in case that information is useful. > + if (!dma_pdev->dev.iommu_group) { > + struct iommu_group *group; > + > + group = iommu_group_alloc(&dma_pdev->dev); > + if (IS_ERR(group)) > + return PTR_ERR(group); > + } > + > + ret = iommu_group_add_device(dma_pdev->dev.iommu_group, dev); > + if (ret) { > + if (iommu_group_empty(dma_pdev->dev.iommu_group)) > + iommu_group_free(dma_pdev->dev.iommu_group); > + return ret; It might be nice to have generic helpers that do this kind of lifetime handling of the groups, but that's a detail. > } > > if (pci_iommuv2_capable(pdev)) { > @@ -309,6 +330,12 @@ static void iommu_ignore_device(struct device *dev) > > static void iommu_uninit_device(struct device *dev) > { > + struct iommu_group *group = dev->iommu_group; > + > + iommu_group_remove_device(dev); > + if (iommu_group_empty(group)) > + iommu_group_free(group); > + > /* > * Nothing to do here - we keep dev_data around for unplugged devices > * and reuse it when the device is re-plugged - not doing so would > @@ -3191,26 +3218,6 @@ static int amd_iommu_domain_has_cap(struct iommu_domain *domain, > return 0; > } > > -static int amd_iommu_device_group(struct device *dev, unsigned int *groupid) > -{ > - struct iommu_dev_data *dev_data = dev->archdata.iommu; > - struct pci_dev *pdev = to_pci_dev(dev); > - u16 devid; > - > - if (!dev_data) > - return -ENODEV; > - > - if (pdev->is_virtfn || !iommu_group_mf) > - devid = dev_data->devid; > - else > - devid = calc_devid(pdev->bus->number, > - PCI_DEVFN(PCI_SLOT(pdev->devfn), 0)); > - > - *groupid = amd_iommu_alias_table[devid]; > - > - return 0; > -} > - > static struct iommu_ops amd_iommu_ops = { > .domain_init = amd_iommu_domain_init, > .domain_destroy = amd_iommu_domain_destroy, > @@ -3220,7 +3227,6 @@ static struct iommu_ops amd_iommu_ops = { > .unmap = amd_iommu_unmap, > .iova_to_phys = amd_iommu_iova_to_phys, > .domain_has_cap = amd_iommu_domain_has_cap, > - .device_group = amd_iommu_device_group, > .pgsize_bitmap = AMD_IOMMU_PGSIZES, > }; > > diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c > index c9c6053..41ab7d0 100644 > --- a/drivers/iommu/intel-iommu.c > +++ b/drivers/iommu/intel-iommu.c > @@ -4075,54 +4075,59 @@ static int intel_iommu_domain_has_cap(struct iommu_domain *domain, > return 0; > } > > -/* > - * Group numbers are arbitrary. Device with the same group number > - * indicate the iommu cannot differentiate between them. To avoid > - * tracking used groups we just use the seg|bus|devfn of the lowest > - * level we're able to differentiate devices > - */ > -static int intel_iommu_device_group(struct device *dev, unsigned int *groupid) > +static int intel_iommu_add_device(struct device *dev) > { > struct pci_dev *pdev = to_pci_dev(dev); > - struct pci_dev *bridge; > - union { > - struct { > - u8 devfn; > - u8 bus; > - u16 segment; > - } pci; > - u32 group; > - } id; > - > - if (iommu_no_mapping(dev)) > - return -ENODEV; > - > - id.pci.segment = pci_domain_nr(pdev->bus); > - id.pci.bus = pdev->bus->number; > - id.pci.devfn = pdev->devfn; > + struct pci_dev *bridge, *dma_pdev = pdev; > + int ret; > > - if (!device_to_iommu(id.pci.segment, id.pci.bus, id.pci.devfn)) > + if (!device_to_iommu(pci_domain_nr(pdev->bus), > + pdev->bus->number, pdev->devfn)) > return -ENODEV; > > bridge = pci_find_upstream_pcie_bridge(pdev); > if (bridge) { > - if (pci_is_pcie(bridge)) { > - id.pci.bus = bridge->subordinate->number; > - id.pci.devfn = 0; > - } else { > - id.pci.bus = bridge->bus->number; > - id.pci.devfn = bridge->devfn; > - } > + if (pci_is_pcie(bridge)) > + dma_pdev = pci_get_domain_bus_and_slot( > + pci_domain_nr(pdev->bus), > + bridge->subordinate->number, 0); > + else > + dma_pdev = bridge; > + } > + > + if (!bridge && !pdev->is_virtfn && iommu_group_mf) > + dma_pdev = pci_get_slot(dma_pdev->bus, 0); > + > + /* dma_pdev = iommu_pci_quirk(dma_pdev); */ > + > + if (!dma_pdev->dev.iommu_group) { > + struct iommu_group *group; > + > + group = iommu_group_alloc(&dma_pdev->dev); > + if (IS_ERR(group)) > + return PTR_ERR(group); > } > > - if (!pdev->is_virtfn && iommu_group_mf) > - id.pci.devfn = PCI_DEVFN(PCI_SLOT(id.pci.devfn), 0); > + ret = iommu_group_add_device(dma_pdev->dev.iommu_group, dev); > + if (ret) { > + if (iommu_group_empty(dma_pdev->dev.iommu_group)) > + iommu_group_free(dma_pdev->dev.iommu_group); > > - *groupid = id.group; > + return ret; > + } > > return 0; > } > > +static void intel_iommu_remove_device(struct device *dev) > +{ > + struct iommu_group *group = dev->iommu_group; > + > + iommu_group_remove_device(dev); > + if (iommu_group_empty(group)) > + iommu_group_free(group); > +} > + > static struct iommu_ops intel_iommu_ops = { > .domain_init = intel_iommu_domain_init, > .domain_destroy = intel_iommu_domain_destroy, > @@ -4132,7 +4137,8 @@ static struct iommu_ops intel_iommu_ops = { > .unmap = intel_iommu_unmap, > .iova_to_phys = intel_iommu_iova_to_phys, > .domain_has_cap = intel_iommu_domain_has_cap, > - .device_group = intel_iommu_device_group, > + .add_device = intel_iommu_add_device, > + .remove_device = intel_iommu_remove_device, > .pgsize_bitmap = INTEL_IOMMU_PGSIZES, > }; > > diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c > index 2198b2d..6a51ed9 100644 > --- a/drivers/iommu/iommu.c > +++ b/drivers/iommu/iommu.c > @@ -26,48 +26,96 @@ > #include <linux/slab.h> > #include <linux/errno.h> > #include <linux/iommu.h> > +#include <linux/idr.h> > > -static ssize_t show_iommu_group(struct device *dev, > - struct device_attribute *attr, char *buf) > +static struct kset *iommu_group_kset; > +static struct idr iommu_group_idr; > +static struct mutex iommu_group_mutex; > + > +struct iommu_group_attribute { > + struct attribute attr; > + ssize_t (*show)(struct iommu_group *group, char *buf); > + ssize_t (*store)(struct iommu_group *group, > + const char *buf, size_t count); > +}; > + > +#define to_iommu_group_attr(_attr) \ > + container_of(_attr, struct iommu_group_attribute, attr) > +#define to_iommu_group(_kobj) \ > + container_of(_kobj, struct iommu_group, kobj) > + > +static ssize_t iommu_group_attr_show(struct kobject *kobj, > + struct attribute *__attr, char *buf) > { > - unsigned int groupid; > + struct iommu_group_attribute *attr = to_iommu_group_attr(__attr); > + struct iommu_group *group = to_iommu_group(kobj); > + ssize_t ret = -EIO; > > - if (iommu_device_group(dev, &groupid)) > - return 0; > + if (attr->show) > + ret = attr->show(group, buf); > + return ret; > +} > + > +static ssize_t iommu_group_attr_store(struct kobject *kobj, > + struct attribute *__attr, > + const char *buf, size_t count) > +{ > + struct iommu_group_attribute *attr = to_iommu_group_attr(__attr); > + struct iommu_group *group = to_iommu_group(kobj); > + ssize_t ret = -EIO; > > - return sprintf(buf, "%u", groupid); > + if (attr->store) > + ret = attr->store(group, buf, count); > + return ret; > } > -static DEVICE_ATTR(iommu_group, S_IRUGO, show_iommu_group, NULL); > > -static int add_iommu_group(struct device *dev, void *data) > +static void iommu_group_release(struct kobject *kobj) > { > - unsigned int groupid; > + struct iommu_group *group = to_iommu_group(kobj); > > - if (iommu_device_group(dev, &groupid) == 0) > - return device_create_file(dev, &dev_attr_iommu_group); > + mutex_lock(&iommu_group_mutex); > + idr_remove(&iommu_group_idr, group->id); > + mutex_unlock(&iommu_group_mutex); > > - return 0; > + kfree(group); > } > > -static int remove_iommu_group(struct device *dev) > +static const struct sysfs_ops iommu_group_sysfs_ops = { > + .show = iommu_group_attr_show, > + .store = iommu_group_attr_store, > +}; > + > +static struct kobj_type iommu_group_ktype = { > + .sysfs_ops = &iommu_group_sysfs_ops, > + .release = iommu_group_release, > +}; > + > +static int add_iommu_group(struct device *dev, void *data) > { > - unsigned int groupid; > + struct iommu_ops *ops = data; > > - if (iommu_device_group(dev, &groupid) == 0) > - device_remove_file(dev, &dev_attr_iommu_group); > + if (!ops->add_device) > + return -ENODEV; > > - return 0; > + WARN_ON(dev->iommu_group); > + > + return ops->add_device(dev); > } > > static int iommu_device_notifier(struct notifier_block *nb, > unsigned long action, void *data) > { > struct device *dev = data; > + struct iommu_ops *ops = dev->bus->iommu_ops; > > if (action == BUS_NOTIFY_ADD_DEVICE) > - return add_iommu_group(dev, NULL); > - else if (action == BUS_NOTIFY_DEL_DEVICE) > - return remove_iommu_group(dev); > + return add_iommu_group(dev, ops); > + else if (action == BUS_NOTIFY_DEL_DEVICE) { > + if (ops->remove_device && dev->iommu_group) { > + ops->remove_device(dev); > + return 0; > + } > + } > > return 0; > } > @@ -79,7 +127,97 @@ static struct notifier_block iommu_device_nb = { > static void iommu_bus_init(struct bus_type *bus, struct iommu_ops *ops) > { > bus_register_notifier(bus, &iommu_device_nb); > - bus_for_each_dev(bus, NULL, NULL, add_iommu_group); > + bus_for_each_dev(bus, NULL, ops, add_iommu_group); > +} > + > +struct iommu_group *iommu_group_alloc(struct device *dma_dev) > +{ > + struct iommu_group *group; > + int ret; > + > + group = kzalloc(sizeof(*group), GFP_KERNEL); > + if (!group) > + return ERR_PTR(-ENOMEM); > + > + group->kobj.kset = iommu_group_kset; > + group->dma_dev = dma_dev; > + atomic_set(&group->refcnt, 0); > + > + mutex_lock(&iommu_group_mutex); > + > +again: > + if (unlikely(0 == idr_pre_get(&iommu_group_idr, GFP_KERNEL))) { > + kfree(group); > + mutex_unlock(&iommu_group_mutex); > + return ERR_PTR(-ENOMEM); > + } > + > + if (-EAGAIN == idr_get_new(&iommu_group_idr, group, &group->id)) > + goto again; > + > + mutex_unlock(&iommu_group_mutex); > + > + ret = kobject_init_and_add(&group->kobj, &iommu_group_ktype, > + NULL, "%d", group->id); > + if (ret) { > + iommu_group_release(&group->kobj); > + return ERR_PTR(ret); > + } > + > + group->devices_kobj = kobject_create_and_add("devices", &group->kobj); > + if (!group->devices_kobj) { > + kobject_put(&group->kobj); > + return ERR_PTR(-ENOMEM); > + } > + > + dma_dev->iommu_group = group; > + > + return group; > +} > + > +void iommu_group_free(struct iommu_group *group) > +{ > + group->dma_dev->iommu_group = NULL; > + WARN_ON(atomic_read(&group->refcnt)); This kind of implies that the representative device doesn't count against the refcount, which seems wrong. > + kobject_put(group->devices_kobj); > + kobject_put(&group->kobj); > +} > + > +bool iommu_group_empty(struct iommu_group *group) > +{ > + return (0 == atomic_read(&group->refcnt)); > +} > + > +int iommu_group_add_device(struct iommu_group *group, struct device *dev) > +{ > + int ret; > + > + atomic_inc(&group->refcnt); > + > + ret = sysfs_create_link(&dev->kobj, &group->kobj, "iommu_group"); > + if (ret) > + return ret; > + > + ret = sysfs_create_link(group->devices_kobj, &dev->kobj, > + kobject_name(&dev->kobj)); > + if (ret) { > + sysfs_remove_link(&dev->kobj, "iommu_group"); > + return ret; > + } > + > + dev->iommu_group = group; > + > + return 0; > +} > + > +void iommu_group_remove_device(struct device *dev) > +{ > + sysfs_remove_link(dev->iommu_group->devices_kobj, > + kobject_name(&dev->kobj)); > + sysfs_remove_link(&dev->kobj, "iommu_group"); > + > + atomic_dec(&dev->iommu_group->refcnt); > + dev->iommu_group = NULL; > } > > /** > @@ -335,9 +473,23 @@ EXPORT_SYMBOL_GPL(iommu_unmap); > > int iommu_device_group(struct device *dev, unsigned int *groupid) > { > - if (iommu_present(dev->bus) && dev->bus->iommu_ops->device_group) > - return dev->bus->iommu_ops->device_group(dev, groupid); > + if (dev->iommu_group) { > + *groupid = dev->iommu_group->id; > + return 0; > + } > > return -ENODEV; > } > EXPORT_SYMBOL_GPL(iommu_device_group); > + > +static int __init iommu_init(void) > +{ > + iommu_group_kset = kset_create_and_add("iommu_groups", NULL, NULL); > + idr_init(&iommu_group_idr); > + mutex_init(&iommu_group_mutex); > + > + BUG_ON(!iommu_group_kset); > + > + return 0; > +} > +subsys_initcall(iommu_init); > diff --git a/include/linux/iommu.h b/include/linux/iommu.h > index 2ee375c..24004d6 100644 > --- a/include/linux/iommu.h > +++ b/include/linux/iommu.h > @@ -60,6 +60,8 @@ struct iommu_domain { > * @iova_to_phys: translate iova to physical address > * @domain_has_cap: domain capabilities query > * @commit: commit iommu domain > + * @add_device: add device to iommu grouping > + * @remove_device: remove device from iommu grouping > * @pgsize_bitmap: bitmap of supported page sizes > */ > struct iommu_ops { > @@ -76,10 +78,25 @@ struct iommu_ops { > int (*domain_has_cap)(struct iommu_domain *domain, > unsigned long cap); > int (*device_group)(struct device *dev, unsigned int *groupid); > + int (*add_device)(struct device *dev); > + void (*remove_device)(struct device *dev); > unsigned long pgsize_bitmap; > }; > > +/** > + * struct iommu_group - groups of devices representing iommu visibility > + * @dma_dev: all dma from the group appears to the iommu using this source id > + * @kobj: iommu group node in sysfs > + * @devices_kobj: sysfs subdir node for linking devices > + * @refcnt: number of devices in group > + * @id: unique id number for the group (for easy sysfs listing) > + */ > struct iommu_group { > + struct device *dma_dev; So, a "representative" device works very nicely for the AMD and Intel IOMMUs. But I'm not sure it makes sense for any IOMMU - we could point it to the bridge in the Power case, but I'm not sure if that really makes sense (particularly when it's a host bridge, not a p2p bridge). In embedded cases it may make even less sense. I think it would be better to move this into iommu driver specific data (which I guess would mean adding a facility for iommu driver specific data, either with a private pointer or using container_of). > + struct kobject kobj; > + struct kobject *devices_kobj; The devices_kobj has the same lifetime as the group, so it might as well be a static member. > + atomic_t refcnt; > + int id; > }; > > extern int bus_set_iommu(struct bus_type *bus, struct iommu_ops *ops); > @@ -101,6 +118,12 @@ extern int iommu_domain_has_cap(struct iommu_domain *domain, > extern void iommu_set_fault_handler(struct iommu_domain *domain, > iommu_fault_handler_t handler); > extern int iommu_device_group(struct device *dev, unsigned int *groupid); > +extern struct iommu_group *iommu_group_alloc(struct device *dev); > +extern void iommu_group_free(struct iommu_group *group); > +extern bool iommu_group_empty(struct iommu_group *group); > +extern int iommu_group_add_device(struct iommu_group *group, > + struct device *dev); > +extern void iommu_group_remove_device(struct device *dev); > > /** > * report_iommu_fault() - report about an IOMMU fault to the IOMMU framework > -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [RFC PATCH 2/3] iommu: Create basic group infrastructure and update AMD-Vi & Intel VT-d 2012-04-18 11:55 ` David Gibson @ 2012-04-18 20:28 ` Alex Williamson 0 siblings, 0 replies; 9+ messages in thread From: Alex Williamson @ 2012-04-18 20:28 UTC (permalink / raw) To: David Gibson; +Cc: dwmw2, iommu, qemu-devel, joerg.roedel, kvm, linux-kernel On Wed, 2012-04-18 at 21:55 +1000, David Gibson wrote: > On Mon, Apr 02, 2012 at 03:14:46PM -0600, Alex Williamson wrote: > > IOMMU groups define the minimum granularity of the IOMMU. We therefore > > create groups using a dma_dev which is the effective requestor ID for > > the entire group. Additional devices can be added to groups based on > > system topology, IOMMU limitations, or device quirks. > > > > This implementation also includes a simple idr database for creating > > a flat address space of group IDs across multiple IOMMUs. Updates > > included for Intel VT-d, using example iommu callbacks for adding and > > removing devices, as well as AMD-Vi, which tracks devices on it's own. > > We should be able to better integrate the iommu_group within existing > > AMD-Vi structs or provide a private data location within the iommu_group > > where we can eventually reduce redundancy. > > > > Signed-off-by: Alex Williamson <alex.williamson@redhat.com> > > Looks reasonable as far as it goes. This still lacks an obvious means > for doing group assignment of devices on busses subordinate to devices > that are on iommu managed busses. Which as we discussed earlier is a > bit of a can of worms, but necessary. Yes, this was a big can of worms in my isolation group attempt. I'm hoping it's easier here. Perhaps the iommu core needs to register add device notifiers for every bus and for each new device without a direct iommu, walk up the device tree and add it as a hidden device in the first iommu group it finds. Nested iommus also throw a serious wrench into the problem, but as we've discussed, are probably safe to ignore. > > --- > > > > drivers/iommu/amd_iommu.c | 50 ++++++----- > > drivers/iommu/intel-iommu.c | 76 +++++++++-------- > > drivers/iommu/iommu.c | 198 ++++++++++++++++++++++++++++++++++++++----- > > include/linux/iommu.h | 23 +++++ > > 4 files changed, 267 insertions(+), 80 deletions(-) > > > > diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c > > index f75e060..876db28 100644 > > --- a/drivers/iommu/amd_iommu.c > > +++ b/drivers/iommu/amd_iommu.c > > @@ -256,9 +256,10 @@ static bool check_device(struct device *dev) > > > > static int iommu_init_device(struct device *dev) > > { > > - struct pci_dev *pdev = to_pci_dev(dev); > > + struct pci_dev *dma_pdev, *pdev = to_pci_dev(dev); > > struct iommu_dev_data *dev_data; > > u16 alias; > > + int ret; > > > > if (dev->archdata.iommu) > > return 0; > > @@ -279,6 +280,26 @@ static int iommu_init_device(struct device *dev) > > return -ENOTSUPP; > > } > > dev_data->alias_data = alias_data; > > + > > + dma_pdev = pci_get_bus_and_slot(alias >> 8, alias & 0xff); > > + } else > > + dma_pdev = pdev; > > + > > + /* dma_pdev = iommu_pci_quirk(dma_pdev) */ > > Presumably an actual implementation of the quirk is coming? It might Yeah, I can add a simple one. It may also be a good place to implement any kind of global policy, like grouping multifunction devices or enforcing PCI ACS. Once we have that, I hope we could just add some kind of get/set iommudata to the group that might allow dma_ops to start making use of groups. > be an idea for it to take both the individual and representative > devices, in case that information is useful. Ok. > > + if (!dma_pdev->dev.iommu_group) { > > + struct iommu_group *group; > > + > > + group = iommu_group_alloc(&dma_pdev->dev); > > + if (IS_ERR(group)) > > + return PTR_ERR(group); > > + } > > + > > + ret = iommu_group_add_device(dma_pdev->dev.iommu_group, dev); > > + if (ret) { > > + if (iommu_group_empty(dma_pdev->dev.iommu_group)) > > + iommu_group_free(dma_pdev->dev.iommu_group); > > + return ret; > > It might be nice to have generic helpers that do this kind of lifetime > handling of the groups, but that's a detail. I've cleaned this up a lot in my working version, using kobject references makes this much better. > > } > > > > if (pci_iommuv2_capable(pdev)) { > > @@ -309,6 +330,12 @@ static void iommu_ignore_device(struct device *dev) > > > > static void iommu_uninit_device(struct device *dev) > > { > > + struct iommu_group *group = dev->iommu_group; > > + > > + iommu_group_remove_device(dev); > > + if (iommu_group_empty(group)) > > + iommu_group_free(group); > > + > > /* > > * Nothing to do here - we keep dev_data around for unplugged devices > > * and reuse it when the device is re-plugged - not doing so would > > @@ -3191,26 +3218,6 @@ static int amd_iommu_domain_has_cap(struct iommu_domain *domain, > > return 0; > > } > > > > -static int amd_iommu_device_group(struct device *dev, unsigned int *groupid) > > -{ > > - struct iommu_dev_data *dev_data = dev->archdata.iommu; > > - struct pci_dev *pdev = to_pci_dev(dev); > > - u16 devid; > > - > > - if (!dev_data) > > - return -ENODEV; > > - > > - if (pdev->is_virtfn || !iommu_group_mf) > > - devid = dev_data->devid; > > - else > > - devid = calc_devid(pdev->bus->number, > > - PCI_DEVFN(PCI_SLOT(pdev->devfn), 0)); > > - > > - *groupid = amd_iommu_alias_table[devid]; > > - > > - return 0; > > -} > > - > > static struct iommu_ops amd_iommu_ops = { > > .domain_init = amd_iommu_domain_init, > > .domain_destroy = amd_iommu_domain_destroy, > > @@ -3220,7 +3227,6 @@ static struct iommu_ops amd_iommu_ops = { > > .unmap = amd_iommu_unmap, > > .iova_to_phys = amd_iommu_iova_to_phys, > > .domain_has_cap = amd_iommu_domain_has_cap, > > - .device_group = amd_iommu_device_group, > > .pgsize_bitmap = AMD_IOMMU_PGSIZES, > > }; > > > > diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c > > index c9c6053..41ab7d0 100644 > > --- a/drivers/iommu/intel-iommu.c > > +++ b/drivers/iommu/intel-iommu.c > > @@ -4075,54 +4075,59 @@ static int intel_iommu_domain_has_cap(struct iommu_domain *domain, > > return 0; > > } > > > > -/* > > - * Group numbers are arbitrary. Device with the same group number > > - * indicate the iommu cannot differentiate between them. To avoid > > - * tracking used groups we just use the seg|bus|devfn of the lowest > > - * level we're able to differentiate devices > > - */ > > -static int intel_iommu_device_group(struct device *dev, unsigned int *groupid) > > +static int intel_iommu_add_device(struct device *dev) > > { > > struct pci_dev *pdev = to_pci_dev(dev); > > - struct pci_dev *bridge; > > - union { > > - struct { > > - u8 devfn; > > - u8 bus; > > - u16 segment; > > - } pci; > > - u32 group; > > - } id; > > - > > - if (iommu_no_mapping(dev)) > > - return -ENODEV; > > - > > - id.pci.segment = pci_domain_nr(pdev->bus); > > - id.pci.bus = pdev->bus->number; > > - id.pci.devfn = pdev->devfn; > > + struct pci_dev *bridge, *dma_pdev = pdev; > > + int ret; > > > > - if (!device_to_iommu(id.pci.segment, id.pci.bus, id.pci.devfn)) > > + if (!device_to_iommu(pci_domain_nr(pdev->bus), > > + pdev->bus->number, pdev->devfn)) > > return -ENODEV; > > > > bridge = pci_find_upstream_pcie_bridge(pdev); > > if (bridge) { > > - if (pci_is_pcie(bridge)) { > > - id.pci.bus = bridge->subordinate->number; > > - id.pci.devfn = 0; > > - } else { > > - id.pci.bus = bridge->bus->number; > > - id.pci.devfn = bridge->devfn; > > - } > > + if (pci_is_pcie(bridge)) > > + dma_pdev = pci_get_domain_bus_and_slot( > > + pci_domain_nr(pdev->bus), > > + bridge->subordinate->number, 0); > > + else > > + dma_pdev = bridge; > > + } > > + > > + if (!bridge && !pdev->is_virtfn && iommu_group_mf) > > + dma_pdev = pci_get_slot(dma_pdev->bus, 0); > > + > > + /* dma_pdev = iommu_pci_quirk(dma_pdev); */ > > + > > + if (!dma_pdev->dev.iommu_group) { > > + struct iommu_group *group; > > + > > + group = iommu_group_alloc(&dma_pdev->dev); > > + if (IS_ERR(group)) > > + return PTR_ERR(group); > > } > > > > - if (!pdev->is_virtfn && iommu_group_mf) > > - id.pci.devfn = PCI_DEVFN(PCI_SLOT(id.pci.devfn), 0); > > + ret = iommu_group_add_device(dma_pdev->dev.iommu_group, dev); > > + if (ret) { > > + if (iommu_group_empty(dma_pdev->dev.iommu_group)) > > + iommu_group_free(dma_pdev->dev.iommu_group); > > > > - *groupid = id.group; > > + return ret; > > + } > > > > return 0; > > } > > > > +static void intel_iommu_remove_device(struct device *dev) > > +{ > > + struct iommu_group *group = dev->iommu_group; > > + > > + iommu_group_remove_device(dev); > > + if (iommu_group_empty(group)) > > + iommu_group_free(group); > > +} > > + > > static struct iommu_ops intel_iommu_ops = { > > .domain_init = intel_iommu_domain_init, > > .domain_destroy = intel_iommu_domain_destroy, > > @@ -4132,7 +4137,8 @@ static struct iommu_ops intel_iommu_ops = { > > .unmap = intel_iommu_unmap, > > .iova_to_phys = intel_iommu_iova_to_phys, > > .domain_has_cap = intel_iommu_domain_has_cap, > > - .device_group = intel_iommu_device_group, > > + .add_device = intel_iommu_add_device, > > + .remove_device = intel_iommu_remove_device, > > .pgsize_bitmap = INTEL_IOMMU_PGSIZES, > > }; > > > > diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c > > index 2198b2d..6a51ed9 100644 > > --- a/drivers/iommu/iommu.c > > +++ b/drivers/iommu/iommu.c > > @@ -26,48 +26,96 @@ > > #include <linux/slab.h> > > #include <linux/errno.h> > > #include <linux/iommu.h> > > +#include <linux/idr.h> > > > > -static ssize_t show_iommu_group(struct device *dev, > > - struct device_attribute *attr, char *buf) > > +static struct kset *iommu_group_kset; > > +static struct idr iommu_group_idr; > > +static struct mutex iommu_group_mutex; > > + > > +struct iommu_group_attribute { > > + struct attribute attr; > > + ssize_t (*show)(struct iommu_group *group, char *buf); > > + ssize_t (*store)(struct iommu_group *group, > > + const char *buf, size_t count); > > +}; > > + > > +#define to_iommu_group_attr(_attr) \ > > + container_of(_attr, struct iommu_group_attribute, attr) > > +#define to_iommu_group(_kobj) \ > > + container_of(_kobj, struct iommu_group, kobj) > > + > > +static ssize_t iommu_group_attr_show(struct kobject *kobj, > > + struct attribute *__attr, char *buf) > > { > > - unsigned int groupid; > > + struct iommu_group_attribute *attr = to_iommu_group_attr(__attr); > > + struct iommu_group *group = to_iommu_group(kobj); > > + ssize_t ret = -EIO; > > > > - if (iommu_device_group(dev, &groupid)) > > - return 0; > > + if (attr->show) > > + ret = attr->show(group, buf); > > + return ret; > > +} > > + > > +static ssize_t iommu_group_attr_store(struct kobject *kobj, > > + struct attribute *__attr, > > + const char *buf, size_t count) > > +{ > > + struct iommu_group_attribute *attr = to_iommu_group_attr(__attr); > > + struct iommu_group *group = to_iommu_group(kobj); > > + ssize_t ret = -EIO; > > > > - return sprintf(buf, "%u", groupid); > > + if (attr->store) > > + ret = attr->store(group, buf, count); > > + return ret; > > } > > -static DEVICE_ATTR(iommu_group, S_IRUGO, show_iommu_group, NULL); > > > > -static int add_iommu_group(struct device *dev, void *data) > > +static void iommu_group_release(struct kobject *kobj) > > { > > - unsigned int groupid; > > + struct iommu_group *group = to_iommu_group(kobj); > > > > - if (iommu_device_group(dev, &groupid) == 0) > > - return device_create_file(dev, &dev_attr_iommu_group); > > + mutex_lock(&iommu_group_mutex); > > + idr_remove(&iommu_group_idr, group->id); > > + mutex_unlock(&iommu_group_mutex); > > > > - return 0; > > + kfree(group); > > } > > > > -static int remove_iommu_group(struct device *dev) > > +static const struct sysfs_ops iommu_group_sysfs_ops = { > > + .show = iommu_group_attr_show, > > + .store = iommu_group_attr_store, > > +}; > > + > > +static struct kobj_type iommu_group_ktype = { > > + .sysfs_ops = &iommu_group_sysfs_ops, > > + .release = iommu_group_release, > > +}; > > + > > +static int add_iommu_group(struct device *dev, void *data) > > { > > - unsigned int groupid; > > + struct iommu_ops *ops = data; > > > > - if (iommu_device_group(dev, &groupid) == 0) > > - device_remove_file(dev, &dev_attr_iommu_group); > > + if (!ops->add_device) > > + return -ENODEV; > > > > - return 0; > > + WARN_ON(dev->iommu_group); > > + > > + return ops->add_device(dev); > > } > > > > static int iommu_device_notifier(struct notifier_block *nb, > > unsigned long action, void *data) > > { > > struct device *dev = data; > > + struct iommu_ops *ops = dev->bus->iommu_ops; > > > > if (action == BUS_NOTIFY_ADD_DEVICE) > > - return add_iommu_group(dev, NULL); > > - else if (action == BUS_NOTIFY_DEL_DEVICE) > > - return remove_iommu_group(dev); > > + return add_iommu_group(dev, ops); > > + else if (action == BUS_NOTIFY_DEL_DEVICE) { > > + if (ops->remove_device && dev->iommu_group) { > > + ops->remove_device(dev); > > + return 0; > > + } > > + } > > > > return 0; > > } > > @@ -79,7 +127,97 @@ static struct notifier_block iommu_device_nb = { > > static void iommu_bus_init(struct bus_type *bus, struct iommu_ops *ops) > > { > > bus_register_notifier(bus, &iommu_device_nb); > > - bus_for_each_dev(bus, NULL, NULL, add_iommu_group); > > + bus_for_each_dev(bus, NULL, ops, add_iommu_group); > > +} > > + > > +struct iommu_group *iommu_group_alloc(struct device *dma_dev) > > +{ > > + struct iommu_group *group; > > + int ret; > > + > > + group = kzalloc(sizeof(*group), GFP_KERNEL); > > + if (!group) > > + return ERR_PTR(-ENOMEM); > > + > > + group->kobj.kset = iommu_group_kset; > > + group->dma_dev = dma_dev; > > + atomic_set(&group->refcnt, 0); > > + > > + mutex_lock(&iommu_group_mutex); > > + > > +again: > > + if (unlikely(0 == idr_pre_get(&iommu_group_idr, GFP_KERNEL))) { > > + kfree(group); > > + mutex_unlock(&iommu_group_mutex); > > + return ERR_PTR(-ENOMEM); > > + } > > + > > + if (-EAGAIN == idr_get_new(&iommu_group_idr, group, &group->id)) > > + goto again; > > + > > + mutex_unlock(&iommu_group_mutex); > > + > > + ret = kobject_init_and_add(&group->kobj, &iommu_group_ktype, > > + NULL, "%d", group->id); > > + if (ret) { > > + iommu_group_release(&group->kobj); > > + return ERR_PTR(ret); > > + } > > + > > + group->devices_kobj = kobject_create_and_add("devices", &group->kobj); > > + if (!group->devices_kobj) { > > + kobject_put(&group->kobj); > > + return ERR_PTR(-ENOMEM); > > + } > > + > > + dma_dev->iommu_group = group; > > + > > + return group; > > +} > > + > > +void iommu_group_free(struct iommu_group *group) > > +{ > > + group->dma_dev->iommu_group = NULL; > > + WARN_ON(atomic_read(&group->refcnt)); > > This kind of implies that the representative device doesn't count > against the refcount, which seems wrong. It is, fixed in upcoming revision. > > + kobject_put(group->devices_kobj); > > + kobject_put(&group->kobj); > > +} > > + > > +bool iommu_group_empty(struct iommu_group *group) > > +{ > > + return (0 == atomic_read(&group->refcnt)); > > +} > > + > > +int iommu_group_add_device(struct iommu_group *group, struct device *dev) > > +{ > > + int ret; > > + > > + atomic_inc(&group->refcnt); > > + > > + ret = sysfs_create_link(&dev->kobj, &group->kobj, "iommu_group"); > > + if (ret) > > + return ret; > > + > > + ret = sysfs_create_link(group->devices_kobj, &dev->kobj, > > + kobject_name(&dev->kobj)); > > + if (ret) { > > + sysfs_remove_link(&dev->kobj, "iommu_group"); > > + return ret; > > + } > > + > > + dev->iommu_group = group; > > + > > + return 0; > > +} > > + > > +void iommu_group_remove_device(struct device *dev) > > +{ > > + sysfs_remove_link(dev->iommu_group->devices_kobj, > > + kobject_name(&dev->kobj)); > > + sysfs_remove_link(&dev->kobj, "iommu_group"); > > + > > + atomic_dec(&dev->iommu_group->refcnt); > > + dev->iommu_group = NULL; > > } > > > > /** > > @@ -335,9 +473,23 @@ EXPORT_SYMBOL_GPL(iommu_unmap); > > > > int iommu_device_group(struct device *dev, unsigned int *groupid) > > { > > - if (iommu_present(dev->bus) && dev->bus->iommu_ops->device_group) > > - return dev->bus->iommu_ops->device_group(dev, groupid); > > + if (dev->iommu_group) { > > + *groupid = dev->iommu_group->id; > > + return 0; > > + } > > > > return -ENODEV; > > } > > EXPORT_SYMBOL_GPL(iommu_device_group); > > + > > +static int __init iommu_init(void) > > +{ > > + iommu_group_kset = kset_create_and_add("iommu_groups", NULL, NULL); > > + idr_init(&iommu_group_idr); > > + mutex_init(&iommu_group_mutex); > > + > > + BUG_ON(!iommu_group_kset); > > + > > + return 0; > > +} > > +subsys_initcall(iommu_init); > > diff --git a/include/linux/iommu.h b/include/linux/iommu.h > > index 2ee375c..24004d6 100644 > > --- a/include/linux/iommu.h > > +++ b/include/linux/iommu.h > > @@ -60,6 +60,8 @@ struct iommu_domain { > > * @iova_to_phys: translate iova to physical address > > * @domain_has_cap: domain capabilities query > > * @commit: commit iommu domain > > + * @add_device: add device to iommu grouping > > + * @remove_device: remove device from iommu grouping > > * @pgsize_bitmap: bitmap of supported page sizes > > */ > > struct iommu_ops { > > @@ -76,10 +78,25 @@ struct iommu_ops { > > int (*domain_has_cap)(struct iommu_domain *domain, > > unsigned long cap); > > int (*device_group)(struct device *dev, unsigned int *groupid); > > + int (*add_device)(struct device *dev); > > + void (*remove_device)(struct device *dev); > > unsigned long pgsize_bitmap; > > }; > > > > +/** > > + * struct iommu_group - groups of devices representing iommu visibility > > + * @dma_dev: all dma from the group appears to the iommu using this source id > > + * @kobj: iommu group node in sysfs > > + * @devices_kobj: sysfs subdir node for linking devices > > + * @refcnt: number of devices in group > > + * @id: unique id number for the group (for easy sysfs listing) > > + */ > > struct iommu_group { > > + struct device *dma_dev; > > So, a "representative" device works very nicely for the AMD and Intel > IOMMUs. But I'm not sure it makes sense for any IOMMU - we could > point it to the bridge in the Power case, but I'm not sure if that > really makes sense (particularly when it's a host bridge, not a p2p > bridge). In embedded cases it may make even less sense. Yep, as noted in other reply, I've dropped this. > I think it would be better to move this into iommu driver specific > data (which I guess would mean adding a facility for iommu driver > specific data, either with a private pointer or using container_of). I was thinking a private pointer. Embedding the iommu group into a iommu structure probably wouldn't work well with the kobjects used to represent the group. > > + struct kobject kobj; > > + struct kobject *devices_kobj; > > The devices_kobj has the same lifetime as the group, so it might as > well be a static member. In my working version, devices_kobj is what holds the reference for the group. Documentation/kobject.txt also expressly forbids more than a single embedded kobject. Thanks, Alex > > + atomic_t refcnt; > > + int id; > > }; > > > > extern int bus_set_iommu(struct bus_type *bus, struct iommu_ops *ops); > > @@ -101,6 +118,12 @@ extern int iommu_domain_has_cap(struct iommu_domain *domain, > > extern void iommu_set_fault_handler(struct iommu_domain *domain, > > iommu_fault_handler_t handler); > > extern int iommu_device_group(struct device *dev, unsigned int *groupid); > > +extern struct iommu_group *iommu_group_alloc(struct device *dev); > > +extern void iommu_group_free(struct iommu_group *group); > > +extern bool iommu_group_empty(struct iommu_group *group); > > +extern int iommu_group_add_device(struct iommu_group *group, > > + struct device *dev); > > +extern void iommu_group_remove_device(struct device *dev); > > > > /** > > * report_iommu_fault() - report about an IOMMU fault to the IOMMU framework > > > ^ permalink raw reply [flat|nested] 9+ messages in thread
* [RFC PATCH 3/3] iommu: Create attach/detach group interface 2012-04-02 21:14 [RFC PATCH 0/3] IOMMU groups Alex Williamson 2012-04-02 21:14 ` [RFC PATCH 1/3] iommu: Introduce iommu_group Alex Williamson 2012-04-02 21:14 ` [RFC PATCH 2/3] iommu: Create basic group infrastructure and update AMD-Vi & Intel VT-d Alex Williamson @ 2012-04-02 21:14 ` Alex Williamson 2012-04-10 21:03 ` [RFC PATCH 0/3] IOMMU groups Alex Williamson 3 siblings, 0 replies; 9+ messages in thread From: Alex Williamson @ 2012-04-02 21:14 UTC (permalink / raw) To: david, dwmw2 Cc: alex.williamson, iommu, qemu-devel, joerg.roedel, kvm, linux-kernel IOMMU ops should be working at a group level rather than a device level as we cannot arbitrarily assign devices from the same group to different domains. For now this is just a simple wrapper that makes use of the dma_dev within a group. Signed-off-by: Alex Williamson <alex.williamson@redhat.com> --- drivers/iommu/iommu.c | 12 ++++++++++++ include/linux/iommu.h | 15 +++++++++++++++ 2 files changed, 27 insertions(+), 0 deletions(-) diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index 6a51ed9..af77da1 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -327,6 +327,18 @@ void iommu_detach_device(struct iommu_domain *domain, struct device *dev) } EXPORT_SYMBOL_GPL(iommu_detach_device); +int iommu_attach_group(struct iommu_domain *domain, struct iommu_group *group) +{ + return iommu_attach_device(domain, group->dma_dev); +} +EXPORT_SYMBOL_GPL(iommu_attach_group); + +void iommu_detach_group(struct iommu_domain *domain, struct iommu_group *group) +{ + iommu_detach_device(domain, group->dma_dev); +} +EXPORT_SYMBOL_GPL(iommu_detach_group); + phys_addr_t iommu_iova_to_phys(struct iommu_domain *domain, unsigned long iova) { diff --git a/include/linux/iommu.h b/include/linux/iommu.h index 24004d6..d4e9a7a 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -107,6 +107,10 @@ extern int iommu_attach_device(struct iommu_domain *domain, struct device *dev); extern void iommu_detach_device(struct iommu_domain *domain, struct device *dev); +extern int iommu_attach_group(struct iommu_domain *domain, + struct iommu_group *group); +extern void iommu_detach_group(struct iommu_domain *domain, + struct iommu_group *group); extern int iommu_map(struct iommu_domain *domain, unsigned long iova, phys_addr_t paddr, size_t size, int prot); extern size_t iommu_unmap(struct iommu_domain *domain, unsigned long iova, @@ -194,6 +198,17 @@ static inline void iommu_detach_device(struct iommu_domain *domain, { } +static inline int iommu_attach_group(struct iommu_domain *domain, + struct iommu_group *group) +{ + return -ENODEV; +} + +static inline void iommu_detach_group(struct iommu_domain *domain, + struct iommu_group *group) +{ +} + static inline int iommu_map(struct iommu_domain *domain, unsigned long iova, phys_addr_t paddr, int gfp_order, int prot) { ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [RFC PATCH 0/3] IOMMU groups 2012-04-02 21:14 [RFC PATCH 0/3] IOMMU groups Alex Williamson ` (2 preceding siblings ...) 2012-04-02 21:14 ` [RFC PATCH 3/3] iommu: Create attach/detach group interface Alex Williamson @ 2012-04-10 21:03 ` Alex Williamson 3 siblings, 0 replies; 9+ messages in thread From: Alex Williamson @ 2012-04-10 21:03 UTC (permalink / raw) To: david, dwmw2; +Cc: iommu, qemu-devel, joerg.roedel, kvm, linux-kernel Ping. Does this approach look like it could satisfy your desire for a more integrated group layer? I'd really like to move VFIO forward, we've been stalled on this long enough. David Woodhouse, I think this provides the quirking you're looking for for device like the Ricoh, do you have any other requirements for a group layer? Thanks, Alex On Mon, 2012-04-02 at 15:14 -0600, Alex Williamson wrote: > This series attempts to make IOMMU device grouping a slightly more > integral part of the device model. iommu_device_groups were originally > introduced to support the VFIO user space driver interface which needs > to understand the granularity of device isolation in order to ensure > security of devices when assigned for user access. This information > was provided via a simple group identifier from the IOMMU driver allowing > VFIO to walk devices and assemble groups itself. > > The feedback received from this was that groups should be the effective > unit of work for the IOMMU API. The existing model of allowing domains > to be created and individual devices attached ignores many of the > restrictions of the IOMMU, whether by design, by topology or by defective > devices. Additionally we should be able to use the grouping information > at the dma ops layer for managing domains and quirking devices. > > This series is a sketch at implementing only those aspects and leaving > everything else about the multifaceted hairball of Isolation groups for > another API. Please comment and let me know if this seems like the > direction we should be headed. Thanks, > > Alex > > > --- > > Alex Williamson (3): > iommu: Create attach/detach group interface > iommu: Create basic group infrastructure and update AMD-Vi & Intel VT-d > iommu: Introduce iommu_group > > > drivers/iommu/amd_iommu.c | 50 ++++++---- > drivers/iommu/intel-iommu.c | 76 ++++++++-------- > drivers/iommu/iommu.c | 210 ++++++++++++++++++++++++++++++++++++++----- > include/linux/device.h | 2 > include/linux/iommu.h | 43 +++++++++ > 5 files changed, 301 insertions(+), 80 deletions(-) ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2012-04-18 20:28 UTC | newest] Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2012-04-02 21:14 [RFC PATCH 0/3] IOMMU groups Alex Williamson 2012-04-02 21:14 ` [RFC PATCH 1/3] iommu: Introduce iommu_group Alex Williamson 2012-04-18 9:58 ` David Gibson 2012-04-18 20:07 ` Alex Williamson 2012-04-02 21:14 ` [RFC PATCH 2/3] iommu: Create basic group infrastructure and update AMD-Vi & Intel VT-d Alex Williamson 2012-04-18 11:55 ` David Gibson 2012-04-18 20:28 ` Alex Williamson 2012-04-02 21:14 ` [RFC PATCH 3/3] iommu: Create attach/detach group interface Alex Williamson 2012-04-10 21:03 ` [RFC PATCH 0/3] IOMMU groups Alex Williamson
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).