linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v5 0/8] vfio/mdev: IOMMU aware mediated device
@ 2019-01-10  3:00 Lu Baolu
  2019-01-10  3:00 ` [PATCH v5 1/8] iommu: Add APIs for multiple domains per device Lu Baolu
                   ` (7 more replies)
  0 siblings, 8 replies; 19+ messages in thread
From: Lu Baolu @ 2019-01-10  3:00 UTC (permalink / raw)
  To: Joerg Roedel, David Woodhouse, Alex Williamson, Kirti Wankhede
  Cc: ashok.raj, sanjay.k.kumar, jacob.jun.pan, kevin.tian,
	Jean-Philippe Brucker, yi.l.liu, yi.y.sun, peterx, tiwei.bie,
	Zeng, Xin, iommu, kvm, linux-kernel, Lu Baolu

Hi,

The Mediate Device is a framework for fine-grained physical device
sharing across the isolated domains. Currently the mdev framework
is designed to be independent of the platform IOMMU support. As the
result, the DMA isolation relies on the mdev parent device in a
vendor specific way.

There are several cases where a mediated device could be protected
and isolated by the platform IOMMU. For example, Intel vt-d rev3.0
[1] introduces a new translation mode called 'scalable mode', which
enables PASID-granular translations. The vt-d scalable mode is the
key ingredient for Scalable I/O Virtualization [2] [3] which allows
sharing a device in minimal possible granularity (ADI - Assignable
Device Interface).

A mediated device backed by an ADI could be protected and isolated
by the IOMMU since 1) the parent device supports tagging an unique
PASID to all DMA traffic out of the mediated device; and 2) the DMA
translation unit (IOMMU) supports the PASID granular translation.
We can apply IOMMU protection and isolation to this kind of devices
just as what we are doing with an assignable PCI device.

In order to distinguish the IOMMU-capable mediated devices from those
which still need to rely on parent devices, this patch set adds one
new member in struct mdev_device.

* iommu_device
  - This, if set, indicates that the mediated device could
    be fully isolated and protected by IOMMU via attaching
    an iommu domain to this device. If empty, it indicates
    using vendor defined isolation.

Below helpers are added to set and get above iommu device in mdev core
implementation.

* mdev_set/get_iommu_device(dev, iommu_device)
  - Set or get the iommu device which represents this mdev
    in IOMMU's device scope. Drivers don't need to set the
    iommu device if it uses vendor defined isolation.

The mdev parent device driver could opt-in that the mdev could be
fully isolated and protected by the IOMMU when the mdev is being
created by invoking mdev_set_iommu_device() in its @create().

In the vfio_iommu_type1_attach_group(), a domain allocated through
iommu_domain_alloc() will be attached to the mdev iommu device if
an iommu device has been set. Otherwise, the dummy external domain
will be used and all the DMA isolation and protection are routed to
parent driver as the result.

On IOMMU side, a basic requirement is allowing to attach multiple
domains to a PCI device if the device advertises the capability
and the IOMMU hardware supports finer granularity translations than
the normal PCI Source ID based translation.

As the result, a PCI device could work in two modes: normal mode
and auxiliary mode. In the normal mode, a pci device could be
isolated in the Source ID granularity; the pci device itself could
be assigned to a user application by attaching a single domain
to it. In the auxiliary mode, a pci device could be isolated in
finer granularity, hence subsets of the device could be assigned
to different user level application by attaching a different domain
to each subset.

Below APIs are introduced in iommu generic layer for aux-domain
purpose:

* iommu_dev_has_feature(dev, IOMMU_DEV_FEAT_AUX)
  - Check whether both IOMMU and device support IOMMU aux
    domain feature. Below aux-domain specific interfaces
    are available only after this returns true.

* iommu_dev_enable/disable_feature(dev, IOMMU_DEV_FEAT_AUX)
  - Enable/disable device specific aux-domain feature.

* iommu_aux_attach_device(domain, dev)
  - Attaches @domain to @dev in the auxiliary mode. Multiple
    domains could be attached to a single device in the
    auxiliary mode with each domain representing an isolated
    address space for an assignable subset of the device.

* iommu_aux_detach_device(domain, dev)
  - Detach @domain which has been attached to @dev in the
    auxiliary mode.

* iommu_aux_get_pasid(domain, dev)
  - Return ID used for finer-granularity DMA translation.
    For the Intel Scalable IOV usage model, this will be
    a PASID. The device which supports Scalable IOV needs
    to write this ID to the device register so that DMA
    requests could be tagged with a right PASID prefix.

In order for the ease of discussion, sometimes we call "a domain in
auxiliary mode' or simply 'an auxiliary domain' when a domain is
attached to a device for finer granularity translations. But we need
to keep in mind that this doesn't mean there is a differnt domain
type. A same domain could be bound to a device for Source ID based
translation, and bound to another device for finer granularity
translation at the same time.

This patch series extends both IOMMU and vfio components to support
mdev device passing through when it could be isolated and protected
by the IOMMU units. The first part of this series (PATCH 1/08~5/08)
adds the interfaces and implementation of the multiple domains per
device. The second part (PATCH 6/08~8/08) adds the iommu device
attribute to each mdev, determines isolation type according to the
existence of an iommu device when attaching group in vfio type1 iommu
module, and attaches the domain to iommu aware mediated devices.

References:
[1] https://software.intel.com/en-us/download/intel-virtualization-technology-for-directed-io-architecture-specification
[2] https://software.intel.com/en-us/download/intel-scalable-io-virtualization-technical-specification
[3] https://schd.ws/hosted_files/lc32018/00/LC3-SIOV-final.pdf

Best regards,
Lu Baolu

Change log:
  v4->v5:
  - The iommu APIs have been updated with Joerg's proposal posted
    here https://www.spinics.net/lists/iommu/msg31874.html.
  - Some typos in commit message and comments have been fixed.
  - PATCH 3/8 was split from 4/8 to ease code review.
  - mdev->domain was removed and could bring back when there's a
    real consumer.
  - Other code review comments I received during v4 review period
    except the EXPORT_SYMBOL vs. EXPORT_SYMBOL_GPL in PATCH 6/8.
  - Rebase all patches to 5.0-rc1.

  v3->v4:
  - Use aux domain specific interfaces for domain attach and detach.
  - Rebase all patches to 4.20-rc1.

  v2->v3:
  - Remove domain type enum and use a pointer on mdev_device instead.
  - Add a generic interface for getting/setting per device iommu
    attributions. And use it for query aux domain capability, enable
    aux domain and disable aux domain purpose.
  - Reuse iommu_domain_get_attr() to retrieve the id in a aux domain.
  - We discussed the impact of the default domain implementation
    on reusing iommu_at(de)tach_device() interfaces. We agreed
    that reusing iommu_at(de)tach_device() interfaces is the right
    direction and we could tweak the code to remove the impact.
    https://www.spinics.net/lists/kvm/msg175285.html  
  - Removed the RFC tag since no objections received.
  - This patch has been submitted separately.
    https://www.spinics.net/lists/kvm/msg173936.html

  v1->v2:
  - Rewrite the patches with the concept of auxiliary domains.

Lu Baolu (8):
  iommu: Add APIs for multiple domains per device
  iommu/vt-d: Add per-device IOMMU feature ops entries
  iommu/vt-d: Move common code out of iommu_attch_device()
  iommu/vt-d: Aux-domain specific domain attach/detach
  iommu/vt-d: Return ID associated with an auxiliary domain
  vfio/mdev: Add iommu related member in mdev_device
  vfio/type1: Add domain at(de)taching group helpers
  vfio/type1: Handle different mdev isolation type

 drivers/iommu/intel-iommu.c      | 304 ++++++++++++++++++++++++++++---
 drivers/iommu/iommu.c            |  80 ++++++++
 drivers/vfio/mdev/mdev_core.c    |  18 ++
 drivers/vfio/mdev/mdev_private.h |   1 +
 drivers/vfio/vfio_iommu_type1.c  | 132 ++++++++++++--
 include/linux/intel-iommu.h      |  11 ++
 include/linux/iommu.h            |  61 +++++++
 include/linux/mdev.h             |  14 ++
 8 files changed, 585 insertions(+), 36 deletions(-)

-- 
2.17.1


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH v5 1/8] iommu: Add APIs for multiple domains per device
  2019-01-10  3:00 [PATCH v5 0/8] vfio/mdev: IOMMU aware mediated device Lu Baolu
@ 2019-01-10  3:00 ` Lu Baolu
  2019-01-14 11:22   ` Jonathan Cameron
  2019-01-10  3:00 ` [PATCH v5 2/8] iommu/vt-d: Add per-device IOMMU feature ops entries Lu Baolu
                   ` (6 subsequent siblings)
  7 siblings, 1 reply; 19+ messages in thread
From: Lu Baolu @ 2019-01-10  3:00 UTC (permalink / raw)
  To: Joerg Roedel, David Woodhouse, Alex Williamson, Kirti Wankhede
  Cc: ashok.raj, sanjay.k.kumar, jacob.jun.pan, kevin.tian,
	Jean-Philippe Brucker, yi.l.liu, yi.y.sun, peterx, tiwei.bie,
	Zeng, Xin, iommu, kvm, linux-kernel, Lu Baolu, Jacob Pan

Sharing a physical PCI device in a finer-granularity way
is becoming a consensus in the industry. IOMMU vendors
are also engaging efforts to support such sharing as well
as possible. Among the efforts, the capability of support
finer-granularity DMA isolation is a common requirement
due to the security consideration. With finer-granularity
DMA isolation, all DMA requests out of or to a subset of
a physical PCI device can be protected by the IOMMU. As a
result, there is a request in software to attach multiple
domains to a physical PCI device. One example of such use
model is the Intel Scalable IOV [1] [2]. The Intel vt-d
3.0 spec [3] introduces the scalable mode which enables
PASID granularity DMA isolation.

This adds the APIs to support multiple domains per device.
In order to ease the discussions, we call it 'a domain in
auxiliary mode' or simply 'auxiliary domain' when multiple
domains are attached to a physical device.

The APIs include:

* iommu_dev_has_feature(dev, IOMMU_DEV_FEAT_AUX)
  - Check whether both IOMMU and device support IOMMU aux
    domain feature. Below aux-domain specific interfaces
    are available only after this returns true.

* iommu_dev_enable/disable_feature(dev, IOMMU_DEV_FEAT_AUX)
  - Enable/disable device specific aux-domain feature.

* iommu_aux_attach_device(domain, dev)
  - Attaches @domain to @dev in the auxiliary mode. Multiple
    domains could be attached to a single device in the
    auxiliary mode with each domain representing an isolated
    address space for an assignable subset of the device.

* iommu_aux_detach_device(domain, dev)
  - Detach @domain which has been attached to @dev in the
    auxiliary mode.

* iommu_aux_get_pasid(domain, dev)
  - Return ID used for finer-granularity DMA translation.
    For the Intel Scalable IOV usage model, this will be
    a PASID. The device which supports Scalable IOV needs
    to write this ID to the device register so that DMA
    requests could be tagged with a right PASID prefix.

This has been updated with the latest proposal from Joerg
posted here [5].

Many people involved in discussions of this design.

Kevin Tian <kevin.tian@intel.com>
Liu Yi L <yi.l.liu@intel.com>
Ashok Raj <ashok.raj@intel.com>
Sanjay Kumar <sanjay.k.kumar@intel.com>
Jacob Pan <jacob.jun.pan@linux.intel.com>
Alex Williamson <alex.williamson@redhat.com>
Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
Joerg Roedel <joro@8bytes.org>

and some discussions can be found here [4] [5].

[1] https://software.intel.com/en-us/download/intel-scalable-io-virtualization-technical-specification
[2] https://schd.ws/hosted_files/lc32018/00/LC3-SIOV-final.pdf
[3] https://software.intel.com/en-us/download/intel-virtualization-technology-for-directed-io-architecture-specification
[4] https://lkml.org/lkml/2018/7/26/4
[5] https://www.spinics.net/lists/iommu/msg31874.html

Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Cc: Kevin Tian <kevin.tian@intel.com>
Cc: Liu Yi L <yi.l.liu@intel.com>
Suggested-by: Kevin Tian <kevin.tian@intel.com>
Suggested-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
Suggested-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
 drivers/iommu/iommu.c | 80 +++++++++++++++++++++++++++++++++++++++++++
 include/linux/iommu.h | 61 +++++++++++++++++++++++++++++++++
 2 files changed, 141 insertions(+)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 3ed4db334341..9166b6145409 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -2033,3 +2033,83 @@ int iommu_fwspec_add_ids(struct device *dev, u32 *ids, int num_ids)
 	return 0;
 }
 EXPORT_SYMBOL_GPL(iommu_fwspec_add_ids);
+
+/*
+ * Per device IOMMU features.
+ */
+bool iommu_dev_has_feature(struct device *dev, enum iommu_dev_features feat)
+{
+	const struct iommu_ops *ops = dev->bus->iommu_ops;
+
+	if (ops && ops->dev_has_feat)
+		return ops->dev_has_feat(dev, feat);
+
+	return false;
+}
+EXPORT_SYMBOL_GPL(iommu_dev_has_feature);
+
+int iommu_dev_enable_feature(struct device *dev, enum iommu_dev_features feat)
+{
+	const struct iommu_ops *ops = dev->bus->iommu_ops;
+
+	if (ops && ops->dev_enable_feat)
+		return ops->dev_enable_feat(dev, feat);
+
+	return -ENODEV;
+}
+EXPORT_SYMBOL_GPL(iommu_dev_enable_feature);
+
+int iommu_dev_disable_feature(struct device *dev, enum iommu_dev_features feat)
+{
+	const struct iommu_ops *ops = dev->bus->iommu_ops;
+
+	if (ops && ops->dev_disable_feat)
+		return ops->dev_disable_feat(dev, feat);
+
+	return -ENODEV;
+}
+EXPORT_SYMBOL_GPL(iommu_dev_disable_feature);
+
+/*
+ * Aux-domain specific attach/detach.
+ *
+ * Only works if iommu_dev_has_feature(dev, IOMMU_DEV_FEAT_AUX) returns true.
+ * Also, as long as domains are attached to a device through this interface,
+ * any tries to call iommu_attach_device() should fail (iommu_detach_device()
+ * can't fail, so we fail on the tryint to re-attach). This should make us safe
+ * against a device being attached to a guest as a whole while there are still
+ * pasid users on it (aux and sva).
+ */
+int iommu_aux_attach_device(struct iommu_domain *domain, struct device *dev)
+{
+	int ret = -ENODEV;
+
+	if (domain->ops->aux_attach_dev)
+		ret = domain->ops->aux_attach_dev(domain, dev);
+
+	if (!ret)
+		trace_attach_device_to_domain(dev);
+
+	return ret;
+}
+EXPORT_SYMBOL_GPL(iommu_aux_attach_device);
+
+void iommu_aux_detach_device(struct iommu_domain *domain, struct device *dev)
+{
+	if (domain->ops->aux_detach_dev) {
+		domain->ops->aux_detach_dev(domain, dev);
+		trace_detach_device_from_domain(dev);
+	}
+}
+EXPORT_SYMBOL_GPL(iommu_aux_detach_device);
+
+int iommu_aux_get_pasid(struct iommu_domain *domain, struct device *dev)
+{
+	int ret = -ENODEV;
+
+	if (domain->ops->aux_get_pasid)
+		ret = domain->ops->aux_get_pasid(domain, dev);
+
+	return ret;
+}
+EXPORT_SYMBOL_GPL(iommu_aux_get_pasid);
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index e90da6b6f3d1..f4c3d2a2cc87 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -156,6 +156,11 @@ struct iommu_resv_region {
 	enum iommu_resv_type	type;
 };
 
+/* Per device IOMMU features */
+enum iommu_dev_features {
+	IOMMU_DEV_FEAT_AUX,	/* Aux-domain feature */
+};
+
 #ifdef CONFIG_IOMMU_API
 
 /**
@@ -183,6 +188,10 @@ struct iommu_resv_region {
  * @domain_window_enable: Configure and enable a particular window for a domain
  * @domain_window_disable: Disable a particular window for a domain
  * @of_xlate: add OF master IDs to iommu grouping
+ * @dev_has/enable/disable_feat: per device entries to check/enable/disable
+ *                               iommu specific features.
+ * @aux_attach/detach_dev: aux-domain specific attach/detach entries.
+ * @aux_get_pasid: get the pasid given an aux-domain
  * @pgsize_bitmap: bitmap of all possible supported page sizes
  */
 struct iommu_ops {
@@ -226,6 +235,16 @@ struct iommu_ops {
 	int (*of_xlate)(struct device *dev, struct of_phandle_args *args);
 	bool (*is_attach_deferred)(struct iommu_domain *domain, struct device *dev);
 
+	/* Per device IOMMU features */
+	bool (*dev_has_feat)(struct device *dev, enum iommu_dev_features f);
+	int (*dev_enable_feat)(struct device *dev, enum iommu_dev_features f);
+	int (*dev_disable_feat)(struct device *dev, enum iommu_dev_features f);
+
+	/* Aux-domain specific attach/detach entries */
+	int (*aux_attach_dev)(struct iommu_domain *domain, struct device *dev);
+	void (*aux_detach_dev)(struct iommu_domain *domain, struct device *dev);
+	int (*aux_get_pasid)(struct iommu_domain *domain, struct device *dev);
+
 	unsigned long pgsize_bitmap;
 };
 
@@ -412,6 +431,13 @@ static inline void dev_iommu_fwspec_set(struct device *dev,
 int iommu_probe_device(struct device *dev);
 void iommu_release_device(struct device *dev);
 
+bool iommu_dev_has_feature(struct device *dev, enum iommu_dev_features feat);
+int iommu_dev_enable_feature(struct device *dev, enum iommu_dev_features feat);
+int iommu_dev_disable_feature(struct device *dev, enum iommu_dev_features feat);
+int iommu_aux_attach_device(struct iommu_domain *domain, struct device *dev);
+void iommu_aux_detach_device(struct iommu_domain *domain, struct device *dev);
+int iommu_aux_get_pasid(struct iommu_domain *domain, struct device *dev);
+
 #else /* CONFIG_IOMMU_API */
 
 struct iommu_ops {};
@@ -696,6 +722,41 @@ const struct iommu_ops *iommu_ops_from_fwnode(struct fwnode_handle *fwnode)
 	return NULL;
 }
 
+static inline bool
+iommu_dev_has_feature(struct device *dev, enum iommu_dev_features feat)
+{
+	return false;
+}
+
+static inline int
+iommu_dev_enable_feature(struct device *dev, enum iommu_dev_features feat)
+{
+	return -ENODEV;
+}
+
+static inline int
+iommu_dev_disable_feature(struct device *dev, enum iommu_dev_features feat)
+{
+	return -ENODEV;
+}
+
+static inline int
+iommu_aux_attach_device(struct iommu_domain *domain, struct device *dev)
+{
+	return -ENODEV;
+}
+
+static inline void
+iommu_aux_detach_device(struct iommu_domain *domain, struct device *dev)
+{
+}
+
+static inline int
+iommu_aux_get_pasid(struct iommu_domain *domain, struct device *dev)
+{
+	return -ENODEV;
+}
+
 #endif /* CONFIG_IOMMU_API */
 
 #ifdef CONFIG_IOMMU_DEBUGFS
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v5 2/8] iommu/vt-d: Add per-device IOMMU feature ops entries
  2019-01-10  3:00 [PATCH v5 0/8] vfio/mdev: IOMMU aware mediated device Lu Baolu
  2019-01-10  3:00 ` [PATCH v5 1/8] iommu: Add APIs for multiple domains per device Lu Baolu
@ 2019-01-10  3:00 ` Lu Baolu
  2019-01-11 11:16   ` Joerg Roedel
  2019-01-10  3:00 ` [PATCH v5 3/8] iommu/vt-d: Move common code out of iommu_attch_device() Lu Baolu
                   ` (5 subsequent siblings)
  7 siblings, 1 reply; 19+ messages in thread
From: Lu Baolu @ 2019-01-10  3:00 UTC (permalink / raw)
  To: Joerg Roedel, David Woodhouse, Alex Williamson, Kirti Wankhede
  Cc: ashok.raj, sanjay.k.kumar, jacob.jun.pan, kevin.tian,
	Jean-Philippe Brucker, yi.l.liu, yi.y.sun, peterx, tiwei.bie,
	Zeng, Xin, iommu, kvm, linux-kernel, Lu Baolu, Jacob Pan

This adds the iommu ops entries for aux-domain per-device
feature query and enable/disable.

Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Cc: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Sanjay Kumar <sanjay.k.kumar@intel.com>
Signed-off-by: Liu Yi L <yi.l.liu@intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
 drivers/iommu/intel-iommu.c | 86 +++++++++++++++++++++++++++++++++++++
 include/linux/intel-iommu.h |  1 +
 2 files changed, 87 insertions(+)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 2bd9ac285c0d..ee8832d26f7e 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -2481,6 +2481,7 @@ static struct dmar_domain *dmar_insert_one_dev_info(struct intel_iommu *iommu,
 	info->domain = domain;
 	info->iommu = iommu;
 	info->pasid_table = NULL;
+	info->auxd_enabled = 0;
 
 	if (dev && dev_is_pci(dev)) {
 		struct pci_dev *pdev = to_pci_dev(info->dev);
@@ -5215,6 +5216,24 @@ static phys_addr_t intel_iommu_iova_to_phys(struct iommu_domain *domain,
 	return phys;
 }
 
+static inline bool scalable_mode_support(void)
+{
+	struct dmar_drhd_unit *drhd;
+	struct intel_iommu *iommu;
+	bool ret = true;
+
+	rcu_read_lock();
+	for_each_active_iommu(iommu, drhd) {
+		if (!sm_supported(iommu)) {
+			ret = false;
+			break;
+		}
+	}
+	rcu_read_unlock();
+
+	return ret;
+}
+
 static bool intel_iommu_capable(enum iommu_cap cap)
 {
 	if (cap == IOMMU_CAP_CACHE_COHERENCY)
@@ -5379,6 +5398,70 @@ struct intel_iommu *intel_svm_device_to_iommu(struct device *dev)
 }
 #endif /* CONFIG_INTEL_IOMMU_SVM */
 
+static int intel_iommu_enable_auxd(struct device *dev)
+{
+	struct device_domain_info *info;
+	struct dmar_domain *domain;
+	unsigned long flags;
+
+	if (!scalable_mode_support())
+		return -ENODEV;
+
+	domain = get_valid_domain_for_dev(dev);
+	if (!domain)
+		return -ENODEV;
+
+	spin_lock_irqsave(&device_domain_lock, flags);
+	info = dev->archdata.iommu;
+	info->auxd_enabled = 1;
+	spin_unlock_irqrestore(&device_domain_lock, flags);
+
+	return 0;
+}
+
+static int intel_iommu_disable_auxd(struct device *dev)
+{
+	struct device_domain_info *info;
+	unsigned long flags;
+
+	spin_lock_irqsave(&device_domain_lock, flags);
+	info = dev->archdata.iommu;
+	if (!WARN_ON(!info))
+		info->auxd_enabled = 0;
+	spin_unlock_irqrestore(&device_domain_lock, flags);
+
+	return 0;
+}
+
+static bool
+intel_iommu_dev_has_feat(struct device *dev, enum iommu_dev_features feat)
+{
+	struct device_domain_info *info = dev->archdata.iommu;
+
+	if (feat == IOMMU_DEV_FEAT_AUX)
+		return scalable_mode_support() && info && info->auxd_enabled;
+
+	return false;
+}
+
+static int
+intel_iommu_dev_enable_feat(struct device *dev, enum iommu_dev_features feat)
+{
+	if (feat == IOMMU_DEV_FEAT_AUX)
+		return intel_iommu_enable_auxd(dev);
+
+	return -ENODEV;
+}
+
+static int
+intel_iommu_dev_disable_feat(struct device *dev, enum iommu_dev_features feat)
+{
+	if (feat == IOMMU_DEV_FEAT_AUX)
+		return intel_iommu_disable_auxd(dev);
+
+	return -ENODEV;
+}
+
 const struct iommu_ops intel_iommu_ops = {
 	.capable		= intel_iommu_capable,
 	.domain_alloc		= intel_iommu_domain_alloc,
@@ -5393,6 +5476,9 @@ const struct iommu_ops intel_iommu_ops = {
 	.get_resv_regions	= intel_iommu_get_resv_regions,
 	.put_resv_regions	= intel_iommu_put_resv_regions,
 	.device_group		= pci_device_group,
+	.dev_has_feat		= intel_iommu_dev_has_feat,
+	.dev_enable_feat	= intel_iommu_dev_enable_feat,
+	.dev_disable_feat	= intel_iommu_dev_disable_feat,
 	.pgsize_bitmap		= INTEL_IOMMU_PGSIZES,
 };
 
diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h
index 0605f3bf6e79..7cf9f7f3724a 100644
--- a/include/linux/intel-iommu.h
+++ b/include/linux/intel-iommu.h
@@ -571,6 +571,7 @@ struct device_domain_info {
 	u8 pri_enabled:1;
 	u8 ats_supported:1;
 	u8 ats_enabled:1;
+	u8 auxd_enabled:1;	/* Multiple domains per device */
 	u8 ats_qdep;
 	struct device *dev; /* it's NULL for PCIe-to-PCI bridge */
 	struct intel_iommu *iommu; /* IOMMU used by this device */
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v5 3/8] iommu/vt-d: Move common code out of iommu_attch_device()
  2019-01-10  3:00 [PATCH v5 0/8] vfio/mdev: IOMMU aware mediated device Lu Baolu
  2019-01-10  3:00 ` [PATCH v5 1/8] iommu: Add APIs for multiple domains per device Lu Baolu
  2019-01-10  3:00 ` [PATCH v5 2/8] iommu/vt-d: Add per-device IOMMU feature ops entries Lu Baolu
@ 2019-01-10  3:00 ` Lu Baolu
  2019-01-14 11:45   ` Jonathan Cameron
  2019-01-10  3:00 ` [PATCH v5 4/8] iommu/vt-d: Aux-domain specific domain attach/detach Lu Baolu
                   ` (4 subsequent siblings)
  7 siblings, 1 reply; 19+ messages in thread
From: Lu Baolu @ 2019-01-10  3:00 UTC (permalink / raw)
  To: Joerg Roedel, David Woodhouse, Alex Williamson, Kirti Wankhede
  Cc: ashok.raj, sanjay.k.kumar, jacob.jun.pan, kevin.tian,
	Jean-Philippe Brucker, yi.l.liu, yi.y.sun, peterx, tiwei.bie,
	Zeng, Xin, iommu, kvm, linux-kernel, Lu Baolu, Jacob Pan

This part of code could be used by both normal and aux
domain specific attach entries. Hence move them into a
common function to avoid duplication.

Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Cc: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
 drivers/iommu/intel-iommu.c | 60 ++++++++++++++++++++++---------------
 1 file changed, 36 insertions(+), 24 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index ee8832d26f7e..e9119d45a29d 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -5058,35 +5058,14 @@ static void intel_iommu_domain_free(struct iommu_domain *domain)
 	domain_exit(to_dmar_domain(domain));
 }
 
-static int intel_iommu_attach_device(struct iommu_domain *domain,
-				     struct device *dev)
+static int prepare_domain_attach_device(struct iommu_domain *domain,
+					struct device *dev)
 {
 	struct dmar_domain *dmar_domain = to_dmar_domain(domain);
 	struct intel_iommu *iommu;
 	int addr_width;
 	u8 bus, devfn;
 
-	if (device_is_rmrr_locked(dev)) {
-		dev_warn(dev, "Device is ineligible for IOMMU domain attach due to platform RMRR requirement.  Contact your platform vendor.\n");
-		return -EPERM;
-	}
-
-	/* normally dev is not mapped */
-	if (unlikely(domain_context_mapped(dev))) {
-		struct dmar_domain *old_domain;
-
-		old_domain = find_domain(dev);
-		if (old_domain) {
-			rcu_read_lock();
-			dmar_remove_one_dev_info(old_domain, dev);
-			rcu_read_unlock();
-
-			if (!domain_type_is_vm_or_si(old_domain) &&
-			     list_empty(&old_domain->devices))
-				domain_exit(old_domain);
-		}
-	}
-
 	iommu = device_to_iommu(dev, &bus, &devfn);
 	if (!iommu)
 		return -ENODEV;
@@ -5119,7 +5098,40 @@ static int intel_iommu_attach_device(struct iommu_domain *domain,
 		dmar_domain->agaw--;
 	}
 
-	return domain_add_dev_info(dmar_domain, dev);
+	return 0;
+}
+
+static int intel_iommu_attach_device(struct iommu_domain *domain,
+				     struct device *dev)
+{
+	int ret;
+
+	if (device_is_rmrr_locked(dev)) {
+		dev_warn(dev, "Device is ineligible for IOMMU domain attach due to platform RMRR requirement.  Contact your platform vendor.\n");
+		return -EPERM;
+	}
+
+	/* normally dev is not mapped */
+	if (unlikely(domain_context_mapped(dev))) {
+		struct dmar_domain *old_domain;
+
+		old_domain = find_domain(dev);
+		if (old_domain) {
+			rcu_read_lock();
+			dmar_remove_one_dev_info(old_domain, dev);
+			rcu_read_unlock();
+
+			if (!domain_type_is_vm_or_si(old_domain) &&
+			    list_empty(&old_domain->devices))
+				domain_exit(old_domain);
+		}
+	}
+
+	ret = prepare_domain_attach_device(domain, dev);
+	if (ret)
+		return ret;
+
+	return domain_add_dev_info(to_dmar_domain(domain), dev);
 }
 
 static void intel_iommu_detach_device(struct iommu_domain *domain,
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v5 4/8] iommu/vt-d: Aux-domain specific domain attach/detach
  2019-01-10  3:00 [PATCH v5 0/8] vfio/mdev: IOMMU aware mediated device Lu Baolu
                   ` (2 preceding siblings ...)
  2019-01-10  3:00 ` [PATCH v5 3/8] iommu/vt-d: Move common code out of iommu_attch_device() Lu Baolu
@ 2019-01-10  3:00 ` Lu Baolu
  2019-01-14 12:26   ` Jonathan Cameron
  2019-01-10  3:00 ` [PATCH v5 5/8] iommu/vt-d: Return ID associated with an auxiliary domain Lu Baolu
                   ` (3 subsequent siblings)
  7 siblings, 1 reply; 19+ messages in thread
From: Lu Baolu @ 2019-01-10  3:00 UTC (permalink / raw)
  To: Joerg Roedel, David Woodhouse, Alex Williamson, Kirti Wankhede
  Cc: ashok.raj, sanjay.k.kumar, jacob.jun.pan, kevin.tian,
	Jean-Philippe Brucker, yi.l.liu, yi.y.sun, peterx, tiwei.bie,
	Zeng, Xin, iommu, kvm, linux-kernel, Lu Baolu, Jacob Pan

When multiple domains per device has been enabled by the
device driver, the device will tag the default PASID for
the domain to all DMA traffics out of the subset of this
device; and the IOMMU should translate the DMA requests
in PASID granularity.

This adds the intel_iommu_aux_attach/detach_device() ops
to support managing PASID granular translation structures
when the device driver has enabled multiple domains per
device.

Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Cc: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Sanjay Kumar <sanjay.k.kumar@intel.com>
Signed-off-by: Liu Yi L <yi.l.liu@intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
 drivers/iommu/intel-iommu.c | 152 ++++++++++++++++++++++++++++++++++++
 include/linux/intel-iommu.h |  10 +++
 2 files changed, 162 insertions(+)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index e9119d45a29d..b8fb6a4bd447 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -2482,6 +2482,7 @@ static struct dmar_domain *dmar_insert_one_dev_info(struct intel_iommu *iommu,
 	info->iommu = iommu;
 	info->pasid_table = NULL;
 	info->auxd_enabled = 0;
+	INIT_LIST_HEAD(&info->auxiliary_domains);
 
 	if (dev && dev_is_pci(dev)) {
 		struct pci_dev *pdev = to_pci_dev(info->dev);
@@ -5058,6 +5059,131 @@ static void intel_iommu_domain_free(struct iommu_domain *domain)
 	domain_exit(to_dmar_domain(domain));
 }
 
+/*
+ * Check whether a @domain could be attached to the @dev through the
+ * aux-domain attach/detach APIs.
+ */
+static inline bool
+is_aux_domain(struct device *dev, struct iommu_domain *domain)
+{
+	struct device_domain_info *info = dev->archdata.iommu;
+
+	return info && info->auxd_enabled &&
+			domain->type == IOMMU_DOMAIN_UNMANAGED;
+}
+
+static void auxiliary_link_device(struct dmar_domain *domain,
+				  struct device *dev)
+{
+	struct device_domain_info *info = dev->archdata.iommu;
+
+	assert_spin_locked(&device_domain_lock);
+	if (WARN_ON(!info))
+		return;
+
+	domain->auxd_refcnt++;
+	list_add(&domain->auxd, &info->auxiliary_domains);
+}
+
+static void auxiliary_unlink_device(struct dmar_domain *domain,
+				    struct device *dev)
+{
+	struct device_domain_info *info = dev->archdata.iommu;
+
+	assert_spin_locked(&device_domain_lock);
+	if (WARN_ON(!info))
+		return;
+
+	list_del(&domain->auxd);
+	domain->auxd_refcnt--;
+
+	if (!domain->auxd_refcnt && domain->default_pasid > 0)
+		intel_pasid_free_id(domain->default_pasid);
+}
+
+static int aux_domain_add_dev(struct dmar_domain *domain,
+			      struct device *dev)
+{
+	int ret;
+	u8 bus, devfn;
+	unsigned long flags;
+	struct intel_iommu *iommu;
+
+	iommu = device_to_iommu(dev, &bus, &devfn);
+	if (!iommu)
+		return -ENODEV;
+
+	if (domain->default_pasid <= 0) {
+		int pasid;
+
+		pasid = intel_pasid_alloc_id(domain, PASID_MIN,
+					     pci_max_pasids(to_pci_dev(dev)),
+					     GFP_KERNEL);
+		if (pasid <= 0) {
+			pr_err("Can't allocate default pasid\n");
+			return -ENODEV;
+		}
+		domain->default_pasid = pasid;
+	}
+
+	spin_lock_irqsave(&device_domain_lock, flags);
+	/*
+	 * iommu->lock must be held to attach domain to iommu and setup the
+	 * pasid entry for second level translation.
+	 */
+	spin_lock(&iommu->lock);
+	ret = domain_attach_iommu(domain, iommu);
+	if (ret)
+		goto attach_failed;
+
+	/* Setup the PASID entry for mediated devices: */
+	ret = intel_pasid_setup_second_level(iommu, domain, dev,
+					     domain->default_pasid);
+	if (ret)
+		goto table_failed;
+	spin_unlock(&iommu->lock);
+
+	auxiliary_link_device(domain, dev);
+
+	spin_unlock_irqrestore(&device_domain_lock, flags);
+
+	return 0;
+
+table_failed:
+	domain_detach_iommu(domain, iommu);
+attach_failed:
+	spin_unlock(&iommu->lock);
+	spin_unlock_irqrestore(&device_domain_lock, flags);
+	if (!domain->auxd_refcnt && domain->default_pasid > 0)
+		intel_pasid_free_id(domain->default_pasid);
+
+	return ret;
+}
+
+static void aux_domain_remove_dev(struct dmar_domain *domain,
+				  struct device *dev)
+{
+	struct device_domain_info *info;
+	struct intel_iommu *iommu;
+	unsigned long flags;
+
+	if (!is_aux_domain(dev, &domain->domain))
+		return;
+
+	spin_lock_irqsave(&device_domain_lock, flags);
+	info = dev->archdata.iommu;
+	iommu = info->iommu;
+
+	auxiliary_unlink_device(domain, dev);
+
+	spin_lock(&iommu->lock);
+	intel_pasid_tear_down_entry(iommu, dev, domain->default_pasid);
+	domain_detach_iommu(domain, iommu);
+	spin_unlock(&iommu->lock);
+
+	spin_unlock_irqrestore(&device_domain_lock, flags);
+}
+
 static int prepare_domain_attach_device(struct iommu_domain *domain,
 					struct device *dev)
 {
@@ -5111,6 +5237,9 @@ static int intel_iommu_attach_device(struct iommu_domain *domain,
 		return -EPERM;
 	}
 
+	if (is_aux_domain(dev, domain))
+		return -EPERM;
+
 	/* normally dev is not mapped */
 	if (unlikely(domain_context_mapped(dev))) {
 		struct dmar_domain *old_domain;
@@ -5134,12 +5263,33 @@ static int intel_iommu_attach_device(struct iommu_domain *domain,
 	return domain_add_dev_info(to_dmar_domain(domain), dev);
 }
 
+static int intel_iommu_aux_attach_device(struct iommu_domain *domain,
+					 struct device *dev)
+{
+	int ret;
+
+	if (!is_aux_domain(dev, domain))
+		return -EPERM;
+
+	ret = prepare_domain_attach_device(domain, dev);
+	if (ret)
+		return ret;
+
+	return aux_domain_add_dev(to_dmar_domain(domain), dev);
+}
+
 static void intel_iommu_detach_device(struct iommu_domain *domain,
 				      struct device *dev)
 {
 	dmar_remove_one_dev_info(to_dmar_domain(domain), dev);
 }
 
+static void intel_iommu_aux_detach_device(struct iommu_domain *domain,
+					  struct device *dev)
+{
+	aux_domain_remove_dev(to_dmar_domain(domain), dev);
+}
+
 static int intel_iommu_map(struct iommu_domain *domain,
 			   unsigned long iova, phys_addr_t hpa,
 			   size_t size, int iommu_prot)
@@ -5480,6 +5630,8 @@ const struct iommu_ops intel_iommu_ops = {
 	.domain_free		= intel_iommu_domain_free,
 	.attach_dev		= intel_iommu_attach_device,
 	.detach_dev		= intel_iommu_detach_device,
+	.aux_attach_dev		= intel_iommu_aux_attach_device,
+	.aux_detach_dev		= intel_iommu_aux_detach_device,
 	.map			= intel_iommu_map,
 	.unmap			= intel_iommu_unmap,
 	.iova_to_phys		= intel_iommu_iova_to_phys,
diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h
index 7cf9f7f3724a..b563a61a6c39 100644
--- a/include/linux/intel-iommu.h
+++ b/include/linux/intel-iommu.h
@@ -492,9 +492,11 @@ struct dmar_domain {
 					/* Domain ids per IOMMU. Use u16 since
 					 * domain ids are 16 bit wide according
 					 * to VT-d spec, section 9.3 */
+	unsigned int	auxd_refcnt;	/* Refcount of auxiliary attaching */
 
 	bool has_iotlb_device;
 	struct list_head devices;	/* all devices' list */
+	struct list_head auxd;		/* link to device's auxiliary list */
 	struct iova_domain iovad;	/* iova's that belong to this domain */
 
 	struct dma_pte	*pgd;		/* virtual address */
@@ -513,6 +515,11 @@ struct dmar_domain {
 					   2 == 1GiB, 3 == 512GiB, 4 == 1TiB */
 	u64		max_addr;	/* maximum mapped address */
 
+	int		default_pasid;	/*
+					 * The default pasid used for non-SVM
+					 * traffic on mediated devices.
+					 */
+
 	struct iommu_domain domain;	/* generic domain data structure for
 					   iommu core */
 };
@@ -562,6 +569,9 @@ struct device_domain_info {
 	struct list_head link;	/* link to domain siblings */
 	struct list_head global; /* link to global list */
 	struct list_head table;	/* link to pasid table */
+	struct list_head auxiliary_domains; /* auxiliary domains
+					     * attached to this device
+					     */
 	u8 bus;			/* PCI bus number */
 	u8 devfn;		/* PCI devfn number */
 	u16 pfsid;		/* SRIOV physical function source ID */
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v5 5/8] iommu/vt-d: Return ID associated with an auxiliary domain
  2019-01-10  3:00 [PATCH v5 0/8] vfio/mdev: IOMMU aware mediated device Lu Baolu
                   ` (3 preceding siblings ...)
  2019-01-10  3:00 ` [PATCH v5 4/8] iommu/vt-d: Aux-domain specific domain attach/detach Lu Baolu
@ 2019-01-10  3:00 ` Lu Baolu
  2019-01-10  3:00 ` [PATCH v5 6/8] vfio/mdev: Add iommu related member in mdev_device Lu Baolu
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 19+ messages in thread
From: Lu Baolu @ 2019-01-10  3:00 UTC (permalink / raw)
  To: Joerg Roedel, David Woodhouse, Alex Williamson, Kirti Wankhede
  Cc: ashok.raj, sanjay.k.kumar, jacob.jun.pan, kevin.tian,
	Jean-Philippe Brucker, yi.l.liu, yi.y.sun, peterx, tiwei.bie,
	Zeng, Xin, iommu, kvm, linux-kernel, Lu Baolu, Jacob Pan

This adds support to return the default pasid associated with
an auxiliary domain. The PCI device which is bound with this
domain should use this value as the pasid for all DMA requests
of the subset of device which is isolated and protected with
this domain.

Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Cc: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Sanjay Kumar <sanjay.k.kumar@intel.com>
Signed-off-by: Liu Yi L <yi.l.liu@intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
 drivers/iommu/intel-iommu.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index b8fb6a4bd447..614906276bf1 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -5624,6 +5624,15 @@ intel_iommu_dev_disable_feat(struct device *dev, enum iommu_dev_features feat)
 	return -ENODEV;
 }
 
+static int
+intel_iommu_aux_get_pasid(struct iommu_domain *domain, struct device *dev)
+{
+	struct dmar_domain *dmar_domain = to_dmar_domain(domain);
+
+	return dmar_domain->default_pasid > 0 ?
+			dmar_domain->default_pasid : -EINVAL;
+}
+
 const struct iommu_ops intel_iommu_ops = {
 	.capable		= intel_iommu_capable,
 	.domain_alloc		= intel_iommu_domain_alloc,
@@ -5632,6 +5641,7 @@ const struct iommu_ops intel_iommu_ops = {
 	.detach_dev		= intel_iommu_detach_device,
 	.aux_attach_dev		= intel_iommu_aux_attach_device,
 	.aux_detach_dev		= intel_iommu_aux_detach_device,
+	.aux_get_pasid		= intel_iommu_aux_get_pasid,
 	.map			= intel_iommu_map,
 	.unmap			= intel_iommu_unmap,
 	.iova_to_phys		= intel_iommu_iova_to_phys,
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v5 6/8] vfio/mdev: Add iommu related member in mdev_device
  2019-01-10  3:00 [PATCH v5 0/8] vfio/mdev: IOMMU aware mediated device Lu Baolu
                   ` (4 preceding siblings ...)
  2019-01-10  3:00 ` [PATCH v5 5/8] iommu/vt-d: Return ID associated with an auxiliary domain Lu Baolu
@ 2019-01-10  3:00 ` Lu Baolu
  2019-01-10  3:00 ` [PATCH v5 7/8] vfio/type1: Add domain at(de)taching group helpers Lu Baolu
  2019-01-10  3:00 ` [PATCH v5 8/8] vfio/type1: Handle different mdev isolation type Lu Baolu
  7 siblings, 0 replies; 19+ messages in thread
From: Lu Baolu @ 2019-01-10  3:00 UTC (permalink / raw)
  To: Joerg Roedel, David Woodhouse, Alex Williamson, Kirti Wankhede
  Cc: ashok.raj, sanjay.k.kumar, jacob.jun.pan, kevin.tian,
	Jean-Philippe Brucker, yi.l.liu, yi.y.sun, peterx, tiwei.bie,
	Zeng, Xin, iommu, kvm, linux-kernel, Lu Baolu, Jacob Pan

A parent device might create different types of mediated
devices. For example, a mediated device could be created
by the parent device with full isolation and protection
provided by the IOMMU. One usage case could be found on
Intel platforms where a mediated device is an assignable
subset of a PCI, the DMA requests on behalf of it are all
tagged with a PASID. Since IOMMU supports PASID-granular
translations (scalable mode in VT-d 3.0), this mediated
device could be individually protected and isolated by an
IOMMU.

This patch adds a new member in the struct mdev_device to
indicate that the mediated device represented by mdev could
be isolated and protected by attaching a domain to a device
represented by mdev->iommu_device. It also adds a helper to
add or set the iommu device.

* mdev_device->iommu_device
  - This, if set, indicates that the mediated device could
    be fully isolated and protected by IOMMU via attaching
    an iommu domain to this device. If empty, it indicates
    using vendor defined isolation, hence bypass IOMMU.

* mdev_set/get_iommu_device(dev, iommu_device)
  - Set or get the iommu device which represents this mdev
    in IOMMU's device scope. Drivers don't need to set the
    iommu device if it uses vendor defined isolation.

Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Cc: Kevin Tian <kevin.tian@intel.com>
Cc: Liu Yi L <yi.l.liu@intel.com>
Suggested-by: Kevin Tian <kevin.tian@intel.com>
Suggested-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
 drivers/vfio/mdev/mdev_core.c    | 18 ++++++++++++++++++
 drivers/vfio/mdev/mdev_private.h |  1 +
 include/linux/mdev.h             | 14 ++++++++++++++
 3 files changed, 33 insertions(+)

diff --git a/drivers/vfio/mdev/mdev_core.c b/drivers/vfio/mdev/mdev_core.c
index 0212f0ee8aea..9be58d392d2b 100644
--- a/drivers/vfio/mdev/mdev_core.c
+++ b/drivers/vfio/mdev/mdev_core.c
@@ -390,6 +390,24 @@ int mdev_device_remove(struct device *dev, bool force_remove)
 	return 0;
 }
 
+int mdev_set_iommu_device(struct device *dev, struct device *iommu_device)
+{
+	struct mdev_device *mdev = to_mdev_device(dev);
+
+	mdev->iommu_device = iommu_device;
+
+	return 0;
+}
+EXPORT_SYMBOL(mdev_set_iommu_device);
+
+struct device *mdev_get_iommu_device(struct device *dev)
+{
+	struct mdev_device *mdev = to_mdev_device(dev);
+
+	return mdev->iommu_device;
+}
+EXPORT_SYMBOL(mdev_get_iommu_device);
+
 static int __init mdev_init(void)
 {
 	return mdev_bus_register();
diff --git a/drivers/vfio/mdev/mdev_private.h b/drivers/vfio/mdev/mdev_private.h
index b5819b7d7ef7..891841862ef8 100644
--- a/drivers/vfio/mdev/mdev_private.h
+++ b/drivers/vfio/mdev/mdev_private.h
@@ -34,6 +34,7 @@ struct mdev_device {
 	struct list_head next;
 	struct kobject *type_kobj;
 	bool active;
+	struct device *iommu_device;
 };
 
 #define to_mdev_device(dev)	container_of(dev, struct mdev_device, dev)
diff --git a/include/linux/mdev.h b/include/linux/mdev.h
index b6e048e1045f..c3ab8a9cfcc7 100644
--- a/include/linux/mdev.h
+++ b/include/linux/mdev.h
@@ -15,6 +15,20 @@
 
 struct mdev_device;
 
+/*
+ * Called by the parent device driver to set the device which represents
+ * this mdev in iommu protection scope. By default, the iommu device is
+ * NULL, that indicates using vendor defined isolation.
+ *
+ * @dev: the mediated device that iommu will isolate.
+ * @iommu_device: a pci device which represents the iommu for @dev.
+ *
+ * Return 0 for success, otherwise negative error value.
+ */
+int mdev_set_iommu_device(struct device *dev, struct device *iommu_device);
+
+struct device *mdev_get_iommu_device(struct device *dev);
+
 /**
  * struct mdev_parent_ops - Structure to be registered for each parent device to
  * register the device to mdev module.
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v5 7/8] vfio/type1: Add domain at(de)taching group helpers
  2019-01-10  3:00 [PATCH v5 0/8] vfio/mdev: IOMMU aware mediated device Lu Baolu
                   ` (5 preceding siblings ...)
  2019-01-10  3:00 ` [PATCH v5 6/8] vfio/mdev: Add iommu related member in mdev_device Lu Baolu
@ 2019-01-10  3:00 ` Lu Baolu
  2019-01-10  3:00 ` [PATCH v5 8/8] vfio/type1: Handle different mdev isolation type Lu Baolu
  7 siblings, 0 replies; 19+ messages in thread
From: Lu Baolu @ 2019-01-10  3:00 UTC (permalink / raw)
  To: Joerg Roedel, David Woodhouse, Alex Williamson, Kirti Wankhede
  Cc: ashok.raj, sanjay.k.kumar, jacob.jun.pan, kevin.tian,
	Jean-Philippe Brucker, yi.l.liu, yi.y.sun, peterx, tiwei.bie,
	Zeng, Xin, iommu, kvm, linux-kernel, Lu Baolu, Jacob Pan

This adds helpers to attach or detach a domain to a
group. This will replace iommu_attach_group() which
only works for non-mdev devices.

If a domain is attaching to a group which includes the
mediated devices, it should attach to the iommu device
(a pci device which represents the mdev in iommu scope)
instead. The added helper supports attaching domain to
groups for both pci and mdev devices.

Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Cc: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Sanjay Kumar <sanjay.k.kumar@intel.com>
Signed-off-by: Liu Yi L <yi.l.liu@intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
 drivers/vfio/vfio_iommu_type1.c | 84 ++++++++++++++++++++++++++++++---
 1 file changed, 77 insertions(+), 7 deletions(-)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index 7651cfb14836..97278ac8da95 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -91,6 +91,7 @@ struct vfio_dma {
 struct vfio_group {
 	struct iommu_group	*iommu_group;
 	struct list_head	next;
+	bool			mdev_group;	/* An mdev group */
 };
 
 /*
@@ -1298,6 +1299,75 @@ static bool vfio_iommu_has_sw_msi(struct iommu_group *group, phys_addr_t *base)
 	return ret;
 }
 
+static struct device *vfio_mdev_get_iommu_device(struct device *dev)
+{
+	struct device *(*fn)(struct device *dev);
+	struct device *iommu_device;
+
+	fn = symbol_get(mdev_get_iommu_device);
+	if (fn) {
+		iommu_device = fn(dev);
+		symbol_put(mdev_get_iommu_device);
+
+		return iommu_device;
+	}
+
+	return NULL;
+}
+
+static int vfio_mdev_attach_domain(struct device *dev, void *data)
+{
+	struct iommu_domain *domain = data;
+	struct device *iommu_device;
+
+	iommu_device = vfio_mdev_get_iommu_device(dev);
+	if (iommu_device) {
+		if (iommu_dev_has_feature(iommu_device, IOMMU_DEV_FEAT_AUX))
+			return iommu_aux_attach_device(domain, iommu_device);
+		else
+			return iommu_attach_device(domain, iommu_device);
+	}
+
+	return -EINVAL;
+}
+
+static int vfio_mdev_detach_domain(struct device *dev, void *data)
+{
+	struct iommu_domain *domain = data;
+	struct device *iommu_device;
+
+	iommu_device = vfio_mdev_get_iommu_device(dev);
+	if (iommu_device) {
+		if (iommu_dev_has_feature(iommu_device, IOMMU_DEV_FEAT_AUX))
+			iommu_aux_detach_device(domain, iommu_device);
+		else
+			iommu_detach_device(domain, iommu_device);
+	}
+
+	return 0;
+}
+
+static int vfio_iommu_attach_group(struct vfio_domain *domain,
+				   struct vfio_group *group)
+{
+	if (group->mdev_group)
+		return iommu_group_for_each_dev(group->iommu_group,
+						domain->domain,
+						vfio_mdev_attach_domain);
+	else
+		return iommu_attach_group(domain->domain, group->iommu_group);
+}
+
+static void vfio_iommu_detach_group(struct vfio_domain *domain,
+				    struct vfio_group *group)
+{
+	if (group->mdev_group)
+		iommu_group_for_each_dev(group->iommu_group, domain->domain,
+					 vfio_mdev_detach_domain);
+	else
+		iommu_detach_group(domain->domain, group->iommu_group);
+}
+
 static int vfio_iommu_type1_attach_group(void *iommu_data,
 					 struct iommu_group *iommu_group)
 {
@@ -1373,7 +1443,7 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
 			goto out_domain;
 	}
 
-	ret = iommu_attach_group(domain->domain, iommu_group);
+	ret = vfio_iommu_attach_group(domain, group);
 	if (ret)
 		goto out_domain;
 
@@ -1405,8 +1475,8 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
 	list_for_each_entry(d, &iommu->domain_list, next) {
 		if (d->domain->ops == domain->domain->ops &&
 		    d->prot == domain->prot) {
-			iommu_detach_group(domain->domain, iommu_group);
-			if (!iommu_attach_group(d->domain, iommu_group)) {
+			vfio_iommu_detach_group(domain, group);
+			if (!vfio_iommu_attach_group(d, group)) {
 				list_add(&group->next, &d->group_list);
 				iommu_domain_free(domain->domain);
 				kfree(domain);
@@ -1414,7 +1484,7 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
 				return 0;
 			}
 
-			ret = iommu_attach_group(domain->domain, iommu_group);
+			ret = vfio_iommu_attach_group(domain, group);
 			if (ret)
 				goto out_domain;
 		}
@@ -1440,7 +1510,7 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
 	return 0;
 
 out_detach:
-	iommu_detach_group(domain->domain, iommu_group);
+	vfio_iommu_detach_group(domain, group);
 out_domain:
 	iommu_domain_free(domain->domain);
 out_free:
@@ -1531,7 +1601,7 @@ static void vfio_iommu_type1_detach_group(void *iommu_data,
 		if (!group)
 			continue;
 
-		iommu_detach_group(domain->domain, iommu_group);
+		vfio_iommu_detach_group(domain, group);
 		list_del(&group->next);
 		kfree(group);
 		/*
@@ -1596,7 +1666,7 @@ static void vfio_release_domain(struct vfio_domain *domain, bool external)
 	list_for_each_entry_safe(group, group_tmp,
 				 &domain->group_list, next) {
 		if (!external)
-			iommu_detach_group(domain->domain, group->iommu_group);
+			vfio_iommu_detach_group(domain, group);
 		list_del(&group->next);
 		kfree(group);
 	}
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v5 8/8] vfio/type1: Handle different mdev isolation type
  2019-01-10  3:00 [PATCH v5 0/8] vfio/mdev: IOMMU aware mediated device Lu Baolu
                   ` (6 preceding siblings ...)
  2019-01-10  3:00 ` [PATCH v5 7/8] vfio/type1: Add domain at(de)taching group helpers Lu Baolu
@ 2019-01-10  3:00 ` Lu Baolu
  7 siblings, 0 replies; 19+ messages in thread
From: Lu Baolu @ 2019-01-10  3:00 UTC (permalink / raw)
  To: Joerg Roedel, David Woodhouse, Alex Williamson, Kirti Wankhede
  Cc: ashok.raj, sanjay.k.kumar, jacob.jun.pan, kevin.tian,
	Jean-Philippe Brucker, yi.l.liu, yi.y.sun, peterx, tiwei.bie,
	Zeng, Xin, iommu, kvm, linux-kernel, Lu Baolu, Jacob Pan

This adds the support to determine the isolation type
of a mediated device group by checking whether it has
an iommu device. If an iommu device exists, an iommu
domain will be allocated and then attached to the iommu
device. Otherwise, keep the same behavior as it is.

Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Cc: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Sanjay Kumar <sanjay.k.kumar@intel.com>
Signed-off-by: Liu Yi L <yi.l.liu@intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
 drivers/vfio/vfio_iommu_type1.c | 48 ++++++++++++++++++++++++++++-----
 1 file changed, 41 insertions(+), 7 deletions(-)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index 97278ac8da95..140366014a1b 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -1368,13 +1368,40 @@ static void vfio_iommu_detach_group(struct vfio_domain *domain,
 		iommu_detach_group(domain->domain, group->iommu_group);
 }
 
+static bool vfio_bus_is_mdev(struct bus_type *bus)
+{
+	struct bus_type *mdev_bus;
+	bool ret = false;
+
+	mdev_bus = symbol_get(mdev_bus_type);
+	if (mdev_bus) {
+		ret = (bus == mdev_bus);
+		symbol_put(mdev_bus_type);
+	}
+
+	return ret;
+}
+
+static int vfio_mdev_iommu_device(struct device *dev, void *data)
+{
+	struct device **old = data, *new;
+
+	new = vfio_mdev_get_iommu_device(dev);
+	if (!new || (*old && *old != new))
+		return -EINVAL;
+
+	*old = new;
+
+	return 0;
+}
+
 static int vfio_iommu_type1_attach_group(void *iommu_data,
 					 struct iommu_group *iommu_group)
 {
 	struct vfio_iommu *iommu = iommu_data;
 	struct vfio_group *group;
 	struct vfio_domain *domain, *d;
-	struct bus_type *bus = NULL, *mdev_bus;
+	struct bus_type *bus = NULL;
 	int ret;
 	bool resv_msi, msi_remap;
 	phys_addr_t resv_msi_base;
@@ -1409,23 +1436,30 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
 	if (ret)
 		goto out_free;
 
-	mdev_bus = symbol_get(mdev_bus_type);
+	if (vfio_bus_is_mdev(bus)) {
+		struct device *iommu_device = NULL;
 
-	if (mdev_bus) {
-		if ((bus == mdev_bus) && !iommu_present(bus)) {
-			symbol_put(mdev_bus_type);
+		group->mdev_group = true;
+
+		/* Determine the isolation type */
+		ret = iommu_group_for_each_dev(iommu_group, &iommu_device,
+					       vfio_mdev_iommu_device);
+		if (ret || !iommu_device) {
 			if (!iommu->external_domain) {
 				INIT_LIST_HEAD(&domain->group_list);
 				iommu->external_domain = domain;
-			} else
+			} else {
 				kfree(domain);
+			}
 
 			list_add(&group->next,
 				 &iommu->external_domain->group_list);
 			mutex_unlock(&iommu->lock);
+
 			return 0;
 		}
-		symbol_put(mdev_bus_type);
+
+		bus = iommu_device->bus;
 	}
 
 	domain->domain = iommu_domain_alloc(bus);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH v5 2/8] iommu/vt-d: Add per-device IOMMU feature ops entries
  2019-01-10  3:00 ` [PATCH v5 2/8] iommu/vt-d: Add per-device IOMMU feature ops entries Lu Baolu
@ 2019-01-11 11:16   ` Joerg Roedel
  2019-01-14  5:30     ` Lu Baolu
  2019-01-24  6:47     ` Lu Baolu
  0 siblings, 2 replies; 19+ messages in thread
From: Joerg Roedel @ 2019-01-11 11:16 UTC (permalink / raw)
  To: Lu Baolu
  Cc: David Woodhouse, Alex Williamson, Kirti Wankhede, ashok.raj,
	sanjay.k.kumar, jacob.jun.pan, kevin.tian, Jean-Philippe Brucker,
	yi.l.liu, yi.y.sun, peterx, tiwei.bie, Zeng, Xin, iommu, kvm,
	linux-kernel, Jacob Pan

Hi,

this looks a bit confusing to me because I can see no checking whether
the device actually supports scalable mode. More below:

On Thu, Jan 10, 2019 at 11:00:21AM +0800, Lu Baolu wrote:
> +static int intel_iommu_enable_auxd(struct device *dev)
> +{
> +	struct device_domain_info *info;
> +	struct dmar_domain *domain;
> +	unsigned long flags;
> +
> +	if (!scalable_mode_support())
> +		return -ENODEV;
> +
> +	domain = get_valid_domain_for_dev(dev);
> +	if (!domain)
> +		return -ENODEV;
> +
> +	spin_lock_irqsave(&device_domain_lock, flags);
> +	info = dev->archdata.iommu;
> +	info->auxd_enabled = 1;
> +	spin_unlock_irqrestore(&device_domain_lock, flags);
> +
> +	return 0;
> +}

This code sets a flag to mark scalable mode enabled. Doesn't the device
need some handling too, like enabling the PASID capability and all?

> +
> +static bool
> +intel_iommu_dev_has_feat(struct device *dev, enum iommu_dev_features feat)
> +{
> +	struct device_domain_info *info = dev->archdata.iommu;
> +
> +	if (feat == IOMMU_DEV_FEAT_AUX)
> +		return scalable_mode_support() && info && info->auxd_enabled;
> +
> +	return false;
> +}

Why is this checking the auxd_enabled flag? The function should just
return whether the device _supports_ scalable mode, not whether it is
enabled.

Regards,

	Joerg

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v5 2/8] iommu/vt-d: Add per-device IOMMU feature ops entries
  2019-01-11 11:16   ` Joerg Roedel
@ 2019-01-14  5:30     ` Lu Baolu
  2019-01-24  6:47     ` Lu Baolu
  1 sibling, 0 replies; 19+ messages in thread
From: Lu Baolu @ 2019-01-14  5:30 UTC (permalink / raw)
  To: Joerg Roedel
  Cc: baolu.lu, David Woodhouse, Alex Williamson, Kirti Wankhede,
	ashok.raj, sanjay.k.kumar, jacob.jun.pan, kevin.tian,
	Jean-Philippe Brucker, yi.l.liu, yi.y.sun, peterx, tiwei.bie,
	xin.zeng, iommu, kvm, linux-kernel, Jacob Pan

Hi Joerg,

Thanks for reviewing my patch.

On 1/11/19 7:16 PM, Joerg Roedel wrote:
> Hi,
> 
> this looks a bit confusing to me because I can see no checking whether
> the device actually supports scalable mode.

Yes. I should put some checking there. Device scalable mode capability
is exposed in PCI extended capability list.

> More below:
> 
> On Thu, Jan 10, 2019 at 11:00:21AM +0800, Lu Baolu wrote:
>> +static int intel_iommu_enable_auxd(struct device *dev)
>> +{
>> +	struct device_domain_info *info;
>> +	struct dmar_domain *domain;
>> +	unsigned long flags;
>> +
>> +	if (!scalable_mode_support())
>> +		return -ENODEV;
>> +
>> +	domain = get_valid_domain_for_dev(dev);
>> +	if (!domain)
>> +		return -ENODEV;
>> +
>> +	spin_lock_irqsave(&device_domain_lock, flags);
>> +	info = dev->archdata.iommu;
>> +	info->auxd_enabled = 1;
>> +	spin_unlock_irqrestore(&device_domain_lock, flags);
>> +
>> +	return 0;
>> +}
> 
> This code sets a flag to mark scalable mode enabled. Doesn't the device
> need some handling too, like enabling the PASID capability and all?

Yes. My design was rough. We should prepare the device for scalable mode
instead of assuming that everything is ready.

> 
>> +
>> +static bool
>> +intel_iommu_dev_has_feat(struct device *dev, enum iommu_dev_features feat)
>> +{
>> +	struct device_domain_info *info = dev->archdata.iommu;
>> +
>> +	if (feat == IOMMU_DEV_FEAT_AUX)
>> +		return scalable_mode_support() && info && info->auxd_enabled;
>> +
>> +	return false;
>> +}
> 
> Why is this checking the auxd_enabled flag?

We need an API to check whether this feature is enabled. In vfio, it
is used like below,

if (iommu_dev_has_feat(dev, FEAT_AUX_DOMAIN))
     iommu_aux_attach_device(dev, domain)
else
     iommu_attach_device(dev, domain)

> The function should just
> return whether the device _supports_ scalable mode, not whether it is
> enabled.

Do we want to have an API to tell whether device has aux-domain feature?
It could be included in the enable API. The enable API returns failure
if device doesn't support aux-domain.

> 
> Regards,
> 
> 	Joerg
> 

Best regards,
Lu Baolu

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v5 1/8] iommu: Add APIs for multiple domains per device
  2019-01-10  3:00 ` [PATCH v5 1/8] iommu: Add APIs for multiple domains per device Lu Baolu
@ 2019-01-14 11:22   ` Jonathan Cameron
  2019-01-15  1:33     ` Lu Baolu
  0 siblings, 1 reply; 19+ messages in thread
From: Jonathan Cameron @ 2019-01-14 11:22 UTC (permalink / raw)
  To: Lu Baolu
  Cc: Joerg Roedel, David Woodhouse, Alex Williamson, Kirti Wankhede,
	kevin.tian, ashok.raj, tiwei.bie, Jean-Philippe Brucker,
	sanjay.k.kumar, iommu, linux-kernel, Zeng, yi.y.sun,
	jacob.jun.pan, kvm

On Thu, 10 Jan 2019 11:00:20 +0800
Lu Baolu <baolu.lu@linux.intel.com> wrote:

> Sharing a physical PCI device in a finer-granularity way
> is becoming a consensus in the industry. IOMMU vendors
> are also engaging efforts to support such sharing as well
> as possible. Among the efforts, the capability of support
> finer-granularity DMA isolation is a common requirement
> due to the security consideration. With finer-granularity
> DMA isolation, all DMA requests out of or to a subset of
> a physical PCI device can be protected by the IOMMU. As a
> result, there is a request in software to attach multiple
> domains to a physical PCI device. One example of such use
> model is the Intel Scalable IOV [1] [2]. The Intel vt-d
> 3.0 spec [3] introduces the scalable mode which enables
> PASID granularity DMA isolation.
> 
> This adds the APIs to support multiple domains per device.
> In order to ease the discussions, we call it 'a domain in
> auxiliary mode' or simply 'auxiliary domain' when multiple
> domains are attached to a physical device.
> 
> The APIs include:
> 
> * iommu_dev_has_feature(dev, IOMMU_DEV_FEAT_AUX)
>   - Check whether both IOMMU and device support IOMMU aux
>     domain feature. Below aux-domain specific interfaces
>     are available only after this returns true.
> 
> * iommu_dev_enable/disable_feature(dev, IOMMU_DEV_FEAT_AUX)
>   - Enable/disable device specific aux-domain feature.
> 
> * iommu_aux_attach_device(domain, dev)
>   - Attaches @domain to @dev in the auxiliary mode. Multiple
>     domains could be attached to a single device in the
>     auxiliary mode with each domain representing an isolated
>     address space for an assignable subset of the device.
> 
> * iommu_aux_detach_device(domain, dev)
>   - Detach @domain which has been attached to @dev in the
>     auxiliary mode.
> 
> * iommu_aux_get_pasid(domain, dev)
>   - Return ID used for finer-granularity DMA translation.
>     For the Intel Scalable IOV usage model, this will be
>     a PASID. The device which supports Scalable IOV needs
>     to write this ID to the device register so that DMA
>     requests could be tagged with a right PASID prefix.
> 
> This has been updated with the latest proposal from Joerg
> posted here [5].
> 
> Many people involved in discussions of this design.
> 
> Kevin Tian <kevin.tian@intel.com>
> Liu Yi L <yi.l.liu@intel.com>
> Ashok Raj <ashok.raj@intel.com>
> Sanjay Kumar <sanjay.k.kumar@intel.com>
> Jacob Pan <jacob.jun.pan@linux.intel.com>
> Alex Williamson <alex.williamson@redhat.com>
> Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
> Joerg Roedel <joro@8bytes.org>
> 
> and some discussions can be found here [4] [5].
> 
> [1] https://software.intel.com/en-us/download/intel-scalable-io-virtualization-technical-specification
> [2] https://schd.ws/hosted_files/lc32018/00/LC3-SIOV-final.pdf
> [3] https://software.intel.com/en-us/download/intel-virtualization-technology-for-directed-io-architecture-specification
> [4] https://lkml.org/lkml/2018/7/26/4
> [5] https://www.spinics.net/lists/iommu/msg31874.html
> 
> Cc: Ashok Raj <ashok.raj@intel.com>
> Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
> Cc: Kevin Tian <kevin.tian@intel.com>
> Cc: Liu Yi L <yi.l.liu@intel.com>
> Suggested-by: Kevin Tian <kevin.tian@intel.com>
> Suggested-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
> Suggested-by: Joerg Roedel <jroedel@suse.de>
> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>

One trivial comment inline.

> ---
>  drivers/iommu/iommu.c | 80 +++++++++++++++++++++++++++++++++++++++++++
>  include/linux/iommu.h | 61 +++++++++++++++++++++++++++++++++
>  2 files changed, 141 insertions(+)
> 
> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
> index 3ed4db334341..9166b6145409 100644
> --- a/drivers/iommu/iommu.c
> +++ b/drivers/iommu/iommu.c
> @@ -2033,3 +2033,83 @@ int iommu_fwspec_add_ids(struct device *dev, u32 *ids, int num_ids)
>  	return 0;
>  }
>  EXPORT_SYMBOL_GPL(iommu_fwspec_add_ids);
> +
> +/*
> + * Per device IOMMU features.
> + */
> +bool iommu_dev_has_feature(struct device *dev, enum iommu_dev_features feat)
> +{
> +	const struct iommu_ops *ops = dev->bus->iommu_ops;
> +
> +	if (ops && ops->dev_has_feat)
> +		return ops->dev_has_feat(dev, feat);
> +
> +	return false;
> +}
> +EXPORT_SYMBOL_GPL(iommu_dev_has_feature);
> +
> +int iommu_dev_enable_feature(struct device *dev, enum iommu_dev_features feat)
> +{
> +	const struct iommu_ops *ops = dev->bus->iommu_ops;
> +
> +	if (ops && ops->dev_enable_feat)
> +		return ops->dev_enable_feat(dev, feat);
> +
> +	return -ENODEV;
> +}
> +EXPORT_SYMBOL_GPL(iommu_dev_enable_feature);
> +
> +int iommu_dev_disable_feature(struct device *dev, enum iommu_dev_features feat)
> +{
> +	const struct iommu_ops *ops = dev->bus->iommu_ops;
> +
> +	if (ops && ops->dev_disable_feat)
> +		return ops->dev_disable_feat(dev, feat);
> +
> +	return -ENODEV;
> +}
> +EXPORT_SYMBOL_GPL(iommu_dev_disable_feature);
> +
> +/*
> + * Aux-domain specific attach/detach.
> + *
> + * Only works if iommu_dev_has_feature(dev, IOMMU_DEV_FEAT_AUX) returns true.
> + * Also, as long as domains are attached to a device through this interface,
> + * any tries to call iommu_attach_device() should fail (iommu_detach_device()
> + * can't fail, so we fail on the tryint to re-attach). This should make us safe

when trying to re-attach. (perhaps?)

> + * against a device being attached to a guest as a whole while there are still
> + * pasid users on it (aux and sva).
> + */
> +int iommu_aux_attach_device(struct iommu_domain *domain, struct device *dev)
> +{
> +	int ret = -ENODEV;
> +
> +	if (domain->ops->aux_attach_dev)
> +		ret = domain->ops->aux_attach_dev(domain, dev);
> +
> +	if (!ret)
> +		trace_attach_device_to_domain(dev);
> +
> +	return ret;
> +}
> +EXPORT_SYMBOL_GPL(iommu_aux_attach_device);
> +
> +void iommu_aux_detach_device(struct iommu_domain *domain, struct device *dev)
> +{
> +	if (domain->ops->aux_detach_dev) {
> +		domain->ops->aux_detach_dev(domain, dev);
> +		trace_detach_device_from_domain(dev);
> +	}
> +}
> +EXPORT_SYMBOL_GPL(iommu_aux_detach_device);
> +
> +int iommu_aux_get_pasid(struct iommu_domain *domain, struct device *dev)
> +{
> +	int ret = -ENODEV;
> +
> +	if (domain->ops->aux_get_pasid)
> +		ret = domain->ops->aux_get_pasid(domain, dev);
> +
> +	return ret;
> +}
> +EXPORT_SYMBOL_GPL(iommu_aux_get_pasid);
> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> index e90da6b6f3d1..f4c3d2a2cc87 100644
> --- a/include/linux/iommu.h
> +++ b/include/linux/iommu.h
> @@ -156,6 +156,11 @@ struct iommu_resv_region {
>  	enum iommu_resv_type	type;
>  };
>  
> +/* Per device IOMMU features */
> +enum iommu_dev_features {
> +	IOMMU_DEV_FEAT_AUX,	/* Aux-domain feature */
> +};
> +
>  #ifdef CONFIG_IOMMU_API
>  
>  /**
> @@ -183,6 +188,10 @@ struct iommu_resv_region {
>   * @domain_window_enable: Configure and enable a particular window for a domain
>   * @domain_window_disable: Disable a particular window for a domain
>   * @of_xlate: add OF master IDs to iommu grouping
> + * @dev_has/enable/disable_feat: per device entries to check/enable/disable
> + *                               iommu specific features.
> + * @aux_attach/detach_dev: aux-domain specific attach/detach entries.
> + * @aux_get_pasid: get the pasid given an aux-domain
>   * @pgsize_bitmap: bitmap of all possible supported page sizes
>   */
>  struct iommu_ops {
> @@ -226,6 +235,16 @@ struct iommu_ops {
>  	int (*of_xlate)(struct device *dev, struct of_phandle_args *args);
>  	bool (*is_attach_deferred)(struct iommu_domain *domain, struct device *dev);
>  
> +	/* Per device IOMMU features */
> +	bool (*dev_has_feat)(struct device *dev, enum iommu_dev_features f);
> +	int (*dev_enable_feat)(struct device *dev, enum iommu_dev_features f);
> +	int (*dev_disable_feat)(struct device *dev, enum iommu_dev_features f);
> +
> +	/* Aux-domain specific attach/detach entries */
> +	int (*aux_attach_dev)(struct iommu_domain *domain, struct device *dev);
> +	void (*aux_detach_dev)(struct iommu_domain *domain, struct device *dev);
> +	int (*aux_get_pasid)(struct iommu_domain *domain, struct device *dev);
> +
>  	unsigned long pgsize_bitmap;
>  };
>  
> @@ -412,6 +431,13 @@ static inline void dev_iommu_fwspec_set(struct device *dev,
>  int iommu_probe_device(struct device *dev);
>  void iommu_release_device(struct device *dev);
>  
> +bool iommu_dev_has_feature(struct device *dev, enum iommu_dev_features feat);
> +int iommu_dev_enable_feature(struct device *dev, enum iommu_dev_features feat);
> +int iommu_dev_disable_feature(struct device *dev, enum iommu_dev_features feat);
> +int iommu_aux_attach_device(struct iommu_domain *domain, struct device *dev);
> +void iommu_aux_detach_device(struct iommu_domain *domain, struct device *dev);
> +int iommu_aux_get_pasid(struct iommu_domain *domain, struct device *dev);
> +
>  #else /* CONFIG_IOMMU_API */
>  
>  struct iommu_ops {};
> @@ -696,6 +722,41 @@ const struct iommu_ops *iommu_ops_from_fwnode(struct fwnode_handle *fwnode)
>  	return NULL;
>  }
>  
> +static inline bool
> +iommu_dev_has_feature(struct device *dev, enum iommu_dev_features feat)
> +{
> +	return false;
> +}
> +
> +static inline int
> +iommu_dev_enable_feature(struct device *dev, enum iommu_dev_features feat)
> +{
> +	return -ENODEV;
> +}
> +
> +static inline int
> +iommu_dev_disable_feature(struct device *dev, enum iommu_dev_features feat)
> +{
> +	return -ENODEV;
> +}
> +
> +static inline int
> +iommu_aux_attach_device(struct iommu_domain *domain, struct device *dev)
> +{
> +	return -ENODEV;
> +}
> +
> +static inline void
> +iommu_aux_detach_device(struct iommu_domain *domain, struct device *dev)
> +{
> +}
> +
> +static inline int
> +iommu_aux_get_pasid(struct iommu_domain *domain, struct device *dev)
> +{
> +	return -ENODEV;
> +}
> +
>  #endif /* CONFIG_IOMMU_API */
>  
>  #ifdef CONFIG_IOMMU_DEBUGFS



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v5 3/8] iommu/vt-d: Move common code out of iommu_attch_device()
  2019-01-10  3:00 ` [PATCH v5 3/8] iommu/vt-d: Move common code out of iommu_attch_device() Lu Baolu
@ 2019-01-14 11:45   ` Jonathan Cameron
  0 siblings, 0 replies; 19+ messages in thread
From: Jonathan Cameron @ 2019-01-14 11:45 UTC (permalink / raw)
  To: Lu Baolu, Kirti Wankhede
  Cc: Joerg Roedel, David Woodhouse, Alex Williamson, kevin.tian,
	ashok.raj, tiwei.bie, Jean-Philippe Brucker, sanjay.k.kumar,
	iommu, linux-kernel, Zeng, yi.y.sun, jacob.jun.pan, kvm

On Thu, 10 Jan 2019 11:00:22 +0800
Lu Baolu <baolu.lu@linux.intel.com> wrote:

> This part of code could be used by both normal and aux
> domain specific attach entries. Hence move them into a
> common function to avoid duplication.
> 
> Cc: Ashok Raj <ashok.raj@intel.com>
> Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
> Cc: Kevin Tian <kevin.tian@intel.com>
> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Another trivial one (it's going to be one of those days).
Typo in the patch title.

> ---
>  drivers/iommu/intel-iommu.c | 60 ++++++++++++++++++++++---------------
>  1 file changed, 36 insertions(+), 24 deletions(-)
> 
> diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
> index ee8832d26f7e..e9119d45a29d 100644
> --- a/drivers/iommu/intel-iommu.c
> +++ b/drivers/iommu/intel-iommu.c
> @@ -5058,35 +5058,14 @@ static void intel_iommu_domain_free(struct iommu_domain *domain)
>  	domain_exit(to_dmar_domain(domain));
>  }
>  
> -static int intel_iommu_attach_device(struct iommu_domain *domain,
> -				     struct device *dev)
> +static int prepare_domain_attach_device(struct iommu_domain *domain,
> +					struct device *dev)
>  {
>  	struct dmar_domain *dmar_domain = to_dmar_domain(domain);
>  	struct intel_iommu *iommu;
>  	int addr_width;
>  	u8 bus, devfn;
>  
> -	if (device_is_rmrr_locked(dev)) {
> -		dev_warn(dev, "Device is ineligible for IOMMU domain attach due to platform RMRR requirement.  Contact your platform vendor.\n");
> -		return -EPERM;
> -	}
> -
> -	/* normally dev is not mapped */
> -	if (unlikely(domain_context_mapped(dev))) {
> -		struct dmar_domain *old_domain;
> -
> -		old_domain = find_domain(dev);
> -		if (old_domain) {
> -			rcu_read_lock();
> -			dmar_remove_one_dev_info(old_domain, dev);
> -			rcu_read_unlock();
> -
> -			if (!domain_type_is_vm_or_si(old_domain) &&
> -			     list_empty(&old_domain->devices))
> -				domain_exit(old_domain);
> -		}
> -	}
> -
>  	iommu = device_to_iommu(dev, &bus, &devfn);
>  	if (!iommu)
>  		return -ENODEV;
> @@ -5119,7 +5098,40 @@ static int intel_iommu_attach_device(struct iommu_domain *domain,
>  		dmar_domain->agaw--;
>  	}
>  
> -	return domain_add_dev_info(dmar_domain, dev);
> +	return 0;
> +}
> +
> +static int intel_iommu_attach_device(struct iommu_domain *domain,
> +				     struct device *dev)
> +{
> +	int ret;
> +
> +	if (device_is_rmrr_locked(dev)) {
> +		dev_warn(dev, "Device is ineligible for IOMMU domain attach due to platform RMRR requirement.  Contact your platform vendor.\n");
> +		return -EPERM;
> +	}
> +
> +	/* normally dev is not mapped */
> +	if (unlikely(domain_context_mapped(dev))) {
> +		struct dmar_domain *old_domain;
> +
> +		old_domain = find_domain(dev);
> +		if (old_domain) {
> +			rcu_read_lock();
> +			dmar_remove_one_dev_info(old_domain, dev);
> +			rcu_read_unlock();
> +
> +			if (!domain_type_is_vm_or_si(old_domain) &&
> +			    list_empty(&old_domain->devices))
> +				domain_exit(old_domain);
> +		}
> +	}
> +
> +	ret = prepare_domain_attach_device(domain, dev);
> +	if (ret)
> +		return ret;
> +
> +	return domain_add_dev_info(to_dmar_domain(domain), dev);
>  }
>  
>  static void intel_iommu_detach_device(struct iommu_domain *domain,



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v5 4/8] iommu/vt-d: Aux-domain specific domain attach/detach
  2019-01-10  3:00 ` [PATCH v5 4/8] iommu/vt-d: Aux-domain specific domain attach/detach Lu Baolu
@ 2019-01-14 12:26   ` Jonathan Cameron
  2019-01-15  2:10     ` Lu Baolu
  0 siblings, 1 reply; 19+ messages in thread
From: Jonathan Cameron @ 2019-01-14 12:26 UTC (permalink / raw)
  To: Lu Baolu
  Cc: Joerg Roedel, David Woodhouse, Alex Williamson, Kirti Wankhede,
	kevin.tian, ashok.raj, tiwei.bie, Jean-Philippe Brucker,
	sanjay.k.kumar, iommu, linux-kernel, Zeng, yi.y.sun,
	jacob.jun.pan, kvm

On Thu, 10 Jan 2019 11:00:23 +0800
Lu Baolu <baolu.lu@linux.intel.com> wrote:

> When multiple domains per device has been enabled by the
> device driver, the device will tag the default PASID for
> the domain to all DMA traffics out of the subset of this
> device; and the IOMMU should translate the DMA requests
> in PASID granularity.
> 
> This adds the intel_iommu_aux_attach/detach_device() ops
> to support managing PASID granular translation structures
> when the device driver has enabled multiple domains per
> device.
> 
> Cc: Ashok Raj <ashok.raj@intel.com>
> Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
> Cc: Kevin Tian <kevin.tian@intel.com>
> Signed-off-by: Sanjay Kumar <sanjay.k.kumar@intel.com>
> Signed-off-by: Liu Yi L <yi.l.liu@intel.com>
> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>

The following is probably a rather naive review given I don't know
the driver or hardware well at all.  Still, it seems like things
are a lot less balanced than I'd expect and isn't totally obvious
to me why that is.

> ---
>  drivers/iommu/intel-iommu.c | 152 ++++++++++++++++++++++++++++++++++++
>  include/linux/intel-iommu.h |  10 +++
>  2 files changed, 162 insertions(+)
> 
> diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
> index e9119d45a29d..b8fb6a4bd447 100644
> --- a/drivers/iommu/intel-iommu.c
> +++ b/drivers/iommu/intel-iommu.c
> @@ -2482,6 +2482,7 @@ static struct dmar_domain *dmar_insert_one_dev_info(struct intel_iommu *iommu,
>  	info->iommu = iommu;
>  	info->pasid_table = NULL;
>  	info->auxd_enabled = 0;
> +	INIT_LIST_HEAD(&info->auxiliary_domains);
>  
>  	if (dev && dev_is_pci(dev)) {
>  		struct pci_dev *pdev = to_pci_dev(info->dev);
> @@ -5058,6 +5059,131 @@ static void intel_iommu_domain_free(struct iommu_domain *domain)
>  	domain_exit(to_dmar_domain(domain));
>  }
>  
> +/*
> + * Check whether a @domain could be attached to the @dev through the
> + * aux-domain attach/detach APIs.
> + */
> +static inline bool
> +is_aux_domain(struct device *dev, struct iommu_domain *domain)

I'm finding the distinction between an aux domain capability on
a given device and whether one is actually in use to be obscured
slightly in the function naming.

This one for example is actually checking if we have a domain
that is capable of being enabled for aux domain use, but not
yet actually in that mode?

Mind you I'm not sure I have a better answer for the naming.
can_aux_domain_be_enabled?  is_unattached_aux_domain?



> +{
> +	struct device_domain_info *info = dev->archdata.iommu;
> +
> +	return info && info->auxd_enabled &&
> +			domain->type == IOMMU_DOMAIN_UNMANAGED;
> +}
> +
> +static void auxiliary_link_device(struct dmar_domain *domain,
> +				  struct device *dev)
> +{
> +	struct device_domain_info *info = dev->archdata.iommu;
> +
> +	assert_spin_locked(&device_domain_lock);
> +	if (WARN_ON(!info))
> +		return;
> +
> +	domain->auxd_refcnt++;
> +	list_add(&domain->auxd, &info->auxiliary_domains);
> +}
> +
> +static void auxiliary_unlink_device(struct dmar_domain *domain,
> +				    struct device *dev)
> +{
> +	struct device_domain_info *info = dev->archdata.iommu;
> +
> +	assert_spin_locked(&device_domain_lock);
> +	if (WARN_ON(!info))
> +		return;
> +
> +	list_del(&domain->auxd);
> +	domain->auxd_refcnt--;
> +
> +	if (!domain->auxd_refcnt && domain->default_pasid > 0)
> +		intel_pasid_free_id(domain->default_pasid);

This seems unbalanced wrt to what is happening in auxiliary_link_device.
If this is necessary then it would be good to have comments saying why.
To my uniformed eye, looks like we could do this at the end of
aux_domain_remove_dev, except that we need to hold the lock.
As such perhaps it makes sense to do the pasid allocation under that
lock in the first place?

I'm not 100% sure, but is there a race if you get the final
remove running against a new add currently?

> +}
> +
> +static int aux_domain_add_dev(struct dmar_domain *domain,
> +			      struct device *dev)
> +{
> +	int ret;
> +	u8 bus, devfn;
> +	unsigned long flags;
> +	struct intel_iommu *iommu;
> +
> +	iommu = device_to_iommu(dev, &bus, &devfn);
> +	if (!iommu)
> +		return -ENODEV;
> +
> +	if (domain->default_pasid <= 0) {

device_domain_lock isn't held, so we might be in process of removing, see
the pasid as set, just as it becomes unset and hence leave here without
one set?

> +		int pasid;
> +
> +		pasid = intel_pasid_alloc_id(domain, PASID_MIN,
> +					     pci_max_pasids(to_pci_dev(dev)),
> +					     GFP_KERNEL);
> +		if (pasid <= 0) {
> +			pr_err("Can't allocate default pasid\n");
> +			return -ENODEV;
> +		}
> +		domain->default_pasid = pasid;
> +	}
> +
> +	spin_lock_irqsave(&device_domain_lock, flags);
> +	/*
> +	 * iommu->lock must be held to attach domain to iommu and setup the
> +	 * pasid entry for second level translation.
> +	 */
> +	spin_lock(&iommu->lock);
> +	ret = domain_attach_iommu(domain, iommu);
> +	if (ret)
> +		goto attach_failed;
> +
> +	/* Setup the PASID entry for mediated devices: */
> +	ret = intel_pasid_setup_second_level(iommu, domain, dev,
> +					     domain->default_pasid);
> +	if (ret)
> +		goto table_failed;
> +	spin_unlock(&iommu->lock);
> +
> +	auxiliary_link_device(domain, dev);
> +
> +	spin_unlock_irqrestore(&device_domain_lock, flags);
> +
> +	return 0;
> +
> +table_failed:
> +	domain_detach_iommu(domain, iommu);
> +attach_failed:
> +	spin_unlock(&iommu->lock);
> +	spin_unlock_irqrestore(&device_domain_lock, flags);
> +	if (!domain->auxd_refcnt && domain->default_pasid > 0)
> +		intel_pasid_free_id(domain->default_pasid);

It would be odd for this to race against a remove, but in
theory it 'might' I think, potentially giving a double free.

> +
> +	return ret;
> +}
> +
> +static void aux_domain_remove_dev(struct dmar_domain *domain,
> +				  struct device *dev)
> +{
> +	struct device_domain_info *info;
> +	struct intel_iommu *iommu;
> +	unsigned long flags;
> +
> +	if (!is_aux_domain(dev, &domain->domain))
> +		return;
> +
> +	spin_lock_irqsave(&device_domain_lock, flags);
> +	info = dev->archdata.iommu;
> +	iommu = info->iommu;
> +
> +	auxiliary_unlink_device(domain, dev);
> +
> +	spin_lock(&iommu->lock);
> +	intel_pasid_tear_down_entry(iommu, dev, domain->default_pasid);
> +	domain_detach_iommu(domain, iommu);
> +	spin_unlock(&iommu->lock);
> +
> +	spin_unlock_irqrestore(&device_domain_lock, flags);
> +}
> +
>  static int prepare_domain_attach_device(struct iommu_domain *domain,
>  					struct device *dev)
>  {
> @@ -5111,6 +5237,9 @@ static int intel_iommu_attach_device(struct iommu_domain *domain,
>  		return -EPERM;
>  	}
>  
> +	if (is_aux_domain(dev, domain))
> +		return -EPERM;
> +
>  	/* normally dev is not mapped */
>  	if (unlikely(domain_context_mapped(dev))) {
>  		struct dmar_domain *old_domain;
> @@ -5134,12 +5263,33 @@ static int intel_iommu_attach_device(struct iommu_domain *domain,
>  	return domain_add_dev_info(to_dmar_domain(domain), dev);
>  }
>  
> +static int intel_iommu_aux_attach_device(struct iommu_domain *domain,
> +					 struct device *dev)
> +{
> +	int ret;
> +
> +	if (!is_aux_domain(dev, domain))
> +		return -EPERM;
> +
> +	ret = prepare_domain_attach_device(domain, dev);
> +	if (ret)
> +		return ret;
> +
> +	return aux_domain_add_dev(to_dmar_domain(domain), dev);
> +}
> +
>  static void intel_iommu_detach_device(struct iommu_domain *domain,
>  				      struct device *dev)
>  {
>  	dmar_remove_one_dev_info(to_dmar_domain(domain), dev);
>  }
>  
> +static void intel_iommu_aux_detach_device(struct iommu_domain *domain,
> +					  struct device *dev)
> +{
> +	aux_domain_remove_dev(to_dmar_domain(domain), dev);
> +}
> +
>  static int intel_iommu_map(struct iommu_domain *domain,
>  			   unsigned long iova, phys_addr_t hpa,
>  			   size_t size, int iommu_prot)
> @@ -5480,6 +5630,8 @@ const struct iommu_ops intel_iommu_ops = {
>  	.domain_free		= intel_iommu_domain_free,
>  	.attach_dev		= intel_iommu_attach_device,
>  	.detach_dev		= intel_iommu_detach_device,
> +	.aux_attach_dev		= intel_iommu_aux_attach_device,
> +	.aux_detach_dev		= intel_iommu_aux_detach_device,
>  	.map			= intel_iommu_map,
>  	.unmap			= intel_iommu_unmap,
>  	.iova_to_phys		= intel_iommu_iova_to_phys,
> diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h
> index 7cf9f7f3724a..b563a61a6c39 100644
> --- a/include/linux/intel-iommu.h
> +++ b/include/linux/intel-iommu.h
> @@ -492,9 +492,11 @@ struct dmar_domain {
>  					/* Domain ids per IOMMU. Use u16 since
>  					 * domain ids are 16 bit wide according
>  					 * to VT-d spec, section 9.3 */
> +	unsigned int	auxd_refcnt;	/* Refcount of auxiliary attaching */
>  
>  	bool has_iotlb_device;
>  	struct list_head devices;	/* all devices' list */
> +	struct list_head auxd;		/* link to device's auxiliary list */
>  	struct iova_domain iovad;	/* iova's that belong to this domain */
>  
>  	struct dma_pte	*pgd;		/* virtual address */
> @@ -513,6 +515,11 @@ struct dmar_domain {
>  					   2 == 1GiB, 3 == 512GiB, 4 == 1TiB */
>  	u64		max_addr;	/* maximum mapped address */
>  
> +	int		default_pasid;	/*
> +					 * The default pasid used for non-SVM
> +					 * traffic on mediated devices.
> +					 */
> +
>  	struct iommu_domain domain;	/* generic domain data structure for
>  					   iommu core */
>  };
> @@ -562,6 +569,9 @@ struct device_domain_info {
>  	struct list_head link;	/* link to domain siblings */
>  	struct list_head global; /* link to global list */
>  	struct list_head table;	/* link to pasid table */
> +	struct list_head auxiliary_domains; /* auxiliary domains
> +					     * attached to this device
> +					     */
>  	u8 bus;			/* PCI bus number */
>  	u8 devfn;		/* PCI devfn number */
>  	u16 pfsid;		/* SRIOV physical function source ID */



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v5 1/8] iommu: Add APIs for multiple domains per device
  2019-01-14 11:22   ` Jonathan Cameron
@ 2019-01-15  1:33     ` Lu Baolu
  0 siblings, 0 replies; 19+ messages in thread
From: Lu Baolu @ 2019-01-15  1:33 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: baolu.lu, Joerg Roedel, David Woodhouse, Alex Williamson,
	Kirti Wankhede, kevin.tian, ashok.raj, tiwei.bie,
	Jean-Philippe Brucker, sanjay.k.kumar, iommu, linux-kernel, Zeng,
	yi.y.sun, jacob.jun.pan, kvm

Hi,

On 1/14/19 7:22 PM, Jonathan Cameron wrote:
> On Thu, 10 Jan 2019 11:00:20 +0800
> Lu Baolu <baolu.lu@linux.intel.com> wrote:
> 
>> Sharing a physical PCI device in a finer-granularity way
>> is becoming a consensus in the industry. IOMMU vendors
>> are also engaging efforts to support such sharing as well
>> as possible. Among the efforts, the capability of support
>> finer-granularity DMA isolation is a common requirement
>> due to the security consideration. With finer-granularity
>> DMA isolation, all DMA requests out of or to a subset of
>> a physical PCI device can be protected by the IOMMU. As a
>> result, there is a request in software to attach multiple
>> domains to a physical PCI device. One example of such use
>> model is the Intel Scalable IOV [1] [2]. The Intel vt-d
>> 3.0 spec [3] introduces the scalable mode which enables
>> PASID granularity DMA isolation.
>>
>> This adds the APIs to support multiple domains per device.
>> In order to ease the discussions, we call it 'a domain in
>> auxiliary mode' or simply 'auxiliary domain' when multiple
>> domains are attached to a physical device.
>>
>> The APIs include:
>>
>> * iommu_dev_has_feature(dev, IOMMU_DEV_FEAT_AUX)
>>    - Check whether both IOMMU and device support IOMMU aux
>>      domain feature. Below aux-domain specific interfaces
>>      are available only after this returns true.
>>
>> * iommu_dev_enable/disable_feature(dev, IOMMU_DEV_FEAT_AUX)
>>    - Enable/disable device specific aux-domain feature.
>>
>> * iommu_aux_attach_device(domain, dev)
>>    - Attaches @domain to @dev in the auxiliary mode. Multiple
>>      domains could be attached to a single device in the
>>      auxiliary mode with each domain representing an isolated
>>      address space for an assignable subset of the device.
>>
>> * iommu_aux_detach_device(domain, dev)
>>    - Detach @domain which has been attached to @dev in the
>>      auxiliary mode.
>>
>> * iommu_aux_get_pasid(domain, dev)
>>    - Return ID used for finer-granularity DMA translation.
>>      For the Intel Scalable IOV usage model, this will be
>>      a PASID. The device which supports Scalable IOV needs
>>      to write this ID to the device register so that DMA
>>      requests could be tagged with a right PASID prefix.
>>
>> This has been updated with the latest proposal from Joerg
>> posted here [5].
>>
>> Many people involved in discussions of this design.
>>
>> Kevin Tian <kevin.tian@intel.com>
>> Liu Yi L <yi.l.liu@intel.com>
>> Ashok Raj <ashok.raj@intel.com>
>> Sanjay Kumar <sanjay.k.kumar@intel.com>
>> Jacob Pan <jacob.jun.pan@linux.intel.com>
>> Alex Williamson <alex.williamson@redhat.com>
>> Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
>> Joerg Roedel <joro@8bytes.org>
>>
>> and some discussions can be found here [4] [5].
>>
>> [1] https://software.intel.com/en-us/download/intel-scalable-io-virtualization-technical-specification
>> [2] https://schd.ws/hosted_files/lc32018/00/LC3-SIOV-final.pdf
>> [3] https://software.intel.com/en-us/download/intel-virtualization-technology-for-directed-io-architecture-specification
>> [4] https://lkml.org/lkml/2018/7/26/4
>> [5] https://www.spinics.net/lists/iommu/msg31874.html
>>
>> Cc: Ashok Raj <ashok.raj@intel.com>
>> Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
>> Cc: Kevin Tian <kevin.tian@intel.com>
>> Cc: Liu Yi L <yi.l.liu@intel.com>
>> Suggested-by: Kevin Tian <kevin.tian@intel.com>
>> Suggested-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
>> Suggested-by: Joerg Roedel <jroedel@suse.de>
>> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
> 
> One trivial comment inline.

Thank you!

> 
>> ---
>>   drivers/iommu/iommu.c | 80 +++++++++++++++++++++++++++++++++++++++++++
>>   include/linux/iommu.h | 61 +++++++++++++++++++++++++++++++++
>>   2 files changed, 141 insertions(+)
>>
>> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
>> index 3ed4db334341..9166b6145409 100644
>> --- a/drivers/iommu/iommu.c
>> +++ b/drivers/iommu/iommu.c
>> @@ -2033,3 +2033,83 @@ int iommu_fwspec_add_ids(struct device *dev, u32 *ids, int num_ids)
>>   	return 0;
>>   }
>>   EXPORT_SYMBOL_GPL(iommu_fwspec_add_ids);
>> +
>> +/*
>> + * Per device IOMMU features.
>> + */
>> +bool iommu_dev_has_feature(struct device *dev, enum iommu_dev_features feat)
>> +{
>> +	const struct iommu_ops *ops = dev->bus->iommu_ops;
>> +
>> +	if (ops && ops->dev_has_feat)
>> +		return ops->dev_has_feat(dev, feat);
>> +
>> +	return false;
>> +}
>> +EXPORT_SYMBOL_GPL(iommu_dev_has_feature);
>> +
>> +int iommu_dev_enable_feature(struct device *dev, enum iommu_dev_features feat)
>> +{
>> +	const struct iommu_ops *ops = dev->bus->iommu_ops;
>> +
>> +	if (ops && ops->dev_enable_feat)
>> +		return ops->dev_enable_feat(dev, feat);
>> +
>> +	return -ENODEV;
>> +}
>> +EXPORT_SYMBOL_GPL(iommu_dev_enable_feature);
>> +
>> +int iommu_dev_disable_feature(struct device *dev, enum iommu_dev_features feat)
>> +{
>> +	const struct iommu_ops *ops = dev->bus->iommu_ops;
>> +
>> +	if (ops && ops->dev_disable_feat)
>> +		return ops->dev_disable_feat(dev, feat);
>> +
>> +	return -ENODEV;
>> +}
>> +EXPORT_SYMBOL_GPL(iommu_dev_disable_feature);
>> +
>> +/*
>> + * Aux-domain specific attach/detach.
>> + *
>> + * Only works if iommu_dev_has_feature(dev, IOMMU_DEV_FEAT_AUX) returns true.
>> + * Also, as long as domains are attached to a device through this interface,
>> + * any tries to call iommu_attach_device() should fail (iommu_detach_device()
>> + * can't fail, so we fail on the tryint to re-attach). This should make us safe
> 
> when trying to re-attach. (perhaps?)

Yes. I will fix it.

Best regards,
Lu Baolu

> 
>> + * against a device being attached to a guest as a whole while there are still
>> + * pasid users on it (aux and sva).
>> + */
>> +int iommu_aux_attach_device(struct iommu_domain *domain, struct device *dev)
>> +{
>> +	int ret = -ENODEV;
>> +
>> +	if (domain->ops->aux_attach_dev)
>> +		ret = domain->ops->aux_attach_dev(domain, dev);
>> +
>> +	if (!ret)
>> +		trace_attach_device_to_domain(dev);
>> +
>> +	return ret;
>> +}
>> +EXPORT_SYMBOL_GPL(iommu_aux_attach_device);
>> +
>> +void iommu_aux_detach_device(struct iommu_domain *domain, struct device *dev)
>> +{
>> +	if (domain->ops->aux_detach_dev) {
>> +		domain->ops->aux_detach_dev(domain, dev);
>> +		trace_detach_device_from_domain(dev);
>> +	}
>> +}
>> +EXPORT_SYMBOL_GPL(iommu_aux_detach_device);
>> +
>> +int iommu_aux_get_pasid(struct iommu_domain *domain, struct device *dev)
>> +{
>> +	int ret = -ENODEV;
>> +
>> +	if (domain->ops->aux_get_pasid)
>> +		ret = domain->ops->aux_get_pasid(domain, dev);
>> +
>> +	return ret;
>> +}
>> +EXPORT_SYMBOL_GPL(iommu_aux_get_pasid);
>> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
>> index e90da6b6f3d1..f4c3d2a2cc87 100644
>> --- a/include/linux/iommu.h
>> +++ b/include/linux/iommu.h
>> @@ -156,6 +156,11 @@ struct iommu_resv_region {
>>   	enum iommu_resv_type	type;
>>   };
>>   
>> +/* Per device IOMMU features */
>> +enum iommu_dev_features {
>> +	IOMMU_DEV_FEAT_AUX,	/* Aux-domain feature */
>> +};
>> +
>>   #ifdef CONFIG_IOMMU_API
>>   
>>   /**
>> @@ -183,6 +188,10 @@ struct iommu_resv_region {
>>    * @domain_window_enable: Configure and enable a particular window for a domain
>>    * @domain_window_disable: Disable a particular window for a domain
>>    * @of_xlate: add OF master IDs to iommu grouping
>> + * @dev_has/enable/disable_feat: per device entries to check/enable/disable
>> + *                               iommu specific features.
>> + * @aux_attach/detach_dev: aux-domain specific attach/detach entries.
>> + * @aux_get_pasid: get the pasid given an aux-domain
>>    * @pgsize_bitmap: bitmap of all possible supported page sizes
>>    */
>>   struct iommu_ops {
>> @@ -226,6 +235,16 @@ struct iommu_ops {
>>   	int (*of_xlate)(struct device *dev, struct of_phandle_args *args);
>>   	bool (*is_attach_deferred)(struct iommu_domain *domain, struct device *dev);
>>   
>> +	/* Per device IOMMU features */
>> +	bool (*dev_has_feat)(struct device *dev, enum iommu_dev_features f);
>> +	int (*dev_enable_feat)(struct device *dev, enum iommu_dev_features f);
>> +	int (*dev_disable_feat)(struct device *dev, enum iommu_dev_features f);
>> +
>> +	/* Aux-domain specific attach/detach entries */
>> +	int (*aux_attach_dev)(struct iommu_domain *domain, struct device *dev);
>> +	void (*aux_detach_dev)(struct iommu_domain *domain, struct device *dev);
>> +	int (*aux_get_pasid)(struct iommu_domain *domain, struct device *dev);
>> +
>>   	unsigned long pgsize_bitmap;
>>   };
>>   
>> @@ -412,6 +431,13 @@ static inline void dev_iommu_fwspec_set(struct device *dev,
>>   int iommu_probe_device(struct device *dev);
>>   void iommu_release_device(struct device *dev);
>>   
>> +bool iommu_dev_has_feature(struct device *dev, enum iommu_dev_features feat);
>> +int iommu_dev_enable_feature(struct device *dev, enum iommu_dev_features feat);
>> +int iommu_dev_disable_feature(struct device *dev, enum iommu_dev_features feat);
>> +int iommu_aux_attach_device(struct iommu_domain *domain, struct device *dev);
>> +void iommu_aux_detach_device(struct iommu_domain *domain, struct device *dev);
>> +int iommu_aux_get_pasid(struct iommu_domain *domain, struct device *dev);
>> +
>>   #else /* CONFIG_IOMMU_API */
>>   
>>   struct iommu_ops {};
>> @@ -696,6 +722,41 @@ const struct iommu_ops *iommu_ops_from_fwnode(struct fwnode_handle *fwnode)
>>   	return NULL;
>>   }
>>   
>> +static inline bool
>> +iommu_dev_has_feature(struct device *dev, enum iommu_dev_features feat)
>> +{
>> +	return false;
>> +}
>> +
>> +static inline int
>> +iommu_dev_enable_feature(struct device *dev, enum iommu_dev_features feat)
>> +{
>> +	return -ENODEV;
>> +}
>> +
>> +static inline int
>> +iommu_dev_disable_feature(struct device *dev, enum iommu_dev_features feat)
>> +{
>> +	return -ENODEV;
>> +}
>> +
>> +static inline int
>> +iommu_aux_attach_device(struct iommu_domain *domain, struct device *dev)
>> +{
>> +	return -ENODEV;
>> +}
>> +
>> +static inline void
>> +iommu_aux_detach_device(struct iommu_domain *domain, struct device *dev)
>> +{
>> +}
>> +
>> +static inline int
>> +iommu_aux_get_pasid(struct iommu_domain *domain, struct device *dev)
>> +{
>> +	return -ENODEV;
>> +}
>> +
>>   #endif /* CONFIG_IOMMU_API */
>>   
>>   #ifdef CONFIG_IOMMU_DEBUGFS
> 
> 
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v5 4/8] iommu/vt-d: Aux-domain specific domain attach/detach
  2019-01-14 12:26   ` Jonathan Cameron
@ 2019-01-15  2:10     ` Lu Baolu
  2019-01-15 13:31       ` Jonathan Cameron
  0 siblings, 1 reply; 19+ messages in thread
From: Lu Baolu @ 2019-01-15  2:10 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: baolu.lu, Joerg Roedel, David Woodhouse, Alex Williamson,
	Kirti Wankhede, kevin.tian, ashok.raj, tiwei.bie,
	Jean-Philippe Brucker, sanjay.k.kumar, iommu, linux-kernel, Zeng,
	yi.y.sun, jacob.jun.pan, kvm

Hi,

On 1/14/19 8:26 PM, Jonathan Cameron wrote:
> On Thu, 10 Jan 2019 11:00:23 +0800
> Lu Baolu <baolu.lu@linux.intel.com> wrote:
> 
>> When multiple domains per device has been enabled by the
>> device driver, the device will tag the default PASID for
>> the domain to all DMA traffics out of the subset of this
>> device; and the IOMMU should translate the DMA requests
>> in PASID granularity.
>>
>> This adds the intel_iommu_aux_attach/detach_device() ops
>> to support managing PASID granular translation structures
>> when the device driver has enabled multiple domains per
>> device.
>>
>> Cc: Ashok Raj <ashok.raj@intel.com>
>> Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
>> Cc: Kevin Tian <kevin.tian@intel.com>
>> Signed-off-by: Sanjay Kumar <sanjay.k.kumar@intel.com>
>> Signed-off-by: Liu Yi L <yi.l.liu@intel.com>
>> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
> 
> The following is probably a rather naive review given I don't know
> the driver or hardware well at all.  Still, it seems like things
> are a lot less balanced than I'd expect and isn't totally obvious
> to me why that is.

Thank you!

> 
>> ---
>>   drivers/iommu/intel-iommu.c | 152 ++++++++++++++++++++++++++++++++++++
>>   include/linux/intel-iommu.h |  10 +++
>>   2 files changed, 162 insertions(+)
>>
>> diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
>> index e9119d45a29d..b8fb6a4bd447 100644
>> --- a/drivers/iommu/intel-iommu.c
>> +++ b/drivers/iommu/intel-iommu.c
>> @@ -2482,6 +2482,7 @@ static struct dmar_domain *dmar_insert_one_dev_info(struct intel_iommu *iommu,
>>   	info->iommu = iommu;
>>   	info->pasid_table = NULL;
>>   	info->auxd_enabled = 0;
>> +	INIT_LIST_HEAD(&info->auxiliary_domains);
>>   
>>   	if (dev && dev_is_pci(dev)) {
>>   		struct pci_dev *pdev = to_pci_dev(info->dev);
>> @@ -5058,6 +5059,131 @@ static void intel_iommu_domain_free(struct iommu_domain *domain)
>>   	domain_exit(to_dmar_domain(domain));
>>   }
>>   
>> +/*
>> + * Check whether a @domain could be attached to the @dev through the
>> + * aux-domain attach/detach APIs.
>> + */
>> +static inline bool
>> +is_aux_domain(struct device *dev, struct iommu_domain *domain)
> 
> I'm finding the distinction between an aux domain capability on
> a given device and whether one is actually in use to be obscured
> slightly in the function naming.
> 
> This one for example is actually checking if we have a domain
> that is capable of being enabled for aux domain use, but not
> yet actually in that mode?
> 
> Mind you I'm not sure I have a better answer for the naming.
> can_aux_domain_be_enabled?  is_unattached_aux_domain?
> 
> 

device aux mode vs. normal mode
===============================

When we talk about the auxiliary mode (simply aux-mode), it means "the
device works in aux-mode or normal mode". "normal mode" means that the
device (and it's corresponding IOMMU) supports only RID (PCI Request ID)
based DMA translation; while, aux-mode means the the device (and it's
IOMMU) supports fine-grained DMA translation, like PASID based DMA
translation with Intel VT-d scalable mode.

We are adding below APIs to switch a device between these two modes:

int iommu_dev_enable/disable_feature(dev, IOMMU_DEV_FEAT_AUX)

And this API (still under discussion) to check which mode the device is
working in:

bool iommu_dev_has_feature(dev, IOMMU_DEV_FEAT_AUX)

aux-domain
==========

If a device is working in aux-mode and we are going to attach a domain
to this device, we say "this domain will be attached to the device in
aux mode", and simply "aux domain". So a domain is "normal" when it is
going to attach to a device in normal mode; and is "aux-domain" when it
is going to attach to a device in aux mode.

> 
>> +{
>> +	struct device_domain_info *info = dev->archdata.iommu;
>> +
>> +	return info && info->auxd_enabled &&
>> +			domain->type == IOMMU_DOMAIN_UNMANAGED;
>> +}
>> +
>> +static void auxiliary_link_device(struct dmar_domain *domain,
>> +				  struct device *dev)
>> +{
>> +	struct device_domain_info *info = dev->archdata.iommu;
>> +
>> +	assert_spin_locked(&device_domain_lock);
>> +	if (WARN_ON(!info))
>> +		return;
>> +
>> +	domain->auxd_refcnt++;
>> +	list_add(&domain->auxd, &info->auxiliary_domains);
>> +}
>> +
>> +static void auxiliary_unlink_device(struct dmar_domain *domain,
>> +				    struct device *dev)
>> +{
>> +	struct device_domain_info *info = dev->archdata.iommu;
>> +
>> +	assert_spin_locked(&device_domain_lock);
>> +	if (WARN_ON(!info))
>> +		return;
>> +
>> +	list_del(&domain->auxd);
>> +	domain->auxd_refcnt--;
>> +
>> +	if (!domain->auxd_refcnt && domain->default_pasid > 0)
>> +		intel_pasid_free_id(domain->default_pasid);
> 
> This seems unbalanced wrt to what is happening in auxiliary_link_device.
> If this is necessary then it would be good to have comments saying why.
> To my uniformed eye, looks like we could do this at the end of
> aux_domain_remove_dev, except that we need to hold the lock.
> As such perhaps it makes sense to do the pasid allocation under that
> lock in the first place?
> 
> I'm not 100% sure, but is there a race if you get the final
> remove running against a new add currently?

Yes. This is unbalanced.

I will move pasid free out here and it should be done with the lock
held to avoid race.

> 
>> +}
>> +
>> +static int aux_domain_add_dev(struct dmar_domain *domain,
>> +			      struct device *dev)
>> +{
>> +	int ret;
>> +	u8 bus, devfn;
>> +	unsigned long flags;
>> +	struct intel_iommu *iommu;
>> +
>> +	iommu = device_to_iommu(dev, &bus, &devfn);
>> +	if (!iommu)
>> +		return -ENODEV;
>> +
>> +	if (domain->default_pasid <= 0) {
> 
> device_domain_lock isn't held, so we might be in process of removing, see
> the pasid as set, just as it becomes unset and hence leave here without
> one set?

Good catch, the lock should be held.

> 
>> +		int pasid;
>> +
>> +		pasid = intel_pasid_alloc_id(domain, PASID_MIN,
>> +					     pci_max_pasids(to_pci_dev(dev)),
>> +					     GFP_KERNEL);
>> +		if (pasid <= 0) {
>> +			pr_err("Can't allocate default pasid\n");
>> +			return -ENODEV;
>> +		}
>> +		domain->default_pasid = pasid;
>> +	}
>> +
>> +	spin_lock_irqsave(&device_domain_lock, flags);
>> +	/*
>> +	 * iommu->lock must be held to attach domain to iommu and setup the
>> +	 * pasid entry for second level translation.
>> +	 */
>> +	spin_lock(&iommu->lock);
>> +	ret = domain_attach_iommu(domain, iommu);
>> +	if (ret)
>> +		goto attach_failed;
>> +
>> +	/* Setup the PASID entry for mediated devices: */
>> +	ret = intel_pasid_setup_second_level(iommu, domain, dev,
>> +					     domain->default_pasid);
>> +	if (ret)
>> +		goto table_failed;
>> +	spin_unlock(&iommu->lock);
>> +
>> +	auxiliary_link_device(domain, dev);
>> +
>> +	spin_unlock_irqrestore(&device_domain_lock, flags);
>> +
>> +	return 0;
>> +
>> +table_failed:
>> +	domain_detach_iommu(domain, iommu);
>> +attach_failed:
>> +	spin_unlock(&iommu->lock);
>> +	spin_unlock_irqrestore(&device_domain_lock, flags);
>> +	if (!domain->auxd_refcnt && domain->default_pasid > 0)
>> +		intel_pasid_free_id(domain->default_pasid);
> 
> It would be odd for this to race against a remove, but in
> theory it 'might' I think, potentially giving a double free.

Yes. Should be done with the lock held.

Best regards,
Lu Baolu

> 
>> +
>> +	return ret;
>> +}
>> +
>> +static void aux_domain_remove_dev(struct dmar_domain *domain,
>> +				  struct device *dev)
>> +{
>> +	struct device_domain_info *info;
>> +	struct intel_iommu *iommu;
>> +	unsigned long flags;
>> +
>> +	if (!is_aux_domain(dev, &domain->domain))
>> +		return;
>> +
>> +	spin_lock_irqsave(&device_domain_lock, flags);
>> +	info = dev->archdata.iommu;
>> +	iommu = info->iommu;
>> +
>> +	auxiliary_unlink_device(domain, dev);
>> +
>> +	spin_lock(&iommu->lock);
>> +	intel_pasid_tear_down_entry(iommu, dev, domain->default_pasid);
>> +	domain_detach_iommu(domain, iommu);
>> +	spin_unlock(&iommu->lock);
>> +
>> +	spin_unlock_irqrestore(&device_domain_lock, flags);
>> +}
>> +
>>   static int prepare_domain_attach_device(struct iommu_domain *domain,
>>   					struct device *dev)
>>   {
>> @@ -5111,6 +5237,9 @@ static int intel_iommu_attach_device(struct iommu_domain *domain,
>>   		return -EPERM;
>>   	}
>>   
>> +	if (is_aux_domain(dev, domain))
>> +		return -EPERM;
>> +
>>   	/* normally dev is not mapped */
>>   	if (unlikely(domain_context_mapped(dev))) {
>>   		struct dmar_domain *old_domain;
>> @@ -5134,12 +5263,33 @@ static int intel_iommu_attach_device(struct iommu_domain *domain,
>>   	return domain_add_dev_info(to_dmar_domain(domain), dev);
>>   }
>>   
>> +static int intel_iommu_aux_attach_device(struct iommu_domain *domain,
>> +					 struct device *dev)
>> +{
>> +	int ret;
>> +
>> +	if (!is_aux_domain(dev, domain))
>> +		return -EPERM;
>> +
>> +	ret = prepare_domain_attach_device(domain, dev);
>> +	if (ret)
>> +		return ret;
>> +
>> +	return aux_domain_add_dev(to_dmar_domain(domain), dev);
>> +}
>> +
>>   static void intel_iommu_detach_device(struct iommu_domain *domain,
>>   				      struct device *dev)
>>   {
>>   	dmar_remove_one_dev_info(to_dmar_domain(domain), dev);
>>   }
>>   
>> +static void intel_iommu_aux_detach_device(struct iommu_domain *domain,
>> +					  struct device *dev)
>> +{
>> +	aux_domain_remove_dev(to_dmar_domain(domain), dev);
>> +}
>> +
>>   static int intel_iommu_map(struct iommu_domain *domain,
>>   			   unsigned long iova, phys_addr_t hpa,
>>   			   size_t size, int iommu_prot)
>> @@ -5480,6 +5630,8 @@ const struct iommu_ops intel_iommu_ops = {
>>   	.domain_free		= intel_iommu_domain_free,
>>   	.attach_dev		= intel_iommu_attach_device,
>>   	.detach_dev		= intel_iommu_detach_device,
>> +	.aux_attach_dev		= intel_iommu_aux_attach_device,
>> +	.aux_detach_dev		= intel_iommu_aux_detach_device,
>>   	.map			= intel_iommu_map,
>>   	.unmap			= intel_iommu_unmap,
>>   	.iova_to_phys		= intel_iommu_iova_to_phys,
>> diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h
>> index 7cf9f7f3724a..b563a61a6c39 100644
>> --- a/include/linux/intel-iommu.h
>> +++ b/include/linux/intel-iommu.h
>> @@ -492,9 +492,11 @@ struct dmar_domain {
>>   					/* Domain ids per IOMMU. Use u16 since
>>   					 * domain ids are 16 bit wide according
>>   					 * to VT-d spec, section 9.3 */
>> +	unsigned int	auxd_refcnt;	/* Refcount of auxiliary attaching */
>>   
>>   	bool has_iotlb_device;
>>   	struct list_head devices;	/* all devices' list */
>> +	struct list_head auxd;		/* link to device's auxiliary list */
>>   	struct iova_domain iovad;	/* iova's that belong to this domain */
>>   
>>   	struct dma_pte	*pgd;		/* virtual address */
>> @@ -513,6 +515,11 @@ struct dmar_domain {
>>   					   2 == 1GiB, 3 == 512GiB, 4 == 1TiB */
>>   	u64		max_addr;	/* maximum mapped address */
>>   
>> +	int		default_pasid;	/*
>> +					 * The default pasid used for non-SVM
>> +					 * traffic on mediated devices.
>> +					 */
>> +
>>   	struct iommu_domain domain;	/* generic domain data structure for
>>   					   iommu core */
>>   };
>> @@ -562,6 +569,9 @@ struct device_domain_info {
>>   	struct list_head link;	/* link to domain siblings */
>>   	struct list_head global; /* link to global list */
>>   	struct list_head table;	/* link to pasid table */
>> +	struct list_head auxiliary_domains; /* auxiliary domains
>> +					     * attached to this device
>> +					     */
>>   	u8 bus;			/* PCI bus number */
>>   	u8 devfn;		/* PCI devfn number */
>>   	u16 pfsid;		/* SRIOV physical function source ID */
> 
> 
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v5 4/8] iommu/vt-d: Aux-domain specific domain attach/detach
  2019-01-15  2:10     ` Lu Baolu
@ 2019-01-15 13:31       ` Jonathan Cameron
  0 siblings, 0 replies; 19+ messages in thread
From: Jonathan Cameron @ 2019-01-15 13:31 UTC (permalink / raw)
  To: Lu Baolu
  Cc: Joerg Roedel, David Woodhouse, Alex Williamson, Kirti Wankhede,
	kevin.tian, ashok.raj, tiwei.bie, Jean-Philippe Brucker,
	sanjay.k.kumar, iommu, linux-kernel, Zeng, yi.y.sun,
	jacob.jun.pan, kvm

On Tue, 15 Jan 2019 10:10:21 +0800
Lu Baolu <baolu.lu@linux.intel.com> wrote:

> Hi,
> 
> On 1/14/19 8:26 PM, Jonathan Cameron wrote:
> > On Thu, 10 Jan 2019 11:00:23 +0800
> > Lu Baolu <baolu.lu@linux.intel.com> wrote:
> >   
> >> When multiple domains per device has been enabled by the
> >> device driver, the device will tag the default PASID for
> >> the domain to all DMA traffics out of the subset of this
> >> device; and the IOMMU should translate the DMA requests
> >> in PASID granularity.
> >>
> >> This adds the intel_iommu_aux_attach/detach_device() ops
> >> to support managing PASID granular translation structures
> >> when the device driver has enabled multiple domains per
> >> device.
> >>
> >> Cc: Ashok Raj <ashok.raj@intel.com>
> >> Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
> >> Cc: Kevin Tian <kevin.tian@intel.com>
> >> Signed-off-by: Sanjay Kumar <sanjay.k.kumar@intel.com>
> >> Signed-off-by: Liu Yi L <yi.l.liu@intel.com>
> >> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>  
> > 
> > The following is probably a rather naive review given I don't know
> > the driver or hardware well at all.  Still, it seems like things
> > are a lot less balanced than I'd expect and isn't totally obvious
> > to me why that is.  
> 
> Thank you!

You are welcome.

...

> >> +/*
> >> + * Check whether a @domain could be attached to the @dev through the
> >> + * aux-domain attach/detach APIs.
> >> + */
> >> +static inline bool
> >> +is_aux_domain(struct device *dev, struct iommu_domain *domain)  
> > 
> > I'm finding the distinction between an aux domain capability on
> > a given device and whether one is actually in use to be obscured
> > slightly in the function naming.
> > 
> > This one for example is actually checking if we have a domain
> > that is capable of being enabled for aux domain use, but not
> > yet actually in that mode?
> > 
> > Mind you I'm not sure I have a better answer for the naming.
> > can_aux_domain_be_enabled?  is_unattached_aux_domain?
> > 
> >   
> 
> device aux mode vs. normal mode
> ===============================
> 
> When we talk about the auxiliary mode (simply aux-mode), it means "the
> device works in aux-mode or normal mode". "normal mode" means that the
> device (and it's corresponding IOMMU) supports only RID (PCI Request ID)
> based DMA translation; while, aux-mode means the the device (and it's
> IOMMU) supports fine-grained DMA translation, like PASID based DMA
> translation with Intel VT-d scalable mode.
> 
> We are adding below APIs to switch a device between these two modes:
> 
> int iommu_dev_enable/disable_feature(dev, IOMMU_DEV_FEAT_AUX)
> 
> And this API (still under discussion) to check which mode the device is
> working in:
> 
> bool iommu_dev_has_feature(dev, IOMMU_DEV_FEAT_AUX)
> 
> aux-domain
> ==========
> 
> If a device is working in aux-mode and we are going to attach a domain
> to this device, we say "this domain will be attached to the device in
> aux mode", and simply "aux domain". So a domain is "normal" when it is
> going to attach to a device in normal mode; and is "aux-domain" when it
> is going to attach to a device in aux mode.

Hmm.. OK I guess.  It still feels like there is more need to refer to
the docs than there should be.  Still, your code and I may well never
read it again so I don't mind :)

> 
> >   
> >> +{


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v5 2/8] iommu/vt-d: Add per-device IOMMU feature ops entries
  2019-01-11 11:16   ` Joerg Roedel
  2019-01-14  5:30     ` Lu Baolu
@ 2019-01-24  6:47     ` Lu Baolu
  2019-01-24 13:20       ` Joerg Roedel
  1 sibling, 1 reply; 19+ messages in thread
From: Lu Baolu @ 2019-01-24  6:47 UTC (permalink / raw)
  To: Joerg Roedel
  Cc: baolu.lu, David Woodhouse, Alex Williamson, Kirti Wankhede,
	ashok.raj, sanjay.k.kumar, jacob.jun.pan, kevin.tian,
	Jean-Philippe Brucker, yi.l.liu, yi.y.sun, peterx, tiwei.bie,
	Zeng, Xin, iommu, kvm, linux-kernel, Jacob Pan

Hi Joerg,

On 1/11/19 7:16 PM, Joerg Roedel wrote:
>> +
>> +static bool
>> +intel_iommu_dev_has_feat(struct device *dev, enum iommu_dev_features feat)
>> +{
>> +	struct device_domain_info *info = dev->archdata.iommu;
>> +
>> +	if (feat == IOMMU_DEV_FEAT_AUX)
>> +		return scalable_mode_support() && info && info->auxd_enabled;
>> +
>> +	return false;
>> +}
> Why is this checking the auxd_enabled flag? The function should just
> return whether the device_supports_  scalable mode, not whether it is
> enabled.

Yes, as the API name implies, it should return the device capability
instead of enable/disable status. I misused this API in the IOMMU
driver.

Since we already have iommu_dev_enable/disable_feature() to enable and
disable an iommu specific feature, is it possible to add another API to
query whether a specific feature has been enabled?

How about

bool iommu_dev_feature_enabled(dev, IOMMU_DEV_FEAT_AUX)?

This is necessary for the third party drivers (like vfio) to determine
which domain attach interface it should use:

if (iommu_dev_feature_enabled(dev, IOMMU_DEV_FEAT_AUX))
         iommmu_aux_attach_device(domain, dev)
else
         iommu_attach_device(domain, dev)


Best regards,
Lu Baolu

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v5 2/8] iommu/vt-d: Add per-device IOMMU feature ops entries
  2019-01-24  6:47     ` Lu Baolu
@ 2019-01-24 13:20       ` Joerg Roedel
  0 siblings, 0 replies; 19+ messages in thread
From: Joerg Roedel @ 2019-01-24 13:20 UTC (permalink / raw)
  To: Lu Baolu
  Cc: David Woodhouse, Alex Williamson, Kirti Wankhede, ashok.raj,
	sanjay.k.kumar, jacob.jun.pan, kevin.tian, Jean-Philippe Brucker,
	yi.l.liu, yi.y.sun, peterx, tiwei.bie, Zeng, Xin, iommu, kvm,
	linux-kernel, Jacob Pan

Hi Lu Baolu,

On Thu, Jan 24, 2019 at 02:47:39PM +0800, Lu Baolu wrote:
> bool iommu_dev_feature_enabled(dev, IOMMU_DEV_FEAT_AUX)?

Looks good. Having a function to check for enabled features is certainly
a good thing.

Regards,

	Joerg

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2019-01-24 13:20 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-01-10  3:00 [PATCH v5 0/8] vfio/mdev: IOMMU aware mediated device Lu Baolu
2019-01-10  3:00 ` [PATCH v5 1/8] iommu: Add APIs for multiple domains per device Lu Baolu
2019-01-14 11:22   ` Jonathan Cameron
2019-01-15  1:33     ` Lu Baolu
2019-01-10  3:00 ` [PATCH v5 2/8] iommu/vt-d: Add per-device IOMMU feature ops entries Lu Baolu
2019-01-11 11:16   ` Joerg Roedel
2019-01-14  5:30     ` Lu Baolu
2019-01-24  6:47     ` Lu Baolu
2019-01-24 13:20       ` Joerg Roedel
2019-01-10  3:00 ` [PATCH v5 3/8] iommu/vt-d: Move common code out of iommu_attch_device() Lu Baolu
2019-01-14 11:45   ` Jonathan Cameron
2019-01-10  3:00 ` [PATCH v5 4/8] iommu/vt-d: Aux-domain specific domain attach/detach Lu Baolu
2019-01-14 12:26   ` Jonathan Cameron
2019-01-15  2:10     ` Lu Baolu
2019-01-15 13:31       ` Jonathan Cameron
2019-01-10  3:00 ` [PATCH v5 5/8] iommu/vt-d: Return ID associated with an auxiliary domain Lu Baolu
2019-01-10  3:00 ` [PATCH v5 6/8] vfio/mdev: Add iommu related member in mdev_device Lu Baolu
2019-01-10  3:00 ` [PATCH v5 7/8] vfio/type1: Add domain at(de)taching group helpers Lu Baolu
2019-01-10  3:00 ` [PATCH v5 8/8] vfio/type1: Handle different mdev isolation type Lu Baolu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).