linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/8] Enable PASID for DMA API users
@ 2022-03-15  5:07 Jacob Pan
  2022-03-15  5:07 ` [PATCH v2 1/8] iommu: Assign per device max PASID Jacob Pan
                   ` (9 more replies)
  0 siblings, 10 replies; 61+ messages in thread
From: Jacob Pan @ 2022-03-15  5:07 UTC (permalink / raw)
  To: iommu, LKML, Joerg Roedel, Jason Gunthorpe, Christoph Hellwig,
	Lu Baolu, Jean-Philippe Brucker
  Cc: Jacob Pan, Raj Ashok, Kumar, Sanjay K, Dave Jiang, Tony Luck,
	Zanussi, Tom, Dan Williams, Tian, Kevin, Yi Liu

Some modern accelerators such as Intel's Data Streaming Accelerator (DSA)
require PASID in DMA requests to be operational. Specifically, the work
submissions with ENQCMD on shared work queues require PASIDs. The use cases
include both user DMA with shared virtual addressing (SVA) and in-kernel
DMA similar to legacy DMA w/o PASID. Here we address the latter.

DMA mapping API is the de facto standard for in-kernel DMA. However, it
operates on a per device or Requester ID(RID) basis which is not
PASID-aware. To leverage DMA API for devices relies on PASIDs, this
patchset introduces the following APIs

1. A driver facing API that enables DMA API PASID usage:
iommu_enable_pasid_dma(struct device *dev, ioasid_t &pasid);

2. An IOMMU op that allows attaching device-domain-PASID generically (will
be used beyond DMA API PASID support)

Once PASID DMA is enabled and attached to the appropriate IOMMU domain,
device drivers can continue to use DMA APIs as-is. There is no difference
in terms of mapping in dma_handle between without PASID and with PASID.
The DMA mapping performed by IOMMU will be identical for both requests, let
it be IOVA or PA in case of pass-through.

In addition, this set converts DSA driver in-kernel DMA with PASID from SVA
lib to DMA API. There have been security and functional issues with the
kernel SVA approach:
(https://lore.kernel.org/linux-iommu/20210511194726.GP1002214@nvidia.com/)
The highlights are as the following:
 - The lack of IOTLB synchronization upon kernel page table updates.
   (vmalloc, module/BPF loading, CONFIG_DEBUG_PAGEALLOC etc.)
 - Other than slight more protection, using kernel virtual address (KVA)
has little advantage over physical address. There are also no use cases yet
where DMA engines need kernel virtual addresses for in-kernel DMA.

Subsequently, cleanup is done around the usage of sva_bind_device() for
in-kernel DMA. Removing special casing code in VT-d driver and tightening
SVA lib API.

This work and idea behind it is a collaboration with many people, many
thanks to Baolu Lu, Jason Gunthorpe, Dave Jiang, and others.


ChangeLog:
v2
	- Do not reserve a special PASID for DMA API usage. Use IOASID
	  allocation instead.
	- Introduced a generic device-pasid-domain attachment IOMMU op.
	  Replaced the DMA API only IOMMU op.
	- Removed supervisor SVA support in VT-d
	- Removed unused sva_bind_device parameters
	- Use IOMMU specific data instead of struct device to store PASID
	  info

Jacob Pan (6):
  iommu/vt-d: Implement device_pasid domain attach ops
  iommu/vt-d: Use device_pasid attach op for RID2PASID
  iommu: Add PASID support for DMA mapping API users
  dmaengine: idxd: Use DMA API for in-kernel DMA with PASID
  iommu/vt-d: Delete supervisor/kernel SVA
  iommu: Remove unused driver data in sva_bind_device

Lu Baolu (2):
  iommu: Assign per device max PASID
  iommu: Add attach/detach_dev_pasid domain ops

 drivers/dma/idxd/cdev.c                       |   2 +-
 drivers/dma/idxd/idxd.h                       |   1 -
 drivers/dma/idxd/init.c                       |  34 +--
 drivers/dma/idxd/sysfs.c                      |   7 -
 .../iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c   |   2 +-
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h   |   5 +-
 drivers/iommu/dma-iommu.c                     |  65 ++++++
 drivers/iommu/intel/iommu.c                   | 214 ++++++++++++++++--
 drivers/iommu/intel/svm.c                     |  51 +----
 drivers/iommu/iommu.c                         |   4 +-
 drivers/misc/uacce/uacce.c                    |   2 +-
 include/linux/dma-iommu.h                     |   7 +
 include/linux/intel-iommu.h                   |  15 +-
 include/linux/iommu.h                         |  37 ++-
 14 files changed, 338 insertions(+), 108 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 61+ messages in thread

* [PATCH v2 1/8] iommu: Assign per device max PASID
  2022-03-15  5:07 [PATCH v2 0/8] Enable PASID for DMA API users Jacob Pan
@ 2022-03-15  5:07 ` Jacob Pan
  2022-03-15  5:07 ` [PATCH v2 2/8] iommu: Add attach/detach_dev_pasid domain ops Jacob Pan
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 61+ messages in thread
From: Jacob Pan @ 2022-03-15  5:07 UTC (permalink / raw)
  To: iommu, LKML, Joerg Roedel, Jason Gunthorpe, Christoph Hellwig,
	Lu Baolu, Jean-Philippe Brucker
  Cc: Jacob Pan, Raj Ashok, Kumar, Sanjay K, Dave Jiang, Tony Luck,
	Zanussi, Tom, Dan Williams, Tian, Kevin, Yi Liu

From: Lu Baolu <baolu.lu@linux.intel.com>

PCIe spec defines Max PASID Width per-device.  Since a PASID is only
used with IOMMU enabled, this patch introduces a PASID max variable on
the per-device IOMMU data. It will be used for limiting PASID allocation
in that PASID table is per-device.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
---
 drivers/iommu/intel/iommu.c |  4 +++-
 include/linux/iommu.h       | 13 +++++++++++++
 2 files changed, 16 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 50666d250b36..881f8361eca2 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -2602,8 +2602,10 @@ static struct dmar_domain *dmar_insert_one_dev_info(struct intel_iommu *iommu,
 		if (sm_supported(iommu)) {
 			if (pasid_supported(iommu)) {
 				int features = pci_pasid_features(pdev);
-				if (features >= 0)
+				if (features >= 0) {
 					info->pasid_supported = features | 1;
+					iommu_set_dev_pasid_max(&pdev->dev, pci_max_pasids(pdev));
+				}
 			}
 
 			if (info->ats_supported && ecap_prs(iommu->ecap) &&
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index de0c57a567c8..369f05c2a4e2 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -364,6 +364,7 @@ struct iommu_fault_param {
  * @fwspec:	 IOMMU fwspec data
  * @iommu_dev:	 IOMMU device this device is linked to
  * @priv:	 IOMMU Driver private data
+ * @pasid_max	 Max PASID value supported by this device
  *
  * TODO: migrate other per device data pointers under iommu_dev_data, e.g.
  *	struct iommu_group	*iommu_group;
@@ -375,8 +376,20 @@ struct dev_iommu {
 	struct iommu_fwspec		*fwspec;
 	struct iommu_device		*iommu_dev;
 	void				*priv;
+	unsigned int			pasid_max;
 };
 
+static inline void iommu_set_dev_pasid_max(struct device *dev,
+					    unsigned int max)
+{
+	struct dev_iommu *param = dev->iommu;
+
+	if (WARN_ON(!param))
+		return;
+
+	param->pasid_max = max;
+}
+
 int iommu_device_register(struct iommu_device *iommu,
 			  const struct iommu_ops *ops,
 			  struct device *hwdev);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH v2 2/8] iommu: Add attach/detach_dev_pasid domain ops
  2022-03-15  5:07 [PATCH v2 0/8] Enable PASID for DMA API users Jacob Pan
  2022-03-15  5:07 ` [PATCH v2 1/8] iommu: Assign per device max PASID Jacob Pan
@ 2022-03-15  5:07 ` Jacob Pan
  2022-03-15 10:24   ` Tian, Kevin
                     ` (2 more replies)
  2022-03-15  5:07 ` [PATCH v2 3/8] iommu/vt-d: Implement device_pasid domain attach ops Jacob Pan
                   ` (7 subsequent siblings)
  9 siblings, 3 replies; 61+ messages in thread
From: Jacob Pan @ 2022-03-15  5:07 UTC (permalink / raw)
  To: iommu, LKML, Joerg Roedel, Jason Gunthorpe, Christoph Hellwig,
	Lu Baolu, Jean-Philippe Brucker
  Cc: Jacob Pan, Raj Ashok, Kumar, Sanjay K, Dave Jiang, Tony Luck,
	Zanussi, Tom, Dan Williams, Tian, Kevin, Yi Liu

From: Lu Baolu <baolu.lu@linux.intel.com>

An IOMMU domain represents an address space which can be attached by
devices that perform DMA within a domain. However, for platforms with
PASID capability the domain attachment needs be handled at device+PASID
level. There can be multiple PASIDs within a device and multiple devices
attached to a given domain.
This patch introduces a new IOMMU op which support device, PASID, and
IOMMU domain attachment. The immediate use case is for PASID capable
devices to perform DMA under DMA APIs.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
---
 include/linux/iommu.h | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 369f05c2a4e2..fde5b933dbe3 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -227,6 +227,8 @@ struct iommu_iotlb_gather {
  * @aux_get_pasid: get the pasid given an aux-domain
  * @sva_bind: Bind process address space to device
  * @sva_unbind: Unbind process address space from device
+ * @attach_dev_pasid: attach an iommu domain to a pasid of device
+ * @detach_dev_pasid: detach an iommu domain from a pasid of device
  * @sva_get_pasid: Get PASID associated to a SVA handle
  * @page_response: handle page request response
  * @cache_invalidate: invalidate translation caches
@@ -296,6 +298,10 @@ struct iommu_ops {
 	struct iommu_sva *(*sva_bind)(struct device *dev, struct mm_struct *mm,
 				      void *drvdata);
 	void (*sva_unbind)(struct iommu_sva *handle);
+	int (*attach_dev_pasid)(struct iommu_domain *domain,
+				struct device *dev, ioasid_t id);
+	void (*detach_dev_pasid)(struct iommu_domain *domain,
+				 struct device *dev, ioasid_t id);
 	u32 (*sva_get_pasid)(struct iommu_sva *handle);
 
 	int (*page_response)(struct device *dev,
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH v2 3/8] iommu/vt-d: Implement device_pasid domain attach ops
  2022-03-15  5:07 [PATCH v2 0/8] Enable PASID for DMA API users Jacob Pan
  2022-03-15  5:07 ` [PATCH v2 1/8] iommu: Assign per device max PASID Jacob Pan
  2022-03-15  5:07 ` [PATCH v2 2/8] iommu: Add attach/detach_dev_pasid domain ops Jacob Pan
@ 2022-03-15  5:07 ` Jacob Pan
  2022-03-15 10:33   ` Tian, Kevin
                     ` (2 more replies)
  2022-03-15  5:07 ` [PATCH v2 4/8] iommu/vt-d: Use device_pasid attach op for RID2PASID Jacob Pan
                   ` (6 subsequent siblings)
  9 siblings, 3 replies; 61+ messages in thread
From: Jacob Pan @ 2022-03-15  5:07 UTC (permalink / raw)
  To: iommu, LKML, Joerg Roedel, Jason Gunthorpe, Christoph Hellwig,
	Lu Baolu, Jean-Philippe Brucker
  Cc: Jacob Pan, Raj Ashok, Kumar, Sanjay K, Dave Jiang, Tony Luck,
	Zanussi, Tom, Dan Williams, Tian, Kevin, Yi Liu

On VT-d platforms with scalable mode enabled, devices issue DMA requests
with PASID need to attach to the correct IOMMU domains.
The attach operation involves the following:
- programming the PASID into device's PASID table
- tracking device domain and the PASID relationship
- managing IOTLB and device TLB invalidations

This patch extends DMAR domain and device domain info with xarrays to
track per domain and per device PASIDs.  It provides the flexibility to
be used beyond DMA API PASID support.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
---
 drivers/iommu/intel/iommu.c | 194 +++++++++++++++++++++++++++++++++++-
 include/linux/intel-iommu.h |  12 ++-
 2 files changed, 202 insertions(+), 4 deletions(-)

diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 881f8361eca2..9267194eaed3 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -1622,20 +1622,48 @@ static void __iommu_flush_dev_iotlb(struct device_domain_info *info,
 			   qdep, addr, mask);
 }
 
+static void __iommu_flush_dev_piotlb(struct device_domain_info *info,
+					u64 address,
+				     ioasid_t pasid, unsigned int mask)
+{
+	u16 sid, qdep;
+
+	if (!info || !info->ats_enabled)
+		return;
+
+	sid = info->bus << 8 | info->devfn;
+	qdep = info->ats_qdep;
+	qi_flush_dev_iotlb_pasid(info->iommu, sid, info->pfsid,
+				 pasid, qdep, address, mask);
+}
+
 static void iommu_flush_dev_iotlb(struct dmar_domain *domain,
 				  u64 addr, unsigned mask)
 {
 	unsigned long flags;
 	struct device_domain_info *info;
 	struct subdev_domain_info *sinfo;
+	unsigned long pasid;
+	struct pasid_info *pinfo;
 
 	if (!domain->has_iotlb_device)
 		return;
 
 	spin_lock_irqsave(&device_domain_lock, flags);
-	list_for_each_entry(info, &domain->devices, link)
-		__iommu_flush_dev_iotlb(info, addr, mask);
-
+	list_for_each_entry(info, &domain->devices, link) {
+		/*
+		 * We cannot use PASID based devTLB invalidation on RID2PASID
+		 * Device does not understand RID2PASID/0. This is different
+		 * than IOTLB invalidation where RID2PASID is also used for
+		 * tagging.
+		 */
+		xa_for_each(&info->pasids, pasid, pinfo) {
+			if (!pasid)
+				__iommu_flush_dev_iotlb(info, addr, mask);
+			else
+				__iommu_flush_dev_piotlb(info, addr, pasid, mask);
+		}
+	}
 	list_for_each_entry(sinfo, &domain->subdevices, link_domain) {
 		info = get_domain_info(sinfo->pdev);
 		__iommu_flush_dev_iotlb(info, addr, mask);
@@ -1648,6 +1676,8 @@ static void domain_flush_piotlb(struct intel_iommu *iommu,
 				u64 addr, unsigned long npages, bool ih)
 {
 	u16 did = domain->iommu_did[iommu->seq_id];
+	struct pasid_info *pinfo;
+	unsigned long pasid;
 
 	if (domain->default_pasid)
 		qi_flush_piotlb(iommu, did, domain->default_pasid,
@@ -1655,6 +1685,21 @@ static void domain_flush_piotlb(struct intel_iommu *iommu,
 
 	if (!list_empty(&domain->devices))
 		qi_flush_piotlb(iommu, did, PASID_RID2PASID, addr, npages, ih);
+
+	if (list_empty(&domain->devices) || xa_empty(&domain->pasids))
+		return;
+
+	/*
+	 * Flush IOTLBs for all the PASIDs attached to this domain, RID2PASID
+	 * included.
+	 * TODO: If there are many PASIDs, we may resort to flush with
+	 * domain ID which may have performance benefits due to fewer
+	 * invalidation descriptors. VM exits may be reduced when running on
+	 * vIOMMU. The current use cases utilize no more than 2 PASIDs per
+	 * device, i.e. RID2PASID and a kernel DMA API PASID.
+	 */
+	xa_for_each(&domain->pasids, pasid, pinfo)
+		qi_flush_piotlb(iommu, did, pasid, addr, npages, ih);
 }
 
 static void iommu_flush_iotlb_psi(struct intel_iommu *iommu,
@@ -1902,6 +1947,7 @@ static struct dmar_domain *alloc_domain(unsigned int type)
 	domain->has_iotlb_device = false;
 	INIT_LIST_HEAD(&domain->devices);
 	INIT_LIST_HEAD(&domain->subdevices);
+	xa_init(&domain->pasids);
 
 	return domain;
 }
@@ -2556,6 +2602,144 @@ static bool dev_is_real_dma_subdevice(struct device *dev)
 	       pci_real_dma_dev(to_pci_dev(dev)) != to_pci_dev(dev);
 }
 
+
+static bool is_device_domain_attached(struct dmar_domain *dmar_domain,
+				      struct device *dev)
+{
+	struct device_domain_info *info;
+
+	list_for_each_entry(info, &dmar_domain->devices, link) {
+		if (info->dev == dev)
+			return true;
+	}
+
+	return false;
+}
+
+static int intel_iommu_attach_dev_pasid(struct iommu_domain *domain,
+					struct device *dev, ioasid_t pasid)
+{
+	struct dmar_domain *dmar_domain = to_dmar_domain(domain);
+	struct device_domain_info *info = get_domain_info(dev);
+	struct intel_iommu *iommu = info->iommu;
+	struct pasid_info *pinfo;
+	unsigned long flags;
+	int ret = 0;
+	void *entry;
+
+	if (!info)
+		return -ENODEV;
+	/*
+	 * Allow attaching kernel PASIDs only after the device is already attached
+	 * with RID2PASID, which is used for legacy DMA.
+	 */
+	if (pasid != PASID_RID2PASID && !is_device_domain_attached(dmar_domain, dev)) {
+		dev_err(dev, "Device not attached, must attach device before PASID!\n");
+		return -ENODEV;
+	}
+
+	/*
+	 * The same PASID from the same device can only attach to this domain
+	 * once. PASID table is per device. NULL entry is not allowed in the
+	 * device-domain xarray.
+	 */
+	entry = xa_load(&info->pasids, pasid);
+	if (entry) {
+		dev_err(dev, "PASID %d already attached!\n", pasid);
+		return -EBUSY;
+	}
+
+	spin_lock_irqsave(&device_domain_lock, flags);
+	spin_lock(&iommu->lock);
+	if (hw_pass_through && domain_type_is_si(info->domain))
+		ret = intel_pasid_setup_pass_through(info->iommu, info->domain,
+						     dev, pasid);
+	else if (domain_use_first_level(dmar_domain))
+		ret = domain_setup_first_level(iommu, dmar_domain, dev, pasid);
+	else
+		ret = intel_pasid_setup_second_level(iommu, dmar_domain, dev, pasid);
+	spin_unlock(&iommu->lock);
+	spin_unlock_irqrestore(&device_domain_lock, flags);
+	if (ret)
+		return ret;
+
+	xa_lock(&dmar_domain->pasids);
+	/*
+	 * Each domain could have multiple devices attached with shared or per
+	 * device PASIDs. At the domain level, we keep track of unique PASIDs and
+	 * device user count.
+	 * E.g. If a domain has two devices attached, device A has PASID 0, 1;
+	 * device B has PASID 0, 2. Then the domain would have PASID 0, 1, 2.
+	 */
+	entry = xa_load(&dmar_domain->pasids, pasid);
+	if (entry) {
+		pinfo = entry;
+	} else {
+		pinfo = kzalloc(sizeof(*pinfo), GFP_ATOMIC);
+		if (!pinfo)
+			return -ENOMEM;
+		pinfo->pasid = pasid;
+		/* Store the new PASID info in the per domain array */
+		ret = xa_err(__xa_store(&dmar_domain->pasids, pasid, pinfo,
+			     GFP_ATOMIC));
+		if (ret)
+			goto xa_store_err;
+	}
+	/* Store PASID in per device-domain array, this is for tracking devTLB */
+	ret = xa_err(xa_store(&info->pasids, pasid, pinfo, GFP_ATOMIC));
+	if (ret)
+		goto xa_store_err;
+
+	atomic_inc(&pinfo->users);
+	xa_unlock(&dmar_domain->pasids);
+
+	return 0;
+
+xa_store_err:
+	xa_unlock(&dmar_domain->pasids);
+	spin_lock_irqsave(&iommu->lock, flags);
+	intel_pasid_tear_down_entry(iommu, dev, pasid, false);
+	spin_unlock_irqrestore(&iommu->lock, flags);
+
+	if (!atomic_read(&pinfo->users)) {
+		__xa_erase(&dmar_domain->pasids, pasid);
+		kfree(pinfo);
+	}
+	return ret;
+}
+
+static void intel_iommu_detach_dev_pasid(struct iommu_domain *domain,
+					 struct device *dev, ioasid_t pasid)
+{
+	struct dmar_domain *dmar_domain = to_dmar_domain(domain);
+	struct device_domain_info *info = get_domain_info(dev);
+	struct intel_iommu *iommu = info->iommu;
+	unsigned long flags;
+	struct pasid_info *pinfo;
+
+	if (WARN_ON(!is_device_domain_attached(dmar_domain, dev)))
+		return;
+
+	spin_lock_irqsave(&iommu->lock, flags);
+	intel_pasid_tear_down_entry(iommu, dev, pasid, false);
+	spin_unlock_irqrestore(&iommu->lock, flags);
+
+	xa_lock(&dmar_domain->pasids);
+	pinfo = xa_load(&dmar_domain->pasids, pasid);
+	if (!pinfo) {
+		dev_err(dev, "PASID %d not attached!\n", pasid);
+		xa_unlock(&dmar_domain->pasids);
+		return;
+	}
+
+	xa_erase(&info->pasids, pasid);
+	if (atomic_dec_and_test(&pinfo->users)) {
+		__xa_erase(&dmar_domain->pasids, pasid);
+		kfree(pinfo);
+	}
+	xa_unlock(&dmar_domain->pasids);
+}
+
 static struct dmar_domain *dmar_insert_one_dev_info(struct intel_iommu *iommu,
 						    int bus, int devfn,
 						    struct device *dev,
@@ -2590,6 +2774,7 @@ static struct dmar_domain *dmar_insert_one_dev_info(struct intel_iommu *iommu,
 	info->pasid_table = NULL;
 	info->auxd_enabled = 0;
 	INIT_LIST_HEAD(&info->subdevices);
+	xa_init(&info->pasids);
 
 	if (dev && dev_is_pci(dev)) {
 		struct pci_dev *pdev = to_pci_dev(info->dev);
@@ -5165,6 +5350,7 @@ static void intel_iommu_probe_finalize(struct device *dev)
 	iommu_setup_dma_ops(dev, 0, U64_MAX);
 }
 
+
 static void intel_iommu_get_resv_regions(struct device *device,
 					 struct list_head *head)
 {
@@ -5491,6 +5677,8 @@ const struct iommu_ops intel_iommu_ops = {
 	.enable_nesting		= intel_iommu_enable_nesting,
 	.attach_dev		= intel_iommu_attach_device,
 	.detach_dev		= intel_iommu_detach_device,
+	.attach_dev_pasid       = intel_iommu_attach_dev_pasid,
+	.detach_dev_pasid       = intel_iommu_detach_dev_pasid,
 	.aux_attach_dev		= intel_iommu_aux_attach_device,
 	.aux_detach_dev		= intel_iommu_aux_detach_device,
 	.aux_get_pasid		= intel_iommu_aux_get_pasid,
diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h
index 82955524fad8..3f4c98f170ec 100644
--- a/include/linux/intel-iommu.h
+++ b/include/linux/intel-iommu.h
@@ -567,7 +567,7 @@ struct dmar_domain {
 					 * The default pasid used for non-SVM
 					 * traffic on mediated devices.
 					 */
-
+	struct xarray	pasids;		/* Tracks the PASIDs attached to this domain */
 	struct iommu_domain domain;	/* generic domain data structure for
 					   iommu core */
 };
@@ -628,6 +628,15 @@ struct subdev_domain_info {
 	int users;			/* user count */
 };
 
+struct pasid_info {
+	struct device_domain_info *info;
+	ioasid_t pasid;
+	atomic_t users;			/* Count the number of devices attached
+					 * with the PASID
+					 */
+	u32 flags;			/* permission and other attributes */
+};
+
 /* PCI domain-device relationship */
 struct device_domain_info {
 	struct list_head link;	/* link to domain siblings */
@@ -650,6 +659,7 @@ struct device_domain_info {
 	struct intel_iommu *iommu; /* IOMMU used by this device */
 	struct dmar_domain *domain; /* pointer to domain */
 	struct pasid_table *pasid_table; /* pasid table */
+	struct xarray pasids; /* PASIDs attached to this domain-device */
 };
 
 static inline void __iommu_flush_cache(
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH v2 4/8] iommu/vt-d: Use device_pasid attach op for RID2PASID
  2022-03-15  5:07 [PATCH v2 0/8] Enable PASID for DMA API users Jacob Pan
                   ` (2 preceding siblings ...)
  2022-03-15  5:07 ` [PATCH v2 3/8] iommu/vt-d: Implement device_pasid domain attach ops Jacob Pan
@ 2022-03-15  5:07 ` Jacob Pan
  2022-03-16  7:54   ` Tian, Kevin
  2022-03-15  5:07 ` [PATCH v2 5/8] iommu: Add PASID support for DMA mapping API users Jacob Pan
                   ` (5 subsequent siblings)
  9 siblings, 1 reply; 61+ messages in thread
From: Jacob Pan @ 2022-03-15  5:07 UTC (permalink / raw)
  To: iommu, LKML, Joerg Roedel, Jason Gunthorpe, Christoph Hellwig,
	Lu Baolu, Jean-Philippe Brucker
  Cc: Jacob Pan, Raj Ashok, Kumar, Sanjay K, Dave Jiang, Tony Luck,
	Zanussi, Tom, Dan Williams, Tian, Kevin, Yi Liu

With the availability of a generic device-PASID-domain attachment API,
there's no need to special case RID2PASID.  Use the API to replace
duplicated code.

Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
---
 drivers/iommu/intel/iommu.c | 18 ++----------------
 1 file changed, 2 insertions(+), 16 deletions(-)

diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 9267194eaed3..f832b7599d21 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -1683,9 +1683,6 @@ static void domain_flush_piotlb(struct intel_iommu *iommu,
 		qi_flush_piotlb(iommu, did, domain->default_pasid,
 				addr, npages, ih);
 
-	if (!list_empty(&domain->devices))
-		qi_flush_piotlb(iommu, did, PASID_RID2PASID, addr, npages, ih);
-
 	if (list_empty(&domain->devices) || xa_empty(&domain->pasids))
 		return;
 
@@ -2826,17 +2823,7 @@ static struct dmar_domain *dmar_insert_one_dev_info(struct intel_iommu *iommu,
 		}
 
 		/* Setup the PASID entry for requests without PASID: */
-		spin_lock_irqsave(&iommu->lock, flags);
-		if (hw_pass_through && domain_type_is_si(domain))
-			ret = intel_pasid_setup_pass_through(iommu, domain,
-					dev, PASID_RID2PASID);
-		else if (domain_use_first_level(domain))
-			ret = domain_setup_first_level(iommu, domain, dev,
-					PASID_RID2PASID);
-		else
-			ret = intel_pasid_setup_second_level(iommu, domain,
-					dev, PASID_RID2PASID);
-		spin_unlock_irqrestore(&iommu->lock, flags);
+		ret = intel_iommu_attach_dev_pasid(&domain->domain, dev, PASID_RID2PASID);
 		if (ret) {
 			dev_err(dev, "Setup RID2PASID failed\n");
 			dmar_remove_one_dev_info(dev);
@@ -4618,8 +4605,7 @@ static void __dmar_remove_one_dev_info(struct device_domain_info *info)
 
 	if (info->dev && !dev_is_real_dma_subdevice(info->dev)) {
 		if (dev_is_pci(info->dev) && sm_supported(iommu))
-			intel_pasid_tear_down_entry(iommu, info->dev,
-					PASID_RID2PASID, false);
+			intel_iommu_detach_dev_pasid(&domain->domain, info->dev, PASID_RID2PASID);
 
 		iommu_disable_dev_iotlb(info);
 		domain_context_clear(info);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH v2 5/8] iommu: Add PASID support for DMA mapping API users
  2022-03-15  5:07 [PATCH v2 0/8] Enable PASID for DMA API users Jacob Pan
                   ` (3 preceding siblings ...)
  2022-03-15  5:07 ` [PATCH v2 4/8] iommu/vt-d: Use device_pasid attach op for RID2PASID Jacob Pan
@ 2022-03-15  5:07 ` Jacob Pan
  2022-03-15 11:16   ` Robin Murphy
                     ` (2 more replies)
  2022-03-15  5:07 ` [PATCH v2 6/8] dmaengine: idxd: Use DMA API for in-kernel DMA with PASID Jacob Pan
                   ` (4 subsequent siblings)
  9 siblings, 3 replies; 61+ messages in thread
From: Jacob Pan @ 2022-03-15  5:07 UTC (permalink / raw)
  To: iommu, LKML, Joerg Roedel, Jason Gunthorpe, Christoph Hellwig,
	Lu Baolu, Jean-Philippe Brucker
  Cc: Jacob Pan, Raj Ashok, Kumar, Sanjay K, Dave Jiang, Tony Luck,
	Zanussi, Tom, Dan Williams, Tian, Kevin, Yi Liu

DMA mapping API is the de facto standard for in-kernel DMA. It operates
on a per device/RID basis which is not PASID-aware.

Some modern devices such as Intel Data Streaming Accelerator, PASID is
required for certain work submissions. To allow such devices use DMA
mapping API, we need the following functionalities:
1. Provide device a way to retrieve a PASID for work submission within
the kernel
2. Enable the kernel PASID on the IOMMU for the device
3. Attach the kernel PASID to the device's default DMA domain, let it
be IOVA or physical address in case of pass-through.

This patch introduces a driver facing API that enables DMA API
PASID usage. Once enabled, device drivers can continue to use DMA APIs as
is. There is no difference in dma_handle between without PASID and with
PASID.

Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
---
 drivers/iommu/dma-iommu.c | 65 +++++++++++++++++++++++++++++++++++++++
 include/linux/dma-iommu.h |  7 +++++
 include/linux/iommu.h     |  9 ++++++
 3 files changed, 81 insertions(+)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index b22034975301..d0ff1a34b1b6 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -39,6 +39,8 @@ enum iommu_dma_cookie_type {
 	IOMMU_DMA_MSI_COOKIE,
 };
 
+static DECLARE_IOASID_SET(iommu_dma_pasid);
+
 struct iommu_dma_cookie {
 	enum iommu_dma_cookie_type	type;
 	union {
@@ -370,6 +372,69 @@ void iommu_put_dma_cookie(struct iommu_domain *domain)
 	domain->iova_cookie = NULL;
 }
 
+/**
+ * iommu_enable_pasid_dma --Enable in-kernel DMA request with PASID
+ * @dev:	Device to be enabled
+ *
+ * DMA request with PASID will be mapped the same way as the legacy DMA.
+ * If the device is in pass-through, PASID will also pass-through. If the
+ * device is in IOVA map, the supervisor PASID will point to the same IOVA
+ * page table.
+ *
+ * @return the kernel PASID to be used for DMA or INVALID_IOASID on failure
+ */
+int iommu_enable_pasid_dma(struct device *dev, ioasid_t *pasid)
+{
+	struct iommu_domain *dom;
+	ioasid_t id, max;
+	int ret;
+
+	dom = iommu_get_domain_for_dev(dev);
+	if (!dom || !dom->ops || !dom->ops->attach_dev_pasid)
+		return -ENODEV;
+	max = iommu_get_dev_pasid_max(dev);
+	if (!max)
+		return -EINVAL;
+
+	id = ioasid_alloc(&iommu_dma_pasid, 1, max, dev);
+	if (id == INVALID_IOASID)
+		return -ENOMEM;
+
+	ret = dom->ops->attach_dev_pasid(dom, dev, id);
+	if (ret) {
+		ioasid_put(id);
+		return ret;
+	}
+	*pasid = id;
+
+	return ret;
+}
+EXPORT_SYMBOL(iommu_enable_pasid_dma);
+
+/**
+ * iommu_disable_pasid_dma --Disable in-kernel DMA request with PASID
+ * @dev:	Device's PASID DMA to be disabled
+ *
+ * It is the device driver's responsibility to ensure no more incoming DMA
+ * requests with the kernel PASID before calling this function. IOMMU driver
+ * ensures PASID cache, IOTLBs related to the kernel PASID are cleared and
+ * drained.
+ *
+ * @return 0 on success or error code on failure
+ */
+void iommu_disable_pasid_dma(struct device *dev, ioasid_t pasid)
+{
+	struct iommu_domain *dom;
+
+	/* TODO: check the given PASID is within the ioasid_set */
+	dom = iommu_get_domain_for_dev(dev);
+	if (!dom->ops->detach_dev_pasid)
+		return;
+	dom->ops->detach_dev_pasid(dom, dev, pasid);
+	ioasid_put(pasid);
+}
+EXPORT_SYMBOL(iommu_disable_pasid_dma);
+
 /**
  * iommu_dma_get_resv_regions - Reserved region driver helper
  * @dev: Device from iommu_get_resv_regions()
diff --git a/include/linux/dma-iommu.h b/include/linux/dma-iommu.h
index 24607dc3c2ac..e6cb9b52a420 100644
--- a/include/linux/dma-iommu.h
+++ b/include/linux/dma-iommu.h
@@ -18,6 +18,13 @@ int iommu_get_dma_cookie(struct iommu_domain *domain);
 int iommu_get_msi_cookie(struct iommu_domain *domain, dma_addr_t base);
 void iommu_put_dma_cookie(struct iommu_domain *domain);
 
+/*
+ * For devices that can do DMA request with PASID, setup a system PASID.
+ * Address modes (IOVA, PA) are selected by the platform code.
+ */
+int iommu_enable_pasid_dma(struct device *dev, ioasid_t *pasid);
+void iommu_disable_pasid_dma(struct device *dev, ioasid_t pasid);
+
 /* Setup call for arch DMA mapping code */
 void iommu_setup_dma_ops(struct device *dev, u64 dma_base, u64 dma_limit);
 int iommu_dma_init_fq(struct iommu_domain *domain);
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index fde5b933dbe3..fb011722e4f8 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -395,6 +395,15 @@ static inline void iommu_set_dev_pasid_max(struct device *dev,
 
 	param->pasid_max = max;
 }
+static inline ioasid_t iommu_get_dev_pasid_max(struct device *dev)
+{
+	struct dev_iommu *param = dev->iommu;
+
+	if (WARN_ON(!param))
+		return 0;
+
+	return param->pasid_max;
+}
 
 int iommu_device_register(struct iommu_device *iommu,
 			  const struct iommu_ops *ops,
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH v2 6/8] dmaengine: idxd: Use DMA API for in-kernel DMA with PASID
  2022-03-15  5:07 [PATCH v2 0/8] Enable PASID for DMA API users Jacob Pan
                   ` (4 preceding siblings ...)
  2022-03-15  5:07 ` [PATCH v2 5/8] iommu: Add PASID support for DMA mapping API users Jacob Pan
@ 2022-03-15  5:07 ` Jacob Pan
  2022-03-18  6:10   ` Tian, Kevin
  2022-03-15  5:07 ` [PATCH v2 7/8] iommu/vt-d: Delete supervisor/kernel SVA Jacob Pan
                   ` (3 subsequent siblings)
  9 siblings, 1 reply; 61+ messages in thread
From: Jacob Pan @ 2022-03-15  5:07 UTC (permalink / raw)
  To: iommu, LKML, Joerg Roedel, Jason Gunthorpe, Christoph Hellwig,
	Lu Baolu, Jean-Philippe Brucker
  Cc: Jacob Pan, Raj Ashok, Kumar, Sanjay K, Dave Jiang, Tony Luck,
	Zanussi, Tom, Dan Williams, Tian, Kevin, Yi Liu

The current in-kernel supervisor PASID support is based on the SVM/SVA
machinery in SVA lib. The binding between a kernel PASID and kernel
mapping has many flaws. See discussions in the link below.

This patch enables in-kernel DMA by switching from SVA lib to the
standard DMA mapping APIs. Since both DMA requests with and without
PASIDs are mapped identically, there is no change to how DMA APIs are
used after the kernel PASID is enabled.

Link: https://lore.kernel.org/linux-iommu/20210511194726.GP1002214@nvidia.com/
Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
---
 drivers/dma/idxd/idxd.h  |  1 -
 drivers/dma/idxd/init.c  | 34 +++++++++-------------------------
 drivers/dma/idxd/sysfs.c |  7 -------
 3 files changed, 9 insertions(+), 33 deletions(-)

diff --git a/drivers/dma/idxd/idxd.h b/drivers/dma/idxd/idxd.h
index da72eb15f610..a09ab4a6e1c1 100644
--- a/drivers/dma/idxd/idxd.h
+++ b/drivers/dma/idxd/idxd.h
@@ -276,7 +276,6 @@ struct idxd_device {
 	struct idxd_wq **wqs;
 	struct idxd_engine **engines;
 
-	struct iommu_sva *sva;
 	unsigned int pasid;
 
 	int num_groups;
diff --git a/drivers/dma/idxd/init.c b/drivers/dma/idxd/init.c
index 08a5f4310188..5d1f8dd4abf6 100644
--- a/drivers/dma/idxd/init.c
+++ b/drivers/dma/idxd/init.c
@@ -16,6 +16,7 @@
 #include <linux/idr.h>
 #include <linux/intel-svm.h>
 #include <linux/iommu.h>
+#include <linux/dma-iommu.h>
 #include <uapi/linux/idxd.h>
 #include <linux/dmaengine.h>
 #include "../dmaengine.h"
@@ -466,36 +467,22 @@ static struct idxd_device *idxd_alloc(struct pci_dev *pdev, struct idxd_driver_d
 
 static int idxd_enable_system_pasid(struct idxd_device *idxd)
 {
-	int flags;
-	unsigned int pasid;
-	struct iommu_sva *sva;
+	u32 pasid;
+	int ret;
 
-	flags = SVM_FLAG_SUPERVISOR_MODE;
-
-	sva = iommu_sva_bind_device(&idxd->pdev->dev, NULL, &flags);
-	if (IS_ERR(sva)) {
-		dev_warn(&idxd->pdev->dev,
-			 "iommu sva bind failed: %ld\n", PTR_ERR(sva));
-		return PTR_ERR(sva);
-	}
-
-	pasid = iommu_sva_get_pasid(sva);
-	if (pasid == IOMMU_PASID_INVALID) {
-		iommu_sva_unbind_device(sva);
-		return -ENODEV;
+	ret = iommu_enable_pasid_dma(&idxd->pdev->dev, &pasid);
+	if (ret) {
+		dev_err(&idxd->pdev->dev, "No DMA PASID %d\n", ret);
+		return ret;
 	}
-
-	idxd->sva = sva;
 	idxd->pasid = pasid;
-	dev_dbg(&idxd->pdev->dev, "system pasid: %u\n", pasid);
+
 	return 0;
 }
 
 static void idxd_disable_system_pasid(struct idxd_device *idxd)
 {
-
-	iommu_sva_unbind_device(idxd->sva);
-	idxd->sva = NULL;
+	iommu_disable_pasid_dma(&idxd->pdev->dev, idxd->pasid);
 }
 
 static int idxd_probe(struct idxd_device *idxd)
@@ -524,10 +511,7 @@ static int idxd_probe(struct idxd_device *idxd)
 		} else {
 			dev_warn(dev, "Unable to turn on SVA feature.\n");
 		}
-	} else if (!sva) {
-		dev_warn(dev, "User forced SVA off via module param.\n");
 	}
-
 	idxd_read_caps(idxd);
 	idxd_read_table_offsets(idxd);
 
diff --git a/drivers/dma/idxd/sysfs.c b/drivers/dma/idxd/sysfs.c
index 7e19ab92b61a..fde6656695ba 100644
--- a/drivers/dma/idxd/sysfs.c
+++ b/drivers/dma/idxd/sysfs.c
@@ -839,13 +839,6 @@ static ssize_t wq_name_store(struct device *dev,
 	if (strlen(buf) > WQ_NAME_SIZE || strlen(buf) == 0)
 		return -EINVAL;
 
-	/*
-	 * This is temporarily placed here until we have SVM support for
-	 * dmaengine.
-	 */
-	if (wq->type == IDXD_WQT_KERNEL && device_pasid_enabled(wq->idxd))
-		return -EOPNOTSUPP;
-
 	memset(wq->name, 0, WQ_NAME_SIZE + 1);
 	strncpy(wq->name, buf, WQ_NAME_SIZE);
 	strreplace(wq->name, '\n', '\0');
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH v2 7/8] iommu/vt-d: Delete supervisor/kernel SVA
  2022-03-15  5:07 [PATCH v2 0/8] Enable PASID for DMA API users Jacob Pan
                   ` (5 preceding siblings ...)
  2022-03-15  5:07 ` [PATCH v2 6/8] dmaengine: idxd: Use DMA API for in-kernel DMA with PASID Jacob Pan
@ 2022-03-15  5:07 ` Jacob Pan
  2022-03-18  6:16   ` Tian, Kevin
  2022-03-15  5:07 ` [PATCH v2 8/8] iommu: Remove unused driver data in sva_bind_device Jacob Pan
                   ` (2 subsequent siblings)
  9 siblings, 1 reply; 61+ messages in thread
From: Jacob Pan @ 2022-03-15  5:07 UTC (permalink / raw)
  To: iommu, LKML, Joerg Roedel, Jason Gunthorpe, Christoph Hellwig,
	Lu Baolu, Jean-Philippe Brucker
  Cc: Jacob Pan, Raj Ashok, Kumar, Sanjay K, Dave Jiang, Tony Luck,
	Zanussi, Tom, Dan Williams, Tian, Kevin, Yi Liu

In-kernel DMA with PASID should use DMA API now, remove supervisor PASID
SVA support. Remove special cases in bind mm and page request service.

Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
---
 drivers/iommu/intel/svm.c | 42 ++++++++-------------------------------
 1 file changed, 8 insertions(+), 34 deletions(-)

diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c
index 2c53689da461..37d6218f173b 100644
--- a/drivers/iommu/intel/svm.c
+++ b/drivers/iommu/intel/svm.c
@@ -516,11 +516,10 @@ static void intel_svm_free_pasid(struct mm_struct *mm)
 
 static struct iommu_sva *intel_svm_bind_mm(struct intel_iommu *iommu,
 					   struct device *dev,
-					   struct mm_struct *mm,
-					   unsigned int flags)
+					   struct mm_struct *mm)
 {
 	struct device_domain_info *info = get_domain_info(dev);
-	unsigned long iflags, sflags;
+	unsigned long iflags, sflags = 0;
 	struct intel_svm_dev *sdev;
 	struct intel_svm *svm;
 	int ret = 0;
@@ -533,16 +532,13 @@ static struct iommu_sva *intel_svm_bind_mm(struct intel_iommu *iommu,
 
 		svm->pasid = mm->pasid;
 		svm->mm = mm;
-		svm->flags = flags;
 		INIT_LIST_HEAD_RCU(&svm->devs);
 
-		if (!(flags & SVM_FLAG_SUPERVISOR_MODE)) {
-			svm->notifier.ops = &intel_mmuops;
-			ret = mmu_notifier_register(&svm->notifier, mm);
-			if (ret) {
-				kfree(svm);
-				return ERR_PTR(ret);
-			}
+		svm->notifier.ops = &intel_mmuops;
+		ret = mmu_notifier_register(&svm->notifier, mm);
+		if (ret) {
+			kfree(svm);
+			return ERR_PTR(ret);
 		}
 
 		ret = pasid_private_add(svm->pasid, svm);
@@ -583,8 +579,6 @@ static struct iommu_sva *intel_svm_bind_mm(struct intel_iommu *iommu,
 	}
 
 	/* Setup the pasid table: */
-	sflags = (flags & SVM_FLAG_SUPERVISOR_MODE) ?
-			PASID_FLAG_SUPERVISOR_MODE : 0;
 	sflags |= cpu_feature_enabled(X86_FEATURE_LA57) ? PASID_FLAG_FL5LP : 0;
 	spin_lock_irqsave(&iommu->lock, iflags);
 	ret = intel_pasid_setup_first_level(iommu, dev, mm->pgd, mm->pasid,
@@ -957,7 +951,7 @@ static irqreturn_t prq_event_thread(int irq, void *d)
 			 * to unbind the mm while any page faults are outstanding.
 			 */
 			svm = pasid_private_find(req->pasid);
-			if (IS_ERR_OR_NULL(svm) || (svm->flags & SVM_FLAG_SUPERVISOR_MODE))
+			if (IS_ERR_OR_NULL(svm))
 				goto bad_req;
 		}
 
@@ -1011,29 +1005,9 @@ static irqreturn_t prq_event_thread(int irq, void *d)
 struct iommu_sva *intel_svm_bind(struct device *dev, struct mm_struct *mm, void *drvdata)
 {
 	struct intel_iommu *iommu = device_to_iommu(dev, NULL, NULL);
-	unsigned int flags = 0;
 	struct iommu_sva *sva;
 	int ret;
 
-	if (drvdata)
-		flags = *(unsigned int *)drvdata;
-
-	if (flags & SVM_FLAG_SUPERVISOR_MODE) {
-		if (!ecap_srs(iommu->ecap)) {
-			dev_err(dev, "%s: Supervisor PASID not supported\n",
-				iommu->name);
-			return ERR_PTR(-EOPNOTSUPP);
-		}
-
-		if (mm) {
-			dev_err(dev, "%s: Supervisor PASID with user provided mm\n",
-				iommu->name);
-			return ERR_PTR(-EINVAL);
-		}
-
-		mm = &init_mm;
-	}
-
 	mutex_lock(&pasid_mutex);
 	ret = intel_svm_alloc_pasid(dev, mm, flags);
 	if (ret) {
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH v2 8/8] iommu: Remove unused driver data in sva_bind_device
  2022-03-15  5:07 [PATCH v2 0/8] Enable PASID for DMA API users Jacob Pan
                   ` (6 preceding siblings ...)
  2022-03-15  5:07 ` [PATCH v2 7/8] iommu/vt-d: Delete supervisor/kernel SVA Jacob Pan
@ 2022-03-15  5:07 ` Jacob Pan
  2022-03-15 11:37   ` Jean-Philippe Brucker
  2022-03-15  5:07 ` [PATCH v2 9/9] dmaengine: idxd: separate user and kernel pasid enabling Jacob Pan
  2022-03-15  8:16 ` [PATCH v2 0/8] Enable PASID for DMA API users Tian, Kevin
  9 siblings, 1 reply; 61+ messages in thread
From: Jacob Pan @ 2022-03-15  5:07 UTC (permalink / raw)
  To: iommu, LKML, Joerg Roedel, Jason Gunthorpe, Christoph Hellwig,
	Lu Baolu, Jean-Philippe Brucker
  Cc: Jacob Pan, Raj Ashok, Kumar, Sanjay K, Dave Jiang, Tony Luck,
	Zanussi, Tom, Dan Williams, Tian, Kevin, Yi Liu

No one is using drvdata for sva_bind_device after kernel SVA support is
removed from VT-d driver. Remove the drvdata parameter as well.

Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
---
 drivers/dma/idxd/cdev.c                         | 2 +-
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c | 2 +-
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h     | 5 ++---
 drivers/iommu/intel/svm.c                       | 9 ++++-----
 drivers/iommu/iommu.c                           | 4 ++--
 drivers/misc/uacce/uacce.c                      | 2 +-
 include/linux/intel-iommu.h                     | 3 +--
 include/linux/iommu.h                           | 9 +++------
 8 files changed, 15 insertions(+), 21 deletions(-)

diff --git a/drivers/dma/idxd/cdev.c b/drivers/dma/idxd/cdev.c
index b9b2b4a4124e..312ec37ebf91 100644
--- a/drivers/dma/idxd/cdev.c
+++ b/drivers/dma/idxd/cdev.c
@@ -100,7 +100,7 @@ static int idxd_cdev_open(struct inode *inode, struct file *filp)
 	filp->private_data = ctx;
 
 	if (device_pasid_enabled(idxd)) {
-		sva = iommu_sva_bind_device(dev, current->mm, NULL);
+		sva = iommu_sva_bind_device(dev, current->mm);
 		if (IS_ERR(sva)) {
 			rc = PTR_ERR(sva);
 			dev_err(dev, "pasid allocation failed: %d\n", rc);
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
index a737ba5f727e..eb2f5cb0701a 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
@@ -354,7 +354,7 @@ __arm_smmu_sva_bind(struct device *dev, struct mm_struct *mm)
 }
 
 struct iommu_sva *
-arm_smmu_sva_bind(struct device *dev, struct mm_struct *mm, void *drvdata)
+arm_smmu_sva_bind(struct device *dev, struct mm_struct *mm)
 {
 	struct iommu_sva *handle;
 	struct iommu_domain *domain = iommu_get_domain_for_dev(dev);
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index cd48590ada30..d2ba86470c42 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -754,8 +754,7 @@ bool arm_smmu_master_sva_enabled(struct arm_smmu_master *master);
 int arm_smmu_master_enable_sva(struct arm_smmu_master *master);
 int arm_smmu_master_disable_sva(struct arm_smmu_master *master);
 bool arm_smmu_master_iopf_supported(struct arm_smmu_master *master);
-struct iommu_sva *arm_smmu_sva_bind(struct device *dev, struct mm_struct *mm,
-				    void *drvdata);
+struct iommu_sva *arm_smmu_sva_bind(struct device *dev, struct mm_struct *mm);
 void arm_smmu_sva_unbind(struct iommu_sva *handle);
 u32 arm_smmu_sva_get_pasid(struct iommu_sva *handle);
 void arm_smmu_sva_notifier_synchronize(void);
@@ -791,7 +790,7 @@ static inline bool arm_smmu_master_iopf_supported(struct arm_smmu_master *master
 }
 
 static inline struct iommu_sva *
-arm_smmu_sva_bind(struct device *dev, struct mm_struct *mm, void *drvdata)
+arm_smmu_sva_bind(struct device *dev, struct mm_struct *mm)
 {
 	return ERR_PTR(-ENODEV);
 }
diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c
index 37d6218f173b..94deb58375f5 100644
--- a/drivers/iommu/intel/svm.c
+++ b/drivers/iommu/intel/svm.c
@@ -500,8 +500,7 @@ int intel_svm_unbind_gpasid(struct device *dev, u32 pasid)
 	return ret;
 }
 
-static int intel_svm_alloc_pasid(struct device *dev, struct mm_struct *mm,
-				 unsigned int flags)
+static int intel_svm_alloc_pasid(struct device *dev, struct mm_struct *mm)
 {
 	ioasid_t max_pasid = dev_is_pci(dev) ?
 			pci_max_pasids(to_pci_dev(dev)) : intel_pasid_max_id;
@@ -1002,20 +1001,20 @@ static irqreturn_t prq_event_thread(int irq, void *d)
 	return IRQ_RETVAL(handled);
 }
 
-struct iommu_sva *intel_svm_bind(struct device *dev, struct mm_struct *mm, void *drvdata)
+struct iommu_sva *intel_svm_bind(struct device *dev, struct mm_struct *mm)
 {
 	struct intel_iommu *iommu = device_to_iommu(dev, NULL, NULL);
 	struct iommu_sva *sva;
 	int ret;
 
 	mutex_lock(&pasid_mutex);
-	ret = intel_svm_alloc_pasid(dev, mm, flags);
+	ret = intel_svm_alloc_pasid(dev, mm);
 	if (ret) {
 		mutex_unlock(&pasid_mutex);
 		return ERR_PTR(ret);
 	}
 
-	sva = intel_svm_bind_mm(iommu, dev, mm, flags);
+	sva = intel_svm_bind_mm(iommu, dev, mm);
 	if (IS_ERR_OR_NULL(sva))
 		intel_svm_free_pasid(mm);
 	mutex_unlock(&pasid_mutex);
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 107dcf5938d6..fef34879bc0c 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -3049,7 +3049,7 @@ EXPORT_SYMBOL_GPL(iommu_aux_get_pasid);
  * On error, returns an ERR_PTR value.
  */
 struct iommu_sva *
-iommu_sva_bind_device(struct device *dev, struct mm_struct *mm, void *drvdata)
+iommu_sva_bind_device(struct device *dev, struct mm_struct *mm)
 {
 	struct iommu_group *group;
 	struct iommu_sva *handle = ERR_PTR(-EINVAL);
@@ -3074,7 +3074,7 @@ iommu_sva_bind_device(struct device *dev, struct mm_struct *mm, void *drvdata)
 	if (iommu_group_device_count(group) != 1)
 		goto out_unlock;
 
-	handle = ops->sva_bind(dev, mm, drvdata);
+	handle = ops->sva_bind(dev, mm);
 
 out_unlock:
 	mutex_unlock(&group->mutex);
diff --git a/drivers/misc/uacce/uacce.c b/drivers/misc/uacce/uacce.c
index 281c54003edc..3238a867ea51 100644
--- a/drivers/misc/uacce/uacce.c
+++ b/drivers/misc/uacce/uacce.c
@@ -99,7 +99,7 @@ static int uacce_bind_queue(struct uacce_device *uacce, struct uacce_queue *q)
 	if (!(uacce->flags & UACCE_DEV_SVA))
 		return 0;
 
-	handle = iommu_sva_bind_device(uacce->parent, current->mm, NULL);
+	handle = iommu_sva_bind_device(uacce->parent, current->mm);
 	if (IS_ERR(handle))
 		return PTR_ERR(handle);
 
diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h
index 3f4c98f170ec..9dc855d7479d 100644
--- a/include/linux/intel-iommu.h
+++ b/include/linux/intel-iommu.h
@@ -777,8 +777,7 @@ extern int intel_svm_finish_prq(struct intel_iommu *iommu);
 int intel_svm_bind_gpasid(struct iommu_domain *domain, struct device *dev,
 			  struct iommu_gpasid_bind_data *data);
 int intel_svm_unbind_gpasid(struct device *dev, u32 pasid);
-struct iommu_sva *intel_svm_bind(struct device *dev, struct mm_struct *mm,
-				 void *drvdata);
+struct iommu_sva *intel_svm_bind(struct device *dev, struct mm_struct *mm);
 void intel_svm_unbind(struct iommu_sva *handle);
 u32 intel_svm_get_pasid(struct iommu_sva *handle);
 int intel_svm_page_response(struct device *dev, struct iommu_fault_event *evt,
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index fb011722e4f8..b570b37181ad 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -294,9 +294,7 @@ struct iommu_ops {
 	int (*aux_attach_dev)(struct iommu_domain *domain, struct device *dev);
 	void (*aux_detach_dev)(struct iommu_domain *domain, struct device *dev);
 	int (*aux_get_pasid)(struct iommu_domain *domain, struct device *dev);
-
-	struct iommu_sva *(*sva_bind)(struct device *dev, struct mm_struct *mm,
-				      void *drvdata);
+	struct iommu_sva *(*sva_bind)(struct device *dev, struct mm_struct *mm);
 	void (*sva_unbind)(struct iommu_sva *handle);
 	int (*attach_dev_pasid)(struct iommu_domain *domain,
 				struct device *dev, ioasid_t id);
@@ -705,8 +703,7 @@ void iommu_aux_detach_device(struct iommu_domain *domain, struct device *dev);
 int iommu_aux_get_pasid(struct iommu_domain *domain, struct device *dev);
 
 struct iommu_sva *iommu_sva_bind_device(struct device *dev,
-					struct mm_struct *mm,
-					void *drvdata);
+					struct mm_struct *mm);
 void iommu_sva_unbind_device(struct iommu_sva *handle);
 u32 iommu_sva_get_pasid(struct iommu_sva *handle);
 
@@ -1065,7 +1062,7 @@ iommu_aux_get_pasid(struct iommu_domain *domain, struct device *dev)
 }
 
 static inline struct iommu_sva *
-iommu_sva_bind_device(struct device *dev, struct mm_struct *mm, void *drvdata)
+iommu_sva_bind_device(struct device *dev, struct mm_struct *mm)
 {
 	return NULL;
 }
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH v2 9/9] dmaengine: idxd: separate user and kernel pasid enabling
  2022-03-15  5:07 [PATCH v2 0/8] Enable PASID for DMA API users Jacob Pan
                   ` (7 preceding siblings ...)
  2022-03-15  5:07 ` [PATCH v2 8/8] iommu: Remove unused driver data in sva_bind_device Jacob Pan
@ 2022-03-15  5:07 ` Jacob Pan
  2022-03-18  6:28   ` Tian, Kevin
  2022-03-15  8:16 ` [PATCH v2 0/8] Enable PASID for DMA API users Tian, Kevin
  9 siblings, 1 reply; 61+ messages in thread
From: Jacob Pan @ 2022-03-15  5:07 UTC (permalink / raw)
  To: iommu, LKML, Joerg Roedel, Jason Gunthorpe, Christoph Hellwig,
	Lu Baolu, Jean-Philippe Brucker
  Cc: Jacob Pan, Raj Ashok, Kumar, Sanjay K, Dave Jiang, Tony Luck,
	Zanussi, Tom, Dan Williams, Tian, Kevin, Yi Liu

From: Dave Jiang <dave.jiang@intel.com>

The idxd driver always gated the pasid enabling under a single knob and
this assumption is incorrect. The pasid used for kernel operation can be
independently toggled and has no dependency on the user pasid (and vice
versa). Split the two so they are independent "enabled" flags.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
---
 drivers/dma/idxd/cdev.c |  4 ++--
 drivers/dma/idxd/idxd.h |  6 ++++++
 drivers/dma/idxd/init.c | 30 ++++++++++++++++++------------
 3 files changed, 26 insertions(+), 14 deletions(-)

diff --git a/drivers/dma/idxd/cdev.c b/drivers/dma/idxd/cdev.c
index 312ec37ebf91..addaebca7683 100644
--- a/drivers/dma/idxd/cdev.c
+++ b/drivers/dma/idxd/cdev.c
@@ -99,7 +99,7 @@ static int idxd_cdev_open(struct inode *inode, struct file *filp)
 	ctx->wq = wq;
 	filp->private_data = ctx;
 
-	if (device_pasid_enabled(idxd)) {
+	if (device_user_pasid_enabled(idxd)) {
 		sva = iommu_sva_bind_device(dev, current->mm);
 		if (IS_ERR(sva)) {
 			rc = PTR_ERR(sva);
@@ -152,7 +152,7 @@ static int idxd_cdev_release(struct inode *node, struct file *filep)
 	if (wq_shared(wq)) {
 		idxd_device_drain_pasid(idxd, ctx->pasid);
 	} else {
-		if (device_pasid_enabled(idxd)) {
+		if (device_user_pasid_enabled(idxd)) {
 			/* The wq disable in the disable pasid function will drain the wq */
 			rc = idxd_wq_disable_pasid(wq);
 			if (rc < 0)
diff --git a/drivers/dma/idxd/idxd.h b/drivers/dma/idxd/idxd.h
index a09ab4a6e1c1..190b08bd7c08 100644
--- a/drivers/dma/idxd/idxd.h
+++ b/drivers/dma/idxd/idxd.h
@@ -239,6 +239,7 @@ enum idxd_device_flag {
 	IDXD_FLAG_CONFIGURABLE = 0,
 	IDXD_FLAG_CMD_RUNNING,
 	IDXD_FLAG_PASID_ENABLED,
+	IDXD_FLAG_USER_PASID_ENABLED,
 };
 
 struct idxd_dma_dev {
@@ -468,6 +469,11 @@ static inline bool device_pasid_enabled(struct idxd_device *idxd)
 	return test_bit(IDXD_FLAG_PASID_ENABLED, &idxd->flags);
 }
 
+static inline bool device_user_pasid_enabled(struct idxd_device *idxd)
+{
+	return test_bit(IDXD_FLAG_USER_PASID_ENABLED, &idxd->flags);
+}
+
 static inline bool device_swq_supported(struct idxd_device *idxd)
 {
 	return (support_enqcmd && device_pasid_enabled(idxd));
diff --git a/drivers/dma/idxd/init.c b/drivers/dma/idxd/init.c
index 5d1f8dd4abf6..981150b7d09b 100644
--- a/drivers/dma/idxd/init.c
+++ b/drivers/dma/idxd/init.c
@@ -500,16 +500,19 @@ static int idxd_probe(struct idxd_device *idxd)
 
 	if (IS_ENABLED(CONFIG_INTEL_IDXD_SVM) && sva) {
 		rc = iommu_dev_enable_feature(dev, IOMMU_DEV_FEAT_SVA);
-		if (rc == 0) {
-			rc = idxd_enable_system_pasid(idxd);
-			if (rc < 0) {
-				iommu_dev_disable_feature(dev, IOMMU_DEV_FEAT_SVA);
-				dev_warn(dev, "Failed to enable PASID. No SVA support: %d\n", rc);
-			} else {
-				set_bit(IDXD_FLAG_PASID_ENABLED, &idxd->flags);
-			}
-		} else {
+		if (rc) {
+			/*
+			 * Do not bail here since legacy DMA is still
+			 * supported, both user and in-kernel DMA with
+			 * PASID rely on SVA feature.
+			 */
 			dev_warn(dev, "Unable to turn on SVA feature.\n");
+		} else {
+			set_bit(IDXD_FLAG_USER_PASID_ENABLED, &idxd->flags);
+			if (idxd_enable_system_pasid(idxd))
+				dev_warn(dev, "No in-kernel DMA with PASID.\n");
+			else
+				set_bit(IDXD_FLAG_PASID_ENABLED, &idxd->flags);
 		}
 	}
 	idxd_read_caps(idxd);
@@ -545,7 +548,8 @@ static int idxd_probe(struct idxd_device *idxd)
  err:
 	if (device_pasid_enabled(idxd))
 		idxd_disable_system_pasid(idxd);
-	iommu_dev_disable_feature(dev, IOMMU_DEV_FEAT_SVA);
+	if (device_user_pasid_enabled(idxd) || device_pasid_enabled(idxd))
+		iommu_dev_disable_feature(dev, IOMMU_DEV_FEAT_SVA);
 	return rc;
 }
 
@@ -558,7 +562,8 @@ static void idxd_cleanup(struct idxd_device *idxd)
 	idxd_cleanup_internals(idxd);
 	if (device_pasid_enabled(idxd))
 		idxd_disable_system_pasid(idxd);
-	iommu_dev_disable_feature(dev, IOMMU_DEV_FEAT_SVA);
+	if (device_user_pasid_enabled(idxd) || device_pasid_enabled(idxd))
+		iommu_dev_disable_feature(dev, IOMMU_DEV_FEAT_SVA);
 }
 
 static int idxd_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
@@ -677,7 +682,8 @@ static void idxd_remove(struct pci_dev *pdev)
 	free_irq(irq_entry->vector, irq_entry);
 	pci_free_irq_vectors(pdev);
 	pci_iounmap(pdev, idxd->reg_base);
-	iommu_dev_disable_feature(&pdev->dev, IOMMU_DEV_FEAT_SVA);
+	if (device_user_pasid_enabled(idxd) || device_pasid_enabled(idxd))
+		iommu_dev_disable_feature(&pdev->dev, IOMMU_DEV_FEAT_SVA);
 	pci_disable_device(pdev);
 	destroy_workqueue(idxd->wq);
 	perfmon_pmu_remove(idxd);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 61+ messages in thread

* RE: [PATCH v2 0/8] Enable PASID for DMA API users
  2022-03-15  5:07 [PATCH v2 0/8] Enable PASID for DMA API users Jacob Pan
                   ` (8 preceding siblings ...)
  2022-03-15  5:07 ` [PATCH v2 9/9] dmaengine: idxd: separate user and kernel pasid enabling Jacob Pan
@ 2022-03-15  8:16 ` Tian, Kevin
  2022-03-15 15:49   ` Jacob Pan
  9 siblings, 1 reply; 61+ messages in thread
From: Tian, Kevin @ 2022-03-15  8:16 UTC (permalink / raw)
  To: Jacob Pan, iommu, LKML, Joerg Roedel, Jason Gunthorpe,
	Christoph Hellwig, Lu Baolu, Jean-Philippe Brucker
  Cc: Pan, Jacob jun, Raj, Ashok, Kumar, Sanjay K, Jiang, Dave, Luck,
	Tony, Zanussi, Tom, Williams, Dan J, Liu, Yi L

> From: Jacob Pan <jacob.jun.pan@linux.intel.com>
> Sent: Tuesday, March 15, 2022 1:07 PM
> 
> Some modern accelerators such as Intel's Data Streaming Accelerator (DSA)
> require PASID in DMA requests to be operational. Specifically, the work
> submissions with ENQCMD on shared work queues require PASIDs. The use
> cases
> include both user DMA with shared virtual addressing (SVA) and in-kernel
> DMA similar to legacy DMA w/o PASID. Here we address the latter.
> 
> DMA mapping API is the de facto standard for in-kernel DMA. However, it
> operates on a per device or Requester ID(RID) basis which is not
> PASID-aware. To leverage DMA API for devices relies on PASIDs, this
> patchset introduces the following APIs
> 
> 1. A driver facing API that enables DMA API PASID usage:
> iommu_enable_pasid_dma(struct device *dev, ioasid_t &pasid);

Should this be called dma_enable_pasid() since it's about DMA API? Doing
so also avoids the driver to include iommu.h.

Thanks
Kevin

^ permalink raw reply	[flat|nested] 61+ messages in thread

* RE: [PATCH v2 2/8] iommu: Add attach/detach_dev_pasid domain ops
  2022-03-15  5:07 ` [PATCH v2 2/8] iommu: Add attach/detach_dev_pasid domain ops Jacob Pan
@ 2022-03-15 10:24   ` Tian, Kevin
  2022-03-15 11:26   ` Jean-Philippe Brucker
  2022-03-18 11:52   ` Lu Baolu
  2 siblings, 0 replies; 61+ messages in thread
From: Tian, Kevin @ 2022-03-15 10:24 UTC (permalink / raw)
  To: Jacob Pan, iommu, LKML, Joerg Roedel, Jason Gunthorpe,
	Christoph Hellwig, Lu Baolu, Jean-Philippe Brucker
  Cc: Pan, Jacob jun, Raj, Ashok, Kumar, Sanjay K, Jiang, Dave, Luck,
	Tony, Zanussi, Tom, Williams, Dan J, Liu, Yi L

> From: Jacob Pan <jacob.jun.pan@linux.intel.com>
> Sent: Tuesday, March 15, 2022 1:07 PM
> 
> From: Lu Baolu <baolu.lu@linux.intel.com>
> 
> An IOMMU domain represents an address space which can be attached by
> devices that perform DMA within a domain. However, for platforms with
> PASID capability the domain attachment needs be handled at device+PASID
> level. There can be multiple PASIDs within a device and multiple devices
> attached to a given domain.
> This patch introduces a new IOMMU op which support device, PASID, and
> IOMMU domain attachment. The immediate use case is for PASID capable
> devices to perform DMA under DMA APIs.
> 
> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
> Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
> ---
>  include/linux/iommu.h | 6 ++++++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> index 369f05c2a4e2..fde5b933dbe3 100644
> --- a/include/linux/iommu.h
> +++ b/include/linux/iommu.h
> @@ -227,6 +227,8 @@ struct iommu_iotlb_gather {
>   * @aux_get_pasid: get the pasid given an aux-domain
>   * @sva_bind: Bind process address space to device
>   * @sva_unbind: Unbind process address space from device
> + * @attach_dev_pasid: attach an iommu domain to a pasid of device
> + * @detach_dev_pasid: detach an iommu domain from a pasid of device
>   * @sva_get_pasid: Get PASID associated to a SVA handle
>   * @page_response: handle page request response
>   * @cache_invalidate: invalidate translation caches
> @@ -296,6 +298,10 @@ struct iommu_ops {
>  	struct iommu_sva *(*sva_bind)(struct device *dev, struct mm_struct
> *mm,
>  				      void *drvdata);
>  	void (*sva_unbind)(struct iommu_sva *handle);
> +	int (*attach_dev_pasid)(struct iommu_domain *domain,
> +				struct device *dev, ioasid_t id);
> +	void (*detach_dev_pasid)(struct iommu_domain *domain,
> +				 struct device *dev, ioasid_t id);
>  	u32 (*sva_get_pasid)(struct iommu_sva *handle);
> 
>  	int (*page_response)(struct device *dev,
> --
> 2.25.1

Probably this can be combined with patch05 to demonstrate the usage
together with the definition of the new ops. Then comes the vt-d specific
change.

Thanks
Kevin

^ permalink raw reply	[flat|nested] 61+ messages in thread

* RE: [PATCH v2 3/8] iommu/vt-d: Implement device_pasid domain attach ops
  2022-03-15  5:07 ` [PATCH v2 3/8] iommu/vt-d: Implement device_pasid domain attach ops Jacob Pan
@ 2022-03-15 10:33   ` Tian, Kevin
  2022-03-15 22:23     ` Jacob Pan
  2022-03-15 14:33   ` Jason Gunthorpe
  2022-03-16  7:39   ` Tian, Kevin
  2 siblings, 1 reply; 61+ messages in thread
From: Tian, Kevin @ 2022-03-15 10:33 UTC (permalink / raw)
  To: Jacob Pan, iommu, LKML, Joerg Roedel, Jason Gunthorpe,
	Christoph Hellwig, Lu Baolu, Jean-Philippe Brucker
  Cc: Pan, Jacob jun, Raj, Ashok, Kumar, Sanjay K, Jiang, Dave, Luck,
	Tony, Zanussi, Tom, Williams, Dan J, Liu, Yi L

> From: Jacob Pan <jacob.jun.pan@linux.intel.com>
> Sent: Tuesday, March 15, 2022 1:07 PM
> 
> On VT-d platforms with scalable mode enabled, devices issue DMA requests
> with PASID need to attach to the correct IOMMU domains.
> The attach operation involves the following:
> - programming the PASID into device's PASID table
> - tracking device domain and the PASID relationship
> - managing IOTLB and device TLB invalidations
> 
> This patch extends DMAR domain and device domain info with xarrays to
> track per domain and per device PASIDs.  It provides the flexibility to
> be used beyond DMA API PASID support.
> 
> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
> Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
> ---
>  drivers/iommu/intel/iommu.c | 194
> +++++++++++++++++++++++++++++++++++-
>  include/linux/intel-iommu.h |  12 ++-
>  2 files changed, 202 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
> index 881f8361eca2..9267194eaed3 100644
> --- a/drivers/iommu/intel/iommu.c
> +++ b/drivers/iommu/intel/iommu.c
> @@ -1622,20 +1622,48 @@ static void __iommu_flush_dev_iotlb(struct
> device_domain_info *info,
>  			   qdep, addr, mask);
>  }
> 
> +static void __iommu_flush_dev_piotlb(struct device_domain_info *info,

piotlb is confusing, better be:

	__iommu_flush_dev_iotlb_pasid()

> +					u64 address,
> +				     ioasid_t pasid, unsigned int mask)
> +{
> +	u16 sid, qdep;
> +
> +	if (!info || !info->ats_enabled)
> +		return;
> +
> +	sid = info->bus << 8 | info->devfn;
> +	qdep = info->ats_qdep;
> +	qi_flush_dev_iotlb_pasid(info->iommu, sid, info->pfsid,
> +				 pasid, qdep, address, mask);
> +}
> +
>  static void iommu_flush_dev_iotlb(struct dmar_domain *domain,
>  				  u64 addr, unsigned mask)
>  {
>  	unsigned long flags;
>  	struct device_domain_info *info;
>  	struct subdev_domain_info *sinfo;
> +	unsigned long pasid;
> +	struct pasid_info *pinfo;
> 
>  	if (!domain->has_iotlb_device)
>  		return;
> 
>  	spin_lock_irqsave(&device_domain_lock, flags);
> -	list_for_each_entry(info, &domain->devices, link)
> -		__iommu_flush_dev_iotlb(info, addr, mask);
> -
> +	list_for_each_entry(info, &domain->devices, link) {
> +		/*
> +		 * We cannot use PASID based devTLB invalidation on
> RID2PASID
> +		 * Device does not understand RID2PASID/0. This is different

Lack of a conjunction word between 'RID2PASID' and 'Device'.

and what is RID2PASID/0? It would be clearer to point out that RID2PASID
is visible only within the iommu to mark out requests without PASID, 
thus this PASID value should never be sent to the device side.

> +		 * than IOTLB invalidation where RID2PASID is also used for
> +		 * tagging.

Then it would be obvious because IOTLB is iommu internal agent thus takes 
use of RID2PASID for tagging.

> +		 */
> +		xa_for_each(&info->pasids, pasid, pinfo) {
> +			if (!pasid)

this should be compared to PASID_RID2PASID (though it's zero today).

> +				__iommu_flush_dev_iotlb(info, addr, mask);
> +			else
> +				__iommu_flush_dev_piotlb(info, addr, pasid,
> mask);
> +		}
> +	}
>  	list_for_each_entry(sinfo, &domain->subdevices, link_domain) {
>  		info = get_domain_info(sinfo->pdev);
>  		__iommu_flush_dev_iotlb(info, addr, mask);

Thanks
Kevin

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v2 5/8] iommu: Add PASID support for DMA mapping API users
  2022-03-15  5:07 ` [PATCH v2 5/8] iommu: Add PASID support for DMA mapping API users Jacob Pan
@ 2022-03-15 11:16   ` Robin Murphy
  2022-03-15 14:22     ` Jason Gunthorpe
  2022-03-15 14:35   ` Jason Gunthorpe
  2022-03-18 12:43   ` Lu Baolu
  2 siblings, 1 reply; 61+ messages in thread
From: Robin Murphy @ 2022-03-15 11:16 UTC (permalink / raw)
  To: Jacob Pan, iommu, LKML, Joerg Roedel, Jason Gunthorpe,
	Christoph Hellwig, Lu Baolu, Jean-Philippe Brucker
  Cc: Tian, Kevin, Tony Luck, Dave Jiang, Raj Ashok, Zanussi, Tom,
	Kumar, Sanjay K, Jacob Pan, Dan Williams

On 2022-03-15 05:07, Jacob Pan wrote:
> DMA mapping API is the de facto standard for in-kernel DMA. It operates
> on a per device/RID basis which is not PASID-aware.
> 
> Some modern devices such as Intel Data Streaming Accelerator, PASID is
> required for certain work submissions. To allow such devices use DMA
> mapping API, we need the following functionalities:
> 1. Provide device a way to retrieve a PASID for work submission within
> the kernel
> 2. Enable the kernel PASID on the IOMMU for the device
> 3. Attach the kernel PASID to the device's default DMA domain, let it
> be IOVA or physical address in case of pass-through.
> 
> This patch introduces a driver facing API that enables DMA API
> PASID usage. Once enabled, device drivers can continue to use DMA APIs as
> is. There is no difference in dma_handle between without PASID and with
> PASID.

Surely the main point of PASIDs is to be able to use more than one of 
them? The way I expected this to work is that iommu_enable_pasid_dma() 
returns a *new* struct device representing the dev+PASID combination, 
and the driver can then pass that to subsequent DMA API and/or IOMMU API 
calls as normal, and they know to do the right thing. Automatically 
inferring a PASID for the original physical device clearly can't scale, 
and seems like a dead-end approach that only helps this one niche use-case.

Either way, I think this is also still fundamentally an IOMMU API 
operation that belongs in iommu.[ch] - since the iommu_dma_ops 
consolidation I'd prefer to continue working towards making dma-iommu.h 
a private header just for IOMMU API internal helpers.

Thanks,
Robin.

> Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
> ---
>   drivers/iommu/dma-iommu.c | 65 +++++++++++++++++++++++++++++++++++++++
>   include/linux/dma-iommu.h |  7 +++++
>   include/linux/iommu.h     |  9 ++++++
>   3 files changed, 81 insertions(+)
> 
> diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
> index b22034975301..d0ff1a34b1b6 100644
> --- a/drivers/iommu/dma-iommu.c
> +++ b/drivers/iommu/dma-iommu.c
> @@ -39,6 +39,8 @@ enum iommu_dma_cookie_type {
>   	IOMMU_DMA_MSI_COOKIE,
>   };
>   
> +static DECLARE_IOASID_SET(iommu_dma_pasid);
> +
>   struct iommu_dma_cookie {
>   	enum iommu_dma_cookie_type	type;
>   	union {
> @@ -370,6 +372,69 @@ void iommu_put_dma_cookie(struct iommu_domain *domain)
>   	domain->iova_cookie = NULL;
>   }
>   
> +/**
> + * iommu_enable_pasid_dma --Enable in-kernel DMA request with PASID
> + * @dev:	Device to be enabled
> + *
> + * DMA request with PASID will be mapped the same way as the legacy DMA.
> + * If the device is in pass-through, PASID will also pass-through. If the
> + * device is in IOVA map, the supervisor PASID will point to the same IOVA
> + * page table.
> + *
> + * @return the kernel PASID to be used for DMA or INVALID_IOASID on failure
> + */
> +int iommu_enable_pasid_dma(struct device *dev, ioasid_t *pasid)
> +{
> +	struct iommu_domain *dom;
> +	ioasid_t id, max;
> +	int ret;
> +
> +	dom = iommu_get_domain_for_dev(dev);
> +	if (!dom || !dom->ops || !dom->ops->attach_dev_pasid)
> +		return -ENODEV;
> +	max = iommu_get_dev_pasid_max(dev);
> +	if (!max)
> +		return -EINVAL;
> +
> +	id = ioasid_alloc(&iommu_dma_pasid, 1, max, dev);
> +	if (id == INVALID_IOASID)
> +		return -ENOMEM;
> +
> +	ret = dom->ops->attach_dev_pasid(dom, dev, id);
> +	if (ret) {
> +		ioasid_put(id);
> +		return ret;
> +	}
> +	*pasid = id;
> +
> +	return ret;
> +}
> +EXPORT_SYMBOL(iommu_enable_pasid_dma);
> +
> +/**
> + * iommu_disable_pasid_dma --Disable in-kernel DMA request with PASID
> + * @dev:	Device's PASID DMA to be disabled
> + *
> + * It is the device driver's responsibility to ensure no more incoming DMA
> + * requests with the kernel PASID before calling this function. IOMMU driver
> + * ensures PASID cache, IOTLBs related to the kernel PASID are cleared and
> + * drained.
> + *
> + * @return 0 on success or error code on failure
> + */
> +void iommu_disable_pasid_dma(struct device *dev, ioasid_t pasid)
> +{
> +	struct iommu_domain *dom;
> +
> +	/* TODO: check the given PASID is within the ioasid_set */
> +	dom = iommu_get_domain_for_dev(dev);
> +	if (!dom->ops->detach_dev_pasid)
> +		return;
> +	dom->ops->detach_dev_pasid(dom, dev, pasid);
> +	ioasid_put(pasid);
> +}
> +EXPORT_SYMBOL(iommu_disable_pasid_dma);
> +
>   /**
>    * iommu_dma_get_resv_regions - Reserved region driver helper
>    * @dev: Device from iommu_get_resv_regions()
> diff --git a/include/linux/dma-iommu.h b/include/linux/dma-iommu.h
> index 24607dc3c2ac..e6cb9b52a420 100644
> --- a/include/linux/dma-iommu.h
> +++ b/include/linux/dma-iommu.h
> @@ -18,6 +18,13 @@ int iommu_get_dma_cookie(struct iommu_domain *domain);
>   int iommu_get_msi_cookie(struct iommu_domain *domain, dma_addr_t base);
>   void iommu_put_dma_cookie(struct iommu_domain *domain);
>   
> +/*
> + * For devices that can do DMA request with PASID, setup a system PASID.
> + * Address modes (IOVA, PA) are selected by the platform code.
> + */
> +int iommu_enable_pasid_dma(struct device *dev, ioasid_t *pasid);
> +void iommu_disable_pasid_dma(struct device *dev, ioasid_t pasid);
> +
>   /* Setup call for arch DMA mapping code */
>   void iommu_setup_dma_ops(struct device *dev, u64 dma_base, u64 dma_limit);
>   int iommu_dma_init_fq(struct iommu_domain *domain);
> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> index fde5b933dbe3..fb011722e4f8 100644
> --- a/include/linux/iommu.h
> +++ b/include/linux/iommu.h
> @@ -395,6 +395,15 @@ static inline void iommu_set_dev_pasid_max(struct device *dev,
>   
>   	param->pasid_max = max;
>   }
> +static inline ioasid_t iommu_get_dev_pasid_max(struct device *dev)
> +{
> +	struct dev_iommu *param = dev->iommu;
> +
> +	if (WARN_ON(!param))
> +		return 0;
> +
> +	return param->pasid_max;
> +}
>   
>   int iommu_device_register(struct iommu_device *iommu,
>   			  const struct iommu_ops *ops,

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v2 2/8] iommu: Add attach/detach_dev_pasid domain ops
  2022-03-15  5:07 ` [PATCH v2 2/8] iommu: Add attach/detach_dev_pasid domain ops Jacob Pan
  2022-03-15 10:24   ` Tian, Kevin
@ 2022-03-15 11:26   ` Jean-Philippe Brucker
  2022-03-15 11:49     ` Tian, Kevin
  2022-03-18 11:52   ` Lu Baolu
  2 siblings, 1 reply; 61+ messages in thread
From: Jean-Philippe Brucker @ 2022-03-15 11:26 UTC (permalink / raw)
  To: Jacob Pan
  Cc: iommu, LKML, Joerg Roedel, Jason Gunthorpe, Christoph Hellwig,
	Lu Baolu, Jean-Philippe Brucker, Jacob Pan, Raj Ashok, Kumar,
	Sanjay K, Dave Jiang, Tony Luck, Zanussi, Tom, Dan Williams,
	Tian, Kevin, Yi Liu

On Mon, Mar 14, 2022 at 10:07:06PM -0700, Jacob Pan wrote:
> From: Lu Baolu <baolu.lu@linux.intel.com>
> 
> An IOMMU domain represents an address space which can be attached by
> devices that perform DMA within a domain. However, for platforms with
> PASID capability the domain attachment needs be handled at device+PASID
> level. There can be multiple PASIDs within a device and multiple devices
> attached to a given domain.
> This patch introduces a new IOMMU op which support device, PASID, and
> IOMMU domain attachment. The immediate use case is for PASID capable
> devices to perform DMA under DMA APIs.
> 
> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
> Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
> ---
>  include/linux/iommu.h | 6 ++++++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> index 369f05c2a4e2..fde5b933dbe3 100644
> --- a/include/linux/iommu.h
> +++ b/include/linux/iommu.h
> @@ -227,6 +227,8 @@ struct iommu_iotlb_gather {
>   * @aux_get_pasid: get the pasid given an aux-domain
>   * @sva_bind: Bind process address space to device
>   * @sva_unbind: Unbind process address space from device
> + * @attach_dev_pasid: attach an iommu domain to a pasid of device
> + * @detach_dev_pasid: detach an iommu domain from a pasid of device

Isn't that operation "assign a PASID to a domain" instead?  In patch 5,
the domain is already attached to the device, so set_domain_pasid() might
be clearer and to the point. If the IOMMU driver did the allocation we
could also avoid patch 1.

If I understand correctly this series is not about a generic PASID API
that allows drivers to manage multiple DMA address spaces, because there
still doesn't seem to be any interest in that. It's about the specific
IDXD use-case, so let's focus on that. We can introduce a specialized call
such as (iommu|dma)_set_device_pasid(), which will be easy to consolidate
later into a more generic "dma_enable_pasid()" API if that ever seems
useful.

Thanks,
Jean

>   * @sva_get_pasid: Get PASID associated to a SVA handle
>   * @page_response: handle page request response
>   * @cache_invalidate: invalidate translation caches
> @@ -296,6 +298,10 @@ struct iommu_ops {
>  	struct iommu_sva *(*sva_bind)(struct device *dev, struct mm_struct *mm,
>  				      void *drvdata);
>  	void (*sva_unbind)(struct iommu_sva *handle);
> +	int (*attach_dev_pasid)(struct iommu_domain *domain,
> +				struct device *dev, ioasid_t id);
> +	void (*detach_dev_pasid)(struct iommu_domain *domain,
> +				 struct device *dev, ioasid_t id);
>  	u32 (*sva_get_pasid)(struct iommu_sva *handle);
>  
>  	int (*page_response)(struct device *dev,
> -- 
> 2.25.1
> 

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v2 8/8] iommu: Remove unused driver data in sva_bind_device
  2022-03-15  5:07 ` [PATCH v2 8/8] iommu: Remove unused driver data in sva_bind_device Jacob Pan
@ 2022-03-15 11:37   ` Jean-Philippe Brucker
  0 siblings, 0 replies; 61+ messages in thread
From: Jean-Philippe Brucker @ 2022-03-15 11:37 UTC (permalink / raw)
  To: Jacob Pan
  Cc: iommu, LKML, Joerg Roedel, Jason Gunthorpe, Christoph Hellwig,
	Lu Baolu, Jean-Philippe Brucker, Jacob Pan, Raj Ashok, Kumar,
	Sanjay K, Dave Jiang, Tony Luck, Zanussi, Tom, Dan Williams,
	Tian, Kevin, Yi Liu

On Mon, Mar 14, 2022 at 10:07:12PM -0700, Jacob Pan wrote:
> No one is using drvdata for sva_bind_device after kernel SVA support is
> removed from VT-d driver. Remove the drvdata parameter as well.
> 
> Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>

Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org>

> ---
>  drivers/dma/idxd/cdev.c                         | 2 +-
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c | 2 +-
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h     | 5 ++---
>  drivers/iommu/intel/svm.c                       | 9 ++++-----
>  drivers/iommu/iommu.c                           | 4 ++--
>  drivers/misc/uacce/uacce.c                      | 2 +-
>  include/linux/intel-iommu.h                     | 3 +--
>  include/linux/iommu.h                           | 9 +++------
>  8 files changed, 15 insertions(+), 21 deletions(-)
> 
> diff --git a/drivers/dma/idxd/cdev.c b/drivers/dma/idxd/cdev.c
> index b9b2b4a4124e..312ec37ebf91 100644
> --- a/drivers/dma/idxd/cdev.c
> +++ b/drivers/dma/idxd/cdev.c
> @@ -100,7 +100,7 @@ static int idxd_cdev_open(struct inode *inode, struct file *filp)
>  	filp->private_data = ctx;
>  
>  	if (device_pasid_enabled(idxd)) {
> -		sva = iommu_sva_bind_device(dev, current->mm, NULL);
> +		sva = iommu_sva_bind_device(dev, current->mm);
>  		if (IS_ERR(sva)) {
>  			rc = PTR_ERR(sva);
>  			dev_err(dev, "pasid allocation failed: %d\n", rc);
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
> index a737ba5f727e..eb2f5cb0701a 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
> @@ -354,7 +354,7 @@ __arm_smmu_sva_bind(struct device *dev, struct mm_struct *mm)
>  }
>  
>  struct iommu_sva *
> -arm_smmu_sva_bind(struct device *dev, struct mm_struct *mm, void *drvdata)
> +arm_smmu_sva_bind(struct device *dev, struct mm_struct *mm)
>  {
>  	struct iommu_sva *handle;
>  	struct iommu_domain *domain = iommu_get_domain_for_dev(dev);
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> index cd48590ada30..d2ba86470c42 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> @@ -754,8 +754,7 @@ bool arm_smmu_master_sva_enabled(struct arm_smmu_master *master);
>  int arm_smmu_master_enable_sva(struct arm_smmu_master *master);
>  int arm_smmu_master_disable_sva(struct arm_smmu_master *master);
>  bool arm_smmu_master_iopf_supported(struct arm_smmu_master *master);
> -struct iommu_sva *arm_smmu_sva_bind(struct device *dev, struct mm_struct *mm,
> -				    void *drvdata);
> +struct iommu_sva *arm_smmu_sva_bind(struct device *dev, struct mm_struct *mm);
>  void arm_smmu_sva_unbind(struct iommu_sva *handle);
>  u32 arm_smmu_sva_get_pasid(struct iommu_sva *handle);
>  void arm_smmu_sva_notifier_synchronize(void);
> @@ -791,7 +790,7 @@ static inline bool arm_smmu_master_iopf_supported(struct arm_smmu_master *master
>  }
>  
>  static inline struct iommu_sva *
> -arm_smmu_sva_bind(struct device *dev, struct mm_struct *mm, void *drvdata)
> +arm_smmu_sva_bind(struct device *dev, struct mm_struct *mm)
>  {
>  	return ERR_PTR(-ENODEV);
>  }
> diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c
> index 37d6218f173b..94deb58375f5 100644
> --- a/drivers/iommu/intel/svm.c
> +++ b/drivers/iommu/intel/svm.c
> @@ -500,8 +500,7 @@ int intel_svm_unbind_gpasid(struct device *dev, u32 pasid)
>  	return ret;
>  }
>  
> -static int intel_svm_alloc_pasid(struct device *dev, struct mm_struct *mm,
> -				 unsigned int flags)
> +static int intel_svm_alloc_pasid(struct device *dev, struct mm_struct *mm)
>  {
>  	ioasid_t max_pasid = dev_is_pci(dev) ?
>  			pci_max_pasids(to_pci_dev(dev)) : intel_pasid_max_id;
> @@ -1002,20 +1001,20 @@ static irqreturn_t prq_event_thread(int irq, void *d)
>  	return IRQ_RETVAL(handled);
>  }
>  
> -struct iommu_sva *intel_svm_bind(struct device *dev, struct mm_struct *mm, void *drvdata)
> +struct iommu_sva *intel_svm_bind(struct device *dev, struct mm_struct *mm)
>  {
>  	struct intel_iommu *iommu = device_to_iommu(dev, NULL, NULL);
>  	struct iommu_sva *sva;
>  	int ret;
>  
>  	mutex_lock(&pasid_mutex);
> -	ret = intel_svm_alloc_pasid(dev, mm, flags);
> +	ret = intel_svm_alloc_pasid(dev, mm);
>  	if (ret) {
>  		mutex_unlock(&pasid_mutex);
>  		return ERR_PTR(ret);
>  	}
>  
> -	sva = intel_svm_bind_mm(iommu, dev, mm, flags);
> +	sva = intel_svm_bind_mm(iommu, dev, mm);
>  	if (IS_ERR_OR_NULL(sva))
>  		intel_svm_free_pasid(mm);
>  	mutex_unlock(&pasid_mutex);
> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
> index 107dcf5938d6..fef34879bc0c 100644
> --- a/drivers/iommu/iommu.c
> +++ b/drivers/iommu/iommu.c
> @@ -3049,7 +3049,7 @@ EXPORT_SYMBOL_GPL(iommu_aux_get_pasid);
>   * On error, returns an ERR_PTR value.
>   */
>  struct iommu_sva *
> -iommu_sva_bind_device(struct device *dev, struct mm_struct *mm, void *drvdata)
> +iommu_sva_bind_device(struct device *dev, struct mm_struct *mm)
>  {
>  	struct iommu_group *group;
>  	struct iommu_sva *handle = ERR_PTR(-EINVAL);
> @@ -3074,7 +3074,7 @@ iommu_sva_bind_device(struct device *dev, struct mm_struct *mm, void *drvdata)
>  	if (iommu_group_device_count(group) != 1)
>  		goto out_unlock;
>  
> -	handle = ops->sva_bind(dev, mm, drvdata);
> +	handle = ops->sva_bind(dev, mm);
>  
>  out_unlock:
>  	mutex_unlock(&group->mutex);
> diff --git a/drivers/misc/uacce/uacce.c b/drivers/misc/uacce/uacce.c
> index 281c54003edc..3238a867ea51 100644
> --- a/drivers/misc/uacce/uacce.c
> +++ b/drivers/misc/uacce/uacce.c
> @@ -99,7 +99,7 @@ static int uacce_bind_queue(struct uacce_device *uacce, struct uacce_queue *q)
>  	if (!(uacce->flags & UACCE_DEV_SVA))
>  		return 0;
>  
> -	handle = iommu_sva_bind_device(uacce->parent, current->mm, NULL);
> +	handle = iommu_sva_bind_device(uacce->parent, current->mm);
>  	if (IS_ERR(handle))
>  		return PTR_ERR(handle);
>  
> diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h
> index 3f4c98f170ec..9dc855d7479d 100644
> --- a/include/linux/intel-iommu.h
> +++ b/include/linux/intel-iommu.h
> @@ -777,8 +777,7 @@ extern int intel_svm_finish_prq(struct intel_iommu *iommu);
>  int intel_svm_bind_gpasid(struct iommu_domain *domain, struct device *dev,
>  			  struct iommu_gpasid_bind_data *data);
>  int intel_svm_unbind_gpasid(struct device *dev, u32 pasid);
> -struct iommu_sva *intel_svm_bind(struct device *dev, struct mm_struct *mm,
> -				 void *drvdata);
> +struct iommu_sva *intel_svm_bind(struct device *dev, struct mm_struct *mm);
>  void intel_svm_unbind(struct iommu_sva *handle);
>  u32 intel_svm_get_pasid(struct iommu_sva *handle);
>  int intel_svm_page_response(struct device *dev, struct iommu_fault_event *evt,
> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> index fb011722e4f8..b570b37181ad 100644
> --- a/include/linux/iommu.h
> +++ b/include/linux/iommu.h
> @@ -294,9 +294,7 @@ struct iommu_ops {
>  	int (*aux_attach_dev)(struct iommu_domain *domain, struct device *dev);
>  	void (*aux_detach_dev)(struct iommu_domain *domain, struct device *dev);
>  	int (*aux_get_pasid)(struct iommu_domain *domain, struct device *dev);
> -
> -	struct iommu_sva *(*sva_bind)(struct device *dev, struct mm_struct *mm,
> -				      void *drvdata);
> +	struct iommu_sva *(*sva_bind)(struct device *dev, struct mm_struct *mm);
>  	void (*sva_unbind)(struct iommu_sva *handle);
>  	int (*attach_dev_pasid)(struct iommu_domain *domain,
>  				struct device *dev, ioasid_t id);
> @@ -705,8 +703,7 @@ void iommu_aux_detach_device(struct iommu_domain *domain, struct device *dev);
>  int iommu_aux_get_pasid(struct iommu_domain *domain, struct device *dev);
>  
>  struct iommu_sva *iommu_sva_bind_device(struct device *dev,
> -					struct mm_struct *mm,
> -					void *drvdata);
> +					struct mm_struct *mm);
>  void iommu_sva_unbind_device(struct iommu_sva *handle);
>  u32 iommu_sva_get_pasid(struct iommu_sva *handle);
>  
> @@ -1065,7 +1062,7 @@ iommu_aux_get_pasid(struct iommu_domain *domain, struct device *dev)
>  }
>  
>  static inline struct iommu_sva *
> -iommu_sva_bind_device(struct device *dev, struct mm_struct *mm, void *drvdata)
> +iommu_sva_bind_device(struct device *dev, struct mm_struct *mm)
>  {
>  	return NULL;
>  }
> -- 
> 2.25.1
> 

^ permalink raw reply	[flat|nested] 61+ messages in thread

* RE: [PATCH v2 2/8] iommu: Add attach/detach_dev_pasid domain ops
  2022-03-15 11:26   ` Jean-Philippe Brucker
@ 2022-03-15 11:49     ` Tian, Kevin
  2022-03-15 16:11       ` Jacob Pan
  2022-03-18 12:01       ` Lu Baolu
  0 siblings, 2 replies; 61+ messages in thread
From: Tian, Kevin @ 2022-03-15 11:49 UTC (permalink / raw)
  To: Jean-Philippe Brucker, Jacob Pan
  Cc: iommu, LKML, Joerg Roedel, Jason Gunthorpe, Christoph Hellwig,
	Lu Baolu, Jean-Philippe Brucker, Pan, Jacob jun, Raj, Ashok,
	Kumar, Sanjay K, Jiang, Dave, Luck, Tony, Zanussi, Tom, Williams,
	Dan J, Liu, Yi L

> From: Jean-Philippe Brucker <jean-philippe@linaro.org>
> Sent: Tuesday, March 15, 2022 7:27 PM
> 
> On Mon, Mar 14, 2022 at 10:07:06PM -0700, Jacob Pan wrote:
> > From: Lu Baolu <baolu.lu@linux.intel.com>
> >
> > An IOMMU domain represents an address space which can be attached by
> > devices that perform DMA within a domain. However, for platforms with
> > PASID capability the domain attachment needs be handled at device+PASID
> > level. There can be multiple PASIDs within a device and multiple devices
> > attached to a given domain.
> > This patch introduces a new IOMMU op which support device, PASID, and
> > IOMMU domain attachment. The immediate use case is for PASID capable
> > devices to perform DMA under DMA APIs.
> >
> > Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
> > Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
> > ---
> >  include/linux/iommu.h | 6 ++++++
> >  1 file changed, 6 insertions(+)
> >
> > diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> > index 369f05c2a4e2..fde5b933dbe3 100644
> > --- a/include/linux/iommu.h
> > +++ b/include/linux/iommu.h
> > @@ -227,6 +227,8 @@ struct iommu_iotlb_gather {
> >   * @aux_get_pasid: get the pasid given an aux-domain
> >   * @sva_bind: Bind process address space to device
> >   * @sva_unbind: Unbind process address space from device
> > + * @attach_dev_pasid: attach an iommu domain to a pasid of device
> > + * @detach_dev_pasid: detach an iommu domain from a pasid of device
> 
> Isn't that operation "assign a PASID to a domain" instead?  In patch 5,
> the domain is already attached to the device, so set_domain_pasid() might
> be clearer and to the point. If the IOMMU driver did the allocation we
> could also avoid patch 1.

iiuc this API can also work for future SIOV usage where each mdev attached
to the domain has its own pasid. "assigning a PASID to a domain" sounds
like going back to the previous aux domain approach which has one PASID
per domain and that PASID is used on all devices attached to the aux domain...

> 
> If I understand correctly this series is not about a generic PASID API
> that allows drivers to manage multiple DMA address spaces, because there
> still doesn't seem to be any interest in that. It's about the specific
> IDXD use-case, so let's focus on that. We can introduce a specialized call
> such as (iommu|dma)_set_device_pasid(), which will be easy to consolidate
> later into a more generic "dma_enable_pasid()" API if that ever seems
> useful.
> 
> Thanks,
> Jean
> 
> >   * @sva_get_pasid: Get PASID associated to a SVA handle
> >   * @page_response: handle page request response
> >   * @cache_invalidate: invalidate translation caches
> > @@ -296,6 +298,10 @@ struct iommu_ops {
> >  	struct iommu_sva *(*sva_bind)(struct device *dev, struct mm_struct
> *mm,
> >  				      void *drvdata);
> >  	void (*sva_unbind)(struct iommu_sva *handle);
> > +	int (*attach_dev_pasid)(struct iommu_domain *domain,
> > +				struct device *dev, ioasid_t id);
> > +	void (*detach_dev_pasid)(struct iommu_domain *domain,
> > +				 struct device *dev, ioasid_t id);
> >  	u32 (*sva_get_pasid)(struct iommu_sva *handle);
> >
> >  	int (*page_response)(struct device *dev,
> > --
> > 2.25.1
> >

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v2 5/8] iommu: Add PASID support for DMA mapping API users
  2022-03-15 11:16   ` Robin Murphy
@ 2022-03-15 14:22     ` Jason Gunthorpe
  2022-03-15 16:31       ` Jacob Pan
  2022-03-16  8:41       ` Tian, Kevin
  0 siblings, 2 replies; 61+ messages in thread
From: Jason Gunthorpe @ 2022-03-15 14:22 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Jacob Pan, iommu, LKML, Joerg Roedel, Christoph Hellwig,
	Lu Baolu, Jean-Philippe Brucker, Tian, Kevin, Tony Luck,
	Dave Jiang, Raj Ashok, Zanussi, Tom, Kumar, Sanjay K, Jacob Pan,
	Dan Williams

On Tue, Mar 15, 2022 at 11:16:41AM +0000, Robin Murphy wrote:
> On 2022-03-15 05:07, Jacob Pan wrote:
> > DMA mapping API is the de facto standard for in-kernel DMA. It operates
> > on a per device/RID basis which is not PASID-aware.
> > 
> > Some modern devices such as Intel Data Streaming Accelerator, PASID is
> > required for certain work submissions. To allow such devices use DMA
> > mapping API, we need the following functionalities:
> > 1. Provide device a way to retrieve a PASID for work submission within
> > the kernel
> > 2. Enable the kernel PASID on the IOMMU for the device
> > 3. Attach the kernel PASID to the device's default DMA domain, let it
> > be IOVA or physical address in case of pass-through.
> > 
> > This patch introduces a driver facing API that enables DMA API
> > PASID usage. Once enabled, device drivers can continue to use DMA APIs as
> > is. There is no difference in dma_handle between without PASID and with
> > PASID.
> 
> Surely the main point of PASIDs is to be able to use more than one
> of them?

IMHO, not for the DMA API.

I can't think of good reasons why a single in-kernel device should
require more than one iommu_domain for use by the DMA API. Even with
the SIOV cases we have been looking at we don't really see a use case
for more than one DMA API iommu_domain on a single physical device.
Do you know of something on the horizon?

From my view the main point of PASIDs is to assign iommu_domains that
are not used by the DMA API.

IMHO it is a device mis-design of IDXD to require all DMA be PASID
tagged. Devices should be able to do DMA on their RID when the PCI
function is controlled by a kernel driver. I see this driver facing
API as addressing a device quirk by aliasing the DMA API of the RID
into a PASID and that is really all it is good for.

In any case I think we are better to wait for an actual user for multi
DMA API iommu_domains to come forward before we try to build an API
for it.

Jason

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v2 3/8] iommu/vt-d: Implement device_pasid domain attach ops
  2022-03-15  5:07 ` [PATCH v2 3/8] iommu/vt-d: Implement device_pasid domain attach ops Jacob Pan
  2022-03-15 10:33   ` Tian, Kevin
@ 2022-03-15 14:33   ` Jason Gunthorpe
  2022-03-15 22:36     ` Jacob Pan
  2022-03-16  7:41     ` Tian, Kevin
  2022-03-16  7:39   ` Tian, Kevin
  2 siblings, 2 replies; 61+ messages in thread
From: Jason Gunthorpe @ 2022-03-15 14:33 UTC (permalink / raw)
  To: Jacob Pan
  Cc: iommu, LKML, Joerg Roedel, Christoph Hellwig, Lu Baolu,
	Jean-Philippe Brucker, Jacob Pan, Raj Ashok, Kumar, Sanjay K,
	Dave Jiang, Tony Luck, Zanussi, Tom, Dan Williams, Tian, Kevin,
	Yi Liu

On Mon, Mar 14, 2022 at 10:07:07PM -0700, Jacob Pan wrote:
> +	/*
> +	 * Each domain could have multiple devices attached with shared or per
> +	 * device PASIDs. At the domain level, we keep track of unique PASIDs and
> +	 * device user count.
> +	 * E.g. If a domain has two devices attached, device A has PASID 0, 1;
> +	 * device B has PASID 0, 2. Then the domain would have PASID 0, 1, 2.
> +	 */

A 2d array of xarray's seems like a poor data structure for this task.

AFACIT this wants to store a list of (device, pasid) tuples, so a
simple linked list, 1d xarray vector or a red black tree seems more
appropriate..

> +	if (entry) {
> +		pinfo = entry;
> +	} else {
> +		pinfo = kzalloc(sizeof(*pinfo), GFP_ATOMIC);
> +		if (!pinfo)
> +			return -ENOMEM;
> +		pinfo->pasid = pasid;
> +		/* Store the new PASID info in the per domain array */
> +		ret = xa_err(__xa_store(&dmar_domain->pasids, pasid, pinfo,
> +			     GFP_ATOMIC));
> +		if (ret)
> +			goto xa_store_err;
> +	}
> +	/* Store PASID in per device-domain array, this is for tracking devTLB */
> +	ret = xa_err(xa_store(&info->pasids, pasid, pinfo, GFP_ATOMIC));
> +	if (ret)
> +		goto xa_store_err;
> +
> +	atomic_inc(&pinfo->users);
> +	xa_unlock(&dmar_domain->pasids);
> +
> +	return 0;
> +
> +xa_store_err:
> +	xa_unlock(&dmar_domain->pasids);
> +	spin_lock_irqsave(&iommu->lock, flags);
> +	intel_pasid_tear_down_entry(iommu, dev, pasid, false);
> +	spin_unlock_irqrestore(&iommu->lock, flags);
> +
> +	if (!atomic_read(&pinfo->users)) {
> +		__xa_erase(&dmar_domain->pasids, pasid);

This isn't locked right

Jason

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v2 5/8] iommu: Add PASID support for DMA mapping API users
  2022-03-15  5:07 ` [PATCH v2 5/8] iommu: Add PASID support for DMA mapping API users Jacob Pan
  2022-03-15 11:16   ` Robin Murphy
@ 2022-03-15 14:35   ` Jason Gunthorpe
  2022-03-15 16:38     ` Jacob Pan
  2022-03-18 12:43   ` Lu Baolu
  2 siblings, 1 reply; 61+ messages in thread
From: Jason Gunthorpe @ 2022-03-15 14:35 UTC (permalink / raw)
  To: Jacob Pan
  Cc: iommu, LKML, Joerg Roedel, Christoph Hellwig, Lu Baolu,
	Jean-Philippe Brucker, Jacob Pan, Raj Ashok, Kumar, Sanjay K,
	Dave Jiang, Tony Luck, Zanussi, Tom, Dan Williams, Tian, Kevin,
	Yi Liu

On Mon, Mar 14, 2022 at 10:07:09PM -0700, Jacob Pan wrote:
> DMA mapping API is the de facto standard for in-kernel DMA. It operates
> on a per device/RID basis which is not PASID-aware.
> 
> Some modern devices such as Intel Data Streaming Accelerator, PASID is
> required for certain work submissions. To allow such devices use DMA
> mapping API, we need the following functionalities:
> 1. Provide device a way to retrieve a PASID for work submission within
> the kernel
> 2. Enable the kernel PASID on the IOMMU for the device
> 3. Attach the kernel PASID to the device's default DMA domain, let it
> be IOVA or physical address in case of pass-through.
> 
> This patch introduces a driver facing API that enables DMA API
> PASID usage. Once enabled, device drivers can continue to use DMA APIs as
> is. There is no difference in dma_handle between without PASID and with
> PASID.
> 
> Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
>  drivers/iommu/dma-iommu.c | 65 +++++++++++++++++++++++++++++++++++++++
>  include/linux/dma-iommu.h |  7 +++++
>  include/linux/iommu.h     |  9 ++++++
>  3 files changed, 81 insertions(+)
> 
> diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
> index b22034975301..d0ff1a34b1b6 100644
> +++ b/drivers/iommu/dma-iommu.c
> @@ -39,6 +39,8 @@ enum iommu_dma_cookie_type {
>  	IOMMU_DMA_MSI_COOKIE,
>  };
>  
> +static DECLARE_IOASID_SET(iommu_dma_pasid);
> +
>  struct iommu_dma_cookie {
>  	enum iommu_dma_cookie_type	type;
>  	union {
> @@ -370,6 +372,69 @@ void iommu_put_dma_cookie(struct iommu_domain *domain)
>  	domain->iova_cookie = NULL;
>  }
>  
> +/**
> + * iommu_enable_pasid_dma --Enable in-kernel DMA request with PASID
> + * @dev:	Device to be enabled
> + *
> + * DMA request with PASID will be mapped the same way as the legacy DMA.
> + * If the device is in pass-through, PASID will also pass-through. If the
> + * device is in IOVA map, the supervisor PASID will point to the same IOVA
> + * page table.
> + *
> + * @return the kernel PASID to be used for DMA or INVALID_IOASID on failure
> + */
> +int iommu_enable_pasid_dma(struct device *dev, ioasid_t *pasid)
> +{
> +	struct iommu_domain *dom;
> +	ioasid_t id, max;
> +	int ret;
> +
> +	dom = iommu_get_domain_for_dev(dev);
> +	if (!dom || !dom->ops || !dom->ops->attach_dev_pasid)
> +		return -ENODEV;

Given the purpose of this API I think it should assert that the device
has the DMA API in-use using the machinery from the other series.

ie this should not be used to clone non-DMA API iommu_domains..

> diff --git a/include/linux/dma-iommu.h b/include/linux/dma-iommu.h
> index 24607dc3c2ac..e6cb9b52a420 100644
> +++ b/include/linux/dma-iommu.h
> @@ -18,6 +18,13 @@ int iommu_get_dma_cookie(struct iommu_domain *domain);
>  int iommu_get_msi_cookie(struct iommu_domain *domain, dma_addr_t base);
>  void iommu_put_dma_cookie(struct iommu_domain *domain);
>  
> +/*
> + * For devices that can do DMA request with PASID, setup a system PASID.
> + * Address modes (IOVA, PA) are selected by the platform code.
> + */
> +int iommu_enable_pasid_dma(struct device *dev, ioasid_t *pasid);
> +void iommu_disable_pasid_dma(struct device *dev, ioasid_t pasid);

The functions already have a kdoc, don't need two..

Jason

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v2 0/8] Enable PASID for DMA API users
  2022-03-15  8:16 ` [PATCH v2 0/8] Enable PASID for DMA API users Tian, Kevin
@ 2022-03-15 15:49   ` Jacob Pan
  0 siblings, 0 replies; 61+ messages in thread
From: Jacob Pan @ 2022-03-15 15:49 UTC (permalink / raw)
  To: Tian, Kevin
  Cc: iommu, LKML, Joerg Roedel, Jason Gunthorpe, Christoph Hellwig,
	Lu Baolu, Jean-Philippe Brucker, Pan, Jacob jun, Raj, Ashok,
	Kumar, Sanjay K, Jiang, Dave, Luck, Tony, Zanussi, Tom, Williams,
	Dan J, Liu, Yi L, jacob.jun.pan

Hi Kevin,

On Tue, 15 Mar 2022 08:16:59 +0000, "Tian, Kevin" <kevin.tian@intel.com>
wrote:

> > From: Jacob Pan <jacob.jun.pan@linux.intel.com>
> > Sent: Tuesday, March 15, 2022 1:07 PM
> > 
> > Some modern accelerators such as Intel's Data Streaming Accelerator
> > (DSA) require PASID in DMA requests to be operational. Specifically,
> > the work submissions with ENQCMD on shared work queues require PASIDs.
> > The use cases
> > include both user DMA with shared virtual addressing (SVA) and in-kernel
> > DMA similar to legacy DMA w/o PASID. Here we address the latter.
> > 
> > DMA mapping API is the de facto standard for in-kernel DMA. However, it
> > operates on a per device or Requester ID(RID) basis which is not
> > PASID-aware. To leverage DMA API for devices relies on PASIDs, this
> > patchset introduces the following APIs
> > 
> > 1. A driver facing API that enables DMA API PASID usage:
> > iommu_enable_pasid_dma(struct device *dev, ioasid_t &pasid);  
> 
> Should this be called dma_enable_pasid() since it's about DMA API? Doing
> so also avoids the driver to include iommu.h.
> 
PASID is still tied to IOMMU, drivers who wants to use this must explicitly
put dependency on IOMMU. So I prefer not to give that illusion.

> Thanks
> Kevin


Thanks,

Jacob

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v2 2/8] iommu: Add attach/detach_dev_pasid domain ops
  2022-03-15 11:49     ` Tian, Kevin
@ 2022-03-15 16:11       ` Jacob Pan
  2022-03-18 12:01       ` Lu Baolu
  1 sibling, 0 replies; 61+ messages in thread
From: Jacob Pan @ 2022-03-15 16:11 UTC (permalink / raw)
  To: Tian, Kevin
  Cc: Jean-Philippe Brucker, iommu, LKML, Joerg Roedel,
	Jason Gunthorpe, Christoph Hellwig, Lu Baolu,
	Jean-Philippe Brucker, Pan, Jacob jun, Raj, Ashok, Kumar,
	Sanjay K, Jiang, Dave, Luck, Tony, Zanussi, Tom, Williams, Dan J,
	Liu, Yi L, jacob.jun.pan

Hi Kevin,

On Tue, 15 Mar 2022 11:49:57 +0000, "Tian, Kevin" <kevin.tian@intel.com>
wrote:

> > From: Jean-Philippe Brucker <jean-philippe@linaro.org>
> > Sent: Tuesday, March 15, 2022 7:27 PM
> > 
> > On Mon, Mar 14, 2022 at 10:07:06PM -0700, Jacob Pan wrote:  
> > > From: Lu Baolu <baolu.lu@linux.intel.com>
> > >
> > > An IOMMU domain represents an address space which can be attached by
> > > devices that perform DMA within a domain. However, for platforms with
> > > PASID capability the domain attachment needs be handled at
> > > device+PASID level. There can be multiple PASIDs within a device and
> > > multiple devices attached to a given domain.
> > > This patch introduces a new IOMMU op which support device, PASID, and
> > > IOMMU domain attachment. The immediate use case is for PASID capable
> > > devices to perform DMA under DMA APIs.
> > >
> > > Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
> > > Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
> > > ---
> > >  include/linux/iommu.h | 6 ++++++
> > >  1 file changed, 6 insertions(+)
> > >
> > > diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> > > index 369f05c2a4e2..fde5b933dbe3 100644
> > > --- a/include/linux/iommu.h
> > > +++ b/include/linux/iommu.h
> > > @@ -227,6 +227,8 @@ struct iommu_iotlb_gather {
> > >   * @aux_get_pasid: get the pasid given an aux-domain
> > >   * @sva_bind: Bind process address space to device
> > >   * @sva_unbind: Unbind process address space from device
> > > + * @attach_dev_pasid: attach an iommu domain to a pasid of device
> > > + * @detach_dev_pasid: detach an iommu domain from a pasid of device  
> > 
> > Isn't that operation "assign a PASID to a domain" instead?  In patch 5,
> > the domain is already attached to the device, so set_domain_pasid()
> > might be clearer and to the point. If the IOMMU driver did the
> > allocation we could also avoid patch 1.
I agree, we could let vendor driver do the allocation inside this op. On
the other side, we could also keep the flexibility such that this op can be
used for guest PASID bind, with a SVA domain.
> 
> iiuc this API can also work for future SIOV usage where each mdev attached
> to the domain has its own pasid. "assigning a PASID to a domain" sounds
> like going back to the previous aux domain approach which has one PASID
> per domain and that PASID is used on all devices attached to the aux
> domain...
> 
yes, that is the intention. I plan to lift the requirement in patch 5 such
that device attachment will not be a prerequisite. That would be after mdev
adoption.

> > 
> > If I understand correctly this series is not about a generic PASID API
> > that allows drivers to manage multiple DMA address spaces, because there
> > still doesn't seem to be any interest in that. It's about the specific
> > IDXD use-case, so let's focus on that. We can introduce a specialized
> > call such as (iommu|dma)_set_device_pasid(), which will be easy to
> > consolidate later into a more generic "dma_enable_pasid()" API if that
> > ever seems useful.
> > 
Right, at the moment it is still a single address space. i.e. the current
domain of the device/group.

But this limitation is at the driver facing API layer not limited in the
IOMMU ops.

> > Thanks,
> > Jean
> >   
> > >   * @sva_get_pasid: Get PASID associated to a SVA handle
> > >   * @page_response: handle page request response
> > >   * @cache_invalidate: invalidate translation caches
> > > @@ -296,6 +298,10 @@ struct iommu_ops {
> > >  	struct iommu_sva *(*sva_bind)(struct device *dev, struct
> > > mm_struct  
> > *mm,  
> > >  				      void *drvdata);
> > >  	void (*sva_unbind)(struct iommu_sva *handle);
> > > +	int (*attach_dev_pasid)(struct iommu_domain *domain,
> > > +				struct device *dev, ioasid_t id);
> > > +	void (*detach_dev_pasid)(struct iommu_domain *domain,
> > > +				 struct device *dev, ioasid_t id);
> > >  	u32 (*sva_get_pasid)(struct iommu_sva *handle);
> > >
> > >  	int (*page_response)(struct device *dev,
> > > --
> > > 2.25.1
> > >  


Thanks,

Jacob

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v2 5/8] iommu: Add PASID support for DMA mapping API users
  2022-03-15 14:22     ` Jason Gunthorpe
@ 2022-03-15 16:31       ` Jacob Pan
  2022-03-15 17:05         ` Jason Gunthorpe
  2022-03-16  8:41       ` Tian, Kevin
  1 sibling, 1 reply; 61+ messages in thread
From: Jacob Pan @ 2022-03-15 16:31 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Robin Murphy, iommu, LKML, Joerg Roedel, Christoph Hellwig,
	Lu Baolu, Jean-Philippe Brucker, Tian, Kevin, Tony Luck,
	Dave Jiang, Raj Ashok, Zanussi, Tom, Kumar, Sanjay K, Jacob Pan,
	Dan Williams, jacob.jun.pan

Hi Jason,

On Tue, 15 Mar 2022 11:22:16 -0300, Jason Gunthorpe <jgg@nvidia.com> wrote:

> On Tue, Mar 15, 2022 at 11:16:41AM +0000, Robin Murphy wrote:
> > On 2022-03-15 05:07, Jacob Pan wrote:  
> > > DMA mapping API is the de facto standard for in-kernel DMA. It
> > > operates on a per device/RID basis which is not PASID-aware.
> > > 
> > > Some modern devices such as Intel Data Streaming Accelerator, PASID is
> > > required for certain work submissions. To allow such devices use DMA
> > > mapping API, we need the following functionalities:
> > > 1. Provide device a way to retrieve a PASID for work submission within
> > > the kernel
> > > 2. Enable the kernel PASID on the IOMMU for the device
> > > 3. Attach the kernel PASID to the device's default DMA domain, let it
> > > be IOVA or physical address in case of pass-through.
> > > 
> > > This patch introduces a driver facing API that enables DMA API
> > > PASID usage. Once enabled, device drivers can continue to use DMA
> > > APIs as is. There is no difference in dma_handle between without
> > > PASID and with PASID.  
> > 
> > Surely the main point of PASIDs is to be able to use more than one
> > of them?  
> 
> IMHO, not for the DMA API.
> 
Right, but we really need two here. One for DMA request w/o PASID (PASID 0)
and a kernel PASID for DMA request tagged w/ PASID.
Since DMA API is not per process, there is no need for more right now.

> I can't think of good reasons why a single in-kernel device should
> require more than one iommu_domain for use by the DMA API. Even with
> the SIOV cases we have been looking at we don't really see a use case
> for more than one DMA API iommu_domain on a single physical device.
> Do you know of something on the horizon?
> 
Not that I know.

> From my view the main point of PASIDs is to assign iommu_domains that
> are not used by the DMA API.
> 
Right, DMA API default to PASID 0. But IDXD device cannot use PASID 0 for
enqcmds.

> IMHO it is a device mis-design of IDXD to require all DMA be PASID
> tagged. Devices should be able to do DMA on their RID when the PCI
IDXD can do DMA w/ RID, the PASID requirement is only for shared WQ where
ENQCMDS is used. ENQCMDS has the benefit of avoiding locking where work
submission is done from multiple CPUs.
Tony, Dave?

> function is controlled by a kernel driver. I see this driver facing
> API as addressing a device quirk by aliasing the DMA API of the RID
> into a PASID and that is really all it is good for.
> 
> In any case I think we are better to wait for an actual user for multi
> DMA API iommu_domains to come forward before we try to build an API
> for it.
> 
What would you recommend in the interim?

Shall we let VT-d driver set up a special global PASID for DMA API? Then
IDXD driver can retrieve it somehow? But that still needs an API similar to
what I did in the previous version where PASID #1 was used.

Thanks,

Jacob

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v2 5/8] iommu: Add PASID support for DMA mapping API users
  2022-03-15 14:35   ` Jason Gunthorpe
@ 2022-03-15 16:38     ` Jacob Pan
  2022-03-15 23:05       ` Jason Gunthorpe
  0 siblings, 1 reply; 61+ messages in thread
From: Jacob Pan @ 2022-03-15 16:38 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: iommu, LKML, Joerg Roedel, Christoph Hellwig, Lu Baolu,
	Jean-Philippe Brucker, Jacob Pan, Raj Ashok, Kumar, Sanjay K,
	Dave Jiang, Tony Luck, Zanussi, Tom, Dan Williams, Tian, Kevin,
	Yi Liu, jacob.jun.pan

Hi Jason,

On Tue, 15 Mar 2022 11:35:35 -0300, Jason Gunthorpe <jgg@nvidia.com> wrote:

> On Mon, Mar 14, 2022 at 10:07:09PM -0700, Jacob Pan wrote:
> > DMA mapping API is the de facto standard for in-kernel DMA. It operates
> > on a per device/RID basis which is not PASID-aware.
> > 
> > Some modern devices such as Intel Data Streaming Accelerator, PASID is
> > required for certain work submissions. To allow such devices use DMA
> > mapping API, we need the following functionalities:
> > 1. Provide device a way to retrieve a PASID for work submission within
> > the kernel
> > 2. Enable the kernel PASID on the IOMMU for the device
> > 3. Attach the kernel PASID to the device's default DMA domain, let it
> > be IOVA or physical address in case of pass-through.
> > 
> > This patch introduces a driver facing API that enables DMA API
> > PASID usage. Once enabled, device drivers can continue to use DMA APIs
> > as is. There is no difference in dma_handle between without PASID and
> > with PASID.
> > 
> > Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
> >  drivers/iommu/dma-iommu.c | 65 +++++++++++++++++++++++++++++++++++++++
> >  include/linux/dma-iommu.h |  7 +++++
> >  include/linux/iommu.h     |  9 ++++++
> >  3 files changed, 81 insertions(+)
> > 
> > diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
> > index b22034975301..d0ff1a34b1b6 100644
> > +++ b/drivers/iommu/dma-iommu.c
> > @@ -39,6 +39,8 @@ enum iommu_dma_cookie_type {
> >  	IOMMU_DMA_MSI_COOKIE,
> >  };
> >  
> > +static DECLARE_IOASID_SET(iommu_dma_pasid);
> > +
> >  struct iommu_dma_cookie {
> >  	enum iommu_dma_cookie_type	type;
> >  	union {
> > @@ -370,6 +372,69 @@ void iommu_put_dma_cookie(struct iommu_domain
> > *domain) domain->iova_cookie = NULL;
> >  }
> >  
> > +/**
> > + * iommu_enable_pasid_dma --Enable in-kernel DMA request with PASID
> > + * @dev:	Device to be enabled
> > + *
> > + * DMA request with PASID will be mapped the same way as the legacy
> > DMA.
> > + * If the device is in pass-through, PASID will also pass-through. If
> > the
> > + * device is in IOVA map, the supervisor PASID will point to the same
> > IOVA
> > + * page table.
> > + *
> > + * @return the kernel PASID to be used for DMA or INVALID_IOASID on
> > failure
> > + */
> > +int iommu_enable_pasid_dma(struct device *dev, ioasid_t *pasid)
> > +{
> > +	struct iommu_domain *dom;
> > +	ioasid_t id, max;
> > +	int ret;
> > +
> > +	dom = iommu_get_domain_for_dev(dev);
> > +	if (!dom || !dom->ops || !dom->ops->attach_dev_pasid)
> > +		return -ENODEV;  
> 
> Given the purpose of this API I think it should assert that the device
> has the DMA API in-use using the machinery from the other series.
> 
> ie this should not be used to clone non-DMA API iommu_domains..
> 
Let me try to confirm the specific here. I should check domain type and
rejects all others except IOMMU_DOMAIN_DMA type, right? Should also allow
IOMMU_DOMAIN_IDENTITY.

That makes sense.

> > diff --git a/include/linux/dma-iommu.h b/include/linux/dma-iommu.h
> > index 24607dc3c2ac..e6cb9b52a420 100644
> > +++ b/include/linux/dma-iommu.h
> > @@ -18,6 +18,13 @@ int iommu_get_dma_cookie(struct iommu_domain
> > *domain); int iommu_get_msi_cookie(struct iommu_domain *domain,
> > dma_addr_t base); void iommu_put_dma_cookie(struct iommu_domain
> > *domain); 
> > +/*
> > + * For devices that can do DMA request with PASID, setup a system
> > PASID.
> > + * Address modes (IOVA, PA) are selected by the platform code.
> > + */
> > +int iommu_enable_pasid_dma(struct device *dev, ioasid_t *pasid);
> > +void iommu_disable_pasid_dma(struct device *dev, ioasid_t pasid);  
> 
> The functions already have a kdoc, don't need two..
> 
Good point!

Thanks,

Jacob

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v2 5/8] iommu: Add PASID support for DMA mapping API users
  2022-03-15 16:31       ` Jacob Pan
@ 2022-03-15 17:05         ` Jason Gunthorpe
  2022-03-15 21:24           ` Jacob Pan
  0 siblings, 1 reply; 61+ messages in thread
From: Jason Gunthorpe @ 2022-03-15 17:05 UTC (permalink / raw)
  To: Jacob Pan
  Cc: Robin Murphy, iommu, LKML, Joerg Roedel, Christoph Hellwig,
	Lu Baolu, Jean-Philippe Brucker, Tian, Kevin, Tony Luck,
	Dave Jiang, Raj Ashok, Zanussi, Tom, Kumar, Sanjay K, Jacob Pan,
	Dan Williams

On Tue, Mar 15, 2022 at 09:31:35AM -0700, Jacob Pan wrote:

> > IMHO it is a device mis-design of IDXD to require all DMA be PASID
> > tagged. Devices should be able to do DMA on their RID when the PCI

> IDXD can do DMA w/ RID, the PASID requirement is only for shared WQ where
> ENQCMDS is used. ENQCMDS has the benefit of avoiding locking where work
> submission is done from multiple CPUs.
> Tony, Dave?

This is what I mean, it has an operating mode you want to use from the
kernel driver that cannot do RID DMA. It is a HW mis-design, IMHO.

Something like PASID0 in the ENQCMDS should have triggered RID DMA.

> > In any case I think we are better to wait for an actual user for multi
> > DMA API iommu_domains to come forward before we try to build an API
> > for it.
> 
> What would you recommend in the interim?

Oh, I mean this approach at a high level is fine - I was saying we
shouldn't try to broaden it like Robin was suggesting without a driver
that needs multiple iommu_domains for the DMA API.

Jason

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v2 5/8] iommu: Add PASID support for DMA mapping API users
  2022-03-15 17:05         ` Jason Gunthorpe
@ 2022-03-15 21:24           ` Jacob Pan
  2022-03-16 10:32             ` Tian, Kevin
  0 siblings, 1 reply; 61+ messages in thread
From: Jacob Pan @ 2022-03-15 21:24 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Robin Murphy, iommu, LKML, Joerg Roedel, Christoph Hellwig,
	Lu Baolu, Jean-Philippe Brucker, Tian, Kevin, Tony Luck,
	Dave Jiang, Raj Ashok, Zanussi, Tom, Kumar, Sanjay K, Jacob Pan,
	Dan Williams, jacob.jun.pan

Hi Jason,

On Tue, 15 Mar 2022 14:05:07 -0300, Jason Gunthorpe <jgg@nvidia.com> wrote:

> On Tue, Mar 15, 2022 at 09:31:35AM -0700, Jacob Pan wrote:
> 
> > > IMHO it is a device mis-design of IDXD to require all DMA be PASID
> > > tagged. Devices should be able to do DMA on their RID when the PCI  
> 
> > IDXD can do DMA w/ RID, the PASID requirement is only for shared WQ
> > where ENQCMDS is used. ENQCMDS has the benefit of avoiding locking
> > where work submission is done from multiple CPUs.
> > Tony, Dave?  
> 
> This is what I mean, it has an operating mode you want to use from the
> kernel driver that cannot do RID DMA. It is a HW mis-design, IMHO.
> 
> Something like PASID0 in the ENQCMDS should have triggered RID DMA.
> 
That would simplify things a lot, it would be just a device change I think.
From IA pov, only ENQCMD will #GP if PASID==0. I will bring this back to HW
team to consider for future generations.

> > > In any case I think we are better to wait for an actual user for multi
> > > DMA API iommu_domains to come forward before we try to build an API
> > > for it.  
> > 
> > What would you recommend in the interim?  
> 
> Oh, I mean this approach at a high level is fine - I was saying we
> shouldn't try to broaden it like Robin was suggesting without a driver
> that needs multiple iommu_domains for the DMA API.
> 
Got it. Thanks for the clarification.

> Jason


Thanks,

Jacob

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v2 3/8] iommu/vt-d: Implement device_pasid domain attach ops
  2022-03-15 10:33   ` Tian, Kevin
@ 2022-03-15 22:23     ` Jacob Pan
  0 siblings, 0 replies; 61+ messages in thread
From: Jacob Pan @ 2022-03-15 22:23 UTC (permalink / raw)
  To: Tian, Kevin
  Cc: iommu, LKML, Joerg Roedel, Jason Gunthorpe, Christoph Hellwig,
	Lu Baolu, Jean-Philippe Brucker, Pan, Jacob jun, Raj, Ashok,
	Kumar, Sanjay K, Jiang, Dave, Luck, Tony, Zanussi, Tom, Williams,
	Dan J, Liu, Yi L, jacob.jun.pan

Hi Kevin,

On Tue, 15 Mar 2022 10:33:08 +0000, "Tian, Kevin" <kevin.tian@intel.com>
wrote:

> > From: Jacob Pan <jacob.jun.pan@linux.intel.com>
> > Sent: Tuesday, March 15, 2022 1:07 PM
> > 
> > On VT-d platforms with scalable mode enabled, devices issue DMA requests
> > with PASID need to attach to the correct IOMMU domains.
> > The attach operation involves the following:
> > - programming the PASID into device's PASID table
> > - tracking device domain and the PASID relationship
> > - managing IOTLB and device TLB invalidations
> > 
> > This patch extends DMAR domain and device domain info with xarrays to
> > track per domain and per device PASIDs.  It provides the flexibility to
> > be used beyond DMA API PASID support.
> > 
> > Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
> > Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
> > ---
> >  drivers/iommu/intel/iommu.c | 194
> > +++++++++++++++++++++++++++++++++++-
> >  include/linux/intel-iommu.h |  12 ++-
> >  2 files changed, 202 insertions(+), 4 deletions(-)
> > 
> > diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
> > index 881f8361eca2..9267194eaed3 100644
> > --- a/drivers/iommu/intel/iommu.c
> > +++ b/drivers/iommu/intel/iommu.c
> > @@ -1622,20 +1622,48 @@ static void __iommu_flush_dev_iotlb(struct
> > device_domain_info *info,
> >  			   qdep, addr, mask);
> >  }
> > 
> > +static void __iommu_flush_dev_piotlb(struct device_domain_info *info,  
> 
> piotlb is confusing, better be:
> 
> 	__iommu_flush_dev_iotlb_pasid()
> 
yeah, that is more clear.

> > +					u64 address,
> > +				     ioasid_t pasid, unsigned int mask)
> > +{
> > +	u16 sid, qdep;
> > +
> > +	if (!info || !info->ats_enabled)
> > +		return;
> > +
> > +	sid = info->bus << 8 | info->devfn;
> > +	qdep = info->ats_qdep;
> > +	qi_flush_dev_iotlb_pasid(info->iommu, sid, info->pfsid,
> > +				 pasid, qdep, address, mask);
> > +}
> > +
> >  static void iommu_flush_dev_iotlb(struct dmar_domain *domain,
> >  				  u64 addr, unsigned mask)
> >  {
> >  	unsigned long flags;
> >  	struct device_domain_info *info;
> >  	struct subdev_domain_info *sinfo;
> > +	unsigned long pasid;
> > +	struct pasid_info *pinfo;
> > 
> >  	if (!domain->has_iotlb_device)
> >  		return;
> > 
> >  	spin_lock_irqsave(&device_domain_lock, flags);
> > -	list_for_each_entry(info, &domain->devices, link)
> > -		__iommu_flush_dev_iotlb(info, addr, mask);
> > -
> > +	list_for_each_entry(info, &domain->devices, link) {
> > +		/*
> > +		 * We cannot use PASID based devTLB invalidation on
> > RID2PASID
> > +		 * Device does not understand RID2PASID/0. This is
> > different  
> 
> Lack of a conjunction word between 'RID2PASID' and 'Device'.
> 
> and what is RID2PASID/0? It would be clearer to point out that RID2PASID
> is visible only within the iommu to mark out requests without PASID, 
> thus this PASID value should never be sent to the device side.
> 
Good point, will do.

> > +		 * than IOTLB invalidation where RID2PASID is also
> > used for
> > +		 * tagging.  
> 
> Then it would be obvious because IOTLB is iommu internal agent thus takes 
> use of RID2PASID for tagging.
> 
ditto

> > +		 */
> > +		xa_for_each(&info->pasids, pasid, pinfo) {
> > +			if (!pasid)  
> 
> this should be compared to PASID_RID2PASID (though it's zero today).
> 
ditto

> > +				__iommu_flush_dev_iotlb(info, addr,
> > mask);
> > +			else
> > +				__iommu_flush_dev_piotlb(info, addr,
> > pasid, mask);
> > +		}
> > +	}
> >  	list_for_each_entry(sinfo, &domain->subdevices, link_domain) {
> >  		info = get_domain_info(sinfo->pdev);
> >  		__iommu_flush_dev_iotlb(info, addr, mask);  
> 
> Thanks
> Kevin


Thanks,

Jacob

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v2 3/8] iommu/vt-d: Implement device_pasid domain attach ops
  2022-03-15 14:33   ` Jason Gunthorpe
@ 2022-03-15 22:36     ` Jacob Pan
  2022-03-15 23:04       ` Jason Gunthorpe
  2022-03-16  7:41     ` Tian, Kevin
  1 sibling, 1 reply; 61+ messages in thread
From: Jacob Pan @ 2022-03-15 22:36 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: iommu, LKML, Joerg Roedel, Christoph Hellwig, Lu Baolu,
	Jean-Philippe Brucker, Jacob Pan, Raj Ashok, Kumar, Sanjay K,
	Dave Jiang, Tony Luck, Zanussi, Tom, Dan Williams, Tian, Kevin,
	Yi Liu, jacob.jun.pan

Hi Jason,

On Tue, 15 Mar 2022 11:33:22 -0300, Jason Gunthorpe <jgg@nvidia.com> wrote:

> On Mon, Mar 14, 2022 at 10:07:07PM -0700, Jacob Pan wrote:
> > +	/*
> > +	 * Each domain could have multiple devices attached with
> > shared or per
> > +	 * device PASIDs. At the domain level, we keep track of unique
> > PASIDs and
> > +	 * device user count.
> > +	 * E.g. If a domain has two devices attached, device A has
> > PASID 0, 1;
> > +	 * device B has PASID 0, 2. Then the domain would have PASID
> > 0, 1, 2.
> > +	 */  
> 
> A 2d array of xarray's seems like a poor data structure for this task.
> 
> AFACIT this wants to store a list of (device, pasid) tuples, so a
> simple linked list, 1d xarray vector or a red black tree seems more
> appropriate..
> 
Agreed.
It might need some surgery for dmar_domain and device_domain_info, which
already has a simple device list. I am trying to leverage the existing data
struct, let me take a closer look.

> > +	if (entry) {
> > +		pinfo = entry;
> > +	} else {
> > +		pinfo = kzalloc(sizeof(*pinfo), GFP_ATOMIC);
> > +		if (!pinfo)
> > +			return -ENOMEM;
> > +		pinfo->pasid = pasid;
> > +		/* Store the new PASID info in the per domain array */
> > +		ret = xa_err(__xa_store(&dmar_domain->pasids, pasid,
> > pinfo,
> > +			     GFP_ATOMIC));
> > +		if (ret)
> > +			goto xa_store_err;
> > +	}
> > +	/* Store PASID in per device-domain array, this is for
> > tracking devTLB */
> > +	ret = xa_err(xa_store(&info->pasids, pasid, pinfo,
> > GFP_ATOMIC));
> > +	if (ret)
> > +		goto xa_store_err;
> > +
> > +	atomic_inc(&pinfo->users);
> > +	xa_unlock(&dmar_domain->pasids);
> > +
> > +	return 0;
> > +
> > +xa_store_err:
> > +	xa_unlock(&dmar_domain->pasids);
> > +	spin_lock_irqsave(&iommu->lock, flags);
> > +	intel_pasid_tear_down_entry(iommu, dev, pasid, false);
> > +	spin_unlock_irqrestore(&iommu->lock, flags);
> > +
> > +	if (!atomic_read(&pinfo->users)) {
> > +		__xa_erase(&dmar_domain->pasids, pasid);  
> 
> This isn't locked right
> 
good catch! need to move under xa_unlock.

Thanks,

Jacob

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v2 3/8] iommu/vt-d: Implement device_pasid domain attach ops
  2022-03-15 22:36     ` Jacob Pan
@ 2022-03-15 23:04       ` Jason Gunthorpe
  2022-03-16 20:50         ` Jacob Pan
  0 siblings, 1 reply; 61+ messages in thread
From: Jason Gunthorpe @ 2022-03-15 23:04 UTC (permalink / raw)
  To: Jacob Pan
  Cc: iommu, LKML, Joerg Roedel, Christoph Hellwig, Lu Baolu,
	Jean-Philippe Brucker, Jacob Pan, Raj Ashok, Kumar, Sanjay K,
	Dave Jiang, Tony Luck, Zanussi, Tom, Dan Williams, Tian, Kevin,
	Yi Liu

On Tue, Mar 15, 2022 at 03:36:20PM -0700, Jacob Pan wrote:
> Hi Jason,
> 
> On Tue, 15 Mar 2022 11:33:22 -0300, Jason Gunthorpe <jgg@nvidia.com> wrote:
> 
> > On Mon, Mar 14, 2022 at 10:07:07PM -0700, Jacob Pan wrote:
> > > +	/*
> > > +	 * Each domain could have multiple devices attached with
> > > shared or per
> > > +	 * device PASIDs. At the domain level, we keep track of unique
> > > PASIDs and
> > > +	 * device user count.
> > > +	 * E.g. If a domain has two devices attached, device A has
> > > PASID 0, 1;
> > > +	 * device B has PASID 0, 2. Then the domain would have PASID
> > > 0, 1, 2.
> > > +	 */  
> > 
> > A 2d array of xarray's seems like a poor data structure for this task.
> > 
> > AFACIT this wants to store a list of (device, pasid) tuples, so a
> > simple linked list, 1d xarray vector or a red black tree seems more
> > appropriate..
> > 
> Agreed.
> It might need some surgery for dmar_domain and device_domain_info, which
> already has a simple device list. I am trying to leverage the existing data
> struct, let me take a closer look.

Maybe the core code should provide this data structure in the
iommu_domain.

Figuring out what stuff is attached is something every driver has to
do right?

Jason

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v2 5/8] iommu: Add PASID support for DMA mapping API users
  2022-03-15 16:38     ` Jacob Pan
@ 2022-03-15 23:05       ` Jason Gunthorpe
  0 siblings, 0 replies; 61+ messages in thread
From: Jason Gunthorpe @ 2022-03-15 23:05 UTC (permalink / raw)
  To: Jacob Pan
  Cc: iommu, LKML, Joerg Roedel, Christoph Hellwig, Lu Baolu,
	Jean-Philippe Brucker, Jacob Pan, Raj Ashok, Kumar, Sanjay K,
	Dave Jiang, Tony Luck, Zanussi, Tom, Dan Williams, Tian, Kevin,
	Yi Liu

On Tue, Mar 15, 2022 at 09:38:10AM -0700, Jacob Pan wrote:
> > > +int iommu_enable_pasid_dma(struct device *dev, ioasid_t *pasid)
> > > +{
> > > +	struct iommu_domain *dom;
> > > +	ioasid_t id, max;
> > > +	int ret;
> > > +
> > > +	dom = iommu_get_domain_for_dev(dev);
> > > +	if (!dom || !dom->ops || !dom->ops->attach_dev_pasid)
> > > +		return -ENODEV;  
> > 
> > Given the purpose of this API I think it should assert that the device
> > has the DMA API in-use using the machinery from the other series.
> > 
> > ie this should not be used to clone non-DMA API iommu_domains..
> > 
> Let me try to confirm the specific here. I should check domain type and
> rejects all others except IOMMU_DOMAIN_DMA type, right? Should also allow
> IOMMU_DOMAIN_IDENTITY.
> 
> That makes sense.

That is one way, the other is to check the new group data that Lu's
patch added.

Jason

^ permalink raw reply	[flat|nested] 61+ messages in thread

* RE: [PATCH v2 3/8] iommu/vt-d: Implement device_pasid domain attach ops
  2022-03-15  5:07 ` [PATCH v2 3/8] iommu/vt-d: Implement device_pasid domain attach ops Jacob Pan
  2022-03-15 10:33   ` Tian, Kevin
  2022-03-15 14:33   ` Jason Gunthorpe
@ 2022-03-16  7:39   ` Tian, Kevin
  2022-03-16 20:51     ` Jacob Pan
  2 siblings, 1 reply; 61+ messages in thread
From: Tian, Kevin @ 2022-03-16  7:39 UTC (permalink / raw)
  To: Jacob Pan, iommu, LKML, Joerg Roedel, Jason Gunthorpe,
	Christoph Hellwig, Lu Baolu, Jean-Philippe Brucker
  Cc: Pan, Jacob jun, Raj, Ashok, Kumar, Sanjay K, Jiang, Dave, Luck,
	Tony, Zanussi, Tom, Williams, Dan J, Liu, Yi L

> From: Jacob Pan <jacob.jun.pan@linux.intel.com>
> Sent: Tuesday, March 15, 2022 1:07 PM
> +static int intel_iommu_attach_dev_pasid(struct iommu_domain *domain,
> +					struct device *dev, ioasid_t pasid)
> +{
> +	struct dmar_domain *dmar_domain = to_dmar_domain(domain);
> +	struct device_domain_info *info = get_domain_info(dev);
> +	struct intel_iommu *iommu = info->iommu;
> +	struct pasid_info *pinfo;
> +	unsigned long flags;
> +	int ret = 0;
> +	void *entry;
> +
> +	if (!info)
> +		return -ENODEV;

btw this interface only works in scalable mode. Lack of a check to
return error on legacy mode here.

Thanks
Kevin

^ permalink raw reply	[flat|nested] 61+ messages in thread

* RE: [PATCH v2 3/8] iommu/vt-d: Implement device_pasid domain attach ops
  2022-03-15 14:33   ` Jason Gunthorpe
  2022-03-15 22:36     ` Jacob Pan
@ 2022-03-16  7:41     ` Tian, Kevin
  2022-03-16 21:01       ` Jacob Pan
  1 sibling, 1 reply; 61+ messages in thread
From: Tian, Kevin @ 2022-03-16  7:41 UTC (permalink / raw)
  To: Jason Gunthorpe, Jacob Pan
  Cc: iommu, LKML, Joerg Roedel, Christoph Hellwig, Lu Baolu,
	Jean-Philippe Brucker, Pan, Jacob jun, Raj, Ashok, Kumar,
	Sanjay K, Jiang, Dave, Luck, Tony, Zanussi, Tom, Williams, Dan J,
	Liu, Yi L

> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Tuesday, March 15, 2022 10:33 PM
> 
> On Mon, Mar 14, 2022 at 10:07:07PM -0700, Jacob Pan wrote:
> > +	/*
> > +	 * Each domain could have multiple devices attached with shared or
> per
> > +	 * device PASIDs. At the domain level, we keep track of unique PASIDs
> and
> > +	 * device user count.
> > +	 * E.g. If a domain has two devices attached, device A has PASID 0, 1;
> > +	 * device B has PASID 0, 2. Then the domain would have PASID 0, 1, 2.
> > +	 */
> 
> A 2d array of xarray's seems like a poor data structure for this task.

besides that it also doesn't work when we support per-device PASID allocation
in the future. In that case merging device PASIDs together is conceptually
wrong.

> 
> AFACIT this wants to store a list of (device, pasid) tuples, so a
> simple linked list, 1d xarray vector or a red black tree seems more
> appropriate..
> 

this tuple can well serve per-device PASID. 😊

Thanks
Kevin

^ permalink raw reply	[flat|nested] 61+ messages in thread

* RE: [PATCH v2 4/8] iommu/vt-d: Use device_pasid attach op for RID2PASID
  2022-03-15  5:07 ` [PATCH v2 4/8] iommu/vt-d: Use device_pasid attach op for RID2PASID Jacob Pan
@ 2022-03-16  7:54   ` Tian, Kevin
  2022-03-17 20:45     ` Jacob Pan
  0 siblings, 1 reply; 61+ messages in thread
From: Tian, Kevin @ 2022-03-16  7:54 UTC (permalink / raw)
  To: Jacob Pan, iommu, LKML, Joerg Roedel, Jason Gunthorpe,
	Christoph Hellwig, Lu Baolu, Jean-Philippe Brucker
  Cc: Pan, Jacob jun, Raj, Ashok, Kumar, Sanjay K, Jiang, Dave, Luck,
	Tony, Zanussi, Tom, Williams, Dan J, Liu, Yi L

> From: Jacob Pan <jacob.jun.pan@linux.intel.com>
> Sent: Tuesday, March 15, 2022 1:07 PM
> 
> With the availability of a generic device-PASID-domain attachment API,
> there's no need to special case RID2PASID.  Use the API to replace
> duplicated code.
> 
> Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
> ---
>  drivers/iommu/intel/iommu.c | 18 ++----------------
>  1 file changed, 2 insertions(+), 16 deletions(-)
> 
> diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
> index 9267194eaed3..f832b7599d21 100644
> --- a/drivers/iommu/intel/iommu.c
> +++ b/drivers/iommu/intel/iommu.c
> @@ -1683,9 +1683,6 @@ static void domain_flush_piotlb(struct
> intel_iommu *iommu,
>  		qi_flush_piotlb(iommu, did, domain->default_pasid,
>  				addr, npages, ih);
> 
> -	if (!list_empty(&domain->devices))
> -		qi_flush_piotlb(iommu, did, PASID_RID2PASID, addr, npages,
> ih);
> -

this should be rebased on top of Baolu's "iommu cleanup and refactoring"
series which has removed the entire domain_flush_piotlb().

Thanks
Kevin

^ permalink raw reply	[flat|nested] 61+ messages in thread

* RE: [PATCH v2 5/8] iommu: Add PASID support for DMA mapping API users
  2022-03-15 14:22     ` Jason Gunthorpe
  2022-03-15 16:31       ` Jacob Pan
@ 2022-03-16  8:41       ` Tian, Kevin
  2022-03-16 14:07         ` Jason Gunthorpe
  1 sibling, 1 reply; 61+ messages in thread
From: Tian, Kevin @ 2022-03-16  8:41 UTC (permalink / raw)
  To: Jason Gunthorpe, Robin Murphy
  Cc: Jacob Pan, iommu, LKML, Joerg Roedel, Christoph Hellwig,
	Lu Baolu, Jean-Philippe Brucker, Luck, Tony, Jiang, Dave, Raj,
	Ashok, Zanussi, Tom, Kumar, Sanjay K, Pan, Jacob jun, Williams,
	Dan J

> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Tuesday, March 15, 2022 10:22 PM
> 
> On Tue, Mar 15, 2022 at 11:16:41AM +0000, Robin Murphy wrote:
> > On 2022-03-15 05:07, Jacob Pan wrote:
> > > DMA mapping API is the de facto standard for in-kernel DMA. It operates
> > > on a per device/RID basis which is not PASID-aware.
> > >
> > > Some modern devices such as Intel Data Streaming Accelerator, PASID is
> > > required for certain work submissions. To allow such devices use DMA
> > > mapping API, we need the following functionalities:
> > > 1. Provide device a way to retrieve a PASID for work submission within
> > > the kernel
> > > 2. Enable the kernel PASID on the IOMMU for the device
> > > 3. Attach the kernel PASID to the device's default DMA domain, let it
> > > be IOVA or physical address in case of pass-through.
> > >
> > > This patch introduces a driver facing API that enables DMA API
> > > PASID usage. Once enabled, device drivers can continue to use DMA APIs
> as
> > > is. There is no difference in dma_handle between without PASID and with
> > > PASID.
> >
> > Surely the main point of PASIDs is to be able to use more than one
> > of them?
> 
> IMHO, not for the DMA API.
> 
> I can't think of good reasons why a single in-kernel device should
> require more than one iommu_domain for use by the DMA API. Even with
> the SIOV cases we have been looking at we don't really see a use case
> for more than one DMA API iommu_domain on a single physical device.

This is correct.

> Do you know of something on the horizon?
> 
> From my view the main point of PASIDs is to assign iommu_domains that
> are not used by the DMA API.
> 
> IMHO it is a device mis-design of IDXD to require all DMA be PASID
> tagged. Devices should be able to do DMA on their RID when the PCI
> function is controlled by a kernel driver. I see this driver facing
> API as addressing a device quirk by aliasing the DMA API of the RID
> into a PASID and that is really all it is good for.

One clarification as I don't agree it is a mis-design.

The IDXD device does support RID-based DMA as the base. PASID-based
DMA API comes to play in two scenarios:

1) When the kernel wants a more scalable way of using IDXD e.g. having
multiple CPUs simultaneously submitting works in a lockless way to a 
shared work queue via a new instruction (ENQCMD) which carries PASID.
Having PASID in the instruction is to identify multiple CPU address spaces 
attached to the shared queue (as required by user SVA). For kernel use 
of IDXD (e.g. page zeroing) in this series all CPUs are accessing the same
IOVA space associated with DMA API thus just requires one PASID attached
to DMA API to use ENQCMD in this optimized mode.

2) When the host wants to share a workqueue between multiple VMs.
In that case the virtual IDXD device exposed to each VM will only support
the shared workqueue mode. Only in this case the DMA API in the
guest must be attached by a PASID as ENQCMD is the only way to submit
works.

So it's just a trick to enable a more scalable use of the IDXD device. Other
than that IDXD does support normal RID-based DMA API when its work
queues are configured in dedicated mode. 😊

> 
> In any case I think we are better to wait for an actual user for multi
> DMA API iommu_domains to come forward before we try to build an API
> for it.
> 
> Jason

^ permalink raw reply	[flat|nested] 61+ messages in thread

* RE: [PATCH v2 5/8] iommu: Add PASID support for DMA mapping API users
  2022-03-15 21:24           ` Jacob Pan
@ 2022-03-16 10:32             ` Tian, Kevin
  0 siblings, 0 replies; 61+ messages in thread
From: Tian, Kevin @ 2022-03-16 10:32 UTC (permalink / raw)
  To: Jacob Pan, Jason Gunthorpe
  Cc: Robin Murphy, iommu, LKML, Joerg Roedel, Christoph Hellwig,
	Lu Baolu, Jean-Philippe Brucker, Luck, Tony, Jiang, Dave, Raj,
	Ashok, Zanussi, Tom, Kumar, Sanjay K, Pan, Jacob jun, Williams,
	Dan J

> From: Jacob Pan <jacob.jun.pan@linux.intel.com>
> Sent: Wednesday, March 16, 2022 5:24 AM
> 
> Hi Jason,
> 
> On Tue, 15 Mar 2022 14:05:07 -0300, Jason Gunthorpe <jgg@nvidia.com>
> wrote:
> 
> > On Tue, Mar 15, 2022 at 09:31:35AM -0700, Jacob Pan wrote:
> >
> > > > IMHO it is a device mis-design of IDXD to require all DMA be PASID
> > > > tagged. Devices should be able to do DMA on their RID when the PCI
> >
> > > IDXD can do DMA w/ RID, the PASID requirement is only for shared WQ
> > > where ENQCMDS is used. ENQCMDS has the benefit of avoiding locking
> > > where work submission is done from multiple CPUs.
> > > Tony, Dave?
> >
> > This is what I mean, it has an operating mode you want to use from the
> > kernel driver that cannot do RID DMA. It is a HW mis-design, IMHO.

Well, that mode is configured per work queue and the device just provides
flexibility for the software to use it for a potential value (scalability). So in 
the end it is a software question whether we can find a clean way to manage
this mode (fortunately with your proposal it does 😊). From this angle I'd
not call it a mis-design because the software has other options if there is no
willing of using that mode. Even in the virtual IDXD case that I explained in
another thread, the admin should not share work queue between VMs unless
he knows that guest can support it. Otherwise the admin should just dedicate
a workqueue to the guest and then let the guest itself to decide whether to
use the shared mode capability for its own purpose.

Also in concept the IOVA space created with DMA API is not different from
other I/O address spaces. There is no architectural restriction why this space
cannot be attached by PASID.

> >
> > Something like PASID0 in the ENQCMDS should have triggered RID DMA.
> >
> That would simplify things a lot, it would be just a device change I think.
> From IA pov, only ENQCMD will #GP if PASID==0. I will bring this back to HW
> team to consider for future generations.
> 

Maybe you can have a quick try?

Per SDM CPU doesn't #GP on PASID0 for ENQCMDS.

Also I don't think the device should do such check since with RID2PASID
the actual PASID value used to mark RID requests in the IOMMU is 
configured by the iommu driver. In that regard it is not correct for any 
hardware outside of the iommu to assume that PASID0 is for RID.

Then the only uncertain thing is within VT-d. In a quick check I didn't 
find any VT-d fault specifically for a DMA request which contains
a value same as RID2PASID.	

There is one interest paragraph in 6.2.1 (Tagging of cached translations):

  In scalable mode, requests-without-PASID are treated as requests-with-
  PASID when looking up the paging-structure cache, and PASID-cache.
  Such lookups use the PASID value from the RID_PASID field in the
  scalable-mode context-entry used to process the request-withoutPASID.
  Refer to Section 9.4 for more details on scalable-mode context-entry.
  Additionally, after translation process when such requests fill into
  IOTLB, the entries are tagged with PASID value obtained from RID_PASID
  field but are still marked as entries for requests-without-PASID.
  Tagging of such entries with PASID value is required so that 
  PASID-selective P_IOTLB invalidation can correctly remove all stale
  mappings. Implementation may allow requests-with-PASID from a given
  Requester-ID to hit entries brought into IOTLB by requests-without-PASID
  from the same Requester-ID to improve performance.

The last sentence gives me the impression that there is no problem for
VT-d to receive a request which happens to contain RID2PASID except
there might be a performance issue (if duplicated entries are created).

Thanks
Kevin

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v2 5/8] iommu: Add PASID support for DMA mapping API users
  2022-03-16  8:41       ` Tian, Kevin
@ 2022-03-16 14:07         ` Jason Gunthorpe
  0 siblings, 0 replies; 61+ messages in thread
From: Jason Gunthorpe @ 2022-03-16 14:07 UTC (permalink / raw)
  To: Tian, Kevin
  Cc: Robin Murphy, Jacob Pan, iommu, LKML, Joerg Roedel,
	Christoph Hellwig, Lu Baolu, Jean-Philippe Brucker, Luck, Tony,
	Jiang, Dave, Raj, Ashok, Zanussi, Tom, Kumar, Sanjay K, Pan,
	Jacob jun, Williams, Dan J

On Wed, Mar 16, 2022 at 08:41:27AM +0000, Tian, Kevin wrote:

> 1) When the kernel wants a more scalable way of using IDXD e.g. having
> multiple CPUs simultaneously submitting works in a lockless way to a 
> shared work queue via a new instruction (ENQCMD) which carries
> PASID.

IMHO the misdesign is the CPU can't submit work with ENQCMD from
kernel space that will do DMA on the RID.

> 2) When the host wants to share a workqueue between multiple VMs.
> In that case the virtual IDXD device exposed to each VM will only support
> the shared workqueue mode. Only in this case the DMA API in the
> guest must be attached by a PASID as ENQCMD is the only way to submit
> works.

It is the same issue - if ENQCMD had 'excute on the RID' then the
virtualization layer could translate that to 'execute on this PASID
setup by the hypervisor' and the kernel would not see additional
differences between SIOV and physical devices. IMHO mandatory kernel
PASID support in the guest just to support the kernel doing DMA to a
device is not nice.

Jason

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v2 3/8] iommu/vt-d: Implement device_pasid domain attach ops
  2022-03-15 23:04       ` Jason Gunthorpe
@ 2022-03-16 20:50         ` Jacob Pan
  2022-03-16 22:15           ` Jason Gunthorpe
  0 siblings, 1 reply; 61+ messages in thread
From: Jacob Pan @ 2022-03-16 20:50 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: iommu, LKML, Joerg Roedel, Christoph Hellwig, Lu Baolu,
	Jean-Philippe Brucker, Jacob Pan, Raj Ashok, Kumar, Sanjay K,
	Dave Jiang, Tony Luck, Zanussi, Tom, Dan Williams, Tian, Kevin,
	Yi Liu, jacob.jun.pan

Hi Jason,

On Tue, 15 Mar 2022 20:04:57 -0300, Jason Gunthorpe <jgg@nvidia.com> wrote:

> On Tue, Mar 15, 2022 at 03:36:20PM -0700, Jacob Pan wrote:
> > Hi Jason,
> > 
> > On Tue, 15 Mar 2022 11:33:22 -0300, Jason Gunthorpe <jgg@nvidia.com>
> > wrote: 
> > > On Mon, Mar 14, 2022 at 10:07:07PM -0700, Jacob Pan wrote:  
> > > > +	/*
> > > > +	 * Each domain could have multiple devices attached with
> > > > shared or per
> > > > +	 * device PASIDs. At the domain level, we keep track of
> > > > unique PASIDs and
> > > > +	 * device user count.
> > > > +	 * E.g. If a domain has two devices attached, device A has
> > > > PASID 0, 1;
> > > > +	 * device B has PASID 0, 2. Then the domain would have
> > > > PASID 0, 1, 2.
> > > > +	 */    
> > > 
> > > A 2d array of xarray's seems like a poor data structure for this task.
> > > 
> > > AFACIT this wants to store a list of (device, pasid) tuples, so a
> > > simple linked list, 1d xarray vector or a red black tree seems more
> > > appropriate..
> > >   
> > Agreed.
> > It might need some surgery for dmar_domain and device_domain_info, which
> > already has a simple device list. I am trying to leverage the existing
> > data struct, let me take a closer look.  
> 
> Maybe the core code should provide this data structure in the
> iommu_domain.
> 
> Figuring out what stuff is attached is something every driver has to
> do right?
yeah, seems we already have some duplicated device list in vendor domain
struct, e.g. VT-d's dmar_domain, AMD's protection_domain. Similarly for 
device_domain_info equivalent.

If core code provides domain-device-pasid tracking, we could do device-pasid
tracking in ioasid.c, when we support per device PASID allocation, each
phy device could be an IOASID set, thus its own namespace.

Perhaps, we could do the following: add a device list to struct
iommu_domain, this will replace vender's domain lists. The data would be
something like:
struct iommu_dev_pasid_data {
	struct list_head list;	  /* For iommu_domain->dev_list */
	struct ioasid_set *pasids;  /* For the PASIDs used by the device */
	struct device *dev;
};

I guess a list of (device, pasid) tuples as you suggested could work but it
will have duplicated device entries since each device could have multiple
PASIDs. right?

Have to code this up to see.

Thanks for the pointers,

Jacob

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v2 3/8] iommu/vt-d: Implement device_pasid domain attach ops
  2022-03-16  7:39   ` Tian, Kevin
@ 2022-03-16 20:51     ` Jacob Pan
  0 siblings, 0 replies; 61+ messages in thread
From: Jacob Pan @ 2022-03-16 20:51 UTC (permalink / raw)
  To: Tian, Kevin
  Cc: iommu, LKML, Joerg Roedel, Jason Gunthorpe, Christoph Hellwig,
	Lu Baolu, Jean-Philippe Brucker, Pan, Jacob jun, Raj, Ashok,
	Kumar, Sanjay K, Jiang, Dave, Luck, Tony, Zanussi, Tom, Williams,
	Dan J, Liu, Yi L, jacob.jun.pan

Hi Kevin,

On Wed, 16 Mar 2022 07:39:09 +0000, "Tian, Kevin" <kevin.tian@intel.com>
wrote:

> > From: Jacob Pan <jacob.jun.pan@linux.intel.com>
> > Sent: Tuesday, March 15, 2022 1:07 PM
> > +static int intel_iommu_attach_dev_pasid(struct iommu_domain *domain,
> > +					struct device *dev, ioasid_t
> > pasid) +{
> > +	struct dmar_domain *dmar_domain = to_dmar_domain(domain);
> > +	struct device_domain_info *info = get_domain_info(dev);
> > +	struct intel_iommu *iommu = info->iommu;
> > +	struct pasid_info *pinfo;
> > +	unsigned long flags;
> > +	int ret = 0;
> > +	void *entry;
> > +
> > +	if (!info)
> > +		return -ENODEV;  
> 
> btw this interface only works in scalable mode. Lack of a check to
> return error on legacy mode here.
> 
right, legacy mode has no PASIDs. will check

Thanks,

Jacob

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v2 3/8] iommu/vt-d: Implement device_pasid domain attach ops
  2022-03-16  7:41     ` Tian, Kevin
@ 2022-03-16 21:01       ` Jacob Pan
  2022-03-18  5:33         ` Tian, Kevin
  0 siblings, 1 reply; 61+ messages in thread
From: Jacob Pan @ 2022-03-16 21:01 UTC (permalink / raw)
  To: Tian, Kevin
  Cc: Jason Gunthorpe, iommu, LKML, Joerg Roedel, Christoph Hellwig,
	Lu Baolu, Jean-Philippe Brucker, Pan, Jacob jun, Raj, Ashok,
	Kumar, Sanjay K, Jiang, Dave, Luck, Tony, Zanussi, Tom, Williams,
	Dan J, Liu, Yi L, jacob.jun.pan

Hi Kevin,

On Wed, 16 Mar 2022 07:41:34 +0000, "Tian, Kevin" <kevin.tian@intel.com>
wrote:

> > From: Jason Gunthorpe <jgg@nvidia.com>
> > Sent: Tuesday, March 15, 2022 10:33 PM
> > 
> > On Mon, Mar 14, 2022 at 10:07:07PM -0700, Jacob Pan wrote:  
> > > +	/*
> > > +	 * Each domain could have multiple devices attached with
> > > shared or  
> > per  
> > > +	 * device PASIDs. At the domain level, we keep track of
> > > unique PASIDs  
> > and  
> > > +	 * device user count.
> > > +	 * E.g. If a domain has two devices attached, device A has
> > > PASID 0, 1;
> > > +	 * device B has PASID 0, 2. Then the domain would have PASID
> > > 0, 1, 2.
> > > +	 */  
> > 
> > A 2d array of xarray's seems like a poor data structure for this task.  
> 
Perhaps i mis-presented here, I am not using 2D array. It is an 1D xarray
for domain PASIDs only. Then I use the existing device list in each domain,
adding another xa to track per-device-domain PASIDs.
> besides that it also doesn't work when we support per-device PASID
> allocation in the future. In that case merging device PASIDs together is
> conceptually wrong.
> 
Sorry, could you elaborate? If we do per-dev PASID allocation, we could use
the ioasid_set for each pdev, right?

> > 
> > AFACIT this wants to store a list of (device, pasid) tuples, so a
> > simple linked list, 1d xarray vector or a red black tree seems more
> > appropriate..
> >   
> 
> this tuple can well serve per-device PASID. 😊
> 
I commented on the other email, but a simple list of tuples could have
duplicated devices since each dev could attach multiple PASIDs, right?
Should we still do two level then?

> Thanks
> Kevin


Thanks,

Jacob

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v2 3/8] iommu/vt-d: Implement device_pasid domain attach ops
  2022-03-16 20:50         ` Jacob Pan
@ 2022-03-16 22:15           ` Jason Gunthorpe
  2022-03-16 22:23             ` Luck, Tony
  2022-03-17  0:49             ` Jacob Pan
  0 siblings, 2 replies; 61+ messages in thread
From: Jason Gunthorpe @ 2022-03-16 22:15 UTC (permalink / raw)
  To: Jacob Pan
  Cc: iommu, LKML, Joerg Roedel, Christoph Hellwig, Lu Baolu,
	Jean-Philippe Brucker, Jacob Pan, Raj Ashok, Kumar, Sanjay K,
	Dave Jiang, Tony Luck, Zanussi, Tom, Dan Williams, Tian, Kevin,
	Yi Liu

On Wed, Mar 16, 2022 at 01:50:04PM -0700, Jacob Pan wrote:

> I guess a list of (device, pasid) tuples as you suggested could work but it
> will have duplicated device entries since each device could have multiple
> PASIDs. right?

Is assigning the same iommu_domain to multiple PASIDs of the same
device something worth optimizing for?

I would expect real applications will try to use the same PASID for
the same IOVA map to optimize IOTLB caching.

Is there a use case for that I'm missing?

Otherwise your explanation is what I was imagining as well.

I would also think about expanding your struct so that the device
driver can track per-device per-domain data as well, that seems
useful IIRC?

ie put a 'sizeof_iommu_dev_pasid_data' in the domain->ops and
allocate that much memory so the driver can use the trailer space for
its own purpose.

Jason

^ permalink raw reply	[flat|nested] 61+ messages in thread

* RE: [PATCH v2 3/8] iommu/vt-d: Implement device_pasid domain attach ops
  2022-03-16 22:15           ` Jason Gunthorpe
@ 2022-03-16 22:23             ` Luck, Tony
  2022-03-17  0:04               ` Jason Gunthorpe
  2022-03-17  0:49             ` Jacob Pan
  1 sibling, 1 reply; 61+ messages in thread
From: Luck, Tony @ 2022-03-16 22:23 UTC (permalink / raw)
  To: Jason Gunthorpe, Jacob Pan
  Cc: iommu, LKML, Joerg Roedel, Christoph Hellwig, Lu Baolu,
	Jean-Philippe Brucker, Pan, Jacob jun, Raj, Ashok, Kumar,
	Sanjay K, Jiang, Dave, Zanussi, Tom, Williams, Dan J, Tian,
	Kevin, Liu, Yi L

> I would expect real applications will try to use the same PASID for
> the same IOVA map to optimize IOTLB caching.

On Intel a ring3 application only gets to use one PASID.

The ENQCMD instruction pick up the PASID for the process
from the IA32_PASID MSR (set by OS when first access,
and on context switches thereafter).

Kernel users (ring0) can supply any PASID when they use
the ENQCMDS instruction. Is that what you mean when you
say "real applications"?

-Tony

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v2 3/8] iommu/vt-d: Implement device_pasid domain attach ops
  2022-03-16 22:23             ` Luck, Tony
@ 2022-03-17  0:04               ` Jason Gunthorpe
  2022-03-18  5:47                 ` Tian, Kevin
  0 siblings, 1 reply; 61+ messages in thread
From: Jason Gunthorpe @ 2022-03-17  0:04 UTC (permalink / raw)
  To: Luck, Tony
  Cc: Jacob Pan, iommu, LKML, Joerg Roedel, Christoph Hellwig,
	Lu Baolu, Jean-Philippe Brucker, Pan, Jacob jun, Raj, Ashok,
	Kumar, Sanjay K, Jiang, Dave, Zanussi, Tom, Williams, Dan J,
	Tian, Kevin, Liu, Yi L

On Wed, Mar 16, 2022 at 10:23:26PM +0000, Luck, Tony wrote:

> Kernel users (ring0) can supply any PASID when they use
> the ENQCMDS instruction. Is that what you mean when you
> say "real applications"?

I'm not talking about ENQCMD at all

I'm saying I don't see much utility to have two PASIDs assigned to the
same device that perform the same IOVA translation.

So I would expect the list of attachments in an iommu_domain to have
exactly one device in it and multiple PASIDs to the same device from
the same iommu_domain is a oddball corner case we don't need to
optimize for, beyond being able to allow it to happen.

Jason

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v2 3/8] iommu/vt-d: Implement device_pasid domain attach ops
  2022-03-16 22:15           ` Jason Gunthorpe
  2022-03-16 22:23             ` Luck, Tony
@ 2022-03-17  0:49             ` Jacob Pan
  2022-03-17 13:23               ` Jason Gunthorpe
  1 sibling, 1 reply; 61+ messages in thread
From: Jacob Pan @ 2022-03-17  0:49 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: iommu, LKML, Joerg Roedel, Christoph Hellwig, Lu Baolu,
	Jean-Philippe Brucker, Jacob Pan, Raj Ashok, Kumar, Sanjay K,
	Dave Jiang, Tony Luck, Zanussi, Tom, Dan Williams, Tian, Kevin,
	Yi Liu, jacob.jun.pan

Hi Jason,

On Wed, 16 Mar 2022 19:15:50 -0300, Jason Gunthorpe <jgg@nvidia.com> wrote:

> On Wed, Mar 16, 2022 at 01:50:04PM -0700, Jacob Pan wrote:
> 
> > I guess a list of (device, pasid) tuples as you suggested could work
> > but it will have duplicated device entries since each device could have
> > multiple PASIDs. right?  
> 
> Is assigning the same iommu_domain to multiple PASIDs of the same
> device something worth optimizing for?
Probably not, the current usage case has only two PASIDs at most (RID2PASID
+ a kernel PASID).

I was just thinking for the generalized case, device TLB flush would be
more efficient if we don't go through the domain list. Use a per-domain-dev
list instead. But it doesn't matter much for DMA domain which has one
device mostly.

> I would expect real applications will try to use the same PASID for
> the same IOVA map to optimize IOTLB caching.
> 
> Is there a use case for that I'm missing?
> 
Yes. it would be more efficient for PASID selective domain TLB flush. But
on VT-d IOTLB is also tagged by domain ID, domain flush can use DID if
there are many PASIDs. Not sure about other archs. Agree that optimizing
PASIDs for TLB flush should be a common goal.

> Otherwise your explanation is what I was imagining as well.
> 
> I would also think about expanding your struct so that the device
> driver can track per-device per-domain data as well, that seems
> useful IIRC?
> 
yes, at least both VT-d and FSL drivers have struct device_domain_info.

> ie put a 'sizeof_iommu_dev_pasid_data' in the domain->ops and
> allocate that much memory so the driver can use the trailer space for
> its own purpose.
> 
That sounds great to have but not sure i understood correctly how to do it.

Do you mean for each vendor driver's struct device_domain_info (or
equivalent), we carve out sizeof_iommu_dev_pasid_data as common data, then
the rest of the space is vendor specific? I don't feel I get your point,
could you elaborate?


Thanks,

Jacob

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v2 3/8] iommu/vt-d: Implement device_pasid domain attach ops
  2022-03-17  0:49             ` Jacob Pan
@ 2022-03-17 13:23               ` Jason Gunthorpe
  2022-03-17 18:23                 ` Jacob Pan
  0 siblings, 1 reply; 61+ messages in thread
From: Jason Gunthorpe @ 2022-03-17 13:23 UTC (permalink / raw)
  To: Jacob Pan
  Cc: iommu, LKML, Joerg Roedel, Christoph Hellwig, Lu Baolu,
	Jean-Philippe Brucker, Jacob Pan, Raj Ashok, Kumar, Sanjay K,
	Dave Jiang, Tony Luck, Zanussi, Tom, Dan Williams, Tian, Kevin,
	Yi Liu

On Wed, Mar 16, 2022 at 05:49:59PM -0700, Jacob Pan wrote:

> > I would expect real applications will try to use the same PASID for
> > the same IOVA map to optimize IOTLB caching.
> > 
> > Is there a use case for that I'm missing?
> > 
> Yes. it would be more efficient for PASID selective domain TLB flush. But
> on VT-d IOTLB is also tagged by domain ID, domain flush can use DID if
> there are many PASIDs. Not sure about other archs. Agree that optimizing
> PASIDs for TLB flush should be a common goal.

If you sort the list of (device, pasid) tuples can something like VT-d
collapse all the same devices and just issue one DID invalidation:

 list_for_each()
    if (itm->device == last_invalidated_device)
          continue;
    invalidate(itm->device);
    last_invalidated_device = itm->device;

While something that was per-pasid could issue per-pasid invalidations
from the same data structure?

> > Otherwise your explanation is what I was imagining as well.
> > 
> > I would also think about expanding your struct so that the device
> > driver can track per-device per-domain data as well, that seems
> > useful IIRC?
> > 
> yes, at least both VT-d and FSL drivers have struct device_domain_info.
> 
> > ie put a 'sizeof_iommu_dev_pasid_data' in the domain->ops and
> > allocate that much memory so the driver can use the trailer space for
> > its own purpose.
> > 
> That sounds great to have but not sure i understood correctly how to do it.
> 
> Do you mean for each vendor driver's struct device_domain_info (or
> equivalent), we carve out sizeof_iommu_dev_pasid_data as common data, then
> the rest of the space is vendor specific? I don't feel I get your point,
> could you elaborate?

I've seen it done two ways..

With a flex array:

 struct iommu_device_data {
     struct list_head list
     ioasid_t pasid;
     struct device *dev;
     [..]
     u64 device_data[];
 }

 struct intel_device_data {
      [..]
 }
 struct iommu_device_data *dev_data;
 struct intel_device_data *intel_data = (void *)&dev_data->device_data;

Or with container of:

 struct iommu_device_data {
     struct list_head list
     ioasid_t pasid;
     struct device *dev;
     [..]
 }

 struct intel_device_data {
     struct iommu_device_data iommu; // must be first
     [...]
 }
 struct iommu_device_data *dev_data;
 struct intel_device_data *intel_data = container_of(dev_data, struct intel_device_data, iommu);

In either case you'd add a size_t to the domain->ops specifying how
much extra memory for the core code to allocate when it manages the
datastructure. The first case allocates based on struct_size, the
second case allocates what is specified.

Look at INIT_RDMA_OBJ_SIZE() for some more complicated example how the
latter can work. I like it because it has the nice container_of
pattern in drivers, the downside is it requires a BUILD_BUG_ON to
check that the driver ordered its struct properly.

The point is to consolidate all the code for allocating and walking
the data structure without having to force two allocations and extra
pointer indirections on performance paths.

Jason

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v2 3/8] iommu/vt-d: Implement device_pasid domain attach ops
  2022-03-17 13:23               ` Jason Gunthorpe
@ 2022-03-17 18:23                 ` Jacob Pan
  0 siblings, 0 replies; 61+ messages in thread
From: Jacob Pan @ 2022-03-17 18:23 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: iommu, LKML, Joerg Roedel, Christoph Hellwig, Lu Baolu,
	Jean-Philippe Brucker, Jacob Pan, Raj Ashok, Kumar, Sanjay K,
	Dave Jiang, Tony Luck, Zanussi, Tom, Dan Williams, Tian, Kevin,
	Yi Liu, jacob.jun.pan

Hi Jason,

On Thu, 17 Mar 2022 10:23:08 -0300, Jason Gunthorpe <jgg@nvidia.com> wrote:

> On Wed, Mar 16, 2022 at 05:49:59PM -0700, Jacob Pan wrote:
> 
> > > I would expect real applications will try to use the same PASID for
> > > the same IOVA map to optimize IOTLB caching.
> > > 
> > > Is there a use case for that I'm missing?
> > >   
> > Yes. it would be more efficient for PASID selective domain TLB flush.
> > But on VT-d IOTLB is also tagged by domain ID, domain flush can use DID
> > if there are many PASIDs. Not sure about other archs. Agree that
> > optimizing PASIDs for TLB flush should be a common goal.  
> 
> If you sort the list of (device, pasid) tuples can something like VT-d
> collapse all the same devices and just issue one DID invalidation:
> 
>  list_for_each()
>     if (itm->device == last_invalidated_device)
>           continue;
>     invalidate(itm->device);
>     last_invalidated_device = itm->device;
> 
I assume this is for devTLB since IOMMU's IOTLB flush doesn't care about
device. I think it works for device-wide invalidation.

> While something that was per-pasid could issue per-pasid invalidations
> from the same data structure?
> 
yes. we can use the same data structure for PASID selective devTLB but 
 list_for_each()
     if (itm->pasid == pasid_to_be_invalidated;
	     invalidate(itm->device, pasid);

For IOMMU's IOTLB, we also have two granularities
1. domain-wide
2. pasid-wide
For #1, we just use DID to invalidate w/o traverse the list.
For #2, we just need to sanity check the pasid is indeed attached by going
through the list.

Seems to work!

> > > Otherwise your explanation is what I was imagining as well.
> > > 
> > > I would also think about expanding your struct so that the device
> > > driver can track per-device per-domain data as well, that seems
> > > useful IIRC?
> > >   
> > yes, at least both VT-d and FSL drivers have struct device_domain_info.
> >   
> > > ie put a 'sizeof_iommu_dev_pasid_data' in the domain->ops and
> > > allocate that much memory so the driver can use the trailer space for
> > > its own purpose.
> > >   
> > That sounds great to have but not sure i understood correctly how to do
> > it.
> > 
> > Do you mean for each vendor driver's struct device_domain_info (or
> > equivalent), we carve out sizeof_iommu_dev_pasid_data as common data,
> > then the rest of the space is vendor specific? I don't feel I get your
> > point, could you elaborate?  
> 
> I've seen it done two ways..
> 
> With a flex array:
> 
>  struct iommu_device_data {
>      struct list_head list
>      ioasid_t pasid;
>      struct device *dev;
>      [..]
>      u64 device_data[];
>  }
> 
>  struct intel_device_data {
>       [..]
>  }
>  struct iommu_device_data *dev_data;
>  struct intel_device_data *intel_data = (void *)&dev_data->device_data;
> 
> Or with container of:
> 
>  struct iommu_device_data {
>      struct list_head list
>      ioasid_t pasid;
>      struct device *dev;
>      [..]
>  }
> 
>  struct intel_device_data {
>      struct iommu_device_data iommu; // must be first
>      [...]
>  }
>  struct iommu_device_data *dev_data;
>  struct intel_device_data *intel_data = container_of(dev_data, struct
> intel_device_data, iommu);
> 
> In either case you'd add a size_t to the domain->ops specifying how
> much extra memory for the core code to allocate when it manages the
> datastructure. The first case allocates based on struct_size, the
> second case allocates what is specified.
> 
> Look at INIT_RDMA_OBJ_SIZE() for some more complicated example how the
> latter can work. I like it because it has the nice container_of
> pattern in drivers, the downside is it requires a BUILD_BUG_ON to
> check that the driver ordered its struct properly.
> 
> The point is to consolidate all the code for allocating and walking
> the data structure without having to force two allocations and extra
> pointer indirections on performance paths.
Make sense, very neat. Vendor driver would not need to do allocations. Let
me give that a try. Seems #2 has better type safety.

Thank you so much for the thorough explanation!

Jacob

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v2 4/8] iommu/vt-d: Use device_pasid attach op for RID2PASID
  2022-03-16  7:54   ` Tian, Kevin
@ 2022-03-17 20:45     ` Jacob Pan
  0 siblings, 0 replies; 61+ messages in thread
From: Jacob Pan @ 2022-03-17 20:45 UTC (permalink / raw)
  To: Tian, Kevin
  Cc: iommu, LKML, Joerg Roedel, Jason Gunthorpe, Christoph Hellwig,
	Lu Baolu, Jean-Philippe Brucker, Pan, Jacob jun, Raj, Ashok,
	Kumar, Sanjay K, Jiang, Dave, Luck, Tony, Zanussi, Tom, Williams,
	Dan J, Liu, Yi L, jacob.jun.pan

Hi Kevin,

On Wed, 16 Mar 2022 07:54:19 +0000, "Tian, Kevin" <kevin.tian@intel.com>
wrote:

> > From: Jacob Pan <jacob.jun.pan@linux.intel.com>
> > Sent: Tuesday, March 15, 2022 1:07 PM
> > 
> > With the availability of a generic device-PASID-domain attachment API,
> > there's no need to special case RID2PASID.  Use the API to replace
> > duplicated code.
> > 
> > Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
> > ---
> >  drivers/iommu/intel/iommu.c | 18 ++----------------
> >  1 file changed, 2 insertions(+), 16 deletions(-)
> > 
> > diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
> > index 9267194eaed3..f832b7599d21 100644
> > --- a/drivers/iommu/intel/iommu.c
> > +++ b/drivers/iommu/intel/iommu.c
> > @@ -1683,9 +1683,6 @@ static void domain_flush_piotlb(struct
> > intel_iommu *iommu,
> >  		qi_flush_piotlb(iommu, did, domain->default_pasid,
> >  				addr, npages, ih);
> > 
> > -	if (!list_empty(&domain->devices))
> > -		qi_flush_piotlb(iommu, did, PASID_RID2PASID, addr,
> > npages, ih);
> > -  
> 
> this should be rebased on top of Baolu's "iommu cleanup and refactoring"
> series which has removed the entire domain_flush_piotlb().
> 
Yes, I have been working with Baolu. Some of the refactoring patches were
withdrawn, so there are lots of moving targets. 

Thanks,

Jacob

^ permalink raw reply	[flat|nested] 61+ messages in thread

* RE: [PATCH v2 3/8] iommu/vt-d: Implement device_pasid domain attach ops
  2022-03-16 21:01       ` Jacob Pan
@ 2022-03-18  5:33         ` Tian, Kevin
  2022-03-28 21:41           ` Jacob Pan
  0 siblings, 1 reply; 61+ messages in thread
From: Tian, Kevin @ 2022-03-18  5:33 UTC (permalink / raw)
  To: Jacob Pan
  Cc: Luck, Tony, Jiang, Dave, Raj, Ashok, Zanussi, Tom, Kumar,
	Sanjay K, LKML, Christoph Hellwig, iommu, Pan, Jacob jun,
	Jason Gunthorpe, Williams, Dan J, Jean-Philippe Brucker

> From: Jacob Pan
> Sent: Thursday, March 17, 2022 5:02 AM
> 
> Hi Kevin,
> 
> On Wed, 16 Mar 2022 07:41:34 +0000, "Tian, Kevin" <kevin.tian@intel.com>
> wrote:
> 
> > > From: Jason Gunthorpe <jgg@nvidia.com>
> > > Sent: Tuesday, March 15, 2022 10:33 PM
> > >
> > > On Mon, Mar 14, 2022 at 10:07:07PM -0700, Jacob Pan wrote:
> > > > +	/*
> > > > +	 * Each domain could have multiple devices attached with
> > > > shared or
> > > per
> > > > +	 * device PASIDs. At the domain level, we keep track of
> > > > unique PASIDs
> > > and
> > > > +	 * device user count.
> > > > +	 * E.g. If a domain has two devices attached, device A has
> > > > PASID 0, 1;
> > > > +	 * device B has PASID 0, 2. Then the domain would have PASID
> > > > 0, 1, 2.
> > > > +	 */
> > >
> > > A 2d array of xarray's seems like a poor data structure for this task.
> >
> Perhaps i mis-presented here, I am not using 2D array. It is an 1D xarray
> for domain PASIDs only. Then I use the existing device list in each domain,
> adding another xa to track per-device-domain PASIDs.
> > besides that it also doesn't work when we support per-device PASID
> > allocation in the future. In that case merging device PASIDs together is
> > conceptually wrong.
> >
> Sorry, could you elaborate? If we do per-dev PASID allocation, we could use
> the ioasid_set for each pdev, right?

My point is simply about the comment above which says the domain
will have PASID 0, 1, 2 when there is [devA, PASID0] and [devB, PASID0].
You can maintain a single  PASID list only when it's globally allocated cross
devices. otherwise this has to be a tuple including device and PASID.

Thanks
Kevin

^ permalink raw reply	[flat|nested] 61+ messages in thread

* RE: [PATCH v2 3/8] iommu/vt-d: Implement device_pasid domain attach ops
  2022-03-17  0:04               ` Jason Gunthorpe
@ 2022-03-18  5:47                 ` Tian, Kevin
  2022-03-18 13:47                   ` Jason Gunthorpe
  0 siblings, 1 reply; 61+ messages in thread
From: Tian, Kevin @ 2022-03-18  5:47 UTC (permalink / raw)
  To: Jason Gunthorpe, Luck, Tony
  Cc: Jacob Pan, iommu, LKML, Joerg Roedel, Christoph Hellwig,
	Lu Baolu, Jean-Philippe Brucker, Pan, Jacob jun, Raj, Ashok,
	Kumar, Sanjay K, Jiang, Dave, Zanussi, Tom, Williams, Dan J, Liu,
	Yi L

> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Thursday, March 17, 2022 8:04 AM
> 
> On Wed, Mar 16, 2022 at 10:23:26PM +0000, Luck, Tony wrote:
> 
> > Kernel users (ring0) can supply any PASID when they use
> > the ENQCMDS instruction. Is that what you mean when you
> > say "real applications"?
> 
> I'm not talking about ENQCMD at all
> 
> I'm saying I don't see much utility to have two PASIDs assigned to the
> same device that perform the same IOVA translation.
> 
> So I would expect the list of attachments in an iommu_domain to have
> exactly one device in it and multiple PASIDs to the same device from
> the same iommu_domain is a oddball corner case we don't need to
> optimize for, beyond being able to allow it to happen.
> 

In theory we do have one example with SIOV.

Say two mdev's of a same parent device are assigned to a guest and
each mdev gets a unique PASID to tag the vRID requests.

Then the guest attaches both mdev's to a stage-1 IOVA domain via
vIOMMU, e.g. when it further assigns them to an user application.

In this case the host iommu_domain wrapping the stage-1 page
table is attached by two PASIDs of the same device.

However iirc there are other tricky issues blocking this usage (Jacob
may remember more detail here) and we didn't hear strong interest
from customer on it. So this is just FYI for theoretical possibility and 
I'm fine with even disallowing it before those issues are resolved.

Thanks
Kevin

^ permalink raw reply	[flat|nested] 61+ messages in thread

* RE: [PATCH v2 6/8] dmaengine: idxd: Use DMA API for in-kernel DMA with PASID
  2022-03-15  5:07 ` [PATCH v2 6/8] dmaengine: idxd: Use DMA API for in-kernel DMA with PASID Jacob Pan
@ 2022-03-18  6:10   ` Tian, Kevin
  2022-03-29 17:39     ` Jacob Pan
  0 siblings, 1 reply; 61+ messages in thread
From: Tian, Kevin @ 2022-03-18  6:10 UTC (permalink / raw)
  To: Jacob Pan, iommu, LKML, Joerg Roedel, Jason Gunthorpe,
	Christoph Hellwig, Lu Baolu, Jean-Philippe Brucker
  Cc: Pan, Jacob jun, Raj, Ashok, Kumar, Sanjay K, Jiang, Dave, Luck,
	Tony, Zanussi, Tom, Williams, Dan J, Liu, Yi L

> From: Jacob Pan <jacob.jun.pan@linux.intel.com>
> Sent: Tuesday, March 15, 2022 1:07 PM
> 
> The current in-kernel supervisor PASID support is based on the SVM/SVA
> machinery in SVA lib. The binding between a kernel PASID and kernel
> mapping has many flaws. See discussions in the link below.
> 
> This patch enables in-kernel DMA by switching from SVA lib to the
> standard DMA mapping APIs. Since both DMA requests with and without
> PASIDs are mapped identically, there is no change to how DMA APIs are
> used after the kernel PASID is enabled.
> 
> Link: https://lore.kernel.org/linux-
> iommu/20210511194726.GP1002214@nvidia.com/
> Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
> ---
>  drivers/dma/idxd/idxd.h  |  1 -
>  drivers/dma/idxd/init.c  | 34 +++++++++-------------------------
>  drivers/dma/idxd/sysfs.c |  7 -------
>  3 files changed, 9 insertions(+), 33 deletions(-)
> 
> diff --git a/drivers/dma/idxd/idxd.h b/drivers/dma/idxd/idxd.h
> index da72eb15f610..a09ab4a6e1c1 100644
> --- a/drivers/dma/idxd/idxd.h
> +++ b/drivers/dma/idxd/idxd.h
> @@ -276,7 +276,6 @@ struct idxd_device {
>  	struct idxd_wq **wqs;
>  	struct idxd_engine **engines;
> 
> -	struct iommu_sva *sva;
>  	unsigned int pasid;
> 
>  	int num_groups;
> diff --git a/drivers/dma/idxd/init.c b/drivers/dma/idxd/init.c
> index 08a5f4310188..5d1f8dd4abf6 100644
> --- a/drivers/dma/idxd/init.c
> +++ b/drivers/dma/idxd/init.c
> @@ -16,6 +16,7 @@
>  #include <linux/idr.h>
>  #include <linux/intel-svm.h>
>  #include <linux/iommu.h>
> +#include <linux/dma-iommu.h>
>  #include <uapi/linux/idxd.h>
>  #include <linux/dmaengine.h>
>  #include "../dmaengine.h"
> @@ -466,36 +467,22 @@ static struct idxd_device *idxd_alloc(struct pci_dev
> *pdev, struct idxd_driver_d
> 
>  static int idxd_enable_system_pasid(struct idxd_device *idxd)

idxd_enable_pasid_dma() since system pasid is a confusing term now?
Or just remove the idxd specific wrappers and have the caller to call
iommu_enable_pasid_dma() directly given the simple logic here.

>  {
> -	int flags;
> -	unsigned int pasid;
> -	struct iommu_sva *sva;
> +	u32 pasid;
> +	int ret;
> 
> -	flags = SVM_FLAG_SUPERVISOR_MODE;
> -
> -	sva = iommu_sva_bind_device(&idxd->pdev->dev, NULL, &flags);
> -	if (IS_ERR(sva)) {
> -		dev_warn(&idxd->pdev->dev,
> -			 "iommu sva bind failed: %ld\n", PTR_ERR(sva));
> -		return PTR_ERR(sva);
> -	}
> -
> -	pasid = iommu_sva_get_pasid(sva);
> -	if (pasid == IOMMU_PASID_INVALID) {
> -		iommu_sva_unbind_device(sva);
> -		return -ENODEV;
> +	ret = iommu_enable_pasid_dma(&idxd->pdev->dev, &pasid);
> +	if (ret) {
> +		dev_err(&idxd->pdev->dev, "No DMA PASID %d\n", ret);
> +		return ret;
>  	}
> -
> -	idxd->sva = sva;
>  	idxd->pasid = pasid;
> -	dev_dbg(&idxd->pdev->dev, "system pasid: %u\n", pasid);
> +
>  	return 0;
>  }
> 
>  static void idxd_disable_system_pasid(struct idxd_device *idxd)
>  {
> -
> -	iommu_sva_unbind_device(idxd->sva);
> -	idxd->sva = NULL;
> +	iommu_disable_pasid_dma(&idxd->pdev->dev, idxd->pasid);
>  }
> 
>  static int idxd_probe(struct idxd_device *idxd)
> @@ -524,10 +511,7 @@ static int idxd_probe(struct idxd_device *idxd)
>  		} else {
>  			dev_warn(dev, "Unable to turn on SVA feature.\n");
>  		}
> -	} else if (!sva) {
> -		dev_warn(dev, "User forced SVA off via module param.\n");

why removing above 2 lines? they are related to a module param thus
not affected by the logic in this series.

>  	}
> -
>  	idxd_read_caps(idxd);
>  	idxd_read_table_offsets(idxd);
> 
> diff --git a/drivers/dma/idxd/sysfs.c b/drivers/dma/idxd/sysfs.c
> index 7e19ab92b61a..fde6656695ba 100644
> --- a/drivers/dma/idxd/sysfs.c
> +++ b/drivers/dma/idxd/sysfs.c
> @@ -839,13 +839,6 @@ static ssize_t wq_name_store(struct device *dev,
>  	if (strlen(buf) > WQ_NAME_SIZE || strlen(buf) == 0)
>  		return -EINVAL;
> 
> -	/*
> -	 * This is temporarily placed here until we have SVM support for
> -	 * dmaengine.
> -	 */
> -	if (wq->type == IDXD_WQT_KERNEL && device_pasid_enabled(wq-
> >idxd))
> -		return -EOPNOTSUPP;
> -
>  	memset(wq->name, 0, WQ_NAME_SIZE + 1);
>  	strncpy(wq->name, buf, WQ_NAME_SIZE);
>  	strreplace(wq->name, '\n', '\0');
> --
> 2.25.1


^ permalink raw reply	[flat|nested] 61+ messages in thread

* RE: [PATCH v2 7/8] iommu/vt-d: Delete supervisor/kernel SVA
  2022-03-15  5:07 ` [PATCH v2 7/8] iommu/vt-d: Delete supervisor/kernel SVA Jacob Pan
@ 2022-03-18  6:16   ` Tian, Kevin
  2022-03-29 17:42     ` Jacob Pan
  0 siblings, 1 reply; 61+ messages in thread
From: Tian, Kevin @ 2022-03-18  6:16 UTC (permalink / raw)
  To: Jacob Pan, iommu, LKML, Joerg Roedel, Jason Gunthorpe,
	Christoph Hellwig, Lu Baolu, Jean-Philippe Brucker
  Cc: Pan, Jacob jun, Raj, Ashok, Kumar, Sanjay K, Jiang, Dave, Luck,
	Tony, Zanussi, Tom, Williams, Dan J, Liu, Yi L

> From: Jacob Pan <jacob.jun.pan@linux.intel.com>
> Sent: Tuesday, March 15, 2022 1:07 PM
> 
> In-kernel DMA with PASID should use DMA API now, remove supervisor
> PASID
> SVA support. Remove special cases in bind mm and page request service.
> 
> Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>

so you removed all the references to SVM_FLAG_SUPERVISOR_MODE
but the definition is still kept in include/linux/intel-svm.h...

> ---
>  drivers/iommu/intel/svm.c | 42 ++++++++-------------------------------
>  1 file changed, 8 insertions(+), 34 deletions(-)
> 
> diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c
> index 2c53689da461..37d6218f173b 100644
> --- a/drivers/iommu/intel/svm.c
> +++ b/drivers/iommu/intel/svm.c
> @@ -516,11 +516,10 @@ static void intel_svm_free_pasid(struct mm_struct
> *mm)
> 
>  static struct iommu_sva *intel_svm_bind_mm(struct intel_iommu *iommu,
>  					   struct device *dev,
> -					   struct mm_struct *mm,
> -					   unsigned int flags)
> +					   struct mm_struct *mm)
>  {
>  	struct device_domain_info *info = get_domain_info(dev);
> -	unsigned long iflags, sflags;
> +	unsigned long iflags, sflags = 0;
>  	struct intel_svm_dev *sdev;
>  	struct intel_svm *svm;
>  	int ret = 0;
> @@ -533,16 +532,13 @@ static struct iommu_sva
> *intel_svm_bind_mm(struct intel_iommu *iommu,
> 
>  		svm->pasid = mm->pasid;
>  		svm->mm = mm;
> -		svm->flags = flags;
>  		INIT_LIST_HEAD_RCU(&svm->devs);
> 
> -		if (!(flags & SVM_FLAG_SUPERVISOR_MODE)) {
> -			svm->notifier.ops = &intel_mmuops;
> -			ret = mmu_notifier_register(&svm->notifier, mm);
> -			if (ret) {
> -				kfree(svm);
> -				return ERR_PTR(ret);
> -			}
> +		svm->notifier.ops = &intel_mmuops;
> +		ret = mmu_notifier_register(&svm->notifier, mm);
> +		if (ret) {
> +			kfree(svm);
> +			return ERR_PTR(ret);
>  		}
> 
>  		ret = pasid_private_add(svm->pasid, svm);
> @@ -583,8 +579,6 @@ static struct iommu_sva *intel_svm_bind_mm(struct
> intel_iommu *iommu,
>  	}
> 
>  	/* Setup the pasid table: */
> -	sflags = (flags & SVM_FLAG_SUPERVISOR_MODE) ?
> -			PASID_FLAG_SUPERVISOR_MODE : 0;
>  	sflags |= cpu_feature_enabled(X86_FEATURE_LA57) ?
> PASID_FLAG_FL5LP : 0;
>  	spin_lock_irqsave(&iommu->lock, iflags);
>  	ret = intel_pasid_setup_first_level(iommu, dev, mm->pgd, mm-
> >pasid,
> @@ -957,7 +951,7 @@ static irqreturn_t prq_event_thread(int irq, void *d)
>  			 * to unbind the mm while any page faults are
> outstanding.
>  			 */
>  			svm = pasid_private_find(req->pasid);
> -			if (IS_ERR_OR_NULL(svm) || (svm->flags &
> SVM_FLAG_SUPERVISOR_MODE))
> +			if (IS_ERR_OR_NULL(svm))
>  				goto bad_req;
>  		}
> 
> @@ -1011,29 +1005,9 @@ static irqreturn_t prq_event_thread(int irq, void
> *d)
>  struct iommu_sva *intel_svm_bind(struct device *dev, struct mm_struct
> *mm, void *drvdata)
>  {
>  	struct intel_iommu *iommu = device_to_iommu(dev, NULL, NULL);
> -	unsigned int flags = 0;
>  	struct iommu_sva *sva;
>  	int ret;
> 
> -	if (drvdata)
> -		flags = *(unsigned int *)drvdata;
> -
> -	if (flags & SVM_FLAG_SUPERVISOR_MODE) {
> -		if (!ecap_srs(iommu->ecap)) {
> -			dev_err(dev, "%s: Supervisor PASID not supported\n",
> -				iommu->name);
> -			return ERR_PTR(-EOPNOTSUPP);
> -		}
> -
> -		if (mm) {
> -			dev_err(dev, "%s: Supervisor PASID with user
> provided mm\n",
> -				iommu->name);
> -			return ERR_PTR(-EINVAL);
> -		}
> -
> -		mm = &init_mm;
> -	}
> -
>  	mutex_lock(&pasid_mutex);
>  	ret = intel_svm_alloc_pasid(dev, mm, flags);
>  	if (ret) {
> --
> 2.25.1


^ permalink raw reply	[flat|nested] 61+ messages in thread

* RE: [PATCH v2 9/9] dmaengine: idxd: separate user and kernel pasid enabling
  2022-03-15  5:07 ` [PATCH v2 9/9] dmaengine: idxd: separate user and kernel pasid enabling Jacob Pan
@ 2022-03-18  6:28   ` Tian, Kevin
  0 siblings, 0 replies; 61+ messages in thread
From: Tian, Kevin @ 2022-03-18  6:28 UTC (permalink / raw)
  To: Jacob Pan, iommu, LKML, Joerg Roedel, Jason Gunthorpe,
	Christoph Hellwig, Lu Baolu, Jean-Philippe Brucker
  Cc: Pan, Jacob jun, Raj, Ashok, Kumar, Sanjay K, Jiang, Dave, Luck,
	Tony, Zanussi, Tom, Williams, Dan J, Liu, Yi L

> From: Jacob Pan <jacob.jun.pan@linux.intel.com>
> Sent: Tuesday, March 15, 2022 1:07 PM

The coverletter is [0/8] but here you actually have the 9th patch...

> 
> From: Dave Jiang <dave.jiang@intel.com>
> 
> The idxd driver always gated the pasid enabling under a single knob and
> this assumption is incorrect. The pasid used for kernel operation can be
> independently toggled and has no dependency on the user pasid (and vice
> versa). Split the two so they are independent "enabled" flags.

While you said kernel vs. user sva can be independently toggled you still
only have a single option 'sva' to gate both in the end. 

Thanks
Kevin

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v2 2/8] iommu: Add attach/detach_dev_pasid domain ops
  2022-03-15  5:07 ` [PATCH v2 2/8] iommu: Add attach/detach_dev_pasid domain ops Jacob Pan
  2022-03-15 10:24   ` Tian, Kevin
  2022-03-15 11:26   ` Jean-Philippe Brucker
@ 2022-03-18 11:52   ` Lu Baolu
  2022-03-18 13:48     ` Jason Gunthorpe
  2 siblings, 1 reply; 61+ messages in thread
From: Lu Baolu @ 2022-03-18 11:52 UTC (permalink / raw)
  To: Jacob Pan, iommu, LKML, Joerg Roedel, Jason Gunthorpe,
	Christoph Hellwig, Jean-Philippe Brucker
  Cc: baolu.lu, Jacob Pan, Raj Ashok, Kumar, Sanjay K, Dave Jiang,
	Tony Luck, Zanussi, Tom, Dan Williams, Tian, Kevin, Yi Liu

On 2022/3/15 13:07, Jacob Pan wrote:
> From: Lu Baolu <baolu.lu@linux.intel.com>
> 
> An IOMMU domain represents an address space which can be attached by
> devices that perform DMA within a domain. However, for platforms with
> PASID capability the domain attachment needs be handled at device+PASID
> level. There can be multiple PASIDs within a device and multiple devices
> attached to a given domain.
> This patch introduces a new IOMMU op which support device, PASID, and
> IOMMU domain attachment. The immediate use case is for PASID capable
> devices to perform DMA under DMA APIs.
> 
> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
> Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
> ---
>   include/linux/iommu.h | 6 ++++++
>   1 file changed, 6 insertions(+)
> 
> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> index 369f05c2a4e2..fde5b933dbe3 100644
> --- a/include/linux/iommu.h
> +++ b/include/linux/iommu.h
> @@ -227,6 +227,8 @@ struct iommu_iotlb_gather {
>    * @aux_get_pasid: get the pasid given an aux-domain
>    * @sva_bind: Bind process address space to device
>    * @sva_unbind: Unbind process address space from device
> + * @attach_dev_pasid: attach an iommu domain to a pasid of device
> + * @detach_dev_pasid: detach an iommu domain from a pasid of device
>    * @sva_get_pasid: Get PASID associated to a SVA handle
>    * @page_response: handle page request response
>    * @cache_invalidate: invalidate translation caches
> @@ -296,6 +298,10 @@ struct iommu_ops {
>   	struct iommu_sva *(*sva_bind)(struct device *dev, struct mm_struct *mm,
>   				      void *drvdata);
>   	void (*sva_unbind)(struct iommu_sva *handle);
> +	int (*attach_dev_pasid)(struct iommu_domain *domain,
> +				struct device *dev, ioasid_t id);
> +	void (*detach_dev_pasid)(struct iommu_domain *domain,
> +				 struct device *dev, ioasid_t id);

As we have introduced iommu_domain_ops, these callbacks should be part
of the domain ops.

>   	u32 (*sva_get_pasid)(struct iommu_sva *handle);
>   
>   	int (*page_response)(struct device *dev,

Best regards,
baolu

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v2 2/8] iommu: Add attach/detach_dev_pasid domain ops
  2022-03-15 11:49     ` Tian, Kevin
  2022-03-15 16:11       ` Jacob Pan
@ 2022-03-18 12:01       ` Lu Baolu
  2022-03-18 13:50         ` Jason Gunthorpe
  1 sibling, 1 reply; 61+ messages in thread
From: Lu Baolu @ 2022-03-18 12:01 UTC (permalink / raw)
  To: Tian, Kevin, Jean-Philippe Brucker, Jacob Pan
  Cc: baolu.lu, iommu, LKML, Joerg Roedel, Jason Gunthorpe,
	Christoph Hellwig, Jean-Philippe Brucker, Pan, Jacob jun, Raj,
	Ashok, Kumar, Sanjay K, Jiang, Dave, Luck, Tony, Zanussi, Tom,
	Williams, Dan J, Liu, Yi L

On 2022/3/15 19:49, Tian, Kevin wrote:
>> From: Jean-Philippe Brucker<jean-philippe@linaro.org>
>> Sent: Tuesday, March 15, 2022 7:27 PM
>>
>> On Mon, Mar 14, 2022 at 10:07:06PM -0700, Jacob Pan wrote:
>>> From: Lu Baolu<baolu.lu@linux.intel.com>
>>>
>>> An IOMMU domain represents an address space which can be attached by
>>> devices that perform DMA within a domain. However, for platforms with
>>> PASID capability the domain attachment needs be handled at device+PASID
>>> level. There can be multiple PASIDs within a device and multiple devices
>>> attached to a given domain.
>>> This patch introduces a new IOMMU op which support device, PASID, and
>>> IOMMU domain attachment. The immediate use case is for PASID capable
>>> devices to perform DMA under DMA APIs.
>>>
>>> Signed-off-by: Lu Baolu<baolu.lu@linux.intel.com>
>>> Signed-off-by: Jacob Pan<jacob.jun.pan@linux.intel.com>
>>> ---
>>>   include/linux/iommu.h | 6 ++++++
>>>   1 file changed, 6 insertions(+)
>>>
>>> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
>>> index 369f05c2a4e2..fde5b933dbe3 100644
>>> --- a/include/linux/iommu.h
>>> +++ b/include/linux/iommu.h
>>> @@ -227,6 +227,8 @@ struct iommu_iotlb_gather {
>>>    * @aux_get_pasid: get the pasid given an aux-domain
>>>    * @sva_bind: Bind process address space to device
>>>    * @sva_unbind: Unbind process address space from device
>>> + * @attach_dev_pasid: attach an iommu domain to a pasid of device
>>> + * @detach_dev_pasid: detach an iommu domain from a pasid of device
>> Isn't that operation "assign a PASID to a domain" instead?  In patch 5,
>> the domain is already attached to the device, so set_domain_pasid() might
>> be clearer and to the point. If the IOMMU driver did the allocation we
>> could also avoid patch 1.
> iiuc this API can also work for future SIOV usage where each mdev attached
> to the domain has its own pasid. "assigning a PASID to a domain" sounds
> like going back to the previous aux domain approach which has one PASID
> per domain and that PASID is used on all devices attached to the aux domain...
> 

This also works for SVA as far as I can see. The sva_bind essentially is
to  attach an SVA domain to the PASID of a device. The sva_bind/unbind
ops could be removed with these two new callbacks.

Best regards,
baolu

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v2 5/8] iommu: Add PASID support for DMA mapping API users
  2022-03-15  5:07 ` [PATCH v2 5/8] iommu: Add PASID support for DMA mapping API users Jacob Pan
  2022-03-15 11:16   ` Robin Murphy
  2022-03-15 14:35   ` Jason Gunthorpe
@ 2022-03-18 12:43   ` Lu Baolu
  2022-03-28 21:44     ` Jacob Pan
  2 siblings, 1 reply; 61+ messages in thread
From: Lu Baolu @ 2022-03-18 12:43 UTC (permalink / raw)
  To: Jacob Pan, iommu, LKML, Joerg Roedel, Jason Gunthorpe,
	Christoph Hellwig, Jean-Philippe Brucker
  Cc: baolu.lu, Jacob Pan, Raj Ashok, Kumar, Sanjay K, Dave Jiang,
	Tony Luck, Zanussi, Tom, Dan Williams, Tian, Kevin, Yi Liu

On 2022/3/15 13:07, Jacob Pan wrote:
> DMA mapping API is the de facto standard for in-kernel DMA. It operates
> on a per device/RID basis which is not PASID-aware.
> 
> Some modern devices such as Intel Data Streaming Accelerator, PASID is
> required for certain work submissions. To allow such devices use DMA
> mapping API, we need the following functionalities:
> 1. Provide device a way to retrieve a PASID for work submission within
> the kernel
> 2. Enable the kernel PASID on the IOMMU for the device
> 3. Attach the kernel PASID to the device's default DMA domain, let it
> be IOVA or physical address in case of pass-through.
> 
> This patch introduces a driver facing API that enables DMA API
> PASID usage. Once enabled, device drivers can continue to use DMA APIs as
> is. There is no difference in dma_handle between without PASID and with
> PASID.
> 
> Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
> ---
>   drivers/iommu/dma-iommu.c | 65 +++++++++++++++++++++++++++++++++++++++
>   include/linux/dma-iommu.h |  7 +++++
>   include/linux/iommu.h     |  9 ++++++
>   3 files changed, 81 insertions(+)
> 
> diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
> index b22034975301..d0ff1a34b1b6 100644
> --- a/drivers/iommu/dma-iommu.c
> +++ b/drivers/iommu/dma-iommu.c
> @@ -39,6 +39,8 @@ enum iommu_dma_cookie_type {
>   	IOMMU_DMA_MSI_COOKIE,
>   };
>   
> +static DECLARE_IOASID_SET(iommu_dma_pasid);
> +
>   struct iommu_dma_cookie {
>   	enum iommu_dma_cookie_type	type;
>   	union {
> @@ -370,6 +372,69 @@ void iommu_put_dma_cookie(struct iommu_domain *domain)
>   	domain->iova_cookie = NULL;
>   }
>   
> +/**
> + * iommu_enable_pasid_dma --Enable in-kernel DMA request with PASID
> + * @dev:	Device to be enabled
> + *
> + * DMA request with PASID will be mapped the same way as the legacy DMA.
> + * If the device is in pass-through, PASID will also pass-through. If the
> + * device is in IOVA map, the supervisor PASID will point to the same IOVA
> + * page table.
> + *
> + * @return the kernel PASID to be used for DMA or INVALID_IOASID on failure

The comment on the return value should be rephrased according to the
real code.

> + */
> +int iommu_enable_pasid_dma(struct device *dev, ioasid_t *pasid)
> +{
> +	struct iommu_domain *dom;
> +	ioasid_t id, max;
> +	int ret;
> +
> +	dom = iommu_get_domain_for_dev(dev);
> +	if (!dom || !dom->ops || !dom->ops->attach_dev_pasid)
> +		return -ENODEV;
> +	max = iommu_get_dev_pasid_max(dev);
> +	if (!max)
> +		return -EINVAL;
> +
> +	id = ioasid_alloc(&iommu_dma_pasid, 1, max, dev);
> +	if (id == INVALID_IOASID)
> +		return -ENOMEM;
> +
> +	ret = dom->ops->attach_dev_pasid(dom, dev, id);
> +	if (ret) {
> +		ioasid_put(id);
> +		return ret;
> +	}
> +	*pasid = id;
> +
> +	return ret;
> +}
> +EXPORT_SYMBOL(iommu_enable_pasid_dma);
> +
> +/**
> + * iommu_disable_pasid_dma --Disable in-kernel DMA request with PASID
> + * @dev:	Device's PASID DMA to be disabled
> + *
> + * It is the device driver's responsibility to ensure no more incoming DMA
> + * requests with the kernel PASID before calling this function. IOMMU driver
> + * ensures PASID cache, IOTLBs related to the kernel PASID are cleared and
> + * drained.
> + *
> + * @return 0 on success or error code on failure

Ditto.

> + */
> +void iommu_disable_pasid_dma(struct device *dev, ioasid_t pasid)
> +{
> +	struct iommu_domain *dom;
> +
> +	/* TODO: check the given PASID is within the ioasid_set */
> +	dom = iommu_get_domain_for_dev(dev);
> +	if (!dom->ops->detach_dev_pasid)
> +		return;
> +	dom->ops->detach_dev_pasid(dom, dev, pasid);
> +	ioasid_put(pasid);
> +}
> +EXPORT_SYMBOL(iommu_disable_pasid_dma);
> +
>   /**
>    * iommu_dma_get_resv_regions - Reserved region driver helper
>    * @dev: Device from iommu_get_resv_regions()
> diff --git a/include/linux/dma-iommu.h b/include/linux/dma-iommu.h
> index 24607dc3c2ac..e6cb9b52a420 100644
> --- a/include/linux/dma-iommu.h
> +++ b/include/linux/dma-iommu.h
> @@ -18,6 +18,13 @@ int iommu_get_dma_cookie(struct iommu_domain *domain);
>   int iommu_get_msi_cookie(struct iommu_domain *domain, dma_addr_t base);
>   void iommu_put_dma_cookie(struct iommu_domain *domain);
>   
> +/*
> + * For devices that can do DMA request with PASID, setup a system PASID.
> + * Address modes (IOVA, PA) are selected by the platform code.
> + */

No need for a comment here. Or move it to the function if need.

> +int iommu_enable_pasid_dma(struct device *dev, ioasid_t *pasid);
> +void iommu_disable_pasid_dma(struct device *dev, ioasid_t pasid);
> +
>   /* Setup call for arch DMA mapping code */
>   void iommu_setup_dma_ops(struct device *dev, u64 dma_base, u64 dma_limit);
>   int iommu_dma_init_fq(struct iommu_domain *domain);
> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> index fde5b933dbe3..fb011722e4f8 100644
> --- a/include/linux/iommu.h
> +++ b/include/linux/iommu.h
> @@ -395,6 +395,15 @@ static inline void iommu_set_dev_pasid_max(struct device *dev,
>   
>   	param->pasid_max = max;
>   }
> +static inline ioasid_t iommu_get_dev_pasid_max(struct device *dev)
> +{
> +	struct dev_iommu *param = dev->iommu;
> +
> +	if (WARN_ON(!param))
> +		return 0;
> +
> +	return param->pasid_max;
> +}
>   
>   int iommu_device_register(struct iommu_device *iommu,
>   			  const struct iommu_ops *ops,

Best regards,
baolu

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v2 3/8] iommu/vt-d: Implement device_pasid domain attach ops
  2022-03-18  5:47                 ` Tian, Kevin
@ 2022-03-18 13:47                   ` Jason Gunthorpe
  0 siblings, 0 replies; 61+ messages in thread
From: Jason Gunthorpe @ 2022-03-18 13:47 UTC (permalink / raw)
  To: Tian, Kevin
  Cc: Luck, Tony, Jacob Pan, iommu, LKML, Joerg Roedel,
	Christoph Hellwig, Lu Baolu, Jean-Philippe Brucker, Pan,
	Jacob jun, Raj, Ashok, Kumar, Sanjay K, Jiang, Dave, Zanussi,
	Tom, Williams, Dan J, Liu, Yi L

On Fri, Mar 18, 2022 at 05:47:23AM +0000, Tian, Kevin wrote:
> may remember more detail here) and we didn't hear strong interest
> from customer on it. So this is just FYI for theoretical possibility and 
> I'm fine with even disallowing it before those issues are resolved.

I didn't mean disallow, I just ment lets not optimize for it.

Jason

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v2 2/8] iommu: Add attach/detach_dev_pasid domain ops
  2022-03-18 11:52   ` Lu Baolu
@ 2022-03-18 13:48     ` Jason Gunthorpe
  0 siblings, 0 replies; 61+ messages in thread
From: Jason Gunthorpe @ 2022-03-18 13:48 UTC (permalink / raw)
  To: Lu Baolu
  Cc: Jacob Pan, iommu, LKML, Joerg Roedel, Christoph Hellwig,
	Jean-Philippe Brucker, Jacob Pan, Raj Ashok, Kumar, Sanjay K,
	Dave Jiang, Tony Luck, Zanussi, Tom, Dan Williams, Tian, Kevin,
	Yi Liu

On Fri, Mar 18, 2022 at 07:52:33PM +0800, Lu Baolu wrote:
> On 2022/3/15 13:07, Jacob Pan wrote:
> > From: Lu Baolu <baolu.lu@linux.intel.com>
> > 
> > An IOMMU domain represents an address space which can be attached by
> > devices that perform DMA within a domain. However, for platforms with
> > PASID capability the domain attachment needs be handled at device+PASID
> > level. There can be multiple PASIDs within a device and multiple devices
> > attached to a given domain.
> > This patch introduces a new IOMMU op which support device, PASID, and
> > IOMMU domain attachment. The immediate use case is for PASID capable
> > devices to perform DMA under DMA APIs.
> > 
> > Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
> > Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
> >   include/linux/iommu.h | 6 ++++++
> >   1 file changed, 6 insertions(+)
> > 
> > diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> > index 369f05c2a4e2..fde5b933dbe3 100644
> > +++ b/include/linux/iommu.h
> > @@ -227,6 +227,8 @@ struct iommu_iotlb_gather {
> >    * @aux_get_pasid: get the pasid given an aux-domain
> >    * @sva_bind: Bind process address space to device
> >    * @sva_unbind: Unbind process address space from device
> > + * @attach_dev_pasid: attach an iommu domain to a pasid of device
> > + * @detach_dev_pasid: detach an iommu domain from a pasid of device
> >    * @sva_get_pasid: Get PASID associated to a SVA handle
> >    * @page_response: handle page request response
> >    * @cache_invalidate: invalidate translation caches
> > @@ -296,6 +298,10 @@ struct iommu_ops {
> >   	struct iommu_sva *(*sva_bind)(struct device *dev, struct mm_struct *mm,
> >   				      void *drvdata);
> >   	void (*sva_unbind)(struct iommu_sva *handle);
> > +	int (*attach_dev_pasid)(struct iommu_domain *domain,
> > +				struct device *dev, ioasid_t id);
> > +	void (*detach_dev_pasid)(struct iommu_domain *domain,
> > +				 struct device *dev, ioasid_t id);
> 
> As we have introduced iommu_domain_ops, these callbacks should be part
> of the domain ops.

+1

Jason

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v2 2/8] iommu: Add attach/detach_dev_pasid domain ops
  2022-03-18 12:01       ` Lu Baolu
@ 2022-03-18 13:50         ` Jason Gunthorpe
  0 siblings, 0 replies; 61+ messages in thread
From: Jason Gunthorpe @ 2022-03-18 13:50 UTC (permalink / raw)
  To: Lu Baolu
  Cc: Tian, Kevin, Jean-Philippe Brucker, Jacob Pan, iommu, LKML,
	Joerg Roedel, Christoph Hellwig, Jean-Philippe Brucker, Pan,
	Jacob jun, Raj, Ashok, Kumar, Sanjay K, Jiang, Dave, Luck, Tony,
	Zanussi, Tom, Williams, Dan J, Liu, Yi L

On Fri, Mar 18, 2022 at 08:01:04PM +0800, Lu Baolu wrote:
> On 2022/3/15 19:49, Tian, Kevin wrote:
> > > From: Jean-Philippe Brucker<jean-philippe@linaro.org>
> > > Sent: Tuesday, March 15, 2022 7:27 PM
> > > 
> > > On Mon, Mar 14, 2022 at 10:07:06PM -0700, Jacob Pan wrote:
> > > > From: Lu Baolu<baolu.lu@linux.intel.com>
> > > > 
> > > > An IOMMU domain represents an address space which can be attached by
> > > > devices that perform DMA within a domain. However, for platforms with
> > > > PASID capability the domain attachment needs be handled at device+PASID
> > > > level. There can be multiple PASIDs within a device and multiple devices
> > > > attached to a given domain.
> > > > This patch introduces a new IOMMU op which support device, PASID, and
> > > > IOMMU domain attachment. The immediate use case is for PASID capable
> > > > devices to perform DMA under DMA APIs.
> > > > 
> > > > Signed-off-by: Lu Baolu<baolu.lu@linux.intel.com>
> > > > Signed-off-by: Jacob Pan<jacob.jun.pan@linux.intel.com>
> > > >   include/linux/iommu.h | 6 ++++++
> > > >   1 file changed, 6 insertions(+)
> > > > 
> > > > diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> > > > index 369f05c2a4e2..fde5b933dbe3 100644
> > > > +++ b/include/linux/iommu.h
> > > > @@ -227,6 +227,8 @@ struct iommu_iotlb_gather {
> > > >    * @aux_get_pasid: get the pasid given an aux-domain
> > > >    * @sva_bind: Bind process address space to device
> > > >    * @sva_unbind: Unbind process address space from device
> > > > + * @attach_dev_pasid: attach an iommu domain to a pasid of device
> > > > + * @detach_dev_pasid: detach an iommu domain from a pasid of device
> > > Isn't that operation "assign a PASID to a domain" instead?  In patch 5,
> > > the domain is already attached to the device, so set_domain_pasid() might
> > > be clearer and to the point. If the IOMMU driver did the allocation we
> > > could also avoid patch 1.
> > iiuc this API can also work for future SIOV usage where each mdev attached
> > to the domain has its own pasid. "assigning a PASID to a domain" sounds
> > like going back to the previous aux domain approach which has one PASID
> > per domain and that PASID is used on all devices attached to the aux domain...
> > 
> 
> This also works for SVA as far as I can see. The sva_bind essentially is
> to  attach an SVA domain to the PASID of a device. The sva_bind/unbind
> ops could be removed with these two new callbacks.

As we talked before I would like to see 'sva bind' go away and be
replaced with:

  domain = alloc_sva_iommu_domain(device, current)
  attach_dev_pasid_domain(domain, device, pasid)

Which composes better with everything else. SVA is just a special kind
of iommu_domain

Jason

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v2 3/8] iommu/vt-d: Implement device_pasid domain attach ops
  2022-03-18  5:33         ` Tian, Kevin
@ 2022-03-28 21:41           ` Jacob Pan
  0 siblings, 0 replies; 61+ messages in thread
From: Jacob Pan @ 2022-03-28 21:41 UTC (permalink / raw)
  To: Tian, Kevin
  Cc: Luck, Tony, Jiang, Dave, Raj, Ashok, Zanussi, Tom, Kumar,
	Sanjay K, LKML, Christoph Hellwig, iommu, Pan, Jacob jun,
	Jason Gunthorpe, Williams, Dan J, Jean-Philippe Brucker,
	jacob.jun.pan

Hi Kevin,

On Fri, 18 Mar 2022 05:33:38 +0000, "Tian, Kevin" <kevin.tian@intel.com>
wrote:

> > From: Jacob Pan
> > Sent: Thursday, March 17, 2022 5:02 AM
> > 
> > Hi Kevin,
> > 
> > On Wed, 16 Mar 2022 07:41:34 +0000, "Tian, Kevin" <kevin.tian@intel.com>
> > wrote:
> >   
> > > > From: Jason Gunthorpe <jgg@nvidia.com>
> > > > Sent: Tuesday, March 15, 2022 10:33 PM
> > > >
> > > > On Mon, Mar 14, 2022 at 10:07:07PM -0700, Jacob Pan wrote:  
> > > > > +	/*
> > > > > +	 * Each domain could have multiple devices attached with
> > > > > shared or  
> > > > per  
> > > > > +	 * device PASIDs. At the domain level, we keep track of
> > > > > unique PASIDs  
> > > > and  
> > > > > +	 * device user count.
> > > > > +	 * E.g. If a domain has two devices attached, device A
> > > > > has PASID 0, 1;
> > > > > +	 * device B has PASID 0, 2. Then the domain would have
> > > > > PASID 0, 1, 2.
> > > > > +	 */  
> > > >
> > > > A 2d array of xarray's seems like a poor data structure for this
> > > > task.  
> > >  
> > Perhaps i mis-presented here, I am not using 2D array. It is an 1D
> > xarray for domain PASIDs only. Then I use the existing device list in
> > each domain, adding another xa to track per-device-domain PASIDs.  
> > > besides that it also doesn't work when we support per-device PASID
> > > allocation in the future. In that case merging device PASIDs together
> > > is conceptually wrong.
> > >  
> > Sorry, could you elaborate? If we do per-dev PASID allocation, we could
> > use the ioasid_set for each pdev, right?  
> 
> My point is simply about the comment above which says the domain
> will have PASID 0, 1, 2 when there is [devA, PASID0] and [devB, PASID0].
> You can maintain a single  PASID list only when it's globally allocated
> cross devices. otherwise this has to be a tuple including device and
> PASID.
> 
Got you, you are right we don't want to limit to globally allocated scheme.

Thanks,

Jacob

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v2 5/8] iommu: Add PASID support for DMA mapping API users
  2022-03-18 12:43   ` Lu Baolu
@ 2022-03-28 21:44     ` Jacob Pan
  0 siblings, 0 replies; 61+ messages in thread
From: Jacob Pan @ 2022-03-28 21:44 UTC (permalink / raw)
  To: Lu Baolu
  Cc: iommu, LKML, Joerg Roedel, Jason Gunthorpe, Christoph Hellwig,
	Jean-Philippe Brucker, Jacob Pan, Raj Ashok, Kumar, Sanjay K,
	Dave Jiang, Tony Luck, Zanussi, Tom, Dan Williams, Tian, Kevin,
	Yi Liu, jacob.jun.pan

Hi BaoLu,

On Fri, 18 Mar 2022 20:43:54 +0800, Lu Baolu <baolu.lu@linux.intel.com>
wrote:

> On 2022/3/15 13:07, Jacob Pan wrote:
> > DMA mapping API is the de facto standard for in-kernel DMA. It operates
> > on a per device/RID basis which is not PASID-aware.
> > 
> > Some modern devices such as Intel Data Streaming Accelerator, PASID is
> > required for certain work submissions. To allow such devices use DMA
> > mapping API, we need the following functionalities:
> > 1. Provide device a way to retrieve a PASID for work submission within
> > the kernel
> > 2. Enable the kernel PASID on the IOMMU for the device
> > 3. Attach the kernel PASID to the device's default DMA domain, let it
> > be IOVA or physical address in case of pass-through.
> > 
> > This patch introduces a driver facing API that enables DMA API
> > PASID usage. Once enabled, device drivers can continue to use DMA APIs
> > as is. There is no difference in dma_handle between without PASID and
> > with PASID.
> > 
> > Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
> > ---
> >   drivers/iommu/dma-iommu.c | 65 +++++++++++++++++++++++++++++++++++++++
> >   include/linux/dma-iommu.h |  7 +++++
> >   include/linux/iommu.h     |  9 ++++++
> >   3 files changed, 81 insertions(+)
> > 
> > diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
> > index b22034975301..d0ff1a34b1b6 100644
> > --- a/drivers/iommu/dma-iommu.c
> > +++ b/drivers/iommu/dma-iommu.c
> > @@ -39,6 +39,8 @@ enum iommu_dma_cookie_type {
> >   	IOMMU_DMA_MSI_COOKIE,
> >   };
> >   
> > +static DECLARE_IOASID_SET(iommu_dma_pasid);
> > +
> >   struct iommu_dma_cookie {
> >   	enum iommu_dma_cookie_type	type;
> >   	union {
> > @@ -370,6 +372,69 @@ void iommu_put_dma_cookie(struct iommu_domain
> > *domain) domain->iova_cookie = NULL;
> >   }
> >   
> > +/**
> > + * iommu_enable_pasid_dma --Enable in-kernel DMA request with PASID
> > + * @dev:	Device to be enabled
> > + *
> > + * DMA request with PASID will be mapped the same way as the legacy
> > DMA.
> > + * If the device is in pass-through, PASID will also pass-through. If
> > the
> > + * device is in IOVA map, the supervisor PASID will point to the same
> > IOVA
> > + * page table.
> > + *
> > + * @return the kernel PASID to be used for DMA or INVALID_IOASID on
> > failure  
> 
> The comment on the return value should be rephrased according to the
> real code.
> 
yes, will do.

> > + */
> > +int iommu_enable_pasid_dma(struct device *dev, ioasid_t *pasid)
> > +{
> > +	struct iommu_domain *dom;
> > +	ioasid_t id, max;
> > +	int ret;
> > +
> > +	dom = iommu_get_domain_for_dev(dev);
> > +	if (!dom || !dom->ops || !dom->ops->attach_dev_pasid)
> > +		return -ENODEV;
> > +	max = iommu_get_dev_pasid_max(dev);
> > +	if (!max)
> > +		return -EINVAL;
> > +
> > +	id = ioasid_alloc(&iommu_dma_pasid, 1, max, dev);
> > +	if (id == INVALID_IOASID)
> > +		return -ENOMEM;
> > +
> > +	ret = dom->ops->attach_dev_pasid(dom, dev, id);
> > +	if (ret) {
> > +		ioasid_put(id);
> > +		return ret;
> > +	}
> > +	*pasid = id;
> > +
> > +	return ret;
> > +}
> > +EXPORT_SYMBOL(iommu_enable_pasid_dma);
> > +
> > +/**
> > + * iommu_disable_pasid_dma --Disable in-kernel DMA request with PASID
> > + * @dev:	Device's PASID DMA to be disabled
> > + *
> > + * It is the device driver's responsibility to ensure no more incoming
> > DMA
> > + * requests with the kernel PASID before calling this function. IOMMU
> > driver
> > + * ensures PASID cache, IOTLBs related to the kernel PASID are cleared
> > and
> > + * drained.
> > + *
> > + * @return 0 on success or error code on failure  
> 
> Ditto.
> 
same

> > + */
> > +void iommu_disable_pasid_dma(struct device *dev, ioasid_t pasid)
> > +{
> > +	struct iommu_domain *dom;
> > +
> > +	/* TODO: check the given PASID is within the ioasid_set */
> > +	dom = iommu_get_domain_for_dev(dev);
> > +	if (!dom->ops->detach_dev_pasid)
> > +		return;
> > +	dom->ops->detach_dev_pasid(dom, dev, pasid);
> > +	ioasid_put(pasid);
> > +}
> > +EXPORT_SYMBOL(iommu_disable_pasid_dma);
> > +
> >   /**
> >    * iommu_dma_get_resv_regions - Reserved region driver helper
> >    * @dev: Device from iommu_get_resv_regions()
> > diff --git a/include/linux/dma-iommu.h b/include/linux/dma-iommu.h
> > index 24607dc3c2ac..e6cb9b52a420 100644
> > --- a/include/linux/dma-iommu.h
> > +++ b/include/linux/dma-iommu.h
> > @@ -18,6 +18,13 @@ int iommu_get_dma_cookie(struct iommu_domain
> > *domain); int iommu_get_msi_cookie(struct iommu_domain *domain,
> > dma_addr_t base); void iommu_put_dma_cookie(struct iommu_domain
> > *domain); 
> > +/*
> > + * For devices that can do DMA request with PASID, setup a system
> > PASID.
> > + * Address modes (IOVA, PA) are selected by the platform code.
> > + */  
> 
> No need for a comment here. Or move it to the function if need.
> 
right, this comment is obsolete. will remove.

> > +int iommu_enable_pasid_dma(struct device *dev, ioasid_t *pasid);
> > +void iommu_disable_pasid_dma(struct device *dev, ioasid_t pasid);
> > +
> >   /* Setup call for arch DMA mapping code */
> >   void iommu_setup_dma_ops(struct device *dev, u64 dma_base, u64
> > dma_limit); int iommu_dma_init_fq(struct iommu_domain *domain);
> > diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> > index fde5b933dbe3..fb011722e4f8 100644
> > --- a/include/linux/iommu.h
> > +++ b/include/linux/iommu.h
> > @@ -395,6 +395,15 @@ static inline void iommu_set_dev_pasid_max(struct
> > device *dev, 
> >   	param->pasid_max = max;
> >   }
> > +static inline ioasid_t iommu_get_dev_pasid_max(struct device *dev)
> > +{
> > +	struct dev_iommu *param = dev->iommu;
> > +
> > +	if (WARN_ON(!param))
> > +		return 0;
> > +
> > +	return param->pasid_max;
> > +}
> >   
> >   int iommu_device_register(struct iommu_device *iommu,
> >   			  const struct iommu_ops *ops,  
> 
> Best regards,
> baolu


Thanks,

Jacob

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v2 6/8] dmaengine: idxd: Use DMA API for in-kernel DMA with PASID
  2022-03-18  6:10   ` Tian, Kevin
@ 2022-03-29 17:39     ` Jacob Pan
  0 siblings, 0 replies; 61+ messages in thread
From: Jacob Pan @ 2022-03-29 17:39 UTC (permalink / raw)
  To: Tian, Kevin
  Cc: iommu, LKML, Joerg Roedel, Jason Gunthorpe, Christoph Hellwig,
	Lu Baolu, Jean-Philippe Brucker, Pan, Jacob jun, Raj, Ashok,
	Kumar, Sanjay K, Jiang, Dave, Luck, Tony, Zanussi, Tom, Williams,
	Dan J, Liu, Yi L, jacob.jun.pan

Hi Kevin,

On Fri, 18 Mar 2022 06:10:40 +0000, "Tian, Kevin" <kevin.tian@intel.com>
wrote:

> > From: Jacob Pan <jacob.jun.pan@linux.intel.com>
> > Sent: Tuesday, March 15, 2022 1:07 PM
> > 
> > The current in-kernel supervisor PASID support is based on the SVM/SVA
> > machinery in SVA lib. The binding between a kernel PASID and kernel
> > mapping has many flaws. See discussions in the link below.
> > 
> > This patch enables in-kernel DMA by switching from SVA lib to the
> > standard DMA mapping APIs. Since both DMA requests with and without
> > PASIDs are mapped identically, there is no change to how DMA APIs are
> > used after the kernel PASID is enabled.
> > 
> > Link: https://lore.kernel.org/linux-
> > iommu/20210511194726.GP1002214@nvidia.com/
> > Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
> > ---
> >  drivers/dma/idxd/idxd.h  |  1 -
> >  drivers/dma/idxd/init.c  | 34 +++++++++-------------------------
> >  drivers/dma/idxd/sysfs.c |  7 -------
> >  3 files changed, 9 insertions(+), 33 deletions(-)
> > 
> > diff --git a/drivers/dma/idxd/idxd.h b/drivers/dma/idxd/idxd.h
> > index da72eb15f610..a09ab4a6e1c1 100644
> > --- a/drivers/dma/idxd/idxd.h
> > +++ b/drivers/dma/idxd/idxd.h
> > @@ -276,7 +276,6 @@ struct idxd_device {
> >  	struct idxd_wq **wqs;
> >  	struct idxd_engine **engines;
> > 
> > -	struct iommu_sva *sva;
> >  	unsigned int pasid;
> > 
> >  	int num_groups;
> > diff --git a/drivers/dma/idxd/init.c b/drivers/dma/idxd/init.c
> > index 08a5f4310188..5d1f8dd4abf6 100644
> > --- a/drivers/dma/idxd/init.c
> > +++ b/drivers/dma/idxd/init.c
> > @@ -16,6 +16,7 @@
> >  #include <linux/idr.h>
> >  #include <linux/intel-svm.h>
> >  #include <linux/iommu.h>
> > +#include <linux/dma-iommu.h>
> >  #include <uapi/linux/idxd.h>
> >  #include <linux/dmaengine.h>
> >  #include "../dmaengine.h"
> > @@ -466,36 +467,22 @@ static struct idxd_device *idxd_alloc(struct
> > pci_dev *pdev, struct idxd_driver_d
> > 
> >  static int idxd_enable_system_pasid(struct idxd_device *idxd)  
> 
> idxd_enable_pasid_dma() since system pasid is a confusing term now?
> Or just remove the idxd specific wrappers and have the caller to call
> iommu_enable_pasid_dma() directly given the simple logic here.
> 
agreed, will do.

> >  {
> > -	int flags;
> > -	unsigned int pasid;
> > -	struct iommu_sva *sva;
> > +	u32 pasid;
> > +	int ret;
> > 
> > -	flags = SVM_FLAG_SUPERVISOR_MODE;
> > -
> > -	sva = iommu_sva_bind_device(&idxd->pdev->dev, NULL, &flags);
> > -	if (IS_ERR(sva)) {
> > -		dev_warn(&idxd->pdev->dev,
> > -			 "iommu sva bind failed: %ld\n", PTR_ERR(sva));
> > -		return PTR_ERR(sva);
> > -	}
> > -
> > -	pasid = iommu_sva_get_pasid(sva);
> > -	if (pasid == IOMMU_PASID_INVALID) {
> > -		iommu_sva_unbind_device(sva);
> > -		return -ENODEV;
> > +	ret = iommu_enable_pasid_dma(&idxd->pdev->dev, &pasid);
> > +	if (ret) {
> > +		dev_err(&idxd->pdev->dev, "No DMA PASID %d\n", ret);
> > +		return ret;
> >  	}
> > -
> > -	idxd->sva = sva;
> >  	idxd->pasid = pasid;
> > -	dev_dbg(&idxd->pdev->dev, "system pasid: %u\n", pasid);
> > +
> >  	return 0;
> >  }
> > 
> >  static void idxd_disable_system_pasid(struct idxd_device *idxd)
> >  {
> > -
> > -	iommu_sva_unbind_device(idxd->sva);
> > -	idxd->sva = NULL;
> > +	iommu_disable_pasid_dma(&idxd->pdev->dev, idxd->pasid);
> >  }
> > 
> >  static int idxd_probe(struct idxd_device *idxd)
> > @@ -524,10 +511,7 @@ static int idxd_probe(struct idxd_device *idxd)
> >  		} else {
> >  			dev_warn(dev, "Unable to turn on SVA
> > feature.\n"); }
> > -	} else if (!sva) {
> > -		dev_warn(dev, "User forced SVA off via module
> > param.\n");  
> 
> why removing above 2 lines? they are related to a module param thus
> not affected by the logic in this series.
> 
This should be in a separate patch. I consulted with Dave, sva module param
is not needed anymore.
Thanks for pointing it out.

> >  	}
> > -
> >  	idxd_read_caps(idxd);
> >  	idxd_read_table_offsets(idxd);
> > 
> > diff --git a/drivers/dma/idxd/sysfs.c b/drivers/dma/idxd/sysfs.c
> > index 7e19ab92b61a..fde6656695ba 100644
> > --- a/drivers/dma/idxd/sysfs.c
> > +++ b/drivers/dma/idxd/sysfs.c
> > @@ -839,13 +839,6 @@ static ssize_t wq_name_store(struct device *dev,
> >  	if (strlen(buf) > WQ_NAME_SIZE || strlen(buf) == 0)
> >  		return -EINVAL;
> > 
> > -	/*
> > -	 * This is temporarily placed here until we have SVM support
> > for
> > -	 * dmaengine.
> > -	 */
> > -	if (wq->type == IDXD_WQT_KERNEL && device_pasid_enabled(wq-  
> > >idxd))  
> > -		return -EOPNOTSUPP;
> > -
> >  	memset(wq->name, 0, WQ_NAME_SIZE + 1);
> >  	strncpy(wq->name, buf, WQ_NAME_SIZE);
> >  	strreplace(wq->name, '\n', '\0');
> > --
> > 2.25.1  
> 


Thanks,

Jacob

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v2 7/8] iommu/vt-d: Delete supervisor/kernel SVA
  2022-03-18  6:16   ` Tian, Kevin
@ 2022-03-29 17:42     ` Jacob Pan
  0 siblings, 0 replies; 61+ messages in thread
From: Jacob Pan @ 2022-03-29 17:42 UTC (permalink / raw)
  To: Tian, Kevin
  Cc: iommu, LKML, Joerg Roedel, Jason Gunthorpe, Christoph Hellwig,
	Lu Baolu, Jean-Philippe Brucker, Pan, Jacob jun, Raj, Ashok,
	Kumar, Sanjay K, Jiang, Dave, Luck, Tony, Zanussi, Tom, Williams,
	Dan J, Liu, Yi L, jacob.jun.pan

Hi Kevin,

On Fri, 18 Mar 2022 06:16:58 +0000, "Tian, Kevin" <kevin.tian@intel.com>
wrote:

> > From: Jacob Pan <jacob.jun.pan@linux.intel.com>
> > Sent: Tuesday, March 15, 2022 1:07 PM
> > 
> > In-kernel DMA with PASID should use DMA API now, remove supervisor
> > PASID
> > SVA support. Remove special cases in bind mm and page request service.
> > 
> > Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>  
> 
> so you removed all the references to SVM_FLAG_SUPERVISOR_MODE
> but the definition is still kept in include/linux/intel-svm.h...
> 
Good catch, will remove.

> > ---
> >  drivers/iommu/intel/svm.c | 42 ++++++++-------------------------------
> >  1 file changed, 8 insertions(+), 34 deletions(-)
> > 
> > diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c
> > index 2c53689da461..37d6218f173b 100644
> > --- a/drivers/iommu/intel/svm.c
> > +++ b/drivers/iommu/intel/svm.c
> > @@ -516,11 +516,10 @@ static void intel_svm_free_pasid(struct mm_struct
> > *mm)
> > 
> >  static struct iommu_sva *intel_svm_bind_mm(struct intel_iommu *iommu,
> >  					   struct device *dev,
> > -					   struct mm_struct *mm,
> > -					   unsigned int flags)
> > +					   struct mm_struct *mm)
> >  {
> >  	struct device_domain_info *info = get_domain_info(dev);
> > -	unsigned long iflags, sflags;
> > +	unsigned long iflags, sflags = 0;
> >  	struct intel_svm_dev *sdev;
> >  	struct intel_svm *svm;
> >  	int ret = 0;
> > @@ -533,16 +532,13 @@ static struct iommu_sva
> > *intel_svm_bind_mm(struct intel_iommu *iommu,
> > 
> >  		svm->pasid = mm->pasid;
> >  		svm->mm = mm;
> > -		svm->flags = flags;
> >  		INIT_LIST_HEAD_RCU(&svm->devs);
> > 
> > -		if (!(flags & SVM_FLAG_SUPERVISOR_MODE)) {
> > -			svm->notifier.ops = &intel_mmuops;
> > -			ret = mmu_notifier_register(&svm->notifier,
> > mm);
> > -			if (ret) {
> > -				kfree(svm);
> > -				return ERR_PTR(ret);
> > -			}
> > +		svm->notifier.ops = &intel_mmuops;
> > +		ret = mmu_notifier_register(&svm->notifier, mm);
> > +		if (ret) {
> > +			kfree(svm);
> > +			return ERR_PTR(ret);
> >  		}
> > 
> >  		ret = pasid_private_add(svm->pasid, svm);
> > @@ -583,8 +579,6 @@ static struct iommu_sva *intel_svm_bind_mm(struct
> > intel_iommu *iommu,
> >  	}
> > 
> >  	/* Setup the pasid table: */
> > -	sflags = (flags & SVM_FLAG_SUPERVISOR_MODE) ?
> > -			PASID_FLAG_SUPERVISOR_MODE : 0;
> >  	sflags |= cpu_feature_enabled(X86_FEATURE_LA57) ?
> > PASID_FLAG_FL5LP : 0;
> >  	spin_lock_irqsave(&iommu->lock, iflags);
> >  	ret = intel_pasid_setup_first_level(iommu, dev, mm->pgd, mm-  
> > >pasid,  
> > @@ -957,7 +951,7 @@ static irqreturn_t prq_event_thread(int irq, void
> > *d)
> >  			 * to unbind the mm while any page faults are
> > outstanding.
> >  			 */
> >  			svm = pasid_private_find(req->pasid);
> > -			if (IS_ERR_OR_NULL(svm) || (svm->flags &
> > SVM_FLAG_SUPERVISOR_MODE))
> > +			if (IS_ERR_OR_NULL(svm))
> >  				goto bad_req;
> >  		}
> > 
> > @@ -1011,29 +1005,9 @@ static irqreturn_t prq_event_thread(int irq, void
> > *d)
> >  struct iommu_sva *intel_svm_bind(struct device *dev, struct mm_struct
> > *mm, void *drvdata)
> >  {
> >  	struct intel_iommu *iommu = device_to_iommu(dev, NULL, NULL);
> > -	unsigned int flags = 0;
> >  	struct iommu_sva *sva;
> >  	int ret;
> > 
> > -	if (drvdata)
> > -		flags = *(unsigned int *)drvdata;
> > -
> > -	if (flags & SVM_FLAG_SUPERVISOR_MODE) {
> > -		if (!ecap_srs(iommu->ecap)) {
> > -			dev_err(dev, "%s: Supervisor PASID not
> > supported\n",
> > -				iommu->name);
> > -			return ERR_PTR(-EOPNOTSUPP);
> > -		}
> > -
> > -		if (mm) {
> > -			dev_err(dev, "%s: Supervisor PASID with user
> > provided mm\n",
> > -				iommu->name);
> > -			return ERR_PTR(-EINVAL);
> > -		}
> > -
> > -		mm = &init_mm;
> > -	}
> > -
> >  	mutex_lock(&pasid_mutex);
> >  	ret = intel_svm_alloc_pasid(dev, mm, flags);
> >  	if (ret) {
> > --
> > 2.25.1  
> 


Thanks,

Jacob

^ permalink raw reply	[flat|nested] 61+ messages in thread

end of thread, other threads:[~2022-03-29 17:39 UTC | newest]

Thread overview: 61+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-15  5:07 [PATCH v2 0/8] Enable PASID for DMA API users Jacob Pan
2022-03-15  5:07 ` [PATCH v2 1/8] iommu: Assign per device max PASID Jacob Pan
2022-03-15  5:07 ` [PATCH v2 2/8] iommu: Add attach/detach_dev_pasid domain ops Jacob Pan
2022-03-15 10:24   ` Tian, Kevin
2022-03-15 11:26   ` Jean-Philippe Brucker
2022-03-15 11:49     ` Tian, Kevin
2022-03-15 16:11       ` Jacob Pan
2022-03-18 12:01       ` Lu Baolu
2022-03-18 13:50         ` Jason Gunthorpe
2022-03-18 11:52   ` Lu Baolu
2022-03-18 13:48     ` Jason Gunthorpe
2022-03-15  5:07 ` [PATCH v2 3/8] iommu/vt-d: Implement device_pasid domain attach ops Jacob Pan
2022-03-15 10:33   ` Tian, Kevin
2022-03-15 22:23     ` Jacob Pan
2022-03-15 14:33   ` Jason Gunthorpe
2022-03-15 22:36     ` Jacob Pan
2022-03-15 23:04       ` Jason Gunthorpe
2022-03-16 20:50         ` Jacob Pan
2022-03-16 22:15           ` Jason Gunthorpe
2022-03-16 22:23             ` Luck, Tony
2022-03-17  0:04               ` Jason Gunthorpe
2022-03-18  5:47                 ` Tian, Kevin
2022-03-18 13:47                   ` Jason Gunthorpe
2022-03-17  0:49             ` Jacob Pan
2022-03-17 13:23               ` Jason Gunthorpe
2022-03-17 18:23                 ` Jacob Pan
2022-03-16  7:41     ` Tian, Kevin
2022-03-16 21:01       ` Jacob Pan
2022-03-18  5:33         ` Tian, Kevin
2022-03-28 21:41           ` Jacob Pan
2022-03-16  7:39   ` Tian, Kevin
2022-03-16 20:51     ` Jacob Pan
2022-03-15  5:07 ` [PATCH v2 4/8] iommu/vt-d: Use device_pasid attach op for RID2PASID Jacob Pan
2022-03-16  7:54   ` Tian, Kevin
2022-03-17 20:45     ` Jacob Pan
2022-03-15  5:07 ` [PATCH v2 5/8] iommu: Add PASID support for DMA mapping API users Jacob Pan
2022-03-15 11:16   ` Robin Murphy
2022-03-15 14:22     ` Jason Gunthorpe
2022-03-15 16:31       ` Jacob Pan
2022-03-15 17:05         ` Jason Gunthorpe
2022-03-15 21:24           ` Jacob Pan
2022-03-16 10:32             ` Tian, Kevin
2022-03-16  8:41       ` Tian, Kevin
2022-03-16 14:07         ` Jason Gunthorpe
2022-03-15 14:35   ` Jason Gunthorpe
2022-03-15 16:38     ` Jacob Pan
2022-03-15 23:05       ` Jason Gunthorpe
2022-03-18 12:43   ` Lu Baolu
2022-03-28 21:44     ` Jacob Pan
2022-03-15  5:07 ` [PATCH v2 6/8] dmaengine: idxd: Use DMA API for in-kernel DMA with PASID Jacob Pan
2022-03-18  6:10   ` Tian, Kevin
2022-03-29 17:39     ` Jacob Pan
2022-03-15  5:07 ` [PATCH v2 7/8] iommu/vt-d: Delete supervisor/kernel SVA Jacob Pan
2022-03-18  6:16   ` Tian, Kevin
2022-03-29 17:42     ` Jacob Pan
2022-03-15  5:07 ` [PATCH v2 8/8] iommu: Remove unused driver data in sva_bind_device Jacob Pan
2022-03-15 11:37   ` Jean-Philippe Brucker
2022-03-15  5:07 ` [PATCH v2 9/9] dmaengine: idxd: separate user and kernel pasid enabling Jacob Pan
2022-03-18  6:28   ` Tian, Kevin
2022-03-15  8:16 ` [PATCH v2 0/8] Enable PASID for DMA API users Tian, Kevin
2022-03-15 15:49   ` Jacob Pan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).