dmaengine.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 0/4] Enable PASID for DMA API users
@ 2022-05-10 21:07 Jacob Pan
  2022-05-10 21:07 ` [PATCH v3 1/4] iommu/vt-d: Implement domain ops for attach_dev_pasid Jacob Pan
                   ` (3 more replies)
  0 siblings, 4 replies; 22+ messages in thread
From: Jacob Pan @ 2022-05-10 21:07 UTC (permalink / raw)
  To: iommu, LKML, dmaengine, Joerg Roedel, David Woodhouse,
	Jean-Philippe Brucker, Lu Baolu, Jason Gunthorpe, vkoul,
	robin.murphy, will
  Cc: Yi Liu, Dave Jiang, Tian, Kevin, Raj Ashok, Eric Auger, Jacob Pan

Some modern accelerators such as Intel's Data Streaming Accelerator (DSA)
require PASID in DMA requests to be operational. Specifically, the work
submissions with ENQCMD on shared work queues require PASIDs. The use cases
include both user DMA with shared virtual addressing (SVA) and in-kernel
DMA similar to legacy DMA w/o PASID. Here we address the latter.

DMA mapping API is the de facto standard for in-kernel DMA. However, it
operates on a per device or Requester ID(RID) basis which is not
PASID-aware. To leverage DMA API for devices relies on PASIDs, this
patchset introduces the following APIs

1. A driver facing API that enables DMA API PASID usage:
iommu_enable_pasid_dma(struct device *dev, ioasid_t &pasid);

2. An IOMMU op that allows attaching device-domain-PASID generically (will
be used beyond DMA API PASID support)

Once PASID DMA is enabled and attached to the appropriate IOMMU domain,
device drivers can continue to use DMA APIs as-is. There is no difference
in terms of mapping in dma_handle between without PASID and with PASID.
The DMA mapping performed by IOMMU will be identical for both requests, let
it be IOVA or PA in case of pass-through.

In addition, this set converts DSA driver in-kernel DMA with PASID from SVA
lib to DMA API. There have been security and functional issues with the
kernel SVA approach:
(https://lore.kernel.org/linux-iommu/20210511194726.GP1002214@nvidia.com/)
The highlights are as the following:
 - The lack of IOTLB synchronization upon kernel page table updates.
   (vmalloc, module/BPF loading, CONFIG_DEBUG_PAGEALLOC etc.)
 - Other than slight more protection, using kernel virtual address (KVA)
has little advantage over physical address. There are also no use cases yet
where DMA engines need kernel virtual addresses for in-kernel DMA.

Subsequently, cleanup is done around the usage of sva_bind_device() for
in-kernel DMA. Removing special casing code in VT-d driver and tightening
SVA lib API.

This work and idea behind it is a collaboration with many people, many
thanks to Baolu Lu, Jason Gunthorpe, Dave Jiang, and others.


ChangeLog:
v3
	- Rebased on "Baolu's SVA and IOPF refactoring" series v5.
	(https://github.com/LuBaolu/intel-iommu/commits/iommu-sva-refactoring-v5)
	This version is significantly simplified by leveraging IOMMU domain
	ops, attach_dev_pasid() op is implemented differently on a DMA domain
	than on a SVA domain.
	We currently have no need to support multiple PASIDs per DMA domain.
	(https://lore.kernel.org/lkml/20220315142216.GV11336@nvidia.com/).
	Removed PASID-device list from V2, a PASID field is introduced to
	struct iommu_domain instead. It is intended for DMA requests with
	PASID by all devices attached to the domain.

v2
	- Do not reserve a special PASID for DMA API usage. Use IOASID
	  allocation instead.
	- Introduced a generic device-pasid-domain attachment IOMMU op.
	  Replaced the DMA API only IOMMU op.
	- Removed supervisor SVA support in VT-d
	- Removed unused sva_bind_device parameters
	- Use IOMMU specific data instead of struct device to store PASID
	  info


Jacob Pan (4):
  iommu/vt-d: Implement domain ops for attach_dev_pasid
  iommu: Add PASID support for DMA mapping API users
  dmaengine: idxd: Use DMA API for in-kernel DMA with PASID
  iommu/vt-d: Delete unused SVM flag

 drivers/dma/idxd/idxd.h     |   1 -
 drivers/dma/idxd/init.c     |  34 +++---------
 drivers/dma/idxd/sysfs.c    |   7 ---
 drivers/iommu/dma-iommu.c   | 107 ++++++++++++++++++++++++++++++++++++
 drivers/iommu/intel/iommu.c |  81 ++++++++++++++++++++++++++-
 drivers/iommu/intel/svm.c   |   2 +-
 include/linux/dma-iommu.h   |   3 +
 include/linux/intel-iommu.h |   1 +
 include/linux/intel-svm.h   |  13 -----
 include/linux/iommu.h       |   2 +
 10 files changed, 202 insertions(+), 49 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH v3 1/4] iommu/vt-d: Implement domain ops for attach_dev_pasid
  2022-05-10 21:07 [PATCH v3 0/4] Enable PASID for DMA API users Jacob Pan
@ 2022-05-10 21:07 ` Jacob Pan
  2022-05-10 23:21   ` Jason Gunthorpe
  2022-05-10 21:07 ` [PATCH v3 2/4] iommu: Add PASID support for DMA mapping API users Jacob Pan
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 22+ messages in thread
From: Jacob Pan @ 2022-05-10 21:07 UTC (permalink / raw)
  To: iommu, LKML, dmaengine, Joerg Roedel, David Woodhouse,
	Jean-Philippe Brucker, Lu Baolu, Jason Gunthorpe, vkoul,
	robin.murphy, will
  Cc: Yi Liu, Dave Jiang, Tian, Kevin, Raj Ashok, Eric Auger, Jacob Pan

On VT-d platforms with scalable mode enabled, devices issue DMA requests
with PASID need to attach PASIDs to given IOMMU domains. The attach
operation involves the following:
- Programming the PASID into the device's PASID table
- Tracking device domain and the PASID relationship
- Managing IOTLB and device TLB invalidations

This patch add attach_dev_pasid functions to the default domain ops which
is used by DMA and identity domain types. It could be extended to support
other domain types whenever necessary.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
---
 drivers/iommu/intel/iommu.c | 81 ++++++++++++++++++++++++++++++++++++-
 include/linux/intel-iommu.h |  1 +
 2 files changed, 80 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index a51b96fa9b3a..5408418f4f4b 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -1562,6 +1562,10 @@ static void __iommu_flush_dev_iotlb(struct device_domain_info *info,
 
 	sid = info->bus << 8 | info->devfn;
 	qdep = info->ats_qdep;
+	if (info->pasid) {
+		qi_flush_dev_iotlb_pasid(info->iommu, sid, info->pfsid,
+					 info->pasid, qdep, addr, mask);
+	}
 	qi_flush_dev_iotlb(info->iommu, sid, info->pfsid,
 			   qdep, addr, mask);
 }
@@ -1591,6 +1595,7 @@ static void iommu_flush_iotlb_psi(struct intel_iommu *iommu,
 	unsigned int mask = ilog2(aligned_pages);
 	uint64_t addr = (uint64_t)pfn << VTD_PAGE_SHIFT;
 	u16 did = domain->iommu_did[iommu->seq_id];
+	struct iommu_domain *iommu_domain = &domain->domain;
 
 	BUG_ON(pages == 0);
 
@@ -1599,6 +1604,9 @@ static void iommu_flush_iotlb_psi(struct intel_iommu *iommu,
 
 	if (domain_use_first_level(domain)) {
 		qi_flush_piotlb(iommu, did, PASID_RID2PASID, addr, pages, ih);
+		/* flush additional kernel DMA PASIDs attached */
+		if (iommu_domain->pasid)
+			qi_flush_piotlb(iommu, did, iommu_domain->pasid, addr, pages, ih);
 	} else {
 		unsigned long bitmask = aligned_pages - 1;
 
@@ -4265,10 +4273,13 @@ static void __dmar_remove_one_dev_info(struct device_domain_info *info)
 	domain = info->domain;
 
 	if (info->dev && !dev_is_real_dma_subdevice(info->dev)) {
-		if (dev_is_pci(info->dev) && sm_supported(iommu))
+		if (dev_is_pci(info->dev) && sm_supported(iommu)) {
 			intel_pasid_tear_down_entry(iommu, info->dev,
 					PASID_RID2PASID, false);
-
+			if (info->pasid)
+				intel_pasid_tear_down_entry(iommu, info->dev,
+							    info->pasid, false);
+		}
 		iommu_disable_dev_iotlb(info);
 		domain_context_clear(info);
 		intel_pasid_free_table(info->dev);
@@ -4912,6 +4923,70 @@ static void intel_iommu_iotlb_sync_map(struct iommu_domain *domain,
 	}
 }
 
+static int intel_iommu_attach_dev_pasid(struct iommu_domain *domain,
+					struct device *dev,
+					ioasid_t pasid)
+{
+	struct device_domain_info *info = dev_iommu_priv_get(dev);
+	struct dmar_domain *dmar_domain = to_dmar_domain(domain);
+	struct intel_iommu *iommu = info->iommu;
+	unsigned long flags;
+	int ret = 0;
+
+	if (!sm_supported(iommu) || !info)
+		return -ENODEV;
+
+	spin_lock_irqsave(&device_domain_lock, flags);
+	/*
+	 * If the same device already has a PASID attached, just return.
+	 * DMA layer will return the PASID value to the caller.
+	 */
+	if (pasid != PASID_RID2PASID && info->pasid) {
+		if (info->pasid == pasid)
+			ret = 0;
+		else {
+			dev_warn(dev, "Cannot attach PASID %u, %u already attached\n",
+				 pasid, info->pasid);
+			ret = -EBUSY;
+		}
+		goto out_unlock_domain;
+	}
+
+	spin_lock(&iommu->lock);
+	if (hw_pass_through && domain_type_is_si(dmar_domain))
+		ret = intel_pasid_setup_pass_through(iommu, dmar_domain,
+						     dev, pasid);
+	else if (domain_use_first_level(dmar_domain))
+		ret = domain_setup_first_level(iommu, dmar_domain,
+					       dev, pasid);
+	else
+		ret = intel_pasid_setup_second_level(iommu, dmar_domain,
+						     dev, pasid);
+
+	spin_unlock(&iommu->lock);
+out_unlock_domain:
+	spin_unlock_irqrestore(&device_domain_lock, flags);
+	if (!ret)
+		info->pasid = pasid;
+
+	return ret;
+}
+
+static void intel_iommu_detach_dev_pasid(struct iommu_domain *domain,
+					struct device *dev,
+					ioasid_t pasid)
+{
+	struct device_domain_info *info = dev_iommu_priv_get(dev);
+	struct intel_iommu *iommu = info->iommu;
+	unsigned long flags;
+
+	WARN_ON(info->pasid != pasid);
+	spin_lock_irqsave(&iommu->lock, flags);
+	intel_pasid_tear_down_entry(iommu, dev, pasid, false);
+	info->pasid = 0;
+	spin_unlock_irqrestore(&iommu->lock, flags);
+}
+
 const struct iommu_ops intel_iommu_ops = {
 	.capable		= intel_iommu_capable,
 	.domain_alloc		= intel_iommu_domain_alloc,
@@ -4940,6 +5015,8 @@ const struct iommu_ops intel_iommu_ops = {
 		.iova_to_phys		= intel_iommu_iova_to_phys,
 		.free			= intel_iommu_domain_free,
 		.enforce_cache_coherency = intel_iommu_enforce_cache_coherency,
+		.attach_dev_pasid	= intel_iommu_attach_dev_pasid,
+		.detach_dev_pasid	= intel_iommu_detach_dev_pasid,
 	}
 };
 
diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h
index 5af24befc9f1..55845a8c4f4d 100644
--- a/include/linux/intel-iommu.h
+++ b/include/linux/intel-iommu.h
@@ -627,6 +627,7 @@ struct device_domain_info {
 	struct intel_iommu *iommu; /* IOMMU used by this device */
 	struct dmar_domain *domain; /* pointer to domain */
 	struct pasid_table *pasid_table; /* pasid table */
+	ioasid_t pasid; /* DMA request with PASID */
 };
 
 static inline void __iommu_flush_cache(
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v3 2/4] iommu: Add PASID support for DMA mapping API users
  2022-05-10 21:07 [PATCH v3 0/4] Enable PASID for DMA API users Jacob Pan
  2022-05-10 21:07 ` [PATCH v3 1/4] iommu/vt-d: Implement domain ops for attach_dev_pasid Jacob Pan
@ 2022-05-10 21:07 ` Jacob Pan
  2022-05-10 23:28   ` Jason Gunthorpe
  2022-05-10 21:07 ` [PATCH v3 3/4] dmaengine: idxd: Use DMA API for in-kernel DMA with PASID Jacob Pan
  2022-05-10 21:07 ` [PATCH v3 4/4] iommu/vt-d: Delete unused SVM flag Jacob Pan
  3 siblings, 1 reply; 22+ messages in thread
From: Jacob Pan @ 2022-05-10 21:07 UTC (permalink / raw)
  To: iommu, LKML, dmaengine, Joerg Roedel, David Woodhouse,
	Jean-Philippe Brucker, Lu Baolu, Jason Gunthorpe, vkoul,
	robin.murphy, will
  Cc: Yi Liu, Dave Jiang, Tian, Kevin, Raj Ashok, Eric Auger, Jacob Pan

DMA mapping API is the de facto standard for in-kernel DMA. It operates
on a per device/RID basis which is not PASID-aware.

Some modern devices such as Intel Data Streaming Accelerator, PASID is
required for certain work submissions. To allow such devices use DMA
mapping API, we need the following functionalities:
1. Provide device a way to retrieve a PASID for work submission within
the kernel
2. Enable the kernel PASID on the IOMMU for the device
3. Attach the kernel PASID to the device's default DMA domain, let it
be IOVA or physical address in case of pass-through.

This patch introduces a driver facing API that enables DMA API
PASID usage. Once enabled, device drivers can continue to use DMA APIs as
is. There is no difference in dma_handle between without PASID and with
PASID.

Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
---
 drivers/iommu/dma-iommu.c | 107 ++++++++++++++++++++++++++++++++++++++
 include/linux/dma-iommu.h |   3 ++
 include/linux/iommu.h     |   2 +
 3 files changed, 112 insertions(+)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 1ca85d37eeab..5984f3129fa2 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -34,6 +34,8 @@ struct iommu_dma_msi_page {
 	phys_addr_t		phys;
 };
 
+static DECLARE_IOASID_SET(iommu_dma_pasid);
+
 enum iommu_dma_cookie_type {
 	IOMMU_DMA_IOVA_COOKIE,
 	IOMMU_DMA_MSI_COOKIE,
@@ -370,6 +372,111 @@ void iommu_put_dma_cookie(struct iommu_domain *domain)
 	domain->iova_cookie = NULL;
 }
 
+/**
+ * iommu_attach_dma_pasid --Attach a PASID for in-kernel DMA. Use the device's
+ * DMA domain.
+ * @dev: Device to be enabled
+ * @pasid: The returned kernel PASID to be used for DMA
+ *
+ * DMA request with PASID will be mapped the same way as the legacy DMA.
+ * If the device is in pass-through, PASID will also pass-through. If the
+ * device is in IOVA, the PASID will point to the same IOVA page table.
+ *
+ * @return err code or 0 on success
+ */
+int iommu_attach_dma_pasid(struct device *dev, ioasid_t *pasid)
+{
+	struct iommu_domain *dom;
+	ioasid_t id, max;
+	int ret = 0;
+
+	dom = iommu_get_domain_for_dev(dev);
+	if (!dom || !dom->ops || !dom->ops->attach_dev_pasid)
+		return -ENODEV;
+
+	/* Only support domain types that DMA API can be used */
+	if (dom->type == IOMMU_DOMAIN_UNMANAGED ||
+	    dom->type == IOMMU_DOMAIN_BLOCKED) {
+		dev_warn(dev, "Invalid domain type %d", dom->type);
+		return -EPERM;
+	}
+
+	id = dom->pasid;
+	if (!id) {
+		/*
+		 * First device to use PASID in its DMA domain, allocate
+		 * a single PASID per DMA domain is all we need, it is also
+		 * good for performance when it comes down to IOTLB flush.
+		 */
+		max = 1U << dev->iommu->pasid_bits;
+		if (!max)
+			return -EINVAL;
+
+		id = ioasid_alloc(&iommu_dma_pasid, 1, max, dev);
+		if (id == INVALID_IOASID)
+			return -ENOMEM;
+
+		dom->pasid = id;
+		atomic_set(&dom->pasid_users, 1);
+	}
+
+	ret = dom->ops->attach_dev_pasid(dom, dev, id);
+	if (!ret) {
+		*pasid = id;
+		atomic_inc(&dom->pasid_users);
+		return 0;
+	}
+
+	if (atomic_dec_and_test(&dom->pasid_users)) {
+		ioasid_free(id);
+		dom->pasid = 0;
+	}
+
+	return ret;
+}
+EXPORT_SYMBOL(iommu_attach_dma_pasid);
+
+/**
+ * iommu_detach_dma_pasid --Disable in-kernel DMA request with PASID
+ * @dev:	Device's PASID DMA to be disabled
+ *
+ * It is the device driver's responsibility to ensure no more incoming DMA
+ * requests with the kernel PASID before calling this function. IOMMU driver
+ * ensures PASID cache, IOTLBs related to the kernel PASID are cleared and
+ * drained.
+ *
+ */
+void iommu_detach_dma_pasid(struct device *dev)
+{
+	struct iommu_domain *dom;
+	ioasid_t pasid;
+
+	dom = iommu_get_domain_for_dev(dev);
+	if (!dom || !dom->ops || !dom->ops->detach_dev_pasid) {
+		dev_warn(dev, "No ops for detaching PASID %u", pasid);
+		return;
+	}
+	/* Only support DMA API managed domain type */
+	if (dom->type == IOMMU_DOMAIN_UNMANAGED ||
+	    dom->type == IOMMU_DOMAIN_BLOCKED) {
+		dev_err(dev, "Invalid domain type %d to detach DMA PASID %u\n",
+			 dom->type, pasid);
+		return;
+	}
+
+	pasid = dom->pasid;
+	if (!pasid) {
+		dev_err(dev, "No DMA PASID attached\n");
+		return;
+	}
+	dom->ops->detach_dev_pasid(dom, dev, pasid);
+	if (atomic_dec_and_test(&dom->pasid_users)) {
+		ioasid_free(pasid);
+		dom->pasid = 0;
+	}
+}
+EXPORT_SYMBOL(iommu_detach_dma_pasid);
+
 /**
  * iommu_dma_get_resv_regions - Reserved region driver helper
  * @dev: Device from iommu_get_resv_regions()
diff --git a/include/linux/dma-iommu.h b/include/linux/dma-iommu.h
index 24607dc3c2ac..538650b9cb75 100644
--- a/include/linux/dma-iommu.h
+++ b/include/linux/dma-iommu.h
@@ -18,6 +18,9 @@ int iommu_get_dma_cookie(struct iommu_domain *domain);
 int iommu_get_msi_cookie(struct iommu_domain *domain, dma_addr_t base);
 void iommu_put_dma_cookie(struct iommu_domain *domain);
 
+int iommu_attach_dma_pasid(struct device *dev, ioasid_t *pasid);
+void iommu_detach_dma_pasid(struct device *dev);
+
 /* Setup call for arch DMA mapping code */
 void iommu_setup_dma_ops(struct device *dev, u64 dma_base, u64 dma_limit);
 int iommu_dma_init_fq(struct iommu_domain *domain);
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 1164524814cb..281a87fdce77 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -105,6 +105,8 @@ struct iommu_domain {
 	enum iommu_page_response_code (*iopf_handler)(struct iommu_fault *fault,
 						      void *data);
 	void *fault_data;
+	ioasid_t pasid;		/* Used for DMA requests with PASID */
+	atomic_t pasid_users;
 };
 
 static inline bool iommu_is_dma_domain(struct iommu_domain *domain)
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v3 3/4] dmaengine: idxd: Use DMA API for in-kernel DMA with PASID
  2022-05-10 21:07 [PATCH v3 0/4] Enable PASID for DMA API users Jacob Pan
  2022-05-10 21:07 ` [PATCH v3 1/4] iommu/vt-d: Implement domain ops for attach_dev_pasid Jacob Pan
  2022-05-10 21:07 ` [PATCH v3 2/4] iommu: Add PASID support for DMA mapping API users Jacob Pan
@ 2022-05-10 21:07 ` Jacob Pan
  2022-05-10 21:07 ` [PATCH v3 4/4] iommu/vt-d: Delete unused SVM flag Jacob Pan
  3 siblings, 0 replies; 22+ messages in thread
From: Jacob Pan @ 2022-05-10 21:07 UTC (permalink / raw)
  To: iommu, LKML, dmaengine, Joerg Roedel, David Woodhouse,
	Jean-Philippe Brucker, Lu Baolu, Jason Gunthorpe, vkoul,
	robin.murphy, will
  Cc: Yi Liu, Dave Jiang, Tian, Kevin, Raj Ashok, Eric Auger, Jacob Pan

The current in-kernel supervisor PASID support is based on the SVM/SVA
machinery in SVA lib. The binding between a kernel PASID and kernel
mapping has many flaws. See discussions in the link below.

This patch enables in-kernel DMA by switching from SVA lib to the
standard DMA mapping APIs. Since both DMA requests with and without
PASIDs are mapped identically, there is no change to how DMA APIs are
used after the kernel PASID is enabled.

Link: https://lore.kernel.org/linux-iommu/20210511194726.GP1002214@nvidia.com/
Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
---
 drivers/dma/idxd/idxd.h  |  1 -
 drivers/dma/idxd/init.c  | 34 +++++++++-------------------------
 drivers/dma/idxd/sysfs.c |  7 -------
 3 files changed, 9 insertions(+), 33 deletions(-)

diff --git a/drivers/dma/idxd/idxd.h b/drivers/dma/idxd/idxd.h
index ccbefd0be617..190b08bd7c08 100644
--- a/drivers/dma/idxd/idxd.h
+++ b/drivers/dma/idxd/idxd.h
@@ -277,7 +277,6 @@ struct idxd_device {
 	struct idxd_wq **wqs;
 	struct idxd_engine **engines;
 
-	struct iommu_sva *sva;
 	unsigned int pasid;
 
 	int num_groups;
diff --git a/drivers/dma/idxd/init.c b/drivers/dma/idxd/init.c
index e1b5d1e4a949..e2e1c0eae6d6 100644
--- a/drivers/dma/idxd/init.c
+++ b/drivers/dma/idxd/init.c
@@ -16,6 +16,7 @@
 #include <linux/idr.h>
 #include <linux/intel-svm.h>
 #include <linux/iommu.h>
+#include <linux/dma-iommu.h>
 #include <uapi/linux/idxd.h>
 #include <linux/dmaengine.h>
 #include "../dmaengine.h"
@@ -466,36 +467,22 @@ static struct idxd_device *idxd_alloc(struct pci_dev *pdev, struct idxd_driver_d
 
 static int idxd_enable_system_pasid(struct idxd_device *idxd)
 {
-	int flags;
-	unsigned int pasid;
-	struct iommu_sva *sva;
+	u32 pasid;
+	int ret;
 
-	flags = SVM_FLAG_SUPERVISOR_MODE;
-
-	sva = iommu_sva_bind_device(&idxd->pdev->dev, NULL, &flags);
-	if (IS_ERR(sva)) {
-		dev_warn(&idxd->pdev->dev,
-			 "iommu sva bind failed: %ld\n", PTR_ERR(sva));
-		return PTR_ERR(sva);
-	}
-
-	pasid = iommu_sva_get_pasid(sva);
-	if (pasid == IOMMU_PASID_INVALID) {
-		iommu_sva_unbind_device(sva);
-		return -ENODEV;
+	ret = iommu_attach_dma_pasid(&idxd->pdev->dev, &pasid);
+	if (ret) {
+		dev_err(&idxd->pdev->dev, "No DMA PASID %d\n", ret);
+		return ret;
 	}
-
-	idxd->sva = sva;
 	idxd->pasid = pasid;
-	dev_dbg(&idxd->pdev->dev, "system pasid: %u\n", pasid);
+
 	return 0;
 }
 
 static void idxd_disable_system_pasid(struct idxd_device *idxd)
 {
-
-	iommu_sva_unbind_device(idxd->sva);
-	idxd->sva = NULL;
+	iommu_detach_dma_pasid(&idxd->pdev->dev);
 }
 
 static int idxd_probe(struct idxd_device *idxd)
@@ -527,10 +514,7 @@ static int idxd_probe(struct idxd_device *idxd)
 			else
 				set_bit(IDXD_FLAG_PASID_ENABLED, &idxd->flags);
 		}
-	} else if (!sva) {
-		dev_warn(dev, "User forced SVA off via module param.\n");
 	}
-
 	idxd_read_caps(idxd);
 	idxd_read_table_offsets(idxd);
 
diff --git a/drivers/dma/idxd/sysfs.c b/drivers/dma/idxd/sysfs.c
index dfd549685c46..a48928973bd4 100644
--- a/drivers/dma/idxd/sysfs.c
+++ b/drivers/dma/idxd/sysfs.c
@@ -839,13 +839,6 @@ static ssize_t wq_name_store(struct device *dev,
 	if (strlen(buf) > WQ_NAME_SIZE || strlen(buf) == 0)
 		return -EINVAL;
 
-	/*
-	 * This is temporarily placed here until we have SVM support for
-	 * dmaengine.
-	 */
-	if (wq->type == IDXD_WQT_KERNEL && device_pasid_enabled(wq->idxd))
-		return -EOPNOTSUPP;
-
 	memset(wq->name, 0, WQ_NAME_SIZE + 1);
 	strncpy(wq->name, buf, WQ_NAME_SIZE);
 	strreplace(wq->name, '\n', '\0');
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v3 4/4] iommu/vt-d: Delete unused SVM flag
  2022-05-10 21:07 [PATCH v3 0/4] Enable PASID for DMA API users Jacob Pan
                   ` (2 preceding siblings ...)
  2022-05-10 21:07 ` [PATCH v3 3/4] dmaengine: idxd: Use DMA API for in-kernel DMA with PASID Jacob Pan
@ 2022-05-10 21:07 ` Jacob Pan
  3 siblings, 0 replies; 22+ messages in thread
From: Jacob Pan @ 2022-05-10 21:07 UTC (permalink / raw)
  To: iommu, LKML, dmaengine, Joerg Roedel, David Woodhouse,
	Jean-Philippe Brucker, Lu Baolu, Jason Gunthorpe, vkoul,
	robin.murphy, will
  Cc: Yi Liu, Dave Jiang, Tian, Kevin, Raj Ashok, Eric Auger, Jacob Pan

Supervisor PASID for SVA/SVM is no longer supported, delete the unused
flag.

Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
---
 drivers/iommu/intel/svm.c |  2 +-
 include/linux/intel-svm.h | 13 -------------
 2 files changed, 1 insertion(+), 14 deletions(-)

diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c
index 38c33cde177e..98ec77415770 100644
--- a/drivers/iommu/intel/svm.c
+++ b/drivers/iommu/intel/svm.c
@@ -750,7 +750,7 @@ static irqreturn_t prq_event_thread(int irq, void *d)
 			 * to unbind the mm while any page faults are outstanding.
 			 */
 			svm = pasid_private_find(req->pasid);
-			if (IS_ERR_OR_NULL(svm) || (svm->flags & SVM_FLAG_SUPERVISOR_MODE))
+			if (IS_ERR_OR_NULL(svm))
 				goto bad_req;
 		}
 
diff --git a/include/linux/intel-svm.h b/include/linux/intel-svm.h
index b3b125b332aa..6835a665c195 100644
--- a/include/linux/intel-svm.h
+++ b/include/linux/intel-svm.h
@@ -13,17 +13,4 @@
 #define PRQ_RING_MASK	((0x1000 << PRQ_ORDER) - 0x20)
 #define PRQ_DEPTH	((0x1000 << PRQ_ORDER) >> 5)
 
-/*
- * The SVM_FLAG_SUPERVISOR_MODE flag requests a PASID which can be used only
- * for access to kernel addresses. No IOTLB flushes are automatically done
- * for kernel mappings; it is valid only for access to the kernel's static
- * 1:1 mapping of physical memory — not to vmalloc or even module mappings.
- * A future API addition may permit the use of such ranges, by means of an
- * explicit IOTLB flush call (akin to the DMA API's unmap method).
- *
- * It is unlikely that we will ever hook into flush_tlb_kernel_range() to
- * do such IOTLB flushes automatically.
- */
-#define SVM_FLAG_SUPERVISOR_MODE	BIT(0)
-
 #endif /* __INTEL_SVM_H__ */
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH v3 1/4] iommu/vt-d: Implement domain ops for attach_dev_pasid
  2022-05-10 21:07 ` [PATCH v3 1/4] iommu/vt-d: Implement domain ops for attach_dev_pasid Jacob Pan
@ 2022-05-10 23:21   ` Jason Gunthorpe
  2022-05-11  0:23     ` Jacob Pan
  0 siblings, 1 reply; 22+ messages in thread
From: Jason Gunthorpe @ 2022-05-10 23:21 UTC (permalink / raw)
  To: Jacob Pan
  Cc: iommu, LKML, dmaengine, Joerg Roedel, David Woodhouse,
	Jean-Philippe Brucker, Lu Baolu, vkoul, robin.murphy, will,
	Yi Liu, Dave Jiang, Tian, Kevin, Raj Ashok, Eric Auger

On Tue, May 10, 2022 at 02:07:01PM -0700, Jacob Pan wrote:
> +static int intel_iommu_attach_dev_pasid(struct iommu_domain *domain,
> +					struct device *dev,
> +					ioasid_t pasid)
> +{
> +	struct device_domain_info *info = dev_iommu_priv_get(dev);
> +	struct dmar_domain *dmar_domain = to_dmar_domain(domain);
> +	struct intel_iommu *iommu = info->iommu;
> +	unsigned long flags;
> +	int ret = 0;
> +
> +	if (!sm_supported(iommu) || !info)
> +		return -ENODEV;
> +
> +	spin_lock_irqsave(&device_domain_lock, flags);
> +	/*
> +	 * If the same device already has a PASID attached, just return.
> +	 * DMA layer will return the PASID value to the caller.
> +	 */
> +	if (pasid != PASID_RID2PASID && info->pasid) {

Why check for PASID == 0 like this? Shouldn't pasid == 0 be rejected
as an invalid argument?

> +		if (info->pasid == pasid)
> +			ret = 0;

Doesn't this need to check that the current domain is the requested
domain as well? How can this happen anyhow - isn't it an error to
double attach?

> diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h
> index 5af24befc9f1..55845a8c4f4d 100644
> +++ b/include/linux/intel-iommu.h
> @@ -627,6 +627,7 @@ struct device_domain_info {
>  	struct intel_iommu *iommu; /* IOMMU used by this device */
>  	struct dmar_domain *domain; /* pointer to domain */
>  	struct pasid_table *pasid_table; /* pasid table */
> +	ioasid_t pasid; /* DMA request with PASID */

And this seems wrong - the DMA API is not the only user of
attach_dev_pasid, so there should not be any global pasid for the
device.

I suspect this should be a counter of # of pasid domains attached so
that the special flush logic triggers

And rely on the core code to worry about assigning only one domain per
pasid - this should really be a 'set' function.

Jason

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v3 2/4] iommu: Add PASID support for DMA mapping API users
  2022-05-10 21:07 ` [PATCH v3 2/4] iommu: Add PASID support for DMA mapping API users Jacob Pan
@ 2022-05-10 23:28   ` Jason Gunthorpe
  2022-05-11  0:43     ` Jacob Pan
  0 siblings, 1 reply; 22+ messages in thread
From: Jason Gunthorpe @ 2022-05-10 23:28 UTC (permalink / raw)
  To: Jacob Pan
  Cc: iommu, LKML, dmaengine, Joerg Roedel, David Woodhouse,
	Jean-Philippe Brucker, Lu Baolu, vkoul, robin.murphy, will,
	Yi Liu, Dave Jiang, Tian, Kevin, Raj Ashok, Eric Auger

On Tue, May 10, 2022 at 02:07:02PM -0700, Jacob Pan wrote:
> DMA mapping API is the de facto standard for in-kernel DMA. It operates
> on a per device/RID basis which is not PASID-aware.
> 
> Some modern devices such as Intel Data Streaming Accelerator, PASID is
> required for certain work submissions. To allow such devices use DMA
> mapping API, we need the following functionalities:
> 1. Provide device a way to retrieve a PASID for work submission within
> the kernel
> 2. Enable the kernel PASID on the IOMMU for the device
> 3. Attach the kernel PASID to the device's default DMA domain, let it
> be IOVA or physical address in case of pass-through.
> 
> This patch introduces a driver facing API that enables DMA API
> PASID usage. Once enabled, device drivers can continue to use DMA APIs as
> is. There is no difference in dma_handle between without PASID and with
> PASID.
> 
> Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
>  drivers/iommu/dma-iommu.c | 107 ++++++++++++++++++++++++++++++++++++++
>  include/linux/dma-iommu.h |   3 ++
>  include/linux/iommu.h     |   2 +
>  3 files changed, 112 insertions(+)
> 
> diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
> index 1ca85d37eeab..5984f3129fa2 100644
> +++ b/drivers/iommu/dma-iommu.c
> @@ -34,6 +34,8 @@ struct iommu_dma_msi_page {
>  	phys_addr_t		phys;
>  };
>  
> +static DECLARE_IOASID_SET(iommu_dma_pasid);
> +
>  enum iommu_dma_cookie_type {
>  	IOMMU_DMA_IOVA_COOKIE,
>  	IOMMU_DMA_MSI_COOKIE,
> @@ -370,6 +372,111 @@ void iommu_put_dma_cookie(struct iommu_domain *domain)
>  	domain->iova_cookie = NULL;
>  }
>  
> +/**
> + * iommu_attach_dma_pasid --Attach a PASID for in-kernel DMA. Use the device's
> + * DMA domain.
> + * @dev: Device to be enabled
> + * @pasid: The returned kernel PASID to be used for DMA
> + *
> + * DMA request with PASID will be mapped the same way as the legacy DMA.
> + * If the device is in pass-through, PASID will also pass-through. If the
> + * device is in IOVA, the PASID will point to the same IOVA page table.
> + *
> + * @return err code or 0 on success
> + */
> +int iommu_attach_dma_pasid(struct device *dev, ioasid_t *pasid)
> +{
> +	struct iommu_domain *dom;
> +	ioasid_t id, max;
> +	int ret = 0;
> +
> +	dom = iommu_get_domain_for_dev(dev);
> +	if (!dom || !dom->ops || !dom->ops->attach_dev_pasid)
> +		return -ENODEV;
> +
> +	/* Only support domain types that DMA API can be used */
> +	if (dom->type == IOMMU_DOMAIN_UNMANAGED ||
> +	    dom->type == IOMMU_DOMAIN_BLOCKED) {
> +		dev_warn(dev, "Invalid domain type %d", dom->type);

This should be a WARN_ON

> +		return -EPERM;
> +	}
> +
> +	id = dom->pasid;
> +	if (!id) {
> +		/*
> +		 * First device to use PASID in its DMA domain, allocate
> +		 * a single PASID per DMA domain is all we need, it is also
> +		 * good for performance when it comes down to IOTLB flush.
> +		 */
> +		max = 1U << dev->iommu->pasid_bits;
> +		if (!max)
> +			return -EINVAL;
> +
> +		id = ioasid_alloc(&iommu_dma_pasid, 1, max, dev);
> +		if (id == INVALID_IOASID)
> +			return -ENOMEM;
> +
> +		dom->pasid = id;
> +		atomic_set(&dom->pasid_users, 1);

All of this needs proper locking.

> +	}
> +
> +	ret = dom->ops->attach_dev_pasid(dom, dev, id);
> +	if (!ret) {
> +		*pasid = id;
> +		atomic_inc(&dom->pasid_users);
> +		return 0;
> +	}
> +
> +	if (atomic_dec_and_test(&dom->pasid_users)) {
> +		ioasid_free(id);
> +		dom->pasid = 0;
> +	}
> +
> +	return ret;
> +}
> +EXPORT_SYMBOL(iommu_attach_dma_pasid);
> +
> +/**
> + * iommu_detach_dma_pasid --Disable in-kernel DMA request with PASID
> + * @dev:	Device's PASID DMA to be disabled
> + *
> + * It is the device driver's responsibility to ensure no more incoming DMA
> + * requests with the kernel PASID before calling this function. IOMMU driver
> + * ensures PASID cache, IOTLBs related to the kernel PASID are cleared and
> + * drained.
> + *
> + */
> +void iommu_detach_dma_pasid(struct device *dev)
> +{
> +	struct iommu_domain *dom;
> +	ioasid_t pasid;
> +
> +	dom = iommu_get_domain_for_dev(dev);
> +	if (!dom || !dom->ops || !dom->ops->detach_dev_pasid) {
> +		dev_warn(dev, "No ops for detaching PASID %u", pasid);
> +		return;
> +	}
> +	/* Only support DMA API managed domain type */
> +	if (dom->type == IOMMU_DOMAIN_UNMANAGED ||
> +	    dom->type == IOMMU_DOMAIN_BLOCKED) {
> +		dev_err(dev, "Invalid domain type %d to detach DMA PASID %u\n",
> +			 dom->type, pasid);
> +		return;
> +	}
> +	pasid = dom->pasid;
> +	if (!pasid) {
> +		dev_err(dev, "No DMA PASID attached\n");
> +		return;
> +	}

All WARN_ON's too

> +	dom->ops->detach_dev_pasid(dom, dev, pasid);
> +	if (atomic_dec_and_test(&dom->pasid_users)) {
> +		ioasid_free(pasid);
> +		dom->pasid = 0;
> +	}
> +}
> +EXPORT_SYMBOL(iommu_detach_dma_pasid);
> +
>  /**
>   * iommu_dma_get_resv_regions - Reserved region driver helper
>   * @dev: Device from iommu_get_resv_regions()
> diff --git a/include/linux/dma-iommu.h b/include/linux/dma-iommu.h
> index 24607dc3c2ac..538650b9cb75 100644
> +++ b/include/linux/dma-iommu.h
> @@ -18,6 +18,9 @@ int iommu_get_dma_cookie(struct iommu_domain *domain);
>  int iommu_get_msi_cookie(struct iommu_domain *domain, dma_addr_t base);
>  void iommu_put_dma_cookie(struct iommu_domain *domain);
>  
> +int iommu_attach_dma_pasid(struct device *dev, ioasid_t *pasid);
> +void iommu_detach_dma_pasid(struct device *dev);
> +
>  /* Setup call for arch DMA mapping code */
>  void iommu_setup_dma_ops(struct device *dev, u64 dma_base, u64 dma_limit);
>  int iommu_dma_init_fq(struct iommu_domain *domain);
> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> index 1164524814cb..281a87fdce77 100644
> +++ b/include/linux/iommu.h
> @@ -105,6 +105,8 @@ struct iommu_domain {
>  	enum iommu_page_response_code (*iopf_handler)(struct iommu_fault *fault,
>  						      void *data);
>  	void *fault_data;
> +	ioasid_t pasid;		/* Used for DMA requests with PASID */
> +	atomic_t pasid_users;

These are poorly named, this is really the DMA API global PASID and
shouldn't be used for other things.

Jason

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v3 1/4] iommu/vt-d: Implement domain ops for attach_dev_pasid
  2022-05-10 23:21   ` Jason Gunthorpe
@ 2022-05-11  0:23     ` Jacob Pan
  2022-05-11 11:54       ` Jason Gunthorpe
  0 siblings, 1 reply; 22+ messages in thread
From: Jacob Pan @ 2022-05-11  0:23 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: iommu, LKML, dmaengine, Joerg Roedel, David Woodhouse,
	Jean-Philippe Brucker, Lu Baolu, vkoul, robin.murphy, will,
	Yi Liu, Dave Jiang, Tian, Kevin, Raj Ashok, Eric Auger,
	jacob.jun.pan

Hi Jason,

On Tue, 10 May 2022 20:21:21 -0300, Jason Gunthorpe <jgg@nvidia.com> wrote:

> On Tue, May 10, 2022 at 02:07:01PM -0700, Jacob Pan wrote:
> > +static int intel_iommu_attach_dev_pasid(struct iommu_domain *domain,
> > +					struct device *dev,
> > +					ioasid_t pasid)
> > +{
> > +	struct device_domain_info *info = dev_iommu_priv_get(dev);
> > +	struct dmar_domain *dmar_domain = to_dmar_domain(domain);
> > +	struct intel_iommu *iommu = info->iommu;
> > +	unsigned long flags;
> > +	int ret = 0;
> > +
> > +	if (!sm_supported(iommu) || !info)
> > +		return -ENODEV;
> > +
> > +	spin_lock_irqsave(&device_domain_lock, flags);
> > +	/*
> > +	 * If the same device already has a PASID attached, just
> > return.
> > +	 * DMA layer will return the PASID value to the caller.
> > +	 */
> > +	if (pasid != PASID_RID2PASID && info->pasid) {  
> 
> Why check for PASID == 0 like this? Shouldn't pasid == 0 be rejected
> as an invalid argument?
Right, I was planning on reuse the attach function for RIDPASID as clean
up, but didn't include here. Will fix.

> 
> > +		if (info->pasid == pasid)
> > +			ret = 0;  
> 
> Doesn't this need to check that the current domain is the requested
> domain as well? How can this happen anyhow - isn't it an error to
> double attach?
> 
> > diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h
> > index 5af24befc9f1..55845a8c4f4d 100644
> > +++ b/include/linux/intel-iommu.h
> > @@ -627,6 +627,7 @@ struct device_domain_info {
> >  	struct intel_iommu *iommu; /* IOMMU used by this device */
> >  	struct dmar_domain *domain; /* pointer to domain */
> >  	struct pasid_table *pasid_table; /* pasid table */
> > +	ioasid_t pasid; /* DMA request with PASID */  
> 
> And this seems wrong - the DMA API is not the only user of
> attach_dev_pasid, so there should not be any global pasid for the
> device.
> 
True but the attach_dev_pasid() op is domain type specific. i.e. DMA API
has its own attach_dev_pasid which is different than sva domain
attach_dev_pasid().
device_domain_info is only used by DMA API.

> I suspect this should be a counter of # of pasid domains attached so
> that the special flush logic triggers
> 
This field is only used for devTLB, so it is per domain-device. struct
device_domain_info is allocated per device-domain as well. Sorry, I might
have totally missed your point.

> And rely on the core code to worry about assigning only one domain per
> pasid - this should really be a 'set' function.
> 
Yes, in this set the core code (in dma-iommu.c) only assign one PASID per
DMA domain type.

Are you suggesting the dma-iommu API should be called
iommu_set_dma_pasid instead of iommu_attach_dma_pasid?

Thanks a lot for the quick review!

Jacob

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v3 2/4] iommu: Add PASID support for DMA mapping API users
  2022-05-10 23:28   ` Jason Gunthorpe
@ 2022-05-11  0:43     ` Jacob Pan
  0 siblings, 0 replies; 22+ messages in thread
From: Jacob Pan @ 2022-05-11  0:43 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: iommu, LKML, dmaengine, Joerg Roedel, David Woodhouse,
	Jean-Philippe Brucker, Lu Baolu, vkoul, robin.murphy, will,
	Yi Liu, Dave Jiang, Tian, Kevin, Raj Ashok, Eric Auger,
	jacob.jun.pan

Hi Jason,

On Tue, 10 May 2022 20:28:04 -0300, Jason Gunthorpe <jgg@nvidia.com> wrote:

> On Tue, May 10, 2022 at 02:07:02PM -0700, Jacob Pan wrote:
> > DMA mapping API is the de facto standard for in-kernel DMA. It operates
> > on a per device/RID basis which is not PASID-aware.
> > 
> > Some modern devices such as Intel Data Streaming Accelerator, PASID is
> > required for certain work submissions. To allow such devices use DMA
> > mapping API, we need the following functionalities:
> > 1. Provide device a way to retrieve a PASID for work submission within
> > the kernel
> > 2. Enable the kernel PASID on the IOMMU for the device
> > 3. Attach the kernel PASID to the device's default DMA domain, let it
> > be IOVA or physical address in case of pass-through.
> > 
> > This patch introduces a driver facing API that enables DMA API
> > PASID usage. Once enabled, device drivers can continue to use DMA APIs
> > as is. There is no difference in dma_handle between without PASID and
> > with PASID.
> > 
> > Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
> >  drivers/iommu/dma-iommu.c | 107 ++++++++++++++++++++++++++++++++++++++
> >  include/linux/dma-iommu.h |   3 ++
> >  include/linux/iommu.h     |   2 +
> >  3 files changed, 112 insertions(+)
> > 
> > diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
> > index 1ca85d37eeab..5984f3129fa2 100644
> > +++ b/drivers/iommu/dma-iommu.c
> > @@ -34,6 +34,8 @@ struct iommu_dma_msi_page {
> >  	phys_addr_t		phys;
> >  };
> >  
> > +static DECLARE_IOASID_SET(iommu_dma_pasid);
> > +
> >  enum iommu_dma_cookie_type {
> >  	IOMMU_DMA_IOVA_COOKIE,
> >  	IOMMU_DMA_MSI_COOKIE,
> > @@ -370,6 +372,111 @@ void iommu_put_dma_cookie(struct iommu_domain
> > *domain) domain->iova_cookie = NULL;
> >  }
> >  
> > +/**
> > + * iommu_attach_dma_pasid --Attach a PASID for in-kernel DMA. Use the
> > device's
> > + * DMA domain.
> > + * @dev: Device to be enabled
> > + * @pasid: The returned kernel PASID to be used for DMA
> > + *
> > + * DMA request with PASID will be mapped the same way as the legacy
> > DMA.
> > + * If the device is in pass-through, PASID will also pass-through. If
> > the
> > + * device is in IOVA, the PASID will point to the same IOVA page table.
> > + *
> > + * @return err code or 0 on success
> > + */
> > +int iommu_attach_dma_pasid(struct device *dev, ioasid_t *pasid)
> > +{
> > +	struct iommu_domain *dom;
> > +	ioasid_t id, max;
> > +	int ret = 0;
> > +
> > +	dom = iommu_get_domain_for_dev(dev);
> > +	if (!dom || !dom->ops || !dom->ops->attach_dev_pasid)
> > +		return -ENODEV;
> > +
> > +	/* Only support domain types that DMA API can be used */
> > +	if (dom->type == IOMMU_DOMAIN_UNMANAGED ||
> > +	    dom->type == IOMMU_DOMAIN_BLOCKED) {
> > +		dev_warn(dev, "Invalid domain type %d", dom->type);  
> 
> This should be a WARN_ON
> 
will do, thanks

> > +		return -EPERM;
> > +	}
> > +
> > +	id = dom->pasid;
> > +	if (!id) {
> > +		/*
> > +		 * First device to use PASID in its DMA domain,
> > allocate
> > +		 * a single PASID per DMA domain is all we need, it is
> > also
> > +		 * good for performance when it comes down to IOTLB
> > flush.
> > +		 */
> > +		max = 1U << dev->iommu->pasid_bits;
> > +		if (!max)
> > +			return -EINVAL;
> > +
> > +		id = ioasid_alloc(&iommu_dma_pasid, 1, max, dev);
> > +		if (id == INVALID_IOASID)
> > +			return -ENOMEM;
> > +
> > +		dom->pasid = id;
> > +		atomic_set(&dom->pasid_users, 1);  
> 
> All of this needs proper locking.
> 
good catch, will add a mutex for domain updates, detach as well.

> > +	}
> > +
> > +	ret = dom->ops->attach_dev_pasid(dom, dev, id);
> > +	if (!ret) {
> > +		*pasid = id;
> > +		atomic_inc(&dom->pasid_users);
> > +		return 0;
> > +	}
> > +
> > +	if (atomic_dec_and_test(&dom->pasid_users)) {
> > +		ioasid_free(id);
> > +		dom->pasid = 0;
> > +	}
> > +
> > +	return ret;
> > +}
> > +EXPORT_SYMBOL(iommu_attach_dma_pasid);
> > +
> > +/**
> > + * iommu_detach_dma_pasid --Disable in-kernel DMA request with PASID
> > + * @dev:	Device's PASID DMA to be disabled
> > + *
> > + * It is the device driver's responsibility to ensure no more incoming
> > DMA
> > + * requests with the kernel PASID before calling this function. IOMMU
> > driver
> > + * ensures PASID cache, IOTLBs related to the kernel PASID are cleared
> > and
> > + * drained.
> > + *
> > + */
> > +void iommu_detach_dma_pasid(struct device *dev)
> > +{
> > +	struct iommu_domain *dom;
> > +	ioasid_t pasid;
> > +
> > +	dom = iommu_get_domain_for_dev(dev);
> > +	if (!dom || !dom->ops || !dom->ops->detach_dev_pasid) {
> > +		dev_warn(dev, "No ops for detaching PASID %u", pasid);
> > +		return;
> > +	}
> > +	/* Only support DMA API managed domain type */
> > +	if (dom->type == IOMMU_DOMAIN_UNMANAGED ||
> > +	    dom->type == IOMMU_DOMAIN_BLOCKED) {
> > +		dev_err(dev, "Invalid domain type %d to detach DMA
> > PASID %u\n",
> > +			 dom->type, pasid);
> > +		return;
> > +	}
> > +	pasid = dom->pasid;
> > +	if (!pasid) {
> > +		dev_err(dev, "No DMA PASID attached\n");
> > +		return;
> > +	}  
> 
> All WARN_ON's too
> 
will do.

> > +	dom->ops->detach_dev_pasid(dom, dev, pasid);
> > +	if (atomic_dec_and_test(&dom->pasid_users)) {
> > +		ioasid_free(pasid);
> > +		dom->pasid = 0;
> > +	}
> > +}
> > +EXPORT_SYMBOL(iommu_detach_dma_pasid);
> > +
> >  /**
> >   * iommu_dma_get_resv_regions - Reserved region driver helper
> >   * @dev: Device from iommu_get_resv_regions()
> > diff --git a/include/linux/dma-iommu.h b/include/linux/dma-iommu.h
> > index 24607dc3c2ac..538650b9cb75 100644
> > +++ b/include/linux/dma-iommu.h
> > @@ -18,6 +18,9 @@ int iommu_get_dma_cookie(struct iommu_domain *domain);
> >  int iommu_get_msi_cookie(struct iommu_domain *domain, dma_addr_t base);
> >  void iommu_put_dma_cookie(struct iommu_domain *domain);
> >  
> > +int iommu_attach_dma_pasid(struct device *dev, ioasid_t *pasid);
> > +void iommu_detach_dma_pasid(struct device *dev);
> > +
> >  /* Setup call for arch DMA mapping code */
> >  void iommu_setup_dma_ops(struct device *dev, u64 dma_base, u64
> > dma_limit); int iommu_dma_init_fq(struct iommu_domain *domain);
> > diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> > index 1164524814cb..281a87fdce77 100644
> > +++ b/include/linux/iommu.h
> > @@ -105,6 +105,8 @@ struct iommu_domain {
> >  	enum iommu_page_response_code (*iopf_handler)(struct
> > iommu_fault *fault, void *data);
> >  	void *fault_data;
> > +	ioasid_t pasid;		/* Used for DMA requests with
> > PASID */
> > +	atomic_t pasid_users;  
> 
> These are poorly named, this is really the DMA API global PASID and
> shouldn't be used for other things.
> 
I was hoping it can be generic since sva_cookie also has a pasid field but
it looks like sva uses mm->pasid now.

Shall we call it dma_api_pasid, dma_pasid, or something else?

> Jason


Thanks,

Jacob

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v3 1/4] iommu/vt-d: Implement domain ops for attach_dev_pasid
  2022-05-11  0:23     ` Jacob Pan
@ 2022-05-11 11:54       ` Jason Gunthorpe
  2022-05-11 15:35         ` Jacob Pan
  0 siblings, 1 reply; 22+ messages in thread
From: Jason Gunthorpe @ 2022-05-11 11:54 UTC (permalink / raw)
  To: Jacob Pan
  Cc: iommu, LKML, dmaengine, Joerg Roedel, David Woodhouse,
	Jean-Philippe Brucker, Lu Baolu, vkoul, robin.murphy, will,
	Yi Liu, Dave Jiang, Tian, Kevin, Raj Ashok, Eric Auger

On Tue, May 10, 2022 at 05:23:09PM -0700, Jacob Pan wrote:

> > > diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h
> > > index 5af24befc9f1..55845a8c4f4d 100644
> > > +++ b/include/linux/intel-iommu.h
> > > @@ -627,6 +627,7 @@ struct device_domain_info {
> > >  	struct intel_iommu *iommu; /* IOMMU used by this device */
> > >  	struct dmar_domain *domain; /* pointer to domain */
> > >  	struct pasid_table *pasid_table; /* pasid table */
> > > +	ioasid_t pasid; /* DMA request with PASID */  
> > 
> > And this seems wrong - the DMA API is not the only user of
> > attach_dev_pasid, so there should not be any global pasid for the
> > device.
> > 
> True but the attach_dev_pasid() op is domain type specific. i.e. DMA API
> has its own attach_dev_pasid which is different than sva domain
> attach_dev_pasid().

Huh? The intel driver shares the same ops between UNMANAGED and DMA -
and in general I do not think we should be putting special knowledge
about the DMA domains in the drivers. Drivers should continue to treat
them identically to UNMANAGED.

> device_domain_info is only used by DMA API.

Huh?
 
> > I suspect this should be a counter of # of pasid domains attached so
> > that the special flush logic triggers
> > 
> This field is only used for devTLB, so it is per domain-device. struct
> device_domain_info is allocated per device-domain as well. Sorry, I might
> have totally missed your point.

You can't store a single pasid in the driver like this, since the only
thing it does is trigger the flush logic just count how many pasids
are used by the device-domain and trigger pasid flush if any pasids
are attached

> > And rely on the core code to worry about assigning only one domain per
> > pasid - this should really be a 'set' function.
>
> Yes, in this set the core code (in dma-iommu.c) only assign one PASID per
> DMA domain type.
> 
> Are you suggesting the dma-iommu API should be called
> iommu_set_dma_pasid instead of iommu_attach_dma_pasid?

No that API is Ok - the driver ops API should be 'set' not attach/detach

Jason

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v3 1/4] iommu/vt-d: Implement domain ops for attach_dev_pasid
  2022-05-11 11:54       ` Jason Gunthorpe
@ 2022-05-11 15:35         ` Jacob Pan
  2022-05-11 16:12           ` Jason Gunthorpe
  0 siblings, 1 reply; 22+ messages in thread
From: Jacob Pan @ 2022-05-11 15:35 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: iommu, LKML, dmaengine, Joerg Roedel, David Woodhouse,
	Jean-Philippe Brucker, Lu Baolu, vkoul, robin.murphy, will,
	Yi Liu, Dave Jiang, Tian, Kevin, Raj Ashok, Eric Auger,
	jacob.jun.pan

Hi Jason,

On Wed, 11 May 2022 08:54:27 -0300, Jason Gunthorpe <jgg@nvidia.com> wrote:

> On Tue, May 10, 2022 at 05:23:09PM -0700, Jacob Pan wrote:
> 
> > > > diff --git a/include/linux/intel-iommu.h
> > > > b/include/linux/intel-iommu.h index 5af24befc9f1..55845a8c4f4d
> > > > 100644 +++ b/include/linux/intel-iommu.h
> > > > @@ -627,6 +627,7 @@ struct device_domain_info {
> > > >  	struct intel_iommu *iommu; /* IOMMU used by this device */
> > > >  	struct dmar_domain *domain; /* pointer to domain */
> > > >  	struct pasid_table *pasid_table; /* pasid table */
> > > > +	ioasid_t pasid; /* DMA request with PASID */    
> > > 
> > > And this seems wrong - the DMA API is not the only user of
> > > attach_dev_pasid, so there should not be any global pasid for the
> > > device.
> > >   
> > True but the attach_dev_pasid() op is domain type specific. i.e. DMA API
> > has its own attach_dev_pasid which is different than sva domain
> > attach_dev_pasid().  
> 
> Huh? The intel driver shares the same ops between UNMANAGED and DMA -
> and in general I do not think we should be putting special knowledge
> about the DMA domains in the drivers. Drivers should continue to treat
> them identically to UNMANAGED.
> 
OK, other than SVA domain, the rest domain types share the same default ops.
I agree that the default ops should be the same for UNMANAGED, IDENTITY, and
DMA domain types. Minor detail is that we need to treat IDENTITY domain
slightly different when it comes down to PASID entry programming.

If not global, perhaps we could have a list of pasids (e.g. xarray) attached
to the device_domain_info. The TLB flush logic would just go through the
list w/o caring what the PASIDs are for. Does it make sense to you?

> > device_domain_info is only used by DMA API.  
> 
> Huh?
My mistake, i meant the device_domain_info.pasid is only used by DMA API

>  
> > > I suspect this should be a counter of # of pasid domains attached so
> > > that the special flush logic triggers
> > >   
> > This field is only used for devTLB, so it is per domain-device. struct
> > device_domain_info is allocated per device-domain as well. Sorry, I
> > might have totally missed your point.  
> 
> You can't store a single pasid in the driver like this, since the only
> thing it does is trigger the flush logic just count how many pasids
> are used by the device-domain and trigger pasid flush if any pasids
> are attached
> 
Got it, will put the pasids in an xa as described above.

> > > And rely on the core code to worry about assigning only one domain per
> > > pasid - this should really be a 'set' function.  
> >
> > Yes, in this set the core code (in dma-iommu.c) only assign one PASID
> > per DMA domain type.
> > 
> > Are you suggesting the dma-iommu API should be called
> > iommu_set_dma_pasid instead of iommu_attach_dma_pasid?  
> 
> No that API is Ok - the driver ops API should be 'set' not attach/detach
> 
Sounds good, this operation has little in common with
domain_ops.dev_attach_pasid() used by SVA domain. So I will add a new
domain_ops.dev_set_pasid()


Thanks,

Jacob

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v3 1/4] iommu/vt-d: Implement domain ops for attach_dev_pasid
  2022-05-11 15:35         ` Jacob Pan
@ 2022-05-11 16:12           ` Jason Gunthorpe
  2022-05-11 17:02             ` Jacob Pan
  0 siblings, 1 reply; 22+ messages in thread
From: Jason Gunthorpe @ 2022-05-11 16:12 UTC (permalink / raw)
  To: Jacob Pan
  Cc: iommu, LKML, dmaengine, Joerg Roedel, David Woodhouse,
	Jean-Philippe Brucker, Lu Baolu, vkoul, robin.murphy, will,
	Yi Liu, Dave Jiang, Tian, Kevin, Raj Ashok, Eric Auger

On Wed, May 11, 2022 at 08:35:18AM -0700, Jacob Pan wrote:

> > Huh? The intel driver shares the same ops between UNMANAGED and DMA -
> > and in general I do not think we should be putting special knowledge
> > about the DMA domains in the drivers. Drivers should continue to treat
> > them identically to UNMANAGED.
> > 
> OK, other than SVA domain, the rest domain types share the same default ops.
> I agree that the default ops should be the same for UNMANAGED, IDENTITY, and
> DMA domain types. Minor detail is that we need to treat IDENTITY domain
> slightly different when it comes down to PASID entry programming.

I would be happy if IDENTITY had its own ops, if that makes sense

> If not global, perhaps we could have a list of pasids (e.g. xarray) attached
> to the device_domain_info. The TLB flush logic would just go through the
> list w/o caring what the PASIDs are for. Does it make sense to you?

Sort of, but we shouldn't duplicate xarrays - the group already has
this xarray - need to find some way to allow access to it from the
driver.

> > > Are you suggesting the dma-iommu API should be called
> > > iommu_set_dma_pasid instead of iommu_attach_dma_pasid?  
> > 
> > No that API is Ok - the driver ops API should be 'set' not attach/detach
> > 
> Sounds good, this operation has little in common with
> domain_ops.dev_attach_pasid() used by SVA domain. So I will add a new
> domain_ops.dev_set_pasid()

What? No, their should only be one operation, 'dev_set_pasid' and it
is exactly the same as the SVA operation. It configures things so that
any existing translation on the PASID is removed and the PASID
translates according to the given domain.

SVA given domain or UNMANAGED given domain doesn't matter to the
higher level code. The driver should implement per-domain ops as
required to get the different behaviors.

Jason

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v3 1/4] iommu/vt-d: Implement domain ops for attach_dev_pasid
  2022-05-11 17:02             ` Jacob Pan
@ 2022-05-11 17:00               ` Jason Gunthorpe
  2022-05-11 17:25                 ` Jacob Pan
  2022-05-12  6:22                 ` Baolu Lu
  0 siblings, 2 replies; 22+ messages in thread
From: Jason Gunthorpe @ 2022-05-11 17:00 UTC (permalink / raw)
  To: Jacob Pan
  Cc: iommu, LKML, dmaengine, Joerg Roedel, David Woodhouse,
	Jean-Philippe Brucker, Lu Baolu, vkoul, robin.murphy, will,
	Yi Liu, Dave Jiang, Tian, Kevin, Raj Ashok, Eric Auger

On Wed, May 11, 2022 at 10:02:16AM -0700, Jacob Pan wrote:
> > > If not global, perhaps we could have a list of pasids (e.g. xarray)
> > > attached to the device_domain_info. The TLB flush logic would just go
> > > through the list w/o caring what the PASIDs are for. Does it make sense
> > > to you?  
> > 
> > Sort of, but we shouldn't duplicate xarrays - the group already has
> > this xarray - need to find some way to allow access to it from the
> > driver.
> > 
> I am not following,  here are the PASIDs for devTLB flush which is per
> device. Why group?

Because group is where the core code stores it.

> We could retrieve PASIDs from the device PASID table but xa would be more
> efficient.
> 
> > > > > Are you suggesting the dma-iommu API should be called
> > > > > iommu_set_dma_pasid instead of iommu_attach_dma_pasid?    
> > > > 
> > > > No that API is Ok - the driver ops API should be 'set' not
> > > > attach/detach 
> > > Sounds good, this operation has little in common with
> > > domain_ops.dev_attach_pasid() used by SVA domain. So I will add a new
> > > domain_ops.dev_set_pasid()  
> > 
> > What? No, their should only be one operation, 'dev_set_pasid' and it
> > is exactly the same as the SVA operation. It configures things so that
> > any existing translation on the PASID is removed and the PASID
> > translates according to the given domain.
> > 
> > SVA given domain or UNMANAGED given domain doesn't matter to the
> > higher level code. The driver should implement per-domain ops as
> > required to get the different behaviors.
> Perhaps some code to clarify, we have
> sva_domain_ops.dev_attach_pasid() = intel_svm_attach_dev_pasid;
> default_domain_ops.dev_attach_pasid() = intel_iommu_attach_dev_pasid;

Yes, keep that structure
 
> Consolidate pasid programming into dev_set_pasid() then called by both
> intel_svm_attach_dev_pasid() and intel_iommu_attach_dev_pasid(), right?

I was only suggesting that really dev_attach_pasid() op is misnamed,
it should be called set_dev_pasid() and act like a set, not a paired
attach/detach - same as the non-PASID ops.

Jason

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v3 1/4] iommu/vt-d: Implement domain ops for attach_dev_pasid
  2022-05-11 16:12           ` Jason Gunthorpe
@ 2022-05-11 17:02             ` Jacob Pan
  2022-05-11 17:00               ` Jason Gunthorpe
  0 siblings, 1 reply; 22+ messages in thread
From: Jacob Pan @ 2022-05-11 17:02 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: iommu, LKML, dmaengine, Joerg Roedel, David Woodhouse,
	Jean-Philippe Brucker, Lu Baolu, vkoul, robin.murphy, will,
	Yi Liu, Dave Jiang, Tian, Kevin, Raj Ashok, Eric Auger,
	jacob.jun.pan

Hi Jason,

On Wed, 11 May 2022 13:12:37 -0300, Jason Gunthorpe <jgg@nvidia.com> wrote:

> On Wed, May 11, 2022 at 08:35:18AM -0700, Jacob Pan wrote:
> 
> > > Huh? The intel driver shares the same ops between UNMANAGED and DMA -
> > > and in general I do not think we should be putting special knowledge
> > > about the DMA domains in the drivers. Drivers should continue to treat
> > > them identically to UNMANAGED.
> > >   
> > OK, other than SVA domain, the rest domain types share the same default
> > ops. I agree that the default ops should be the same for UNMANAGED,
> > IDENTITY, and DMA domain types. Minor detail is that we need to treat
> > IDENTITY domain slightly different when it comes down to PASID entry
> > programming.  
> 
> I would be happy if IDENTITY had its own ops, if that makes sense
> 
I have tried to have its own ops but there are complications around
checking if a domain has ops. It would be a logic thing to clean up next.

> > If not global, perhaps we could have a list of pasids (e.g. xarray)
> > attached to the device_domain_info. The TLB flush logic would just go
> > through the list w/o caring what the PASIDs are for. Does it make sense
> > to you?  
> 
> Sort of, but we shouldn't duplicate xarrays - the group already has
> this xarray - need to find some way to allow access to it from the
> driver.
> 
I am not following,  here are the PASIDs for devTLB flush which is per
device. Why group?
We could retrieve PASIDs from the device PASID table but xa would be more
efficient.

> > > > Are you suggesting the dma-iommu API should be called
> > > > iommu_set_dma_pasid instead of iommu_attach_dma_pasid?    
> > > 
> > > No that API is Ok - the driver ops API should be 'set' not
> > > attach/detach 
> > Sounds good, this operation has little in common with
> > domain_ops.dev_attach_pasid() used by SVA domain. So I will add a new
> > domain_ops.dev_set_pasid()  
> 
> What? No, their should only be one operation, 'dev_set_pasid' and it
> is exactly the same as the SVA operation. It configures things so that
> any existing translation on the PASID is removed and the PASID
> translates according to the given domain.
> 
> SVA given domain or UNMANAGED given domain doesn't matter to the
> higher level code. The driver should implement per-domain ops as
> required to get the different behaviors.
Perhaps some code to clarify, we have
sva_domain_ops.dev_attach_pasid() = intel_svm_attach_dev_pasid;
default_domain_ops.dev_attach_pasid() = intel_iommu_attach_dev_pasid;

Consolidate pasid programming into dev_set_pasid() then called by both
intel_svm_attach_dev_pasid() and intel_iommu_attach_dev_pasid(), right?


Thanks,

Jacob

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v3 1/4] iommu/vt-d: Implement domain ops for attach_dev_pasid
  2022-05-11 17:00               ` Jason Gunthorpe
@ 2022-05-11 17:25                 ` Jacob Pan
  2022-05-11 18:29                   ` Jason Gunthorpe
  2022-05-12  1:16                   ` Baolu Lu
  2022-05-12  6:22                 ` Baolu Lu
  1 sibling, 2 replies; 22+ messages in thread
From: Jacob Pan @ 2022-05-11 17:25 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: iommu, LKML, dmaengine, Joerg Roedel, David Woodhouse,
	Jean-Philippe Brucker, Lu Baolu, vkoul, robin.murphy, will,
	Yi Liu, Dave Jiang, Tian, Kevin, Raj Ashok, Eric Auger,
	jacob.jun.pan

Hi Jason,

On Wed, 11 May 2022 14:00:25 -0300, Jason Gunthorpe <jgg@nvidia.com> wrote:

> On Wed, May 11, 2022 at 10:02:16AM -0700, Jacob Pan wrote:
> > > > If not global, perhaps we could have a list of pasids (e.g. xarray)
> > > > attached to the device_domain_info. The TLB flush logic would just
> > > > go through the list w/o caring what the PASIDs are for. Does it
> > > > make sense to you?    
> > > 
> > > Sort of, but we shouldn't duplicate xarrays - the group already has
> > > this xarray - need to find some way to allow access to it from the
> > > driver.
> > >   
> > I am not following,  here are the PASIDs for devTLB flush which is per
> > device. Why group?  
> 
> Because group is where the core code stores it.
I see, with singleton group. I guess I can let dma-iommu code call

iommu_attach_dma_pasid {
	iommu_attach_device_pasid();
Then the PASID will be stored in the group xa.
The flush code can retrieve PASIDs from device_domain_info.device -> group
-> pasid_array.
Thanks for pointing it out, I missed the new pasid_array.
> 
> > We could retrieve PASIDs from the device PASID table but xa would be
> > more efficient.
> >   
> > > > > > Are you suggesting the dma-iommu API should be called
> > > > > > iommu_set_dma_pasid instead of iommu_attach_dma_pasid?      
> > > > > 
> > > > > No that API is Ok - the driver ops API should be 'set' not
> > > > > attach/detach   
> > > > Sounds good, this operation has little in common with
> > > > domain_ops.dev_attach_pasid() used by SVA domain. So I will add a
> > > > new domain_ops.dev_set_pasid()    
> > > 
> > > What? No, their should only be one operation, 'dev_set_pasid' and it
> > > is exactly the same as the SVA operation. It configures things so that
> > > any existing translation on the PASID is removed and the PASID
> > > translates according to the given domain.
> > > 
> > > SVA given domain or UNMANAGED given domain doesn't matter to the
> > > higher level code. The driver should implement per-domain ops as
> > > required to get the different behaviors.  
> > Perhaps some code to clarify, we have
> > sva_domain_ops.dev_attach_pasid() = intel_svm_attach_dev_pasid;
> > default_domain_ops.dev_attach_pasid() = intel_iommu_attach_dev_pasid;  
> 
> Yes, keep that structure
>  
> > Consolidate pasid programming into dev_set_pasid() then called by both
> > intel_svm_attach_dev_pasid() and intel_iommu_attach_dev_pasid(), right?
> >  
> 
> I was only suggesting that really dev_attach_pasid() op is misnamed,
> it should be called set_dev_pasid() and act like a set, not a paired
> attach/detach - same as the non-PASID ops.
> 
Got it. Perhaps another patch to rename, Baolu?


Thanks,

Jacob

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v3 1/4] iommu/vt-d: Implement domain ops for attach_dev_pasid
  2022-05-11 17:25                 ` Jacob Pan
@ 2022-05-11 18:29                   ` Jason Gunthorpe
  2022-05-18 18:42                     ` Jacob Pan
  2022-05-12  1:16                   ` Baolu Lu
  1 sibling, 1 reply; 22+ messages in thread
From: Jason Gunthorpe @ 2022-05-11 18:29 UTC (permalink / raw)
  To: Jacob Pan
  Cc: iommu, LKML, dmaengine, Joerg Roedel, David Woodhouse,
	Jean-Philippe Brucker, Lu Baolu, vkoul, robin.murphy, will,
	Yi Liu, Dave Jiang, Tian, Kevin, Raj Ashok, Eric Auger

On Wed, May 11, 2022 at 10:25:21AM -0700, Jacob Pan wrote:
> Hi Jason,
> 
> On Wed, 11 May 2022 14:00:25 -0300, Jason Gunthorpe <jgg@nvidia.com> wrote:
> 
> > On Wed, May 11, 2022 at 10:02:16AM -0700, Jacob Pan wrote:
> > > > > If not global, perhaps we could have a list of pasids (e.g. xarray)
> > > > > attached to the device_domain_info. The TLB flush logic would just
> > > > > go through the list w/o caring what the PASIDs are for. Does it
> > > > > make sense to you?    
> > > > 
> > > > Sort of, but we shouldn't duplicate xarrays - the group already has
> > > > this xarray - need to find some way to allow access to it from the
> > > > driver.
> > > >   
> > > I am not following,  here are the PASIDs for devTLB flush which is per
> > > device. Why group?  
> > 
> > Because group is where the core code stores it.
> I see, with singleton group. I guess I can let dma-iommu code call
> 
> iommu_attach_dma_pasid {
> 	iommu_attach_device_pasid();
> Then the PASID will be stored in the group xa.

Yes, again, the dma-iommu should not be any different from the normal
unmanaged path. At this point there is no longer any difference, we
should not invent new ones.

> The flush code can retrieve PASIDs from device_domain_info.device ->
> group -> pasid_array.  Thanks for pointing it out, I missed the new
> pasid_array.

Yes.. It seems inefficient to iterate over that xarray multiple times
on the flush hot path, but maybe there is little choice. Try to use
use the xas iterators under the xa_lock spinlock..

The challenge will be accessing the group xa in the first place, but
maybe the core code can gain a function call to return a pointer to
that XA or something..

Jason

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v3 1/4] iommu/vt-d: Implement domain ops for attach_dev_pasid
  2022-05-11 17:25                 ` Jacob Pan
  2022-05-11 18:29                   ` Jason Gunthorpe
@ 2022-05-12  1:16                   ` Baolu Lu
  1 sibling, 0 replies; 22+ messages in thread
From: Baolu Lu @ 2022-05-12  1:16 UTC (permalink / raw)
  To: Jacob Pan, Jason Gunthorpe
  Cc: baolu.lu, iommu, LKML, dmaengine, Joerg Roedel, David Woodhouse,
	Jean-Philippe Brucker, vkoul, robin.murphy, will, Yi Liu,
	Dave Jiang, Tian, Kevin, Raj Ashok, Eric Auger

On 2022/5/12 01:25, Jacob Pan wrote:
> Hi Jason,
> 
> On Wed, 11 May 2022 14:00:25 -0300, Jason Gunthorpe <jgg@nvidia.com> wrote:
> 
>> On Wed, May 11, 2022 at 10:02:16AM -0700, Jacob Pan wrote:
>>>>> If not global, perhaps we could have a list of pasids (e.g. xarray)
>>>>> attached to the device_domain_info. The TLB flush logic would just
>>>>> go through the list w/o caring what the PASIDs are for. Does it
>>>>> make sense to you?
>>>>
>>>> Sort of, but we shouldn't duplicate xarrays - the group already has
>>>> this xarray - need to find some way to allow access to it from the
>>>> driver.
>>>>    
>>> I am not following,  here are the PASIDs for devTLB flush which is per
>>> device. Why group?
>>
>> Because group is where the core code stores it.
> I see, with singleton group. I guess I can let dma-iommu code call
> 
> iommu_attach_dma_pasid {
> 	iommu_attach_device_pasid();
> Then the PASID will be stored in the group xa.
> The flush code can retrieve PASIDs from device_domain_info.device -> group
> -> pasid_array.
> Thanks for pointing it out, I missed the new pasid_array.
>>
>>> We could retrieve PASIDs from the device PASID table but xa would be
>>> more efficient.
>>>    
>>>>>>> Are you suggesting the dma-iommu API should be called
>>>>>>> iommu_set_dma_pasid instead of iommu_attach_dma_pasid?
>>>>>>
>>>>>> No that API is Ok - the driver ops API should be 'set' not
>>>>>> attach/detach
>>>>> Sounds good, this operation has little in common with
>>>>> domain_ops.dev_attach_pasid() used by SVA domain. So I will add a
>>>>> new domain_ops.dev_set_pasid()
>>>>
>>>> What? No, their should only be one operation, 'dev_set_pasid' and it
>>>> is exactly the same as the SVA operation. It configures things so that
>>>> any existing translation on the PASID is removed and the PASID
>>>> translates according to the given domain.
>>>>
>>>> SVA given domain or UNMANAGED given domain doesn't matter to the
>>>> higher level code. The driver should implement per-domain ops as
>>>> required to get the different behaviors.
>>> Perhaps some code to clarify, we have
>>> sva_domain_ops.dev_attach_pasid() = intel_svm_attach_dev_pasid;
>>> default_domain_ops.dev_attach_pasid() = intel_iommu_attach_dev_pasid;
>>
>> Yes, keep that structure
>>   
>>> Consolidate pasid programming into dev_set_pasid() then called by both
>>> intel_svm_attach_dev_pasid() and intel_iommu_attach_dev_pasid(), right?
>>>   
>>
>> I was only suggesting that really dev_attach_pasid() op is misnamed,
>> it should be called set_dev_pasid() and act like a set, not a paired
>> attach/detach - same as the non-PASID ops.
>>
> Got it. Perhaps another patch to rename, Baolu?

Yes. I can rename it in my sva series if others are also happy with this
naming.

Best regards,
baolu

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v3 1/4] iommu/vt-d: Implement domain ops for attach_dev_pasid
  2022-05-11 17:00               ` Jason Gunthorpe
  2022-05-11 17:25                 ` Jacob Pan
@ 2022-05-12  6:22                 ` Baolu Lu
  2022-05-12 11:52                   ` Jason Gunthorpe
  1 sibling, 1 reply; 22+ messages in thread
From: Baolu Lu @ 2022-05-12  6:22 UTC (permalink / raw)
  To: Jason Gunthorpe, Jacob Pan
  Cc: baolu.lu, iommu, LKML, dmaengine, Joerg Roedel, David Woodhouse,
	Jean-Philippe Brucker, vkoul, robin.murphy, will, Yi Liu,
	Dave Jiang, Tian, Kevin, Raj Ashok, Eric Auger

Hi Jason,

On 2022/5/12 01:00, Jason Gunthorpe wrote:
>> Consolidate pasid programming into dev_set_pasid() then called by both
>> intel_svm_attach_dev_pasid() and intel_iommu_attach_dev_pasid(), right?
> I was only suggesting that really dev_attach_pasid() op is misnamed,
> it should be called set_dev_pasid() and act like a set, not a paired
> attach/detach - same as the non-PASID ops.

So,

   "set_dev_pasid(domain, device, pasid)" equals to dev_attach_pasid()

and

   "set_dev_pasid(NULL, device, pasid)" equals to dev_detach_pasid()?

do I understand it right?

Best regards,
baolu

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v3 1/4] iommu/vt-d: Implement domain ops for attach_dev_pasid
  2022-05-12  6:22                 ` Baolu Lu
@ 2022-05-12 11:52                   ` Jason Gunthorpe
  0 siblings, 0 replies; 22+ messages in thread
From: Jason Gunthorpe @ 2022-05-12 11:52 UTC (permalink / raw)
  To: Baolu Lu
  Cc: Jacob Pan, iommu, LKML, dmaengine, Joerg Roedel, David Woodhouse,
	Jean-Philippe Brucker, vkoul, robin.murphy, will, Yi Liu,
	Dave Jiang, Tian, Kevin, Raj Ashok, Eric Auger

On Thu, May 12, 2022 at 02:22:03PM +0800, Baolu Lu wrote:
> Hi Jason,
> 
> On 2022/5/12 01:00, Jason Gunthorpe wrote:
> > > Consolidate pasid programming into dev_set_pasid() then called by both
> > > intel_svm_attach_dev_pasid() and intel_iommu_attach_dev_pasid(), right?
> > I was only suggesting that really dev_attach_pasid() op is misnamed,
> > it should be called set_dev_pasid() and act like a set, not a paired
> > attach/detach - same as the non-PASID ops.
> 
> So,
> 
>   "set_dev_pasid(domain, device, pasid)" equals to dev_attach_pasid()
> 
> and
> 
>   "set_dev_pasid(NULL, device, pasid)" equals to dev_detach_pasid()?
> 
> do I understand it right?

blocking_domain should be passed, not null

Jason

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v3 1/4] iommu/vt-d: Implement domain ops for attach_dev_pasid
  2022-05-11 18:29                   ` Jason Gunthorpe
@ 2022-05-18 18:42                     ` Jacob Pan
  2022-05-18 18:52                       ` Jason Gunthorpe
  0 siblings, 1 reply; 22+ messages in thread
From: Jacob Pan @ 2022-05-18 18:42 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: iommu, LKML, dmaengine, Joerg Roedel, David Woodhouse,
	Jean-Philippe Brucker, Lu Baolu, vkoul, robin.murphy, will,
	Yi Liu, Dave Jiang, Tian, Kevin, Raj Ashok, Eric Auger,
	jacob.jun.pan

Hi Jason,

On Wed, 11 May 2022 15:29:08 -0300, Jason Gunthorpe <jgg@nvidia.com> wrote:

> On Wed, May 11, 2022 at 10:25:21AM -0700, Jacob Pan wrote:
> > Hi Jason,
> > 
> > On Wed, 11 May 2022 14:00:25 -0300, Jason Gunthorpe <jgg@nvidia.com>
> > wrote: 
> > > On Wed, May 11, 2022 at 10:02:16AM -0700, Jacob Pan wrote:  
> > > > > > If not global, perhaps we could have a list of pasids (e.g.
> > > > > > xarray) attached to the device_domain_info. The TLB flush logic
> > > > > > would just go through the list w/o caring what the PASIDs are
> > > > > > for. Does it make sense to you?      
> > > > > 
> > > > > Sort of, but we shouldn't duplicate xarrays - the group already
> > > > > has this xarray - need to find some way to allow access to it
> > > > > from the driver.
> > > > >     
> > > > I am not following,  here are the PASIDs for devTLB flush which is
> > > > per device. Why group?    
> > > 
> > > Because group is where the core code stores it.  
> > I see, with singleton group. I guess I can let dma-iommu code call
> > 
> > iommu_attach_dma_pasid {
> > 	iommu_attach_device_pasid();
> > Then the PASID will be stored in the group xa.  
> 
> Yes, again, the dma-iommu should not be any different from the normal
> unmanaged path. At this point there is no longer any difference, we
> should not invent new ones.
> 
> > The flush code can retrieve PASIDs from device_domain_info.device ->
> > group -> pasid_array.  Thanks for pointing it out, I missed the new
> > pasid_array.  
> 
> Yes.. It seems inefficient to iterate over that xarray multiple times
> on the flush hot path, but maybe there is little choice. Try to use
> use the xas iterators under the xa_lock spinlock..
> 
xas_for_each takes a max range, here we don't really have one. So I posted
v4 w/o using the xas advanced API. Please let me know if you have
suggestions.
xa_for_each takes RCU read lock, it should be fast for tlb flush, right? The
worst case maybe over flush when we have stale data but should be very rare.

> The challenge will be accessing the group xa in the first place, but
> maybe the core code can gain a function call to return a pointer to
> that XA or something..
> 
I added a helper function to find the matching DMA API PASID in v4.


Thanks,

Jacob

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v3 1/4] iommu/vt-d: Implement domain ops for attach_dev_pasid
  2022-05-18 18:42                     ` Jacob Pan
@ 2022-05-18 18:52                       ` Jason Gunthorpe
  2022-05-19 21:05                         ` Jacob Pan
  0 siblings, 1 reply; 22+ messages in thread
From: Jason Gunthorpe @ 2022-05-18 18:52 UTC (permalink / raw)
  To: Jacob Pan
  Cc: iommu, LKML, dmaengine, Joerg Roedel, David Woodhouse,
	Jean-Philippe Brucker, Lu Baolu, vkoul, robin.murphy, will,
	Yi Liu, Dave Jiang, Tian, Kevin, Raj Ashok, Eric Auger

On Wed, May 18, 2022 at 11:42:04AM -0700, Jacob Pan wrote:

> > Yes.. It seems inefficient to iterate over that xarray multiple times
> > on the flush hot path, but maybe there is little choice. Try to use
> > use the xas iterators under the xa_lock spinlock..
> > 
> xas_for_each takes a max range, here we don't really have one. So I posted
> v4 w/o using the xas advanced API. Please let me know if you have
> suggestions.

You are supposed to use ULONG_MAX in cases like that.

> xa_for_each takes RCU read lock, it should be fast for tlb flush, right? The
> worst case maybe over flush when we have stale data but should be very rare.

Not really, xa_for_each walks the tree for every iteration, it is
slower than a linked list walk in any cases where the xarray is
multi-node. xas_for_each is able to retain a pointer where it is in
the tree so each iteration is usually just a pointer increment.

The downside is you cannot sleep while doing xas_for_each

> > The challenge will be accessing the group xa in the first place, but
> > maybe the core code can gain a function call to return a pointer to
> > that XA or something..
 
> I added a helper function to find the matching DMA API PASID in v4.

Again, why are we focused on DMA API? Nothing you build here should be
DMA API beyond the fact that the iommu_domain being attached is the
default domain.

Jason

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v3 1/4] iommu/vt-d: Implement domain ops for attach_dev_pasid
  2022-05-18 18:52                       ` Jason Gunthorpe
@ 2022-05-19 21:05                         ` Jacob Pan
  0 siblings, 0 replies; 22+ messages in thread
From: Jacob Pan @ 2022-05-19 21:05 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: iommu, LKML, dmaengine, Joerg Roedel, David Woodhouse,
	Jean-Philippe Brucker, Lu Baolu, vkoul, robin.murphy, will,
	Yi Liu, Dave Jiang, Tian, Kevin, Raj Ashok, Eric Auger,
	jacob.jun.pan

Hi Jason,

On Wed, 18 May 2022 15:52:05 -0300, Jason Gunthorpe <jgg@nvidia.com> wrote:

> On Wed, May 18, 2022 at 11:42:04AM -0700, Jacob Pan wrote:
> 
> > > Yes.. It seems inefficient to iterate over that xarray multiple times
> > > on the flush hot path, but maybe there is little choice. Try to use
> > > use the xas iterators under the xa_lock spinlock..
> > >   
> > xas_for_each takes a max range, here we don't really have one. So I
> > posted v4 w/o using the xas advanced API. Please let me know if you have
> > suggestions.  
> 
> You are supposed to use ULONG_MAX in cases like that.
> 
got it.
> > xa_for_each takes RCU read lock, it should be fast for tlb flush,
> > right? The worst case maybe over flush when we have stale data but
> > should be very rare.  
> 
> Not really, xa_for_each walks the tree for every iteration, it is
> slower than a linked list walk in any cases where the xarray is
> multi-node. xas_for_each is able to retain a pointer where it is in
> the tree so each iteration is usually just a pointer increment.
> 
Thanks for explaining, yeah if we have to iterate multiple times
xas_for_each() is better.

> The downside is you cannot sleep while doing xas_for_each
> 
will do under RCU read lock

> > > The challenge will be accessing the group xa in the first place, but
> > > maybe the core code can gain a function call to return a pointer to
> > > that XA or something..  
>  
> > I added a helper function to find the matching DMA API PASID in v4.  
> 
> Again, why are we focused on DMA API? Nothing you build here should be
> DMA API beyond the fact that the iommu_domain being attached is the
> default domain.
The helper is not DMA API specific. Just a domain-PASID look up. Sorry for
the confusion.

Thanks,

Jacob

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2022-05-19 21:02 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-05-10 21:07 [PATCH v3 0/4] Enable PASID for DMA API users Jacob Pan
2022-05-10 21:07 ` [PATCH v3 1/4] iommu/vt-d: Implement domain ops for attach_dev_pasid Jacob Pan
2022-05-10 23:21   ` Jason Gunthorpe
2022-05-11  0:23     ` Jacob Pan
2022-05-11 11:54       ` Jason Gunthorpe
2022-05-11 15:35         ` Jacob Pan
2022-05-11 16:12           ` Jason Gunthorpe
2022-05-11 17:02             ` Jacob Pan
2022-05-11 17:00               ` Jason Gunthorpe
2022-05-11 17:25                 ` Jacob Pan
2022-05-11 18:29                   ` Jason Gunthorpe
2022-05-18 18:42                     ` Jacob Pan
2022-05-18 18:52                       ` Jason Gunthorpe
2022-05-19 21:05                         ` Jacob Pan
2022-05-12  1:16                   ` Baolu Lu
2022-05-12  6:22                 ` Baolu Lu
2022-05-12 11:52                   ` Jason Gunthorpe
2022-05-10 21:07 ` [PATCH v3 2/4] iommu: Add PASID support for DMA mapping API users Jacob Pan
2022-05-10 23:28   ` Jason Gunthorpe
2022-05-11  0:43     ` Jacob Pan
2022-05-10 21:07 ` [PATCH v3 3/4] dmaengine: idxd: Use DMA API for in-kernel DMA with PASID Jacob Pan
2022-05-10 21:07 ` [PATCH v3 4/4] iommu/vt-d: Delete unused SVM flag Jacob Pan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).