iommu.lists.linux-foundation.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v6 00/10] Nested Shared Virtual Address (SVA) VT-d support
@ 2019-10-22 23:53 Jacob Pan
  2019-10-22 23:53 ` [PATCH v6 01/10] iommu/vt-d: Enlightened PASID allocation Jacob Pan
                   ` (10 more replies)
  0 siblings, 11 replies; 27+ messages in thread
From: Jacob Pan @ 2019-10-22 23:53 UTC (permalink / raw)
  To: iommu, LKML, Joerg Roedel, David Woodhouse, Alex Williamson,
	Jean-Philippe Brucker
  Cc: Tian, Kevin, Raj Ashok, Jonathan Cameron

Shared virtual address (SVA), a.k.a, Shared virtual memory (SVM) on Intel
platforms allow address space sharing between device DMA and applications.
SVA can reduce programming complexity and enhance security.
This series is intended to enable SVA virtualization, i.e. shared guest
application address space and physical device DMA address. Only IOMMU portion
of the changes are included in this series. Additional support is needed in
VFIO and QEMU (will be submitted separately) to complete this functionality.

To make incremental changes and reduce the size of each patchset. This series
does not inlcude support for page request services.

In VT-d implementation, PASID table is per device and maintained in the host.
Guest PASID table is shadowed in VMM where virtual IOMMU is emulated.

    .-------------.  .---------------------------.
    |   vIOMMU    |  | Guest process CR3, FL only|
    |             |  '---------------------------'
    .----------------/
    | PASID Entry |--- PASID cache flush -
    '-------------'                       |
    |             |                       V
    |             |                CR3 in GPA
    '-------------'
Guest
------| Shadow |--------------------------|--------
      v        v                          v
Host
    .-------------.  .----------------------.
    |   pIOMMU    |  | Bind FL for GVA-GPA  |
    |             |  '----------------------'
    .----------------/  |
    | PASID Entry |     V (Nested xlate)
    '----------------\.------------------------------.
    |             |   |SL for GPA-HPA, default domain|
    |             |   '------------------------------'
    '-------------'
Where:
 - FL = First level/stage one page tables
 - SL = Second level/stage two page tables

This is the remaining VT-d only portion of V5 since the uAPIs and IOASID common
code have been applied to Joerg's IOMMU core branch.
(https://lkml.org/lkml/2019/10/2/833)

The complete set with VFIO patches are here:
https://github.com/jacobpan/linux.git:siov_sva

The complete nested SVA upstream patches are divided into three phases:
    1. Common APIs and PCI device direct assignment
    2. Page Request Services (PRS) support
    3. Mediated device assignment

With this set and the accompanied VFIO code, we will achieve phase #1.

Thanks,

Jacob

ChangeLog:
	- V6
	  - Rebased on top of Joerg's core branch
	  (git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu.git core)
	  - Adapt to new uAPIs and IOASID allocators

	- V5
	  Rebased on v5.3-rc4 which has some of the IOMMU fault APIs merged.
 	  Addressed v4 review comments from Eric Auger, Baolu Lu, and
	    Jonathan Cameron. Specific changes are as follows:
	  - Refined custom IOASID allocator to support multiple vIOMMU, hotplug
	    cases.
	  - Extracted vendor data from IOMMU guest PASID bind data, for VT-d
	    will support all necessary guest PASID entry fields for PASID
	    bind.
	  - Support non-identity host-guest PASID mapping
	  - Exception handling in various cases

	- V4
	  - Redesigned IOASID allocator such that it can support custom
	  allocators with shared helper functions. Use separate XArray
	  to store IOASIDs per allocator. Took advice from Eric Auger to
	  have default allocator use the generic allocator structure.
	  Combined into one patch in that the default allocator is just
	  "another" allocator now. Can be built as a module in case of
	  driver use without IOMMU.
	  - Extended bind guest PASID data to support SMMU and non-identity
	  guest to host PASID mapping https://lkml.org/lkml/2019/5/21/802
	  - Rebased on Jean's sva/api common tree, new patches starts with
	   [PATCH v4 10/22]

	- V3
	  - Addressed thorough review comments from Eric Auger (Thank you!)
	  - Moved IOASID allocator from driver core to IOMMU code per
	    suggestion by Christoph Hellwig
	    (https://lkml.org/lkml/2019/4/26/462)
	  - Rebased on top of Jean's SVA API branch and Eric's v7[1]
	    (git://linux-arm.org/linux-jpb.git sva/api)
	  - All IOMMU APIs are unmodified (except the new bind guest PASID
	    call in patch 9/16)

	- V2
	  - Rebased on Joerg's IOMMU x86/vt-d branch v5.1-rc4
	  - Integrated with Eric Auger's new v7 series for common APIs
	  (https://github.com/eauger/linux/tree/v5.1-rc3-2stage-v7)
	  - Addressed review comments from Andy Shevchenko and Alex Williamson on
	    IOASID custom allocator.
	  - Support multiple custom IOASID allocators (vIOMMUs) and dynamic
	    registration.


Jacob Pan (9):
  iommu/vt-d: Add custom allocator for IOASID
  iommu/vt-d: Replace Intel specific PASID allocator with IOASID
  iommu/vt-d: Move domain helper to header
  iommu/vt-d: Avoid duplicated code for PASID setup
  iommu/vt-d: Add nested translation helper function
  iommu/vt-d: Misc macro clean up for SVM
  iommu/vt-d: Add bind guest PASID support
  iommu/vt-d: Support flushing more translation cache types
  iommu/vt-d: Add svm/sva invalidate function

Lu Baolu (1):
  iommu/vt-d: Enlightened PASID allocation

 drivers/iommu/Kconfig       |   1 +
 drivers/iommu/dmar.c        |  46 ++++++
 drivers/iommu/intel-iommu.c | 259 +++++++++++++++++++++++++++++++--
 drivers/iommu/intel-pasid.c | 343 +++++++++++++++++++++++++++++++++++++-------
 drivers/iommu/intel-pasid.h |  25 +++-
 drivers/iommu/intel-svm.c   | 298 ++++++++++++++++++++++++++++++--------
 include/linux/intel-iommu.h |  39 ++++-
 include/linux/intel-svm.h   |  17 +++
 8 files changed, 904 insertions(+), 124 deletions(-)

-- 
2.7.4

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH v6 01/10] iommu/vt-d: Enlightened PASID allocation
  2019-10-22 23:53 [PATCH v6 00/10] Nested Shared Virtual Address (SVA) VT-d support Jacob Pan
@ 2019-10-22 23:53 ` Jacob Pan
  2019-10-23  0:45   ` Raj, Ashok
  2019-10-22 23:53 ` [PATCH v6 02/10] iommu/vt-d: Add custom allocator for IOASID Jacob Pan
                   ` (9 subsequent siblings)
  10 siblings, 1 reply; 27+ messages in thread
From: Jacob Pan @ 2019-10-22 23:53 UTC (permalink / raw)
  To: iommu, LKML, Joerg Roedel, David Woodhouse, Alex Williamson,
	Jean-Philippe Brucker
  Cc: Tian, Kevin, Raj Ashok, Jonathan Cameron

From: Lu Baolu <baolu.lu@linux.intel.com>

If Intel IOMMU runs in caching mode, a.k.a. virtual IOMMU, the
IOMMU driver should rely on the emulation software to allocate
and free PASID IDs. The Intel vt-d spec revision 3.0 defines a
register set to support this. This includes a capability register,
a virtual command register and a virtual response register. Refer
to section 10.4.42, 10.4.43, 10.4.44 for more information.

This patch adds the enlightened PASID allocation/free interfaces
via the virtual command register.

Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Cc: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Liu Yi L <yi.l.liu@intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
---
 drivers/iommu/intel-pasid.c | 83 +++++++++++++++++++++++++++++++++++++++++++++
 drivers/iommu/intel-pasid.h | 13 ++++++-
 include/linux/intel-iommu.h |  2 ++
 3 files changed, 97 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/intel-pasid.c b/drivers/iommu/intel-pasid.c
index 040a445be300..76bcbb21e112 100644
--- a/drivers/iommu/intel-pasid.c
+++ b/drivers/iommu/intel-pasid.c
@@ -63,6 +63,89 @@ void *intel_pasid_lookup_id(int pasid)
 	return p;
 }
 
+static int check_vcmd_pasid(struct intel_iommu *iommu)
+{
+	u64 cap;
+
+	if (!ecap_vcs(iommu->ecap)) {
+		pr_warn("IOMMU: %s: Hardware doesn't support virtual command\n",
+			iommu->name);
+		return -ENODEV;
+	}
+
+	cap = dmar_readq(iommu->reg + DMAR_VCCAP_REG);
+	if (!(cap & DMA_VCS_PAS)) {
+		pr_warn("IOMMU: %s: Emulation software doesn't support PASID allocation\n",
+			iommu->name);
+		return -ENODEV;
+	}
+
+	return 0;
+}
+
+int vcmd_alloc_pasid(struct intel_iommu *iommu, unsigned int *pasid)
+{
+	u64 res;
+	u8 status_code;
+	unsigned long flags;
+	int ret = 0;
+
+	ret = check_vcmd_pasid(iommu);
+	if (ret)
+		return ret;
+
+	raw_spin_lock_irqsave(&iommu->register_lock, flags);
+	dmar_writeq(iommu->reg + DMAR_VCMD_REG, VCMD_CMD_ALLOC);
+	IOMMU_WAIT_OP(iommu, DMAR_VCRSP_REG, dmar_readq,
+		      !(res & VCMD_VRSP_IP), res);
+	raw_spin_unlock_irqrestore(&iommu->register_lock, flags);
+
+	status_code = VCMD_VRSP_SC(res);
+	switch (status_code) {
+	case VCMD_VRSP_SC_SUCCESS:
+		*pasid = VCMD_VRSP_RESULT(res);
+		break;
+	case VCMD_VRSP_SC_NO_PASID_AVAIL:
+		pr_info("IOMMU: %s: No PASID available\n", iommu->name);
+		ret = -ENOMEM;
+		break;
+	default:
+		ret = -ENODEV;
+		pr_warn("IOMMU: %s: Unexpected error code %d\n",
+			iommu->name, status_code);
+	}
+
+	return ret;
+}
+
+void vcmd_free_pasid(struct intel_iommu *iommu, unsigned int pasid)
+{
+	u64 res;
+	u8 status_code;
+	unsigned long flags;
+
+	if (check_vcmd_pasid(iommu))
+		return;
+
+	raw_spin_lock_irqsave(&iommu->register_lock, flags);
+	dmar_writeq(iommu->reg + DMAR_VCMD_REG, (pasid << 8) | VCMD_CMD_FREE);
+	IOMMU_WAIT_OP(iommu, DMAR_VCRSP_REG, dmar_readq,
+		      !(res & VCMD_VRSP_IP), res);
+	raw_spin_unlock_irqrestore(&iommu->register_lock, flags);
+
+	status_code = VCMD_VRSP_SC(res);
+	switch (status_code) {
+	case VCMD_VRSP_SC_SUCCESS:
+		break;
+	case VCMD_VRSP_SC_INVALID_PASID:
+		pr_info("IOMMU: %s: Invalid PASID\n", iommu->name);
+		break;
+	default:
+		pr_warn("IOMMU: %s: Unexpected error code %d\n",
+			iommu->name, status_code);
+	}
+}
+
 /*
  * Per device pasid table management:
  */
diff --git a/drivers/iommu/intel-pasid.h b/drivers/iommu/intel-pasid.h
index fc8cd8f17de1..e413e884e685 100644
--- a/drivers/iommu/intel-pasid.h
+++ b/drivers/iommu/intel-pasid.h
@@ -23,6 +23,16 @@
 #define is_pasid_enabled(entry)		(((entry)->lo >> 3) & 0x1)
 #define get_pasid_dir_size(entry)	(1 << ((((entry)->lo >> 9) & 0x7) + 7))
 
+/* Virtual command interface for enlightened pasid management. */
+#define VCMD_CMD_ALLOC			0x1
+#define VCMD_CMD_FREE			0x2
+#define VCMD_VRSP_IP			0x1
+#define VCMD_VRSP_SC(e)			(((e) >> 1) & 0x3)
+#define VCMD_VRSP_SC_SUCCESS		0
+#define VCMD_VRSP_SC_NO_PASID_AVAIL	1
+#define VCMD_VRSP_SC_INVALID_PASID	1
+#define VCMD_VRSP_RESULT(e)		(((e) >> 8) & 0xfffff)
+
 /*
  * Domain ID reserved for pasid entries programmed for first-level
  * only and pass-through transfer modes.
@@ -95,5 +105,6 @@ int intel_pasid_setup_pass_through(struct intel_iommu *iommu,
 				   struct device *dev, int pasid);
 void intel_pasid_tear_down_entry(struct intel_iommu *iommu,
 				 struct device *dev, int pasid);
-
+int vcmd_alloc_pasid(struct intel_iommu *iommu, unsigned int *pasid);
+void vcmd_free_pasid(struct intel_iommu *iommu, unsigned int pasid);
 #endif /* __INTEL_PASID_H */
diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h
index ed11ef594378..eea7468694a7 100644
--- a/include/linux/intel-iommu.h
+++ b/include/linux/intel-iommu.h
@@ -161,6 +161,7 @@
 #define ecap_smpwc(e)		(((e) >> 48) & 0x1)
 #define ecap_flts(e)		(((e) >> 47) & 0x1)
 #define ecap_slts(e)		(((e) >> 46) & 0x1)
+#define ecap_vcs(e)		(((e) >> 44) & 0x1)
 #define ecap_smts(e)		(((e) >> 43) & 0x1)
 #define ecap_dit(e)		((e >> 41) & 0x1)
 #define ecap_pasid(e)		((e >> 40) & 0x1)
@@ -279,6 +280,7 @@
 
 /* PRS_REG */
 #define DMA_PRS_PPR	((u32)1)
+#define DMA_VCS_PAS	((u64)1)
 
 #define IOMMU_WAIT_OP(iommu, offset, op, cond, sts)			\
 do {									\
-- 
2.7.4

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v6 02/10] iommu/vt-d: Add custom allocator for IOASID
  2019-10-22 23:53 [PATCH v6 00/10] Nested Shared Virtual Address (SVA) VT-d support Jacob Pan
  2019-10-22 23:53 ` [PATCH v6 01/10] iommu/vt-d: Enlightened PASID allocation Jacob Pan
@ 2019-10-22 23:53 ` Jacob Pan
  2019-10-23  0:51   ` Raj, Ashok
  2019-10-22 23:53 ` [PATCH v6 03/10] iommu/vt-d: Replace Intel specific PASID allocator with IOASID Jacob Pan
                   ` (8 subsequent siblings)
  10 siblings, 1 reply; 27+ messages in thread
From: Jacob Pan @ 2019-10-22 23:53 UTC (permalink / raw)
  To: iommu, LKML, Joerg Roedel, David Woodhouse, Alex Williamson,
	Jean-Philippe Brucker
  Cc: Tian, Kevin, Raj Ashok, Jonathan Cameron

When VT-d driver runs in the guest, PASID allocation must be
performed via virtual command interface. This patch registers a
custom IOASID allocator which takes precedence over the default
XArray based allocator. The resulting IOASID allocation will always
come from the host. This ensures that PASID namespace is system-
wide.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Liu, Yi L <yi.l.liu@intel.com>
Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
---
 drivers/iommu/Kconfig       |  1 +
 drivers/iommu/intel-iommu.c | 67 +++++++++++++++++++++++++++++++++++++++++++++
 include/linux/intel-iommu.h |  2 ++
 3 files changed, 70 insertions(+)

diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index fd50ddffffbf..961fe5795a90 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -211,6 +211,7 @@ config INTEL_IOMMU_SVM
 	bool "Support for Shared Virtual Memory with Intel IOMMU"
 	depends on INTEL_IOMMU && X86
 	select PCI_PASID
+	select IOASID
 	select MMU_NOTIFIER
 	help
 	  Shared Virtual Memory (SVM) provides a facility for devices
diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 3f974919d3bd..3aff0141c522 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -1706,6 +1706,8 @@ static void free_dmar_iommu(struct intel_iommu *iommu)
 		if (ecap_prs(iommu->ecap))
 			intel_svm_finish_prq(iommu);
 	}
+	ioasid_unregister_allocator(&iommu->pasid_allocator);
+
 #endif
 }
 
@@ -4910,6 +4912,46 @@ static int __init probe_acpi_namespace_devices(void)
 	return 0;
 }
 
+#ifdef CONFIG_INTEL_IOMMU_SVM
+static ioasid_t intel_ioasid_alloc(ioasid_t min, ioasid_t max, void *data)
+{
+	struct intel_iommu *iommu = data;
+	ioasid_t ioasid;
+
+	/*
+	 * VT-d virtual command interface always uses the full 20 bit
+	 * PASID range. Host can partition guest PASID range based on
+	 * policies but it is out of guest's control.
+	 */
+	if (min < PASID_MIN || max > PASID_MAX)
+		return INVALID_IOASID;
+
+	if (vcmd_alloc_pasid(iommu, &ioasid))
+		return INVALID_IOASID;
+
+	return ioasid;
+}
+
+static void intel_ioasid_free(ioasid_t ioasid, void *data)
+{
+	struct iommu_pasid_alloc_info *svm;
+	struct intel_iommu *iommu = data;
+
+	if (!iommu)
+		return;
+	/*
+	 * Sanity check the ioasid owner is done at upper layer, e.g. VFIO
+	 * We can only free the PASID when all the devices are unbond.
+	 */
+	svm = ioasid_find(NULL, ioasid, NULL);
+	if (!svm) {
+		pr_warn("Freeing unbond IOASID %d\n", ioasid);
+		return;
+	}
+	vcmd_free_pasid(iommu, ioasid);
+}
+#endif
+
 int __init intel_iommu_init(void)
 {
 	int ret = -ENODEV;
@@ -5020,6 +5062,31 @@ int __init intel_iommu_init(void)
 				       "%s", iommu->name);
 		iommu_device_set_ops(&iommu->iommu, &intel_iommu_ops);
 		iommu_device_register(&iommu->iommu);
+#ifdef CONFIG_INTEL_IOMMU_SVM
+		if (cap_caching_mode(iommu->cap) && sm_supported(iommu)) {
+			/*
+			 * Register a custom ASID allocator if we are running
+			 * in a guest, the purpose is to have a system wide PASID
+			 * namespace among all PASID users.
+			 * There can be multiple vIOMMUs in each guest but only
+			 * one allocator is active. All vIOMMU allocators will
+			 * eventually be calling the same host allocator.
+			 */
+			iommu->pasid_allocator.alloc = intel_ioasid_alloc;
+			iommu->pasid_allocator.free = intel_ioasid_free;
+			iommu->pasid_allocator.pdata = (void *)iommu;
+			ret = ioasid_register_allocator(&iommu->pasid_allocator);
+			if (ret) {
+				pr_warn("Custom PASID allocator registeration failed\n");
+				/*
+				 * Disable scalable mode on this IOMMU if there
+				 * is no custom allocator. Mixing SM capable vIOMMU
+				 * and non-SM vIOMMU are not supported.
+				 */
+				intel_iommu_sm = 0;
+			}
+		}
+#endif
 	}
 
 	bus_set_iommu(&pci_bus_type, &intel_iommu_ops);
diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h
index eea7468694a7..fb1973a761c1 100644
--- a/include/linux/intel-iommu.h
+++ b/include/linux/intel-iommu.h
@@ -19,6 +19,7 @@
 #include <linux/iommu.h>
 #include <linux/io-64-nonatomic-lo-hi.h>
 #include <linux/dmar.h>
+#include <linux/ioasid.h>
 
 #include <asm/cacheflush.h>
 #include <asm/iommu.h>
@@ -542,6 +543,7 @@ struct intel_iommu {
 #ifdef CONFIG_INTEL_IOMMU_SVM
 	struct page_req_dsc *prq;
 	unsigned char prq_name[16];    /* Name for PRQ interrupt */
+	struct ioasid_allocator_ops pasid_allocator; /* Custom allocator for PASIDs */
 #endif
 	struct q_inval  *qi;            /* Queued invalidation info */
 	u32 *iommu_state; /* Store iommu states between suspend and resume.*/
-- 
2.7.4

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v6 03/10] iommu/vt-d: Replace Intel specific PASID allocator with IOASID
  2019-10-22 23:53 [PATCH v6 00/10] Nested Shared Virtual Address (SVA) VT-d support Jacob Pan
  2019-10-22 23:53 ` [PATCH v6 01/10] iommu/vt-d: Enlightened PASID allocation Jacob Pan
  2019-10-22 23:53 ` [PATCH v6 02/10] iommu/vt-d: Add custom allocator for IOASID Jacob Pan
@ 2019-10-22 23:53 ` Jacob Pan
  2019-10-23  0:58   ` Raj, Ashok
  2019-10-22 23:53 ` [PATCH v6 04/10] iommu/vt-d: Move domain helper to header Jacob Pan
                   ` (7 subsequent siblings)
  10 siblings, 1 reply; 27+ messages in thread
From: Jacob Pan @ 2019-10-22 23:53 UTC (permalink / raw)
  To: iommu, LKML, Joerg Roedel, David Woodhouse, Alex Williamson,
	Jean-Philippe Brucker
  Cc: Tian, Kevin, Raj Ashok, Jonathan Cameron

Make use of generic IOASID code to manage PASID allocation,
free, and lookup. Replace Intel specific code.

Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
---
 drivers/iommu/intel-iommu.c | 12 ++++++------
 drivers/iommu/intel-pasid.c | 36 ------------------------------------
 drivers/iommu/intel-svm.c   | 37 +++++++++++++++++++++----------------
 3 files changed, 27 insertions(+), 58 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 3aff0141c522..72febcf2c48f 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -5311,7 +5311,7 @@ static void auxiliary_unlink_device(struct dmar_domain *domain,
 	domain->auxd_refcnt--;
 
 	if (!domain->auxd_refcnt && domain->default_pasid > 0)
-		intel_pasid_free_id(domain->default_pasid);
+		ioasid_free(domain->default_pasid);
 }
 
 static int aux_domain_add_dev(struct dmar_domain *domain,
@@ -5329,10 +5329,10 @@ static int aux_domain_add_dev(struct dmar_domain *domain,
 	if (domain->default_pasid <= 0) {
 		int pasid;
 
-		pasid = intel_pasid_alloc_id(domain, PASID_MIN,
-					     pci_max_pasids(to_pci_dev(dev)),
-					     GFP_KERNEL);
-		if (pasid <= 0) {
+		/* No private data needed for the default pasid */
+		pasid = ioasid_alloc(NULL, PASID_MIN, pci_max_pasids(to_pci_dev(dev)) - 1,
+				NULL);
+		if (pasid == INVALID_IOASID) {
 			pr_err("Can't allocate default pasid\n");
 			return -ENODEV;
 		}
@@ -5368,7 +5368,7 @@ static int aux_domain_add_dev(struct dmar_domain *domain,
 	spin_unlock(&iommu->lock);
 	spin_unlock_irqrestore(&device_domain_lock, flags);
 	if (!domain->auxd_refcnt && domain->default_pasid > 0)
-		intel_pasid_free_id(domain->default_pasid);
+		ioasid_free(domain->default_pasid);
 
 	return ret;
 }
diff --git a/drivers/iommu/intel-pasid.c b/drivers/iommu/intel-pasid.c
index 76bcbb21e112..c0d1f28d3412 100644
--- a/drivers/iommu/intel-pasid.c
+++ b/drivers/iommu/intel-pasid.c
@@ -26,42 +26,6 @@
  */
 static DEFINE_SPINLOCK(pasid_lock);
 u32 intel_pasid_max_id = PASID_MAX;
-static DEFINE_IDR(pasid_idr);
-
-int intel_pasid_alloc_id(void *ptr, int start, int end, gfp_t gfp)
-{
-	int ret, min, max;
-
-	min = max_t(int, start, PASID_MIN);
-	max = min_t(int, end, intel_pasid_max_id);
-
-	WARN_ON(in_interrupt());
-	idr_preload(gfp);
-	spin_lock(&pasid_lock);
-	ret = idr_alloc(&pasid_idr, ptr, min, max, GFP_ATOMIC);
-	spin_unlock(&pasid_lock);
-	idr_preload_end();
-
-	return ret;
-}
-
-void intel_pasid_free_id(int pasid)
-{
-	spin_lock(&pasid_lock);
-	idr_remove(&pasid_idr, pasid);
-	spin_unlock(&pasid_lock);
-}
-
-void *intel_pasid_lookup_id(int pasid)
-{
-	void *p;
-
-	spin_lock(&pasid_lock);
-	p = idr_find(&pasid_idr, pasid);
-	spin_unlock(&pasid_lock);
-
-	return p;
-}
 
 static int check_vcmd_pasid(struct intel_iommu *iommu)
 {
diff --git a/drivers/iommu/intel-svm.c b/drivers/iommu/intel-svm.c
index 9b159132405d..5aef5b7bf561 100644
--- a/drivers/iommu/intel-svm.c
+++ b/drivers/iommu/intel-svm.c
@@ -17,6 +17,7 @@
 #include <linux/dmar.h>
 #include <linux/interrupt.h>
 #include <linux/mm_types.h>
+#include <linux/ioasid.h>
 #include <asm/page.h>
 
 #include "intel-pasid.h"
@@ -318,16 +319,15 @@ int intel_svm_bind_mm(struct device *dev, int *pasid, int flags, struct svm_dev_
 		if (pasid_max > intel_pasid_max_id)
 			pasid_max = intel_pasid_max_id;
 
-		/* Do not use PASID 0 in caching mode (virtualised IOMMU) */
-		ret = intel_pasid_alloc_id(svm,
-					   !!cap_caching_mode(iommu->cap),
-					   pasid_max - 1, GFP_KERNEL);
-		if (ret < 0) {
+		/* Do not use PASID 0, reserved for RID to PASID */
+		svm->pasid = ioasid_alloc(NULL, PASID_MIN,
+					pasid_max - 1, svm);
+		if (svm->pasid == INVALID_IOASID) {
 			kfree(svm);
 			kfree(sdev);
+			ret = ENOSPC;
 			goto out;
 		}
-		svm->pasid = ret;
 		svm->notifier.ops = &intel_mmuops;
 		svm->mm = mm;
 		svm->flags = flags;
@@ -337,7 +337,7 @@ int intel_svm_bind_mm(struct device *dev, int *pasid, int flags, struct svm_dev_
 		if (mm) {
 			ret = mmu_notifier_register(&svm->notifier, mm);
 			if (ret) {
-				intel_pasid_free_id(svm->pasid);
+				ioasid_free(svm->pasid);
 				kfree(svm);
 				kfree(sdev);
 				goto out;
@@ -353,7 +353,7 @@ int intel_svm_bind_mm(struct device *dev, int *pasid, int flags, struct svm_dev_
 		if (ret) {
 			if (mm)
 				mmu_notifier_unregister(&svm->notifier, mm);
-			intel_pasid_free_id(svm->pasid);
+			ioasid_free(svm->pasid);
 			kfree(svm);
 			kfree(sdev);
 			goto out;
@@ -401,7 +401,12 @@ int intel_svm_unbind_mm(struct device *dev, int pasid)
 	if (!iommu)
 		goto out;
 
-	svm = intel_pasid_lookup_id(pasid);
+	svm = ioasid_find(NULL, pasid, NULL);
+	if (IS_ERR(svm)) {
+		ret = PTR_ERR(svm);
+		goto out;
+	}
+
 	if (!svm)
 		goto out;
 
@@ -423,7 +428,7 @@ int intel_svm_unbind_mm(struct device *dev, int pasid)
 				kfree_rcu(sdev, rcu);
 
 				if (list_empty(&svm->devs)) {
-					intel_pasid_free_id(svm->pasid);
+					ioasid_free(svm->pasid);
 					if (svm->mm)
 						mmu_notifier_unregister(&svm->notifier, svm->mm);
 
@@ -458,10 +463,11 @@ int intel_svm_is_pasid_valid(struct device *dev, int pasid)
 	if (!iommu)
 		goto out;
 
-	svm = intel_pasid_lookup_id(pasid);
-	if (!svm)
+	svm = ioasid_find(NULL, pasid, NULL);
+	if (IS_ERR(svm)) {
+		ret = PTR_ERR(svm);
 		goto out;
-
+	}
 	/* init_mm is used in this case */
 	if (!svm->mm)
 		ret = 1;
@@ -568,13 +574,12 @@ static irqreturn_t prq_event_thread(int irq, void *d)
 
 		if (!svm || svm->pasid != req->pasid) {
 			rcu_read_lock();
-			svm = intel_pasid_lookup_id(req->pasid);
+			svm = ioasid_find(NULL, req->pasid, NULL);
 			/* It *can't* go away, because the driver is not permitted
 			 * to unbind the mm while any page faults are outstanding.
 			 * So we only need RCU to protect the internal idr code. */
 			rcu_read_unlock();
-
-			if (!svm) {
+			if (IS_ERR(svm) || !svm) {
 				pr_err("%s: Page request for invalid PASID %d: %08llx %08llx\n",
 				       iommu->name, req->pasid, ((unsigned long long *)req)[0],
 				       ((unsigned long long *)req)[1]);
-- 
2.7.4

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v6 04/10] iommu/vt-d: Move domain helper to header
  2019-10-22 23:53 [PATCH v6 00/10] Nested Shared Virtual Address (SVA) VT-d support Jacob Pan
                   ` (2 preceding siblings ...)
  2019-10-22 23:53 ` [PATCH v6 03/10] iommu/vt-d: Replace Intel specific PASID allocator with IOASID Jacob Pan
@ 2019-10-22 23:53 ` Jacob Pan
  2019-10-22 23:53 ` [PATCH v6 05/10] iommu/vt-d: Avoid duplicated code for PASID setup Jacob Pan
                   ` (6 subsequent siblings)
  10 siblings, 0 replies; 27+ messages in thread
From: Jacob Pan @ 2019-10-22 23:53 UTC (permalink / raw)
  To: iommu, LKML, Joerg Roedel, David Woodhouse, Alex Williamson,
	Jean-Philippe Brucker
  Cc: Tian, Kevin, Raj Ashok, Jonathan Cameron

Move domain helper to header to be used by SVA code.

Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
---
 drivers/iommu/intel-iommu.c | 6 ------
 include/linux/intel-iommu.h | 6 ++++++
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 72febcf2c48f..89a976c21f55 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -428,12 +428,6 @@ static void init_translation_status(struct intel_iommu *iommu)
 		iommu->flags |= VTD_FLAG_TRANS_PRE_ENABLED;
 }
 
-/* Convert generic 'struct iommu_domain to private struct dmar_domain */
-static struct dmar_domain *to_dmar_domain(struct iommu_domain *dom)
-{
-	return container_of(dom, struct dmar_domain, domain);
-}
-
 static int __init intel_iommu_setup(char *str)
 {
 	if (!str)
diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h
index fb1973a761c1..4ed2ce3341a5 100644
--- a/include/linux/intel-iommu.h
+++ b/include/linux/intel-iommu.h
@@ -590,6 +590,12 @@ static inline void __iommu_flush_cache(
 		clflush_cache_range(addr, size);
 }
 
+/* Convert generic struct iommu_domain to private struct dmar_domain */
+static inline struct dmar_domain *to_dmar_domain(struct iommu_domain *dom)
+{
+	return container_of(dom, struct dmar_domain, domain);
+}
+
 /*
  * 0: readable
  * 1: writable
-- 
2.7.4

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v6 05/10] iommu/vt-d: Avoid duplicated code for PASID setup
  2019-10-22 23:53 [PATCH v6 00/10] Nested Shared Virtual Address (SVA) VT-d support Jacob Pan
                   ` (3 preceding siblings ...)
  2019-10-22 23:53 ` [PATCH v6 04/10] iommu/vt-d: Move domain helper to header Jacob Pan
@ 2019-10-22 23:53 ` Jacob Pan
  2019-10-22 23:53 ` [PATCH v6 06/10] iommu/vt-d: Add nested translation helper function Jacob Pan
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 27+ messages in thread
From: Jacob Pan @ 2019-10-22 23:53 UTC (permalink / raw)
  To: iommu, LKML, Joerg Roedel, David Woodhouse, Alex Williamson,
	Jean-Philippe Brucker
  Cc: Tian, Kevin, Raj Ashok, Jonathan Cameron

After each setup for PASID entry, related translation caches must be flushed.
We can combine duplicated code into one function which is less error prone.

Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
---
 drivers/iommu/intel-pasid.c | 48 +++++++++++++++++----------------------------
 1 file changed, 18 insertions(+), 30 deletions(-)

diff --git a/drivers/iommu/intel-pasid.c b/drivers/iommu/intel-pasid.c
index c0d1f28d3412..9c5affcd28e4 100644
--- a/drivers/iommu/intel-pasid.c
+++ b/drivers/iommu/intel-pasid.c
@@ -512,6 +512,21 @@ void intel_pasid_tear_down_entry(struct intel_iommu *iommu,
 		devtlb_invalidation_with_pasid(iommu, dev, pasid);
 }
 
+static void pasid_flush_caches(struct intel_iommu *iommu,
+				struct pasid_entry *pte,
+				int pasid, u16 did)
+{
+	if (!ecap_coherent(iommu->ecap))
+		clflush_cache_range(pte, sizeof(*pte));
+
+	if (cap_caching_mode(iommu->cap)) {
+		pasid_cache_invalidation_with_pasid(iommu, did, pasid);
+		iotlb_invalidation_with_pasid(iommu, did, pasid);
+	} else {
+		iommu_flush_write_buffer(iommu);
+	}
+}
+
 /*
  * Set up the scalable mode pasid table entry for first only
  * translation type.
@@ -557,16 +572,7 @@ int intel_pasid_setup_first_level(struct intel_iommu *iommu,
 	/* Setup Present and PASID Granular Transfer Type: */
 	pasid_set_translation_type(pte, 1);
 	pasid_set_present(pte);
-
-	if (!ecap_coherent(iommu->ecap))
-		clflush_cache_range(pte, sizeof(*pte));
-
-	if (cap_caching_mode(iommu->cap)) {
-		pasid_cache_invalidation_with_pasid(iommu, did, pasid);
-		iotlb_invalidation_with_pasid(iommu, did, pasid);
-	} else {
-		iommu_flush_write_buffer(iommu);
-	}
+	pasid_flush_caches(iommu, pte, pasid, did);
 
 	return 0;
 }
@@ -630,16 +636,7 @@ int intel_pasid_setup_second_level(struct intel_iommu *iommu,
 	 */
 	pasid_set_sre(pte);
 	pasid_set_present(pte);
-
-	if (!ecap_coherent(iommu->ecap))
-		clflush_cache_range(pte, sizeof(*pte));
-
-	if (cap_caching_mode(iommu->cap)) {
-		pasid_cache_invalidation_with_pasid(iommu, did, pasid);
-		iotlb_invalidation_with_pasid(iommu, did, pasid);
-	} else {
-		iommu_flush_write_buffer(iommu);
-	}
+	pasid_flush_caches(iommu, pte, pasid, did);
 
 	return 0;
 }
@@ -673,16 +670,7 @@ int intel_pasid_setup_pass_through(struct intel_iommu *iommu,
 	 */
 	pasid_set_sre(pte);
 	pasid_set_present(pte);
-
-	if (!ecap_coherent(iommu->ecap))
-		clflush_cache_range(pte, sizeof(*pte));
-
-	if (cap_caching_mode(iommu->cap)) {
-		pasid_cache_invalidation_with_pasid(iommu, did, pasid);
-		iotlb_invalidation_with_pasid(iommu, did, pasid);
-	} else {
-		iommu_flush_write_buffer(iommu);
-	}
+	pasid_flush_caches(iommu, pte, pasid, did);
 
 	return 0;
 }
-- 
2.7.4

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v6 06/10] iommu/vt-d: Add nested translation helper function
  2019-10-22 23:53 [PATCH v6 00/10] Nested Shared Virtual Address (SVA) VT-d support Jacob Pan
                   ` (4 preceding siblings ...)
  2019-10-22 23:53 ` [PATCH v6 05/10] iommu/vt-d: Avoid duplicated code for PASID setup Jacob Pan
@ 2019-10-22 23:53 ` Jacob Pan
  2019-10-22 23:53 ` [PATCH v6 07/10] iommu/vt-d: Misc macro clean up for SVM Jacob Pan
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 27+ messages in thread
From: Jacob Pan @ 2019-10-22 23:53 UTC (permalink / raw)
  To: iommu, LKML, Joerg Roedel, David Woodhouse, Alex Williamson,
	Jean-Philippe Brucker
  Cc: Tian, Kevin, Raj Ashok, Jonathan Cameron

Nested translation mode is supported in VT-d 3.0 Spec.CH 3.8.
With PASID granular translation type set to 0x11b, translation
result from the first level(FL) also subject to a second level(SL)
page table translation. This mode is used for SVA virtualization,
where FL performs guest virtual to guest physical translation and
SL performs guest physical to host physical translation.

Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: Liu, Yi L <yi.l.liu@linux.intel.com>
---
 drivers/iommu/intel-pasid.c | 207 ++++++++++++++++++++++++++++++++++++++++++++
 drivers/iommu/intel-pasid.h |  12 +++
 2 files changed, 219 insertions(+)

diff --git a/drivers/iommu/intel-pasid.c b/drivers/iommu/intel-pasid.c
index 9c5affcd28e4..fd2c82fd90c1 100644
--- a/drivers/iommu/intel-pasid.c
+++ b/drivers/iommu/intel-pasid.c
@@ -442,6 +442,76 @@ pasid_set_flpm(struct pasid_entry *pe, u64 value)
 	pasid_set_bits(&pe->val[2], GENMASK_ULL(3, 2), value << 2);
 }
 
+/*
+ * Setup the Extended Memory Type(EMT) field (Bits 91-93)
+ * of a scalable mode PASID entry.
+ */
+static inline void
+pasid_set_emt(struct pasid_entry *pe, u64 value)
+{
+	pasid_set_bits(&pe->val[1], GENMASK_ULL(29, 27), value << 27);
+}
+
+/*
+ * Setup the Page Attribute Table (PAT) field (Bits 96-127)
+ * of a scalable mode PASID entry.
+ */
+static inline void
+pasid_set_pat(struct pasid_entry *pe, u64 value)
+{
+	pasid_set_bits(&pe->val[1], GENMASK_ULL(63, 32), value << 27);
+}
+
+/*
+ * Setup the Cache Disable (CD) field (Bit 89)
+ * of a scalable mode PASID entry.
+ */
+static inline void
+pasid_set_cd(struct pasid_entry *pe)
+{
+	pasid_set_bits(&pe->val[1], 1 << 25, 1);
+}
+
+/*
+ * Setup the Extended Memory Type Enable (EMTE) field (Bit 90)
+ * of a scalable mode PASID entry.
+ */
+static inline void
+pasid_set_emte(struct pasid_entry *pe)
+{
+	pasid_set_bits(&pe->val[1], 1 << 26, 1);
+}
+
+/*
+ * Setup the Extended Access Flag Enable (EAFE) field (Bit 135)
+ * of a scalable mode PASID entry.
+ */
+static inline void
+pasid_set_eafe(struct pasid_entry *pe)
+{
+	pasid_set_bits(&pe->val[2], 1 << 7, 1);
+}
+
+/*
+ * Setup the Page-level Cache Disable (PCD) field (Bit 95)
+ * of a scalable mode PASID entry.
+ */
+static inline void
+pasid_set_pcd(struct pasid_entry *pe)
+{
+	pasid_set_bits(&pe->val[1], 1 << 31, 1);
+}
+
+/*
+ * Setup the Page-level Write-Through (PWT)) field (Bit 94)
+ * of a scalable mode PASID entry.
+ */
+static inline void
+pasid_set_pwt(struct pasid_entry *pe)
+{
+	pasid_set_bits(&pe->val[1], 1 << 30, 1);
+}
+
 static void
 pasid_cache_invalidation_with_pasid(struct intel_iommu *iommu,
 				    u16 did, int pasid)
@@ -674,3 +744,140 @@ int intel_pasid_setup_pass_through(struct intel_iommu *iommu,
 
 	return 0;
 }
+
+static int intel_pasid_setup_bind_data(struct intel_iommu *iommu,
+				struct pasid_entry *pte,
+				struct iommu_gpasid_bind_data_vtd *pasid_data)
+{
+	/*
+	 * Not all guest PASID table entry fields are passed down during bind,
+	 * here we only set up the ones that are dependent on guest settings.
+	 * Execution related bits such as NXE, SMEP are not meaningful to IOMMU,
+	 * therefore not set. Other fields, such as snoop related, are set based
+	 * on host needs regardless of  guest settings.
+	 */
+	if (pasid_data->flags & IOMMU_SVA_VTD_GPASID_SRE) {
+		if (!ecap_srs(iommu->ecap)) {
+			pr_err("No supervisor request support on %s\n",
+			       iommu->name);
+			return -EINVAL;
+		}
+		pasid_set_sre(pte);
+	}
+
+	if ((pasid_data->flags & IOMMU_SVA_VTD_GPASID_EAFE) && ecap_eafs(iommu->ecap))
+		pasid_set_eafe(pte);
+
+	if (pasid_data->flags & IOMMU_SVA_VTD_GPASID_EMTE) {
+		pasid_set_emte(pte);
+		pasid_set_emt(pte, pasid_data->emt);
+	}
+
+	/*
+	 * Memory type is only applicable to devices inside processor coherent
+	 * domain. PCIe devices are not included. We can skip the rest of the
+	 * flags if IOMMU does not support MTS.
+	 */
+	if (!ecap_mts(iommu->ecap)) {
+		pr_info("%s does not support memory type bind guest PASID\n",
+			iommu->name);
+		return 0;
+	}
+
+	if (pasid_data->flags & IOMMU_SVA_VTD_GPASID_PCD)
+		pasid_set_pcd(pte);
+	if (pasid_data->flags & IOMMU_SVA_VTD_GPASID_PWT)
+		pasid_set_pwt(pte);
+	if (pasid_data->flags & IOMMU_SVA_VTD_GPASID_CD)
+		pasid_set_cd(pte);
+	pasid_set_pat(pte, pasid_data->pat);
+
+	return 0;
+
+}
+
+/**
+ * intel_pasid_setup_nested() - Set up PASID entry for nested translation
+ * which is used for vSVA. The first level page tables are used for
+ * GVA-GPA translation in the guest, second level page tables are used
+ * for GPA to HPA translation.
+ *
+ * @iommu:      Iommu which the device belong to
+ * @dev:        Device to be set up for translation
+ * @gpgd:       FLPTPTR: First Level Page translation pointer in GPA
+ * @pasid:      PASID to be programmed in the device PASID table
+ * @pasid_data: Additional PASID info from the guest bind request
+ * @domain:     Domain info for setting up second level page tables
+ * @addr_width: Address width of the first level (guest)
+ */
+int intel_pasid_setup_nested(struct intel_iommu *iommu,
+			struct device *dev, pgd_t *gpgd,
+			int pasid, struct iommu_gpasid_bind_data_vtd *pasid_data,
+			struct dmar_domain *domain,
+			int addr_width)
+{
+	struct pasid_entry *pte;
+	struct dma_pte *pgd;
+	u64 pgd_val;
+	int agaw;
+	u16 did;
+
+	if (!ecap_nest(iommu->ecap)) {
+		pr_err("IOMMU: %s: No nested translation support\n",
+		       iommu->name);
+		return -EINVAL;
+	}
+
+	pte = intel_pasid_get_entry(dev, pasid);
+	if (WARN_ON(!pte))
+		return -EINVAL;
+
+	pasid_clear_entry(pte);
+
+	/* Sanity checking performed by caller to make sure address
+	 * width matching in two dimensions:
+	 * 1. CPU vs. IOMMU
+	 * 2. Guest vs. Host.
+	 */
+	switch (addr_width) {
+	case 57:
+		pasid_set_flpm(pte, 1);
+		break;
+	case 48:
+		pasid_set_flpm(pte, 0);
+		break;
+	default:
+		dev_err(dev, "Invalid paging mode %d\n", addr_width);
+		return -EINVAL;
+	}
+
+	pasid_set_flptr(pte, (u64)gpgd);
+
+	intel_pasid_setup_bind_data(iommu, pte, pasid_data);
+
+	/* Setup the second level based on the given domain */
+	pgd = domain->pgd;
+
+	for (agaw = domain->agaw; agaw != iommu->agaw; agaw--) {
+		pgd = phys_to_virt(dma_pte_addr(pgd));
+		if (!dma_pte_present(pgd)) {
+			dev_err(dev, "Invalid domain page table\n");
+			return -EINVAL;
+		}
+	}
+	pgd_val = virt_to_phys(pgd);
+	pasid_set_slptr(pte, pgd_val);
+	pasid_set_fault_enable(pte);
+
+	did = domain->iommu_did[iommu->seq_id];
+	pasid_set_domain_id(pte, did);
+
+	pasid_set_address_width(pte, agaw);
+	pasid_set_page_snoop(pte, !!ecap_smpwc(iommu->ecap));
+
+	pasid_set_translation_type(pte, PASID_ENTRY_PGTT_NESTED);
+	pasid_set_present(pte);
+	pasid_flush_caches(iommu, pte, pasid, did);
+
+	return 0;
+}
diff --git a/drivers/iommu/intel-pasid.h b/drivers/iommu/intel-pasid.h
index e413e884e685..09c85db73b77 100644
--- a/drivers/iommu/intel-pasid.h
+++ b/drivers/iommu/intel-pasid.h
@@ -46,6 +46,7 @@
  * to vmalloc or even module mappings.
  */
 #define PASID_FLAG_SUPERVISOR_MODE	BIT(0)
+#define PASID_FLAG_NESTED		BIT(1)
 
 struct pasid_dir_entry {
 	u64 val;
@@ -55,6 +56,11 @@ struct pasid_entry {
 	u64 val[8];
 };
 
+#define PASID_ENTRY_PGTT_FL_ONLY	(1)
+#define PASID_ENTRY_PGTT_SL_ONLY	(2)
+#define PASID_ENTRY_PGTT_NESTED		(3)
+#define PASID_ENTRY_PGTT_PT		(4)
+
 /* The representative of a PASID table */
 struct pasid_table {
 	void			*table;		/* pasid table pointer */
@@ -103,6 +109,12 @@ int intel_pasid_setup_second_level(struct intel_iommu *iommu,
 int intel_pasid_setup_pass_through(struct intel_iommu *iommu,
 				   struct dmar_domain *domain,
 				   struct device *dev, int pasid);
+int intel_pasid_setup_nested(struct intel_iommu *iommu,
+			struct device *dev, pgd_t *pgd,
+			int pasid,
+			struct iommu_gpasid_bind_data_vtd *pasid_data,
+			struct dmar_domain *domain,
+			int addr_width);
 void intel_pasid_tear_down_entry(struct intel_iommu *iommu,
 				 struct device *dev, int pasid);
 int vcmd_alloc_pasid(struct intel_iommu *iommu, unsigned int *pasid);
-- 
2.7.4

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v6 07/10] iommu/vt-d: Misc macro clean up for SVM
  2019-10-22 23:53 [PATCH v6 00/10] Nested Shared Virtual Address (SVA) VT-d support Jacob Pan
                   ` (5 preceding siblings ...)
  2019-10-22 23:53 ` [PATCH v6 06/10] iommu/vt-d: Add nested translation helper function Jacob Pan
@ 2019-10-22 23:53 ` Jacob Pan
  2019-10-22 23:53 ` [PATCH v6 08/10] iommu/vt-d: Add bind guest PASID support Jacob Pan
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 27+ messages in thread
From: Jacob Pan @ 2019-10-22 23:53 UTC (permalink / raw)
  To: iommu, LKML, Joerg Roedel, David Woodhouse, Alex Williamson,
	Jean-Philippe Brucker
  Cc: Tian, Kevin, Raj Ashok, Jonathan Cameron

Use combined macros for_each_svm_dev() to simplify SVM device iteration
and error checking.

Suggested-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
---
 drivers/iommu/intel-svm.c | 85 +++++++++++++++++++++++------------------------
 1 file changed, 41 insertions(+), 44 deletions(-)

diff --git a/drivers/iommu/intel-svm.c b/drivers/iommu/intel-svm.c
index 5aef5b7bf561..34323a15849a 100644
--- a/drivers/iommu/intel-svm.c
+++ b/drivers/iommu/intel-svm.c
@@ -212,6 +212,10 @@ static const struct mmu_notifier_ops intel_mmuops = {
 static DEFINE_MUTEX(pasid_mutex);
 static LIST_HEAD(global_svm_list);
 
+#define for_each_svm_dev(svm, dev)			\
+	list_for_each_entry(sdev, &svm->devs, list)	\
+	if (dev == sdev->dev)				\
+
 int intel_svm_bind_mm(struct device *dev, int *pasid, int flags, struct svm_dev_ops *ops)
 {
 	struct intel_iommu *iommu = intel_svm_device_to_iommu(dev);
@@ -257,15 +261,13 @@ int intel_svm_bind_mm(struct device *dev, int *pasid, int flags, struct svm_dev_
 				goto out;
 			}
 
-			list_for_each_entry(sdev, &svm->devs, list) {
-				if (dev == sdev->dev) {
-					if (sdev->ops != ops) {
-						ret = -EBUSY;
-						goto out;
-					}
-					sdev->users++;
-					goto success;
+			for_each_svm_dev(svm, dev) {
+				if (sdev->ops != ops) {
+					ret = -EBUSY;
+					goto out;
 				}
+				sdev->users++;
+				goto success;
 			}
 
 			break;
@@ -402,48 +404,43 @@ int intel_svm_unbind_mm(struct device *dev, int pasid)
 		goto out;
 
 	svm = ioasid_find(NULL, pasid, NULL);
-	if (IS_ERR(svm)) {
+	if (IS_ERR_OR_NULL(svm)) {
 		ret = PTR_ERR(svm);
 		goto out;
 	}
 
-	if (!svm)
-		goto out;
+	for_each_svm_dev(svm, dev) {
+		ret = 0;
+		sdev->users--;
+		if (!sdev->users) {
+			list_del_rcu(&sdev->list);
+			/* Flush the PASID cache and IOTLB for this device.
+			 * Note that we do depend on the hardware *not* using
+			 * the PASID any more. Just as we depend on other
+			 * devices never using PASIDs that they have no right
+			 * to use. We have a *shared* PASID table, because it's
+			 * large and has to be physically contiguous. So it's
+			 * hard to be as defensive as we might like. */
+			intel_pasid_tear_down_entry(iommu, dev, svm->pasid);
+			intel_flush_svm_range_dev(svm, sdev, 0, -1, 0);
+			kfree_rcu(sdev, rcu);
+
+			if (list_empty(&svm->devs)) {
+				ioasid_free(svm->pasid);
+				if (svm->mm)
+					mmu_notifier_unregister(&svm->notifier, svm->mm);
 
-	list_for_each_entry(sdev, &svm->devs, list) {
-		if (dev == sdev->dev) {
-			ret = 0;
-			sdev->users--;
-			if (!sdev->users) {
-				list_del_rcu(&sdev->list);
-				/* Flush the PASID cache and IOTLB for this device.
-				 * Note that we do depend on the hardware *not* using
-				 * the PASID any more. Just as we depend on other
-				 * devices never using PASIDs that they have no right
-				 * to use. We have a *shared* PASID table, because it's
-				 * large and has to be physically contiguous. So it's
-				 * hard to be as defensive as we might like. */
-				intel_pasid_tear_down_entry(iommu, dev, svm->pasid);
-				intel_flush_svm_range_dev(svm, sdev, 0, -1, 0);
-				kfree_rcu(sdev, rcu);
-
-				if (list_empty(&svm->devs)) {
-					ioasid_free(svm->pasid);
-					if (svm->mm)
-						mmu_notifier_unregister(&svm->notifier, svm->mm);
-
-					list_del(&svm->list);
-
-					/* We mandate that no page faults may be outstanding
-					 * for the PASID when intel_svm_unbind_mm() is called.
-					 * If that is not obeyed, subtle errors will happen.
-					 * Let's make them less subtle... */
-					memset(svm, 0x6b, sizeof(*svm));
-					kfree(svm);
-				}
+				list_del(&svm->list);
+
+				/* We mandate that no page faults may be outstanding
+				 * for the PASID when intel_svm_unbind_mm() is called.
+				 * If that is not obeyed, subtle errors will happen.
+				 * Let's make them less subtle... */
+				memset(svm, 0x6b, sizeof(*svm));
+				kfree(svm);
 			}
-			break;
 		}
+		break;
 	}
  out:
 	mutex_unlock(&pasid_mutex);
@@ -579,7 +576,7 @@ static irqreturn_t prq_event_thread(int irq, void *d)
 			 * to unbind the mm while any page faults are outstanding.
 			 * So we only need RCU to protect the internal idr code. */
 			rcu_read_unlock();
-			if (IS_ERR(svm) || !svm) {
+			if (IS_ERR_OR_NULL(svm)) {
 				pr_err("%s: Page request for invalid PASID %d: %08llx %08llx\n",
 				       iommu->name, req->pasid, ((unsigned long long *)req)[0],
 				       ((unsigned long long *)req)[1]);
-- 
2.7.4

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v6 08/10] iommu/vt-d: Add bind guest PASID support
  2019-10-22 23:53 [PATCH v6 00/10] Nested Shared Virtual Address (SVA) VT-d support Jacob Pan
                   ` (6 preceding siblings ...)
  2019-10-22 23:53 ` [PATCH v6 07/10] iommu/vt-d: Misc macro clean up for SVM Jacob Pan
@ 2019-10-22 23:53 ` Jacob Pan
  2019-10-22 23:53 ` [PATCH v6 09/10] iommu/vt-d: Support flushing more translation cache types Jacob Pan
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 27+ messages in thread
From: Jacob Pan @ 2019-10-22 23:53 UTC (permalink / raw)
  To: iommu, LKML, Joerg Roedel, David Woodhouse, Alex Williamson,
	Jean-Philippe Brucker
  Cc: Tian, Kevin, Raj Ashok, Jonathan Cameron

When supporting guest SVA with emulated IOMMU, the guest PASID
table is shadowed in VMM. Updates to guest vIOMMU PASID table
will result in PASID cache flush which will be passed down to
the host as bind guest PASID calls.

For the SL page tables, it will be harvested from device's
default domain (request w/o PASID), or aux domain in case of
mediated device.

    .-------------.  .---------------------------.
    |   vIOMMU    |  | Guest process CR3, FL only|
    |             |  '---------------------------'
    .----------------/
    | PASID Entry |--- PASID cache flush -
    '-------------'                       |
    |             |                       V
    |             |                CR3 in GPA
    '-------------'
Guest
------| Shadow |--------------------------|--------
      v        v                          v
Host
    .-------------.  .----------------------.
    |   pIOMMU    |  | Bind FL for GVA-GPA  |
    |             |  '----------------------'
    .----------------/  |
    | PASID Entry |     V (Nested xlate)
    '----------------\.------------------------------.
    |             |   |SL for GPA-HPA, default domain|
    |             |   '------------------------------'
    '-------------'
Where:
 - FL = First level/stage one page tables
 - SL = Second level/stage two page tables

Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: Liu, Yi L <yi.l.liu@linux.intel.com>
---
 drivers/iommu/intel-iommu.c |   4 +
 drivers/iommu/intel-svm.c   | 184 ++++++++++++++++++++++++++++++++++++++++++++
 include/linux/intel-iommu.h |   8 +-
 include/linux/intel-svm.h   |  17 ++++
 4 files changed, 212 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 89a976c21f55..51e2a23ffd4e 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -6026,6 +6026,10 @@ const struct iommu_ops intel_iommu_ops = {
 	.dev_disable_feat	= intel_iommu_dev_disable_feat,
 	.is_attach_deferred	= intel_iommu_is_attach_deferred,
 	.pgsize_bitmap		= INTEL_IOMMU_PGSIZES,
+#ifdef CONFIG_INTEL_IOMMU_SVM
+	.sva_bind_gpasid	= intel_svm_bind_gpasid,
+	.sva_unbind_gpasid	= intel_svm_unbind_gpasid,
+#endif
 };
 
 static void quirk_iommu_igfx(struct pci_dev *dev)
diff --git a/drivers/iommu/intel-svm.c b/drivers/iommu/intel-svm.c
index 34323a15849a..55f4128bd6bf 100644
--- a/drivers/iommu/intel-svm.c
+++ b/drivers/iommu/intel-svm.c
@@ -216,6 +216,190 @@ static LIST_HEAD(global_svm_list);
 	list_for_each_entry(sdev, &svm->devs, list)	\
 	if (dev == sdev->dev)				\
 
+int intel_svm_bind_gpasid(struct iommu_domain *domain,
+			struct device *dev,
+			struct iommu_gpasid_bind_data *data)
+{
+	struct intel_iommu *iommu = intel_svm_device_to_iommu(dev);
+	struct dmar_domain *ddomain;
+	struct intel_svm_dev *sdev;
+	struct intel_svm *svm;
+	int ret = 0;
+
+	if (WARN_ON(!iommu) || !data)
+		return -EINVAL;
+
+	if (data->version != IOMMU_GPASID_BIND_VERSION_1 ||
+		data->format != IOMMU_PASID_FORMAT_INTEL_VTD)
+		return -EINVAL;
+
+	if (dev_is_pci(dev)) {
+		/* VT-d supports devices with full 20 bit PASIDs only */
+		if (pci_max_pasids(to_pci_dev(dev)) != PASID_MAX)
+			return -EINVAL;
+	}
+
+	/*
+	 * We only check host PASID range, we have no knowledge to check
+	 * guest PASID range nor do we use the guest PASID.
+	 */
+	if (data->hpasid <= 0 || data->hpasid >= PASID_MAX)
+		return -EINVAL;
+
+	ddomain = to_dmar_domain(domain);
+	/* REVISIT:
+	 * Sanity check adddress width and paging mode support
+	 * width matching in two dimensions:
+	 * 1. paging mode CPU <= IOMMU
+	 * 2. address width Guest <= Host.
+	 */
+	mutex_lock(&pasid_mutex);
+	svm = ioasid_find(NULL, data->hpasid, NULL);
+	if (IS_ERR(svm)) {
+		ret = PTR_ERR(svm);
+		goto out;
+	}
+	if (svm) {
+		/*
+		 * If we found svm for the PASID, there must be at
+		 * least one device bond, otherwise svm should be freed.
+		 */
+		BUG_ON(list_empty(&svm->devs));
+
+		for_each_svm_dev(svm, dev) {
+			/* In case of multiple sub-devices of the same pdev assigned, we should
+			 * allow multiple bind calls with the same PASID and pdev.
+			 */
+			sdev->users++;
+			goto out;
+		}
+	} else {
+		/* We come here when PASID has never been bond to a device. */
+		svm = kzalloc(sizeof(*svm), GFP_KERNEL);
+		if (!svm) {
+			ret = -ENOMEM;
+			goto out;
+		}
+		/* REVISIT: upper layer/VFIO can track host process that bind the PASID.
+		 * ioasid_set = mm might be sufficient for vfio to check pasid VMM
+		 * ownership.
+		 */
+		svm->mm = get_task_mm(current);
+		svm->pasid = data->hpasid;
+		if (data->flags & IOMMU_SVA_GPASID_VAL) {
+			svm->gpasid = data->gpasid;
+			svm->flags |= SVM_FLAG_GUEST_PASID;
+		}
+		ioasid_set_data(data->hpasid, svm);
+		INIT_LIST_HEAD_RCU(&svm->devs);
+		INIT_LIST_HEAD(&svm->list);
+
+		mmput(svm->mm);
+	}
+	sdev = kzalloc(sizeof(*sdev), GFP_KERNEL);
+	if (!sdev) {
+		if (list_empty(&svm->devs))
+			kfree(svm);
+		ret = -ENOMEM;
+		goto out;
+	}
+	sdev->dev = dev;
+	sdev->users = 1;
+
+	/* Set up device context entry for PASID if not enabled already */
+	ret = intel_iommu_enable_pasid(iommu, sdev->dev);
+	if (ret) {
+		dev_err(dev, "Failed to enable PASID capability\n");
+		kfree(sdev);
+		goto out;
+	}
+
+	/*
+	 * For guest bind, we need to set up PASID table entry as follows:
+	 * - FLPM matches guest paging mode
+	 * - turn on nested mode
+	 * - SL guest address width matching
+	 */
+	ret = intel_pasid_setup_nested(iommu,
+				dev,
+				(pgd_t *)data->gpgd,
+				data->hpasid,
+				&data->vtd,
+				ddomain,
+				data->addr_width);
+	if (ret) {
+		dev_err(dev, "Failed to set up PASID %llu in nested mode, Err %d\n",
+			data->hpasid, ret);
+		kfree(sdev);
+		goto out;
+	}
+	svm->flags |= SVM_FLAG_GUEST_MODE;
+
+	init_rcu_head(&sdev->rcu);
+	list_add_rcu(&sdev->list, &svm->devs);
+ out:
+	mutex_unlock(&pasid_mutex);
+	return ret;
+}
+
+int intel_svm_unbind_gpasid(struct device *dev, int pasid)
+{
+	struct intel_svm_dev *sdev;
+	struct intel_iommu *iommu;
+	struct intel_svm *svm;
+	int ret = -EINVAL;
+
+	mutex_lock(&pasid_mutex);
+	iommu = intel_svm_device_to_iommu(dev);
+	if (!iommu)
+		goto out;
+
+	svm = ioasid_find(NULL, pasid, NULL);
+	if (IS_ERR_OR_NULL(svm)) {
+		ret = PTR_ERR(svm);
+		goto out;
+	}
+
+	for_each_svm_dev(svm, dev) {
+		ret = 0;
+		sdev->users--;
+		if (!sdev->users) {
+			list_del_rcu(&sdev->list);
+			intel_pasid_tear_down_entry(iommu, dev, svm->pasid);
+			/* TODO: Drain in flight PRQ for the PASID since it
+			 * may get reused soon, we don't want to
+			 * confuse with its previous life.
+			 * intel_svm_drain_prq(dev, pasid);
+			 */
+			kfree_rcu(sdev, rcu);
+
+			if (list_empty(&svm->devs)) {
+				list_del(&svm->list);
+				kfree(svm);
+				/*
+				 * We do not free PASID here until explicit call
+				 * from VFIO to free. The PASID life cycle
+				 * management is largely tied to VFIO management
+				 * of assigned device life cycles. In case of
+				 * guest exit without a explicit free PASID call,
+				 * the responsibility lies in VFIO layer to free
+				 * the PASIDs allocated for the guest.
+				 * For security reasons, VFIO has to track the
+				 * PASID ownership per guest anyway to ensure
+				 * that PASID allocated by one guest cannot be
+				 * used by another.
+				 */
+				ioasid_set_data(pasid, NULL);
+			}
+		}
+		break;
+	}
+ out:
+	mutex_unlock(&pasid_mutex);
+
+	return ret;
+}
+
 int intel_svm_bind_mm(struct device *dev, int *pasid, int flags, struct svm_dev_ops *ops)
 {
 	struct intel_iommu *iommu = intel_svm_device_to_iommu(dev);
diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h
index 4ed2ce3341a5..0b63ec142bea 100644
--- a/include/linux/intel-iommu.h
+++ b/include/linux/intel-iommu.h
@@ -669,7 +669,9 @@ int intel_iommu_enable_pasid(struct intel_iommu *iommu, struct device *dev);
 int intel_svm_init(struct intel_iommu *iommu);
 extern int intel_svm_enable_prq(struct intel_iommu *iommu);
 extern int intel_svm_finish_prq(struct intel_iommu *iommu);
-
+extern int intel_svm_bind_gpasid(struct iommu_domain *domain,
+		struct device *dev, struct iommu_gpasid_bind_data *data);
+extern int intel_svm_unbind_gpasid(struct device *dev, int pasid);
 struct svm_dev_ops;
 
 struct intel_svm_dev {
@@ -686,9 +688,13 @@ struct intel_svm_dev {
 struct intel_svm {
 	struct mmu_notifier notifier;
 	struct mm_struct *mm;
+
 	struct intel_iommu *iommu;
 	int flags;
 	int pasid;
+	int gpasid; /* Guest PASID in case of vSVA bind with non-identity host
+		     * to guest PASID mapping.
+		     */
 	struct list_head devs;
 	struct list_head list;
 };
diff --git a/include/linux/intel-svm.h b/include/linux/intel-svm.h
index 94f047a8a845..a2c189ad0b01 100644
--- a/include/linux/intel-svm.h
+++ b/include/linux/intel-svm.h
@@ -44,6 +44,23 @@ struct svm_dev_ops {
  * do such IOTLB flushes automatically.
  */
 #define SVM_FLAG_SUPERVISOR_MODE	(1<<1)
+/*
+ * The SVM_FLAG_GUEST_MODE flag is used when a guest process bind to a device.
+ * In this case the mm_struct is in the guest kernel or userspace, its life
+ * cycle is managed by VMM and VFIO layer. For IOMMU driver, this API provides
+ * means to bind/unbind guest CR3 with PASIDs allocated for a device.
+ */
+#define SVM_FLAG_GUEST_MODE	(1<<2)
+/*
+ * The SVM_FLAG_GUEST_PASID flag is used when a guest has its own PASID space,
+ * which requires guest and host PASID translation at both directions. We keep
+ * track of guest PASID in order to provide lookup service to device drivers.
+ * One such example is a physical function (PF) driver that supports mediated
+ * device (mdev) assignment. Guest programming of mdev configuration space can
+ * only be done with guest PASID, therefore PF driver needs to find the matching
+ * host PASID to program the real hardware.
+ */
+#define SVM_FLAG_GUEST_PASID	(1<<3)
 
 #ifdef CONFIG_INTEL_IOMMU_SVM
 
-- 
2.7.4

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v6 09/10] iommu/vt-d: Support flushing more translation cache types
  2019-10-22 23:53 [PATCH v6 00/10] Nested Shared Virtual Address (SVA) VT-d support Jacob Pan
                   ` (7 preceding siblings ...)
  2019-10-22 23:53 ` [PATCH v6 08/10] iommu/vt-d: Add bind guest PASID support Jacob Pan
@ 2019-10-22 23:53 ` Jacob Pan
  2019-10-22 23:53 ` [PATCH v6 10/10] iommu/vt-d: Add svm/sva invalidate function Jacob Pan
  2019-10-23  0:27 ` [PATCH v6 00/10] Nested Shared Virtual Address (SVA) VT-d support Raj, Ashok
  10 siblings, 0 replies; 27+ messages in thread
From: Jacob Pan @ 2019-10-22 23:53 UTC (permalink / raw)
  To: iommu, LKML, Joerg Roedel, David Woodhouse, Alex Williamson,
	Jean-Philippe Brucker
  Cc: Tian, Kevin, Raj Ashok, Jonathan Cameron

When Shared Virtual Memory is exposed to a guest via vIOMMU, scalable
IOTLB invalidation may be passed down from outside IOMMU subsystems.
This patch adds invalidation functions that can be used for additional
translation cache types.

Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
---
 drivers/iommu/dmar.c        | 46 +++++++++++++++++++++++++++++++++++++++++++++
 drivers/iommu/intel-pasid.c |  3 ++-
 include/linux/intel-iommu.h | 21 +++++++++++++++++----
 3 files changed, 65 insertions(+), 5 deletions(-)

diff --git a/drivers/iommu/dmar.c b/drivers/iommu/dmar.c
index eecd6a421667..14604e95e09e 100644
--- a/drivers/iommu/dmar.c
+++ b/drivers/iommu/dmar.c
@@ -1345,6 +1345,20 @@ void qi_flush_iotlb(struct intel_iommu *iommu, u16 did, u64 addr,
 	qi_submit_sync(&desc, iommu);
 }
 
+/* PASID-based IOTLB Invalidate */
+void qi_flush_piotlb(struct intel_iommu *iommu, u16 did, u64 addr, u32 pasid,
+		unsigned int size_order, u64 granu, int ih)
+{
+	struct qi_desc desc = {.qw2 = 0, .qw3 = 0};
+
+	desc.qw0 = QI_EIOTLB_PASID(pasid) | QI_EIOTLB_DID(did) |
+		QI_EIOTLB_GRAN(granu) | QI_EIOTLB_TYPE;
+	desc.qw1 = QI_EIOTLB_ADDR(addr) | QI_EIOTLB_IH(ih) |
+		QI_EIOTLB_AM(size_order);
+
+	qi_submit_sync(&desc, iommu);
+}
+
 void qi_flush_dev_iotlb(struct intel_iommu *iommu, u16 sid, u16 pfsid,
 			u16 qdep, u64 addr, unsigned mask)
 {
@@ -1368,6 +1382,38 @@ void qi_flush_dev_iotlb(struct intel_iommu *iommu, u16 sid, u16 pfsid,
 	qi_submit_sync(&desc, iommu);
 }
 
+/* PASID-based device IOTLB Invalidate */
+void qi_flush_dev_piotlb(struct intel_iommu *iommu, u16 sid, u16 pfsid,
+		u32 pasid,  u16 qdep, u64 addr, unsigned size_order, u64 granu)
+{
+	struct qi_desc desc;
+
+	desc.qw0 = QI_DEV_EIOTLB_PASID(pasid) | QI_DEV_EIOTLB_SID(sid) |
+		QI_DEV_EIOTLB_QDEP(qdep) | QI_DEIOTLB_TYPE |
+		QI_DEV_IOTLB_PFSID(pfsid);
+	desc.qw1 = QI_DEV_EIOTLB_GLOB(granu);
+
+	/* If S bit is 0, we only flush a single page. If S bit is set,
+	 * The least significant zero bit indicates the invalidation address
+	 * range. VT-d spec 6.5.2.6.
+	 * e.g. address bit 12[0] indicates 8KB, 13[0] indicates 16KB.
+	 */
+	if (!size_order) {
+		desc.qw0 |= QI_DEV_EIOTLB_ADDR(addr) & ~QI_DEV_EIOTLB_SIZE;
+	} else {
+		unsigned long mask = 1UL << (VTD_PAGE_SHIFT + size_order);
+		desc.qw1 |= QI_DEV_EIOTLB_ADDR(addr & ~mask) | QI_DEV_EIOTLB_SIZE;
+	}
+	qi_submit_sync(&desc, iommu);
+}
+
+void qi_flush_pasid_cache(struct intel_iommu *iommu, u16 did, u64 granu, int pasid)
+{
+	struct qi_desc desc = {.qw1 = 0, .qw2 = 0, .qw3 = 0};
+
+	desc.qw0 = QI_PC_PASID(pasid) | QI_PC_DID(did) | QI_PC_GRAN(granu) | QI_PC_TYPE;
+	qi_submit_sync(&desc, iommu);
+}
 /*
  * Disable Queued Invalidation interface.
  */
diff --git a/drivers/iommu/intel-pasid.c b/drivers/iommu/intel-pasid.c
index fd2c82fd90c1..ff7e877b7a4d 100644
--- a/drivers/iommu/intel-pasid.c
+++ b/drivers/iommu/intel-pasid.c
@@ -518,7 +518,8 @@ pasid_cache_invalidation_with_pasid(struct intel_iommu *iommu,
 {
 	struct qi_desc desc;
 
-	desc.qw0 = QI_PC_DID(did) | QI_PC_PASID_SEL | QI_PC_PASID(pasid);
+	desc.qw0 = QI_PC_DID(did) | QI_PC_GRAN(QI_PC_PASID_SEL) |
+		QI_PC_PASID(pasid) | QI_PC_TYPE;
 	desc.qw1 = 0;
 	desc.qw2 = 0;
 	desc.qw3 = 0;
diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h
index 0b63ec142bea..14b87ae2916a 100644
--- a/include/linux/intel-iommu.h
+++ b/include/linux/intel-iommu.h
@@ -329,7 +329,7 @@ enum {
 #define QI_IOTLB_GRAN(gran) 	(((u64)gran) >> (DMA_TLB_FLUSH_GRANU_OFFSET-4))
 #define QI_IOTLB_ADDR(addr)	(((u64)addr) & VTD_PAGE_MASK)
 #define QI_IOTLB_IH(ih)		(((u64)ih) << 6)
-#define QI_IOTLB_AM(am)		(((u8)am))
+#define QI_IOTLB_AM(am)		(((u8)am) & 0x3f)
 
 #define QI_CC_FM(fm)		(((u64)fm) << 48)
 #define QI_CC_SID(sid)		(((u64)sid) << 32)
@@ -347,16 +347,21 @@ enum {
 #define QI_PC_DID(did)		(((u64)did) << 16)
 #define QI_PC_GRAN(gran)	(((u64)gran) << 4)
 
-#define QI_PC_ALL_PASIDS	(QI_PC_TYPE | QI_PC_GRAN(0))
-#define QI_PC_PASID_SEL		(QI_PC_TYPE | QI_PC_GRAN(1))
+/* PASID cache invalidation granu */
+#define QI_PC_ALL_PASIDS	0
+#define QI_PC_PASID_SEL		1
 
 #define QI_EIOTLB_ADDR(addr)	((u64)(addr) & VTD_PAGE_MASK)
 #define QI_EIOTLB_IH(ih)	(((u64)ih) << 6)
-#define QI_EIOTLB_AM(am)	(((u64)am))
+#define QI_EIOTLB_AM(am)	(((u64)am) & 0x3f)
 #define QI_EIOTLB_PASID(pasid) 	(((u64)pasid) << 32)
 #define QI_EIOTLB_DID(did)	(((u64)did) << 16)
 #define QI_EIOTLB_GRAN(gran) 	(((u64)gran) << 4)
 
+/* QI Dev-IOTLB inv granu */
+#define QI_DEV_IOTLB_GRAN_ALL		1
+#define QI_DEV_IOTLB_GRAN_PASID_SEL	0
+
 #define QI_DEV_EIOTLB_ADDR(a)	((u64)(a) & VTD_PAGE_MASK)
 #define QI_DEV_EIOTLB_SIZE	(((u64)1) << 11)
 #define QI_DEV_EIOTLB_GLOB(g)	((u64)g)
@@ -651,8 +656,16 @@ extern void qi_flush_context(struct intel_iommu *iommu, u16 did, u16 sid,
 			     u8 fm, u64 type);
 extern void qi_flush_iotlb(struct intel_iommu *iommu, u16 did, u64 addr,
 			  unsigned int size_order, u64 type);
+extern void qi_flush_piotlb(struct intel_iommu *iommu, u16 did, u64 addr,
+			u32 pasid, unsigned int size_order, u64 type, int ih);
 extern void qi_flush_dev_iotlb(struct intel_iommu *iommu, u16 sid, u16 pfsid,
 			u16 qdep, u64 addr, unsigned mask);
+
+extern void qi_flush_dev_piotlb(struct intel_iommu *iommu, u16 sid, u16 pfsid,
+			u32 pasid, u16 qdep, u64 addr, unsigned size_order, u64 granu);
+
+extern void qi_flush_pasid_cache(struct intel_iommu *iommu, u16 did, u64 granu, int pasid);
+
 extern int qi_submit_sync(struct qi_desc *desc, struct intel_iommu *iommu);
 
 extern int dmar_ir_support(void);
-- 
2.7.4

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v6 10/10] iommu/vt-d: Add svm/sva invalidate function
  2019-10-22 23:53 [PATCH v6 00/10] Nested Shared Virtual Address (SVA) VT-d support Jacob Pan
                   ` (8 preceding siblings ...)
  2019-10-22 23:53 ` [PATCH v6 09/10] iommu/vt-d: Support flushing more translation cache types Jacob Pan
@ 2019-10-22 23:53 ` Jacob Pan
  2019-10-23  0:27 ` [PATCH v6 00/10] Nested Shared Virtual Address (SVA) VT-d support Raj, Ashok
  10 siblings, 0 replies; 27+ messages in thread
From: Jacob Pan @ 2019-10-22 23:53 UTC (permalink / raw)
  To: iommu, LKML, Joerg Roedel, David Woodhouse, Alex Williamson,
	Jean-Philippe Brucker
  Cc: Tian, Kevin, Raj Ashok, Jonathan Cameron

When Shared Virtual Address (SVA) is enabled for a guest OS via
vIOMMU, we need to provide invalidation support at IOMMU API and driver
level. This patch adds Intel VT-d specific function to implement
iommu passdown invalidate API for shared virtual address.

The use case is for supporting caching structure invalidation
of assigned SVM capable devices. Emulated IOMMU exposes queue
invalidation capability and passes down all descriptors from the guest
to the physical IOMMU.

The assumption is that guest to host device ID mapping should be
resolved prior to calling IOMMU driver. Based on the device handle,
host IOMMU driver can replace certain fields before submit to the
invalidation queue.

Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Liu, Yi L <yi.l.liu@linux.intel.com>
---
 drivers/iommu/intel-iommu.c | 170 ++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 170 insertions(+)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 51e2a23ffd4e..4cfff0dd3d3b 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -5491,6 +5491,175 @@ static void intel_iommu_aux_detach_device(struct iommu_domain *domain,
 	aux_domain_remove_dev(to_dmar_domain(domain), dev);
 }
 
+/*
+ * 2D array for converting and sanitizing IOMMU generic TLB granularity to
+ * VT-d granularity. Invalidation is typically included in the unmap operation
+ * as a result of DMA or VFIO unmap. However, for assigned device where guest
+ * could own the first level page tables without being shadowed by QEMU. In
+ * this case there is no pass down unmap to the host IOMMU as a result of unmap
+ * in the guest. Only invalidations are trapped and passed down.
+ * In all cases, only first level TLB invalidation (request with PASID) can be
+ * passed down, therefore we do not include IOTLB granularity for request
+ * without PASID (second level).
+ *
+ * For an example, to find the VT-d granularity encoding for IOTLB
+ * type and page selective granularity within PASID:
+ * X: indexed by iommu cache type
+ * Y: indexed by enum iommu_inv_granularity
+ * [IOMMU_CACHE_INV_TYPE_IOTLB][IOMMU_INV_GRANU_ADDR]
+ *
+ * Granu_map array indicates validity of the table. 1: valid, 0: invalid
+ *
+ */
+const static int inv_type_granu_map[IOMMU_CACHE_INV_TYPE_NR][IOMMU_INV_GRANU_NR] = {
+	/* PASID based IOTLB, support PASID selective and page selective */
+	{0, 1, 1},
+	/* PASID based dev TLBs, only support all PASIDs or single PASID */
+	{1, 1, 0},
+	/* PASID cache */
+	{1, 1, 0}
+};
+
+const static u64 inv_type_granu_table[IOMMU_CACHE_INV_TYPE_NR][IOMMU_INV_GRANU_NR] = {
+	/* PASID based IOTLB */
+	{0, QI_GRAN_NONG_PASID, QI_GRAN_PSI_PASID},
+	/* PASID based dev TLBs */
+	{QI_DEV_IOTLB_GRAN_ALL, QI_DEV_IOTLB_GRAN_PASID_SEL, 0},
+	/* PASID cache */
+	{QI_PC_ALL_PASIDS, QI_PC_PASID_SEL, 0},
+};
+
+static inline int to_vtd_granularity(int type, int granu, u64 *vtd_granu)
+{
+	if (type >= IOMMU_CACHE_INV_TYPE_NR || granu >= IOMMU_INV_GRANU_NR ||
+		!inv_type_granu_map[type][granu])
+		return -EINVAL;
+
+	*vtd_granu = inv_type_granu_table[type][granu];
+
+	return 0;
+}
+
+static inline u64 to_vtd_size(u64 granu_size, u64 nr_granules)
+{
+	u64 nr_pages = (granu_size * nr_granules) >> VTD_PAGE_SHIFT;
+
+	/* VT-d size is encoded as 2^size of 4K pages, 0 for 4k, 9 for 2MB, etc.
+	 * IOMMU cache invalidate API passes granu_size in bytes, and number of
+	 * granu size in contiguous memory.
+	 */
+	return order_base_2(nr_pages);
+}
+
+#ifdef CONFIG_INTEL_IOMMU_SVM
+static int intel_iommu_sva_invalidate(struct iommu_domain *domain,
+		struct device *dev, struct iommu_cache_invalidate_info *inv_info)
+{
+	struct dmar_domain *dmar_domain = to_dmar_domain(domain);
+	struct device_domain_info *info;
+	struct intel_iommu *iommu;
+	unsigned long flags;
+	int cache_type;
+	u8 bus, devfn;
+	u16 did, sid;
+	int ret = 0;
+	u64 size;
+
+	if (!inv_info || !dmar_domain ||
+		inv_info->version != IOMMU_CACHE_INVALIDATE_INFO_VERSION_1)
+		return -EINVAL;
+
+	if (!dev || !dev_is_pci(dev))
+		return -ENODEV;
+
+	iommu = device_to_iommu(dev, &bus, &devfn);
+	if (!iommu)
+		return -ENODEV;
+
+	spin_lock_irqsave(&device_domain_lock, flags);
+	spin_lock(&iommu->lock);
+	info = iommu_support_dev_iotlb(dmar_domain, iommu, bus, devfn);
+	if (!info) {
+		ret = -EINVAL;
+		goto out_unlock;
+	}
+	did = dmar_domain->iommu_did[iommu->seq_id];
+	sid = PCI_DEVID(bus, devfn);
+	size = to_vtd_size(inv_info->addr_info.granule_size, inv_info->addr_info.nb_granules);
+
+	for_each_set_bit(cache_type, (unsigned long *)&inv_info->cache, IOMMU_CACHE_INV_TYPE_NR) {
+		u64 granu = 0;
+		u64 pasid = 0;
+
+		ret = to_vtd_granularity(cache_type, inv_info->granularity, &granu);
+		if (ret) {
+			pr_err("Invalid cache type and granu combination %d/%d\n", cache_type,
+				inv_info->granularity);
+			break;
+		}
+
+		/* PASID is stored in different locations based on granularity */
+		if (inv_info->granularity == IOMMU_INV_GRANU_PASID)
+			pasid = inv_info->pasid_info.pasid;
+		else if (inv_info->granularity == IOMMU_INV_GRANU_ADDR)
+			pasid = inv_info->addr_info.pasid;
+		else {
+			pr_err("Cannot find PASID for given cache type and granularity\n");
+			break;
+		}
+
+		switch (BIT(cache_type)) {
+		case IOMMU_CACHE_INV_TYPE_IOTLB:
+			if (size && (inv_info->addr_info.addr & ((BIT(VTD_PAGE_SHIFT + size)) - 1))) {
+				pr_err("Address out of range, 0x%llx, size order %llu\n",
+					inv_info->addr_info.addr, size);
+				ret = -ERANGE;
+				goto out_unlock;
+			}
+
+			qi_flush_piotlb(iommu, did, mm_to_dma_pfn(inv_info->addr_info.addr),
+					pasid, size, granu, inv_info->addr_info.flags & IOMMU_INV_ADDR_FLAGS_LEAF);
+
+			/*
+			 * Always flush device IOTLB if ATS is enabled since guest
+			 * vIOMMU exposes CM = 1, no device IOTLB flush will be passed
+			 * down.
+			 */
+			if (info->ats_enabled) {
+				qi_flush_dev_piotlb(iommu, sid, info->pfsid,
+						pasid, info->ats_qdep,
+						inv_info->addr_info.addr, size,
+						granu);
+			}
+			break;
+		case IOMMU_CACHE_INV_TYPE_DEV_IOTLB:
+			if (info->ats_enabled) {
+				qi_flush_dev_piotlb(iommu, sid, info->pfsid,
+						inv_info->addr_info.pasid, info->ats_qdep,
+						inv_info->addr_info.addr, size,
+						granu);
+			} else
+				pr_warn("Passdown device IOTLB flush w/o ATS!\n");
+
+			break;
+		case IOMMU_CACHE_INV_TYPE_PASID:
+			qi_flush_pasid_cache(iommu, did, granu, inv_info->pasid_info.pasid);
+
+			break;
+		default:
+			dev_err(dev, "Unsupported IOMMU invalidation type %d\n",
+				cache_type);
+			ret = -EINVAL;
+		}
+	}
+out_unlock:
+	spin_unlock(&iommu->lock);
+	spin_unlock_irqrestore(&device_domain_lock, flags);
+
+	return ret;
+}
+#endif
+
 static int intel_iommu_map(struct iommu_domain *domain,
 			   unsigned long iova, phys_addr_t hpa,
 			   size_t size, int iommu_prot)
@@ -6027,6 +6196,7 @@ const struct iommu_ops intel_iommu_ops = {
 	.is_attach_deferred	= intel_iommu_is_attach_deferred,
 	.pgsize_bitmap		= INTEL_IOMMU_PGSIZES,
 #ifdef CONFIG_INTEL_IOMMU_SVM
+	.cache_invalidate	= intel_iommu_sva_invalidate,
 	.sva_bind_gpasid	= intel_svm_bind_gpasid,
 	.sva_unbind_gpasid	= intel_svm_unbind_gpasid,
 #endif
-- 
2.7.4

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [PATCH v6 00/10] Nested Shared Virtual Address (SVA) VT-d support
  2019-10-22 23:53 [PATCH v6 00/10] Nested Shared Virtual Address (SVA) VT-d support Jacob Pan
                   ` (9 preceding siblings ...)
  2019-10-22 23:53 ` [PATCH v6 10/10] iommu/vt-d: Add svm/sva invalidate function Jacob Pan
@ 2019-10-23  0:27 ` Raj, Ashok
  2019-10-23 13:46   ` Jacob Pan
  10 siblings, 1 reply; 27+ messages in thread
From: Raj, Ashok @ 2019-10-23  0:27 UTC (permalink / raw)
  To: Jacob Pan
  Cc: Tian, Kevin, Ashok Raj, David Woodhouse, iommu, LKML,
	Alex Williamson, Jean-Philippe Brucker, Jonathan Cameron

Hi Jacob

On Tue, Oct 22, 2019 at 04:53:13PM -0700, Jacob Pan wrote:
> Shared virtual address (SVA), a.k.a, Shared virtual memory (SVM) on Intel
> platforms allow address space sharing between device DMA and applications.
> SVA can reduce programming complexity and enhance security.
> This series is intended to enable SVA virtualization, i.e. shared guest
> application address space and physical device DMA address. Only IOMMU portion

Last line "i.e shared guest application address space and physical device 
DMA address" doesn't read well. 

Simply rephrase as "enable use of SVA within a guest user application"

> of the changes are included in this series. Additional support is needed in
> VFIO and QEMU (will be submitted separately) to complete this functionality.
> 
> To make incremental changes and reduce the size of each patchset. This series
> does not inlcude support for page request services.
> 
> In VT-d implementation, PASID table is per device and maintained in the host.
> Guest PASID table is shadowed in VMM where virtual IOMMU is emulated.
> 
>     .-------------.  .---------------------------.
>     |   vIOMMU    |  | Guest process CR3, FL only|
>     |             |  '---------------------------'
>     .----------------/
>     | PASID Entry |--- PASID cache flush -
>     '-------------'                       |
>     |             |                       V
>     |             |                CR3 in GPA
>     '-------------'
> Guest
> ------| Shadow |--------------------------|--------
>       v        v                          v
> Host
>     .-------------.  .----------------------.
>     |   pIOMMU    |  | Bind FL for GVA-GPA  |
>     |             |  '----------------------'
>     .----------------/  |
>     | PASID Entry |     V (Nested xlate)
>     '----------------\.------------------------------.
>     |             |   |SL for GPA-HPA, default domain|
>     |             |   '------------------------------'
>     '-------------'

is the SL for GPA-HPA default domain? When you assign a device to 
Guest, this isn't default domain right?

> Where:
>  - FL = First level/stage one page tables
>  - SL = Second level/stage two page tables
> 
> This is the remaining VT-d only portion of V5 since the uAPIs and IOASID common
> code have been applied to Joerg's IOMMU core branch.
> (https://lkml.org/lkml/2019/10/2/833)
> 
> The complete set with VFIO patches are here:
> https://github.com/jacobpan/linux.git:siov_sva
> 
> The complete nested SVA upstream patches are divided into three phases:
>     1. Common APIs and PCI device direct assignment
>     2. Page Request Services (PRS) support
>     3. Mediated device assignment
> 
> With this set and the accompanied VFIO code, we will achieve phase #1.
> 
> Thanks,
> 
> Jacob
> 
> ChangeLog:
> 	- V6
> 	  - Rebased on top of Joerg's core branch
> 	  (git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu.git core)
> 	  - Adapt to new uAPIs and IOASID allocators
> 
> 	- V5
> 	  Rebased on v5.3-rc4 which has some of the IOMMU fault APIs merged.
>  	  Addressed v4 review comments from Eric Auger, Baolu Lu, and
> 	    Jonathan Cameron. Specific changes are as follows:
> 	  - Refined custom IOASID allocator to support multiple vIOMMU, hotplug
> 	    cases.
> 	  - Extracted vendor data from IOMMU guest PASID bind data, for VT-d
> 	    will support all necessary guest PASID entry fields for PASID
> 	    bind.
> 	  - Support non-identity host-guest PASID mapping
> 	  - Exception handling in various cases
> 
> 	- V4
> 	  - Redesigned IOASID allocator such that it can support custom
> 	  allocators with shared helper functions. Use separate XArray
> 	  to store IOASIDs per allocator. Took advice from Eric Auger to
> 	  have default allocator use the generic allocator structure.
> 	  Combined into one patch in that the default allocator is just
> 	  "another" allocator now. Can be built as a module in case of
> 	  driver use without IOMMU.
> 	  - Extended bind guest PASID data to support SMMU and non-identity
> 	  guest to host PASID mapping https://lkml.org/lkml/2019/5/21/802
> 	  - Rebased on Jean's sva/api common tree, new patches starts with
> 	   [PATCH v4 10/22]
> 
> 	- V3
> 	  - Addressed thorough review comments from Eric Auger (Thank you!)
> 	  - Moved IOASID allocator from driver core to IOMMU code per
> 	    suggestion by Christoph Hellwig
> 	    (https://lkml.org/lkml/2019/4/26/462)
> 	  - Rebased on top of Jean's SVA API branch and Eric's v7[1]
> 	    (git://linux-arm.org/linux-jpb.git sva/api)
> 	  - All IOMMU APIs are unmodified (except the new bind guest PASID
> 	    call in patch 9/16)
> 
> 	- V2
> 	  - Rebased on Joerg's IOMMU x86/vt-d branch v5.1-rc4
> 	  - Integrated with Eric Auger's new v7 series for common APIs
> 	  (https://github.com/eauger/linux/tree/v5.1-rc3-2stage-v7)
> 	  - Addressed review comments from Andy Shevchenko and Alex Williamson on
> 	    IOASID custom allocator.
> 	  - Support multiple custom IOASID allocators (vIOMMUs) and dynamic
> 	    registration.
> 
> 
> Jacob Pan (9):
>   iommu/vt-d: Add custom allocator for IOASID
>   iommu/vt-d: Replace Intel specific PASID allocator with IOASID
>   iommu/vt-d: Move domain helper to header
>   iommu/vt-d: Avoid duplicated code for PASID setup
>   iommu/vt-d: Add nested translation helper function
>   iommu/vt-d: Misc macro clean up for SVM
>   iommu/vt-d: Add bind guest PASID support
>   iommu/vt-d: Support flushing more translation cache types
>   iommu/vt-d: Add svm/sva invalidate function
> 
> Lu Baolu (1):
>   iommu/vt-d: Enlightened PASID allocation
> 
>  drivers/iommu/Kconfig       |   1 +
>  drivers/iommu/dmar.c        |  46 ++++++
>  drivers/iommu/intel-iommu.c | 259 +++++++++++++++++++++++++++++++--
>  drivers/iommu/intel-pasid.c | 343 +++++++++++++++++++++++++++++++++++++-------
>  drivers/iommu/intel-pasid.h |  25 +++-
>  drivers/iommu/intel-svm.c   | 298 ++++++++++++++++++++++++++++++--------
>  include/linux/intel-iommu.h |  39 ++++-
>  include/linux/intel-svm.h   |  17 +++
>  8 files changed, 904 insertions(+), 124 deletions(-)
> 
> -- 
> 2.7.4
> 
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v6 01/10] iommu/vt-d: Enlightened PASID allocation
  2019-10-22 23:53 ` [PATCH v6 01/10] iommu/vt-d: Enlightened PASID allocation Jacob Pan
@ 2019-10-23  0:45   ` Raj, Ashok
  2019-10-23  1:53     ` Lu Baolu
  0 siblings, 1 reply; 27+ messages in thread
From: Raj, Ashok @ 2019-10-23  0:45 UTC (permalink / raw)
  To: Jacob Pan
  Cc: Tian, Kevin, Ashok Raj, David Woodhouse, iommu, LKML,
	Alex Williamson, Jean-Philippe Brucker, Jonathan Cameron

On Tue, Oct 22, 2019 at 04:53:14PM -0700, Jacob Pan wrote:
> From: Lu Baolu <baolu.lu@linux.intel.com>
> 
> If Intel IOMMU runs in caching mode, a.k.a. virtual IOMMU, the
> IOMMU driver should rely on the emulation software to allocate
> and free PASID IDs. The Intel vt-d spec revision 3.0 defines a
> register set to support this. This includes a capability register,
> a virtual command register and a virtual response register. Refer
> to section 10.4.42, 10.4.43, 10.4.44 for more information.

The above paragraph seems a bit confusing, there is no reference
to caching mode for for VCMD... some suggestion below.

Enabling IOMMU in a guest requires communication with the host
driver for certain aspects. Use of PASID ID to enable Shared Virtual
Addressing (SVA) requires managing PASID's in the host. VT-d 3.0 spec
provides a Virtual Command Register (VCMD) to facilitate this.
Writes to this register in the guest are trapped by Qemu and 
proxies the call to the host driver.... 


> 
> This patch adds the enlightened PASID allocation/free interfaces
> via the virtual command register.
> 
> Cc: Ashok Raj <ashok.raj@intel.com>
> Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
> Cc: Kevin Tian <kevin.tian@intel.com>
> Signed-off-by: Liu Yi L <yi.l.liu@intel.com>
> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
> Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
> Reviewed-by: Eric Auger <eric.auger@redhat.com>
> ---
>  drivers/iommu/intel-pasid.c | 83 +++++++++++++++++++++++++++++++++++++++++++++
>  drivers/iommu/intel-pasid.h | 13 ++++++-
>  include/linux/intel-iommu.h |  2 ++
>  3 files changed, 97 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/iommu/intel-pasid.c b/drivers/iommu/intel-pasid.c
> index 040a445be300..76bcbb21e112 100644
> --- a/drivers/iommu/intel-pasid.c
> +++ b/drivers/iommu/intel-pasid.c
> @@ -63,6 +63,89 @@ void *intel_pasid_lookup_id(int pasid)
>  	return p;
>  }
>  
> +static int check_vcmd_pasid(struct intel_iommu *iommu)
> +{
> +	u64 cap;
> +
> +	if (!ecap_vcs(iommu->ecap)) {
> +		pr_warn("IOMMU: %s: Hardware doesn't support virtual command\n",
> +			iommu->name);
> +		return -ENODEV;
> +	}
> +
> +	cap = dmar_readq(iommu->reg + DMAR_VCCAP_REG);
> +	if (!(cap & DMA_VCS_PAS)) {
> +		pr_warn("IOMMU: %s: Emulation software doesn't support PASID allocation\n",
> +			iommu->name);
> +		return -ENODEV;
> +	}
> +
> +	return 0;
> +}
> +
> +int vcmd_alloc_pasid(struct intel_iommu *iommu, unsigned int *pasid)
> +{
> +	u64 res;
> +	u8 status_code;
> +	unsigned long flags;
> +	int ret = 0;
> +
> +	ret = check_vcmd_pasid(iommu);

Do you have to check this everytime? every dmar_readq is going to trap
to the other side ...

> +	if (ret)
> +		return ret;
> +
> +	raw_spin_lock_irqsave(&iommu->register_lock, flags);
> +	dmar_writeq(iommu->reg + DMAR_VCMD_REG, VCMD_CMD_ALLOC);
> +	IOMMU_WAIT_OP(iommu, DMAR_VCRSP_REG, dmar_readq,
> +		      !(res & VCMD_VRSP_IP), res);
> +	raw_spin_unlock_irqrestore(&iommu->register_lock, flags);
> +
> +	status_code = VCMD_VRSP_SC(res);
> +	switch (status_code) {
> +	case VCMD_VRSP_SC_SUCCESS:
> +		*pasid = VCMD_VRSP_RESULT(res);
> +		break;
> +	case VCMD_VRSP_SC_NO_PASID_AVAIL:
> +		pr_info("IOMMU: %s: No PASID available\n", iommu->name);
> +		ret = -ENOMEM;
> +		break;
> +	default:
> +		ret = -ENODEV;
> +		pr_warn("IOMMU: %s: Unexpected error code %d\n",
> +			iommu->name, status_code);
> +	}
> +
> +	return ret;
> +}
> +
> +void vcmd_free_pasid(struct intel_iommu *iommu, unsigned int pasid)
> +{
> +	u64 res;
> +	u8 status_code;
> +	unsigned long flags;
> +
> +	if (check_vcmd_pasid(iommu))
> +		return;
> +
> +	raw_spin_lock_irqsave(&iommu->register_lock, flags);
> +	dmar_writeq(iommu->reg + DMAR_VCMD_REG, (pasid << 8) | VCMD_CMD_FREE);
> +	IOMMU_WAIT_OP(iommu, DMAR_VCRSP_REG, dmar_readq,
> +		      !(res & VCMD_VRSP_IP), res);
> +	raw_spin_unlock_irqrestore(&iommu->register_lock, flags);
> +
> +	status_code = VCMD_VRSP_SC(res);
> +	switch (status_code) {
> +	case VCMD_VRSP_SC_SUCCESS:
> +		break;
> +	case VCMD_VRSP_SC_INVALID_PASID:
> +		pr_info("IOMMU: %s: Invalid PASID\n", iommu->name);
> +		break;
> +	default:
> +		pr_warn("IOMMU: %s: Unexpected error code %d\n",
> +			iommu->name, status_code);
> +	}
> +}
> +
>  /*
>   * Per device pasid table management:
>   */
> diff --git a/drivers/iommu/intel-pasid.h b/drivers/iommu/intel-pasid.h
> index fc8cd8f17de1..e413e884e685 100644
> --- a/drivers/iommu/intel-pasid.h
> +++ b/drivers/iommu/intel-pasid.h
> @@ -23,6 +23,16 @@
>  #define is_pasid_enabled(entry)		(((entry)->lo >> 3) & 0x1)
>  #define get_pasid_dir_size(entry)	(1 << ((((entry)->lo >> 9) & 0x7) + 7))
>  
> +/* Virtual command interface for enlightened pasid management. */
> +#define VCMD_CMD_ALLOC			0x1
> +#define VCMD_CMD_FREE			0x2
> +#define VCMD_VRSP_IP			0x1
> +#define VCMD_VRSP_SC(e)			(((e) >> 1) & 0x3)
> +#define VCMD_VRSP_SC_SUCCESS		0
> +#define VCMD_VRSP_SC_NO_PASID_AVAIL	1
> +#define VCMD_VRSP_SC_INVALID_PASID	1
> +#define VCMD_VRSP_RESULT(e)		(((e) >> 8) & 0xfffff)
> +
>  /*
>   * Domain ID reserved for pasid entries programmed for first-level
>   * only and pass-through transfer modes.
> @@ -95,5 +105,6 @@ int intel_pasid_setup_pass_through(struct intel_iommu *iommu,
>  				   struct device *dev, int pasid);
>  void intel_pasid_tear_down_entry(struct intel_iommu *iommu,
>  				 struct device *dev, int pasid);
> -
> +int vcmd_alloc_pasid(struct intel_iommu *iommu, unsigned int *pasid);
> +void vcmd_free_pasid(struct intel_iommu *iommu, unsigned int pasid);
>  #endif /* __INTEL_PASID_H */
> diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h
> index ed11ef594378..eea7468694a7 100644
> --- a/include/linux/intel-iommu.h
> +++ b/include/linux/intel-iommu.h
> @@ -161,6 +161,7 @@
>  #define ecap_smpwc(e)		(((e) >> 48) & 0x1)
>  #define ecap_flts(e)		(((e) >> 47) & 0x1)
>  #define ecap_slts(e)		(((e) >> 46) & 0x1)
> +#define ecap_vcs(e)		(((e) >> 44) & 0x1)
>  #define ecap_smts(e)		(((e) >> 43) & 0x1)
>  #define ecap_dit(e)		((e >> 41) & 0x1)
>  #define ecap_pasid(e)		((e >> 40) & 0x1)
> @@ -279,6 +280,7 @@
>  
>  /* PRS_REG */
>  #define DMA_PRS_PPR	((u32)1)
> +#define DMA_VCS_PAS	((u64)1)
>  
>  #define IOMMU_WAIT_OP(iommu, offset, op, cond, sts)			\
>  do {									\
> -- 
> 2.7.4
> 
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v6 02/10] iommu/vt-d: Add custom allocator for IOASID
  2019-10-22 23:53 ` [PATCH v6 02/10] iommu/vt-d: Add custom allocator for IOASID Jacob Pan
@ 2019-10-23  0:51   ` Raj, Ashok
  2019-10-23  2:21     ` Lu Baolu
  2019-10-23  4:04     ` Jacob Pan
  0 siblings, 2 replies; 27+ messages in thread
From: Raj, Ashok @ 2019-10-23  0:51 UTC (permalink / raw)
  To: Jacob Pan
  Cc: Tian, Kevin, Ashok Raj, David Woodhouse, iommu, LKML,
	Alex Williamson, Jean-Philippe Brucker, Jonathan Cameron

On Tue, Oct 22, 2019 at 04:53:15PM -0700, Jacob Pan wrote:
> When VT-d driver runs in the guest, PASID allocation must be
> performed via virtual command interface. This patch registers a
> custom IOASID allocator which takes precedence over the default
> XArray based allocator. The resulting IOASID allocation will always
> come from the host. This ensures that PASID namespace is system-
> wide.
> 
> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
> Signed-off-by: Liu, Yi L <yi.l.liu@intel.com>
> Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
> ---
>  drivers/iommu/Kconfig       |  1 +
>  drivers/iommu/intel-iommu.c | 67 +++++++++++++++++++++++++++++++++++++++++++++
>  include/linux/intel-iommu.h |  2 ++
>  3 files changed, 70 insertions(+)
> 
> diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
> index fd50ddffffbf..961fe5795a90 100644
> --- a/drivers/iommu/Kconfig
> +++ b/drivers/iommu/Kconfig
> @@ -211,6 +211,7 @@ config INTEL_IOMMU_SVM
>  	bool "Support for Shared Virtual Memory with Intel IOMMU"
>  	depends on INTEL_IOMMU && X86
>  	select PCI_PASID
> +	select IOASID
>  	select MMU_NOTIFIER
>  	help
>  	  Shared Virtual Memory (SVM) provides a facility for devices
> diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
> index 3f974919d3bd..3aff0141c522 100644
> --- a/drivers/iommu/intel-iommu.c
> +++ b/drivers/iommu/intel-iommu.c
> @@ -1706,6 +1706,8 @@ static void free_dmar_iommu(struct intel_iommu *iommu)
>  		if (ecap_prs(iommu->ecap))
>  			intel_svm_finish_prq(iommu);
>  	}
> +	ioasid_unregister_allocator(&iommu->pasid_allocator);
> +
>  #endif
>  }
>  
> @@ -4910,6 +4912,46 @@ static int __init probe_acpi_namespace_devices(void)
>  	return 0;
>  }
>  
> +#ifdef CONFIG_INTEL_IOMMU_SVM

Maybe move them to intel-svm.c instead? that's where the bulk
of the svm support is?

> +static ioasid_t intel_ioasid_alloc(ioasid_t min, ioasid_t max, void *data)
> +{
> +	struct intel_iommu *iommu = data;
> +	ioasid_t ioasid;
> +
> +	/*
> +	 * VT-d virtual command interface always uses the full 20 bit
> +	 * PASID range. Host can partition guest PASID range based on
> +	 * policies but it is out of guest's control.
> +	 */
> +	if (min < PASID_MIN || max > PASID_MAX)
> +		return INVALID_IOASID;

What are these PASID_MIN/MAX? Do you check if these are within the 
limits supported by the iommu/vIOMMU as its enumerated?


> +
> +	if (vcmd_alloc_pasid(iommu, &ioasid))
> +		return INVALID_IOASID;
> +
> +	return ioasid;
> +}
> +
> +static void intel_ioasid_free(ioasid_t ioasid, void *data)
> +{
> +	struct iommu_pasid_alloc_info *svm;
> +	struct intel_iommu *iommu = data;
> +
> +	if (!iommu)
> +		return;
> +	/*
> +	 * Sanity check the ioasid owner is done at upper layer, e.g. VFIO
> +	 * We can only free the PASID when all the devices are unbond.
> +	 */
> +	svm = ioasid_find(NULL, ioasid, NULL);
> +	if (!svm) {
> +		pr_warn("Freeing unbond IOASID %d\n", ioasid);
> +		return;
> +	}
> +	vcmd_free_pasid(iommu, ioasid);
> +}
> +#endif
> +
>  int __init intel_iommu_init(void)
>  {
>  	int ret = -ENODEV;
> @@ -5020,6 +5062,31 @@ int __init intel_iommu_init(void)
>  				       "%s", iommu->name);
>  		iommu_device_set_ops(&iommu->iommu, &intel_iommu_ops);
>  		iommu_device_register(&iommu->iommu);
> +#ifdef CONFIG_INTEL_IOMMU_SVM
> +		if (cap_caching_mode(iommu->cap) && sm_supported(iommu)) {

do you need to check against cap_caching_mode() or ecap_vcmd?


> +			/*
> +			 * Register a custom ASID allocator if we are running
> +			 * in a guest, the purpose is to have a system wide PASID
> +			 * namespace among all PASID users.
> +			 * There can be multiple vIOMMUs in each guest but only
> +			 * one allocator is active. All vIOMMU allocators will
> +			 * eventually be calling the same host allocator.
> +			 */
> +			iommu->pasid_allocator.alloc = intel_ioasid_alloc;
> +			iommu->pasid_allocator.free = intel_ioasid_free;
> +			iommu->pasid_allocator.pdata = (void *)iommu;
> +			ret = ioasid_register_allocator(&iommu->pasid_allocator);
> +			if (ret) {
> +				pr_warn("Custom PASID allocator registeration failed\n");
> +				/*
> +				 * Disable scalable mode on this IOMMU if there
> +				 * is no custom allocator. Mixing SM capable vIOMMU
> +				 * and non-SM vIOMMU are not supported.
> +				 */
> +				intel_iommu_sm = 0;
> +			}
> +		}
> +#endif
>  	}
>  
>  	bus_set_iommu(&pci_bus_type, &intel_iommu_ops);
> diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h
> index eea7468694a7..fb1973a761c1 100644
> --- a/include/linux/intel-iommu.h
> +++ b/include/linux/intel-iommu.h
> @@ -19,6 +19,7 @@
>  #include <linux/iommu.h>
>  #include <linux/io-64-nonatomic-lo-hi.h>
>  #include <linux/dmar.h>
> +#include <linux/ioasid.h>
>  
>  #include <asm/cacheflush.h>
>  #include <asm/iommu.h>
> @@ -542,6 +543,7 @@ struct intel_iommu {
>  #ifdef CONFIG_INTEL_IOMMU_SVM
>  	struct page_req_dsc *prq;
>  	unsigned char prq_name[16];    /* Name for PRQ interrupt */
> +	struct ioasid_allocator_ops pasid_allocator; /* Custom allocator for PASIDs */
>  #endif
>  	struct q_inval  *qi;            /* Queued invalidation info */
>  	u32 *iommu_state; /* Store iommu states between suspend and resume.*/
> -- 
> 2.7.4
> 
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v6 03/10] iommu/vt-d: Replace Intel specific PASID allocator with IOASID
  2019-10-22 23:53 ` [PATCH v6 03/10] iommu/vt-d: Replace Intel specific PASID allocator with IOASID Jacob Pan
@ 2019-10-23  0:58   ` Raj, Ashok
  2019-10-23  4:19     ` Jacob Pan
  0 siblings, 1 reply; 27+ messages in thread
From: Raj, Ashok @ 2019-10-23  0:58 UTC (permalink / raw)
  To: Jacob Pan
  Cc: Tian, Kevin, Ashok Raj, David Woodhouse, iommu, LKML,
	Alex Williamson, Jean-Philippe Brucker, Jonathan Cameron

On Tue, Oct 22, 2019 at 04:53:16PM -0700, Jacob Pan wrote:
> Make use of generic IOASID code to manage PASID allocation,
> free, and lookup. Replace Intel specific code.
> 
> Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
> ---
>  drivers/iommu/intel-iommu.c | 12 ++++++------
>  drivers/iommu/intel-pasid.c | 36 ------------------------------------
>  drivers/iommu/intel-svm.c   | 37 +++++++++++++++++++++----------------
>  3 files changed, 27 insertions(+), 58 deletions(-)
> 
> diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
> index 3aff0141c522..72febcf2c48f 100644
> --- a/drivers/iommu/intel-iommu.c
> +++ b/drivers/iommu/intel-iommu.c
> @@ -5311,7 +5311,7 @@ static void auxiliary_unlink_device(struct dmar_domain *domain,
>  	domain->auxd_refcnt--;
>  
>  	if (!domain->auxd_refcnt && domain->default_pasid > 0)
> -		intel_pasid_free_id(domain->default_pasid);
> +		ioasid_free(domain->default_pasid);

if the domain is gauranteed to be torn down, its ok.. but otherwise
do you need to clear domain->default_pasid to avoid accidental reference
in some other code path?

>  }
>  
>  static int aux_domain_add_dev(struct dmar_domain *domain,
> @@ -5329,10 +5329,10 @@ static int aux_domain_add_dev(struct dmar_domain *domain,
>  	if (domain->default_pasid <= 0) {
>  		int pasid;
>  
> -		pasid = intel_pasid_alloc_id(domain, PASID_MIN,
> -					     pci_max_pasids(to_pci_dev(dev)),
> -					     GFP_KERNEL);
> -		if (pasid <= 0) {
> +		/* No private data needed for the default pasid */
> +		pasid = ioasid_alloc(NULL, PASID_MIN, pci_max_pasids(to_pci_dev(dev)) - 1,

If we ensure IOMMU max-pasid is full 20bit width, its good. In a vIOMMU
if iommu max pasid is restricted for some kind of partitioning in future
you want to consider limiting to no more than what's provisioned in the vIOMMU 
right?


> +				NULL);
> +		if (pasid == INVALID_IOASID) {
>  			pr_err("Can't allocate default pasid\n");
>  			return -ENODEV;
>  		}
> @@ -5368,7 +5368,7 @@ static int aux_domain_add_dev(struct dmar_domain *domain,
>  	spin_unlock(&iommu->lock);
>  	spin_unlock_irqrestore(&device_domain_lock, flags);
>  	if (!domain->auxd_refcnt && domain->default_pasid > 0)
> -		intel_pasid_free_id(domain->default_pasid);
> +		ioasid_free(domain->default_pasid);
>  
>  	return ret;
>  }
> diff --git a/drivers/iommu/intel-pasid.c b/drivers/iommu/intel-pasid.c
> index 76bcbb21e112..c0d1f28d3412 100644
> --- a/drivers/iommu/intel-pasid.c
> +++ b/drivers/iommu/intel-pasid.c
> @@ -26,42 +26,6 @@
>   */
>  static DEFINE_SPINLOCK(pasid_lock);
>  u32 intel_pasid_max_id = PASID_MAX;
> -static DEFINE_IDR(pasid_idr);
> -
> -int intel_pasid_alloc_id(void *ptr, int start, int end, gfp_t gfp)
> -{
> -	int ret, min, max;
> -
> -	min = max_t(int, start, PASID_MIN);
> -	max = min_t(int, end, intel_pasid_max_id);
> -
> -	WARN_ON(in_interrupt());
> -	idr_preload(gfp);
> -	spin_lock(&pasid_lock);
> -	ret = idr_alloc(&pasid_idr, ptr, min, max, GFP_ATOMIC);
> -	spin_unlock(&pasid_lock);
> -	idr_preload_end();
> -
> -	return ret;
> -}
> -
> -void intel_pasid_free_id(int pasid)
> -{
> -	spin_lock(&pasid_lock);
> -	idr_remove(&pasid_idr, pasid);
> -	spin_unlock(&pasid_lock);
> -}
> -
> -void *intel_pasid_lookup_id(int pasid)
> -{
> -	void *p;
> -
> -	spin_lock(&pasid_lock);
> -	p = idr_find(&pasid_idr, pasid);
> -	spin_unlock(&pasid_lock);
> -
> -	return p;
> -}
>  
>  static int check_vcmd_pasid(struct intel_iommu *iommu)
>  {
> diff --git a/drivers/iommu/intel-svm.c b/drivers/iommu/intel-svm.c
> index 9b159132405d..5aef5b7bf561 100644
> --- a/drivers/iommu/intel-svm.c
> +++ b/drivers/iommu/intel-svm.c
> @@ -17,6 +17,7 @@
>  #include <linux/dmar.h>
>  #include <linux/interrupt.h>
>  #include <linux/mm_types.h>
> +#include <linux/ioasid.h>
>  #include <asm/page.h>
>  
>  #include "intel-pasid.h"
> @@ -318,16 +319,15 @@ int intel_svm_bind_mm(struct device *dev, int *pasid, int flags, struct svm_dev_
>  		if (pasid_max > intel_pasid_max_id)
>  			pasid_max = intel_pasid_max_id;
>  
> -		/* Do not use PASID 0 in caching mode (virtualised IOMMU) */
> -		ret = intel_pasid_alloc_id(svm,
> -					   !!cap_caching_mode(iommu->cap),
> -					   pasid_max - 1, GFP_KERNEL);
> -		if (ret < 0) {
> +		/* Do not use PASID 0, reserved for RID to PASID */
> +		svm->pasid = ioasid_alloc(NULL, PASID_MIN,
> +					pasid_max - 1, svm);
> +		if (svm->pasid == INVALID_IOASID) {
>  			kfree(svm);
>  			kfree(sdev);
> +			ret = ENOSPC;
>  			goto out;
>  		}
> -		svm->pasid = ret;
>  		svm->notifier.ops = &intel_mmuops;
>  		svm->mm = mm;
>  		svm->flags = flags;
> @@ -337,7 +337,7 @@ int intel_svm_bind_mm(struct device *dev, int *pasid, int flags, struct svm_dev_
>  		if (mm) {
>  			ret = mmu_notifier_register(&svm->notifier, mm);
>  			if (ret) {
> -				intel_pasid_free_id(svm->pasid);
> +				ioasid_free(svm->pasid);
>  				kfree(svm);
>  				kfree(sdev);
>  				goto out;
> @@ -353,7 +353,7 @@ int intel_svm_bind_mm(struct device *dev, int *pasid, int flags, struct svm_dev_
>  		if (ret) {
>  			if (mm)
>  				mmu_notifier_unregister(&svm->notifier, mm);
> -			intel_pasid_free_id(svm->pasid);
> +			ioasid_free(svm->pasid);
>  			kfree(svm);
>  			kfree(sdev);
>  			goto out;
> @@ -401,7 +401,12 @@ int intel_svm_unbind_mm(struct device *dev, int pasid)
>  	if (!iommu)
>  		goto out;
>  
> -	svm = intel_pasid_lookup_id(pasid);
> +	svm = ioasid_find(NULL, pasid, NULL);
> +	if (IS_ERR(svm)) {
> +		ret = PTR_ERR(svm);
> +		goto out;
> +	}
> +
>  	if (!svm)
>  		goto out;
>  
> @@ -423,7 +428,7 @@ int intel_svm_unbind_mm(struct device *dev, int pasid)
>  				kfree_rcu(sdev, rcu);
>  
>  				if (list_empty(&svm->devs)) {
> -					intel_pasid_free_id(svm->pasid);
> +					ioasid_free(svm->pasid);
>  					if (svm->mm)
>  						mmu_notifier_unregister(&svm->notifier, svm->mm);
>  
> @@ -458,10 +463,11 @@ int intel_svm_is_pasid_valid(struct device *dev, int pasid)
>  	if (!iommu)
>  		goto out;
>  
> -	svm = intel_pasid_lookup_id(pasid);
> -	if (!svm)
> +	svm = ioasid_find(NULL, pasid, NULL);
> +	if (IS_ERR(svm)) {
> +		ret = PTR_ERR(svm);
>  		goto out;
> -
> +	}
>  	/* init_mm is used in this case */
>  	if (!svm->mm)
>  		ret = 1;
> @@ -568,13 +574,12 @@ static irqreturn_t prq_event_thread(int irq, void *d)
>  
>  		if (!svm || svm->pasid != req->pasid) {
>  			rcu_read_lock();
> -			svm = intel_pasid_lookup_id(req->pasid);
> +			svm = ioasid_find(NULL, req->pasid, NULL);
>  			/* It *can't* go away, because the driver is not permitted
>  			 * to unbind the mm while any page faults are outstanding.
>  			 * So we only need RCU to protect the internal idr code. */
>  			rcu_read_unlock();
> -
> -			if (!svm) {
> +			if (IS_ERR(svm) || !svm) {
>  				pr_err("%s: Page request for invalid PASID %d: %08llx %08llx\n",
>  				       iommu->name, req->pasid, ((unsigned long long *)req)[0],
>  				       ((unsigned long long *)req)[1]);
> -- 
> 2.7.4
> 
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v6 01/10] iommu/vt-d: Enlightened PASID allocation
  2019-10-23  0:45   ` Raj, Ashok
@ 2019-10-23  1:53     ` Lu Baolu
  2019-10-23 17:55       ` Jacob Pan
  0 siblings, 1 reply; 27+ messages in thread
From: Lu Baolu @ 2019-10-23  1:53 UTC (permalink / raw)
  To: Raj, Ashok, Jacob Pan
  Cc: Tian, Kevin, Jean-Philippe Brucker, iommu, LKML, Alex Williamson,
	David Woodhouse, Jonathan Cameron

Hi Ashok,

Thanks for reviewing the patch.

On 10/23/19 8:45 AM, Raj, Ashok wrote:
> On Tue, Oct 22, 2019 at 04:53:14PM -0700, Jacob Pan wrote:
>> From: Lu Baolu <baolu.lu@linux.intel.com>
>>
>> If Intel IOMMU runs in caching mode, a.k.a. virtual IOMMU, the
>> IOMMU driver should rely on the emulation software to allocate
>> and free PASID IDs. The Intel vt-d spec revision 3.0 defines a
>> register set to support this. This includes a capability register,
>> a virtual command register and a virtual response register. Refer
>> to section 10.4.42, 10.4.43, 10.4.44 for more information.
> 
> The above paragraph seems a bit confusing, there is no reference
> to caching mode for for VCMD... some suggestion below.
> 
> Enabling IOMMU in a guest requires communication with the host
> driver for certain aspects. Use of PASID ID to enable Shared Virtual
> Addressing (SVA) requires managing PASID's in the host. VT-d 3.0 spec
> provides a Virtual Command Register (VCMD) to facilitate this.
> Writes to this register in the guest are trapped by Qemu and
> proxies the call to the host driver....

Yours is better. Will use it.

> 
> 
>>
>> This patch adds the enlightened PASID allocation/free interfaces
>> via the virtual command register.
>>
>> Cc: Ashok Raj <ashok.raj@intel.com>
>> Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
>> Cc: Kevin Tian <kevin.tian@intel.com>
>> Signed-off-by: Liu Yi L <yi.l.liu@intel.com>
>> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
>> Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
>> Reviewed-by: Eric Auger <eric.auger@redhat.com>
>> ---
>>   drivers/iommu/intel-pasid.c | 83 +++++++++++++++++++++++++++++++++++++++++++++
>>   drivers/iommu/intel-pasid.h | 13 ++++++-
>>   include/linux/intel-iommu.h |  2 ++
>>   3 files changed, 97 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/iommu/intel-pasid.c b/drivers/iommu/intel-pasid.c
>> index 040a445be300..76bcbb21e112 100644
>> --- a/drivers/iommu/intel-pasid.c
>> +++ b/drivers/iommu/intel-pasid.c
>> @@ -63,6 +63,89 @@ void *intel_pasid_lookup_id(int pasid)
>>   	return p;
>>   }
>>   
>> +static int check_vcmd_pasid(struct intel_iommu *iommu)
>> +{
>> +	u64 cap;
>> +
>> +	if (!ecap_vcs(iommu->ecap)) {
>> +		pr_warn("IOMMU: %s: Hardware doesn't support virtual command\n",
>> +			iommu->name);
>> +		return -ENODEV;
>> +	}
>> +
>> +	cap = dmar_readq(iommu->reg + DMAR_VCCAP_REG);
>> +	if (!(cap & DMA_VCS_PAS)) {
>> +		pr_warn("IOMMU: %s: Emulation software doesn't support PASID allocation\n",
>> +			iommu->name);
>> +		return -ENODEV;
>> +	}
>> +
>> +	return 0;
>> +}
>> +
>> +int vcmd_alloc_pasid(struct intel_iommu *iommu, unsigned int *pasid)
>> +{
>> +	u64 res;
>> +	u8 status_code;
>> +	unsigned long flags;
>> +	int ret = 0;
>> +
>> +	ret = check_vcmd_pasid(iommu);
> 
> Do you have to check this everytime? every dmar_readq is going to trap
> to the other side ...

Yes. We don't need to check it every time. Check once and save the
result during boot is enough.

How about below incremental change?

diff --git a/drivers/iommu/intel-pasid.c b/drivers/iommu/intel-pasid.c
index ff7e877b7a4d..c15d9d7e1e73 100644
--- a/drivers/iommu/intel-pasid.c
+++ b/drivers/iommu/intel-pasid.c
@@ -29,22 +29,29 @@ u32 intel_pasid_max_id = PASID_MAX;

  static int check_vcmd_pasid(struct intel_iommu *iommu)
  {
+       static int vcmd_supported = -EINVAL;
         u64 cap;

+       if (vcmd_supported != -EINVAL)
+               return vcmd_supported;
+
         if (!ecap_vcs(iommu->ecap)) {
                 pr_warn("IOMMU: %s: Hardware doesn't support virtual 
command\n",
                         iommu->name);
-               return -ENODEV;
+               vcmd_supported = -ENODEV;
+               return vcmd_supported;
         }

         cap = dmar_readq(iommu->reg + DMAR_VCCAP_REG);
         if (!(cap & DMA_VCS_PAS)) {
                 pr_warn("IOMMU: %s: Emulation software doesn't support 
PASID allocation\n",
                         iommu->name);
-               return -ENODEV;
+               vcmd_supported = -ENODEV;
+               return vcmd_supported;
         }

-       return 0;
+       vcmd_supported = 0;
+       return vcmd_supported;
  }

  int vcmd_alloc_pasid(struct intel_iommu *iommu, unsigned int *pasid)

Best regards,
baolu

> 
>> +	if (ret)
>> +		return ret;
>> +
>> +	raw_spin_lock_irqsave(&iommu->register_lock, flags);
>> +	dmar_writeq(iommu->reg + DMAR_VCMD_REG, VCMD_CMD_ALLOC);
>> +	IOMMU_WAIT_OP(iommu, DMAR_VCRSP_REG, dmar_readq,
>> +		      !(res & VCMD_VRSP_IP), res);
>> +	raw_spin_unlock_irqrestore(&iommu->register_lock, flags);
>> +
>> +	status_code = VCMD_VRSP_SC(res);
>> +	switch (status_code) {
>> +	case VCMD_VRSP_SC_SUCCESS:
>> +		*pasid = VCMD_VRSP_RESULT(res);
>> +		break;
>> +	case VCMD_VRSP_SC_NO_PASID_AVAIL:
>> +		pr_info("IOMMU: %s: No PASID available\n", iommu->name);
>> +		ret = -ENOMEM;
>> +		break;
>> +	default:
>> +		ret = -ENODEV;
>> +		pr_warn("IOMMU: %s: Unexpected error code %d\n",
>> +			iommu->name, status_code);
>> +	}
>> +
>> +	return ret;
>> +}
>> +
>> +void vcmd_free_pasid(struct intel_iommu *iommu, unsigned int pasid)
>> +{
>> +	u64 res;
>> +	u8 status_code;
>> +	unsigned long flags;
>> +
>> +	if (check_vcmd_pasid(iommu))
>> +		return;
>> +
>> +	raw_spin_lock_irqsave(&iommu->register_lock, flags);
>> +	dmar_writeq(iommu->reg + DMAR_VCMD_REG, (pasid << 8) | VCMD_CMD_FREE);
>> +	IOMMU_WAIT_OP(iommu, DMAR_VCRSP_REG, dmar_readq,
>> +		      !(res & VCMD_VRSP_IP), res);
>> +	raw_spin_unlock_irqrestore(&iommu->register_lock, flags);
>> +
>> +	status_code = VCMD_VRSP_SC(res);
>> +	switch (status_code) {
>> +	case VCMD_VRSP_SC_SUCCESS:
>> +		break;
>> +	case VCMD_VRSP_SC_INVALID_PASID:
>> +		pr_info("IOMMU: %s: Invalid PASID\n", iommu->name);
>> +		break;
>> +	default:
>> +		pr_warn("IOMMU: %s: Unexpected error code %d\n",
>> +			iommu->name, status_code);
>> +	}
>> +}
>> +
>>   /*
>>    * Per device pasid table management:
>>    */
>> diff --git a/drivers/iommu/intel-pasid.h b/drivers/iommu/intel-pasid.h
>> index fc8cd8f17de1..e413e884e685 100644
>> --- a/drivers/iommu/intel-pasid.h
>> +++ b/drivers/iommu/intel-pasid.h
>> @@ -23,6 +23,16 @@
>>   #define is_pasid_enabled(entry)		(((entry)->lo >> 3) & 0x1)
>>   #define get_pasid_dir_size(entry)	(1 << ((((entry)->lo >> 9) & 0x7) + 7))
>>   
>> +/* Virtual command interface for enlightened pasid management. */
>> +#define VCMD_CMD_ALLOC			0x1
>> +#define VCMD_CMD_FREE			0x2
>> +#define VCMD_VRSP_IP			0x1
>> +#define VCMD_VRSP_SC(e)			(((e) >> 1) & 0x3)
>> +#define VCMD_VRSP_SC_SUCCESS		0
>> +#define VCMD_VRSP_SC_NO_PASID_AVAIL	1
>> +#define VCMD_VRSP_SC_INVALID_PASID	1
>> +#define VCMD_VRSP_RESULT(e)		(((e) >> 8) & 0xfffff)
>> +
>>   /*
>>    * Domain ID reserved for pasid entries programmed for first-level
>>    * only and pass-through transfer modes.
>> @@ -95,5 +105,6 @@ int intel_pasid_setup_pass_through(struct intel_iommu *iommu,
>>   				   struct device *dev, int pasid);
>>   void intel_pasid_tear_down_entry(struct intel_iommu *iommu,
>>   				 struct device *dev, int pasid);
>> -
>> +int vcmd_alloc_pasid(struct intel_iommu *iommu, unsigned int *pasid);
>> +void vcmd_free_pasid(struct intel_iommu *iommu, unsigned int pasid);
>>   #endif /* __INTEL_PASID_H */
>> diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h
>> index ed11ef594378..eea7468694a7 100644
>> --- a/include/linux/intel-iommu.h
>> +++ b/include/linux/intel-iommu.h
>> @@ -161,6 +161,7 @@
>>   #define ecap_smpwc(e)		(((e) >> 48) & 0x1)
>>   #define ecap_flts(e)		(((e) >> 47) & 0x1)
>>   #define ecap_slts(e)		(((e) >> 46) & 0x1)
>> +#define ecap_vcs(e)		(((e) >> 44) & 0x1)
>>   #define ecap_smts(e)		(((e) >> 43) & 0x1)
>>   #define ecap_dit(e)		((e >> 41) & 0x1)
>>   #define ecap_pasid(e)		((e >> 40) & 0x1)
>> @@ -279,6 +280,7 @@
>>   
>>   /* PRS_REG */
>>   #define DMA_PRS_PPR	((u32)1)
>> +#define DMA_VCS_PAS	((u64)1)
>>   
>>   #define IOMMU_WAIT_OP(iommu, offset, op, cond, sts)			\
>>   do {									\
>> -- 
>> 2.7.4
>>
> 
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [PATCH v6 02/10] iommu/vt-d: Add custom allocator for IOASID
  2019-10-23  0:51   ` Raj, Ashok
@ 2019-10-23  2:21     ` Lu Baolu
  2019-10-23 23:01       ` Jacob Pan
  2019-10-23  4:04     ` Jacob Pan
  1 sibling, 1 reply; 27+ messages in thread
From: Lu Baolu @ 2019-10-23  2:21 UTC (permalink / raw)
  To: Raj, Ashok, Jacob Pan
  Cc: Tian, Kevin, Jean-Philippe Brucker, iommu, LKML, Alex Williamson,
	David Woodhouse, Jonathan Cameron

Hi,

On 10/23/19 8:51 AM, Raj, Ashok wrote:
> On Tue, Oct 22, 2019 at 04:53:15PM -0700, Jacob Pan wrote:
>> When VT-d driver runs in the guest, PASID allocation must be
>> performed via virtual command interface. This patch registers a
>> custom IOASID allocator which takes precedence over the default
>> XArray based allocator. The resulting IOASID allocation will always
>> come from the host. This ensures that PASID namespace is system-
>> wide.
>>
>> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
>> Signed-off-by: Liu, Yi L <yi.l.liu@intel.com>
>> Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
>> ---
>>   drivers/iommu/Kconfig       |  1 +
>>   drivers/iommu/intel-iommu.c | 67 +++++++++++++++++++++++++++++++++++++++++++++
>>   include/linux/intel-iommu.h |  2 ++
>>   3 files changed, 70 insertions(+)
>>
>> diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
>> index fd50ddffffbf..961fe5795a90 100644
>> --- a/drivers/iommu/Kconfig
>> +++ b/drivers/iommu/Kconfig
>> @@ -211,6 +211,7 @@ config INTEL_IOMMU_SVM
>>   	bool "Support for Shared Virtual Memory with Intel IOMMU"
>>   	depends on INTEL_IOMMU && X86
>>   	select PCI_PASID
>> +	select IOASID
>>   	select MMU_NOTIFIER
>>   	help
>>   	  Shared Virtual Memory (SVM) provides a facility for devices
>> diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
>> index 3f974919d3bd..3aff0141c522 100644
>> --- a/drivers/iommu/intel-iommu.c
>> +++ b/drivers/iommu/intel-iommu.c
>> @@ -1706,6 +1706,8 @@ static void free_dmar_iommu(struct intel_iommu *iommu)
>>   		if (ecap_prs(iommu->ecap))
>>   			intel_svm_finish_prq(iommu);
>>   	}
>> +	ioasid_unregister_allocator(&iommu->pasid_allocator);
>> +
>>   #endif
>>   }
>>   
>> @@ -4910,6 +4912,46 @@ static int __init probe_acpi_namespace_devices(void)
>>   	return 0;
>>   }
>>   
>> +#ifdef CONFIG_INTEL_IOMMU_SVM
> 
> Maybe move them to intel-svm.c instead? that's where the bulk
> of the svm support is?

I think this is a generic PASID allocator for guest IOMMU although vSVA
is currently the only consumer. Instead of making it SVM specific, I'd
like to suggest moving it to intel-pasid.c and replace the @svm
parameter with a void * one in intel_ioasid_free().

> 
>> +static ioasid_t intel_ioasid_alloc(ioasid_t min, ioasid_t max, void *data)
>> +{
>> +	struct intel_iommu *iommu = data;
>> +	ioasid_t ioasid;
>> +
>> +	/*
>> +	 * VT-d virtual command interface always uses the full 20 bit
>> +	 * PASID range. Host can partition guest PASID range based on
>> +	 * policies but it is out of guest's control.
>> +	 */
>> +	if (min < PASID_MIN || max > PASID_MAX)
>> +		return INVALID_IOASID;
> 
> What are these PASID_MIN/MAX? Do you check if these are within the
> limits supported by the iommu/vIOMMU as its enumerated?
> 

Is it an invalid request when @max is greater than hardware capability?

Say, the consumer is asking for allocate a PASID within [0, 2^20], while
the PASID pool of the allocator is just, say, [4, 2^19] with others
reserved for other special usage or due to iommu capability. Instead
of returning error, why not just allocating a PASID within [2, 2^19]?

In another word, final allocation range should be Range[allocator 
supported] & Range[customer specified]. Please correct me if I missed
anything.

> 
>> +
>> +	if (vcmd_alloc_pasid(iommu, &ioasid))
>> +		return INVALID_IOASID;
>> +
>> +	return ioasid;
>> +}
>> +
>> +static void intel_ioasid_free(ioasid_t ioasid, void *data)
>> +{
>> +	struct iommu_pasid_alloc_info *svm;
>> +	struct intel_iommu *iommu = data;
>> +
>> +	if (!iommu)
>> +		return;
>> +	/*
>> +	 * Sanity check the ioasid owner is done at upper layer, e.g. VFIO
>> +	 * We can only free the PASID when all the devices are unbond.
>> +	 */
>> +	svm = ioasid_find(NULL, ioasid, NULL);
>> +	if (!svm) {
>> +		pr_warn("Freeing unbond IOASID %d\n", ioasid);
>> +		return;
>> +	}
>> +	vcmd_free_pasid(iommu, ioasid);
>> +}
>> +#endif
>> +
>>   int __init intel_iommu_init(void)
>>   {
>>   	int ret = -ENODEV;
>> @@ -5020,6 +5062,31 @@ int __init intel_iommu_init(void)
>>   				       "%s", iommu->name);
>>   		iommu_device_set_ops(&iommu->iommu, &intel_iommu_ops);
>>   		iommu_device_register(&iommu->iommu);
>> +#ifdef CONFIG_INTEL_IOMMU_SVM
>> +		if (cap_caching_mode(iommu->cap) && sm_supported(iommu)) {
> 
> do you need to check against cap_caching_mode() or ecap_vcmd?
> 
> 
>> +			/*
>> +			 * Register a custom ASID allocator if we are running
>> +			 * in a guest, the purpose is to have a system wide PASID
>> +			 * namespace among all PASID users.
>> +			 * There can be multiple vIOMMUs in each guest but only
>> +			 * one allocator is active. All vIOMMU allocators will
>> +			 * eventually be calling the same host allocator.
>> +			 */
>> +			iommu->pasid_allocator.alloc = intel_ioasid_alloc;
>> +			iommu->pasid_allocator.free = intel_ioasid_free;
>> +			iommu->pasid_allocator.pdata = (void *)iommu;
>> +			ret = ioasid_register_allocator(&iommu->pasid_allocator);
>> +			if (ret) {
>> +				pr_warn("Custom PASID allocator registeration failed\n");
>> +				/*
>> +				 * Disable scalable mode on this IOMMU if there
>> +				 * is no custom allocator. Mixing SM capable vIOMMU
>> +				 * and non-SM vIOMMU are not supported.
>> +				 */
>> +				intel_iommu_sm = 0;
>> +			}
>> +		}
>> +#endif

I think this should also be moved to intel-pasid.c and made it svm
independent as a generic pasid allocator for IOMMU running in a vm guest
or user level application.

Best regards,
baolu

>>   	}
>>   
>>   	bus_set_iommu(&pci_bus_type, &intel_iommu_ops);
>> diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h
>> index eea7468694a7..fb1973a761c1 100644
>> --- a/include/linux/intel-iommu.h
>> +++ b/include/linux/intel-iommu.h
>> @@ -19,6 +19,7 @@
>>   #include <linux/iommu.h>
>>   #include <linux/io-64-nonatomic-lo-hi.h>
>>   #include <linux/dmar.h>
>> +#include <linux/ioasid.h>
>>   
>>   #include <asm/cacheflush.h>
>>   #include <asm/iommu.h>
>> @@ -542,6 +543,7 @@ struct intel_iommu {
>>   #ifdef CONFIG_INTEL_IOMMU_SVM
>>   	struct page_req_dsc *prq;
>>   	unsigned char prq_name[16];    /* Name for PRQ interrupt */
>> +	struct ioasid_allocator_ops pasid_allocator; /* Custom allocator for PASIDs */
>>   #endif
>>   	struct q_inval  *qi;            /* Queued invalidation info */
>>   	u32 *iommu_state; /* Store iommu states between suspend and resume.*/
>> -- 
>> 2.7.4
>>
> 
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v6 02/10] iommu/vt-d: Add custom allocator for IOASID
  2019-10-23  0:51   ` Raj, Ashok
  2019-10-23  2:21     ` Lu Baolu
@ 2019-10-23  4:04     ` Jacob Pan
  2019-10-23 23:04       ` Jacob Pan
  1 sibling, 1 reply; 27+ messages in thread
From: Jacob Pan @ 2019-10-23  4:04 UTC (permalink / raw)
  To: Raj, Ashok
  Cc: Tian, Kevin, David Woodhouse, iommu, LKML, Alex Williamson,
	Jean-Philippe Brucker, Jonathan Cameron

On Tue, 22 Oct 2019 17:51:29 -0700
"Raj, Ashok" <ashok.raj@intel.com> wrote:

> On Tue, Oct 22, 2019 at 04:53:15PM -0700, Jacob Pan wrote:
> > When VT-d driver runs in the guest, PASID allocation must be
> > performed via virtual command interface. This patch registers a
> > custom IOASID allocator which takes precedence over the default
> > XArray based allocator. The resulting IOASID allocation will always
> > come from the host. This ensures that PASID namespace is system-
> > wide.
> > 
> > Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
> > Signed-off-by: Liu, Yi L <yi.l.liu@intel.com>
> > Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
> > ---
> >  drivers/iommu/Kconfig       |  1 +
> >  drivers/iommu/intel-iommu.c | 67
> > +++++++++++++++++++++++++++++++++++++++++++++
> > include/linux/intel-iommu.h |  2 ++ 3 files changed, 70
> > insertions(+)
> > 
> > diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
> > index fd50ddffffbf..961fe5795a90 100644
> > --- a/drivers/iommu/Kconfig
> > +++ b/drivers/iommu/Kconfig
> > @@ -211,6 +211,7 @@ config INTEL_IOMMU_SVM
> >  	bool "Support for Shared Virtual Memory with Intel IOMMU"
> >  	depends on INTEL_IOMMU && X86
> >  	select PCI_PASID
> > +	select IOASID
> >  	select MMU_NOTIFIER
> >  	help
> >  	  Shared Virtual Memory (SVM) provides a facility for
> > devices diff --git a/drivers/iommu/intel-iommu.c
> > b/drivers/iommu/intel-iommu.c index 3f974919d3bd..3aff0141c522
> > 100644 --- a/drivers/iommu/intel-iommu.c
> > +++ b/drivers/iommu/intel-iommu.c
> > @@ -1706,6 +1706,8 @@ static void free_dmar_iommu(struct
> > intel_iommu *iommu) if (ecap_prs(iommu->ecap))
> >  			intel_svm_finish_prq(iommu);
> >  	}
> > +	ioasid_unregister_allocator(&iommu->pasid_allocator);
> > +
> >  #endif
> >  }
> >  
> > @@ -4910,6 +4912,46 @@ static int __init
> > probe_acpi_namespace_devices(void) return 0;
> >  }
> >  
> > +#ifdef CONFIG_INTEL_IOMMU_SVM  
> 
> Maybe move them to intel-svm.c instead? that's where the bulk
> of the svm support is?
> 
The reason I put them in intel-iommu.c is that pasid allocators need to
be registered during initialization of Intel iommu. No strong
preference.

> > +static ioasid_t intel_ioasid_alloc(ioasid_t min, ioasid_t max,
> > void *data) +{
> > +	struct intel_iommu *iommu = data;
> > +	ioasid_t ioasid;
> > +
> > +	/*
> > +	 * VT-d virtual command interface always uses the full 20
> > bit
> > +	 * PASID range. Host can partition guest PASID range based
> > on
> > +	 * policies but it is out of guest's control.
> > +	 */
> > +	if (min < PASID_MIN || max > PASID_MAX)
> > +		return INVALID_IOASID;  
> 
> What are these PASID_MIN/MAX? Do you check if these are within the 
> limits supported by the iommu/vIOMMU as its enumerated?
> 
PASID_MIN/MAX is the full range, 1-2M.
I do not check range because VCMD interface will always use the
full range. vIOMMU will always support full 20 bit guest PASID.
The host VFIO code calls IOASID allocator which should respect IOMMU
enumerated range.
> 
> > +
> > +	if (vcmd_alloc_pasid(iommu, &ioasid))
> > +		return INVALID_IOASID;
> > +
> > +	return ioasid;
> > +}
> > +
> > +static void intel_ioasid_free(ioasid_t ioasid, void *data)
> > +{
> > +	struct iommu_pasid_alloc_info *svm;
> > +	struct intel_iommu *iommu = data;
> > +
> > +	if (!iommu)
> > +		return;
> > +	/*
> > +	 * Sanity check the ioasid owner is done at upper layer,
> > e.g. VFIO
> > +	 * We can only free the PASID when all the devices are
> > unbond.
> > +	 */
> > +	svm = ioasid_find(NULL, ioasid, NULL);
> > +	if (!svm) {
> > +		pr_warn("Freeing unbond IOASID %d\n", ioasid);
> > +		return;
> > +	}
> > +	vcmd_free_pasid(iommu, ioasid);
> > +}
> > +#endif
> > +
> >  int __init intel_iommu_init(void)
> >  {
> >  	int ret = -ENODEV;
> > @@ -5020,6 +5062,31 @@ int __init intel_iommu_init(void)
> >  				       "%s", iommu->name);
> >  		iommu_device_set_ops(&iommu->iommu,
> > &intel_iommu_ops); iommu_device_register(&iommu->iommu);
> > +#ifdef CONFIG_INTEL_IOMMU_SVM
> > +		if (cap_caching_mode(iommu->cap) &&
> > sm_supported(iommu)) {  
> 
> do you need to check against cap_caching_mode() or ecap_vcmd?
> 
I guess ecap_vcmd() will suffice. Kind of redundant.
> 
> > +			/*
> > +			 * Register a custom ASID allocator if we
> > are running
> > +			 * in a guest, the purpose is to have a
> > system wide PASID
> > +			 * namespace among all PASID users.
> > +			 * There can be multiple vIOMMUs in each
> > guest but only
> > +			 * one allocator is active. All vIOMMU
> > allocators will
> > +			 * eventually be calling the same host
> > allocator.
> > +			 */
> > +			iommu->pasid_allocator.alloc =
> > intel_ioasid_alloc;
> > +			iommu->pasid_allocator.free =
> > intel_ioasid_free;
> > +			iommu->pasid_allocator.pdata = (void
> > *)iommu;
> > +			ret =
> > ioasid_register_allocator(&iommu->pasid_allocator);
> > +			if (ret) {
> > +				pr_warn("Custom PASID allocator
> > registeration failed\n");
> > +				/*
> > +				 * Disable scalable mode on this
> > IOMMU if there
> > +				 * is no custom allocator. Mixing
> > SM capable vIOMMU
> > +				 * and non-SM vIOMMU are not
> > supported.
> > +				 */
> > +				intel_iommu_sm = 0;
> > +			}
> > +		}
> > +#endif
> >  	}
> >  
> >  	bus_set_iommu(&pci_bus_type, &intel_iommu_ops);
> > diff --git a/include/linux/intel-iommu.h
> > b/include/linux/intel-iommu.h index eea7468694a7..fb1973a761c1
> > 100644 --- a/include/linux/intel-iommu.h
> > +++ b/include/linux/intel-iommu.h
> > @@ -19,6 +19,7 @@
> >  #include <linux/iommu.h>
> >  #include <linux/io-64-nonatomic-lo-hi.h>
> >  #include <linux/dmar.h>
> > +#include <linux/ioasid.h>
> >  
> >  #include <asm/cacheflush.h>
> >  #include <asm/iommu.h>
> > @@ -542,6 +543,7 @@ struct intel_iommu {
> >  #ifdef CONFIG_INTEL_IOMMU_SVM
> >  	struct page_req_dsc *prq;
> >  	unsigned char prq_name[16];    /* Name for PRQ interrupt */
> > +	struct ioasid_allocator_ops pasid_allocator; /* Custom
> > allocator for PASIDs */ #endif
> >  	struct q_inval  *qi;            /* Queued invalidation
> > info */ u32 *iommu_state; /* Store iommu states between suspend and
> > resume.*/ -- 
> > 2.7.4
> >   

[Jacob Pan]
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v6 03/10] iommu/vt-d: Replace Intel specific PASID allocator with IOASID
  2019-10-23  0:58   ` Raj, Ashok
@ 2019-10-23  4:19     ` Jacob Pan
  0 siblings, 0 replies; 27+ messages in thread
From: Jacob Pan @ 2019-10-23  4:19 UTC (permalink / raw)
  To: Raj, Ashok
  Cc: Tian, Kevin, David Woodhouse, iommu, LKML, Alex Williamson,
	Jean-Philippe Brucker, Jonathan Cameron

On Tue, 22 Oct 2019 17:58:59 -0700
"Raj, Ashok" <ashok.raj@intel.com> wrote:

> On Tue, Oct 22, 2019 at 04:53:16PM -0700, Jacob Pan wrote:
> > Make use of generic IOASID code to manage PASID allocation,
> > free, and lookup. Replace Intel specific code.
> > 
> > Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
> > ---
> >  drivers/iommu/intel-iommu.c | 12 ++++++------
> >  drivers/iommu/intel-pasid.c | 36
> > ------------------------------------ drivers/iommu/intel-svm.c   |
> > 37 +++++++++++++++++++++---------------- 3 files changed, 27
> > insertions(+), 58 deletions(-)
> > 
> > diff --git a/drivers/iommu/intel-iommu.c
> > b/drivers/iommu/intel-iommu.c index 3aff0141c522..72febcf2c48f
> > 100644 --- a/drivers/iommu/intel-iommu.c
> > +++ b/drivers/iommu/intel-iommu.c
> > @@ -5311,7 +5311,7 @@ static void auxiliary_unlink_device(struct
> > dmar_domain *domain, domain->auxd_refcnt--;
> >  
> >  	if (!domain->auxd_refcnt && domain->default_pasid > 0)
> > -		intel_pasid_free_id(domain->default_pasid);
> > +		ioasid_free(domain->default_pasid);  
> 
> if the domain is gauranteed to be torn down, its ok.. but otherwise
> do you need to clear domain->default_pasid to avoid accidental
> reference in some other code path?
> 
Good point, but i think it is out of the scope of this patch. Perhaps
another patch to fix that.
> >  }
> >  
> >  static int aux_domain_add_dev(struct dmar_domain *domain,
> > @@ -5329,10 +5329,10 @@ static int aux_domain_add_dev(struct
> > dmar_domain *domain, if (domain->default_pasid <= 0) {
> >  		int pasid;
> >  
> > -		pasid = intel_pasid_alloc_id(domain, PASID_MIN,
> > -
> > pci_max_pasids(to_pci_dev(dev)),
> > -					     GFP_KERNEL);
> > -		if (pasid <= 0) {
> > +		/* No private data needed for the default pasid */
> > +		pasid = ioasid_alloc(NULL, PASID_MIN,
> > pci_max_pasids(to_pci_dev(dev)) - 1,  
> 
> If we ensure IOMMU max-pasid is full 20bit width, its good. In a
> vIOMMU if iommu max pasid is restricted for some kind of partitioning
> in future you want to consider limiting to no more than what's
> provisioned in the vIOMMU right?
> 
Yes, I agree we should double check IOMMU ecap PSS.
But it is outside the scope of this patch, which is merely replacing the
current logic with a different API.
Also, unless we do nested mdev, this code should run in host only,
no vIOMMU, right?
> 
> > +				NULL);
> > +		if (pasid == INVALID_IOASID) {
> >  			pr_err("Can't allocate default pasid\n");
> >  			return -ENODEV;
> >  		}
> > @@ -5368,7 +5368,7 @@ static int aux_domain_add_dev(struct
> > dmar_domain *domain, spin_unlock(&iommu->lock);
> >  	spin_unlock_irqrestore(&device_domain_lock, flags);
> >  	if (!domain->auxd_refcnt && domain->default_pasid > 0)
> > -		intel_pasid_free_id(domain->default_pasid);
> > +		ioasid_free(domain->default_pasid);
> >  
> >  	return ret;
> >  }
> > diff --git a/drivers/iommu/intel-pasid.c
> > b/drivers/iommu/intel-pasid.c index 76bcbb21e112..c0d1f28d3412
> > 100644 --- a/drivers/iommu/intel-pasid.c
> > +++ b/drivers/iommu/intel-pasid.c
> > @@ -26,42 +26,6 @@
> >   */
> >  static DEFINE_SPINLOCK(pasid_lock);
> >  u32 intel_pasid_max_id = PASID_MAX;
> > -static DEFINE_IDR(pasid_idr);
> > -
> > -int intel_pasid_alloc_id(void *ptr, int start, int end, gfp_t gfp)
> > -{
> > -	int ret, min, max;
> > -
> > -	min = max_t(int, start, PASID_MIN);
> > -	max = min_t(int, end, intel_pasid_max_id);
> > -
> > -	WARN_ON(in_interrupt());
> > -	idr_preload(gfp);
> > -	spin_lock(&pasid_lock);
> > -	ret = idr_alloc(&pasid_idr, ptr, min, max, GFP_ATOMIC);
> > -	spin_unlock(&pasid_lock);
> > -	idr_preload_end();
> > -
> > -	return ret;
> > -}
> > -
> > -void intel_pasid_free_id(int pasid)
> > -{
> > -	spin_lock(&pasid_lock);
> > -	idr_remove(&pasid_idr, pasid);
> > -	spin_unlock(&pasid_lock);
> > -}
> > -
> > -void *intel_pasid_lookup_id(int pasid)
> > -{
> > -	void *p;
> > -
> > -	spin_lock(&pasid_lock);
> > -	p = idr_find(&pasid_idr, pasid);
> > -	spin_unlock(&pasid_lock);
> > -
> > -	return p;
> > -}
> >  
> >  static int check_vcmd_pasid(struct intel_iommu *iommu)
> >  {
> > diff --git a/drivers/iommu/intel-svm.c b/drivers/iommu/intel-svm.c
> > index 9b159132405d..5aef5b7bf561 100644
> > --- a/drivers/iommu/intel-svm.c
> > +++ b/drivers/iommu/intel-svm.c
> > @@ -17,6 +17,7 @@
> >  #include <linux/dmar.h>
> >  #include <linux/interrupt.h>
> >  #include <linux/mm_types.h>
> > +#include <linux/ioasid.h>
> >  #include <asm/page.h>
> >  
> >  #include "intel-pasid.h"
> > @@ -318,16 +319,15 @@ int intel_svm_bind_mm(struct device *dev, int
> > *pasid, int flags, struct svm_dev_ if (pasid_max >
> > intel_pasid_max_id) pasid_max = intel_pasid_max_id;
> >  
> > -		/* Do not use PASID 0 in caching mode (virtualised
> > IOMMU) */
> > -		ret = intel_pasid_alloc_id(svm,
> > -					   !!cap_caching_mode(iommu->cap),
> > -					   pasid_max - 1,
> > GFP_KERNEL);
> > -		if (ret < 0) {
> > +		/* Do not use PASID 0, reserved for RID to PASID */
> > +		svm->pasid = ioasid_alloc(NULL, PASID_MIN,
> > +					pasid_max - 1, svm);
> > +		if (svm->pasid == INVALID_IOASID) {
> >  			kfree(svm);
> >  			kfree(sdev);
> > +			ret = ENOSPC;
> >  			goto out;
> >  		}
> > -		svm->pasid = ret;
> >  		svm->notifier.ops = &intel_mmuops;
> >  		svm->mm = mm;
> >  		svm->flags = flags;
> > @@ -337,7 +337,7 @@ int intel_svm_bind_mm(struct device *dev, int
> > *pasid, int flags, struct svm_dev_ if (mm) {
> >  			ret =
> > mmu_notifier_register(&svm->notifier, mm); if (ret) {
> > -				intel_pasid_free_id(svm->pasid);
> > +				ioasid_free(svm->pasid);
> >  				kfree(svm);
> >  				kfree(sdev);
> >  				goto out;
> > @@ -353,7 +353,7 @@ int intel_svm_bind_mm(struct device *dev, int
> > *pasid, int flags, struct svm_dev_ if (ret) {
> >  			if (mm)
> >  				mmu_notifier_unregister(&svm->notifier,
> > mm);
> > -			intel_pasid_free_id(svm->pasid);
> > +			ioasid_free(svm->pasid);
> >  			kfree(svm);
> >  			kfree(sdev);
> >  			goto out;
> > @@ -401,7 +401,12 @@ int intel_svm_unbind_mm(struct device *dev,
> > int pasid) if (!iommu)
> >  		goto out;
> >  
> > -	svm = intel_pasid_lookup_id(pasid);
> > +	svm = ioasid_find(NULL, pasid, NULL);
> > +	if (IS_ERR(svm)) {
> > +		ret = PTR_ERR(svm);
> > +		goto out;
> > +	}
> > +
> >  	if (!svm)
> >  		goto out;
> >  
> > @@ -423,7 +428,7 @@ int intel_svm_unbind_mm(struct device *dev, int
> > pasid) kfree_rcu(sdev, rcu);
> >  
> >  				if (list_empty(&svm->devs)) {
> > -
> > intel_pasid_free_id(svm->pasid);
> > +					ioasid_free(svm->pasid);
> >  					if (svm->mm)
> >  						mmu_notifier_unregister(&svm->notifier,
> > svm->mm); 
> > @@ -458,10 +463,11 @@ int intel_svm_is_pasid_valid(struct device
> > *dev, int pasid) if (!iommu)
> >  		goto out;
> >  
> > -	svm = intel_pasid_lookup_id(pasid);
> > -	if (!svm)
> > +	svm = ioasid_find(NULL, pasid, NULL);
> > +	if (IS_ERR(svm)) {
> > +		ret = PTR_ERR(svm);
> >  		goto out;
> > -
> > +	}
> >  	/* init_mm is used in this case */
> >  	if (!svm->mm)
> >  		ret = 1;
> > @@ -568,13 +574,12 @@ static irqreturn_t prq_event_thread(int irq,
> > void *d) 
> >  		if (!svm || svm->pasid != req->pasid) {
> >  			rcu_read_lock();
> > -			svm = intel_pasid_lookup_id(req->pasid);
> > +			svm = ioasid_find(NULL, req->pasid, NULL);
> >  			/* It *can't* go away, because the driver
> > is not permitted
> >  			 * to unbind the mm while any page faults
> > are outstanding.
> >  			 * So we only need RCU to protect the
> > internal idr code. */ rcu_read_unlock();
> > -
> > -			if (!svm) {
> > +			if (IS_ERR(svm) || !svm) {
> >  				pr_err("%s: Page request for
> > invalid PASID %d: %08llx %08llx\n", iommu->name, req->pasid,
> > ((unsigned long long *)req)[0], ((unsigned long long *)req)[1]);
> > -- 
> > 2.7.4
> >   
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v6 00/10] Nested Shared Virtual Address (SVA) VT-d support
  2019-10-23  0:27 ` [PATCH v6 00/10] Nested Shared Virtual Address (SVA) VT-d support Raj, Ashok
@ 2019-10-23 13:46   ` Jacob Pan
  0 siblings, 0 replies; 27+ messages in thread
From: Jacob Pan @ 2019-10-23 13:46 UTC (permalink / raw)
  To: Raj, Ashok
  Cc: Tian, Kevin, David Woodhouse, iommu, LKML, Alex Williamson,
	Jean-Philippe Brucker, Jonathan Cameron

On Tue, 22 Oct 2019 17:27:51 -0700
"Raj, Ashok" <ashok.raj@intel.com> wrote:

> Hi Jacob
> 
> On Tue, Oct 22, 2019 at 04:53:13PM -0700, Jacob Pan wrote:
> > Shared virtual address (SVA), a.k.a, Shared virtual memory (SVM) on
> > Intel platforms allow address space sharing between device DMA and
> > applications. SVA can reduce programming complexity and enhance
> > security. This series is intended to enable SVA virtualization,
> > i.e. shared guest application address space and physical device DMA
> > address. Only IOMMU portion  
> 
> Last line "i.e shared guest application address space and physical
> device DMA address" doesn't read well. 
> 
> Simply rephrase as "enable use of SVA within a guest user application"
> 
Much better, thanks!
> > of the changes are included in this series. Additional support is
> > needed in VFIO and QEMU (will be submitted separately) to complete
> > this functionality.
> > 
> > To make incremental changes and reduce the size of each patchset.
> > This series does not inlcude support for page request services.
> > 
> > In VT-d implementation, PASID table is per device and maintained in
> > the host. Guest PASID table is shadowed in VMM where virtual IOMMU
> > is emulated.
> > 
> >     .-------------.  .---------------------------.
> >     |   vIOMMU    |  | Guest process CR3, FL only|
> >     |             |  '---------------------------'
> >     .----------------/
> >     | PASID Entry |--- PASID cache flush -
> >     '-------------'                       |
> >     |             |                       V
> >     |             |                CR3 in GPA
> >     '-------------'
> > Guest
> > ------| Shadow |--------------------------|--------
> >       v        v                          v
> > Host
> >     .-------------.  .----------------------.
> >     |   pIOMMU    |  | Bind FL for GVA-GPA  |
> >     |             |  '----------------------'
> >     .----------------/  |
> >     | PASID Entry |     V (Nested xlate)
> >     '----------------\.------------------------------.
> >     |             |   |SL for GPA-HPA, default domain|
> >     |             |   '------------------------------'
> >     '-------------'  
> 
> is the SL for GPA-HPA default domain? When you assign a device to 
> Guest, this isn't default domain right?
> 
SL is harvested from default domain for PCI device assignment. Once we
add mdev support, you are right that SL is not default domain, but this
patchset is not dealing with mdev yet. Jumping ahead :)

> > Where:
> >  - FL = First level/stage one page tables
> >  - SL = Second level/stage two page tables
> > 
> > This is the remaining VT-d only portion of V5 since the uAPIs and
> > IOASID common code have been applied to Joerg's IOMMU core branch.
> > (https://lkml.org/lkml/2019/10/2/833)
> > 
> > The complete set with VFIO patches are here:
> > https://github.com/jacobpan/linux.git:siov_sva
> > 
> > The complete nested SVA upstream patches are divided into three
> > phases: 1. Common APIs and PCI device direct assignment
> >     2. Page Request Services (PRS) support
> >     3. Mediated device assignment
> > 
> > With this set and the accompanied VFIO code, we will achieve phase
> > #1.
> > 
> > Thanks,
> > 
> > Jacob
> > 
> > ChangeLog:
> > 	- V6
> > 	  - Rebased on top of Joerg's core branch
> > 	  (git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu.git
> > core)
> > 	  - Adapt to new uAPIs and IOASID allocators
> > 
> > 	- V5
> > 	  Rebased on v5.3-rc4 which has some of the IOMMU fault
> > APIs merged. Addressed v4 review comments from Eric Auger, Baolu
> > Lu, and Jonathan Cameron. Specific changes are as follows:
> > 	  - Refined custom IOASID allocator to support multiple
> > vIOMMU, hotplug cases.
> > 	  - Extracted vendor data from IOMMU guest PASID bind data,
> > for VT-d will support all necessary guest PASID entry fields for
> > PASID bind.
> > 	  - Support non-identity host-guest PASID mapping
> > 	  - Exception handling in various cases
> > 
> > 	- V4
> > 	  - Redesigned IOASID allocator such that it can support
> > custom allocators with shared helper functions. Use separate XArray
> > 	  to store IOASIDs per allocator. Took advice from Eric
> > Auger to have default allocator use the generic allocator structure.
> > 	  Combined into one patch in that the default allocator is
> > just "another" allocator now. Can be built as a module in case of
> > 	  driver use without IOMMU.
> > 	  - Extended bind guest PASID data to support SMMU and
> > non-identity guest to host PASID mapping
> > https://lkml.org/lkml/2019/5/21/802
> > 	  - Rebased on Jean's sva/api common tree, new patches
> > starts with [PATCH v4 10/22]
> > 
> > 	- V3
> > 	  - Addressed thorough review comments from Eric Auger
> > (Thank you!)
> > 	  - Moved IOASID allocator from driver core to IOMMU code
> > per suggestion by Christoph Hellwig
> > 	    (https://lkml.org/lkml/2019/4/26/462)
> > 	  - Rebased on top of Jean's SVA API branch and Eric's v7[1]
> > 	    (git://linux-arm.org/linux-jpb.git sva/api)
> > 	  - All IOMMU APIs are unmodified (except the new bind
> > guest PASID call in patch 9/16)
> > 
> > 	- V2
> > 	  - Rebased on Joerg's IOMMU x86/vt-d branch v5.1-rc4
> > 	  - Integrated with Eric Auger's new v7 series for common
> > APIs (https://github.com/eauger/linux/tree/v5.1-rc3-2stage-v7)
> > 	  - Addressed review comments from Andy Shevchenko and Alex
> > Williamson on IOASID custom allocator.
> > 	  - Support multiple custom IOASID allocators (vIOMMUs) and
> > dynamic registration.
> > 
> > 
> > Jacob Pan (9):
> >   iommu/vt-d: Add custom allocator for IOASID
> >   iommu/vt-d: Replace Intel specific PASID allocator with IOASID
> >   iommu/vt-d: Move domain helper to header
> >   iommu/vt-d: Avoid duplicated code for PASID setup
> >   iommu/vt-d: Add nested translation helper function
> >   iommu/vt-d: Misc macro clean up for SVM
> >   iommu/vt-d: Add bind guest PASID support
> >   iommu/vt-d: Support flushing more translation cache types
> >   iommu/vt-d: Add svm/sva invalidate function
> > 
> > Lu Baolu (1):
> >   iommu/vt-d: Enlightened PASID allocation
> > 
> >  drivers/iommu/Kconfig       |   1 +
> >  drivers/iommu/dmar.c        |  46 ++++++
> >  drivers/iommu/intel-iommu.c | 259 +++++++++++++++++++++++++++++++--
> >  drivers/iommu/intel-pasid.c | 343
> > +++++++++++++++++++++++++++++++++++++-------
> > drivers/iommu/intel-pasid.h |  25 +++- drivers/iommu/intel-svm.c
> > | 298 ++++++++++++++++++++++++++++++--------
> > include/linux/intel-iommu.h |  39 ++++- include/linux/intel-svm.h
> > |  17 +++ 8 files changed, 904 insertions(+), 124 deletions(-)
> > 
> > -- 
> > 2.7.4
> >   

[Jacob Pan]
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v6 01/10] iommu/vt-d: Enlightened PASID allocation
  2019-10-23  1:53     ` Lu Baolu
@ 2019-10-23 17:55       ` Jacob Pan
  2019-10-23 21:11         ` Jacob Pan
  2019-10-25  1:39         ` Lu Baolu
  0 siblings, 2 replies; 27+ messages in thread
From: Jacob Pan @ 2019-10-23 17:55 UTC (permalink / raw)
  To: Lu Baolu
  Cc: Tian, Kevin, Raj, Ashok, Jean-Philippe Brucker, iommu, LKML,
	Alex Williamson, David Woodhouse, Jonathan Cameron

On Wed, 23 Oct 2019 09:53:04 +0800
Lu Baolu <baolu.lu@linux.intel.com> wrote:

> Hi Ashok,
> 
> Thanks for reviewing the patch.
> 
> On 10/23/19 8:45 AM, Raj, Ashok wrote:
> > On Tue, Oct 22, 2019 at 04:53:14PM -0700, Jacob Pan wrote:  
> >> From: Lu Baolu <baolu.lu@linux.intel.com>
> >>
> >> If Intel IOMMU runs in caching mode, a.k.a. virtual IOMMU, the
> >> IOMMU driver should rely on the emulation software to allocate
> >> and free PASID IDs. The Intel vt-d spec revision 3.0 defines a
> >> register set to support this. This includes a capability register,
> >> a virtual command register and a virtual response register. Refer
> >> to section 10.4.42, 10.4.43, 10.4.44 for more information.  
> > 
> > The above paragraph seems a bit confusing, there is no reference
> > to caching mode for for VCMD... some suggestion below.
> > 
> > Enabling IOMMU in a guest requires communication with the host
> > driver for certain aspects. Use of PASID ID to enable Shared Virtual
> > Addressing (SVA) requires managing PASID's in the host. VT-d 3.0
> > spec provides a Virtual Command Register (VCMD) to facilitate this.
> > Writes to this register in the guest are trapped by Qemu and
> > proxies the call to the host driver....  
> 
> Yours is better. Will use it.
> 
I will roll that in to v7
> > 
> >   
> >>
> >> This patch adds the enlightened PASID allocation/free interfaces
> >> via the virtual command register.
> >>
> >> Cc: Ashok Raj <ashok.raj@intel.com>
> >> Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
> >> Cc: Kevin Tian <kevin.tian@intel.com>
> >> Signed-off-by: Liu Yi L <yi.l.liu@intel.com>
> >> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
> >> Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
> >> Reviewed-by: Eric Auger <eric.auger@redhat.com>
> >> ---
> >>   drivers/iommu/intel-pasid.c | 83
> >> +++++++++++++++++++++++++++++++++++++++++++++
> >> drivers/iommu/intel-pasid.h | 13 ++++++-
> >> include/linux/intel-iommu.h |  2 ++ 3 files changed, 97
> >> insertions(+), 1 deletion(-)
> >>
> >> diff --git a/drivers/iommu/intel-pasid.c
> >> b/drivers/iommu/intel-pasid.c index 040a445be300..76bcbb21e112
> >> 100644 --- a/drivers/iommu/intel-pasid.c
> >> +++ b/drivers/iommu/intel-pasid.c
> >> @@ -63,6 +63,89 @@ void *intel_pasid_lookup_id(int pasid)
> >>   	return p;
> >>   }
> >>   
> >> +static int check_vcmd_pasid(struct intel_iommu *iommu)
> >> +{
> >> +	u64 cap;
> >> +
> >> +	if (!ecap_vcs(iommu->ecap)) {
> >> +		pr_warn("IOMMU: %s: Hardware doesn't support
> >> virtual command\n",
> >> +			iommu->name);
> >> +		return -ENODEV;
> >> +	}
> >> +
> >> +	cap = dmar_readq(iommu->reg + DMAR_VCCAP_REG);
> >> +	if (!(cap & DMA_VCS_PAS)) {
> >> +		pr_warn("IOMMU: %s: Emulation software doesn't
> >> support PASID allocation\n",
> >> +			iommu->name);
> >> +		return -ENODEV;
> >> +	}
> >> +
> >> +	return 0;
> >> +}
> >> +
> >> +int vcmd_alloc_pasid(struct intel_iommu *iommu, unsigned int
> >> *pasid) +{
> >> +	u64 res;
> >> +	u8 status_code;
> >> +	unsigned long flags;
> >> +	int ret = 0;
> >> +
> >> +	ret = check_vcmd_pasid(iommu);  
> > 
> > Do you have to check this everytime? every dmar_readq is going to
> > trap to the other side ...  
> 
> Yes. We don't need to check it every time. Check once and save the
> result during boot is enough.
> 
> How about below incremental change?
> 
Below is good but I was thinking to include vccap in struct
intel_iommu{} where cap and ecaps reside. i.e.
diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h
index 14b87ae2916a..e2cf25c9c956 100644
--- a/include/linux/intel-iommu.h
+++ b/include/linux/intel-iommu.h
@@ -528,6 +528,7 @@ struct intel_iommu {
        u64             reg_size; /* size of hw register set */
        u64             cap;
        u64             ecap;
+       u64             vccap;

Also, we can use a static branch here.

> diff --git a/drivers/iommu/intel-pasid.c b/drivers/iommu/intel-pasid.c
> index ff7e877b7a4d..c15d9d7e1e73 100644
> --- a/drivers/iommu/intel-pasid.c
> +++ b/drivers/iommu/intel-pasid.c
> @@ -29,22 +29,29 @@ u32 intel_pasid_max_id = PASID_MAX;
> 
>   static int check_vcmd_pasid(struct intel_iommu *iommu)
>   {
> +       static int vcmd_supported = -EINVAL;
>          u64 cap;
> 
> +       if (vcmd_supported != -EINVAL)
> +               return vcmd_supported;
> +
>          if (!ecap_vcs(iommu->ecap)) {
>                  pr_warn("IOMMU: %s: Hardware doesn't support virtual 
> command\n",
>                          iommu->name);
> -               return -ENODEV;
> +               vcmd_supported = -ENODEV;
> +               return vcmd_supported;
>          }
> 
>          cap = dmar_readq(iommu->reg + DMAR_VCCAP_REG);
>          if (!(cap & DMA_VCS_PAS)) {
>                  pr_warn("IOMMU: %s: Emulation software doesn't
> support PASID allocation\n",
>                          iommu->name);
> -               return -ENODEV;
> +               vcmd_supported = -ENODEV;
> +               return vcmd_supported;
>          }
> 
> -       return 0;
> +       vcmd_supported = 0;
> +       return vcmd_supported;
>   }
> 
>   int vcmd_alloc_pasid(struct intel_iommu *iommu, unsigned int *pasid)
> 
> Best regards,
> baolu
> 
> >   
> >> +	if (ret)
> >> +		return ret;
> >> +
> >> +	raw_spin_lock_irqsave(&iommu->register_lock, flags);
> >> +	dmar_writeq(iommu->reg + DMAR_VCMD_REG, VCMD_CMD_ALLOC);
> >> +	IOMMU_WAIT_OP(iommu, DMAR_VCRSP_REG, dmar_readq,
> >> +		      !(res & VCMD_VRSP_IP), res);
> >> +	raw_spin_unlock_irqrestore(&iommu->register_lock, flags);
> >> +
> >> +	status_code = VCMD_VRSP_SC(res);
> >> +	switch (status_code) {
> >> +	case VCMD_VRSP_SC_SUCCESS:
> >> +		*pasid = VCMD_VRSP_RESULT(res);
> >> +		break;
> >> +	case VCMD_VRSP_SC_NO_PASID_AVAIL:
> >> +		pr_info("IOMMU: %s: No PASID available\n",
> >> iommu->name);
> >> +		ret = -ENOMEM;
> >> +		break;
> >> +	default:
> >> +		ret = -ENODEV;
> >> +		pr_warn("IOMMU: %s: Unexpected error code %d\n",
> >> +			iommu->name, status_code);
> >> +	}
> >> +
> >> +	return ret;
> >> +}
> >> +
> >> +void vcmd_free_pasid(struct intel_iommu *iommu, unsigned int
> >> pasid) +{
> >> +	u64 res;
> >> +	u8 status_code;
> >> +	unsigned long flags;
> >> +
> >> +	if (check_vcmd_pasid(iommu))
> >> +		return;
> >> +
> >> +	raw_spin_lock_irqsave(&iommu->register_lock, flags);
> >> +	dmar_writeq(iommu->reg + DMAR_VCMD_REG, (pasid << 8) |
> >> VCMD_CMD_FREE);
> >> +	IOMMU_WAIT_OP(iommu, DMAR_VCRSP_REG, dmar_readq,
> >> +		      !(res & VCMD_VRSP_IP), res);
> >> +	raw_spin_unlock_irqrestore(&iommu->register_lock, flags);
> >> +
> >> +	status_code = VCMD_VRSP_SC(res);
> >> +	switch (status_code) {
> >> +	case VCMD_VRSP_SC_SUCCESS:
> >> +		break;
> >> +	case VCMD_VRSP_SC_INVALID_PASID:
> >> +		pr_info("IOMMU: %s: Invalid PASID\n",
> >> iommu->name);
> >> +		break;
> >> +	default:
> >> +		pr_warn("IOMMU: %s: Unexpected error code %d\n",
> >> +			iommu->name, status_code);
> >> +	}
> >> +}
> >> +
> >>   /*
> >>    * Per device pasid table management:
> >>    */
> >> diff --git a/drivers/iommu/intel-pasid.h
> >> b/drivers/iommu/intel-pasid.h index fc8cd8f17de1..e413e884e685
> >> 100644 --- a/drivers/iommu/intel-pasid.h
> >> +++ b/drivers/iommu/intel-pasid.h
> >> @@ -23,6 +23,16 @@
> >>   #define is_pasid_enabled(entry)		(((entry)->lo >>
> >> 3) & 0x1) #define get_pasid_dir_size(entry)	(1 <<
> >> ((((entry)->lo >> 9) & 0x7) + 7)) 
> >> +/* Virtual command interface for enlightened pasid management. */
> >> +#define VCMD_CMD_ALLOC			0x1
> >> +#define VCMD_CMD_FREE			0x2
> >> +#define VCMD_VRSP_IP			0x1
> >> +#define VCMD_VRSP_SC(e)			(((e) >> 1) & 0x3)
> >> +#define VCMD_VRSP_SC_SUCCESS		0
> >> +#define VCMD_VRSP_SC_NO_PASID_AVAIL	1
> >> +#define VCMD_VRSP_SC_INVALID_PASID	1
> >> +#define VCMD_VRSP_RESULT(e)		(((e) >> 8) & 0xfffff)
> >> +
> >>   /*
> >>    * Domain ID reserved for pasid entries programmed for
> >> first-level
> >>    * only and pass-through transfer modes.
> >> @@ -95,5 +105,6 @@ int intel_pasid_setup_pass_through(struct
> >> intel_iommu *iommu, struct device *dev, int pasid);
> >>   void intel_pasid_tear_down_entry(struct intel_iommu *iommu,
> >>   				 struct device *dev, int pasid);
> >> -
> >> +int vcmd_alloc_pasid(struct intel_iommu *iommu, unsigned int
> >> *pasid); +void vcmd_free_pasid(struct intel_iommu *iommu, unsigned
> >> int pasid); #endif /* __INTEL_PASID_H */
> >> diff --git a/include/linux/intel-iommu.h
> >> b/include/linux/intel-iommu.h index ed11ef594378..eea7468694a7
> >> 100644 --- a/include/linux/intel-iommu.h
> >> +++ b/include/linux/intel-iommu.h
> >> @@ -161,6 +161,7 @@
> >>   #define ecap_smpwc(e)		(((e) >> 48) & 0x1)
> >>   #define ecap_flts(e)		(((e) >> 47) & 0x1)
> >>   #define ecap_slts(e)		(((e) >> 46) & 0x1)
> >> +#define ecap_vcs(e)		(((e) >> 44) & 0x1)
> >>   #define ecap_smts(e)		(((e) >> 43) & 0x1)
> >>   #define ecap_dit(e)		((e >> 41) & 0x1)
> >>   #define ecap_pasid(e)		((e >> 40) & 0x1)
> >> @@ -279,6 +280,7 @@
> >>   
> >>   /* PRS_REG */
> >>   #define DMA_PRS_PPR	((u32)1)
> >> +#define DMA_VCS_PAS	((u64)1)
> >>   
> >>   #define IOMMU_WAIT_OP(iommu, offset, op, cond,
> >> sts)			\ do
> >> {
> >> \ -- 2.7.4
> >>  
> >   

[Jacob Pan]
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [PATCH v6 01/10] iommu/vt-d: Enlightened PASID allocation
  2019-10-23 17:55       ` Jacob Pan
@ 2019-10-23 21:11         ` Jacob Pan
  2019-10-25  1:42           ` Lu Baolu
  2019-10-25  1:39         ` Lu Baolu
  1 sibling, 1 reply; 27+ messages in thread
From: Jacob Pan @ 2019-10-23 21:11 UTC (permalink / raw)
  To: Lu Baolu
  Cc: Tian, Kevin, Raj, Ashok, Jean-Philippe Brucker, iommu, LKML,
	Alex Williamson, David Woodhouse, Jonathan Cameron

On Wed, 23 Oct 2019 10:55:23 -0700
Jacob Pan <jacob.jun.pan@linux.intel.com> wrote:

> > > Do you have to check this everytime? every dmar_readq is going to
> > > trap to the other side ...    
> > 
> > Yes. We don't need to check it every time. Check once and save the
> > result during boot is enough.
> > 
> > How about below incremental change?
> >   
> Below is good but I was thinking to include vccap in struct
> intel_iommu{} where cap and ecaps reside. i.e.
> diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h
> index 14b87ae2916a..e2cf25c9c956 100644
> --- a/include/linux/intel-iommu.h
> +++ b/include/linux/intel-iommu.h
> @@ -528,6 +528,7 @@ struct intel_iommu {
>         u64             reg_size; /* size of hw register set */
>         u64             cap;
>         u64             ecap;
> +       u64             vccap;
> 
> Also, we can use a static branch here.
> 
On a second thought, we cannot use static(branch) here in that we
cannot assume there is only one vIOMMU all the time. Have to cache the
vccap per iommu.

> > diff --git a/drivers/iommu/intel-pasid.c
> > b/drivers/iommu/intel-pasid.c index ff7e877b7a4d..c15d9d7e1e73
> > 100644 --- a/drivers/iommu/intel-pasid.c
> > +++ b/drivers/iommu/intel-pasid.c
> > @@ -29,22 +29,29 @@ u32 intel_pasid_max_id = PASID_MAX;
> > 
[Jacob Pan]
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v6 02/10] iommu/vt-d: Add custom allocator for IOASID
  2019-10-23  2:21     ` Lu Baolu
@ 2019-10-23 23:01       ` Jacob Pan
  2019-10-25  1:44         ` Lu Baolu
  0 siblings, 1 reply; 27+ messages in thread
From: Jacob Pan @ 2019-10-23 23:01 UTC (permalink / raw)
  To: Lu Baolu
  Cc: Tian, Kevin, Raj, Ashok, Jean-Philippe Brucker, iommu, LKML,
	Alex Williamson, David Woodhouse, Jonathan Cameron

On Wed, 23 Oct 2019 10:21:51 +0800
Lu Baolu <baolu.lu@linux.intel.com> wrote:

> >> +#ifdef CONFIG_INTEL_IOMMU_SVM  
> > 
> > Maybe move them to intel-svm.c instead? that's where the bulk
> > of the svm support is?  
> 
> I think this is a generic PASID allocator for guest IOMMU although
> vSVA is currently the only consumer. Instead of making it SVM
> specific, I'd like to suggest moving it to intel-pasid.c and replace
> the @svm parameter with a void * one in intel_ioasid_free().

make sense to use void*, no need to tie that to svm bind data type.

In terms of location, perhaps we can move if we have more consumers of
custom allocator?
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v6 02/10] iommu/vt-d: Add custom allocator for IOASID
  2019-10-23  4:04     ` Jacob Pan
@ 2019-10-23 23:04       ` Jacob Pan
  0 siblings, 0 replies; 27+ messages in thread
From: Jacob Pan @ 2019-10-23 23:04 UTC (permalink / raw)
  To: Raj, Ashok
  Cc: Tian, Kevin, David Woodhouse, iommu, LKML, Alex Williamson,
	Jean-Philippe Brucker, Jonathan Cameron

On Tue, 22 Oct 2019 21:04:00 -0700
Jacob Pan <jacob.jun.pan@linux.intel.com> wrote:

> > > +		if (cap_caching_mode(iommu->cap) &&
> > > sm_supported(iommu)) {    
> > 
> > do you need to check against cap_caching_mode() or ecap_vcmd?
> >   
> I guess ecap_vcmd() will suffice. Kind of redundant.
Actually, we can check vcmd and vcmd_pasid here, then we dont need to
check it on every alloc/free calls.

Thanks,

Jacob
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v6 01/10] iommu/vt-d: Enlightened PASID allocation
  2019-10-23 17:55       ` Jacob Pan
  2019-10-23 21:11         ` Jacob Pan
@ 2019-10-25  1:39         ` Lu Baolu
  1 sibling, 0 replies; 27+ messages in thread
From: Lu Baolu @ 2019-10-25  1:39 UTC (permalink / raw)
  To: Jacob Pan
  Cc: Tian, Kevin, Alex Williamson, Raj, Ashok, Jean-Philippe Brucker,
	LKML, iommu, David Woodhouse, Jonathan Cameron

Hi,

On 10/24/19 1:55 AM, Jacob Pan wrote:
> On Wed, 23 Oct 2019 09:53:04 +0800
> Lu Baolu <baolu.lu@linux.intel.com> wrote:
> 
>> Hi Ashok,
>>
>> Thanks for reviewing the patch.
>>
>> On 10/23/19 8:45 AM, Raj, Ashok wrote:
>>> On Tue, Oct 22, 2019 at 04:53:14PM -0700, Jacob Pan wrote:
>>>> From: Lu Baolu <baolu.lu@linux.intel.com>
>>>>
>>>> If Intel IOMMU runs in caching mode, a.k.a. virtual IOMMU, the
>>>> IOMMU driver should rely on the emulation software to allocate
>>>> and free PASID IDs. The Intel vt-d spec revision 3.0 defines a
>>>> register set to support this. This includes a capability register,
>>>> a virtual command register and a virtual response register. Refer
>>>> to section 10.4.42, 10.4.43, 10.4.44 for more information.
>>>
>>> The above paragraph seems a bit confusing, there is no reference
>>> to caching mode for for VCMD... some suggestion below.
>>>
>>> Enabling IOMMU in a guest requires communication with the host
>>> driver for certain aspects. Use of PASID ID to enable Shared Virtual
>>> Addressing (SVA) requires managing PASID's in the host. VT-d 3.0
>>> spec provides a Virtual Command Register (VCMD) to facilitate this.
>>> Writes to this register in the guest are trapped by Qemu and
>>> proxies the call to the host driver....
>>
>> Yours is better. Will use it.
>>
> I will roll that in to v7
>>>
>>>    
>>>>
>>>> This patch adds the enlightened PASID allocation/free interfaces
>>>> via the virtual command register.
>>>>
>>>> Cc: Ashok Raj <ashok.raj@intel.com>
>>>> Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
>>>> Cc: Kevin Tian <kevin.tian@intel.com>
>>>> Signed-off-by: Liu Yi L <yi.l.liu@intel.com>
>>>> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
>>>> Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
>>>> Reviewed-by: Eric Auger <eric.auger@redhat.com>
>>>> ---
>>>>    drivers/iommu/intel-pasid.c | 83
>>>> +++++++++++++++++++++++++++++++++++++++++++++
>>>> drivers/iommu/intel-pasid.h | 13 ++++++-
>>>> include/linux/intel-iommu.h |  2 ++ 3 files changed, 97
>>>> insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/drivers/iommu/intel-pasid.c
>>>> b/drivers/iommu/intel-pasid.c index 040a445be300..76bcbb21e112
>>>> 100644 --- a/drivers/iommu/intel-pasid.c
>>>> +++ b/drivers/iommu/intel-pasid.c
>>>> @@ -63,6 +63,89 @@ void *intel_pasid_lookup_id(int pasid)
>>>>    	return p;
>>>>    }
>>>>    
>>>> +static int check_vcmd_pasid(struct intel_iommu *iommu)
>>>> +{
>>>> +	u64 cap;
>>>> +
>>>> +	if (!ecap_vcs(iommu->ecap)) {
>>>> +		pr_warn("IOMMU: %s: Hardware doesn't support
>>>> virtual command\n",
>>>> +			iommu->name);
>>>> +		return -ENODEV;
>>>> +	}
>>>> +
>>>> +	cap = dmar_readq(iommu->reg + DMAR_VCCAP_REG);
>>>> +	if (!(cap & DMA_VCS_PAS)) {
>>>> +		pr_warn("IOMMU: %s: Emulation software doesn't
>>>> support PASID allocation\n",
>>>> +			iommu->name);
>>>> +		return -ENODEV;
>>>> +	}
>>>> +
>>>> +	return 0;
>>>> +}
>>>> +
>>>> +int vcmd_alloc_pasid(struct intel_iommu *iommu, unsigned int
>>>> *pasid) +{
>>>> +	u64 res;
>>>> +	u8 status_code;
>>>> +	unsigned long flags;
>>>> +	int ret = 0;
>>>> +
>>>> +	ret = check_vcmd_pasid(iommu);
>>>
>>> Do you have to check this everytime? every dmar_readq is going to
>>> trap to the other side ...
>>
>> Yes. We don't need to check it every time. Check once and save the
>> result during boot is enough.
>>
>> How about below incremental change?
>>
> Below is good but I was thinking to include vccap in struct
> intel_iommu{} where cap and ecaps reside. i.e.
> diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h
> index 14b87ae2916a..e2cf25c9c956 100644
> --- a/include/linux/intel-iommu.h
> +++ b/include/linux/intel-iommu.h
> @@ -528,6 +528,7 @@ struct intel_iommu {
>          u64             reg_size; /* size of hw register set */
>          u64             cap;
>          u64             ecap;
> +       u64             vccap;
> 
> Also, we can use a static branch here.

Yeah! Good idea.

Best regards,
baolu
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v6 01/10] iommu/vt-d: Enlightened PASID allocation
  2019-10-23 21:11         ` Jacob Pan
@ 2019-10-25  1:42           ` Lu Baolu
  0 siblings, 0 replies; 27+ messages in thread
From: Lu Baolu @ 2019-10-25  1:42 UTC (permalink / raw)
  To: Jacob Pan
  Cc: Tian, Kevin, Alex Williamson, Raj, Ashok, Jean-Philippe Brucker,
	LKML, iommu, David Woodhouse, Jonathan Cameron

Hi,

On 10/24/19 5:11 AM, Jacob Pan wrote:
> On Wed, 23 Oct 2019 10:55:23 -0700
> Jacob Pan <jacob.jun.pan@linux.intel.com> wrote:
> 
>>>> Do you have to check this everytime? every dmar_readq is going to
>>>> trap to the other side ...
>>>
>>> Yes. We don't need to check it every time. Check once and save the
>>> result during boot is enough.
>>>
>>> How about below incremental change?
>>>    
>> Below is good but I was thinking to include vccap in struct
>> intel_iommu{} where cap and ecaps reside. i.e.
>> diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h
>> index 14b87ae2916a..e2cf25c9c956 100644
>> --- a/include/linux/intel-iommu.h
>> +++ b/include/linux/intel-iommu.h
>> @@ -528,6 +528,7 @@ struct intel_iommu {
>>          u64             reg_size; /* size of hw register set */
>>          u64             cap;
>>          u64             ecap;
>> +       u64             vccap;
>>
>> Also, we can use a static branch here.
>>
> On a second thought, we cannot use static(branch) here in that we
> cannot assume there is only one vIOMMU all the time. Have to cache the
> vccap per iommu.

intel_iommu is a per iommu structure, right? Or I missed anything?

> 
>>> diff --git a/drivers/iommu/intel-pasid.c
>>> b/drivers/iommu/intel-pasid.c index ff7e877b7a4d..c15d9d7e1e73
>>> 100644 --- a/drivers/iommu/intel-pasid.c
>>> +++ b/drivers/iommu/intel-pasid.c
>>> @@ -29,22 +29,29 @@ u32 intel_pasid_max_id = PASID_MAX;
>>>
> [Jacob Pan]
> 

Best regards,
baolu
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v6 02/10] iommu/vt-d: Add custom allocator for IOASID
  2019-10-23 23:01       ` Jacob Pan
@ 2019-10-25  1:44         ` Lu Baolu
  0 siblings, 0 replies; 27+ messages in thread
From: Lu Baolu @ 2019-10-25  1:44 UTC (permalink / raw)
  To: Jacob Pan
  Cc: Tian, Kevin, Alex Williamson, Raj, Ashok, Jean-Philippe Brucker,
	LKML, iommu, David Woodhouse, Jonathan Cameron

Hi,

On 10/24/19 7:01 AM, Jacob Pan wrote:
> On Wed, 23 Oct 2019 10:21:51 +0800
> Lu Baolu <baolu.lu@linux.intel.com> wrote:
> 
>>>> +#ifdef CONFIG_INTEL_IOMMU_SVM
>>>
>>> Maybe move them to intel-svm.c instead? that's where the bulk
>>> of the svm support is?
>>
>> I think this is a generic PASID allocator for guest IOMMU although
>> vSVA is currently the only consumer. Instead of making it SVM
>> specific, I'd like to suggest moving it to intel-pasid.c and replace
>> the @svm parameter with a void * one in intel_ioasid_free().
> 
> make sense to use void*, no need to tie that to svm bind data type.
> 
> In terms of location, perhaps we can move if we have more consumers of
> custom allocator?
> 

Make sense to me.

Best regards,
baolu
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2019-10-25  1:46 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-22 23:53 [PATCH v6 00/10] Nested Shared Virtual Address (SVA) VT-d support Jacob Pan
2019-10-22 23:53 ` [PATCH v6 01/10] iommu/vt-d: Enlightened PASID allocation Jacob Pan
2019-10-23  0:45   ` Raj, Ashok
2019-10-23  1:53     ` Lu Baolu
2019-10-23 17:55       ` Jacob Pan
2019-10-23 21:11         ` Jacob Pan
2019-10-25  1:42           ` Lu Baolu
2019-10-25  1:39         ` Lu Baolu
2019-10-22 23:53 ` [PATCH v6 02/10] iommu/vt-d: Add custom allocator for IOASID Jacob Pan
2019-10-23  0:51   ` Raj, Ashok
2019-10-23  2:21     ` Lu Baolu
2019-10-23 23:01       ` Jacob Pan
2019-10-25  1:44         ` Lu Baolu
2019-10-23  4:04     ` Jacob Pan
2019-10-23 23:04       ` Jacob Pan
2019-10-22 23:53 ` [PATCH v6 03/10] iommu/vt-d: Replace Intel specific PASID allocator with IOASID Jacob Pan
2019-10-23  0:58   ` Raj, Ashok
2019-10-23  4:19     ` Jacob Pan
2019-10-22 23:53 ` [PATCH v6 04/10] iommu/vt-d: Move domain helper to header Jacob Pan
2019-10-22 23:53 ` [PATCH v6 05/10] iommu/vt-d: Avoid duplicated code for PASID setup Jacob Pan
2019-10-22 23:53 ` [PATCH v6 06/10] iommu/vt-d: Add nested translation helper function Jacob Pan
2019-10-22 23:53 ` [PATCH v6 07/10] iommu/vt-d: Misc macro clean up for SVM Jacob Pan
2019-10-22 23:53 ` [PATCH v6 08/10] iommu/vt-d: Add bind guest PASID support Jacob Pan
2019-10-22 23:53 ` [PATCH v6 09/10] iommu/vt-d: Support flushing more translation cache types Jacob Pan
2019-10-22 23:53 ` [PATCH v6 10/10] iommu/vt-d: Add svm/sva invalidate function Jacob Pan
2019-10-23  0:27 ` [PATCH v6 00/10] Nested Shared Virtual Address (SVA) VT-d support Raj, Ashok
2019-10-23 13:46   ` Jacob Pan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).