All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v5 00/17] iommu/amd: SVA Support (part 3) - refactor support for GCR3 table
@ 2024-01-16 16:53 Vasant Hegde
  2024-01-16 16:53 ` [PATCH v5 01/17] iommu/amd: Pass struct iommu_dev_data to set_dte_entry() Vasant Hegde
                   ` (16 more replies)
  0 siblings, 17 replies; 31+ messages in thread
From: Vasant Hegde @ 2024-01-16 16:53 UTC (permalink / raw)
  To: iommu, joro
  Cc: suravee.suthikulpanit, wei.huang2, jsnitsel, jgg, Vasant Hegde

This is part 3 of the 4-part series to introduce Share Virtual
Address (SVA) support, which focuses on refactoring GCR3 table.
This moves GCR3 related information from protection domain
structure to iommu_dev_data (per device) structure.

It contains the following enhancements:

* Patch 1 - 3:
  Cleanup and introduce helper function

* Patch 4 - 14:
  Introduce per device GCR3 table, per device domain ID, invalidation based
  on per device and code refactoring

* Patch 15 - 17:
  Remove unused variable, rearrange functions to make it easy to maintain

This patch series is based on top of iommu/next branch (commit 75f74f85a42e).

This is also available at github :
  https://github.com/AMDESE/linux/tree/iommu_sva_part3_v5_v6.7_rc8

Thanks Jason for reviewing previous versions and providing valuable feedbacks.

Changes from v4 -> v5:
  - Added get_amd_iommu_from_dev_data()
  - pass gcr3_info instead of dev_data to setup/free_gcr3_table()
  - Added path to EXPORT iommu_group_mutex_assert
  - Moved dev_data->domid to gcr3_info->domid


v4: https://lore.kernel.org/linux-iommu/20231212085224.6985-1-vasant.hegde@amd.com/T/#t

Changes from v3 -> v4:
  - Added new patch to enable IOMMU GT feature early
  - Fixed dirty tracking path to use get_amd_iommu_from_dev()
  - Fixed duplicate flush issue in do_attach() path
  - Removed gcr3_info->giov variable. Instead we use domain page table mode to decide
    DTE[GIOV] bit.
  - Free gcr3_table before calling set_dte_entry (otherwise there is a possibility
    of its setting up gcr3 table with glx 0)
  - Adjusted gfp_t flag for GCR3 table allocation (GPF_ATOMIC -> GFP_KERNEL)
  - Moved below patches from invalidation improvement series to this series
    * Remove unused flush_pasid() related functions
      (Originally it was part of Invalidation improvement series, I had to push it to
       this series as set_gcr3() was still using __amd_iommu_flush_tlb()).
    * Added patch to Rearrange device flush code

v3 : https://lore.kernel.org/linux-iommu/20231013151652.6008-1-vasant.hegde@amd.com/T/#t

Changes from v2 -> v3:
  - Dropped "iommu/amd: Use struct protection_domain in helper functions" as flush related
    functions are already fixed and set_gcr3 related functions is reworked in this series.
  - Removed moving to_pdomain() function from this series. It will be introduced
    in SVA series where it will be used.
  - Added per device domain ID and per device invalidation functions
  - Added functions to flush TLB for the given device
  - Added Review-by tags

v2 : https://lore.kernel.org/linux-iommu/20230816174031.634453-1-vasant.hegde@amd.com/T/#t

Changes from v1 - v2:
  - Dropped iommu_v2 module related support as newly introduced Part2
    removed iommu_v2 module.
  - Moved Patch 'Use struct protection_domain in helper functions' from Part1 to
    Part3
  - Updated get_amd_iommu_from_dev() to retrieve iommu from device structure
  - Removed 'PD_MODE_PT'

v1 : https://lore.kernel.org/linux-iommu/20230808100232.5977-1-vasant.hegde@amd.com/T/#t

Thank you,
Vasant / Suravee

Suravee Suthikulpanit (8):
  iommu/amd: Introduce get_amd_iommu_from_dev()
  iommu/amd: Introduce struct protection_domain.pd_mode
  iommu/amd: Introduce per-device GCR3 table
  iommu/amd: Refactor helper function for setting / clearing GCR3
  iommu/amd: Refactor attaching / detaching device functions
  iommu/amd: Refactor protection_domain helper functions
  iommu/amd: Refactor GCR3 table helper functions
  iommu/amd: Remove unused GCR3 table parameters from struct protection_domain

Vasant Hegde (9):
  iommu/amd: Pass struct iommu_dev_data to set_dte_entry()
  iommu/amd: Enable Guest Translation before registering devices
  iommu/amd: Use protection_domain.flags to check page table mode
  iommu/amd: Introduce per-device domain ID to workaround potential TLB aliasing issue
  iommu/amd: Add support for device based TLB invalidation
  iommu/amd: Rearrange GCR3 table setup code
  iommu: Introduce iommu_group_mutex_assert()
  iommu/amd: Remove unused flush pasid functions
  iommu/amd: Rearrange device flush code

 drivers/iommu/amd/amd_iommu.h       |  29 +-
 drivers/iommu/amd/amd_iommu_types.h |  26 +-
 drivers/iommu/amd/init.c            |   6 +-
 drivers/iommu/amd/io_pgtable_v2.c   |  21 +-
 drivers/iommu/amd/iommu.c           | 633 +++++++++++++---------------
 drivers/iommu/iommu.c               |  17 +
 include/linux/iommu.h               |  21 +
 7 files changed, 377 insertions(+), 376 deletions(-)

-- 
2.31.1


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH v5 01/17] iommu/amd: Pass struct iommu_dev_data to set_dte_entry()
  2024-01-16 16:53 [PATCH v5 00/17] iommu/amd: SVA Support (part 3) - refactor support for GCR3 table Vasant Hegde
@ 2024-01-16 16:53 ` Vasant Hegde
  2024-01-16 16:53 ` [PATCH v5 02/17] iommu/amd: Enable Guest Translation before registering devices Vasant Hegde
                   ` (15 subsequent siblings)
  16 siblings, 0 replies; 31+ messages in thread
From: Vasant Hegde @ 2024-01-16 16:53 UTC (permalink / raw)
  To: iommu, joro
  Cc: suravee.suthikulpanit, wei.huang2, jsnitsel, jgg, Vasant Hegde,
	Jason Gunthorpe

Pass iommu_dev_data structure instead of passing indivisual variables.

No functional changes intended.

Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/iommu/amd/iommu.c | 18 ++++++++----------
 1 file changed, 8 insertions(+), 10 deletions(-)

diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 92d6596bdfff..77612142f436 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -1704,12 +1704,14 @@ static int setup_gcr3_table(struct protection_domain *domain, int pasids)
 	return 0;
 }
 
-static void set_dte_entry(struct amd_iommu *iommu, u16 devid,
-			  struct protection_domain *domain, bool ats, bool ppr)
+static void set_dte_entry(struct amd_iommu *iommu,
+			  struct iommu_dev_data *dev_data)
 {
 	u64 pte_root = 0;
 	u64 flags = 0;
 	u32 old_domid;
+	u16 devid = dev_data->devid;
+	struct protection_domain *domain = dev_data->domain;
 	struct dev_table_entry *dev_table = get_dev_table(iommu);
 
 	if (domain->iop.mode != PAGE_MODE_NONE)
@@ -1729,10 +1731,10 @@ static void set_dte_entry(struct amd_iommu *iommu, u16 devid,
 
 	flags = dev_table[devid].data[1];
 
-	if (ats)
+	if (dev_data->ats_enabled)
 		flags |= DTE_FLAG_IOTLB;
 
-	if (ppr)
+	if (dev_data->ppr)
 		pte_root |= 1ULL << DEV_ENTRY_PPR;
 
 	if (domain->dirty_tracking)
@@ -1808,12 +1810,10 @@ static void do_attach(struct iommu_dev_data *dev_data,
 		      struct protection_domain *domain)
 {
 	struct amd_iommu *iommu;
-	bool ats;
 
 	iommu = rlookup_amd_iommu(dev_data->dev);
 	if (!iommu)
 		return;
-	ats   = dev_data->ats_enabled;
 
 	/* Update data structures */
 	dev_data->domain = domain;
@@ -1828,8 +1828,7 @@ static void do_attach(struct iommu_dev_data *dev_data,
 	domain->dev_cnt                 += 1;
 
 	/* Update device table */
-	set_dte_entry(iommu, dev_data->devid, domain,
-		      ats, dev_data->ppr);
+	set_dte_entry(iommu, dev_data);
 	clone_aliases(iommu, dev_data->dev);
 
 	device_flush_dte(dev_data);
@@ -2013,8 +2012,7 @@ static void update_device_table(struct protection_domain *domain)
 
 		if (!iommu)
 			continue;
-		set_dte_entry(iommu, dev_data->devid, domain,
-			      dev_data->ats_enabled, dev_data->ppr);
+		set_dte_entry(iommu, dev_data);
 		clone_aliases(iommu, dev_data->dev);
 	}
 }
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v5 02/17] iommu/amd: Enable Guest Translation before registering devices
  2024-01-16 16:53 [PATCH v5 00/17] iommu/amd: SVA Support (part 3) - refactor support for GCR3 table Vasant Hegde
  2024-01-16 16:53 ` [PATCH v5 01/17] iommu/amd: Pass struct iommu_dev_data to set_dte_entry() Vasant Hegde
@ 2024-01-16 16:53 ` Vasant Hegde
  2024-01-16 16:53 ` [PATCH v5 03/17] iommu/amd: Introduce get_amd_iommu_from_dev() Vasant Hegde
                   ` (14 subsequent siblings)
  16 siblings, 0 replies; 31+ messages in thread
From: Vasant Hegde @ 2024-01-16 16:53 UTC (permalink / raw)
  To: iommu, joro
  Cc: suravee.suthikulpanit, wei.huang2, jsnitsel, jgg, Vasant Hegde,
	Jason Gunthorpe

IOMMU Guest Translation (GT) feature needs to be enabled before
invalidating guest translations (CMD_INV_IOMMU_PAGES with GN=1).

Currently GT feature is enabled after setting up interrupt handler.
So far it was fine as we were not invalidating guest page table
before this point.

Upcoming series will introduce per device GCR3 table and it will
invalidate guest pages after configuring. Hence move GT feature
enablement to early_enable_iommu().

Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/iommu/amd/init.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
index c83bd0c2a1c9..959820ccfbcc 100644
--- a/drivers/iommu/amd/init.c
+++ b/drivers/iommu/amd/init.c
@@ -2769,6 +2769,7 @@ static void early_enable_iommu(struct amd_iommu *iommu)
 	iommu_enable_command_buffer(iommu);
 	iommu_enable_event_buffer(iommu);
 	iommu_set_exclusion_range(iommu);
+	iommu_enable_gt(iommu);
 	iommu_enable_ga(iommu);
 	iommu_enable_xt(iommu);
 	iommu_enable_irtcachedis(iommu);
@@ -2825,6 +2826,7 @@ static void early_enable_iommus(void)
 			iommu_disable_irtcachedis(iommu);
 			iommu_enable_command_buffer(iommu);
 			iommu_enable_event_buffer(iommu);
+			iommu_enable_gt(iommu);
 			iommu_enable_ga(iommu);
 			iommu_enable_xt(iommu);
 			iommu_enable_irtcachedis(iommu);
@@ -2838,10 +2840,8 @@ static void enable_iommus_v2(void)
 {
 	struct amd_iommu *iommu;
 
-	for_each_iommu(iommu) {
+	for_each_iommu(iommu)
 		iommu_enable_ppr_log(iommu);
-		iommu_enable_gt(iommu);
-	}
 }
 
 static void enable_iommus_vapic(void)
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v5 03/17] iommu/amd: Introduce get_amd_iommu_from_dev()
  2024-01-16 16:53 [PATCH v5 00/17] iommu/amd: SVA Support (part 3) - refactor support for GCR3 table Vasant Hegde
  2024-01-16 16:53 ` [PATCH v5 01/17] iommu/amd: Pass struct iommu_dev_data to set_dte_entry() Vasant Hegde
  2024-01-16 16:53 ` [PATCH v5 02/17] iommu/amd: Enable Guest Translation before registering devices Vasant Hegde
@ 2024-01-16 16:53 ` Vasant Hegde
  2024-01-19 19:00   ` Jason Gunthorpe
  2024-01-16 16:53 ` [PATCH v5 04/17] iommu/amd: Introduce struct protection_domain.pd_mode Vasant Hegde
                   ` (13 subsequent siblings)
  16 siblings, 1 reply; 31+ messages in thread
From: Vasant Hegde @ 2024-01-16 16:53 UTC (permalink / raw)
  To: iommu, joro
  Cc: suravee.suthikulpanit, wei.huang2, jsnitsel, jgg, Vasant Hegde

From: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>

Introduce get_amd_iommu_from_dev() and get_amd_iommu_from_dev_data().
And replace rlookup_amd_iommu() with the new helper function where
applicable to avoid unnecessary loop to look up struct amd_iommu from
struct device.

Suggested-by: Jason Gunthorpe <jgg@ziepe.ca>
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
---
 drivers/iommu/amd/amd_iommu.h | 15 ++++++++++
 drivers/iommu/amd/iommu.c     | 54 +++++++++--------------------------
 include/linux/iommu.h         | 16 +++++++++++
 3 files changed, 44 insertions(+), 41 deletions(-)

diff --git a/drivers/iommu/amd/amd_iommu.h b/drivers/iommu/amd/amd_iommu.h
index 8b3601f285fd..31275bc6e9e4 100644
--- a/drivers/iommu/amd/amd_iommu.h
+++ b/drivers/iommu/amd/amd_iommu.h
@@ -150,6 +150,21 @@ static inline void *alloc_pgtable_page(int nid, gfp_t gfp)
 	return page ? page_address(page) : NULL;
 }
 
+/*
+ * This must be called after device probe completes. During probe
+ * use rlookup_amd_iommu() get the iommu.
+ */
+static inline struct amd_iommu *get_amd_iommu_from_dev(struct device *dev)
+{
+	return iommu_get_iommu_dev(dev, struct amd_iommu, iommu);
+}
+
+/* This must be called after device probe completes. */
+static inline struct amd_iommu *get_amd_iommu_from_dev_data(struct iommu_dev_data *dev_data)
+{
+	return iommu_get_iommu_dev(dev_data->dev, struct amd_iommu, iommu);
+}
+
 bool translation_pre_enabled(struct amd_iommu *iommu);
 bool amd_iommu_is_attach_deferred(struct device *dev);
 int __init add_special_device(u8 type, u8 id, u32 *devid, bool cmd_line);
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 77612142f436..eefd399c5086 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -1388,14 +1388,9 @@ void amd_iommu_flush_all_caches(struct amd_iommu *iommu)
 static int device_flush_iotlb(struct iommu_dev_data *dev_data, u64 address,
 			      size_t size, ioasid_t pasid, bool gn)
 {
-	struct amd_iommu *iommu;
+	struct amd_iommu *iommu = get_amd_iommu_from_dev_data(dev_data);
 	struct iommu_cmd cmd;
-	int qdep;
-
-	qdep     = dev_data->ats_qdep;
-	iommu    = rlookup_amd_iommu(dev_data->dev);
-	if (!iommu)
-		return -EINVAL;
+	int qdep = dev_data->ats_qdep;
 
 	build_inv_iotlb_pages(&cmd, dev_data->devid, qdep, address,
 			      size, pasid, gn);
@@ -1415,16 +1410,12 @@ static int device_flush_dte_alias(struct pci_dev *pdev, u16 alias, void *data)
  */
 static int device_flush_dte(struct iommu_dev_data *dev_data)
 {
-	struct amd_iommu *iommu;
+	struct amd_iommu *iommu = get_amd_iommu_from_dev_data(dev_data);
 	struct pci_dev *pdev = NULL;
 	struct amd_iommu_pci_seg *pci_seg;
 	u16 alias;
 	int ret;
 
-	iommu = rlookup_amd_iommu(dev_data->dev);
-	if (!iommu)
-		return -EINVAL;
-
 	if (dev_is_pci(dev_data->dev))
 		pdev = to_pci_dev(dev_data->dev);
 
@@ -1809,11 +1800,7 @@ static void clear_dte_entry(struct amd_iommu *iommu, u16 devid)
 static void do_attach(struct iommu_dev_data *dev_data,
 		      struct protection_domain *domain)
 {
-	struct amd_iommu *iommu;
-
-	iommu = rlookup_amd_iommu(dev_data->dev);
-	if (!iommu)
-		return;
+	struct amd_iommu *iommu = get_amd_iommu_from_dev_data(dev_data);
 
 	/* Update data structures */
 	dev_data->domain = domain;
@@ -1837,11 +1824,7 @@ static void do_attach(struct iommu_dev_data *dev_data,
 static void do_detach(struct iommu_dev_data *dev_data)
 {
 	struct protection_domain *domain = dev_data->domain;
-	struct amd_iommu *iommu;
-
-	iommu = rlookup_amd_iommu(dev_data->dev);
-	if (!iommu)
-		return;
+	struct amd_iommu *iommu = get_amd_iommu_from_dev_data(dev_data);
 
 	/* Update data structures */
 	dev_data->domain = NULL;
@@ -2008,10 +1991,8 @@ static void update_device_table(struct protection_domain *domain)
 	struct iommu_dev_data *dev_data;
 
 	list_for_each_entry(dev_data, &domain->dev_list, list) {
-		struct amd_iommu *iommu = rlookup_amd_iommu(dev_data->dev);
+		struct amd_iommu *iommu = get_amd_iommu_from_dev_data(dev_data);
 
-		if (!iommu)
-			continue;
 		set_dte_entry(iommu, dev_data);
 		clone_aliases(iommu, dev_data->dev);
 	}
@@ -2192,11 +2173,8 @@ static struct iommu_domain *do_iommu_domain_alloc(unsigned int type,
 	struct protection_domain *domain;
 	struct amd_iommu *iommu = NULL;
 
-	if (dev) {
-		iommu = rlookup_amd_iommu(dev);
-		if (!iommu)
-			return ERR_PTR(-ENODEV);
-	}
+	if (dev)
+		iommu = get_amd_iommu_from_dev(dev);
 
 	/*
 	 * Since DTE[Mode]=0 is prohibited on SNP-enabled system,
@@ -2277,7 +2255,7 @@ static int amd_iommu_attach_device(struct iommu_domain *dom,
 {
 	struct iommu_dev_data *dev_data = dev_iommu_priv_get(dev);
 	struct protection_domain *domain = to_pdomain(dom);
-	struct amd_iommu *iommu = rlookup_amd_iommu(dev);
+	struct amd_iommu *iommu = get_amd_iommu_from_dev(dev);
 	int ret;
 
 	/*
@@ -2416,7 +2394,7 @@ static bool amd_iommu_capable(struct device *dev, enum iommu_cap cap)
 	case IOMMU_CAP_DEFERRED_FLUSH:
 		return true;
 	case IOMMU_CAP_DIRTY_TRACKING: {
-		struct amd_iommu *iommu = rlookup_amd_iommu(dev);
+		struct amd_iommu *iommu = get_amd_iommu_from_dev(dev);
 
 		return amd_iommu_hd_support(iommu);
 	}
@@ -2445,9 +2423,7 @@ static int amd_iommu_set_dirty_tracking(struct iommu_domain *domain,
 	}
 
 	list_for_each_entry(dev_data, &pdomain->dev_list, list) {
-		iommu = rlookup_amd_iommu(dev_data->dev);
-		if (!iommu)
-			continue;
+		iommu = get_amd_iommu_from_dev_data(dev_data);
 
 		dev_table = get_dev_table(iommu);
 		pte_root = dev_table[dev_data->devid].data[0];
@@ -2507,9 +2483,7 @@ static void amd_iommu_get_resv_regions(struct device *dev,
 		return;
 
 	devid = PCI_SBDF_TO_DEVID(sbdf);
-	iommu = rlookup_amd_iommu(dev);
-	if (!iommu)
-		return;
+	iommu = get_amd_iommu_from_dev(dev);
 	pci_seg = iommu->pci_seg;
 
 	list_for_each_entry(entry, &pci_seg->unity_map, list) {
@@ -2843,9 +2817,7 @@ int amd_iommu_complete_ppr(struct pci_dev *pdev, u32 pasid,
 	struct iommu_cmd cmd;
 
 	dev_data = dev_iommu_priv_get(&pdev->dev);
-	iommu    = rlookup_amd_iommu(&pdev->dev);
-	if (!iommu)
-		return -ENODEV;
+	iommu    = get_amd_iommu_from_dev(&pdev->dev);
 
 	build_complete_ppr(&cmd, dev_data->devid, pasid, status,
 			   tag, dev_data->pri_tlp);
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 2f765ae06021..7f6342bc71c3 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -675,6 +675,22 @@ static inline struct iommu_device *dev_to_iommu_device(struct device *dev)
 	return (struct iommu_device *)dev_get_drvdata(dev);
 }
 
+/**
+ * iommu_get_iommu_dev - Get iommu_device for a device
+ * @dev: an end-point device
+ *
+ * Note that this function must be called from the iommu_ops
+ * to retrieve the iommu_device for a device, which the core code
+ * guarentees it will not invoke the op without an attached iommu.
+ */
+static inline struct iommu_device *__iommu_get_iommu_dev(struct device *dev)
+{
+	return dev->iommu->iommu_dev;
+}
+
+#define iommu_get_iommu_dev(dev, type, member) \
+	container_of(__iommu_get_iommu_dev(dev), type, member)
+
 static inline void iommu_iotlb_gather_init(struct iommu_iotlb_gather *gather)
 {
 	*gather = (struct iommu_iotlb_gather) {
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v5 04/17] iommu/amd: Introduce struct protection_domain.pd_mode
  2024-01-16 16:53 [PATCH v5 00/17] iommu/amd: SVA Support (part 3) - refactor support for GCR3 table Vasant Hegde
                   ` (2 preceding siblings ...)
  2024-01-16 16:53 ` [PATCH v5 03/17] iommu/amd: Introduce get_amd_iommu_from_dev() Vasant Hegde
@ 2024-01-16 16:53 ` Vasant Hegde
  2024-01-16 16:53 ` [PATCH v5 05/17] iommu/amd: Introduce per-device GCR3 table Vasant Hegde
                   ` (12 subsequent siblings)
  16 siblings, 0 replies; 31+ messages in thread
From: Vasant Hegde @ 2024-01-16 16:53 UTC (permalink / raw)
  To: iommu, joro
  Cc: suravee.suthikulpanit, wei.huang2, jsnitsel, jgg, Vasant Hegde,
	Jason Gunthorpe

From: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>

This enum variable is used to track the type of page table used by the
protection domain. It will replace the protection_domain.flags in
subsequent series.

Suggested-by: Jason Gunthorpe <jgg@ziepe.ca>
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/iommu/amd/amd_iommu_types.h | 6 ++++++
 drivers/iommu/amd/iommu.c           | 2 ++
 2 files changed, 8 insertions(+)

diff --git a/drivers/iommu/amd/amd_iommu_types.h b/drivers/iommu/amd/amd_iommu_types.h
index 809d74faa1a5..934a63f64839 100644
--- a/drivers/iommu/amd/amd_iommu_types.h
+++ b/drivers/iommu/amd/amd_iommu_types.h
@@ -549,6 +549,11 @@ struct amd_io_pgtable {
 	u64			*pgd;		/* v2 pgtable pgd pointer */
 };
 
+enum protection_domain_mode {
+	PD_MODE_V1 = 1,
+	PD_MODE_V2,
+};
+
 /*
  * This structure contains generic data for  IOMMU protection domains
  * independent of their use.
@@ -564,6 +569,7 @@ struct protection_domain {
 	int nid;		/* Node ID */
 	u64 *gcr3_tbl;		/* Guest CR3 table */
 	unsigned long flags;	/* flags to find out type of domain */
+	enum protection_domain_mode pd_mode; /* Track page table type */
 	bool dirty_tracking;	/* dirty tracking is enabled in the domain */
 	unsigned dev_cnt;	/* devices assigned to this domain */
 	unsigned dev_iommu[MAX_IOMMUS]; /* per-IOMMU reference count */
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index eefd399c5086..88a536dafa7a 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -2073,6 +2073,7 @@ static int protection_domain_init_v1(struct protection_domain *domain, int mode)
 			return -ENOMEM;
 	}
 
+	domain->pd_mode = PD_MODE_V1;
 	amd_iommu_domain_set_pgtable(domain, pt_root, mode);
 
 	return 0;
@@ -2081,6 +2082,7 @@ static int protection_domain_init_v1(struct protection_domain *domain, int mode)
 static int protection_domain_init_v2(struct protection_domain *domain)
 {
 	domain->flags |= PD_GIOV_MASK;
+	domain->pd_mode = PD_MODE_V2;
 
 	domain->domain.pgsize_bitmap = AMD_IOMMU_PGSIZES_V2;
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v5 05/17] iommu/amd: Introduce per-device GCR3 table
  2024-01-16 16:53 [PATCH v5 00/17] iommu/amd: SVA Support (part 3) - refactor support for GCR3 table Vasant Hegde
                   ` (3 preceding siblings ...)
  2024-01-16 16:53 ` [PATCH v5 04/17] iommu/amd: Introduce struct protection_domain.pd_mode Vasant Hegde
@ 2024-01-16 16:53 ` Vasant Hegde
  2024-01-19 19:03   ` Jason Gunthorpe
  2024-01-16 16:53 ` [PATCH v5 06/17] iommu/amd: Use protection_domain.flags to check page table mode Vasant Hegde
                   ` (11 subsequent siblings)
  16 siblings, 1 reply; 31+ messages in thread
From: Vasant Hegde @ 2024-01-16 16:53 UTC (permalink / raw)
  To: iommu, joro
  Cc: suravee.suthikulpanit, wei.huang2, jsnitsel, jgg, Vasant Hegde

From: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>

AMD IOMMU GCR3 table is indexed by PASID. Each entry stores guest CR3
register value, which is an address to the root of guest IO page table.
The GCR3 table can be programmed per-device. However, Linux AMD IOMMU
driver currently managing the table on a per-domain basis.

PASID is a device feature. When SVA is enabled it will bind PASID to
device, not domain. Hence it makes sense to have per device GCR3 table.

Introduce struct iommu_dev_data.gcr3_tbl_info to keep track of GCR3 table
configuration. This will eventually replaces gcr3 related variables in
protection_domain structure.

Suggested-by: Jason Gunthorpe <jgg@ziepe.ca>
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
---
 drivers/iommu/amd/amd_iommu_types.h | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/drivers/iommu/amd/amd_iommu_types.h b/drivers/iommu/amd/amd_iommu_types.h
index 934a63f64839..fead9033796f 100644
--- a/drivers/iommu/amd/amd_iommu_types.h
+++ b/drivers/iommu/amd/amd_iommu_types.h
@@ -541,6 +541,12 @@ struct amd_irte_ops;
 #define io_pgtable_cfg_to_data(x) \
 	container_of((x), struct amd_io_pgtable, pgtbl_cfg)
 
+struct gcr3_tbl_info {
+	u64	*gcr3_tbl;	/* Guest CR3 table */
+	int	glx;		/* Number of levels for GCR3 table */
+	u32	pasid_cnt;	/* Track attached PASIDs */
+};
+
 struct amd_io_pgtable {
 	struct io_pgtable_cfg	pgtbl_cfg;
 	struct io_pgtable	iop;
@@ -822,6 +828,7 @@ struct iommu_dev_data {
 	struct list_head list;		  /* For domain->dev_list */
 	struct llist_node dev_data_list;  /* For global dev_data_list */
 	struct protection_domain *domain; /* Domain the device is bound to */
+	struct gcr3_tbl_info gcr3_info;   /* Per-device GCR3 table */
 	struct device *dev;
 	u16 devid;			  /* PCI Device ID */
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v5 06/17] iommu/amd: Use protection_domain.flags to check page table mode
  2024-01-16 16:53 [PATCH v5 00/17] iommu/amd: SVA Support (part 3) - refactor support for GCR3 table Vasant Hegde
                   ` (4 preceding siblings ...)
  2024-01-16 16:53 ` [PATCH v5 05/17] iommu/amd: Introduce per-device GCR3 table Vasant Hegde
@ 2024-01-16 16:53 ` Vasant Hegde
  2024-01-16 16:53 ` [PATCH v5 07/17] iommu/amd: Introduce per-device domain ID to workaround potential TLB aliasing issue Vasant Hegde
                   ` (10 subsequent siblings)
  16 siblings, 0 replies; 31+ messages in thread
From: Vasant Hegde @ 2024-01-16 16:53 UTC (permalink / raw)
  To: iommu, joro
  Cc: suravee.suthikulpanit, wei.huang2, jsnitsel, jgg, Vasant Hegde,
	Jason Gunthorpe

Page table mode (v1, v2 or pt) is per domain property. Recently we have
enhanced protection_domain.pd_mode to track per domain page table mode.
Use that variable to check the page table mode instead of global
'amd_iommu_pgtable' in {map/unmap}_pages path.

Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/iommu/amd/iommu.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 88a536dafa7a..dac04585c7fc 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -2315,7 +2315,7 @@ static int amd_iommu_map_pages(struct iommu_domain *dom, unsigned long iova,
 	int prot = 0;
 	int ret = -EINVAL;
 
-	if ((amd_iommu_pgtable == AMD_IOMMU_V1) &&
+	if ((domain->pd_mode == PD_MODE_V1) &&
 	    (domain->iop.mode == PAGE_MODE_NONE))
 		return -EINVAL;
 
@@ -2361,7 +2361,7 @@ static size_t amd_iommu_unmap_pages(struct iommu_domain *dom, unsigned long iova
 	struct io_pgtable_ops *ops = &domain->iop.iop.ops;
 	size_t r;
 
-	if ((amd_iommu_pgtable == AMD_IOMMU_V1) &&
+	if ((domain->pd_mode == PD_MODE_V1) &&
 	    (domain->iop.mode == PAGE_MODE_NONE))
 		return 0;
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v5 07/17] iommu/amd: Introduce per-device domain ID to workaround potential TLB aliasing issue
  2024-01-16 16:53 [PATCH v5 00/17] iommu/amd: SVA Support (part 3) - refactor support for GCR3 table Vasant Hegde
                   ` (5 preceding siblings ...)
  2024-01-16 16:53 ` [PATCH v5 06/17] iommu/amd: Use protection_domain.flags to check page table mode Vasant Hegde
@ 2024-01-16 16:53 ` Vasant Hegde
  2024-01-16 16:53 ` [PATCH v5 08/17] iommu/amd: Add support for device based TLB invalidation Vasant Hegde
                   ` (9 subsequent siblings)
  16 siblings, 0 replies; 31+ messages in thread
From: Vasant Hegde @ 2024-01-16 16:53 UTC (permalink / raw)
  To: iommu, joro
  Cc: suravee.suthikulpanit, wei.huang2, jsnitsel, jgg, Vasant Hegde

With v1 page table, the AMD IOMMU spec states that the hardware must use
the domain ID to tag its internal translation caches. I/O devices with
different v1 page tables must be given different domain IDs. I/O devices
that share the same v1 page table __may__ be given the same domain ID.
This domain ID management policy is currently implemented by the AMD
IOMMU driver. In this case, only the domain ID is needed when issuing the
INVALIDATE_IOMMU_PAGES command to invalidate the IOMMU translation cache
(TLB).

With v2 page table, the hardware uses domain ID and PASID as parameters
to tag and issue the INVALIDATE_IOMMU_PAGES command. Since the GCR3 table
is setup per-device, and there is no guarantee for PASID to be unique
across multiple devices. The same PASID for different devices could
have different v2 page tables. In such case, if multiple devices share the
same domain ID, IOMMU translation cache for these devices would be polluted
due to TLB aliasing.

Hence, avoid the TLB aliasing issue with v2 page table by allocating unique
domain ID for each device even when multiple devices are sharing the same v1
page table. Please note that this workaround would result in multiple
INVALIDATE_IOMMU_PAGES commands (one per domain id) when unmapping a
translation.

Domain ID can be shared until device starts using PASID. We will enhance this
code later where we will allocate per device domain ID only when its needed.

Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
---
 drivers/iommu/amd/amd_iommu_types.h |  1 +
 drivers/iommu/amd/iommu.c           | 84 +++++++++++++++++++++++------
 2 files changed, 70 insertions(+), 15 deletions(-)

diff --git a/drivers/iommu/amd/amd_iommu_types.h b/drivers/iommu/amd/amd_iommu_types.h
index fead9033796f..9e23710fd3ec 100644
--- a/drivers/iommu/amd/amd_iommu_types.h
+++ b/drivers/iommu/amd/amd_iommu_types.h
@@ -545,6 +545,7 @@ struct gcr3_tbl_info {
 	u64	*gcr3_tbl;	/* Guest CR3 table */
 	int	glx;		/* Number of levels for GCR3 table */
 	u32	pasid_cnt;	/* Track attached PASIDs */
+	u16	domid;		/* Per device domain ID */
 };
 
 struct amd_io_pgtable {
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index dac04585c7fc..34004f0e90dc 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -90,6 +90,14 @@ static inline bool pdom_is_v2_pgtbl_mode(struct protection_domain *pdom)
 	return (pdom && (pdom->flags & PD_IOMMUV2_MASK));
 }
 
+/*
+ * Allocate per device domain ID when using V2 page table
+ */
+static inline bool domain_id_is_per_dev(struct protection_domain *pdom)
+{
+	return (pdom && pdom->pd_mode != PD_MODE_V1);
+}
+
 static inline int get_acpihid_device_id(struct device *dev,
 					struct acpihid_map_entry **entry)
 {
@@ -1444,27 +1452,37 @@ static int device_flush_dte(struct iommu_dev_data *dev_data)
 	return ret;
 }
 
-/*
- * TLB invalidation function which is called from the mapping functions.
- * It invalidates a single PTE if the range to flush is within a single
- * page. Otherwise it flushes the whole TLB of the IOMMU.
- */
-static void __domain_flush_pages(struct protection_domain *domain,
+static int domain_flush_pages_v2(struct protection_domain *pdom,
 				 u64 address, size_t size)
 {
 	struct iommu_dev_data *dev_data;
 	struct iommu_cmd cmd;
-	int ret = 0, i;
-	ioasid_t pasid = IOMMU_NO_PASID;
-	bool gn = false;
+	int ret = 0;
 
-	if (pdom_is_v2_pgtbl_mode(domain))
-		gn = true;
+	list_for_each_entry(dev_data, &pdom->dev_list, list) {
+		struct amd_iommu *iommu = get_amd_iommu_from_dev(dev_data->dev);
+		u16 domid = dev_data->gcr3_info.domid;
+
+		build_inv_iommu_pages(&cmd, address, size,
+				      domid, IOMMU_NO_PASID, true);
+
+		ret |= iommu_queue_command(iommu, &cmd);
+	}
+
+	return ret;
+}
 
-	build_inv_iommu_pages(&cmd, address, size, domain->id, pasid, gn);
+static int domain_flush_pages_v1(struct protection_domain *pdom,
+				 u64 address, size_t size)
+{
+	struct iommu_cmd cmd;
+	int ret = 0, i;
+
+	build_inv_iommu_pages(&cmd, address, size,
+			      pdom->id, IOMMU_NO_PASID, false);
 
 	for (i = 0; i < amd_iommu_get_num_iommus(); ++i) {
-		if (!domain->dev_iommu[i])
+		if (!pdom->dev_iommu[i])
 			continue;
 
 		/*
@@ -1474,6 +1492,28 @@ static void __domain_flush_pages(struct protection_domain *domain,
 		ret |= iommu_queue_command(amd_iommus[i], &cmd);
 	}
 
+	return ret;
+}
+
+/*
+ * TLB invalidation function which is called from the mapping functions.
+ * It flushes range of PTEs of the domain.
+ */
+static void __domain_flush_pages(struct protection_domain *domain,
+				 u64 address, size_t size)
+{
+	struct iommu_dev_data *dev_data;
+	int ret = 0;
+	ioasid_t pasid = IOMMU_NO_PASID;
+	bool gn = false;
+
+	if (pdom_is_v2_pgtbl_mode(domain)) {
+		gn = true;
+		ret = domain_flush_pages_v2(domain, address, size);
+	} else {
+		ret = domain_flush_pages_v1(domain, address, size);
+	}
+
 	list_for_each_entry(dev_data, &domain->dev_list, list) {
 
 		if (!dev_data->ats_enabled)
@@ -1702,9 +1742,15 @@ static void set_dte_entry(struct amd_iommu *iommu,
 	u64 flags = 0;
 	u32 old_domid;
 	u16 devid = dev_data->devid;
+	u16 domid;
 	struct protection_domain *domain = dev_data->domain;
 	struct dev_table_entry *dev_table = get_dev_table(iommu);
 
+	if (domain_id_is_per_dev(domain))
+		domid = dev_data->gcr3_info.domid;
+	else
+		domid = domain->id;
+
 	if (domain->iop.mode != PAGE_MODE_NONE)
 		pte_root = iommu_virt_to_phys(domain->iop.root);
 
@@ -1717,7 +1763,7 @@ static void set_dte_entry(struct amd_iommu *iommu,
 	 * When SNP is enabled, Only set TV bit when IOMMU
 	 * page translation is in use.
 	 */
-	if (!amd_iommu_snp_en || (domain->id != 0))
+	if (!amd_iommu_snp_en || (domid != 0))
 		pte_root |= DTE_FLAG_TV;
 
 	flags = dev_table[devid].data[1];
@@ -1766,7 +1812,7 @@ static void set_dte_entry(struct amd_iommu *iommu,
 	}
 
 	flags &= ~DEV_DOMID_MASK;
-	flags |= domain->id;
+	flags |= domid;
 
 	old_domid = dev_table[devid].data[1] & DEV_DOMID_MASK;
 	dev_table[devid].data[1]  = flags;
@@ -1814,6 +1860,10 @@ static void do_attach(struct iommu_dev_data *dev_data,
 	domain->dev_iommu[iommu->index] += 1;
 	domain->dev_cnt                 += 1;
 
+	/* Allocate per device domain ID */
+	if (domain_id_is_per_dev(domain))
+		dev_data->gcr3_info.domid = domain_id_alloc();
+
 	/* Update device table */
 	set_dte_entry(iommu, dev_data);
 	clone_aliases(iommu, dev_data->dev);
@@ -1841,6 +1891,10 @@ static void do_detach(struct iommu_dev_data *dev_data)
 	/* decrease reference counters - needs to happen after the flushes */
 	domain->dev_iommu[iommu->index] -= 1;
 	domain->dev_cnt                 -= 1;
+
+	/* Free per device domain ID */
+	if (domain_id_is_per_dev(domain))
+		domain_id_free(dev_data->gcr3_info.domid);
 }
 
 /*
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v5 08/17] iommu/amd: Add support for device based TLB invalidation
  2024-01-16 16:53 [PATCH v5 00/17] iommu/amd: SVA Support (part 3) - refactor support for GCR3 table Vasant Hegde
                   ` (6 preceding siblings ...)
  2024-01-16 16:53 ` [PATCH v5 07/17] iommu/amd: Introduce per-device domain ID to workaround potential TLB aliasing issue Vasant Hegde
@ 2024-01-16 16:53 ` Vasant Hegde
  2024-01-16 16:53 ` [PATCH v5 09/17] iommu/amd: Rearrange GCR3 table setup code Vasant Hegde
                   ` (8 subsequent siblings)
  16 siblings, 0 replies; 31+ messages in thread
From: Vasant Hegde @ 2024-01-16 16:53 UTC (permalink / raw)
  To: iommu, joro
  Cc: suravee.suthikulpanit, wei.huang2, jsnitsel, jgg, Vasant Hegde,
	Jason Gunthorpe

Add support to invalidate TLB/IOTLB for the given device.

These functions will be used in subsequent patches where we will
introduce per device GCR3 table and SVA support.

Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/iommu/amd/amd_iommu.h |  5 +++++
 drivers/iommu/amd/iommu.c     | 23 +++++++++++++++++++++++
 2 files changed, 28 insertions(+)

diff --git a/drivers/iommu/amd/amd_iommu.h b/drivers/iommu/amd/amd_iommu.h
index 31275bc6e9e4..5e4614dbd435 100644
--- a/drivers/iommu/amd/amd_iommu.h
+++ b/drivers/iommu/amd/amd_iommu.h
@@ -63,6 +63,11 @@ void amd_iommu_domain_update(struct protection_domain *domain);
 void amd_iommu_domain_flush_complete(struct protection_domain *domain);
 void amd_iommu_domain_flush_pages(struct protection_domain *domain,
 				  u64 address, size_t size);
+void amd_iommu_dev_flush_pasid_pages(struct iommu_dev_data *dev_data,
+				     ioasid_t pasid, u64 address, size_t size);
+void amd_iommu_dev_flush_pasid_all(struct iommu_dev_data *dev_data,
+				   ioasid_t pasid);
+
 int amd_iommu_flush_tlb(struct iommu_domain *dom, u32 pasid);
 int amd_iommu_domain_set_gcr3(struct iommu_domain *dom, u32 pasid,
 			      unsigned long cr3);
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 34004f0e90dc..f83e0521d9a6 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -1582,6 +1582,29 @@ static void amd_iommu_domain_flush_all(struct protection_domain *domain)
 				     CMD_INV_IOMMU_ALL_PAGES_ADDRESS);
 }
 
+void amd_iommu_dev_flush_pasid_pages(struct iommu_dev_data *dev_data,
+				     ioasid_t pasid, u64 address, size_t size)
+{
+	struct iommu_cmd cmd;
+	struct amd_iommu *iommu = get_amd_iommu_from_dev(dev_data->dev);
+
+	build_inv_iommu_pages(&cmd, address, size,
+			      dev_data->gcr3_info.domid, pasid, true);
+	iommu_queue_command(iommu, &cmd);
+
+	if (dev_data->ats_enabled)
+		device_flush_iotlb(dev_data, address, size, pasid, true);
+
+	iommu_completion_wait(iommu);
+}
+
+void amd_iommu_dev_flush_pasid_all(struct iommu_dev_data *dev_data,
+				   ioasid_t pasid)
+{
+	amd_iommu_dev_flush_pasid_pages(dev_data, 0,
+					CMD_INV_IOMMU_ALL_PAGES_ADDRESS, pasid);
+}
+
 void amd_iommu_domain_flush_complete(struct protection_domain *domain)
 {
 	int i;
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v5 09/17] iommu/amd: Rearrange GCR3 table setup code
  2024-01-16 16:53 [PATCH v5 00/17] iommu/amd: SVA Support (part 3) - refactor support for GCR3 table Vasant Hegde
                   ` (7 preceding siblings ...)
  2024-01-16 16:53 ` [PATCH v5 08/17] iommu/amd: Add support for device based TLB invalidation Vasant Hegde
@ 2024-01-16 16:53 ` Vasant Hegde
  2024-01-16 16:53 ` [PATCH v5 10/17] iommu: Introduce iommu_group_mutex_assert() Vasant Hegde
                   ` (7 subsequent siblings)
  16 siblings, 0 replies; 31+ messages in thread
From: Vasant Hegde @ 2024-01-16 16:53 UTC (permalink / raw)
  To: iommu, joro
  Cc: suravee.suthikulpanit, wei.huang2, jsnitsel, jgg, Vasant Hegde,
	Jason Gunthorpe

Consolidate GCR3 table related code in one place so that its easy
to maintain.

Note that this patch doesn't move __set_gcr3/__clear_gcr3. We are moving
GCR3 table from per domain to per device. Following series will rework
these functions. During that time I will move these functions as well.

No functional changes intended.

Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/iommu/amd/iommu.c | 64 +++++++++++++++++++--------------------
 1 file changed, 32 insertions(+), 32 deletions(-)

diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index f83e0521d9a6..22092e741121 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -1758,6 +1758,38 @@ static int setup_gcr3_table(struct protection_domain *domain, int pasids)
 	return 0;
 }
 
+static u64 *__get_gcr3_pte(u64 *root, int level, u32 pasid, bool alloc)
+{
+	int index;
+	u64 *pte;
+
+	while (true) {
+
+		index = (pasid >> (9 * level)) & 0x1ff;
+		pte   = &root[index];
+
+		if (level == 0)
+			break;
+
+		if (!(*pte & GCR3_VALID)) {
+			if (!alloc)
+				return NULL;
+
+			root = (void *)get_zeroed_page(GFP_ATOMIC);
+			if (root == NULL)
+				return NULL;
+
+			*pte = iommu_virt_to_phys(root) | GCR3_VALID;
+		}
+
+		root = iommu_phys_to_virt(*pte & PAGE_MASK);
+
+		level -= 1;
+	}
+
+	return pte;
+}
+
 static void set_dte_entry(struct amd_iommu *iommu,
 			  struct iommu_dev_data *dev_data)
 {
@@ -2796,38 +2828,6 @@ int amd_iommu_flush_tlb(struct iommu_domain *dom, u32 pasid)
 	return ret;
 }
 
-static u64 *__get_gcr3_pte(u64 *root, int level, u32 pasid, bool alloc)
-{
-	int index;
-	u64 *pte;
-
-	while (true) {
-
-		index = (pasid >> (9 * level)) & 0x1ff;
-		pte   = &root[index];
-
-		if (level == 0)
-			break;
-
-		if (!(*pte & GCR3_VALID)) {
-			if (!alloc)
-				return NULL;
-
-			root = (void *)get_zeroed_page(GFP_ATOMIC);
-			if (root == NULL)
-				return NULL;
-
-			*pte = iommu_virt_to_phys(root) | GCR3_VALID;
-		}
-
-		root = iommu_phys_to_virt(*pte & PAGE_MASK);
-
-		level -= 1;
-	}
-
-	return pte;
-}
-
 static int __set_gcr3(struct protection_domain *domain, u32 pasid,
 		      unsigned long cr3)
 {
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v5 10/17] iommu: Introduce iommu_group_mutex_assert()
  2024-01-16 16:53 [PATCH v5 00/17] iommu/amd: SVA Support (part 3) - refactor support for GCR3 table Vasant Hegde
                   ` (8 preceding siblings ...)
  2024-01-16 16:53 ` [PATCH v5 09/17] iommu/amd: Rearrange GCR3 table setup code Vasant Hegde
@ 2024-01-16 16:53 ` Vasant Hegde
  2024-01-19 19:09   ` Jason Gunthorpe
  2024-01-16 16:53 ` [PATCH v5 11/17] iommu/amd: Refactor helper function for setting / clearing GCR3 Vasant Hegde
                   ` (6 subsequent siblings)
  16 siblings, 1 reply; 31+ messages in thread
From: Vasant Hegde @ 2024-01-16 16:53 UTC (permalink / raw)
  To: iommu, joro
  Cc: suravee.suthikulpanit, wei.huang2, jsnitsel, jgg, Vasant Hegde,
	Jason Gunthorpe

Add function to check iommu group mutex lock. So that device drivers can
rely on group mutex lock instead of adding another driver level lock
before modifying driver specific device data structure.

Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
---
 drivers/iommu/iommu.c | 17 +++++++++++++++++
 include/linux/iommu.h |  5 +++++
 2 files changed, 22 insertions(+)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 502779d57b68..b27d98374d43 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -1246,6 +1246,23 @@ void iommu_group_remove_device(struct device *dev)
 }
 EXPORT_SYMBOL_GPL(iommu_group_remove_device);
 
+/**
+ * iommu_group_mutex_assert - Check device group mutex lock
+ * @dev: the device that has group param set
+ *
+ * This function is called by an iommu driver to check whether it holds
+ * group mutex lock for the given device or not.
+ *
+ * Note that this function must be called after device group param is set.
+ */
+void iommu_group_mutex_assert(struct device *dev)
+{
+	struct iommu_group *group = dev->iommu_group;
+
+	lockdep_assert_held(&group->mutex);
+}
+EXPORT_SYMBOL_GPL(iommu_group_mutex_assert);
+
 static struct device *iommu_group_first_dev(struct iommu_group *group)
 {
 	lockdep_assert_held(&group->mutex);
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 7f6342bc71c3..c983b6a1ebce 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -751,6 +751,7 @@ extern int iommu_group_set_name(struct iommu_group *group, const char *name);
 extern int iommu_group_add_device(struct iommu_group *group,
 				  struct device *dev);
 extern void iommu_group_remove_device(struct device *dev);
+extern void iommu_group_mutex_assert(struct device *dev);
 extern int iommu_group_for_each_dev(struct iommu_group *group, void *data,
 				    int (*fn)(struct device *, void *));
 extern struct iommu_group *iommu_group_get(struct device *dev);
@@ -1144,6 +1145,10 @@ static inline void iommu_group_remove_device(struct device *dev)
 {
 }
 
+static inline void iommu_group_mutex_assert(struct device *dev)
+{
+}
+
 static inline int iommu_group_for_each_dev(struct iommu_group *group,
 					   void *data,
 					   int (*fn)(struct device *, void *))
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v5 11/17] iommu/amd: Refactor helper function for setting / clearing GCR3
  2024-01-16 16:53 [PATCH v5 00/17] iommu/amd: SVA Support (part 3) - refactor support for GCR3 table Vasant Hegde
                   ` (9 preceding siblings ...)
  2024-01-16 16:53 ` [PATCH v5 10/17] iommu: Introduce iommu_group_mutex_assert() Vasant Hegde
@ 2024-01-16 16:53 ` Vasant Hegde
  2024-01-16 16:53 ` [PATCH v5 12/17] iommu/amd: Refactor attaching / detaching device functions Vasant Hegde
                   ` (5 subsequent siblings)
  16 siblings, 0 replies; 31+ messages in thread
From: Vasant Hegde @ 2024-01-16 16:53 UTC (permalink / raw)
  To: iommu, joro
  Cc: suravee.suthikulpanit, wei.huang2, jsnitsel, jgg, Vasant Hegde

From: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>

Refactor GCR3 helper functions in preparation to use per device
GCR3 table.
  * Add new function update_gcr3 to update per device GCR3 table

  * Remove per domain default GCR3 setup during v2 page table allocation.
    Subsequent patch will add support to setup default gcr3 while
    attaching device to domain.

  * Remove amd_iommu_domain_update() from V2 page table path as device
    detach path will take care of updating the domain.

  * Consolidate GCR3 table related code in one place so that its easy
    to maintain.

  * Rename functions to reflect its usage.

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Co-developed-by: Vasant Hegde <vasant.hegde@amd.com>
Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
---
 drivers/iommu/amd/amd_iommu.h     |   8 ++-
 drivers/iommu/amd/io_pgtable_v2.c |  21 +-----
 drivers/iommu/amd/iommu.c         | 113 ++++++++++++++----------------
 3 files changed, 59 insertions(+), 83 deletions(-)

diff --git a/drivers/iommu/amd/amd_iommu.h b/drivers/iommu/amd/amd_iommu.h
index 5e4614dbd435..3a9b32ad1411 100644
--- a/drivers/iommu/amd/amd_iommu.h
+++ b/drivers/iommu/amd/amd_iommu.h
@@ -52,6 +52,11 @@ int amd_iommu_pc_set_reg(struct amd_iommu *iommu, u8 bank, u8 cntr,
 int amd_iommu_pdev_enable_cap_pri(struct pci_dev *pdev);
 void amd_iommu_pdev_disable_cap_pri(struct pci_dev *pdev);
 
+/* GCR3 setup */
+int amd_iommu_set_gcr3(struct iommu_dev_data *dev_data,
+		       ioasid_t pasid, unsigned long gcr3);
+int amd_iommu_clear_gcr3(struct iommu_dev_data *dev_data, ioasid_t pasid);
+
 int amd_iommu_flush_page(struct iommu_domain *dom, u32 pasid, u64 address);
 /*
  * This function flushes all internal caches of
@@ -69,9 +74,6 @@ void amd_iommu_dev_flush_pasid_all(struct iommu_dev_data *dev_data,
 				   ioasid_t pasid);
 
 int amd_iommu_flush_tlb(struct iommu_domain *dom, u32 pasid);
-int amd_iommu_domain_set_gcr3(struct iommu_domain *dom, u32 pasid,
-			      unsigned long cr3);
-int amd_iommu_domain_clear_gcr3(struct iommu_domain *dom, u32 pasid);
 
 #ifdef CONFIG_IRQ_REMAP
 int amd_iommu_create_irq_domain(struct amd_iommu *iommu);
diff --git a/drivers/iommu/amd/io_pgtable_v2.c b/drivers/iommu/amd/io_pgtable_v2.c
index 6d69ba60744f..93489d2db4e8 100644
--- a/drivers/iommu/amd/io_pgtable_v2.c
+++ b/drivers/iommu/amd/io_pgtable_v2.c
@@ -350,38 +350,26 @@ static const struct iommu_flush_ops v2_flush_ops = {
 
 static void v2_free_pgtable(struct io_pgtable *iop)
 {
-	struct protection_domain *pdom;
 	struct amd_io_pgtable *pgtable = container_of(iop, struct amd_io_pgtable, iop);
 
-	pdom = container_of(pgtable, struct protection_domain, iop);
-	if (!(pdom->flags & PD_IOMMUV2_MASK))
+	if (!pgtable || !pgtable->pgd)
 		return;
 
-	/* Clear gcr3 entry */
-	amd_iommu_domain_clear_gcr3(&pdom->domain, 0);
-
-	/* Make changes visible to IOMMUs */
-	amd_iommu_domain_update(pdom);
-
 	/* Free page table */
 	free_pgtable(pgtable->pgd, get_pgtable_level());
+	pgtable->pgd = NULL;
 }
 
 static struct io_pgtable *v2_alloc_pgtable(struct io_pgtable_cfg *cfg, void *cookie)
 {
 	struct amd_io_pgtable *pgtable = io_pgtable_cfg_to_data(cfg);
 	struct protection_domain *pdom = (struct protection_domain *)cookie;
-	int ret;
 	int ias = IOMMU_IN_ADDR_BIT_SIZE;
 
 	pgtable->pgd = alloc_pgtable_page(pdom->nid, GFP_ATOMIC);
 	if (!pgtable->pgd)
 		return NULL;
 
-	ret = amd_iommu_domain_set_gcr3(&pdom->domain, 0, iommu_virt_to_phys(pgtable->pgd));
-	if (ret)
-		goto err_free_pgd;
-
 	if (get_pgtable_level() == PAGE_MODE_5_LEVEL)
 		ias = 57;
 
@@ -395,11 +383,6 @@ static struct io_pgtable *v2_alloc_pgtable(struct io_pgtable_cfg *cfg, void *coo
 	cfg->tlb           = &v2_flush_ops;
 
 	return &pgtable->iop;
-
-err_free_pgd:
-	free_pgtable_page(pgtable->pgd);
-
-	return NULL;
 }
 
 struct io_pgtable_init_fns io_pgtable_amd_iommu_v2_init_fns = {
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 22092e741121..c9cf5fd03d0d 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -1758,10 +1758,13 @@ static int setup_gcr3_table(struct protection_domain *domain, int pasids)
 	return 0;
 }
 
-static u64 *__get_gcr3_pte(u64 *root, int level, u32 pasid, bool alloc)
+static u64 *__get_gcr3_pte(struct gcr3_tbl_info *gcr3_info,
+			   ioasid_t pasid, bool alloc)
 {
 	int index;
 	u64 *pte;
+	u64 *root = gcr3_info->gcr3_tbl;
+	int level = gcr3_info->glx;
 
 	while (true) {
 
@@ -1790,6 +1793,54 @@ static u64 *__get_gcr3_pte(u64 *root, int level, u32 pasid, bool alloc)
 	return pte;
 }
 
+static int update_gcr3(struct iommu_dev_data *dev_data,
+		       ioasid_t pasid, unsigned long gcr3, bool set)
+{
+	struct gcr3_tbl_info *gcr3_info = &dev_data->gcr3_info;
+	u64 *pte;
+
+	pte = __get_gcr3_pte(gcr3_info, pasid, true);
+	if (pte == NULL)
+		return -ENOMEM;
+
+	if (set)
+		*pte = (gcr3 & PAGE_MASK) | GCR3_VALID;
+	else
+		*pte = 0;
+
+	amd_iommu_dev_flush_pasid_all(dev_data, pasid);
+	return 0;
+}
+
+int amd_iommu_set_gcr3(struct iommu_dev_data *dev_data, ioasid_t pasid,
+		       unsigned long gcr3)
+{
+	struct gcr3_tbl_info *gcr3_info = &dev_data->gcr3_info;
+	int ret;
+
+	iommu_group_mutex_assert(dev_data->dev);
+
+	ret = update_gcr3(dev_data, pasid, gcr3, true);
+	if (!ret)
+		gcr3_info->pasid_cnt++;
+
+	return ret;
+}
+
+int amd_iommu_clear_gcr3(struct iommu_dev_data *dev_data, ioasid_t pasid)
+{
+	struct gcr3_tbl_info *gcr3_info = &dev_data->gcr3_info;
+	int ret;
+
+	iommu_group_mutex_assert(dev_data->dev);
+
+	ret = update_gcr3(dev_data, pasid, 0, false);
+	if (!ret)
+		gcr3_info->pasid_cnt--;
+
+	return ret;
+}
+
 static void set_dte_entry(struct amd_iommu *iommu,
 			  struct iommu_dev_data *dev_data)
 {
@@ -2828,66 +2879,6 @@ int amd_iommu_flush_tlb(struct iommu_domain *dom, u32 pasid)
 	return ret;
 }
 
-static int __set_gcr3(struct protection_domain *domain, u32 pasid,
-		      unsigned long cr3)
-{
-	u64 *pte;
-
-	if (domain->iop.mode != PAGE_MODE_NONE)
-		return -EINVAL;
-
-	pte = __get_gcr3_pte(domain->gcr3_tbl, domain->glx, pasid, true);
-	if (pte == NULL)
-		return -ENOMEM;
-
-	*pte = (cr3 & PAGE_MASK) | GCR3_VALID;
-
-	return __amd_iommu_flush_tlb(domain, pasid);
-}
-
-static int __clear_gcr3(struct protection_domain *domain, u32 pasid)
-{
-	u64 *pte;
-
-	if (domain->iop.mode != PAGE_MODE_NONE)
-		return -EINVAL;
-
-	pte = __get_gcr3_pte(domain->gcr3_tbl, domain->glx, pasid, false);
-	if (pte == NULL)
-		return 0;
-
-	*pte = 0;
-
-	return __amd_iommu_flush_tlb(domain, pasid);
-}
-
-int amd_iommu_domain_set_gcr3(struct iommu_domain *dom, u32 pasid,
-			      unsigned long cr3)
-{
-	struct protection_domain *domain = to_pdomain(dom);
-	unsigned long flags;
-	int ret;
-
-	spin_lock_irqsave(&domain->lock, flags);
-	ret = __set_gcr3(domain, pasid, cr3);
-	spin_unlock_irqrestore(&domain->lock, flags);
-
-	return ret;
-}
-
-int amd_iommu_domain_clear_gcr3(struct iommu_domain *dom, u32 pasid)
-{
-	struct protection_domain *domain = to_pdomain(dom);
-	unsigned long flags;
-	int ret;
-
-	spin_lock_irqsave(&domain->lock, flags);
-	ret = __clear_gcr3(domain, pasid);
-	spin_unlock_irqrestore(&domain->lock, flags);
-
-	return ret;
-}
-
 int amd_iommu_complete_ppr(struct pci_dev *pdev, u32 pasid,
 			   int status, int tag)
 {
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v5 12/17] iommu/amd: Refactor attaching / detaching device functions
  2024-01-16 16:53 [PATCH v5 00/17] iommu/amd: SVA Support (part 3) - refactor support for GCR3 table Vasant Hegde
                   ` (10 preceding siblings ...)
  2024-01-16 16:53 ` [PATCH v5 11/17] iommu/amd: Refactor helper function for setting / clearing GCR3 Vasant Hegde
@ 2024-01-16 16:53 ` Vasant Hegde
  2024-01-16 16:53 ` [PATCH v5 13/17] iommu/amd: Refactor protection_domain helper functions Vasant Hegde
                   ` (4 subsequent siblings)
  16 siblings, 0 replies; 31+ messages in thread
From: Vasant Hegde @ 2024-01-16 16:53 UTC (permalink / raw)
  To: iommu, joro
  Cc: suravee.suthikulpanit, wei.huang2, jsnitsel, jgg, Vasant Hegde

From: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>

If default domain is configured with V2 page table then setup default GCR3
with domain GCR3 pointer. So that all devices in the domain uses same page
table for translation. Also return page table setup status from do_attach()
function.

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Co-developed-by: Vasant Hegde <vasant.hegde@amd.com>
Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
---
 drivers/iommu/amd/iommu.c | 30 +++++++++++++++++++++++++++---
 1 file changed, 27 insertions(+), 3 deletions(-)

diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index c9cf5fd03d0d..b5347acee5f3 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -1949,10 +1949,11 @@ static void clear_dte_entry(struct amd_iommu *iommu, u16 devid)
 	amd_iommu_apply_erratum_63(iommu, devid);
 }
 
-static void do_attach(struct iommu_dev_data *dev_data,
-		      struct protection_domain *domain)
+static int do_attach(struct iommu_dev_data *dev_data,
+		     struct protection_domain *domain)
 {
 	struct amd_iommu *iommu = get_amd_iommu_from_dev_data(dev_data);
+	int ret = 0;
 
 	/* Update data structures */
 	dev_data->domain = domain;
@@ -1970,11 +1971,28 @@ static void do_attach(struct iommu_dev_data *dev_data,
 	if (domain_id_is_per_dev(domain))
 		dev_data->gcr3_info.domid = domain_id_alloc();
 
+	/* Init GCR3 table and update device table */
+	if (domain->pd_mode == PD_MODE_V2) {
+		/* By default, setup GCR3 table to support single PASID */
+		ret = setup_gcr3_table(dev_data->domain, 1);
+		if (ret)
+			return ret;
+
+		ret = update_gcr3(dev_data, 0,
+				  iommu_virt_to_phys(domain->iop.pgd), true);
+		if (ret) {
+			free_gcr3_table(dev_data->domain);
+			return ret;
+		}
+	}
+
 	/* Update device table */
 	set_dte_entry(iommu, dev_data);
 	clone_aliases(iommu, dev_data->dev);
 
 	device_flush_dte(dev_data);
+
+	return ret;
 }
 
 static void do_detach(struct iommu_dev_data *dev_data)
@@ -1982,6 +2000,12 @@ static void do_detach(struct iommu_dev_data *dev_data)
 	struct protection_domain *domain = dev_data->domain;
 	struct amd_iommu *iommu = get_amd_iommu_from_dev_data(dev_data);
 
+	/* Clear GCR3 table */
+	if (domain->pd_mode == PD_MODE_V2) {
+		update_gcr3(dev_data, 0, 0, false);
+		free_gcr3_table(dev_data->domain);
+	}
+
 	/* Update data structures */
 	dev_data->domain = NULL;
 	list_del(&dev_data->list);
@@ -2028,7 +2052,7 @@ static int attach_device(struct device *dev,
 	if (dev_is_pci(dev))
 		pdev_enable_caps(to_pci_dev(dev));
 
-	do_attach(dev_data, domain);
+	ret = do_attach(dev_data, domain);
 
 out:
 	spin_unlock(&dev_data->lock);
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v5 13/17] iommu/amd: Refactor protection_domain helper functions
  2024-01-16 16:53 [PATCH v5 00/17] iommu/amd: SVA Support (part 3) - refactor support for GCR3 table Vasant Hegde
                   ` (11 preceding siblings ...)
  2024-01-16 16:53 ` [PATCH v5 12/17] iommu/amd: Refactor attaching / detaching device functions Vasant Hegde
@ 2024-01-16 16:53 ` Vasant Hegde
  2024-01-16 16:53 ` [PATCH v5 14/17] iommu/amd: Refactor GCR3 table " Vasant Hegde
                   ` (3 subsequent siblings)
  16 siblings, 0 replies; 31+ messages in thread
From: Vasant Hegde @ 2024-01-16 16:53 UTC (permalink / raw)
  To: iommu, joro
  Cc: suravee.suthikulpanit, wei.huang2, jsnitsel, jgg, Vasant Hegde,
	Jason Gunthorpe

From: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>

To removes the code to setup GCR3 table, and only handle domain
create / destroy, since GCR3 is no longer part of a domain.

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/iommu/amd/iommu.c | 14 +++-----------
 1 file changed, 3 insertions(+), 11 deletions(-)

diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index b5347acee5f3..b6b7b55c3682 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -2233,9 +2233,6 @@ static void protection_domain_free(struct protection_domain *domain)
 	if (domain->iop.pgtbl_cfg.tlb)
 		free_io_pgtable_ops(&domain->iop.iop.ops);
 
-	if (domain->flags & PD_IOMMUV2_MASK)
-		free_gcr3_table(domain);
-
 	if (domain->iop.root)
 		free_page((unsigned long)domain->iop.root);
 
@@ -2263,15 +2260,10 @@ static int protection_domain_init_v1(struct protection_domain *domain, int mode)
 	return 0;
 }
 
-static int protection_domain_init_v2(struct protection_domain *domain)
+static int protection_domain_init_v2(struct protection_domain *pdom)
 {
-	domain->flags |= PD_GIOV_MASK;
-	domain->pd_mode = PD_MODE_V2;
-
-	domain->domain.pgsize_bitmap = AMD_IOMMU_PGSIZES_V2;
-
-	if (setup_gcr3_table(domain, 1))
-		return -ENOMEM;
+	pdom->pd_mode = PD_MODE_V2;
+	pdom->domain.pgsize_bitmap = AMD_IOMMU_PGSIZES_V2;
 
 	return 0;
 }
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v5 14/17] iommu/amd: Refactor GCR3 table helper functions
  2024-01-16 16:53 [PATCH v5 00/17] iommu/amd: SVA Support (part 3) - refactor support for GCR3 table Vasant Hegde
                   ` (12 preceding siblings ...)
  2024-01-16 16:53 ` [PATCH v5 13/17] iommu/amd: Refactor protection_domain helper functions Vasant Hegde
@ 2024-01-16 16:53 ` Vasant Hegde
  2024-01-19 19:59   ` Jason Gunthorpe
  2024-01-16 16:53 ` [PATCH v5 15/17] iommu/amd: Remove unused flush pasid functions Vasant Hegde
                   ` (2 subsequent siblings)
  16 siblings, 1 reply; 31+ messages in thread
From: Vasant Hegde @ 2024-01-16 16:53 UTC (permalink / raw)
  To: iommu, joro
  Cc: suravee.suthikulpanit, wei.huang2, jsnitsel, jgg, Vasant Hegde

From: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>

To use the new per-device struct gcr3_tbl_info. Use GFP_KERNEL flag
instead of GFP_ATOMIC for GCR3 table allocation. Also modify
set_dte_entry() to use new per device GCR3 table.

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Co-developed-by: Vasant Hegde <vasant.hegde@amd.com>
Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
---
 drivers/iommu/amd/iommu.c | 53 +++++++++++++++++++++++----------------
 1 file changed, 31 insertions(+), 22 deletions(-)

diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index b6b7b55c3682..4f7e0dc3a2f4 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -79,6 +79,9 @@ struct kmem_cache *amd_iommu_irq_cache;
 
 static void detach_device(struct device *dev);
 
+static void set_dte_entry(struct amd_iommu *iommu,
+			  struct iommu_dev_data *dev_data);
+
 /****************************************************************************
  *
  * Helper functions
@@ -1710,16 +1713,19 @@ static void free_gcr3_tbl_level2(u64 *tbl)
 	}
 }
 
-static void free_gcr3_table(struct protection_domain *domain)
+static void free_gcr3_table(struct gcr3_tbl_info *gcr3_info)
 {
-	if (domain->glx == 2)
-		free_gcr3_tbl_level2(domain->gcr3_tbl);
-	else if (domain->glx == 1)
-		free_gcr3_tbl_level1(domain->gcr3_tbl);
+	if (gcr3_info->glx == 2)
+		free_gcr3_tbl_level2(gcr3_info->gcr3_tbl);
+	else if (gcr3_info->glx == 1)
+		free_gcr3_tbl_level1(gcr3_info->gcr3_tbl);
 	else
-		BUG_ON(domain->glx != 0);
+		WARN_ON_ONCE(gcr3_info->glx != 0);
+
+	gcr3_info->glx = 0;
 
-	free_page((unsigned long)domain->gcr3_tbl);
+	free_page((unsigned long)gcr3_info->gcr3_tbl);
+	gcr3_info->gcr3_tbl = NULL;
 }
 
 /*
@@ -1738,22 +1744,22 @@ static int get_gcr3_levels(int pasids)
 	return levels ? (DIV_ROUND_UP(levels, 9) - 1) : levels;
 }
 
-/* Note: This function expects iommu_domain->lock to be held prior calling the function. */
-static int setup_gcr3_table(struct protection_domain *domain, int pasids)
+static int setup_gcr3_table(struct gcr3_tbl_info *gcr3_info,
+			    int nid, int pasids)
 {
 	int levels = get_gcr3_levels(pasids);
 
 	if (levels > amd_iommu_max_glx_val)
 		return -EINVAL;
 
-	domain->gcr3_tbl = alloc_pgtable_page(domain->nid, GFP_ATOMIC);
-	if (domain->gcr3_tbl == NULL)
-		return -ENOMEM;
+	if (gcr3_info->gcr3_tbl)
+		return -EBUSY;
 
-	domain->glx      = levels;
-	domain->flags   |= PD_IOMMUV2_MASK;
+	gcr3_info->gcr3_tbl = alloc_pgtable_page(nid, GFP_KERNEL);
+	if (gcr3_info->gcr3_tbl == NULL)
+		return -ENOMEM;
 
-	amd_iommu_domain_update(domain);
+	gcr3_info->glx = levels;
 
 	return 0;
 }
@@ -1851,6 +1857,7 @@ static void set_dte_entry(struct amd_iommu *iommu,
 	u16 domid;
 	struct protection_domain *domain = dev_data->domain;
 	struct dev_table_entry *dev_table = get_dev_table(iommu);
+	struct gcr3_tbl_info *gcr3_info = &dev_data->gcr3_info;
 
 	if (domain_id_is_per_dev(domain))
 		domid = dev_data->gcr3_info.domid;
@@ -1883,9 +1890,9 @@ static void set_dte_entry(struct amd_iommu *iommu,
 	if (domain->dirty_tracking)
 		pte_root |= DTE_FLAG_HAD;
 
-	if (domain->flags & PD_IOMMUV2_MASK) {
-		u64 gcr3 = iommu_virt_to_phys(domain->gcr3_tbl);
-		u64 glx  = domain->glx;
+	if (gcr3_info && gcr3_info->gcr3_tbl) {
+		u64 gcr3 = iommu_virt_to_phys(gcr3_info->gcr3_tbl);
+		u64 glx  = gcr3_info->glx;
 		u64 tmp;
 
 		pte_root |= DTE_FLAG_GV;
@@ -1913,7 +1920,8 @@ static void set_dte_entry(struct amd_iommu *iommu,
 				((u64)GUEST_PGTABLE_5_LEVEL << DTE_GPT_LEVEL_SHIFT);
 		}
 
-		if (domain->flags & PD_GIOV_MASK)
+		/* GIOV is supported with V2 page table mode only */
+		if (pdom_is_v2_pgtbl_mode(domain))
 			pte_root |= DTE_FLAG_GIOV;
 	}
 
@@ -1974,14 +1982,15 @@ static int do_attach(struct iommu_dev_data *dev_data,
 	/* Init GCR3 table and update device table */
 	if (domain->pd_mode == PD_MODE_V2) {
 		/* By default, setup GCR3 table to support single PASID */
-		ret = setup_gcr3_table(dev_data->domain, 1);
+		ret = setup_gcr3_table(&dev_data->gcr3_info,
+				       dev_to_node(dev_data->dev), 1);
 		if (ret)
 			return ret;
 
 		ret = update_gcr3(dev_data, 0,
 				  iommu_virt_to_phys(domain->iop.pgd), true);
 		if (ret) {
-			free_gcr3_table(dev_data->domain);
+			free_gcr3_table(&dev_data->gcr3_info);
 			return ret;
 		}
 	}
@@ -2003,7 +2012,7 @@ static void do_detach(struct iommu_dev_data *dev_data)
 	/* Clear GCR3 table */
 	if (domain->pd_mode == PD_MODE_V2) {
 		update_gcr3(dev_data, 0, 0, false);
-		free_gcr3_table(dev_data->domain);
+		free_gcr3_table(&dev_data->gcr3_info);
 	}
 
 	/* Update data structures */
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v5 15/17] iommu/amd: Remove unused flush pasid functions
  2024-01-16 16:53 [PATCH v5 00/17] iommu/amd: SVA Support (part 3) - refactor support for GCR3 table Vasant Hegde
                   ` (13 preceding siblings ...)
  2024-01-16 16:53 ` [PATCH v5 14/17] iommu/amd: Refactor GCR3 table " Vasant Hegde
@ 2024-01-16 16:53 ` Vasant Hegde
  2024-01-16 16:53 ` [PATCH v5 16/17] iommu/amd: Rearrange device flush code Vasant Hegde
  2024-01-16 16:53 ` [PATCH v5 17/17] iommu/amd: Remove unused GCR3 table parameters from struct protection_domain Vasant Hegde
  16 siblings, 0 replies; 31+ messages in thread
From: Vasant Hegde @ 2024-01-16 16:53 UTC (permalink / raw)
  To: iommu, joro
  Cc: suravee.suthikulpanit, wei.huang2, jsnitsel, jgg, Vasant Hegde,
	Jason Gunthorpe

We have removed iommu_v2 module and converted v2 page table to use
common flush functions. Also we have moved GCR3 table to per device.
PASID related functions are not used. Hence remove these unused
functions.

Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/iommu/amd/amd_iommu.h |   3 -
 drivers/iommu/amd/iommu.c     | 100 ----------------------------------
 2 files changed, 103 deletions(-)

diff --git a/drivers/iommu/amd/amd_iommu.h b/drivers/iommu/amd/amd_iommu.h
index 3a9b32ad1411..b61293f2aef6 100644
--- a/drivers/iommu/amd/amd_iommu.h
+++ b/drivers/iommu/amd/amd_iommu.h
@@ -57,7 +57,6 @@ int amd_iommu_set_gcr3(struct iommu_dev_data *dev_data,
 		       ioasid_t pasid, unsigned long gcr3);
 int amd_iommu_clear_gcr3(struct iommu_dev_data *dev_data, ioasid_t pasid);
 
-int amd_iommu_flush_page(struct iommu_domain *dom, u32 pasid, u64 address);
 /*
  * This function flushes all internal caches of
  * the IOMMU used by this driver.
@@ -73,8 +72,6 @@ void amd_iommu_dev_flush_pasid_pages(struct iommu_dev_data *dev_data,
 void amd_iommu_dev_flush_pasid_all(struct iommu_dev_data *dev_data,
 				   ioasid_t pasid);
 
-int amd_iommu_flush_tlb(struct iommu_domain *dom, u32 pasid);
-
 #ifdef CONFIG_IRQ_REMAP
 int amd_iommu_create_irq_domain(struct amd_iommu *iommu);
 #else
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 4f7e0dc3a2f4..ac81872ce5c0 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -2804,106 +2804,6 @@ const struct iommu_ops amd_iommu_ops = {
 	}
 };
 
-static int __flush_pasid(struct protection_domain *domain, u32 pasid,
-			 u64 address, size_t size)
-{
-	struct iommu_dev_data *dev_data;
-	struct iommu_cmd cmd;
-	int i, ret;
-
-	if (!(domain->flags & PD_IOMMUV2_MASK))
-		return -EINVAL;
-
-	build_inv_iommu_pages(&cmd, address, size, domain->id, pasid, true);
-
-	/*
-	 * IOMMU TLB needs to be flushed before Device TLB to
-	 * prevent device TLB refill from IOMMU TLB
-	 */
-	for (i = 0; i < amd_iommu_get_num_iommus(); ++i) {
-		if (domain->dev_iommu[i] == 0)
-			continue;
-
-		ret = iommu_queue_command(amd_iommus[i], &cmd);
-		if (ret != 0)
-			goto out;
-	}
-
-	/* Wait until IOMMU TLB flushes are complete */
-	amd_iommu_domain_flush_complete(domain);
-
-	/* Now flush device TLBs */
-	list_for_each_entry(dev_data, &domain->dev_list, list) {
-		struct amd_iommu *iommu;
-		int qdep;
-
-		/*
-		   There might be non-IOMMUv2 capable devices in an IOMMUv2
-		 * domain.
-		 */
-		if (!dev_data->ats_enabled)
-			continue;
-
-		qdep  = dev_data->ats_qdep;
-		iommu = rlookup_amd_iommu(dev_data->dev);
-		if (!iommu)
-			continue;
-		build_inv_iotlb_pages(&cmd, dev_data->devid, qdep,
-				      address, size, pasid, true);
-
-		ret = iommu_queue_command(iommu, &cmd);
-		if (ret != 0)
-			goto out;
-	}
-
-	/* Wait until all device TLBs are flushed */
-	amd_iommu_domain_flush_complete(domain);
-
-	ret = 0;
-
-out:
-
-	return ret;
-}
-
-static int __amd_iommu_flush_page(struct protection_domain *domain, u32 pasid,
-				  u64 address)
-{
-	return __flush_pasid(domain, pasid, address, PAGE_SIZE);
-}
-
-int amd_iommu_flush_page(struct iommu_domain *dom, u32 pasid,
-			 u64 address)
-{
-	struct protection_domain *domain = to_pdomain(dom);
-	unsigned long flags;
-	int ret;
-
-	spin_lock_irqsave(&domain->lock, flags);
-	ret = __amd_iommu_flush_page(domain, pasid, address);
-	spin_unlock_irqrestore(&domain->lock, flags);
-
-	return ret;
-}
-
-static int __amd_iommu_flush_tlb(struct protection_domain *domain, u32 pasid)
-{
-	return __flush_pasid(domain, pasid, 0, CMD_INV_IOMMU_ALL_PAGES_ADDRESS);
-}
-
-int amd_iommu_flush_tlb(struct iommu_domain *dom, u32 pasid)
-{
-	struct protection_domain *domain = to_pdomain(dom);
-	unsigned long flags;
-	int ret;
-
-	spin_lock_irqsave(&domain->lock, flags);
-	ret = __amd_iommu_flush_tlb(domain, pasid);
-	spin_unlock_irqrestore(&domain->lock, flags);
-
-	return ret;
-}
-
 int amd_iommu_complete_ppr(struct pci_dev *pdev, u32 pasid,
 			   int status, int tag)
 {
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v5 16/17] iommu/amd: Rearrange device flush code
  2024-01-16 16:53 [PATCH v5 00/17] iommu/amd: SVA Support (part 3) - refactor support for GCR3 table Vasant Hegde
                   ` (14 preceding siblings ...)
  2024-01-16 16:53 ` [PATCH v5 15/17] iommu/amd: Remove unused flush pasid functions Vasant Hegde
@ 2024-01-16 16:53 ` Vasant Hegde
  2024-01-16 16:53 ` [PATCH v5 17/17] iommu/amd: Remove unused GCR3 table parameters from struct protection_domain Vasant Hegde
  16 siblings, 0 replies; 31+ messages in thread
From: Vasant Hegde @ 2024-01-16 16:53 UTC (permalink / raw)
  To: iommu, joro
  Cc: suravee.suthikulpanit, wei.huang2, jsnitsel, jgg, Vasant Hegde,
	Jason Gunthorpe

Consolidate all flush related code in one place so that its easy
to maintain.

No functional changes intended.

Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/iommu/amd/iommu.c | 92 ++++++++++++++++++---------------------
 1 file changed, 43 insertions(+), 49 deletions(-)

diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index ac81872ce5c0..6e7436787c45 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -1649,6 +1649,49 @@ static void domain_flush_devices(struct protection_domain *domain)
 		device_flush_dte(dev_data);
 }
 
+static void update_device_table(struct protection_domain *domain)
+{
+	struct iommu_dev_data *dev_data;
+
+	list_for_each_entry(dev_data, &domain->dev_list, list) {
+		struct amd_iommu *iommu = rlookup_amd_iommu(dev_data->dev);
+
+		set_dte_entry(iommu, dev_data);
+		clone_aliases(iommu, dev_data->dev);
+	}
+}
+
+void amd_iommu_update_and_flush_device_table(struct protection_domain *domain)
+{
+	update_device_table(domain);
+	domain_flush_devices(domain);
+}
+
+void amd_iommu_domain_update(struct protection_domain *domain)
+{
+	/* Update device table */
+	amd_iommu_update_and_flush_device_table(domain);
+
+	/* Flush domain TLB(s) and wait for completion */
+	amd_iommu_domain_flush_all(domain);
+}
+
+int amd_iommu_complete_ppr(struct pci_dev *pdev, u32 pasid,
+			   int status, int tag)
+{
+	struct iommu_dev_data *dev_data;
+	struct amd_iommu *iommu;
+	struct iommu_cmd cmd;
+
+	dev_data = dev_iommu_priv_get(&pdev->dev);
+	iommu    = get_amd_iommu_from_dev(&pdev->dev);
+
+	build_complete_ppr(&cmd, dev_data->devid, pasid, status,
+			   tag, dev_data->pri_tlp);
+
+	return iommu_queue_command(iommu, &cmd);
+}
+
 /****************************************************************************
  *
  * The next functions belong to the domain allocation. A domain is
@@ -2173,39 +2216,6 @@ static struct iommu_group *amd_iommu_device_group(struct device *dev)
 	return acpihid_device_group(dev);
 }
 
-/*****************************************************************************
- *
- * The next functions belong to the dma_ops mapping/unmapping code.
- *
- *****************************************************************************/
-
-static void update_device_table(struct protection_domain *domain)
-{
-	struct iommu_dev_data *dev_data;
-
-	list_for_each_entry(dev_data, &domain->dev_list, list) {
-		struct amd_iommu *iommu = get_amd_iommu_from_dev_data(dev_data);
-
-		set_dte_entry(iommu, dev_data);
-		clone_aliases(iommu, dev_data->dev);
-	}
-}
-
-void amd_iommu_update_and_flush_device_table(struct protection_domain *domain)
-{
-	update_device_table(domain);
-	domain_flush_devices(domain);
-}
-
-void amd_iommu_domain_update(struct protection_domain *domain)
-{
-	/* Update device table */
-	amd_iommu_update_and_flush_device_table(domain);
-
-	/* Flush domain TLB(s) and wait for completion */
-	amd_iommu_domain_flush_all(domain);
-}
-
 /*****************************************************************************
  *
  * The following functions belong to the exported interface of AMD IOMMU
@@ -2804,22 +2814,6 @@ const struct iommu_ops amd_iommu_ops = {
 	}
 };
 
-int amd_iommu_complete_ppr(struct pci_dev *pdev, u32 pasid,
-			   int status, int tag)
-{
-	struct iommu_dev_data *dev_data;
-	struct amd_iommu *iommu;
-	struct iommu_cmd cmd;
-
-	dev_data = dev_iommu_priv_get(&pdev->dev);
-	iommu    = get_amd_iommu_from_dev(&pdev->dev);
-
-	build_complete_ppr(&cmd, dev_data->devid, pasid, status,
-			   tag, dev_data->pri_tlp);
-
-	return iommu_queue_command(iommu, &cmd);
-}
-
 #ifdef CONFIG_IRQ_REMAP
 
 /*****************************************************************************
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v5 17/17] iommu/amd: Remove unused GCR3 table parameters from struct protection_domain
  2024-01-16 16:53 [PATCH v5 00/17] iommu/amd: SVA Support (part 3) - refactor support for GCR3 table Vasant Hegde
                   ` (15 preceding siblings ...)
  2024-01-16 16:53 ` [PATCH v5 16/17] iommu/amd: Rearrange device flush code Vasant Hegde
@ 2024-01-16 16:53 ` Vasant Hegde
  16 siblings, 0 replies; 31+ messages in thread
From: Vasant Hegde @ 2024-01-16 16:53 UTC (permalink / raw)
  To: iommu, joro
  Cc: suravee.suthikulpanit, wei.huang2, jsnitsel, jgg, Vasant Hegde,
	Jason Gunthorpe

From: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>

Since they are moved to struct iommu_dev_data, and the driver has been
ported to use them.

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/iommu/amd/amd_iommu_types.h | 12 ------------
 drivers/iommu/amd/iommu.c           |  2 +-
 2 files changed, 1 insertion(+), 13 deletions(-)

diff --git a/drivers/iommu/amd/amd_iommu_types.h b/drivers/iommu/amd/amd_iommu_types.h
index 9e23710fd3ec..d5786c33aee6 100644
--- a/drivers/iommu/amd/amd_iommu_types.h
+++ b/drivers/iommu/amd/amd_iommu_types.h
@@ -453,15 +453,6 @@
 
 #define MAX_DOMAIN_ID 65536
 
-/* Protection domain flags */
-#define PD_DMA_OPS_MASK		BIT(0) /* domain used for dma_ops */
-#define PD_DEFAULT_MASK		BIT(1) /* domain is a default dma_ops
-					      domain for an IOMMU */
-#define PD_PASSTHROUGH_MASK	BIT(2) /* domain has no page
-					      translation */
-#define PD_IOMMUV2_MASK		BIT(3) /* domain has gcr3 table */
-#define PD_GIOV_MASK		BIT(4) /* domain enable GIOV support */
-
 /* Timeout stuff */
 #define LOOP_TIMEOUT		100000
 #define MMIO_STATUS_TIMEOUT	2000000
@@ -572,10 +563,7 @@ struct protection_domain {
 	struct amd_io_pgtable iop;
 	spinlock_t lock;	/* mostly used to lock the page table*/
 	u16 id;			/* the domain id written to the device table */
-	int glx;		/* Number of levels for GCR3 table */
 	int nid;		/* Node ID */
-	u64 *gcr3_tbl;		/* Guest CR3 table */
-	unsigned long flags;	/* flags to find out type of domain */
 	enum protection_domain_mode pd_mode; /* Track page table type */
 	bool dirty_tracking;	/* dirty tracking is enabled in the domain */
 	unsigned dev_cnt;	/* devices assigned to this domain */
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 6e7436787c45..256b5507272f 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -90,7 +90,7 @@ static void set_dte_entry(struct amd_iommu *iommu,
 
 static inline bool pdom_is_v2_pgtbl_mode(struct protection_domain *pdom)
 {
-	return (pdom && (pdom->flags & PD_IOMMUV2_MASK));
+	return (pdom && (pdom->pd_mode == PD_MODE_V2));
 }
 
 /*
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [PATCH v5 03/17] iommu/amd: Introduce get_amd_iommu_from_dev()
  2024-01-16 16:53 ` [PATCH v5 03/17] iommu/amd: Introduce get_amd_iommu_from_dev() Vasant Hegde
@ 2024-01-19 19:00   ` Jason Gunthorpe
  0 siblings, 0 replies; 31+ messages in thread
From: Jason Gunthorpe @ 2024-01-19 19:00 UTC (permalink / raw)
  To: Vasant Hegde; +Cc: iommu, joro, suravee.suthikulpanit, wei.huang2, jsnitsel

On Tue, Jan 16, 2024 at 04:53:21PM +0000, Vasant Hegde wrote:
> From: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
> 
> Introduce get_amd_iommu_from_dev() and get_amd_iommu_from_dev_data().
> And replace rlookup_amd_iommu() with the new helper function where
> applicable to avoid unnecessary loop to look up struct amd_iommu from
> struct device.
> 
> Suggested-by: Jason Gunthorpe <jgg@ziepe.ca>
> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
> Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
> ---
>  drivers/iommu/amd/amd_iommu.h | 15 ++++++++++
>  drivers/iommu/amd/iommu.c     | 54 +++++++++--------------------------
>  include/linux/iommu.h         | 16 +++++++++++
>  3 files changed, 44 insertions(+), 41 deletions(-)

Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>

Jason

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v5 05/17] iommu/amd: Introduce per-device GCR3 table
  2024-01-16 16:53 ` [PATCH v5 05/17] iommu/amd: Introduce per-device GCR3 table Vasant Hegde
@ 2024-01-19 19:03   ` Jason Gunthorpe
  0 siblings, 0 replies; 31+ messages in thread
From: Jason Gunthorpe @ 2024-01-19 19:03 UTC (permalink / raw)
  To: Vasant Hegde; +Cc: iommu, joro, suravee.suthikulpanit, wei.huang2, jsnitsel

On Tue, Jan 16, 2024 at 04:53:23PM +0000, Vasant Hegde wrote:
> From: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
> 
> AMD IOMMU GCR3 table is indexed by PASID. Each entry stores guest CR3
> register value, which is an address to the root of guest IO page table.
> The GCR3 table can be programmed per-device. However, Linux AMD IOMMU
> driver currently managing the table on a per-domain basis.
> 
> PASID is a device feature. When SVA is enabled it will bind PASID to
> device, not domain. Hence it makes sense to have per device GCR3 table.
> 
> Introduce struct iommu_dev_data.gcr3_tbl_info to keep track of GCR3 table
> configuration. This will eventually replaces gcr3 related variables in
> protection_domain structure.
> 
> Suggested-by: Jason Gunthorpe <jgg@ziepe.ca>
> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
> Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
> ---
>  drivers/iommu/amd/amd_iommu_types.h | 7 +++++++
>  1 file changed, 7 insertions(+)

Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>

Jason

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v5 10/17] iommu: Introduce iommu_group_mutex_assert()
  2024-01-16 16:53 ` [PATCH v5 10/17] iommu: Introduce iommu_group_mutex_assert() Vasant Hegde
@ 2024-01-19 19:09   ` Jason Gunthorpe
  2024-01-22  6:24     ` Vasant Hegde
  0 siblings, 1 reply; 31+ messages in thread
From: Jason Gunthorpe @ 2024-01-19 19:09 UTC (permalink / raw)
  To: Vasant Hegde; +Cc: iommu, joro, suravee.suthikulpanit, wei.huang2, jsnitsel

On Tue, Jan 16, 2024 at 04:53:28PM +0000, Vasant Hegde wrote:

> +/**
> + * iommu_group_mutex_assert - Check device group mutex lock
> + * @dev: the device that has group param set
> + *
> + * This function is called by an iommu driver to check whether it holds
> + * group mutex lock for the given device or not.
> + *
> + * Note that this function must be called after device group param is set.
> + */
> +void iommu_group_mutex_assert(struct device *dev)
> +{
> +	struct iommu_group *group = dev->iommu_group;
> +
> +	lockdep_assert_held(&group->mutex);
> +}
> +EXPORT_SYMBOL_GPL(iommu_group_mutex_assert);
> +
>  static struct device *iommu_group_first_dev(struct iommu_group *group)
>  {
>  	lockdep_assert_held(&group->mutex);
> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> index 7f6342bc71c3..c983b6a1ebce 100644
> --- a/include/linux/iommu.h
> +++ b/include/linux/iommu.h
> @@ -751,6 +751,7 @@ extern int iommu_group_set_name(struct iommu_group *group, const char *name);
>  extern int iommu_group_add_device(struct iommu_group *group,
>  				  struct device *dev);
>  extern void iommu_group_remove_device(struct device *dev);
> +extern void iommu_group_mutex_assert(struct device *dev);

This shouldn't be unconditional. Like this outside the other ifdefs:

#if IS_ENABLED(CONFIG_LOCKDEP) && IS_ENABLED(CONFIG_IOMMU)
void iommu_group_mutex_assert(struct device *dev);
#else
static inline void iommu_group_mutex_assert(struct device *dev)
{
}
#endif

And a similar ifdef around the implementation I suppose.

Jason

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v5 14/17] iommu/amd: Refactor GCR3 table helper functions
  2024-01-16 16:53 ` [PATCH v5 14/17] iommu/amd: Refactor GCR3 table " Vasant Hegde
@ 2024-01-19 19:59   ` Jason Gunthorpe
  2024-01-22 10:23     ` Vasant Hegde
  0 siblings, 1 reply; 31+ messages in thread
From: Jason Gunthorpe @ 2024-01-19 19:59 UTC (permalink / raw)
  To: Vasant Hegde; +Cc: iommu, joro, suravee.suthikulpanit, wei.huang2, jsnitsel

On Tue, Jan 16, 2024 at 04:53:32PM +0000, Vasant Hegde wrote:

> @@ -1738,22 +1744,22 @@ static int get_gcr3_levels(int pasids)
>  	return levels ? (DIV_ROUND_UP(levels, 9) - 1) : levels;
>  }
>  
> -/* Note: This function expects iommu_domain->lock to be held prior calling the function. */
> -static int setup_gcr3_table(struct protection_domain *domain, int pasids)
> +static int setup_gcr3_table(struct gcr3_tbl_info *gcr3_info,
> +			    int nid, int pasids)
>  {
>  	int levels = get_gcr3_levels(pasids);
>  
>  	if (levels > amd_iommu_max_glx_val)
>  		return -EINVAL;
>  
> -	domain->gcr3_tbl = alloc_pgtable_page(domain->nid, GFP_ATOMIC);
> -	if (domain->gcr3_tbl == NULL)
> -		return -ENOMEM;
> +	if (gcr3_info->gcr3_tbl)
> +		return -EBUSY;
>  
> -	domain->glx      = levels;
> -	domain->flags   |= PD_IOMMUV2_MASK;
> +	gcr3_info->gcr3_tbl = alloc_pgtable_page(nid, GFP_KERNEL);
> +	if (gcr3_info->gcr3_tbl == NULL)
> +		return -ENOMEM;
>  
> -	amd_iommu_domain_update(domain);
> +	gcr3_info->glx = levels;

I think this patch should also move the domain_id
allocation/deallocation into setup_gcr3_table()/free_gcr3_table()

But it would really be better to move patch 7 to after the gcr3 layer
is cleaned and seperated from protection_domain so you don't have to
move it in the first place..

It is missing some error handling too

And domain_id_is_per_dev() is pretty redundant once you do that. See below

> @@ -1974,14 +1982,15 @@ static int do_attach(struct iommu_dev_data *dev_data,
>  	/* Init GCR3 table and update device table */
>  	if (domain->pd_mode == PD_MODE_V2) {
>  		/* By default, setup GCR3 table to support single PASID */
> -		ret = setup_gcr3_table(dev_data->domain, 1);
> +		ret = setup_gcr3_table(&dev_data->gcr3_info,
> +				       dev_to_node(dev_data->dev), 1);

IMO it is better to pass the struct amd_iommu * to setup_gcr3_table()
and then it can get the NUMA locality more sensibly via

 dev_to_node(&iommu->dev->dev)

The iommu is the thing that will walk the table and needs the
locality, not the end point device. Locality should flow from
the affiliated amd_iommu always.

Jason

diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 256b5507272ff9..fb94ffa269717b 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -1769,6 +1769,8 @@ static void free_gcr3_table(struct gcr3_tbl_info *gcr3_info)
 
 	free_page((unsigned long)gcr3_info->gcr3_tbl);
 	gcr3_info->gcr3_tbl = NULL;
+
+	domain_id_free(gcr3_info.domid);
 }
 
 /*
@@ -1788,7 +1790,7 @@ static int get_gcr3_levels(int pasids)
 }
 
 static int setup_gcr3_table(struct gcr3_tbl_info *gcr3_info,
-			    int nid, int pasids)
+			    struct amd_iommu *iommu, int pasids)
 {
 	int levels = get_gcr3_levels(pasids);
 
@@ -1798,12 +1800,19 @@ static int setup_gcr3_table(struct gcr3_tbl_info *gcr3_info,
 	if (gcr3_info->gcr3_tbl)
 		return -EBUSY;
 
-	gcr3_info->gcr3_tbl = alloc_pgtable_page(nid, GFP_KERNEL);
+	gcr3_info->gcr3_tbl = alloc_pgtable_page(
+		iommu ? dev_to_node(&iommu->dev->dev) : 1, GFP_KERNEL);
 	if (gcr3_info->gcr3_tbl == NULL)
 		return -ENOMEM;
 
 	gcr3_info->glx = levels;
 
+	gcr3_info.domid = domain_id_alloc();
+	if (!gcr3_info.domid) {
+		free_page((unsigned long)gcr3_info->gcr3_tbl);
+		return -ENOMEM;
+	}
+
 	return 0;
 }
 
@@ -1902,7 +1911,7 @@ static void set_dte_entry(struct amd_iommu *iommu,
 	struct dev_table_entry *dev_table = get_dev_table(iommu);
 	struct gcr3_tbl_info *gcr3_info = &dev_data->gcr3_info;
 
-	if (domain_id_is_per_dev(domain))
+	if (gcr3_info && gcr3_info->gcr3_tbl) {
 		domid = dev_data->gcr3_info.domid;
 	else
 		domid = domain->id;
@@ -2018,15 +2027,10 @@ static int do_attach(struct iommu_dev_data *dev_data,
 	domain->dev_iommu[iommu->index] += 1;
 	domain->dev_cnt                 += 1;
 
-	/* Allocate per device domain ID */
-	if (domain_id_is_per_dev(domain))
-		dev_data->gcr3_info.domid = domain_id_alloc();
-
 	/* Init GCR3 table and update device table */
 	if (domain->pd_mode == PD_MODE_V2) {
 		/* By default, setup GCR3 table to support single PASID */
-		ret = setup_gcr3_table(&dev_data->gcr3_info,
-				       dev_to_node(dev_data->dev), 1);
+		ret = setup_gcr3_table(&dev_data->gcr3_info, iommu, 1);
 		if (ret)
 			return ret;
 
@@ -2073,10 +2077,6 @@ static void do_detach(struct iommu_dev_data *dev_data)
 	/* decrease reference counters - needs to happen after the flushes */
 	domain->dev_iommu[iommu->index] -= 1;
 	domain->dev_cnt                 -= 1;
-
-	/* Free per device domain ID */
-	if (domain_id_is_per_dev(domain))
-		domain_id_free(dev_data->gcr3_info.domid);
 }
 
 /*

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [PATCH v5 10/17] iommu: Introduce iommu_group_mutex_assert()
  2024-01-19 19:09   ` Jason Gunthorpe
@ 2024-01-22  6:24     ` Vasant Hegde
  2024-01-22 18:00       ` Jason Gunthorpe
  0 siblings, 1 reply; 31+ messages in thread
From: Vasant Hegde @ 2024-01-22  6:24 UTC (permalink / raw)
  To: Jason Gunthorpe; +Cc: iommu, joro, suravee.suthikulpanit, wei.huang2, jsnitsel

Jason,


On 1/20/2024 12:39 AM, Jason Gunthorpe wrote:
> On Tue, Jan 16, 2024 at 04:53:28PM +0000, Vasant Hegde wrote:
> 
>> +/**
>> + * iommu_group_mutex_assert - Check device group mutex lock
>> + * @dev: the device that has group param set
>> + *
>> + * This function is called by an iommu driver to check whether it holds
>> + * group mutex lock for the given device or not.
>> + *
>> + * Note that this function must be called after device group param is set.
>> + */
>> +void iommu_group_mutex_assert(struct device *dev)
>> +{
>> +	struct iommu_group *group = dev->iommu_group;
>> +
>> +	lockdep_assert_held(&group->mutex);
>> +}
>> +EXPORT_SYMBOL_GPL(iommu_group_mutex_assert);
>> +
>>  static struct device *iommu_group_first_dev(struct iommu_group *group)
>>  {
>>  	lockdep_assert_held(&group->mutex);
>> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
>> index 7f6342bc71c3..c983b6a1ebce 100644
>> --- a/include/linux/iommu.h
>> +++ b/include/linux/iommu.h
>> @@ -751,6 +751,7 @@ extern int iommu_group_set_name(struct iommu_group *group, const char *name);
>>  extern int iommu_group_add_device(struct iommu_group *group,
>>  				  struct device *dev);
>>  extern void iommu_group_remove_device(struct device *dev);
>> +extern void iommu_group_mutex_assert(struct device *dev);
> 
> This shouldn't be unconditional. Like this outside the other ifdefs:
> 
> #if IS_ENABLED(CONFIG_LOCKDEP) && IS_ENABLED(CONFIG_IOMMU)

This is already covered by `CONFIG_IOMMU_API` check.
Also lockdep_assert_held() is already covered by LOCKDEP config check in
lockdep.h file.

So I think another explicit check in redundant.

Is there  any other reason to have explicit check?


-Vasant


> void iommu_group_mutex_assert(struct device *dev);
> #else
> static inline void iommu_group_mutex_assert(struct device *dev)
> {
> }
> #endif
> 
> And a similar ifdef around the implementation I suppose.
> 
> Jason

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v5 14/17] iommu/amd: Refactor GCR3 table helper functions
  2024-01-19 19:59   ` Jason Gunthorpe
@ 2024-01-22 10:23     ` Vasant Hegde
  2024-01-22 18:26       ` Jason Gunthorpe
  0 siblings, 1 reply; 31+ messages in thread
From: Vasant Hegde @ 2024-01-22 10:23 UTC (permalink / raw)
  To: Jason Gunthorpe; +Cc: iommu, joro, suravee.suthikulpanit, wei.huang2, jsnitsel

Jason,


On 1/20/2024 1:29 AM, Jason Gunthorpe wrote:
> On Tue, Jan 16, 2024 at 04:53:32PM +0000, Vasant Hegde wrote:
> 
>> @@ -1738,22 +1744,22 @@ static int get_gcr3_levels(int pasids)
>>  	return levels ? (DIV_ROUND_UP(levels, 9) - 1) : levels;
>>  }
>>  
>> -/* Note: This function expects iommu_domain->lock to be held prior calling the function. */
>> -static int setup_gcr3_table(struct protection_domain *domain, int pasids)
>> +static int setup_gcr3_table(struct gcr3_tbl_info *gcr3_info,
>> +			    int nid, int pasids)
>>  {
>>  	int levels = get_gcr3_levels(pasids);
>>  
>>  	if (levels > amd_iommu_max_glx_val)
>>  		return -EINVAL;
>>  
>> -	domain->gcr3_tbl = alloc_pgtable_page(domain->nid, GFP_ATOMIC);
>> -	if (domain->gcr3_tbl == NULL)
>> -		return -ENOMEM;
>> +	if (gcr3_info->gcr3_tbl)
>> +		return -EBUSY;
>>  
>> -	domain->glx      = levels;
>> -	domain->flags   |= PD_IOMMUV2_MASK;
>> +	gcr3_info->gcr3_tbl = alloc_pgtable_page(nid, GFP_KERNEL);
>> +	if (gcr3_info->gcr3_tbl == NULL)
>> +		return -ENOMEM;
>>  
>> -	amd_iommu_domain_update(domain);
>> +	gcr3_info->glx = levels;
> 
> I think this patch should also move the domain_id
> allocation/deallocation into setup_gcr3_table()/free_gcr3_table()

We do domain allocation in device attach path only. But GCR3 table
allocation/setup can happen while enabling SVA (like passthrough -> SVA path).
Hence I have kept it in attach() path itself.


> 
> But it would really be better to move patch 7 to after the gcr3 layer
> is cleaned and seperated from protection_domain so you don't have to
> move it in the first place..

Fine I can do that.

> 
> It is missing some error handling too

Where?

> 
> And domain_id_is_per_dev() is pretty redundant once you do that. See below

We need to handle passthrough as well. It doesn't make sense to allocate and
keep GCR3 table when we are not going to use it.

-Vasant


> 
>> @@ -1974,14 +1982,15 @@ static int do_attach(struct iommu_dev_data *dev_data,
>>  	/* Init GCR3 table and update device table */
>>  	if (domain->pd_mode == PD_MODE_V2) {
>>  		/* By default, setup GCR3 table to support single PASID */
>> -		ret = setup_gcr3_table(dev_data->domain, 1);
>> +		ret = setup_gcr3_table(&dev_data->gcr3_info,
>> +				       dev_to_node(dev_data->dev), 1);
> 
> IMO it is better to pass the struct amd_iommu * to setup_gcr3_table()
> and then it can get the NUMA locality more sensibly via
> 
>  dev_to_node(&iommu->dev->dev)
> 
> The iommu is the thing that will walk the table and needs the
> locality, not the end point device. Locality should flow from
> the affiliated amd_iommu always.
> 
> Jason
> 
> diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
> index 256b5507272ff9..fb94ffa269717b 100644
> --- a/drivers/iommu/amd/iommu.c
> +++ b/drivers/iommu/amd/iommu.c
> @@ -1769,6 +1769,8 @@ static void free_gcr3_table(struct gcr3_tbl_info *gcr3_info)
>  
>  	free_page((unsigned long)gcr3_info->gcr3_tbl);
>  	gcr3_info->gcr3_tbl = NULL;
> +
> +	domain_id_free(gcr3_info.domid);
>  }
>  
>  /*
> @@ -1788,7 +1790,7 @@ static int get_gcr3_levels(int pasids)
>  }
>  
>  static int setup_gcr3_table(struct gcr3_tbl_info *gcr3_info,
> -			    int nid, int pasids)
> +			    struct amd_iommu *iommu, int pasids)
>  {
>  	int levels = get_gcr3_levels(pasids);
>  
> @@ -1798,12 +1800,19 @@ static int setup_gcr3_table(struct gcr3_tbl_info *gcr3_info,
>  	if (gcr3_info->gcr3_tbl)
>  		return -EBUSY;
>  
> -	gcr3_info->gcr3_tbl = alloc_pgtable_page(nid, GFP_KERNEL);
> +	gcr3_info->gcr3_tbl = alloc_pgtable_page(
> +		iommu ? dev_to_node(&iommu->dev->dev) : 1, GFP_KERNEL);
>  	if (gcr3_info->gcr3_tbl == NULL)
>  		return -ENOMEM;
>  
>  	gcr3_info->glx = levels;
>  
> +	gcr3_info.domid = domain_id_alloc();
> +	if (!gcr3_info.domid) {
> +		free_page((unsigned long)gcr3_info->gcr3_tbl);
> +		return -ENOMEM;
> +	}
> +
>  	return 0;
>  }
>  
> @@ -1902,7 +1911,7 @@ static void set_dte_entry(struct amd_iommu *iommu,
>  	struct dev_table_entry *dev_table = get_dev_table(iommu);
>  	struct gcr3_tbl_info *gcr3_info = &dev_data->gcr3_info;
>  
> -	if (domain_id_is_per_dev(domain))
> +	if (gcr3_info && gcr3_info->gcr3_tbl) {
>  		domid = dev_data->gcr3_info.domid;
>  	else
>  		domid = domain->id;
> @@ -2018,15 +2027,10 @@ static int do_attach(struct iommu_dev_data *dev_data,
>  	domain->dev_iommu[iommu->index] += 1;
>  	domain->dev_cnt                 += 1;
>  
> -	/* Allocate per device domain ID */
> -	if (domain_id_is_per_dev(domain))
> -		dev_data->gcr3_info.domid = domain_id_alloc();
> -
>  	/* Init GCR3 table and update device table */
>  	if (domain->pd_mode == PD_MODE_V2) {
>  		/* By default, setup GCR3 table to support single PASID */
> -		ret = setup_gcr3_table(&dev_data->gcr3_info,
> -				       dev_to_node(dev_data->dev), 1);
> +		ret = setup_gcr3_table(&dev_data->gcr3_info, iommu, 1);
>  		if (ret)
>  			return ret;
>  
> @@ -2073,10 +2077,6 @@ static void do_detach(struct iommu_dev_data *dev_data)
>  	/* decrease reference counters - needs to happen after the flushes */
>  	domain->dev_iommu[iommu->index] -= 1;
>  	domain->dev_cnt                 -= 1;
> -
> -	/* Free per device domain ID */
> -	if (domain_id_is_per_dev(domain))
> -		domain_id_free(dev_data->gcr3_info.domid);
>  }
>  
>  /*

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v5 10/17] iommu: Introduce iommu_group_mutex_assert()
  2024-01-22  6:24     ` Vasant Hegde
@ 2024-01-22 18:00       ` Jason Gunthorpe
  2024-01-23  5:27         ` Vasant Hegde
  0 siblings, 1 reply; 31+ messages in thread
From: Jason Gunthorpe @ 2024-01-22 18:00 UTC (permalink / raw)
  To: Vasant Hegde; +Cc: iommu, joro, suravee.suthikulpanit, wei.huang2, jsnitsel

On Mon, Jan 22, 2024 at 11:54:39AM +0530, Vasant Hegde wrote:
> Jason,
> 
> 
> On 1/20/2024 12:39 AM, Jason Gunthorpe wrote:
> > On Tue, Jan 16, 2024 at 04:53:28PM +0000, Vasant Hegde wrote:
> > 
> >> +/**
> >> + * iommu_group_mutex_assert - Check device group mutex lock
> >> + * @dev: the device that has group param set
> >> + *
> >> + * This function is called by an iommu driver to check whether it holds
> >> + * group mutex lock for the given device or not.
> >> + *
> >> + * Note that this function must be called after device group param is set.
> >> + */
> >> +void iommu_group_mutex_assert(struct device *dev)
> >> +{
> >> +	struct iommu_group *group = dev->iommu_group;
> >> +
> >> +	lockdep_assert_held(&group->mutex);
> >> +}
> >> +EXPORT_SYMBOL_GPL(iommu_group_mutex_assert);
> >> +
> >>  static struct device *iommu_group_first_dev(struct iommu_group *group)
> >>  {
> >>  	lockdep_assert_held(&group->mutex);
> >> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> >> index 7f6342bc71c3..c983b6a1ebce 100644
> >> --- a/include/linux/iommu.h
> >> +++ b/include/linux/iommu.h
> >> @@ -751,6 +751,7 @@ extern int iommu_group_set_name(struct iommu_group *group, const char *name);
> >>  extern int iommu_group_add_device(struct iommu_group *group,
> >>  				  struct device *dev);
> >>  extern void iommu_group_remove_device(struct device *dev);
> >> +extern void iommu_group_mutex_assert(struct device *dev);
> > 
> > This shouldn't be unconditional. Like this outside the other ifdefs:
> > 
> > #if IS_ENABLED(CONFIG_LOCKDEP) && IS_ENABLED(CONFIG_IOMMU)
> 
> This is already covered by `CONFIG_IOMMU_API` check.
> Also lockdep_assert_held() is already covered by LOCKDEP config check in
> lockdep.h file.
> 
> So I think another explicit check in redundant.
> 
> Is there  any other reason to have explicit check?

We don't want the out of line function call overhead even when lockdep
is not turned on.

The point is that without lockdep the inline stub should be the
only thing present and the compiler simply does nothing at all.

What you have here will generate an function call and a stack frame
push to do nothing.

Jason

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v5 14/17] iommu/amd: Refactor GCR3 table helper functions
  2024-01-22 10:23     ` Vasant Hegde
@ 2024-01-22 18:26       ` Jason Gunthorpe
  2024-01-23  8:54         ` Vasant Hegde
  0 siblings, 1 reply; 31+ messages in thread
From: Jason Gunthorpe @ 2024-01-22 18:26 UTC (permalink / raw)
  To: Vasant Hegde; +Cc: iommu, joro, suravee.suthikulpanit, wei.huang2, jsnitsel

On Mon, Jan 22, 2024 at 03:53:18PM +0530, Vasant Hegde wrote:
> Jason,
> 
> 
> On 1/20/2024 1:29 AM, Jason Gunthorpe wrote:
> > On Tue, Jan 16, 2024 at 04:53:32PM +0000, Vasant Hegde wrote:
> > 
> >> @@ -1738,22 +1744,22 @@ static int get_gcr3_levels(int pasids)
> >>  	return levels ? (DIV_ROUND_UP(levels, 9) - 1) : levels;
> >>  }
> >>  
> >> -/* Note: This function expects iommu_domain->lock to be held prior calling the function. */
> >> -static int setup_gcr3_table(struct protection_domain *domain, int pasids)
> >> +static int setup_gcr3_table(struct gcr3_tbl_info *gcr3_info,
> >> +			    int nid, int pasids)
> >>  {
> >>  	int levels = get_gcr3_levels(pasids);
> >>  
> >>  	if (levels > amd_iommu_max_glx_val)
> >>  		return -EINVAL;
> >>  
> >> -	domain->gcr3_tbl = alloc_pgtable_page(domain->nid, GFP_ATOMIC);
> >> -	if (domain->gcr3_tbl == NULL)
> >> -		return -ENOMEM;
> >> +	if (gcr3_info->gcr3_tbl)
> >> +		return -EBUSY;
> >>  
> >> -	domain->glx      = levels;
> >> -	domain->flags   |= PD_IOMMUV2_MASK;
> >> +	gcr3_info->gcr3_tbl = alloc_pgtable_page(nid, GFP_KERNEL);
> >> +	if (gcr3_info->gcr3_tbl == NULL)
> >> +		return -ENOMEM;
> >>  
> >> -	amd_iommu_domain_update(domain);
> >> +	gcr3_info->glx = levels;
> > 
> > I think this patch should also move the domain_id
> > allocation/deallocation into setup_gcr3_table()/free_gcr3_table()
> 
> We do domain allocation in device attach path only. But GCR3 table
> allocation/setup can happen while enabling SVA (like passthrough -> SVA path).
> Hence I have kept it in attach() path itself.

Huh? That doesn't sound right. Any time you install a GCR3 table into
a DTE you need to get a domain_id *for that GCR3 table*. There is no
other place to get a domain id!?

All this logic should be shared between the pasid and rid attach
paths.

The passthrough thing is only an issue of DTE construction.

If you build a DTE with a GCR3 table and RID=IDENTITY then you set
some bits, and that is it. Detect that case directly when you build
the DTE. It should have no effect on what domain ID is used to tag
translations retrived from a GCR3 table.

(I say that with some trepidation because it isn't clear to me how the
AMD IOMMU caches the DTE entries themselves)

I've said before the DTE construction should be fixed up before making
things more complicated :\

> > It is missing some error handling too
> 
> Where?

See my diff I sent, it was error unwinds around gcr3 table failure.
 
> > And domain_id_is_per_dev() is pretty redundant once you do that. See below
> 
> We need to handle passthrough as well. It doesn't make sense to allocate and
> keep GCR3 table when we are not going to use it.

Then don't, and my diff didn't - but check for the passthrough case
directly against the attached domain as identity.

> > @@ -1902,7 +1911,7 @@ static void set_dte_entry(struct amd_iommu *iommu,
> >  	struct dev_table_entry *dev_table = get_dev_table(iommu);
> >  	struct gcr3_tbl_info *gcr3_info = &dev_data->gcr3_info;
> >  
> > -	if (domain_id_is_per_dev(domain))
> > +	if (gcr3_info && gcr3_info->gcr3_tbl) {
> >  		domid = dev_data->gcr3_info.domid;
> >  	else
> >  		domid = domain->id;

Because here it already has the (gcr3_info && gcr3_info->gcr3_tbl)
test below to decide if the gcr3 will be written to the DTE, so of
course it should be the same test to decide where the domain id comes
from.

> > @@ -2018,15 +2027,10 @@ static int do_attach(struct iommu_dev_data *dev_data,
> >  	domain->dev_iommu[iommu->index] += 1;
> >  	domain->dev_cnt                 += 1;
> >  
> > -	/* Allocate per device domain ID */
> > -	if (domain_id_is_per_dev(domain))
> > -		dev_data->gcr3_info.domid = domain_id_alloc();
> > -

And this gets moved into this test:

> >  	/* Init GCR3 table and update device table */
> >  	if (domain->pd_mode == PD_MODE_V2) {

Here, which is obviously correct at this point as you only need a GCR3
table when installing a V2 table on the RID.

> > @@ -2073,10 +2077,6 @@ static void do_detach(struct iommu_dev_data *dev_data)
> >  	/* decrease reference counters - needs to happen after the flushes */
> >  	domain->dev_iommu[iommu->index] -= 1;
> >  	domain->dev_cnt                 -= 1;
> > -
> > -	/* Free per device domain ID */
> > -	if (domain_id_is_per_dev(domain))
> > -		domain_id_free(dev_data->gcr3_info.domid);

And this got moved into the gcr3 free a few lines up that checked the
pd_mode.

Jason

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v5 10/17] iommu: Introduce iommu_group_mutex_assert()
  2024-01-22 18:00       ` Jason Gunthorpe
@ 2024-01-23  5:27         ` Vasant Hegde
  0 siblings, 0 replies; 31+ messages in thread
From: Vasant Hegde @ 2024-01-23  5:27 UTC (permalink / raw)
  To: Jason Gunthorpe; +Cc: iommu, joro, suravee.suthikulpanit, wei.huang2, jsnitsel



On 1/22/2024 11:30 PM, Jason Gunthorpe wrote:
> On Mon, Jan 22, 2024 at 11:54:39AM +0530, Vasant Hegde wrote:
>> Jason,
>>
>>
>> On 1/20/2024 12:39 AM, Jason Gunthorpe wrote:
>>> On Tue, Jan 16, 2024 at 04:53:28PM +0000, Vasant Hegde wrote:
>>>
>>>> +/**
>>>> + * iommu_group_mutex_assert - Check device group mutex lock
>>>> + * @dev: the device that has group param set
>>>> + *
>>>> + * This function is called by an iommu driver to check whether it holds
>>>> + * group mutex lock for the given device or not.
>>>> + *
>>>> + * Note that this function must be called after device group param is set.
>>>> + */
>>>> +void iommu_group_mutex_assert(struct device *dev)
>>>> +{
>>>> +	struct iommu_group *group = dev->iommu_group;
>>>> +
>>>> +	lockdep_assert_held(&group->mutex);
>>>> +}
>>>> +EXPORT_SYMBOL_GPL(iommu_group_mutex_assert);
>>>> +
>>>>  static struct device *iommu_group_first_dev(struct iommu_group *group)
>>>>  {
>>>>  	lockdep_assert_held(&group->mutex);
>>>> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
>>>> index 7f6342bc71c3..c983b6a1ebce 100644
>>>> --- a/include/linux/iommu.h
>>>> +++ b/include/linux/iommu.h
>>>> @@ -751,6 +751,7 @@ extern int iommu_group_set_name(struct iommu_group *group, const char *name);
>>>>  extern int iommu_group_add_device(struct iommu_group *group,
>>>>  				  struct device *dev);
>>>>  extern void iommu_group_remove_device(struct device *dev);
>>>> +extern void iommu_group_mutex_assert(struct device *dev);
>>>
>>> This shouldn't be unconditional. Like this outside the other ifdefs:
>>>
>>> #if IS_ENABLED(CONFIG_LOCKDEP) && IS_ENABLED(CONFIG_IOMMU)
>>
>> This is already covered by `CONFIG_IOMMU_API` check.
>> Also lockdep_assert_held() is already covered by LOCKDEP config check in
>> lockdep.h file.
>>
>> So I think another explicit check in redundant.
>>
>> Is there  any other reason to have explicit check?
> 
> We don't want the out of line function call overhead even when lockdep
> is not turned on.
> 
> The point is that without lockdep the inline stub should be the
> only thing present and the compiler simply does nothing at all.
> 
> What you have here will generate an function call and a stack frame
> push to do nothing.
> 

Ok. Will fix it.

-Vasant


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v5 14/17] iommu/amd: Refactor GCR3 table helper functions
  2024-01-22 18:26       ` Jason Gunthorpe
@ 2024-01-23  8:54         ` Vasant Hegde
  2024-01-25  1:46           ` Jason Gunthorpe
  0 siblings, 1 reply; 31+ messages in thread
From: Vasant Hegde @ 2024-01-23  8:54 UTC (permalink / raw)
  To: Jason Gunthorpe; +Cc: iommu, joro, suravee.suthikulpanit, wei.huang2, jsnitsel



On 1/22/2024 11:56 PM, Jason Gunthorpe wrote:
> On Mon, Jan 22, 2024 at 03:53:18PM +0530, Vasant Hegde wrote:
>> Jason,
>>
>>
>> On 1/20/2024 1:29 AM, Jason Gunthorpe wrote:
>>> On Tue, Jan 16, 2024 at 04:53:32PM +0000, Vasant Hegde wrote:
>>>
>>>> @@ -1738,22 +1744,22 @@ static int get_gcr3_levels(int pasids)
>>>>  	return levels ? (DIV_ROUND_UP(levels, 9) - 1) : levels;
>>>>  }
>>>>  
>>>> -/* Note: This function expects iommu_domain->lock to be held prior calling the function. */
>>>> -static int setup_gcr3_table(struct protection_domain *domain, int pasids)
>>>> +static int setup_gcr3_table(struct gcr3_tbl_info *gcr3_info,
>>>> +			    int nid, int pasids)
>>>>  {
>>>>  	int levels = get_gcr3_levels(pasids);
>>>>  
>>>>  	if (levels > amd_iommu_max_glx_val)
>>>>  		return -EINVAL;
>>>>  
>>>> -	domain->gcr3_tbl = alloc_pgtable_page(domain->nid, GFP_ATOMIC);
>>>> -	if (domain->gcr3_tbl == NULL)
>>>> -		return -ENOMEM;
>>>> +	if (gcr3_info->gcr3_tbl)
>>>> +		return -EBUSY;
>>>>  
>>>> -	domain->glx      = levels;
>>>> -	domain->flags   |= PD_IOMMUV2_MASK;
>>>> +	gcr3_info->gcr3_tbl = alloc_pgtable_page(nid, GFP_KERNEL);
>>>> +	if (gcr3_info->gcr3_tbl == NULL)
>>>> +		return -ENOMEM;
>>>>  
>>>> -	amd_iommu_domain_update(domain);
>>>> +	gcr3_info->glx = levels;
>>>
>>> I think this patch should also move the domain_id
>>> allocation/deallocation into setup_gcr3_table()/free_gcr3_table()
>>
>> We do domain allocation in device attach path only. But GCR3 table
>> allocation/setup can happen while enabling SVA (like passthrough -> SVA path).
>> Hence I have kept it in attach() path itself.
> 
> Huh? That doesn't sound right. Any time you install a GCR3 table into
> a DTE you need to get a domain_id *for that GCR3 table*. There is no
> other place to get a domain id!?

Its already fixed. If its per-device-domain-ID it comes from dev_data/gcr3_info.
Else it will come from protection_domain->domain_id.

domain_id_is_per_dev() decides how to allocate domain ID. Right now for V2 and
pass through mode we allocate per-device-domain-ID as they can switch to SVA.



> 
> All this logic should be shared between the pasid and rid attach
> paths.>
> The passthrough thing is only an issue of DTE construction.
> 
> If you build a DTE with a GCR3 table and RID=IDENTITY then you set
> some bits, and that is it. Detect that case directly when you build
> the DTE. It should have no effect on what domain ID is used to tag
> translations retrived from a GCR3 table.

We build DTE as soon as we attach device to domain.

Now moving domain ID allocation to setup_gcr3_table complicates things.
  - In attach_device() path we want to allocate domain ID but not GCR3 table
    We can allocate GCR3 table, but if we don't use it its waste of memory.
  - In SVA enablement path we want to allocate GCR3 table
    But by then domain ID should have been allocated. We don't want to allocate
another domain_ID and change ID in SVA enablement path as our domain ID is not
specific to GCR3.



> 
> (I say that with some trepidation because it isn't clear to me how the
> AMD IOMMU caches the DTE entries themselves)
> 
> I've said before the DTE construction should be fixed up before making
> things more complicated :\

We are not making anything complicated here!

> 
>>> It is missing some error handling too
>>
>> Where?
> 
> See my diff I sent, it was error unwinds around gcr3 table failure.

Ok.

>  
>>> And domain_id_is_per_dev() is pretty redundant once you do that. See below
>>
>> We need to handle passthrough as well. It doesn't make sense to allocate and
>> keep GCR3 table when we are not going to use it.
> 
> Then don't, and my diff didn't - but check for the passthrough case
> directly against the attached domain as identity.


We don't want to add condition check that depends on code path:
  like attach_device : allocate domain ID but not GCR3
  SVA path : Allocate GCR3 but not domain ID.

-Vasant


> 
>>> @@ -1902,7 +1911,7 @@ static void set_dte_entry(struct amd_iommu *iommu,
>>>  	struct dev_table_entry *dev_table = get_dev_table(iommu);
>>>  	struct gcr3_tbl_info *gcr3_info = &dev_data->gcr3_info;
>>>  
>>> -	if (domain_id_is_per_dev(domain))
>>> +	if (gcr3_info && gcr3_info->gcr3_tbl) {
>>>  		domid = dev_data->gcr3_info.domid;
>>>  	else
>>>  		domid = domain->id;
> 
> Because here it already has the (gcr3_info && gcr3_info->gcr3_tbl)
> test below to decide if the gcr3 will be written to the DTE, so of
> course it should be the same test to decide where the domain id comes
> from.
> 
>>> @@ -2018,15 +2027,10 @@ static int do_attach(struct iommu_dev_data *dev_data,
>>>  	domain->dev_iommu[iommu->index] += 1;
>>>  	domain->dev_cnt                 += 1;
>>>  
>>> -	/* Allocate per device domain ID */
>>> -	if (domain_id_is_per_dev(domain))
>>> -		dev_data->gcr3_info.domid = domain_id_alloc();
>>> -
> 
> And this gets moved into this test:
> 
>>>  	/* Init GCR3 table and update device table */
>>>  	if (domain->pd_mode == PD_MODE_V2) {
> 
> Here, which is obviously correct at this point as you only need a GCR3
> table when installing a V2 table on the RID.
> 
>>> @@ -2073,10 +2077,6 @@ static void do_detach(struct iommu_dev_data *dev_data)
>>>  	/* decrease reference counters - needs to happen after the flushes */
>>>  	domain->dev_iommu[iommu->index] -= 1;
>>>  	domain->dev_cnt                 -= 1;
>>> -
>>> -	/* Free per device domain ID */
>>> -	if (domain_id_is_per_dev(domain))
>>> -		domain_id_free(dev_data->gcr3_info.domid);
> 
> And this got moved into the gcr3 free a few lines up that checked the
> pd_mode.
> 
> Jason

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v5 14/17] iommu/amd: Refactor GCR3 table helper functions
  2024-01-23  8:54         ` Vasant Hegde
@ 2024-01-25  1:46           ` Jason Gunthorpe
  2024-01-25 12:11             ` Vasant Hegde
  0 siblings, 1 reply; 31+ messages in thread
From: Jason Gunthorpe @ 2024-01-25  1:46 UTC (permalink / raw)
  To: Vasant Hegde; +Cc: iommu, joro, suravee.suthikulpanit, wei.huang2, jsnitsel

On Tue, Jan 23, 2024 at 02:24:51PM +0530, Vasant Hegde wrote:

> domain_id_is_per_dev() decides how to allocate domain ID. Right now for V2 and
> pass through mode we allocate per-device-domain-ID as they can switch to SVA.

That makes no sense. You don't need a domain id for passthrough mode
unless you are also installing a gcr3 table. You don't need to install
a gcr3 table unless there is a PASID being attached too.

Pre-setting the domain ID to avoid setting it when the GCR3 is later
loaded is spaghetti logic.

> > All this logic should be shared between the pasid and rid attach
> > paths.>
> > The passthrough thing is only an issue of DTE construction.
> > 
> > If you build a DTE with a GCR3 table and RID=IDENTITY then you set
> > some bits, and that is it. Detect that case directly when you build
> > the DTE. It should have no effect on what domain ID is used to tag
> > translations retrived from a GCR3 table.
> 
> We build DTE as soon as we attach device to domain.
> 
> Now moving domain ID allocation to setup_gcr3_table complicates things.
>   - In attach_device() path we want to allocate domain ID but not GCR3 table
>     We can allocate GCR3 table, but if we don't use it its waste of memory.

It is a waste to allocate the domain id for identity too.

>   - In SVA enablement path we want to allocate GCR3 table
>     But by then domain ID should have been allocated. We don't want to allocate
> another domain_ID and change ID in SVA enablement path as our domain ID is not
> specific to GCR3.

Again, this seems to be a complication that is being created by the
DTE construction. You shouldn't need the caller to carefully sequence
what it is doing. The DTE programming should just install the correct
DTE for the *current state*.

A RID only identity/passthrough DTE does not have a GCR3 table and
does not need a unique domain ID.

> >> We need to handle passthrough as well. It doesn't make sense to allocate and
> >> keep GCR3 table when we are not going to use it.
> > 
> > Then don't, and my diff didn't - but check for the passthrough case
> > directly against the attached domain as identity.
> 
> We don't want to add condition check that depends on code path:
>   like attach_device : allocate domain ID but not GCR3
>   SVA path : Allocate GCR3 but not domain ID.

I'm not saying you should do that, I'm actively saying you should not
do that! All paths should be symmetric.

This matters, if you hope to implement the full driver functionality
the DTE construction must be clean, thare are too many cases
otherwise..

Jason

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v5 14/17] iommu/amd: Refactor GCR3 table helper functions
  2024-01-25  1:46           ` Jason Gunthorpe
@ 2024-01-25 12:11             ` Vasant Hegde
  2024-01-25 15:53               ` Jason Gunthorpe
  0 siblings, 1 reply; 31+ messages in thread
From: Vasant Hegde @ 2024-01-25 12:11 UTC (permalink / raw)
  To: Jason Gunthorpe; +Cc: iommu, joro, suravee.suthikulpanit, wei.huang2, jsnitsel



On 1/25/2024 7:16 AM, Jason Gunthorpe wrote:
> On Tue, Jan 23, 2024 at 02:24:51PM +0530, Vasant Hegde wrote:
> 
>> domain_id_is_per_dev() decides how to allocate domain ID. Right now for V2 and
>> pass through mode we allocate per-device-domain-ID as they can switch to SVA.
> 
> That makes no sense. You don't need a domain id for passthrough mode
> unless you are also installing a gcr3 table. You don't need to install
> a gcr3 table unless there is a PASID being attached too.
> 
> Pre-setting the domain ID to avoid setting it when the GCR3 is later
> loaded is spaghetti logic.
> 
>>> All this logic should be shared between the pasid and rid attach
>>> paths.>
>>> The passthrough thing is only an issue of DTE construction.
>>>
>>> If you build a DTE with a GCR3 table and RID=IDENTITY then you set
>>> some bits, and that is it. Detect that case directly when you build
>>> the DTE. It should have no effect on what domain ID is used to tag
>>> translations retrived from a GCR3 table.
>>
>> We build DTE as soon as we attach device to domain.
>>
>> Now moving domain ID allocation to setup_gcr3_table complicates things.
>>   - In attach_device() path we want to allocate domain ID but not GCR3 table
>>     We can allocate GCR3 table, but if we don't use it its waste of memory.
> 
> It is a waste to allocate the domain id for identity too.
> 
>>   - In SVA enablement path we want to allocate GCR3 table
>>     But by then domain ID should have been allocated. We don't want to allocate
>> another domain_ID and change ID in SVA enablement path as our domain ID is not
>> specific to GCR3.
> 
> Again, this seems to be a complication that is being created by the
> DTE construction. You shouldn't need the caller to carefully sequence
> what it is doing. The DTE programming should just install the correct
> DTE for the *current state*.
> 
> A RID only identity/passthrough DTE does not have a GCR3 table and
> does not need a unique domain ID.
> 
>>>> We need to handle passthrough as well. It doesn't make sense to allocate and
>>>> keep GCR3 table when we are not going to use it.
>>>
>>> Then don't, and my diff didn't - but check for the passthrough case
>>> directly against the attached domain as identity.
>>
>> We don't want to add condition check that depends on code path:
>>   like attach_device : allocate domain ID but not GCR3
>>   SVA path : Allocate GCR3 but not domain ID.
> 
> I'm not saying you should do that, I'm actively saying you should not
> do that! All paths should be symmetric.

I still think its not a good thing to do domain allocation inside
setup_gcr3_table(). Anyway I have changed it in v6.


-Vasant

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v5 14/17] iommu/amd: Refactor GCR3 table helper functions
  2024-01-25 12:11             ` Vasant Hegde
@ 2024-01-25 15:53               ` Jason Gunthorpe
  0 siblings, 0 replies; 31+ messages in thread
From: Jason Gunthorpe @ 2024-01-25 15:53 UTC (permalink / raw)
  To: Vasant Hegde; +Cc: iommu, joro, suravee.suthikulpanit, wei.huang2, jsnitsel

On Thu, Jan 25, 2024 at 05:41:29PM +0530, Vasant Hegde wrote:

> > I'm not saying you should do that, I'm actively saying you should not
> > do that! All paths should be symmetric.
> 
> I still think its not a good thing to do domain allocation inside
> setup_gcr3_table(). Anyway I have changed it in v6.

Well, we can find out later if you have to change it and then we can
understand why it is not a good thing :)

Jason

^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2024-01-25 15:53 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-01-16 16:53 [PATCH v5 00/17] iommu/amd: SVA Support (part 3) - refactor support for GCR3 table Vasant Hegde
2024-01-16 16:53 ` [PATCH v5 01/17] iommu/amd: Pass struct iommu_dev_data to set_dte_entry() Vasant Hegde
2024-01-16 16:53 ` [PATCH v5 02/17] iommu/amd: Enable Guest Translation before registering devices Vasant Hegde
2024-01-16 16:53 ` [PATCH v5 03/17] iommu/amd: Introduce get_amd_iommu_from_dev() Vasant Hegde
2024-01-19 19:00   ` Jason Gunthorpe
2024-01-16 16:53 ` [PATCH v5 04/17] iommu/amd: Introduce struct protection_domain.pd_mode Vasant Hegde
2024-01-16 16:53 ` [PATCH v5 05/17] iommu/amd: Introduce per-device GCR3 table Vasant Hegde
2024-01-19 19:03   ` Jason Gunthorpe
2024-01-16 16:53 ` [PATCH v5 06/17] iommu/amd: Use protection_domain.flags to check page table mode Vasant Hegde
2024-01-16 16:53 ` [PATCH v5 07/17] iommu/amd: Introduce per-device domain ID to workaround potential TLB aliasing issue Vasant Hegde
2024-01-16 16:53 ` [PATCH v5 08/17] iommu/amd: Add support for device based TLB invalidation Vasant Hegde
2024-01-16 16:53 ` [PATCH v5 09/17] iommu/amd: Rearrange GCR3 table setup code Vasant Hegde
2024-01-16 16:53 ` [PATCH v5 10/17] iommu: Introduce iommu_group_mutex_assert() Vasant Hegde
2024-01-19 19:09   ` Jason Gunthorpe
2024-01-22  6:24     ` Vasant Hegde
2024-01-22 18:00       ` Jason Gunthorpe
2024-01-23  5:27         ` Vasant Hegde
2024-01-16 16:53 ` [PATCH v5 11/17] iommu/amd: Refactor helper function for setting / clearing GCR3 Vasant Hegde
2024-01-16 16:53 ` [PATCH v5 12/17] iommu/amd: Refactor attaching / detaching device functions Vasant Hegde
2024-01-16 16:53 ` [PATCH v5 13/17] iommu/amd: Refactor protection_domain helper functions Vasant Hegde
2024-01-16 16:53 ` [PATCH v5 14/17] iommu/amd: Refactor GCR3 table " Vasant Hegde
2024-01-19 19:59   ` Jason Gunthorpe
2024-01-22 10:23     ` Vasant Hegde
2024-01-22 18:26       ` Jason Gunthorpe
2024-01-23  8:54         ` Vasant Hegde
2024-01-25  1:46           ` Jason Gunthorpe
2024-01-25 12:11             ` Vasant Hegde
2024-01-25 15:53               ` Jason Gunthorpe
2024-01-16 16:53 ` [PATCH v5 15/17] iommu/amd: Remove unused flush pasid functions Vasant Hegde
2024-01-16 16:53 ` [PATCH v5 16/17] iommu/amd: Rearrange device flush code Vasant Hegde
2024-01-16 16:53 ` [PATCH v5 17/17] iommu/amd: Remove unused GCR3 table parameters from struct protection_domain Vasant Hegde

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.