All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/11] iommu/amd: SVA support (part 2) - refactor support for GCR3 table
@ 2023-08-08 10:02 Vasant Hegde
  2023-08-08 10:02 ` [PATCH 01/11] iommu/amd: Rename helper function rlookup_amd_iommu() Vasant Hegde
                   ` (10 more replies)
  0 siblings, 11 replies; 35+ messages in thread
From: Vasant Hegde @ 2023-08-08 10:02 UTC (permalink / raw)
  To: iommu, joro
  Cc: suravee.suthikulpanit, wei.huang2, jsnitsel, jgg, Vasant Hegde

This is part 2 of the 3-part series to introduce Share Virtual
Address (SVA) support, which focuses on refactoring GCR3 table.
This moves GCR3 related information from protection domain
structure to iommu_dev_data (per device) structure.

It contains the following enhancements:

* Patch 1:
  Rename helper function

* Patch 2 - 8:
  Introduce per device GCR3 table and code refactoring

* Patch 9 - 10:
  Add support for iommu_v2 module to adopt to new interfaces

* Patch 11:
  Remove unused variable from protection domain structure

This patch series is based on top of SVA Part 1 [1] :
  [1] https://lore.kernel.org/linux-iommu/20230804064216.835544-1-vasant.hegde@amd.com/T/#t

This is also available at github :
  https://github.com/AMDESE/linux/tree/iommu_sva_part2_v1_v6.5_rc1

Thank you,
Vasant / Suravee

Suravee Suthikulpanit (9):
  iommu/amd: Rename helper function rlookup_amd_iommu()
  iommu/amd: Introduce struct protection_domain.pd_mode
  iommu/amd: Introduce per-device GCR3 table
  iommu/amd: Refactor helper function for setting / clearing GCR3
  iommu/amd: Refactor helper function for attaching / detaching device
  iommu/amd: Refactor protection_domain helper functions
  iommu/amd: Refactor GCR3 table helper functions
  iommu/amd: Introduce helper functions for AMD IOMMU v2 driver
  iommu/amd: Remove unused GCR3 table parameters from struct protection_domain

Vasant Hegde (2):
  iommu/amd: Use protection_domain.flags to check page table mode
  iommu/amd/iommu_v2: Add support to switch default domain to SVA mode

 drivers/iommu/amd/amd_iommu.h       |  17 +-
 drivers/iommu/amd/amd_iommu_types.h |  27 +-
 drivers/iommu/amd/io_pgtable_v2.c   |  20 +-
 drivers/iommu/amd/iommu.c           | 387 +++++++++++++++++++---------
 drivers/iommu/amd/iommu_v2.c        |  66 ++---
 5 files changed, 307 insertions(+), 210 deletions(-)

-- 
2.31.1


^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCH 01/11] iommu/amd: Rename helper function rlookup_amd_iommu()
  2023-08-08 10:02 [PATCH 00/11] iommu/amd: SVA support (part 2) - refactor support for GCR3 table Vasant Hegde
@ 2023-08-08 10:02 ` Vasant Hegde
  2023-08-08 15:28   ` Jason Gunthorpe
  2023-08-08 10:02 ` [PATCH 02/11] iommu/amd: Introduce struct protection_domain.pd_mode Vasant Hegde
                   ` (9 subsequent siblings)
  10 siblings, 1 reply; 35+ messages in thread
From: Vasant Hegde @ 2023-08-08 10:02 UTC (permalink / raw)
  To: iommu, joro
  Cc: suravee.suthikulpanit, wei.huang2, jsnitsel, jgg, Vasant Hegde

From: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>

Rename and make it available to other code.

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
---
 drivers/iommu/amd/amd_iommu.h |  1 +
 drivers/iommu/amd/iommu.c     | 30 +++++++++++++++---------------
 2 files changed, 16 insertions(+), 15 deletions(-)

diff --git a/drivers/iommu/amd/amd_iommu.h b/drivers/iommu/amd/amd_iommu.h
index 56996dc2868d..1471ab2dddd1 100644
--- a/drivers/iommu/amd/amd_iommu.h
+++ b/drivers/iommu/amd/amd_iommu.h
@@ -24,6 +24,7 @@ int amd_iommu_init_devices(void);
 void amd_iommu_uninit_devices(void);
 void amd_iommu_init_notifier(void);
 void amd_iommu_set_rlookup_table(struct amd_iommu *iommu, u16 devid);
+struct amd_iommu *amd_iommu_rlookup_iommu(struct device *dev);
 
 #ifdef CONFIG_AMD_IOMMU_DEBUGFS
 void amd_iommu_debugfs_setup(struct amd_iommu *iommu);
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 26e509ffe77a..77b913211c35 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -164,7 +164,7 @@ static struct amd_iommu *__rlookup_amd_iommu(u16 seg, u16 devid)
 	return NULL;
 }
 
-static struct amd_iommu *rlookup_amd_iommu(struct device *dev)
+struct amd_iommu *amd_iommu_rlookup_iommu(struct device *dev)
 {
 	u16 seg = get_device_segment(dev);
 	int devid = get_device_sbdf_id(dev);
@@ -218,7 +218,7 @@ static int clone_alias(struct pci_dev *pdev, u16 alias, void *data)
 	if (devid == alias)
 		return 0;
 
-	iommu = rlookup_amd_iommu(&pdev->dev);
+	iommu = amd_iommu_rlookup_iommu(&pdev->dev);
 	if (!iommu)
 		return 0;
 
@@ -469,7 +469,7 @@ static bool check_device(struct device *dev)
 		return false;
 	devid = PCI_SBDF_TO_DEVID(sbdf);
 
-	iommu = rlookup_amd_iommu(dev);
+	iommu = amd_iommu_rlookup_iommu(dev);
 	if (!iommu)
 		return false;
 
@@ -1427,7 +1427,7 @@ static int device_flush_iotlb(struct iommu_dev_data *dev_data,
 	int qdep;
 
 	qdep     = dev_data->ats_qdep;
-	iommu    = rlookup_amd_iommu(dev_data->dev);
+	iommu    = amd_iommu_rlookup_iommu(dev_data->dev);
 	if (!iommu)
 		return -EINVAL;
 
@@ -1454,7 +1454,7 @@ static int device_flush_dte(struct iommu_dev_data *dev_data)
 	u16 alias;
 	int ret;
 
-	iommu = rlookup_amd_iommu(dev_data->dev);
+	iommu = amd_iommu_rlookup_iommu(dev_data->dev);
 	if (!iommu)
 		return -EINVAL;
 
@@ -1822,7 +1822,7 @@ static void do_attach(struct iommu_dev_data *dev_data,
 	struct amd_iommu *iommu;
 	bool ats;
 
-	iommu = rlookup_amd_iommu(dev_data->dev);
+	iommu = amd_iommu_rlookup_iommu(dev_data->dev);
 	if (!iommu)
 		return;
 	ats   = dev_data->ats_enabled;
@@ -1852,7 +1852,7 @@ static void do_detach(struct iommu_dev_data *dev_data)
 	struct protection_domain *domain = dev_data->domain;
 	struct amd_iommu *iommu;
 
-	iommu = rlookup_amd_iommu(dev_data->dev);
+	iommu = amd_iommu_rlookup_iommu(dev_data->dev);
 	if (!iommu)
 		return;
 
@@ -1965,7 +1965,7 @@ static struct iommu_device *amd_iommu_probe_device(struct device *dev)
 	if (!check_device(dev))
 		return ERR_PTR(-ENODEV);
 
-	iommu = rlookup_amd_iommu(dev);
+	iommu = amd_iommu_rlookup_iommu(dev);
 	if (!iommu)
 		return ERR_PTR(-ENODEV);
 
@@ -2006,7 +2006,7 @@ static void amd_iommu_release_device(struct device *dev)
 	if (!check_device(dev))
 		return;
 
-	iommu = rlookup_amd_iommu(dev);
+	iommu = amd_iommu_rlookup_iommu(dev);
 	if (!iommu)
 		return;
 
@@ -2033,7 +2033,7 @@ static void update_device_table(struct protection_domain *domain)
 	struct iommu_dev_data *dev_data;
 
 	list_for_each_entry(dev_data, &domain->dev_list, list) {
-		struct amd_iommu *iommu = rlookup_amd_iommu(dev_data->dev);
+		struct amd_iommu *iommu = amd_iommu_rlookup_iommu(dev_data->dev);
 
 		if (!iommu)
 			continue;
@@ -2254,7 +2254,7 @@ static int amd_iommu_attach_device(struct iommu_domain *dom,
 {
 	struct iommu_dev_data *dev_data = dev_iommu_priv_get(dev);
 	struct protection_domain *domain = to_pdomain(dom);
-	struct amd_iommu *iommu = rlookup_amd_iommu(dev);
+	struct amd_iommu *iommu = amd_iommu_rlookup_iommu(dev);
 	int ret;
 
 	/*
@@ -2405,7 +2405,7 @@ static void amd_iommu_get_resv_regions(struct device *dev,
 		return;
 
 	devid = PCI_SBDF_TO_DEVID(sbdf);
-	iommu = rlookup_amd_iommu(dev);
+	iommu = amd_iommu_rlookup_iommu(dev);
 	if (!iommu)
 		return;
 	pci_seg = iommu->pci_seg;
@@ -2641,7 +2641,7 @@ static int __flush_pasid(struct protection_domain *domain, u32 pasid,
 			continue;
 
 		qdep  = dev_data->ats_qdep;
-		iommu = rlookup_amd_iommu(dev_data->dev);
+		iommu = amd_iommu_rlookup_iommu(dev_data->dev);
 		if (!iommu)
 			continue;
 		build_inv_iotlb_pasid(&cmd, dev_data->devid, pasid,
@@ -2800,7 +2800,7 @@ int amd_iommu_complete_ppr(struct pci_dev *pdev, u32 pasid,
 	struct iommu_cmd cmd;
 
 	dev_data = dev_iommu_priv_get(&pdev->dev);
-	iommu    = rlookup_amd_iommu(&pdev->dev);
+	iommu    = amd_iommu_rlookup_iommu(&pdev->dev);
 	if (!iommu)
 		return -ENODEV;
 
@@ -2943,7 +2943,7 @@ static int set_remap_table_entry_alias(struct pci_dev *pdev, u16 alias,
 {
 	struct irq_remap_table *table = data;
 	struct amd_iommu_pci_seg *pci_seg;
-	struct amd_iommu *iommu = rlookup_amd_iommu(&pdev->dev);
+	struct amd_iommu *iommu = amd_iommu_rlookup_iommu(&pdev->dev);
 
 	if (!iommu)
 		return -EINVAL;
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 02/11] iommu/amd: Introduce struct protection_domain.pd_mode
  2023-08-08 10:02 [PATCH 00/11] iommu/amd: SVA support (part 2) - refactor support for GCR3 table Vasant Hegde
  2023-08-08 10:02 ` [PATCH 01/11] iommu/amd: Rename helper function rlookup_amd_iommu() Vasant Hegde
@ 2023-08-08 10:02 ` Vasant Hegde
  2023-08-08 15:30   ` Jason Gunthorpe
  2023-08-08 10:02 ` [PATCH 03/11] iommu/amd: Introduce per-device GCR3 table Vasant Hegde
                   ` (8 subsequent siblings)
  10 siblings, 1 reply; 35+ messages in thread
From: Vasant Hegde @ 2023-08-08 10:02 UTC (permalink / raw)
  To: iommu, joro
  Cc: suravee.suthikulpanit, wei.huang2, jsnitsel, jgg, Vasant Hegde

From: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>

This enum variable is used to track the type of page table used by the
protection domain. It will replace the protection_domain.flags in
subsequent series.

Suggested-by: Jason Gunthorpe <jgg@ziepe.ca>
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
---
 drivers/iommu/amd/amd_iommu_types.h | 7 +++++++
 drivers/iommu/amd/iommu.c           | 3 +++
 2 files changed, 10 insertions(+)

diff --git a/drivers/iommu/amd/amd_iommu_types.h b/drivers/iommu/amd/amd_iommu_types.h
index 0d339e022572..5e89032b3dee 100644
--- a/drivers/iommu/amd/amd_iommu_types.h
+++ b/drivers/iommu/amd/amd_iommu_types.h
@@ -555,6 +555,12 @@ struct amd_io_pgtable {
 	u64			*pgd;		/* v2 pgtable pgd pointer */
 };
 
+enum protection_domain_mode {
+	PD_MODE_PT = 0,
+	PD_MODE_V1,
+	PD_MODE_V2,
+};
+
 /*
  * This structure contains generic data for  IOMMU protection domains
  * independent of their use.
@@ -570,6 +576,7 @@ struct protection_domain {
 	int nid;		/* Node ID */
 	u64 *gcr3_tbl;		/* Guest CR3 table */
 	unsigned long flags;	/* flags to find out type of domain */
+	enum protection_domain_mode pd_mode; /* Track page table type */
 	unsigned dev_cnt;	/* devices assigned to this domain */
 	unsigned dev_iommu[MAX_IOMMUS]; /* per-IOMMU reference count */
 };
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 77b913211c35..28b2e93926b5 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -2119,6 +2119,7 @@ static int protection_domain_init_v1(struct protection_domain *domain, int mode)
 			return -ENOMEM;
 	}
 
+	domain->pd_mode = PD_MODE_V1;
 	amd_iommu_domain_set_pgtable(domain, pt_root, mode);
 
 	return 0;
@@ -2127,6 +2128,7 @@ static int protection_domain_init_v1(struct protection_domain *domain, int mode)
 static int protection_domain_init_v2(struct protection_domain *domain)
 {
 	domain->flags |= PD_GIOV_MASK;
+	domain->pd_mode = PD_MODE_V2;
 
 	domain->domain.pgsize_bitmap = AMD_IOMMU_PGSIZES_V2;
 
@@ -2158,6 +2160,7 @@ static struct protection_domain *protection_domain_alloc(unsigned int type)
 	switch (type) {
 	/* No need to allocate io pgtable ops in passthrough mode */
 	case IOMMU_DOMAIN_IDENTITY:
+		domain->pd_mode = PD_MODE_PT;
 		return domain;
 	case IOMMU_DOMAIN_DMA:
 	case IOMMU_DOMAIN_DMA_FQ:
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 03/11] iommu/amd: Introduce per-device GCR3 table
  2023-08-08 10:02 [PATCH 00/11] iommu/amd: SVA support (part 2) - refactor support for GCR3 table Vasant Hegde
  2023-08-08 10:02 ` [PATCH 01/11] iommu/amd: Rename helper function rlookup_amd_iommu() Vasant Hegde
  2023-08-08 10:02 ` [PATCH 02/11] iommu/amd: Introduce struct protection_domain.pd_mode Vasant Hegde
@ 2023-08-08 10:02 ` Vasant Hegde
  2023-08-08 15:32   ` Jason Gunthorpe
  2023-08-08 10:02 ` [PATCH 04/11] iommu/amd: Use protection_domain.flags to check page table mode Vasant Hegde
                   ` (7 subsequent siblings)
  10 siblings, 1 reply; 35+ messages in thread
From: Vasant Hegde @ 2023-08-08 10:02 UTC (permalink / raw)
  To: iommu, joro
  Cc: suravee.suthikulpanit, wei.huang2, jsnitsel, jgg, Vasant Hegde

From: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>

AMD IOMMU GCR3 table is indexed by PASID. Each entry stores guest CR3
register value, which is an address to the root of guest IO page table.
The GCR3 table can be programmed per-device. However, Linux AMD IOMMU
driver currently managing the table on a per-domain basis.

PASID is a device feature. When SVA is enabled it will bind PASID to
device, not domain. Hence it makes sense to have per device GCR3 table.

Note that when system is booted with amd_iommu=pgtbl_v2, it will use
PASID 0 for DMA translation. In this case all devices in the domain will
continue to share same GCR3 table.

Introduce struct iommu_dev_data.gcr3_tbl_info to keep track of GCR3 table
configuration. This will eventually replaces gcr3 related variables in
protection_domain structure.

Suggested-by: Jason Gunthorpe <jgg@ziepe.ca>
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
---
 drivers/iommu/amd/amd_iommu_types.h | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/drivers/iommu/amd/amd_iommu_types.h b/drivers/iommu/amd/amd_iommu_types.h
index 5e89032b3dee..a275a4bbf8d0 100644
--- a/drivers/iommu/amd/amd_iommu_types.h
+++ b/drivers/iommu/amd/amd_iommu_types.h
@@ -547,6 +547,13 @@ struct amd_irte_ops;
 #define io_pgtable_cfg_to_data(x) \
 	container_of((x), struct amd_io_pgtable, pgtbl_cfg)
 
+struct gcr3_tbl_info {
+	u64	*gcr3_tbl;	/* Guest CR3 table */
+	int	glx;		/* Number of levels for GCR3 table */
+	int	pasid_cnt;	/* Track attached PASIDs */
+	bool	giov;		/* GIOV bit support */
+};
+
 struct amd_io_pgtable {
 	struct io_pgtable_cfg	pgtbl_cfg;
 	struct io_pgtable	iop;
@@ -825,6 +832,7 @@ struct iommu_dev_data {
 	struct list_head list;		  /* For domain->dev_list */
 	struct llist_node dev_data_list;  /* For global dev_data_list */
 	struct protection_domain *domain; /* Domain the device is bound to */
+	struct gcr3_tbl_info gcr3_info;   /* Per-device GCR3 table */
 	struct device *dev;
 	u16 devid;			  /* PCI Device ID */
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 04/11] iommu/amd: Use protection_domain.flags to check page table mode
  2023-08-08 10:02 [PATCH 00/11] iommu/amd: SVA support (part 2) - refactor support for GCR3 table Vasant Hegde
                   ` (2 preceding siblings ...)
  2023-08-08 10:02 ` [PATCH 03/11] iommu/amd: Introduce per-device GCR3 table Vasant Hegde
@ 2023-08-08 10:02 ` Vasant Hegde
  2023-08-08 15:35   ` Jason Gunthorpe
  2023-08-08 10:02 ` [PATCH 05/11] iommu/amd: Refactor helper function for setting / clearing GCR3 Vasant Hegde
                   ` (6 subsequent siblings)
  10 siblings, 1 reply; 35+ messages in thread
From: Vasant Hegde @ 2023-08-08 10:02 UTC (permalink / raw)
  To: iommu, joro
  Cc: suravee.suthikulpanit, wei.huang2, jsnitsel, jgg, Vasant Hegde

Page table mode (v1, v2 or pt) is per domain property. Recently we have
enhanced protection_domain.pd_mode to track per domain page table mode.
Use that variable to check the page table mode instead of global
'amd_iommu_pgtable' in {map/unmap}_pages path.

Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
---
 drivers/iommu/amd/iommu.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 28b2e93926b5..6f6311e8b004 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -2307,7 +2307,7 @@ static int amd_iommu_map_pages(struct iommu_domain *dom, unsigned long iova,
 	int prot = 0;
 	int ret = -EINVAL;
 
-	if ((amd_iommu_pgtable == AMD_IOMMU_V1) &&
+	if ((domain->pd_mode == PD_MODE_V1) &&
 	    (domain->iop.mode == PAGE_MODE_NONE))
 		return -EINVAL;
 
@@ -2353,7 +2353,7 @@ static size_t amd_iommu_unmap_pages(struct iommu_domain *dom, unsigned long iova
 	struct io_pgtable_ops *ops = &domain->iop.iop.ops;
 	size_t r;
 
-	if ((amd_iommu_pgtable == AMD_IOMMU_V1) &&
+	if ((domain->pd_mode == PD_MODE_V1) &&
 	    (domain->iop.mode == PAGE_MODE_NONE))
 		return 0;
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 05/11] iommu/amd: Refactor helper function for setting / clearing GCR3
  2023-08-08 10:02 [PATCH 00/11] iommu/amd: SVA support (part 2) - refactor support for GCR3 table Vasant Hegde
                   ` (3 preceding siblings ...)
  2023-08-08 10:02 ` [PATCH 04/11] iommu/amd: Use protection_domain.flags to check page table mode Vasant Hegde
@ 2023-08-08 10:02 ` Vasant Hegde
  2023-08-08 10:02 ` [PATCH 06/11] iommu/amd: Refactor helper function for attaching / detaching device Vasant Hegde
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 35+ messages in thread
From: Vasant Hegde @ 2023-08-08 10:02 UTC (permalink / raw)
  To: iommu, joro
  Cc: suravee.suthikulpanit, wei.huang2, jsnitsel, jgg, Vasant Hegde

From: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>

Refactor GCR3 helper functions in preparation to use per device
GCR3 table.
  * Use new per device GCR3 table to set/clear the gcr3 entries.

  * Add internal functions which will be used by subsequent patches
    to set/clear default gcr3 entries.

  * Remove per domain default GCR3 setup during v2 page table allocation.
    Subsequent patch will add support to setup default gcr3 while
    attaching device to domain.

  * Remove amd_iommu_domain_update() from V2 page table path as device
    detach path will take care of updating the domain.

  * Rename functions to reflect its usage.

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Co-developed-by: Vasant Hegde <vasant.hegde@amd.com>
Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
---
 drivers/iommu/amd/amd_iommu.h     |  8 ++--
 drivers/iommu/amd/io_pgtable_v2.c | 20 ++------
 drivers/iommu/amd/iommu.c         | 79 ++++++++++++++++++-------------
 drivers/iommu/amd/iommu_v2.c      |  7 +--
 4 files changed, 59 insertions(+), 55 deletions(-)

diff --git a/drivers/iommu/amd/amd_iommu.h b/drivers/iommu/amd/amd_iommu.h
index 1471ab2dddd1..c9a35b6e2a87 100644
--- a/drivers/iommu/amd/amd_iommu.h
+++ b/drivers/iommu/amd/amd_iommu.h
@@ -59,6 +59,11 @@ int amd_iommu_pc_set_reg(struct amd_iommu *iommu, u8 bank, u8 cntr,
 int amd_iommu_pdev_enable_cap_pri(struct pci_dev *pdev);
 void amd_iommu_pdev_disable_cap_pri(struct pci_dev *pdev);
 
+/* GCR3 setup */
+int amd_iommu_set_gcr3(struct iommu_dev_data *dev_data,
+		       u32 pasid, unsigned long gcr3);
+int amd_iommu_clear_gcr3(struct iommu_dev_data *dev_data, u32 pasid);
+
 int amd_iommu_register_ppr_notifier(struct notifier_block *nb);
 int amd_iommu_unregister_ppr_notifier(struct notifier_block *nb);
 void amd_iommu_domain_direct_map(struct iommu_domain *dom);
@@ -69,9 +74,6 @@ void amd_iommu_domain_update(struct protection_domain *domain);
 void amd_iommu_domain_flush_complete(struct protection_domain *domain);
 void amd_iommu_domain_flush_tlb_pde(struct protection_domain *domain);
 int amd_iommu_flush_tlb(struct protection_domain *domain, u32 pasid);
-int amd_iommu_domain_set_gcr3(struct protection_domain *domain, u32 pasid,
-			      unsigned long cr3);
-int amd_iommu_domain_clear_gcr3(struct protection_domain *domain, u32 pasid);
 
 #ifdef CONFIG_IRQ_REMAP
 int amd_iommu_create_irq_domain(struct amd_iommu *iommu);
diff --git a/drivers/iommu/amd/io_pgtable_v2.c b/drivers/iommu/amd/io_pgtable_v2.c
index c17cda83bca5..8c588f93cbed 100644
--- a/drivers/iommu/amd/io_pgtable_v2.c
+++ b/drivers/iommu/amd/io_pgtable_v2.c
@@ -360,34 +360,25 @@ static void v2_free_pgtable(struct io_pgtable *iop)
 	struct amd_io_pgtable *pgtable = container_of(iop, struct amd_io_pgtable, iop);
 
 	pdom = container_of(pgtable, struct protection_domain, iop);
-	if (!(pdom->flags & PD_IOMMUV2_MASK))
-		return;
-
-	/* Clear gcr3 entry */
-	amd_iommu_domain_clear_gcr3(pdom, 0);
 
-	/* Make changes visible to IOMMUs */
-	amd_iommu_domain_update(pdom);
+	if (!pgtable->pgd)
+		return;
 
 	/* Free page table */
 	free_pgtable(pgtable->pgd, get_pgtable_level());
+	pgtable->pgd = NULL;
 }
 
 static struct io_pgtable *v2_alloc_pgtable(struct io_pgtable_cfg *cfg, void *cookie)
 {
 	struct amd_io_pgtable *pgtable = io_pgtable_cfg_to_data(cfg);
 	struct protection_domain *pdom = (struct protection_domain *)cookie;
-	int ret;
 	int ias = IOMMU_IN_ADDR_BIT_SIZE;
 
 	pgtable->pgd = alloc_pgtable_page(pdom->nid, GFP_ATOMIC);
 	if (!pgtable->pgd)
 		return NULL;
 
-	ret = amd_iommu_domain_set_gcr3(pdom, 0, iommu_virt_to_phys(pgtable->pgd));
-	if (ret)
-		goto err_free_pgd;
-
 	if (get_pgtable_level() == PAGE_MODE_5_LEVEL)
 		ias = 57;
 
@@ -401,11 +392,6 @@ static struct io_pgtable *v2_alloc_pgtable(struct io_pgtable_cfg *cfg, void *coo
 	cfg->tlb           = &v2_flush_ops;
 
 	return &pgtable->iop;
-
-err_free_pgd:
-	free_pgtable_page(pgtable->pgd);
-
-	return NULL;
 }
 
 struct io_pgtable_init_fns io_pgtable_amd_iommu_v2_init_fns = {
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 6f6311e8b004..cc42732820dd 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -78,6 +78,11 @@ struct kmem_cache *amd_iommu_irq_cache;
 
 static void detach_device(struct device *dev);
 
+static int __set_gcr3(struct iommu_dev_data *dev_data,
+		       u32 pasid, unsigned long gcr3);
+
+static int __clear_gcr3(struct iommu_dev_data *dev_data, u32 pasid);
+
 /****************************************************************************
  *
  * Helper functions
@@ -2735,65 +2740,75 @@ static u64 *__get_gcr3_pte(u64 *root, int level, u32 pasid, bool alloc)
 	return pte;
 }
 
-static int __set_gcr3(struct protection_domain *domain, u32 pasid,
-		      unsigned long cr3)
+static int __set_gcr3(struct iommu_dev_data *dev_data,
+		      u32 pasid, unsigned long gcr3)
 {
+	struct gcr3_tbl_info *gcr3_info = &dev_data->gcr3_info;
 	u64 *pte;
 
-	if (domain->iop.mode != PAGE_MODE_NONE)
-		return -EINVAL;
+	lockdep_assert_held(&dev_data->lock);
 
-	pte = __get_gcr3_pte(domain->gcr3_tbl, domain->glx, pasid, true);
+	pte = __get_gcr3_pte(gcr3_info->gcr3_tbl,
+			     gcr3_info->glx, pasid, true);
 	if (pte == NULL)
 		return -ENOMEM;
 
-	*pte = (cr3 & PAGE_MASK) | GCR3_VALID;
+	*pte = (gcr3 & PAGE_MASK) | GCR3_VALID;
+	__amd_iommu_flush_tlb(dev_data->domain, pasid);
 
-	return __amd_iommu_flush_tlb(domain, pasid);
+	return 0;
 }
 
-static int __clear_gcr3(struct protection_domain *domain, u32 pasid)
+int amd_iommu_set_gcr3(struct iommu_dev_data *dev_data, u32 pasid,
+		       unsigned long gcr3)
 {
-	u64 *pte;
-
-	if (domain->iop.mode != PAGE_MODE_NONE)
-		return -EINVAL;
+	struct gcr3_tbl_info *gcr3_info = &dev_data->gcr3_info;
+	int ret;
 
-	pte = __get_gcr3_pte(domain->gcr3_tbl, domain->glx, pasid, false);
-	if (pte == NULL)
-		return 0;
+	spin_lock(&dev_data->lock);
 
-	*pte = 0;
+	ret = __set_gcr3(dev_data, pasid, gcr3);
+	if (!ret)
+		gcr3_info->pasid_cnt++;
 
-	return __amd_iommu_flush_tlb(domain, pasid);
+	spin_unlock(&dev_data->lock);
+	return ret;
 }
+EXPORT_SYMBOL(amd_iommu_set_gcr3);
 
-int amd_iommu_domain_set_gcr3(struct protection_domain *domain, u32 pasid,
-			      unsigned long cr3)
+static int __clear_gcr3(struct iommu_dev_data *dev_data, u32 pasid)
 {
-	unsigned long flags;
-	int ret;
+	struct gcr3_tbl_info *gcr3_info = &dev_data->gcr3_info;
+	u64 *pte;
 
-	spin_lock_irqsave(&domain->lock, flags);
-	ret = __set_gcr3(domain, pasid, cr3);
-	spin_unlock_irqrestore(&domain->lock, flags);
+	lockdep_assert_held(&dev_data->lock);
 
-	return ret;
+	pte = __get_gcr3_pte(gcr3_info->gcr3_tbl,
+			     gcr3_info->glx, pasid, false);
+	if (pte == NULL)
+		return -EINVAL;
+
+	*pte = 0;
+	__amd_iommu_flush_tlb(dev_data->domain, pasid);
+
+	return 0;
 }
-EXPORT_SYMBOL(amd_iommu_domain_set_gcr3);
 
-int amd_iommu_domain_clear_gcr3(struct protection_domain *domain, u32 pasid)
+int amd_iommu_clear_gcr3(struct iommu_dev_data *dev_data, u32 pasid)
 {
-	unsigned long flags;
+	struct gcr3_tbl_info *gcr3_info = &dev_data->gcr3_info;
 	int ret;
 
-	spin_lock_irqsave(&domain->lock, flags);
-	ret = __clear_gcr3(domain, pasid);
-	spin_unlock_irqrestore(&domain->lock, flags);
+	spin_lock(&dev_data->lock);
 
+	ret = __clear_gcr3(dev_data, pasid);
+	if (!ret)
+		gcr3_info->pasid_cnt--;
+
+	spin_unlock(&dev_data->lock);
 	return ret;
 }
-EXPORT_SYMBOL(amd_iommu_domain_clear_gcr3);
+EXPORT_SYMBOL(amd_iommu_clear_gcr3);
 
 int amd_iommu_complete_ppr(struct pci_dev *pdev, u32 pasid,
 			   int status, int tag)
diff --git a/drivers/iommu/amd/iommu_v2.c b/drivers/iommu/amd/iommu_v2.c
index 4148cbf069dc..8453b2d9d27b 100644
--- a/drivers/iommu/amd/iommu_v2.c
+++ b/drivers/iommu/amd/iommu_v2.c
@@ -271,6 +271,7 @@ static void put_pasid_state_wait(struct pasid_state *pasid_state)
 static void unbind_pasid(struct pasid_state *pasid_state)
 {
 	struct device_state *dev_state = pasid_state->device_state;
+	struct iommu_dev_data *dev_data = dev_iommu_priv_get(&dev_state->pdev->dev);
 
 	/*
 	 * Mark pasid_state as invalid, no more faults will we added to the
@@ -282,7 +283,7 @@ static void unbind_pasid(struct pasid_state *pasid_state)
 	smp_wmb();
 
 	/* After this the device/pasid can't access the mm anymore */
-	amd_iommu_domain_clear_gcr3(dev_state->pdom, pasid_state->pasid);
+	amd_iommu_clear_gcr3(dev_data, pasid_state->pasid);
 
 	/* Make sure no more pending faults are in the queue */
 	flush_workqueue(iommu_wq);
@@ -605,6 +606,7 @@ int amd_iommu_bind_pasid(struct pci_dev *pdev, u32 pasid,
 	struct mm_struct *mm;
 	u32 sbdf;
 	int ret;
+	struct iommu_dev_data *dev_data = dev_iommu_priv_get(&pdev->dev);
 
 	might_sleep();
 
@@ -650,8 +652,7 @@ int amd_iommu_bind_pasid(struct pci_dev *pdev, u32 pasid,
 	if (ret)
 		goto out_unregister;
 
-	ret = amd_iommu_domain_set_gcr3(dev_state->pdom, pasid,
-					__pa(pasid_state->mm->pgd));
+	ret = amd_iommu_set_gcr3(dev_data, pasid, __pa(pasid_state->mm->pgd));
 	if (ret)
 		goto out_clear_state;
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 06/11] iommu/amd: Refactor helper function for attaching / detaching device
  2023-08-08 10:02 [PATCH 00/11] iommu/amd: SVA support (part 2) - refactor support for GCR3 table Vasant Hegde
                   ` (4 preceding siblings ...)
  2023-08-08 10:02 ` [PATCH 05/11] iommu/amd: Refactor helper function for setting / clearing GCR3 Vasant Hegde
@ 2023-08-08 10:02 ` Vasant Hegde
  2023-08-08 15:39   ` Jason Gunthorpe
  2023-08-08 10:02 ` [PATCH 07/11] iommu/amd: Refactor protection_domain helper functions Vasant Hegde
                   ` (4 subsequent siblings)
  10 siblings, 1 reply; 35+ messages in thread
From: Vasant Hegde @ 2023-08-08 10:02 UTC (permalink / raw)
  To: iommu, joro
  Cc: suravee.suthikulpanit, wei.huang2, jsnitsel, jgg, Vasant Hegde

From: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>

To use the new helper function for setting up GCR3 table.

If system is booted with V2 page table then setup default GCR3 with
domain GCR3 pointer. So that all devices in the domain uses same page
table for translation. Also return page table setup status from
do_attach() function.

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Co-developed-by: Vasant Hegde <vasant.hegde@amd.com>
Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
---
 drivers/iommu/amd/iommu.c | 54 ++++++++++++++++++++++++++++++++++++---
 1 file changed, 50 insertions(+), 4 deletions(-)

diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index cc42732820dd..4fed3fbe069e 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -1821,15 +1821,41 @@ static void clear_dte_entry(struct amd_iommu *iommu, u16 devid)
 	amd_iommu_apply_erratum_63(iommu, devid);
 }
 
-static void do_attach(struct iommu_dev_data *dev_data,
-		      struct protection_domain *domain)
+/*
+ * Note: This is currently used when booting w/ amd_iommu=pgtbl_v2
+ */
+static int default_gcr3_init(struct iommu_dev_data *dev_data)
+{
+	struct gcr3_tbl_info *gcr3_info = &dev_data->gcr3_info;
+
+	/* By default, GCR3 is set to support non-PASID devices. */
+	gcr3_info->giov = true;
+
+	/*
+	 * By default, setup GCR3 table to support MAX PASIDs
+	 * support by the IOMMU HW.
+	 */
+	return setup_gcr3_table(dev_data->domain, -1);
+}
+
+static inline void default_gcr3_destroy(struct iommu_dev_data *dev_data)
+{
+	struct gcr3_tbl_info *gcr3_info = &dev_data->gcr3_info;
+
+	gcr3_info->giov = false;
+	free_gcr3_table(dev_data->domain);
+}
+
+static int do_attach(struct iommu_dev_data *dev_data,
+		     struct protection_domain *domain)
 {
 	struct amd_iommu *iommu;
 	bool ats;
+	int ret = 0;
 
 	iommu = amd_iommu_rlookup_iommu(dev_data->dev);
 	if (!iommu)
-		return;
+		return -EINVAL;
 	ats   = dev_data->ats_enabled;
 
 	/* Update data structures */
@@ -1844,12 +1870,27 @@ static void do_attach(struct iommu_dev_data *dev_data,
 	domain->dev_iommu[iommu->index] += 1;
 	domain->dev_cnt                 += 1;
 
+	if (domain->pd_mode == PD_MODE_V2) {
+		ret = default_gcr3_init(dev_data);
+		if (ret)
+			return ret;
+
+		ret = __set_gcr3(dev_data, 0,
+				 iommu_virt_to_phys(domain->iop.pgd));
+		if (ret) {
+			default_gcr3_destroy(dev_data);
+			return ret;
+		}
+	}
+
 	/* Update device table */
 	set_dte_entry(iommu, dev_data->devid, domain,
 		      ats, dev_data->ppr);
 	clone_aliases(iommu, dev_data->dev);
 
 	device_flush_dte(dev_data);
+
+	return ret;
 }
 
 static void do_detach(struct iommu_dev_data *dev_data)
@@ -1861,6 +1902,11 @@ static void do_detach(struct iommu_dev_data *dev_data)
 	if (!iommu)
 		return;
 
+	if (domain->pd_mode == PD_MODE_V2) {
+		__clear_gcr3(dev_data, 0);
+		default_gcr3_destroy(dev_data);
+	}
+
 	/* Update data structures */
 	dev_data->domain = NULL;
 	list_del(&dev_data->list);
@@ -1906,7 +1952,7 @@ static int attach_device(struct device *dev,
 	if (dev_is_pci(dev))
 		pdev_enable_caps(to_pci_dev(dev));
 
-	do_attach(dev_data, domain);
+	ret = do_attach(dev_data, domain);
 
 	/*
 	 * We might boot into a crash-kernel here. The crashed kernel
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 07/11] iommu/amd: Refactor protection_domain helper functions
  2023-08-08 10:02 [PATCH 00/11] iommu/amd: SVA support (part 2) - refactor support for GCR3 table Vasant Hegde
                   ` (5 preceding siblings ...)
  2023-08-08 10:02 ` [PATCH 06/11] iommu/amd: Refactor helper function for attaching / detaching device Vasant Hegde
@ 2023-08-08 10:02 ` Vasant Hegde
  2023-08-08 10:02 ` [PATCH 08/11] iommu/amd: Refactor GCR3 table " Vasant Hegde
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 35+ messages in thread
From: Vasant Hegde @ 2023-08-08 10:02 UTC (permalink / raw)
  To: iommu, joro
  Cc: suravee.suthikulpanit, wei.huang2, jsnitsel, jgg, Vasant Hegde

From: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>

To removes the code to setup GCR3 table, and only handle domain
create / destroy, since GCR3 is no longer part of a domain.

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
---
 drivers/iommu/amd/iommu.c | 14 +++-----------
 1 file changed, 3 insertions(+), 11 deletions(-)

diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 4fed3fbe069e..c9318be9fe4e 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -2146,9 +2146,6 @@ static void protection_domain_free(struct protection_domain *domain)
 	if (domain->iop.pgtbl_cfg.tlb)
 		free_io_pgtable_ops(&domain->iop.iop.ops);
 
-	if (domain->flags & PD_IOMMUV2_MASK)
-		free_gcr3_table(domain);
-
 	if (domain->iop.root)
 		free_page((unsigned long)domain->iop.root);
 
@@ -2176,15 +2173,10 @@ static int protection_domain_init_v1(struct protection_domain *domain, int mode)
 	return 0;
 }
 
-static int protection_domain_init_v2(struct protection_domain *domain)
+static int protection_domain_init_v2(struct protection_domain *pdom)
 {
-	domain->flags |= PD_GIOV_MASK;
-	domain->pd_mode = PD_MODE_V2;
-
-	domain->domain.pgsize_bitmap = AMD_IOMMU_PGSIZES_V2;
-
-	if (setup_gcr3_table(domain, 1))
-		return -ENOMEM;
+	pdom->pd_mode = PD_MODE_V2;
+	pdom->domain.pgsize_bitmap = AMD_IOMMU_PGSIZES_V2;
 
 	return 0;
 }
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 08/11] iommu/amd: Refactor GCR3 table helper functions
  2023-08-08 10:02 [PATCH 00/11] iommu/amd: SVA support (part 2) - refactor support for GCR3 table Vasant Hegde
                   ` (6 preceding siblings ...)
  2023-08-08 10:02 ` [PATCH 07/11] iommu/amd: Refactor protection_domain helper functions Vasant Hegde
@ 2023-08-08 10:02 ` Vasant Hegde
  2023-08-08 10:02 ` [PATCH 09/11] iommu/amd: Introduce helper functions for AMD IOMMU v2 driver Vasant Hegde
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 35+ messages in thread
From: Vasant Hegde @ 2023-08-08 10:02 UTC (permalink / raw)
  To: iommu, joro
  Cc: suravee.suthikulpanit, wei.huang2, jsnitsel, jgg, Vasant Hegde

From: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>

To use the new per-device struct gcr3_tbl_info. Also modify
set_dte_entry() to use new per device GCR3 table.

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Co-developed-by: Vasant Hegde <vasant.hegde@amd.com>
Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
---
 drivers/iommu/amd/iommu.c | 75 +++++++++++++++++++++++++--------------
 1 file changed, 48 insertions(+), 27 deletions(-)

diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index c9318be9fe4e..bba2034a28f6 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -78,6 +78,11 @@ struct kmem_cache *amd_iommu_irq_cache;
 
 static void detach_device(struct device *dev);
 
+static void set_dte_entry(struct amd_iommu *iommu, u16 devid,
+			  struct gcr3_tbl_info *gcr3_info,
+			  struct protection_domain *domain, bool ats,
+			  enum ppr_handlers ppr);
+
 static int __set_gcr3(struct iommu_dev_data *dev_data,
 		       u32 pasid, unsigned long gcr3);
 
@@ -1679,16 +1684,27 @@ static void free_gcr3_tbl_level2(u64 *tbl)
 	}
 }
 
-static void free_gcr3_table(struct protection_domain *domain)
+static void free_gcr3_table(struct iommu_dev_data *dev_data)
 {
-	if (domain->glx == 2)
-		free_gcr3_tbl_level2(domain->gcr3_tbl);
-	else if (domain->glx == 1)
-		free_gcr3_tbl_level1(domain->gcr3_tbl);
+	struct gcr3_tbl_info *gcr3_info = &dev_data->gcr3_info;
+	struct amd_iommu *iommu = amd_iommu_rlookup_iommu(dev_data->dev);
+
+	if (gcr3_info->glx == 2)
+		free_gcr3_tbl_level2(gcr3_info->gcr3_tbl);
+	else if (gcr3_info->glx == 1)
+		free_gcr3_tbl_level1(gcr3_info->gcr3_tbl);
 	else
-		BUG_ON(domain->glx != 0);
+		WARN_ON_ONCE(gcr3_info->glx != 0);
+
+	gcr3_info->glx = 0;
+
+	set_dte_entry(iommu, dev_data->devid, NULL, dev_data->domain,
+		      dev_data->ats_enabled, dev_data->ppr);
+	clone_aliases(iommu, dev_data->dev);
+	device_flush_dte(dev_data);
 
-	free_page((unsigned long)domain->gcr3_tbl);
+	free_page((unsigned long)gcr3_info->gcr3_tbl);
+	gcr3_info->gcr3_tbl = NULL;
 }
 
 /*
@@ -1703,27 +1719,35 @@ static int get_gcr3_levels(int pasids)
 	return (DIV_ROUND_UP(get_count_order(pasids), 9) - 1);
 }
 
-/* Note: This function expects iommu_domain->lock to be held prior calling the function. */
-static int setup_gcr3_table(struct protection_domain *domain, int pasids)
+static int setup_gcr3_table(struct iommu_dev_data *dev_data, int pasids)
 {
+	struct gcr3_tbl_info *gcr3_info = &dev_data->gcr3_info;
+	struct amd_iommu *iommu = amd_iommu_rlookup_iommu(dev_data->dev);
 	int levels = get_gcr3_levels(pasids);
 
 	if (levels > amd_iommu_max_glx_val)
 		return -EINVAL;
 
-	domain->gcr3_tbl = alloc_pgtable_page(domain->nid, GFP_ATOMIC);
-	if (domain->gcr3_tbl == NULL)
+	if (gcr3_info->gcr3_tbl)
+		return -EBUSY;
+
+	gcr3_info->gcr3_tbl = alloc_pgtable_page(dev_to_node(dev_data->dev),
+						 GFP_ATOMIC);
+	if (gcr3_info->gcr3_tbl == NULL)
 		return -ENOMEM;
 
-	domain->glx      = levels;
-	domain->flags   |= PD_IOMMUV2_MASK;
+	gcr3_info->glx = levels;
 
-	amd_iommu_domain_update(domain);
+	set_dte_entry(iommu, dev_data->devid, &dev_data->gcr3_info,
+		      dev_data->domain, dev_data->ats_enabled, dev_data->ppr);
+	clone_aliases(iommu, dev_data->dev);
+	device_flush_dte(dev_data);
 
 	return 0;
 }
 
 static void set_dte_entry(struct amd_iommu *iommu, u16 devid,
+			  struct gcr3_tbl_info *gcr3_info,
 			  struct protection_domain *domain, bool ats,
 			  enum ppr_handlers ppr)
 {
@@ -1755,9 +1779,9 @@ static void set_dte_entry(struct amd_iommu *iommu, u16 devid,
 	if (ppr != PPR_HANDLER_NONE)
 		pte_root |= 1ULL << DEV_ENTRY_PPR;
 
-	if (domain->flags & PD_IOMMUV2_MASK) {
-		u64 gcr3 = iommu_virt_to_phys(domain->gcr3_tbl);
-		u64 glx  = domain->glx;
+	if (gcr3_info && gcr3_info->gcr3_tbl) {
+		u64 gcr3 = iommu_virt_to_phys(gcr3_info->gcr3_tbl);
+		u64 glx  = gcr3_info->glx;
 		u64 tmp;
 
 		pte_root |= DTE_FLAG_GV;
@@ -1785,7 +1809,7 @@ static void set_dte_entry(struct amd_iommu *iommu, u16 devid,
 				((u64)GUEST_PGTABLE_5_LEVEL << DTE_GPT_LEVEL_SHIFT);
 		}
 
-		if (domain->flags & PD_GIOV_MASK)
+		if (gcr3_info->giov)
 			pte_root |= DTE_FLAG_GIOV;
 	}
 
@@ -1835,7 +1859,7 @@ static int default_gcr3_init(struct iommu_dev_data *dev_data)
 	 * By default, setup GCR3 table to support MAX PASIDs
 	 * support by the IOMMU HW.
 	 */
-	return setup_gcr3_table(dev_data->domain, -1);
+	return setup_gcr3_table(dev_data, -1);
 }
 
 static inline void default_gcr3_destroy(struct iommu_dev_data *dev_data)
@@ -1843,7 +1867,7 @@ static inline void default_gcr3_destroy(struct iommu_dev_data *dev_data)
 	struct gcr3_tbl_info *gcr3_info = &dev_data->gcr3_info;
 
 	gcr3_info->giov = false;
-	free_gcr3_table(dev_data->domain);
+	free_gcr3_table(dev_data);
 }
 
 static int do_attach(struct iommu_dev_data *dev_data,
@@ -1884,8 +1908,8 @@ static int do_attach(struct iommu_dev_data *dev_data,
 	}
 
 	/* Update device table */
-	set_dte_entry(iommu, dev_data->devid, domain,
-		      ats, dev_data->ppr);
+	set_dte_entry(iommu, dev_data->devid, &dev_data->gcr3_info,
+		      domain, ats, dev_data->ppr);
 	clone_aliases(iommu, dev_data->dev);
 
 	device_flush_dte(dev_data);
@@ -2088,8 +2112,8 @@ static void update_device_table(struct protection_domain *domain)
 
 		if (!iommu)
 			continue;
-		set_dte_entry(iommu, dev_data->devid, domain,
-			      dev_data->ats_enabled, dev_data->ppr);
+		set_dte_entry(iommu, dev_data->devid, &dev_data->gcr3_info,
+			      domain, dev_data->ats_enabled, dev_data->ppr);
 		clone_aliases(iommu, dev_data->dev);
 	}
 }
@@ -2637,9 +2661,6 @@ int amd_iommu_domain_enable_v2(struct iommu_domain *dom, int pasids)
 	if (pdom->dev_cnt > 0 || pdom->flags & PD_IOMMUV2_MASK)
 		goto out;
 
-	if (!pdom->gcr3_tbl)
-		ret = setup_gcr3_table(pdom, pasids);
-
 out:
 	spin_unlock_irqrestore(&pdom->lock, flags);
 	return ret;
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 09/11] iommu/amd: Introduce helper functions for AMD IOMMU v2 driver
  2023-08-08 10:02 [PATCH 00/11] iommu/amd: SVA support (part 2) - refactor support for GCR3 table Vasant Hegde
                   ` (7 preceding siblings ...)
  2023-08-08 10:02 ` [PATCH 08/11] iommu/amd: Refactor GCR3 table " Vasant Hegde
@ 2023-08-08 10:02 ` Vasant Hegde
  2023-08-08 15:49   ` Jason Gunthorpe
  2023-08-08 10:02 ` [PATCH 10/11] iommu/amd/iommu_v2: Add support to switch default domain to SVA mode Vasant Hegde
  2023-08-08 10:02 ` [PATCH 11/11] iommu/amd: Remove unused GCR3 table parameters from struct protection_domain Vasant Hegde
  10 siblings, 1 reply; 35+ messages in thread
From: Vasant Hegde @ 2023-08-08 10:02 UTC (permalink / raw)
  To: iommu, joro
  Cc: suravee.suthikulpanit, wei.huang2, jsnitsel, jgg, Vasant Hegde

From: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>

Introducing new functions to init and uninit v2 domain. Also setup /
clear GCR3 table. These functions will make necessary changes to
existing domain to switch to SVA mode. iommu_v2 module will call these
new functions instead of setting up new domain to enable SVA mode.

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Co-developed-by: Vasant Hegde <vasant.hegde@amd.com>
Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
---
 drivers/iommu/amd/amd_iommu.h |  6 +++
 drivers/iommu/amd/iommu.c     | 95 +++++++++++++++++++++++++++++++++++
 2 files changed, 101 insertions(+)

diff --git a/drivers/iommu/amd/amd_iommu.h b/drivers/iommu/amd/amd_iommu.h
index c9a35b6e2a87..f5a9697488f8 100644
--- a/drivers/iommu/amd/amd_iommu.h
+++ b/drivers/iommu/amd/amd_iommu.h
@@ -64,6 +64,12 @@ int amd_iommu_set_gcr3(struct iommu_dev_data *dev_data,
 		       u32 pasid, unsigned long gcr3);
 int amd_iommu_clear_gcr3(struct iommu_dev_data *dev_data, u32 pasid);
 
+/* v2 API init */
+int amd_iommu_v2api_domain_init(struct protection_domain *pdom);
+void amd_iommu_v2api_domain_uninit(struct protection_domain *pdom);
+int amd_iommu_v2api_gcr3_init(struct pci_dev *pdev, int pasids);
+int amd_iommu_v2api_gcr3_uninit(struct pci_dev *pdev);
+
 int amd_iommu_register_ppr_notifier(struct notifier_block *nb);
 int amd_iommu_unregister_ppr_notifier(struct notifier_block *nb);
 void amd_iommu_domain_direct_map(struct iommu_domain *dom);
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index bba2034a28f6..b56d1e624cf6 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -2179,6 +2179,101 @@ static void protection_domain_free(struct protection_domain *domain)
 	kfree(domain);
 }
 
+/*******************************
+ * V2API-related helper functions
+ * Note: Separate these out for now. It will be removed when
+ *       deprecate the AMD IOMMU v2 API support.
+ */
+int amd_iommu_v2api_domain_init(struct protection_domain *pdom)
+{
+	unsigned long flags;
+	int ret = 0;
+
+	spin_lock_irqsave(&pdom->lock, flags);
+
+	/* Only allow v2API support on pass-through domain. */
+	if (pdom->pd_mode != PD_MODE_PT) {
+		ret = -EINVAL;
+		goto out;
+	}
+
+	pdom->pd_mode = PD_MODE_V2;
+
+out:
+	spin_unlock_irqrestore(&pdom->lock, flags);
+	return ret;
+}
+EXPORT_SYMBOL(amd_iommu_v2api_domain_init);
+
+void amd_iommu_v2api_domain_uninit(struct protection_domain *pdom)
+{
+	unsigned long flags;
+
+	if (!pdom)
+		return;
+
+	spin_lock_irqsave(&pdom->lock, flags);
+	/* Switch back to pass-through mode */
+	pdom->pd_mode = PD_MODE_PT;
+	spin_unlock_irqrestore(&pdom->lock, flags);
+}
+EXPORT_SYMBOL(amd_iommu_v2api_domain_uninit);
+
+int amd_iommu_v2api_gcr3_init(struct pci_dev *pdev, int pasids)
+{
+	struct iommu_dev_data *dev_data = dev_iommu_priv_get(&pdev->dev);
+	struct gcr3_tbl_info *gcr3_info = &dev_data->gcr3_info;
+	unsigned long flags;
+	int ret;
+
+	spin_lock_irqsave(&dev_data->lock, flags);
+
+	/* V2API mode needs ATS, PASID and PRI support */
+	if (!dev_data->ats_enabled || !dev_data->pri_enabled ||
+	    !dev_data->pasid_enabled) {
+		ret = -EINVAL;
+		goto out;
+	}
+
+	/*
+	 * V2API is supported on passthrough domain only.
+	 * Hence we cannot support GIOV for V2API.
+	 */
+	gcr3_info->giov = false;
+	gcr3_info->pasid_cnt = 0;
+
+	/* Allocate GCR3 table */
+	ret = setup_gcr3_table(dev_data, pasids);
+
+out:
+	spin_unlock_irqrestore(&dev_data->lock, flags);
+	return ret;
+}
+EXPORT_SYMBOL(amd_iommu_v2api_gcr3_init);
+
+int amd_iommu_v2api_gcr3_uninit(struct pci_dev *pdev)
+{
+	struct iommu_dev_data *dev_data = dev_iommu_priv_get(&pdev->dev);
+	struct gcr3_tbl_info *gcr3_info = &dev_data->gcr3_info;
+	unsigned long flags;
+	int ret = 0;
+
+	spin_lock_irqsave(&dev_data->lock, flags);
+
+	if (gcr3_info->pasid_cnt) {
+		ret = -EBUSY;
+		goto out;
+	}
+
+	free_gcr3_table(dev_data);
+
+out:
+	spin_unlock_irqrestore(&dev_data->lock, flags);
+	return ret;
+}
+EXPORT_SYMBOL(amd_iommu_v2api_gcr3_uninit);
+
+
 static int protection_domain_init_v1(struct protection_domain *domain, int mode)
 {
 	u64 *pt_root = NULL;
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 10/11] iommu/amd/iommu_v2: Add support to switch default domain to SVA mode
  2023-08-08 10:02 [PATCH 00/11] iommu/amd: SVA support (part 2) - refactor support for GCR3 table Vasant Hegde
                   ` (8 preceding siblings ...)
  2023-08-08 10:02 ` [PATCH 09/11] iommu/amd: Introduce helper functions for AMD IOMMU v2 driver Vasant Hegde
@ 2023-08-08 10:02 ` Vasant Hegde
  2023-08-08 15:51   ` Jason Gunthorpe
  2023-08-08 10:02 ` [PATCH 11/11] iommu/amd: Remove unused GCR3 table parameters from struct protection_domain Vasant Hegde
  10 siblings, 1 reply; 35+ messages in thread
From: Vasant Hegde @ 2023-08-08 10:02 UTC (permalink / raw)
  To: iommu, joro
  Cc: suravee.suthikulpanit, wei.huang2, jsnitsel, jgg, Vasant Hegde

Current iommu_v2 module allocates a secondary domain every time a
device driver calls amd_iommu_init_device(). Then it detaches all
devices in the group from the default domain, and reattaches to the
new domain. Finally it configures V2API mode.

Previous patch added support to switch same domain to SVA mode. Use
these new interfaces to enable SVA mode.

Note that even with this change, PASID is managed by device driver. This
imposes restriction where there can be only one SVA enabled device per
IOMMU group (same as current behaviour).

Finally remove unused functions.

Co-developed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
---
 drivers/iommu/amd/amd_iommu.h |  2 --
 drivers/iommu/amd/iommu.c     | 37 ----------------------
 drivers/iommu/amd/iommu_v2.c  | 59 +++++++----------------------------
 3 files changed, 12 insertions(+), 86 deletions(-)

diff --git a/drivers/iommu/amd/amd_iommu.h b/drivers/iommu/amd/amd_iommu.h
index f5a9697488f8..2ec335ac0b1c 100644
--- a/drivers/iommu/amd/amd_iommu.h
+++ b/drivers/iommu/amd/amd_iommu.h
@@ -72,8 +72,6 @@ int amd_iommu_v2api_gcr3_uninit(struct pci_dev *pdev);
 
 int amd_iommu_register_ppr_notifier(struct notifier_block *nb);
 int amd_iommu_unregister_ppr_notifier(struct notifier_block *nb);
-void amd_iommu_domain_direct_map(struct iommu_domain *dom);
-int amd_iommu_domain_enable_v2(struct iommu_domain *dom, int pasids);
 int amd_iommu_flush_page(struct protection_domain *domain, u32 pasid, u64 address);
 void amd_iommu_update_and_flush_device_table(struct protection_domain *domain);
 void amd_iommu_domain_update(struct protection_domain *domain);
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index b56d1e624cf6..77b87511e085 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -2725,43 +2725,6 @@ int amd_iommu_unregister_ppr_notifier(struct notifier_block *nb)
 }
 EXPORT_SYMBOL(amd_iommu_unregister_ppr_notifier);
 
-void amd_iommu_domain_direct_map(struct iommu_domain *dom)
-{
-	struct protection_domain *domain = to_pdomain(dom);
-	unsigned long flags;
-
-	spin_lock_irqsave(&domain->lock, flags);
-
-	if (domain->iop.pgtbl_cfg.tlb)
-		free_io_pgtable_ops(&domain->iop.iop.ops);
-
-	spin_unlock_irqrestore(&domain->lock, flags);
-}
-EXPORT_SYMBOL(amd_iommu_domain_direct_map);
-
-int amd_iommu_domain_enable_v2(struct iommu_domain *dom, int pasids)
-{
-	struct protection_domain *pdom = to_pdomain(dom);
-	unsigned long flags;
-	int ret;
-
-	spin_lock_irqsave(&pdom->lock, flags);
-
-	/*
-	 * Save us all sanity checks whether devices already in the
-	 * domain support IOMMUv2. Just force that the domain has no
-	 * devices attached when it is switched into IOMMUv2 mode.
-	 */
-	ret = -EBUSY;
-	if (pdom->dev_cnt > 0 || pdom->flags & PD_IOMMUV2_MASK)
-		goto out;
-
-out:
-	spin_unlock_irqrestore(&pdom->lock, flags);
-	return ret;
-}
-EXPORT_SYMBOL(amd_iommu_domain_enable_v2);
-
 static int __flush_pasid(struct protection_domain *domain, u32 pasid,
 			 u64 address, bool size)
 {
diff --git a/drivers/iommu/amd/iommu_v2.c b/drivers/iommu/amd/iommu_v2.c
index 8453b2d9d27b..f05bd9156c8d 100644
--- a/drivers/iommu/amd/iommu_v2.c
+++ b/drivers/iommu/amd/iommu_v2.c
@@ -111,9 +111,6 @@ static struct device_state *get_device_state(u32 sbdf)
 
 static void free_device_state(struct device_state *dev_state)
 {
-	struct iommu_group *group;
-	struct iommu_domain *domain = &dev_state->pdom->domain;
-
 	/* Get rid of any remaining pasid states */
 	free_pasid_states(dev_state);
 
@@ -123,20 +120,8 @@ static void free_device_state(struct device_state *dev_state)
 	 */
 	wait_event(dev_state->wq, !atomic_read(&dev_state->count));
 
-	/*
-	 * First detach device from domain - No more PRI requests will arrive
-	 * from that device after it is unbound from the IOMMUv2 domain.
-	 */
-	group = iommu_group_get(&dev_state->pdev->dev);
-	if (WARN_ON(!group))
-		return;
-
-	iommu_detach_group(domain, group);
-
-	iommu_group_put(group);
-
-	/* Everything is down now, free the IOMMUv2 domain */
-	iommu_domain_free(domain);
+	amd_iommu_v2api_gcr3_uninit(dev_state->pdev);
+	amd_iommu_v2api_domain_uninit(dev_state->pdom);
 
 	/* Finally get rid of the device-state */
 	kfree(dev_state);
@@ -735,9 +720,7 @@ EXPORT_SYMBOL(amd_iommu_unbind_pasid);
 
 int amd_iommu_init_device(struct pci_dev *pdev, int pasids)
 {
-	struct iommu_domain *domain;
 	struct device_state *dev_state;
-	struct iommu_group *group;
 	unsigned long flags;
 	int ret, tmp;
 	u32 sbdf;
@@ -774,6 +757,7 @@ int amd_iommu_init_device(struct pci_dev *pdev, int pasids)
 	init_waitqueue_head(&dev_state->wq);
 	dev_state->pdev  = pdev;
 	dev_state->sbdf = sbdf;
+	dev_state->pdom = dev_data->domain;
 
 	tmp = pasids;
 	for (dev_state->pasid_levels = 0; (tmp - 1) & ~0x1ff; tmp >>= 9)
@@ -787,39 +771,20 @@ int amd_iommu_init_device(struct pci_dev *pdev, int pasids)
 	if (dev_state->states == NULL)
 		goto out_free_dev_state;
 
-	domain = iommu_domain_alloc(&pci_bus_type);
-	if (domain == NULL)
+	ret = amd_iommu_v2api_domain_init(dev_data->domain);
+	if (ret)
 		goto out_free_states;
 
-	/* Retrieve new protection_domain that has just been allocated */
-	dev_state->pdom = to_pdomain(domain);
-
-	/* See iommu_is_default_domain() */
-	domain->type = IOMMU_DOMAIN_IDENTITY;
-	amd_iommu_domain_direct_map(&dev_state->pdom->domain);
-
-	ret = amd_iommu_domain_enable_v2(domain, pasids);
+	ret = amd_iommu_v2api_gcr3_init(pdev, pasids);
 	if (ret)
-		goto out_free_domain;
-
-	group = iommu_group_get(&pdev->dev);
-	if (!group) {
-		ret = -EINVAL;
-		goto out_free_domain;
-	}
-
-	ret = iommu_attach_group(domain, group);
-	if (ret != 0)
-		goto out_drop_group;
-
-	iommu_group_put(group);
+		goto out_uninit_domain;
 
 	spin_lock_irqsave(&state_lock, flags);
 
 	if (__get_device_state(sbdf) != NULL) {
 		spin_unlock_irqrestore(&state_lock, flags);
 		ret = -EBUSY;
-		goto out_free_domain;
+		goto out_uninit_gcr3;
 	}
 
 	list_add_tail(&dev_state->list, &state_list);
@@ -828,11 +793,11 @@ int amd_iommu_init_device(struct pci_dev *pdev, int pasids)
 
 	return 0;
 
-out_drop_group:
-	iommu_group_put(group);
+out_uninit_gcr3:
+	amd_iommu_v2api_gcr3_uninit(dev_state->pdev);
 
-out_free_domain:
-	iommu_domain_free(domain);
+out_uninit_domain:
+	amd_iommu_v2api_domain_uninit(dev_state->pdom);
 
 out_free_states:
 	free_page((unsigned long)dev_state->states);
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 11/11] iommu/amd: Remove unused GCR3 table parameters from struct protection_domain
  2023-08-08 10:02 [PATCH 00/11] iommu/amd: SVA support (part 2) - refactor support for GCR3 table Vasant Hegde
                   ` (9 preceding siblings ...)
  2023-08-08 10:02 ` [PATCH 10/11] iommu/amd/iommu_v2: Add support to switch default domain to SVA mode Vasant Hegde
@ 2023-08-08 10:02 ` Vasant Hegde
  10 siblings, 0 replies; 35+ messages in thread
From: Vasant Hegde @ 2023-08-08 10:02 UTC (permalink / raw)
  To: iommu, joro
  Cc: suravee.suthikulpanit, wei.huang2, jsnitsel, jgg, Vasant Hegde

From: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>

Since they are moved to struct iommu_dev_data, and the driver has been
ported to use them.

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
---
 drivers/iommu/amd/amd_iommu_types.h | 12 ------------
 drivers/iommu/amd/iommu.c           |  2 +-
 2 files changed, 1 insertion(+), 13 deletions(-)

diff --git a/drivers/iommu/amd/amd_iommu_types.h b/drivers/iommu/amd/amd_iommu_types.h
index a275a4bbf8d0..cbc36a81c9cc 100644
--- a/drivers/iommu/amd/amd_iommu_types.h
+++ b/drivers/iommu/amd/amd_iommu_types.h
@@ -442,15 +442,6 @@
 
 #define MAX_DOMAIN_ID 65536
 
-/* Protection domain flags */
-#define PD_DMA_OPS_MASK		BIT(0) /* domain used for dma_ops */
-#define PD_DEFAULT_MASK		BIT(1) /* domain is a default dma_ops
-					      domain for an IOMMU */
-#define PD_PASSTHROUGH_MASK	BIT(2) /* domain has no page
-					      translation */
-#define PD_IOMMUV2_MASK		BIT(3) /* domain has gcr3 table */
-#define PD_GIOV_MASK		BIT(4) /* domain enable GIOV support */
-
 /* Timeout stuff */
 #define LOOP_TIMEOUT		100000
 #define MMIO_STATUS_TIMEOUT	2000000
@@ -579,10 +570,7 @@ struct protection_domain {
 	struct amd_io_pgtable iop;
 	spinlock_t lock;	/* mostly used to lock the page table*/
 	u16 id;			/* the domain id written to the device table */
-	int glx;		/* Number of levels for GCR3 table */
 	int nid;		/* Node ID */
-	u64 *gcr3_tbl;		/* Guest CR3 table */
-	unsigned long flags;	/* flags to find out type of domain */
 	enum protection_domain_mode pd_mode; /* Track page table type */
 	unsigned dev_cnt;	/* devices assigned to this domain */
 	unsigned dev_iommu[MAX_IOMMUS]; /* per-IOMMU reference count */
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 77b87511e085..8434cbeff085 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -2732,7 +2732,7 @@ static int __flush_pasid(struct protection_domain *domain, u32 pasid,
 	struct iommu_cmd cmd;
 	int i, ret;
 
-	if (!(domain->flags & PD_IOMMUV2_MASK))
+	if (!(domain->pd_mode & PD_MODE_V2))
 		return -EINVAL;
 
 	build_inv_iommu_pasid(&cmd, domain->id, pasid, address, size);
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* Re: [PATCH 01/11] iommu/amd: Rename helper function rlookup_amd_iommu()
  2023-08-08 10:02 ` [PATCH 01/11] iommu/amd: Rename helper function rlookup_amd_iommu() Vasant Hegde
@ 2023-08-08 15:28   ` Jason Gunthorpe
  2023-08-10 23:06     ` Suthikulpanit, Suravee
  0 siblings, 1 reply; 35+ messages in thread
From: Jason Gunthorpe @ 2023-08-08 15:28 UTC (permalink / raw)
  To: Vasant Hegde; +Cc: iommu, joro, suravee.suthikulpanit, wei.huang2, jsnitsel

On Tue, Aug 08, 2023 at 10:02:22AM +0000, Vasant Hegde wrote:
> From: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
> 
> Rename and make it available to other code.
> 
> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
> Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
> ---
>  drivers/iommu/amd/amd_iommu.h |  1 +
>  drivers/iommu/amd/iommu.c     | 30 +++++++++++++++---------------
>  2 files changed, 16 insertions(+), 15 deletions(-)
> 
> diff --git a/drivers/iommu/amd/amd_iommu.h b/drivers/iommu/amd/amd_iommu.h
> index 56996dc2868d..1471ab2dddd1 100644
> --- a/drivers/iommu/amd/amd_iommu.h
> +++ b/drivers/iommu/amd/amd_iommu.h
> @@ -24,6 +24,7 @@ int amd_iommu_init_devices(void);
>  void amd_iommu_uninit_devices(void);
>  void amd_iommu_init_notifier(void);
>  void amd_iommu_set_rlookup_table(struct amd_iommu *iommu, u16 devid);
> +struct amd_iommu *amd_iommu_rlookup_iommu(struct device *dev);
>  
>  #ifdef CONFIG_AMD_IOMMU_DEBUGFS
>  void amd_iommu_debugfs_setup(struct amd_iommu *iommu);
> diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
> index 26e509ffe77a..77b913211c35 100644
> --- a/drivers/iommu/amd/iommu.c
> +++ b/drivers/iommu/amd/iommu.c
> @@ -164,7 +164,7 @@ static struct amd_iommu *__rlookup_amd_iommu(u16 seg, u16 devid)
>  	return NULL;
>  }
>  
> -static struct amd_iommu *rlookup_amd_iommu(struct device *dev)
> +struct amd_iommu *amd_iommu_rlookup_iommu(struct device *dev)
>  {

Er, wah? This is a really confused function. It should just be:

static inline struct amd_iommu *amd_device_to_iommu(struct device *dev)
{
	return container_of(dev->iommu->iommu_dev, struct amd_iommu, iommu);
}

Right? At least for many of the call sites.

The core code maintains the association of the iommu_device with each
probe'd device automatically. The drivers don't need tables and
searching.

Jason 


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 02/11] iommu/amd: Introduce struct protection_domain.pd_mode
  2023-08-08 10:02 ` [PATCH 02/11] iommu/amd: Introduce struct protection_domain.pd_mode Vasant Hegde
@ 2023-08-08 15:30   ` Jason Gunthorpe
  2023-08-11 10:04     ` Vasant Hegde
  0 siblings, 1 reply; 35+ messages in thread
From: Jason Gunthorpe @ 2023-08-08 15:30 UTC (permalink / raw)
  To: Vasant Hegde; +Cc: iommu, joro, suravee.suthikulpanit, wei.huang2, jsnitsel

On Tue, Aug 08, 2023 at 10:02:23AM +0000, Vasant Hegde wrote:
> From: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
> 
> This enum variable is used to track the type of page table used by the
> protection domain. It will replace the protection_domain.flags in
> subsequent series.
> 
> Suggested-by: Jason Gunthorpe <jgg@ziepe.ca>
> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
> Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
> ---
>  drivers/iommu/amd/amd_iommu_types.h | 7 +++++++
>  drivers/iommu/amd/iommu.c           | 3 +++
>  2 files changed, 10 insertions(+)
> 
> diff --git a/drivers/iommu/amd/amd_iommu_types.h b/drivers/iommu/amd/amd_iommu_types.h
> index 0d339e022572..5e89032b3dee 100644
> --- a/drivers/iommu/amd/amd_iommu_types.h
> +++ b/drivers/iommu/amd/amd_iommu_types.h
> @@ -555,6 +555,12 @@ struct amd_io_pgtable {
>  	u64			*pgd;		/* v2 pgtable pgd pointer */
>  };
>  
> +enum protection_domain_mode {
> +	PD_MODE_PT = 0,
> +	PD_MODE_V1,
> +	PD_MODE_V2,
> +};

PD_MODE_PT is redundant because the domain->type ==
IOMMU_DOMAIN_IDENTITY already indicates identity. It is better not to
denormalize the data structures.

But right idea

Jason

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 03/11] iommu/amd: Introduce per-device GCR3 table
  2023-08-08 10:02 ` [PATCH 03/11] iommu/amd: Introduce per-device GCR3 table Vasant Hegde
@ 2023-08-08 15:32   ` Jason Gunthorpe
  2023-08-10 23:31     ` Suthikulpanit, Suravee
  0 siblings, 1 reply; 35+ messages in thread
From: Jason Gunthorpe @ 2023-08-08 15:32 UTC (permalink / raw)
  To: Vasant Hegde; +Cc: iommu, joro, suravee.suthikulpanit, wei.huang2, jsnitsel

On Tue, Aug 08, 2023 at 10:02:24AM +0000, Vasant Hegde wrote:
> From: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
> 
> AMD IOMMU GCR3 table is indexed by PASID. Each entry stores guest CR3
> register value, which is an address to the root of guest IO page table.
> The GCR3 table can be programmed per-device. However, Linux AMD IOMMU
> driver currently managing the table on a per-domain basis.

Why do you call it "guest"? There is no guest here. This "GCR3" is the
root of the PASID table for the device. It is used when ever PASID is
required even without VMs.

Jason

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 04/11] iommu/amd: Use protection_domain.flags to check page table mode
  2023-08-08 10:02 ` [PATCH 04/11] iommu/amd: Use protection_domain.flags to check page table mode Vasant Hegde
@ 2023-08-08 15:35   ` Jason Gunthorpe
  2023-08-11 10:10     ` Vasant Hegde
  0 siblings, 1 reply; 35+ messages in thread
From: Jason Gunthorpe @ 2023-08-08 15:35 UTC (permalink / raw)
  To: Vasant Hegde; +Cc: iommu, joro, suravee.suthikulpanit, wei.huang2, jsnitsel

On Tue, Aug 08, 2023 at 10:02:25AM +0000, Vasant Hegde wrote:
> Page table mode (v1, v2 or pt) is per domain property. Recently we have
> enhanced protection_domain.pd_mode to track per domain page table mode.
> Use that variable to check the page table mode instead of global
> 'amd_iommu_pgtable' in {map/unmap}_pages path.
> 
> Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
> ---
>  drivers/iommu/amd/iommu.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)

This is OK, but IMHO, it is much better to provide different ops for
the v1/v2 page table map/unmap functions since they don't really share
any code.

eg having a function pointer that does nothing other than call another
function pointer:

	struct io_pgtable_ops *ops = &domain->iop.iop.ops;
	if (ops->map_pages) {
		ret = ops->map_pages(ops, iova, paddr, pgsize,
				     pgcount, prot, gfp, mapped);
	}

Is quite inefficient these days. If you start with a map_pages_v2
domain op then you can just directly call iommu_v2_map_pages().

We did alot of work on the core side to enable this.

Jason

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 06/11] iommu/amd: Refactor helper function for attaching / detaching device
  2023-08-08 10:02 ` [PATCH 06/11] iommu/amd: Refactor helper function for attaching / detaching device Vasant Hegde
@ 2023-08-08 15:39   ` Jason Gunthorpe
  2023-08-11 10:07     ` Vasant Hegde
  0 siblings, 1 reply; 35+ messages in thread
From: Jason Gunthorpe @ 2023-08-08 15:39 UTC (permalink / raw)
  To: Vasant Hegde; +Cc: iommu, joro, suravee.suthikulpanit, wei.huang2, jsnitsel

On Tue, Aug 08, 2023 at 10:02:27AM +0000, Vasant Hegde wrote:
> From: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
> 
> To use the new helper function for setting up GCR3 table.
> 
> If system is booted with V2 page table then setup default GCR3 with
> domain GCR3 pointer. So that all devices in the domain uses same page
> table for translation. Also return page table setup status from
> do_attach() function.
> 
> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
> Co-developed-by: Vasant Hegde <vasant.hegde@amd.com>
> Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
> ---
>  drivers/iommu/amd/iommu.c | 54 ++++++++++++++++++++++++++++++++++++---
>  1 file changed, 50 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
> index cc42732820dd..4fed3fbe069e 100644
> --- a/drivers/iommu/amd/iommu.c
> +++ b/drivers/iommu/amd/iommu.c
> @@ -1821,15 +1821,41 @@ static void clear_dte_entry(struct amd_iommu *iommu, u16 devid)
>  	amd_iommu_apply_erratum_63(iommu, devid);
>  }
>  
> -static void do_attach(struct iommu_dev_data *dev_data,
> -		      struct protection_domain *domain)
> +/*
> + * Note: This is currently used when booting w/ amd_iommu=pgtbl_v2
> + */

I would drop all this commentary about amd_iommu=pgtbl_v2. That
command line option should be removed.

This is more about enabling the GCR3 when PASID is required.

> +static int default_gcr3_init(struct iommu_dev_data *dev_data)
> +{

I wouldn't call it default. On the ARM series we've been calling this
the 'kernel owned PASID table'. It simply IS the GCR3 table for this
struct device.

Jason

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 09/11] iommu/amd: Introduce helper functions for AMD IOMMU v2 driver
  2023-08-08 10:02 ` [PATCH 09/11] iommu/amd: Introduce helper functions for AMD IOMMU v2 driver Vasant Hegde
@ 2023-08-08 15:49   ` Jason Gunthorpe
  2023-08-11  1:34     ` Suthikulpanit, Suravee
  0 siblings, 1 reply; 35+ messages in thread
From: Jason Gunthorpe @ 2023-08-08 15:49 UTC (permalink / raw)
  To: Vasant Hegde; +Cc: iommu, joro, suravee.suthikulpanit, wei.huang2, jsnitsel

On Tue, Aug 08, 2023 at 10:02:30AM +0000, Vasant Hegde wrote:
> From: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
> 
> Introducing new functions to init and uninit v2 domain. Also setup /
> clear GCR3 table. These functions will make necessary changes to
> existing domain to switch to SVA mode. 

Huh? "necessary changes to existing domains" ? That shouldn't be
something a driver ever does.

> +/*******************************
> + * V2API-related helper functions
> + * Note: Separate these out for now. It will be removed when
> + *       deprecate the AMD IOMMU v2 API support.
> + */
> +int amd_iommu_v2api_domain_init(struct protection_domain *pdom)
> +{
> +	unsigned long flags;
> +	int ret = 0;
> +
> +	spin_lock_irqsave(&pdom->lock, flags);
> +
> +	/* Only allow v2API support on pass-through domain. */
> +	if (pdom->pd_mode != PD_MODE_PT) {
> +		ret = -EINVAL;
> +		goto out;
> +	}
> +
> +	pdom->pd_mode = PD_MODE_V2;

Wah?? You change a PT domain to a V2 domain? That seems totally
nonsensical.

This is continuing the wrong direction of the old code.

When you first create the GCR3 table you have to inspect which domain
is attached to the RID and setup the DTE/GCR3 accordingly.

You don't *change* the domain that is attached to the RID (other
things might be using it!) You just note that it is IDENTITY and do
the right things.

This should all be part of generic code that is working with the GCR3
object, not here in the v2api section.

Keep in mind that the API flow is to get a 'ref' on the GCR3 (eg
because the caller wants to put a domain in it) and a pairing 'unref'.

eg if it was already a v2 PT when you want to attach a PASID then
great, the device already has a gcr3 and so it just keeps
going. Having a PASID will block the GCR3 from being freed and it will
block V1 domains from being attached to the RID.

When the GCR3 becomes fully unrefed then all the PASIDs must be
removed, the RID must be an IDENTIY and then the driver can free the
GCR3 and reconfigure the DTE back to IDENTITY. Otherwise the DTE
points to the kernel owned GCR3.

Jason

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 10/11] iommu/amd/iommu_v2: Add support to switch default domain to SVA mode
  2023-08-08 10:02 ` [PATCH 10/11] iommu/amd/iommu_v2: Add support to switch default domain to SVA mode Vasant Hegde
@ 2023-08-08 15:51   ` Jason Gunthorpe
  0 siblings, 0 replies; 35+ messages in thread
From: Jason Gunthorpe @ 2023-08-08 15:51 UTC (permalink / raw)
  To: Vasant Hegde; +Cc: iommu, joro, suravee.suthikulpanit, wei.huang2, jsnitsel

On Tue, Aug 08, 2023 at 10:02:31AM +0000, Vasant Hegde wrote:
> Current iommu_v2 module allocates a secondary domain every time a
> device driver calls amd_iommu_init_device(). Then it detaches all
> devices in the group from the default domain, and reattaches to the
> new domain. Finally it configures V2API mode.
> 
> Previous patch added support to switch same domain to SVA mode. Use
> these new interfaces to enable SVA mode.
> 
> Note that even with this change, PASID is managed by device driver. This
> imposes restriction where there can be only one SVA enabled device per
> IOMMU group (same as current behaviour).
> 
> Finally remove unused functions.
> 
> Co-developed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
> Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
> ---
>  drivers/iommu/amd/amd_iommu.h |  2 --
>  drivers/iommu/amd/iommu.c     | 37 ----------------------
>  drivers/iommu/amd/iommu_v2.c  | 59 +++++++----------------------------
>  3 files changed, 12 insertions(+), 86 deletions(-)

I'm glad to see this code go, but my remarks are basically the same as
the last patch

amd_iommu_init_device() should force the kernel owned GCR3 table, not
mess with domains.

Basically all it should do is get a ref on the GCR3 table. A ref may
already exist if the GCR3 is already being used by a v2 RID domain.

Jason

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 01/11] iommu/amd: Rename helper function rlookup_amd_iommu()
  2023-08-08 15:28   ` Jason Gunthorpe
@ 2023-08-10 23:06     ` Suthikulpanit, Suravee
  2023-08-11 13:06       ` Jason Gunthorpe
  0 siblings, 1 reply; 35+ messages in thread
From: Suthikulpanit, Suravee @ 2023-08-10 23:06 UTC (permalink / raw)
  To: Jason Gunthorpe, Vasant Hegde; +Cc: iommu, joro, wei.huang2, jsnitsel



On 8/8/2023 8:28 AM, Jason Gunthorpe wrote:
> On Tue, Aug 08, 2023 at 10:02:22AM +0000, Vasant Hegde wrote:
>> From: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
>>
>> Rename and make it available to other code.
>>
>> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
>> Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
>> ---
>>   drivers/iommu/amd/amd_iommu.h |  1 +
>>   drivers/iommu/amd/iommu.c     | 30 +++++++++++++++---------------
>>   2 files changed, 16 insertions(+), 15 deletions(-)
>>
>> diff --git a/drivers/iommu/amd/amd_iommu.h b/drivers/iommu/amd/amd_iommu.h
>> index 56996dc2868d..1471ab2dddd1 100644
>> --- a/drivers/iommu/amd/amd_iommu.h
>> +++ b/drivers/iommu/amd/amd_iommu.h
>> @@ -24,6 +24,7 @@ int amd_iommu_init_devices(void);
>>   void amd_iommu_uninit_devices(void);
>>   void amd_iommu_init_notifier(void);
>>   void amd_iommu_set_rlookup_table(struct amd_iommu *iommu, u16 devid);
>> +struct amd_iommu *amd_iommu_rlookup_iommu(struct device *dev);
>>   
>>   #ifdef CONFIG_AMD_IOMMU_DEBUGFS
>>   void amd_iommu_debugfs_setup(struct amd_iommu *iommu);
>> diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
>> index 26e509ffe77a..77b913211c35 100644
>> --- a/drivers/iommu/amd/iommu.c
>> +++ b/drivers/iommu/amd/iommu.c
>> @@ -164,7 +164,7 @@ static struct amd_iommu *__rlookup_amd_iommu(u16 seg, u16 devid)
>>   	return NULL;
>>   }
>>   
>> -static struct amd_iommu *rlookup_amd_iommu(struct device *dev)
>> +struct amd_iommu *amd_iommu_rlookup_iommu(struct device *dev)
>>   {
> 
> Er, wah? This is a really confused function. It should just be:
> 
> static inline struct amd_iommu *amd_device_to_iommu(struct device *dev)
> {
> 	return container_of(dev->iommu->iommu_dev, struct amd_iommu, iommu);
> }
> 
> Right? At least for many of the call sites.
> 
> The core code maintains the association of the iommu_device with each
> probe'd device automatically. The drivers don't need tables and
> searching.

The rlookup table is setup from the information in the IVRS table, which 
contains list of devices belong to an IOMMU instance. This information 
is used to for setting dev->iommu during probing (See 
drivers/iommu/amd/iommu.c: amd_iommu_probe_device()).

Subsequently, it should be able to use the amd_device_to_iommu() you 
provided as long as the device has not been released.

So, we can add the helper function above, and make sure to check that 
dev->iommu is not NULL before calling the container_of().

Thanks,
Suravee

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 03/11] iommu/amd: Introduce per-device GCR3 table
  2023-08-08 15:32   ` Jason Gunthorpe
@ 2023-08-10 23:31     ` Suthikulpanit, Suravee
  0 siblings, 0 replies; 35+ messages in thread
From: Suthikulpanit, Suravee @ 2023-08-10 23:31 UTC (permalink / raw)
  To: Jason Gunthorpe, Vasant Hegde; +Cc: iommu, joro, wei.huang2, jsnitsel



On 8/8/2023 8:32 AM, Jason Gunthorpe wrote:
> On Tue, Aug 08, 2023 at 10:02:24AM +0000, Vasant Hegde wrote:
>> From: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
>>
>> AMD IOMMU GCR3 table is indexed by PASID. Each entry stores guest CR3
>> register value, which is an address to the root of guest IO page table.
>> The GCR3 table can be programmed per-device. However, Linux AMD IOMMU
>> driver currently managing the table on a per-domain basis.
> 
> Why do you call it "guest"? There is no guest here. This "GCR3" is the
> root of the PASID table for the device. It is used when ever PASID is
> required even without VMs.
> 
> Jason

That's just how the AMD IOMMU spec refers to the stage 1 table (a.k.a v2 
table). Agree that this is confusing, but that's just how things being 
defined in different architectures.

The Guest CR3 (GCR3) table uses PASID to do indexing into the table, and 
the entry contains the guest CR3, which is root of the stage 1 page table.

Thanks,
Suravee

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 09/11] iommu/amd: Introduce helper functions for AMD IOMMU v2 driver
  2023-08-08 15:49   ` Jason Gunthorpe
@ 2023-08-11  1:34     ` Suthikulpanit, Suravee
  2023-08-11 13:17       ` Jason Gunthorpe
  0 siblings, 1 reply; 35+ messages in thread
From: Suthikulpanit, Suravee @ 2023-08-11  1:34 UTC (permalink / raw)
  To: Jason Gunthorpe, Vasant Hegde; +Cc: iommu, joro, wei.huang2, jsnitsel

Jason,

On 8/8/2023 8:49 AM, Jason Gunthorpe wrote:
> On Tue, Aug 08, 2023 at 10:02:30AM +0000, Vasant Hegde wrote:
>> From: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
>>
>> Introducing new functions to init and uninit v2 domain. Also setup /
>> clear GCR3 table. These functions will make necessary changes to
>> existing domain to switch to SVA mode.
> 
> Huh? "necessary changes to existing domains" ? That shouldn't be
> something a driver ever does.
> 
>> +/*******************************
>> + * V2API-related helper functions
>> + * Note: Separate these out for now. It will be removed when
>> + *       deprecate the AMD IOMMU v2 API support.
>> + */
>> +int amd_iommu_v2api_domain_init(struct protection_domain *pdom)
>> +{
>> +	unsigned long flags;
>> +	int ret = 0;
>> +
>> +	spin_lock_irqsave(&pdom->lock, flags);
>> +
>> +	/* Only allow v2API support on pass-through domain. */
>> +	if (pdom->pd_mode != PD_MODE_PT) {
>> +		ret = -EINVAL;
>> +		goto out;
>> +	}
>> +
>> +	pdom->pd_mode = PD_MODE_V2;
> 
> Wah?? You change a PT domain to a V2 domain? That seems totally
> nonsensical.
> 
> This is continuing the wrong direction of the old code.
> 
> When you first create the GCR3 table you have to inspect which domain
> is attached to the RID and setup the DTE/GCR3 accordingly.
> 
> You don't *change* the domain that is attached to the RID (other
> things might be using it!) You just note that it is IDENTITY and do
> the right things.
> 
> This should all be part of generic code that is working with the GCR3
> object, not here in the v2api section.
> 
> Keep in mind that the API flow is to get a 'ref' on the GCR3 (eg
> because the caller wants to put a domain in it) and a pairing 'unref'.
> 
> eg if it was already a v2 PT when you want to attach a PASID then
> great, the device already has a gcr3 and so it just keeps
> going. Having a PASID will block the GCR3 from being freed and it will
> block V1 domains from being attached to the RID.
> 
> When the GCR3 becomes fully unrefed then all the PASIDs must be
> removed, the RID must be an IDENTIY and then the driver can free the
> GCR3 and reconfigure the DTE back to IDENTITY. Otherwise the DTE
> points to the kernel owned GCR3.
> 
> Jason

Supporting the IOMMU v2API adds complication on how IOMMU domain for v2 
page table is managed due to several restrictions, and it is conflicting 
with the upcoming SVA support.

Since this patch targeting v6.6 is dropping IOMMUv2 support:

[PATCH V2 4/5] drm/amdkfd: drop IOMMUv2 support
(https://www.spinics.net/lists/amd-gfx/msg96082.html)

Subsequently, we will go ahead and remove the AMD IOMMU v2 driver since 
there is no in-kernel dependencies. This should simplify the domain 
management, and should be easier to introduce the SVA support afterward.

We will rework this series and send out v2 shortly.

Thanks,
Suravee

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 02/11] iommu/amd: Introduce struct protection_domain.pd_mode
  2023-08-08 15:30   ` Jason Gunthorpe
@ 2023-08-11 10:04     ` Vasant Hegde
  2023-08-11 13:21       ` Jason Gunthorpe
  0 siblings, 1 reply; 35+ messages in thread
From: Vasant Hegde @ 2023-08-11 10:04 UTC (permalink / raw)
  To: Jason Gunthorpe; +Cc: iommu, joro, suravee.suthikulpanit, wei.huang2, jsnitsel

Jason,


On 8/8/2023 9:00 PM, Jason Gunthorpe wrote:
> On Tue, Aug 08, 2023 at 10:02:23AM +0000, Vasant Hegde wrote:
>> From: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
>>
>> This enum variable is used to track the type of page table used by the
>> protection domain. It will replace the protection_domain.flags in
>> subsequent series.
>>
>> Suggested-by: Jason Gunthorpe <jgg@ziepe.ca>
>> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
>> Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
>> ---
>>  drivers/iommu/amd/amd_iommu_types.h | 7 +++++++
>>  drivers/iommu/amd/iommu.c           | 3 +++
>>  2 files changed, 10 insertions(+)
>>
>> diff --git a/drivers/iommu/amd/amd_iommu_types.h b/drivers/iommu/amd/amd_iommu_types.h
>> index 0d339e022572..5e89032b3dee 100644
>> --- a/drivers/iommu/amd/amd_iommu_types.h
>> +++ b/drivers/iommu/amd/amd_iommu_types.h
>> @@ -555,6 +555,12 @@ struct amd_io_pgtable {
>>  	u64			*pgd;		/* v2 pgtable pgd pointer */
>>  };
>>  
>> +enum protection_domain_mode {
>> +	PD_MODE_PT = 0,
>> +	PD_MODE_V1,
>> +	PD_MODE_V2,
>> +};
> 
> PD_MODE_PT is redundant because the domain->type ==
> IOMMU_DOMAIN_IDENTITY already indicates identity. It is better not to
> denormalize the data structures.

We need to track how we have setup the hardware page table here. We can use
`IOMMU_DOMAIN_IDENTITY` for PT mode, but we don't have anything for other two
modes. So having macros makes it easy. Hence we have introduced new macros for
protection domain.

-Vasant

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 06/11] iommu/amd: Refactor helper function for attaching / detaching device
  2023-08-08 15:39   ` Jason Gunthorpe
@ 2023-08-11 10:07     ` Vasant Hegde
  2023-08-11 13:20       ` Jason Gunthorpe
  0 siblings, 1 reply; 35+ messages in thread
From: Vasant Hegde @ 2023-08-11 10:07 UTC (permalink / raw)
  To: Jason Gunthorpe; +Cc: iommu, joro, suravee.suthikulpanit, wei.huang2, jsnitsel

Jason,


On 8/8/2023 9:09 PM, Jason Gunthorpe wrote:
> On Tue, Aug 08, 2023 at 10:02:27AM +0000, Vasant Hegde wrote:
>> From: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
>>
>> To use the new helper function for setting up GCR3 table.
>>
>> If system is booted with V2 page table then setup default GCR3 with
>> domain GCR3 pointer. So that all devices in the domain uses same page
>> table for translation. Also return page table setup status from
>> do_attach() function.
>>
>> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
>> Co-developed-by: Vasant Hegde <vasant.hegde@amd.com>
>> Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
>> ---
>>  drivers/iommu/amd/iommu.c | 54 ++++++++++++++++++++++++++++++++++++---
>>  1 file changed, 50 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
>> index cc42732820dd..4fed3fbe069e 100644
>> --- a/drivers/iommu/amd/iommu.c
>> +++ b/drivers/iommu/amd/iommu.c
>> @@ -1821,15 +1821,41 @@ static void clear_dte_entry(struct amd_iommu *iommu, u16 devid)
>>  	amd_iommu_apply_erratum_63(iommu, devid);
>>  }
>>  
>> -static void do_attach(struct iommu_dev_data *dev_data,
>> -		      struct protection_domain *domain)
>> +/*
>> + * Note: This is currently used when booting w/ amd_iommu=pgtbl_v2
>> + */
> 
> I would drop all this commentary about amd_iommu=pgtbl_v2. That
> command line option should be removed.

AMD IOMMU can boot with PT, V1 page table and V2 page table. We need a way to
tell what page table we want to use. Hence we have these command line options.

-Vasant


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 04/11] iommu/amd: Use protection_domain.flags to check page table mode
  2023-08-08 15:35   ` Jason Gunthorpe
@ 2023-08-11 10:10     ` Vasant Hegde
  0 siblings, 0 replies; 35+ messages in thread
From: Vasant Hegde @ 2023-08-11 10:10 UTC (permalink / raw)
  To: Jason Gunthorpe; +Cc: iommu, joro, suravee.suthikulpanit, wei.huang2, jsnitsel

Jason,


On 8/8/2023 9:05 PM, Jason Gunthorpe wrote:
> On Tue, Aug 08, 2023 at 10:02:25AM +0000, Vasant Hegde wrote:
>> Page table mode (v1, v2 or pt) is per domain property. Recently we have
>> enhanced protection_domain.pd_mode to track per domain page table mode.
>> Use that variable to check the page table mode instead of global
>> 'amd_iommu_pgtable' in {map/unmap}_pages path.
>>
>> Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
>> ---
>>  drivers/iommu/amd/iommu.c | 4 ++--
>>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> This is OK, but IMHO, it is much better to provide different ops for
> the v1/v2 page table map/unmap functions since they don't really share
> any code.
> 
> eg having a function pointer that does nothing other than call another
> function pointer:
> 
> 	struct io_pgtable_ops *ops = &domain->iop.iop.ops;
> 	if (ops->map_pages) {
> 		ret = ops->map_pages(ops, iova, paddr, pgsize,
> 				     pgcount, prot, gfp, mapped);
> 	}
> 
> Is quite inefficient these days. If you start with a map_pages_v2
> domain op then you can just directly call iommu_v2_map_pages().
> 
> We did alot of work on the core side to enable this.

Agreed. This code can be improved. We will look into it later.

-Vasant


> 
> Jason

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 01/11] iommu/amd: Rename helper function rlookup_amd_iommu()
  2023-08-10 23:06     ` Suthikulpanit, Suravee
@ 2023-08-11 13:06       ` Jason Gunthorpe
  0 siblings, 0 replies; 35+ messages in thread
From: Jason Gunthorpe @ 2023-08-11 13:06 UTC (permalink / raw)
  To: Suthikulpanit, Suravee; +Cc: Vasant Hegde, iommu, joro, wei.huang2, jsnitsel

On Thu, Aug 10, 2023 at 04:06:52PM -0700, Suthikulpanit, Suravee wrote:
> 
> 
> On 8/8/2023 8:28 AM, Jason Gunthorpe wrote:
> > On Tue, Aug 08, 2023 at 10:02:22AM +0000, Vasant Hegde wrote:
> > > From: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
> > > 
> > > Rename and make it available to other code.
> > > 
> > > Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
> > > Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
> > > ---
> > >   drivers/iommu/amd/amd_iommu.h |  1 +
> > >   drivers/iommu/amd/iommu.c     | 30 +++++++++++++++---------------
> > >   2 files changed, 16 insertions(+), 15 deletions(-)
> > > 
> > > diff --git a/drivers/iommu/amd/amd_iommu.h b/drivers/iommu/amd/amd_iommu.h
> > > index 56996dc2868d..1471ab2dddd1 100644
> > > --- a/drivers/iommu/amd/amd_iommu.h
> > > +++ b/drivers/iommu/amd/amd_iommu.h
> > > @@ -24,6 +24,7 @@ int amd_iommu_init_devices(void);
> > >   void amd_iommu_uninit_devices(void);
> > >   void amd_iommu_init_notifier(void);
> > >   void amd_iommu_set_rlookup_table(struct amd_iommu *iommu, u16 devid);
> > > +struct amd_iommu *amd_iommu_rlookup_iommu(struct device *dev);
> > >   #ifdef CONFIG_AMD_IOMMU_DEBUGFS
> > >   void amd_iommu_debugfs_setup(struct amd_iommu *iommu);
> > > diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
> > > index 26e509ffe77a..77b913211c35 100644
> > > --- a/drivers/iommu/amd/iommu.c
> > > +++ b/drivers/iommu/amd/iommu.c
> > > @@ -164,7 +164,7 @@ static struct amd_iommu *__rlookup_amd_iommu(u16 seg, u16 devid)
> > >   	return NULL;
> > >   }
> > > -static struct amd_iommu *rlookup_amd_iommu(struct device *dev)
> > > +struct amd_iommu *amd_iommu_rlookup_iommu(struct device *dev)
> > >   {
> > 
> > Er, wah? This is a really confused function. It should just be:
> > 
> > static inline struct amd_iommu *amd_device_to_iommu(struct device *dev)
> > {
> > 	return container_of(dev->iommu->iommu_dev, struct amd_iommu, iommu);
> > }
> > 
> > Right? At least for many of the call sites.
> > 
> > The core code maintains the association of the iommu_device with each
> > probe'd device automatically. The drivers don't need tables and
> > searching.
> 
> The rlookup table is setup from the information in the IVRS table, which
> contains list of devices belong to an IOMMU instance. This information is
> used to for setting dev->iommu during probing (See
> drivers/iommu/amd/iommu.c: amd_iommu_probe_device()).

Yes, I saw that..

> Subsequently, it should be able to use the amd_device_to_iommu() you
> provided as long as the device has not been released.
> 
> So, we can add the helper function above, and make sure to check that
> dev->iommu is not NULL before calling the container_of().

There should not be a null check

Such a function should only be called from one of the iommu ops and
the core code guarentees it will not invoke the op without an attached
iommu_driver

Provide a little inline function in iommu.h to get the iommu_dev and
give that explanation in a comment.

If you want to do this outside the ops then you have a locking problem
and have to address it properly. Try to avoid that.

Jason

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 09/11] iommu/amd: Introduce helper functions for AMD IOMMU v2 driver
  2023-08-11  1:34     ` Suthikulpanit, Suravee
@ 2023-08-11 13:17       ` Jason Gunthorpe
  2023-08-11 16:51         ` Suthikulpanit, Suravee
  0 siblings, 1 reply; 35+ messages in thread
From: Jason Gunthorpe @ 2023-08-11 13:17 UTC (permalink / raw)
  To: Suthikulpanit, Suravee; +Cc: Vasant Hegde, iommu, joro, wei.huang2, jsnitsel

On Thu, Aug 10, 2023 at 06:34:46PM -0700, Suthikulpanit, Suravee wrote:

> Supporting the IOMMU v2API adds complication on how IOMMU domain for v2 page
> table is managed due to several restrictions, and it is conflicting with the
> upcoming SVA support.
> 
> Since this patch targeting v6.6 is dropping IOMMUv2 support:
> 
> [PATCH V2 4/5] drm/amdkfd: drop IOMMUv2 support
> (https://www.spinics.net/lists/amd-gfx/msg96082.html)
> 
> Subsequently, we will go ahead and remove the AMD IOMMU v2 driver since
> there is no in-kernel dependencies. This should simplify the domain
> management, and should be easier to introduce the SVA support afterward.
> 
> We will rework this series and send out v2 shortly.

Oh wow that is great news. Lets go forward on that basis.

Please make sure the AMD driver has zero EXPORT_SYMBOL's in it after
the above is applied

Thanks,
Jason

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 06/11] iommu/amd: Refactor helper function for attaching / detaching device
  2023-08-11 10:07     ` Vasant Hegde
@ 2023-08-11 13:20       ` Jason Gunthorpe
  2023-08-11 16:50         ` Vasant Hegde
  0 siblings, 1 reply; 35+ messages in thread
From: Jason Gunthorpe @ 2023-08-11 13:20 UTC (permalink / raw)
  To: Vasant Hegde; +Cc: iommu, joro, suravee.suthikulpanit, wei.huang2, jsnitsel

On Fri, Aug 11, 2023 at 03:37:25PM +0530, Vasant Hegde wrote:
> Jason,
> 
> 
> On 8/8/2023 9:09 PM, Jason Gunthorpe wrote:
> > On Tue, Aug 08, 2023 at 10:02:27AM +0000, Vasant Hegde wrote:
> >> From: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
> >>
> >> To use the new helper function for setting up GCR3 table.
> >>
> >> If system is booted with V2 page table then setup default GCR3 with
> >> domain GCR3 pointer. So that all devices in the domain uses same page
> >> table for translation. Also return page table setup status from
> >> do_attach() function.
> >>
> >> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
> >> Co-developed-by: Vasant Hegde <vasant.hegde@amd.com>
> >> Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
> >> ---
> >>  drivers/iommu/amd/iommu.c | 54 ++++++++++++++++++++++++++++++++++++---
> >>  1 file changed, 50 insertions(+), 4 deletions(-)
> >>
> >> diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
> >> index cc42732820dd..4fed3fbe069e 100644
> >> --- a/drivers/iommu/amd/iommu.c
> >> +++ b/drivers/iommu/amd/iommu.c
> >> @@ -1821,15 +1821,41 @@ static void clear_dte_entry(struct amd_iommu *iommu, u16 devid)
> >>  	amd_iommu_apply_erratum_63(iommu, devid);
> >>  }
> >>  
> >> -static void do_attach(struct iommu_dev_data *dev_data,
> >> -		      struct protection_domain *domain)
> >> +/*
> >> + * Note: This is currently used when booting w/ amd_iommu=pgtbl_v2
> >> + */
> > 
> > I would drop all this commentary about amd_iommu=pgtbl_v2. That
> > command line option should be removed.
> 
> AMD IOMMU can boot with PT, V1 page table and V2 page table. We need a way to
> tell what page table we want to use. Hence we have these command line options.
 
No, you don't, it is wrong.

PT or not is controlled only by the core code using existing common
mechanims. Not by iommu drivers.

For v1/v2 kernel needs to automatically select the best page table
option, not try to make an admin figure it out with a command line option.

I expect it to be removed in due course.

Jason

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 02/11] iommu/amd: Introduce struct protection_domain.pd_mode
  2023-08-11 10:04     ` Vasant Hegde
@ 2023-08-11 13:21       ` Jason Gunthorpe
  0 siblings, 0 replies; 35+ messages in thread
From: Jason Gunthorpe @ 2023-08-11 13:21 UTC (permalink / raw)
  To: Vasant Hegde; +Cc: iommu, joro, suravee.suthikulpanit, wei.huang2, jsnitsel

On Fri, Aug 11, 2023 at 03:34:46PM +0530, Vasant Hegde wrote:
> Jason,
> 
> 
> On 8/8/2023 9:00 PM, Jason Gunthorpe wrote:
> > On Tue, Aug 08, 2023 at 10:02:23AM +0000, Vasant Hegde wrote:
> >> From: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
> >>
> >> This enum variable is used to track the type of page table used by the
> >> protection domain. It will replace the protection_domain.flags in
> >> subsequent series.
> >>
> >> Suggested-by: Jason Gunthorpe <jgg@ziepe.ca>
> >> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
> >> Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
> >> ---
> >>  drivers/iommu/amd/amd_iommu_types.h | 7 +++++++
> >>  drivers/iommu/amd/iommu.c           | 3 +++
> >>  2 files changed, 10 insertions(+)
> >>
> >> diff --git a/drivers/iommu/amd/amd_iommu_types.h b/drivers/iommu/amd/amd_iommu_types.h
> >> index 0d339e022572..5e89032b3dee 100644
> >> --- a/drivers/iommu/amd/amd_iommu_types.h
> >> +++ b/drivers/iommu/amd/amd_iommu_types.h
> >> @@ -555,6 +555,12 @@ struct amd_io_pgtable {
> >>  	u64			*pgd;		/* v2 pgtable pgd pointer */
> >>  };
> >>  
> >> +enum protection_domain_mode {
> >> +	PD_MODE_PT = 0,
> >> +	PD_MODE_V1,
> >> +	PD_MODE_V2,
> >> +};
> > 
> > PD_MODE_PT is redundant because the domain->type ==
> > IOMMU_DOMAIN_IDENTITY already indicates identity. It is better not to
> > denormalize the data structures.
> 
> We need to track how we have setup the hardware page table here. We can use
> `IOMMU_DOMAIN_IDENTITY` for PT mode, but we don't have anything for other two
> modes. So having macros makes it easy. Hence we have introduced new macros for
> protection domain.

I think this was immediately abused in the next patch, so I'd prefer
not to see this kind of duplication. It should describe the format of
the IOPTEs and if the domain doesn't have IOPTEs then it shouldn't
need to ever look at this value.

Jason

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 06/11] iommu/amd: Refactor helper function for attaching / detaching device
  2023-08-11 13:20       ` Jason Gunthorpe
@ 2023-08-11 16:50         ` Vasant Hegde
  2023-08-11 23:50           ` Jason Gunthorpe
  0 siblings, 1 reply; 35+ messages in thread
From: Vasant Hegde @ 2023-08-11 16:50 UTC (permalink / raw)
  To: Jason Gunthorpe; +Cc: iommu, joro, suravee.suthikulpanit, wei.huang2, jsnitsel

Jason,


On 8/11/2023 6:50 PM, Jason Gunthorpe wrote:
> On Fri, Aug 11, 2023 at 03:37:25PM +0530, Vasant Hegde wrote:
>> Jason,
>>
>>
>> On 8/8/2023 9:09 PM, Jason Gunthorpe wrote:
>>> On Tue, Aug 08, 2023 at 10:02:27AM +0000, Vasant Hegde wrote:
>>>> From: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
>>>>
>>>> To use the new helper function for setting up GCR3 table.
>>>>
>>>> If system is booted with V2 page table then setup default GCR3 with
>>>> domain GCR3 pointer. So that all devices in the domain uses same page
>>>> table for translation. Also return page table setup status from
>>>> do_attach() function.
>>>>
>>>> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
>>>> Co-developed-by: Vasant Hegde <vasant.hegde@amd.com>
>>>> Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
>>>> ---
>>>>  drivers/iommu/amd/iommu.c | 54 ++++++++++++++++++++++++++++++++++++---
>>>>  1 file changed, 50 insertions(+), 4 deletions(-)
>>>>
>>>> diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
>>>> index cc42732820dd..4fed3fbe069e 100644
>>>> --- a/drivers/iommu/amd/iommu.c
>>>> +++ b/drivers/iommu/amd/iommu.c
>>>> @@ -1821,15 +1821,41 @@ static void clear_dte_entry(struct amd_iommu *iommu, u16 devid)
>>>>  	amd_iommu_apply_erratum_63(iommu, devid);
>>>>  }
>>>>  
>>>> -static void do_attach(struct iommu_dev_data *dev_data,
>>>> -		      struct protection_domain *domain)
>>>> +/*
>>>> + * Note: This is currently used when booting w/ amd_iommu=pgtbl_v2
>>>> + */
>>>
>>> I would drop all this commentary about amd_iommu=pgtbl_v2. That
>>> command line option should be removed.
>>
>> AMD IOMMU can boot with PT, V1 page table and V2 page table. We need a way to
>> tell what page table we want to use. Hence we have these command line options.
>  
> No, you don't, it is wrong.
> 
> PT or not is controlled only by the core code using existing common
> mechanims. Not by iommu drivers.

Yes. We don't have 'pt' in amd_iommu.

> 
> For v1/v2 kernel needs to automatically select the best page table
> option, not try to make an admin figure it out with a command line option.

Our hardware is capable to boot with either V1 or V2 page table. By default
driver decides what is the best mode.
But if user wants he has a flexibility to choose V1 vs V2 page table. In that
mode we will adhere to user provided option and all devices will be put in that
mode.

-Vasant


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 09/11] iommu/amd: Introduce helper functions for AMD IOMMU v2 driver
  2023-08-11 13:17       ` Jason Gunthorpe
@ 2023-08-11 16:51         ` Suthikulpanit, Suravee
  0 siblings, 0 replies; 35+ messages in thread
From: Suthikulpanit, Suravee @ 2023-08-11 16:51 UTC (permalink / raw)
  To: Jason Gunthorpe; +Cc: Vasant Hegde, iommu, joro, wei.huang2, jsnitsel



On 8/11/2023 6:17 AM, Jason Gunthorpe wrote:
> On Thu, Aug 10, 2023 at 06:34:46PM -0700, Suthikulpanit, Suravee wrote:
> 
>> Supporting the IOMMU v2API adds complication on how IOMMU domain for v2 page
>> table is managed due to several restrictions, and it is conflicting with the
>> upcoming SVA support.
>>
>> Since this patch targeting v6.6 is dropping IOMMUv2 support:
>>
>> [PATCH V2 4/5] drm/amdkfd: drop IOMMUv2 support
>> (https://www.spinics.net/lists/amd-gfx/msg96082.html)
>>
>> Subsequently, we will go ahead and remove the AMD IOMMU v2 driver since
>> there is no in-kernel dependencies. This should simplify the domain
>> management, and should be easier to introduce the SVA support afterward.
>>
>> We will rework this series and send out v2 shortly.
> 
> Oh wow that is great news. Lets go forward on that basis.
> 
> Please make sure the AMD driver has zero EXPORT_SYMBOL's in it after
> the above is applied
> 
> Thanks,
> Jason

We will take care of that.

Thanks,
Suravee

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 06/11] iommu/amd: Refactor helper function for attaching / detaching device
  2023-08-11 16:50         ` Vasant Hegde
@ 2023-08-11 23:50           ` Jason Gunthorpe
  2023-08-15  4:19             ` Tian, Kevin
  2023-08-15  5:33             ` Suthikulpanit, Suravee
  0 siblings, 2 replies; 35+ messages in thread
From: Jason Gunthorpe @ 2023-08-11 23:50 UTC (permalink / raw)
  To: Vasant Hegde; +Cc: iommu, joro, suravee.suthikulpanit, wei.huang2, jsnitsel

On Fri, Aug 11, 2023 at 10:20:56PM +0530, Vasant Hegde wrote:

> Our hardware is capable to boot with either V1 or V2 page table. By default
> driver decides what is the best mode.
> But if user wants he has a flexibility to choose V1 vs V2 page table. In that
> mode we will adhere to user provided option and all devices will be put in that
> mode.

There is no valid reason for the user to override the kernel. If we do
this for AMD we have to do it for every single driver that supports
two formats. It makes no sense

Jason

^ permalink raw reply	[flat|nested] 35+ messages in thread

* RE: [PATCH 06/11] iommu/amd: Refactor helper function for attaching / detaching device
  2023-08-11 23:50           ` Jason Gunthorpe
@ 2023-08-15  4:19             ` Tian, Kevin
  2023-08-15  5:33             ` Suthikulpanit, Suravee
  1 sibling, 0 replies; 35+ messages in thread
From: Tian, Kevin @ 2023-08-15  4:19 UTC (permalink / raw)
  To: Jason Gunthorpe, Vasant Hegde
  Cc: iommu, joro, suravee.suthikulpanit, wei.huang2, jsnitsel

> From: Jason Gunthorpe <jgg@ziepe.ca>
> Sent: Saturday, August 12, 2023 7:51 AM
> 
> On Fri, Aug 11, 2023 at 10:20:56PM +0530, Vasant Hegde wrote:
> 
> > Our hardware is capable to boot with either V1 or V2 page table. By default
> > driver decides what is the best mode.
> > But if user wants he has a flexibility to choose V1 vs V2 page table. In that
> > mode we will adhere to user provided option and all devices will be put in
> that
> > mode.
> 
> There is no valid reason for the user to override the kernel. If we do
> this for AMD we have to do it for every single driver that supports
> two formats. It makes no sense
> 

We ever considered a similar option for vt-d but gave up as the only
purpose at the moment was for silicon validation. Out of that we didn't
find a valid reason for real user...

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 06/11] iommu/amd: Refactor helper function for attaching / detaching device
  2023-08-11 23:50           ` Jason Gunthorpe
  2023-08-15  4:19             ` Tian, Kevin
@ 2023-08-15  5:33             ` Suthikulpanit, Suravee
  2023-08-15 11:42               ` Jason Gunthorpe
  1 sibling, 1 reply; 35+ messages in thread
From: Suthikulpanit, Suravee @ 2023-08-15  5:33 UTC (permalink / raw)
  To: Jason Gunthorpe, Vasant Hegde; +Cc: iommu, joro, wei.huang2, jsnitsel

Jason,

On 8/11/2023 4:50 PM, Jason Gunthorpe wrote:
> On Fri, Aug 11, 2023 at 10:20:56PM +0530, Vasant Hegde wrote:
> 
>> Our hardware is capable to boot with either V1 or V2 page table. By default
>> driver decides what is the best mode.
>> But if user wants he has a flexibility to choose V1 vs V2 page table. In that
>> mode we will adhere to user provided option and all devices will be put in that
>> mode.
> 
> There is no valid reason for the user to override the kernel. If we do
> this for AMD we have to do it for every single driver that supports
> two formats. It makes no sense
> 
> Jason

The amd_iommu=pgtbl_v2 was first introduced to enable nested IOMMU 
translation w/ HW-vIOMMU support. This option is necessary for the guest 
kernel since we need guest kernel to use IOMMU v2 page table for DMA-API 
but the IOMMU driver is currently default to use pgtbl_v1 for DMA-API. 
Currently, there is no good way to auto detect this use case.

In bare-metal environment, both pgtbl_v1 and pgtbl_v2 are supported. In 
the future, we might consider migrate to use pgtbl_v2 by default. 
However, there are certain concerns and restrictions w/ the use of pgtbl_v2:

* pgtbl_v2 is not supported with SNP-enabled system. SNP-enablement can 
be detected with amd_iommu_snp_en, which is initialized early when the 
amd_iommu_snp_enable() is called. So, IOMMU driver should be able to 
automatically switch to use pgtbl_v1.

* There is an upcoming feature (e.g. TMPM : 
https://www.amd.com/system/files/TechDocs/58151_0.51-PUB.pdf), which 
requires devices to be setup w/ pgtbl_v1. However, the TMPM driver is 
loaded after devices are probed. Therefore, this would require Linux to 
boot with option amd_iommu=pgtbl_v1.

* IOMMU pgtbl_v1 and pgtbl_v2 are different from HW perspective, which 
could yield different performance depending on the workload and 
end-point devices. Therefore, the kernel option 
amd_iommu=pgtbl_v1|pgtbl_v2 adds flexibility and give users additional 
control if needed.

For SVA, we need to ensure that the iommu group is using pgtbl_v2 or in 
pass-through mode (no page table setup) prior to enabling SVA for the 
device. Currently, this would require the pgtbl_v2 option. However, we 
could consider default to using v2 table for an IOMMU group if it 
contains an PASID-capable device.

Thanks,
Suravee


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 06/11] iommu/amd: Refactor helper function for attaching / detaching device
  2023-08-15  5:33             ` Suthikulpanit, Suravee
@ 2023-08-15 11:42               ` Jason Gunthorpe
  0 siblings, 0 replies; 35+ messages in thread
From: Jason Gunthorpe @ 2023-08-15 11:42 UTC (permalink / raw)
  To: Suthikulpanit, Suravee; +Cc: Vasant Hegde, iommu, joro, wei.huang2, jsnitsel

On Mon, Aug 14, 2023 at 10:33:08PM -0700, Suthikulpanit, Suravee wrote:
> Jason,
> 
> On 8/11/2023 4:50 PM, Jason Gunthorpe wrote:
> > On Fri, Aug 11, 2023 at 10:20:56PM +0530, Vasant Hegde wrote:
> > 
> > > Our hardware is capable to boot with either V1 or V2 page table. By default
> > > driver decides what is the best mode.
> > > But if user wants he has a flexibility to choose V1 vs V2 page table. In that
> > > mode we will adhere to user provided option and all devices will be put in that
> > > mode.
> > 
> > There is no valid reason for the user to override the kernel. If we do
> > this for AMD we have to do it for every single driver that supports
> > two formats. It makes no sense
> > 
> > Jason
> 
> The amd_iommu=pgtbl_v2 was first introduced to enable nested IOMMU
> translation w/ HW-vIOMMU support. This option is necessary for the guest
> kernel since we need guest kernel to use IOMMU v2 page table for DMA-API but
> the IOMMU driver is currently default to use pgtbl_v1 for DMA-API.
> Currently, there is no good way to auto detect this use case.

I will say it again - upstream kernel does not support nested IOMMU,
please stop adding junk to your driver to support out of tree patch
sets.

iommufd has an automatic solution to this problem.

> * There is an upcoming feature (e.g. TMPM :
> https://www.amd.com/system/files/TechDocs/58151_0.51-PUB.pdf), which
> requires devices to be setup w/ pgtbl_v1. However, the TMPM driver is loaded
> after devices are probed. Therefore, this would require Linux to boot with
> option amd_iommu=pgtbl_v1.

That is horrible, and it is even worse that you plan to support this
with a command line option. You need a proper fix.

> * IOMMU pgtbl_v1 and pgtbl_v2 are different from HW perspective, which could
> yield different performance depending on the workload and end-point devices.
> Therefore, the kernel option amd_iommu=pgtbl_v1|pgtbl_v2 adds flexibility
> and give users additional control if needed.

Maybe, but I'd want to see actual data on this. I'm skeptical.
 
> For SVA, we need to ensure that the iommu group is using pgtbl_v2 or in
> pass-through mode (no page table setup) prior to enabling SVA for the
> device. Currently, this would require the pgtbl_v2 option. However, we could
> consider default to using v2 table for an IOMMU group if it contains an
> PASID-capable device.

It cannot continue to require the command line option - we expect that
all the iommu drivers will support SVA without admin
intervention. This will likel have to happen via some kind of 'is
pasid possible' core call which will make the distinction based on the
composition of the group regarding pasid support of the devices in it.

Jason

^ permalink raw reply	[flat|nested] 35+ messages in thread

end of thread, other threads:[~2023-08-15 11:57 UTC | newest]

Thread overview: 35+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-08-08 10:02 [PATCH 00/11] iommu/amd: SVA support (part 2) - refactor support for GCR3 table Vasant Hegde
2023-08-08 10:02 ` [PATCH 01/11] iommu/amd: Rename helper function rlookup_amd_iommu() Vasant Hegde
2023-08-08 15:28   ` Jason Gunthorpe
2023-08-10 23:06     ` Suthikulpanit, Suravee
2023-08-11 13:06       ` Jason Gunthorpe
2023-08-08 10:02 ` [PATCH 02/11] iommu/amd: Introduce struct protection_domain.pd_mode Vasant Hegde
2023-08-08 15:30   ` Jason Gunthorpe
2023-08-11 10:04     ` Vasant Hegde
2023-08-11 13:21       ` Jason Gunthorpe
2023-08-08 10:02 ` [PATCH 03/11] iommu/amd: Introduce per-device GCR3 table Vasant Hegde
2023-08-08 15:32   ` Jason Gunthorpe
2023-08-10 23:31     ` Suthikulpanit, Suravee
2023-08-08 10:02 ` [PATCH 04/11] iommu/amd: Use protection_domain.flags to check page table mode Vasant Hegde
2023-08-08 15:35   ` Jason Gunthorpe
2023-08-11 10:10     ` Vasant Hegde
2023-08-08 10:02 ` [PATCH 05/11] iommu/amd: Refactor helper function for setting / clearing GCR3 Vasant Hegde
2023-08-08 10:02 ` [PATCH 06/11] iommu/amd: Refactor helper function for attaching / detaching device Vasant Hegde
2023-08-08 15:39   ` Jason Gunthorpe
2023-08-11 10:07     ` Vasant Hegde
2023-08-11 13:20       ` Jason Gunthorpe
2023-08-11 16:50         ` Vasant Hegde
2023-08-11 23:50           ` Jason Gunthorpe
2023-08-15  4:19             ` Tian, Kevin
2023-08-15  5:33             ` Suthikulpanit, Suravee
2023-08-15 11:42               ` Jason Gunthorpe
2023-08-08 10:02 ` [PATCH 07/11] iommu/amd: Refactor protection_domain helper functions Vasant Hegde
2023-08-08 10:02 ` [PATCH 08/11] iommu/amd: Refactor GCR3 table " Vasant Hegde
2023-08-08 10:02 ` [PATCH 09/11] iommu/amd: Introduce helper functions for AMD IOMMU v2 driver Vasant Hegde
2023-08-08 15:49   ` Jason Gunthorpe
2023-08-11  1:34     ` Suthikulpanit, Suravee
2023-08-11 13:17       ` Jason Gunthorpe
2023-08-11 16:51         ` Suthikulpanit, Suravee
2023-08-08 10:02 ` [PATCH 10/11] iommu/amd/iommu_v2: Add support to switch default domain to SVA mode Vasant Hegde
2023-08-08 15:51   ` Jason Gunthorpe
2023-08-08 10:02 ` [PATCH 11/11] iommu/amd: Remove unused GCR3 table parameters from struct protection_domain Vasant Hegde

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.