iommu.lists.linux-foundation.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/9] iommu/amd: Add AMD IOMMU emulation support for SEV-SNP guest kernel
@ 2024-04-30 15:24 Suravee Suthikulpanit
  2024-04-30 15:24 ` [PATCH 1/9] iommu/amd: Introduce helper functions for managing IOMMU memory Suravee Suthikulpanit
                   ` (9 more replies)
  0 siblings, 10 replies; 19+ messages in thread
From: Suravee Suthikulpanit @ 2024-04-30 15:24 UTC (permalink / raw)
  To: linux-kernel, iommu, joro
  Cc: thomas.lendacky, vasant.hegde, michael.roth, jon.grimm, rientjes,
	Suravee Suthikulpanit

To boot a VM w/ x2APIC ID > 255, guest interrupt remapping emulation 
is required. For SEV guest, this can be achieved using an emulated
AMD IOMMU.

In order to support emulated AMD IOMMU in SEV guest, memory pages used
by the guest IOMMU data structures must be in decrypted mode. Also GPAs
for these pages must not have the memory encryption bit set.

Testing:
  - Booting Linux SEV guest w/ 512 vcpus w/ QEMU emulated amd-iommu with
    qemu-system-x86_64 option: -device amd-iommu,intremap=on,xtsup=on
    (emulated devices only for now).

GIT repos:
* https://github.com/AMDESE/linux-iommu/tree/iommu_next_sev-iommu-v1 

Thanks,
Suravee

Suravee Suthikulpanit (9):
  iommu/amd: Introduce helper functions for managing IOMMU memory
  iommu/amd: Convert Device Table pointer to use struct amd_iommu_mem
  iommu/amd: Convert Command Buffer pointer to use struct amd_iommu_mem
  iommu/amd: Convert Completion-Wait Semaphore pointer to use struct
    amd_iommu_mem
  iommu/amd: Convert Event Log pointer to use struct amd_iommu_mem
  iommu/amd: Convert PPR Log pointer to use the struct amd_iommu_mem
  iommu/amd: Remove iommu_alloc_4k_pages() helper function
  iommu/amd: Decrypt interrupt remapping table for AMD IOMMU emulation
    in SEV guest
  iommu/amd: Set default domain to IDENTITY_DOMAIN when running in SEV
    guest

 drivers/iommu/amd/amd_iommu.h       |  31 +++++-
 drivers/iommu/amd/amd_iommu_types.h |  28 ++++--
 drivers/iommu/amd/init.c            | 144 +++++++++++++++-------------
 drivers/iommu/amd/iommu.c           | 133 +++++++++++++++++++------
 drivers/iommu/amd/ppr.c             |  22 +++--
 5 files changed, 246 insertions(+), 112 deletions(-)

-- 
2.34.1


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 1/9] iommu/amd: Introduce helper functions for managing IOMMU memory
  2024-04-30 15:24 [PATCH 0/9] iommu/amd: Add AMD IOMMU emulation support for SEV-SNP guest kernel Suravee Suthikulpanit
@ 2024-04-30 15:24 ` Suravee Suthikulpanit
  2024-05-01 16:17   ` Jason Gunthorpe
  2024-04-30 15:24 ` [PATCH 2/9] iommu/amd: Convert Device Table pointer to use struct amd_iommu_mem Suravee Suthikulpanit
                   ` (8 subsequent siblings)
  9 siblings, 1 reply; 19+ messages in thread
From: Suravee Suthikulpanit @ 2024-04-30 15:24 UTC (permalink / raw)
  To: linux-kernel, iommu, joro
  Cc: thomas.lendacky, vasant.hegde, michael.roth, jon.grimm, rientjes,
	Suravee Suthikulpanit

Depending on the modes of operation, certain AMD IOMMU data structures are
allocated with constraints. For example:

 * Some buffers must be 4K-aligned when running in SNP-enabled host

 * To support AMD IOMMU emulation in an SEV guest, some data structures
   cannot be encrypted so that the VMM can access the memory successfully.

Introduce struct amd_iommu_mem along with helper functions to allocate and
free memory areas based on the modes of operation. Initially, two modes
are supported :

1. 4K-aligned mode: Required when running on SNP-enabled host system.
Note that this mode replaces the iommu_alloc_4k_pages().

2. SEV-guest shared mode: Set memory as shared when in an SEV guest.
Note that the memory encryption bit of the GPA must be cleared when
programming into the guest base address MMIO register field of
these buffers.

Suggested-by: Thomas Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
---
 drivers/iommu/amd/amd_iommu.h       | 29 +++++++++++++
 drivers/iommu/amd/amd_iommu_types.h | 15 +++++++
 drivers/iommu/amd/iommu.c           | 66 +++++++++++++++++++++++++++++
 3 files changed, 110 insertions(+)

diff --git a/drivers/iommu/amd/amd_iommu.h b/drivers/iommu/amd/amd_iommu.h
index 2fde1302a584..ccd9003813ac 100644
--- a/drivers/iommu/amd/amd_iommu.h
+++ b/drivers/iommu/amd/amd_iommu.h
@@ -28,6 +28,11 @@ void iommu_feature_enable(struct amd_iommu *iommu, u8 bit);
 void *__init iommu_alloc_4k_pages(struct amd_iommu *iommu,
 				  gfp_t gfp, size_t size);
 
+void *amd_iommu_get_zeroed_mem(gfp_t gfp_mask, struct amd_iommu_mem *mem);
+void *amd_iommu_get_zeroed_mem_node(int nid, gfp_t gfp_mask,
+				    struct amd_iommu_mem *mem);
+void amd_iommu_free_mem(struct amd_iommu_mem *mem);
+
 #ifdef CONFIG_AMD_IOMMU_DEBUGFS
 void amd_iommu_debugfs_setup(struct amd_iommu *iommu);
 #else
@@ -137,6 +142,30 @@ static inline u64 iommu_virt_to_phys(void *vaddr)
 	return (u64)__sme_set(virt_to_phys(vaddr));
 }
 
+static inline bool amd_iommu_mem_is_decrypted(struct amd_iommu_mem *mem)
+{
+	return (mem->modes & ALLOC_MODE_GUEST_MEM_DECRYPT);
+}
+
+static inline bool amd_iommu_mem_is_4k(struct amd_iommu_mem *mem)
+{
+	return (mem->modes & ALLOC_MODE_4K);
+}
+
+static inline u64 amd_iommu_mem_to_phys(struct amd_iommu_mem *mem)
+{
+	/*
+	 * Return physical address without the encryption bit for data
+	 * structures allocated with the flag ALLOC_MODE_GUEST_MEM_DECRYPT
+	 * when running in SEV guest.
+	 */
+	if (amd_iommu_mem_is_decrypted(mem) &&
+	    cc_platform_has(CC_ATTR_GUEST_MEM_ENCRYPT))
+		return (u64)virt_to_phys(mem->buf);
+
+	return iommu_virt_to_phys(mem->buf);
+}
+
 static inline void *iommu_phys_to_virt(unsigned long paddr)
 {
 	return phys_to_virt(__sme_clr(paddr));
diff --git a/drivers/iommu/amd/amd_iommu_types.h b/drivers/iommu/amd/amd_iommu_types.h
index 2b76b5dedc1d..a42e10777922 100644
--- a/drivers/iommu/amd/amd_iommu_types.h
+++ b/drivers/iommu/amd/amd_iommu_types.h
@@ -479,6 +479,21 @@ extern bool amd_iommu_np_cache;
 /* Only true if all IOMMUs support device IOTLBs */
 extern bool amd_iommu_iotlb_sup;
 
+/*
+ * Allocation modes for struct amd_iommu_mem.
+ */
+
+/* Allocate memory as 4K-aligned mode */
+#define ALLOC_MODE_4K			BIT(0)
+/* Allocate memory as decrypted mode (for SEV guest) */
+#define ALLOC_MODE_GUEST_MEM_DECRYPT	BIT(1)
+
+struct amd_iommu_mem {
+	void *buf;
+	int order;
+	u32 modes;
+};
+
 struct irq_remap_table {
 	raw_spinlock_t lock;
 	unsigned min_index;
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 1fdf37e04215..05e967ad7032 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -37,6 +37,7 @@
 #include <asm/iommu.h>
 #include <asm/gart.h>
 #include <asm/dma.h>
+#include <asm/set_memory.h>
 #include <uapi/linux/iommufd.h>
 
 #include "amd_iommu.h"
@@ -574,6 +575,71 @@ static void amd_iommu_uninit_device(struct device *dev)
 	 */
 }
 
+void amd_iommu_free_mem(struct amd_iommu_mem *mem)
+{
+	int ret;
+	unsigned long addr = (unsigned long)mem->buf;
+
+	if (amd_iommu_mem_is_decrypted(mem) &&
+	    cc_platform_has(CC_ATTR_GUEST_MEM_ENCRYPT)) {
+		ret = set_memory_encrypted(addr, 1 << mem->order);
+		if (ret) {
+			pr_warn("%s: Fail to set memory encrypted (ret=%d)\n",
+				__func__, ret);
+			return;
+		}
+	}
+
+	iommu_free_pages(mem->buf, mem->order);
+	mem->buf = NULL;
+}
+
+void *amd_iommu_get_zeroed_mem_node(int nid, gfp_t gfp_mask, struct amd_iommu_mem *mem)
+{
+	int ret;
+	unsigned long addr;
+	int numpages = (1 << mem->order);
+
+	mem->buf = iommu_alloc_pages_node(nid, gfp_mask, mem->order);
+	if (!mem->buf)
+		return NULL;
+
+	addr = (unsigned long)mem->buf;
+
+	/* Allocate SEV guest memory as decrypted */
+	if (amd_iommu_mem_is_decrypted(mem) &&
+	    cc_platform_has(CC_ATTR_GUEST_MEM_ENCRYPT)) {
+		ret = set_memory_decrypted(addr, numpages);
+		if (ret) {
+			pr_warn("%s: Fail to set memory decrypted (ret=%d)\n",
+				__func__, ret);
+			goto out_free;
+		}
+	}
+
+	/* Allocate memory as 4K-aligned on SNP-enabled system. */
+	if (amd_iommu_mem_is_4k(mem) &&
+	    check_feature(FEATURE_SNP)) {
+		ret = set_memory_4k(addr, numpages);
+		if (ret) {
+			pr_warn("%s: Fail to set memory 4K(ret=%d)\n",
+				__func__, ret);
+			goto out_free;
+		}
+	}
+
+	return mem->buf;
+
+out_free:
+	amd_iommu_free_mem(mem);
+	return NULL;
+}
+
+void *amd_iommu_get_zeroed_mem(gfp_t gfp_mask, struct amd_iommu_mem *mem)
+{
+	return amd_iommu_get_zeroed_mem_node(NUMA_NO_NODE, gfp_mask, mem);
+}
+
 /****************************************************************************
  *
  * Interrupt handling functions
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 2/9] iommu/amd: Convert Device Table pointer to use struct amd_iommu_mem
  2024-04-30 15:24 [PATCH 0/9] iommu/amd: Add AMD IOMMU emulation support for SEV-SNP guest kernel Suravee Suthikulpanit
  2024-04-30 15:24 ` [PATCH 1/9] iommu/amd: Introduce helper functions for managing IOMMU memory Suravee Suthikulpanit
@ 2024-04-30 15:24 ` Suravee Suthikulpanit
  2024-04-30 15:24 ` [PATCH 3/9] iommu/amd: Convert Command Buffer " Suravee Suthikulpanit
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 19+ messages in thread
From: Suravee Suthikulpanit @ 2024-04-30 15:24 UTC (permalink / raw)
  To: linux-kernel, iommu, joro
  Cc: thomas.lendacky, vasant.hegde, michael.roth, jon.grimm, rientjes,
	Suravee Suthikulpanit

And specify the memory to be decrypted when running in an SEV guest
so that the VMM can access the memory successfully.

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
---
 drivers/iommu/amd/amd_iommu_types.h |  2 +-
 drivers/iommu/amd/init.c            | 26 +++++++++++++-------------
 drivers/iommu/amd/iommu.c           |  2 +-
 3 files changed, 15 insertions(+), 15 deletions(-)

diff --git a/drivers/iommu/amd/amd_iommu_types.h b/drivers/iommu/amd/amd_iommu_types.h
index a42e10777922..d9159f2e3f0f 100644
--- a/drivers/iommu/amd/amd_iommu_types.h
+++ b/drivers/iommu/amd/amd_iommu_types.h
@@ -638,7 +638,7 @@ struct amd_iommu_pci_seg {
 	 * information about the domain the device belongs to as well as the
 	 * page table root pointer.
 	 */
-	struct dev_table_entry *dev_table;
+	struct amd_iommu_mem dev_table_mem;
 
 	/*
 	 * The rlookup iommu table is used to find the IOMMU which is
diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
index 5aae03d24e2a..c68ff602d534 100644
--- a/drivers/iommu/amd/init.c
+++ b/drivers/iommu/amd/init.c
@@ -409,11 +409,10 @@ static void iommu_set_device_table(struct amd_iommu *iommu)
 {
 	u64 entry;
 	u32 dev_table_size = iommu->pci_seg->dev_table_size;
-	void *dev_table = (void *)get_dev_table(iommu);
 
 	BUG_ON(iommu->mmio_base == NULL);
 
-	entry = iommu_virt_to_phys(dev_table);
+	entry = amd_iommu_mem_to_phys(&iommu->pci_seg->dev_table_mem);
 	entry |= (dev_table_size >> 12) - 1;
 	memcpy_toio(iommu->mmio_base + MMIO_DEV_TABLE_OFFSET,
 			&entry, sizeof(entry));
@@ -650,9 +649,12 @@ static int __init find_last_devid_acpi(struct acpi_table_header *table, u16 pci_
 /* Allocate per PCI segment device table */
 static inline int __init alloc_dev_table(struct amd_iommu_pci_seg *pci_seg)
 {
-	pci_seg->dev_table = iommu_alloc_pages(GFP_KERNEL | GFP_DMA32,
-					       get_order(pci_seg->dev_table_size));
-	if (!pci_seg->dev_table)
+	struct amd_iommu_mem *mem = &pci_seg->dev_table_mem;
+
+	mem->modes = ALLOC_MODE_GUEST_MEM_DECRYPT;
+	mem->order = get_order(pci_seg->dev_table_size);
+	mem->buf = amd_iommu_get_zeroed_mem(GFP_KERNEL | GFP_DMA32, mem);
+	if (!mem->buf)
 		return -ENOMEM;
 
 	return 0;
@@ -660,9 +662,7 @@ static inline int __init alloc_dev_table(struct amd_iommu_pci_seg *pci_seg)
 
 static inline void free_dev_table(struct amd_iommu_pci_seg *pci_seg)
 {
-	iommu_free_pages(pci_seg->dev_table,
-			 get_order(pci_seg->dev_table_size));
-	pci_seg->dev_table = NULL;
+	amd_iommu_free_mem(&pci_seg->dev_table_mem);
 }
 
 /* Allocate per PCI segment IOMMU rlookup table. */
@@ -2570,7 +2570,7 @@ static int __init init_memory_definitions(struct acpi_table_header *table)
 static void init_device_table_dma(struct amd_iommu_pci_seg *pci_seg)
 {
 	u32 devid;
-	struct dev_table_entry *dev_table = pci_seg->dev_table;
+	struct dev_table_entry *dev_table = pci_seg->dev_table_mem.buf;
 
 	if (dev_table == NULL)
 		return;
@@ -2585,7 +2585,7 @@ static void init_device_table_dma(struct amd_iommu_pci_seg *pci_seg)
 static void __init uninit_device_table_dma(struct amd_iommu_pci_seg *pci_seg)
 {
 	u32 devid;
-	struct dev_table_entry *dev_table = pci_seg->dev_table;
+	struct dev_table_entry *dev_table = pci_seg->dev_table_mem.buf;
 
 	if (dev_table == NULL)
 		return;
@@ -2606,7 +2606,7 @@ static void init_device_table(void)
 
 	for_each_pci_segment(pci_seg) {
 		for (devid = 0; devid <= pci_seg->last_bdf; ++devid)
-			__set_dev_entry_bit(pci_seg->dev_table,
+			__set_dev_entry_bit(pci_seg->dev_table_mem.buf,
 					    devid, DEV_ENTRY_IRQ_TBL_EN);
 	}
 }
@@ -2778,9 +2778,9 @@ static void early_enable_iommus(void)
 		pr_info("Copied DEV table from previous kernel.\n");
 
 		for_each_pci_segment(pci_seg) {
-			iommu_free_pages(pci_seg->dev_table,
+			iommu_free_pages(pci_seg->dev_table_mem.buf,
 					 get_order(pci_seg->dev_table_size));
-			pci_seg->dev_table = pci_seg->old_dev_tbl_cpy;
+			pci_seg->dev_table_mem.buf = pci_seg->old_dev_tbl_cpy;
 		}
 
 		for_each_iommu(iommu) {
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 05e967ad7032..3df07e8ef002 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -144,7 +144,7 @@ struct dev_table_entry *get_dev_table(struct amd_iommu *iommu)
 	struct amd_iommu_pci_seg *pci_seg = iommu->pci_seg;
 
 	BUG_ON(pci_seg == NULL);
-	dev_table = pci_seg->dev_table;
+	dev_table = pci_seg->dev_table_mem.buf;
 	BUG_ON(dev_table == NULL);
 
 	return dev_table;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 3/9] iommu/amd: Convert Command Buffer pointer to use struct amd_iommu_mem
  2024-04-30 15:24 [PATCH 0/9] iommu/amd: Add AMD IOMMU emulation support for SEV-SNP guest kernel Suravee Suthikulpanit
  2024-04-30 15:24 ` [PATCH 1/9] iommu/amd: Introduce helper functions for managing IOMMU memory Suravee Suthikulpanit
  2024-04-30 15:24 ` [PATCH 2/9] iommu/amd: Convert Device Table pointer to use struct amd_iommu_mem Suravee Suthikulpanit
@ 2024-04-30 15:24 ` Suravee Suthikulpanit
  2024-04-30 15:24 ` [PATCH 4/9] iommu/amd: Convert Completion-Wait Semaphore " Suravee Suthikulpanit
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 19+ messages in thread
From: Suravee Suthikulpanit @ 2024-04-30 15:24 UTC (permalink / raw)
  To: linux-kernel, iommu, joro
  Cc: thomas.lendacky, vasant.hegde, michael.roth, jon.grimm, rientjes,
	Suravee Suthikulpanit

And specify the memory to be decrypted when running in an SEV guest
so that the VMM can access the memory successfully.

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
---
 drivers/iommu/amd/amd_iommu_types.h |  2 +-
 drivers/iommu/amd/init.c            | 17 +++++++++++------
 drivers/iommu/amd/iommu.c           |  2 +-
 3 files changed, 13 insertions(+), 8 deletions(-)

diff --git a/drivers/iommu/amd/amd_iommu_types.h b/drivers/iommu/amd/amd_iommu_types.h
index d9159f2e3f0f..653955ab120d 100644
--- a/drivers/iommu/amd/amd_iommu_types.h
+++ b/drivers/iommu/amd/amd_iommu_types.h
@@ -733,7 +733,7 @@ struct amd_iommu {
 	u64 exclusion_length;
 
 	/* command buffer virtual address */
-	u8 *cmd_buf;
+	struct amd_iommu_mem cmd_buf_mem;
 	u32 cmd_buf_head;
 	u32 cmd_buf_tail;
 
diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
index c68ff602d534..77147dc3b79f 100644
--- a/drivers/iommu/amd/init.c
+++ b/drivers/iommu/amd/init.c
@@ -735,10 +735,15 @@ static void __init free_alias_table(struct amd_iommu_pci_seg *pci_seg)
  */
 static int __init alloc_command_buffer(struct amd_iommu *iommu)
 {
-	iommu->cmd_buf = iommu_alloc_pages(GFP_KERNEL,
-					   get_order(CMD_BUFFER_SIZE));
+	struct amd_iommu_mem *mem = &iommu->cmd_buf_mem;
 
-	return iommu->cmd_buf ? 0 : -ENOMEM;
+	mem->modes = ALLOC_MODE_GUEST_MEM_DECRYPT;
+	mem->order = get_order(CMD_BUFFER_SIZE);
+	mem->buf = amd_iommu_get_zeroed_mem(GFP_KERNEL, mem);
+	if (!mem->buf)
+		return -ENOMEM;
+
+	return 0;
 }
 
 /*
@@ -812,9 +817,9 @@ static void iommu_enable_command_buffer(struct amd_iommu *iommu)
 {
 	u64 entry;
 
-	BUG_ON(iommu->cmd_buf == NULL);
+	BUG_ON(iommu->cmd_buf_mem.buf == NULL);
 
-	entry = iommu_virt_to_phys(iommu->cmd_buf);
+	entry = amd_iommu_mem_to_phys(&iommu->cmd_buf_mem);
 	entry |= MMIO_CMD_SIZE_512;
 
 	memcpy_toio(iommu->mmio_base + MMIO_CMD_BUF_OFFSET,
@@ -833,7 +838,7 @@ static void iommu_disable_command_buffer(struct amd_iommu *iommu)
 
 static void __init free_command_buffer(struct amd_iommu *iommu)
 {
-	iommu_free_pages(iommu->cmd_buf, get_order(CMD_BUFFER_SIZE));
+	amd_iommu_free_mem(&iommu->cmd_buf_mem);
 }
 
 void *__init iommu_alloc_4k_pages(struct amd_iommu *iommu, gfp_t gfp,
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 3df07e8ef002..2b18134f1eb5 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -1078,7 +1078,7 @@ static void copy_cmd_to_buffer(struct amd_iommu *iommu,
 
 	/* Copy command to buffer */
 	tail = iommu->cmd_buf_tail;
-	target = iommu->cmd_buf + tail;
+	target = ((u8 *)iommu->cmd_buf_mem.buf) + tail;
 	memcpy(target, cmd, sizeof(*cmd));
 
 	tail = (tail + sizeof(*cmd)) % CMD_BUFFER_SIZE;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 4/9] iommu/amd: Convert Completion-Wait Semaphore pointer to use struct amd_iommu_mem
  2024-04-30 15:24 [PATCH 0/9] iommu/amd: Add AMD IOMMU emulation support for SEV-SNP guest kernel Suravee Suthikulpanit
                   ` (2 preceding siblings ...)
  2024-04-30 15:24 ` [PATCH 3/9] iommu/amd: Convert Command Buffer " Suravee Suthikulpanit
@ 2024-04-30 15:24 ` Suravee Suthikulpanit
  2024-04-30 15:24 ` [PATCH 5/9] iommu/amd: Convert Event Log " Suravee Suthikulpanit
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 19+ messages in thread
From: Suravee Suthikulpanit @ 2024-04-30 15:24 UTC (permalink / raw)
  To: linux-kernel, iommu, joro
  Cc: thomas.lendacky, vasant.hegde, michael.roth, jon.grimm, rientjes,
	Suravee Suthikulpanit

And specify the memory to be 4K-aligned when running in SNP host, and
to be decrypted when running in an SEV guestso that the VMM can access
the memory successfully.

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
---
 drivers/iommu/amd/amd_iommu_types.h |  2 +-
 drivers/iommu/amd/init.c            | 17 +++++++++++------
 drivers/iommu/amd/iommu.c           |  5 +++--
 3 files changed, 15 insertions(+), 9 deletions(-)

diff --git a/drivers/iommu/amd/amd_iommu_types.h b/drivers/iommu/amd/amd_iommu_types.h
index 653955ab120d..e671e9220a21 100644
--- a/drivers/iommu/amd/amd_iommu_types.h
+++ b/drivers/iommu/amd/amd_iommu_types.h
@@ -798,7 +798,7 @@ struct amd_iommu {
 #endif
 
 	u32 flags;
-	volatile u64 *cmd_sem;
+	struct amd_iommu_mem cmd_sem_mem;
 	atomic64_t cmd_sem_val;
 
 #ifdef CONFIG_AMD_IOMMU_DEBUGFS
diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
index 77147dc3b79f..51861874656e 100644
--- a/drivers/iommu/amd/init.c
+++ b/drivers/iommu/amd/init.c
@@ -383,7 +383,7 @@ static void iommu_set_exclusion_range(struct amd_iommu *iommu)
 
 static void iommu_set_cwwb_range(struct amd_iommu *iommu)
 {
-	u64 start = iommu_virt_to_phys((void *)iommu->cmd_sem);
+	u64 start = amd_iommu_mem_to_phys(&iommu->cmd_sem_mem);
 	u64 entry = start & PM_ADDR_MASK;
 
 	if (!check_feature(FEATURE_SNP))
@@ -963,15 +963,20 @@ static int iommu_init_ga_log(struct amd_iommu *iommu)
 
 static int __init alloc_cwwb_sem(struct amd_iommu *iommu)
 {
-	iommu->cmd_sem = iommu_alloc_4k_pages(iommu, GFP_KERNEL, 1);
+	struct amd_iommu_mem *mem = &iommu->cmd_sem_mem;
 
-	return iommu->cmd_sem ? 0 : -ENOMEM;
+	mem->modes = ALLOC_MODE_4K | ALLOC_MODE_GUEST_MEM_DECRYPT;
+	mem->order = get_order(1);
+	mem->buf = amd_iommu_get_zeroed_mem(GFP_KERNEL, mem);
+	if (!mem->buf)
+		return -ENOMEM;
+
+	return 0;
 }
 
 static void __init free_cwwb_sem(struct amd_iommu *iommu)
 {
-	if (iommu->cmd_sem)
-		iommu_free_page((void *)iommu->cmd_sem);
+	amd_iommu_free_mem(&iommu->cmd_sem_mem);
 }
 
 static void iommu_enable_xt(struct amd_iommu *iommu)
@@ -3841,7 +3846,7 @@ int amd_iommu_snp_disable(void)
 		if (ret)
 			return ret;
 
-		ret = iommu_make_shared((void *)iommu->cmd_sem, PAGE_SIZE);
+		ret = iommu_make_shared(iommu->cmd_sem_mem.buf, PAGE_SIZE);
 		if (ret)
 			return ret;
 	}
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 2b18134f1eb5..bd29d26c8d44 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -1056,8 +1056,9 @@ irqreturn_t amd_iommu_int_handler(int irq, void *data)
 static int wait_on_sem(struct amd_iommu *iommu, u64 data)
 {
 	int i = 0;
+	u64 *cmd_sem = (u64 *)iommu->cmd_sem_mem.buf;
 
-	while (*iommu->cmd_sem != data && i < LOOP_TIMEOUT) {
+	while (*cmd_sem != data && i < LOOP_TIMEOUT) {
 		udelay(1);
 		i += 1;
 	}
@@ -1092,7 +1093,7 @@ static void build_completion_wait(struct iommu_cmd *cmd,
 				  struct amd_iommu *iommu,
 				  u64 data)
 {
-	u64 paddr = iommu_virt_to_phys((void *)iommu->cmd_sem);
+	u64 paddr = amd_iommu_mem_to_phys(&iommu->cmd_sem_mem);
 
 	memset(cmd, 0, sizeof(*cmd));
 	cmd->data[0] = lower_32_bits(paddr) | CMD_COMPL_WAIT_STORE_MASK;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 5/9] iommu/amd: Convert Event Log pointer to use struct amd_iommu_mem
  2024-04-30 15:24 [PATCH 0/9] iommu/amd: Add AMD IOMMU emulation support for SEV-SNP guest kernel Suravee Suthikulpanit
                   ` (3 preceding siblings ...)
  2024-04-30 15:24 ` [PATCH 4/9] iommu/amd: Convert Completion-Wait Semaphore " Suravee Suthikulpanit
@ 2024-04-30 15:24 ` Suravee Suthikulpanit
  2024-04-30 15:24 ` [PATCH 6/9] iommu/amd: Convert PPR Log pointer to use the " Suravee Suthikulpanit
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 19+ messages in thread
From: Suravee Suthikulpanit @ 2024-04-30 15:24 UTC (permalink / raw)
  To: linux-kernel, iommu, joro
  Cc: thomas.lendacky, vasant.hegde, michael.roth, jon.grimm, rientjes,
	Suravee Suthikulpanit

And specify the memory to be 4K-aligned when running in SNP host, and
to be decrypted when running in an SEV guest so that the VMM can access
the memory successfully.

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
---
 drivers/iommu/amd/amd_iommu_types.h |  2 +-
 drivers/iommu/amd/init.c            | 20 +++++++++++++-------
 drivers/iommu/amd/iommu.c           |  2 +-
 3 files changed, 15 insertions(+), 9 deletions(-)

diff --git a/drivers/iommu/amd/amd_iommu_types.h b/drivers/iommu/amd/amd_iommu_types.h
index e671e9220a21..af4e9ca6414e 100644
--- a/drivers/iommu/amd/amd_iommu_types.h
+++ b/drivers/iommu/amd/amd_iommu_types.h
@@ -738,7 +738,7 @@ struct amd_iommu {
 	u32 cmd_buf_tail;
 
 	/* event buffer virtual address */
-	u8 *evt_buf;
+	struct amd_iommu_mem evt_buf_mem;
 
 	/* Name for event log interrupt */
 	unsigned char evt_irq_name[16];
diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
index 51861874656e..5242a9a16946 100644
--- a/drivers/iommu/amd/init.c
+++ b/drivers/iommu/amd/init.c
@@ -860,19 +860,25 @@ void *__init iommu_alloc_4k_pages(struct amd_iommu *iommu, gfp_t gfp,
 /* allocates the memory where the IOMMU will log its events to */
 static int __init alloc_event_buffer(struct amd_iommu *iommu)
 {
-	iommu->evt_buf = iommu_alloc_4k_pages(iommu, GFP_KERNEL,
-					      EVT_BUFFER_SIZE);
+	struct amd_iommu_mem *mem = &iommu->evt_buf_mem;
 
-	return iommu->evt_buf ? 0 : -ENOMEM;
+	mem->modes = ALLOC_MODE_4K | ALLOC_MODE_GUEST_MEM_DECRYPT;
+	mem->order = get_order(EVT_BUFFER_SIZE);
+	mem->buf = amd_iommu_get_zeroed_mem(GFP_KERNEL, mem);
+	if (!mem->buf)
+		return -ENOMEM;
+
+	return 0;
 }
 
 static void iommu_enable_event_buffer(struct amd_iommu *iommu)
 {
 	u64 entry;
 
-	BUG_ON(iommu->evt_buf == NULL);
+	BUG_ON(iommu->evt_buf_mem.buf == NULL);
 
-	entry = iommu_virt_to_phys(iommu->evt_buf) | EVT_LEN_MASK;
+	entry = amd_iommu_mem_to_phys(&iommu->evt_buf_mem);
+	entry |= EVT_LEN_MASK;
 
 	memcpy_toio(iommu->mmio_base + MMIO_EVT_BUF_OFFSET,
 		    &entry, sizeof(entry));
@@ -894,7 +900,7 @@ static void iommu_disable_event_buffer(struct amd_iommu *iommu)
 
 static void __init free_event_buffer(struct amd_iommu *iommu)
 {
-	iommu_free_pages(iommu->evt_buf, get_order(EVT_BUFFER_SIZE));
+	amd_iommu_free_mem(&iommu->evt_buf_mem);
 }
 
 static void free_ga_log(struct amd_iommu *iommu)
@@ -3838,7 +3844,7 @@ int amd_iommu_snp_disable(void)
 		return 0;
 
 	for_each_iommu(iommu) {
-		ret = iommu_make_shared(iommu->evt_buf, EVT_BUFFER_SIZE);
+		ret = iommu_make_shared(iommu->evt_buf_mem.buf, EVT_BUFFER_SIZE);
 		if (ret)
 			return ret;
 
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index bd29d26c8d44..4f95c726e139 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -890,7 +890,7 @@ static void iommu_poll_events(struct amd_iommu *iommu)
 	tail = readl(iommu->mmio_base + MMIO_EVT_TAIL_OFFSET);
 
 	while (head != tail) {
-		iommu_print_event(iommu, iommu->evt_buf + head);
+		iommu_print_event(iommu, (u8 *)iommu->evt_buf_mem.buf + head);
 		head = (head + EVENT_ENTRY_SIZE) % EVT_BUFFER_SIZE;
 	}
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 6/9] iommu/amd: Convert PPR Log pointer to use the struct amd_iommu_mem
  2024-04-30 15:24 [PATCH 0/9] iommu/amd: Add AMD IOMMU emulation support for SEV-SNP guest kernel Suravee Suthikulpanit
                   ` (4 preceding siblings ...)
  2024-04-30 15:24 ` [PATCH 5/9] iommu/amd: Convert Event Log " Suravee Suthikulpanit
@ 2024-04-30 15:24 ` Suravee Suthikulpanit
  2024-04-30 15:24 ` [PATCH 7/9] iommu/amd: Remove iommu_alloc_4k_pages() helper function Suravee Suthikulpanit
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 19+ messages in thread
From: Suravee Suthikulpanit @ 2024-04-30 15:24 UTC (permalink / raw)
  To: linux-kernel, iommu, joro
  Cc: thomas.lendacky, vasant.hegde, michael.roth, jon.grimm, rientjes,
	Suravee Suthikulpanit

And specify the memory to be 4K-aligned when running in SNP host,

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
---
 drivers/iommu/amd/amd_iommu_types.h |  2 +-
 drivers/iommu/amd/init.c            |  2 +-
 drivers/iommu/amd/ppr.c             | 22 ++++++++++++++--------
 3 files changed, 16 insertions(+), 10 deletions(-)

diff --git a/drivers/iommu/amd/amd_iommu_types.h b/drivers/iommu/amd/amd_iommu_types.h
index af4e9ca6414e..8ced34cac1db 100644
--- a/drivers/iommu/amd/amd_iommu_types.h
+++ b/drivers/iommu/amd/amd_iommu_types.h
@@ -744,7 +744,7 @@ struct amd_iommu {
 	unsigned char evt_irq_name[16];
 
 	/* Base of the PPR log, if present */
-	u8 *ppr_log;
+	struct amd_iommu_mem ppr_log_mem;
 
 	/* Name for PPR log interrupt */
 	unsigned char ppr_irq_name[16];
diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
index 5242a9a16946..b62d4c806155 100644
--- a/drivers/iommu/amd/init.c
+++ b/drivers/iommu/amd/init.c
@@ -3848,7 +3848,7 @@ int amd_iommu_snp_disable(void)
 		if (ret)
 			return ret;
 
-		ret = iommu_make_shared(iommu->ppr_log, PPR_LOG_SIZE);
+		ret = iommu_make_shared(iommu->ppr_log_mem.buf, PPR_LOG_SIZE);
 		if (ret)
 			return ret;
 
diff --git a/drivers/iommu/amd/ppr.c b/drivers/iommu/amd/ppr.c
index 091423bb8aac..13983b4bf47b 100644
--- a/drivers/iommu/amd/ppr.c
+++ b/drivers/iommu/amd/ppr.c
@@ -19,21 +19,27 @@
 
 int __init amd_iommu_alloc_ppr_log(struct amd_iommu *iommu)
 {
-	iommu->ppr_log = iommu_alloc_4k_pages(iommu, GFP_KERNEL | __GFP_ZERO,
-					      PPR_LOG_SIZE);
-	return iommu->ppr_log ? 0 : -ENOMEM;
+	struct amd_iommu_mem *mem = &iommu->ppr_log_mem;
+
+	mem->modes = ALLOC_MODE_4K;
+	mem->order = get_order(PPR_LOG_SIZE);
+	mem->buf = amd_iommu_get_zeroed_mem(GFP_KERNEL, mem);
+	if (!mem->buf)
+		return -ENOMEM;
+	return 0;
 }
 
 void amd_iommu_enable_ppr_log(struct amd_iommu *iommu)
 {
 	u64 entry;
 
-	if (iommu->ppr_log == NULL)
+	if (iommu->ppr_log_mem.buf == NULL)
 		return;
 
 	iommu_feature_enable(iommu, CONTROL_PPR_EN);
 
-	entry = iommu_virt_to_phys(iommu->ppr_log) | PPR_LOG_SIZE_512;
+	entry = amd_iommu_mem_to_phys(&iommu->ppr_log_mem);
+	entry |= PPR_LOG_SIZE_512;
 
 	memcpy_toio(iommu->mmio_base + MMIO_PPR_LOG_OFFSET,
 		    &entry, sizeof(entry));
@@ -48,7 +54,7 @@ void amd_iommu_enable_ppr_log(struct amd_iommu *iommu)
 
 void __init amd_iommu_free_ppr_log(struct amd_iommu *iommu)
 {
-	iommu_free_pages(iommu->ppr_log, get_order(PPR_LOG_SIZE));
+	amd_iommu_free_mem(&iommu->ppr_log_mem);
 }
 
 /*
@@ -163,7 +169,7 @@ void amd_iommu_poll_ppr_log(struct amd_iommu *iommu)
 {
 	u32 head, tail;
 
-	if (iommu->ppr_log == NULL)
+	if (iommu->ppr_log_mem.buf == NULL)
 		return;
 
 	head = readl(iommu->mmio_base + MMIO_PPR_HEAD_OFFSET);
@@ -174,7 +180,7 @@ void amd_iommu_poll_ppr_log(struct amd_iommu *iommu)
 		u64 entry[2];
 		int i;
 
-		raw = (u64 *)(iommu->ppr_log + head);
+		raw = (u64 *)((u8 *)iommu->ppr_log_mem.buf) + head;
 
 		/*
 		 * Hardware bug: Interrupt may arrive before the entry is
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 7/9] iommu/amd: Remove iommu_alloc_4k_pages() helper function
  2024-04-30 15:24 [PATCH 0/9] iommu/amd: Add AMD IOMMU emulation support for SEV-SNP guest kernel Suravee Suthikulpanit
                   ` (5 preceding siblings ...)
  2024-04-30 15:24 ` [PATCH 6/9] iommu/amd: Convert PPR Log pointer to use the " Suravee Suthikulpanit
@ 2024-04-30 15:24 ` Suravee Suthikulpanit
  2024-04-30 15:24 ` [PATCH 8/9] iommu/amd: Decrypt interrupt remapping table for AMD IOMMU emulation in SEV guest Suravee Suthikulpanit
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 19+ messages in thread
From: Suravee Suthikulpanit @ 2024-04-30 15:24 UTC (permalink / raw)
  To: linux-kernel, iommu, joro
  Cc: thomas.lendacky, vasant.hegde, michael.roth, jon.grimm, rientjes,
	Suravee Suthikulpanit

Since it is replaced with the amd_iommu_get_zeroed_mem().

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
---
 drivers/iommu/amd/amd_iommu.h |  2 --
 drivers/iommu/amd/init.c      | 16 ----------------
 2 files changed, 18 deletions(-)

diff --git a/drivers/iommu/amd/amd_iommu.h b/drivers/iommu/amd/amd_iommu.h
index ccd9003813ac..1ca7e1e389c4 100644
--- a/drivers/iommu/amd/amd_iommu.h
+++ b/drivers/iommu/amd/amd_iommu.h
@@ -25,8 +25,6 @@ void amd_iommu_restart_ga_log(struct amd_iommu *iommu);
 void amd_iommu_restart_ppr_log(struct amd_iommu *iommu);
 void amd_iommu_set_rlookup_table(struct amd_iommu *iommu, u16 devid);
 void iommu_feature_enable(struct amd_iommu *iommu, u8 bit);
-void *__init iommu_alloc_4k_pages(struct amd_iommu *iommu,
-				  gfp_t gfp, size_t size);
 
 void *amd_iommu_get_zeroed_mem(gfp_t gfp_mask, struct amd_iommu_mem *mem);
 void *amd_iommu_get_zeroed_mem_node(int nid, gfp_t gfp_mask,
diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
index b62d4c806155..1b74a31b4337 100644
--- a/drivers/iommu/amd/init.c
+++ b/drivers/iommu/amd/init.c
@@ -841,22 +841,6 @@ static void __init free_command_buffer(struct amd_iommu *iommu)
 	amd_iommu_free_mem(&iommu->cmd_buf_mem);
 }
 
-void *__init iommu_alloc_4k_pages(struct amd_iommu *iommu, gfp_t gfp,
-				  size_t size)
-{
-	int order = get_order(size);
-	void *buf = iommu_alloc_pages(gfp, order);
-
-	if (buf &&
-	    check_feature(FEATURE_SNP) &&
-	    set_memory_4k((unsigned long)buf, (1 << order))) {
-		iommu_free_pages(buf, order);
-		buf = NULL;
-	}
-
-	return buf;
-}
-
 /* allocates the memory where the IOMMU will log its events to */
 static int __init alloc_event_buffer(struct amd_iommu *iommu)
 {
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 8/9] iommu/amd: Decrypt interrupt remapping table for AMD IOMMU emulation in SEV guest
  2024-04-30 15:24 [PATCH 0/9] iommu/amd: Add AMD IOMMU emulation support for SEV-SNP guest kernel Suravee Suthikulpanit
                   ` (6 preceding siblings ...)
  2024-04-30 15:24 ` [PATCH 7/9] iommu/amd: Remove iommu_alloc_4k_pages() helper function Suravee Suthikulpanit
@ 2024-04-30 15:24 ` Suravee Suthikulpanit
  2024-04-30 15:24 ` [PATCH 9/9] iommu/amd: Set default domain to IDENTITY_DOMAIN when running " Suravee Suthikulpanit
  2024-05-13 20:05 ` [PATCH 0/9] iommu/amd: Add AMD IOMMU emulation support for SEV-SNP guest kernel Michael Kelley
  9 siblings, 0 replies; 19+ messages in thread
From: Suravee Suthikulpanit @ 2024-04-30 15:24 UTC (permalink / raw)
  To: linux-kernel, iommu, joro
  Cc: thomas.lendacky, vasant.hegde, michael.roth, jon.grimm, rientjes,
	Suravee Suthikulpanit

The interrupt remapping table must be decrypted so that the VMM can access
the memory to emulate interrupt remapping. However, the amd iommu driver
currently allocate the table with kmem_cache mainly to enforce 128-byte
memory alignment as specified in the AMD IOMMU spec.

For SEV guest, memory encryption is done on a page basis. The driver must
be modified to allocate the table using page-aligned memory, which still
satisfies the original 128-byte alignment.

In addition, the table is setup per-device, which can be allocated with
NUMA-aware page to help reduce IRTE access latency.

Suggested-by: Thomas Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
---
 drivers/iommu/amd/amd_iommu_types.h |  3 +-
 drivers/iommu/amd/init.c            | 31 +++++++-----------
 drivers/iommu/amd/iommu.c           | 50 ++++++++++++++++-------------
 3 files changed, 41 insertions(+), 43 deletions(-)

diff --git a/drivers/iommu/amd/amd_iommu_types.h b/drivers/iommu/amd/amd_iommu_types.h
index 8ced34cac1db..980fbb9bae39 100644
--- a/drivers/iommu/amd/amd_iommu_types.h
+++ b/drivers/iommu/amd/amd_iommu_types.h
@@ -309,7 +309,6 @@
  * AMD IOMMU hardware only support 512 IRTEs despite
  * the architectural limitation of 2048 entries.
  */
-#define DTE_INTTAB_ALIGNMENT    128
 #define DTE_INTTABLEN_VALUE     9ULL
 #define DTE_INTTABLEN           (DTE_INTTABLEN_VALUE << 1)
 #define DTE_INTTABLEN_MASK      (0xfULL << 1)
@@ -497,7 +496,7 @@ struct amd_iommu_mem {
 struct irq_remap_table {
 	raw_spinlock_t lock;
 	unsigned min_index;
-	u32 *table;
+	struct amd_iommu_mem mem;
 };
 
 /* Interrupt remapping feature used? */
diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
index 1b74a31b4337..b3ff89952c7f 100644
--- a/drivers/iommu/amd/init.c
+++ b/drivers/iommu/amd/init.c
@@ -697,6 +697,17 @@ static inline int __init alloc_irq_lookup_table(struct amd_iommu_pci_seg *pci_se
 
 static inline void free_irq_lookup_table(struct amd_iommu_pci_seg *pci_seg)
 {
+	int i;
+	struct irq_remap_table *table;
+
+	for (i = 0 ; i <= pci_seg->last_bdf; ++i) {
+		table = pci_seg->irq_lookup_table[i];
+		if (table) {
+			amd_iommu_free_mem(&table->mem);
+			kfree(table);
+		}
+	}
+
 	kmemleak_free(pci_seg->irq_lookup_table);
 	iommu_free_pages(pci_seg->irq_lookup_table,
 			 get_order(pci_seg->rlookup_table_size));
@@ -2923,9 +2934,6 @@ static struct syscore_ops amd_iommu_syscore_ops = {
 
 static void __init free_iommu_resources(void)
 {
-	kmem_cache_destroy(amd_iommu_irq_cache);
-	amd_iommu_irq_cache = NULL;
-
 	free_iommu_all();
 	free_pci_segments();
 }
@@ -3026,7 +3034,7 @@ static void __init ivinfo_init(void *ivrs)
 static int __init early_amd_iommu_init(void)
 {
 	struct acpi_table_header *ivrs_base;
-	int remap_cache_sz, ret;
+	int ret;
 	acpi_status status;
 
 	if (!amd_iommu_detected)
@@ -3090,21 +3098,6 @@ static int __init early_amd_iommu_init(void)
 
 	if (amd_iommu_irq_remap) {
 		struct amd_iommu_pci_seg *pci_seg;
-		/*
-		 * Interrupt remapping enabled, create kmem_cache for the
-		 * remapping tables.
-		 */
-		ret = -ENOMEM;
-		if (!AMD_IOMMU_GUEST_IR_GA(amd_iommu_guest_ir))
-			remap_cache_sz = MAX_IRQS_PER_TABLE * sizeof(u32);
-		else
-			remap_cache_sz = MAX_IRQS_PER_TABLE * (sizeof(u64) * 2);
-		amd_iommu_irq_cache = kmem_cache_create("irq_remap_cache",
-							remap_cache_sz,
-							DTE_INTTAB_ALIGNMENT,
-							0, NULL);
-		if (!amd_iommu_irq_cache)
-			goto out;
 
 		for_each_pci_segment(pci_seg) {
 			if (alloc_irq_lookup_table(pci_seg))
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 4f95c726e139..f98a10b7925b 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -73,8 +73,6 @@ struct iommu_cmd {
 	u32 data[4];
 };
 
-struct kmem_cache *amd_iommu_irq_cache;
-
 static void detach_device(struct device *dev);
 
 static void set_dte_entry(struct amd_iommu *iommu,
@@ -2998,7 +2996,7 @@ static void set_dte_irq_entry(struct amd_iommu *iommu, u16 devid,
 
 	dte	= dev_table[devid].data[2];
 	dte	&= ~DTE_IRQ_PHYS_ADDR_MASK;
-	dte	|= iommu_virt_to_phys(table->table);
+	dte	|= amd_iommu_mem_to_phys(&table->mem);
 	dte	|= DTE_IRQ_REMAP_INTCTL;
 	dte	|= DTE_INTTABLEN;
 	dte	|= DTE_IRQ_REMAP_ENABLE;
@@ -3024,27 +3022,35 @@ static struct irq_remap_table *get_irq_table(struct amd_iommu *iommu, u16 devid)
 	return table;
 }
 
-static struct irq_remap_table *__alloc_irq_table(void)
+static size_t get_irq_table_size(void)
+{
+	if (!AMD_IOMMU_GUEST_IR_GA(amd_iommu_guest_ir))
+		return (MAX_IRQS_PER_TABLE * sizeof(u32));
+	else
+		return (MAX_IRQS_PER_TABLE * (sizeof(u64) * 2));
+}
+
+static struct irq_remap_table *__alloc_irq_table(struct amd_iommu *iommu)
 {
+	struct amd_iommu_mem *mem;
 	struct irq_remap_table *table;
+	int order = get_order(get_irq_table_size());
+	int nid = (iommu && iommu->dev) ? dev_to_node(&iommu->dev->dev) : NUMA_NO_NODE;
 
 	table = kzalloc(sizeof(*table), GFP_KERNEL);
 	if (!table)
 		return NULL;
 
-	table->table = kmem_cache_alloc(amd_iommu_irq_cache, GFP_KERNEL);
-	if (!table->table) {
+	mem = &table->mem;
+	mem->modes = ALLOC_MODE_GUEST_MEM_DECRYPT;
+	mem->order = order;
+	mem->buf = amd_iommu_get_zeroed_mem_node(nid, GFP_KERNEL, mem);
+	if (!mem->buf) {
 		kfree(table);
 		return NULL;
 	}
 	raw_spin_lock_init(&table->lock);
 
-	if (!AMD_IOMMU_GUEST_IR_GA(amd_iommu_guest_ir))
-		memset(table->table, 0,
-		       MAX_IRQS_PER_TABLE * sizeof(u32));
-	else
-		memset(table->table, 0,
-		       (MAX_IRQS_PER_TABLE * (sizeof(u64) * 2)));
 	return table;
 }
 
@@ -3101,7 +3107,7 @@ static struct irq_remap_table *alloc_irq_table(struct amd_iommu *iommu,
 	spin_unlock_irqrestore(&iommu_table_lock, flags);
 
 	/* Nothing there yet, allocate new irq remapping table */
-	new_table = __alloc_irq_table();
+	new_table = __alloc_irq_table(iommu);
 	if (!new_table)
 		return NULL;
 
@@ -3136,7 +3142,7 @@ static struct irq_remap_table *alloc_irq_table(struct amd_iommu *iommu,
 	spin_unlock_irqrestore(&iommu_table_lock, flags);
 
 	if (new_table) {
-		kmem_cache_free(amd_iommu_irq_cache, new_table->table);
+		amd_iommu_free_mem(&new_table->mem);
 		kfree(new_table);
 	}
 	return table;
@@ -3202,7 +3208,7 @@ static int __modify_irte_ga(struct amd_iommu *iommu, u16 devid, int index,
 
 	raw_spin_lock_irqsave(&table->lock, flags);
 
-	entry = (struct irte_ga *)table->table;
+	entry = (struct irte_ga *)table->mem.buf;
 	entry = &entry[index];
 
 	/*
@@ -3244,7 +3250,7 @@ static int modify_irte(struct amd_iommu *iommu,
 		return -ENOMEM;
 
 	raw_spin_lock_irqsave(&table->lock, flags);
-	table->table[index] = irte->val;
+	((u32 *)table->mem.buf)[index] = irte->val;
 	raw_spin_unlock_irqrestore(&table->lock, flags);
 
 	iommu_flush_irt_and_complete(iommu, devid);
@@ -3358,12 +3364,12 @@ static void irte_ga_set_affinity(struct amd_iommu *iommu, void *entry, u16 devid
 #define IRTE_ALLOCATED (~1U)
 static void irte_set_allocated(struct irq_remap_table *table, int index)
 {
-	table->table[index] = IRTE_ALLOCATED;
+	((u32 *)table->mem.buf)[index] = IRTE_ALLOCATED;
 }
 
 static void irte_ga_set_allocated(struct irq_remap_table *table, int index)
 {
-	struct irte_ga *ptr = (struct irte_ga *)table->table;
+	struct irte_ga *ptr = (struct irte_ga *)table->mem.buf;
 	struct irte_ga *irte = &ptr[index];
 
 	memset(&irte->lo.val, 0, sizeof(u64));
@@ -3373,7 +3379,7 @@ static void irte_ga_set_allocated(struct irq_remap_table *table, int index)
 
 static bool irte_is_allocated(struct irq_remap_table *table, int index)
 {
-	union irte *ptr = (union irte *)table->table;
+	union irte *ptr = (union irte *)table->mem.buf;
 	union irte *irte = &ptr[index];
 
 	return irte->val != 0;
@@ -3381,7 +3387,7 @@ static bool irte_is_allocated(struct irq_remap_table *table, int index)
 
 static bool irte_ga_is_allocated(struct irq_remap_table *table, int index)
 {
-	struct irte_ga *ptr = (struct irte_ga *)table->table;
+	struct irte_ga *ptr = (struct irte_ga *)table->mem.buf;
 	struct irte_ga *irte = &ptr[index];
 
 	return irte->hi.fields.vector != 0;
@@ -3389,12 +3395,12 @@ static bool irte_ga_is_allocated(struct irq_remap_table *table, int index)
 
 static void irte_clear_allocated(struct irq_remap_table *table, int index)
 {
-	table->table[index] = 0;
+	((u32 *)table->mem.buf)[index] = 0;
 }
 
 static void irte_ga_clear_allocated(struct irq_remap_table *table, int index)
 {
-	struct irte_ga *ptr = (struct irte_ga *)table->table;
+	struct irte_ga *ptr = (struct irte_ga *)table->mem.buf;
 	struct irte_ga *irte = &ptr[index];
 
 	memset(&irte->lo.val, 0, sizeof(u64));
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 9/9] iommu/amd: Set default domain to IDENTITY_DOMAIN when running in SEV guest
  2024-04-30 15:24 [PATCH 0/9] iommu/amd: Add AMD IOMMU emulation support for SEV-SNP guest kernel Suravee Suthikulpanit
                   ` (7 preceding siblings ...)
  2024-04-30 15:24 ` [PATCH 8/9] iommu/amd: Decrypt interrupt remapping table for AMD IOMMU emulation in SEV guest Suravee Suthikulpanit
@ 2024-04-30 15:24 ` Suravee Suthikulpanit
  2024-05-01 14:17   ` Jason Gunthorpe
  2024-05-13 20:05 ` [PATCH 0/9] iommu/amd: Add AMD IOMMU emulation support for SEV-SNP guest kernel Michael Kelley
  9 siblings, 1 reply; 19+ messages in thread
From: Suravee Suthikulpanit @ 2024-04-30 15:24 UTC (permalink / raw)
  To: linux-kernel, iommu, joro
  Cc: thomas.lendacky, vasant.hegde, michael.roth, jon.grimm, rientjes,
	Suravee Suthikulpanit

Since SEV guest depends on the unencrypted swiotlb bounce buffer
to support DMA, the guest AMD IOMMU driver must be force to setup to
pass-through mode.

Suggested-by: Thomas Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
---
 drivers/iommu/amd/init.c  | 15 +++++++++++++++
 drivers/iommu/amd/iommu.c |  6 ++++++
 2 files changed, 21 insertions(+)

diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
index b3ff89952c7f..1dccf030f674 100644
--- a/drivers/iommu/amd/init.c
+++ b/drivers/iommu/amd/init.c
@@ -3179,6 +3179,20 @@ static bool __init detect_ivrs(void)
 	return true;
 }
 
+static void iommu_sev_guest_enable(void)
+{
+	/*
+	 * Force IOMMU default domain to pass-through for
+	 * SEV guest since we cannot support DMA-remapping.
+	 * Note: This check must be done after IOMMU_ENABLED state.
+	 */
+	if (!cc_platform_has(CC_ATTR_GUEST_MEM_ENCRYPT))
+		return;
+
+	pr_info("Force pass-through for SEV guest\n");
+	iommu_set_default_passthrough(false);
+}
+
 static void iommu_snp_enable(void)
 {
 #ifdef CONFIG_KVM_AMD_SEV
@@ -3247,6 +3261,7 @@ static int __init state_next(void)
 		break;
 	case IOMMU_ENABLED:
 		register_syscore_ops(&amd_iommu_syscore_ops);
+		iommu_sev_guest_enable();
 		iommu_snp_enable();
 		ret = amd_iommu_init_pci();
 		init_state = ret ? IOMMU_INIT_ERROR : IOMMU_PCI_INIT;
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index f98a10b7925b..c985d23c8528 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -2876,6 +2876,12 @@ static int amd_iommu_def_domain_type(struct device *dev)
 		return IOMMU_DOMAIN_IDENTITY;
 	}
 
+	/*
+	 * Force identity map for SEV guest.
+	 */
+	if (cc_platform_has(CC_ATTR_GUEST_MEM_ENCRYPT))
+		return IOMMU_DOMAIN_IDENTITY;
+
 	return 0;
 }
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH 9/9] iommu/amd: Set default domain to IDENTITY_DOMAIN when running in SEV guest
  2024-04-30 15:24 ` [PATCH 9/9] iommu/amd: Set default domain to IDENTITY_DOMAIN when running " Suravee Suthikulpanit
@ 2024-05-01 14:17   ` Jason Gunthorpe
  2024-05-13 12:17     ` Suthikulpanit, Suravee
  0 siblings, 1 reply; 19+ messages in thread
From: Jason Gunthorpe @ 2024-05-01 14:17 UTC (permalink / raw)
  To: Suravee Suthikulpanit
  Cc: linux-kernel, iommu, joro, thomas.lendacky, vasant.hegde,
	michael.roth, jon.grimm, rientjes

On Tue, Apr 30, 2024 at 03:24:30PM +0000, Suravee Suthikulpanit wrote:
> Since SEV guest depends on the unencrypted swiotlb bounce buffer
> to support DMA, the guest AMD IOMMU driver must be force to setup to
> pass-through mode.

You should block the creation of paging domains as well if the HW
can't support them.

But, is there actually a functional problem here? Doesn't swiotlb work
OK with iommu even with the encrypted memory cases? What is missing if
not?

Jason

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/9] iommu/amd: Introduce helper functions for managing IOMMU memory
  2024-04-30 15:24 ` [PATCH 1/9] iommu/amd: Introduce helper functions for managing IOMMU memory Suravee Suthikulpanit
@ 2024-05-01 16:17   ` Jason Gunthorpe
  2024-05-13 18:59     ` Suthikulpanit, Suravee
  0 siblings, 1 reply; 19+ messages in thread
From: Jason Gunthorpe @ 2024-05-01 16:17 UTC (permalink / raw)
  To: Suravee Suthikulpanit
  Cc: linux-kernel, iommu, joro, thomas.lendacky, vasant.hegde,
	michael.roth, jon.grimm, rientjes

On Tue, Apr 30, 2024 at 03:24:22PM +0000, Suravee Suthikulpanit wrote:
> Depending on the modes of operation, certain AMD IOMMU data structures are
> allocated with constraints. For example:
> 
>  * Some buffers must be 4K-aligned when running in SNP-enabled host
> 
>  * To support AMD IOMMU emulation in an SEV guest, some data structures
>    cannot be encrypted so that the VMM can access the memory successfully.

Uh, this seems like a really bad idea. The VM's integrity strongly
depends on the correct function of the HW. If the IOMMU datastructures
are not protected then the whole thing is not secure.

For instance allowing hostile VMs to manipulate the DTE, or interfere
with the command queue, destroys any possibility to have secure DMA.

Is this some precursor to implementing a secure iommu where the data
structures will remain encrypted? What is even the point of putting a
non-secure viommu into a SEV guest anyhow?

Jason

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 9/9] iommu/amd: Set default domain to IDENTITY_DOMAIN when running in SEV guest
  2024-05-01 14:17   ` Jason Gunthorpe
@ 2024-05-13 12:17     ` Suthikulpanit, Suravee
  2024-05-13 23:10       ` Jason Gunthorpe
  0 siblings, 1 reply; 19+ messages in thread
From: Suthikulpanit, Suravee @ 2024-05-13 12:17 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: linux-kernel, iommu, joro, thomas.lendacky, vasant.hegde,
	michael.roth, jon.grimm, rientjes

Jason,

On 5/1/2024 9:17 PM, Jason Gunthorpe wrote:
> On Tue, Apr 30, 2024 at 03:24:30PM +0000, Suravee Suthikulpanit wrote:
>> Since SEV guest depends on the unencrypted swiotlb bounce buffer
>> to support DMA, the guest AMD IOMMU driver must be force to setup to
>> pass-through mode.
> 
> You should block the creation of paging domains as well if the HW
> can't support them.

Sure, I'll add a logic to check and block domain creation.

> But, is there actually a functional problem here? Doesn't swiotlb work
> OK with iommu even with the encrypted memory cases? What is missing if
> not?

Currently, SEV guest is default to use SWIOTLB. This does not have any 
issues.

However, in order to support vcpus w/ x2APIC ID (> 255) in a guest, it 
requires guest interrupt remapping support. This is achieved by adding 
QEMU-emulated AMD or Intel vIOMMU models.

In case of AMD IOMMU, depending on the CONFIG_IOMMU_DEFAULT_PASSTHROUGH 
kernel config, it would default to setup the v1 table for DMA remapping, 
which is not supported in the SEV guest (since it requires to use SWIOTLB).

This patch is needed to avoid the need to have 
CONFIG_IOMMU_DEFAULT_PASSTHROUGH=y, or specifying kernel command-line 
option iommu=pt in the guest.

Thanks,
Suravee

> Jason

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/9] iommu/amd: Introduce helper functions for managing IOMMU memory
  2024-05-01 16:17   ` Jason Gunthorpe
@ 2024-05-13 18:59     ` Suthikulpanit, Suravee
  2024-05-13 23:13       ` Jason Gunthorpe
  0 siblings, 1 reply; 19+ messages in thread
From: Suthikulpanit, Suravee @ 2024-05-13 18:59 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: linux-kernel, iommu, joro, thomas.lendacky, vasant.hegde,
	michael.roth, jon.grimm, rientjes

Jason

On 5/1/2024 11:17 PM, Jason Gunthorpe wrote:
> On Tue, Apr 30, 2024 at 03:24:22PM +0000, Suravee Suthikulpanit wrote:
>> Depending on the modes of operation, certain AMD IOMMU data structures are
>> allocated with constraints. For example:
>>
>>   * Some buffers must be 4K-aligned when running in SNP-enabled host
>>
>>   * To support AMD IOMMU emulation in an SEV guest, some data structures
>>     cannot be encrypted so that the VMM can access the memory successfully.
> 
> Uh, this seems like a really bad idea. The VM's integrity strongly
> depends on the correct function of the HW. If the IOMMU datastructures
> are not protected then the whole thing is not secure.
> 
> For instance allowing hostile VMs to manipulate the DTE, or interfere
> with the command queue, destroys any possibility to have secure DMA.

Currently, we have already set the area used for guest SWIOTLB region as 
shared memory to support DMA in SEV guest. Here, we are setting 
additional guest IOMMU data structures as shared:

* Device Table
* Command Buffer
* Completion-Wait Semaphore Buffer
* Per-device Interrupt Remapping Table

, which are necessary for QEMU interrupt remapping emulation. Therefore,
we are not making the VM any less secure from device perspective.

> Is this some precursor to implementing a secure iommu where the data
> structures will remain encrypted? 

Yes, the is precursor to secure vIOMMU support in the guest.

> What is even the point of putting a non-secure viommu into a SEV guest anyhow?

This is needed to provide interrupt remapping support for vcpu with 
x2APIC ID (> 255) in the guest, which is already available w/ 
QEMU-emulated AMD vIOMMU.

Thanks,
Suravee

^ permalink raw reply	[flat|nested] 19+ messages in thread

* RE: [PATCH 0/9] iommu/amd: Add AMD IOMMU emulation support for SEV-SNP guest kernel
  2024-04-30 15:24 [PATCH 0/9] iommu/amd: Add AMD IOMMU emulation support for SEV-SNP guest kernel Suravee Suthikulpanit
                   ` (8 preceding siblings ...)
  2024-04-30 15:24 ` [PATCH 9/9] iommu/amd: Set default domain to IDENTITY_DOMAIN when running " Suravee Suthikulpanit
@ 2024-05-13 20:05 ` Michael Kelley
  2024-05-14 19:02   ` Suthikulpanit, Suravee
  9 siblings, 1 reply; 19+ messages in thread
From: Michael Kelley @ 2024-05-13 20:05 UTC (permalink / raw)
  To: Suravee Suthikulpanit, linux-kernel, iommu, joro
  Cc: thomas.lendacky, vasant.hegde, michael.roth, jon.grimm, rientjes

From: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Sent: Tuesday, April 30, 2024 8:24 AM
> 
> To boot a VM w/ x2APIC ID > 255, guest interrupt remapping emulation
> is required.

Top-level question:  Is there a reason the MSI extended destination ID mechanism is
insufficient to avoid the need for interrupt remapping?  (see function pointer
"msi_ext_dest_id").  I'm unclear on whether it is or not. If it is not sufficient, perhaps
you could explain why.

> For SEV guest, this can be achieved using an emulated
> AMD IOMMU.

You've used "SEV" here and in several other places.  I think you intend this to be
the more specific "SEV-SNP", and exclude SEV and SEV-ES. For avoid any confusion,
I'd suggest using "SEV-SNP" throughout if that's what you mean.

Michael

> 
> In order to support emulated AMD IOMMU in SEV guest, memory pages used
> by the guest IOMMU data structures must be in decrypted mode. Also GPAs
> for these pages must not have the memory encryption bit set.
> 
> Testing:
>   - Booting Linux SEV guest w/ 512 vcpus w/ QEMU emulated amd-iommu with
>     qemu-system-x86_64 option: -device amd-iommu,intremap=on,xtsup=on
>     (emulated devices only for now).
> 
> GIT repos:
> * https://github.com/AMDESE/linux-iommu/tree/iommu_next_sev-iommu-v1
> 
> Thanks,
> Suravee
> 
> Suravee Suthikulpanit (9):
>   iommu/amd: Introduce helper functions for managing IOMMU memory
>   iommu/amd: Convert Device Table pointer to use struct amd_iommu_mem
>   iommu/amd: Convert Command Buffer pointer to use struct amd_iommu_mem
>   iommu/amd: Convert Completion-Wait Semaphore pointer to use struct
>     amd_iommu_mem
>   iommu/amd: Convert Event Log pointer to use struct amd_iommu_mem
>   iommu/amd: Convert PPR Log pointer to use the struct amd_iommu_mem
>   iommu/amd: Remove iommu_alloc_4k_pages() helper function
>   iommu/amd: Decrypt interrupt remapping table for AMD IOMMU emulation
>     in SEV guest
>   iommu/amd: Set default domain to IDENTITY_DOMAIN when running in SEV
>     guest
> 
>  drivers/iommu/amd/amd_iommu.h       |  31 +++++-
>  drivers/iommu/amd/amd_iommu_types.h |  28 ++++--
>  drivers/iommu/amd/init.c            | 144 +++++++++++++++-------------
>  drivers/iommu/amd/iommu.c           | 133 +++++++++++++++++++------
>  drivers/iommu/amd/ppr.c             |  22 +++--
>  5 files changed, 246 insertions(+), 112 deletions(-)
> 
> --
> 2.34.1
> 


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 9/9] iommu/amd: Set default domain to IDENTITY_DOMAIN when running in SEV guest
  2024-05-13 12:17     ` Suthikulpanit, Suravee
@ 2024-05-13 23:10       ` Jason Gunthorpe
  0 siblings, 0 replies; 19+ messages in thread
From: Jason Gunthorpe @ 2024-05-13 23:10 UTC (permalink / raw)
  To: Suthikulpanit, Suravee
  Cc: linux-kernel, iommu, joro, thomas.lendacky, vasant.hegde,
	michael.roth, jon.grimm, rientjes

On Mon, May 13, 2024 at 07:17:49PM +0700, Suthikulpanit, Suravee wrote:
> Jason,
> 
> On 5/1/2024 9:17 PM, Jason Gunthorpe wrote:
> > On Tue, Apr 30, 2024 at 03:24:30PM +0000, Suravee Suthikulpanit wrote:
> > > Since SEV guest depends on the unencrypted swiotlb bounce buffer
> > > to support DMA, the guest AMD IOMMU driver must be force to setup to
> > > pass-through mode.
> > 
> > You should block the creation of paging domains as well if the HW
> > can't support them.
> 
> Sure, I'll add a logic to check and block domain creation.
> 
> > But, is there actually a functional problem here? Doesn't swiotlb work
> > OK with iommu even with the encrypted memory cases? What is missing if
> > not?
> 
> Currently, SEV guest is default to use SWIOTLB. This does not have any
> issues.
> 
> However, in order to support vcpus w/ x2APIC ID (> 255) in a guest, it
> requires guest interrupt remapping support. This is achieved by adding
> QEMU-emulated AMD or Intel vIOMMU models.
> 
> In case of AMD IOMMU, depending on the CONFIG_IOMMU_DEFAULT_PASSTHROUGH
> kernel config, it would default to setup the v1 table for DMA remapping,
> which is not supported in the SEV guest (since it requires to use SWIOTLB).

But this just means you are inserting an iommu hw that is totally
non-working. I'd expect that the iommu continues to work correctly but
cannot access any encrypted pages..

If it is unusable do you even need to allow it to probe to any
drivers? Nothing works so there isn't much point to binding devices to
the iommu..?

Jason

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/9] iommu/amd: Introduce helper functions for managing IOMMU memory
  2024-05-13 18:59     ` Suthikulpanit, Suravee
@ 2024-05-13 23:13       ` Jason Gunthorpe
  0 siblings, 0 replies; 19+ messages in thread
From: Jason Gunthorpe @ 2024-05-13 23:13 UTC (permalink / raw)
  To: Suthikulpanit, Suravee
  Cc: linux-kernel, iommu, joro, thomas.lendacky, vasant.hegde,
	michael.roth, jon.grimm, rientjes

On Tue, May 14, 2024 at 01:59:33AM +0700, Suthikulpanit, Suravee wrote:
> Jason
> 
> On 5/1/2024 11:17 PM, Jason Gunthorpe wrote:
> > On Tue, Apr 30, 2024 at 03:24:22PM +0000, Suravee Suthikulpanit wrote:
> > > Depending on the modes of operation, certain AMD IOMMU data structures are
> > > allocated with constraints. For example:
> > > 
> > >   * Some buffers must be 4K-aligned when running in SNP-enabled host
> > > 
> > >   * To support AMD IOMMU emulation in an SEV guest, some data structures
> > >     cannot be encrypted so that the VMM can access the memory successfully.
> > 
> > Uh, this seems like a really bad idea. The VM's integrity strongly
> > depends on the correct function of the HW. If the IOMMU datastructures
> > are not protected then the whole thing is not secure.
> > 
> > For instance allowing hostile VMs to manipulate the DTE, or interfere
> > with the command queue, destroys any possibility to have secure DMA.
> 
> Currently, we have already set the area used for guest SWIOTLB region as
> shared memory to support DMA in SEV guest. Here, we are setting additional
> guest IOMMU data structures as shared:
> 
> * Device Table
> * Command Buffer
> * Completion-Wait Semaphore Buffer
> * Per-device Interrupt Remapping Table

And if a hostile VMM starts messing with this is everything going to
hold up? Or will you get crashes and security bugs?

I don't think it is a good idea to put things in non-secure memory
without also doing a full security audit.

> > Is this some precursor to implementing a secure iommu where the data
> > structures will remain encrypted?
> 
> Yes, the is precursor to secure vIOMMU support in the guest.

How does the guest tell if the vIOMMU is secure, and shouldn't you in
this patch refuse to load on a secure vIOMMU at all?

Maybe it would be a better idea to have a mini irq side only driver
that is audited and safe to use non-secure memory than trying to
repurpose the entire complex driver?

Jason

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 0/9] iommu/amd: Add AMD IOMMU emulation support for SEV-SNP guest kernel
  2024-05-13 20:05 ` [PATCH 0/9] iommu/amd: Add AMD IOMMU emulation support for SEV-SNP guest kernel Michael Kelley
@ 2024-05-14 19:02   ` Suthikulpanit, Suravee
  2024-05-14 21:34     ` Michael Kelley
  0 siblings, 1 reply; 19+ messages in thread
From: Suthikulpanit, Suravee @ 2024-05-14 19:02 UTC (permalink / raw)
  To: Michael Kelley, linux-kernel, iommu, joro
  Cc: thomas.lendacky, vasant.hegde, michael.roth, jon.grimm, rientjes



On 5/14/2024 3:05 AM, Michael Kelley wrote:
> From: Suravee Suthikulpanit<suravee.suthikulpanit@amd.com>  Sent: Tuesday, April 30, 2024 8:24 AM
>> To boot a VM w/ x2APIC ID > 255, guest interrupt remapping emulation
>> is required.
>
> Top-level question:  Is there a reason the MSI extended destination ID mechanism is
> insufficient to avoid the need for interrupt remapping?  (see function pointer
> "msi_ext_dest_id").  I'm unclear on whether it is or not. If it is not sufficient, perhaps
> you could explain why.

In case of running a Linux VM w/ QEMU/KVM as hypervisor, the 
qemu-system-x86_64 option kvm-msi-ext-dest-id=on would allow booting the 
VM w/ x2APIC ID > 255. However, for other hypervisor, it might not 
support this feature.

>> For SEV guest, this can be achieved using an emulated
>> AMD IOMMU.
> You've used "SEV" here and in several other places.  I think you intend this to be
> the more specific "SEV-SNP", and exclude SEV and SEV-ES. For avoid any confusion,
> I'd suggest using "SEV-SNP" throughout if that's what you mean.

Actually, The CC_ATTR_GUEST_MEM_ENCRYPT attribute is true for all SEV 
guests, so this will enable IOMMU emulation for all SEV guests.

Thanks,
Suravee

> Michael

^ permalink raw reply	[flat|nested] 19+ messages in thread

* RE: [PATCH 0/9] iommu/amd: Add AMD IOMMU emulation support for SEV-SNP guest kernel
  2024-05-14 19:02   ` Suthikulpanit, Suravee
@ 2024-05-14 21:34     ` Michael Kelley
  0 siblings, 0 replies; 19+ messages in thread
From: Michael Kelley @ 2024-05-14 21:34 UTC (permalink / raw)
  To: Suthikulpanit, Suravee, linux-kernel, iommu, joro
  Cc: thomas.lendacky, vasant.hegde, michael.roth, jon.grimm, rientjes

From: Suthikulpanit, Suravee <suravee.suthikulpanit@amd.com> Sent: Tuesday, May 14, 2024 12:02 PM
> 
> On 5/14/2024 3:05 AM, Michael Kelley wrote:
> > From: Suravee Suthikulpanit<suravee.suthikulpanit@amd.com>  Sent: Tuesday, April
> 30, 2024 8:24 AM
> >> To boot a VM w/ x2APIC ID > 255, guest interrupt remapping emulation
> >> is required.
> >
> > Top-level question:  Is there a reason the MSI extended destination ID mechanism is
> > insufficient to avoid the need for interrupt remapping?  (see function pointer
> > "msi_ext_dest_id").  I'm unclear on whether it is or not. If it is not sufficient, perhaps
> > you could explain why.
> 
> In case of running a Linux VM w/ QEMU/KVM as hypervisor, the
> qemu-system-x86_64 option kvm-msi-ext-dest-id=on would allow booting the
> VM w/ x2APIC ID > 255. However, for other hypervisor, it might not
> support this feature.

Two observations:
1) KVM, Hyper-V, and Xen all have support for MSI extended destination ID.
Is it reasonable to make it a requirement that any hypervisor supporting
SEV/SEV-ES/SEV-SNP guests must also support MSI extended destination ID
if the guest is to run with more than 255 vCPUs?  With that requirement,
you don't need any interrupt remapping or IOMMU emulation.

2) It would help to be more explicit about the basic premise of this patch
set.  I *think* the idea is that KVM/QEMU already supports an emulated
AMD IOMMU in a normal VM.  That emulation depends on QEMU having
access to data structures in guest memory (just like a physical AMD IOMMU
would).  But with SEV/SEV-ES/SEV-SNP, guest memory is encrypted and
QEMU doesn't have the access.  So this patch set changes the IOMMU
related data structures to be allocated in decrypted guest memory.  Of
course, as Jason has pointed out, this would seem to open the guest to
security risks from a compromised hypervisor/VMM, negating what
SEV* is trying to provide as guest confidentiality.

My #1 above might be a lot less risky from a security perspective.

> 
> >> For SEV guest, this can be achieved using an emulated
> >> AMD IOMMU.
> > You've used "SEV" here and in several other places.  I think you intend this to be
> > the more specific "SEV-SNP", and exclude SEV and SEV-ES. For avoid any confusion,
> > I'd suggest using "SEV-SNP" throughout if that's what you mean.
> 
> Actually, The CC_ATTR_GUEST_MEM_ENCRYPT attribute is true for all SEV
> guests, so this will enable IOMMU emulation for all SEV guests.
> 

If that's the case, I'd suggest updating the Subject: line of this cover letter
to not be specific to SEV-SNP, and to call out in the text that's your intent
for the patch set to work for all SEV* flavors.  Also, the commit messages
throughout the patch set sometimes reference "SNP" and sometimes "SEV".
That's confusing if your intent is applicability to all SEV* flavors.

Michael

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2024-05-14 21:35 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-04-30 15:24 [PATCH 0/9] iommu/amd: Add AMD IOMMU emulation support for SEV-SNP guest kernel Suravee Suthikulpanit
2024-04-30 15:24 ` [PATCH 1/9] iommu/amd: Introduce helper functions for managing IOMMU memory Suravee Suthikulpanit
2024-05-01 16:17   ` Jason Gunthorpe
2024-05-13 18:59     ` Suthikulpanit, Suravee
2024-05-13 23:13       ` Jason Gunthorpe
2024-04-30 15:24 ` [PATCH 2/9] iommu/amd: Convert Device Table pointer to use struct amd_iommu_mem Suravee Suthikulpanit
2024-04-30 15:24 ` [PATCH 3/9] iommu/amd: Convert Command Buffer " Suravee Suthikulpanit
2024-04-30 15:24 ` [PATCH 4/9] iommu/amd: Convert Completion-Wait Semaphore " Suravee Suthikulpanit
2024-04-30 15:24 ` [PATCH 5/9] iommu/amd: Convert Event Log " Suravee Suthikulpanit
2024-04-30 15:24 ` [PATCH 6/9] iommu/amd: Convert PPR Log pointer to use the " Suravee Suthikulpanit
2024-04-30 15:24 ` [PATCH 7/9] iommu/amd: Remove iommu_alloc_4k_pages() helper function Suravee Suthikulpanit
2024-04-30 15:24 ` [PATCH 8/9] iommu/amd: Decrypt interrupt remapping table for AMD IOMMU emulation in SEV guest Suravee Suthikulpanit
2024-04-30 15:24 ` [PATCH 9/9] iommu/amd: Set default domain to IDENTITY_DOMAIN when running " Suravee Suthikulpanit
2024-05-01 14:17   ` Jason Gunthorpe
2024-05-13 12:17     ` Suthikulpanit, Suravee
2024-05-13 23:10       ` Jason Gunthorpe
2024-05-13 20:05 ` [PATCH 0/9] iommu/amd: Add AMD IOMMU emulation support for SEV-SNP guest kernel Michael Kelley
2024-05-14 19:02   ` Suthikulpanit, Suravee
2024-05-14 21:34     ` Michael Kelley

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).