qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [PATCH ats_vtd v2 00/25] ATS support for VT-d
@ 2024-05-15  7:14 CLEMENT MATHIEU--DRIF
  2024-05-15  7:14 ` [PATCH ats_vtd v2 01/25] intel_iommu: fix FRCD construction macro CLEMENT MATHIEU--DRIF
                   ` (24 more replies)
  0 siblings, 25 replies; 32+ messages in thread
From: CLEMENT MATHIEU--DRIF @ 2024-05-15  7:14 UTC (permalink / raw)
  To: qemu-devel
  Cc: jasowang, zhenzhong.duan, kevin.tian, yi.l.liu, joao.m.martins,
	peterx, CLEMENT MATHIEU--DRIF

This series belongs to a list of series that add SVM support for VT-d.

As a starting point, we use the series called 'intel_iommu: Enable stage-1 translation' (rfc2) by Zhenzhong Duan and Yi Liu.

Here we focus on the implementation of ATS support in the IOMMU and on a PCI-level
API for ATS to be used by virtual devices.

This work is based on the VT-d specification version 4.1 (March 2023).
Here is a link to a GitHub repository where you can find the following elements :
    - Qemu with all the patches for SVM
        - ATS
        - PRI
        - Device IOTLB invalidations
        - Requests with already translated addresses
    - A demo device
    - A simple driver for the demo device
    - A userspace program (for testing and demonstration purposes)

https://github.com/BullSequana/Qemu-in-guest-SVM-demo

v2
    - handle huge pages better by detecting the page table level at which the translation errors occur
    - Changes after review by ZhenZhong Duan :
    	- Set the access bit after checking permissions
    	- helper for PASID and ATS : make the commit message more accurate ('present' replaced with 'enabled')
    	- pcie_pasid_init: add PCI_PASID_CAP_WIDTH_SHIFT and use it instead of PCI_EXT_CAP_PASID_SIZEOF for shifting the pasid width when preparing the capability register
    	- pci: do not check pci_bus_bypass_iommu after calling pci_device_get_iommu_bus_devfn
    	- do not alter formatting of IOMMUTLBEntry declaration
    	- vtd_iova_fl_check_canonical : directly use s->aw_bits instead of aw for the sake of clarity

Clément Mathieu--Drif (25):
  intel_iommu: fix FRCD construction macro.
  intel_iommu: make types match
  intel_iommu: check if the input address is canonical
  intel_iommu: set accessed and dirty bits during first stage
    translation
  intel_iommu: return page walk level even when the translation fails
  intel_iommu: extract device IOTLB invalidation logic
  intel_iommu: do not consider wait_desc as an invalid descriptor
  memory: add permissions in IOMMUAccessFlags
  pcie: add helper to declare PASID capability for a pcie device
  pcie: helper functions to check if PASID and ATS are enabled
  intel_iommu: declare supported PASID size
  intel_iommu: add an internal API to find an address space with PASID
  intel_iommu: add support for PASID-based device IOTLB invalidation
  pci: cache the bus mastering status in the device
  pci: add IOMMU operations to get address spaces and memory regions
    with PASID
  pci: add a pci-level initialization function for iommu notifiers
  intel_iommu: implement the get_address_space_pasid iommu operation
  intel_iommu: implement the get_memory_region_pasid iommu operation
  memory: Allow to store the PASID in IOMMUTLBEntry
  intel_iommu: fill the PASID field when creating an instance of
    IOMMUTLBEntry
  atc: generic ATC that can be used by PCIe devices that support SVM
  memory: add an API for ATS support
  pci: add a pci-level API for ATS
  intel_iommu: set the address mask even when a translation fails
  intel_iommu: add support for ATS

 hw/i386/intel_iommu.c                     | 324 +++++++++++---
 hw/i386/intel_iommu_internal.h            |  21 +-
 hw/pci/pci.c                              | 125 +++++-
 hw/pci/pcie.c                             |  42 ++
 include/exec/memory.h                     |  50 ++-
 include/hw/i386/intel_iommu.h             |   2 +-
 include/hw/pci/pci.h                      |  99 +++++
 include/hw/pci/pci_device.h               |   1 +
 include/hw/pci/pcie.h                     |   9 +-
 include/hw/pci/pcie_regs.h                |   3 +
 include/standard-headers/linux/pci_regs.h |   1 +
 system/memory.c                           |  20 +
 tests/unit/meson.build                    |   1 +
 tests/unit/test-atc.c                     | 502 ++++++++++++++++++++++
 util/atc.c                                | 211 +++++++++
 util/atc.h                                | 117 +++++
 util/meson.build                          |   1 +
 17 files changed, 1453 insertions(+), 76 deletions(-)
 create mode 100644 tests/unit/test-atc.c
 create mode 100644 util/atc.c
 create mode 100644 util/atc.h

-- 
2.44.0

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH ats_vtd v2 01/25] intel_iommu: fix FRCD construction macro.
  2024-05-15  7:14 [PATCH ats_vtd v2 00/25] ATS support for VT-d CLEMENT MATHIEU--DRIF
@ 2024-05-15  7:14 ` CLEMENT MATHIEU--DRIF
  2024-05-15  7:14 ` [PATCH ats_vtd v2 03/25] intel_iommu: check if the input address is canonical CLEMENT MATHIEU--DRIF
                   ` (23 subsequent siblings)
  24 siblings, 0 replies; 32+ messages in thread
From: CLEMENT MATHIEU--DRIF @ 2024-05-15  7:14 UTC (permalink / raw)
  To: qemu-devel
  Cc: jasowang, zhenzhong.duan, kevin.tian, yi.l.liu, joao.m.martins,
	peterx, CLEMENT MATHIEU--DRIF

The constant must be unsigned, otherwise the two's complement
overrides the other fields when a PASID is present

Signed-off-by: Clément Mathieu--Drif <clement.mathieu--drif@eviden.com>
---
 hw/i386/intel_iommu_internal.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/i386/intel_iommu_internal.h b/hw/i386/intel_iommu_internal.h
index c5efcff9fd..4f6b0154b5 100644
--- a/hw/i386/intel_iommu_internal.h
+++ b/hw/i386/intel_iommu_internal.h
@@ -271,7 +271,7 @@
 /* For the low 64-bit of 128-bit */
 #define VTD_FRCD_FI(val)        ((val) & ~0xfffULL)
 #define VTD_FRCD_PV(val)        (((val) & 0xffffULL) << 40)
-#define VTD_FRCD_PP(val)        (((val) & 0x1) << 31)
+#define VTD_FRCD_PP(val)        (((val) & 0x1ULL) << 31)
 #define VTD_FRCD_IR_IDX(val)    (((val) & 0xffffULL) << 48)
 
 /* DMA Remapping Fault Conditions */
-- 
2.44.0

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH ats_vtd v2 02/25] intel_iommu: make types match
  2024-05-15  7:14 [PATCH ats_vtd v2 00/25] ATS support for VT-d CLEMENT MATHIEU--DRIF
  2024-05-15  7:14 ` [PATCH ats_vtd v2 01/25] intel_iommu: fix FRCD construction macro CLEMENT MATHIEU--DRIF
  2024-05-15  7:14 ` [PATCH ats_vtd v2 03/25] intel_iommu: check if the input address is canonical CLEMENT MATHIEU--DRIF
@ 2024-05-15  7:14 ` CLEMENT MATHIEU--DRIF
  2024-05-15  7:14 ` [PATCH ats_vtd v2 04/25] intel_iommu: set accessed and dirty bits during first stage translation CLEMENT MATHIEU--DRIF
                   ` (21 subsequent siblings)
  24 siblings, 0 replies; 32+ messages in thread
From: CLEMENT MATHIEU--DRIF @ 2024-05-15  7:14 UTC (permalink / raw)
  To: qemu-devel
  Cc: jasowang, zhenzhong.duan, kevin.tian, yi.l.liu, joao.m.martins,
	peterx, CLEMENT MATHIEU--DRIF

The 'level' field in vtd_iotlb_key is an uint8_t.
We don't need to store level as an int in vtd_lookup_iotlb (avoids a
'loosing precision' warning).

VTDIOTLBPageInvInfo.mask is used in binary operations with addresses.

Signed-off-by: Clément Mathieu--Drif <clement.mathieu--drif@eviden.com>
---
 hw/i386/intel_iommu.c          | 2 +-
 hw/i386/intel_iommu_internal.h | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index 70735e2379..80cdf37870 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -386,7 +386,7 @@ static VTDIOTLBEntry *vtd_lookup_iotlb(IntelIOMMUState *s, uint16_t source_id,
 {
     struct vtd_iotlb_key key;
     VTDIOTLBEntry *entry;
-    int level;
+    uint8_t level;
 
     for (level = VTD_SL_PT_LEVEL; level < VTD_SL_PML4_LEVEL; level++) {
         key.gfn = vtd_get_iotlb_gfn(addr, level);
diff --git a/hw/i386/intel_iommu_internal.h b/hw/i386/intel_iommu_internal.h
index 4f6b0154b5..901691afb9 100644
--- a/hw/i386/intel_iommu_internal.h
+++ b/hw/i386/intel_iommu_internal.h
@@ -473,7 +473,7 @@ struct VTDIOTLBPageInvInfo {
     uint16_t domain_id;
     uint32_t pasid;
     uint64_t addr;
-    uint8_t mask;
+    uint64_t mask;
 };
 typedef struct VTDIOTLBPageInvInfo VTDIOTLBPageInvInfo;
 
-- 
2.44.0

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH ats_vtd v2 03/25] intel_iommu: check if the input address is canonical
  2024-05-15  7:14 [PATCH ats_vtd v2 00/25] ATS support for VT-d CLEMENT MATHIEU--DRIF
  2024-05-15  7:14 ` [PATCH ats_vtd v2 01/25] intel_iommu: fix FRCD construction macro CLEMENT MATHIEU--DRIF
@ 2024-05-15  7:14 ` CLEMENT MATHIEU--DRIF
  2024-05-15  7:14 ` [PATCH ats_vtd v2 02/25] intel_iommu: make types match CLEMENT MATHIEU--DRIF
                   ` (22 subsequent siblings)
  24 siblings, 0 replies; 32+ messages in thread
From: CLEMENT MATHIEU--DRIF @ 2024-05-15  7:14 UTC (permalink / raw)
  To: qemu-devel
  Cc: jasowang, zhenzhong.duan, kevin.tian, yi.l.liu, joao.m.martins,
	peterx, CLEMENT MATHIEU--DRIF

First stage translation must fail if the address to translate is
not canonical.

Signed-off-by: Clément Mathieu--Drif <clement.mathieu--drif@eviden.com>
---
 hw/i386/intel_iommu.c          | 21 +++++++++++++++++++++
 hw/i386/intel_iommu_internal.h |  2 ++
 2 files changed, 23 insertions(+)

diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index 80cdf37870..0ecf00f37a 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -1912,6 +1912,7 @@ static const bool vtd_qualified_faults[] = {
     [VTD_FR_PASID_ENTRY_P] = true,
     [VTD_FR_PASID_TABLE_ENTRY_INV] = true,
     [VTD_FR_SM_INTERRUPT_ADDR] = true,
+    [VTD_FR_FS_NON_CANONICAL] = true,
     [VTD_FR_MAX] = false,
 };
 
@@ -2023,6 +2024,20 @@ static inline uint64_t vtd_get_flpte_addr(uint64_t flpte, uint8_t aw)
     return flpte & VTD_FL_PT_BASE_ADDR_MASK(aw);
 }
 
+/* Return true if IOVA is canonical, otherwise false. */
+static bool vtd_iova_fl_check_canonical(IntelIOMMUState *s, uint64_t iova,
+                                        VTDContextEntry *ce, uint32_t pasid)
+{
+    uint64_t iova_limit = vtd_iova_limit(s, ce, s->aw_bits, pasid);
+    uint64_t upper_bits_mask = ~(iova_limit - 1);
+    uint64_t upper_bits = iova & upper_bits_mask;
+    bool msb = ((iova & (iova_limit >> 1)) != 0);
+    return !(
+             (!msb && (upper_bits != 0)) ||
+             (msb && (upper_bits != upper_bits_mask))
+            );
+}
+
 /*
  * Given the @iova, get relevant @flptep. @flpte_level will be the last level
  * of the translation, can be used for deciding the size of large page.
@@ -2038,6 +2053,12 @@ static int vtd_iova_to_flpte(IntelIOMMUState *s, VTDContextEntry *ce,
     uint32_t offset;
     uint64_t flpte;
 
+    if (!vtd_iova_fl_check_canonical(s, iova, ce, pasid)) {
+        error_report_once("%s: detected non canonical IOVA (iova=0x%" PRIx64 ","
+                          "pasid=0x%" PRIx32 ")", __func__, iova, pasid);
+        return -VTD_FR_FS_NON_CANONICAL;
+    }
+
     while (true) {
         offset = vtd_iova_fl_level_offset(iova, level);
         flpte = vtd_get_flpte(addr, offset);
diff --git a/hw/i386/intel_iommu_internal.h b/hw/i386/intel_iommu_internal.h
index 901691afb9..e9448291a4 100644
--- a/hw/i386/intel_iommu_internal.h
+++ b/hw/i386/intel_iommu_internal.h
@@ -324,6 +324,8 @@ typedef enum VTDFaultReason {
     VTD_FR_PASID_ENTRY_P = 0x59, /* The Present(P) field of pasidt-entry is 0 */
     VTD_FR_PASID_TABLE_ENTRY_INV = 0x5b,  /*Invalid PASID table entry */
 
+    VTD_FR_FS_NON_CANONICAL = 0x80, /* SNG.1 : Address for FS not canonical.*/
+
     /* Output address in the interrupt address range for scalable mode */
     VTD_FR_SM_INTERRUPT_ADDR = 0x87,
     VTD_FR_MAX,                 /* Guard */
-- 
2.44.0

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH ats_vtd v2 04/25] intel_iommu: set accessed and dirty bits during first stage translation
  2024-05-15  7:14 [PATCH ats_vtd v2 00/25] ATS support for VT-d CLEMENT MATHIEU--DRIF
                   ` (2 preceding siblings ...)
  2024-05-15  7:14 ` [PATCH ats_vtd v2 02/25] intel_iommu: make types match CLEMENT MATHIEU--DRIF
@ 2024-05-15  7:14 ` CLEMENT MATHIEU--DRIF
  2024-05-15  7:14 ` [PATCH ats_vtd v2 07/25] intel_iommu: do not consider wait_desc as an invalid descriptor CLEMENT MATHIEU--DRIF
                   ` (20 subsequent siblings)
  24 siblings, 0 replies; 32+ messages in thread
From: CLEMENT MATHIEU--DRIF @ 2024-05-15  7:14 UTC (permalink / raw)
  To: qemu-devel
  Cc: jasowang, zhenzhong.duan, kevin.tian, yi.l.liu, joao.m.martins,
	peterx, CLEMENT MATHIEU--DRIF

Signed-off-by: Clément Mathieu--Drif <clement.mathieu--drif@eviden.com>
---
 hw/i386/intel_iommu.c          | 25 +++++++++++++++++++++++++
 hw/i386/intel_iommu_internal.h |  3 +++
 2 files changed, 28 insertions(+)

diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index 0ecf00f37a..252364893b 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -1913,6 +1913,7 @@ static const bool vtd_qualified_faults[] = {
     [VTD_FR_PASID_TABLE_ENTRY_INV] = true,
     [VTD_FR_SM_INTERRUPT_ADDR] = true,
     [VTD_FR_FS_NON_CANONICAL] = true,
+    [VTD_FR_FS_BIT_UPDATE_FAILED] = true,
     [VTD_FR_MAX] = false,
 };
 
@@ -2038,6 +2039,20 @@ static bool vtd_iova_fl_check_canonical(IntelIOMMUState *s, uint64_t iova,
             );
 }
 
+static MemTxResult vtd_set_flag_in_pte(dma_addr_t base_addr, uint32_t index,
+                                       uint64_t pte, uint64_t flag)
+{
+    if (pte & flag) {
+        return MEMTX_OK;
+    }
+    pte |= flag;
+    pte = cpu_to_le64(pte);
+    return dma_memory_write(&address_space_memory,
+                            base_addr + index * sizeof(pte),
+                            &pte, sizeof(pte),
+                            MEMTXATTRS_UNSPECIFIED);
+}
+
 /*
  * Given the @iova, get relevant @flptep. @flpte_level will be the last level
  * of the translation, can be used for deciding the size of large page.
@@ -2083,7 +2098,17 @@ static int vtd_iova_to_flpte(IntelIOMMUState *s, VTDContextEntry *ce,
             return -VTD_FR_WRITE;
         }
 
+        if (vtd_set_flag_in_pte(addr, offset, flpte, VTD_FL_PTE_A)
+                                                                != MEMTX_OK) {
+            return -VTD_FR_FS_BIT_UPDATE_FAILED;
+        }
+
         if (vtd_is_last_flpte(flpte, level)) {
+            if (is_write &&
+                (vtd_set_flag_in_pte(addr, offset, flpte, VTD_FL_PTE_D) !=
+                                                                    MEMTX_OK)) {
+                    return -VTD_FR_FS_BIT_UPDATE_FAILED;
+            }
             *flptep = flpte;
             *flpte_level = level;
             return 0;
diff --git a/hw/i386/intel_iommu_internal.h b/hw/i386/intel_iommu_internal.h
index e9448291a4..14879d3a58 100644
--- a/hw/i386/intel_iommu_internal.h
+++ b/hw/i386/intel_iommu_internal.h
@@ -328,6 +328,7 @@ typedef enum VTDFaultReason {
 
     /* Output address in the interrupt address range for scalable mode */
     VTD_FR_SM_INTERRUPT_ADDR = 0x87,
+    VTD_FR_FS_BIT_UPDATE_FAILED = 0x91, /* SFS.10 */
     VTD_FR_MAX,                 /* Guard */
 } VTDFaultReason;
 
@@ -649,6 +650,8 @@ typedef struct VTDPIOTLBInvInfo {
 /* First Level Paging Structure */
 #define VTD_FL_PT_LEVEL             1
 #define VTD_FL_PT_ENTRY_NR          512
+#define VTD_FL_PTE_A                0x20
+#define VTD_FL_PTE_D                0x40
 
 /* Masks for First Level Paging Entry */
 #define VTD_FL_RW_MASK              (1ULL << 1)
-- 
2.44.0

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH ats_vtd v2 07/25] intel_iommu: do not consider wait_desc as an invalid descriptor
  2024-05-15  7:14 [PATCH ats_vtd v2 00/25] ATS support for VT-d CLEMENT MATHIEU--DRIF
                   ` (3 preceding siblings ...)
  2024-05-15  7:14 ` [PATCH ats_vtd v2 04/25] intel_iommu: set accessed and dirty bits during first stage translation CLEMENT MATHIEU--DRIF
@ 2024-05-15  7:14 ` CLEMENT MATHIEU--DRIF
  2024-05-15  7:14 ` [PATCH ats_vtd v2 05/25] intel_iommu: return page walk level even when the translation fails CLEMENT MATHIEU--DRIF
                   ` (19 subsequent siblings)
  24 siblings, 0 replies; 32+ messages in thread
From: CLEMENT MATHIEU--DRIF @ 2024-05-15  7:14 UTC (permalink / raw)
  To: qemu-devel
  Cc: jasowang, zhenzhong.duan, kevin.tian, yi.l.liu, joao.m.martins,
	peterx, CLEMENT MATHIEU--DRIF

Signed-off-by: Clément Mathieu--Drif <clement.mathieu--drif@eviden.com>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
---
 hw/i386/intel_iommu.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index dbdf13470d..373f3d254a 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -3362,6 +3362,11 @@ static bool vtd_process_wait_desc(IntelIOMMUState *s, VTDInvDesc *inv_desc)
     } else if (inv_desc->lo & VTD_INV_DESC_WAIT_IF) {
         /* Interrupt flag */
         vtd_generate_completion_event(s);
+    } else if (inv_desc->lo & VTD_INV_DESC_WAIT_FN) {
+        /*
+         * SW = 0, IF = 0, FN = 1
+         * Nothing to do as we process the events sequentially
+         */
     } else {
         error_report_once("%s: invalid wait desc: hi=%"PRIx64", lo=%"PRIx64
                           " (unknown type)", __func__, inv_desc->hi,
-- 
2.44.0

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH ats_vtd v2 05/25] intel_iommu: return page walk level even when the translation fails
  2024-05-15  7:14 [PATCH ats_vtd v2 00/25] ATS support for VT-d CLEMENT MATHIEU--DRIF
                   ` (4 preceding siblings ...)
  2024-05-15  7:14 ` [PATCH ats_vtd v2 07/25] intel_iommu: do not consider wait_desc as an invalid descriptor CLEMENT MATHIEU--DRIF
@ 2024-05-15  7:14 ` CLEMENT MATHIEU--DRIF
  2024-05-15  7:14 ` [PATCH ats_vtd v2 06/25] intel_iommu: extract device IOTLB invalidation logic CLEMENT MATHIEU--DRIF
                   ` (18 subsequent siblings)
  24 siblings, 0 replies; 32+ messages in thread
From: CLEMENT MATHIEU--DRIF @ 2024-05-15  7:14 UTC (permalink / raw)
  To: qemu-devel
  Cc: jasowang, zhenzhong.duan, kevin.tian, yi.l.liu, joao.m.martins,
	peterx, CLEMENT MATHIEU--DRIF

We use this information in vtd_do_iommu_translate to populate the
IOMMUTLBEntry and indicate the correct page mask. This prevents ATS
devices from sending many useless translation requests when a megapage
or gigapage iova is not mapped to a physical address.

Signed-off-by: Clément Mathieu--Drif <clement.mathieu--drif@eviden.com>
---
 hw/i386/intel_iommu.c | 11 +++++------
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index 252364893b..7a4dd738a3 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -2064,9 +2064,9 @@ static int vtd_iova_to_flpte(IntelIOMMUState *s, VTDContextEntry *ce,
                              uint32_t pasid)
 {
     dma_addr_t addr = vtd_get_iova_pgtbl_base(s, ce, pasid);
-    uint32_t level = vtd_get_iova_level(s, ce, pasid);
     uint32_t offset;
     uint64_t flpte;
+    *flpte_level = vtd_get_iova_level(s, ce, pasid);
 
     if (!vtd_iova_fl_check_canonical(s, iova, ce, pasid)) {
         error_report_once("%s: detected non canonical IOVA (iova=0x%" PRIx64 ","
@@ -2075,10 +2075,10 @@ static int vtd_iova_to_flpte(IntelIOMMUState *s, VTDContextEntry *ce,
     }
 
     while (true) {
-        offset = vtd_iova_fl_level_offset(iova, level);
+        offset = vtd_iova_fl_level_offset(iova, *flpte_level);
         flpte = vtd_get_flpte(addr, offset);
         if (flpte == (uint64_t)-1) {
-            if (level == vtd_get_iova_level(s, ce, pasid)) {
+            if (*flpte_level == vtd_get_iova_level(s, ce, pasid)) {
                 /* Invalid programming of context-entry */
                 return -VTD_FR_CONTEXT_ENTRY_INV;
             } else {
@@ -2103,19 +2103,18 @@ static int vtd_iova_to_flpte(IntelIOMMUState *s, VTDContextEntry *ce,
             return -VTD_FR_FS_BIT_UPDATE_FAILED;
         }
 
-        if (vtd_is_last_flpte(flpte, level)) {
+        if (vtd_is_last_flpte(flpte, *flpte_level)) {
             if (is_write &&
                 (vtd_set_flag_in_pte(addr, offset, flpte, VTD_FL_PTE_D) !=
                                                                     MEMTX_OK)) {
                     return -VTD_FR_FS_BIT_UPDATE_FAILED;
             }
             *flptep = flpte;
-            *flpte_level = level;
             return 0;
         }
 
         addr = vtd_get_flpte_addr(flpte, aw_bits);
-        level--;
+        (*flpte_level)--;
     }
 }
 
-- 
2.44.0

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH ats_vtd v2 06/25] intel_iommu: extract device IOTLB invalidation logic
  2024-05-15  7:14 [PATCH ats_vtd v2 00/25] ATS support for VT-d CLEMENT MATHIEU--DRIF
                   ` (5 preceding siblings ...)
  2024-05-15  7:14 ` [PATCH ats_vtd v2 05/25] intel_iommu: return page walk level even when the translation fails CLEMENT MATHIEU--DRIF
@ 2024-05-15  7:14 ` CLEMENT MATHIEU--DRIF
  2024-05-15  7:14 ` [PATCH ats_vtd v2 12/25] intel_iommu: add an internal API to find an address space with PASID CLEMENT MATHIEU--DRIF
                   ` (17 subsequent siblings)
  24 siblings, 0 replies; 32+ messages in thread
From: CLEMENT MATHIEU--DRIF @ 2024-05-15  7:14 UTC (permalink / raw)
  To: qemu-devel
  Cc: jasowang, zhenzhong.duan, kevin.tian, yi.l.liu, joao.m.martins,
	peterx, CLEMENT MATHIEU--DRIF, Philippe Mathieu-Daudé

This piece of code can be shared by both IOTLB invalidation and
PASID-based IOTLB invalidation

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Clément Mathieu--Drif <clement.mathieu--drif@eviden.com>
---
 hw/i386/intel_iommu.c | 57 +++++++++++++++++++++++++------------------
 1 file changed, 33 insertions(+), 24 deletions(-)

diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index 7a4dd738a3..dbdf13470d 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -4292,6 +4292,38 @@ static bool vtd_process_inv_iec_desc(IntelIOMMUState *s,
     return true;
 }
 
+static void do_invalidate_device_tlb(VTDAddressSpace *vtd_dev_as,
+                                     bool size, hwaddr addr)
+{
+    /*
+     * According to ATS spec table 2.4:
+     * S = 0, bits 15:12 = xxxx     range size: 4K
+     * S = 1, bits 15:12 = xxx0     range size: 8K
+     * S = 1, bits 15:12 = xx01     range size: 16K
+     * S = 1, bits 15:12 = x011     range size: 32K
+     * S = 1, bits 15:12 = 0111     range size: 64K
+     * ...
+     */
+
+    IOMMUTLBEvent event;
+    uint64_t sz;
+
+    if (size) {
+        sz = (VTD_PAGE_SIZE * 2) << cto64(addr >> VTD_PAGE_SHIFT);
+        addr &= ~(sz - 1);
+    } else {
+        sz = VTD_PAGE_SIZE;
+    }
+
+    event.type = IOMMU_NOTIFIER_DEVIOTLB_UNMAP;
+    event.entry.target_as = &vtd_dev_as->as;
+    event.entry.addr_mask = sz - 1;
+    event.entry.iova = addr;
+    event.entry.perm = IOMMU_NONE;
+    event.entry.translated_addr = 0;
+    memory_region_notify_iommu(&vtd_dev_as->iommu, 0, event);
+}
+
 static bool vtd_process_device_piotlb_desc(IntelIOMMUState *s,
                                            VTDInvDesc *inv_desc)
 {
@@ -4307,9 +4339,7 @@ static bool vtd_process_device_iotlb_desc(IntelIOMMUState *s,
                                           VTDInvDesc *inv_desc)
 {
     VTDAddressSpace *vtd_dev_as;
-    IOMMUTLBEvent event;
     hwaddr addr;
-    uint64_t sz;
     uint16_t sid;
     bool size;
 
@@ -4334,28 +4364,7 @@ static bool vtd_process_device_iotlb_desc(IntelIOMMUState *s,
         goto done;
     }
 
-    /* According to ATS spec table 2.4:
-     * S = 0, bits 15:12 = xxxx     range size: 4K
-     * S = 1, bits 15:12 = xxx0     range size: 8K
-     * S = 1, bits 15:12 = xx01     range size: 16K
-     * S = 1, bits 15:12 = x011     range size: 32K
-     * S = 1, bits 15:12 = 0111     range size: 64K
-     * ...
-     */
-    if (size) {
-        sz = (VTD_PAGE_SIZE * 2) << cto64(addr >> VTD_PAGE_SHIFT);
-        addr &= ~(sz - 1);
-    } else {
-        sz = VTD_PAGE_SIZE;
-    }
-
-    event.type = IOMMU_NOTIFIER_DEVIOTLB_UNMAP;
-    event.entry.target_as = &vtd_dev_as->as;
-    event.entry.addr_mask = sz - 1;
-    event.entry.iova = addr;
-    event.entry.perm = IOMMU_NONE;
-    event.entry.translated_addr = 0;
-    memory_region_notify_iommu(&vtd_dev_as->iommu, 0, event);
+    do_invalidate_device_tlb(vtd_dev_as, size, addr);
 
 done:
     return true;
-- 
2.44.0

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH ats_vtd v2 12/25] intel_iommu: add an internal API to find an address space with PASID
  2024-05-15  7:14 [PATCH ats_vtd v2 00/25] ATS support for VT-d CLEMENT MATHIEU--DRIF
                   ` (6 preceding siblings ...)
  2024-05-15  7:14 ` [PATCH ats_vtd v2 06/25] intel_iommu: extract device IOTLB invalidation logic CLEMENT MATHIEU--DRIF
@ 2024-05-15  7:14 ` CLEMENT MATHIEU--DRIF
  2024-05-15  7:14 ` [PATCH ats_vtd v2 10/25] pcie: helper functions to check if PASID and ATS are enabled CLEMENT MATHIEU--DRIF
                   ` (16 subsequent siblings)
  24 siblings, 0 replies; 32+ messages in thread
From: CLEMENT MATHIEU--DRIF @ 2024-05-15  7:14 UTC (permalink / raw)
  To: qemu-devel
  Cc: jasowang, zhenzhong.duan, kevin.tian, yi.l.liu, joao.m.martins,
	peterx, CLEMENT MATHIEU--DRIF

This will be used to implement the device IOTLB invalidation

Signed-off-by: Clément Mathieu--Drif <clement.mathieu--drif@eviden.com>
---
 hw/i386/intel_iommu.c | 39 ++++++++++++++++++++++++---------------
 1 file changed, 24 insertions(+), 15 deletions(-)

diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index 3bb4d385a8..166103510e 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -65,6 +65,11 @@ struct vtd_as_key {
     uint32_t pasid;
 };
 
+struct vtd_as_raw_key {
+    uint16_t sid;
+    uint32_t pasid;
+};
+
 struct vtd_iotlb_key {
     uint64_t gfn;
     uint32_t pasid;
@@ -1931,29 +1936,33 @@ static inline bool vtd_is_interrupt_addr(hwaddr addr)
     return VTD_INTERRUPT_ADDR_FIRST <= addr && addr <= VTD_INTERRUPT_ADDR_LAST;
 }
 
-static gboolean vtd_find_as_by_sid(gpointer key, gpointer value,
-                                   gpointer user_data)
+static gboolean vtd_find_as_by_sid_and_pasid(gpointer key, gpointer value,
+                                             gpointer user_data)
 {
     struct vtd_as_key *as_key = (struct vtd_as_key *)key;
-    uint16_t target_sid = *(uint16_t *)user_data;
+    struct vtd_as_raw_key target = *(struct vtd_as_raw_key *)user_data;
     uint16_t sid = PCI_BUILD_BDF(pci_bus_num(as_key->bus), as_key->devfn);
-    return sid == target_sid;
+
+    return (as_key->pasid == target.pasid) &&
+           (sid == target.sid);
 }
 
-static VTDAddressSpace *vtd_get_as_by_sid(IntelIOMMUState *s, uint16_t sid)
+static VTDAddressSpace *vtd_get_as_by_sid_and_pasid(IntelIOMMUState *s,
+                                                    uint16_t sid,
+                                                    uint32_t pasid)
 {
-    uint8_t bus_num = PCI_BUS_NUM(sid);
-    VTDAddressSpace *vtd_as = s->vtd_as_cache[bus_num];
-
-    if (vtd_as &&
-        (sid == PCI_BUILD_BDF(pci_bus_num(vtd_as->bus), vtd_as->devfn))) {
-        return vtd_as;
-    }
+    struct vtd_as_raw_key key = {
+        .sid = sid,
+        .pasid = pasid
+    };
 
-    vtd_as = g_hash_table_find(s->vtd_address_spaces, vtd_find_as_by_sid, &sid);
-    s->vtd_as_cache[bus_num] = vtd_as;
+    return g_hash_table_find(s->vtd_address_spaces,
+                             vtd_find_as_by_sid_and_pasid, &key);
+}
 
-    return vtd_as;
+static VTDAddressSpace *vtd_get_as_by_sid(IntelIOMMUState *s, uint16_t sid)
+{
+    return vtd_get_as_by_sid_and_pasid(s, sid, PCI_NO_PASID);
 }
 
 static void vtd_pt_enable_fast_path(IntelIOMMUState *s, uint16_t source_id)
-- 
2.44.0

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH ats_vtd v2 09/25] pcie: add helper to declare PASID capability for a pcie device
  2024-05-15  7:14 [PATCH ats_vtd v2 00/25] ATS support for VT-d CLEMENT MATHIEU--DRIF
                   ` (8 preceding siblings ...)
  2024-05-15  7:14 ` [PATCH ats_vtd v2 10/25] pcie: helper functions to check if PASID and ATS are enabled CLEMENT MATHIEU--DRIF
@ 2024-05-15  7:14 ` CLEMENT MATHIEU--DRIF
  2024-05-15  7:14 ` [PATCH ats_vtd v2 11/25] intel_iommu: declare supported PASID size CLEMENT MATHIEU--DRIF
                   ` (14 subsequent siblings)
  24 siblings, 0 replies; 32+ messages in thread
From: CLEMENT MATHIEU--DRIF @ 2024-05-15  7:14 UTC (permalink / raw)
  To: qemu-devel
  Cc: jasowang, zhenzhong.duan, kevin.tian, yi.l.liu, joao.m.martins,
	peterx, CLEMENT MATHIEU--DRIF

Signed-off-by: Clément Mathieu--Drif <clement.mathieu--drif@eviden.com>
---
 hw/pci/pcie.c                             | 24 +++++++++++++++++++++++
 include/hw/pci/pcie.h                     |  6 +++++-
 include/hw/pci/pcie_regs.h                |  3 +++
 include/standard-headers/linux/pci_regs.h |  1 +
 4 files changed, 33 insertions(+), 1 deletion(-)

diff --git a/hw/pci/pcie.c b/hw/pci/pcie.c
index 4b2f0805c6..d6a052b616 100644
--- a/hw/pci/pcie.c
+++ b/hw/pci/pcie.c
@@ -1177,3 +1177,27 @@ void pcie_acs_reset(PCIDevice *dev)
         pci_set_word(dev->config + dev->exp.acs_cap + PCI_ACS_CTRL, 0);
     }
 }
+
+/* PASID */
+void pcie_pasid_init(PCIDevice *dev, uint16_t offset, uint8_t pasid_width,
+                     bool exec_perm, bool priv_mod)
+{
+    assert(pasid_width <= PCI_EXT_CAP_PASID_MAX_WIDTH);
+    static const uint16_t control_reg_rw_mask = 0x07;
+    uint16_t capability_reg = pasid_width;
+
+    pcie_add_capability(dev, PCI_EXT_CAP_ID_PASID, PCI_PASID_VER, offset,
+                        PCI_EXT_CAP_PASID_SIZEOF);
+
+    capability_reg <<= PCI_PASID_CAP_WIDTH_SHIFT;
+    capability_reg |= exec_perm ? PCI_PASID_CAP_EXEC : 0;
+    capability_reg |= priv_mod  ? PCI_PASID_CAP_PRIV : 0;
+    pci_set_word(dev->config + offset + PCI_PASID_CAP, capability_reg);
+
+    /* Everything is disabled by default */
+    pci_set_word(dev->config + offset + PCI_PASID_CTRL, 0);
+
+    pci_set_word(dev->wmask + offset + PCI_PASID_CTRL, control_reg_rw_mask);
+
+    dev->exp.pasid_cap = offset;
+}
diff --git a/include/hw/pci/pcie.h b/include/hw/pci/pcie.h
index 11f5a91bbb..c59627d556 100644
--- a/include/hw/pci/pcie.h
+++ b/include/hw/pci/pcie.h
@@ -69,8 +69,9 @@ struct PCIExpressDevice {
     uint16_t aer_cap;
     PCIEAERLog aer_log;
 
-    /* Offset of ATS capability in config space */
+    /* Offset of ATS and PASID capabilities in config space */
     uint16_t ats_cap;
+    uint16_t pasid_cap;
 
     /* ACS */
     uint16_t acs_cap;
@@ -147,4 +148,7 @@ void pcie_cap_slot_unplug_cb(HotplugHandler *hotplug_dev, DeviceState *dev,
                              Error **errp);
 void pcie_cap_slot_unplug_request_cb(HotplugHandler *hotplug_dev,
                                      DeviceState *dev, Error **errp);
+
+void pcie_pasid_init(PCIDevice *dev, uint16_t offset, uint8_t pasid_width,
+                     bool exec_perm, bool priv_mod);
 #endif /* QEMU_PCIE_H */
diff --git a/include/hw/pci/pcie_regs.h b/include/hw/pci/pcie_regs.h
index 9d3b6868dc..0a86598f80 100644
--- a/include/hw/pci/pcie_regs.h
+++ b/include/hw/pci/pcie_regs.h
@@ -86,6 +86,9 @@ typedef enum PCIExpLinkWidth {
 #define PCI_ARI_VER                     1
 #define PCI_ARI_SIZEOF                  8
 
+/* PASID */
+#define PCI_PASID_VER                   1
+#define PCI_EXT_CAP_PASID_MAX_WIDTH     20
 /* AER */
 #define PCI_ERR_VER                     2
 #define PCI_ERR_SIZEOF                  0x48
diff --git a/include/standard-headers/linux/pci_regs.h b/include/standard-headers/linux/pci_regs.h
index a39193213f..406dce8e82 100644
--- a/include/standard-headers/linux/pci_regs.h
+++ b/include/standard-headers/linux/pci_regs.h
@@ -935,6 +935,7 @@
 #define  PCI_PASID_CAP_EXEC	0x0002	/* Exec permissions Supported */
 #define  PCI_PASID_CAP_PRIV	0x0004	/* Privilege Mode Supported */
 #define  PCI_PASID_CAP_WIDTH	0x1f00
+#define  PCI_PASID_CAP_WIDTH_SHIFT  8
 #define PCI_PASID_CTRL		0x06    /* PASID control register */
 #define  PCI_PASID_CTRL_ENABLE	0x0001	/* Enable bit */
 #define  PCI_PASID_CTRL_EXEC	0x0002	/* Exec permissions Enable */
-- 
2.44.0

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH ats_vtd v2 11/25] intel_iommu: declare supported PASID size
  2024-05-15  7:14 [PATCH ats_vtd v2 00/25] ATS support for VT-d CLEMENT MATHIEU--DRIF
                   ` (9 preceding siblings ...)
  2024-05-15  7:14 ` [PATCH ats_vtd v2 09/25] pcie: add helper to declare PASID capability for a pcie device CLEMENT MATHIEU--DRIF
@ 2024-05-15  7:14 ` CLEMENT MATHIEU--DRIF
  2024-05-15  7:14 ` [PATCH ats_vtd v2 08/25] memory: add permissions in IOMMUAccessFlags CLEMENT MATHIEU--DRIF
                   ` (13 subsequent siblings)
  24 siblings, 0 replies; 32+ messages in thread
From: CLEMENT MATHIEU--DRIF @ 2024-05-15  7:14 UTC (permalink / raw)
  To: qemu-devel
  Cc: jasowang, zhenzhong.duan, kevin.tian, yi.l.liu, joao.m.martins,
	peterx, CLEMENT MATHIEU--DRIF

Signed-off-by: Clément Mathieu--Drif <clement.mathieu--drif@eviden.com>
---
 hw/i386/intel_iommu.c          | 2 +-
 hw/i386/intel_iommu_internal.h | 1 +
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index 373f3d254a..3bb4d385a8 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -5819,7 +5819,7 @@ static void vtd_cap_init(IntelIOMMUState *s)
     }
 
     if (s->pasid) {
-        s->ecap |= VTD_ECAP_PASID;
+        s->ecap |= VTD_ECAP_PASID | VTD_ECAP_PSS;
     }
 }
 
diff --git a/hw/i386/intel_iommu_internal.h b/hw/i386/intel_iommu_internal.h
index 14879d3a58..d63ff049a7 100644
--- a/hw/i386/intel_iommu_internal.h
+++ b/hw/i386/intel_iommu_internal.h
@@ -193,6 +193,7 @@
 #define VTD_ECAP_MHMV               (15ULL << 20)
 #define VTD_ECAP_NEST               (1ULL << 26)
 #define VTD_ECAP_SRS                (1ULL << 31)
+#define VTD_ECAP_PSS                (19ULL << 35)
 #define VTD_ECAP_PASID              (1ULL << 40)
 #define VTD_ECAP_SMTS               (1ULL << 43)
 #define VTD_ECAP_SLTS               (1ULL << 46)
-- 
2.44.0

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH ats_vtd v2 10/25] pcie: helper functions to check if PASID and ATS are enabled
  2024-05-15  7:14 [PATCH ats_vtd v2 00/25] ATS support for VT-d CLEMENT MATHIEU--DRIF
                   ` (7 preceding siblings ...)
  2024-05-15  7:14 ` [PATCH ats_vtd v2 12/25] intel_iommu: add an internal API to find an address space with PASID CLEMENT MATHIEU--DRIF
@ 2024-05-15  7:14 ` CLEMENT MATHIEU--DRIF
  2024-05-15  7:14 ` [PATCH ats_vtd v2 09/25] pcie: add helper to declare PASID capability for a pcie device CLEMENT MATHIEU--DRIF
                   ` (15 subsequent siblings)
  24 siblings, 0 replies; 32+ messages in thread
From: CLEMENT MATHIEU--DRIF @ 2024-05-15  7:14 UTC (permalink / raw)
  To: qemu-devel
  Cc: jasowang, zhenzhong.duan, kevin.tian, yi.l.liu, joao.m.martins,
	peterx, CLEMENT MATHIEU--DRIF

ats_enabled and pasid_enabled check whether the capabilities are
present or not. If so, we read the configuration space to get
the status of the feature (enabled or not).

Signed-off-by: Clément Mathieu--Drif <clement.mathieu--drif@eviden.com>
---
 hw/pci/pcie.c         | 18 ++++++++++++++++++
 include/hw/pci/pcie.h |  3 +++
 2 files changed, 21 insertions(+)

diff --git a/hw/pci/pcie.c b/hw/pci/pcie.c
index d6a052b616..4efd84fed5 100644
--- a/hw/pci/pcie.c
+++ b/hw/pci/pcie.c
@@ -1201,3 +1201,21 @@ void pcie_pasid_init(PCIDevice *dev, uint16_t offset, uint8_t pasid_width,
 
     dev->exp.pasid_cap = offset;
 }
+
+bool pcie_pasid_enabled(const PCIDevice *dev)
+{
+    if (!pci_is_express(dev) || !dev->exp.pasid_cap) {
+        return false;
+    }
+    return (pci_get_word(dev->config + dev->exp.pasid_cap + PCI_PASID_CTRL) &
+                PCI_PASID_CTRL_ENABLE) != 0;
+}
+
+bool pcie_ats_enabled(const PCIDevice *dev)
+{
+    if (!pci_is_express(dev) || !dev->exp.ats_cap) {
+        return false;
+    }
+    return (pci_get_word(dev->config + dev->exp.ats_cap + PCI_ATS_CTRL) &
+                PCI_ATS_CTRL_ENABLE) != 0;
+}
diff --git a/include/hw/pci/pcie.h b/include/hw/pci/pcie.h
index c59627d556..8c222f09da 100644
--- a/include/hw/pci/pcie.h
+++ b/include/hw/pci/pcie.h
@@ -151,4 +151,7 @@ void pcie_cap_slot_unplug_request_cb(HotplugHandler *hotplug_dev,
 
 void pcie_pasid_init(PCIDevice *dev, uint16_t offset, uint8_t pasid_width,
                      bool exec_perm, bool priv_mod);
+
+bool pcie_pasid_enabled(const PCIDevice *dev);
+bool pcie_ats_enabled(const PCIDevice *dev);
 #endif /* QEMU_PCIE_H */
-- 
2.44.0

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH ats_vtd v2 08/25] memory: add permissions in IOMMUAccessFlags
  2024-05-15  7:14 [PATCH ats_vtd v2 00/25] ATS support for VT-d CLEMENT MATHIEU--DRIF
                   ` (10 preceding siblings ...)
  2024-05-15  7:14 ` [PATCH ats_vtd v2 11/25] intel_iommu: declare supported PASID size CLEMENT MATHIEU--DRIF
@ 2024-05-15  7:14 ` CLEMENT MATHIEU--DRIF
  2024-05-15  7:14 ` [PATCH ats_vtd v2 14/25] pci: cache the bus mastering status in the device CLEMENT MATHIEU--DRIF
                   ` (12 subsequent siblings)
  24 siblings, 0 replies; 32+ messages in thread
From: CLEMENT MATHIEU--DRIF @ 2024-05-15  7:14 UTC (permalink / raw)
  To: qemu-devel
  Cc: jasowang, zhenzhong.duan, kevin.tian, yi.l.liu, joao.m.martins,
	peterx, CLEMENT MATHIEU--DRIF

This will be necessary for devices implementing ATS.
We also define a new macro IOMMU_ACCESS_FLAG_FULL in addition to
IOMMU_ACCESS_FLAG to support more access flags.
IOMMU_ACCESS_FLAG is kept for convenience and backward compatibility.

Here are the flags added (defined by the PCIe 5 specification) :
    - Execute Requested
    - Privileged Mode Requested
    - Global
    - Untranslated Only

IOMMU_ACCESS_FLAG sets the additional flags to 0

Signed-off-by: Clément Mathieu--Drif <clement.mathieu--drif@eviden.com>
---
 include/exec/memory.h | 23 +++++++++++++++++++++--
 1 file changed, 21 insertions(+), 2 deletions(-)

diff --git a/include/exec/memory.h b/include/exec/memory.h
index 8626a355b3..2c0e964c07 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -110,15 +110,34 @@ struct MemoryRegionSection {
 
 typedef struct IOMMUTLBEntry IOMMUTLBEntry;
 
-/* See address_space_translate: bit 0 is read, bit 1 is write.  */
+/*
+ * See address_space_translate:
+ *      - bit 0 : read
+ *      - bit 1 : write
+ *      - bit 2 : exec
+ *      - bit 3 : priv
+ *      - bit 4 : global
+ *      - bit 5 : untranslated only
+ */
 typedef enum {
     IOMMU_NONE = 0,
     IOMMU_RO   = 1,
     IOMMU_WO   = 2,
     IOMMU_RW   = 3,
+    IOMMU_EXEC = 4,
+    IOMMU_PRIV = 8,
+    IOMMU_GLOBAL = 16,
+    IOMMU_UNTRANSLATED_ONLY = 32,
 } IOMMUAccessFlags;
 
-#define IOMMU_ACCESS_FLAG(r, w) (((r) ? IOMMU_RO : 0) | ((w) ? IOMMU_WO : 0))
+#define IOMMU_ACCESS_FLAG(r, w)     (((r) ? IOMMU_RO : 0) | \
+                                    ((w) ? IOMMU_WO : 0))
+#define IOMMU_ACCESS_FLAG_FULL(r, w, x, p, g, uo) \
+                                    (IOMMU_ACCESS_FLAG(r, w) | \
+                                    ((x) ? IOMMU_EXEC : 0) | \
+                                    ((p) ? IOMMU_PRIV : 0) | \
+                                    ((g) ? IOMMU_GLOBAL : 0) | \
+                                    ((uo) ? IOMMU_UNTRANSLATED_ONLY : 0))
 
 struct IOMMUTLBEntry {
     AddressSpace    *target_as;
-- 
2.44.0

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH ats_vtd v2 16/25] pci: add a pci-level initialization function for iommu notifiers
  2024-05-15  7:14 [PATCH ats_vtd v2 00/25] ATS support for VT-d CLEMENT MATHIEU--DRIF
                   ` (12 preceding siblings ...)
  2024-05-15  7:14 ` [PATCH ats_vtd v2 14/25] pci: cache the bus mastering status in the device CLEMENT MATHIEU--DRIF
@ 2024-05-15  7:14 ` CLEMENT MATHIEU--DRIF
  2024-05-15  7:14 ` [PATCH ats_vtd v2 13/25] intel_iommu: add support for PASID-based device IOTLB invalidation CLEMENT MATHIEU--DRIF
                   ` (10 subsequent siblings)
  24 siblings, 0 replies; 32+ messages in thread
From: CLEMENT MATHIEU--DRIF @ 2024-05-15  7:14 UTC (permalink / raw)
  To: qemu-devel
  Cc: jasowang, zhenzhong.duan, kevin.tian, yi.l.liu, joao.m.martins,
	peterx, CLEMENT MATHIEU--DRIF

We add a convenient way to initialize an device-iotlb notifier.
This is meant to be used by ATS-capable devices.

pci_device_iommu_memory_region_pasid is introduces in this commit and
will be used in several other SVM-related functions exposed in
the PCI API.

Signed-off-by: Clément Mathieu--Drif <clement.mathieu--drif@eviden.com>
---
 hw/pci/pci.c         | 38 ++++++++++++++++++++++++++++++++++++++
 include/hw/pci/pci.h | 13 +++++++++++++
 2 files changed, 51 insertions(+)

diff --git a/hw/pci/pci.c b/hw/pci/pci.c
index 2b42b4e4cc..f90eb04fda 100644
--- a/hw/pci/pci.c
+++ b/hw/pci/pci.c
@@ -2747,6 +2747,44 @@ AddressSpace *pci_device_iommu_address_space(PCIDevice *dev)
     return &address_space_memory;
 }
 
+static IOMMUMemoryRegion *pci_device_iommu_memory_region_pasid(PCIDevice *dev,
+                                                               uint32_t pasid)
+{
+    PCIBus *bus;
+    PCIBus *iommu_bus;
+    int devfn;
+
+    /*
+     * This function is for internal use in the module,
+     * we can call it with PCI_NO_PASID
+     */
+    if (!dev->is_master ||
+            ((pasid != PCI_NO_PASID) && !pcie_pasid_enabled(dev))) {
+        return NULL;
+    }
+
+    pci_device_get_iommu_bus_devfn(dev, &bus, &iommu_bus, &devfn);
+    if (iommu_bus && iommu_bus->iommu_ops->get_memory_region_pasid) {
+        return iommu_bus->iommu_ops->get_memory_region_pasid(bus,
+                                 iommu_bus->iommu_opaque, devfn, pasid);
+    }
+    return NULL;
+}
+
+bool pci_iommu_init_iotlb_notifier(PCIDevice *dev, uint32_t pasid,
+                                   IOMMUNotifier *n, IOMMUNotify fn)
+{
+    IOMMUMemoryRegion *iommu_mr = pci_device_iommu_memory_region_pasid(dev,
+                                                                        pasid);
+    if (!iommu_mr) {
+        return false;
+    }
+    iommu_notifier_init(n, fn, IOMMU_NOTIFIER_DEVIOTLB_EVENTS, 0, HWADDR_MAX,
+                        memory_region_iommu_attrs_to_index(iommu_mr,
+                                                       MEMTXATTRS_UNSPECIFIED));
+    return true;
+}
+
 AddressSpace *pci_device_iommu_address_space_pasid(PCIDevice *dev,
                                                    uint32_t pasid)
 {
diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
index 0c532c563c..1587c18cd9 100644
--- a/include/hw/pci/pci.h
+++ b/include/hw/pci/pci.h
@@ -458,6 +458,19 @@ int pci_device_set_iommu_device(PCIDevice *dev, HostIOMMUDevice *hiod,
                                 Error **errp);
 void pci_device_unset_iommu_device(PCIDevice *dev);
 
+/**
+ * pci_iommu_init_iotlb_notifier: initialize an IOMMU notifier
+ *
+ * This function is used by devices before registering an IOTLB notifier
+ *
+ * @dev: the device
+ * @pasid: the pasid of the address space to watch
+ * @n: the notifier to initialize
+ * @fn: the callback to be installed
+ */
+bool pci_iommu_init_iotlb_notifier(PCIDevice *dev, uint32_t pasid,
+                                   IOMMUNotifier *n, IOMMUNotify fn);
+
 /**
  * pci_setup_iommu: Initialize specific IOMMU handlers for a PCIBus
  *
-- 
2.44.0

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH ats_vtd v2 13/25] intel_iommu: add support for PASID-based device IOTLB invalidation
  2024-05-15  7:14 [PATCH ats_vtd v2 00/25] ATS support for VT-d CLEMENT MATHIEU--DRIF
                   ` (13 preceding siblings ...)
  2024-05-15  7:14 ` [PATCH ats_vtd v2 16/25] pci: add a pci-level initialization function for iommu notifiers CLEMENT MATHIEU--DRIF
@ 2024-05-15  7:14 ` CLEMENT MATHIEU--DRIF
  2024-05-15  7:14 ` [PATCH ats_vtd v2 15/25] pci: add IOMMU operations to get address spaces and memory regions with PASID CLEMENT MATHIEU--DRIF
                   ` (9 subsequent siblings)
  24 siblings, 0 replies; 32+ messages in thread
From: CLEMENT MATHIEU--DRIF @ 2024-05-15  7:14 UTC (permalink / raw)
  To: qemu-devel
  Cc: jasowang, zhenzhong.duan, kevin.tian, yi.l.liu, joao.m.martins,
	peterx, CLEMENT MATHIEU--DRIF

Signed-off-by: Clément Mathieu--Drif <clement.mathieu--drif@eviden.com>
---
 hw/i386/intel_iommu.c          | 42 ++++++++++++++++++++++++++++++----
 hw/i386/intel_iommu_internal.h | 10 ++++++++
 2 files changed, 47 insertions(+), 5 deletions(-)

diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index 166103510e..fd4710ba28 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -4341,11 +4341,43 @@ static void do_invalidate_device_tlb(VTDAddressSpace *vtd_dev_as,
 static bool vtd_process_device_piotlb_desc(IntelIOMMUState *s,
                                            VTDInvDesc *inv_desc)
 {
-    /*
-     * no need to handle it for passthru device, for emulated
-     * devices with device tlb, it may be required, but for now,
-     * return is enough
-     */
+    uint16_t sid;
+    VTDAddressSpace *vtd_dev_as;
+    bool size;
+    bool global;
+    hwaddr addr;
+    uint32_t pasid;
+
+    if ((inv_desc->hi & VTD_INV_DESC_PASID_DEVICE_IOTLB_RSVD_HI) ||
+         (inv_desc->lo & VTD_INV_DESC_PASID_DEVICE_IOTLB_RSVD_LO)) {
+        error_report_once("%s: invalid pasid-based dev iotlb inv desc:"
+                          "hi=%"PRIx64 "(reserved nonzero)",
+                          __func__, inv_desc->hi);
+        return false;
+    }
+
+    global = VTD_INV_DESC_PASID_DEVICE_IOTLB_GLOBAL(inv_desc->hi);
+    size = VTD_INV_DESC_PASID_DEVICE_IOTLB_SIZE(inv_desc->hi);
+    addr = VTD_INV_DESC_PASID_DEVICE_IOTLB_ADDR(inv_desc->hi);
+    sid = VTD_INV_DESC_PASID_DEVICE_IOTLB_SID(inv_desc->lo);
+    if (global) {
+        QLIST_FOREACH(vtd_dev_as, &s->vtd_as_with_notifiers, next) {
+            if ((vtd_dev_as->pasid != PCI_NO_PASID) &&
+                (PCI_BUILD_BDF(pci_bus_num(vtd_dev_as->bus),
+                                           vtd_dev_as->devfn) == sid)) {
+                do_invalidate_device_tlb(vtd_dev_as, size, addr);
+            }
+        }
+    } else {
+        pasid = VTD_INV_DESC_PASID_DEVICE_IOTLB_PASID(inv_desc->lo);
+        vtd_dev_as = vtd_get_as_by_sid_and_pasid(s, sid, pasid);
+        if (!vtd_dev_as) {
+            return true;
+        }
+
+        do_invalidate_device_tlb(vtd_dev_as, size, addr);
+    }
+
     return true;
 }
 
diff --git a/hw/i386/intel_iommu_internal.h b/hw/i386/intel_iommu_internal.h
index d63ff049a7..3d59e10488 100644
--- a/hw/i386/intel_iommu_internal.h
+++ b/hw/i386/intel_iommu_internal.h
@@ -424,6 +424,16 @@ typedef union VTDInvDesc VTDInvDesc;
 #define VTD_INV_DESC_DEVICE_IOTLB_RSVD_HI 0xffeULL
 #define VTD_INV_DESC_DEVICE_IOTLB_RSVD_LO 0xffff0000ffe0fff8
 
+/* Mask for PASID Device IOTLB Invalidate Descriptor */
+#define VTD_INV_DESC_PASID_DEVICE_IOTLB_ADDR(val) ((val) & \
+                                                   0xfffffffffffff000ULL)
+#define VTD_INV_DESC_PASID_DEVICE_IOTLB_SIZE(val) ((val >> 11) & 0x1)
+#define VTD_INV_DESC_PASID_DEVICE_IOTLB_GLOBAL(val) ((val) & 0x1)
+#define VTD_INV_DESC_PASID_DEVICE_IOTLB_SID(val) (((val) >> 16) & 0xffffULL)
+#define VTD_INV_DESC_PASID_DEVICE_IOTLB_PASID(val) ((val >> 32) & 0xfffffULL)
+#define VTD_INV_DESC_PASID_DEVICE_IOTLB_RSVD_HI 0x7feULL
+#define VTD_INV_DESC_PASID_DEVICE_IOTLB_RSVD_LO 0xfff000000000f000ULL
+
 /* Rsvd field masks for spte */
 #define VTD_SPTE_SNP 0x800ULL
 
-- 
2.44.0

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH ats_vtd v2 14/25] pci: cache the bus mastering status in the device
  2024-05-15  7:14 [PATCH ats_vtd v2 00/25] ATS support for VT-d CLEMENT MATHIEU--DRIF
                   ` (11 preceding siblings ...)
  2024-05-15  7:14 ` [PATCH ats_vtd v2 08/25] memory: add permissions in IOMMUAccessFlags CLEMENT MATHIEU--DRIF
@ 2024-05-15  7:14 ` CLEMENT MATHIEU--DRIF
  2024-05-15  7:14 ` [PATCH ats_vtd v2 16/25] pci: add a pci-level initialization function for iommu notifiers CLEMENT MATHIEU--DRIF
                   ` (11 subsequent siblings)
  24 siblings, 0 replies; 32+ messages in thread
From: CLEMENT MATHIEU--DRIF @ 2024-05-15  7:14 UTC (permalink / raw)
  To: qemu-devel
  Cc: jasowang, zhenzhong.duan, kevin.tian, yi.l.liu, joao.m.martins,
	peterx, CLEMENT MATHIEU--DRIF

Signed-off-by: Clément Mathieu--Drif <clement.mathieu--drif@eviden.com>
---
 hw/pci/pci.c                | 24 ++++++++++++++----------
 include/hw/pci/pci_device.h |  1 +
 2 files changed, 15 insertions(+), 10 deletions(-)

diff --git a/hw/pci/pci.c b/hw/pci/pci.c
index 045d69f4c1..e5f72f9f1d 100644
--- a/hw/pci/pci.c
+++ b/hw/pci/pci.c
@@ -116,6 +116,12 @@ static GSequence *pci_acpi_index_list(void)
     return used_acpi_index_list;
 }
 
+static void pci_set_master(PCIDevice *d, bool enable)
+{
+    memory_region_set_enabled(&d->bus_master_enable_region, enable);
+    d->is_master = enable; /* cache the status */
+}
+
 static void pci_init_bus_master(PCIDevice *pci_dev)
 {
     AddressSpace *dma_as = pci_device_iommu_address_space(pci_dev);
@@ -123,7 +129,7 @@ static void pci_init_bus_master(PCIDevice *pci_dev)
     memory_region_init_alias(&pci_dev->bus_master_enable_region,
                              OBJECT(pci_dev), "bus master",
                              dma_as->root, 0, memory_region_size(dma_as->root));
-    memory_region_set_enabled(&pci_dev->bus_master_enable_region, false);
+    pci_set_master(pci_dev, false);
     memory_region_add_subregion(&pci_dev->bus_master_container_region, 0,
                                 &pci_dev->bus_master_enable_region);
 }
@@ -657,9 +663,8 @@ static int get_pci_config_device(QEMUFile *f, void *pv, size_t size,
         pci_bridge_update_mappings(PCI_BRIDGE(s));
     }
 
-    memory_region_set_enabled(&s->bus_master_enable_region,
-                              pci_get_word(s->config + PCI_COMMAND)
-                              & PCI_COMMAND_MASTER);
+    pci_set_master(s,
+                   pci_get_word(s->config + PCI_COMMAND) & PCI_COMMAND_MASTER);
 
     g_free(config);
     return 0;
@@ -1611,9 +1616,9 @@ void pci_default_write_config(PCIDevice *d, uint32_t addr, uint32_t val_in, int
 
     if (ranges_overlap(addr, l, PCI_COMMAND, 2)) {
         pci_update_irq_disabled(d, was_irq_disabled);
-        memory_region_set_enabled(&d->bus_master_enable_region,
-                                  (pci_get_word(d->config + PCI_COMMAND)
-                                   & PCI_COMMAND_MASTER) && d->has_power);
+        pci_set_master(d,
+                      (pci_get_word(d->config + PCI_COMMAND) &
+                            PCI_COMMAND_MASTER) && d->has_power);
     }
 
     msi_write_config(d, addr, val_in, l);
@@ -2888,9 +2893,8 @@ void pci_set_power(PCIDevice *d, bool state)
 
     d->has_power = state;
     pci_update_mappings(d);
-    memory_region_set_enabled(&d->bus_master_enable_region,
-                              (pci_get_word(d->config + PCI_COMMAND)
-                               & PCI_COMMAND_MASTER) && d->has_power);
+    pci_set_master(d, (pci_get_word(d->config + PCI_COMMAND)
+                        & PCI_COMMAND_MASTER) && d->has_power);
     if (!d->has_power) {
         pci_device_reset(d);
     }
diff --git a/include/hw/pci/pci_device.h b/include/hw/pci/pci_device.h
index d3dd0f64b2..7fa501569a 100644
--- a/include/hw/pci/pci_device.h
+++ b/include/hw/pci/pci_device.h
@@ -87,6 +87,7 @@ struct PCIDevice {
     char name[64];
     PCIIORegion io_regions[PCI_NUM_REGIONS];
     AddressSpace bus_master_as;
+    bool is_master;
     MemoryRegion bus_master_container_region;
     MemoryRegion bus_master_enable_region;
 
-- 
2.44.0

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH ats_vtd v2 15/25] pci: add IOMMU operations to get address spaces and memory regions with PASID
  2024-05-15  7:14 [PATCH ats_vtd v2 00/25] ATS support for VT-d CLEMENT MATHIEU--DRIF
                   ` (14 preceding siblings ...)
  2024-05-15  7:14 ` [PATCH ats_vtd v2 13/25] intel_iommu: add support for PASID-based device IOTLB invalidation CLEMENT MATHIEU--DRIF
@ 2024-05-15  7:14 ` CLEMENT MATHIEU--DRIF
  2024-05-15  7:14 ` [PATCH ats_vtd v2 18/25] intel_iommu: implement the get_memory_region_pasid iommu operation CLEMENT MATHIEU--DRIF
                   ` (8 subsequent siblings)
  24 siblings, 0 replies; 32+ messages in thread
From: CLEMENT MATHIEU--DRIF @ 2024-05-15  7:14 UTC (permalink / raw)
  To: qemu-devel
  Cc: jasowang, zhenzhong.duan, kevin.tian, yi.l.liu, joao.m.martins,
	peterx, CLEMENT MATHIEU--DRIF

Signed-off-by: Clément Mathieu--Drif <clement.mathieu--drif@eviden.com>
---
 hw/pci/pci.c         | 19 +++++++++++++++++++
 include/hw/pci/pci.h | 34 ++++++++++++++++++++++++++++++++++
 2 files changed, 53 insertions(+)

diff --git a/hw/pci/pci.c b/hw/pci/pci.c
index e5f72f9f1d..2b42b4e4cc 100644
--- a/hw/pci/pci.c
+++ b/hw/pci/pci.c
@@ -2747,6 +2747,25 @@ AddressSpace *pci_device_iommu_address_space(PCIDevice *dev)
     return &address_space_memory;
 }
 
+AddressSpace *pci_device_iommu_address_space_pasid(PCIDevice *dev,
+                                                   uint32_t pasid)
+{
+    PCIBus *bus;
+    PCIBus *iommu_bus;
+    int devfn;
+
+    if (!dev->is_master || !pcie_pasid_enabled(dev) || pasid == PCI_NO_PASID) {
+        return NULL;
+    }
+
+    pci_device_get_iommu_bus_devfn(dev, &bus, &iommu_bus, &devfn);
+    if (iommu_bus && iommu_bus->iommu_ops->get_address_space_pasid) {
+        return iommu_bus->iommu_ops->get_address_space_pasid(bus,
+                                    iommu_bus->iommu_opaque, devfn, pasid);
+    }
+    return NULL;
+}
+
 int pci_device_set_iommu_device(PCIDevice *dev, HostIOMMUDevice *hiod,
                                 Error **errp)
 {
diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
index 849e391813..0c532c563c 100644
--- a/include/hw/pci/pci.h
+++ b/include/hw/pci/pci.h
@@ -385,6 +385,38 @@ typedef struct PCIIOMMUOps {
      * @devfn: device and function number
      */
     AddressSpace * (*get_address_space)(PCIBus *bus, void *opaque, int devfn);
+    /**
+     * @get_address_space_pasid: same as get_address_space but returns an
+     * address space with the requested PASID
+     *
+     * This callback is required for PASID-based operations
+     *
+     * @bus: the #PCIBus being accessed.
+     *
+     * @opaque: the data passed to pci_setup_iommu().
+     *
+     * @devfn: device and function number
+     *
+     * @pasid: the pasid associated with the requested memory region
+     */
+    AddressSpace * (*get_address_space_pasid)(PCIBus *bus, void *opaque,
+                                              int devfn, uint32_t pasid);
+    /**
+     * @get_memory_region_pasid: get the iommu memory region for a given
+     * device and pasid
+     *
+     * @bus: the #PCIBus being accessed.
+     *
+     * @opaque: the data passed to pci_setup_iommu().
+     *
+     * @devfn: device and function number
+     *
+     * @pasid: the pasid associated with the requested memory region
+     */
+    IOMMUMemoryRegion * (*get_memory_region_pasid)(PCIBus *bus,
+                                                   void *opaque,
+                                                   int devfn,
+                                                   uint32_t pasid);
     /**
      * @set_iommu_device: attach a HostIOMMUDevice to a vIOMMU
      *
@@ -420,6 +452,8 @@ typedef struct PCIIOMMUOps {
 } PCIIOMMUOps;
 
 AddressSpace *pci_device_iommu_address_space(PCIDevice *dev);
+AddressSpace *pci_device_iommu_address_space_pasid(PCIDevice *dev,
+                                                   uint32_t pasid);
 int pci_device_set_iommu_device(PCIDevice *dev, HostIOMMUDevice *hiod,
                                 Error **errp);
 void pci_device_unset_iommu_device(PCIDevice *dev);
-- 
2.44.0

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH ats_vtd v2 20/25] intel_iommu: fill the PASID field when creating an instance of IOMMUTLBEntry
  2024-05-15  7:14 [PATCH ats_vtd v2 00/25] ATS support for VT-d CLEMENT MATHIEU--DRIF
                   ` (17 preceding siblings ...)
  2024-05-15  7:14 ` [PATCH ats_vtd v2 19/25] memory: Allow to store the PASID in IOMMUTLBEntry CLEMENT MATHIEU--DRIF
@ 2024-05-15  7:14 ` CLEMENT MATHIEU--DRIF
  2024-05-17 10:40   ` Duan, Zhenzhong
  2024-05-15  7:14 ` [PATCH ats_vtd v2 17/25] intel_iommu: implement the get_address_space_pasid iommu operation CLEMENT MATHIEU--DRIF
                   ` (5 subsequent siblings)
  24 siblings, 1 reply; 32+ messages in thread
From: CLEMENT MATHIEU--DRIF @ 2024-05-15  7:14 UTC (permalink / raw)
  To: qemu-devel
  Cc: jasowang, zhenzhong.duan, kevin.tian, yi.l.liu, joao.m.martins,
	peterx, CLEMENT MATHIEU--DRIF

Signed-off-by: Clément Mathieu--Drif <clement.mathieu--drif@eviden.com>
---
 hw/i386/intel_iommu.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index 53f17d66c0..c4ebd4569e 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -2299,6 +2299,7 @@ out:
     entry->translated_addr = vtd_get_slpte_addr(pte, s->aw_bits) & page_mask;
     entry->addr_mask = ~page_mask;
     entry->perm = access_flags;
+    entry->pasid = pasid;
     return true;
 
 error:
@@ -2307,6 +2308,7 @@ error:
     entry->translated_addr = 0;
     entry->addr_mask = 0;
     entry->perm = IOMMU_NONE;
+    entry->pasid = PCI_NO_PASID;
     return false;
 }
 
@@ -3497,6 +3499,7 @@ static void vtd_piotlb_pasid_invalidate_notify(IntelIOMMUState *s,
                 event.entry.target_as = &address_space_memory;
                 event.entry.iova = notifier->start;
                 event.entry.perm = IOMMU_NONE;
+                event.entry.pasid = pasid;
                 event.entry.addr_mask = notifier->end - notifier->start;
                 event.entry.translated_addr = 0;
 
@@ -3678,6 +3681,7 @@ static void vtd_piotlb_page_invalidate(IntelIOMMUState *s, uint16_t domain_id,
             event.entry.target_as = &address_space_memory;
             event.entry.iova = addr;
             event.entry.perm = IOMMU_NONE;
+            event.entry.pasid = pasid;
             event.entry.addr_mask = size - 1;
             event.entry.translated_addr = 0;
 
@@ -4335,6 +4339,7 @@ static void do_invalidate_device_tlb(VTDAddressSpace *vtd_dev_as,
     event.entry.iova = addr;
     event.entry.perm = IOMMU_NONE;
     event.entry.translated_addr = 0;
+    event.entry.pasid = vtd_dev_as->pasid;
     memory_region_notify_iommu(&vtd_dev_as->iommu, 0, event);
 }
 
@@ -4911,6 +4916,7 @@ static IOMMUTLBEntry vtd_iommu_translate(IOMMUMemoryRegion *iommu, hwaddr addr,
     IOMMUTLBEntry iotlb = {
         /* We'll fill in the rest later. */
         .target_as = &address_space_memory,
+        .pasid = vtd_as->pasid,
     };
     bool success;
 
@@ -4923,6 +4929,7 @@ static IOMMUTLBEntry vtd_iommu_translate(IOMMUMemoryRegion *iommu, hwaddr addr,
         iotlb.translated_addr = addr & VTD_PAGE_MASK_4K;
         iotlb.addr_mask = ~VTD_PAGE_MASK_4K;
         iotlb.perm = IOMMU_RW;
+        iotlb.pasid = PCI_NO_PASID;
         success = true;
     }
 
-- 
2.44.0

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH ats_vtd v2 19/25] memory: Allow to store the PASID in IOMMUTLBEntry
  2024-05-15  7:14 [PATCH ats_vtd v2 00/25] ATS support for VT-d CLEMENT MATHIEU--DRIF
                   ` (16 preceding siblings ...)
  2024-05-15  7:14 ` [PATCH ats_vtd v2 18/25] intel_iommu: implement the get_memory_region_pasid iommu operation CLEMENT MATHIEU--DRIF
@ 2024-05-15  7:14 ` CLEMENT MATHIEU--DRIF
  2024-05-15  7:14 ` [PATCH ats_vtd v2 20/25] intel_iommu: fill the PASID field when creating an instance of IOMMUTLBEntry CLEMENT MATHIEU--DRIF
                   ` (6 subsequent siblings)
  24 siblings, 0 replies; 32+ messages in thread
From: CLEMENT MATHIEU--DRIF @ 2024-05-15  7:14 UTC (permalink / raw)
  To: qemu-devel
  Cc: jasowang, zhenzhong.duan, kevin.tian, yi.l.liu, joao.m.martins,
	peterx, CLEMENT MATHIEU--DRIF

This will be useful for devices that support ATS

Signed-off-by: Clément Mathieu--Drif <clement.mathieu--drif@eviden.com>
---
 include/exec/memory.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/exec/memory.h b/include/exec/memory.h
index 2c0e964c07..198b71e9af 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -145,6 +145,7 @@ struct IOMMUTLBEntry {
     hwaddr           translated_addr;
     hwaddr           addr_mask;  /* 0xfff = 4k translation */
     IOMMUAccessFlags perm;
+    uint32_t         pasid;
 };
 
 /*
-- 
2.44.0

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH ats_vtd v2 18/25] intel_iommu: implement the get_memory_region_pasid iommu operation
  2024-05-15  7:14 [PATCH ats_vtd v2 00/25] ATS support for VT-d CLEMENT MATHIEU--DRIF
                   ` (15 preceding siblings ...)
  2024-05-15  7:14 ` [PATCH ats_vtd v2 15/25] pci: add IOMMU operations to get address spaces and memory regions with PASID CLEMENT MATHIEU--DRIF
@ 2024-05-15  7:14 ` CLEMENT MATHIEU--DRIF
  2024-05-15  7:14 ` [PATCH ats_vtd v2 19/25] memory: Allow to store the PASID in IOMMUTLBEntry CLEMENT MATHIEU--DRIF
                   ` (7 subsequent siblings)
  24 siblings, 0 replies; 32+ messages in thread
From: CLEMENT MATHIEU--DRIF @ 2024-05-15  7:14 UTC (permalink / raw)
  To: qemu-devel
  Cc: jasowang, zhenzhong.duan, kevin.tian, yi.l.liu, joao.m.martins,
	peterx, CLEMENT MATHIEU--DRIF

Signed-off-by: Clément Mathieu--Drif <clement.mathieu--drif@eviden.com>
---
 hw/i386/intel_iommu.c | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index e48b169cda..53f17d66c0 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -5997,9 +5997,24 @@ static AddressSpace *vtd_host_dma_iommu(PCIBus *bus, void *opaque, int devfn)
     return vtd_host_dma_iommu_pasid(bus, opaque, devfn, PCI_NO_PASID);
 }
 
+static IOMMUMemoryRegion *vtd_get_memory_region_pasid(PCIBus *bus,
+                                                      void *opaque,
+                                                      int devfn,
+                                                      uint32_t pasid)
+{
+    IntelIOMMUState *s = opaque;
+    VTDAddressSpace *vtd_as;
+
+    assert(0 <= devfn && devfn < PCI_DEVFN_MAX);
+
+    vtd_as = vtd_find_add_as(s, bus, devfn, pasid);
+    return &vtd_as->iommu;
+}
+
 static PCIIOMMUOps vtd_iommu_ops = {
     .get_address_space = vtd_host_dma_iommu,
     .get_address_space_pasid = vtd_host_dma_iommu_pasid,
+    .get_memory_region_pasid = vtd_get_memory_region_pasid,
     .set_iommu_device = vtd_dev_set_iommu_device,
     .unset_iommu_device = vtd_dev_unset_iommu_device,
 };
-- 
2.44.0

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH ats_vtd v2 17/25] intel_iommu: implement the get_address_space_pasid iommu operation
  2024-05-15  7:14 [PATCH ats_vtd v2 00/25] ATS support for VT-d CLEMENT MATHIEU--DRIF
                   ` (18 preceding siblings ...)
  2024-05-15  7:14 ` [PATCH ats_vtd v2 20/25] intel_iommu: fill the PASID field when creating an instance of IOMMUTLBEntry CLEMENT MATHIEU--DRIF
@ 2024-05-15  7:14 ` CLEMENT MATHIEU--DRIF
  2024-05-15  7:14 ` [PATCH ats_vtd v2 21/25] atc: generic ATC that can be used by PCIe devices that support SVM CLEMENT MATHIEU--DRIF
                   ` (4 subsequent siblings)
  24 siblings, 0 replies; 32+ messages in thread
From: CLEMENT MATHIEU--DRIF @ 2024-05-15  7:14 UTC (permalink / raw)
  To: qemu-devel
  Cc: jasowang, zhenzhong.duan, kevin.tian, yi.l.liu, joao.m.martins,
	peterx, CLEMENT MATHIEU--DRIF

Signed-off-by: Clément Mathieu--Drif <clement.mathieu--drif@eviden.com>
---
 hw/i386/intel_iommu.c         | 13 ++++++++++---
 include/hw/i386/intel_iommu.h |  2 +-
 2 files changed, 11 insertions(+), 4 deletions(-)

diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index fd4710ba28..e48b169cda 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -5429,7 +5429,7 @@ static const MemoryRegionOps vtd_mem_ir_fault_ops = {
 };
 
 VTDAddressSpace *vtd_find_add_as(IntelIOMMUState *s, PCIBus *bus,
-                                 int devfn, unsigned int pasid)
+                                 int devfn, uint32_t pasid)
 {
     /*
      * We can't simply use sid here since the bus number might not be
@@ -5980,19 +5980,26 @@ static void vtd_reset(DeviceState *dev)
     vtd_refresh_pasid_bind(s);
 }
 
-static AddressSpace *vtd_host_dma_iommu(PCIBus *bus, void *opaque, int devfn)
+static AddressSpace *vtd_host_dma_iommu_pasid(PCIBus *bus, void *opaque,
+                                              int devfn, uint32_t pasid)
 {
     IntelIOMMUState *s = opaque;
     VTDAddressSpace *vtd_as;
 
     assert(0 <= devfn && devfn < PCI_DEVFN_MAX);
 
-    vtd_as = vtd_find_add_as(s, bus, devfn, PCI_NO_PASID);
+    vtd_as = vtd_find_add_as(s, bus, devfn, pasid);
     return &vtd_as->as;
 }
 
+static AddressSpace *vtd_host_dma_iommu(PCIBus *bus, void *opaque, int devfn)
+{
+    return vtd_host_dma_iommu_pasid(bus, opaque, devfn, PCI_NO_PASID);
+}
+
 static PCIIOMMUOps vtd_iommu_ops = {
     .get_address_space = vtd_host_dma_iommu,
+    .get_address_space_pasid = vtd_host_dma_iommu_pasid,
     .set_iommu_device = vtd_dev_set_iommu_device,
     .unset_iommu_device = vtd_dev_unset_iommu_device,
 };
diff --git a/include/hw/i386/intel_iommu.h b/include/hw/i386/intel_iommu.h
index 0d5b933159..bac40e4d40 100644
--- a/include/hw/i386/intel_iommu.h
+++ b/include/hw/i386/intel_iommu.h
@@ -325,6 +325,6 @@ struct IntelIOMMUState {
  * create a new one if none exists
  */
 VTDAddressSpace *vtd_find_add_as(IntelIOMMUState *s, PCIBus *bus,
-                                 int devfn, unsigned int pasid);
+                                 int devfn, uint32_t pasid);
 
 #endif
-- 
2.44.0

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH ats_vtd v2 21/25] atc: generic ATC that can be used by PCIe devices that support SVM
  2024-05-15  7:14 [PATCH ats_vtd v2 00/25] ATS support for VT-d CLEMENT MATHIEU--DRIF
                   ` (19 preceding siblings ...)
  2024-05-15  7:14 ` [PATCH ats_vtd v2 17/25] intel_iommu: implement the get_address_space_pasid iommu operation CLEMENT MATHIEU--DRIF
@ 2024-05-15  7:14 ` CLEMENT MATHIEU--DRIF
  2024-05-17 10:44   ` Duan, Zhenzhong
  2024-05-15  7:14 ` [PATCH ats_vtd v2 24/25] intel_iommu: set the address mask even when a translation fails CLEMENT MATHIEU--DRIF
                   ` (3 subsequent siblings)
  24 siblings, 1 reply; 32+ messages in thread
From: CLEMENT MATHIEU--DRIF @ 2024-05-15  7:14 UTC (permalink / raw)
  To: qemu-devel
  Cc: jasowang, zhenzhong.duan, kevin.tian, yi.l.liu, joao.m.martins,
	peterx, CLEMENT MATHIEU--DRIF

As the SVM-capable devices will need to cache translations, we provide
an first implementation.

This cache uses a two-level design based on hash tables.
The first level is indexed by a PASID and the second by a virtual addresse.

Signed-off-by: Clément Mathieu--Drif <clement.mathieu--drif@eviden.com>
---
 tests/unit/meson.build |   1 +
 tests/unit/test-atc.c  | 502 +++++++++++++++++++++++++++++++++++++++++
 util/atc.c             | 211 +++++++++++++++++
 util/atc.h             | 117 ++++++++++
 util/meson.build       |   1 +
 5 files changed, 832 insertions(+)
 create mode 100644 tests/unit/test-atc.c
 create mode 100644 util/atc.c
 create mode 100644 util/atc.h

diff --git a/tests/unit/meson.build b/tests/unit/meson.build
index 228a21d03c..5c9a6fe9f4 100644
--- a/tests/unit/meson.build
+++ b/tests/unit/meson.build
@@ -52,6 +52,7 @@ tests = {
   'test-interval-tree': [],
   'test-xs-node': [qom],
   'test-virtio-dmabuf': [meson.project_source_root() / 'hw/display/virtio-dmabuf.c'],
+  'test-atc': []
 }
 
 if have_system or have_tools
diff --git a/tests/unit/test-atc.c b/tests/unit/test-atc.c
new file mode 100644
index 0000000000..60fa60924a
--- /dev/null
+++ b/tests/unit/test-atc.c
@@ -0,0 +1,502 @@
+/*
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include "util/atc.h"
+
+static inline bool tlb_entry_equal(IOMMUTLBEntry *e1, IOMMUTLBEntry *e2)
+{
+    if (!e1 || !e2) {
+        return !e1 && !e2;
+    }
+    return e1->iova == e2->iova &&
+            e1->addr_mask == e2->addr_mask &&
+            e1->pasid == e2->pasid &&
+            e1->perm == e2->perm &&
+            e1->target_as == e2->target_as &&
+            e1->translated_addr == e2->translated_addr;
+}
+
+static void assert_lookup_equals(ATC *atc, IOMMUTLBEntry *target,
+                                 uint32_t pasid, hwaddr iova)
+{
+    IOMMUTLBEntry *result;
+    result = atc_lookup(atc, pasid, iova);
+    g_assert(tlb_entry_equal(result, target));
+}
+
+static void check_creation(uint64_t page_size, uint8_t address_width,
+                           uint8_t levels, uint8_t level_offset,
+                           bool should_work) {
+    ATC *atc = atc_new(page_size, address_width);
+    if (atc) {
+        if (atc->levels != levels || atc->level_offset != level_offset) {
+            g_assert(false); /* ATC created but invalid configuration : fail */
+        }
+        atc_destroy(atc);
+        g_assert(should_work);
+    } else {
+        g_assert(!should_work);
+    }
+}
+
+static void test_creation_parameters(void)
+{
+    check_creation(8, 39, 3, 9, false);
+    check_creation(4095, 39, 3, 9, false);
+    check_creation(4097, 39, 3, 9, false);
+    check_creation(8192, 48, 0, 0, false);
+
+    check_creation(4096, 38, 0, 0, false);
+    check_creation(4096, 39, 3, 9, true);
+    check_creation(4096, 40, 0, 0, false);
+    check_creation(4096, 47, 0, 0, false);
+    check_creation(4096, 48, 4, 9, true);
+    check_creation(4096, 49, 0, 0, false);
+    check_creation(4096, 56, 0, 0, false);
+    check_creation(4096, 57, 5, 9, true);
+    check_creation(4096, 58, 0, 0, false);
+
+    check_creation(16384, 35, 0, 0, false);
+    check_creation(16384, 36, 2, 11, true);
+    check_creation(16384, 37, 0, 0, false);
+    check_creation(16384, 46, 0, 0, false);
+    check_creation(16384, 47, 3, 11, true);
+    check_creation(16384, 48, 0, 0, false);
+    check_creation(16384, 57, 0, 0, false);
+    check_creation(16384, 58, 4, 11, true);
+    check_creation(16384, 59, 0, 0, false);
+}
+
+static void test_single_entry(void)
+{
+    IOMMUTLBEntry entry = {
+        .iova = 0x123456789000ULL,
+        .addr_mask = 0xfffULL,
+        .pasid = 5,
+        .perm = IOMMU_RW,
+        .translated_addr = 0xdeadbeefULL,
+    };
+
+    ATC *atc = atc_new(4096, 48);
+    g_assert(atc);
+
+    assert_lookup_equals(atc, NULL, entry.pasid,
+                         entry.iova + (entry.addr_mask / 2));
+
+    atc_create_address_space_cache(atc, entry.pasid);
+    g_assert(atc_update(atc, &entry) == 0);
+
+    assert_lookup_equals(atc, NULL, entry.pasid + 1,
+                         entry.iova + (entry.addr_mask / 2));
+    assert_lookup_equals(atc, &entry, entry.pasid,
+                         entry.iova + (entry.addr_mask / 2));
+
+    atc_destroy(atc);
+}
+
+static void test_page_boundaries(void)
+{
+    static const uint32_t pasid = 5;
+    static const hwaddr page_size = 4096;
+
+    /* 2 consecutive entries */
+    IOMMUTLBEntry e1 = {
+        .iova = 0x123456789000ULL,
+        .addr_mask = page_size - 1,
+        .pasid = pasid,
+        .perm = IOMMU_RW,
+        .translated_addr = 0xdeadbeefULL,
+    };
+    IOMMUTLBEntry e2 = {
+        .iova = e1.iova + page_size,
+        .addr_mask = page_size - 1,
+        .pasid = pasid,
+        .perm = IOMMU_RW,
+        .translated_addr = 0x900df00dULL,
+    };
+
+    ATC *atc = atc_new(page_size, 48);
+
+    atc_create_address_space_cache(atc, e1.pasid);
+    /* creating the address space twice should not be a problem */
+    atc_create_address_space_cache(atc, e1.pasid);
+
+    atc_update(atc, &e1);
+    atc_update(atc, &e2);
+
+    assert_lookup_equals(atc, NULL, e1.pasid, e1.iova - 1);
+    assert_lookup_equals(atc, &e1, e1.pasid, e1.iova);
+    assert_lookup_equals(atc, &e1, e1.pasid, e1.iova + e1.addr_mask);
+    g_assert((e1.iova + e1.addr_mask + 1) == e2.iova);
+    assert_lookup_equals(atc, &e2, e2.pasid, e2.iova);
+    assert_lookup_equals(atc, &e2, e2.pasid, e2.iova + e2.addr_mask);
+    assert_lookup_equals(atc, NULL, e2.pasid, e2.iova + e2.addr_mask + 1);
+
+    assert_lookup_equals(atc, NULL, e1.pasid + 10, e1.iova);
+    assert_lookup_equals(atc, NULL, e2.pasid + 10, e2.iova);
+    atc_destroy(atc);
+}
+
+static void test_huge_page(void)
+{
+    static const uint32_t pasid = 5;
+    static const hwaddr page_size = 4096;
+    IOMMUTLBEntry e1 = {
+        .iova = 0x123456600000ULL,
+        .addr_mask = 0x1fffffULL,
+        .pasid = pasid,
+        .perm = IOMMU_RW,
+        .translated_addr = 0xdeadbeefULL,
+    };
+    hwaddr addr;
+
+    ATC *atc = atc_new(page_size, 48);
+
+    atc_create_address_space_cache(atc, e1.pasid);
+    atc_update(atc, &e1);
+
+    for (addr = e1.iova; addr <= e1.iova + e1.addr_mask; addr += page_size) {
+        assert_lookup_equals(atc, &e1, e1.pasid, addr);
+    }
+    /* addr is now out of the huge page */
+    assert_lookup_equals(atc, NULL, e1.pasid, addr);
+    atc_destroy(atc);
+}
+
+static void test_pasid(void)
+{
+    hwaddr addr = 0xaaaaaaaaa000ULL;
+    IOMMUTLBEntry e1 = {
+        .iova = addr,
+        .addr_mask = 0xfffULL,
+        .pasid = 8,
+        .perm = IOMMU_RW,
+        .translated_addr = 0xdeadbeefULL,
+    };
+    IOMMUTLBEntry e2 = {
+        .iova = addr,
+        .addr_mask = 0xfffULL,
+        .pasid = 2,
+        .perm = IOMMU_RW,
+        .translated_addr = 0xb001ULL,
+    };
+    uint16_t i;
+
+    ATC *atc = atc_new(4096, 48);
+
+    atc_create_address_space_cache(atc, e1.pasid);
+    atc_create_address_space_cache(atc, e2.pasid);
+    atc_update(atc, &e1);
+    atc_update(atc, &e2);
+
+    for (i = 0; i <= MAX(e1.pasid, e2.pasid) + 1; ++i) {
+        if (i == e1.pasid || i == e2.pasid) {
+            continue;
+        }
+        assert_lookup_equals(atc, NULL, i, addr);
+    }
+    assert_lookup_equals(atc, &e1, e1.pasid, addr);
+    assert_lookup_equals(atc, &e1, e1.pasid, addr);
+    atc_destroy(atc);
+}
+
+static void test_large_address(void)
+{
+    IOMMUTLBEntry e1 = {
+        .iova = 0xaaaaaaaaa000ULL,
+        .addr_mask = 0xfffULL,
+        .pasid = 8,
+        .perm = IOMMU_RW,
+        .translated_addr = 0x5eeeeeedULL,
+    };
+    IOMMUTLBEntry e2 = {
+        .iova = 0x1f00baaaaabf000ULL,
+        .addr_mask = 0xfffULL,
+        .pasid = e1.pasid,
+        .perm = IOMMU_RW,
+        .translated_addr = 0xdeadbeefULL,
+    };
+
+    ATC *atc = atc_new(4096, 57);
+
+    atc_create_address_space_cache(atc, e1.pasid);
+    atc_update(atc, &e1);
+    atc_update(atc, &e2);
+
+    assert_lookup_equals(atc, &e1, e1.pasid, e1.iova);
+    assert_lookup_equals(atc, &e2, e2.pasid, e2.iova);
+    atc_destroy(atc);
+}
+
+static void test_bigger_page(void)
+{
+    IOMMUTLBEntry e1 = {
+        .iova = 0xaabbccdde000ULL,
+        .addr_mask = 0x1fffULL,
+        .pasid = 1,
+        .perm = IOMMU_RW,
+        .translated_addr = 0x5eeeeeedULL,
+    };
+    hwaddr i;
+
+    ATC *atc = atc_new(8192, 43);
+
+    atc_create_address_space_cache(atc, e1.pasid);
+    atc_update(atc, &e1);
+
+    i = e1.iova & (~e1.addr_mask);
+    assert_lookup_equals(atc, NULL, e1.pasid, i - 1);
+    while (i <= e1.iova + e1.addr_mask) {
+        assert_lookup_equals(atc, &e1, e1.pasid, i);
+        ++i;
+    }
+    assert_lookup_equals(atc, NULL, e1.pasid, i);
+    atc_destroy(atc);
+}
+
+static void test_unknown_pasid(void)
+{
+    IOMMUTLBEntry e1 = {
+        .iova = 0xaabbccfff000ULL,
+        .addr_mask = 0xfffULL,
+        .pasid = 1,
+        .perm = IOMMU_RW,
+        .translated_addr = 0x5eeeeeedULL,
+    };
+
+    ATC *atc = atc_new(4096, 48);
+    g_assert(atc_update(atc, &e1) != 0);
+    assert_lookup_equals(atc, NULL, e1.pasid, e1.iova);
+}
+
+static void test_invalidation(void)
+{
+    static uint64_t page_size = 4096;
+    IOMMUTLBEntry e1 = {
+        .iova = 0xaabbccddf000ULL,
+        .addr_mask = 0xfffULL,
+        .pasid = 1,
+        .perm = IOMMU_RW,
+        .translated_addr = 0x5eeeeeedULL,
+    };
+    IOMMUTLBEntry e2 = {
+        .iova = 0xffe00000ULL,
+        .addr_mask = 0x1fffffULL,
+        .pasid = 1,
+        .perm = IOMMU_RW,
+        .translated_addr = 0xb000001ULL,
+    };
+    IOMMUTLBEntry e3;
+
+    ATC *atc = atc_new(page_size , 48);
+    atc_create_address_space_cache(atc, e1.pasid);
+
+    atc_update(atc, &e1);
+    assert_lookup_equals(atc, &e1, e1.pasid, e1.iova);
+    atc_invalidate(atc, &e1);
+    assert_lookup_equals(atc, NULL, e1.pasid, e1.iova);
+
+    atc_update(atc, &e1);
+    atc_update(atc, &e2);
+    assert_lookup_equals(atc, &e1, e1.pasid, e1.iova);
+    assert_lookup_equals(atc, &e2, e2.pasid, e2.iova);
+    atc_invalidate(atc, &e2);
+    assert_lookup_equals(atc, &e1, e1.pasid, e1.iova);
+    assert_lookup_equals(atc, NULL, e2.pasid, e2.iova);
+
+    /* invalidate a huge page by invalidating a small region */
+    for (hwaddr addr = e2.iova; addr <= (e2.iova + e2.addr_mask);
+         addr += page_size) {
+        atc_update(atc, &e2);
+        assert_lookup_equals(atc, &e2, e2.pasid, e2.iova);
+        e3 = (IOMMUTLBEntry){
+            .iova = addr,
+            .addr_mask = page_size - 1,
+            .pasid = e2.pasid,
+            .perm = IOMMU_RW,
+            .translated_addr = 0,
+        };
+        atc_invalidate(atc, &e3);
+        assert_lookup_equals(atc, NULL, e2.pasid, e2.iova);
+    }
+}
+
+static void test_delete_address_space_cache(void)
+{
+    static uint64_t page_size = 4096;
+    IOMMUTLBEntry e1 = {
+        .iova = 0xaabbccddf000ULL,
+        .addr_mask = 0xfffULL,
+        .pasid = 1,
+        .perm = IOMMU_RW,
+        .translated_addr = 0x5eeeeeedULL,
+    };
+    IOMMUTLBEntry e2 = {
+        .iova = e1.iova,
+        .addr_mask = 0xfffULL,
+        .pasid = 2,
+        .perm = IOMMU_RW,
+        .translated_addr = 0x5eeeeeedULL,
+    };
+
+    ATC *atc = atc_new(page_size , 48);
+    atc_create_address_space_cache(atc, e1.pasid);
+
+    atc_update(atc, &e1);
+    assert_lookup_equals(atc, &e1, e1.pasid, e1.iova);
+    atc_invalidate(atc, &e2); /* unkown pasid : is a nop*/
+    assert_lookup_equals(atc, &e1, e1.pasid, e1.iova);
+
+    atc_create_address_space_cache(atc, e2.pasid);
+    atc_update(atc, &e2);
+    assert_lookup_equals(atc, &e1, e1.pasid, e1.iova);
+    assert_lookup_equals(atc, &e2, e2.pasid, e2.iova);
+    atc_invalidate(atc, &e1);
+    /* e1 has been removed but e2 is still there */
+    assert_lookup_equals(atc, NULL, e1.pasid, e1.iova);
+    assert_lookup_equals(atc, &e2, e2.pasid, e2.iova);
+
+    atc_update(atc, &e1);
+    assert_lookup_equals(atc, &e1, e1.pasid, e1.iova);
+    assert_lookup_equals(atc, &e2, e2.pasid, e2.iova);
+
+    atc_delete_address_space_cache(atc, e2.pasid);
+    assert_lookup_equals(atc, &e1, e1.pasid, e1.iova);
+    assert_lookup_equals(atc, NULL, e2.pasid, e2.iova);
+}
+
+static void test_invalidate_entire_address_space(void)
+{
+    static uint64_t page_size = 4096;
+    IOMMUTLBEntry e1 = {
+        .iova = 0x1000ULL,
+        .addr_mask = 0xfffULL,
+        .pasid = 1,
+        .perm = IOMMU_RW,
+        .translated_addr = 0x5eedULL,
+    };
+    IOMMUTLBEntry e2 = {
+        .iova = 0xfffffffff000ULL,
+        .addr_mask = 0xfffULL,
+        .pasid = 1,
+        .perm = IOMMU_RW,
+        .translated_addr = 0xbeefULL,
+    };
+    IOMMUTLBEntry e3 = {
+        .iova = 0,
+        .addr_mask = 0xffffffffffffffffULL,
+        .pasid = 1,
+        .perm = IOMMU_RW,
+        .translated_addr = 0,
+    };
+
+    ATC *atc = atc_new(page_size , 48);
+    atc_create_address_space_cache(atc, e1.pasid);
+
+    atc_update(atc, &e1);
+    atc_update(atc, &e2);
+    assert_lookup_equals(atc, &e1, e1.pasid, e1.iova);
+    assert_lookup_equals(atc, &e2, e2.pasid, e2.iova);
+    atc_invalidate(atc, &e3);
+    /* e1 has been removed but e2 is still there */
+    assert_lookup_equals(atc, NULL, e1.pasid, e1.iova);
+    assert_lookup_equals(atc, NULL, e2.pasid, e2.iova);
+
+    atc_destroy(atc);
+}
+
+static void test_reset(void)
+{
+    static uint64_t page_size = 4096;
+    IOMMUTLBEntry e1 = {
+        .iova = 0x1000ULL,
+        .addr_mask = 0xfffULL,
+        .pasid = 1,
+        .perm = IOMMU_RW,
+        .translated_addr = 0x5eedULL,
+    };
+    IOMMUTLBEntry e2 = {
+        .iova = 0xfffffffff000ULL,
+        .addr_mask = 0xfffULL,
+        .pasid = 2,
+        .perm = IOMMU_RW,
+        .translated_addr = 0xbeefULL,
+    };
+
+    ATC *atc = atc_new(page_size , 48);
+    atc_create_address_space_cache(atc, e1.pasid);
+    atc_create_address_space_cache(atc, e2.pasid);
+    atc_update(atc, &e1);
+    atc_update(atc, &e2);
+
+    assert_lookup_equals(atc, &e1, e1.pasid, e1.iova);
+    assert_lookup_equals(atc, &e2, e2.pasid, e2.iova);
+
+    atc_reset(atc);
+
+    assert_lookup_equals(atc, NULL, e1.pasid, e1.iova);
+    assert_lookup_equals(atc, NULL, e2.pasid, e2.iova);
+}
+
+static void test_get_max_number_of_pages(void)
+{
+    static uint64_t page_size = 4096;
+    hwaddr base = 0xc0fee000; /* aligned */
+    ATC *atc = atc_new(page_size , 48);
+    g_assert(atc_get_max_number_of_pages(atc, base, page_size / 2) == 1);
+    g_assert(atc_get_max_number_of_pages(atc, base, page_size) == 1);
+    g_assert(atc_get_max_number_of_pages(atc, base, page_size + 1) == 2);
+
+    g_assert(atc_get_max_number_of_pages(atc, base + 10, 1) == 1);
+    g_assert(atc_get_max_number_of_pages(atc, base + 10, page_size - 10) == 1);
+    g_assert(atc_get_max_number_of_pages(atc, base + 10,
+                                         page_size - 10 + 1) == 2);
+    g_assert(atc_get_max_number_of_pages(atc, base + 10,
+                                         page_size - 10 + 2) == 2);
+
+    g_assert(atc_get_max_number_of_pages(atc, base + page_size - 1, 1) == 1);
+    g_assert(atc_get_max_number_of_pages(atc, base + page_size - 1, 2) == 2);
+    g_assert(atc_get_max_number_of_pages(atc, base + page_size - 1, 3) == 2);
+
+    g_assert(atc_get_max_number_of_pages(atc, base + 10, page_size * 20) == 21);
+    g_assert(atc_get_max_number_of_pages(atc, base + 10,
+                                         (page_size * 20) + (page_size - 10))
+                                          == 21);
+    g_assert(atc_get_max_number_of_pages(atc, base + 10,
+                                         (page_size * 20) +
+                                         (page_size - 10 + 1)) == 22);
+}
+
+int main(int argc, char **argv)
+{
+    g_test_init(&argc, &argv, NULL);
+    g_test_add_func("/atc/test_creation_parameters", test_creation_parameters);
+    g_test_add_func("/atc/test_single_entry", test_single_entry);
+    g_test_add_func("/atc/test_page_boundaries", test_page_boundaries);
+    g_test_add_func("/atc/test_huge_page", test_huge_page);
+    g_test_add_func("/atc/test_pasid", test_pasid);
+    g_test_add_func("/atc/test_large_address", test_large_address);
+    g_test_add_func("/atc/test_bigger_page", test_bigger_page);
+    g_test_add_func("/atc/test_unknown_pasid", test_unknown_pasid);
+    g_test_add_func("/atc/test_invalidation", test_invalidation);
+    g_test_add_func("/atc/test_delete_address_space_cache",
+                    test_delete_address_space_cache);
+    g_test_add_func("/atc/test_invalidate_entire_address_space",
+                    test_invalidate_entire_address_space);
+    g_test_add_func("/atc/test_reset", test_reset);
+    g_test_add_func("/atc/test_get_max_number_of_pages",
+                    test_get_max_number_of_pages);
+    return g_test_run();
+}
diff --git a/util/atc.c b/util/atc.c
new file mode 100644
index 0000000000..d951532e26
--- /dev/null
+++ b/util/atc.c
@@ -0,0 +1,211 @@
+/*
+ * QEMU emulation of an ATC
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include "util/atc.h"
+
+
+#define PAGE_TABLE_ENTRY_SIZE 8
+
+/* a pasid is hashed using the identity function */
+static guint atc_pasid_key_hash(gconstpointer v)
+{
+    return (guint)(uintptr_t)v; /* pasid */
+}
+
+/* pasid equality */
+static gboolean atc_pasid_key_equal(gconstpointer v1, gconstpointer v2)
+{
+    return v1 == v2;
+}
+
+/* Hash function for IOTLB entries */
+static guint atc_addr_key_hash(gconstpointer v)
+{
+    hwaddr addr = (hwaddr)v;
+    return (guint)((addr >> 32) ^ (addr & 0xffffffffU));
+}
+
+/* Equality test for IOTLB entries */
+static gboolean atc_addr_key_equal(gconstpointer v1, gconstpointer v2)
+{
+    return (hwaddr)v1 == (hwaddr)v2;
+}
+
+static void atc_address_space_free(void *as)
+{
+    g_hash_table_unref(as);
+}
+
+/* return log2(val), or UINT8_MAX if val is not a power of 2 */
+static uint8_t ilog2(uint64_t val)
+{
+    uint8_t result = 0;
+    while (val != 1) {
+        if (val & 1) {
+            return UINT8_MAX;
+        }
+
+        val >>= 1;
+        result += 1;
+    }
+    return result;
+}
+
+ATC *atc_new(uint64_t page_size, uint8_t address_width)
+{
+    ATC *atc;
+    uint8_t log_page_size = ilog2(page_size);
+    /* number of bits each used to store all the intermediate indexes */
+    uint64_t addr_lookup_indexes_size;
+
+    if (log_page_size == UINT8_MAX) {
+        return NULL;
+    }
+    /*
+     * We only support page table entries of 8 (PAGE_TABLE_ENTRY_SIZE) bytes
+     * log2(page_size / 8) = log2(page_size) - 3
+     * is the level offset
+     */
+    if (log_page_size <= 3) {
+        return NULL;
+    }
+
+    atc = g_new0(ATC, 1);
+    atc->address_spaces = g_hash_table_new_full(atc_pasid_key_hash,
+                                                atc_pasid_key_equal,
+                                                NULL, atc_address_space_free);
+    atc->level_offset = log_page_size - 3;
+    /* at this point, we know that page_size is a power of 2 */
+    atc->min_addr_mask = page_size - 1;
+    addr_lookup_indexes_size = address_width - log_page_size;
+    if ((addr_lookup_indexes_size % atc->level_offset) != 0) {
+        goto error;
+    }
+    atc->levels = addr_lookup_indexes_size / atc->level_offset;
+    atc->page_size = page_size;
+    return atc;
+
+error:
+    g_free(atc);
+    return NULL;
+}
+
+static inline GHashTable *atc_get_address_space_cache(ATC *atc, uint32_t pasid)
+{
+    return g_hash_table_lookup(atc->address_spaces,
+                               (gconstpointer)(uintptr_t)pasid);
+}
+
+void atc_create_address_space_cache(ATC *atc, uint32_t pasid)
+{
+    GHashTable *as_cache;
+
+    as_cache = atc_get_address_space_cache(atc, pasid);
+    if (!as_cache) {
+        as_cache = g_hash_table_new_full(atc_addr_key_hash,
+                                         atc_addr_key_equal,
+                                         NULL, g_free);
+        g_hash_table_replace(atc->address_spaces,
+                             (gpointer)(uintptr_t)pasid, as_cache);
+    }
+}
+
+void atc_delete_address_space_cache(ATC *atc, uint32_t pasid)
+{
+    g_hash_table_remove(atc->address_spaces, (gpointer)(uintptr_t)pasid);
+}
+
+int atc_update(ATC *atc, IOMMUTLBEntry *entry)
+{
+    IOMMUTLBEntry *value;
+    GHashTable *as_cache = atc_get_address_space_cache(atc, entry->pasid);
+    if (!as_cache) {
+        return -ENODEV;
+    }
+    value = g_memdup2(entry, sizeof(*value));
+    g_hash_table_replace(as_cache, (gpointer)(entry->iova), value);
+    return 0;
+}
+
+IOMMUTLBEntry *atc_lookup(ATC *atc, uint32_t pasid, hwaddr addr)
+{
+    IOMMUTLBEntry *entry;
+    hwaddr mask = atc->min_addr_mask;
+    hwaddr key = addr & (~mask);
+    GHashTable *as_cache = atc_get_address_space_cache(atc, pasid);
+
+    if (!as_cache) {
+        return NULL;
+    }
+
+    /*
+     * Iterate over the possible page sizes and try to find a hit
+    */
+    for (uint8_t level = 0; level < atc->levels; ++level) {
+        entry = g_hash_table_lookup(as_cache, (gconstpointer)key);
+        if (entry) {
+            return entry;
+        }
+        mask = (mask << atc->level_offset) | ((1 << atc->level_offset) - 1);
+        key = addr & (~mask);
+    }
+
+    return NULL;
+}
+
+static gboolean atc_invalidate_entry_predicate(gpointer key, gpointer value,
+                                               gpointer user_data)
+{
+    IOMMUTLBEntry *entry = (IOMMUTLBEntry *)value;
+    IOMMUTLBEntry *target = (IOMMUTLBEntry *)user_data;
+    hwaddr target_mask = ~target->addr_mask;
+    hwaddr entry_mask = ~entry->addr_mask;
+    return ((target->iova & target_mask) == (entry->iova & target_mask)) ||
+           ((target->iova & entry_mask) == (entry->iova & entry_mask));
+}
+
+void atc_invalidate(ATC *atc, IOMMUTLBEntry *entry)
+{
+    GHashTable *as_cache = atc_get_address_space_cache(atc, entry->pasid);
+    if (!as_cache) {
+        return;
+    }
+    g_hash_table_foreach_remove(as_cache,
+                                atc_invalidate_entry_predicate,
+                                entry);
+}
+
+void atc_destroy(ATC *atc)
+{
+    g_hash_table_unref(atc->address_spaces);
+}
+
+size_t atc_get_max_number_of_pages(ATC *atc, hwaddr addr, size_t length)
+{
+    hwaddr page_mask = ~(atc->min_addr_mask);
+    size_t result = (length / atc->page_size);
+    if ((((addr & page_mask) + length - 1) & page_mask) !=
+        ((addr + length - 1) & page_mask)) {
+        result += 1;
+    }
+    return result + (length % atc->page_size != 0 ? 1 : 0);
+}
+
+void atc_reset(ATC *atc)
+{
+    g_hash_table_remove_all(atc->address_spaces);
+}
diff --git a/util/atc.h b/util/atc.h
new file mode 100644
index 0000000000..8be95f5cca
--- /dev/null
+++ b/util/atc.h
@@ -0,0 +1,117 @@
+/*
+ * QEMU emulation of an ATC
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef UTIL_ATC_H
+#define UTIL_ATC_H
+
+#include "qemu/osdep.h"
+#include "exec/memory.h"
+
+typedef struct ATC {
+    GHashTable *address_spaces; /* Key : pasid, value : GHashTable */
+    hwaddr min_addr_mask;
+    uint64_t page_size;
+    uint8_t levels;
+    uint8_t level_offset;
+} ATC;
+
+/*
+ * atc_new: Create an ATC.
+ *
+ * Return an ATC or NULL if the creation failed
+ *
+ * @page_size: #PCIDevice doing the memory access
+ * @address_width: width of the virtual addresses used by the IOMMU (in bits)
+ */
+ATC *atc_new(uint64_t page_size, uint8_t address_width);
+
+/*
+ * atc_update: Insert or update an entry in the cache
+ *
+ * Return 0 if the operation succeeds, a negative error code otherwise
+ *
+ * The insertion will fail if the address space associated with this pasid
+ * has not been created with atc_create_address_space_cache
+ *
+ * @atc: the ATC to update
+ * @entry: the tlb entry to insert into the cache
+ */
+int atc_update(ATC *atc, IOMMUTLBEntry *entry);
+
+/*
+ * atc_create_address_space_cache: delare a new address space
+ * identified by a PASID
+ *
+ * @atc: the ATC to update
+ * @pasid: the pasid of the address space to be created
+ */
+void atc_create_address_space_cache(ATC *atc, uint32_t pasid);
+
+/*
+ * atc_delete_address_space_cache: delete an address space
+ * identified by a PASID
+ *
+ * @atc: the ATC to update
+ * @pasid: the pasid of the address space to be deleted
+ */
+void atc_delete_address_space_cache(ATC *atc, uint32_t pasid);
+
+/*
+ * atc_lookup: query the cache in a given address space
+ *
+ * @atc: the ATC to query
+ * @pasid: the pasid of the address space to query
+ * @addr: the virtual address to translate
+ */
+IOMMUTLBEntry *atc_lookup(ATC *atc, uint32_t pasid, hwaddr addr);
+
+/*
+ * atc_invalidate: invalidate an entry in the cache
+ *
+ * @atc: the ATC to update
+ * @entry: the entry to invalidate
+ */
+void atc_invalidate(ATC *atc, IOMMUTLBEntry *entry);
+
+/*
+ * atc_destroy: delete an ATC
+ *
+ * @atc: the cache to be deleted
+ */
+void atc_destroy(ATC *atc);
+
+/*
+ * atc_get_max_number_of_pages: get the number of pages a memory operation
+ * will access if all the pages concerned have the minimum size.
+ *
+ * This function can be used to determine the size of the result array to be
+ * allocated when issuing an ATS request.
+ *
+ * @atc: the cache
+ * @addr: start address
+ * @length: number of bytes accessed from addr
+ */
+size_t atc_get_max_number_of_pages(ATC *atc, hwaddr addr, size_t length);
+
+/*
+ * atc_reset: invalidates all the entries stored in the ATC
+ *
+ * @atc: the cache
+ */
+void atc_reset(ATC *atc);
+
+#endif
diff --git a/util/meson.build b/util/meson.build
index 0ef9886be0..a2e0e9e5d7 100644
--- a/util/meson.build
+++ b/util/meson.build
@@ -94,6 +94,7 @@ if have_block
   util_ss.add(files('hbitmap.c'))
   util_ss.add(files('hexdump.c'))
   util_ss.add(files('iova-tree.c'))
+  util_ss.add(files('atc.c'))
   util_ss.add(files('iov.c', 'uri.c'))
   util_ss.add(files('nvdimm-utils.c'))
   util_ss.add(files('block-helpers.c'))
-- 
2.44.0

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH ats_vtd v2 24/25] intel_iommu: set the address mask even when a translation fails
  2024-05-15  7:14 [PATCH ats_vtd v2 00/25] ATS support for VT-d CLEMENT MATHIEU--DRIF
                   ` (20 preceding siblings ...)
  2024-05-15  7:14 ` [PATCH ats_vtd v2 21/25] atc: generic ATC that can be used by PCIe devices that support SVM CLEMENT MATHIEU--DRIF
@ 2024-05-15  7:14 ` CLEMENT MATHIEU--DRIF
  2024-05-15  7:14 ` [PATCH ats_vtd v2 23/25] pci: add a pci-level API for ATS CLEMENT MATHIEU--DRIF
                   ` (2 subsequent siblings)
  24 siblings, 0 replies; 32+ messages in thread
From: CLEMENT MATHIEU--DRIF @ 2024-05-15  7:14 UTC (permalink / raw)
  To: qemu-devel
  Cc: jasowang, zhenzhong.duan, kevin.tian, yi.l.liu, joao.m.martins,
	peterx, CLEMENT MATHIEU--DRIF

Implements the behavior defined in section 10.2.3.5 of PCIe spec rev 5.
This is needed by devices that support ATS.

Signed-off-by: Clément Mathieu--Drif <clement.mathieu--drif@eviden.com>
---
 hw/i386/intel_iommu.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index c4ebd4569e..67b9ff4934 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -2164,7 +2164,8 @@ static bool vtd_do_iommu_translate(VTDAddressSpace *vtd_as, PCIBus *bus,
     uint8_t bus_num = pci_bus_num(bus);
     VTDContextCacheEntry *cc_entry;
     uint64_t pte, page_mask;
-    uint32_t level, pasid = vtd_as->pasid;
+    uint32_t level = UINT32_MAX;
+    uint32_t pasid = vtd_as->pasid;
     uint16_t source_id = PCI_BUILD_BDF(bus_num, devfn);
     int ret_fr;
     bool is_fpd_set = false;
@@ -2306,7 +2307,12 @@ error:
     vtd_iommu_unlock(s);
     entry->iova = 0;
     entry->translated_addr = 0;
-    entry->addr_mask = 0;
+    /*
+     * Set the mask for ATS (the range must be present even when the
+     * translation fails : PCIe rev 5 10.2.3.5)
+     */
+    entry->addr_mask = (level != UINT32_MAX) ?
+                       (~vtd_slpt_level_page_mask(level)) : (~VTD_PAGE_MASK_4K);
     entry->perm = IOMMU_NONE;
     entry->pasid = PCI_NO_PASID;
     return false;
-- 
2.44.0

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH ats_vtd v2 22/25] memory: add an API for ATS support
  2024-05-15  7:14 [PATCH ats_vtd v2 00/25] ATS support for VT-d CLEMENT MATHIEU--DRIF
                   ` (22 preceding siblings ...)
  2024-05-15  7:14 ` [PATCH ats_vtd v2 23/25] pci: add a pci-level API for ATS CLEMENT MATHIEU--DRIF
@ 2024-05-15  7:14 ` CLEMENT MATHIEU--DRIF
  2024-05-15  7:14 ` [PATCH ats_vtd v2 25/25] intel_iommu: add support for ATS CLEMENT MATHIEU--DRIF
  24 siblings, 0 replies; 32+ messages in thread
From: CLEMENT MATHIEU--DRIF @ 2024-05-15  7:14 UTC (permalink / raw)
  To: qemu-devel
  Cc: jasowang, zhenzhong.duan, kevin.tian, yi.l.liu, joao.m.martins,
	peterx, CLEMENT MATHIEU--DRIF

IOMMU have to implement iommu_ats_request_translation to support ATS.

Devices can use IOMMU_TLB_ENTRY_TRANSLATION_ERROR to check the tlb
entries returned by a translation request.

Signed-off-by: Clément Mathieu--Drif <clement.mathieu--drif@eviden.com>
---
 include/exec/memory.h | 26 ++++++++++++++++++++++++++
 system/memory.c       | 20 ++++++++++++++++++++
 2 files changed, 46 insertions(+)

diff --git a/include/exec/memory.h b/include/exec/memory.h
index 198b71e9af..98b02b942c 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -148,6 +148,10 @@ struct IOMMUTLBEntry {
     uint32_t         pasid;
 };
 
+/* Check if an IOMMU TLB entry indicates a translation error */
+#define IOMMU_TLB_ENTRY_TRANSLATION_ERROR(entry) ((((entry)->perm) & IOMMU_RW) \
+                                                    == IOMMU_NONE)
+
 /*
  * Bitmap for different IOMMUNotifier capabilities. Each notifier can
  * register with one or multiple IOMMU Notifier capability bit(s).
@@ -567,6 +571,20 @@ struct IOMMUMemoryRegionClass {
      int (*iommu_set_iova_ranges)(IOMMUMemoryRegion *iommu,
                                   GList *iova_ranges,
                                   Error **errp);
+
+    /**
+     * @iommu_ats_request_translation:
+     * This method must be implemented if the IOMMU has ATS enabled
+     *
+     * @see pci_ats_request_translation_pasid
+     */
+    ssize_t (*iommu_ats_request_translation)(IOMMUMemoryRegion *iommu,
+                                             bool priv_req, bool exec_req,
+                                             hwaddr addr, size_t length,
+                                             bool no_write,
+                                             IOMMUTLBEntry *result,
+                                             size_t result_length,
+                                             uint32_t *err_count);
 };
 
 typedef struct RamDiscardListener RamDiscardListener;
@@ -1870,6 +1888,14 @@ void memory_region_iommu_replay(IOMMUMemoryRegion *iommu_mr, IOMMUNotifier *n);
 void memory_region_unregister_iommu_notifier(MemoryRegion *mr,
                                              IOMMUNotifier *n);
 
+ssize_t memory_region_iommu_ats_request_translation(IOMMUMemoryRegion *iommu_mr,
+                                                bool priv_req, bool exec_req,
+                                                hwaddr addr, size_t length,
+                                                bool no_write,
+                                                IOMMUTLBEntry *result,
+                                                size_t result_length,
+                                                uint32_t *err_count);
+
 /**
  * memory_region_iommu_get_attr: return an IOMMU attr if get_attr() is
  * defined on the IOMMU.
diff --git a/system/memory.c b/system/memory.c
index a229a79988..9c9418c5ee 100644
--- a/system/memory.c
+++ b/system/memory.c
@@ -2000,6 +2000,26 @@ void memory_region_unregister_iommu_notifier(MemoryRegion *mr,
     memory_region_update_iommu_notify_flags(iommu_mr, NULL);
 }
 
+ssize_t memory_region_iommu_ats_request_translation(IOMMUMemoryRegion *iommu_mr,
+                                                    bool priv_req,
+                                                    bool exec_req,
+                                                    hwaddr addr, size_t length,
+                                                    bool no_write,
+                                                    IOMMUTLBEntry *result,
+                                                    size_t result_length,
+                                                    uint32_t *err_count)
+{
+    IOMMUMemoryRegionClass *imrc = memory_region_get_iommu_class_nocheck(iommu_mr);
+
+    if (!imrc->iommu_ats_request_translation) {
+        return -ENODEV;
+    }
+
+    return imrc->iommu_ats_request_translation(iommu_mr, priv_req, exec_req,
+                                               addr, length, no_write, result,
+                                               result_length, err_count);
+}
+
 void memory_region_notify_iommu_one(IOMMUNotifier *notifier,
                                     IOMMUTLBEvent *event)
 {
-- 
2.44.0

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH ats_vtd v2 23/25] pci: add a pci-level API for ATS
  2024-05-15  7:14 [PATCH ats_vtd v2 00/25] ATS support for VT-d CLEMENT MATHIEU--DRIF
                   ` (21 preceding siblings ...)
  2024-05-15  7:14 ` [PATCH ats_vtd v2 24/25] intel_iommu: set the address mask even when a translation fails CLEMENT MATHIEU--DRIF
@ 2024-05-15  7:14 ` CLEMENT MATHIEU--DRIF
  2024-05-15  7:14 ` [PATCH ats_vtd v2 22/25] memory: add an API for ATS support CLEMENT MATHIEU--DRIF
  2024-05-15  7:14 ` [PATCH ats_vtd v2 25/25] intel_iommu: add support for ATS CLEMENT MATHIEU--DRIF
  24 siblings, 0 replies; 32+ messages in thread
From: CLEMENT MATHIEU--DRIF @ 2024-05-15  7:14 UTC (permalink / raw)
  To: qemu-devel
  Cc: jasowang, zhenzhong.duan, kevin.tian, yi.l.liu, joao.m.martins,
	peterx, CLEMENT MATHIEU--DRIF

Devices implementing ATS can send translation requests using
pci_ats_request_translation_pasid.

The invalidation events are sent back to the device using the iommu
notifier managed with pci_register_iommu_tlb_event_notifier and
pci_unregister_iommu_tlb_event_notifier

Signed-off-by: Clément Mathieu--Drif <clement.mathieu--drif@eviden.com>
---
 hw/pci/pci.c         | 44 +++++++++++++++++++++++++++++++++++++
 include/hw/pci/pci.h | 52 ++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 96 insertions(+)

diff --git a/hw/pci/pci.c b/hw/pci/pci.c
index f90eb04fda..20b838657e 100644
--- a/hw/pci/pci.c
+++ b/hw/pci/pci.c
@@ -2831,6 +2831,50 @@ void pci_device_unset_iommu_device(PCIDevice *dev)
     }
 }
 
+ssize_t pci_ats_request_translation_pasid(PCIDevice *dev, uint32_t pasid,
+                                          bool priv_req, bool exec_req, hwaddr addr,
+                                          size_t length, bool no_write,
+                                          IOMMUTLBEntry *result,
+                                          size_t result_length,
+                                          uint32_t *err_count)
+{
+    assert(result_length);
+    IOMMUMemoryRegion *iommu_mr = pci_device_iommu_memory_region_pasid(dev,
+                                                                        pasid);
+    if (!iommu_mr || !pcie_ats_enabled(dev)) {
+        return -EPERM;
+    }
+    return memory_region_iommu_ats_request_translation(iommu_mr, priv_req,
+                                                       exec_req, addr, length,
+                                                       no_write, result,
+                                                       result_length,
+                                                       err_count);
+}
+
+int pci_register_iommu_tlb_event_notifier(PCIDevice *dev, uint32_t pasid,
+                                          IOMMUNotifier *n)
+{
+    IOMMUMemoryRegion *iommu_mr = pci_device_iommu_memory_region_pasid(dev,
+                                                                        pasid);
+    if (!iommu_mr) {
+        return -EPERM;
+    }
+    return memory_region_register_iommu_notifier(MEMORY_REGION(iommu_mr), n,
+                                                 &error_fatal);
+}
+
+int pci_unregister_iommu_tlb_event_notifier(PCIDevice *dev, uint32_t pasid,
+                                             IOMMUNotifier *n)
+{
+    IOMMUMemoryRegion *iommu_mr = pci_device_iommu_memory_region_pasid(dev,
+                                                                        pasid);
+    if (!iommu_mr) {
+        return -EPERM;
+    }
+    memory_region_unregister_iommu_notifier(MEMORY_REGION(iommu_mr), n);
+    return 0;
+}
+
 void pci_setup_iommu(PCIBus *bus, const PCIIOMMUOps *ops, void *opaque)
 {
     /*
diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
index 1587c18cd9..dc247d24bd 100644
--- a/include/hw/pci/pci.h
+++ b/include/hw/pci/pci.h
@@ -471,6 +471,58 @@ void pci_device_unset_iommu_device(PCIDevice *dev);
 bool pci_iommu_init_iotlb_notifier(PCIDevice *dev, uint32_t pasid,
                                    IOMMUNotifier *n, IOMMUNotify fn);
 
+/**
+ * pci_ats_request_translation_pasid: perform an ATS request
+ *
+ * Return the number of translations stored in @result in case of success,
+ * a negative error code otherwise.
+ * -ENOMEM is returned when the result buffer is not large enough to store
+ * all the translations
+ *
+ * @dev: the ATS-capable PCI device
+ * @pasid: the pasid of the address space in which the translation will be made
+ * @priv_req: privileged mode bit (PASID TLP)
+ * @exec_req: execute request bit (PASID TLP)
+ * @addr: start address of the memory range to be translated
+ * @length: length of the memory range in bytes
+ * @no_write: request a read-only access translation (if supported by the IOMMU)
+ * @result: buffer in which the TLB entries will be stored
+ * @result_length: result buffer length
+ * @err_count: number of untranslated subregions
+ */
+ssize_t pci_ats_request_translation_pasid(PCIDevice *dev, uint32_t pasid,
+                                          bool priv_req, bool exec_req, hwaddr addr,
+                                          size_t length, bool no_write,
+                                          IOMMUTLBEntry *result,
+                                          size_t result_length,
+                                          uint32_t *err_count);
+
+/**
+ * pci_register_iommu_tlb_event_notifier: register a notifier for changes to
+ * IOMMU translation entries in a specific address space.
+ *
+ * Returns 0 on success, or a negative errno otherwise.
+ *
+ * @dev: the device that wants to get notified
+ * @pasid: the pasid of the address space to track
+ * @n: the notifier to register
+ */
+int pci_register_iommu_tlb_event_notifier(PCIDevice *dev, uint32_t pasid,
+                                          IOMMUNotifier *n);
+
+/**
+ * pci_unregister_iommu_tlb_event_notifier: unregister a notifier that has been
+ * registerd with pci_register_iommu_tlb_event_notifier
+ *
+ * Returns 0 on success, or a negative errno otherwise.
+ *
+ * @dev: the device that wants to unsubscribe
+ * @pasid: the pasid of the address space to be untracked
+ * @n: the notifier to unregister
+ */
+int pci_unregister_iommu_tlb_event_notifier(PCIDevice *dev, uint32_t pasid,
+                                            IOMMUNotifier *n);
+
 /**
  * pci_setup_iommu: Initialize specific IOMMU handlers for a PCIBus
  *
-- 
2.44.0

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH ats_vtd v2 25/25] intel_iommu: add support for ATS
  2024-05-15  7:14 [PATCH ats_vtd v2 00/25] ATS support for VT-d CLEMENT MATHIEU--DRIF
                   ` (23 preceding siblings ...)
  2024-05-15  7:14 ` [PATCH ats_vtd v2 22/25] memory: add an API for ATS support CLEMENT MATHIEU--DRIF
@ 2024-05-15  7:14 ` CLEMENT MATHIEU--DRIF
  24 siblings, 0 replies; 32+ messages in thread
From: CLEMENT MATHIEU--DRIF @ 2024-05-15  7:14 UTC (permalink / raw)
  To: qemu-devel
  Cc: jasowang, zhenzhong.duan, kevin.tian, yi.l.liu, joao.m.martins,
	peterx, CLEMENT MATHIEU--DRIF

Signed-off-by: Clément Mathieu--Drif <clement.mathieu--drif@eviden.com>
---
 hw/i386/intel_iommu.c          | 75 ++++++++++++++++++++++++++++++++--
 hw/i386/intel_iommu_internal.h |  1 +
 2 files changed, 73 insertions(+), 3 deletions(-)

diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index 67b9ff4934..7421a99373 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -5394,12 +5394,10 @@ static void vtd_report_ir_illegal_access(VTDAddressSpace *vtd_as,
     bool is_fpd_set = false;
     VTDContextEntry ce;
 
-    assert(vtd_as->pasid != PCI_NO_PASID);
-
     /* Try out best to fetch FPD, we can't do anything more */
     if (vtd_dev_to_context_entry(s, bus_n, vtd_as->devfn, &ce) == 0) {
         is_fpd_set = ce.lo & VTD_CONTEXT_ENTRY_FPD;
-        if (!is_fpd_set && s->root_scalable) {
+        if (!is_fpd_set && s->root_scalable && vtd_as->pasid != PCI_NO_PASID) {
             vtd_ce_get_pasid_fpd(s, &ce, &is_fpd_set, vtd_as->pasid);
         }
     }
@@ -6024,6 +6022,75 @@ static IOMMUMemoryRegion *vtd_get_memory_region_pasid(PCIBus *bus,
     return &vtd_as->iommu;
 }
 
+static IOMMUTLBEntry vtd_iommu_ats_do_translate(IOMMUMemoryRegion *iommu,
+                                                hwaddr addr,
+                                                IOMMUAccessFlags flags,
+                                                int iommu_idx)
+{
+    IOMMUTLBEntry entry;
+    VTDAddressSpace *vtd_as = container_of(iommu, VTDAddressSpace, iommu);
+
+    if (vtd_is_interrupt_addr(addr)) {
+        vtd_report_ir_illegal_access(vtd_as, addr, flags & IOMMU_WO);
+        entry.iova = 0;
+        entry.translated_addr = 0;
+        entry.addr_mask = ~VTD_PAGE_MASK_4K;
+        entry.perm = IOMMU_NONE;
+        entry.pasid = PCI_NO_PASID;
+    } else {
+        entry = vtd_iommu_translate(iommu, addr, flags, iommu_idx);
+    }
+    return entry;
+}
+
+static ssize_t vtd_iommu_ats_request_translation(IOMMUMemoryRegion *iommu,
+                                                 bool priv_req, bool exec_req,
+                                                 hwaddr addr, size_t length,
+                                                 bool no_write,
+                                                 IOMMUTLBEntry *result,
+                                                 size_t result_length,
+                                                 uint32_t *err_count)
+{
+    IOMMUAccessFlags flags = IOMMU_ACCESS_FLAG_FULL(true, !no_write, exec_req,
+                                                    priv_req, false, false);
+    ssize_t res_index = 0;
+    hwaddr target_address = addr + length;
+    IOMMUTLBEntry entry;
+
+    *err_count = 0;
+
+    while ((addr < target_address) && (res_index < result_length)) {
+        entry = vtd_iommu_ats_do_translate(iommu, addr, flags, 0);
+        if (!IOMMU_TLB_ENTRY_TRANSLATION_ERROR(&entry)) { /* Translation done */
+            if (no_write) {
+                /* The device should not use this entry for a write access */
+                entry.perm &= ~IOMMU_WO;
+            }
+            /*
+             * 4.1.2 : Global Mapping (G) : Remapping hardware provides a value
+             * of 0 in this field
+             */
+            entry.perm &= ~IOMMU_GLOBAL;
+        } else {
+            *err_count += 1;
+        }
+        result[res_index] = entry;
+        res_index += 1;
+        addr = (addr & (~entry.addr_mask)) + (entry.addr_mask + 1);
+    }
+
+    /* Buffer too small */
+    if (addr < target_address) {
+        return -ENOMEM;
+    }
+    return res_index;
+}
+
+static uint64_t vtd_get_min_page_size(IOMMUMemoryRegion *iommu)
+{
+    return VTD_PAGE_SIZE;
+}
+
 static PCIIOMMUOps vtd_iommu_ops = {
     .get_address_space = vtd_host_dma_iommu,
     .get_address_space_pasid = vtd_host_dma_iommu_pasid,
@@ -6230,6 +6297,8 @@ static void vtd_iommu_memory_region_class_init(ObjectClass *klass,
     imrc->translate = vtd_iommu_translate;
     imrc->notify_flag_changed = vtd_iommu_notify_flag_changed;
     imrc->replay = vtd_iommu_replay;
+    imrc->iommu_ats_request_translation = vtd_iommu_ats_request_translation;
+    imrc->get_min_page_size = vtd_get_min_page_size;
 }
 
 static const TypeInfo vtd_iommu_memory_region_info = {
diff --git a/hw/i386/intel_iommu_internal.h b/hw/i386/intel_iommu_internal.h
index 3d59e10488..aa4d0d5f16 100644
--- a/hw/i386/intel_iommu_internal.h
+++ b/hw/i386/intel_iommu_internal.h
@@ -193,6 +193,7 @@
 #define VTD_ECAP_MHMV               (15ULL << 20)
 #define VTD_ECAP_NEST               (1ULL << 26)
 #define VTD_ECAP_SRS                (1ULL << 31)
+#define VTD_ECAP_NWFS               (1ULL << 33)
 #define VTD_ECAP_PSS                (19ULL << 35)
 #define VTD_ECAP_PASID              (1ULL << 40)
 #define VTD_ECAP_SMTS               (1ULL << 43)
-- 
2.44.0

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* RE: [PATCH ats_vtd v2 20/25] intel_iommu: fill the PASID field when creating an instance of IOMMUTLBEntry
  2024-05-15  7:14 ` [PATCH ats_vtd v2 20/25] intel_iommu: fill the PASID field when creating an instance of IOMMUTLBEntry CLEMENT MATHIEU--DRIF
@ 2024-05-17 10:40   ` Duan, Zhenzhong
  2024-05-17 11:11     ` CLEMENT MATHIEU--DRIF
  0 siblings, 1 reply; 32+ messages in thread
From: Duan, Zhenzhong @ 2024-05-17 10:40 UTC (permalink / raw)
  To: CLEMENT MATHIEU--DRIF, qemu-devel
  Cc: jasowang, Tian, Kevin, Liu, Yi L, joao.m.martins, peterx



>-----Original Message-----
>From: CLEMENT MATHIEU--DRIF <clement.mathieu--drif@eviden.com>
>Subject: [PATCH ats_vtd v2 20/25] intel_iommu: fill the PASID field when
>creating an instance of IOMMUTLBEntry
>
>Signed-off-by: Clément Mathieu--Drif <clement.mathieu--drif@eviden.com>
>---
> hw/i386/intel_iommu.c | 7 +++++++
> 1 file changed, 7 insertions(+)
>
>diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
>index 53f17d66c0..c4ebd4569e 100644
>--- a/hw/i386/intel_iommu.c
>+++ b/hw/i386/intel_iommu.c
>@@ -2299,6 +2299,7 @@ out:
>     entry->translated_addr = vtd_get_slpte_addr(pte, s->aw_bits) &
>page_mask;
>     entry->addr_mask = ~page_mask;
>     entry->perm = access_flags;
>+    entry->pasid = pasid;

For PCI_NO_PASID, do we want to assign PCI_NO_PASID or rid2pasid?

Thanks
Zhenzhong

>     return true;
>
> error:
>@@ -2307,6 +2308,7 @@ error:
>     entry->translated_addr = 0;
>     entry->addr_mask = 0;
>     entry->perm = IOMMU_NONE;
>+    entry->pasid = PCI_NO_PASID;
>     return false;
> }
>
>@@ -3497,6 +3499,7 @@ static void
>vtd_piotlb_pasid_invalidate_notify(IntelIOMMUState *s,
>                 event.entry.target_as = &address_space_memory;
>                 event.entry.iova = notifier->start;
>                 event.entry.perm = IOMMU_NONE;
>+                event.entry.pasid = pasid;
>                 event.entry.addr_mask = notifier->end - notifier->start;
>                 event.entry.translated_addr = 0;
>
>@@ -3678,6 +3681,7 @@ static void
>vtd_piotlb_page_invalidate(IntelIOMMUState *s, uint16_t domain_id,
>             event.entry.target_as = &address_space_memory;
>             event.entry.iova = addr;
>             event.entry.perm = IOMMU_NONE;
>+            event.entry.pasid = pasid;
>             event.entry.addr_mask = size - 1;
>             event.entry.translated_addr = 0;
>
>@@ -4335,6 +4339,7 @@ static void
>do_invalidate_device_tlb(VTDAddressSpace *vtd_dev_as,
>     event.entry.iova = addr;
>     event.entry.perm = IOMMU_NONE;
>     event.entry.translated_addr = 0;
>+    event.entry.pasid = vtd_dev_as->pasid;
>     memory_region_notify_iommu(&vtd_dev_as->iommu, 0, event);
> }
>
>@@ -4911,6 +4916,7 @@ static IOMMUTLBEntry
>vtd_iommu_translate(IOMMUMemoryRegion *iommu, hwaddr addr,
>     IOMMUTLBEntry iotlb = {
>         /* We'll fill in the rest later. */
>         .target_as = &address_space_memory,
>+        .pasid = vtd_as->pasid,
>     };
>     bool success;
>
>@@ -4923,6 +4929,7 @@ static IOMMUTLBEntry
>vtd_iommu_translate(IOMMUMemoryRegion *iommu, hwaddr addr,
>         iotlb.translated_addr = addr & VTD_PAGE_MASK_4K;
>         iotlb.addr_mask = ~VTD_PAGE_MASK_4K;
>         iotlb.perm = IOMMU_RW;
>+        iotlb.pasid = PCI_NO_PASID;
>         success = true;
>     }
>
>--
>2.44.0

^ permalink raw reply	[flat|nested] 32+ messages in thread

* RE: [PATCH ats_vtd v2 21/25] atc: generic ATC that can be used by PCIe devices that support SVM
  2024-05-15  7:14 ` [PATCH ats_vtd v2 21/25] atc: generic ATC that can be used by PCIe devices that support SVM CLEMENT MATHIEU--DRIF
@ 2024-05-17 10:44   ` Duan, Zhenzhong
  2024-05-17 11:12     ` CLEMENT MATHIEU--DRIF
  0 siblings, 1 reply; 32+ messages in thread
From: Duan, Zhenzhong @ 2024-05-17 10:44 UTC (permalink / raw)
  To: CLEMENT MATHIEU--DRIF, qemu-devel
  Cc: jasowang, Tian, Kevin, Liu, Yi L, joao.m.martins, peterx



>-----Original Message-----
>From: CLEMENT MATHIEU--DRIF <clement.mathieu--drif@eviden.com>
>Subject: [PATCH ats_vtd v2 21/25] atc: generic ATC that can be used by PCIe
>devices that support SVM
>
>As the SVM-capable devices will need to cache translations, we provide
>an first implementation.
>
>This cache uses a two-level design based on hash tables.
>The first level is indexed by a PASID and the second by a virtual addresse.
>
>Signed-off-by: Clément Mathieu--Drif <clement.mathieu--drif@eviden.com>
>---
> tests/unit/meson.build |   1 +
> tests/unit/test-atc.c  | 502
>+++++++++++++++++++++++++++++++++++++++++
> util/atc.c             | 211 +++++++++++++++++
> util/atc.h             | 117 ++++++++++
> util/meson.build       |   1 +
> 5 files changed, 832 insertions(+)
> create mode 100644 tests/unit/test-atc.c
> create mode 100644 util/atc.c
> create mode 100644 util/atc.h

Maybe the unit test can be split from functional change?

>
>diff --git a/tests/unit/meson.build b/tests/unit/meson.build
>index 228a21d03c..5c9a6fe9f4 100644
>--- a/tests/unit/meson.build
>+++ b/tests/unit/meson.build
>@@ -52,6 +52,7 @@ tests = {
>   'test-interval-tree': [],
>   'test-xs-node': [qom],
>   'test-virtio-dmabuf': [meson.project_source_root() / 'hw/display/virtio-
>dmabuf.c'],
>+  'test-atc': []
> }
>
> if have_system or have_tools
>diff --git a/tests/unit/test-atc.c b/tests/unit/test-atc.c
>new file mode 100644
>index 0000000000..60fa60924a
>--- /dev/null
>+++ b/tests/unit/test-atc.c
>@@ -0,0 +1,502 @@
>+/*
>+ * This program is free software; you can redistribute it and/or modify
>+ * it under the terms of the GNU General Public License as published by
>+ * the Free Software Foundation; either version 2 of the License, or
>+ * (at your option) any later version.
>+
>+ * This program is distributed in the hope that it will be useful,
>+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
>+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>+ * GNU General Public License for more details.
>+
>+ * You should have received a copy of the GNU General Public License along
>+ * with this program; if not, see <http://www.gnu.org/licenses/>.
>+ */
>+
>+#include "util/atc.h"
>+
>+static inline bool tlb_entry_equal(IOMMUTLBEntry *e1, IOMMUTLBEntry
>*e2)
>+{
>+    if (!e1 || !e2) {
>+        return !e1 && !e2;
>+    }
>+    return e1->iova == e2->iova &&
>+            e1->addr_mask == e2->addr_mask &&
>+            e1->pasid == e2->pasid &&
>+            e1->perm == e2->perm &&
>+            e1->target_as == e2->target_as &&
>+            e1->translated_addr == e2->translated_addr;
>+}
>+
>+static void assert_lookup_equals(ATC *atc, IOMMUTLBEntry *target,
>+                                 uint32_t pasid, hwaddr iova)
>+{
>+    IOMMUTLBEntry *result;
>+    result = atc_lookup(atc, pasid, iova);
>+    g_assert(tlb_entry_equal(result, target));
>+}
>+
>+static void check_creation(uint64_t page_size, uint8_t address_width,
>+                           uint8_t levels, uint8_t level_offset,
>+                           bool should_work) {
>+    ATC *atc = atc_new(page_size, address_width);
>+    if (atc) {
>+        if (atc->levels != levels || atc->level_offset != level_offset) {
>+            g_assert(false); /* ATC created but invalid configuration : fail */
>+        }
>+        atc_destroy(atc);
>+        g_assert(should_work);
>+    } else {
>+        g_assert(!should_work);
>+    }
>+}
>+
>+static void test_creation_parameters(void)
>+{
>+    check_creation(8, 39, 3, 9, false);
>+    check_creation(4095, 39, 3, 9, false);
>+    check_creation(4097, 39, 3, 9, false);
>+    check_creation(8192, 48, 0, 0, false);
>+
>+    check_creation(4096, 38, 0, 0, false);
>+    check_creation(4096, 39, 3, 9, true);
>+    check_creation(4096, 40, 0, 0, false);
>+    check_creation(4096, 47, 0, 0, false);
>+    check_creation(4096, 48, 4, 9, true);
>+    check_creation(4096, 49, 0, 0, false);
>+    check_creation(4096, 56, 0, 0, false);
>+    check_creation(4096, 57, 5, 9, true);
>+    check_creation(4096, 58, 0, 0, false);
>+
>+    check_creation(16384, 35, 0, 0, false);
>+    check_creation(16384, 36, 2, 11, true);
>+    check_creation(16384, 37, 0, 0, false);
>+    check_creation(16384, 46, 0, 0, false);
>+    check_creation(16384, 47, 3, 11, true);
>+    check_creation(16384, 48, 0, 0, false);
>+    check_creation(16384, 57, 0, 0, false);
>+    check_creation(16384, 58, 4, 11, true);
>+    check_creation(16384, 59, 0, 0, false);
>+}
>+
>+static void test_single_entry(void)
>+{
>+    IOMMUTLBEntry entry = {
>+        .iova = 0x123456789000ULL,
>+        .addr_mask = 0xfffULL,
>+        .pasid = 5,
>+        .perm = IOMMU_RW,
>+        .translated_addr = 0xdeadbeefULL,
>+    };
>+
>+    ATC *atc = atc_new(4096, 48);
>+    g_assert(atc);
>+
>+    assert_lookup_equals(atc, NULL, entry.pasid,
>+                         entry.iova + (entry.addr_mask / 2));
>+
>+    atc_create_address_space_cache(atc, entry.pasid);
>+    g_assert(atc_update(atc, &entry) == 0);
>+
>+    assert_lookup_equals(atc, NULL, entry.pasid + 1,
>+                         entry.iova + (entry.addr_mask / 2));
>+    assert_lookup_equals(atc, &entry, entry.pasid,
>+                         entry.iova + (entry.addr_mask / 2));
>+
>+    atc_destroy(atc);
>+}
>+
>+static void test_page_boundaries(void)
>+{
>+    static const uint32_t pasid = 5;
>+    static const hwaddr page_size = 4096;
>+
>+    /* 2 consecutive entries */
>+    IOMMUTLBEntry e1 = {
>+        .iova = 0x123456789000ULL,
>+        .addr_mask = page_size - 1,
>+        .pasid = pasid,
>+        .perm = IOMMU_RW,
>+        .translated_addr = 0xdeadbeefULL,
>+    };
>+    IOMMUTLBEntry e2 = {
>+        .iova = e1.iova + page_size,
>+        .addr_mask = page_size - 1,
>+        .pasid = pasid,
>+        .perm = IOMMU_RW,
>+        .translated_addr = 0x900df00dULL,
>+    };
>+
>+    ATC *atc = atc_new(page_size, 48);
>+
>+    atc_create_address_space_cache(atc, e1.pasid);
>+    /* creating the address space twice should not be a problem */
>+    atc_create_address_space_cache(atc, e1.pasid);
>+
>+    atc_update(atc, &e1);
>+    atc_update(atc, &e2);
>+
>+    assert_lookup_equals(atc, NULL, e1.pasid, e1.iova - 1);
>+    assert_lookup_equals(atc, &e1, e1.pasid, e1.iova);
>+    assert_lookup_equals(atc, &e1, e1.pasid, e1.iova + e1.addr_mask);
>+    g_assert((e1.iova + e1.addr_mask + 1) == e2.iova);
>+    assert_lookup_equals(atc, &e2, e2.pasid, e2.iova);
>+    assert_lookup_equals(atc, &e2, e2.pasid, e2.iova + e2.addr_mask);
>+    assert_lookup_equals(atc, NULL, e2.pasid, e2.iova + e2.addr_mask + 1);
>+
>+    assert_lookup_equals(atc, NULL, e1.pasid + 10, e1.iova);
>+    assert_lookup_equals(atc, NULL, e2.pasid + 10, e2.iova);
>+    atc_destroy(atc);
>+}
>+
>+static void test_huge_page(void)
>+{
>+    static const uint32_t pasid = 5;
>+    static const hwaddr page_size = 4096;
>+    IOMMUTLBEntry e1 = {
>+        .iova = 0x123456600000ULL,
>+        .addr_mask = 0x1fffffULL,
>+        .pasid = pasid,
>+        .perm = IOMMU_RW,
>+        .translated_addr = 0xdeadbeefULL,
>+    };
>+    hwaddr addr;
>+
>+    ATC *atc = atc_new(page_size, 48);
>+
>+    atc_create_address_space_cache(atc, e1.pasid);
>+    atc_update(atc, &e1);
>+
>+    for (addr = e1.iova; addr <= e1.iova + e1.addr_mask; addr += page_size) {
>+        assert_lookup_equals(atc, &e1, e1.pasid, addr);
>+    }
>+    /* addr is now out of the huge page */
>+    assert_lookup_equals(atc, NULL, e1.pasid, addr);
>+    atc_destroy(atc);
>+}
>+
>+static void test_pasid(void)
>+{
>+    hwaddr addr = 0xaaaaaaaaa000ULL;
>+    IOMMUTLBEntry e1 = {
>+        .iova = addr,
>+        .addr_mask = 0xfffULL,
>+        .pasid = 8,
>+        .perm = IOMMU_RW,
>+        .translated_addr = 0xdeadbeefULL,
>+    };
>+    IOMMUTLBEntry e2 = {
>+        .iova = addr,
>+        .addr_mask = 0xfffULL,
>+        .pasid = 2,
>+        .perm = IOMMU_RW,
>+        .translated_addr = 0xb001ULL,
>+    };
>+    uint16_t i;
>+
>+    ATC *atc = atc_new(4096, 48);
>+
>+    atc_create_address_space_cache(atc, e1.pasid);
>+    atc_create_address_space_cache(atc, e2.pasid);
>+    atc_update(atc, &e1);
>+    atc_update(atc, &e2);
>+
>+    for (i = 0; i <= MAX(e1.pasid, e2.pasid) + 1; ++i) {
>+        if (i == e1.pasid || i == e2.pasid) {
>+            continue;
>+        }
>+        assert_lookup_equals(atc, NULL, i, addr);
>+    }
>+    assert_lookup_equals(atc, &e1, e1.pasid, addr);
>+    assert_lookup_equals(atc, &e1, e1.pasid, addr);
>+    atc_destroy(atc);
>+}
>+
>+static void test_large_address(void)
>+{
>+    IOMMUTLBEntry e1 = {
>+        .iova = 0xaaaaaaaaa000ULL,
>+        .addr_mask = 0xfffULL,
>+        .pasid = 8,
>+        .perm = IOMMU_RW,
>+        .translated_addr = 0x5eeeeeedULL,
>+    };
>+    IOMMUTLBEntry e2 = {
>+        .iova = 0x1f00baaaaabf000ULL,
>+        .addr_mask = 0xfffULL,
>+        .pasid = e1.pasid,
>+        .perm = IOMMU_RW,
>+        .translated_addr = 0xdeadbeefULL,
>+    };
>+
>+    ATC *atc = atc_new(4096, 57);
>+
>+    atc_create_address_space_cache(atc, e1.pasid);
>+    atc_update(atc, &e1);
>+    atc_update(atc, &e2);
>+
>+    assert_lookup_equals(atc, &e1, e1.pasid, e1.iova);
>+    assert_lookup_equals(atc, &e2, e2.pasid, e2.iova);
>+    atc_destroy(atc);
>+}
>+
>+static void test_bigger_page(void)
>+{
>+    IOMMUTLBEntry e1 = {
>+        .iova = 0xaabbccdde000ULL,
>+        .addr_mask = 0x1fffULL,
>+        .pasid = 1,
>+        .perm = IOMMU_RW,
>+        .translated_addr = 0x5eeeeeedULL,
>+    };
>+    hwaddr i;
>+
>+    ATC *atc = atc_new(8192, 43);
>+
>+    atc_create_address_space_cache(atc, e1.pasid);
>+    atc_update(atc, &e1);
>+
>+    i = e1.iova & (~e1.addr_mask);
>+    assert_lookup_equals(atc, NULL, e1.pasid, i - 1);
>+    while (i <= e1.iova + e1.addr_mask) {
>+        assert_lookup_equals(atc, &e1, e1.pasid, i);
>+        ++i;
>+    }
>+    assert_lookup_equals(atc, NULL, e1.pasid, i);
>+    atc_destroy(atc);
>+}
>+
>+static void test_unknown_pasid(void)
>+{
>+    IOMMUTLBEntry e1 = {
>+        .iova = 0xaabbccfff000ULL,
>+        .addr_mask = 0xfffULL,
>+        .pasid = 1,
>+        .perm = IOMMU_RW,
>+        .translated_addr = 0x5eeeeeedULL,
>+    };
>+
>+    ATC *atc = atc_new(4096, 48);
>+    g_assert(atc_update(atc, &e1) != 0);
>+    assert_lookup_equals(atc, NULL, e1.pasid, e1.iova);
>+}
>+
>+static void test_invalidation(void)
>+{
>+    static uint64_t page_size = 4096;
>+    IOMMUTLBEntry e1 = {
>+        .iova = 0xaabbccddf000ULL,
>+        .addr_mask = 0xfffULL,
>+        .pasid = 1,
>+        .perm = IOMMU_RW,
>+        .translated_addr = 0x5eeeeeedULL,
>+    };
>+    IOMMUTLBEntry e2 = {
>+        .iova = 0xffe00000ULL,
>+        .addr_mask = 0x1fffffULL,
>+        .pasid = 1,
>+        .perm = IOMMU_RW,
>+        .translated_addr = 0xb000001ULL,
>+    };
>+    IOMMUTLBEntry e3;
>+
>+    ATC *atc = atc_new(page_size , 48);
>+    atc_create_address_space_cache(atc, e1.pasid);
>+
>+    atc_update(atc, &e1);
>+    assert_lookup_equals(atc, &e1, e1.pasid, e1.iova);
>+    atc_invalidate(atc, &e1);
>+    assert_lookup_equals(atc, NULL, e1.pasid, e1.iova);
>+
>+    atc_update(atc, &e1);
>+    atc_update(atc, &e2);
>+    assert_lookup_equals(atc, &e1, e1.pasid, e1.iova);
>+    assert_lookup_equals(atc, &e2, e2.pasid, e2.iova);
>+    atc_invalidate(atc, &e2);
>+    assert_lookup_equals(atc, &e1, e1.pasid, e1.iova);
>+    assert_lookup_equals(atc, NULL, e2.pasid, e2.iova);
>+
>+    /* invalidate a huge page by invalidating a small region */
>+    for (hwaddr addr = e2.iova; addr <= (e2.iova + e2.addr_mask);
>+         addr += page_size) {
>+        atc_update(atc, &e2);
>+        assert_lookup_equals(atc, &e2, e2.pasid, e2.iova);
>+        e3 = (IOMMUTLBEntry){
>+            .iova = addr,
>+            .addr_mask = page_size - 1,
>+            .pasid = e2.pasid,
>+            .perm = IOMMU_RW,
>+            .translated_addr = 0,
>+        };
>+        atc_invalidate(atc, &e3);
>+        assert_lookup_equals(atc, NULL, e2.pasid, e2.iova);
>+    }
>+}
>+
>+static void test_delete_address_space_cache(void)
>+{
>+    static uint64_t page_size = 4096;
>+    IOMMUTLBEntry e1 = {
>+        .iova = 0xaabbccddf000ULL,
>+        .addr_mask = 0xfffULL,
>+        .pasid = 1,
>+        .perm = IOMMU_RW,
>+        .translated_addr = 0x5eeeeeedULL,
>+    };
>+    IOMMUTLBEntry e2 = {
>+        .iova = e1.iova,
>+        .addr_mask = 0xfffULL,
>+        .pasid = 2,
>+        .perm = IOMMU_RW,
>+        .translated_addr = 0x5eeeeeedULL,
>+    };
>+
>+    ATC *atc = atc_new(page_size , 48);
>+    atc_create_address_space_cache(atc, e1.pasid);
>+
>+    atc_update(atc, &e1);
>+    assert_lookup_equals(atc, &e1, e1.pasid, e1.iova);
>+    atc_invalidate(atc, &e2); /* unkown pasid : is a nop*/
>+    assert_lookup_equals(atc, &e1, e1.pasid, e1.iova);
>+
>+    atc_create_address_space_cache(atc, e2.pasid);
>+    atc_update(atc, &e2);
>+    assert_lookup_equals(atc, &e1, e1.pasid, e1.iova);
>+    assert_lookup_equals(atc, &e2, e2.pasid, e2.iova);
>+    atc_invalidate(atc, &e1);
>+    /* e1 has been removed but e2 is still there */
>+    assert_lookup_equals(atc, NULL, e1.pasid, e1.iova);
>+    assert_lookup_equals(atc, &e2, e2.pasid, e2.iova);
>+
>+    atc_update(atc, &e1);
>+    assert_lookup_equals(atc, &e1, e1.pasid, e1.iova);
>+    assert_lookup_equals(atc, &e2, e2.pasid, e2.iova);
>+
>+    atc_delete_address_space_cache(atc, e2.pasid);
>+    assert_lookup_equals(atc, &e1, e1.pasid, e1.iova);
>+    assert_lookup_equals(atc, NULL, e2.pasid, e2.iova);
>+}
>+
>+static void test_invalidate_entire_address_space(void)
>+{
>+    static uint64_t page_size = 4096;
>+    IOMMUTLBEntry e1 = {
>+        .iova = 0x1000ULL,
>+        .addr_mask = 0xfffULL,
>+        .pasid = 1,
>+        .perm = IOMMU_RW,
>+        .translated_addr = 0x5eedULL,
>+    };
>+    IOMMUTLBEntry e2 = {
>+        .iova = 0xfffffffff000ULL,
>+        .addr_mask = 0xfffULL,
>+        .pasid = 1,
>+        .perm = IOMMU_RW,
>+        .translated_addr = 0xbeefULL,
>+    };
>+    IOMMUTLBEntry e3 = {
>+        .iova = 0,
>+        .addr_mask = 0xffffffffffffffffULL,
>+        .pasid = 1,
>+        .perm = IOMMU_RW,
>+        .translated_addr = 0,
>+    };
>+
>+    ATC *atc = atc_new(page_size , 48);
>+    atc_create_address_space_cache(atc, e1.pasid);
>+
>+    atc_update(atc, &e1);
>+    atc_update(atc, &e2);
>+    assert_lookup_equals(atc, &e1, e1.pasid, e1.iova);
>+    assert_lookup_equals(atc, &e2, e2.pasid, e2.iova);
>+    atc_invalidate(atc, &e3);
>+    /* e1 has been removed but e2 is still there */
>+    assert_lookup_equals(atc, NULL, e1.pasid, e1.iova);
>+    assert_lookup_equals(atc, NULL, e2.pasid, e2.iova);
>+
>+    atc_destroy(atc);
>+}
>+
>+static void test_reset(void)
>+{
>+    static uint64_t page_size = 4096;
>+    IOMMUTLBEntry e1 = {
>+        .iova = 0x1000ULL,
>+        .addr_mask = 0xfffULL,
>+        .pasid = 1,
>+        .perm = IOMMU_RW,
>+        .translated_addr = 0x5eedULL,
>+    };
>+    IOMMUTLBEntry e2 = {
>+        .iova = 0xfffffffff000ULL,
>+        .addr_mask = 0xfffULL,
>+        .pasid = 2,
>+        .perm = IOMMU_RW,
>+        .translated_addr = 0xbeefULL,
>+    };
>+
>+    ATC *atc = atc_new(page_size , 48);
>+    atc_create_address_space_cache(atc, e1.pasid);
>+    atc_create_address_space_cache(atc, e2.pasid);
>+    atc_update(atc, &e1);
>+    atc_update(atc, &e2);
>+
>+    assert_lookup_equals(atc, &e1, e1.pasid, e1.iova);
>+    assert_lookup_equals(atc, &e2, e2.pasid, e2.iova);
>+
>+    atc_reset(atc);
>+
>+    assert_lookup_equals(atc, NULL, e1.pasid, e1.iova);
>+    assert_lookup_equals(atc, NULL, e2.pasid, e2.iova);
>+}
>+
>+static void test_get_max_number_of_pages(void)
>+{
>+    static uint64_t page_size = 4096;
>+    hwaddr base = 0xc0fee000; /* aligned */
>+    ATC *atc = atc_new(page_size , 48);
>+    g_assert(atc_get_max_number_of_pages(atc, base, page_size / 2) == 1);
>+    g_assert(atc_get_max_number_of_pages(atc, base, page_size) == 1);
>+    g_assert(atc_get_max_number_of_pages(atc, base, page_size + 1) == 2);
>+
>+    g_assert(atc_get_max_number_of_pages(atc, base + 10, 1) == 1);
>+    g_assert(atc_get_max_number_of_pages(atc, base + 10, page_size - 10)
>== 1);
>+    g_assert(atc_get_max_number_of_pages(atc, base + 10,
>+                                         page_size - 10 + 1) == 2);
>+    g_assert(atc_get_max_number_of_pages(atc, base + 10,
>+                                         page_size - 10 + 2) == 2);
>+
>+    g_assert(atc_get_max_number_of_pages(atc, base + page_size - 1, 1) ==
>1);
>+    g_assert(atc_get_max_number_of_pages(atc, base + page_size - 1, 2) ==
>2);
>+    g_assert(atc_get_max_number_of_pages(atc, base + page_size - 1, 3) ==
>2);
>+
>+    g_assert(atc_get_max_number_of_pages(atc, base + 10, page_size * 20)
>== 21);
>+    g_assert(atc_get_max_number_of_pages(atc, base + 10,
>+                                         (page_size * 20) + (page_size - 10))
>+                                          == 21);
>+    g_assert(atc_get_max_number_of_pages(atc, base + 10,
>+                                         (page_size * 20) +
>+                                         (page_size - 10 + 1)) == 22);
>+}
>+
>+int main(int argc, char **argv)
>+{
>+    g_test_init(&argc, &argv, NULL);
>+    g_test_add_func("/atc/test_creation_parameters",
>test_creation_parameters);
>+    g_test_add_func("/atc/test_single_entry", test_single_entry);
>+    g_test_add_func("/atc/test_page_boundaries", test_page_boundaries);
>+    g_test_add_func("/atc/test_huge_page", test_huge_page);
>+    g_test_add_func("/atc/test_pasid", test_pasid);
>+    g_test_add_func("/atc/test_large_address", test_large_address);
>+    g_test_add_func("/atc/test_bigger_page", test_bigger_page);
>+    g_test_add_func("/atc/test_unknown_pasid", test_unknown_pasid);
>+    g_test_add_func("/atc/test_invalidation", test_invalidation);
>+    g_test_add_func("/atc/test_delete_address_space_cache",
>+                    test_delete_address_space_cache);
>+    g_test_add_func("/atc/test_invalidate_entire_address_space",
>+                    test_invalidate_entire_address_space);
>+    g_test_add_func("/atc/test_reset", test_reset);
>+    g_test_add_func("/atc/test_get_max_number_of_pages",
>+                    test_get_max_number_of_pages);
>+    return g_test_run();
>+}
>diff --git a/util/atc.c b/util/atc.c
>new file mode 100644
>index 0000000000..d951532e26
>--- /dev/null
>+++ b/util/atc.c
>@@ -0,0 +1,211 @@
>+/*
>+ * QEMU emulation of an ATC
>+ *
>+ * This program is free software; you can redistribute it and/or modify
>+ * it under the terms of the GNU General Public License as published by
>+ * the Free Software Foundation; either version 2 of the License, or
>+ * (at your option) any later version.
>+
>+ * This program is distributed in the hope that it will be useful,
>+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
>+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>+ * GNU General Public License for more details.
>+
>+ * You should have received a copy of the GNU General Public License along
>+ * with this program; if not, see <http://www.gnu.org/licenses/>.
>+ */
>+
>+#include "util/atc.h"
>+
>+
>+#define PAGE_TABLE_ENTRY_SIZE 8
>+
>+/* a pasid is hashed using the identity function */
>+static guint atc_pasid_key_hash(gconstpointer v)
>+{
>+    return (guint)(uintptr_t)v; /* pasid */
>+}
>+
>+/* pasid equality */
>+static gboolean atc_pasid_key_equal(gconstpointer v1, gconstpointer v2)
>+{
>+    return v1 == v2;
>+}
>+
>+/* Hash function for IOTLB entries */
>+static guint atc_addr_key_hash(gconstpointer v)
>+{
>+    hwaddr addr = (hwaddr)v;
>+    return (guint)((addr >> 32) ^ (addr & 0xffffffffU));
>+}
>+
>+/* Equality test for IOTLB entries */
>+static gboolean atc_addr_key_equal(gconstpointer v1, gconstpointer v2)
>+{
>+    return (hwaddr)v1 == (hwaddr)v2;
>+}
>+
>+static void atc_address_space_free(void *as)
>+{
>+    g_hash_table_unref(as);
>+}
>+
>+/* return log2(val), or UINT8_MAX if val is not a power of 2 */
>+static uint8_t ilog2(uint64_t val)
>+{
>+    uint8_t result = 0;
>+    while (val != 1) {
>+        if (val & 1) {
>+            return UINT8_MAX;
>+        }
>+
>+        val >>= 1;
>+        result += 1;
>+    }
>+    return result;
>+}
>+
>+ATC *atc_new(uint64_t page_size, uint8_t address_width)
>+{
>+    ATC *atc;
>+    uint8_t log_page_size = ilog2(page_size);
>+    /* number of bits each used to store all the intermediate indexes */
>+    uint64_t addr_lookup_indexes_size;
>+
>+    if (log_page_size == UINT8_MAX) {
>+        return NULL;
>+    }
>+    /*
>+     * We only support page table entries of 8 (PAGE_TABLE_ENTRY_SIZE)
>bytes
>+     * log2(page_size / 8) = log2(page_size) - 3
>+     * is the level offset
>+     */
>+    if (log_page_size <= 3) {
>+        return NULL;
>+    }
>+
>+    atc = g_new0(ATC, 1);
>+    atc->address_spaces = g_hash_table_new_full(atc_pasid_key_hash,
>+                                                atc_pasid_key_equal,
>+                                                NULL, atc_address_space_free);
>+    atc->level_offset = log_page_size - 3;
>+    /* at this point, we know that page_size is a power of 2 */
>+    atc->min_addr_mask = page_size - 1;
>+    addr_lookup_indexes_size = address_width - log_page_size;
>+    if ((addr_lookup_indexes_size % atc->level_offset) != 0) {
>+        goto error;
>+    }
>+    atc->levels = addr_lookup_indexes_size / atc->level_offset;
>+    atc->page_size = page_size;
>+    return atc;
>+
>+error:
>+    g_free(atc);
>+    return NULL;
>+}
>+
>+static inline GHashTable *atc_get_address_space_cache(ATC *atc, uint32_t
>pasid)
>+{
>+    return g_hash_table_lookup(atc->address_spaces,
>+                               (gconstpointer)(uintptr_t)pasid);
>+}
>+
>+void atc_create_address_space_cache(ATC *atc, uint32_t pasid)
>+{
>+    GHashTable *as_cache;
>+
>+    as_cache = atc_get_address_space_cache(atc, pasid);
>+    if (!as_cache) {
>+        as_cache = g_hash_table_new_full(atc_addr_key_hash,
>+                                         atc_addr_key_equal,
>+                                         NULL, g_free);
>+        g_hash_table_replace(atc->address_spaces,
>+                             (gpointer)(uintptr_t)pasid, as_cache);
>+    }
>+}
>+
>+void atc_delete_address_space_cache(ATC *atc, uint32_t pasid)
>+{
>+    g_hash_table_remove(atc->address_spaces, (gpointer)(uintptr_t)pasid);
>+}
>+
>+int atc_update(ATC *atc, IOMMUTLBEntry *entry)
>+{
>+    IOMMUTLBEntry *value;
>+    GHashTable *as_cache = atc_get_address_space_cache(atc, entry-
>>pasid);
>+    if (!as_cache) {
>+        return -ENODEV;
>+    }
>+    value = g_memdup2(entry, sizeof(*value));
>+    g_hash_table_replace(as_cache, (gpointer)(entry->iova), value);
>+    return 0;
>+}
>+
>+IOMMUTLBEntry *atc_lookup(ATC *atc, uint32_t pasid, hwaddr addr)
>+{
>+    IOMMUTLBEntry *entry;
>+    hwaddr mask = atc->min_addr_mask;
>+    hwaddr key = addr & (~mask);
>+    GHashTable *as_cache = atc_get_address_space_cache(atc, pasid);
>+
>+    if (!as_cache) {
>+        return NULL;
>+    }
>+
>+    /*
>+     * Iterate over the possible page sizes and try to find a hit
>+    */
>+    for (uint8_t level = 0; level < atc->levels; ++level) {
>+        entry = g_hash_table_lookup(as_cache, (gconstpointer)key);
>+        if (entry) {
>+            return entry;
>+        }
>+        mask = (mask << atc->level_offset) | ((1 << atc->level_offset) - 1);
>+        key = addr & (~mask);
>+    }
>+
>+    return NULL;
>+}
>+
>+static gboolean atc_invalidate_entry_predicate(gpointer key, gpointer
>value,
>+                                               gpointer user_data)
>+{
>+    IOMMUTLBEntry *entry = (IOMMUTLBEntry *)value;
>+    IOMMUTLBEntry *target = (IOMMUTLBEntry *)user_data;
>+    hwaddr target_mask = ~target->addr_mask;
>+    hwaddr entry_mask = ~entry->addr_mask;
>+    return ((target->iova & target_mask) == (entry->iova & target_mask)) ||
>+           ((target->iova & entry_mask) == (entry->iova & entry_mask));
>+}
>+
>+void atc_invalidate(ATC *atc, IOMMUTLBEntry *entry)
>+{
>+    GHashTable *as_cache = atc_get_address_space_cache(atc, entry-
>>pasid);
>+    if (!as_cache) {
>+        return;
>+    }
>+    g_hash_table_foreach_remove(as_cache,
>+                                atc_invalidate_entry_predicate,
>+                                entry);
>+}
>+
>+void atc_destroy(ATC *atc)
>+{
>+    g_hash_table_unref(atc->address_spaces);
>+}
>+
>+size_t atc_get_max_number_of_pages(ATC *atc, hwaddr addr, size_t
>length)
>+{
>+    hwaddr page_mask = ~(atc->min_addr_mask);
>+    size_t result = (length / atc->page_size);
>+    if ((((addr & page_mask) + length - 1) & page_mask) !=
>+        ((addr + length - 1) & page_mask)) {
>+        result += 1;
>+    }
>+    return result + (length % atc->page_size != 0 ? 1 : 0);
>+}
>+
>+void atc_reset(ATC *atc)
>+{
>+    g_hash_table_remove_all(atc->address_spaces);
>+}
>diff --git a/util/atc.h b/util/atc.h
>new file mode 100644
>index 0000000000..8be95f5cca
>--- /dev/null
>+++ b/util/atc.h
>@@ -0,0 +1,117 @@
>+/*
>+ * QEMU emulation of an ATC
>+ *
>+ * This program is free software; you can redistribute it and/or modify
>+ * it under the terms of the GNU General Public License as published by
>+ * the Free Software Foundation; either version 2 of the License, or
>+ * (at your option) any later version.
>+
>+ * This program is distributed in the hope that it will be useful,
>+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
>+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>+ * GNU General Public License for more details.
>+
>+ * You should have received a copy of the GNU General Public License along
>+ * with this program; if not, see <http://www.gnu.org/licenses/>.
>+ */
>+
>+#ifndef UTIL_ATC_H
>+#define UTIL_ATC_H
>+
>+#include "qemu/osdep.h"
>+#include "exec/memory.h"
>+
>+typedef struct ATC {
>+    GHashTable *address_spaces; /* Key : pasid, value : GHashTable */
>+    hwaddr min_addr_mask;
>+    uint64_t page_size;
>+    uint8_t levels;
>+    uint8_t level_offset;
>+} ATC;
>+
>+/*
>+ * atc_new: Create an ATC.
>+ *
>+ * Return an ATC or NULL if the creation failed
>+ *
>+ * @page_size: #PCIDevice doing the memory access
>+ * @address_width: width of the virtual addresses used by the IOMMU (in
>bits)
>+ */
>+ATC *atc_new(uint64_t page_size, uint8_t address_width);
>+
>+/*
>+ * atc_update: Insert or update an entry in the cache
>+ *
>+ * Return 0 if the operation succeeds, a negative error code otherwise
>+ *
>+ * The insertion will fail if the address space associated with this pasid
>+ * has not been created with atc_create_address_space_cache
>+ *
>+ * @atc: the ATC to update
>+ * @entry: the tlb entry to insert into the cache
>+ */
>+int atc_update(ATC *atc, IOMMUTLBEntry *entry);
>+
>+/*
>+ * atc_create_address_space_cache: delare a new address space
>+ * identified by a PASID
>+ *
>+ * @atc: the ATC to update
>+ * @pasid: the pasid of the address space to be created
>+ */
>+void atc_create_address_space_cache(ATC *atc, uint32_t pasid);
>+
>+/*
>+ * atc_delete_address_space_cache: delete an address space
>+ * identified by a PASID
>+ *
>+ * @atc: the ATC to update
>+ * @pasid: the pasid of the address space to be deleted
>+ */
>+void atc_delete_address_space_cache(ATC *atc, uint32_t pasid);
>+
>+/*
>+ * atc_lookup: query the cache in a given address space
>+ *
>+ * @atc: the ATC to query
>+ * @pasid: the pasid of the address space to query
>+ * @addr: the virtual address to translate
>+ */
>+IOMMUTLBEntry *atc_lookup(ATC *atc, uint32_t pasid, hwaddr addr);
>+
>+/*
>+ * atc_invalidate: invalidate an entry in the cache
>+ *
>+ * @atc: the ATC to update
>+ * @entry: the entry to invalidate
>+ */
>+void atc_invalidate(ATC *atc, IOMMUTLBEntry *entry);
>+
>+/*
>+ * atc_destroy: delete an ATC
>+ *
>+ * @atc: the cache to be deleted
>+ */
>+void atc_destroy(ATC *atc);
>+
>+/*
>+ * atc_get_max_number_of_pages: get the number of pages a memory
>operation
>+ * will access if all the pages concerned have the minimum size.
>+ *
>+ * This function can be used to determine the size of the result array to be
>+ * allocated when issuing an ATS request.
>+ *
>+ * @atc: the cache
>+ * @addr: start address
>+ * @length: number of bytes accessed from addr
>+ */
>+size_t atc_get_max_number_of_pages(ATC *atc, hwaddr addr, size_t
>length);
>+
>+/*
>+ * atc_reset: invalidates all the entries stored in the ATC
>+ *
>+ * @atc: the cache
>+ */
>+void atc_reset(ATC *atc);
>+
>+#endif
>diff --git a/util/meson.build b/util/meson.build
>index 0ef9886be0..a2e0e9e5d7 100644
>--- a/util/meson.build
>+++ b/util/meson.build
>@@ -94,6 +94,7 @@ if have_block
>   util_ss.add(files('hbitmap.c'))
>   util_ss.add(files('hexdump.c'))
>   util_ss.add(files('iova-tree.c'))
>+  util_ss.add(files('atc.c'))
>   util_ss.add(files('iov.c', 'uri.c'))
>   util_ss.add(files('nvdimm-utils.c'))
>   util_ss.add(files('block-helpers.c'))
>--
>2.44.0

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH ats_vtd v2 20/25] intel_iommu: fill the PASID field when creating an instance of IOMMUTLBEntry
  2024-05-17 10:40   ` Duan, Zhenzhong
@ 2024-05-17 11:11     ` CLEMENT MATHIEU--DRIF
  2024-05-21  3:11       ` Duan, Zhenzhong
  0 siblings, 1 reply; 32+ messages in thread
From: CLEMENT MATHIEU--DRIF @ 2024-05-17 11:11 UTC (permalink / raw)
  To: Duan, Zhenzhong, qemu-devel
  Cc: jasowang, Tian, Kevin, Liu, Yi L, joao.m.martins, peterx


On 17/05/2024 12:40, Duan, Zhenzhong wrote:
> Caution: External email. Do not open attachments or click links, unless this email comes from a known sender and you know the content is safe.
>
>
>> -----Original Message-----
>> From: CLEMENT MATHIEU--DRIF <clement.mathieu--drif@eviden.com>
>> Subject: [PATCH ats_vtd v2 20/25] intel_iommu: fill the PASID field when
>> creating an instance of IOMMUTLBEntry
>>
>> Signed-off-by: Clément Mathieu--Drif <clement.mathieu--drif@eviden.com>
>> ---
>> hw/i386/intel_iommu.c | 7 +++++++
>> 1 file changed, 7 insertions(+)
>>
>> diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
>> index 53f17d66c0..c4ebd4569e 100644
>> --- a/hw/i386/intel_iommu.c
>> +++ b/hw/i386/intel_iommu.c
>> @@ -2299,6 +2299,7 @@ out:
>>      entry->translated_addr = vtd_get_slpte_addr(pte, s->aw_bits) &
>> page_mask;
>>      entry->addr_mask = ~page_mask;
>>      entry->perm = access_flags;
>> +    entry->pasid = pasid;
> For PCI_NO_PASID, do we want to assign PCI_NO_PASID or rid2pasid?
we have the following statement a few lines above :
if (rid2pasid) {
         pasid = VTD_CE_GET_RID2PASID(&ce);
}

so we store rid2pasid if the feature is enabled.

But maybe we should store PCI_NO_PASID because the rest of the world is 
not supposed to be aware of what we are doing with rid2pasid.

Does it look good to you?
>
> Thanks
> Zhenzhong
>
>>      return true;
>>
>> error:
>> @@ -2307,6 +2308,7 @@ error:
>>      entry->translated_addr = 0;
>>      entry->addr_mask = 0;
>>      entry->perm = IOMMU_NONE;
>> +    entry->pasid = PCI_NO_PASID;
>>      return false;
>> }
>>
>> @@ -3497,6 +3499,7 @@ static void
>> vtd_piotlb_pasid_invalidate_notify(IntelIOMMUState *s,
>>                  event.entry.target_as = &address_space_memory;
>>                  event.entry.iova = notifier->start;
>>                  event.entry.perm = IOMMU_NONE;
>> +                event.entry.pasid = pasid;
>>                  event.entry.addr_mask = notifier->end - notifier->start;
>>                  event.entry.translated_addr = 0;
>>
>> @@ -3678,6 +3681,7 @@ static void
>> vtd_piotlb_page_invalidate(IntelIOMMUState *s, uint16_t domain_id,
>>              event.entry.target_as = &address_space_memory;
>>              event.entry.iova = addr;
>>              event.entry.perm = IOMMU_NONE;
>> +            event.entry.pasid = pasid;
>>              event.entry.addr_mask = size - 1;
>>              event.entry.translated_addr = 0;
>>
>> @@ -4335,6 +4339,7 @@ static void
>> do_invalidate_device_tlb(VTDAddressSpace *vtd_dev_as,
>>      event.entry.iova = addr;
>>      event.entry.perm = IOMMU_NONE;
>>      event.entry.translated_addr = 0;
>> +    event.entry.pasid = vtd_dev_as->pasid;
>>      memory_region_notify_iommu(&vtd_dev_as->iommu, 0, event);
>> }
>>
>> @@ -4911,6 +4916,7 @@ static IOMMUTLBEntry
>> vtd_iommu_translate(IOMMUMemoryRegion *iommu, hwaddr addr,
>>      IOMMUTLBEntry iotlb = {
>>          /* We'll fill in the rest later. */
>>          .target_as = &address_space_memory,
>> +        .pasid = vtd_as->pasid,
>>      };
>>      bool success;
>>
>> @@ -4923,6 +4929,7 @@ static IOMMUTLBEntry
>> vtd_iommu_translate(IOMMUMemoryRegion *iommu, hwaddr addr,
>>          iotlb.translated_addr = addr & VTD_PAGE_MASK_4K;
>>          iotlb.addr_mask = ~VTD_PAGE_MASK_4K;
>>          iotlb.perm = IOMMU_RW;
>> +        iotlb.pasid = PCI_NO_PASID;
>>          success = true;
>>      }
>>
>> --
>> 2.44.0

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH ats_vtd v2 21/25] atc: generic ATC that can be used by PCIe devices that support SVM
  2024-05-17 10:44   ` Duan, Zhenzhong
@ 2024-05-17 11:12     ` CLEMENT MATHIEU--DRIF
  0 siblings, 0 replies; 32+ messages in thread
From: CLEMENT MATHIEU--DRIF @ 2024-05-17 11:12 UTC (permalink / raw)
  To: Duan, Zhenzhong, qemu-devel
  Cc: jasowang, Tian, Kevin, Liu, Yi L, joao.m.martins, peterx


On 17/05/2024 12:44, Duan, Zhenzhong wrote:
> Caution: External email. Do not open attachments or click links, unless this email comes from a known sender and you know the content is safe.
>
>
>> -----Original Message-----
>> From: CLEMENT MATHIEU--DRIF <clement.mathieu--drif@eviden.com>
>> Subject: [PATCH ats_vtd v2 21/25] atc: generic ATC that can be used by PCIe
>> devices that support SVM
>>
>> As the SVM-capable devices will need to cache translations, we provide
>> an first implementation.
>>
>> This cache uses a two-level design based on hash tables.
>> The first level is indexed by a PASID and the second by a virtual addresse.
>>
>> Signed-off-by: Clément Mathieu--Drif <clement.mathieu--drif@eviden.com>
>> ---
>> tests/unit/meson.build |   1 +
>> tests/unit/test-atc.c  | 502
>> +++++++++++++++++++++++++++++++++++++++++
>> util/atc.c             | 211 +++++++++++++++++
>> util/atc.h             | 117 ++++++++++
>> util/meson.build       |   1 +
>> 5 files changed, 832 insertions(+)
>> create mode 100644 tests/unit/test-atc.c
>> create mode 100644 util/atc.c
>> create mode 100644 util/atc.h
> Maybe the unit test can be split from functional change?
will do!
>> diff --git a/tests/unit/meson.build b/tests/unit/meson.build
>> index 228a21d03c..5c9a6fe9f4 100644
>> --- a/tests/unit/meson.build
>> +++ b/tests/unit/meson.build
>> @@ -52,6 +52,7 @@ tests = {
>>    'test-interval-tree': [],
>>    'test-xs-node': [qom],
>>    'test-virtio-dmabuf': [meson.project_source_root() / 'hw/display/virtio-
>> dmabuf.c'],
>> +  'test-atc': []
>> }
>>
>> if have_system or have_tools
>> diff --git a/tests/unit/test-atc.c b/tests/unit/test-atc.c
>> new file mode 100644
>> index 0000000000..60fa60924a
>> --- /dev/null
>> +++ b/tests/unit/test-atc.c
>> @@ -0,0 +1,502 @@
>> +/*
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License as published by
>> + * the Free Software Foundation; either version 2 of the License, or
>> + * (at your option) any later version.
>> +
>> + * This program is distributed in the hope that it will be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> + * GNU General Public License for more details.
>> +
>> + * You should have received a copy of the GNU General Public License along
>> + * with this program; if not, see <http://www.gnu.org/licenses/>.
>> + */
>> +
>> +#include "util/atc.h"
>> +
>> +static inline bool tlb_entry_equal(IOMMUTLBEntry *e1, IOMMUTLBEntry
>> *e2)
>> +{
>> +    if (!e1 || !e2) {
>> +        return !e1 && !e2;
>> +    }
>> +    return e1->iova == e2->iova &&
>> +            e1->addr_mask == e2->addr_mask &&
>> +            e1->pasid == e2->pasid &&
>> +            e1->perm == e2->perm &&
>> +            e1->target_as == e2->target_as &&
>> +            e1->translated_addr == e2->translated_addr;
>> +}
>> +
>> +static void assert_lookup_equals(ATC *atc, IOMMUTLBEntry *target,
>> +                                 uint32_t pasid, hwaddr iova)
>> +{
>> +    IOMMUTLBEntry *result;
>> +    result = atc_lookup(atc, pasid, iova);
>> +    g_assert(tlb_entry_equal(result, target));
>> +}
>> +
>> +static void check_creation(uint64_t page_size, uint8_t address_width,
>> +                           uint8_t levels, uint8_t level_offset,
>> +                           bool should_work) {
>> +    ATC *atc = atc_new(page_size, address_width);
>> +    if (atc) {
>> +        if (atc->levels != levels || atc->level_offset != level_offset) {
>> +            g_assert(false); /* ATC created but invalid configuration : fail */
>> +        }
>> +        atc_destroy(atc);
>> +        g_assert(should_work);
>> +    } else {
>> +        g_assert(!should_work);
>> +    }
>> +}
>> +
>> +static void test_creation_parameters(void)
>> +{
>> +    check_creation(8, 39, 3, 9, false);
>> +    check_creation(4095, 39, 3, 9, false);
>> +    check_creation(4097, 39, 3, 9, false);
>> +    check_creation(8192, 48, 0, 0, false);
>> +
>> +    check_creation(4096, 38, 0, 0, false);
>> +    check_creation(4096, 39, 3, 9, true);
>> +    check_creation(4096, 40, 0, 0, false);
>> +    check_creation(4096, 47, 0, 0, false);
>> +    check_creation(4096, 48, 4, 9, true);
>> +    check_creation(4096, 49, 0, 0, false);
>> +    check_creation(4096, 56, 0, 0, false);
>> +    check_creation(4096, 57, 5, 9, true);
>> +    check_creation(4096, 58, 0, 0, false);
>> +
>> +    check_creation(16384, 35, 0, 0, false);
>> +    check_creation(16384, 36, 2, 11, true);
>> +    check_creation(16384, 37, 0, 0, false);
>> +    check_creation(16384, 46, 0, 0, false);
>> +    check_creation(16384, 47, 3, 11, true);
>> +    check_creation(16384, 48, 0, 0, false);
>> +    check_creation(16384, 57, 0, 0, false);
>> +    check_creation(16384, 58, 4, 11, true);
>> +    check_creation(16384, 59, 0, 0, false);
>> +}
>> +
>> +static void test_single_entry(void)
>> +{
>> +    IOMMUTLBEntry entry = {
>> +        .iova = 0x123456789000ULL,
>> +        .addr_mask = 0xfffULL,
>> +        .pasid = 5,
>> +        .perm = IOMMU_RW,
>> +        .translated_addr = 0xdeadbeefULL,
>> +    };
>> +
>> +    ATC *atc = atc_new(4096, 48);
>> +    g_assert(atc);
>> +
>> +    assert_lookup_equals(atc, NULL, entry.pasid,
>> +                         entry.iova + (entry.addr_mask / 2));
>> +
>> +    atc_create_address_space_cache(atc, entry.pasid);
>> +    g_assert(atc_update(atc, &entry) == 0);
>> +
>> +    assert_lookup_equals(atc, NULL, entry.pasid + 1,
>> +                         entry.iova + (entry.addr_mask / 2));
>> +    assert_lookup_equals(atc, &entry, entry.pasid,
>> +                         entry.iova + (entry.addr_mask / 2));
>> +
>> +    atc_destroy(atc);
>> +}
>> +
>> +static void test_page_boundaries(void)
>> +{
>> +    static const uint32_t pasid = 5;
>> +    static const hwaddr page_size = 4096;
>> +
>> +    /* 2 consecutive entries */
>> +    IOMMUTLBEntry e1 = {
>> +        .iova = 0x123456789000ULL,
>> +        .addr_mask = page_size - 1,
>> +        .pasid = pasid,
>> +        .perm = IOMMU_RW,
>> +        .translated_addr = 0xdeadbeefULL,
>> +    };
>> +    IOMMUTLBEntry e2 = {
>> +        .iova = e1.iova + page_size,
>> +        .addr_mask = page_size - 1,
>> +        .pasid = pasid,
>> +        .perm = IOMMU_RW,
>> +        .translated_addr = 0x900df00dULL,
>> +    };
>> +
>> +    ATC *atc = atc_new(page_size, 48);
>> +
>> +    atc_create_address_space_cache(atc, e1.pasid);
>> +    /* creating the address space twice should not be a problem */
>> +    atc_create_address_space_cache(atc, e1.pasid);
>> +
>> +    atc_update(atc, &e1);
>> +    atc_update(atc, &e2);
>> +
>> +    assert_lookup_equals(atc, NULL, e1.pasid, e1.iova - 1);
>> +    assert_lookup_equals(atc, &e1, e1.pasid, e1.iova);
>> +    assert_lookup_equals(atc, &e1, e1.pasid, e1.iova + e1.addr_mask);
>> +    g_assert((e1.iova + e1.addr_mask + 1) == e2.iova);
>> +    assert_lookup_equals(atc, &e2, e2.pasid, e2.iova);
>> +    assert_lookup_equals(atc, &e2, e2.pasid, e2.iova + e2.addr_mask);
>> +    assert_lookup_equals(atc, NULL, e2.pasid, e2.iova + e2.addr_mask + 1);
>> +
>> +    assert_lookup_equals(atc, NULL, e1.pasid + 10, e1.iova);
>> +    assert_lookup_equals(atc, NULL, e2.pasid + 10, e2.iova);
>> +    atc_destroy(atc);
>> +}
>> +
>> +static void test_huge_page(void)
>> +{
>> +    static const uint32_t pasid = 5;
>> +    static const hwaddr page_size = 4096;
>> +    IOMMUTLBEntry e1 = {
>> +        .iova = 0x123456600000ULL,
>> +        .addr_mask = 0x1fffffULL,
>> +        .pasid = pasid,
>> +        .perm = IOMMU_RW,
>> +        .translated_addr = 0xdeadbeefULL,
>> +    };
>> +    hwaddr addr;
>> +
>> +    ATC *atc = atc_new(page_size, 48);
>> +
>> +    atc_create_address_space_cache(atc, e1.pasid);
>> +    atc_update(atc, &e1);
>> +
>> +    for (addr = e1.iova; addr <= e1.iova + e1.addr_mask; addr += page_size) {
>> +        assert_lookup_equals(atc, &e1, e1.pasid, addr);
>> +    }
>> +    /* addr is now out of the huge page */
>> +    assert_lookup_equals(atc, NULL, e1.pasid, addr);
>> +    atc_destroy(atc);
>> +}
>> +
>> +static void test_pasid(void)
>> +{
>> +    hwaddr addr = 0xaaaaaaaaa000ULL;
>> +    IOMMUTLBEntry e1 = {
>> +        .iova = addr,
>> +        .addr_mask = 0xfffULL,
>> +        .pasid = 8,
>> +        .perm = IOMMU_RW,
>> +        .translated_addr = 0xdeadbeefULL,
>> +    };
>> +    IOMMUTLBEntry e2 = {
>> +        .iova = addr,
>> +        .addr_mask = 0xfffULL,
>> +        .pasid = 2,
>> +        .perm = IOMMU_RW,
>> +        .translated_addr = 0xb001ULL,
>> +    };
>> +    uint16_t i;
>> +
>> +    ATC *atc = atc_new(4096, 48);
>> +
>> +    atc_create_address_space_cache(atc, e1.pasid);
>> +    atc_create_address_space_cache(atc, e2.pasid);
>> +    atc_update(atc, &e1);
>> +    atc_update(atc, &e2);
>> +
>> +    for (i = 0; i <= MAX(e1.pasid, e2.pasid) + 1; ++i) {
>> +        if (i == e1.pasid || i == e2.pasid) {
>> +            continue;
>> +        }
>> +        assert_lookup_equals(atc, NULL, i, addr);
>> +    }
>> +    assert_lookup_equals(atc, &e1, e1.pasid, addr);
>> +    assert_lookup_equals(atc, &e1, e1.pasid, addr);
>> +    atc_destroy(atc);
>> +}
>> +
>> +static void test_large_address(void)
>> +{
>> +    IOMMUTLBEntry e1 = {
>> +        .iova = 0xaaaaaaaaa000ULL,
>> +        .addr_mask = 0xfffULL,
>> +        .pasid = 8,
>> +        .perm = IOMMU_RW,
>> +        .translated_addr = 0x5eeeeeedULL,
>> +    };
>> +    IOMMUTLBEntry e2 = {
>> +        .iova = 0x1f00baaaaabf000ULL,
>> +        .addr_mask = 0xfffULL,
>> +        .pasid = e1.pasid,
>> +        .perm = IOMMU_RW,
>> +        .translated_addr = 0xdeadbeefULL,
>> +    };
>> +
>> +    ATC *atc = atc_new(4096, 57);
>> +
>> +    atc_create_address_space_cache(atc, e1.pasid);
>> +    atc_update(atc, &e1);
>> +    atc_update(atc, &e2);
>> +
>> +    assert_lookup_equals(atc, &e1, e1.pasid, e1.iova);
>> +    assert_lookup_equals(atc, &e2, e2.pasid, e2.iova);
>> +    atc_destroy(atc);
>> +}
>> +
>> +static void test_bigger_page(void)
>> +{
>> +    IOMMUTLBEntry e1 = {
>> +        .iova = 0xaabbccdde000ULL,
>> +        .addr_mask = 0x1fffULL,
>> +        .pasid = 1,
>> +        .perm = IOMMU_RW,
>> +        .translated_addr = 0x5eeeeeedULL,
>> +    };
>> +    hwaddr i;
>> +
>> +    ATC *atc = atc_new(8192, 43);
>> +
>> +    atc_create_address_space_cache(atc, e1.pasid);
>> +    atc_update(atc, &e1);
>> +
>> +    i = e1.iova & (~e1.addr_mask);
>> +    assert_lookup_equals(atc, NULL, e1.pasid, i - 1);
>> +    while (i <= e1.iova + e1.addr_mask) {
>> +        assert_lookup_equals(atc, &e1, e1.pasid, i);
>> +        ++i;
>> +    }
>> +    assert_lookup_equals(atc, NULL, e1.pasid, i);
>> +    atc_destroy(atc);
>> +}
>> +
>> +static void test_unknown_pasid(void)
>> +{
>> +    IOMMUTLBEntry e1 = {
>> +        .iova = 0xaabbccfff000ULL,
>> +        .addr_mask = 0xfffULL,
>> +        .pasid = 1,
>> +        .perm = IOMMU_RW,
>> +        .translated_addr = 0x5eeeeeedULL,
>> +    };
>> +
>> +    ATC *atc = atc_new(4096, 48);
>> +    g_assert(atc_update(atc, &e1) != 0);
>> +    assert_lookup_equals(atc, NULL, e1.pasid, e1.iova);
>> +}
>> +
>> +static void test_invalidation(void)
>> +{
>> +    static uint64_t page_size = 4096;
>> +    IOMMUTLBEntry e1 = {
>> +        .iova = 0xaabbccddf000ULL,
>> +        .addr_mask = 0xfffULL,
>> +        .pasid = 1,
>> +        .perm = IOMMU_RW,
>> +        .translated_addr = 0x5eeeeeedULL,
>> +    };
>> +    IOMMUTLBEntry e2 = {
>> +        .iova = 0xffe00000ULL,
>> +        .addr_mask = 0x1fffffULL,
>> +        .pasid = 1,
>> +        .perm = IOMMU_RW,
>> +        .translated_addr = 0xb000001ULL,
>> +    };
>> +    IOMMUTLBEntry e3;
>> +
>> +    ATC *atc = atc_new(page_size , 48);
>> +    atc_create_address_space_cache(atc, e1.pasid);
>> +
>> +    atc_update(atc, &e1);
>> +    assert_lookup_equals(atc, &e1, e1.pasid, e1.iova);
>> +    atc_invalidate(atc, &e1);
>> +    assert_lookup_equals(atc, NULL, e1.pasid, e1.iova);
>> +
>> +    atc_update(atc, &e1);
>> +    atc_update(atc, &e2);
>> +    assert_lookup_equals(atc, &e1, e1.pasid, e1.iova);
>> +    assert_lookup_equals(atc, &e2, e2.pasid, e2.iova);
>> +    atc_invalidate(atc, &e2);
>> +    assert_lookup_equals(atc, &e1, e1.pasid, e1.iova);
>> +    assert_lookup_equals(atc, NULL, e2.pasid, e2.iova);
>> +
>> +    /* invalidate a huge page by invalidating a small region */
>> +    for (hwaddr addr = e2.iova; addr <= (e2.iova + e2.addr_mask);
>> +         addr += page_size) {
>> +        atc_update(atc, &e2);
>> +        assert_lookup_equals(atc, &e2, e2.pasid, e2.iova);
>> +        e3 = (IOMMUTLBEntry){
>> +            .iova = addr,
>> +            .addr_mask = page_size - 1,
>> +            .pasid = e2.pasid,
>> +            .perm = IOMMU_RW,
>> +            .translated_addr = 0,
>> +        };
>> +        atc_invalidate(atc, &e3);
>> +        assert_lookup_equals(atc, NULL, e2.pasid, e2.iova);
>> +    }
>> +}
>> +
>> +static void test_delete_address_space_cache(void)
>> +{
>> +    static uint64_t page_size = 4096;
>> +    IOMMUTLBEntry e1 = {
>> +        .iova = 0xaabbccddf000ULL,
>> +        .addr_mask = 0xfffULL,
>> +        .pasid = 1,
>> +        .perm = IOMMU_RW,
>> +        .translated_addr = 0x5eeeeeedULL,
>> +    };
>> +    IOMMUTLBEntry e2 = {
>> +        .iova = e1.iova,
>> +        .addr_mask = 0xfffULL,
>> +        .pasid = 2,
>> +        .perm = IOMMU_RW,
>> +        .translated_addr = 0x5eeeeeedULL,
>> +    };
>> +
>> +    ATC *atc = atc_new(page_size , 48);
>> +    atc_create_address_space_cache(atc, e1.pasid);
>> +
>> +    atc_update(atc, &e1);
>> +    assert_lookup_equals(atc, &e1, e1.pasid, e1.iova);
>> +    atc_invalidate(atc, &e2); /* unkown pasid : is a nop*/
>> +    assert_lookup_equals(atc, &e1, e1.pasid, e1.iova);
>> +
>> +    atc_create_address_space_cache(atc, e2.pasid);
>> +    atc_update(atc, &e2);
>> +    assert_lookup_equals(atc, &e1, e1.pasid, e1.iova);
>> +    assert_lookup_equals(atc, &e2, e2.pasid, e2.iova);
>> +    atc_invalidate(atc, &e1);
>> +    /* e1 has been removed but e2 is still there */
>> +    assert_lookup_equals(atc, NULL, e1.pasid, e1.iova);
>> +    assert_lookup_equals(atc, &e2, e2.pasid, e2.iova);
>> +
>> +    atc_update(atc, &e1);
>> +    assert_lookup_equals(atc, &e1, e1.pasid, e1.iova);
>> +    assert_lookup_equals(atc, &e2, e2.pasid, e2.iova);
>> +
>> +    atc_delete_address_space_cache(atc, e2.pasid);
>> +    assert_lookup_equals(atc, &e1, e1.pasid, e1.iova);
>> +    assert_lookup_equals(atc, NULL, e2.pasid, e2.iova);
>> +}
>> +
>> +static void test_invalidate_entire_address_space(void)
>> +{
>> +    static uint64_t page_size = 4096;
>> +    IOMMUTLBEntry e1 = {
>> +        .iova = 0x1000ULL,
>> +        .addr_mask = 0xfffULL,
>> +        .pasid = 1,
>> +        .perm = IOMMU_RW,
>> +        .translated_addr = 0x5eedULL,
>> +    };
>> +    IOMMUTLBEntry e2 = {
>> +        .iova = 0xfffffffff000ULL,
>> +        .addr_mask = 0xfffULL,
>> +        .pasid = 1,
>> +        .perm = IOMMU_RW,
>> +        .translated_addr = 0xbeefULL,
>> +    };
>> +    IOMMUTLBEntry e3 = {
>> +        .iova = 0,
>> +        .addr_mask = 0xffffffffffffffffULL,
>> +        .pasid = 1,
>> +        .perm = IOMMU_RW,
>> +        .translated_addr = 0,
>> +    };
>> +
>> +    ATC *atc = atc_new(page_size , 48);
>> +    atc_create_address_space_cache(atc, e1.pasid);
>> +
>> +    atc_update(atc, &e1);
>> +    atc_update(atc, &e2);
>> +    assert_lookup_equals(atc, &e1, e1.pasid, e1.iova);
>> +    assert_lookup_equals(atc, &e2, e2.pasid, e2.iova);
>> +    atc_invalidate(atc, &e3);
>> +    /* e1 has been removed but e2 is still there */
>> +    assert_lookup_equals(atc, NULL, e1.pasid, e1.iova);
>> +    assert_lookup_equals(atc, NULL, e2.pasid, e2.iova);
>> +
>> +    atc_destroy(atc);
>> +}
>> +
>> +static void test_reset(void)
>> +{
>> +    static uint64_t page_size = 4096;
>> +    IOMMUTLBEntry e1 = {
>> +        .iova = 0x1000ULL,
>> +        .addr_mask = 0xfffULL,
>> +        .pasid = 1,
>> +        .perm = IOMMU_RW,
>> +        .translated_addr = 0x5eedULL,
>> +    };
>> +    IOMMUTLBEntry e2 = {
>> +        .iova = 0xfffffffff000ULL,
>> +        .addr_mask = 0xfffULL,
>> +        .pasid = 2,
>> +        .perm = IOMMU_RW,
>> +        .translated_addr = 0xbeefULL,
>> +    };
>> +
>> +    ATC *atc = atc_new(page_size , 48);
>> +    atc_create_address_space_cache(atc, e1.pasid);
>> +    atc_create_address_space_cache(atc, e2.pasid);
>> +    atc_update(atc, &e1);
>> +    atc_update(atc, &e2);
>> +
>> +    assert_lookup_equals(atc, &e1, e1.pasid, e1.iova);
>> +    assert_lookup_equals(atc, &e2, e2.pasid, e2.iova);
>> +
>> +    atc_reset(atc);
>> +
>> +    assert_lookup_equals(atc, NULL, e1.pasid, e1.iova);
>> +    assert_lookup_equals(atc, NULL, e2.pasid, e2.iova);
>> +}
>> +
>> +static void test_get_max_number_of_pages(void)
>> +{
>> +    static uint64_t page_size = 4096;
>> +    hwaddr base = 0xc0fee000; /* aligned */
>> +    ATC *atc = atc_new(page_size , 48);
>> +    g_assert(atc_get_max_number_of_pages(atc, base, page_size / 2) == 1);
>> +    g_assert(atc_get_max_number_of_pages(atc, base, page_size) == 1);
>> +    g_assert(atc_get_max_number_of_pages(atc, base, page_size + 1) == 2);
>> +
>> +    g_assert(atc_get_max_number_of_pages(atc, base + 10, 1) == 1);
>> +    g_assert(atc_get_max_number_of_pages(atc, base + 10, page_size - 10)
>> == 1);
>> +    g_assert(atc_get_max_number_of_pages(atc, base + 10,
>> +                                         page_size - 10 + 1) == 2);
>> +    g_assert(atc_get_max_number_of_pages(atc, base + 10,
>> +                                         page_size - 10 + 2) == 2);
>> +
>> +    g_assert(atc_get_max_number_of_pages(atc, base + page_size - 1, 1) ==
>> 1);
>> +    g_assert(atc_get_max_number_of_pages(atc, base + page_size - 1, 2) ==
>> 2);
>> +    g_assert(atc_get_max_number_of_pages(atc, base + page_size - 1, 3) ==
>> 2);
>> +
>> +    g_assert(atc_get_max_number_of_pages(atc, base + 10, page_size * 20)
>> == 21);
>> +    g_assert(atc_get_max_number_of_pages(atc, base + 10,
>> +                                         (page_size * 20) + (page_size - 10))
>> +                                          == 21);
>> +    g_assert(atc_get_max_number_of_pages(atc, base + 10,
>> +                                         (page_size * 20) +
>> +                                         (page_size - 10 + 1)) == 22);
>> +}
>> +
>> +int main(int argc, char **argv)
>> +{
>> +    g_test_init(&argc, &argv, NULL);
>> +    g_test_add_func("/atc/test_creation_parameters",
>> test_creation_parameters);
>> +    g_test_add_func("/atc/test_single_entry", test_single_entry);
>> +    g_test_add_func("/atc/test_page_boundaries", test_page_boundaries);
>> +    g_test_add_func("/atc/test_huge_page", test_huge_page);
>> +    g_test_add_func("/atc/test_pasid", test_pasid);
>> +    g_test_add_func("/atc/test_large_address", test_large_address);
>> +    g_test_add_func("/atc/test_bigger_page", test_bigger_page);
>> +    g_test_add_func("/atc/test_unknown_pasid", test_unknown_pasid);
>> +    g_test_add_func("/atc/test_invalidation", test_invalidation);
>> +    g_test_add_func("/atc/test_delete_address_space_cache",
>> +                    test_delete_address_space_cache);
>> +    g_test_add_func("/atc/test_invalidate_entire_address_space",
>> +                    test_invalidate_entire_address_space);
>> +    g_test_add_func("/atc/test_reset", test_reset);
>> +    g_test_add_func("/atc/test_get_max_number_of_pages",
>> +                    test_get_max_number_of_pages);
>> +    return g_test_run();
>> +}
>> diff --git a/util/atc.c b/util/atc.c
>> new file mode 100644
>> index 0000000000..d951532e26
>> --- /dev/null
>> +++ b/util/atc.c
>> @@ -0,0 +1,211 @@
>> +/*
>> + * QEMU emulation of an ATC
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License as published by
>> + * the Free Software Foundation; either version 2 of the License, or
>> + * (at your option) any later version.
>> +
>> + * This program is distributed in the hope that it will be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> + * GNU General Public License for more details.
>> +
>> + * You should have received a copy of the GNU General Public License along
>> + * with this program; if not, see <http://www.gnu.org/licenses/>.
>> + */
>> +
>> +#include "util/atc.h"
>> +
>> +
>> +#define PAGE_TABLE_ENTRY_SIZE 8
>> +
>> +/* a pasid is hashed using the identity function */
>> +static guint atc_pasid_key_hash(gconstpointer v)
>> +{
>> +    return (guint)(uintptr_t)v; /* pasid */
>> +}
>> +
>> +/* pasid equality */
>> +static gboolean atc_pasid_key_equal(gconstpointer v1, gconstpointer v2)
>> +{
>> +    return v1 == v2;
>> +}
>> +
>> +/* Hash function for IOTLB entries */
>> +static guint atc_addr_key_hash(gconstpointer v)
>> +{
>> +    hwaddr addr = (hwaddr)v;
>> +    return (guint)((addr >> 32) ^ (addr & 0xffffffffU));
>> +}
>> +
>> +/* Equality test for IOTLB entries */
>> +static gboolean atc_addr_key_equal(gconstpointer v1, gconstpointer v2)
>> +{
>> +    return (hwaddr)v1 == (hwaddr)v2;
>> +}
>> +
>> +static void atc_address_space_free(void *as)
>> +{
>> +    g_hash_table_unref(as);
>> +}
>> +
>> +/* return log2(val), or UINT8_MAX if val is not a power of 2 */
>> +static uint8_t ilog2(uint64_t val)
>> +{
>> +    uint8_t result = 0;
>> +    while (val != 1) {
>> +        if (val & 1) {
>> +            return UINT8_MAX;
>> +        }
>> +
>> +        val >>= 1;
>> +        result += 1;
>> +    }
>> +    return result;
>> +}
>> +
>> +ATC *atc_new(uint64_t page_size, uint8_t address_width)
>> +{
>> +    ATC *atc;
>> +    uint8_t log_page_size = ilog2(page_size);
>> +    /* number of bits each used to store all the intermediate indexes */
>> +    uint64_t addr_lookup_indexes_size;
>> +
>> +    if (log_page_size == UINT8_MAX) {
>> +        return NULL;
>> +    }
>> +    /*
>> +     * We only support page table entries of 8 (PAGE_TABLE_ENTRY_SIZE)
>> bytes
>> +     * log2(page_size / 8) = log2(page_size) - 3
>> +     * is the level offset
>> +     */
>> +    if (log_page_size <= 3) {
>> +        return NULL;
>> +    }
>> +
>> +    atc = g_new0(ATC, 1);
>> +    atc->address_spaces = g_hash_table_new_full(atc_pasid_key_hash,
>> +                                                atc_pasid_key_equal,
>> +                                                NULL, atc_address_space_free);
>> +    atc->level_offset = log_page_size - 3;
>> +    /* at this point, we know that page_size is a power of 2 */
>> +    atc->min_addr_mask = page_size - 1;
>> +    addr_lookup_indexes_size = address_width - log_page_size;
>> +    if ((addr_lookup_indexes_size % atc->level_offset) != 0) {
>> +        goto error;
>> +    }
>> +    atc->levels = addr_lookup_indexes_size / atc->level_offset;
>> +    atc->page_size = page_size;
>> +    return atc;
>> +
>> +error:
>> +    g_free(atc);
>> +    return NULL;
>> +}
>> +
>> +static inline GHashTable *atc_get_address_space_cache(ATC *atc, uint32_t
>> pasid)
>> +{
>> +    return g_hash_table_lookup(atc->address_spaces,
>> +                               (gconstpointer)(uintptr_t)pasid);
>> +}
>> +
>> +void atc_create_address_space_cache(ATC *atc, uint32_t pasid)
>> +{
>> +    GHashTable *as_cache;
>> +
>> +    as_cache = atc_get_address_space_cache(atc, pasid);
>> +    if (!as_cache) {
>> +        as_cache = g_hash_table_new_full(atc_addr_key_hash,
>> +                                         atc_addr_key_equal,
>> +                                         NULL, g_free);
>> +        g_hash_table_replace(atc->address_spaces,
>> +                             (gpointer)(uintptr_t)pasid, as_cache);
>> +    }
>> +}
>> +
>> +void atc_delete_address_space_cache(ATC *atc, uint32_t pasid)
>> +{
>> +    g_hash_table_remove(atc->address_spaces, (gpointer)(uintptr_t)pasid);
>> +}
>> +
>> +int atc_update(ATC *atc, IOMMUTLBEntry *entry)
>> +{
>> +    IOMMUTLBEntry *value;
>> +    GHashTable *as_cache = atc_get_address_space_cache(atc, entry-
>>> pasid);
>> +    if (!as_cache) {
>> +        return -ENODEV;
>> +    }
>> +    value = g_memdup2(entry, sizeof(*value));
>> +    g_hash_table_replace(as_cache, (gpointer)(entry->iova), value);
>> +    return 0;
>> +}
>> +
>> +IOMMUTLBEntry *atc_lookup(ATC *atc, uint32_t pasid, hwaddr addr)
>> +{
>> +    IOMMUTLBEntry *entry;
>> +    hwaddr mask = atc->min_addr_mask;
>> +    hwaddr key = addr & (~mask);
>> +    GHashTable *as_cache = atc_get_address_space_cache(atc, pasid);
>> +
>> +    if (!as_cache) {
>> +        return NULL;
>> +    }
>> +
>> +    /*
>> +     * Iterate over the possible page sizes and try to find a hit
>> +    */
>> +    for (uint8_t level = 0; level < atc->levels; ++level) {
>> +        entry = g_hash_table_lookup(as_cache, (gconstpointer)key);
>> +        if (entry) {
>> +            return entry;
>> +        }
>> +        mask = (mask << atc->level_offset) | ((1 << atc->level_offset) - 1);
>> +        key = addr & (~mask);
>> +    }
>> +
>> +    return NULL;
>> +}
>> +
>> +static gboolean atc_invalidate_entry_predicate(gpointer key, gpointer
>> value,
>> +                                               gpointer user_data)
>> +{
>> +    IOMMUTLBEntry *entry = (IOMMUTLBEntry *)value;
>> +    IOMMUTLBEntry *target = (IOMMUTLBEntry *)user_data;
>> +    hwaddr target_mask = ~target->addr_mask;
>> +    hwaddr entry_mask = ~entry->addr_mask;
>> +    return ((target->iova & target_mask) == (entry->iova & target_mask)) ||
>> +           ((target->iova & entry_mask) == (entry->iova & entry_mask));
>> +}
>> +
>> +void atc_invalidate(ATC *atc, IOMMUTLBEntry *entry)
>> +{
>> +    GHashTable *as_cache = atc_get_address_space_cache(atc, entry-
>>> pasid);
>> +    if (!as_cache) {
>> +        return;
>> +    }
>> +    g_hash_table_foreach_remove(as_cache,
>> +                                atc_invalidate_entry_predicate,
>> +                                entry);
>> +}
>> +
>> +void atc_destroy(ATC *atc)
>> +{
>> +    g_hash_table_unref(atc->address_spaces);
>> +}
>> +
>> +size_t atc_get_max_number_of_pages(ATC *atc, hwaddr addr, size_t
>> length)
>> +{
>> +    hwaddr page_mask = ~(atc->min_addr_mask);
>> +    size_t result = (length / atc->page_size);
>> +    if ((((addr & page_mask) + length - 1) & page_mask) !=
>> +        ((addr + length - 1) & page_mask)) {
>> +        result += 1;
>> +    }
>> +    return result + (length % atc->page_size != 0 ? 1 : 0);
>> +}
>> +
>> +void atc_reset(ATC *atc)
>> +{
>> +    g_hash_table_remove_all(atc->address_spaces);
>> +}
>> diff --git a/util/atc.h b/util/atc.h
>> new file mode 100644
>> index 0000000000..8be95f5cca
>> --- /dev/null
>> +++ b/util/atc.h
>> @@ -0,0 +1,117 @@
>> +/*
>> + * QEMU emulation of an ATC
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License as published by
>> + * the Free Software Foundation; either version 2 of the License, or
>> + * (at your option) any later version.
>> +
>> + * This program is distributed in the hope that it will be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> + * GNU General Public License for more details.
>> +
>> + * You should have received a copy of the GNU General Public License along
>> + * with this program; if not, see <http://www.gnu.org/licenses/>.
>> + */
>> +
>> +#ifndef UTIL_ATC_H
>> +#define UTIL_ATC_H
>> +
>> +#include "qemu/osdep.h"
>> +#include "exec/memory.h"
>> +
>> +typedef struct ATC {
>> +    GHashTable *address_spaces; /* Key : pasid, value : GHashTable */
>> +    hwaddr min_addr_mask;
>> +    uint64_t page_size;
>> +    uint8_t levels;
>> +    uint8_t level_offset;
>> +} ATC;
>> +
>> +/*
>> + * atc_new: Create an ATC.
>> + *
>> + * Return an ATC or NULL if the creation failed
>> + *
>> + * @page_size: #PCIDevice doing the memory access
>> + * @address_width: width of the virtual addresses used by the IOMMU (in
>> bits)
>> + */
>> +ATC *atc_new(uint64_t page_size, uint8_t address_width);
>> +
>> +/*
>> + * atc_update: Insert or update an entry in the cache
>> + *
>> + * Return 0 if the operation succeeds, a negative error code otherwise
>> + *
>> + * The insertion will fail if the address space associated with this pasid
>> + * has not been created with atc_create_address_space_cache
>> + *
>> + * @atc: the ATC to update
>> + * @entry: the tlb entry to insert into the cache
>> + */
>> +int atc_update(ATC *atc, IOMMUTLBEntry *entry);
>> +
>> +/*
>> + * atc_create_address_space_cache: delare a new address space
>> + * identified by a PASID
>> + *
>> + * @atc: the ATC to update
>> + * @pasid: the pasid of the address space to be created
>> + */
>> +void atc_create_address_space_cache(ATC *atc, uint32_t pasid);
>> +
>> +/*
>> + * atc_delete_address_space_cache: delete an address space
>> + * identified by a PASID
>> + *
>> + * @atc: the ATC to update
>> + * @pasid: the pasid of the address space to be deleted
>> + */
>> +void atc_delete_address_space_cache(ATC *atc, uint32_t pasid);
>> +
>> +/*
>> + * atc_lookup: query the cache in a given address space
>> + *
>> + * @atc: the ATC to query
>> + * @pasid: the pasid of the address space to query
>> + * @addr: the virtual address to translate
>> + */
>> +IOMMUTLBEntry *atc_lookup(ATC *atc, uint32_t pasid, hwaddr addr);
>> +
>> +/*
>> + * atc_invalidate: invalidate an entry in the cache
>> + *
>> + * @atc: the ATC to update
>> + * @entry: the entry to invalidate
>> + */
>> +void atc_invalidate(ATC *atc, IOMMUTLBEntry *entry);
>> +
>> +/*
>> + * atc_destroy: delete an ATC
>> + *
>> + * @atc: the cache to be deleted
>> + */
>> +void atc_destroy(ATC *atc);
>> +
>> +/*
>> + * atc_get_max_number_of_pages: get the number of pages a memory
>> operation
>> + * will access if all the pages concerned have the minimum size.
>> + *
>> + * This function can be used to determine the size of the result array to be
>> + * allocated when issuing an ATS request.
>> + *
>> + * @atc: the cache
>> + * @addr: start address
>> + * @length: number of bytes accessed from addr
>> + */
>> +size_t atc_get_max_number_of_pages(ATC *atc, hwaddr addr, size_t
>> length);
>> +
>> +/*
>> + * atc_reset: invalidates all the entries stored in the ATC
>> + *
>> + * @atc: the cache
>> + */
>> +void atc_reset(ATC *atc);
>> +
>> +#endif
>> diff --git a/util/meson.build b/util/meson.build
>> index 0ef9886be0..a2e0e9e5d7 100644
>> --- a/util/meson.build
>> +++ b/util/meson.build
>> @@ -94,6 +94,7 @@ if have_block
>>    util_ss.add(files('hbitmap.c'))
>>    util_ss.add(files('hexdump.c'))
>>    util_ss.add(files('iova-tree.c'))
>> +  util_ss.add(files('atc.c'))
>>    util_ss.add(files('iov.c', 'uri.c'))
>>    util_ss.add(files('nvdimm-utils.c'))
>>    util_ss.add(files('block-helpers.c'))
>> --
>> 2.44.0

^ permalink raw reply	[flat|nested] 32+ messages in thread

* RE: [PATCH ats_vtd v2 20/25] intel_iommu: fill the PASID field when creating an instance of IOMMUTLBEntry
  2024-05-17 11:11     ` CLEMENT MATHIEU--DRIF
@ 2024-05-21  3:11       ` Duan, Zhenzhong
  2024-05-21  5:09         ` CLEMENT MATHIEU--DRIF
  0 siblings, 1 reply; 32+ messages in thread
From: Duan, Zhenzhong @ 2024-05-21  3:11 UTC (permalink / raw)
  To: CLEMENT MATHIEU--DRIF, qemu-devel
  Cc: jasowang, Tian, Kevin, Liu, Yi L, joao.m.martins, peterx



>-----Original Message-----
>From: CLEMENT MATHIEU--DRIF <clement.mathieu--drif@eviden.com>
>Subject: Re: [PATCH ats_vtd v2 20/25] intel_iommu: fill the PASID field when
>creating an instance of IOMMUTLBEntry
>
>
>On 17/05/2024 12:40, Duan, Zhenzhong wrote:
>> Caution: External email. Do not open attachments or click links, unless this
>email comes from a known sender and you know the content is safe.
>>
>>
>>> -----Original Message-----
>>> From: CLEMENT MATHIEU--DRIF <clement.mathieu--drif@eviden.com>
>>> Subject: [PATCH ats_vtd v2 20/25] intel_iommu: fill the PASID field when
>>> creating an instance of IOMMUTLBEntry
>>>
>>> Signed-off-by: Clément Mathieu--Drif <clement.mathieu--
>drif@eviden.com>
>>> ---
>>> hw/i386/intel_iommu.c | 7 +++++++
>>> 1 file changed, 7 insertions(+)
>>>
>>> diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
>>> index 53f17d66c0..c4ebd4569e 100644
>>> --- a/hw/i386/intel_iommu.c
>>> +++ b/hw/i386/intel_iommu.c
>>> @@ -2299,6 +2299,7 @@ out:
>>>      entry->translated_addr = vtd_get_slpte_addr(pte, s->aw_bits) &
>>> page_mask;
>>>      entry->addr_mask = ~page_mask;
>>>      entry->perm = access_flags;
>>> +    entry->pasid = pasid;
>> For PCI_NO_PASID, do we want to assign PCI_NO_PASID or rid2pasid?
>we have the following statement a few lines above :
>if (rid2pasid) {
>         pasid = VTD_CE_GET_RID2PASID(&ce);
>}
>
>so we store rid2pasid if the feature is enabled.
>
>But maybe we should store PCI_NO_PASID because the rest of the world is
>not supposed to be aware of what we are doing with rid2pasid.
>
>Does it look good to you?

Yes, that make sense.

>>
>> Thanks
>> Zhenzhong
>>
>>>      return true;
>>>
>>> error:
>>> @@ -2307,6 +2308,7 @@ error:
>>>      entry->translated_addr = 0;
>>>      entry->addr_mask = 0;
>>>      entry->perm = IOMMU_NONE;
>>> +    entry->pasid = PCI_NO_PASID;
>>>      return false;
>>> }
>>>
>>> @@ -3497,6 +3499,7 @@ static void
>>> vtd_piotlb_pasid_invalidate_notify(IntelIOMMUState *s,
>>>                  event.entry.target_as = &address_space_memory;
>>>                  event.entry.iova = notifier->start;
>>>                  event.entry.perm = IOMMU_NONE;
>>> +                event.entry.pasid = pasid;
>>>                  event.entry.addr_mask = notifier->end - notifier->start;
>>>                  event.entry.translated_addr = 0;
>>>
>>> @@ -3678,6 +3681,7 @@ static void
>>> vtd_piotlb_page_invalidate(IntelIOMMUState *s, uint16_t domain_id,
>>>              event.entry.target_as = &address_space_memory;
>>>              event.entry.iova = addr;
>>>              event.entry.perm = IOMMU_NONE;
>>> +            event.entry.pasid = pasid;
>>>              event.entry.addr_mask = size - 1;
>>>              event.entry.translated_addr = 0;
>>>
>>> @@ -4335,6 +4339,7 @@ static void
>>> do_invalidate_device_tlb(VTDAddressSpace *vtd_dev_as,
>>>      event.entry.iova = addr;
>>>      event.entry.perm = IOMMU_NONE;
>>>      event.entry.translated_addr = 0;
>>> +    event.entry.pasid = vtd_dev_as->pasid;
>>>      memory_region_notify_iommu(&vtd_dev_as->iommu, 0, event);
>>> }
>>>
>>> @@ -4911,6 +4916,7 @@ static IOMMUTLBEntry
>>> vtd_iommu_translate(IOMMUMemoryRegion *iommu, hwaddr addr,
>>>      IOMMUTLBEntry iotlb = {
>>>          /* We'll fill in the rest later. */
>>>          .target_as = &address_space_memory,
>>> +        .pasid = vtd_as->pasid,
>>>      };
>>>      bool success;
>>>
>>> @@ -4923,6 +4929,7 @@ static IOMMUTLBEntry
>>> vtd_iommu_translate(IOMMUMemoryRegion *iommu, hwaddr addr,
>>>          iotlb.translated_addr = addr & VTD_PAGE_MASK_4K;
>>>          iotlb.addr_mask = ~VTD_PAGE_MASK_4K;
>>>          iotlb.perm = IOMMU_RW;
>>> +        iotlb.pasid = PCI_NO_PASID;
>>>          success = true;
>>>      }
>>>
>>> --
>>> 2.44.0

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH ats_vtd v2 20/25] intel_iommu: fill the PASID field when creating an instance of IOMMUTLBEntry
  2024-05-21  3:11       ` Duan, Zhenzhong
@ 2024-05-21  5:09         ` CLEMENT MATHIEU--DRIF
  0 siblings, 0 replies; 32+ messages in thread
From: CLEMENT MATHIEU--DRIF @ 2024-05-21  5:09 UTC (permalink / raw)
  To: Duan, Zhenzhong, qemu-devel
  Cc: jasowang, Tian, Kevin, Liu, Yi L, joao.m.martins, peterx


On 21/05/2024 05:11, Duan, Zhenzhong wrote:
> Caution: External email. Do not open attachments or click links, unless this email comes from a known sender and you know the content is safe.
>
>
>> -----Original Message-----
>> From: CLEMENT MATHIEU--DRIF <clement.mathieu--drif@eviden.com>
>> Subject: Re: [PATCH ats_vtd v2 20/25] intel_iommu: fill the PASID field when
>> creating an instance of IOMMUTLBEntry
>>
>>
>> On 17/05/2024 12:40, Duan, Zhenzhong wrote:
>>> Caution: External email. Do not open attachments or click links, unless this
>> email comes from a known sender and you know the content is safe.
>>>
>>>> -----Original Message-----
>>>> From: CLEMENT MATHIEU--DRIF <clement.mathieu--drif@eviden.com>
>>>> Subject: [PATCH ats_vtd v2 20/25] intel_iommu: fill the PASID field when
>>>> creating an instance of IOMMUTLBEntry
>>>>
>>>> Signed-off-by: Clément Mathieu--Drif <clement.mathieu--
>> drif@eviden.com>
>>>> ---
>>>> hw/i386/intel_iommu.c | 7 +++++++
>>>> 1 file changed, 7 insertions(+)
>>>>
>>>> diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
>>>> index 53f17d66c0..c4ebd4569e 100644
>>>> --- a/hw/i386/intel_iommu.c
>>>> +++ b/hw/i386/intel_iommu.c
>>>> @@ -2299,6 +2299,7 @@ out:
>>>>       entry->translated_addr = vtd_get_slpte_addr(pte, s->aw_bits) &
>>>> page_mask;
>>>>       entry->addr_mask = ~page_mask;
>>>>       entry->perm = access_flags;
>>>> +    entry->pasid = pasid;
>>> For PCI_NO_PASID, do we want to assign PCI_NO_PASID or rid2pasid?
>> we have the following statement a few lines above :
>> if (rid2pasid) {
>>          pasid = VTD_CE_GET_RID2PASID(&ce);
>> }
>>
>> so we store rid2pasid if the feature is enabled.
>>
>> But maybe we should store PCI_NO_PASID because the rest of the world is
>> not supposed to be aware of what we are doing with rid2pasid.
>>
>> Does it look good to you?
> Yes, that make sense.
ok, will do
>
>>> Thanks
>>> Zhenzhong
>>>
>>>>       return true;
>>>>
>>>> error:
>>>> @@ -2307,6 +2308,7 @@ error:
>>>>       entry->translated_addr = 0;
>>>>       entry->addr_mask = 0;
>>>>       entry->perm = IOMMU_NONE;
>>>> +    entry->pasid = PCI_NO_PASID;
>>>>       return false;
>>>> }
>>>>
>>>> @@ -3497,6 +3499,7 @@ static void
>>>> vtd_piotlb_pasid_invalidate_notify(IntelIOMMUState *s,
>>>>                   event.entry.target_as = &address_space_memory;
>>>>                   event.entry.iova = notifier->start;
>>>>                   event.entry.perm = IOMMU_NONE;
>>>> +                event.entry.pasid = pasid;
>>>>                   event.entry.addr_mask = notifier->end - notifier->start;
>>>>                   event.entry.translated_addr = 0;
>>>>
>>>> @@ -3678,6 +3681,7 @@ static void
>>>> vtd_piotlb_page_invalidate(IntelIOMMUState *s, uint16_t domain_id,
>>>>               event.entry.target_as = &address_space_memory;
>>>>               event.entry.iova = addr;
>>>>               event.entry.perm = IOMMU_NONE;
>>>> +            event.entry.pasid = pasid;
>>>>               event.entry.addr_mask = size - 1;
>>>>               event.entry.translated_addr = 0;
>>>>
>>>> @@ -4335,6 +4339,7 @@ static void
>>>> do_invalidate_device_tlb(VTDAddressSpace *vtd_dev_as,
>>>>       event.entry.iova = addr;
>>>>       event.entry.perm = IOMMU_NONE;
>>>>       event.entry.translated_addr = 0;
>>>> +    event.entry.pasid = vtd_dev_as->pasid;
>>>>       memory_region_notify_iommu(&vtd_dev_as->iommu, 0, event);
>>>> }
>>>>
>>>> @@ -4911,6 +4916,7 @@ static IOMMUTLBEntry
>>>> vtd_iommu_translate(IOMMUMemoryRegion *iommu, hwaddr addr,
>>>>       IOMMUTLBEntry iotlb = {
>>>>           /* We'll fill in the rest later. */
>>>>           .target_as = &address_space_memory,
>>>> +        .pasid = vtd_as->pasid,
>>>>       };
>>>>       bool success;
>>>>
>>>> @@ -4923,6 +4929,7 @@ static IOMMUTLBEntry
>>>> vtd_iommu_translate(IOMMUMemoryRegion *iommu, hwaddr addr,
>>>>           iotlb.translated_addr = addr & VTD_PAGE_MASK_4K;
>>>>           iotlb.addr_mask = ~VTD_PAGE_MASK_4K;
>>>>           iotlb.perm = IOMMU_RW;
>>>> +        iotlb.pasid = PCI_NO_PASID;
>>>>           success = true;
>>>>       }
>>>>
>>>> --
>>>> 2.44.0

^ permalink raw reply	[flat|nested] 32+ messages in thread

end of thread, other threads:[~2024-05-21  5:10 UTC | newest]

Thread overview: 32+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-05-15  7:14 [PATCH ats_vtd v2 00/25] ATS support for VT-d CLEMENT MATHIEU--DRIF
2024-05-15  7:14 ` [PATCH ats_vtd v2 01/25] intel_iommu: fix FRCD construction macro CLEMENT MATHIEU--DRIF
2024-05-15  7:14 ` [PATCH ats_vtd v2 03/25] intel_iommu: check if the input address is canonical CLEMENT MATHIEU--DRIF
2024-05-15  7:14 ` [PATCH ats_vtd v2 02/25] intel_iommu: make types match CLEMENT MATHIEU--DRIF
2024-05-15  7:14 ` [PATCH ats_vtd v2 04/25] intel_iommu: set accessed and dirty bits during first stage translation CLEMENT MATHIEU--DRIF
2024-05-15  7:14 ` [PATCH ats_vtd v2 07/25] intel_iommu: do not consider wait_desc as an invalid descriptor CLEMENT MATHIEU--DRIF
2024-05-15  7:14 ` [PATCH ats_vtd v2 05/25] intel_iommu: return page walk level even when the translation fails CLEMENT MATHIEU--DRIF
2024-05-15  7:14 ` [PATCH ats_vtd v2 06/25] intel_iommu: extract device IOTLB invalidation logic CLEMENT MATHIEU--DRIF
2024-05-15  7:14 ` [PATCH ats_vtd v2 12/25] intel_iommu: add an internal API to find an address space with PASID CLEMENT MATHIEU--DRIF
2024-05-15  7:14 ` [PATCH ats_vtd v2 10/25] pcie: helper functions to check if PASID and ATS are enabled CLEMENT MATHIEU--DRIF
2024-05-15  7:14 ` [PATCH ats_vtd v2 09/25] pcie: add helper to declare PASID capability for a pcie device CLEMENT MATHIEU--DRIF
2024-05-15  7:14 ` [PATCH ats_vtd v2 11/25] intel_iommu: declare supported PASID size CLEMENT MATHIEU--DRIF
2024-05-15  7:14 ` [PATCH ats_vtd v2 08/25] memory: add permissions in IOMMUAccessFlags CLEMENT MATHIEU--DRIF
2024-05-15  7:14 ` [PATCH ats_vtd v2 14/25] pci: cache the bus mastering status in the device CLEMENT MATHIEU--DRIF
2024-05-15  7:14 ` [PATCH ats_vtd v2 16/25] pci: add a pci-level initialization function for iommu notifiers CLEMENT MATHIEU--DRIF
2024-05-15  7:14 ` [PATCH ats_vtd v2 13/25] intel_iommu: add support for PASID-based device IOTLB invalidation CLEMENT MATHIEU--DRIF
2024-05-15  7:14 ` [PATCH ats_vtd v2 15/25] pci: add IOMMU operations to get address spaces and memory regions with PASID CLEMENT MATHIEU--DRIF
2024-05-15  7:14 ` [PATCH ats_vtd v2 18/25] intel_iommu: implement the get_memory_region_pasid iommu operation CLEMENT MATHIEU--DRIF
2024-05-15  7:14 ` [PATCH ats_vtd v2 19/25] memory: Allow to store the PASID in IOMMUTLBEntry CLEMENT MATHIEU--DRIF
2024-05-15  7:14 ` [PATCH ats_vtd v2 20/25] intel_iommu: fill the PASID field when creating an instance of IOMMUTLBEntry CLEMENT MATHIEU--DRIF
2024-05-17 10:40   ` Duan, Zhenzhong
2024-05-17 11:11     ` CLEMENT MATHIEU--DRIF
2024-05-21  3:11       ` Duan, Zhenzhong
2024-05-21  5:09         ` CLEMENT MATHIEU--DRIF
2024-05-15  7:14 ` [PATCH ats_vtd v2 17/25] intel_iommu: implement the get_address_space_pasid iommu operation CLEMENT MATHIEU--DRIF
2024-05-15  7:14 ` [PATCH ats_vtd v2 21/25] atc: generic ATC that can be used by PCIe devices that support SVM CLEMENT MATHIEU--DRIF
2024-05-17 10:44   ` Duan, Zhenzhong
2024-05-17 11:12     ` CLEMENT MATHIEU--DRIF
2024-05-15  7:14 ` [PATCH ats_vtd v2 24/25] intel_iommu: set the address mask even when a translation fails CLEMENT MATHIEU--DRIF
2024-05-15  7:14 ` [PATCH ats_vtd v2 23/25] pci: add a pci-level API for ATS CLEMENT MATHIEU--DRIF
2024-05-15  7:14 ` [PATCH ats_vtd v2 22/25] memory: add an API for ATS support CLEMENT MATHIEU--DRIF
2024-05-15  7:14 ` [PATCH ats_vtd v2 25/25] intel_iommu: add support for ATS CLEMENT MATHIEU--DRIF

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).