All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH for-4.15 v5 0/3] xen/iommu: Collection of bug fixes for IOMMU teardown
@ 2021-02-26 10:56 Julien Grall
  2021-02-26 10:56 ` [PATCH for-4.15 v5 1/3] xen/iommu: x86: Don't try to free page tables is the IOMMU is not enabled Julien Grall
                   ` (3 more replies)
  0 siblings, 4 replies; 10+ messages in thread
From: Julien Grall @ 2021-02-26 10:56 UTC (permalink / raw)
  To: xen-devel
  Cc: hongyxia, iwj, Julien Grall, Jan Beulich, Paul Durrant,
	Andrew Cooper, Kevin Tian

From: Julien Grall <jgrall@amazon.com>

Hi all,

This series is a collection of bug fixes for the IOMMU teardown code.
All of them are candidate for 4.15 as they can either leak memory or
lead to host crash/host corruption.

This is sent directly on xen-devel because all the issues were either
introduced in 4.15 or happen in the domain creation code.

Major changes since v4:
    - New patch added (it is a split of patch #1 in v4)

Major changes since v3:
    - Remove patch #3 "xen/iommu: x86: Harden the IOMMU page-table
    allocator" as it is not strictly necessary for 4.15.
    - Re-order the patches to avoid on a follow-up patch to fix
    completely the issue.

Major changes since v2:
    - patch #1 "xen/x86: p2m: Don't map the special pages in the IOMMU
    page-tables" has been removed. This requires Jan's patch [1] to
    fully mitigate memory leaks.

Release-Acked-by: Ian Jackson <iwj@xenproject.org>

@Ian, I assumed that the release-acked-by would stand even with the
patch split. Let me know if if this is not the case.

Cheers,

[1] <90271e69-c07e-a32c-5531-a79b10ef03dd@suse.com>

Julien Grall (3):
  xen/iommu: x86: Don't try to free page tables is the IOMMU is not
    enabled
  xen/x86: iommu: Ignore IOMMU mapping requests when a domain is dying
  xen/iommu: x86: Clear the root page-table before freeing the
    page-tables

 xen/drivers/passthrough/amd/iommu_map.c     | 12 +++++++++++
 xen/drivers/passthrough/amd/pci_amd_iommu.c | 12 ++++++++++-
 xen/drivers/passthrough/vtd/iommu.c         | 24 ++++++++++++++++++++-
 xen/drivers/passthrough/x86/iommu.c         | 19 ++++++++++++++++
 xen/include/xen/iommu.h                     |  1 +
 5 files changed, 66 insertions(+), 2 deletions(-)

-- 
2.17.1



^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH for-4.15 v5 1/3] xen/iommu: x86: Don't try to free page tables is the IOMMU is not enabled
  2021-02-26 10:56 [PATCH for-4.15 v5 0/3] xen/iommu: Collection of bug fixes for IOMMU teardown Julien Grall
@ 2021-02-26 10:56 ` Julien Grall
  2021-02-26 13:27   ` Jan Beulich
  2021-02-26 10:56 ` [PATCH for-4.15 v5 2/3] xen/x86: iommu: Ignore IOMMU mapping requests when a domain is dying Julien Grall
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 10+ messages in thread
From: Julien Grall @ 2021-02-26 10:56 UTC (permalink / raw)
  To: xen-devel; +Cc: hongyxia, iwj, Julien Grall, Jan Beulich, Paul Durrant

From: Julien Grall <jgrall@amazon.com>

When using CONFIG_BIGMEM=y, the page_list cannot be accessed whilst it
is is unitialized. However, iommu_free_pgtables() will be called even if
the domain is not using an IOMMU.

Consequently, Xen will try to go through the page list and deference a
NULL pointer.

Bail out early if the domain is not using an IOMMU.

Fixes: 15bc9a1ef51c ("x86/iommu: add common page-table allocator")
Signed-off-by: Julien Grall <jgrall@amazon.com>

---
    Changes in v5:
        - Patch added. This was split from "xen/x86: iommu: Ignore
        IOMMU mapping requests when a domain is dying"
---
 xen/drivers/passthrough/x86/iommu.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/xen/drivers/passthrough/x86/iommu.c b/xen/drivers/passthrough/x86/iommu.c
index cea1032b3d02..58a330e82247 100644
--- a/xen/drivers/passthrough/x86/iommu.c
+++ b/xen/drivers/passthrough/x86/iommu.c
@@ -267,6 +267,9 @@ int iommu_free_pgtables(struct domain *d)
     struct page_info *pg;
     unsigned int done = 0;
 
+    if ( !is_iommu_enabled(d) )
+        return 0;
+
     while ( (pg = page_list_remove_head(&hd->arch.pgtables.list)) )
     {
         free_domheap_page(pg);
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH for-4.15 v5 2/3] xen/x86: iommu: Ignore IOMMU mapping requests when a domain is dying
  2021-02-26 10:56 [PATCH for-4.15 v5 0/3] xen/iommu: Collection of bug fixes for IOMMU teardown Julien Grall
  2021-02-26 10:56 ` [PATCH for-4.15 v5 1/3] xen/iommu: x86: Don't try to free page tables is the IOMMU is not enabled Julien Grall
@ 2021-02-26 10:56 ` Julien Grall
  2021-02-26 13:30   ` Jan Beulich
  2021-03-01  2:52   ` Tian, Kevin
  2021-02-26 10:56 ` [PATCH for-4.15 v5 3/3] xen/iommu: x86: Clear the root page-table before freeing the page-tables Julien Grall
  2021-03-02  9:56 ` [PATCH for-4.15 v5 0/3] xen/iommu: Collection of bug fixes for IOMMU teardown Julien Grall
  3 siblings, 2 replies; 10+ messages in thread
From: Julien Grall @ 2021-02-26 10:56 UTC (permalink / raw)
  To: xen-devel
  Cc: hongyxia, iwj, Julien Grall, Jan Beulich, Andrew Cooper,
	Kevin Tian, Paul Durrant

From: Julien Grall <jgrall@amazon.com>

The new x86 IOMMU page-tables allocator will release the pages when
relinquishing the domain resources. However, this is not sufficient
when the domain is dying because nothing prevents page-table to be
allocated.

As the domain is dying, it is not necessary to continue to modify the
IOMMU page-tables as they are going to be destroyed soon.

At the moment, page-table allocates will only happen when iommu_map().
So after this change there will be no more page-table allocation
happening because we don't use superpage mappings yet when not sharing
page tables.

In order to observe d->is_dying correctly, we need to rely on per-arch
locking, so the check to ignore IOMMU mapping is added on the per-driver
map_page() callback.

Signed-off-by: Julien Grall <jgrall@amazon.com>

---

As discussed in v3, this is only covering 4.15. We can discuss
post-4.15 how to catch page-table allocations if another caller (e.g.
iommu_unmap() if we ever decide to support superpages) start to use the
page-table allocator.

Changes in v5:
    - Clarify in the commit message why fixing iommu_map() is enough
    - Split "if ( !is_iommu_enabled(d) )"  in a separate patch
    - Update the comment on top of the spin_barrier()

Changes in v4:
    - Move the patch to the top of the queue
    - Reword the commit message

Changes in v3:
    - Patch added. This is a replacement of "xen/iommu: iommu_map: Don't
    crash the domain if it is dying"
---
 xen/drivers/passthrough/amd/iommu_map.c | 12 ++++++++++++
 xen/drivers/passthrough/vtd/iommu.c     | 12 ++++++++++++
 xen/drivers/passthrough/x86/iommu.c     |  3 +++
 3 files changed, 27 insertions(+)

diff --git a/xen/drivers/passthrough/amd/iommu_map.c b/xen/drivers/passthrough/amd/iommu_map.c
index d3a8b1aec766..560af54b765b 100644
--- a/xen/drivers/passthrough/amd/iommu_map.c
+++ b/xen/drivers/passthrough/amd/iommu_map.c
@@ -285,6 +285,18 @@ int amd_iommu_map_page(struct domain *d, dfn_t dfn, mfn_t mfn,
 
     spin_lock(&hd->arch.mapping_lock);
 
+    /*
+     * IOMMU mapping request can be safely ignored when the domain is dying.
+     *
+     * hd->arch.mapping_lock guarantees that d->is_dying will be observed
+     * before any page tables are freed (see iommu_free_pgtables()).
+     */
+    if ( d->is_dying )
+    {
+        spin_unlock(&hd->arch.mapping_lock);
+        return 0;
+    }
+
     rc = amd_iommu_alloc_root(d);
     if ( rc )
     {
diff --git a/xen/drivers/passthrough/vtd/iommu.c b/xen/drivers/passthrough/vtd/iommu.c
index d136fe36883b..b549a71530d5 100644
--- a/xen/drivers/passthrough/vtd/iommu.c
+++ b/xen/drivers/passthrough/vtd/iommu.c
@@ -1762,6 +1762,18 @@ static int __must_check intel_iommu_map_page(struct domain *d, dfn_t dfn,
 
     spin_lock(&hd->arch.mapping_lock);
 
+    /*
+     * IOMMU mapping request can be safely ignored when the domain is dying.
+     *
+     * hd->arch.mapping_lock guarantees that d->is_dying will be observed
+     * before any page tables are freed (see iommu_free_pgtables())
+     */
+    if ( d->is_dying )
+    {
+        spin_unlock(&hd->arch.mapping_lock);
+        return 0;
+    }
+
     pg_maddr = addr_to_dma_page_maddr(d, dfn_to_daddr(dfn), 1);
     if ( !pg_maddr )
     {
diff --git a/xen/drivers/passthrough/x86/iommu.c b/xen/drivers/passthrough/x86/iommu.c
index 58a330e82247..ad19b7dd461c 100644
--- a/xen/drivers/passthrough/x86/iommu.c
+++ b/xen/drivers/passthrough/x86/iommu.c
@@ -270,6 +270,9 @@ int iommu_free_pgtables(struct domain *d)
     if ( !is_iommu_enabled(d) )
         return 0;
 
+    /* After this barrier, no new IOMMU mappings can be inserted. */
+    spin_barrier(&hd->arch.mapping_lock);
+
     while ( (pg = page_list_remove_head(&hd->arch.pgtables.list)) )
     {
         free_domheap_page(pg);
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH for-4.15 v5 3/3] xen/iommu: x86: Clear the root page-table before freeing the page-tables
  2021-02-26 10:56 [PATCH for-4.15 v5 0/3] xen/iommu: Collection of bug fixes for IOMMU teardown Julien Grall
  2021-02-26 10:56 ` [PATCH for-4.15 v5 1/3] xen/iommu: x86: Don't try to free page tables is the IOMMU is not enabled Julien Grall
  2021-02-26 10:56 ` [PATCH for-4.15 v5 2/3] xen/x86: iommu: Ignore IOMMU mapping requests when a domain is dying Julien Grall
@ 2021-02-26 10:56 ` Julien Grall
  2021-03-01  2:54   ` Tian, Kevin
  2021-03-02  9:56 ` [PATCH for-4.15 v5 0/3] xen/iommu: Collection of bug fixes for IOMMU teardown Julien Grall
  3 siblings, 1 reply; 10+ messages in thread
From: Julien Grall @ 2021-02-26 10:56 UTC (permalink / raw)
  To: xen-devel
  Cc: hongyxia, iwj, Julien Grall, Jan Beulich, Andrew Cooper,
	Kevin Tian, Paul Durrant

From: Julien Grall <jgrall@amazon.com>

The new per-domain IOMMU page-table allocator will now free the
page-tables when domain's resources are relinquished. However, the
per-domain IOMMU structure will still contain a dangling pointer to
the root page-table.

Xen may access the IOMMU page-tables afterwards at least in the case of
PV domain:

(XEN) Xen call trace:
(XEN)    [<ffff82d04025b4b2>] R iommu.c#addr_to_dma_page_maddr+0x12e/0x1d8
(XEN)    [<ffff82d04025b695>] F iommu.c#intel_iommu_unmap_page+0x5d/0xf8
(XEN)    [<ffff82d0402695f3>] F iommu_unmap+0x9c/0x129
(XEN)    [<ffff82d0402696a6>] F iommu_legacy_unmap+0x26/0x63
(XEN)    [<ffff82d04033c5c7>] F mm.c#cleanup_page_mappings+0x139/0x144
(XEN)    [<ffff82d04033c61d>] F put_page+0x4b/0xb3
(XEN)    [<ffff82d04033c87f>] F put_page_from_l1e+0x136/0x13b
(XEN)    [<ffff82d04033cada>] F devalidate_page+0x256/0x8dc
(XEN)    [<ffff82d04033d396>] F mm.c#_put_page_type+0x236/0x47e
(XEN)    [<ffff82d04033d64d>] F mm.c#put_pt_page+0x6f/0x80
(XEN)    [<ffff82d04033d8d6>] F mm.c#put_page_from_l2e+0x8a/0xcf
(XEN)    [<ffff82d04033cc27>] F devalidate_page+0x3a3/0x8dc
(XEN)    [<ffff82d04033d396>] F mm.c#_put_page_type+0x236/0x47e
(XEN)    [<ffff82d04033d64d>] F mm.c#put_pt_page+0x6f/0x80
(XEN)    [<ffff82d04033d807>] F mm.c#put_page_from_l3e+0x8a/0xcf
(XEN)    [<ffff82d04033cdf0>] F devalidate_page+0x56c/0x8dc
(XEN)    [<ffff82d04033d396>] F mm.c#_put_page_type+0x236/0x47e
(XEN)    [<ffff82d04033d64d>] F mm.c#put_pt_page+0x6f/0x80
(XEN)    [<ffff82d04033d6c7>] F mm.c#put_page_from_l4e+0x69/0x6d
(XEN)    [<ffff82d04033cf24>] F devalidate_page+0x6a0/0x8dc
(XEN)    [<ffff82d04033d396>] F mm.c#_put_page_type+0x236/0x47e
(XEN)    [<ffff82d04033d92e>] F put_page_type_preemptible+0x13/0x15
(XEN)    [<ffff82d04032598a>] F domain.c#relinquish_memory+0x1ff/0x4e9
(XEN)    [<ffff82d0403295f2>] F domain_relinquish_resources+0x2b6/0x36a
(XEN)    [<ffff82d040205cdf>] F domain_kill+0xb8/0x141
(XEN)    [<ffff82d040236cac>] F do_domctl+0xb6f/0x18e5
(XEN)    [<ffff82d04031d098>] F pv_hypercall+0x2f0/0x55f
(XEN)    [<ffff82d04039b432>] F lstar_enter+0x112/0x120

This will result to a use after-free and possibly an host crash or
memory corruption.

It would not be possible to free the page-tables further down in
domain_relinquish_resources() because cleanup_page_mappings() will only
be called when the last reference on the page dropped. This may happen
much later if another domain still hold a reference.

After all the PCI devices have been de-assigned, nobody should use the
IOMMU page-tables and it is therefore pointless to try to modify them.

So we can simply clear any reference to the root page-table in the
per-domain IOMMU structure. This requires to introduce a new callback of
the method will depend on the IOMMU driver used.

Take the opportunity to add an ASSERT() in arch_iommu_domain_destroy()
to check if we freed all the IOMMU page tables.

Fixes: 3eef6d07d722 ("x86/iommu: convert VT-d code to use new page table allocator")
Signed-off-by: Julien Grall <jgrall@amazon.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>

---
    Changes in v5:
        - Add Jan's reviewed-by
        - Fix typo
        - Use ! rather than == NULL

    Changes in v4:
        - Move the patch later in the series as we need to prevent
        iommu_map() to allocate memory first.
        - Add an ASSERT() in arch_iommu_domain_destroy().

    Changes in v3:
        - Move the patch earlier in the series
        - Reword the commit message

    Changes in v2:
        - Introduce clear_root_pgtable()
        - Move the patch later in the series
---
 xen/drivers/passthrough/amd/pci_amd_iommu.c | 12 +++++++++++-
 xen/drivers/passthrough/vtd/iommu.c         | 12 +++++++++++-
 xen/drivers/passthrough/x86/iommu.c         | 13 +++++++++++++
 xen/include/xen/iommu.h                     |  1 +
 4 files changed, 36 insertions(+), 2 deletions(-)

diff --git a/xen/drivers/passthrough/amd/pci_amd_iommu.c b/xen/drivers/passthrough/amd/pci_amd_iommu.c
index 42b5a5a9bec4..085fe2f5771e 100644
--- a/xen/drivers/passthrough/amd/pci_amd_iommu.c
+++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c
@@ -381,9 +381,18 @@ static int amd_iommu_assign_device(struct domain *d, u8 devfn,
     return reassign_device(pdev->domain, d, devfn, pdev);
 }
 
+static void amd_iommu_clear_root_pgtable(struct domain *d)
+{
+    struct domain_iommu *hd = dom_iommu(d);
+
+    spin_lock(&hd->arch.mapping_lock);
+    hd->arch.amd.root_table = NULL;
+    spin_unlock(&hd->arch.mapping_lock);
+}
+
 static void amd_iommu_domain_destroy(struct domain *d)
 {
-    dom_iommu(d)->arch.amd.root_table = NULL;
+    ASSERT(!dom_iommu(d)->arch.amd.root_table);
 }
 
 static int amd_iommu_add_device(u8 devfn, struct pci_dev *pdev)
@@ -565,6 +574,7 @@ static const struct iommu_ops __initconstrel _iommu_ops = {
     .remove_device = amd_iommu_remove_device,
     .assign_device  = amd_iommu_assign_device,
     .teardown = amd_iommu_domain_destroy,
+    .clear_root_pgtable = amd_iommu_clear_root_pgtable,
     .map_page = amd_iommu_map_page,
     .unmap_page = amd_iommu_unmap_page,
     .iotlb_flush = amd_iommu_flush_iotlb_pages,
diff --git a/xen/drivers/passthrough/vtd/iommu.c b/xen/drivers/passthrough/vtd/iommu.c
index b549a71530d5..475efb3be3bd 100644
--- a/xen/drivers/passthrough/vtd/iommu.c
+++ b/xen/drivers/passthrough/vtd/iommu.c
@@ -1726,6 +1726,15 @@ out:
     return ret;
 }
 
+static void iommu_clear_root_pgtable(struct domain *d)
+{
+    struct domain_iommu *hd = dom_iommu(d);
+
+    spin_lock(&hd->arch.mapping_lock);
+    hd->arch.vtd.pgd_maddr = 0;
+    spin_unlock(&hd->arch.mapping_lock);
+}
+
 static void iommu_domain_teardown(struct domain *d)
 {
     struct domain_iommu *hd = dom_iommu(d);
@@ -1740,7 +1749,7 @@ static void iommu_domain_teardown(struct domain *d)
         xfree(mrmrr);
     }
 
-    hd->arch.vtd.pgd_maddr = 0;
+    ASSERT(!hd->arch.vtd.pgd_maddr);
 }
 
 static int __must_check intel_iommu_map_page(struct domain *d, dfn_t dfn,
@@ -2731,6 +2740,7 @@ static struct iommu_ops __initdata vtd_ops = {
     .remove_device = intel_iommu_remove_device,
     .assign_device  = intel_iommu_assign_device,
     .teardown = iommu_domain_teardown,
+    .clear_root_pgtable = iommu_clear_root_pgtable,
     .map_page = intel_iommu_map_page,
     .unmap_page = intel_iommu_unmap_page,
     .lookup_page = intel_iommu_lookup_page,
diff --git a/xen/drivers/passthrough/x86/iommu.c b/xen/drivers/passthrough/x86/iommu.c
index ad19b7dd461c..b90bb31bfeea 100644
--- a/xen/drivers/passthrough/x86/iommu.c
+++ b/xen/drivers/passthrough/x86/iommu.c
@@ -149,6 +149,13 @@ int arch_iommu_domain_init(struct domain *d)
 
 void arch_iommu_domain_destroy(struct domain *d)
 {
+    /*
+     * There should be not page-tables left allocated by the time the
+     * domain is destroyed. Note that arch_iommu_domain_destroy() is
+     * called unconditionally, so pgtables may be uninitialized.
+     */
+    ASSERT(!dom_iommu(d)->platform_ops ||
+           page_list_empty(&dom_iommu(d)->arch.pgtables.list));
 }
 
 static bool __hwdom_init hwdom_iommu_map(const struct domain *d,
@@ -273,6 +280,12 @@ int iommu_free_pgtables(struct domain *d)
     /* After this barrier, no new IOMMU mappings can be inserted. */
     spin_barrier(&hd->arch.mapping_lock);
 
+    /*
+     * Pages will be moved to the free list below. So we want to
+     * clear the root page-table to avoid any potential use after-free.
+     */
+    hd->platform_ops->clear_root_pgtable(d);
+
     while ( (pg = page_list_remove_head(&hd->arch.pgtables.list)) )
     {
         free_domheap_page(pg);
diff --git a/xen/include/xen/iommu.h b/xen/include/xen/iommu.h
index 863a68fe1622..d59ed7cbad43 100644
--- a/xen/include/xen/iommu.h
+++ b/xen/include/xen/iommu.h
@@ -272,6 +272,7 @@ struct iommu_ops {
 
     int (*adjust_irq_affinities)(void);
     void (*sync_cache)(const void *addr, unsigned int size);
+    void (*clear_root_pgtable)(struct domain *d);
 #endif /* CONFIG_X86 */
 
     int __must_check (*suspend)(void);
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH for-4.15 v5 1/3] xen/iommu: x86: Don't try to free page tables is the IOMMU is not enabled
  2021-02-26 10:56 ` [PATCH for-4.15 v5 1/3] xen/iommu: x86: Don't try to free page tables is the IOMMU is not enabled Julien Grall
@ 2021-02-26 13:27   ` Jan Beulich
  0 siblings, 0 replies; 10+ messages in thread
From: Jan Beulich @ 2021-02-26 13:27 UTC (permalink / raw)
  To: Julien Grall; +Cc: hongyxia, iwj, Julien Grall, Paul Durrant, xen-devel

On 26.02.2021 11:56, Julien Grall wrote:
> From: Julien Grall <jgrall@amazon.com>
> 
> When using CONFIG_BIGMEM=y, the page_list cannot be accessed whilst it
> is is unitialized. However, iommu_free_pgtables() will be called even if
> the domain is not using an IOMMU.
> 
> Consequently, Xen will try to go through the page list and deference a
> NULL pointer.
> 
> Bail out early if the domain is not using an IOMMU.
> 
> Fixes: 15bc9a1ef51c ("x86/iommu: add common page-table allocator")
> Signed-off-by: Julien Grall <jgrall@amazon.com>

Reviewed-by: Jan Beulich <jbeulich@suse.com>



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH for-4.15 v5 2/3] xen/x86: iommu: Ignore IOMMU mapping requests when a domain is dying
  2021-02-26 10:56 ` [PATCH for-4.15 v5 2/3] xen/x86: iommu: Ignore IOMMU mapping requests when a domain is dying Julien Grall
@ 2021-02-26 13:30   ` Jan Beulich
  2021-03-01  9:21     ` Julien Grall
  2021-03-01  2:52   ` Tian, Kevin
  1 sibling, 1 reply; 10+ messages in thread
From: Jan Beulich @ 2021-02-26 13:30 UTC (permalink / raw)
  To: Julien Grall
  Cc: hongyxia, iwj, Julien Grall, Andrew Cooper, Kevin Tian,
	Paul Durrant, xen-devel

On 26.02.2021 11:56, Julien Grall wrote:
> From: Julien Grall <jgrall@amazon.com>
> 
> The new x86 IOMMU page-tables allocator will release the pages when
> relinquishing the domain resources. However, this is not sufficient
> when the domain is dying because nothing prevents page-table to be
> allocated.
> 
> As the domain is dying, it is not necessary to continue to modify the
> IOMMU page-tables as they are going to be destroyed soon.
> 
> At the moment, page-table allocates will only happen when iommu_map().
> So after this change there will be no more page-table allocation
> happening because we don't use superpage mappings yet when not sharing
> page tables.
> 
> In order to observe d->is_dying correctly, we need to rely on per-arch
> locking, so the check to ignore IOMMU mapping is added on the per-driver
> map_page() callback.
> 
> Signed-off-by: Julien Grall <jgrall@amazon.com>

Reviewed-by: Jan Beulich <jbeulich@suse.com>

Does this also want a Fixes: tag (the same as patch 1)?

Jan


^ permalink raw reply	[flat|nested] 10+ messages in thread

* RE: [PATCH for-4.15 v5 2/3] xen/x86: iommu: Ignore IOMMU mapping requests when a domain is dying
  2021-02-26 10:56 ` [PATCH for-4.15 v5 2/3] xen/x86: iommu: Ignore IOMMU mapping requests when a domain is dying Julien Grall
  2021-02-26 13:30   ` Jan Beulich
@ 2021-03-01  2:52   ` Tian, Kevin
  1 sibling, 0 replies; 10+ messages in thread
From: Tian, Kevin @ 2021-03-01  2:52 UTC (permalink / raw)
  To: Julien Grall, xen-devel
  Cc: hongyxia, iwj, Julien Grall, Jan Beulich, Cooper, Andrew, Paul Durrant

> From: Julien Grall <julien@xen.org>
> Sent: Friday, February 26, 2021 6:57 PM
> 
> From: Julien Grall <jgrall@amazon.com>
> 
> The new x86 IOMMU page-tables allocator will release the pages when
> relinquishing the domain resources. However, this is not sufficient
> when the domain is dying because nothing prevents page-table to be
> allocated.
> 
> As the domain is dying, it is not necessary to continue to modify the
> IOMMU page-tables as they are going to be destroyed soon.
> 
> At the moment, page-table allocates will only happen when iommu_map().
> So after this change there will be no more page-table allocation
> happening because we don't use superpage mappings yet when not sharing
> page tables.
> 
> In order to observe d->is_dying correctly, we need to rely on per-arch
> locking, so the check to ignore IOMMU mapping is added on the per-driver
> map_page() callback.
> 
> Signed-off-by: Julien Grall <jgrall@amazon.com>

Reviewed-by: Kevin Tian <kevin.tian@intel.com>

> 
> ---
> 
> As discussed in v3, this is only covering 4.15. We can discuss
> post-4.15 how to catch page-table allocations if another caller (e.g.
> iommu_unmap() if we ever decide to support superpages) start to use the
> page-table allocator.
> 
> Changes in v5:
>     - Clarify in the commit message why fixing iommu_map() is enough
>     - Split "if ( !is_iommu_enabled(d) )"  in a separate patch
>     - Update the comment on top of the spin_barrier()
> 
> Changes in v4:
>     - Move the patch to the top of the queue
>     - Reword the commit message
> 
> Changes in v3:
>     - Patch added. This is a replacement of "xen/iommu: iommu_map: Don't
>     crash the domain if it is dying"
> ---
>  xen/drivers/passthrough/amd/iommu_map.c | 12 ++++++++++++
>  xen/drivers/passthrough/vtd/iommu.c     | 12 ++++++++++++
>  xen/drivers/passthrough/x86/iommu.c     |  3 +++
>  3 files changed, 27 insertions(+)
> 
> diff --git a/xen/drivers/passthrough/amd/iommu_map.c
> b/xen/drivers/passthrough/amd/iommu_map.c
> index d3a8b1aec766..560af54b765b 100644
> --- a/xen/drivers/passthrough/amd/iommu_map.c
> +++ b/xen/drivers/passthrough/amd/iommu_map.c
> @@ -285,6 +285,18 @@ int amd_iommu_map_page(struct domain *d, dfn_t
> dfn, mfn_t mfn,
> 
>      spin_lock(&hd->arch.mapping_lock);
> 
> +    /*
> +     * IOMMU mapping request can be safely ignored when the domain is
> dying.
> +     *
> +     * hd->arch.mapping_lock guarantees that d->is_dying will be observed
> +     * before any page tables are freed (see iommu_free_pgtables()).
> +     */
> +    if ( d->is_dying )
> +    {
> +        spin_unlock(&hd->arch.mapping_lock);
> +        return 0;
> +    }
> +
>      rc = amd_iommu_alloc_root(d);
>      if ( rc )
>      {
> diff --git a/xen/drivers/passthrough/vtd/iommu.c
> b/xen/drivers/passthrough/vtd/iommu.c
> index d136fe36883b..b549a71530d5 100644
> --- a/xen/drivers/passthrough/vtd/iommu.c
> +++ b/xen/drivers/passthrough/vtd/iommu.c
> @@ -1762,6 +1762,18 @@ static int __must_check
> intel_iommu_map_page(struct domain *d, dfn_t dfn,
> 
>      spin_lock(&hd->arch.mapping_lock);
> 
> +    /*
> +     * IOMMU mapping request can be safely ignored when the domain is
> dying.
> +     *
> +     * hd->arch.mapping_lock guarantees that d->is_dying will be observed
> +     * before any page tables are freed (see iommu_free_pgtables())
> +     */
> +    if ( d->is_dying )
> +    {
> +        spin_unlock(&hd->arch.mapping_lock);
> +        return 0;
> +    }
> +
>      pg_maddr = addr_to_dma_page_maddr(d, dfn_to_daddr(dfn), 1);
>      if ( !pg_maddr )
>      {
> diff --git a/xen/drivers/passthrough/x86/iommu.c
> b/xen/drivers/passthrough/x86/iommu.c
> index 58a330e82247..ad19b7dd461c 100644
> --- a/xen/drivers/passthrough/x86/iommu.c
> +++ b/xen/drivers/passthrough/x86/iommu.c
> @@ -270,6 +270,9 @@ int iommu_free_pgtables(struct domain *d)
>      if ( !is_iommu_enabled(d) )
>          return 0;
> 
> +    /* After this barrier, no new IOMMU mappings can be inserted. */
> +    spin_barrier(&hd->arch.mapping_lock);
> +
>      while ( (pg = page_list_remove_head(&hd->arch.pgtables.list)) )
>      {
>          free_domheap_page(pg);
> --
> 2.17.1



^ permalink raw reply	[flat|nested] 10+ messages in thread

* RE: [PATCH for-4.15 v5 3/3] xen/iommu: x86: Clear the root page-table before freeing the page-tables
  2021-02-26 10:56 ` [PATCH for-4.15 v5 3/3] xen/iommu: x86: Clear the root page-table before freeing the page-tables Julien Grall
@ 2021-03-01  2:54   ` Tian, Kevin
  0 siblings, 0 replies; 10+ messages in thread
From: Tian, Kevin @ 2021-03-01  2:54 UTC (permalink / raw)
  To: Julien Grall, xen-devel
  Cc: hongyxia, iwj, Julien Grall, Jan Beulich, Cooper, Andrew, Paul Durrant

> From: Julien Grall <julien@xen.org>
> Sent: Friday, February 26, 2021 6:57 PM
> 
> From: Julien Grall <jgrall@amazon.com>
> 
> The new per-domain IOMMU page-table allocator will now free the
> page-tables when domain's resources are relinquished. However, the
> per-domain IOMMU structure will still contain a dangling pointer to
> the root page-table.
> 
> Xen may access the IOMMU page-tables afterwards at least in the case of
> PV domain:
> 
> (XEN) Xen call trace:
> (XEN)    [<ffff82d04025b4b2>] R
> iommu.c#addr_to_dma_page_maddr+0x12e/0x1d8
> (XEN)    [<ffff82d04025b695>] F
> iommu.c#intel_iommu_unmap_page+0x5d/0xf8
> (XEN)    [<ffff82d0402695f3>] F iommu_unmap+0x9c/0x129
> (XEN)    [<ffff82d0402696a6>] F iommu_legacy_unmap+0x26/0x63
> (XEN)    [<ffff82d04033c5c7>] F mm.c#cleanup_page_mappings+0x139/0x144
> (XEN)    [<ffff82d04033c61d>] F put_page+0x4b/0xb3
> (XEN)    [<ffff82d04033c87f>] F put_page_from_l1e+0x136/0x13b
> (XEN)    [<ffff82d04033cada>] F devalidate_page+0x256/0x8dc
> (XEN)    [<ffff82d04033d396>] F mm.c#_put_page_type+0x236/0x47e
> (XEN)    [<ffff82d04033d64d>] F mm.c#put_pt_page+0x6f/0x80
> (XEN)    [<ffff82d04033d8d6>] F mm.c#put_page_from_l2e+0x8a/0xcf
> (XEN)    [<ffff82d04033cc27>] F devalidate_page+0x3a3/0x8dc
> (XEN)    [<ffff82d04033d396>] F mm.c#_put_page_type+0x236/0x47e
> (XEN)    [<ffff82d04033d64d>] F mm.c#put_pt_page+0x6f/0x80
> (XEN)    [<ffff82d04033d807>] F mm.c#put_page_from_l3e+0x8a/0xcf
> (XEN)    [<ffff82d04033cdf0>] F devalidate_page+0x56c/0x8dc
> (XEN)    [<ffff82d04033d396>] F mm.c#_put_page_type+0x236/0x47e
> (XEN)    [<ffff82d04033d64d>] F mm.c#put_pt_page+0x6f/0x80
> (XEN)    [<ffff82d04033d6c7>] F mm.c#put_page_from_l4e+0x69/0x6d
> (XEN)    [<ffff82d04033cf24>] F devalidate_page+0x6a0/0x8dc
> (XEN)    [<ffff82d04033d396>] F mm.c#_put_page_type+0x236/0x47e
> (XEN)    [<ffff82d04033d92e>] F put_page_type_preemptible+0x13/0x15
> (XEN)    [<ffff82d04032598a>] F domain.c#relinquish_memory+0x1ff/0x4e9
> (XEN)    [<ffff82d0403295f2>] F domain_relinquish_resources+0x2b6/0x36a
> (XEN)    [<ffff82d040205cdf>] F domain_kill+0xb8/0x141
> (XEN)    [<ffff82d040236cac>] F do_domctl+0xb6f/0x18e5
> (XEN)    [<ffff82d04031d098>] F pv_hypercall+0x2f0/0x55f
> (XEN)    [<ffff82d04039b432>] F lstar_enter+0x112/0x120
> 
> This will result to a use after-free and possibly an host crash or
> memory corruption.
> 
> It would not be possible to free the page-tables further down in
> domain_relinquish_resources() because cleanup_page_mappings() will only
> be called when the last reference on the page dropped. This may happen
> much later if another domain still hold a reference.
> 
> After all the PCI devices have been de-assigned, nobody should use the
> IOMMU page-tables and it is therefore pointless to try to modify them.
> 
> So we can simply clear any reference to the root page-table in the
> per-domain IOMMU structure. This requires to introduce a new callback of
> the method will depend on the IOMMU driver used.
> 
> Take the opportunity to add an ASSERT() in arch_iommu_domain_destroy()
> to check if we freed all the IOMMU page tables.
> 
> Fixes: 3eef6d07d722 ("x86/iommu: convert VT-d code to use new page table
> allocator")
> Signed-off-by: Julien Grall <jgrall@amazon.com>
> Reviewed-by: Jan Beulich <jbeulich@suse.com>

Reviewed-by: Kevin Tian <kevin.tian@intel.com>

> 
> ---
>     Changes in v5:
>         - Add Jan's reviewed-by
>         - Fix typo
>         - Use ! rather than == NULL
> 
>     Changes in v4:
>         - Move the patch later in the series as we need to prevent
>         iommu_map() to allocate memory first.
>         - Add an ASSERT() in arch_iommu_domain_destroy().
> 
>     Changes in v3:
>         - Move the patch earlier in the series
>         - Reword the commit message
> 
>     Changes in v2:
>         - Introduce clear_root_pgtable()
>         - Move the patch later in the series
> ---
>  xen/drivers/passthrough/amd/pci_amd_iommu.c | 12 +++++++++++-
>  xen/drivers/passthrough/vtd/iommu.c         | 12 +++++++++++-
>  xen/drivers/passthrough/x86/iommu.c         | 13 +++++++++++++
>  xen/include/xen/iommu.h                     |  1 +
>  4 files changed, 36 insertions(+), 2 deletions(-)
> 
> diff --git a/xen/drivers/passthrough/amd/pci_amd_iommu.c
> b/xen/drivers/passthrough/amd/pci_amd_iommu.c
> index 42b5a5a9bec4..085fe2f5771e 100644
> --- a/xen/drivers/passthrough/amd/pci_amd_iommu.c
> +++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c
> @@ -381,9 +381,18 @@ static int amd_iommu_assign_device(struct domain
> *d, u8 devfn,
>      return reassign_device(pdev->domain, d, devfn, pdev);
>  }
> 
> +static void amd_iommu_clear_root_pgtable(struct domain *d)
> +{
> +    struct domain_iommu *hd = dom_iommu(d);
> +
> +    spin_lock(&hd->arch.mapping_lock);
> +    hd->arch.amd.root_table = NULL;
> +    spin_unlock(&hd->arch.mapping_lock);
> +}
> +
>  static void amd_iommu_domain_destroy(struct domain *d)
>  {
> -    dom_iommu(d)->arch.amd.root_table = NULL;
> +    ASSERT(!dom_iommu(d)->arch.amd.root_table);
>  }
> 
>  static int amd_iommu_add_device(u8 devfn, struct pci_dev *pdev)
> @@ -565,6 +574,7 @@ static const struct iommu_ops __initconstrel
> _iommu_ops = {
>      .remove_device = amd_iommu_remove_device,
>      .assign_device  = amd_iommu_assign_device,
>      .teardown = amd_iommu_domain_destroy,
> +    .clear_root_pgtable = amd_iommu_clear_root_pgtable,
>      .map_page = amd_iommu_map_page,
>      .unmap_page = amd_iommu_unmap_page,
>      .iotlb_flush = amd_iommu_flush_iotlb_pages,
> diff --git a/xen/drivers/passthrough/vtd/iommu.c
> b/xen/drivers/passthrough/vtd/iommu.c
> index b549a71530d5..475efb3be3bd 100644
> --- a/xen/drivers/passthrough/vtd/iommu.c
> +++ b/xen/drivers/passthrough/vtd/iommu.c
> @@ -1726,6 +1726,15 @@ out:
>      return ret;
>  }
> 
> +static void iommu_clear_root_pgtable(struct domain *d)
> +{
> +    struct domain_iommu *hd = dom_iommu(d);
> +
> +    spin_lock(&hd->arch.mapping_lock);
> +    hd->arch.vtd.pgd_maddr = 0;
> +    spin_unlock(&hd->arch.mapping_lock);
> +}
> +
>  static void iommu_domain_teardown(struct domain *d)
>  {
>      struct domain_iommu *hd = dom_iommu(d);
> @@ -1740,7 +1749,7 @@ static void iommu_domain_teardown(struct
> domain *d)
>          xfree(mrmrr);
>      }
> 
> -    hd->arch.vtd.pgd_maddr = 0;
> +    ASSERT(!hd->arch.vtd.pgd_maddr);
>  }
> 
>  static int __must_check intel_iommu_map_page(struct domain *d, dfn_t dfn,
> @@ -2731,6 +2740,7 @@ static struct iommu_ops __initdata vtd_ops = {
>      .remove_device = intel_iommu_remove_device,
>      .assign_device  = intel_iommu_assign_device,
>      .teardown = iommu_domain_teardown,
> +    .clear_root_pgtable = iommu_clear_root_pgtable,
>      .map_page = intel_iommu_map_page,
>      .unmap_page = intel_iommu_unmap_page,
>      .lookup_page = intel_iommu_lookup_page,
> diff --git a/xen/drivers/passthrough/x86/iommu.c
> b/xen/drivers/passthrough/x86/iommu.c
> index ad19b7dd461c..b90bb31bfeea 100644
> --- a/xen/drivers/passthrough/x86/iommu.c
> +++ b/xen/drivers/passthrough/x86/iommu.c
> @@ -149,6 +149,13 @@ int arch_iommu_domain_init(struct domain *d)
> 
>  void arch_iommu_domain_destroy(struct domain *d)
>  {
> +    /*
> +     * There should be not page-tables left allocated by the time the
> +     * domain is destroyed. Note that arch_iommu_domain_destroy() is
> +     * called unconditionally, so pgtables may be uninitialized.
> +     */
> +    ASSERT(!dom_iommu(d)->platform_ops ||
> +           page_list_empty(&dom_iommu(d)->arch.pgtables.list));
>  }
> 
>  static bool __hwdom_init hwdom_iommu_map(const struct domain *d,
> @@ -273,6 +280,12 @@ int iommu_free_pgtables(struct domain *d)
>      /* After this barrier, no new IOMMU mappings can be inserted. */
>      spin_barrier(&hd->arch.mapping_lock);
> 
> +    /*
> +     * Pages will be moved to the free list below. So we want to
> +     * clear the root page-table to avoid any potential use after-free.
> +     */
> +    hd->platform_ops->clear_root_pgtable(d);
> +
>      while ( (pg = page_list_remove_head(&hd->arch.pgtables.list)) )
>      {
>          free_domheap_page(pg);
> diff --git a/xen/include/xen/iommu.h b/xen/include/xen/iommu.h
> index 863a68fe1622..d59ed7cbad43 100644
> --- a/xen/include/xen/iommu.h
> +++ b/xen/include/xen/iommu.h
> @@ -272,6 +272,7 @@ struct iommu_ops {
> 
>      int (*adjust_irq_affinities)(void);
>      void (*sync_cache)(const void *addr, unsigned int size);
> +    void (*clear_root_pgtable)(struct domain *d);
>  #endif /* CONFIG_X86 */
> 
>      int __must_check (*suspend)(void);
> --
> 2.17.1



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH for-4.15 v5 2/3] xen/x86: iommu: Ignore IOMMU mapping requests when a domain is dying
  2021-02-26 13:30   ` Jan Beulich
@ 2021-03-01  9:21     ` Julien Grall
  0 siblings, 0 replies; 10+ messages in thread
From: Julien Grall @ 2021-03-01  9:21 UTC (permalink / raw)
  To: Jan Beulich
  Cc: hongyxia, iwj, Julien Grall, Andrew Cooper, Kevin Tian,
	Paul Durrant, xen-devel

Hi Jan,

On 26/02/2021 13:30, Jan Beulich wrote:
> On 26.02.2021 11:56, Julien Grall wrote:
>> From: Julien Grall <jgrall@amazon.com>
>>
>> The new x86 IOMMU page-tables allocator will release the pages when
>> relinquishing the domain resources. However, this is not sufficient
>> when the domain is dying because nothing prevents page-table to be
>> allocated.
>>
>> As the domain is dying, it is not necessary to continue to modify the
>> IOMMU page-tables as they are going to be destroyed soon.
>>
>> At the moment, page-table allocates will only happen when iommu_map().
>> So after this change there will be no more page-table allocation
>> happening because we don't use superpage mappings yet when not sharing
>> page tables.
>>
>> In order to observe d->is_dying correctly, we need to rely on per-arch
>> locking, so the check to ignore IOMMU mapping is added on the per-driver
>> map_page() callback.
>>
>> Signed-off-by: Julien Grall <jgrall@amazon.com>
> 
> Reviewed-by: Jan Beulich <jbeulich@suse.com>

Thanks!

> 
> Does this also want a Fixes: tag (the same as patch 1)?

I think so. I will add it when committing the series.

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH for-4.15 v5 0/3] xen/iommu: Collection of bug fixes for IOMMU teardown
  2021-02-26 10:56 [PATCH for-4.15 v5 0/3] xen/iommu: Collection of bug fixes for IOMMU teardown Julien Grall
                   ` (2 preceding siblings ...)
  2021-02-26 10:56 ` [PATCH for-4.15 v5 3/3] xen/iommu: x86: Clear the root page-table before freeing the page-tables Julien Grall
@ 2021-03-02  9:56 ` Julien Grall
  3 siblings, 0 replies; 10+ messages in thread
From: Julien Grall @ 2021-03-02  9:56 UTC (permalink / raw)
  To: xen-devel
  Cc: hongyxia, iwj, Julien Grall, Jan Beulich, Paul Durrant,
	Andrew Cooper, Kevin Tian

Hi,

On 26/02/2021 10:56, Julien Grall wrote:
> Julien Grall (3):
>    xen/iommu: x86: Don't try to free page tables is the IOMMU is not
>      enabled
>    xen/x86: iommu: Ignore IOMMU mapping requests when a domain is dying
>    xen/iommu: x86: Clear the root page-table before freeing the
>      page-tables

I have committed the 3 patches.

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2021-03-02  9:56 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-02-26 10:56 [PATCH for-4.15 v5 0/3] xen/iommu: Collection of bug fixes for IOMMU teardown Julien Grall
2021-02-26 10:56 ` [PATCH for-4.15 v5 1/3] xen/iommu: x86: Don't try to free page tables is the IOMMU is not enabled Julien Grall
2021-02-26 13:27   ` Jan Beulich
2021-02-26 10:56 ` [PATCH for-4.15 v5 2/3] xen/x86: iommu: Ignore IOMMU mapping requests when a domain is dying Julien Grall
2021-02-26 13:30   ` Jan Beulich
2021-03-01  9:21     ` Julien Grall
2021-03-01  2:52   ` Tian, Kevin
2021-02-26 10:56 ` [PATCH for-4.15 v5 3/3] xen/iommu: x86: Clear the root page-table before freeing the page-tables Julien Grall
2021-03-01  2:54   ` Tian, Kevin
2021-03-02  9:56 ` [PATCH for-4.15 v5 0/3] xen/iommu: Collection of bug fixes for IOMMU teardown Julien Grall

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.