xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
* [for-4.15][PATCH v4 0/2] xen/iommu: Collection of bug fixes for IOMMU teardown
@ 2021-02-23 12:34 Julien Grall
  2021-02-23 12:34 ` [for-4.15][PATCH v4 1/2] xen/x86: iommu: Ignore IOMMU mapping requests when a domain is dying Julien Grall
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Julien Grall @ 2021-02-23 12:34 UTC (permalink / raw)
  To: xen-devel; +Cc: hongyxia, iwj, Julien Grall

From: Julien Grall <jgrall@amazon.com>

Hi all,

This series is a collection of bug fixes for the IOMMU teardown code.
All of them are candidate for 4.15 as they can either leak memory or
lead to host crash/host corruption.

This is sent directly on xen-devel because all the issues were either
introduced in 4.15 or happen in the domain creation code.

Major changes since v3:
    - Remove patch #3 "xen/iommu: x86: Harden the IOMMU page-table
    allocator" as it is not strictly necessary for 4.15.
    - Re-order the patches to avoid on a follow-up patch to fix
    completely the issue.

Major changes since v2:
    - patch #1 "xen/x86: p2m: Don't map the special pages in the IOMMU
    page-tables" has been removed. This requires Jan's patch [1] to
    fully mitigate memory leaks.

Cheers,

[1] <90271e69-c07e-a32c-5531-a79b10ef03dd@suse.com>

Julien Grall (2):
  xen/x86: iommu: Ignore IOMMU mapping requests when a domain is dying
  xen/iommu: x86: Clear the root page-table before freeing the
    page-tables

 xen/drivers/passthrough/amd/iommu_map.c     | 12 +++++++++++
 xen/drivers/passthrough/amd/pci_amd_iommu.c | 12 ++++++++++-
 xen/drivers/passthrough/vtd/iommu.c         | 24 ++++++++++++++++++++-
 xen/drivers/passthrough/x86/iommu.c         | 19 ++++++++++++++++
 xen/include/xen/iommu.h                     |  1 +
 5 files changed, 66 insertions(+), 2 deletions(-)

-- 
2.17.1



^ permalink raw reply	[flat|nested] 4+ messages in thread

* [for-4.15][PATCH v4 1/2] xen/x86: iommu: Ignore IOMMU mapping requests when a domain is dying
  2021-02-23 12:34 [for-4.15][PATCH v4 0/2] xen/iommu: Collection of bug fixes for IOMMU teardown Julien Grall
@ 2021-02-23 12:34 ` Julien Grall
  2021-02-23 12:34 ` [for-4.15][PATCH v4 2/2] xen/iommu: x86: Clear the root page-table before freeing the page-tables Julien Grall
  2021-02-24  9:43 ` [for-4.15][PATCH v4 0/2] xen/iommu: Collection of bug fixes for IOMMU teardown Julien Grall
  2 siblings, 0 replies; 4+ messages in thread
From: Julien Grall @ 2021-02-23 12:34 UTC (permalink / raw)
  To: xen-devel; +Cc: hongyxia, iwj, Julien Grall

From: Julien Grall <jgrall@amazon.com>

The new x86 IOMMU page-tables allocator will release the pages when
relinquishing the domain resources. However, this is not sufficient
when the domain is dying because nothing prevents page-table to be
allocated.

As the domain is dying, it is not necessary to continue to modify the
IOMMU page-tables as they are going to be destroyed soon.

At the moment, page-table allocates will only happen when iommu_map().
So after this change there will be no more page-table allocation
happening.

In order to observe d->is_dying correctly, we need to rely on per-arch
locking, so the check to ignore IOMMU mapping is added on the per-driver
map_page() callback.

Signed-off-by: Julien Grall <jgrall@amazon.com>

---

As discussed in v3, this is only covering 4.15. We can discuss
post-4.15 how to catch page-table allocations if another caller (e.g.
iommu_unmap() if we ever decide to support superpages) start to use the
page-table allocator.

Changes in v4:
    - Move the patch to the top of the queue
    - Reword the commit message

Changes in v3:
    - Patch added. This is a replacement of "xen/iommu: iommu_map: Don't
    crash the domain if it is dying"
---
 xen/drivers/passthrough/amd/iommu_map.c | 12 ++++++++++++
 xen/drivers/passthrough/vtd/iommu.c     | 12 ++++++++++++
 xen/drivers/passthrough/x86/iommu.c     |  6 ++++++
 3 files changed, 30 insertions(+)

diff --git a/xen/drivers/passthrough/amd/iommu_map.c b/xen/drivers/passthrough/amd/iommu_map.c
index d3a8b1aec766..560af54b765b 100644
--- a/xen/drivers/passthrough/amd/iommu_map.c
+++ b/xen/drivers/passthrough/amd/iommu_map.c
@@ -285,6 +285,18 @@ int amd_iommu_map_page(struct domain *d, dfn_t dfn, mfn_t mfn,
 
     spin_lock(&hd->arch.mapping_lock);
 
+    /*
+     * IOMMU mapping request can be safely ignored when the domain is dying.
+     *
+     * hd->arch.mapping_lock guarantees that d->is_dying will be observed
+     * before any page tables are freed (see iommu_free_pgtables()).
+     */
+    if ( d->is_dying )
+    {
+        spin_unlock(&hd->arch.mapping_lock);
+        return 0;
+    }
+
     rc = amd_iommu_alloc_root(d);
     if ( rc )
     {
diff --git a/xen/drivers/passthrough/vtd/iommu.c b/xen/drivers/passthrough/vtd/iommu.c
index d136fe36883b..b549a71530d5 100644
--- a/xen/drivers/passthrough/vtd/iommu.c
+++ b/xen/drivers/passthrough/vtd/iommu.c
@@ -1762,6 +1762,18 @@ static int __must_check intel_iommu_map_page(struct domain *d, dfn_t dfn,
 
     spin_lock(&hd->arch.mapping_lock);
 
+    /*
+     * IOMMU mapping request can be safely ignored when the domain is dying.
+     *
+     * hd->arch.mapping_lock guarantees that d->is_dying will be observed
+     * before any page tables are freed (see iommu_free_pgtables())
+     */
+    if ( d->is_dying )
+    {
+        spin_unlock(&hd->arch.mapping_lock);
+        return 0;
+    }
+
     pg_maddr = addr_to_dma_page_maddr(d, dfn_to_daddr(dfn), 1);
     if ( !pg_maddr )
     {
diff --git a/xen/drivers/passthrough/x86/iommu.c b/xen/drivers/passthrough/x86/iommu.c
index cea1032b3d02..c6b03624fe28 100644
--- a/xen/drivers/passthrough/x86/iommu.c
+++ b/xen/drivers/passthrough/x86/iommu.c
@@ -267,6 +267,12 @@ int iommu_free_pgtables(struct domain *d)
     struct page_info *pg;
     unsigned int done = 0;
 
+    if ( !is_iommu_enabled(d) )
+        return 0;
+
+    /* After this barrier, no more IOMMU mapping can happen */
+    spin_barrier(&hd->arch.mapping_lock);
+
     while ( (pg = page_list_remove_head(&hd->arch.pgtables.list)) )
     {
         free_domheap_page(pg);
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [for-4.15][PATCH v4 2/2] xen/iommu: x86: Clear the root page-table before freeing the page-tables
  2021-02-23 12:34 [for-4.15][PATCH v4 0/2] xen/iommu: Collection of bug fixes for IOMMU teardown Julien Grall
  2021-02-23 12:34 ` [for-4.15][PATCH v4 1/2] xen/x86: iommu: Ignore IOMMU mapping requests when a domain is dying Julien Grall
@ 2021-02-23 12:34 ` Julien Grall
  2021-02-24  9:43 ` [for-4.15][PATCH v4 0/2] xen/iommu: Collection of bug fixes for IOMMU teardown Julien Grall
  2 siblings, 0 replies; 4+ messages in thread
From: Julien Grall @ 2021-02-23 12:34 UTC (permalink / raw)
  To: xen-devel; +Cc: hongyxia, iwj, Julien Grall

From: Julien Grall <jgrall@amazon.com>

The new per-domain IOMMU page-table allocator will now free the
page-tables when domain's resources are relinquished. However, the
per-domain IOMMU structure will still contain a dangling pointer to
the root page-table.

Xen may access the IOMMU page-tables afterwards at least in the case of
PV domain:

(XEN) Xen call trace:
(XEN)    [<ffff82d04025b4b2>] R iommu.c#addr_to_dma_page_maddr+0x12e/0x1d8
(XEN)    [<ffff82d04025b695>] F iommu.c#intel_iommu_unmap_page+0x5d/0xf8
(XEN)    [<ffff82d0402695f3>] F iommu_unmap+0x9c/0x129
(XEN)    [<ffff82d0402696a6>] F iommu_legacy_unmap+0x26/0x63
(XEN)    [<ffff82d04033c5c7>] F mm.c#cleanup_page_mappings+0x139/0x144
(XEN)    [<ffff82d04033c61d>] F put_page+0x4b/0xb3
(XEN)    [<ffff82d04033c87f>] F put_page_from_l1e+0x136/0x13b
(XEN)    [<ffff82d04033cada>] F devalidate_page+0x256/0x8dc
(XEN)    [<ffff82d04033d396>] F mm.c#_put_page_type+0x236/0x47e
(XEN)    [<ffff82d04033d64d>] F mm.c#put_pt_page+0x6f/0x80
(XEN)    [<ffff82d04033d8d6>] F mm.c#put_page_from_l2e+0x8a/0xcf
(XEN)    [<ffff82d04033cc27>] F devalidate_page+0x3a3/0x8dc
(XEN)    [<ffff82d04033d396>] F mm.c#_put_page_type+0x236/0x47e
(XEN)    [<ffff82d04033d64d>] F mm.c#put_pt_page+0x6f/0x80
(XEN)    [<ffff82d04033d807>] F mm.c#put_page_from_l3e+0x8a/0xcf
(XEN)    [<ffff82d04033cdf0>] F devalidate_page+0x56c/0x8dc
(XEN)    [<ffff82d04033d396>] F mm.c#_put_page_type+0x236/0x47e
(XEN)    [<ffff82d04033d64d>] F mm.c#put_pt_page+0x6f/0x80
(XEN)    [<ffff82d04033d6c7>] F mm.c#put_page_from_l4e+0x69/0x6d
(XEN)    [<ffff82d04033cf24>] F devalidate_page+0x6a0/0x8dc
(XEN)    [<ffff82d04033d396>] F mm.c#_put_page_type+0x236/0x47e
(XEN)    [<ffff82d04033d92e>] F put_page_type_preemptible+0x13/0x15
(XEN)    [<ffff82d04032598a>] F domain.c#relinquish_memory+0x1ff/0x4e9
(XEN)    [<ffff82d0403295f2>] F domain_relinquish_resources+0x2b6/0x36a
(XEN)    [<ffff82d040205cdf>] F domain_kill+0xb8/0x141
(XEN)    [<ffff82d040236cac>] F do_domctl+0xb6f/0x18e5
(XEN)    [<ffff82d04031d098>] F pv_hypercall+0x2f0/0x55f
(XEN)    [<ffff82d04039b432>] F lstar_enter+0x112/0x120

This will result to a use after-free and possibly an host crash or
memory corruption.

It would not be possible to free the page-tables further down in
domain_relinquish_resources() because cleanup_page_mappings() will only
be called when the last reference on the page dropped. This may happen
much later if another domain still hold a reference.

After all the PCI devices have been de-assigned, nobody should use the
IOMMU page-tables and it is therefore pointless to try to modify them.

So we can simply clear any reference to the root page-table in the
per-domain IOMMU structure. This requires to introduce a new callback of
the method will depend on the IOMMU driver used.

Take the opportunity to add an ASSERT() in arch_iommu_domain_destroy()
to check if we freed all the IOMMU page tables.

Fixes: 3eef6d07d722 ("x86/iommu: convert VT-d code to use new page table allocator")
Signed-off-by: Julien Grall <jgrall@amazon.com>

---
    Changes in v4:
        - Move the patch later in the series as we need to prevent
        iommu_map() to allocate memory first.
        - Add an ASSERT() in arch_iommu_domain_destroy().

    Changes in v3:
        - Move the patch earlier in the series
        - Reword the commit message

    Changes in v2:
        - Introduce clear_root_pgtable()
        - Move the patch later in the series
---
 xen/drivers/passthrough/amd/pci_amd_iommu.c | 12 +++++++++++-
 xen/drivers/passthrough/vtd/iommu.c         | 12 +++++++++++-
 xen/drivers/passthrough/x86/iommu.c         | 13 +++++++++++++
 xen/include/xen/iommu.h                     |  1 +
 4 files changed, 36 insertions(+), 2 deletions(-)

diff --git a/xen/drivers/passthrough/amd/pci_amd_iommu.c b/xen/drivers/passthrough/amd/pci_amd_iommu.c
index 42b5a5a9bec4..085fe2f5771e 100644
--- a/xen/drivers/passthrough/amd/pci_amd_iommu.c
+++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c
@@ -381,9 +381,18 @@ static int amd_iommu_assign_device(struct domain *d, u8 devfn,
     return reassign_device(pdev->domain, d, devfn, pdev);
 }
 
+static void amd_iommu_clear_root_pgtable(struct domain *d)
+{
+    struct domain_iommu *hd = dom_iommu(d);
+
+    spin_lock(&hd->arch.mapping_lock);
+    hd->arch.amd.root_table = NULL;
+    spin_unlock(&hd->arch.mapping_lock);
+}
+
 static void amd_iommu_domain_destroy(struct domain *d)
 {
-    dom_iommu(d)->arch.amd.root_table = NULL;
+    ASSERT(!dom_iommu(d)->arch.amd.root_table);
 }
 
 static int amd_iommu_add_device(u8 devfn, struct pci_dev *pdev)
@@ -565,6 +574,7 @@ static const struct iommu_ops __initconstrel _iommu_ops = {
     .remove_device = amd_iommu_remove_device,
     .assign_device  = amd_iommu_assign_device,
     .teardown = amd_iommu_domain_destroy,
+    .clear_root_pgtable = amd_iommu_clear_root_pgtable,
     .map_page = amd_iommu_map_page,
     .unmap_page = amd_iommu_unmap_page,
     .iotlb_flush = amd_iommu_flush_iotlb_pages,
diff --git a/xen/drivers/passthrough/vtd/iommu.c b/xen/drivers/passthrough/vtd/iommu.c
index b549a71530d5..475efb3be3bd 100644
--- a/xen/drivers/passthrough/vtd/iommu.c
+++ b/xen/drivers/passthrough/vtd/iommu.c
@@ -1726,6 +1726,15 @@ out:
     return ret;
 }
 
+static void iommu_clear_root_pgtable(struct domain *d)
+{
+    struct domain_iommu *hd = dom_iommu(d);
+
+    spin_lock(&hd->arch.mapping_lock);
+    hd->arch.vtd.pgd_maddr = 0;
+    spin_unlock(&hd->arch.mapping_lock);
+}
+
 static void iommu_domain_teardown(struct domain *d)
 {
     struct domain_iommu *hd = dom_iommu(d);
@@ -1740,7 +1749,7 @@ static void iommu_domain_teardown(struct domain *d)
         xfree(mrmrr);
     }
 
-    hd->arch.vtd.pgd_maddr = 0;
+    ASSERT(!hd->arch.vtd.pgd_maddr);
 }
 
 static int __must_check intel_iommu_map_page(struct domain *d, dfn_t dfn,
@@ -2731,6 +2740,7 @@ static struct iommu_ops __initdata vtd_ops = {
     .remove_device = intel_iommu_remove_device,
     .assign_device  = intel_iommu_assign_device,
     .teardown = iommu_domain_teardown,
+    .clear_root_pgtable = iommu_clear_root_pgtable,
     .map_page = intel_iommu_map_page,
     .unmap_page = intel_iommu_unmap_page,
     .lookup_page = intel_iommu_lookup_page,
diff --git a/xen/drivers/passthrough/x86/iommu.c b/xen/drivers/passthrough/x86/iommu.c
index c6b03624fe28..faeb549591d8 100644
--- a/xen/drivers/passthrough/x86/iommu.c
+++ b/xen/drivers/passthrough/x86/iommu.c
@@ -149,6 +149,13 @@ int arch_iommu_domain_init(struct domain *d)
 
 void arch_iommu_domain_destroy(struct domain *d)
 {
+    /*
+     * There should be not page-tables left allocated by the time the
+     * domain is destroyed. Note that arch_iommu_domain_destroy() is
+     * called unconditionally, so pgtables may be unitialized.
+     */
+    ASSERT(dom_iommu(d)->platform_ops == NULL ||
+           page_list_empty(&dom_iommu(d)->arch.pgtables.list));
 }
 
 static bool __hwdom_init hwdom_iommu_map(const struct domain *d,
@@ -273,6 +280,12 @@ int iommu_free_pgtables(struct domain *d)
     /* After this barrier, no more IOMMU mapping can happen */
     spin_barrier(&hd->arch.mapping_lock);
 
+    /*
+     * Pages will be moved to the free list below. So we want to
+     * clear the root page-table to avoid any potential use after-free.
+     */
+    hd->platform_ops->clear_root_pgtable(d);
+
     while ( (pg = page_list_remove_head(&hd->arch.pgtables.list)) )
     {
         free_domheap_page(pg);
diff --git a/xen/include/xen/iommu.h b/xen/include/xen/iommu.h
index 863a68fe1622..d59ed7cbad43 100644
--- a/xen/include/xen/iommu.h
+++ b/xen/include/xen/iommu.h
@@ -272,6 +272,7 @@ struct iommu_ops {
 
     int (*adjust_irq_affinities)(void);
     void (*sync_cache)(const void *addr, unsigned int size);
+    void (*clear_root_pgtable)(struct domain *d);
 #endif /* CONFIG_X86 */
 
     int __must_check (*suspend)(void);
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [for-4.15][PATCH v4 0/2] xen/iommu: Collection of bug fixes for IOMMU teardown
  2021-02-23 12:34 [for-4.15][PATCH v4 0/2] xen/iommu: Collection of bug fixes for IOMMU teardown Julien Grall
  2021-02-23 12:34 ` [for-4.15][PATCH v4 1/2] xen/x86: iommu: Ignore IOMMU mapping requests when a domain is dying Julien Grall
  2021-02-23 12:34 ` [for-4.15][PATCH v4 2/2] xen/iommu: x86: Clear the root page-table before freeing the page-tables Julien Grall
@ 2021-02-24  9:43 ` Julien Grall
  2 siblings, 0 replies; 4+ messages in thread
From: Julien Grall @ 2021-02-24  9:43 UTC (permalink / raw)
  To: xen-devel; +Cc: hongyxia, iwj, Julien Grall

Hi all,

Please ignore this version. I forgot to CC the maintainers on it. I will 
resend a new series.

Cheers,

On 23/02/2021 12:34, Julien Grall wrote:
> From: Julien Grall <jgrall@amazon.com>
> 
> Hi all,
> 
> This series is a collection of bug fixes for the IOMMU teardown code.
> All of them are candidate for 4.15 as they can either leak memory or
> lead to host crash/host corruption.
> 
> This is sent directly on xen-devel because all the issues were either
> introduced in 4.15 or happen in the domain creation code.
> 
> Major changes since v3:
>      - Remove patch #3 "xen/iommu: x86: Harden the IOMMU page-table
>      allocator" as it is not strictly necessary for 4.15.
>      - Re-order the patches to avoid on a follow-up patch to fix
>      completely the issue.
> 
> Major changes since v2:
>      - patch #1 "xen/x86: p2m: Don't map the special pages in the IOMMU
>      page-tables" has been removed. This requires Jan's patch [1] to
>      fully mitigate memory leaks.
> 
> Cheers,
> 
> [1] <90271e69-c07e-a32c-5531-a79b10ef03dd@suse.com>
> 
> Julien Grall (2):
>    xen/x86: iommu: Ignore IOMMU mapping requests when a domain is dying
>    xen/iommu: x86: Clear the root page-table before freeing the
>      page-tables
> 
>   xen/drivers/passthrough/amd/iommu_map.c     | 12 +++++++++++
>   xen/drivers/passthrough/amd/pci_amd_iommu.c | 12 ++++++++++-
>   xen/drivers/passthrough/vtd/iommu.c         | 24 ++++++++++++++++++++-
>   xen/drivers/passthrough/x86/iommu.c         | 19 ++++++++++++++++
>   xen/include/xen/iommu.h                     |  1 +
>   5 files changed, 66 insertions(+), 2 deletions(-)
> 

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2021-02-24  9:44 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-02-23 12:34 [for-4.15][PATCH v4 0/2] xen/iommu: Collection of bug fixes for IOMMU teardown Julien Grall
2021-02-23 12:34 ` [for-4.15][PATCH v4 1/2] xen/x86: iommu: Ignore IOMMU mapping requests when a domain is dying Julien Grall
2021-02-23 12:34 ` [for-4.15][PATCH v4 2/2] xen/iommu: x86: Clear the root page-table before freeing the page-tables Julien Grall
2021-02-24  9:43 ` [for-4.15][PATCH v4 0/2] xen/iommu: Collection of bug fixes for IOMMU teardown Julien Grall

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).