All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Tian, Kevin" <kevin.tian@intel.com>
To: Julien Grall <julien@xen.org>,
	"xen-devel@lists.xenproject.org" <xen-devel@lists.xenproject.org>
Cc: "hongyxia@amazon.co.uk" <hongyxia@amazon.co.uk>,
	"iwj@xenproject.org" <iwj@xenproject.org>,
	Julien Grall <jgrall@amazon.com>, Jan Beulich <jbeulich@suse.com>,
	"Cooper, Andrew" <andrew.cooper3@citrix.com>,
	"Paul Durrant" <paul@xen.org>
Subject: RE: [PATCH for-4.15 v5 3/3] xen/iommu: x86: Clear the root page-table before freeing the page-tables
Date: Mon, 1 Mar 2021 02:54:23 +0000	[thread overview]
Message-ID: <MWHPR11MB18863C7E6DD57B1915C806D78C9A9@MWHPR11MB1886.namprd11.prod.outlook.com> (raw)
In-Reply-To: <20210226105640.12037-4-julien@xen.org>

> From: Julien Grall <julien@xen.org>
> Sent: Friday, February 26, 2021 6:57 PM
> 
> From: Julien Grall <jgrall@amazon.com>
> 
> The new per-domain IOMMU page-table allocator will now free the
> page-tables when domain's resources are relinquished. However, the
> per-domain IOMMU structure will still contain a dangling pointer to
> the root page-table.
> 
> Xen may access the IOMMU page-tables afterwards at least in the case of
> PV domain:
> 
> (XEN) Xen call trace:
> (XEN)    [<ffff82d04025b4b2>] R
> iommu.c#addr_to_dma_page_maddr+0x12e/0x1d8
> (XEN)    [<ffff82d04025b695>] F
> iommu.c#intel_iommu_unmap_page+0x5d/0xf8
> (XEN)    [<ffff82d0402695f3>] F iommu_unmap+0x9c/0x129
> (XEN)    [<ffff82d0402696a6>] F iommu_legacy_unmap+0x26/0x63
> (XEN)    [<ffff82d04033c5c7>] F mm.c#cleanup_page_mappings+0x139/0x144
> (XEN)    [<ffff82d04033c61d>] F put_page+0x4b/0xb3
> (XEN)    [<ffff82d04033c87f>] F put_page_from_l1e+0x136/0x13b
> (XEN)    [<ffff82d04033cada>] F devalidate_page+0x256/0x8dc
> (XEN)    [<ffff82d04033d396>] F mm.c#_put_page_type+0x236/0x47e
> (XEN)    [<ffff82d04033d64d>] F mm.c#put_pt_page+0x6f/0x80
> (XEN)    [<ffff82d04033d8d6>] F mm.c#put_page_from_l2e+0x8a/0xcf
> (XEN)    [<ffff82d04033cc27>] F devalidate_page+0x3a3/0x8dc
> (XEN)    [<ffff82d04033d396>] F mm.c#_put_page_type+0x236/0x47e
> (XEN)    [<ffff82d04033d64d>] F mm.c#put_pt_page+0x6f/0x80
> (XEN)    [<ffff82d04033d807>] F mm.c#put_page_from_l3e+0x8a/0xcf
> (XEN)    [<ffff82d04033cdf0>] F devalidate_page+0x56c/0x8dc
> (XEN)    [<ffff82d04033d396>] F mm.c#_put_page_type+0x236/0x47e
> (XEN)    [<ffff82d04033d64d>] F mm.c#put_pt_page+0x6f/0x80
> (XEN)    [<ffff82d04033d6c7>] F mm.c#put_page_from_l4e+0x69/0x6d
> (XEN)    [<ffff82d04033cf24>] F devalidate_page+0x6a0/0x8dc
> (XEN)    [<ffff82d04033d396>] F mm.c#_put_page_type+0x236/0x47e
> (XEN)    [<ffff82d04033d92e>] F put_page_type_preemptible+0x13/0x15
> (XEN)    [<ffff82d04032598a>] F domain.c#relinquish_memory+0x1ff/0x4e9
> (XEN)    [<ffff82d0403295f2>] F domain_relinquish_resources+0x2b6/0x36a
> (XEN)    [<ffff82d040205cdf>] F domain_kill+0xb8/0x141
> (XEN)    [<ffff82d040236cac>] F do_domctl+0xb6f/0x18e5
> (XEN)    [<ffff82d04031d098>] F pv_hypercall+0x2f0/0x55f
> (XEN)    [<ffff82d04039b432>] F lstar_enter+0x112/0x120
> 
> This will result to a use after-free and possibly an host crash or
> memory corruption.
> 
> It would not be possible to free the page-tables further down in
> domain_relinquish_resources() because cleanup_page_mappings() will only
> be called when the last reference on the page dropped. This may happen
> much later if another domain still hold a reference.
> 
> After all the PCI devices have been de-assigned, nobody should use the
> IOMMU page-tables and it is therefore pointless to try to modify them.
> 
> So we can simply clear any reference to the root page-table in the
> per-domain IOMMU structure. This requires to introduce a new callback of
> the method will depend on the IOMMU driver used.
> 
> Take the opportunity to add an ASSERT() in arch_iommu_domain_destroy()
> to check if we freed all the IOMMU page tables.
> 
> Fixes: 3eef6d07d722 ("x86/iommu: convert VT-d code to use new page table
> allocator")
> Signed-off-by: Julien Grall <jgrall@amazon.com>
> Reviewed-by: Jan Beulich <jbeulich@suse.com>

Reviewed-by: Kevin Tian <kevin.tian@intel.com>

> 
> ---
>     Changes in v5:
>         - Add Jan's reviewed-by
>         - Fix typo
>         - Use ! rather than == NULL
> 
>     Changes in v4:
>         - Move the patch later in the series as we need to prevent
>         iommu_map() to allocate memory first.
>         - Add an ASSERT() in arch_iommu_domain_destroy().
> 
>     Changes in v3:
>         - Move the patch earlier in the series
>         - Reword the commit message
> 
>     Changes in v2:
>         - Introduce clear_root_pgtable()
>         - Move the patch later in the series
> ---
>  xen/drivers/passthrough/amd/pci_amd_iommu.c | 12 +++++++++++-
>  xen/drivers/passthrough/vtd/iommu.c         | 12 +++++++++++-
>  xen/drivers/passthrough/x86/iommu.c         | 13 +++++++++++++
>  xen/include/xen/iommu.h                     |  1 +
>  4 files changed, 36 insertions(+), 2 deletions(-)
> 
> diff --git a/xen/drivers/passthrough/amd/pci_amd_iommu.c
> b/xen/drivers/passthrough/amd/pci_amd_iommu.c
> index 42b5a5a9bec4..085fe2f5771e 100644
> --- a/xen/drivers/passthrough/amd/pci_amd_iommu.c
> +++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c
> @@ -381,9 +381,18 @@ static int amd_iommu_assign_device(struct domain
> *d, u8 devfn,
>      return reassign_device(pdev->domain, d, devfn, pdev);
>  }
> 
> +static void amd_iommu_clear_root_pgtable(struct domain *d)
> +{
> +    struct domain_iommu *hd = dom_iommu(d);
> +
> +    spin_lock(&hd->arch.mapping_lock);
> +    hd->arch.amd.root_table = NULL;
> +    spin_unlock(&hd->arch.mapping_lock);
> +}
> +
>  static void amd_iommu_domain_destroy(struct domain *d)
>  {
> -    dom_iommu(d)->arch.amd.root_table = NULL;
> +    ASSERT(!dom_iommu(d)->arch.amd.root_table);
>  }
> 
>  static int amd_iommu_add_device(u8 devfn, struct pci_dev *pdev)
> @@ -565,6 +574,7 @@ static const struct iommu_ops __initconstrel
> _iommu_ops = {
>      .remove_device = amd_iommu_remove_device,
>      .assign_device  = amd_iommu_assign_device,
>      .teardown = amd_iommu_domain_destroy,
> +    .clear_root_pgtable = amd_iommu_clear_root_pgtable,
>      .map_page = amd_iommu_map_page,
>      .unmap_page = amd_iommu_unmap_page,
>      .iotlb_flush = amd_iommu_flush_iotlb_pages,
> diff --git a/xen/drivers/passthrough/vtd/iommu.c
> b/xen/drivers/passthrough/vtd/iommu.c
> index b549a71530d5..475efb3be3bd 100644
> --- a/xen/drivers/passthrough/vtd/iommu.c
> +++ b/xen/drivers/passthrough/vtd/iommu.c
> @@ -1726,6 +1726,15 @@ out:
>      return ret;
>  }
> 
> +static void iommu_clear_root_pgtable(struct domain *d)
> +{
> +    struct domain_iommu *hd = dom_iommu(d);
> +
> +    spin_lock(&hd->arch.mapping_lock);
> +    hd->arch.vtd.pgd_maddr = 0;
> +    spin_unlock(&hd->arch.mapping_lock);
> +}
> +
>  static void iommu_domain_teardown(struct domain *d)
>  {
>      struct domain_iommu *hd = dom_iommu(d);
> @@ -1740,7 +1749,7 @@ static void iommu_domain_teardown(struct
> domain *d)
>          xfree(mrmrr);
>      }
> 
> -    hd->arch.vtd.pgd_maddr = 0;
> +    ASSERT(!hd->arch.vtd.pgd_maddr);
>  }
> 
>  static int __must_check intel_iommu_map_page(struct domain *d, dfn_t dfn,
> @@ -2731,6 +2740,7 @@ static struct iommu_ops __initdata vtd_ops = {
>      .remove_device = intel_iommu_remove_device,
>      .assign_device  = intel_iommu_assign_device,
>      .teardown = iommu_domain_teardown,
> +    .clear_root_pgtable = iommu_clear_root_pgtable,
>      .map_page = intel_iommu_map_page,
>      .unmap_page = intel_iommu_unmap_page,
>      .lookup_page = intel_iommu_lookup_page,
> diff --git a/xen/drivers/passthrough/x86/iommu.c
> b/xen/drivers/passthrough/x86/iommu.c
> index ad19b7dd461c..b90bb31bfeea 100644
> --- a/xen/drivers/passthrough/x86/iommu.c
> +++ b/xen/drivers/passthrough/x86/iommu.c
> @@ -149,6 +149,13 @@ int arch_iommu_domain_init(struct domain *d)
> 
>  void arch_iommu_domain_destroy(struct domain *d)
>  {
> +    /*
> +     * There should be not page-tables left allocated by the time the
> +     * domain is destroyed. Note that arch_iommu_domain_destroy() is
> +     * called unconditionally, so pgtables may be uninitialized.
> +     */
> +    ASSERT(!dom_iommu(d)->platform_ops ||
> +           page_list_empty(&dom_iommu(d)->arch.pgtables.list));
>  }
> 
>  static bool __hwdom_init hwdom_iommu_map(const struct domain *d,
> @@ -273,6 +280,12 @@ int iommu_free_pgtables(struct domain *d)
>      /* After this barrier, no new IOMMU mappings can be inserted. */
>      spin_barrier(&hd->arch.mapping_lock);
> 
> +    /*
> +     * Pages will be moved to the free list below. So we want to
> +     * clear the root page-table to avoid any potential use after-free.
> +     */
> +    hd->platform_ops->clear_root_pgtable(d);
> +
>      while ( (pg = page_list_remove_head(&hd->arch.pgtables.list)) )
>      {
>          free_domheap_page(pg);
> diff --git a/xen/include/xen/iommu.h b/xen/include/xen/iommu.h
> index 863a68fe1622..d59ed7cbad43 100644
> --- a/xen/include/xen/iommu.h
> +++ b/xen/include/xen/iommu.h
> @@ -272,6 +272,7 @@ struct iommu_ops {
> 
>      int (*adjust_irq_affinities)(void);
>      void (*sync_cache)(const void *addr, unsigned int size);
> +    void (*clear_root_pgtable)(struct domain *d);
>  #endif /* CONFIG_X86 */
> 
>      int __must_check (*suspend)(void);
> --
> 2.17.1



  reply	other threads:[~2021-03-01  2:54 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-02-26 10:56 [PATCH for-4.15 v5 0/3] xen/iommu: Collection of bug fixes for IOMMU teardown Julien Grall
2021-02-26 10:56 ` [PATCH for-4.15 v5 1/3] xen/iommu: x86: Don't try to free page tables is the IOMMU is not enabled Julien Grall
2021-02-26 13:27   ` Jan Beulich
2021-02-26 10:56 ` [PATCH for-4.15 v5 2/3] xen/x86: iommu: Ignore IOMMU mapping requests when a domain is dying Julien Grall
2021-02-26 13:30   ` Jan Beulich
2021-03-01  9:21     ` Julien Grall
2021-03-01  2:52   ` Tian, Kevin
2021-02-26 10:56 ` [PATCH for-4.15 v5 3/3] xen/iommu: x86: Clear the root page-table before freeing the page-tables Julien Grall
2021-03-01  2:54   ` Tian, Kevin [this message]
2021-03-02  9:56 ` [PATCH for-4.15 v5 0/3] xen/iommu: Collection of bug fixes for IOMMU teardown Julien Grall

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=MWHPR11MB18863C7E6DD57B1915C806D78C9A9@MWHPR11MB1886.namprd11.prod.outlook.com \
    --to=kevin.tian@intel.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=hongyxia@amazon.co.uk \
    --cc=iwj@xenproject.org \
    --cc=jbeulich@suse.com \
    --cc=jgrall@amazon.com \
    --cc=julien@xen.org \
    --cc=paul@xen.org \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.