All of lore.kernel.org
 help / color / mirror / Atom feed
From: Roger Pau Monne <roger.pau@citrix.com>
To: xen-devel@lists.xenproject.org
Cc: Stefano Stabellini <sstabellini@kernel.org>,
	Wei Liu <wei.liu2@citrix.com>,
	George Dunlap <George.Dunlap@eu.citrix.com>,
	Andrew Cooper <andrew.cooper3@citrix.com>,
	Ian Jackson <ian.jackson@eu.citrix.com>, Tim Deegan <tim@xen.org>,
	Julien Grall <julien.grall@arm.com>,
	Jan Beulich <jbeulich@suse.com>,
	Roger Pau Monne <roger.pau@citrix.com>
Subject: [PATCH v4 4/4] x86/iommu: add reserved dom0-iommu option to map reserved memory ranges
Date: Wed, 8 Aug 2018 12:07:47 +0200	[thread overview]
Message-ID: <20180808100747.19464-5-roger.pau@citrix.com> (raw)
In-Reply-To: <20180808100747.19464-1-roger.pau@citrix.com>

Several people have reported hardware issues (malfunctioning USB
controllers) due to iommu page faults on Intel hardware. Those faults
are caused by missing RMRR (VTd) entries in the ACPI tables. Those can
be worked around on VTd hardware by manually adding RMRR entries on
the command line, this is however limited to Intel hardware and quite
cumbersome to do.

In order to solve those issues add a new dom0-iommu=reserved option
that identity maps all regions marked as reserved in the memory map.
Note that regions used by devices emulated by Xen (LAPIC, IO-APIC or
PCIe MCFG regions) are specifically avoided. Note that this option is
available to a PVH Dom0 (as opposed to the inclusive option which only
works for PV Dom0).

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
---
Changes since v3:
 - Add mappings if the iommu page tables are shared.

Changes since v2:
 - Fix comment regarding dom0-strict.
 - Change documentation style of xen command line.
 - Rename iommu_map to hwdom_iommu_map.
 - Move all the checks to hwdom_iommu_map.

Changes since v1:
 - Introduce a new reserved option instead of abusing the inclusive
   option.
 - Use the same helper function for PV and PVH in order to decide if a
   page should be added to the domain page tables.
 - Use the data inside of the domain struct to detect overlaps with
   emulated MMIO regions.
---
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: George Dunlap <George.Dunlap@eu.citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Julien Grall <julien.grall@arm.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Tim Deegan <tim@xen.org>
Cc: Wei Liu <wei.liu2@citrix.com>
---
 docs/misc/xen-command-line.markdown         | 11 ++-
 xen/arch/x86/hvm/io.c                       |  5 ++
 xen/drivers/passthrough/amd/pci_amd_iommu.c |  3 +
 xen/drivers/passthrough/iommu.c             |  3 +
 xen/drivers/passthrough/vtd/iommu.c         |  3 +
 xen/drivers/passthrough/x86/iommu.c         | 92 ++++++++++++++-------
 xen/include/asm-x86/hvm/io.h                |  3 +
 xen/include/xen/iommu.h                     |  2 +-
 8 files changed, 91 insertions(+), 31 deletions(-)

diff --git a/docs/misc/xen-command-line.markdown b/docs/misc/xen-command-line.markdown
index 90b32fe3f0..59ec2afc5d 100644
--- a/docs/misc/xen-command-line.markdown
+++ b/docs/misc/xen-command-line.markdown
@@ -1205,7 +1205,7 @@ detection of systems known to misbehave upon accesses to that port.
 >> Enable IOMMU debugging code (implies `verbose`).
 
 ### dom0-iommu
-> `= List of [ none | strict | relaxed | inclusive ]`
+> `= List of [ none | strict | relaxed | inclusive | reserved ]`
 
 * `none`: disables DMA remapping for Dom0.
 
@@ -1233,6 +1233,15 @@ meaning:
   option is only applicable to a PV Dom0 and is enabled by default on Intel
   hardware.
 
+* `reserved`: sets up DMA remapping for all the reserved regions in the memory
+  map for Dom0. Use this to work around firmware issues providing incorrect
+  RMRR/IVMD entries. Rather than only mapping RAM pages for IOMMU accesses
+  for Dom0, all memory regions marked as reserved in the memory map that don't
+  overlap with any MMIO region from emulated devices will be identity mapped.
+  This option maps a subset of the memory that would be mapped when using the
+  `inclusive` option. This option is available to a PVH Dom0 and is enabled by
+  default on Intel hardware.
+
 ### iommu\_dev\_iotlb\_timeout
 > `= <integer>`
 
diff --git a/xen/arch/x86/hvm/io.c b/xen/arch/x86/hvm/io.c
index bf4d8748d3..5e01c33890 100644
--- a/xen/arch/x86/hvm/io.c
+++ b/xen/arch/x86/hvm/io.c
@@ -404,6 +404,11 @@ static const struct hvm_mmcfg *vpci_mmcfg_find(const struct domain *d,
     return NULL;
 }
 
+bool vpci_mmcfg_address(const struct domain *d, paddr_t addr)
+{
+    return vpci_mmcfg_find(d, addr);
+}
+
 static unsigned int vpci_mmcfg_decode_addr(const struct hvm_mmcfg *mmcfg,
                                            paddr_t addr, pci_sbdf_t *sbdf)
 {
diff --git a/xen/drivers/passthrough/amd/pci_amd_iommu.c b/xen/drivers/passthrough/amd/pci_amd_iommu.c
index 0e0c99c942..2c2867d088 100644
--- a/xen/drivers/passthrough/amd/pci_amd_iommu.c
+++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c
@@ -256,6 +256,9 @@ static void __hwdom_init amd_iommu_hwdom_init(struct domain *d)
     /* Inclusive IOMMU mappings are disabled by default on AMD hardware. */
     iommu_dom0_inclusive = iommu_dom0_inclusive == -1 ? false
                                                       : iommu_dom0_inclusive;
+    /* Reserved IOMMU mappings are disabled by default on AMD hardware. */
+    iommu_dom0_reserved = iommu_dom0_reserved == -1 ? false
+                                                    : iommu_dom0_reserved;
 
     if ( allocate_domain_resources(dom_iommu(d)) )
         BUG();
diff --git a/xen/drivers/passthrough/iommu.c b/xen/drivers/passthrough/iommu.c
index f15c94be42..9c991bd2cf 100644
--- a/xen/drivers/passthrough/iommu.c
+++ b/xen/drivers/passthrough/iommu.c
@@ -75,6 +75,7 @@ custom_param("dom0-iommu", parse_dom0_iommu_param);
 bool __hwdom_initdata iommu_dom0_strict;
 bool __read_mostly iommu_dom0_passthrough;
 int8_t __hwdom_initdata iommu_dom0_inclusive = -1;
+int8_t __hwdom_initdata iommu_dom0_reserved = -1;
 
 DEFINE_PER_CPU(bool_t, iommu_dont_flush_iotlb);
 
@@ -162,6 +163,8 @@ static int __init parse_dom0_iommu_param(const char *s)
             iommu_dom0_strict = false;
         else if ( !strncmp(s, "inclusive", ss - s) )
             iommu_dom0_inclusive = val;
+        else if ( !strncmp(s, "reserved", ss - s) )
+            iommu_dom0_reserved = val;
         else
             rc = -EINVAL;
 
diff --git a/xen/drivers/passthrough/vtd/iommu.c b/xen/drivers/passthrough/vtd/iommu.c
index 7c7e15755d..77a076215b 100644
--- a/xen/drivers/passthrough/vtd/iommu.c
+++ b/xen/drivers/passthrough/vtd/iommu.c
@@ -1307,6 +1307,9 @@ static void __hwdom_init intel_iommu_hwdom_init(struct domain *d)
     /* Inclusive mappings are enabled by default on Intel hardware for PV. */
     iommu_dom0_inclusive = iommu_dom0_inclusive == -1 ? is_pv_domain(d)
                                                       : iommu_dom0_inclusive;
+    /* Reserved IOMMU mappings are enabled by default on Intel hardware. */
+    iommu_dom0_reserved = iommu_dom0_reserved == -1 ? true
+                                                    : iommu_dom0_reserved;
 
     setup_hwdom_pci_devices(d, setup_hwdom_device);
     setup_hwdom_rmrr(d);
diff --git a/xen/drivers/passthrough/x86/iommu.c b/xen/drivers/passthrough/x86/iommu.c
index 5a7a765e9d..6271d8b671 100644
--- a/xen/drivers/passthrough/x86/iommu.c
+++ b/xen/drivers/passthrough/x86/iommu.c
@@ -20,6 +20,7 @@
 #include <xen/softirq.h>
 #include <xsm/xsm.h>
 
+#include <asm/hvm/io.h>
 #include <asm/setup.h>
 
 void iommu_update_ire_from_apic(
@@ -134,13 +135,67 @@ void arch_iommu_domain_destroy(struct domain *d)
 {
 }
 
+static bool __hwdom_init hwdom_iommu_map(const struct domain *d, unsigned long pfn,
+                                         unsigned long max_pfn)
+{
+    unsigned int i;
+
+    /*
+     * Ignore any address below 1MB, that's already identity mapped by the
+     * domain builder for HVM.
+     */
+    if ( (is_hvm_domain(d) && pfn < PFN_DOWN(MB(1))) ||
+         /* Exclude Xen bits. */
+         xen_in_range(pfn) || (pfn > max_pfn && !mfn_valid(_mfn(pfn))) )
+        return false;
+
+    /*
+     * If dom0-strict mode is enabled or the guest type is PVH/HVM then exclude
+     * conventional RAM and let the common code map dom0's pages.
+     */
+    if ( page_is_ram_type(pfn, RAM_TYPE_CONVENTIONAL) &&
+         (iommu_dom0_strict || is_hvm_domain(d)) )
+        return false;
+    if ( page_is_ram_type(pfn, RAM_TYPE_RESERVED) &&
+         !iommu_dom0_reserved && !iommu_dom0_inclusive )
+        return false;
+    if ( !page_is_ram_type(pfn, RAM_TYPE_UNUSABLE) &&
+         !page_is_ram_type(pfn, RAM_TYPE_RESERVED) &&
+         !page_is_ram_type(pfn, RAM_TYPE_CONVENTIONAL) &&
+         (!iommu_dom0_inclusive || pfn > max_pfn) )
+        return false;
+
+    /* Check that it doesn't overlap with the LAPIC */
+    if ( has_vlapic(d) )
+    {
+        const struct vcpu *v;
+
+        for_each_vcpu(d, v)
+            if ( pfn == PFN_DOWN(vlapic_base_address(vcpu_vlapic(v))) )
+                return false;
+    }
+    /* ... or the IO-APIC */
+    for ( i = 0; has_vioapic(d) && i < d->arch.hvm_domain.nr_vioapics; i++ )
+        if ( pfn == PFN_DOWN(domain_vioapic(d, i)->base_address) )
+            return false;
+    /*
+     * ... or the PCIe MCFG regions.
+     * TODO: runtime added MMCFG regions are not checked to make sure they
+     * don't overlap with already mapped regions, thus preventing trapping.
+     */
+    if ( has_vpci(d) && vpci_mmcfg_address(d, pfn << PAGE_SHIFT) )
+        return false;
+
+    return true;
+}
+
 void __hwdom_init arch_iommu_hwdom_init(struct domain *d)
 {
     unsigned long i, top, max_pfn;
 
     BUG_ON(!is_hardware_domain(d));
 
-    if ( iommu_dom0_passthrough || !is_pv_domain(d) )
+    if ( iommu_dom0_passthrough )
         return;
 
     max_pfn = (GB(4) >> PAGE_SHIFT) - 1;
@@ -149,39 +204,18 @@ void __hwdom_init arch_iommu_hwdom_init(struct domain *d)
     for ( i = 0; i < top; i++ )
     {
         unsigned long pfn = pdx_to_pfn(i);
-        bool map;
         int rc;
 
-        /*
-         * Set up 1:1 mapping for dom0. Default to include only
-         * conventional RAM areas and let RMRRs include needed reserved
-         * regions. When set, the inclusive mapping additionally maps in
-         * every pfn up to 4GB except those that fall in unusable ranges.
-         */
-        if ( pfn > max_pfn && !mfn_valid(_mfn(pfn)) )
+        if ( !hwdom_iommu_map(d, pfn, max_pfn) )
             continue;
 
-        if ( iommu_dom0_inclusive && pfn <= max_pfn )
-            map = !page_is_ram_type(pfn, RAM_TYPE_UNUSABLE);
+        if ( iommu_use_hap_pt(d) )
+        {
+            ASSERT(is_hvm_domain(d));
+            rc = set_identity_p2m_entry(d, pfn, p2m_access_rw, 0);
+        }
         else
-            map = page_is_ram_type(pfn, RAM_TYPE_CONVENTIONAL);
-
-        if ( !map )
-            continue;
-
-        /* Exclude Xen bits */
-        if ( xen_in_range(pfn) )
-            continue;
-
-        /*
-         * If dom0-strict mode is enabled then exclude conventional RAM
-         * and let the common code map dom0's pages.
-         */
-        if ( iommu_dom0_strict &&
-             page_is_ram_type(pfn, RAM_TYPE_CONVENTIONAL) )
-            continue;
-
-        rc = iommu_map_page(d, pfn, pfn, IOMMUF_readable|IOMMUF_writable);
+            rc = iommu_map_page(d, pfn, pfn, IOMMUF_readable|IOMMUF_writable);
         if ( rc )
             printk(XENLOG_WARNING " d%d: IOMMU mapping failed: %d\n",
                    d->domain_id, rc);
diff --git a/xen/include/asm-x86/hvm/io.h b/xen/include/asm-x86/hvm/io.h
index e6b6ed0b92..8cca456b55 100644
--- a/xen/include/asm-x86/hvm/io.h
+++ b/xen/include/asm-x86/hvm/io.h
@@ -180,6 +180,9 @@ int register_vpci_mmcfg_handler(struct domain *d, paddr_t addr,
 /* Destroy tracked MMCFG areas. */
 void destroy_vpci_mmcfg(struct domain *d);
 
+/* Check if an address is between a MMCFG region for a domain. */
+bool vpci_mmcfg_address(const struct domain *d, paddr_t addr);
+
 #endif /* __ASM_X86_HVM_IO_H__ */
 
 
diff --git a/xen/include/xen/iommu.h b/xen/include/xen/iommu.h
index 99e5b89c0f..fed1b1ea7a 100644
--- a/xen/include/xen/iommu.h
+++ b/xen/include/xen/iommu.h
@@ -37,7 +37,7 @@ extern bool_t iommu_debug;
 extern bool_t amd_iommu_perdev_intremap;
 
 extern bool iommu_dom0_strict, iommu_dom0_passthrough;
-extern int8_t iommu_dom0_inclusive;
+extern int8_t iommu_dom0_inclusive, iommu_dom0_reserved;
 
 extern unsigned int iommu_dev_iotlb_timeout;
 
-- 
2.18.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

  parent reply	other threads:[~2018-08-08 10:08 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-08-08 10:07 [PATCH v4 0/4] 86/iommu: PVH Dom0 workarounds for missing RMRR entries Roger Pau Monne
2018-08-08 10:07 ` [PATCH v4 1/4] iommu: introduce dom0-iommu option Roger Pau Monne
2018-08-08 12:10   ` Jan Beulich
2018-08-08 15:50     ` Roger Pau Monné
2018-08-09  7:00       ` Jan Beulich
2018-08-09 10:01         ` Roger Pau Monné
2018-08-09 10:29           ` Jan Beulich
2018-08-09 10:51             ` Roger Pau Monné
2018-08-09 11:46               ` Jan Beulich
2018-08-09 10:18         ` Paul Durrant
2018-08-08 10:07 ` [PATCH v4 2/4] iommu: make iommu_inclusive_mapping a suboption of dom0-iommu Roger Pau Monne
2018-08-08 12:32   ` Jan Beulich
2018-08-08 16:09     ` Roger Pau Monné
2018-08-09  7:17       ` Jan Beulich
2018-08-08 10:07 ` [PATCH v4 3/4] dom0/pvh: change the order of the MMCFG initialization Roger Pau Monne
2018-08-08 12:35   ` Jan Beulich
2018-08-10 10:04     ` Roger Pau Monné
2018-08-10 11:41       ` Jan Beulich
2018-08-08 10:07 ` Roger Pau Monne [this message]
2018-08-08 13:15   ` [PATCH v4 4/4] x86/iommu: add reserved dom0-iommu option to map reserved memory ranges Jan Beulich
2018-08-08 16:18     ` Roger Pau Monné
2018-08-09  7:36       ` Jan Beulich
2018-08-09 10:23         ` Roger Pau Monné
2018-08-09 10:33           ` Jan Beulich

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180808100747.19464-5-roger.pau@citrix.com \
    --to=roger.pau@citrix.com \
    --cc=George.Dunlap@eu.citrix.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=ian.jackson@eu.citrix.com \
    --cc=jbeulich@suse.com \
    --cc=julien.grall@arm.com \
    --cc=sstabellini@kernel.org \
    --cc=tim@xen.org \
    --cc=wei.liu2@citrix.com \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.