All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Xu <peterx@redhat.com>
To: qemu-devel@nongnu.org
Cc: tianyu.lan@intel.com, kevin.tian@intel.com, mst@redhat.com,
	jan.kiszka@siemens.com, jasowang@redhat.com, peterx@redhat.com,
	David Gibson <david@gibson.dropbear.id.au>,
	alex.williamson@redhat.com, bd.aviv@gmail.com
Subject: [Qemu-devel] [PATCH v7 16/17] intel_iommu: allow dynamic switch of IOMMU region
Date: Tue,  7 Feb 2017 16:28:18 +0800	[thread overview]
Message-ID: <1486456099-7345-17-git-send-email-peterx@redhat.com> (raw)
In-Reply-To: <1486456099-7345-1-git-send-email-peterx@redhat.com>

This is preparation work to finally enabled dynamic switching ON/OFF for
VT-d protection. The old VT-d codes is using static IOMMU address space,
and that won't satisfy vfio-pci device listeners.

Let me explain.

vfio-pci devices depend on the memory region listener and IOMMU replay
mechanism to make sure the device mapping is coherent with the guest
even if there are domain switches. And there are two kinds of domain
switches:

  (1) switch from domain A -> B
  (2) switch from domain A -> no domain (e.g., turn DMAR off)

Case (1) is handled by the context entry invalidation handling by the
VT-d replay logic. What the replay function should do here is to replay
the existing page mappings in domain B.

However for case (2), we don't want to replay any domain mappings - we
just need the default GPA->HPA mappings (the address_space_memory
mapping). And this patch helps on case (2) to build up the mapping
automatically by leveraging the vfio-pci memory listeners.

Another important thing that this patch does is to seperate
IR (Interrupt Remapping) from DMAR (DMA Remapping). IR region should not
depend on the DMAR region (like before this patch). It should be a
standalone region, and it should be able to be activated without
DMAR (which is a common behavior of Linux kernel - by default it enables
IR while disabled DMAR).

Reviewed-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
---
 hw/i386/intel_iommu.c         | 78 ++++++++++++++++++++++++++++++++++++++++---
 hw/i386/trace-events          |  2 +-
 include/hw/i386/intel_iommu.h |  2 ++
 3 files changed, 77 insertions(+), 5 deletions(-)

diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index f8d5713..4fe161f 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -1291,9 +1291,49 @@ static void vtd_handle_gcmd_sirtp(IntelIOMMUState *s)
     vtd_set_clear_mask_long(s, DMAR_GSTS_REG, 0, VTD_GSTS_IRTPS);
 }
 
+static void vtd_switch_address_space(VTDAddressSpace *as)
+{
+    assert(as);
+
+    trace_vtd_switch_address_space(pci_bus_num(as->bus),
+                                   VTD_PCI_SLOT(as->devfn),
+                                   VTD_PCI_FUNC(as->devfn),
+                                   as->iommu_state->dmar_enabled);
+
+    /* Turn off first then on the other */
+    if (as->iommu_state->dmar_enabled) {
+        memory_region_set_enabled(&as->sys_alias, false);
+        memory_region_set_enabled(&as->iommu, true);
+    } else {
+        memory_region_set_enabled(&as->iommu, false);
+        memory_region_set_enabled(&as->sys_alias, true);
+    }
+}
+
+static void vtd_switch_address_space_all(IntelIOMMUState *s)
+{
+    GHashTableIter iter;
+    VTDBus *vtd_bus;
+    int i;
+
+    g_hash_table_iter_init(&iter, s->vtd_as_by_busptr);
+    while (g_hash_table_iter_next(&iter, NULL, (void **)&vtd_bus)) {
+        for (i = 0; i < X86_IOMMU_PCI_DEVFN_MAX; i++) {
+            if (!vtd_bus->dev_as[i]) {
+                continue;
+            }
+            vtd_switch_address_space(vtd_bus->dev_as[i]);
+        }
+    }
+}
+
 /* Handle Translation Enable/Disable */
 static void vtd_handle_gcmd_te(IntelIOMMUState *s, bool en)
 {
+    if (s->dmar_enabled == en) {
+        return;
+    }
+
     VTD_DPRINTF(CSR, "Translation Enable %s", (en ? "on" : "off"));
 
     if (en) {
@@ -1308,6 +1348,8 @@ static void vtd_handle_gcmd_te(IntelIOMMUState *s, bool en)
         /* Ok - report back to driver */
         vtd_set_clear_mask_long(s, DMAR_GSTS_REG, VTD_GSTS_TES, 0);
     }
+
+    vtd_switch_address_space_all(s);
 }
 
 /* Handle Interrupt Remap Enable/Disable */
@@ -2529,15 +2571,43 @@ VTDAddressSpace *vtd_find_add_as(IntelIOMMUState *s, PCIBus *bus, int devfn)
         vtd_dev_as->devfn = (uint8_t)devfn;
         vtd_dev_as->iommu_state = s;
         vtd_dev_as->context_cache_entry.context_cache_gen = 0;
+
+        /*
+         * Memory region relationships looks like (Address range shows
+         * only lower 32 bits to make it short in length...):
+         *
+         * |-----------------+-------------------+----------|
+         * | Name            | Address range     | Priority |
+         * |-----------------+-------------------+----------+
+         * | vtd_root        | 00000000-ffffffff |        0 |
+         * |  intel_iommu    | 00000000-ffffffff |        1 |
+         * |  vtd_sys_alias  | 00000000-ffffffff |        1 |
+         * |  intel_iommu_ir | fee00000-feefffff |       64 |
+         * |-----------------+-------------------+----------|
+         *
+         * We enable/disable DMAR by switching enablement for
+         * vtd_sys_alias and intel_iommu regions. IR region is always
+         * enabled.
+         */
         memory_region_init_iommu(&vtd_dev_as->iommu, OBJECT(s),
                                  &s->iommu_ops, "intel_iommu", UINT64_MAX);
+        memory_region_init_alias(&vtd_dev_as->sys_alias, OBJECT(s),
+                                 "vtd_sys_alias", get_system_memory(),
+                                 0, memory_region_size(get_system_memory()));
         memory_region_init_io(&vtd_dev_as->iommu_ir, OBJECT(s),
                               &vtd_mem_ir_ops, s, "intel_iommu_ir",
                               VTD_INTERRUPT_ADDR_SIZE);
-        memory_region_add_subregion(&vtd_dev_as->iommu, VTD_INTERRUPT_ADDR_FIRST,
-                                    &vtd_dev_as->iommu_ir);
-        address_space_init(&vtd_dev_as->as,
-                           &vtd_dev_as->iommu, name);
+        memory_region_init(&vtd_dev_as->root, OBJECT(s),
+                           "vtd_root", UINT64_MAX);
+        memory_region_add_subregion_overlap(&vtd_dev_as->root,
+                                            VTD_INTERRUPT_ADDR_FIRST,
+                                            &vtd_dev_as->iommu_ir, 64);
+        address_space_init(&vtd_dev_as->as, &vtd_dev_as->root, name);
+        memory_region_add_subregion_overlap(&vtd_dev_as->root, 0,
+                                            &vtd_dev_as->sys_alias, 1);
+        memory_region_add_subregion_overlap(&vtd_dev_as->root, 0,
+                                            &vtd_dev_as->iommu, 1);
+        vtd_switch_address_space(vtd_dev_as);
     }
     return vtd_dev_as;
 }
diff --git a/hw/i386/trace-events b/hw/i386/trace-events
index 463db0d..ebb650b 100644
--- a/hw/i386/trace-events
+++ b/hw/i386/trace-events
@@ -4,7 +4,6 @@
 x86_iommu_iec_notify(bool global, uint32_t index, uint32_t mask) "Notify IEC invalidation: global=%d index=%" PRIu32 " mask=%" PRIu32
 
 # hw/i386/intel_iommu.c
-vtd_switch_address_space(uint8_t bus, uint8_t slot, uint8_t fn, bool on) "Device %02x:%02x.%x switching address space (iommu enabled=%d)"
 vtd_inv_desc(const char *type, uint64_t hi, uint64_t lo) "invalidate desc type %s high 0x%"PRIx64" low 0x%"PRIx64
 vtd_inv_desc_invalid(uint64_t hi, uint64_t lo) "invalid inv desc hi 0x%"PRIx64" lo 0x%"PRIx64
 vtd_inv_desc_cc_domain(uint16_t domain) "context invalidate domain 0x%"PRIx16
@@ -37,6 +36,7 @@ vtd_page_walk_one(uint32_t level, uint64_t iova, uint64_t gpa, uint64_t mask, in
 vtd_page_walk_skip_read(uint64_t iova, uint64_t next) "Page walk skip iova 0x%"PRIx64" - 0x%"PRIx64" due to unable to read"
 vtd_page_walk_skip_perm(uint64_t iova, uint64_t next) "Page walk skip iova 0x%"PRIx64" - 0x%"PRIx64" due to perm empty"
 vtd_page_walk_skip_reserve(uint64_t iova, uint64_t next) "Page walk skip iova 0x%"PRIx64" - 0x%"PRIx64" due to rsrv set"
+vtd_switch_address_space(uint8_t bus, uint8_t slot, uint8_t fn, bool on) "Device %02x:%02x.%x switching address space (iommu enabled=%d)"
 
 # hw/i386/amd_iommu.c
 amdvi_evntlog_fail(uint64_t addr, uint32_t head) "error: fail to write at addr 0x%"PRIx64" +  offset 0x%"PRIx32
diff --git a/include/hw/i386/intel_iommu.h b/include/hw/i386/intel_iommu.h
index fe645aa..8f212a1 100644
--- a/include/hw/i386/intel_iommu.h
+++ b/include/hw/i386/intel_iommu.h
@@ -83,6 +83,8 @@ struct VTDAddressSpace {
     uint8_t devfn;
     AddressSpace as;
     MemoryRegion iommu;
+    MemoryRegion root;
+    MemoryRegion sys_alias;
     MemoryRegion iommu_ir;      /* Interrupt region: 0xfeeXXXXX */
     IntelIOMMUState *iommu_state;
     VTDContextCacheEntry context_cache_entry;
-- 
2.7.4

  parent reply	other threads:[~2017-02-07  8:30 UTC|newest]

Thread overview: 63+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-02-07  8:28 [Qemu-devel] [PATCH v7 00/17] VT-d: vfio enablement and misc enhances Peter Xu
2017-02-07  8:28 ` [Qemu-devel] [PATCH v7 01/17] vfio: trace map/unmap for notify as well Peter Xu
2017-02-07  8:28 ` [Qemu-devel] [PATCH v7 02/17] vfio: introduce vfio_get_vaddr() Peter Xu
2017-02-10  1:12   ` David Gibson
2017-02-10  5:50     ` Peter Xu
2017-02-07  8:28 ` [Qemu-devel] [PATCH v7 03/17] vfio: allow to notify unmap for very large region Peter Xu
2017-02-10  1:13   ` David Gibson
2017-02-07  8:28 ` [Qemu-devel] [PATCH v7 04/17] intel_iommu: add "caching-mode" option Peter Xu
2017-02-10  1:14   ` David Gibson
2017-02-07  8:28 ` [Qemu-devel] [PATCH v7 05/17] intel_iommu: simplify irq region translation Peter Xu
2017-02-10  1:15   ` David Gibson
2017-02-07  8:28 ` [Qemu-devel] [PATCH v7 06/17] intel_iommu: renaming gpa to iova where proper Peter Xu
2017-02-10  1:17   ` David Gibson
2017-02-07  8:28 ` [Qemu-devel] [PATCH v7 07/17] intel_iommu: convert dbg macros to traces for inv Peter Xu
2017-02-08  2:47   ` Jason Wang
2017-02-10  1:19   ` David Gibson
2017-02-07  8:28 ` [Qemu-devel] [PATCH v7 08/17] intel_iommu: convert dbg macros to trace for trans Peter Xu
2017-02-08  2:49   ` Jason Wang
2017-02-10  1:20   ` David Gibson
2017-02-07  8:28 ` [Qemu-devel] [PATCH v7 09/17] intel_iommu: vtd_slpt_level_shift check level Peter Xu
2017-02-10  1:20   ` David Gibson
2017-02-07  8:28 ` [Qemu-devel] [PATCH v7 10/17] memory: add section range info for IOMMU notifier Peter Xu
2017-02-10  2:29   ` David Gibson
2017-02-07  8:28 ` [Qemu-devel] [PATCH v7 11/17] memory: provide IOMMU_NOTIFIER_FOREACH macro Peter Xu
2017-02-10  2:30   ` David Gibson
2017-02-07  8:28 ` [Qemu-devel] [PATCH v7 12/17] memory: provide iommu_replay_all() Peter Xu
2017-02-10  2:31   ` David Gibson
2017-02-07  8:28 ` [Qemu-devel] [PATCH v7 13/17] memory: introduce memory_region_notify_one() Peter Xu
2017-02-10  2:33   ` David Gibson
2017-02-07  8:28 ` [Qemu-devel] [PATCH v7 14/17] memory: add MemoryRegionIOMMUOps.replay() callback Peter Xu
2017-02-10  2:34   ` David Gibson
2017-03-27  8:35   ` Liu, Yi L
2017-03-27  9:12     ` Peter Xu
2017-03-27  9:21       ` Liu, Yi L
2017-03-30 11:06         ` Liu, Yi L
2017-03-30 11:57           ` Jason Wang
2017-03-31  2:56             ` Peter Xu
2017-03-31  4:21               ` Jason Wang
2017-03-31  5:01                 ` Peter Xu
2017-03-31  5:12                   ` Jason Wang
2017-03-31  5:28                     ` Peter Xu
2017-03-31  5:34             ` Liu, Yi L
2017-03-31  7:16               ` Jason Wang
2017-03-31  7:30                 ` Liu, Yi L
2017-04-01  5:00                   ` Jason Wang
2017-04-01  6:39                     ` Liu, Yi L
2017-02-07  8:28 ` [Qemu-devel] [PATCH v7 15/17] intel_iommu: provide its own replay() callback Peter Xu
2017-02-10  2:36   ` David Gibson
2017-02-07  8:28 ` Peter Xu [this message]
2017-02-10  2:38   ` [Qemu-devel] [PATCH v7 16/17] intel_iommu: allow dynamic switch of IOMMU region David Gibson
2017-02-07  8:28 ` [Qemu-devel] [PATCH v7 17/17] intel_iommu: enable vfio devices Peter Xu
2017-02-10  6:24   ` Jason Wang
2017-03-16  4:05   ` Peter Xu
2017-03-19 15:34     ` Aviv B.D.
2017-03-20  1:56       ` Peter Xu
2017-03-20  2:12         ` Liu, Yi L
2017-03-20  2:41           ` Peter Xu
2017-02-17 17:18 ` [Qemu-devel] [PATCH v7 00/17] VT-d: vfio enablement and misc enhances Alex Williamson
2017-02-20  7:47   ` Peter Xu
2017-02-20  8:17     ` Liu, Yi L
2017-02-20  8:32       ` Peter Xu
2017-02-20 19:15     ` Alex Williamson
2017-02-28  7:52 ` Peter Xu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1486456099-7345-17-git-send-email-peterx@redhat.com \
    --to=peterx@redhat.com \
    --cc=alex.williamson@redhat.com \
    --cc=bd.aviv@gmail.com \
    --cc=david@gibson.dropbear.id.au \
    --cc=jan.kiszka@siemens.com \
    --cc=jasowang@redhat.com \
    --cc=kevin.tian@intel.com \
    --cc=mst@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=tianyu.lan@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.