All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH v3] intel_iommu: allow dynamic switch of IOMMU region
@ 2016-12-22  9:48 Peter Xu
  2016-12-22  9:52 ` Jason Wang
  2017-01-05  3:30 ` Tian, Kevin
  0 siblings, 2 replies; 11+ messages in thread
From: Peter Xu @ 2016-12-22  9:48 UTC (permalink / raw)
  To: qemu-devel
  Cc: tianyu.lan, kevin.tian, mst, jan.kiszka, jasowang, peterx,
	alex.williamson, bd.aviv, david

This is preparation work to finally enabled dynamic switching ON/OFF for
VT-d protection. The old VT-d codes is using static IOMMU region, and
that won't satisfy vfio-pci device listeners.

Let me explain.

vfio-pci devices depend on the memory region listener and IOMMU replay
mechanism to make sure the device mapping is coherent with the guest
even if there are domain switches. And there are two kinds of domain
switches:

  (1) switch from domain A -> B
  (2) switch from domain A -> no domain (e.g., turn DMAR off)

Case (1) is handled by the context entry invalidation handling by the
VT-d replay logic. What the replay function should do here is to replay
the existing page mappings in domain B.

However for case (2), we don't want to replay any domain mappings - we
just need the default GPA->HPA mappings (the address_space_memory
mapping). And this patch helps on case (2) to build up the mapping
automatically by leveraging the vfio-pci memory listeners.

Another important thing that this patch does is to seperate
IR (Interrupt Remapping) from DMAR (DMA Remapping). IR region should not
depend on the DMAR region (like before this patch). It should be a
standalone region, and it should be able to be activated without
DMAR (which is a common behavior of Linux kernel - by default it enables
IR while disabled DMAR).

Signed-off-by: Peter Xu <peterx@redhat.com>
---
v3:
- fix another trivial style issue patchew reported but I missed in v2

v2:
- fix issues reported by patchew
- switch domain by enable/disable memory regions [David]
- provide vtd_switch_address_space{_all}()
- provide a better comment on the memory regions

test done: with intel_iommu device, boot vm with/without
"intel_iommu=on" parameter.

Signed-off-by: Peter Xu <peterx@redhat.com>
---
 hw/i386/intel_iommu.c         | 79 ++++++++++++++++++++++++++++++++++++++++---
 hw/i386/trace-events          |  3 ++
 include/hw/i386/intel_iommu.h |  2 ++
 3 files changed, 80 insertions(+), 4 deletions(-)

diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index 5f3e351..2710ee774 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -35,6 +35,7 @@
 #include "sysemu/kvm.h"
 #include "hw/i386/apic_internal.h"
 #include "kvm_i386.h"
+#include "trace.h"
 
 /*#define DEBUG_INTEL_IOMMU*/
 #ifdef DEBUG_INTEL_IOMMU
@@ -1179,9 +1180,49 @@ static void vtd_handle_gcmd_sirtp(IntelIOMMUState *s)
     vtd_set_clear_mask_long(s, DMAR_GSTS_REG, 0, VTD_GSTS_IRTPS);
 }
 
+static void vtd_switch_address_space(VTDAddressSpace *as, bool iommu_enabled)
+{
+    assert(as);
+
+    trace_vtd_switch_address_space(pci_bus_num(as->bus),
+                                   VTD_PCI_SLOT(as->devfn),
+                                   VTD_PCI_FUNC(as->devfn),
+                                   iommu_enabled);
+
+    /* Turn off first then on the other */
+    if (iommu_enabled) {
+        memory_region_set_enabled(&as->sys_alias, false);
+        memory_region_set_enabled(&as->iommu, true);
+    } else {
+        memory_region_set_enabled(&as->iommu, false);
+        memory_region_set_enabled(&as->sys_alias, true);
+    }
+}
+
+static void vtd_switch_address_space_all(IntelIOMMUState *s, bool enabled)
+{
+    GHashTableIter iter;
+    VTDBus *vtd_bus;
+    int i;
+
+    g_hash_table_iter_init(&iter, s->vtd_as_by_busptr);
+    while (g_hash_table_iter_next(&iter, NULL, (void **)&vtd_bus)) {
+        for (i = 0; i < X86_IOMMU_PCI_DEVFN_MAX; i++) {
+            if (!vtd_bus->dev_as[i]) {
+                continue;
+            }
+            vtd_switch_address_space(vtd_bus->dev_as[i], enabled);
+        }
+    }
+}
+
 /* Handle Translation Enable/Disable */
 static void vtd_handle_gcmd_te(IntelIOMMUState *s, bool en)
 {
+    if (s->dmar_enabled == en) {
+        return;
+    }
+
     VTD_DPRINTF(CSR, "Translation Enable %s", (en ? "on" : "off"));
 
     if (en) {
@@ -1196,6 +1237,8 @@ static void vtd_handle_gcmd_te(IntelIOMMUState *s, bool en)
         /* Ok - report back to driver */
         vtd_set_clear_mask_long(s, DMAR_GSTS_REG, VTD_GSTS_TES, 0);
     }
+
+    vtd_switch_address_space_all(s, en);
 }
 
 /* Handle Interrupt Remap Enable/Disable */
@@ -2343,15 +2386,43 @@ VTDAddressSpace *vtd_find_add_as(IntelIOMMUState *s, PCIBus *bus, int devfn)
         vtd_dev_as->devfn = (uint8_t)devfn;
         vtd_dev_as->iommu_state = s;
         vtd_dev_as->context_cache_entry.context_cache_gen = 0;
+
+        /*
+         * Memory region relationships looks like (Address range shows
+         * only lower 32 bits to make it short in length...):
+         *
+         * |-----------------+-------------------+----------|
+         * | Name            | Address range     | Priority |
+         * |-----------------+-------------------+----------+
+         * | vtd_root        | 00000000-ffffffff |        0 |
+         * |  intel_iommu    | 00000000-ffffffff |        1 |
+         * |  vtd_sys_alias  | 00000000-ffffffff |        1 |
+         * |  intel_iommu_ir | fee00000-feefffff |       64 |
+         * |-----------------+-------------------+----------|
+         *
+         * We enable/disable DMAR by switching enablement for
+         * vtd_sys_alias and intel_iommu regions. IR region is always
+         * enabled.
+         */
         memory_region_init_iommu(&vtd_dev_as->iommu, OBJECT(s),
                                  &s->iommu_ops, "intel_iommu", UINT64_MAX);
+        memory_region_init_alias(&vtd_dev_as->sys_alias, OBJECT(s),
+                                 "vtd_sys_alias", get_system_memory(),
+                                 0, memory_region_size(get_system_memory()));
         memory_region_init_io(&vtd_dev_as->iommu_ir, OBJECT(s),
                               &vtd_mem_ir_ops, s, "intel_iommu_ir",
                               VTD_INTERRUPT_ADDR_SIZE);
-        memory_region_add_subregion(&vtd_dev_as->iommu, VTD_INTERRUPT_ADDR_FIRST,
-                                    &vtd_dev_as->iommu_ir);
-        address_space_init(&vtd_dev_as->as,
-                           &vtd_dev_as->iommu, "intel_iommu");
+        memory_region_init(&vtd_dev_as->root, OBJECT(s),
+                           "vtd_root", UINT64_MAX);
+        memory_region_add_subregion_overlap(&vtd_dev_as->root,
+                                            VTD_INTERRUPT_ADDR_FIRST,
+                                            &vtd_dev_as->iommu_ir, 64);
+        address_space_init(&vtd_dev_as->as, &vtd_dev_as->root, "intel_iommu");
+        memory_region_add_subregion_overlap(&vtd_dev_as->root, 0,
+                                            &vtd_dev_as->sys_alias, 1);
+        memory_region_add_subregion_overlap(&vtd_dev_as->root, 0,
+                                            &vtd_dev_as->iommu, 1);
+        vtd_switch_address_space(vtd_dev_as, s->dmar_enabled);
     }
     return vtd_dev_as;
 }
diff --git a/hw/i386/trace-events b/hw/i386/trace-events
index d2b4973..aee93bb 100644
--- a/hw/i386/trace-events
+++ b/hw/i386/trace-events
@@ -10,6 +10,9 @@ xen_pv_mmio_write(uint64_t addr) "WARNING: write to Xen PV Device MMIO space (ad
 # hw/i386/x86-iommu.c
 x86_iommu_iec_notify(bool global, uint32_t index, uint32_t mask) "Notify IEC invalidation: global=%d index=%" PRIu32 " mask=%" PRIu32
 
+# hw/i386/intel_iommu.c
+vtd_switch_address_space(uint8_t bus, uint8_t slot, uint8_t fn, bool on) "Device %02x:%02x.%x switching address space (iommu enabled=%d)"
+
 # hw/i386/amd_iommu.c
 amdvi_evntlog_fail(uint64_t addr, uint32_t head) "error: fail to write at addr 0x%"PRIx64" +  offset 0x%"PRIx32
 amdvi_cache_update(uint16_t domid, uint8_t bus, uint8_t slot, uint8_t func, uint64_t gpa, uint64_t txaddr) " update iotlb domid 0x%"PRIx16" devid: %02x:%02x.%x gpa 0x%"PRIx64" hpa 0x%"PRIx64
diff --git a/include/hw/i386/intel_iommu.h b/include/hw/i386/intel_iommu.h
index 405c9d1..85c1b9b 100644
--- a/include/hw/i386/intel_iommu.h
+++ b/include/hw/i386/intel_iommu.h
@@ -83,6 +83,8 @@ struct VTDAddressSpace {
     uint8_t devfn;
     AddressSpace as;
     MemoryRegion iommu;
+    MemoryRegion root;
+    MemoryRegion sys_alias;
     MemoryRegion iommu_ir;      /* Interrupt region: 0xfeeXXXXX */
     IntelIOMMUState *iommu_state;
     VTDContextCacheEntry context_cache_entry;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [Qemu-devel] [PATCH v3] intel_iommu: allow dynamic switch of IOMMU region
  2016-12-22  9:48 [Qemu-devel] [PATCH v3] intel_iommu: allow dynamic switch of IOMMU region Peter Xu
@ 2016-12-22  9:52 ` Jason Wang
  2016-12-22 11:04   ` Peter Xu
  2017-01-05  3:30 ` Tian, Kevin
  1 sibling, 1 reply; 11+ messages in thread
From: Jason Wang @ 2016-12-22  9:52 UTC (permalink / raw)
  To: Peter Xu, qemu-devel
  Cc: tianyu.lan, kevin.tian, mst, jan.kiszka, alex.williamson, bd.aviv, david



On 2016年12月22日 17:48, Peter Xu wrote:
>   /* Handle Translation Enable/Disable */
>   static void vtd_handle_gcmd_te(IntelIOMMUState *s, bool en)
>   {
> +    if (s->dmar_enabled == en) {
> +        return;
> +    }
> +
>       VTD_DPRINTF(CSR, "Translation Enable %s", (en ? "on" : "off"));
>   
>       if (en) {
> @@ -1196,6 +1237,8 @@ static void vtd_handle_gcmd_te(IntelIOMMUState *s, bool en)
>           /* Ok - report back to driver */
>           vtd_set_clear_mask_long(s, DMAR_GSTS_REG, VTD_GSTS_TES, 0);
>       }
> +
> +    vtd_switch_address_space_all(s, en);
>   }

We may need something like notifier here to tell e.g vhost to stop 
device IOTLB. (Since it's likely this series were applied after device 
IOTLB patches)

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Qemu-devel] [PATCH v3] intel_iommu: allow dynamic switch of IOMMU region
  2016-12-22  9:52 ` Jason Wang
@ 2016-12-22 11:04   ` Peter Xu
  2016-12-22 11:34     ` Jason Wang
  0 siblings, 1 reply; 11+ messages in thread
From: Peter Xu @ 2016-12-22 11:04 UTC (permalink / raw)
  To: Jason Wang
  Cc: qemu-devel, tianyu.lan, kevin.tian, mst, jan.kiszka,
	alex.williamson, bd.aviv, david

On Thu, Dec 22, 2016 at 05:52:58PM +0800, Jason Wang wrote:
> 
> 
> On 2016年12月22日 17:48, Peter Xu wrote:
> >  /* Handle Translation Enable/Disable */
> >  static void vtd_handle_gcmd_te(IntelIOMMUState *s, bool en)
> >  {
> >+    if (s->dmar_enabled == en) {
> >+        return;
> >+    }
> >+
> >      VTD_DPRINTF(CSR, "Translation Enable %s", (en ? "on" : "off"));
> >      if (en) {
> >@@ -1196,6 +1237,8 @@ static void vtd_handle_gcmd_te(IntelIOMMUState *s, bool en)
> >          /* Ok - report back to driver */
> >          vtd_set_clear_mask_long(s, DMAR_GSTS_REG, VTD_GSTS_TES, 0);
> >      }
> >+
> >+    vtd_switch_address_space_all(s, en);
> >  }
> 
> We may need something like notifier here to tell e.g vhost to stop device
> IOTLB. (Since it's likely this series were applied after device IOTLB
> patches)

Yes, I missed vhost case.

To notify vhost, IMO we should be able to use memory listeners just
like how vfio devices do (please refer to vfio_memory_listener).
However, I think the bigger issue is we still don't have a dynamic way
to turn vhost DMAR on/off, right?

If so, we may need to re-touch all the parts to enable the dynamic
switching of DMA remapping - QEMU vhost, kernel vhost, and virtio on
the guest side... I start to doubt whether that effort will worth it
due to this single change, especially if this feature (dynamic on/off
DMA remapping) won't be used by most VMs (i.e., Linux should only turn
VT-d on when kernel detects it, and should never turn it off in
anyway).

(However I do think this is an improvement to current VT-d though)

Thanks,

-- peterx

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Qemu-devel] [PATCH v3] intel_iommu: allow dynamic switch of IOMMU region
  2016-12-22 11:04   ` Peter Xu
@ 2016-12-22 11:34     ` Jason Wang
  2016-12-23  3:26       ` Peter Xu
  0 siblings, 1 reply; 11+ messages in thread
From: Jason Wang @ 2016-12-22 11:34 UTC (permalink / raw)
  To: Peter Xu
  Cc: tianyu.lan, kevin.tian, mst, jan.kiszka, bd.aviv, qemu-devel,
	alex.williamson, david



On 2016年12月22日 19:04, Peter Xu wrote:
> On Thu, Dec 22, 2016 at 05:52:58PM +0800, Jason Wang wrote:
>>
>> On 2016年12月22日 17:48, Peter Xu wrote:
>>>   /* Handle Translation Enable/Disable */
>>>   static void vtd_handle_gcmd_te(IntelIOMMUState *s, bool en)
>>>   {
>>> +    if (s->dmar_enabled == en) {
>>> +        return;
>>> +    }
>>> +
>>>       VTD_DPRINTF(CSR, "Translation Enable %s", (en ? "on" : "off"));
>>>       if (en) {
>>> @@ -1196,6 +1237,8 @@ static void vtd_handle_gcmd_te(IntelIOMMUState *s, bool en)
>>>           /* Ok - report back to driver */
>>>           vtd_set_clear_mask_long(s, DMAR_GSTS_REG, VTD_GSTS_TES, 0);
>>>       }
>>> +
>>> +    vtd_switch_address_space_all(s, en);
>>>   }
>> We may need something like notifier here to tell e.g vhost to stop device
>> IOTLB. (Since it's likely this series were applied after device IOTLB
>> patches)
> Yes, I missed vhost case.
>
> To notify vhost, IMO we should be able to use memory listeners just
> like how vfio devices do (please refer to vfio_memory_listener).

Just for switching? This seems an overkill since we don't depends on it 
for all other things. Guest should setup correct mappings by explicitly 
notify device IOTLB. A quick glance at ATS spec, for enabling and 
disabling, looks like it was done through enable bit of ASTctl instead 
of here.

So we are probably ok here :)

> However, I think the bigger issue is we still don't have a dynamic way
> to turn vhost DMAR on/off, right?

The API was ready for this, just issue another set_feature ioctl without 
IOMMU_PLATFORM. (But unfortunately, vhost need a small patch to make 
this work).

>
> If so, we may need to re-touch all the parts to enable the dynamic
> switching of DMA remapping - QEMU vhost, kernel vhost, and virtio on
> the guest side... I start to doubt whether that effort will worth it
> due to this single change, especially if this feature (dynamic on/off
> DMA remapping) won't be used by most VMs (i.e., Linux should only turn
> VT-d on when kernel detects it, and should never turn it off in
> anyway).

For vhost part, the changes should be very minor, probably just:

- a patch to make sure vhost can disable device IOTLB during feature set
- properly handling enabling bit of ATSctl in qemu (probably through an 
notifier)

>
> (However I do think this is an improvement to current VT-d though)
>
> Thanks,
>
> -- peterx
>

+1. We should try to emulate exactly what hardware does to avoid 
breaking all kinds of guest or userspace drivers.

Thanks

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Qemu-devel] [PATCH v3] intel_iommu: allow dynamic switch of IOMMU region
  2016-12-22 11:34     ` Jason Wang
@ 2016-12-23  3:26       ` Peter Xu
  2016-12-26  2:45         ` Jason Wang
  0 siblings, 1 reply; 11+ messages in thread
From: Peter Xu @ 2016-12-23  3:26 UTC (permalink / raw)
  To: Jason Wang
  Cc: tianyu.lan, kevin.tian, mst, jan.kiszka, bd.aviv, qemu-devel,
	alex.williamson, david

On Thu, Dec 22, 2016 at 07:34:10PM +0800, Jason Wang wrote:
> 
> 
> On 2016年12月22日 19:04, Peter Xu wrote:
> >On Thu, Dec 22, 2016 at 05:52:58PM +0800, Jason Wang wrote:
> >>
> >>On 2016年12月22日 17:48, Peter Xu wrote:
> >>>  /* Handle Translation Enable/Disable */
> >>>  static void vtd_handle_gcmd_te(IntelIOMMUState *s, bool en)
> >>>  {
> >>>+    if (s->dmar_enabled == en) {
> >>>+        return;
> >>>+    }
> >>>+
> >>>      VTD_DPRINTF(CSR, "Translation Enable %s", (en ? "on" : "off"));
> >>>      if (en) {
> >>>@@ -1196,6 +1237,8 @@ static void vtd_handle_gcmd_te(IntelIOMMUState *s, bool en)
> >>>          /* Ok - report back to driver */
> >>>          vtd_set_clear_mask_long(s, DMAR_GSTS_REG, VTD_GSTS_TES, 0);
> >>>      }
> >>>+
> >>>+    vtd_switch_address_space_all(s, en);
> >>>  }
> >>We may need something like notifier here to tell e.g vhost to stop device
> >>IOTLB. (Since it's likely this series were applied after device IOTLB
> >>patches)
> >Yes, I missed vhost case.
> >
> >To notify vhost, IMO we should be able to use memory listeners just
> >like how vfio devices do (please refer to vfio_memory_listener).
> 
> Just for switching? This seems an overkill since we don't depends on it for
> all other things. Guest should setup correct mappings by explicitly notify
> device IOTLB. A quick glance at ATS spec, for enabling and disabling, looks
> like it was done through enable bit of ASTctl instead of here.
> 
> So we are probably ok here :)

So we need a more general way to notify ATS about this, right? (e.g.,
for future devices that support ATS as well, not only vhost)

> 
> >However, I think the bigger issue is we still don't have a dynamic way
> >to turn vhost DMAR on/off, right?
> 
> The API was ready for this, just issue another set_feature ioctl without
> IOMMU_PLATFORM. (But unfortunately, vhost need a small patch to make this
> work).

That'll be nice. :)

> 
> >
> >If so, we may need to re-touch all the parts to enable the dynamic
> >switching of DMA remapping - QEMU vhost, kernel vhost, and virtio on
> >the guest side... I start to doubt whether that effort will worth it
> >due to this single change, especially if this feature (dynamic on/off
> >DMA remapping) won't be used by most VMs (i.e., Linux should only turn
> >VT-d on when kernel detects it, and should never turn it off in
> >anyway).
> 
> For vhost part, the changes should be very minor, probably just:
> 
> - a patch to make sure vhost can disable device IOTLB during feature set
> - properly handling enabling bit of ATSctl in qemu (probably through an
> notifier)

Do you think I should provide another patch for the domain switch
notification (along with this one)? Or I can just postpone this patch
until DMAR series merged (after all this patch is not an urgent one).

> 
> >
> >(However I do think this is an improvement to current VT-d though)
> >
> >Thanks,
> >
> >-- peterx
> >
> 
> +1. We should try to emulate exactly what hardware does to avoid breaking
> all kinds of guest or userspace drivers.

Yes. Thanks,

-- peterx

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Qemu-devel] [PATCH v3] intel_iommu: allow dynamic switch of IOMMU region
  2016-12-23  3:26       ` Peter Xu
@ 2016-12-26  2:45         ` Jason Wang
  2017-01-03 21:20           ` Alex Williamson
  0 siblings, 1 reply; 11+ messages in thread
From: Jason Wang @ 2016-12-26  2:45 UTC (permalink / raw)
  To: Peter Xu
  Cc: tianyu.lan, kevin.tian, mst, jan.kiszka, bd.aviv, qemu-devel,
	alex.williamson, david



On 2016年12月23日 11:26, Peter Xu wrote:
> On Thu, Dec 22, 2016 at 07:34:10PM +0800, Jason Wang wrote:
>>
>> On 2016年12月22日 19:04, Peter Xu wrote:
>>> On Thu, Dec 22, 2016 at 05:52:58PM +0800, Jason Wang wrote:
>>>> On 2016年12月22日 17:48, Peter Xu wrote:
>>>>>   /* Handle Translation Enable/Disable */
>>>>>   static void vtd_handle_gcmd_te(IntelIOMMUState *s, bool en)
>>>>>   {
>>>>> +    if (s->dmar_enabled == en) {
>>>>> +        return;
>>>>> +    }
>>>>> +
>>>>>       VTD_DPRINTF(CSR, "Translation Enable %s", (en ? "on" : "off"));
>>>>>       if (en) {
>>>>> @@ -1196,6 +1237,8 @@ static void vtd_handle_gcmd_te(IntelIOMMUState *s, bool en)
>>>>>           /* Ok - report back to driver */
>>>>>           vtd_set_clear_mask_long(s, DMAR_GSTS_REG, VTD_GSTS_TES, 0);
>>>>>       }
>>>>> +
>>>>> +    vtd_switch_address_space_all(s, en);
>>>>>   }
>>>> We may need something like notifier here to tell e.g vhost to stop device
>>>> IOTLB. (Since it's likely this series were applied after device IOTLB
>>>> patches)
>>> Yes, I missed vhost case.
>>>
>>> To notify vhost, IMO we should be able to use memory listeners just
>>> like how vfio devices do (please refer to vfio_memory_listener).
>> Just for switching? This seems an overkill since we don't depends on it for
>> all other things. Guest should setup correct mappings by explicitly notify
>> device IOTLB. A quick glance at ATS spec, for enabling and disabling, looks
>> like it was done through enable bit of ASTctl instead of here.
>>
>> So we are probably ok here :)
> So we need a more general way to notify ATS about this, right? (e.g.,
> for future devices that support ATS as well, not only vhost)

Yes, if we want to make ATS visible to guest. But looks like it needs 
more e.g VFIO support for device IOTLB invalidation.

>
>>> However, I think the bigger issue is we still don't have a dynamic way
>>> to turn vhost DMAR on/off, right?
>> The API was ready for this, just issue another set_feature ioctl without
>> IOMMU_PLATFORM. (But unfortunately, vhost need a small patch to make this
>> work).
> That'll be nice. :)
>
>>> If so, we may need to re-touch all the parts to enable the dynamic
>>> switching of DMA remapping - QEMU vhost, kernel vhost, and virtio on
>>> the guest side... I start to doubt whether that effort will worth it
>>> due to this single change, especially if this feature (dynamic on/off
>>> DMA remapping) won't be used by most VMs (i.e., Linux should only turn
>>> VT-d on when kernel detects it, and should never turn it off in
>>> anyway).
>> For vhost part, the changes should be very minor, probably just:
>>
>> - a patch to make sure vhost can disable device IOTLB during feature set
>> - properly handling enabling bit of ATSctl in qemu (probably through an
>> notifier)
> Do you think I should provide another patch for the domain switch
> notification (along with this one)? Or I can just postpone this patch
> until DMAR series merged (after all this patch is not an urgent one).

My understanding is, at least device IOTLB does not care about this. So 
we can leave it as is.

Thanks

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Qemu-devel] [PATCH v3] intel_iommu: allow dynamic switch of IOMMU region
  2016-12-26  2:45         ` Jason Wang
@ 2017-01-03 21:20           ` Alex Williamson
  2017-01-05  4:36             ` Tian, Kevin
  0 siblings, 1 reply; 11+ messages in thread
From: Alex Williamson @ 2017-01-03 21:20 UTC (permalink / raw)
  To: Jason Wang
  Cc: Peter Xu, tianyu.lan, kevin.tian, mst, jan.kiszka, bd.aviv,
	qemu-devel, david

On Mon, 26 Dec 2016 10:45:55 +0800
Jason Wang <jasowang@redhat.com> wrote:

> On 2016年12月23日 11:26, Peter Xu wrote:
> > On Thu, Dec 22, 2016 at 07:34:10PM +0800, Jason Wang wrote:  
> >>
> >> On 2016年12月22日 19:04, Peter Xu wrote:  
> >>> On Thu, Dec 22, 2016 at 05:52:58PM +0800, Jason Wang wrote:  
> >>>> On 2016年12月22日 17:48, Peter Xu wrote:  
> >>>>>   /* Handle Translation Enable/Disable */
> >>>>>   static void vtd_handle_gcmd_te(IntelIOMMUState *s, bool en)
> >>>>>   {
> >>>>> +    if (s->dmar_enabled == en) {
> >>>>> +        return;
> >>>>> +    }
> >>>>> +
> >>>>>       VTD_DPRINTF(CSR, "Translation Enable %s", (en ? "on" : "off"));
> >>>>>       if (en) {
> >>>>> @@ -1196,6 +1237,8 @@ static void vtd_handle_gcmd_te(IntelIOMMUState *s, bool en)
> >>>>>           /* Ok - report back to driver */
> >>>>>           vtd_set_clear_mask_long(s, DMAR_GSTS_REG, VTD_GSTS_TES, 0);
> >>>>>       }
> >>>>> +
> >>>>> +    vtd_switch_address_space_all(s, en);
> >>>>>   }  
> >>>> We may need something like notifier here to tell e.g vhost to stop device
> >>>> IOTLB. (Since it's likely this series were applied after device IOTLB
> >>>> patches)  
> >>> Yes, I missed vhost case.
> >>>
> >>> To notify vhost, IMO we should be able to use memory listeners just
> >>> like how vfio devices do (please refer to vfio_memory_listener).  
> >> Just for switching? This seems an overkill since we don't depends on it for
> >> all other things. Guest should setup correct mappings by explicitly notify
> >> device IOTLB. A quick glance at ATS spec, for enabling and disabling, looks
> >> like it was done through enable bit of ASTctl instead of here.
> >>
> >> So we are probably ok here :)  
> > So we need a more general way to notify ATS about this, right? (e.g.,
> > for future devices that support ATS as well, not only vhost)  
> 
> Yes, if we want to make ATS visible to guest. But looks like it needs 
> more e.g VFIO support for device IOTLB invalidation.

ATS is already visible for vfio-pci devices, it will be enabled if the
host IOMMU supports ATS.  If the host IOMMU does not support ATS,
enabling it on the device by the guest will generate unsupported
requests.  This is yet another case, along with address width, where
it's potentially really complicated and not amenable to hotplug to have
a guest IOMMU.  AFAICT, there's nothing to be added to vfio for device
iotlb invalidations, this occurs automatically between the host IOMMU
and device when a DMA unmap occurs.  Thanks,

Alex

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Qemu-devel] [PATCH v3] intel_iommu: allow dynamic switch of IOMMU region
  2016-12-22  9:48 [Qemu-devel] [PATCH v3] intel_iommu: allow dynamic switch of IOMMU region Peter Xu
  2016-12-22  9:52 ` Jason Wang
@ 2017-01-05  3:30 ` Tian, Kevin
  2017-01-05  3:52   ` Peter Xu
  1 sibling, 1 reply; 11+ messages in thread
From: Tian, Kevin @ 2017-01-05  3:30 UTC (permalink / raw)
  To: Peter Xu, qemu-devel
  Cc: Lan, Tianyu, mst, jan.kiszka, jasowang, alex.williamson, bd.aviv, david

> From: Peter Xu [mailto:peterx@redhat.com]
> Sent: Thursday, December 22, 2016 5:48 PM
> 
> This is preparation work to finally enabled dynamic switching ON/OFF for
> VT-d protection. The old VT-d codes is using static IOMMU region, and
> that won't satisfy vfio-pci device listeners.

Is "IOMMU address space" more accurate than "IOMMU region" here,
based on following description and actual code?

> 
> Let me explain.
> 
> vfio-pci devices depend on the memory region listener and IOMMU replay
> mechanism to make sure the device mapping is coherent with the guest
> even if there are domain switches. And there are two kinds of domain
> switches:
> 
>   (1) switch from domain A -> B
>   (2) switch from domain A -> no domain (e.g., turn DMAR off)
> 
> Case (1) is handled by the context entry invalidation handling by the
> VT-d replay logic. What the replay function should do here is to replay
> the existing page mappings in domain B.
> 
> However for case (2), we don't want to replay any domain mappings - we
> just need the default GPA->HPA mappings (the address_space_memory
> mapping). And this patch helps on case (2) to build up the mapping
> automatically by leveraging the vfio-pci memory listeners.
> 
> Another important thing that this patch does is to seperate
> IR (Interrupt Remapping) from DMAR (DMA Remapping). IR region should not
> depend on the DMAR region (like before this patch). It should be a
> standalone region, and it should be able to be activated without
> DMAR (which is a common behavior of Linux kernel - by default it enables
> IR while disabled DMAR).

Not just because Linux kernel behavior. It's the spec behavior - IR
and DMAR can be enabled/disabled separately. :-)

> 
> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
> v3:
> - fix another trivial style issue patchew reported but I missed in v2
> 
> v2:
> - fix issues reported by patchew
> - switch domain by enable/disable memory regions [David]
> - provide vtd_switch_address_space{_all}()
> - provide a better comment on the memory regions
> 
> test done: with intel_iommu device, boot vm with/without
> "intel_iommu=on" parameter.
> 
> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
>  hw/i386/intel_iommu.c         | 79
> ++++++++++++++++++++++++++++++++++++++++---
>  hw/i386/trace-events          |  3 ++
>  include/hw/i386/intel_iommu.h |  2 ++
>  3 files changed, 80 insertions(+), 4 deletions(-)
> 
> diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
> index 5f3e351..2710ee774 100644
> --- a/hw/i386/intel_iommu.c
> +++ b/hw/i386/intel_iommu.c
> @@ -35,6 +35,7 @@
>  #include "sysemu/kvm.h"
>  #include "hw/i386/apic_internal.h"
>  #include "kvm_i386.h"
> +#include "trace.h"
> 
>  /*#define DEBUG_INTEL_IOMMU*/
>  #ifdef DEBUG_INTEL_IOMMU
> @@ -1179,9 +1180,49 @@ static void vtd_handle_gcmd_sirtp(IntelIOMMUState *s)
>      vtd_set_clear_mask_long(s, DMAR_GSTS_REG, 0, VTD_GSTS_IRTPS);
>  }
> 
> +static void vtd_switch_address_space(VTDAddressSpace *as, bool iommu_enabled)
> +{
> +    assert(as);
> +
> +    trace_vtd_switch_address_space(pci_bus_num(as->bus),
> +                                   VTD_PCI_SLOT(as->devfn),
> +                                   VTD_PCI_FUNC(as->devfn),
> +                                   iommu_enabled);
> +
> +    /* Turn off first then on the other */
> +    if (iommu_enabled) {
> +        memory_region_set_enabled(&as->sys_alias, false);
> +        memory_region_set_enabled(&as->iommu, true);
> +    } else {
> +        memory_region_set_enabled(&as->iommu, false);
> +        memory_region_set_enabled(&as->sys_alias, true);
> +    }
> +}
> +
> +static void vtd_switch_address_space_all(IntelIOMMUState *s, bool enabled)
> +{
> +    GHashTableIter iter;
> +    VTDBus *vtd_bus;
> +    int i;
> +
> +    g_hash_table_iter_init(&iter, s->vtd_as_by_busptr);
> +    while (g_hash_table_iter_next(&iter, NULL, (void **)&vtd_bus)) {
> +        for (i = 0; i < X86_IOMMU_PCI_DEVFN_MAX; i++) {
> +            if (!vtd_bus->dev_as[i]) {
> +                continue;
> +            }
> +            vtd_switch_address_space(vtd_bus->dev_as[i], enabled);
> +        }
> +    }
> +}
> +
>  /* Handle Translation Enable/Disable */
>  static void vtd_handle_gcmd_te(IntelIOMMUState *s, bool en)
>  {
> +    if (s->dmar_enabled == en) {
> +        return;
> +    }
> +
>      VTD_DPRINTF(CSR, "Translation Enable %s", (en ? "on" : "off"));
> 
>      if (en) {
> @@ -1196,6 +1237,8 @@ static void vtd_handle_gcmd_te(IntelIOMMUState *s, bool en)
>          /* Ok - report back to driver */
>          vtd_set_clear_mask_long(s, DMAR_GSTS_REG, VTD_GSTS_TES, 0);
>      }
> +
> +    vtd_switch_address_space_all(s, en);
>  }

A context entry can be configured as pass-through, meaning no addr
translation for DMAs from this device when IOMMU is globally enabled.
There is no translation structure per se, so 'sys_alias" is also required
in such configuration.

> 
>  /* Handle Interrupt Remap Enable/Disable */
> @@ -2343,15 +2386,43 @@ VTDAddressSpace *vtd_find_add_as(IntelIOMMUState *s,
> PCIBus *bus, int devfn)
>          vtd_dev_as->devfn = (uint8_t)devfn;
>          vtd_dev_as->iommu_state = s;
>          vtd_dev_as->context_cache_entry.context_cache_gen = 0;
> +
> +        /*
> +         * Memory region relationships looks like (Address range shows
> +         * only lower 32 bits to make it short in length...):
> +         *
> +         * |-----------------+-------------------+----------|
> +         * | Name            | Address range     | Priority |
> +         * |-----------------+-------------------+----------+
> +         * | vtd_root        | 00000000-ffffffff |        0 |
> +         * |  intel_iommu    | 00000000-ffffffff |        1 |
> +         * |  vtd_sys_alias  | 00000000-ffffffff |        1 |
> +         * |  intel_iommu_ir | fee00000-feefffff |       64 |
> +         * |-----------------+-------------------+----------|
> +         *
> +         * We enable/disable DMAR by switching enablement for
> +         * vtd_sys_alias and intel_iommu regions. IR region is always
> +         * enabled.
> +         */
>          memory_region_init_iommu(&vtd_dev_as->iommu, OBJECT(s),
>                                   &s->iommu_ops, "intel_iommu", UINT64_MAX);
> +        memory_region_init_alias(&vtd_dev_as->sys_alias, OBJECT(s),
> +                                 "vtd_sys_alias", get_system_memory(),
> +                                 0, memory_region_size(get_system_memory()));
>          memory_region_init_io(&vtd_dev_as->iommu_ir, OBJECT(s),
>                                &vtd_mem_ir_ops, s, "intel_iommu_ir",
>                                VTD_INTERRUPT_ADDR_SIZE);
> -        memory_region_add_subregion(&vtd_dev_as->iommu,
> VTD_INTERRUPT_ADDR_FIRST,
> -                                    &vtd_dev_as->iommu_ir);
> -        address_space_init(&vtd_dev_as->as,
> -                           &vtd_dev_as->iommu, "intel_iommu");
> +        memory_region_init(&vtd_dev_as->root, OBJECT(s),
> +                           "vtd_root", UINT64_MAX);
> +        memory_region_add_subregion_overlap(&vtd_dev_as->root,
> +                                            VTD_INTERRUPT_ADDR_FIRST,
> +                                            &vtd_dev_as->iommu_ir, 64);
> +        address_space_init(&vtd_dev_as->as, &vtd_dev_as->root, "intel_iommu");
> +        memory_region_add_subregion_overlap(&vtd_dev_as->root, 0,
> +                                            &vtd_dev_as->sys_alias, 1);
> +        memory_region_add_subregion_overlap(&vtd_dev_as->root, 0,
> +                                            &vtd_dev_as->iommu, 1);
> +        vtd_switch_address_space(vtd_dev_as, s->dmar_enabled);
>      }
>      return vtd_dev_as;
>  }
> diff --git a/hw/i386/trace-events b/hw/i386/trace-events
> index d2b4973..aee93bb 100644
> --- a/hw/i386/trace-events
> +++ b/hw/i386/trace-events
> @@ -10,6 +10,9 @@ xen_pv_mmio_write(uint64_t addr) "WARNING: write to Xen PV
> Device MMIO space (ad
>  # hw/i386/x86-iommu.c
>  x86_iommu_iec_notify(bool global, uint32_t index, uint32_t mask) "Notify IEC invalidation:
> global=%d index=%" PRIu32 " mask=%" PRIu32
> 
> +# hw/i386/intel_iommu.c
> +vtd_switch_address_space(uint8_t bus, uint8_t slot, uint8_t fn, bool on)
> "Device %02x:%02x.%x switching address space (iommu enabled=%d)"
> +
>  # hw/i386/amd_iommu.c
>  amdvi_evntlog_fail(uint64_t addr, uint32_t head) "error: fail to write at addr
> 0x%"PRIx64" +  offset 0x%"PRIx32
>  amdvi_cache_update(uint16_t domid, uint8_t bus, uint8_t slot, uint8_t func, uint64_t gpa,
> uint64_t txaddr) " update iotlb domid 0x%"PRIx16" devid: %02x:%02x.%x gpa
> 0x%"PRIx64" hpa 0x%"PRIx64
> diff --git a/include/hw/i386/intel_iommu.h b/include/hw/i386/intel_iommu.h
> index 405c9d1..85c1b9b 100644
> --- a/include/hw/i386/intel_iommu.h
> +++ b/include/hw/i386/intel_iommu.h
> @@ -83,6 +83,8 @@ struct VTDAddressSpace {
>      uint8_t devfn;
>      AddressSpace as;
>      MemoryRegion iommu;
> +    MemoryRegion root;
> +    MemoryRegion sys_alias;
>      MemoryRegion iommu_ir;      /* Interrupt region: 0xfeeXXXXX */
>      IntelIOMMUState *iommu_state;
>      VTDContextCacheEntry context_cache_entry;
> --
> 2.7.4

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Qemu-devel] [PATCH v3] intel_iommu: allow dynamic switch of IOMMU region
  2017-01-05  3:30 ` Tian, Kevin
@ 2017-01-05  3:52   ` Peter Xu
  2017-01-05  4:15     ` Tian, Kevin
  0 siblings, 1 reply; 11+ messages in thread
From: Peter Xu @ 2017-01-05  3:52 UTC (permalink / raw)
  To: Tian, Kevin
  Cc: qemu-devel, Lan, Tianyu, mst, jan.kiszka, jasowang,
	alex.williamson, bd.aviv, david

Hi, Kevin,

On Thu, Jan 05, 2017 at 03:30:28AM +0000, Tian, Kevin wrote:
> > From: Peter Xu [mailto:peterx@redhat.com]
> > Sent: Thursday, December 22, 2016 5:48 PM
> > 
> > This is preparation work to finally enabled dynamic switching ON/OFF for
> > VT-d protection. The old VT-d codes is using static IOMMU region, and
> > that won't satisfy vfio-pci device listeners.
> 
> Is "IOMMU address space" more accurate than "IOMMU region" here,
> based on following description and actual code?

Sounds reasonable. Will fix.

(I have merged this patch with the VT-d vfio enablement series, so
 will change there rather than sending another standalone patch for
 this one)

> 
> > 
> > Let me explain.
> > 
> > vfio-pci devices depend on the memory region listener and IOMMU replay
> > mechanism to make sure the device mapping is coherent with the guest
> > even if there are domain switches. And there are two kinds of domain
> > switches:
> > 
> >   (1) switch from domain A -> B
> >   (2) switch from domain A -> no domain (e.g., turn DMAR off)
> > 
> > Case (1) is handled by the context entry invalidation handling by the
> > VT-d replay logic. What the replay function should do here is to replay
> > the existing page mappings in domain B.
> > 
> > However for case (2), we don't want to replay any domain mappings - we
> > just need the default GPA->HPA mappings (the address_space_memory
> > mapping). And this patch helps on case (2) to build up the mapping
> > automatically by leveraging the vfio-pci memory listeners.
> > 
> > Another important thing that this patch does is to seperate
> > IR (Interrupt Remapping) from DMAR (DMA Remapping). IR region should not
> > depend on the DMAR region (like before this patch). It should be a
> > standalone region, and it should be able to be activated without
> > DMAR (which is a common behavior of Linux kernel - by default it enables
> > IR while disabled DMAR).
> 
> Not just because Linux kernel behavior. It's the spec behavior - IR
> and DMAR can be enabled/disabled separately. :-)

Thanks to point out. I wasn't meant to emphasize on "that's specific
for Linux" - I was trying to say "the old code won't work even with
default Linux parameters". So I see no conflicts. :)

[...]

> >  /* Handle Translation Enable/Disable */
> >  static void vtd_handle_gcmd_te(IntelIOMMUState *s, bool en)
> >  {
> > +    if (s->dmar_enabled == en) {
> > +        return;
> > +    }
> > +
> >      VTD_DPRINTF(CSR, "Translation Enable %s", (en ? "on" : "off"));
> > 
> >      if (en) {
> > @@ -1196,6 +1237,8 @@ static void vtd_handle_gcmd_te(IntelIOMMUState *s, bool en)
> >          /* Ok - report back to driver */
> >          vtd_set_clear_mask_long(s, DMAR_GSTS_REG, VTD_GSTS_TES, 0);
> >      }
> > +
> > +    vtd_switch_address_space_all(s, en);
> >  }
> 
> A context entry can be configured as pass-through, meaning no addr
> translation for DMAs from this device when IOMMU is globally enabled.
> There is no translation structure per se, so 'sys_alias" is also required
> in such configuration.

Right. But current vt-d emulation still doesn't support per-device
pass-through. See vtd_dev_to_context_entry():

    ...
    } else if (ce->lo & VTD_CONTEXT_ENTRY_TT) {
        VTD_DPRINTF(GENERAL, "error: unsupported Translation Type in "
                    "context-entry hi 0x%"PRIx64 " lo 0x%"PRIx64,
                    ce->hi, ce->lo);
        return -VTD_FR_CONTEXT_ENTRY_INV;
    }
    ...

And:

    #define VTD_CONTEXT_ENTRY_TT        (3ULL << 2) /* Translation Type */
    #define VTD_CONTEXT_TT_PASS_THROUGH 2

IMO we can add it when we support device passthrough.

Thanks,

-- peterx

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Qemu-devel] [PATCH v3] intel_iommu: allow dynamic switch of IOMMU region
  2017-01-05  3:52   ` Peter Xu
@ 2017-01-05  4:15     ` Tian, Kevin
  0 siblings, 0 replies; 11+ messages in thread
From: Tian, Kevin @ 2017-01-05  4:15 UTC (permalink / raw)
  To: Peter Xu
  Cc: qemu-devel, Lan, Tianyu, mst, jan.kiszka, jasowang,
	alex.williamson, bd.aviv, david

> From: Peter Xu [mailto:peterx@redhat.com]
> Sent: Thursday, January 05, 2017 11:53 AM
> 
> > >  /* Handle Translation Enable/Disable */
> > >  static void vtd_handle_gcmd_te(IntelIOMMUState *s, bool en)
> > >  {
> > > +    if (s->dmar_enabled == en) {
> > > +        return;
> > > +    }
> > > +
> > >      VTD_DPRINTF(CSR, "Translation Enable %s", (en ? "on" : "off"));
> > >
> > >      if (en) {
> > > @@ -1196,6 +1237,8 @@ static void vtd_handle_gcmd_te(IntelIOMMUState *s, bool
> en)
> > >          /* Ok - report back to driver */
> > >          vtd_set_clear_mask_long(s, DMAR_GSTS_REG, VTD_GSTS_TES, 0);
> > >      }
> > > +
> > > +    vtd_switch_address_space_all(s, en);
> > >  }
> >
> > A context entry can be configured as pass-through, meaning no addr
> > translation for DMAs from this device when IOMMU is globally enabled.
> > There is no translation structure per se, so 'sys_alias" is also required
> > in such configuration.
> 
> Right. But current vt-d emulation still doesn't support per-device
> pass-through. See vtd_dev_to_context_entry():
> 
>     ...
>     } else if (ce->lo & VTD_CONTEXT_ENTRY_TT) {
>         VTD_DPRINTF(GENERAL, "error: unsupported Translation Type in "
>                     "context-entry hi 0x%"PRIx64 " lo 0x%"PRIx64,
>                     ce->hi, ce->lo);
>         return -VTD_FR_CONTEXT_ENTRY_INV;
>     }
>     ...
> 
> And:
> 
>     #define VTD_CONTEXT_ENTRY_TT        (3ULL << 2) /* Translation Type */
>     #define VTD_CONTEXT_TT_PASS_THROUGH 2
> 
> IMO we can add it when we support device passthrough.
> 

Then fine with me. :-)

Thanks
Kevin

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Qemu-devel] [PATCH v3] intel_iommu: allow dynamic switch of IOMMU region
  2017-01-03 21:20           ` Alex Williamson
@ 2017-01-05  4:36             ` Tian, Kevin
  0 siblings, 0 replies; 11+ messages in thread
From: Tian, Kevin @ 2017-01-05  4:36 UTC (permalink / raw)
  To: Alex Williamson, Jason Wang
  Cc: Peter Xu, Lan, Tianyu, mst, jan.kiszka, bd.aviv, qemu-devel, david

> From: Alex Williamson [mailto:alex.williamson@redhat.com]
> Sent: Wednesday, January 04, 2017 5:21 AM
> 
> On Mon, 26 Dec 2016 10:45:55 +0800
> Jason Wang <jasowang@redhat.com> wrote:
> 
> > On 2016年12月23日 11:26, Peter Xu wrote:
> > > On Thu, Dec 22, 2016 at 07:34:10PM +0800, Jason Wang wrote:
> > >>
> > >> On 2016年12月22日 19:04, Peter Xu wrote:
> > >>> On Thu, Dec 22, 2016 at 05:52:58PM +0800, Jason Wang wrote:
> > >>>> On 2016年12月22日 17:48, Peter Xu wrote:
> > >>>>>   /* Handle Translation Enable/Disable */
> > >>>>>   static void vtd_handle_gcmd_te(IntelIOMMUState *s, bool en)
> > >>>>>   {
> > >>>>> +    if (s->dmar_enabled == en) {
> > >>>>> +        return;
> > >>>>> +    }
> > >>>>> +
> > >>>>>       VTD_DPRINTF(CSR, "Translation Enable %s", (en ? "on" : "off"));
> > >>>>>       if (en) {
> > >>>>> @@ -1196,6 +1237,8 @@ static void vtd_handle_gcmd_te(IntelIOMMUState *s,
> bool en)
> > >>>>>           /* Ok - report back to driver */
> > >>>>>           vtd_set_clear_mask_long(s, DMAR_GSTS_REG, VTD_GSTS_TES, 0);
> > >>>>>       }
> > >>>>> +
> > >>>>> +    vtd_switch_address_space_all(s, en);
> > >>>>>   }
> > >>>> We may need something like notifier here to tell e.g vhost to stop device
> > >>>> IOTLB. (Since it's likely this series were applied after device IOTLB
> > >>>> patches)
> > >>> Yes, I missed vhost case.
> > >>>
> > >>> To notify vhost, IMO we should be able to use memory listeners just
> > >>> like how vfio devices do (please refer to vfio_memory_listener).
> > >> Just for switching? This seems an overkill since we don't depends on it for
> > >> all other things. Guest should setup correct mappings by explicitly notify
> > >> device IOTLB. A quick glance at ATS spec, for enabling and disabling, looks
> > >> like it was done through enable bit of ASTctl instead of here.
> > >>
> > >> So we are probably ok here :)
> > > So we need a more general way to notify ATS about this, right? (e.g.,
> > > for future devices that support ATS as well, not only vhost)
> >
> > Yes, if we want to make ATS visible to guest. But looks like it needs
> > more e.g VFIO support for device IOTLB invalidation.
> 
> ATS is already visible for vfio-pci devices, it will be enabled if the
> host IOMMU supports ATS.  If the host IOMMU does not support ATS,
> enabling it on the device by the guest will generate unsupported
> requests.  This is yet another case, along with address width, where
> it's potentially really complicated and not amenable to hotplug to have
> a guest IOMMU.  AFAICT, there's nothing to be added to vfio for device
> iotlb invalidations, this occurs automatically between the host IOMMU
> and device when a DMA unmap occurs.  Thanks,
> 

True for current usage - use only 2nd level translation for DMA protection
purpose (IOVA->PA). There is a potential requirement of handling iotlb
invalidation through VFIO in future though - when enabling shared virtual
Memory (SVM) capability in VM where physical IOMMU will be configured
in a nested translation mode, with 1st level translation for GVA->GPA and
the 2nd level translation for GPA->HPA. It's designed in a way that guest 
1st level translation table is directly linked to context entry (all the addresses 
in that table are interpreted as GPA), w/o need of shadowing like for 2nd
level. In such case any IOTLB invalidation related to 1st level translation 
need be forwarded from Qemu vIOMMU to host IOMMU driver, through VFIO.

Not an immediate worry (not for this patch) but a thing we will need think 
later when adding SVM support. 

Thanks
Kevin

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2017-01-05  4:36 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-12-22  9:48 [Qemu-devel] [PATCH v3] intel_iommu: allow dynamic switch of IOMMU region Peter Xu
2016-12-22  9:52 ` Jason Wang
2016-12-22 11:04   ` Peter Xu
2016-12-22 11:34     ` Jason Wang
2016-12-23  3:26       ` Peter Xu
2016-12-26  2:45         ` Jason Wang
2017-01-03 21:20           ` Alex Williamson
2017-01-05  4:36             ` Tian, Kevin
2017-01-05  3:30 ` Tian, Kevin
2017-01-05  3:52   ` Peter Xu
2017-01-05  4:15     ` Tian, Kevin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.