* [Qemu-devel] [PATCH v5 0/2] VFIO/SMMUv3: Fail on VFIO/HW nested paging detection @ 2019-08-29 9:01 Eric Auger 2019-08-29 9:01 ` [Qemu-devel] [PATCH v5 1/2] memory: Add IOMMU_ATTR_NEED_HW_NESTED_PAGING IOMMU memory region attribute Eric Auger 2019-08-29 9:01 ` [Qemu-devel] [PATCH v5 2/2] hw/vfio/common: Fail on VFIO/HW nested paging detection Eric Auger 0 siblings, 2 replies; 6+ messages in thread From: Eric Auger @ 2019-08-29 9:01 UTC (permalink / raw) To: eric.auger.pro, eric.auger, qemu-devel, qemu-arm, peter.maydell, pbonzini, alex.williamson Cc: aik, peterx As of today when a guest is assigned with a host PCI device and an SMMUv3, VFIO calls memory_region_iommu_replay() default implementation. This translates the whole address range and completely stalls the execution. As VFIO/SMMUv3 integration is not supported yet (it requires SMMUv3 HW nested paging), let's recognize this situation and fail. Best Regards Eric This series can be found at: https://github.com/eauger/qemu/tree/v4.1.0_smmu_vfio_fail_v5 History: v4 -> v5: - v4 patches: 1, 4, 5 were upstreamed separately - IOMMU_ATTR_HW_NESTED_PAGING renamed into IOMMU_ATTR_NEED_HW_NESTED_PAGING v3 -> v4: - see individual patches v2 -> v3: - squash IOMMU_ATTR_VFIO_NESTED introduction and SMMUv3 usage - assert when recognizing VFIO/NESTED case - collect R-bs v1 -> v2: - Added "memory: Remove unused memory_region_iommu_replay_all()" & "hw/arm/smmuv3: Log a guest error when decoding an invalid STE" - do not attempt to implement replay Cb but rather remove the call in case it is not needed - explain why we do not remove other log messages on config decoding Eric Auger (2): memory: Add IOMMU_ATTR_NEED_HW_NESTED_PAGING IOMMU memory region attribute hw/vfio/common: Fail on VFIO/HW nested paging detection hw/arm/smmuv3.c | 12 ++++++++++++ hw/vfio/common.c | 10 ++++++++++ include/exec/memory.h | 8 +++++++- 3 files changed, 29 insertions(+), 1 deletion(-) -- 2.20.1 ^ permalink raw reply [flat|nested] 6+ messages in thread
* [Qemu-devel] [PATCH v5 1/2] memory: Add IOMMU_ATTR_NEED_HW_NESTED_PAGING IOMMU memory region attribute 2019-08-29 9:01 [Qemu-devel] [PATCH v5 0/2] VFIO/SMMUv3: Fail on VFIO/HW nested paging detection Eric Auger @ 2019-08-29 9:01 ` Eric Auger 2019-08-29 9:01 ` [Qemu-devel] [PATCH v5 2/2] hw/vfio/common: Fail on VFIO/HW nested paging detection Eric Auger 1 sibling, 0 replies; 6+ messages in thread From: Eric Auger @ 2019-08-29 9:01 UTC (permalink / raw) To: eric.auger.pro, eric.auger, qemu-devel, qemu-arm, peter.maydell, pbonzini, alex.williamson Cc: aik, peterx We introduce a new IOMMU Memory Region attribute, IOMMU_ATTR_NEED_HW_NESTED_PAGING that tells whether the virtual IOMMU relies on physical IOMMU HW nested paging capability when protecting host assigned devices. Current Intel virtual IOMMU device supports "Caching Mode" and does not require 2 stages at physical level to be integrated with VFIO. However SMMUv3 does not implement such "caching mode" and requires HW nested paging. As such SMMUv3 is the first IOMMU device to advertise this attribute. This new attribute will allow the VFIO code to specialize its handling. Signed-off-by: Eric Auger <eric.auger@redhat.com> --- v4 -> v5: - patches 1, 4, 5 were upstreamed separately - s/IOMMU_ATTR_HW_NESTED_PAGING/IOMMU_ATTR_NEED_HW_NESTED_PAGING v3 -> v4: - s/IOMMU_ATTR_VFIO_NESTED/IOMMU_ATTR_HW_NESTED_PAGING - add comments related to the existing attributes - fix space after the cast --- hw/arm/smmuv3.c | 12 ++++++++++++ include/exec/memory.h | 8 +++++++- 2 files changed, 19 insertions(+), 1 deletion(-) diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c index 2eaf07fb5f..a932bf7136 100644 --- a/hw/arm/smmuv3.c +++ b/hw/arm/smmuv3.c @@ -1490,6 +1490,17 @@ static void smmuv3_notify_flag_changed(IOMMUMemoryRegion *iommu, } } +static int smmuv3_get_attr(IOMMUMemoryRegion *iommu, + enum IOMMUMemoryRegionAttr attr, + void *data) +{ + if (attr == IOMMU_ATTR_NEED_HW_NESTED_PAGING) { + *(bool *)data = true; + return 0; + } + return -EINVAL; +} + static void smmuv3_iommu_memory_region_class_init(ObjectClass *klass, void *data) { @@ -1497,6 +1508,7 @@ static void smmuv3_iommu_memory_region_class_init(ObjectClass *klass, imrc->translate = smmuv3_translate; imrc->notify_flag_changed = smmuv3_notify_flag_changed; + imrc->get_attr = smmuv3_get_attr; } static const TypeInfo smmuv3_type_info = { diff --git a/include/exec/memory.h b/include/exec/memory.h index fddc2ff48a..61493633fa 100644 --- a/include/exec/memory.h +++ b/include/exec/memory.h @@ -212,7 +212,13 @@ typedef struct MemoryRegionClass { enum IOMMUMemoryRegionAttr { - IOMMU_ATTR_SPAPR_TCE_FD + /* Retrieve an integer corresponding to the TCE file descriptor */ + IOMMU_ATTR_SPAPR_TCE_FD, + /* + * Retrieve a boolean that indicates whether the virtual IOMMU relies + * on physical IOMMU HW nested paging to protect host assigned devices + */ + IOMMU_ATTR_NEED_HW_NESTED_PAGING, }; /** -- 2.20.1 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* [Qemu-devel] [PATCH v5 2/2] hw/vfio/common: Fail on VFIO/HW nested paging detection 2019-08-29 9:01 [Qemu-devel] [PATCH v5 0/2] VFIO/SMMUv3: Fail on VFIO/HW nested paging detection Eric Auger 2019-08-29 9:01 ` [Qemu-devel] [PATCH v5 1/2] memory: Add IOMMU_ATTR_NEED_HW_NESTED_PAGING IOMMU memory region attribute Eric Auger @ 2019-08-29 9:01 ` Eric Auger 2019-08-29 18:14 ` Alex Williamson 1 sibling, 1 reply; 6+ messages in thread From: Eric Auger @ 2019-08-29 9:01 UTC (permalink / raw) To: eric.auger.pro, eric.auger, qemu-devel, qemu-arm, peter.maydell, pbonzini, alex.williamson Cc: aik, peterx As of today, VFIO only works along with vIOMMU supporting caching mode. The SMMUv3 does not support this mode and requires HW nested paging to work properly with VFIO. So any attempt to run a VFIO device protected by such IOMMU would prevent the assigned device from working and at the moment the guest does not even boot as the default memory_region_iommu_replay() implementation attempts to translate the whole address space and completely stalls the guest. So let's fail on that case. Signed-off-by: Eric Auger <eric.auger@redhat.com> --- v3 -> v4: - use IOMMU_ATTR_HW_NESTED_PAGING - do not abort anymore but jump to fail --- hw/vfio/common.c | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/hw/vfio/common.c b/hw/vfio/common.c index 3e03c495d8..e8c009d019 100644 --- a/hw/vfio/common.c +++ b/hw/vfio/common.c @@ -606,9 +606,19 @@ static void vfio_listener_region_add(MemoryListener *listener, if (memory_region_is_iommu(section->mr)) { VFIOGuestIOMMU *giommu; IOMMUMemoryRegion *iommu_mr = IOMMU_MEMORY_REGION(section->mr); + bool nested; int iommu_idx; trace_vfio_listener_region_add_iommu(iova, end); + + if (!memory_region_iommu_get_attr(iommu_mr, + IOMMU_ATTR_NEED_HW_NESTED_PAGING, + (void *)&nested) && nested) { + error_report("VFIO/vIOMMU integration based on HW nested paging " + "is not yet supported"); + ret = -EINVAL; + goto fail; + } /* * FIXME: For VFIO iommu types which have KVM acceleration to * avoid bouncing all map/unmaps through qemu this way, this -- 2.20.1 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [Qemu-devel] [PATCH v5 2/2] hw/vfio/common: Fail on VFIO/HW nested paging detection 2019-08-29 9:01 ` [Qemu-devel] [PATCH v5 2/2] hw/vfio/common: Fail on VFIO/HW nested paging detection Eric Auger @ 2019-08-29 18:14 ` Alex Williamson 2019-08-30 8:06 ` Auger Eric 0 siblings, 1 reply; 6+ messages in thread From: Alex Williamson @ 2019-08-29 18:14 UTC (permalink / raw) To: Eric Auger Cc: peter.maydell, aik, qemu-devel, peterx, qemu-arm, pbonzini, eric.auger.pro On Thu, 29 Aug 2019 11:01:41 +0200 Eric Auger <eric.auger@redhat.com> wrote: > As of today, VFIO only works along with vIOMMU supporting > caching mode. The SMMUv3 does not support this mode and > requires HW nested paging to work properly with VFIO. > > So any attempt to run a VFIO device protected by such IOMMU > would prevent the assigned device from working and at the > moment the guest does not even boot as the default > memory_region_iommu_replay() implementation attempts to > translate the whole address space and completely stalls > the guest. Why doesn't this stall an x86 guest? I'm a bit confused about what this provides versus the flag_changed notifier looking for IOMMU_NOTIFIER_MAP, which AIUI is the common deficiency between VT-d w/o caching-mode and SMMUv3 w/o nested mode. The iommu notifier is registered prior to calling iommu_replay, so it seems we already have an opportunity to do something there. Help me understand why this is needed. Thanks, Alex > > So let's fail on that case. > > Signed-off-by: Eric Auger <eric.auger@redhat.com> > > --- > > v3 -> v4: > - use IOMMU_ATTR_HW_NESTED_PAGING > - do not abort anymore but jump to fail > --- > hw/vfio/common.c | 10 ++++++++++ > 1 file changed, 10 insertions(+) > > diff --git a/hw/vfio/common.c b/hw/vfio/common.c > index 3e03c495d8..e8c009d019 100644 > --- a/hw/vfio/common.c > +++ b/hw/vfio/common.c > @@ -606,9 +606,19 @@ static void vfio_listener_region_add(MemoryListener *listener, > if (memory_region_is_iommu(section->mr)) { > VFIOGuestIOMMU *giommu; > IOMMUMemoryRegion *iommu_mr = IOMMU_MEMORY_REGION(section->mr); > + bool nested; > int iommu_idx; > > trace_vfio_listener_region_add_iommu(iova, end); > + > + if (!memory_region_iommu_get_attr(iommu_mr, > + IOMMU_ATTR_NEED_HW_NESTED_PAGING, > + (void *)&nested) && nested) { > + error_report("VFIO/vIOMMU integration based on HW nested paging " > + "is not yet supported"); > + ret = -EINVAL; > + goto fail; > + } > /* > * FIXME: For VFIO iommu types which have KVM acceleration to > * avoid bouncing all map/unmaps through qemu this way, this ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Qemu-devel] [PATCH v5 2/2] hw/vfio/common: Fail on VFIO/HW nested paging detection 2019-08-29 18:14 ` Alex Williamson @ 2019-08-30 8:06 ` Auger Eric 2019-08-30 17:22 ` Alex Williamson 0 siblings, 1 reply; 6+ messages in thread From: Auger Eric @ 2019-08-30 8:06 UTC (permalink / raw) To: Alex Williamson Cc: peter.maydell, aik, qemu-devel, peterx, qemu-arm, pbonzini, eric.auger.pro Hi Alex, On 8/29/19 8:14 PM, Alex Williamson wrote: > On Thu, 29 Aug 2019 11:01:41 +0200 > Eric Auger <eric.auger@redhat.com> wrote: > >> As of today, VFIO only works along with vIOMMU supporting >> caching mode. The SMMUv3 does not support this mode and >> requires HW nested paging to work properly with VFIO. >> >> So any attempt to run a VFIO device protected by such IOMMU >> would prevent the assigned device from working and at the >> moment the guest does not even boot as the default >> memory_region_iommu_replay() implementation attempts to >> translate the whole address space and completely stalls >> the guest. > > Why doesn't this stall an x86 guest? it does not stall on x86 since intel_iommu implements a custom replay (see vtd_iommu_replay) and you do not execute the dummy default one. This function performs a full page table walk, scanning all the valid entries and calling the MAP notifier on those. Although this operation is tedious it has nothing to compare against the dummy default replay function which calls translate() on the whole address range (on a page basis). > > I'm a bit confused about what this provides versus the flag_changed > notifier looking for IOMMU_NOTIFIER_MAP, which AIUI is the common > deficiency between VT-d w/o caching-mode and SMMUv3 w/o nested mode. > The iommu notifier is registered prior to calling iommu_replay, so it > seems we already have an opportunity to do something there. Help me > understand why this is needed. Thanks, At the moment the smmuv3 notify_flag_changed callback implementation (smmuv3_notify_flag_changed) emits a warning when it detects a MAP notifier gets registered: warn_report("SMMUv3 does not support notification on MAP: " "device %s will not function properly", pcidev->name); and then the replay gets executed, looping forever. I could exit instead of emitting a warning but the drawback is that on vfio hotplug, it will also exit whereas we would rather simply reject the hotplug. I think the solution based on the IOMMU MR attribute handles both the static and hotplug solutions. Also looking further, I will need this IOMMU MR attribute for 2stage SMMU integration (see [RFC v5 14/29] vfio: Force nested if iommu requires it). I know that it is standing for a while and it is still hypothetical but setting up 2stage will require specific treatments in the vfio common.c code, opt-in the 2stage mode, register specific iommu mr notifiers. Using the IOMMU MR attribute allows me to detect which kind of VFIO/IOMMU integration I need to setup. Thanks Eric > > Alex > >> >> So let's fail on that case. >> >> Signed-off-by: Eric Auger <eric.auger@redhat.com> >> >> --- >> >> v3 -> v4: >> - use IOMMU_ATTR_HW_NESTED_PAGING >> - do not abort anymore but jump to fail >> --- >> hw/vfio/common.c | 10 ++++++++++ >> 1 file changed, 10 insertions(+) >> >> diff --git a/hw/vfio/common.c b/hw/vfio/common.c >> index 3e03c495d8..e8c009d019 100644 >> --- a/hw/vfio/common.c >> +++ b/hw/vfio/common.c >> @@ -606,9 +606,19 @@ static void vfio_listener_region_add(MemoryListener *listener, >> if (memory_region_is_iommu(section->mr)) { >> VFIOGuestIOMMU *giommu; >> IOMMUMemoryRegion *iommu_mr = IOMMU_MEMORY_REGION(section->mr); >> + bool nested; >> int iommu_idx; >> >> trace_vfio_listener_region_add_iommu(iova, end); >> + >> + if (!memory_region_iommu_get_attr(iommu_mr, >> + IOMMU_ATTR_NEED_HW_NESTED_PAGING, >> + (void *)&nested) && nested) { >> + error_report("VFIO/vIOMMU integration based on HW nested paging " >> + "is not yet supported"); >> + ret = -EINVAL; >> + goto fail; >> + } >> /* >> * FIXME: For VFIO iommu types which have KVM acceleration to >> * avoid bouncing all map/unmaps through qemu this way, this > > ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Qemu-devel] [PATCH v5 2/2] hw/vfio/common: Fail on VFIO/HW nested paging detection 2019-08-30 8:06 ` Auger Eric @ 2019-08-30 17:22 ` Alex Williamson 0 siblings, 0 replies; 6+ messages in thread From: Alex Williamson @ 2019-08-30 17:22 UTC (permalink / raw) To: Auger Eric Cc: peter.maydell, aik, qemu-devel, peterx, qemu-arm, pbonzini, eric.auger.pro On Fri, 30 Aug 2019 10:06:56 +0200 Auger Eric <eric.auger@redhat.com> wrote: > Hi Alex, > > On 8/29/19 8:14 PM, Alex Williamson wrote: > > On Thu, 29 Aug 2019 11:01:41 +0200 > > Eric Auger <eric.auger@redhat.com> wrote: > > > >> As of today, VFIO only works along with vIOMMU supporting > >> caching mode. The SMMUv3 does not support this mode and > >> requires HW nested paging to work properly with VFIO. > >> > >> So any attempt to run a VFIO device protected by such IOMMU > >> would prevent the assigned device from working and at the > >> moment the guest does not even boot as the default > >> memory_region_iommu_replay() implementation attempts to > >> translate the whole address space and completely stalls > >> the guest. > > > > Why doesn't this stall an x86 guest? > it does not stall on x86 since intel_iommu implements a custom replay > (see vtd_iommu_replay) and you do not execute the dummy default one. > This function performs a full page table walk, scanning all the valid > entries and calling the MAP notifier on those. Although this operation > is tedious it has nothing to compare against the dummy default replay > function which calls translate() on the whole address range (on a page > basis). Ah right. OTOH, what are the arguments against smmuv3 providing a replay function? > > I'm a bit confused about what this provides versus the flag_changed > > notifier looking for IOMMU_NOTIFIER_MAP, which AIUI is the common > > deficiency between VT-d w/o caching-mode and SMMUv3 w/o nested mode. > > The iommu notifier is registered prior to calling iommu_replay, so it > > seems we already have an opportunity to do something there. Help me > > understand why this is needed. Thanks, > > At the moment the smmuv3 notify_flag_changed callback implementation > (smmuv3_notify_flag_changed) emits a warning when it detects a MAP > notifier gets registered: > > warn_report("SMMUv3 does not support notification on MAP: " > "device %s will not function properly", pcidev->name); > > and then the replay gets executed, looping forever. > > I could exit instead of emitting a warning but the drawback is that on > vfio hotplug, it will also exit whereas we would rather simply reject > the hotplug. There are solutions to the above by modifying the existing framework rather than creating a parallel solution though. For instance, could memory_region_register_iommu_notifier() reject the notifier if the flag change is incompatible, allowing the fault to propagate back to vfio and taking a similar exit path as provided here. > I think the solution based on the IOMMU MR attribute handles both the > static and hotplug solutions. Also looking further, I will need this > IOMMU MR attribute for 2stage SMMU integration (see [RFC v5 14/29] > vfio: Force nested if iommu requires it). I know that it is standing > for a while and it is still hypothetical but setting up 2stage will > require specific treatments in the vfio common.c code, opt-in the > 2stage mode, register specific iommu mr notifiers. Using the IOMMU MR > attribute allows me to detect which kind of VFIO/IOMMU integration I > need to setup. Hmm, I'm certainly more on board with that use case. I guess the question is whether the problem statement presented here justifies what seems to be a parallel solution to what we have today, or could have with some enhancements. Thanks, Alex > >> > >> So let's fail on that case. > >> > >> Signed-off-by: Eric Auger <eric.auger@redhat.com> > >> > >> --- > >> > >> v3 -> v4: > >> - use IOMMU_ATTR_HW_NESTED_PAGING > >> - do not abort anymore but jump to fail > >> --- > >> hw/vfio/common.c | 10 ++++++++++ > >> 1 file changed, 10 insertions(+) > >> > >> diff --git a/hw/vfio/common.c b/hw/vfio/common.c > >> index 3e03c495d8..e8c009d019 100644 > >> --- a/hw/vfio/common.c > >> +++ b/hw/vfio/common.c > >> @@ -606,9 +606,19 @@ static void > >> vfio_listener_region_add(MemoryListener *listener, if > >> (memory_region_is_iommu(section->mr)) { VFIOGuestIOMMU *giommu; > >> IOMMUMemoryRegion *iommu_mr = > >> IOMMU_MEMORY_REGION(section->mr); > >> + bool nested; > >> int iommu_idx; > >> > >> trace_vfio_listener_region_add_iommu(iova, end); > >> + > >> + if (!memory_region_iommu_get_attr(iommu_mr, > >> + > >> IOMMU_ATTR_NEED_HW_NESTED_PAGING, > >> + (void *)&nested) && > >> nested) { > >> + error_report("VFIO/vIOMMU integration based on HW > >> nested paging " > >> + "is not yet supported"); > >> + ret = -EINVAL; > >> + goto fail; > >> + } > >> /* > >> * FIXME: For VFIO iommu types which have KVM > >> acceleration to > >> * avoid bouncing all map/unmaps through qemu this way, > >> this > > > > ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2019-08-30 17:24 UTC | newest] Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2019-08-29 9:01 [Qemu-devel] [PATCH v5 0/2] VFIO/SMMUv3: Fail on VFIO/HW nested paging detection Eric Auger 2019-08-29 9:01 ` [Qemu-devel] [PATCH v5 1/2] memory: Add IOMMU_ATTR_NEED_HW_NESTED_PAGING IOMMU memory region attribute Eric Auger 2019-08-29 9:01 ` [Qemu-devel] [PATCH v5 2/2] hw/vfio/common: Fail on VFIO/HW nested paging detection Eric Auger 2019-08-29 18:14 ` Alex Williamson 2019-08-30 8:06 ` Auger Eric 2019-08-30 17:22 ` Alex Williamson
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).