All of lore.kernel.org
 help / color / mirror / Atom feed
From: Auger Eric <eric.auger@redhat.com>
To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>,
	"eric.auger.pro@gmail.com" <eric.auger.pro@gmail.com>,
	"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
	"qemu-arm@nongnu.org" <qemu-arm@nongnu.org>,
	"peter.maydell@linaro.org" <peter.maydell@linaro.org>,
	"mst@redhat.com" <mst@redhat.com>,
	"alex.williamson@redhat.com" <alex.williamson@redhat.com>,
	"jacob.jun.pan@linux.intel.com" <jacob.jun.pan@linux.intel.com>,
	"yi.l.liu@intel.com" <yi.l.liu@intel.com>
Cc: "jean-philippe@linaro.org" <jean-philippe@linaro.org>,
	"tnowicki@marvell.com" <tnowicki@marvell.com>,
	"maz@kernel.org" <maz@kernel.org>,
	"zhangfei.gao@foxmail.com" <zhangfei.gao@foxmail.com>,
	"peterx@redhat.com" <peterx@redhat.com>,
	"zhangfei.gao@linaro.org" <zhangfei.gao@linaro.org>,
	"bbhushan2@marvell.com" <bbhushan2@marvell.com>,
	"will@kernel.org" <will@kernel.org>
Subject: Re: [RFC v6 00/24] vSMMUv3/pSMMUv3 2 stage VFIO integration
Date: Fri, 3 Apr 2020 12:45:17 +0200	[thread overview]
Message-ID: <93dccfc9-774c-9976-15ad-b484f0c5956c@redhat.com> (raw)
In-Reply-To: <779801971e964109bc46120dda541078@huawei.com>

Hi Shameer,

On 3/25/20 12:35 PM, Shameerali Kolothum Thodi wrote:
> Hi Eric,
> 
>> -----Original Message-----
>> From: Eric Auger [mailto:eric.auger@redhat.com]
>> Sent: 20 March 2020 16:58
>> To: eric.auger.pro@gmail.com; eric.auger@redhat.com;
>> qemu-devel@nongnu.org; qemu-arm@nongnu.org; peter.maydell@linaro.org;
>> mst@redhat.com; alex.williamson@redhat.com;
>> jacob.jun.pan@linux.intel.com; yi.l.liu@intel.com
>> Cc: peterx@redhat.com; jean-philippe@linaro.org; will@kernel.org;
>> tnowicki@marvell.com; Shameerali Kolothum Thodi
>> <shameerali.kolothum.thodi@huawei.com>; zhangfei.gao@foxmail.com;
>> zhangfei.gao@linaro.org; maz@kernel.org; bbhushan2@marvell.com
>> Subject: [RFC v6 00/24] vSMMUv3/pSMMUv3 2 stage VFIO integration
>>
>> Up to now vSMMUv3 has not been integrated with VFIO. VFIO
>> integration requires to program the physical IOMMU consistently
>> with the guest mappings. However, as opposed to VTD, SMMUv3 has
>> no "Caching Mode" which allows easy trapping of guest mappings.
>> This means the vSMMUV3 cannot use the same VFIO integration as VTD.
>>
>> However SMMUv3 has 2 translation stages. This was devised with
>> virtualization use case in mind where stage 1 is "owned" by the
>> guest whereas the host uses stage 2 for VM isolation.
>>
>> This series sets up this nested translation stage. It only works
>> if there is one physical SMMUv3 used along with QEMU vSMMUv3 (in
>> other words, it does not work if there is a physical SMMUv2).
> 
> I was testing this series on one of our hardware board with SMMUv3. I did
> observe an issue while trying to bring up Guest with and without the vsmmuV3.

I am currently investigating and up to now I fail to reproduce on my end.
> 
> Steps are like below,
> 
> 1. start a guest with "iommu=smmuv3" and a n/w vf device.
> 
> 2.Exit the VM.
how to you exit the VM?
> 
> 3. start the guest again without "iommu=smmuv3"
> 
> This time qemu crashes with,
> 
> [ 0.447830] hns3 0000:00:01.0: enabling device (0000 -> 0002)
> /home/shameer/qemu-eric/qemu/hw/vfio/pci.c:2851:vfio_dma_fault_notifier_handler:
> Object 0xaaaaeeb47c00 is not an instance of type
So I think I understand the qemu crash. At the moment the vfio_pci
registers a fault handler even if we are not in nested mode. The smmuv3
host driver calls any registered fault handler when it encounters an
error in !nested mode. So the eventfd is triggered to userspace but qemu
does not expect that. However the root case is we got some physical
faults on the second run.
> qemu:iommu-memory-region
> ./qemu_run-vsmmu-hns: line 9: 13609 Aborted                 (core
> dumped) ./qemu-system-aarch64-vsmmuv3v10 -machine
> virt,kernel_irqchip=on,gic-version=3 -cpu host -smp cpus=1 -kernel
> Image-ericv10-uacce -initrd rootfs-iperf.cpio -bios
Just to double check with you,
host: will-arm-smmu-updates-2stage-v10
qemu: v4.2.0-2stage-rfcv6
guest version?
> QEMU_EFI_Dec2018.fd -device vfio-pci,host=0000:7d:02.1 -net none -m
Do you assign exactly the same VF as during the 1st run?
> 4096 -nographic -D -d -enable-kvm -append "console=ttyAMA0
> root=/dev/vda -m 4096 rw earlycon=pl011,0x9000000"
> 
> And you can see that host kernel receives smmuv3 C_BAD_STE event,
> 
> [10499.379288] vfio-pci 0000:7d:02.1: enabling device (0000 -> 0002)
> [10501.943881] arm-smmu-v3 arm-smmu-v3.2.auto: event 0x04 received:
> [10501.943884] arm-smmu-v3 arm-smmu-v3.2.auto: 0x00007d1100000004
> [10501.943886] arm-smmu-v3 arm-smmu-v3.2.auto: 0x0000100800000080
> [10501.943887] arm-smmu-v3 arm-smmu-v3.2.auto: 0x00000000fe040000
> [10501.943889] arm-smmu-v3 arm-smmu-v3.2.auto: 0x000000007e04c440
I will try to prepare a kernel branch with additional traces.

Thanks

Eric
> 
> So I suspect we didn't clear nested stage configuration and that affects the 
> translation in the second run. I tried to issue(force) a vfio_detach_pasid_table() but 
> that didn't solve the problem.
> 
> May be I am missing something. Could you please take a look and let me know.
> 
> Thanks,
> Shameer
> 
>> - We force the host to use stage 2 instead of stage 1, when we
>>   detect a vSMMUV3 is behind a VFIO device. For a VFIO device
>>   without any virtual IOMMU, we still use stage 1 as many existing
>>   SMMUs expect this behavior.
>> - We use PCIPASIDOps to propage guest stage1 config changes on
>>   STE (Stream Table Entry) changes.
>> - We implement a specific UNMAP notifier that conveys guest
>>   IOTLB invalidations to the host
>> - We register MSI IOVA/GPA bindings to the host so that this latter
>>   can build a nested stage translation
>> - As the legacy MAP notifier is not called anymore, we must make
>>   sure stage 2 mappings are set. This is achieved through another
>>   prereg memory listener.
>> - Physical SMMU stage 1 related faults are reported to the guest
>>   via en eventfd mechanism and exposed trhough a dedicated VFIO-PCI
>>   region. Then they are reinjected into the guest.
>>
>> Best Regards
>>
>> Eric
>>
>> This series can be found at:
>> https://github.com/eauger/qemu/tree/v4.2.0-2stage-rfcv6
>>
>> Kernel Dependencies:
>> [1] [PATCH v10 00/11] SMMUv3 Nested Stage Setup (VFIO part)
>> [2] [PATCH v10 00/13] SMMUv3 Nested Stage Setup (IOMMU part)
>> branch at:
>> https://github.com/eauger/linux/tree/will-arm-smmu-updates-2stage-v10
>>
>> History:
>>
>> v5 -> v6:
>> - just rebase work
>>
>> v4 -> v5:
>> - Use PCIPASIDOps for config update notifications
>> - removal of notification for MSI binding which is not needed
>>   anymore
>> - Use a single fault region
>> - use the specific interrupt index
>>
>> v3 -> v4:
>> - adapt to changes in uapi (asid cache invalidation)
>> - check VFIO_PCI_DMA_FAULT_IRQ_INDEX is supported at kernel level
>>   before attempting to set signaling for it.
>> - sync on 5.2-rc1 kernel headers + Drew's patch that imports sve_context.h
>> - fix MSI binding for MSI (not MSIX)
>> - fix mingw compilation
>>
>> v2 -> v3:
>> - rework fault handling
>> - MSI binding registration done in vfio-pci. MSI binding tear down called
>>   on container cleanup path
>> - leaf parameter propagated
>>
>> v1 -> v2:
>> - Fixed dual assignment (asid now correctly propagated on TLB invalidations)
>> - Integrated fault reporting
>>
>>
>> Eric Auger (23):
>>   update-linux-headers: Import iommu.h
>>   header update against 5.6.0-rc3 and IOMMU/VFIO nested stage APIs
>>   memory: Add IOMMU_ATTR_VFIO_NESTED IOMMU memory region
>> attribute
>>   memory: Add IOMMU_ATTR_MSI_TRANSLATE IOMMU memory region
>> attribute
>>   memory: Introduce IOMMU Memory Region inject_faults API
>>   memory: Add arch_id and leaf fields in IOTLBEntry
>>   iommu: Introduce generic header
>>   vfio: Force nested if iommu requires it
>>   vfio: Introduce hostwin_from_range helper
>>   vfio: Introduce helpers to DMA map/unmap a RAM section
>>   vfio: Set up nested stage mappings
>>   vfio: Pass stage 1 MSI bindings to the host
>>   vfio: Helper to get IRQ info including capabilities
>>   vfio/pci: Register handler for iommu fault
>>   vfio/pci: Set up the DMA FAULT region
>>   vfio/pci: Implement the DMA fault handler
>>   hw/arm/smmuv3: Advertise MSI_TRANSLATE attribute
>>   hw/arm/smmuv3: Store the PASID table GPA in the translation config
>>   hw/arm/smmuv3: Fill the IOTLBEntry arch_id on NH_VA invalidation
>>   hw/arm/smmuv3: Fill the IOTLBEntry leaf field on NH_VA invalidation
>>   hw/arm/smmuv3: Pass stage 1 configurations to the host
>>   hw/arm/smmuv3: Implement fault injection
>>   hw/arm/smmuv3: Allow MAP notifiers
>>
>> Liu Yi L (1):
>>   pci: introduce PCIPASIDOps to PCIDevice
>>
>>  hw/arm/smmuv3.c                 | 189 ++++++++++--
>>  hw/arm/trace-events             |   3 +-
>>  hw/pci/pci.c                    |  34 +++
>>  hw/vfio/common.c                | 506
>> +++++++++++++++++++++++++-------
>>  hw/vfio/pci.c                   | 267 ++++++++++++++++-
>>  hw/vfio/pci.h                   |   9 +
>>  hw/vfio/trace-events            |   9 +-
>>  include/exec/memory.h           |  49 +++-
>>  include/hw/arm/smmu-common.h    |   1 +
>>  include/hw/iommu/iommu.h        |  28 ++
>>  include/hw/pci/pci.h            |  11 +
>>  include/hw/vfio/vfio-common.h   |  16 +
>>  linux-headers/COPYING           |   2 +
>>  linux-headers/asm-x86/kvm.h     |   1 +
>>  linux-headers/linux/iommu.h     | 375 +++++++++++++++++++++++
>>  linux-headers/linux/vfio.h      | 109 ++++++-
>>  memory.c                        |  10 +
>>  scripts/update-linux-headers.sh |   2 +-
>>  18 files changed, 1478 insertions(+), 143 deletions(-)
>>  create mode 100644 include/hw/iommu/iommu.h
>>  create mode 100644 linux-headers/linux/iommu.h
>>
>> --
>> 2.20.1
> 



  parent reply	other threads:[~2020-04-03 10:47 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-20 16:58 [RFC v6 00/24] vSMMUv3/pSMMUv3 2 stage VFIO integration Eric Auger
2020-03-20 16:58 ` [RFC v6 01/24] update-linux-headers: Import iommu.h Eric Auger
2020-03-26 12:58   ` Liu, Yi L
2020-03-26 17:51     ` Auger Eric
2020-03-20 16:58 ` [RFC v6 02/24] header update against 5.6.0-rc3 and IOMMU/VFIO nested stage APIs Eric Auger
2020-03-20 16:58 ` [RFC v6 03/24] memory: Add IOMMU_ATTR_VFIO_NESTED IOMMU memory region attribute Eric Auger
2020-03-20 16:58 ` [RFC v6 04/24] memory: Add IOMMU_ATTR_MSI_TRANSLATE " Eric Auger
2020-03-20 16:58 ` [RFC v6 05/24] memory: Introduce IOMMU Memory Region inject_faults API Eric Auger
2020-03-26 13:13   ` Liu, Yi L
2020-03-20 16:58 ` [RFC v6 06/24] memory: Add arch_id and leaf fields in IOTLBEntry Eric Auger
2020-03-20 16:58 ` [RFC v6 07/24] iommu: Introduce generic header Eric Auger
2020-03-20 16:58 ` [RFC v6 08/24] pci: introduce PCIPASIDOps to PCIDevice Eric Auger
2020-03-26 13:01   ` Liu, Yi L
2020-03-20 16:58 ` [RFC v6 09/24] vfio: Force nested if iommu requires it Eric Auger
2020-03-31  6:34   ` Liu, Yi L
2020-03-31  8:04     ` Auger Eric
2020-03-31  8:34       ` Liu, Yi L
2020-03-20 16:58 ` [RFC v6 10/24] vfio: Introduce hostwin_from_range helper Eric Auger
2020-03-20 16:58 ` [RFC v6 11/24] vfio: Introduce helpers to DMA map/unmap a RAM section Eric Auger
2020-03-20 16:58 ` [RFC v6 12/24] vfio: Set up nested stage mappings Eric Auger
2020-03-20 16:58 ` [RFC v6 13/24] vfio: Pass stage 1 MSI bindings to the host Eric Auger
2020-03-20 16:58 ` [RFC v6 14/24] vfio: Helper to get IRQ info including capabilities Eric Auger
2020-03-20 16:58 ` [RFC v6 15/24] vfio/pci: Register handler for iommu fault Eric Auger
2020-03-20 16:58 ` [RFC v6 16/24] vfio/pci: Set up the DMA FAULT region Eric Auger
2020-03-20 16:58 ` [RFC v6 17/24] vfio/pci: Implement the DMA fault handler Eric Auger
2020-03-20 16:58 ` [RFC v6 18/24] hw/arm/smmuv3: Advertise MSI_TRANSLATE attribute Eric Auger
2020-03-20 16:58 ` [RFC v6 19/24] hw/arm/smmuv3: Store the PASID table GPA in the translation config Eric Auger
2020-03-20 16:58 ` [RFC v6 20/24] hw/arm/smmuv3: Fill the IOTLBEntry arch_id on NH_VA invalidation Eric Auger
2020-03-20 16:58 ` [RFC v6 21/24] hw/arm/smmuv3: Fill the IOTLBEntry leaf field " Eric Auger
2020-03-20 16:58 ` [RFC v6 22/24] hw/arm/smmuv3: Pass stage 1 configurations to the host Eric Auger
2020-03-20 16:58 ` [RFC v6 23/24] hw/arm/smmuv3: Implement fault injection Eric Auger
2020-03-20 16:58 ` [RFC v6 24/24] hw/arm/smmuv3: Allow MAP notifiers Eric Auger
2020-03-25 11:35 ` [RFC v6 00/24] vSMMUv3/pSMMUv3 2 stage VFIO integration Shameerali Kolothum Thodi
2020-03-25 12:42   ` Auger Eric
2020-04-03 10:45   ` Auger Eric [this message]
2020-04-03 12:10     ` Shameerali Kolothum Thodi
2020-03-31  6:42 ` Zhangfei Gao
2020-03-31  8:12   ` Auger Eric
2020-03-31  8:24     ` Zhangfei Gao
2020-04-02 16:46       ` Auger Eric

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=93dccfc9-774c-9976-15ad-b484f0c5956c@redhat.com \
    --to=eric.auger@redhat.com \
    --cc=alex.williamson@redhat.com \
    --cc=bbhushan2@marvell.com \
    --cc=eric.auger.pro@gmail.com \
    --cc=jacob.jun.pan@linux.intel.com \
    --cc=jean-philippe@linaro.org \
    --cc=maz@kernel.org \
    --cc=mst@redhat.com \
    --cc=peter.maydell@linaro.org \
    --cc=peterx@redhat.com \
    --cc=qemu-arm@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=shameerali.kolothum.thodi@huawei.com \
    --cc=tnowicki@marvell.com \
    --cc=will@kernel.org \
    --cc=yi.l.liu@intel.com \
    --cc=zhangfei.gao@foxmail.com \
    --cc=zhangfei.gao@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.