KVM Archive on lore.kernel.org
 help / color / Atom feed
From: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>
To: Auger Eric <eric.auger@redhat.com>,
	"eric.auger.pro@gmail.com" <eric.auger.pro@gmail.com>,
	"iommu@lists.linux-foundation.org"
	<iommu@lists.linux-foundation.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	"kvmarm@lists.cs.columbia.edu" <kvmarm@lists.cs.columbia.edu>,
	"joro@8bytes.org" <joro@8bytes.org>,
	"alex.williamson@redhat.com" <alex.williamson@redhat.com>,
	"jacob.jun.pan@linux.intel.com" <jacob.jun.pan@linux.intel.com>,
	"yi.l.liu@intel.com" <yi.l.liu@intel.com>,
	"jean-philippe.brucker@arm.com" <jean-philippe.brucker@arm.com>,
	"will.deacon@arm.com" <will.deacon@arm.com>,
	"robin.murphy@arm.com" <robin.murphy@arm.com>
Cc: "kevin.tian@intel.com" <kevin.tian@intel.com>,
	"vincent.stehle@arm.com" <vincent.stehle@arm.com>,
	"ashok.raj@intel.com" <ashok.raj@intel.com>,
	"marc.zyngier@arm.com" <marc.zyngier@arm.com>,
	"tina.zhang@intel.com" <tina.zhang@intel.com>,
	Linuxarm <linuxarm@huawei.com>, "xuwei (O)" <xuwei5@huawei.com>
Subject: RE: [PATCH v9 00/11] SMMUv3 Nested Stage Setup (VFIO part)
Date: Tue, 12 Nov 2019 13:06:13 +0000
Message-ID: <76d9dc0274414887b04e11b9b6bda257@huawei.com> (raw)
In-Reply-To: <ebaded3e-8a5c-73dd-b3f7-7533a6e80146@redhat.com>

Hi Eric,

> -----Original Message-----
> From: Auger Eric [mailto:eric.auger@redhat.com]
> Sent: 12 November 2019 11:29
> To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>;
> eric.auger.pro@gmail.com; iommu@lists.linux-foundation.org;
> linux-kernel@vger.kernel.org; kvm@vger.kernel.org;
> kvmarm@lists.cs.columbia.edu; joro@8bytes.org;
> alex.williamson@redhat.com; jacob.jun.pan@linux.intel.com;
> yi.l.liu@intel.com; jean-philippe.brucker@arm.com; will.deacon@arm.com;
> robin.murphy@arm.com
> Cc: kevin.tian@intel.com; vincent.stehle@arm.com; ashok.raj@intel.com;
> marc.zyngier@arm.com; tina.zhang@intel.com; Linuxarm
> <linuxarm@huawei.com>; xuwei (O) <xuwei5@huawei.com>
> Subject: Re: [PATCH v9 00/11] SMMUv3 Nested Stage Setup (VFIO part)
> 
> Hi Shameer,
> On 11/12/19 12:08 PM, Shameerali Kolothum Thodi wrote:
> > Hi Eric,
> >
> >> -----Original Message-----
> >> From: kvmarm-bounces@lists.cs.columbia.edu
> >> [mailto:kvmarm-bounces@lists.cs.columbia.edu] On Behalf Of Eric Auger
> >> Sent: 11 July 2019 14:56
> >> To: eric.auger.pro@gmail.com; eric.auger@redhat.com;
> >> iommu@lists.linux-foundation.org; linux-kernel@vger.kernel.org;
> >> kvm@vger.kernel.org; kvmarm@lists.cs.columbia.edu; joro@8bytes.org;
> >> alex.williamson@redhat.com; jacob.jun.pan@linux.intel.com;
> >> yi.l.liu@intel.com; jean-philippe.brucker@arm.com; will.deacon@arm.com;
> >> robin.murphy@arm.com
> >> Cc: kevin.tian@intel.com; vincent.stehle@arm.com; ashok.raj@intel.com;
> >> marc.zyngier@arm.com; tina.zhang@intel.com
> >> Subject: [PATCH v9 00/11] SMMUv3 Nested Stage Setup (VFIO part)
> >>
> >> This series brings the VFIO part of HW nested paging support
> >> in the SMMUv3.
> >>
> >> The series depends on:
> >> [PATCH v9 00/14] SMMUv3 Nested Stage Setup (IOMMU part)
> >> (https://www.spinics.net/lists/kernel/msg3187714.html)
> >>
> >> 3 new IOCTLs are introduced that allow the userspace to
> >> 1) pass the guest stage 1 configuration
> >> 2) pass stage 1 MSI bindings
> >> 3) invalidate stage 1 related caches
> >>
> >> They map onto the related new IOMMU API functions.
> >>
> >> We introduce the capability to register specific interrupt
> >> indexes (see [1]). A new DMA_FAULT interrupt index allows to register
> >> an eventfd to be signaled whenever a stage 1 related fault
> >> is detected at physical level. Also a specific region allows
> >> to expose the fault records to the user space.
> >
> > I am trying to get this running on one of our platform that has smmuv3 dual
> > stage support. I am seeing some issues with this when an ixgbe vf dev is
> > made pass-through and is behind a vSMMUv3 in Guest.
> >
> > Kernel used : https://github.com/eauger/linux/tree/v5.3.0-rc0-2stage-v9
> > Qemu: https://github.com/eauger/qemu/tree/v4.1.0-rc0-2stage-rfcv5
> >
> > And this is my Qemu cmd line,
> >
> > ./qemu-system-aarch64
> > -machine virt,kernel_irqchip=on,gic-version=3,iommu=smmuv3 -cpu host \
> > -kernel Image \
> > -drive if=none,file=ubuntu,id=fs \
> > -device virtio-blk-device,drive=fs \
> > -device vfio-pci,host=0000:01:10.1 \
> > -bios QEMU_EFI.fd \
> > -net none \
> > -m 4G \
> > -nographic -D -d -enable-kvm \
> > -append "console=ttyAMA0 root=/dev/vda rw acpi=force"
> >
> > The basic ping from Guest works fine,
> > root@ubuntu:~# ping 10.202.225.185
> > PING 10.202.225.185 (10.202.225.185) 56(84) bytes of data.
> > 64 bytes from 10.202.225.185: icmp_seq=2 ttl=64 time=0.207 ms
> > 64 bytes from 10.202.225.185: icmp_seq=3 ttl=64 time=0.203 ms
> > ...
> >
> > But if I increase ping packet size,
> >
> > root@ubuntu:~# ping -s 1024 10.202.225.185
> > PING 10.202.225.185 (10.202.225.185) 1024(1052) bytes of data.
> > 1032 bytes from 10.202.225.185: icmp_seq=22 ttl=64 time=0.292 ms
> > 1032 bytes from 10.202.225.185: icmp_seq=23 ttl=64 time=0.207 ms
> > From 10.202.225.169 icmp_seq=66 Destination Host Unreachable
> > From 10.202.225.169 icmp_seq=67 Destination Host Unreachable
> > From 10.202.225.169 icmp_seq=68 Destination Host Unreachable
> > From 10.202.225.169 icmp_seq=69 Destination Host Unreachable
> >
> > And from Host kernel I get,
> > [  819.970742] ixgbe 0000:01:00.1 enp1s0f1: 3 Spoofed packets detected
> > [  824.002707] ixgbe 0000:01:00.1 enp1s0f1: 1 Spoofed packets detected
> > [  828.034683] ixgbe 0000:01:00.1 enp1s0f1: 1 Spoofed packets detected
> > [  830.050673] ixgbe 0000:01:00.1 enp1s0f1: 4 Spoofed packets detected
> > [  832.066659] ixgbe 0000:01:00.1 enp1s0f1: 1 Spoofed packets detected
> > [  834.082640] ixgbe 0000:01:00.1 enp1s0f1: 3 Spoofed packets detected
> >
> > Also noted that iperf cannot work as it fails to establish the connection with
> iperf
> > server.
> >
> > Please find attached the trace logs(vfio*, smmuv3*) from Qemu for your
> reference.
> > I haven't debugged this further yet and thought of checking with you if this is
> > something you have seen already or not. Or maybe I am missing something
> here?
> 
> Please can you try to edit and modify hw/vfio/common.c, function
> vfio_iommu_unmap_notify
> 
> 
> /*
>     if (size <= 0x10000) {
>         ustruct.info.cache = IOMMU_CACHE_INV_TYPE_IOTLB;
>         ustruct.info.granularity = IOMMU_INV_GRANU_ADDR;
>         ustruct.info.addr_info.flags = IOMMU_INV_ADDR_FLAGS_ARCHID;
>         if (iotlb->leaf) {
>             ustruct.info.addr_info.flags |=
> IOMMU_INV_ADDR_FLAGS_LEAF;
>         }
>         ustruct.info.addr_info.archid = iotlb->arch_id;
>         ustruct.info.addr_info.addr = start;
>         ustruct.info.addr_info.granule_size = size;
>         ustruct.info.addr_info.nb_granules = 1;
>         trace_vfio_iommu_addr_inv_iotlb(iotlb->arch_id, start, size, 1,
>                                         iotlb->leaf);
>     } else {
> */
>         ustruct.info.cache = IOMMU_CACHE_INV_TYPE_IOTLB;
>         ustruct.info.granularity = IOMMU_INV_GRANU_PASID;
>         ustruct.info.pasid_info.archid = iotlb->arch_id;
>         ustruct.info.pasid_info.flags = IOMMU_INV_PASID_FLAGS_ARCHID;
>         trace_vfio_iommu_asid_inv_iotlb(iotlb->arch_id);
> //    }
> 
> This modification leads to invalidate the whole asid each time we get a
> guest TLBI instead of invalidating the single IOVA (TLBI). On my end, I
> saw this was the cause of such kind of issues. Please let me know if it
> fixes your perf issues

Yes, this seems to fix the issue.

root@ubuntu:~# iperf -c 10.202.225.185
------------------------------------------------------------
Client connecting to 10.202.225.185, TCP port 5001
TCP window size: 85.0 KByte (default)
------------------------------------------------------------
[  3] local 10.202.225.169 port 47996 connected with 10.202.225.185 port 5001
[ ID] Interval       Transfer     Bandwidth
[  3]  0.0-10.0 sec  2.27 GBytes  1.95 Gbits/sec
root@ubuntu:~#

But the performance seems to be very poor as this is a 10Gbps interface(Of course
invalidating the whole asid may not be very helpful). It is interesting that why the
single iova invalidation is not working.

 and then we may discuss further about the test
> configuration.

Sure. Please let me know.

Cheers,
Shameer 

> Thanks
> 
> Eric
> 
> 
> 
> >
> > Please let me know.
> >
> > Thanks,
> > Shameer
> >
> >> Best Regards
> >>
> >> Eric
> >>
> >> This series can be found at:
> >> https://github.com/eauger/linux/tree/v5.3.0-rc0-2stage-v9
> >>
> >> It series includes Tina's patch steming from
> >> [1] "[RFC PATCH v2 1/3] vfio: Use capability chains to handle device
> >> specific irq" plus patches originally contributed by Yi.
> >>
> >> History:
> >>
> >> v8 -> v9:
> >> - introduce specific irq framework
> >> - single fault region
> >> - iommu_unregister_device_fault_handler failure case not handled
> >>   yet.
> >>
> >> v7 -> v8:
> >> - rebase on top of v5.2-rc1 and especially
> >>   8be39a1a04c1  iommu/arm-smmu-v3: Add a master->domain pointer
> >> - dynamic alloc of s1_cfg/s2_cfg
> >> - __arm_smmu_tlb_inv_asid/s1_range_nosync
> >> - check there is no HW MSI regions
> >> - asid invalidation using pasid extended struct (change in the uapi)
> >> - add s1_live/s2_live checks
> >> - move check about support of nested stages in domain finalise
> >> - fixes in error reporting according to the discussion with Robin
> >> - reordered the patches to have first iommu/smmuv3 patches and then
> >>   VFIO patches
> >>
> >> v6 -> v7:
> >> - removed device handle from bind/unbind_guest_msi
> >> - added "iommu/smmuv3: Nested mode single MSI doorbell per domain
> >>   enforcement"
> >> - added few uapi comments as suggested by Jean, Jacop and Alex
> >>
> >> v5 -> v6:
> >> - Fix compilation issue when CONFIG_IOMMU_API is unset
> >>
> >> v4 -> v5:
> >> - fix bug reported by Vincent: fault handler unregistration now happens in
> >>   vfio_pci_release
> >> - IOMMU_FAULT_PERM_* moved outside of struct definition + small
> >>   uapi changes suggested by Kean-Philippe (except fetch_addr)
> >> - iommu: introduce device fault report API: removed the PRI part.
> >> - see individual logs for more details
> >> - reset the ste abort flag on detach
> >>
> >> v3 -> v4:
> >> - took into account Alex, jean-Philippe and Robin's comments on v3
> >> - rework of the smmuv3 driver integration
> >> - add tear down ops for msi binding and PASID table binding
> >> - fix S1 fault propagation
> >> - put fault reporting patches at the beginning of the series following
> >>   Jean-Philippe's request
> >> - update of the cache invalidate and fault API uapis
> >> - VFIO fault reporting rework with 2 separate regions and one mmappable
> >>   segment for the fault queue
> >> - moved to PATCH
> >>
> >> v2 -> v3:
> >> - When registering the S1 MSI binding we now store the device handle. This
> >>   addresses Robin's comment about discimination of devices beonging to
> >>   different S1 groups and using different physical MSI doorbells.
> >> - Change the fault reporting API: use VFIO_PCI_DMA_FAULT_IRQ_INDEX to
> >>   set the eventfd and expose the faults through an mmappable fault region
> >>
> >> v1 -> v2:
> >> - Added the fault reporting capability
> >> - asid properly passed on invalidation (fix assignment of multiple
> >>   devices)
> >> - see individual change logs for more info
> >>
> >>
> >> Eric Auger (8):
> >>   vfio: VFIO_IOMMU_SET_MSI_BINDING
> >>   vfio/pci: Add VFIO_REGION_TYPE_NESTED region type
> >>   vfio/pci: Register an iommu fault handler
> >>   vfio/pci: Allow to mmap the fault queue
> >>   vfio: Add new IRQ for DMA fault reporting
> >>   vfio/pci: Add framework for custom interrupt indices
> >>   vfio/pci: Register and allow DMA FAULT IRQ signaling
> >>   vfio: Document nested stage control
> >>
> >> Liu, Yi L (2):
> >>   vfio: VFIO_IOMMU_SET_PASID_TABLE
> >>   vfio: VFIO_IOMMU_CACHE_INVALIDATE
> >>
> >> Tina Zhang (1):
> >>   vfio: Use capability chains to handle device specific irq
> >>
> >>  Documentation/vfio.txt              |  77 ++++++++
> >>  drivers/vfio/pci/vfio_pci.c         | 283
> ++++++++++++++++++++++++++--
> >>  drivers/vfio/pci/vfio_pci_intrs.c   |  62 ++++++
> >>  drivers/vfio/pci/vfio_pci_private.h |  24 +++
> >>  drivers/vfio/pci/vfio_pci_rdwr.c    |  45 +++++
> >>  drivers/vfio/vfio_iommu_type1.c     | 166 ++++++++++++++++
> >>  include/uapi/linux/vfio.h           | 109 ++++++++++-
> >>  7 files changed, 747 insertions(+), 19 deletions(-)
> >>
> >> --
> >> 2.20.1
> >>
> >> _______________________________________________
> >> kvmarm mailing list
> >> kvmarm@lists.cs.columbia.edu
> >> https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


  reply index

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-07-11 13:56 Eric Auger
2019-07-11 13:56 ` [PATCH v9 01/11] vfio: VFIO_IOMMU_SET_PASID_TABLE Eric Auger
2019-07-11 13:56 ` [PATCH v9 02/11] vfio: VFIO_IOMMU_CACHE_INVALIDATE Eric Auger
2019-07-11 13:56 ` [PATCH v9 03/11] vfio: VFIO_IOMMU_SET_MSI_BINDING Eric Auger
2019-07-11 13:56 ` [PATCH v9 04/11] vfio/pci: Add VFIO_REGION_TYPE_NESTED region type Eric Auger
2019-07-11 13:56 ` [PATCH v9 05/11] vfio/pci: Register an iommu fault handler Eric Auger
2019-07-11 13:56 ` [PATCH v9 06/11] vfio/pci: Allow to mmap the fault queue Eric Auger
2019-07-11 13:56 ` [PATCH v9 07/11] vfio: Use capability chains to handle device specific irq Eric Auger
2019-07-11 13:56 ` [PATCH v9 08/11] vfio: Add new IRQ for DMA fault reporting Eric Auger
2019-07-11 13:56 ` [PATCH v9 09/11] vfio/pci: Add framework for custom interrupt indices Eric Auger
2019-07-11 13:56 ` [PATCH v9 10/11] vfio/pci: Register and allow DMA FAULT IRQ signaling Eric Auger
2019-07-11 13:56 ` [PATCH v9 11/11] vfio: Document nested stage control Eric Auger
     [not found] ` <f5b4b97b197d4bab8f3703eba2e966c4@huawei.com>
2019-11-12 11:28   ` [PATCH v9 00/11] SMMUv3 Nested Stage Setup (VFIO part) Auger Eric
2019-11-12 13:06     ` Shameerali Kolothum Thodi [this message]
2019-11-12 13:21       ` Auger Eric
2019-11-12 14:21         ` Shameerali Kolothum Thodi
2019-11-12 17:56         ` Shameerali Kolothum Thodi
2019-11-12 20:34           ` Auger Eric
2019-11-13 16:24             ` Shameerali Kolothum Thodi
2019-11-20  8:15 ` Tomasz Nowicki
2019-11-20 10:18   ` Auger Eric
2020-03-03 12:57     ` zhangfei
2020-03-03 13:14       ` Auger Eric

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=76d9dc0274414887b04e11b9b6bda257@huawei.com \
    --to=shameerali.kolothum.thodi@huawei.com \
    --cc=alex.williamson@redhat.com \
    --cc=ashok.raj@intel.com \
    --cc=eric.auger.pro@gmail.com \
    --cc=eric.auger@redhat.com \
    --cc=iommu@lists.linux-foundation.org \
    --cc=jacob.jun.pan@linux.intel.com \
    --cc=jean-philippe.brucker@arm.com \
    --cc=joro@8bytes.org \
    --cc=kevin.tian@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=kvmarm@lists.cs.columbia.edu \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linuxarm@huawei.com \
    --cc=marc.zyngier@arm.com \
    --cc=robin.murphy@arm.com \
    --cc=tina.zhang@intel.com \
    --cc=vincent.stehle@arm.com \
    --cc=will.deacon@arm.com \
    --cc=xuwei5@huawei.com \
    --cc=yi.l.liu@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

KVM Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/kvm/0 kvm/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 kvm kvm/ https://lore.kernel.org/kvm \
		kvm@vger.kernel.org
	public-inbox-index kvm

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.kvm


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git