All of lore.kernel.org
 help / color / mirror / Atom feed
From: Alex Williamson <alex.williamson@redhat.com>
To: Peter Xu <peterx@redhat.com>
Cc: qemu-devel@nongnu.org, tianyu.lan@intel.com,
	kevin.tian@intel.com, mst@redhat.com, jan.kiszka@siemens.com,
	jasowang@redhat.com, David Gibson <david@gibson.dropbear.id.au>,
	bd.aviv@gmail.com
Subject: Re: [Qemu-devel] [PATCH v7 00/17] VT-d: vfio enablement and misc enhances
Date: Mon, 20 Feb 2017 12:15:33 -0700	[thread overview]
Message-ID: <20170220121533.6914307b@t450s.home> (raw)
In-Reply-To: <20170220074731.GD12693@pxdev.xzpeter.org>

On Mon, 20 Feb 2017 15:47:31 +0800
Peter Xu <peterx@redhat.com> wrote:

> On Fri, Feb 17, 2017 at 10:18:35AM -0700, Alex Williamson wrote:
> > On Tue,  7 Feb 2017 16:28:02 +0800
> > Peter Xu <peterx@redhat.com> wrote:
> >   
> > > This is v7 of vt-d vfio enablement series.  
> > [snip]  
> > > =========
> > > Test Done
> > > =========
> > > 
> > > Build test passed for x86_64/arm/ppc64.
> > > 
> > > Simply tested with x86_64, assigning two PCI devices to a single VM,
> > > boot the VM using:
> > > 
> > > bin=x86_64-softmmu/qemu-system-x86_64
> > > $bin -M q35,accel=kvm,kernel-irqchip=split -m 1G \
> > >      -device intel-iommu,intremap=on,eim=off,caching-mode=on \
> > >      -netdev user,id=net0,hostfwd=tcp::5555-:22 \
> > >      -device virtio-net-pci,netdev=net0 \
> > >      -device vfio-pci,host=03:00.0 \
> > >      -device vfio-pci,host=02:00.0 \
> > >      -trace events=".trace.vfio" \
> > >      /var/lib/libvirt/images/vm1.qcow2
> > > 
> > > pxdev:bin [vtd-vfio-enablement]# cat .trace.vfio
> > > vtd_page_walk*
> > > vtd_replay*
> > > vtd_inv_desc*
> > > 
> > > Then, in the guest, run the following tool:
> > > 
> > >   https://github.com/xzpeter/clibs/blob/master/gpl/userspace/vfio-bind-group/vfio-bind-group.c
> > > 
> > > With parameter:
> > > 
> > >   ./vfio-bind-group 00:03.0 00:04.0
> > > 
> > > Check host side trace log, I can see pages are replayed and mapped in
> > > 00:04.0 device address space, like:
> > > 
> > > ...
> > > vtd_replay_ce_valid replay valid context device 00:04.00 hi 0x401 lo 0x38fe1001
> > > vtd_page_walk Page walk for ce (0x401, 0x38fe1001) iova range 0x0 - 0x8000000000
> > > vtd_page_walk_level Page walk (base=0x38fe1000, level=3) iova range 0x0 - 0x8000000000
> > > vtd_page_walk_level Page walk (base=0x35d31000, level=2) iova range 0x0 - 0x40000000
> > > vtd_page_walk_level Page walk (base=0x34979000, level=1) iova range 0x0 - 0x200000
> > > vtd_page_walk_one Page walk detected map level 0x1 iova 0x0 -> gpa 0x22dc3000 mask 0xfff perm 3
> > > vtd_page_walk_one Page walk detected map level 0x1 iova 0x1000 -> gpa 0x22e25000 mask 0xfff perm 3
> > > vtd_page_walk_one Page walk detected map level 0x1 iova 0x2000 -> gpa 0x22e12000 mask 0xfff perm 3
> > > vtd_page_walk_one Page walk detected map level 0x1 iova 0x3000 -> gpa 0x22e2d000 mask 0xfff perm 3
> > > vtd_page_walk_one Page walk detected map level 0x1 iova 0x4000 -> gpa 0x12a49000 mask 0xfff perm 3
> > > vtd_page_walk_one Page walk detected map level 0x1 iova 0x5000 -> gpa 0x129bb000 mask 0xfff perm 3
> > > vtd_page_walk_one Page walk detected map level 0x1 iova 0x6000 -> gpa 0x128db000 mask 0xfff perm 3
> > > vtd_page_walk_one Page walk detected map level 0x1 iova 0x7000 -> gpa 0x12a80000 mask 0xfff perm 3
> > > vtd_page_walk_one Page walk detected map level 0x1 iova 0x8000 -> gpa 0x12a7e000 mask 0xfff perm 3
> > > vtd_page_walk_one Page walk detected map level 0x1 iova 0x9000 -> gpa 0x12b22000 mask 0xfff perm 3
> > > vtd_page_walk_one Page walk detected map level 0x1 iova 0xa000 -> gpa 0x12b41000 mask 0xfff perm 3
> > > ...  
> > 
> > Hi Peter,
> > 
> > I'm trying to make use of this, with your vtd-vfio-enablement-v7 branch
> > (HEAD 0c1c4e738095).  I'm assigning an 82576 PF to a VM.  It works with
> > iommu=pt, but if I remove that option, the device does not work and
> > vfio_iommu_map_notify is never called.  Any suggestions?  My
> > commandline is below.  Thanks,
> > 
> > Alex
> > 
> > /usr/local/bin/qemu-system-x86_64 \
> >         -name guest=l1,debug-threads=on -S \
> >         -machine pc-q35-2.9,accel=kvm,usb=off,dump-guest-core=off,kernel-irqchip=split \
> >         -cpu host -m 10240 -realtime mlock=off -smp 4,sockets=1,cores=2,threads=2 \
> >         -no-user-config -nodefaults -monitor stdio -rtc base=utc,driftfix=slew \
> >         -global kvm-pit.lost_tick_policy=discard -no-hpet -no-shutdown \
> >         -global ICH9-LPC.disable_s3=1 -global ICH9-LPC.disable_s4=1 \
> >         -boot strict=on \
> >         -device ioh3420,port=0x10,chassis=1,id=pci.1,bus=pcie.0,addr=0x2 \
> >         -device i82801b11-bridge,id=pci.2,bus=pcie.0,addr=0x1e \
> >         -device pci-bridge,chassis_nr=3,id=pci.3,bus=pci.2,addr=0x0 \
> >         -device ioh3420,port=0x18,chassis=4,id=pci.4,bus=pcie.0,addr=0x3 \
> >         -device ioh3420,port=0x20,chassis=5,id=pci.5,bus=pcie.0,addr=0x4 \
> >         -device ioh3420,port=0x28,chassis=6,id=pci.6,bus=pcie.0,addr=0x5 \
> >         -device ioh3420,port=0x30,chassis=7,id=pci.7,bus=pcie.0,addr=0x6 \
> >         -device ioh3420,port=0x38,chassis=8,id=pci.8,bus=pcie.0,addr=0x7 \
> >         -device ich9-usb-ehci1,id=usb,bus=pcie.0,addr=0x1d.0x7 \
> >         -device ich9-usb-uhci1,masterbus=usb.0,firstport=0,bus=pcie.0,multifunction=on,addr=0x1d \
> >         -device ich9-usb-uhci2,masterbus=usb.0,firstport=2,bus=pcie.0,addr=0x1d.0x1 \
> >         -device ich9-usb-uhci3,masterbus=usb.0,firstport=4,bus=pcie.0,addr=0x1d.0x2 \
> >         -device virtio-serial-pci,id=virtio-serial0,bus=pci.4,addr=0x0 \
> >         -drive file=/dev/vg_s20/lv_l1,format=raw,if=none,id=drive-virtio-disk0,cache=none,aio=native \
> >         -device virtio-blk-pci,scsi=off,bus=pci.5,addr=0x0,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 \
> >         -netdev user,id=hostnet0 \
> >         -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:c2:62:30,bus=pci.1,addr=0x0 \
> >         -device usb-tablet,id=input0,bus=usb.0,port=1 \
> >         -vnc :0 -vga std \
> >         -device vfio-pci,host=01:00.0,id=hostdev0,bus=pci.8,addr=0x0 \
> >         -device intel-iommu,intremap=on,eim=off,caching-mode=on -trace events=/trace-events.txt -msg timestamp=on  
> 
> Alex,
> 
> Thanks for testing this series.
> 
> I think I reproduced it using my 10g nic as well. What I got is:
> 
> [   23.724787] ixgbe 0000:01:00.0 enp1s0: Detected Tx Unit Hang
> [   23.724787]   Tx Queue             <0>
> [   23.724787]   TDH, TDT             <0>, <1>
> [   23.724787]   next_to_use          <1>
> [   23.724787]   next_to_clean        <0>
> [   23.724787] tx_buffer_info[next_to_clean]
> [   23.724787]   time_stamp           <fffbb8bb>
> [   23.724787]   jiffies              <fffbc780>
> [   23.729580] ixgbe 0000:01:00.0 enp1s0: tx hang 1 detected on queue 0, resetting adapter
> [   23.730752] ixgbe 0000:01:00.0 enp1s0: initiating reset due to tx timeout
> [   23.731768] ixgbe 0000:01:00.0 enp1s0: Reset adapter
> 
> Is this the problem you have encountered? (adapter continuously reset)
> 
> Interestingly, I found that the problem solves itself after I move the
> "-device intel-iommu,..." line before all the other devices.
> 
> Or say, this will be the much shorter reproducer meet the bug:
> 
> $qemu   -machine q35,accel=kvm,kernel-irqchip=split \
>         -cpu host -smp 4 -m 2048 \
>         -nographic -nodefaults -serial stdio \
>         -device vfio-pci,host=05:00.0,bus=pci.1 \
>         -device intel-iommu,intremap=on,eim=off,caching-mode=on \
>         /images/fedora-25.qcow2
> 
> While this may possibly be okay at least on my host (switching the
> order of the two devices):
> 
> $qemu   -machine q35,accel=kvm,kernel-irqchip=split \
>         -cpu host -smp 4 -m 2048 \
>         -nographic -nodefaults -serial stdio \
>         -device intel-iommu,intremap=on,eim=off,caching-mode=on \
>         -device vfio-pci,host=05:00.0,bus=pci.1 \
>         /images/fedora-25.qcow2
> 
> So not sure how the ordering of realization of these two devices
> (intel-iommu, vfio-pci) affected the behavior. One thing I suspect is
> that in vfio_realize(), we have:
> 
>   group = vfio_get_group(groupid, pci_device_iommu_address_space(pdev), errp);
> 
> while here we possibly will be getting &address_space_memory here
> instead of the correct DMA address space since Intel IOMMU device has
> not yet been inited...
> 
> Before I go deeper, any thoughts?


Sounds theory, seems confirmed by Yi.  Makes it pretty impossible to
test using libvirt <qemu:arg> support, which is how I derived my VM
commandline.  Thanks,

Alex

  parent reply	other threads:[~2017-02-20 19:15 UTC|newest]

Thread overview: 63+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-02-07  8:28 [Qemu-devel] [PATCH v7 00/17] VT-d: vfio enablement and misc enhances Peter Xu
2017-02-07  8:28 ` [Qemu-devel] [PATCH v7 01/17] vfio: trace map/unmap for notify as well Peter Xu
2017-02-07  8:28 ` [Qemu-devel] [PATCH v7 02/17] vfio: introduce vfio_get_vaddr() Peter Xu
2017-02-10  1:12   ` David Gibson
2017-02-10  5:50     ` Peter Xu
2017-02-07  8:28 ` [Qemu-devel] [PATCH v7 03/17] vfio: allow to notify unmap for very large region Peter Xu
2017-02-10  1:13   ` David Gibson
2017-02-07  8:28 ` [Qemu-devel] [PATCH v7 04/17] intel_iommu: add "caching-mode" option Peter Xu
2017-02-10  1:14   ` David Gibson
2017-02-07  8:28 ` [Qemu-devel] [PATCH v7 05/17] intel_iommu: simplify irq region translation Peter Xu
2017-02-10  1:15   ` David Gibson
2017-02-07  8:28 ` [Qemu-devel] [PATCH v7 06/17] intel_iommu: renaming gpa to iova where proper Peter Xu
2017-02-10  1:17   ` David Gibson
2017-02-07  8:28 ` [Qemu-devel] [PATCH v7 07/17] intel_iommu: convert dbg macros to traces for inv Peter Xu
2017-02-08  2:47   ` Jason Wang
2017-02-10  1:19   ` David Gibson
2017-02-07  8:28 ` [Qemu-devel] [PATCH v7 08/17] intel_iommu: convert dbg macros to trace for trans Peter Xu
2017-02-08  2:49   ` Jason Wang
2017-02-10  1:20   ` David Gibson
2017-02-07  8:28 ` [Qemu-devel] [PATCH v7 09/17] intel_iommu: vtd_slpt_level_shift check level Peter Xu
2017-02-10  1:20   ` David Gibson
2017-02-07  8:28 ` [Qemu-devel] [PATCH v7 10/17] memory: add section range info for IOMMU notifier Peter Xu
2017-02-10  2:29   ` David Gibson
2017-02-07  8:28 ` [Qemu-devel] [PATCH v7 11/17] memory: provide IOMMU_NOTIFIER_FOREACH macro Peter Xu
2017-02-10  2:30   ` David Gibson
2017-02-07  8:28 ` [Qemu-devel] [PATCH v7 12/17] memory: provide iommu_replay_all() Peter Xu
2017-02-10  2:31   ` David Gibson
2017-02-07  8:28 ` [Qemu-devel] [PATCH v7 13/17] memory: introduce memory_region_notify_one() Peter Xu
2017-02-10  2:33   ` David Gibson
2017-02-07  8:28 ` [Qemu-devel] [PATCH v7 14/17] memory: add MemoryRegionIOMMUOps.replay() callback Peter Xu
2017-02-10  2:34   ` David Gibson
2017-03-27  8:35   ` Liu, Yi L
2017-03-27  9:12     ` Peter Xu
2017-03-27  9:21       ` Liu, Yi L
2017-03-30 11:06         ` Liu, Yi L
2017-03-30 11:57           ` Jason Wang
2017-03-31  2:56             ` Peter Xu
2017-03-31  4:21               ` Jason Wang
2017-03-31  5:01                 ` Peter Xu
2017-03-31  5:12                   ` Jason Wang
2017-03-31  5:28                     ` Peter Xu
2017-03-31  5:34             ` Liu, Yi L
2017-03-31  7:16               ` Jason Wang
2017-03-31  7:30                 ` Liu, Yi L
2017-04-01  5:00                   ` Jason Wang
2017-04-01  6:39                     ` Liu, Yi L
2017-02-07  8:28 ` [Qemu-devel] [PATCH v7 15/17] intel_iommu: provide its own replay() callback Peter Xu
2017-02-10  2:36   ` David Gibson
2017-02-07  8:28 ` [Qemu-devel] [PATCH v7 16/17] intel_iommu: allow dynamic switch of IOMMU region Peter Xu
2017-02-10  2:38   ` David Gibson
2017-02-07  8:28 ` [Qemu-devel] [PATCH v7 17/17] intel_iommu: enable vfio devices Peter Xu
2017-02-10  6:24   ` Jason Wang
2017-03-16  4:05   ` Peter Xu
2017-03-19 15:34     ` Aviv B.D.
2017-03-20  1:56       ` Peter Xu
2017-03-20  2:12         ` Liu, Yi L
2017-03-20  2:41           ` Peter Xu
2017-02-17 17:18 ` [Qemu-devel] [PATCH v7 00/17] VT-d: vfio enablement and misc enhances Alex Williamson
2017-02-20  7:47   ` Peter Xu
2017-02-20  8:17     ` Liu, Yi L
2017-02-20  8:32       ` Peter Xu
2017-02-20 19:15     ` Alex Williamson [this message]
2017-02-28  7:52 ` Peter Xu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170220121533.6914307b@t450s.home \
    --to=alex.williamson@redhat.com \
    --cc=bd.aviv@gmail.com \
    --cc=david@gibson.dropbear.id.au \
    --cc=jan.kiszka@siemens.com \
    --cc=jasowang@redhat.com \
    --cc=kevin.tian@intel.com \
    --cc=mst@redhat.com \
    --cc=peterx@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=tianyu.lan@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.