From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mout0.freenet.de ([195.4.92.90]:41814 "EHLO mout0.freenet.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753279AbaJVQ3E (ORCPT ); Wed, 22 Oct 2014 12:29:04 -0400 Message-ID: <5447D9D9.9030909@maya.org> Date: Wed, 22 Oct 2014 18:22:49 +0200 From: Andreas Hartmann MIME-Version: 1.0 To: Alex Williamson , Andreas Hartmann CC: Bjorn Helgaas , linux-pci Subject: Re: Hard and silent lock up since linux 3.14 with PCIe pass through (vfio) References: <20140923210318.498dacbd@dualc.maya.org> <1411502866.24563.8.camel@ul30vt.home> <5437A958.3000201@maya.org> <5437F1F5.3010706@maya.org> <543804BC.3080307@maya.org> <20141011003219.560cca97@dualc.maya.org> <20141010225408.GA24493@google.com> <5438CC1E.3060407@maya.org> <1413360267.4202.70.camel@ul30vt.home> <54406B34.1050808@maya.org> <1413925580.4202.189.camel@ul30vt.home> <1413927152.4202.195.camel@ul30vt.home> In-Reply-To: <1413927152.4202.195.camel@ul30vt.home> Content-Type: text/plain; charset=UTF-8 Sender: linux-pci-owner@vger.kernel.org List-ID: Alex Williamson wrote: > On Tue, 2014-10-21 at 15:06 -0600, Alex Williamson wrote: >> Hi Andreas, >> >> On Fri, 2014-10-17 at 03:04 +0200, Andreas Hartmann wrote: >>> Hello Alex, >>> >>> Alex Williamson wrote: >>>> Hi Andreas, >>> [...] >>>> Sorry for the breakage. Is it possible to run lspci on the device in a >>>> loop from the host and capture whether we're failing to restore some of >>>> the VC bits to their previous state? >>> >>>> Does the problem also occur if you >>>> unbind from host driver, >>> >>> The machine is booted w/ blacklisted ath9k. Then, the device is bound to >>> vfio: >>> >>> echo "168c 0030" > /sys/bus/pci/drivers/vfio-pci/new_id >>> echo 0000:03:00.0 > /sys/bus/pci/devices/0000:03:00.0/driver/unbind >>> echo 0000:03:00.0 > /sys/bus/pci/drivers/vfio-pci/bind >>> >>> afterwards the VM is started -> hang. >>> >>> W/o starting th VM, I can bind it to vfio and unbind it from vfio w/o >>> any problem. >>> >>>> echo 1 > reset in pci-sysfs, >>> >>> echo 1 > /sys/bus/pci/devices/0000:03:00.0 works w/o any problem while >>> bound to vfio. Even after unbinding from vfio and rebinding to vfio >>> again ... . >>> >>>> and re-bind to the >>> >>> Do you mean loading ath9k in host system after unbinding from vfio? If >>> yes: Works w/o any problem. It's even possible to reset it or do a >>> ifconfig wlan0 up, ifconfig wlan0 down, rmmod ath9k, bind it to vfio >>> again and reset it, .... >>> >>> Looks like the hang only is triggered by qemu-system_x86_64 on startup >>> the VM. > > Also, this might be because QEMU since 1.7 will favor doing a bus reset > for a device over PM reset while the sysfs reset interface will only do > a bus reset if there are no other methods available and there are no > other devices on the bus. Can you reproduce the hang using the sysfs > reset interface without QEMU if you modify the kernel like this: > > --- a/drivers/pci/pci.c > +++ b/drivers/pci/pci.c > @@ -3308,15 +3308,15 @@ static int __pci_dev_reset(struct pci_dev *dev, int prob > if (rc != -ENOTTY) > goto done; > > - rc = pci_pm_reset(dev, probe); > + rc = pci_dev_reset_slot_function(dev, probe); > if (rc != -ENOTTY) > goto done; > > - rc = pci_dev_reset_slot_function(dev, probe); > + rc = pci_parent_bus_reset(dev, probe); > if (rc != -ENOTTY) > goto done; > > - rc = pci_parent_bus_reset(dev, probe); > + rc = pci_pm_reset(dev, probe); > done: > return rc; > } This way it's crashing with echo 1 > reset, too. Regards, Andreas