kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Matthew Rosato <mjrosato@linux.ibm.com>
To: Niklas Schnelle <schnelle@linux.ibm.com>,
	alex.williamson@redhat.com, cohuck@redhat.com
Cc: pmorel@linux.ibm.com, borntraeger@de.ibm.com, hca@linux.ibm.com,
	gor@linux.ibm.com, gerald.schaefer@linux.ibm.com,
	linux-s390@vger.kernel.org, kvm@vger.kernel.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH 4/4] vfio-pci/zdev: Introduce the zPCI I/O vfio region
Date: Wed, 20 Jan 2021 12:10:53 -0500	[thread overview]
Message-ID: <e26c2616-69cf-78f3-81db-972a0d461bc9@linux.ibm.com> (raw)
In-Reply-To: <90d99da8-02cf-e051-314b-2ab192f8fd57@linux.ibm.com>

On 1/20/21 8:21 AM, Niklas Schnelle wrote:
> 
> 
> On 1/19/21 9:02 PM, Matthew Rosato wrote:
>> Some s390 PCI devices (e.g. ISM) perform I/O operations that have very
> .. snip ...
>> +
>> +static size_t vfio_pci_zdev_io_rw(struct vfio_pci_device *vdev,
>> +				  char __user *buf, size_t count,
>> +				  loff_t *ppos, bool iswrite)
>> +{
> ... snip ...
>> +	/*
>> +	 * For now, the largest allowed block I/O is advertised as PAGE_SIZE,
>> +	 * and cannot exceed a page boundary - so a single page is enough.  The
>> +	 * guest should have validated this but let's double-check that the
>> +	 * request will not cross a page boundary.
>> +	 */
>> +	if (((region->req.gaddr & ~PAGE_MASK)
>> +			+ region->req.len - 1) & PAGE_MASK) {
>> +		return -EIO;
>> +	}
>> +
>> +	mutex_lock(&zdev->lock);
> 
> I plan on using the zdev->lock for preventing concurrent zPCI devices
> removal/configuration state changes between zPCI availability/error
> events and enable_slot()/disable_slot() and /sys/bus/pci/devices/<dev>/recover.
> 
> With that use in place using it here causes a deadlock when doing
> "echo 0 > /sys/bus/pci/slots/<fid>/power from the host for an ISM device
> attached to a guest.
> 
> This is because the (soft) hot unplug will cause vfio to notify QEMU, which
> sends a deconfiguration request to the guest, which then tries to
> gracefully shutdown the device. During that shutdown the device will
> be accessed, running into this code path which then blocks on
> the lock held by the disable_slot() code which waits on vfio
> releasing the device.
> 

Oof, good catch.  The primary purpose of acquiring the zdev lock here 
was to ensure that the region is only being used to process one 
operation at a time and at the time I wrote this initially the zdev lock 
seemed like a reasonable candidate :)

> Alex may correct me if I'm wrong but I think instead vfio should
> be holding the PCI device lock via pci_device_lock(pdev).
> 

OK, I can have a look at this.  Thanks.


> The annotated trace with my test code looks as follows:
> 
> [  618.025091] Call Trace:
> [  618.025093]  [<00000007c1a139e0>] __schedule+0x360/0x960
> [  618.025104]  [<00000007c1a14760>] schedule_preempt_disabled+0x60/0x100
> [  618.025107]  [<00000007c1a16b48>] __mutex_lock+0x358/0x880
> [  618.025110]  [<00000007c1a170a2>] mutex_lock_nested+0x32/0x40
> [  618.025112]  [<000003ff805a3948>] vfio_pci_zdev_io_rw+0x168/0x310 [vfio_pci]
> [  618.025120]  [<000003ff8059b2b0>] vfio_pci_write+0xd0/0xe0 [vfio_pci]
> [  618.025124]  [<00000007c0fa5392>] __s390x_sys_pwrite64+0x112/0x360
> [  618.025129]  [<00000007c1a0aaf6>] __do_syscall+0x116/0x190
> [  618.025132]  [<00000007c1a1deda>] system_call+0x72/0x98
> [  618.025137] 1 lock held by qemu-system-s39/1315:
> [  618.025139]  #0: 000000008524b4e8 (&zdev->lock){....}-{3:3}, at: vfio_pci_zdev_io_rw+0x168/0x310 [vfio_pci]
> [  618.025151]
>                 Showing all locks held in the system:
> [  618.025166] 1 lock held by khungtaskd/99:
> [  618.025168]  #0: 00000007c1ed4748 (rcu_read_lock){....}-{1:2}, at: rcu_lock_acquire.constprop.0+0x0/0x210
> [  618.025194] 6 locks held by zsh/1190:
> [  618.025196]  #0: 0000000095fc0488 (sb_writers#3){....}-{0:0}, at: __do_syscall+0x116/0x190
> [  618.025226]  #1: 00000000975bf090 (&of->mutex){....}-{3:3}, at: kernfs_fop_write+0x9a/0x240
> [  618.025236]  #2: 000000008584be78 (kn->active#245){....}-{0:0}, at: kernfs_fop_write+0xa6/0x240
> [  618.025243]  #3: 000000008524b4e8 (&zdev->lock){....}-{3:3}, at: disable_slot+0x32/0x130 <-------------------------------------|
> [  618.025252]  #4: 00000007c1f53468 (pci_rescan_remove_lock){....}-{3:3}, at: pci_stop_and_remove_bus_device_locked+0x26/0x240   |
> [  618.025260]  #5: 0000000085d8a1a0 (&dev->mutex){....}-{3:3}, at: device_release_driver+0x32/0x1d0                              |
> [  618.025271] 1 lock held by qemu-system-s39/1312:                                                                               D
> [  618.025273] 1 lock held by qemu-system-s39/1313:                                                                               E
> [  618.025275]  #0: 00000000d47e80d0 (&vcpu->mutex){....}-{3:3}, at: kvm_vcpu_ioctl+0x90/0x780 [kvm]                              A
> [  618.025322] 1 lock held by qemu-system-s39/1314:                                                                               D
> [  618.025324]  #0: 00000000d34700d0 (&vcpu->mutex){....}-{3:3}, at: kvm_vcpu_ioctl+0x90/0x780 [kvm]                              |
> [  618.025345] 1 lock held by qemu-system-s39/1315:                                                                               |
> [  618.025347]  #0: 000000008524b4e8 (&zdev->lock){....}-{3:3}, at: vfio_pci_zdev_io_rw+0x168/0x310 [vfio_pci] <------------------|
> [  618.025355] 1 lock held by qemu-system-s39/1317:
> [  618.025357]  #0: 00000000d34480d0 (&vcpu->mutex){....}-{3:3}, at: kvm_vcpu_ioctl+0x90/0x780 [kvm]
> [  618.025378] 1 lock held by qemu-system-s39/1318:
> [  618.025380]  #0: 00000000d34380d0 (&vcpu->mutex){....}-{3:3}, at: kvm_vcpu_ioctl+0x90/0x780 [kvm]
> [  618.025400] 1 lock held by qemu-system-s39/1319:
> [  618.025403]  #0: 00000000d47e8a90 (&vcpu->mutex){....}-{3:3}, at: kvm_vcpu_ioctl+0x90/0x780 [kvm]
> [  618.025424] 2 locks held by zsh/1391:
> [  618.025426]  #0: 00000000d4a708a0 (&tty->ldisc_sem){....}-{0:0}, at: tty_ldisc_ref_wait+0x34/0x70
> [  618.025435]  #1: 0000038002fc72f0 (&ldata->atomic_read_lock){....}-{3:3}, at: n_tty_read+0xc8/0xa50
> 
> 
>> +
>> +	ret = get_user_pages_fast(region->req.gaddr & PAGE_MASK, 1, 0, &gpage);
>> +	if (ret <= 0) {
>> +		count = -EIO;
> ... snip ...
> 


  reply	other threads:[~2021-01-20 17:13 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-19 20:02 [PATCH 0/4] vfio-pci/zdev: Fixing s390 vfio-pci ISM support Matthew Rosato
2021-01-19 20:02 ` [PATCH 1/4] s390/pci: track alignment/length strictness for zpci_dev Matthew Rosato
2021-01-19 20:02 ` [PATCH 2/4] vfio-pci/zdev: Pass the relaxed alignment flag Matthew Rosato
2021-01-19 20:02 ` [PATCH 3/4] s390/pci: Get hardware-reported max store block length Matthew Rosato
2021-01-19 20:02 ` [PATCH 4/4] vfio-pci/zdev: Introduce the zPCI I/O vfio region Matthew Rosato
2021-01-20 13:21   ` Niklas Schnelle
2021-01-20 17:10     ` Matthew Rosato [this message]
2021-01-20 17:28       ` Niklas Schnelle
2021-01-20 17:40         ` Matthew Rosato
2021-01-21 20:50     ` Alex Williamson
2021-01-21 10:01   ` Niklas Schnelle
2021-01-21 15:57     ` Matthew Rosato
2021-01-22 23:48   ` Alex Williamson
2021-01-25 14:40     ` Matthew Rosato
2021-01-25 15:42       ` Cornelia Huck
2021-01-25 15:52         ` Matthew Rosato
2021-01-26 23:18       ` Alex Williamson
2021-01-27 14:23         ` Matthew Rosato
2021-01-27 15:53           ` Alex Williamson
2021-01-27 17:45             ` Cornelia Huck
2021-01-27 18:27               ` Matthew Rosato
2021-01-19 20:50 ` [PATCH 0/4] vfio-pci/zdev: Fixing s390 vfio-pci ISM support Matthew Rosato
2021-01-20  9:02 ` Pierre Morel
2021-01-20 14:02   ` Matthew Rosato

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e26c2616-69cf-78f3-81db-972a0d461bc9@linux.ibm.com \
    --to=mjrosato@linux.ibm.com \
    --cc=alex.williamson@redhat.com \
    --cc=borntraeger@de.ibm.com \
    --cc=cohuck@redhat.com \
    --cc=gerald.schaefer@linux.ibm.com \
    --cc=gor@linux.ibm.com \
    --cc=hca@linux.ibm.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=pmorel@linux.ibm.com \
    --cc=schnelle@linux.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).