kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Niklas Schnelle <schnelle@linux.ibm.com>
To: Matthew Rosato <mjrosato@linux.ibm.com>,
	alex.williamson@redhat.com, cohuck@redhat.com
Cc: pmorel@linux.ibm.com, borntraeger@de.ibm.com, hca@linux.ibm.com,
	gor@linux.ibm.com, gerald.schaefer@linux.ibm.com,
	linux-s390@vger.kernel.org, kvm@vger.kernel.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH 4/4] vfio-pci/zdev: Introduce the zPCI I/O vfio region
Date: Wed, 20 Jan 2021 18:28:22 +0100	[thread overview]
Message-ID: <bf7728b6-3c6d-5b71-8a4d-fc55b0013d9a@linux.ibm.com> (raw)
In-Reply-To: <e26c2616-69cf-78f3-81db-972a0d461bc9@linux.ibm.com>



On 1/20/21 6:10 PM, Matthew Rosato wrote:
> On 1/20/21 8:21 AM, Niklas Schnelle wrote:
>>
>>
>> On 1/19/21 9:02 PM, Matthew Rosato wrote:
>>> Some s390 PCI devices (e.g. ISM) perform I/O operations that have very
>> .. snip ...
>>> +
>>> +static size_t vfio_pci_zdev_io_rw(struct vfio_pci_device *vdev,
>>> +                  char __user *buf, size_t count,
>>> +                  loff_t *ppos, bool iswrite)
>>> +{
>> ... snip ...
>>> +    /*
>>> +     * For now, the largest allowed block I/O is advertised as PAGE_SIZE,
>>> +     * and cannot exceed a page boundary - so a single page is enough.  The
>>> +     * guest should have validated this but let's double-check that the
>>> +     * request will not cross a page boundary.
>>> +     */
>>> +    if (((region->req.gaddr & ~PAGE_MASK)
>>> +            + region->req.len - 1) & PAGE_MASK) {
>>> +        return -EIO;
>>> +    }
>>> +
>>> +    mutex_lock(&zdev->lock);
>>
>> I plan on using the zdev->lock for preventing concurrent zPCI devices
>> removal/configuration state changes between zPCI availability/error
>> events and enable_slot()/disable_slot() and /sys/bus/pci/devices/<dev>/recover.
>>
>> With that use in place using it here causes a deadlock when doing
>> "echo 0 > /sys/bus/pci/slots/<fid>/power from the host for an ISM device
>> attached to a guest.
>>
>> This is because the (soft) hot unplug will cause vfio to notify QEMU, which
>> sends a deconfiguration request to the guest, which then tries to
>> gracefully shutdown the device. During that shutdown the device will
>> be accessed, running into this code path which then blocks on
>> the lock held by the disable_slot() code which waits on vfio
>> releasing the device.
>>
> 
> Oof, good catch.  The primary purpose of acquiring the zdev lock here was to ensure that the region is only being used to process one operation at a time and at the time I wrote this initially the zdev lock seemed like a reasonable candidate :)

Oh ok, I thought it was to guard against removal. What's the reason
we can't have multiple concurrent users of the region though?
The PCISTB/PCILG should be able to be executed concurrently but
I'm probably missing something.

> 
>> Alex may correct me if I'm wrong but I think instead vfio should
>> be holding the PCI device lock via pci_device_lock(pdev).
>>
> 
> OK, I can have a look at this.  Thanks.

Hmm, ok that idea was based on my assumption that it guards
against removal. My bad, of course without my patch it wouldn't
and this came before..

> 
> 
>> The annotated trace with my test code looks as follows:
>>
>> [  618.025091] Call Trace:
>> [  618.025093]  [<00000007c1a139e0>] __schedule+0x360/0x960
>> [  618.025104]  [<00000007c1a14760>] schedule_preempt_disabled+0x60/0x100
>> [  618.025107]  [<00000007c1a16b48>] __mutex_lock+0x358/0x880
>> [  618.025110]  [<00000007c1a170a2>] mutex_lock_nested+0x32/0x40
>> [  618.025112]  [<000003ff805a3948>] vfio_pci_zdev_io_rw+0x168/0x310 [vfio_pci]
>> [  618.025120]  [<000003ff8059b2b0>] vfio_pci_write+0xd0/0xe0 [vfio_pci]
>> [  618.025124]  [<00000007c0fa5392>] __s390x_sys_pwrite64+0x112/0x360
>> [  618.025129]  [<00000007c1a0aaf6>] __do_syscall+0x116/0x190
>> [  618.025132]  [<00000007c1a1deda>] system_call+0x72/0x98
>> [  618.025137] 1 lock held by qemu-system-s39/1315:
>> [  618.025139]  #0: 000000008524b4e8 (&zdev->lock){....}-{3:3}, at: vfio_pci_zdev_io_rw+0x168/0x310 [vfio_pci]
>> [  618.025151]
>>                 Showing all locks held in the system:
>> [  618.025166] 1 lock held by khungtaskd/99:
>> [  618.025168]  #0: 00000007c1ed4748 (rcu_read_lock){....}-{1:2}, at: rcu_lock_acquire.constprop.0+0x0/0x210
>> [  618.025194] 6 locks held by zsh/1190:
>> [  618.025196]  #0: 0000000095fc0488 (sb_writers#3){....}-{0:0}, at: __do_syscall+0x116/0x190
>> [  618.025226]  #1: 00000000975bf090 (&of->mutex){....}-{3:3}, at: kernfs_fop_write+0x9a/0x240
>> [  618.025236]  #2: 000000008584be78 (kn->active#245){....}-{0:0}, at: kernfs_fop_write+0xa6/0x240
>> [  618.025243]  #3: 000000008524b4e8 (&zdev->lock){....}-{3:3}, at: disable_slot+0x32/0x130 <-------------------------------------|
>> [  618.025252]  #4: 00000007c1f53468 (pci_rescan_remove_lock){....}-{3:3}, at: pci_stop_and_remove_bus_device_locked+0x26/0x240   |
>> [  618.025260]  #5: 0000000085d8a1a0 (&dev->mutex){....}-{3:3}, at: device_release_driver+0x32/0x1d0                              |
>> [  618.025271] 1 lock held by qemu-system-s39/1312:                                                                               D
>> [  618.025273] 1 lock held by qemu-system-s39/1313:                                                                               E
>> [  618.025275]  #0: 00000000d47e80d0 (&vcpu->mutex){....}-{3:3}, at: kvm_vcpu_ioctl+0x90/0x780 [kvm]                              A
>> [  618.025322] 1 lock held by qemu-system-s39/1314:                                                                               D
>> [  618.025324]  #0: 00000000d34700d0 (&vcpu->mutex){....}-{3:3}, at: kvm_vcpu_ioctl+0x90/0x780 [kvm]                              |
>> [  618.025345] 1 lock held by qemu-system-s39/1315:                                                                               |
>> [  618.025347]  #0: 000000008524b4e8 (&zdev->lock){....}-{3:3}, at: vfio_pci_zdev_io_rw+0x168/0x310 [vfio_pci] <------------------|
>> [  618.025355] 1 lock held by qemu-system-s39/1317:
>> [  618.025357]  #0: 00000000d34480d0 (&vcpu->mutex){....}-{3:3}, at: kvm_vcpu_ioctl+0x90/0x780 [kvm]
>> [  618.025378] 1 lock held by qemu-system-s39/1318:
>> [  618.025380]  #0: 00000000d34380d0 (&vcpu->mutex){....}-{3:3}, at: kvm_vcpu_ioctl+0x90/0x780 [kvm]
>> [  618.025400] 1 lock held by qemu-system-s39/1319:
>> [  618.025403]  #0: 00000000d47e8a90 (&vcpu->mutex){....}-{3:3}, at: kvm_vcpu_ioctl+0x90/0x780 [kvm]
>> [  618.025424] 2 locks held by zsh/1391:
>> [  618.025426]  #0: 00000000d4a708a0 (&tty->ldisc_sem){....}-{0:0}, at: tty_ldisc_ref_wait+0x34/0x70
>> [  618.025435]  #1: 0000038002fc72f0 (&ldata->atomic_read_lock){....}-{3:3}, at: n_tty_read+0xc8/0xa50
>>
>>
>>> +
>>> +    ret = get_user_pages_fast(region->req.gaddr & PAGE_MASK, 1, 0, &gpage);
>>> +    if (ret <= 0) {
>>> +        count = -EIO;
>> ... snip ...
>>
> 

  reply	other threads:[~2021-01-20 18:43 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-19 20:02 [PATCH 0/4] vfio-pci/zdev: Fixing s390 vfio-pci ISM support Matthew Rosato
2021-01-19 20:02 ` [PATCH 1/4] s390/pci: track alignment/length strictness for zpci_dev Matthew Rosato
2021-01-19 20:02 ` [PATCH 2/4] vfio-pci/zdev: Pass the relaxed alignment flag Matthew Rosato
2021-01-19 20:02 ` [PATCH 3/4] s390/pci: Get hardware-reported max store block length Matthew Rosato
2021-01-19 20:02 ` [PATCH 4/4] vfio-pci/zdev: Introduce the zPCI I/O vfio region Matthew Rosato
2021-01-20 13:21   ` Niklas Schnelle
2021-01-20 17:10     ` Matthew Rosato
2021-01-20 17:28       ` Niklas Schnelle [this message]
2021-01-20 17:40         ` Matthew Rosato
2021-01-21 20:50     ` Alex Williamson
2021-01-21 10:01   ` Niklas Schnelle
2021-01-21 15:57     ` Matthew Rosato
2021-01-22 23:48   ` Alex Williamson
2021-01-25 14:40     ` Matthew Rosato
2021-01-25 15:42       ` Cornelia Huck
2021-01-25 15:52         ` Matthew Rosato
2021-01-26 23:18       ` Alex Williamson
2021-01-27 14:23         ` Matthew Rosato
2021-01-27 15:53           ` Alex Williamson
2021-01-27 17:45             ` Cornelia Huck
2021-01-27 18:27               ` Matthew Rosato
2021-01-19 20:50 ` [PATCH 0/4] vfio-pci/zdev: Fixing s390 vfio-pci ISM support Matthew Rosato
2021-01-20  9:02 ` Pierre Morel
2021-01-20 14:02   ` Matthew Rosato

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bf7728b6-3c6d-5b71-8a4d-fc55b0013d9a@linux.ibm.com \
    --to=schnelle@linux.ibm.com \
    --cc=alex.williamson@redhat.com \
    --cc=borntraeger@de.ibm.com \
    --cc=cohuck@redhat.com \
    --cc=gerald.schaefer@linux.ibm.com \
    --cc=gor@linux.ibm.com \
    --cc=hca@linux.ibm.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=mjrosato@linux.ibm.com \
    --cc=pmorel@linux.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).