All of lore.kernel.org
 help / color / mirror / Atom feed
From: Matthew Rosato <mjrosato@linux.ibm.com>
To: Christian Borntraeger <borntraeger@linux.ibm.com>,
	Jason Gunthorpe <jgg@nvidia.com>
Cc: Alex Williamson <alex.williamson@redhat.com>,
	Tony Krowiak <akrowiak@linux.ibm.com>,
	"Jason J . Herne" <jjherne@linux.ibm.com>,
	Marc Hartmayer <mhartmay@linux.ibm.com>,
	Eric Farman <farman@linux.ibm.com>,
	Cornelia Huck <cohuck@redhat.com>,
	kvm@vger.kernel.org, Qian Cai <cai@lca.pw>,
	Joerg Roedel <jroedel@suse.de>,
	Marek Szyprowski <m.szyprowski@samsung.com>,
	linux-s390 <linux-s390@vger.kernel.org>
Subject: Re: [PATCH v2] vfio: Follow a strict lifetime for struct iommu_group
Date: Tue, 4 Oct 2022 14:22:10 -0400	[thread overview]
Message-ID: <33bc5258-5c95-99ee-a952-5b0b2826da3a@linux.ibm.com> (raw)
In-Reply-To: <0a0d7937-316a-a0e2-9d7d-df8f3f8a38e3@linux.ibm.com>

On 10/4/22 1:36 PM, Christian Borntraeger wrote:
> 
> 
> Am 04.10.22 um 18:28 schrieb Jason Gunthorpe:
>> On Tue, Oct 04, 2022 at 05:44:53PM +0200, Christian Borntraeger wrote:
>>
>>>> Does some userspace have the group FD open when it stucks like this,
>>>> eg what does fuser say?
>>>
>>> /proc/<virtnodedevd>/fd
>>> 51480 0 dr-x------. 2 root root  0  4. Okt 17:16 .
>>> 43593 0 dr-xr-xr-x. 9 root root  0  4. Okt 17:16 ..
>>> 65252 0 lr-x------. 1 root root 64  4. Okt 17:42 0 -> /dev/null
>>> 65253 0 lrwx------. 1 root root 64  4. Okt 17:42 1 -> 'socket:[51479]'
>>> 65261 0 lrwx------. 1 root root 64  4. Okt 17:42 10 -> 'anon_inode:[eventfd]'
>>> 65262 0 lrwx------. 1 root root 64  4. Okt 17:42 11 -> 'socket:[51485]'
>>> 65263 0 lrwx------. 1 root root 64  4. Okt 17:42 12 -> 'socket:[51487]'
>>> 65264 0 lrwx------. 1 root root 64  4. Okt 17:42 13 -> 'socket:[51486]'
>>> 65265 0 lrwx------. 1 root root 64  4. Okt 17:42 14 -> 'anon_inode:[eventfd]'
>>> 65266 0 lrwx------. 1 root root 64  4. Okt 17:42 15 -> 'socket:[60421]'
>>> 65267 0 lrwx------. 1 root root 64  4. Okt 17:42 16 -> 'anon_inode:[eventfd]'
>>> 65268 0 lrwx------. 1 root root 64  4. Okt 17:42 17 -> 'socket:[28008]'
>>> 65269 0 l-wx------. 1 root root 64  4. Okt 17:42 18 -> /run/libvirt/nodedev/driver.pid
>>> 65270 0 lrwx------. 1 root root 64  4. Okt 17:42 19 -> 'socket:[28818]'
>>> 65254 0 lrwx------. 1 root root 64  4. Okt 17:42 2 -> 'socket:[51479]'
>>> 65271 0 lr-x------. 1 root root 64  4. Okt 17:42 20 -> '/dev/vfio/3 (deleted)'
>>
>> Seems like a userspace bug to keep the group FD open after the /dev/
>> file has been deleted :|
>>
>> What do you think about this?
>>
>> commit a54a852b1484b1605917a8f4d80691db333b25ed
>> Author: Jason Gunthorpe <jgg@ziepe.ca>
>> Date:   Tue Oct 4 13:14:37 2022 -0300
>>
>>      vfio: Make the group FD disassociate from the iommu_group
>>           Allow the vfio_group struct to exist with a NULL iommu_group pointer. When
>>      the pointer is NULL the vfio_group users promise not to touch the
>>      iommu_group. This allows a driver to be hot unplugged while userspace is
>>      keeping the group FD open.
>>           SPAPR mode is excluded from this behavior because of how it wrongly hacks
>>      part of its iommu interface through KVM. Due to this we loose control over
>>      what it is doing and cannot revoke the iommu_group usage in the IOMMU
>>      layer via vfio_group_detach_container().
>>           Thus, for SPAPR the group FDs must still be closed before a device can be
>>      hot unplugged.
>>           This fixes a userspace regression where we learned that virtnodedevd
>>      leaves a group FD open even though the /dev/ node for it has been deleted
>>      and all the drivers for it unplugged.
>>           Fixes: ca5f21b25749 ("vfio: Follow a strict lifetime for struct iommu_group")
>>      Reported-by: Christian Borntraeger <borntraeger@linux.ibm.com>
>>      Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> 
> Almost :-)
> 
> drivers/vfio/vfio_main.c: In function 'vfio_file_is_group':
> drivers/vfio/vfio_main.c:1606:47: error: expected ')' before ';' token
>  1606 |         return (file->f_op == &vfio_group_fops;
>       |                ~                              ^
>       |                                               )
> drivers/vfio/vfio_main.c:1606:48: error: expected ';' before '}' token
>  1606 |         return (file->f_op == &vfio_group_fops;
>       |                                                ^
>       |                                                ;
>  1607 | }
>       | ~
> 
> 
> With that fixed I get:
> 
> ERROR: modpost: "vfio_file_is_group" [drivers/vfio/pci/vfio-pci-core.ko] undefined!
> 
> With that worked around (m -> y)


Looks like this can be solved with EXPORT_SYMBOL_GPL(vfio_file_is_group);

Also:

arch/s390/kvm/../../../virt/kvm/vfio.c:64:28: warning: ‘kvm_vfio_file_iommu_group’ defined but not used [-Wunused-function]
   64 | static struct iommu_group *kvm_vfio_file_iommu_group(struct file *file)
      |                            ^~~~~~~~~~~~~~~~~~~~~~~~~

kvm_vfio_file_iommu_group looks like it is now SPAPR-only

> 
> 
> Tested-by: Christian Borntraeger <borntraeger@linux.ibm.com>
> 
> At least the vfio-ap part

Nope, with this s390 vfio-pci at least breaks:

[  132.943389] kernel BUG at lib/list_debug.c:53!
[  132.943406] monitor event: 0040 ilc:2 [#1] SMP 
[  132.943410] Modules linked in: vfio_pci kvm vfio_pci_core irqbypass vfio_virqfd vhost_vsock vmw_vsock_virtio_transport_common vsock vhost vhost_iotlb nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defr
ag_ipv4 ip_set nf_tables nfnetlink sunrpc mlx5_ib ism smc ib_uverbs ib_core uvdevice s390_trng tape_3590 tape tape_class eadm_sch vfio_ccw mdev vfio_iommu_type1 vfio zcrypt_cex4 sch_fq_codel configfs ghash_s390 prng chacha_s390 libchacha mlx5_core aes_s390 des_s390 libdes sha3_512_s390 nvme sha3_256_s390 sha512_s390 sh
a256_s390 nvme_core sha1_s390 sha_common zfcp scsi_transport_fc pkey zcrypt rng_core autofs4 [last unloaded: vfio_pci]
[  132.943457] CPU: 12 PID: 4991 Comm: nose2 Tainted: G        W          6.0.0-rc4 #40
[  132.943460] Hardware name: IBM 3931 A01 782 (LPAR)
[  132.943462] Krnl PSW : 0704c00180000000 00000000cbc90568 (__list_del_entry_valid+0xd8/0xf0)
[  132.943469]            R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:0 PM:0 RI:0 EA:3
[  132.943474] Krnl GPRS: 8000000000000001 0000000900000027 000000000000004e 00000000ccc1ffe0
[  132.943477]            00000000fffeffff 00000009fc290000 0000000000000000 0000000000000080
[  132.943480]            00000000acc86438 0000000000000000 00000000acc86420 00000000a1492800
[  132.943483]            00000000922a0000 000003ffb9dce260 00000000cbc90564 0000038004a6b9f8
[  132.943489] Krnl Code: 00000000cbc90558: c0200045eff3        larl    %r2,00000000cc54e53e
[  132.943489]            00000000cbc9055e: c0e50022c7d9        brasl   %r14,00000000cc0e9510
[  132.943489]           #00000000cbc90564: af000000            mc      0,0
[  132.943489]           >00000000cbc90568: b9040032            lgr     %r3,%r2
[  132.943489]            00000000cbc9056c: c0200045efd4        larl    %r2,00000000cc54e514
[  132.943489]            00000000cbc90572: c0e50022c7cf        brasl   %r14,00000000cc0e9510
[  132.943489]            00000000cbc90578: af000000            mc      0,0
[  132.943489]            00000000cbc9057c: 0707                bcr     0,%r7
[  132.943510] Call Trace:
[  132.943512]  [<00000000cbc90568>] __list_del_entry_valid+0xd8/0xf0 
[  132.943515] ([<00000000cbc90564>] __list_del_entry_valid+0xd4/0xf0)
[  132.943518]  [<000003ff8011a1b8>] vfio_group_detach_container+0x88/0x170 [vfio] 
[  132.943524]  [<000003ff801176c0>] vfio_device_remove_group.isra.0+0xb0/0x1e0 [vfio] 
[  132.943529]  [<000003ff804f9e54>] vfio_pci_core_unregister_device+0x34/0x80 [vfio_pci_core] 
[  132.943535]  [<000003ff804ae1c4>] vfio_pci_remove+0x2c/0x40 [vfio_pci] 
[  132.943539]  [<00000000cbd58c3c>] pci_device_remove+0x3c/0x98 
[  132.943542]  [<00000000cbdbdbce>] device_release_driver_internal+0x1c6/0x288 
[  132.943545]  [<00000000cbd4e284>] pci_stop_bus_device+0x94/0xc0 
[  132.943549]  [<00000000cbd4e570>] pci_stop_and_remove_bus_device_locked+0x30/0x48 
[  132.943552]  [<00000000cb55d980>] zpci_bus_remove_device+0x68/0xa8 
[  132.943555]  [<00000000cb556e82>] zpci_deconfigure_device+0x3a/0xe0 
[  132.943558]  [<00000000cbd65d04>] power_write_file+0x7c/0x130 
[  132.943561]  [<00000000cb8fbc90>] kernfs_fop_write_iter+0x138/0x210 
[  132.943565]  [<00000000cb837344>] vfs_write+0x194/0x2e0 "
[  132.943568]  [<00000000cb8376fa>] ksys_write+0x6a/0xf8 
[  132.943571]  [<00000000cc0f918c>] __do_syscall+0x1d4/0x200 
[  132.943575]  [<00000000cc107e42>] system_call+0x82/0xb0 
[  132.943577] Last Breaking-Event-Address:
[  132.943579]  [<00000000cc0e955c>] _printk+0x4c/0x58
[  132.943585] Kernel panic - not syncing: Fatal exception: panic_on_oops

  parent reply	other threads:[~2022-10-04 18:22 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-23  0:06 [PATCH v2] vfio: Follow a strict lifetime for struct iommu_group Jason Gunthorpe
2022-09-26 17:03 ` Matthew Rosato
2022-09-27 20:05   ` Alex Williamson
2022-10-04 15:19     ` Christian Borntraeger
2022-10-04 15:40       ` Jason Gunthorpe
2022-10-04 15:44         ` Christian Borntraeger
2022-10-04 16:28           ` Jason Gunthorpe
2022-10-04 17:15             ` Christian Borntraeger
2022-10-04 17:22               ` Jason Gunthorpe
2022-10-04 17:36             ` Christian Borntraeger
2022-10-04 17:48               ` Christian Borntraeger
2022-10-04 18:22               ` Matthew Rosato [this message]
2022-10-04 18:56                 ` Eric Farman
2022-10-05 13:46                 ` Matthew Rosato
2022-10-05 13:57                   ` Jason Gunthorpe
2022-10-05 14:00                     ` Christian Borntraeger
2022-10-05 14:01                     ` Jason Gunthorpe
2022-10-05 14:19                       ` Christian Borntraeger
2022-10-06 11:55                         ` Christian Borntraeger
2022-10-05 14:21                       ` Matthew Rosato
2022-10-05 15:40                         ` Matthew Rosato
2022-10-05 14:01                     ` Matthew Rosato
2022-10-04 19:59   ` Matthew Rosato
2022-10-04 20:19     ` Alex Williamson
2022-10-04 22:30       ` Jason Gunthorpe
2022-09-27  6:34 ` Yi Liu
2022-09-27 13:12   ` Jason Gunthorpe
2022-09-28  3:51 ` Tian, Kevin
2022-09-28 15:26   ` Jason Gunthorpe
2022-09-28 23:54     ` Tian, Kevin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=33bc5258-5c95-99ee-a952-5b0b2826da3a@linux.ibm.com \
    --to=mjrosato@linux.ibm.com \
    --cc=akrowiak@linux.ibm.com \
    --cc=alex.williamson@redhat.com \
    --cc=borntraeger@linux.ibm.com \
    --cc=cai@lca.pw \
    --cc=cohuck@redhat.com \
    --cc=farman@linux.ibm.com \
    --cc=jgg@nvidia.com \
    --cc=jjherne@linux.ibm.com \
    --cc=jroedel@suse.de \
    --cc=kvm@vger.kernel.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=m.szyprowski@samsung.com \
    --cc=mhartmay@linux.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.