From: Alex Williamson <alex.williamson@redhat.com> To: Jason Gunthorpe <jgg@nvidia.com> Cc: kvm@vger.kernel.org, mst@redhat.com, maorg@nvidia.com, virtualization@lists.linux-foundation.org, jiri@nvidia.com, leonro@nvidia.com Subject: Re: [PATCH V1 vfio 9/9] vfio/virtio: Introduce a vfio driver over virtio devices Date: Wed, 18 Oct 2023 12:29:25 -0600 [thread overview] Message-ID: <20231018122925.3fde9405.alex.williamson@redhat.com> (raw) In-Reply-To: <20231018163333.GZ3952@nvidia.com> On Wed, 18 Oct 2023 13:33:33 -0300 Jason Gunthorpe <jgg@nvidia.com> wrote: > On Tue, Oct 17, 2023 at 02:24:48PM -0600, Alex Williamson wrote: > > On Tue, 17 Oct 2023 16:42:17 +0300 > > Yishai Hadas <yishaih@nvidia.com> wrote: > > > +static int virtiovf_pci_probe(struct pci_dev *pdev, > > > + const struct pci_device_id *id) > > > +{ > > > + const struct vfio_device_ops *ops = &virtiovf_acc_vfio_pci_ops; > > > + struct virtiovf_pci_core_device *virtvdev; > > > + int ret; > > > + > > > + if (pdev->is_virtfn && virtiovf_support_legacy_access(pdev) && > > > + !virtiovf_bar0_exists(pdev) && pdev->msix_cap) > > > + ops = &virtiovf_acc_vfio_pci_tran_ops; > > > > This is still an issue for me, it's a very narrow use case where we > > have a modern device and want to enable legacy support. Implementing an > > IO BAR and mangling the device ID seems like it should be an opt-in, > > not standard behavior for any compatible device. Users should > > generally expect that the device they see in the host is the device > > they see in the guest. They might even rely on that principle. > > I think this should be configured when the VF is provisioned. If the > user does not want legacy IO bar support then the VFIO VF function > should not advertise the capability, and they won't get driver > support. > > I think that is a very reasonable way to approach this - it is how we > approached similar problems for mlx5. The provisioning interface is > what "profiles" the VF, regardless of if VFIO is driving it or not. It seems like a huge assumption that every device is going to allow this degree of specification in provisioning VFs. mlx5 is a vendor specific driver, it can make such assumptions in design philosophy. > > We can't use the argument that users wanting the default device should > > use vfio-pci rather than virtio-vfio-pci because we've already defined > > the algorithm by which libvirt should choose a variant driver for a > > device. libvirt will choose this driver for all virtio-net devices. > > Well, we can if the use case is niche. I think profiling a virtio VF > to support legacy IO bar emulation and then not wanting to use it is > a niche case. > > The same argument is going come with live migration. This same driver > will still bind and enable live migration if the virtio function is > profiled to support it. If you don't want that in your system then > don't profile the VF for migration support. What in the virtio or SR-IOV spec requires a vendor to make this configurable? > > This driver effectively has the option to expose two different profiles > > for the device, native or transitional. We've discussed profile > > support for variant drivers previously as an equivalent functionality > > to mdev types, but the only use case for this currently is out-of-tree. > > I think this might be the opportunity to define how device profiles are > > exposed and selected in a variant driver. > > Honestly, I've been trying to keep this out of VFIO... > > The function is profiled when it is created, by whatever created > it. As in the other thread we have a vast amount of variation in what > is required to provision the function in the first place. "Legacy IO > BAR emulation support" is just one thing. virtio-net needs to be > hooked up to real network and get a MAC, virtio-blk needs to be hooked > up to real storage and get a media. At a minimum. This is big and > complicated. > > It may not even be the x86 running VFIO that is doing this > provisioning, the PCI function may come pre-provisioned from a DPU. > > It feels better to keep that all in one place, in whatever external > thing is preparing the function before giving it to VFIO. VFIO is > concerned with operating a prepared function. > > When we get to SIOV it should not be VFIO that is > provisioning/creating functions. The owning driver should be doing > this and routing the function to VFIO (eg with an aux device or > otherwise) > > This gets back to the qemu thread on the grace patch where we need to > ask how does the libvirt world see this, given there is no good way to > generically handle all scenarios without a userspace driver to operate > elements. So nothing here is really "all in one place", it may be in the provisioning of the VF, outside of the scope of the host OS, it might be a collection of scripts or operators with device or interface specific tooling to configure the device. Sometimes this configuration will be before the device is probed by the vfio-pci variant driver, sometimes in between probing and opening the device. I don't see why it becomes out of scope if the variant driver itself provides some means for selecting a device profile. We have evidence both from mdev vGPUs and here (imo) that we can expect to see that behavior, so why wouldn't we want to attempt some basic shared interface for variant drivers to implement for selecting such a profile rather than add to this hodgepodge > > Jason had previously suggested a devlink interface for this, but I > > understand that path had been shot down by devlink developers. > > I think we go some things support but supporting all things was shot > down. > > > Another obvious option is sysfs, where we might imagine an optional > > "profiles" directory, perhaps under vfio-dev. Attributes of > > "available" and "current" could allow discovery and selection of a > > profile similar to mdev types. > > IMHO it is a far too complex problem for sysfs. Isn't it then just like devlink, not a silver bullet, but useful for some configuration? AIUI, devlink shot down a means to list available profiles for a device and a means to select one of those profiles. There are a variety of attributes in sysfs which perform this sort of behavior. Specifying a specific profile in sysfs can be difficult, and I'm not proposing sysfs profile support as a mandatory feature, but I'm also not a fan of the vendor specific sysfs approach that out of tree drivers have taken. The mdev type interface is certainly not perfect, but from it we've been able to develop mdevctl to allow persistent and complex configurations of mdev devices. I'd like to see the ability to do something like that with variant drivers that offer multiple profiles without always depending on vendor specific interfaces. Thanks, Alex _______________________________________________ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
WARNING: multiple messages have this Message-ID (diff)
From: Alex Williamson <alex.williamson@redhat.com> To: Jason Gunthorpe <jgg@nvidia.com> Cc: Yishai Hadas <yishaih@nvidia.com>, mst@redhat.com, jasowang@redhat.com, kvm@vger.kernel.org, virtualization@lists.linux-foundation.org, parav@nvidia.com, feliu@nvidia.com, jiri@nvidia.com, kevin.tian@intel.com, joao.m.martins@oracle.com, si-wei.liu@oracle.com, leonro@nvidia.com, maorg@nvidia.com Subject: Re: [PATCH V1 vfio 9/9] vfio/virtio: Introduce a vfio driver over virtio devices Date: Wed, 18 Oct 2023 12:29:25 -0600 [thread overview] Message-ID: <20231018122925.3fde9405.alex.williamson@redhat.com> (raw) In-Reply-To: <20231018163333.GZ3952@nvidia.com> On Wed, 18 Oct 2023 13:33:33 -0300 Jason Gunthorpe <jgg@nvidia.com> wrote: > On Tue, Oct 17, 2023 at 02:24:48PM -0600, Alex Williamson wrote: > > On Tue, 17 Oct 2023 16:42:17 +0300 > > Yishai Hadas <yishaih@nvidia.com> wrote: > > > +static int virtiovf_pci_probe(struct pci_dev *pdev, > > > + const struct pci_device_id *id) > > > +{ > > > + const struct vfio_device_ops *ops = &virtiovf_acc_vfio_pci_ops; > > > + struct virtiovf_pci_core_device *virtvdev; > > > + int ret; > > > + > > > + if (pdev->is_virtfn && virtiovf_support_legacy_access(pdev) && > > > + !virtiovf_bar0_exists(pdev) && pdev->msix_cap) > > > + ops = &virtiovf_acc_vfio_pci_tran_ops; > > > > This is still an issue for me, it's a very narrow use case where we > > have a modern device and want to enable legacy support. Implementing an > > IO BAR and mangling the device ID seems like it should be an opt-in, > > not standard behavior for any compatible device. Users should > > generally expect that the device they see in the host is the device > > they see in the guest. They might even rely on that principle. > > I think this should be configured when the VF is provisioned. If the > user does not want legacy IO bar support then the VFIO VF function > should not advertise the capability, and they won't get driver > support. > > I think that is a very reasonable way to approach this - it is how we > approached similar problems for mlx5. The provisioning interface is > what "profiles" the VF, regardless of if VFIO is driving it or not. It seems like a huge assumption that every device is going to allow this degree of specification in provisioning VFs. mlx5 is a vendor specific driver, it can make such assumptions in design philosophy. > > We can't use the argument that users wanting the default device should > > use vfio-pci rather than virtio-vfio-pci because we've already defined > > the algorithm by which libvirt should choose a variant driver for a > > device. libvirt will choose this driver for all virtio-net devices. > > Well, we can if the use case is niche. I think profiling a virtio VF > to support legacy IO bar emulation and then not wanting to use it is > a niche case. > > The same argument is going come with live migration. This same driver > will still bind and enable live migration if the virtio function is > profiled to support it. If you don't want that in your system then > don't profile the VF for migration support. What in the virtio or SR-IOV spec requires a vendor to make this configurable? > > This driver effectively has the option to expose two different profiles > > for the device, native or transitional. We've discussed profile > > support for variant drivers previously as an equivalent functionality > > to mdev types, but the only use case for this currently is out-of-tree. > > I think this might be the opportunity to define how device profiles are > > exposed and selected in a variant driver. > > Honestly, I've been trying to keep this out of VFIO... > > The function is profiled when it is created, by whatever created > it. As in the other thread we have a vast amount of variation in what > is required to provision the function in the first place. "Legacy IO > BAR emulation support" is just one thing. virtio-net needs to be > hooked up to real network and get a MAC, virtio-blk needs to be hooked > up to real storage and get a media. At a minimum. This is big and > complicated. > > It may not even be the x86 running VFIO that is doing this > provisioning, the PCI function may come pre-provisioned from a DPU. > > It feels better to keep that all in one place, in whatever external > thing is preparing the function before giving it to VFIO. VFIO is > concerned with operating a prepared function. > > When we get to SIOV it should not be VFIO that is > provisioning/creating functions. The owning driver should be doing > this and routing the function to VFIO (eg with an aux device or > otherwise) > > This gets back to the qemu thread on the grace patch where we need to > ask how does the libvirt world see this, given there is no good way to > generically handle all scenarios without a userspace driver to operate > elements. So nothing here is really "all in one place", it may be in the provisioning of the VF, outside of the scope of the host OS, it might be a collection of scripts or operators with device or interface specific tooling to configure the device. Sometimes this configuration will be before the device is probed by the vfio-pci variant driver, sometimes in between probing and opening the device. I don't see why it becomes out of scope if the variant driver itself provides some means for selecting a device profile. We have evidence both from mdev vGPUs and here (imo) that we can expect to see that behavior, so why wouldn't we want to attempt some basic shared interface for variant drivers to implement for selecting such a profile rather than add to this hodgepodge > > Jason had previously suggested a devlink interface for this, but I > > understand that path had been shot down by devlink developers. > > I think we go some things support but supporting all things was shot > down. > > > Another obvious option is sysfs, where we might imagine an optional > > "profiles" directory, perhaps under vfio-dev. Attributes of > > "available" and "current" could allow discovery and selection of a > > profile similar to mdev types. > > IMHO it is a far too complex problem for sysfs. Isn't it then just like devlink, not a silver bullet, but useful for some configuration? AIUI, devlink shot down a means to list available profiles for a device and a means to select one of those profiles. There are a variety of attributes in sysfs which perform this sort of behavior. Specifying a specific profile in sysfs can be difficult, and I'm not proposing sysfs profile support as a mandatory feature, but I'm also not a fan of the vendor specific sysfs approach that out of tree drivers have taken. The mdev type interface is certainly not perfect, but from it we've been able to develop mdevctl to allow persistent and complex configurations of mdev devices. I'd like to see the ability to do something like that with variant drivers that offer multiple profiles without always depending on vendor specific interfaces. Thanks, Alex
next prev parent reply other threads:[~2023-10-18 18:29 UTC|newest] Thread overview: 100+ messages / expand[flat|nested] mbox.gz Atom feed top 2023-10-17 13:42 [PATCH V1 vfio 0/9] Introduce a vfio driver over virtio devices Yishai Hadas 2023-10-17 13:42 ` Yishai Hadas via Virtualization 2023-10-17 13:42 ` [PATCH V1 vfio 1/9] virtio-pci: Fix common config map for modern device Yishai Hadas 2023-10-17 13:42 ` Yishai Hadas via Virtualization 2023-10-17 13:42 ` [PATCH V1 vfio 2/9] virtio: Define feature bit for administration virtqueue Yishai Hadas 2023-10-17 13:42 ` Yishai Hadas via Virtualization 2023-10-17 13:42 ` [PATCH V1 vfio 3/9] virtio-pci: Introduce admin virtqueue Yishai Hadas 2023-10-17 13:42 ` Yishai Hadas via Virtualization 2023-10-17 13:42 ` [PATCH V1 vfio 4/9] virtio-pci: Introduce admin command sending function Yishai Hadas 2023-10-17 13:42 ` Yishai Hadas via Virtualization 2023-10-17 13:42 ` [PATCH V1 vfio 5/9] virtio-pci: Introduce admin commands Yishai Hadas 2023-10-17 13:42 ` Yishai Hadas via Virtualization 2023-10-17 13:42 ` [PATCH V1 vfio 6/9] virtio-pci: Introduce APIs to execute legacy IO " Yishai Hadas 2023-10-17 13:42 ` Yishai Hadas via Virtualization 2023-10-17 20:33 ` kernel test robot 2023-10-17 20:33 ` kernel test robot 2023-10-22 1:14 ` kernel test robot 2023-10-22 1:14 ` kernel test robot 2023-10-24 21:01 ` Michael S. Tsirkin 2023-10-24 21:01 ` Michael S. Tsirkin 2023-10-25 9:18 ` Yishai Hadas via Virtualization 2023-10-25 10:17 ` Michael S. Tsirkin 2023-10-25 10:17 ` Michael S. Tsirkin 2023-10-25 13:00 ` Yishai Hadas 2023-10-25 13:00 ` Yishai Hadas via Virtualization 2023-10-25 13:04 ` Michael S. Tsirkin 2023-10-25 13:04 ` Michael S. Tsirkin 2023-10-25 13:44 ` Michael S. Tsirkin 2023-10-25 13:44 ` Michael S. Tsirkin 2023-10-25 14:03 ` Yishai Hadas 2023-10-25 14:03 ` Yishai Hadas via Virtualization 2023-10-25 16:31 ` Michael S. Tsirkin 2023-10-25 16:31 ` Michael S. Tsirkin 2023-10-25 9:36 ` Yishai Hadas 2023-10-25 9:36 ` Yishai Hadas via Virtualization 2023-10-17 13:42 ` [PATCH V1 vfio 7/9] vfio/pci: Expose vfio_pci_core_setup_barmap() Yishai Hadas 2023-10-17 13:42 ` Yishai Hadas via Virtualization 2023-10-17 13:42 ` [PATCH V1 vfio 8/9] vfio/pci: Expose vfio_pci_iowrite/read##size() Yishai Hadas 2023-10-17 13:42 ` Yishai Hadas via Virtualization 2023-10-17 13:42 ` [PATCH V1 vfio 9/9] vfio/virtio: Introduce a vfio driver over virtio devices Yishai Hadas 2023-10-17 13:42 ` Yishai Hadas via Virtualization 2023-10-17 20:24 ` Alex Williamson 2023-10-17 20:24 ` Alex Williamson 2023-10-18 9:01 ` Yishai Hadas 2023-10-18 9:01 ` Yishai Hadas via Virtualization 2023-10-18 12:51 ` Alex Williamson 2023-10-18 12:51 ` Alex Williamson 2023-10-18 13:06 ` Parav Pandit 2023-10-18 13:06 ` Parav Pandit via Virtualization 2023-10-18 16:33 ` Jason Gunthorpe 2023-10-18 18:29 ` Alex Williamson [this message] 2023-10-18 18:29 ` Alex Williamson 2023-10-18 19:28 ` Jason Gunthorpe 2023-10-24 19:57 ` Alex Williamson 2023-10-24 19:57 ` Alex Williamson 2023-10-25 14:35 ` Yishai Hadas 2023-10-25 14:35 ` Yishai Hadas via Virtualization 2023-10-25 16:28 ` Michael S. Tsirkin 2023-10-25 16:28 ` Michael S. Tsirkin 2023-10-25 19:13 ` Alex Williamson 2023-10-25 19:13 ` Alex Williamson 2023-10-26 12:08 ` Yishai Hadas 2023-10-26 12:08 ` Yishai Hadas via Virtualization 2023-10-26 12:12 ` Michael S. Tsirkin 2023-10-26 12:12 ` Michael S. Tsirkin 2023-10-26 12:40 ` Parav Pandit 2023-10-26 12:40 ` Parav Pandit via Virtualization 2023-10-26 13:15 ` Michael S. Tsirkin 2023-10-26 13:15 ` Michael S. Tsirkin 2023-10-26 13:28 ` Parav Pandit 2023-10-26 13:28 ` Parav Pandit via Virtualization 2023-10-26 15:06 ` Michael S. Tsirkin 2023-10-26 15:06 ` Michael S. Tsirkin 2023-10-26 15:09 ` Parav Pandit 2023-10-26 15:09 ` Parav Pandit via Virtualization 2023-10-26 15:46 ` Michael S. Tsirkin 2023-10-26 15:46 ` Michael S. Tsirkin 2023-10-26 15:56 ` Parav Pandit 2023-10-26 15:56 ` Parav Pandit via Virtualization 2023-10-26 17:55 ` Alex Williamson 2023-10-26 17:55 ` Alex Williamson 2023-10-26 19:49 ` Michael S. Tsirkin 2023-10-26 19:49 ` Michael S. Tsirkin 2023-10-29 16:13 ` Yishai Hadas via Virtualization 2023-10-29 16:13 ` Yishai Hadas 2023-10-22 8:20 ` [PATCH V1 vfio 0/9] " Yishai Hadas 2023-10-22 8:20 ` Yishai Hadas via Virtualization 2023-10-22 9:12 ` Michael S. Tsirkin 2023-10-22 9:12 ` Michael S. Tsirkin 2023-10-23 15:33 ` Alex Williamson 2023-10-23 15:33 ` Alex Williamson 2023-10-23 15:42 ` Jason Gunthorpe 2023-10-23 16:09 ` Alex Williamson 2023-10-23 16:09 ` Alex Williamson 2023-10-23 16:20 ` Jason Gunthorpe 2023-10-23 16:45 ` Alex Williamson 2023-10-23 16:45 ` Alex Williamson 2023-10-23 17:27 ` Jason Gunthorpe 2023-10-25 8:34 ` Tian, Kevin 2023-10-25 8:34 ` Tian, Kevin
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20231018122925.3fde9405.alex.williamson@redhat.com \ --to=alex.williamson@redhat.com \ --cc=jgg@nvidia.com \ --cc=jiri@nvidia.com \ --cc=kvm@vger.kernel.org \ --cc=leonro@nvidia.com \ --cc=maorg@nvidia.com \ --cc=mst@redhat.com \ --cc=virtualization@lists.linux-foundation.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.