All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jason Gunthorpe <jgg@nvidia.com>
To: Bjorn Helgaas <helgaas@kernel.org>
Cc: Yishai Hadas <yishaih@nvidia.com>,
	bhelgaas@google.com, corbet@lwn.net, alex.williamson@redhat.com,
	diana.craciun@oss.nxp.com, kwankhede@nvidia.com,
	eric.auger@redhat.com, masahiroy@kernel.org,
	michal.lkml@markovi.net, linux-pci@vger.kernel.org,
	linux-doc@vger.kernel.org, kvm@vger.kernel.org,
	linux-s390@vger.kernel.org, linux-kbuild@vger.kernel.org,
	mgurtovoy@nvidia.com, maorg@nvidia.com, leonro@nvidia.com
Subject: Re: [PATCH 09/12] PCI: Add a PCI_ID_F_VFIO_DRIVER_OVERRIDE flag to struct pci_device_id
Date: Thu, 12 Aug 2021 10:27:28 -0300	[thread overview]
Message-ID: <20210812132728.GB8367@nvidia.com> (raw)
In-Reply-To: <20210811190737.GA2378935@bjorn-Precision-5520>

On Wed, Aug 11, 2021 at 02:07:37PM -0500, Bjorn Helgaas wrote:
> On Thu, Aug 05, 2021 at 09:23:57PM -0300, Jason Gunthorpe wrote:
> > On Wed, Aug 04, 2021 at 03:34:12PM -0500, Bjorn Helgaas wrote:
> > 
> > > > The first use will be to define a VFIO flag that indicates the PCI driver
> > > > is a VFIO driver.
> > >
> > > Is there such a thing as a "VFIO driver" today?  
> > 
> > Yes.
> > 
> > VFIO has long existed as a driver subsystem that binds drivers to
> > devices in various bus types. In the case of PCI the admin moves a PCI
> > device from normal operation to VFIO operation via something like:
> 
> What specifically makes a driver a "VFIO driver"?  

It is a device driver whose probe function instantiates a "struct
vfio_device" which binds it to the VFIO subsystem and triggers
creation of the char devs, ioctls, etc.

No different from every other subsystem, really. Eg a netdev driver
creates a struct ndev_device, a TPM driver creates struct tpm_chip,
etc.

> supports the VFIO ioctls in include/uapi/linux/vfio.h?  That by itself
> doesn't require special treatment by the kernel, so I think there's
> more here.

The unique thing about VFIO, compared to all other subsystems, is that
VFIO is a second choice for driver binding. A device will have a
natural kernel driver, eg mlx5 naturally creates netdevs, and it has a
VFIO driver option. Userspace selects if it wants the device to
operate in normal mode or VFIO mode.

The kernel should never move a device to VFIO mode automatically -
which is the special behavior compared to any other normal pci_driver.

> > echo vfio_pci > /sys/bus/pci/devices/0000:01:00.0/driver_override
> > 
> > Other bus types (platform, acpi, etc) have a similar command to move
> > them to VFIO.
> 
> Do the other bus types have a flag analogous to
> PCI_ID_F_VFIO_DRIVER_OVERRIDE?  If we're doing something similar to
> other bus types, it'd be nice if the approach were similar.

They could, this series doesn't attempt it. I expect the approach to
be similar as driver_override was copied from PCI to other
busses. When this is completed I hope to take a look at it.

> > PCI: Add a PCI_ID_F_VFIO_DRIVER_OVERRIDE flag to struct pci_device_id
> > 
> > Allow device drivers to include match entries in the modules.alias file
> > produced by kbuild that are not used for normal driver autoprobing and
> > module autoloading. Drivers using these match entries can be connected to
> > the PCI device manually, by userspace, using the existing driver_override
> > sysfs.
> 
> IIUC, the end result of this is basically a tweak to the existing
> sysfs driver_override functionality.

Yes..

> And I *think* (correct me if I'm wrong), this actually has nothing in
> particular to do with VFIO.  It's just that you want to expose some
> device IDs that are only used for binding when driver_override is set.

The general concept has nothing to do with VFIO but adding the "vfio_"
prefix to the modalias is obviously VFIO specific.

The entire point is to convay to userspace the information that the
modules.alias line is just for vfio.

We could imagine in future some other use for this, in which case the
future user would use their own prefix, not vfio.
 
> > When userspace wants to change a device to the VFIO subsystem userspace
> > can implement a generic algorithm:
> > 
> >    1) Identify the sysfs path to the device:
> >     /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0
> > 
> >    2) Get the modalias string from the kernel:
> >     $ cat /sys/bus/pci/devices/0000:01:00.0/modalias
> >     pci:v000015B3d00001021sv000015B3sd00000001bc02sc00i00
> 
> So far, I think this is all the existing behavior, unaffected by this
> patch.

Yes.
 
> >    3) Prefix it with vfio_:
> >     vfio_pci:v000015B3d00001021sv000015B3sd00000001bc02sc00i00
> > 
> >    4) Search modules.alias for the above string and select the entry that
> >       has the fewest *'s:
> >     alias vfio_pci:v000015B3d00001021sv*sd*bc*sc*i* mlx5_vfio_pci
> 
> And this patch basically adds this modules.alias entry.

Yes.
 
> Previously vfio_pci contained no vendor/device IDs, and the only way
> to bind it to a device was to either:
> 
>   - Modprobe the driver and write dynamic device IDs to the driver's
>     /sys/.../new_id.  This should directly bind the driver to all
>     devices that match the new IDs (see new_id_store()).
> 
> or
> 
>   - Write "vfio_pci" to the device's /sys/.../driver_override.
>     AFAICS, this won't bind anything (see driver_override_store()),
>     but if we call the driver's .probe() method via modprobe or
>     rescan, the driver_override will match any device regardless of
>     ID.

Yes

> IIUC, after this patch, you can add vendor/device IDs to a struct
> pci_driver with this new flag.  These IDs are advertised via
> modules.alias.

Yes
 
> For driver binding, IDs with the new flag are eligible to match only
> when driver_override is set to the matching driver.

Yes
 
> Setting a device's driver_override has *always* caused the matching
> driver to bind.  The only difference after this patch is that now we
> give the driver an ID from its .id_table instead of pci_device_id_any.

Almost - before a .id_table entried might be returned as well. The
difference here is that there are "hidden" entries in the id_table
that is only used by driver_overrride and we can return that hidden
entry.

> >    5) modprobe the matched module name:
> >     $ modprobe mlx5_vfio_pci
> 
> I assume somewhere in here you need to unbind mlx5_core before binding
> mlx5_vfio_pci?

Er, yes, I skipped some steps here where unbind/bind has to be done
 
> >    6) cat the matched module name to driver_override:
> >     echo mlx5_vfio_pci > /sys/bus/pci/devices/0000:01:00.0/driver_override
> 
> Don't you need something here to trigger the driver attach, i.e.,
> should step 5 and step 6 be swapped?  What if the driver is already
> loaded? 

The full sequence is more like:

     echo mlx5_vfio_pci > /sys/bus/pci/devices/0000:01:00.0/driver_override
     echo 0000:01:00.0 > /sys/bus/pci/devices/0000:01:00.0/driver/unbind
     echo 0000:01:00.0 > /sys/bus/pci/drivers_probe

> Can you modprobe again to make it bind to a second device?

modprobe is a single-shot, it just loads the module and doesn't
trigger any driver binding. modprobing a second time is a NOP.

> I see drivers/vfio/platform/vfio_platform.c; is that what you mean?

Yes, look around vfio_platform_acpi_probe()

> I don't see any VFIO things with ACPI in their name, so maybe I'm
> looking the wrong place.  If this is purely *plans* for the future,
> maybe say something like "planned VFIO drivers ..."

Sure

Jason

  reply	other threads:[~2021-08-12 13:27 UTC|newest]

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-21 16:15 [PATCH 00/12] Introduce vfio_pci_core subsystem Yishai Hadas
2021-07-21 16:15 ` [PATCH 01/12] vfio/pci: Rename vfio_pci.c to vfio_pci_core.c Yishai Hadas
2021-07-21 16:15 ` [PATCH 02/12] vfio/pci: Rename vfio_pci_private.h to vfio_pci_core.h Yishai Hadas
2021-07-21 16:16 ` [PATCH 03/12] vfio/pci: Rename vfio_pci_device to vfio_pci_core_device Yishai Hadas
2021-07-21 16:16 ` [PATCH 04/12] vfio/pci: Rename ops functions to fit core namings Yishai Hadas
2021-07-21 16:16 ` [PATCH 05/12] vfio/pci: Include vfio header in vfio_pci_core.h Yishai Hadas
2021-07-21 16:16 ` [PATCH 06/12] vfio/pci: Split the pci_driver code out of vfio_pci_core.c Yishai Hadas
2021-07-21 16:16 ` [PATCH 07/12] vfio/pci: Move igd initialization to vfio_pci.c Yishai Hadas
2021-07-21 16:16 ` [PATCH 08/12] vfio/pci: Move module parameters " Yishai Hadas
2021-07-21 16:16 ` [PATCH 09/12] PCI: Add a PCI_ID_F_VFIO_DRIVER_OVERRIDE flag to struct pci_device_id Yishai Hadas
2021-07-27 16:34   ` Alex Williamson
2021-07-27 17:14     ` Jason Gunthorpe
2021-07-27 23:02       ` Alex Williamson
2021-07-27 23:42         ` Jason Gunthorpe
2021-08-04 20:34   ` Bjorn Helgaas
2021-08-05 16:47     ` Max Gurtovoy
2021-08-06  0:23     ` Jason Gunthorpe
2021-08-11 12:22       ` Max Gurtovoy
2021-08-11 19:07       ` Bjorn Helgaas
2021-08-12 13:27         ` Jason Gunthorpe [this message]
2021-08-12 15:57           ` Bjorn Helgaas
2021-08-12 19:51             ` Jason Gunthorpe
2021-08-12 20:26               ` Bjorn Helgaas
2021-08-12 23:21                 ` Max Gurtovoy
2021-08-13 17:44                   ` Bjorn Helgaas
2021-08-14 23:27                     ` Max Gurtovoy
2021-08-16 17:21                       ` Bjorn Helgaas
2021-08-17 13:01                         ` Max Gurtovoy
2021-08-17 14:13                           ` Bjorn Helgaas
2021-08-17 14:44                             ` Max Gurtovoy
2021-08-12 15:42   ` Bjorn Helgaas
2021-07-21 16:16 ` [PATCH 10/12] vfio: Use select for eventfd Yishai Hadas
2021-07-21 16:16 ` [PATCH 11/12] vfio: Use kconfig if XX/endif blocks instead of repeating 'depends on' Yishai Hadas
2021-07-21 16:16 ` [PATCH 12/12] vfio/pci: Introduce vfio_pci_core.ko Yishai Hadas
2021-07-21 17:39   ` Leon Romanovsky
2021-07-22  9:06     ` Yishai Hadas
2021-07-22  9:22       ` Max Gurtovoy
2021-07-23 14:13         ` Leon Romanovsky
2021-07-25 10:45           ` Max Gurtovoy
2021-07-27 21:54   ` Alex Williamson
2021-07-27 23:09     ` Jason Gunthorpe
2021-07-28  4:56       ` Leon Romanovsky
2021-07-28  5:43       ` Christoph Hellwig
2021-07-28  7:04         ` Arnd Bergmann
2021-07-28  7:17           ` Leon Romanovsky
2021-07-28 12:03         ` Jason Gunthorpe
2021-07-28 12:12           ` Arnd Bergmann
2021-07-28 12:29           ` Christoph Hellwig
2021-07-28 12:47             ` Jason Gunthorpe
2021-07-28 12:55               ` Christoph Hellwig
2021-07-28 13:31                 ` Jason Gunthorpe
2021-07-28 13:08               ` Arnd Bergmann
2021-07-28 17:26                 ` Jason Gunthorpe
2021-08-04 13:41 ` [PATCH 00/12] Introduce vfio_pci_core subsystem Yishai Hadas
2021-08-04 15:27   ` Alex Williamson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210812132728.GB8367@nvidia.com \
    --to=jgg@nvidia.com \
    --cc=alex.williamson@redhat.com \
    --cc=bhelgaas@google.com \
    --cc=corbet@lwn.net \
    --cc=diana.craciun@oss.nxp.com \
    --cc=eric.auger@redhat.com \
    --cc=helgaas@kernel.org \
    --cc=kvm@vger.kernel.org \
    --cc=kwankhede@nvidia.com \
    --cc=leonro@nvidia.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kbuild@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=maorg@nvidia.com \
    --cc=masahiroy@kernel.org \
    --cc=mgurtovoy@nvidia.com \
    --cc=michal.lkml@markovi.net \
    --cc=yishaih@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.