All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/12] Introduce vfio_pci_core subsystem
@ 2021-07-21 16:15 Yishai Hadas
  2021-07-21 16:15 ` [PATCH 01/12] vfio/pci: Rename vfio_pci.c to vfio_pci_core.c Yishai Hadas
                   ` (12 more replies)
  0 siblings, 13 replies; 55+ messages in thread
From: Yishai Hadas @ 2021-07-21 16:15 UTC (permalink / raw)
  To: bhelgaas, corbet, alex.williamson, diana.craciun, kwankhede,
	eric.auger, masahiroy, michal.lkml
  Cc: linux-pci, linux-doc, kvm, linux-s390, linux-kbuild, mgurtovoy,
	jgg, yishaih, maorg, leonro

Prologue:

This is the second series of three to send the "mlx5_vfio_pci" driver
that has been discussed on the list for a while now. It comes on top of
the first series (i.e. Reorganize reflck to support splitting vfio_pci)
that was sent already and pending merge [1].

 - Split vfio_pci into vfio_pci/vfio_pci_core and provide infrastructure
   for non-generic VFIO PCI drivers.
 - The new driver mlx5_vfio_pci that is a full implementation of
   suspend/resume functionality for mlx5 devices.

A preview of all the patches can be seen here:
https://github.com/jgunthorpe/linux/commits/mlx5_vfio_pci

[1] https://lore.kernel.org/dri-devel/0-v2-b6a5582525c9+ff96-vfio_reflck_jgg@nvidia.com/T/#t
=====================

From Max Gurtovoy:
====================
This series splits the vfio_pci driver into two parts, a PCI driver and
a subsystem driver that will also be library of code. The main PCI
driver, vfio_pci.ko, will remain as before and it will use the library
module vfio_pci_core.ko to help create the vfio_device.

This series is intended to solve the issues that were raised in the
previous attempts for extending vfio-pci for device specific
functionality:

1. https://lore.kernel.org/kvm/20200518024202.13996-1-yan.y.zhao@intel.com
   by Yan Zhao
2. https://lore.kernel.org/kvm/20210702095849.1610-1-shameerali.kolothum.thodi@huawei.com
   by Longfang Liu

Also to support proposed future changes to virtio and other common
protocols to support migration:

https://lists.oasis-open.org/archives/virtio-comment/202106/msg00044.html

This subsystem framework will also ease adding new device specific
functionality to VFIO devices in the future by allowing another module
to provide the pci_driver that can setup a number of details before
registering to the VFIO subsystem, such as injecting its own operations.

This series also extends the "driver_override" mechanism. A flag is
added for PCI drivers that will declare themselves as "driver_override"
capable which sends their match table to the modules.alias file but
otherwise leaves them outside of the normal driver core auto-binding
world, like vfio_pci.

In order to get the best match for "driver_override" drivers, one can
create a userspace program to inspect the modules.alias, an example can
be found at:

https://github.com/maxgurtovoy/linux_tools/blob/main/vfio/bind_vfio_pci_driver.py

Which finds the 'best match' according to a simple algorithm: "the
driver with the fewest '*' matches wins."

For example, the vfio-pci driver will match to any pci device. So it
will have the maximal '*' matches.

In case we are looking for a match to a mlx5 based device, we'll have a
match to vfio-pci.ko and mlx5-vfio-pci.ko. We'll prefer mlx5-vfio-pci.ko
since it will have less '*' matches (probably vendor and device IDs will
match). This will work in the future for NVMe/Virtio devices that can
match according to a class code or other criteria.

Yishai


Jason Gunthorpe (2):
  vfio: Use select for eventfd
  vfio: Use kconfig if XX/endif blocks instead of repeating 'depends on'

Max Gurtovoy (9):
  vfio/pci: Rename vfio_pci.c to vfio_pci_core.c
  vfio/pci: Rename vfio_pci_private.h to vfio_pci_core.h
  vfio/pci: Rename vfio_pci_device to vfio_pci_core_device
  vfio/pci: Rename ops functions to fit core namings
  vfio/pci: Include vfio header in vfio_pci_core.h
  vfio/pci: Split the pci_driver code out of vfio_pci_core.c
  vfio/pci: Move igd initialization to vfio_pci.c
  PCI: Add a PCI_ID_F_VFIO_DRIVER_OVERRIDE flag to struct pci_device_id
  vfio/pci: Introduce vfio_pci_core.ko

Yishai Hadas (1):
  vfio/pci: Move module parameters to vfio_pci.c

 Documentation/PCI/pci.rst                     |    1 +
 drivers/pci/pci-driver.c                      |   25 +-
 drivers/vfio/Kconfig                          |   29 +-
 drivers/vfio/fsl-mc/Kconfig                   |    3 +-
 drivers/vfio/mdev/Kconfig                     |    1 -
 drivers/vfio/pci/Kconfig                      |   39 +-
 drivers/vfio/pci/Makefile                     |    8 +-
 drivers/vfio/pci/vfio_pci.c                   | 2238 +----------------
 drivers/vfio/pci/vfio_pci_config.c            |   70 +-
 drivers/vfio/pci/vfio_pci_core.c              | 2138 ++++++++++++++++
 drivers/vfio/pci/vfio_pci_igd.c               |   19 +-
 drivers/vfio/pci/vfio_pci_intrs.c             |   42 +-
 drivers/vfio/pci/vfio_pci_rdwr.c              |   18 +-
 drivers/vfio/pci/vfio_pci_zdev.c              |    4 +-
 drivers/vfio/platform/Kconfig                 |    6 +-
 drivers/vfio/platform/reset/Kconfig           |    4 +-
 include/linux/mod_devicetable.h               |    7 +
 include/linux/pci.h                           |   27 +
 .../linux/vfio_pci_core.h                     |   89 +-
 scripts/mod/devicetable-offsets.c             |    1 +
 scripts/mod/file2alias.c                      |    8 +-
 21 files changed, 2496 insertions(+), 2281 deletions(-)
 create mode 100644 drivers/vfio/pci/vfio_pci_core.c
 rename drivers/vfio/pci/vfio_pci_private.h => include/linux/vfio_pci_core.h (56%)

-- 
2.18.1


^ permalink raw reply	[flat|nested] 55+ messages in thread

* [PATCH 01/12] vfio/pci: Rename vfio_pci.c to vfio_pci_core.c
  2021-07-21 16:15 [PATCH 00/12] Introduce vfio_pci_core subsystem Yishai Hadas
@ 2021-07-21 16:15 ` Yishai Hadas
  2021-07-21 16:15 ` [PATCH 02/12] vfio/pci: Rename vfio_pci_private.h to vfio_pci_core.h Yishai Hadas
                   ` (11 subsequent siblings)
  12 siblings, 0 replies; 55+ messages in thread
From: Yishai Hadas @ 2021-07-21 16:15 UTC (permalink / raw)
  To: bhelgaas, corbet, alex.williamson, diana.craciun, kwankhede,
	eric.auger, masahiroy, michal.lkml
  Cc: linux-pci, linux-doc, kvm, linux-s390, linux-kbuild, mgurtovoy,
	jgg, yishaih, maorg, leonro

From: Max Gurtovoy <mgurtovoy@nvidia.com>

This is a preparation patch for separating the vfio_pci driver to a
subsystem driver and a generic pci driver. This patch doesn't change any
logic.

Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
---
 drivers/vfio/pci/Makefile                        | 2 +-
 drivers/vfio/pci/{vfio_pci.c => vfio_pci_core.c} | 0
 2 files changed, 1 insertion(+), 1 deletion(-)
 rename drivers/vfio/pci/{vfio_pci.c => vfio_pci_core.c} (100%)

diff --git a/drivers/vfio/pci/Makefile b/drivers/vfio/pci/Makefile
index 3ff42093962f..66a40488e967 100644
--- a/drivers/vfio/pci/Makefile
+++ b/drivers/vfio/pci/Makefile
@@ -1,6 +1,6 @@
 # SPDX-License-Identifier: GPL-2.0-only
 
-vfio-pci-y := vfio_pci.o vfio_pci_intrs.o vfio_pci_rdwr.o vfio_pci_config.o
+vfio-pci-y := vfio_pci_core.o vfio_pci_intrs.o vfio_pci_rdwr.o vfio_pci_config.o
 vfio-pci-$(CONFIG_VFIO_PCI_IGD) += vfio_pci_igd.o
 vfio-pci-$(CONFIG_S390) += vfio_pci_zdev.o
 
diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci_core.c
similarity index 100%
rename from drivers/vfio/pci/vfio_pci.c
rename to drivers/vfio/pci/vfio_pci_core.c
-- 
2.18.1


^ permalink raw reply	[flat|nested] 55+ messages in thread

* [PATCH 02/12] vfio/pci: Rename vfio_pci_private.h to vfio_pci_core.h
  2021-07-21 16:15 [PATCH 00/12] Introduce vfio_pci_core subsystem Yishai Hadas
  2021-07-21 16:15 ` [PATCH 01/12] vfio/pci: Rename vfio_pci.c to vfio_pci_core.c Yishai Hadas
@ 2021-07-21 16:15 ` Yishai Hadas
  2021-07-21 16:16 ` [PATCH 03/12] vfio/pci: Rename vfio_pci_device to vfio_pci_core_device Yishai Hadas
                   ` (10 subsequent siblings)
  12 siblings, 0 replies; 55+ messages in thread
From: Yishai Hadas @ 2021-07-21 16:15 UTC (permalink / raw)
  To: bhelgaas, corbet, alex.williamson, diana.craciun, kwankhede,
	eric.auger, masahiroy, michal.lkml
  Cc: linux-pci, linux-doc, kvm, linux-s390, linux-kbuild, mgurtovoy,
	jgg, yishaih, maorg, leonro

From: Max Gurtovoy <mgurtovoy@nvidia.com>

This is a preparation patch for separating the vfio_pci driver to a
subsystem driver and a generic pci driver. This patch doesn't change any
logic.

Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
---
 drivers/vfio/pci/vfio_pci_config.c                       | 2 +-
 drivers/vfio/pci/vfio_pci_core.c                         | 2 +-
 drivers/vfio/pci/{vfio_pci_private.h => vfio_pci_core.h} | 6 +++---
 drivers/vfio/pci/vfio_pci_igd.c                          | 2 +-
 drivers/vfio/pci/vfio_pci_intrs.c                        | 2 +-
 drivers/vfio/pci/vfio_pci_rdwr.c                         | 2 +-
 drivers/vfio/pci/vfio_pci_zdev.c                         | 2 +-
 7 files changed, 9 insertions(+), 9 deletions(-)
 rename drivers/vfio/pci/{vfio_pci_private.h => vfio_pci_core.h} (98%)

diff --git a/drivers/vfio/pci/vfio_pci_config.c b/drivers/vfio/pci/vfio_pci_config.c
index 70e28efbc51f..0bc269c0b03f 100644
--- a/drivers/vfio/pci/vfio_pci_config.c
+++ b/drivers/vfio/pci/vfio_pci_config.c
@@ -26,7 +26,7 @@
 #include <linux/vfio.h>
 #include <linux/slab.h>
 
-#include "vfio_pci_private.h"
+#include "vfio_pci_core.h"
 
 /* Fake capability ID for standard config space */
 #define PCI_CAP_ID_BASIC	0
diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c
index d751d38f2175..51eb96375e98 100644
--- a/drivers/vfio/pci/vfio_pci_core.c
+++ b/drivers/vfio/pci/vfio_pci_core.c
@@ -28,7 +28,7 @@
 #include <linux/nospec.h>
 #include <linux/sched/mm.h>
 
-#include "vfio_pci_private.h"
+#include "vfio_pci_core.h"
 
 #define DRIVER_VERSION  "0.2"
 #define DRIVER_AUTHOR   "Alex Williamson <alex.williamson@redhat.com>"
diff --git a/drivers/vfio/pci/vfio_pci_private.h b/drivers/vfio/pci/vfio_pci_core.h
similarity index 98%
rename from drivers/vfio/pci/vfio_pci_private.h
rename to drivers/vfio/pci/vfio_pci_core.h
index 70414b6c904d..ef26e781961d 100644
--- a/drivers/vfio/pci/vfio_pci_private.h
+++ b/drivers/vfio/pci/vfio_pci_core.h
@@ -15,8 +15,8 @@
 #include <linux/uuid.h>
 #include <linux/notifier.h>
 
-#ifndef VFIO_PCI_PRIVATE_H
-#define VFIO_PCI_PRIVATE_H
+#ifndef VFIO_PCI_CORE_H
+#define VFIO_PCI_CORE_H
 
 #define VFIO_PCI_OFFSET_SHIFT   40
 
@@ -205,4 +205,4 @@ static inline int vfio_pci_info_zdev_add_caps(struct vfio_pci_device *vdev,
 }
 #endif
 
-#endif /* VFIO_PCI_PRIVATE_H */
+#endif /* VFIO_PCI_CORE_H */
diff --git a/drivers/vfio/pci/vfio_pci_igd.c b/drivers/vfio/pci/vfio_pci_igd.c
index aa0a29fd2762..d57c409b4033 100644
--- a/drivers/vfio/pci/vfio_pci_igd.c
+++ b/drivers/vfio/pci/vfio_pci_igd.c
@@ -15,7 +15,7 @@
 #include <linux/uaccess.h>
 #include <linux/vfio.h>
 
-#include "vfio_pci_private.h"
+#include "vfio_pci_core.h"
 
 #define OPREGION_SIGNATURE	"IntelGraphicsMem"
 #define OPREGION_SIZE		(8 * 1024)
diff --git a/drivers/vfio/pci/vfio_pci_intrs.c b/drivers/vfio/pci/vfio_pci_intrs.c
index 869dce5f134d..df1e8c8c274c 100644
--- a/drivers/vfio/pci/vfio_pci_intrs.c
+++ b/drivers/vfio/pci/vfio_pci_intrs.c
@@ -20,7 +20,7 @@
 #include <linux/wait.h>
 #include <linux/slab.h>
 
-#include "vfio_pci_private.h"
+#include "vfio_pci_core.h"
 
 /*
  * INTx
diff --git a/drivers/vfio/pci/vfio_pci_rdwr.c b/drivers/vfio/pci/vfio_pci_rdwr.c
index a0b5fc8e46f4..667e82726e75 100644
--- a/drivers/vfio/pci/vfio_pci_rdwr.c
+++ b/drivers/vfio/pci/vfio_pci_rdwr.c
@@ -17,7 +17,7 @@
 #include <linux/vfio.h>
 #include <linux/vgaarb.h>
 
-#include "vfio_pci_private.h"
+#include "vfio_pci_core.h"
 
 #ifdef __LITTLE_ENDIAN
 #define vfio_ioread64	ioread64
diff --git a/drivers/vfio/pci/vfio_pci_zdev.c b/drivers/vfio/pci/vfio_pci_zdev.c
index 7b011b62c766..ecae0c3d95a0 100644
--- a/drivers/vfio/pci/vfio_pci_zdev.c
+++ b/drivers/vfio/pci/vfio_pci_zdev.c
@@ -19,7 +19,7 @@
 #include <asm/pci_clp.h>
 #include <asm/pci_io.h>
 
-#include "vfio_pci_private.h"
+#include "vfio_pci_core.h"
 
 /*
  * Add the Base PCI Function information to the device info region.
-- 
2.18.1


^ permalink raw reply	[flat|nested] 55+ messages in thread

* [PATCH 03/12] vfio/pci: Rename vfio_pci_device to vfio_pci_core_device
  2021-07-21 16:15 [PATCH 00/12] Introduce vfio_pci_core subsystem Yishai Hadas
  2021-07-21 16:15 ` [PATCH 01/12] vfio/pci: Rename vfio_pci.c to vfio_pci_core.c Yishai Hadas
  2021-07-21 16:15 ` [PATCH 02/12] vfio/pci: Rename vfio_pci_private.h to vfio_pci_core.h Yishai Hadas
@ 2021-07-21 16:16 ` Yishai Hadas
  2021-07-21 16:16 ` [PATCH 04/12] vfio/pci: Rename ops functions to fit core namings Yishai Hadas
                   ` (9 subsequent siblings)
  12 siblings, 0 replies; 55+ messages in thread
From: Yishai Hadas @ 2021-07-21 16:16 UTC (permalink / raw)
  To: bhelgaas, corbet, alex.williamson, diana.craciun, kwankhede,
	eric.auger, masahiroy, michal.lkml
  Cc: linux-pci, linux-doc, kvm, linux-s390, linux-kbuild, mgurtovoy,
	jgg, yishaih, maorg, leonro

From: Max Gurtovoy <mgurtovoy@nvidia.com>

This is a preparation patch for separating the vfio_pci driver to a
subsystem driver and a generic pci driver. This patch doesn't change any
logic.

The new vfio_pci_core_device structure will be the main structure of the
core driver and later on vfio_pci_device structure will be the main
structure of the generic vfio_pci driver.

Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
---
 drivers/vfio/pci/vfio_pci_config.c |  68 ++++++++--------
 drivers/vfio/pci/vfio_pci_core.c   | 123 +++++++++++++++--------------
 drivers/vfio/pci/vfio_pci_core.h   |  52 ++++++------
 drivers/vfio/pci/vfio_pci_igd.c    |  17 ++--
 drivers/vfio/pci/vfio_pci_intrs.c  |  40 +++++-----
 drivers/vfio/pci/vfio_pci_rdwr.c   |  16 ++--
 drivers/vfio/pci/vfio_pci_zdev.c   |   2 +-
 7 files changed, 160 insertions(+), 158 deletions(-)

diff --git a/drivers/vfio/pci/vfio_pci_config.c b/drivers/vfio/pci/vfio_pci_config.c
index 0bc269c0b03f..1f034f768a27 100644
--- a/drivers/vfio/pci/vfio_pci_config.c
+++ b/drivers/vfio/pci/vfio_pci_config.c
@@ -108,9 +108,9 @@ static const u16 pci_ext_cap_length[PCI_EXT_CAP_ID_MAX + 1] = {
 struct perm_bits {
 	u8	*virt;		/* read/write virtual data, not hw */
 	u8	*write;		/* writeable bits */
-	int	(*readfn)(struct vfio_pci_device *vdev, int pos, int count,
+	int	(*readfn)(struct vfio_pci_core_device *vdev, int pos, int count,
 			  struct perm_bits *perm, int offset, __le32 *val);
-	int	(*writefn)(struct vfio_pci_device *vdev, int pos, int count,
+	int	(*writefn)(struct vfio_pci_core_device *vdev, int pos, int count,
 			   struct perm_bits *perm, int offset, __le32 val);
 };
 
@@ -171,7 +171,7 @@ static int vfio_user_config_write(struct pci_dev *pdev, int offset,
 	return ret;
 }
 
-static int vfio_default_config_read(struct vfio_pci_device *vdev, int pos,
+static int vfio_default_config_read(struct vfio_pci_core_device *vdev, int pos,
 				    int count, struct perm_bits *perm,
 				    int offset, __le32 *val)
 {
@@ -197,7 +197,7 @@ static int vfio_default_config_read(struct vfio_pci_device *vdev, int pos,
 	return count;
 }
 
-static int vfio_default_config_write(struct vfio_pci_device *vdev, int pos,
+static int vfio_default_config_write(struct vfio_pci_core_device *vdev, int pos,
 				     int count, struct perm_bits *perm,
 				     int offset, __le32 val)
 {
@@ -244,7 +244,7 @@ static int vfio_default_config_write(struct vfio_pci_device *vdev, int pos,
 }
 
 /* Allow direct read from hardware, except for capability next pointer */
-static int vfio_direct_config_read(struct vfio_pci_device *vdev, int pos,
+static int vfio_direct_config_read(struct vfio_pci_core_device *vdev, int pos,
 				   int count, struct perm_bits *perm,
 				   int offset, __le32 *val)
 {
@@ -269,7 +269,7 @@ static int vfio_direct_config_read(struct vfio_pci_device *vdev, int pos,
 }
 
 /* Raw access skips any kind of virtualization */
-static int vfio_raw_config_write(struct vfio_pci_device *vdev, int pos,
+static int vfio_raw_config_write(struct vfio_pci_core_device *vdev, int pos,
 				 int count, struct perm_bits *perm,
 				 int offset, __le32 val)
 {
@@ -282,7 +282,7 @@ static int vfio_raw_config_write(struct vfio_pci_device *vdev, int pos,
 	return count;
 }
 
-static int vfio_raw_config_read(struct vfio_pci_device *vdev, int pos,
+static int vfio_raw_config_read(struct vfio_pci_core_device *vdev, int pos,
 				int count, struct perm_bits *perm,
 				int offset, __le32 *val)
 {
@@ -296,7 +296,7 @@ static int vfio_raw_config_read(struct vfio_pci_device *vdev, int pos,
 }
 
 /* Virt access uses only virtualization */
-static int vfio_virt_config_write(struct vfio_pci_device *vdev, int pos,
+static int vfio_virt_config_write(struct vfio_pci_core_device *vdev, int pos,
 				  int count, struct perm_bits *perm,
 				  int offset, __le32 val)
 {
@@ -304,7 +304,7 @@ static int vfio_virt_config_write(struct vfio_pci_device *vdev, int pos,
 	return count;
 }
 
-static int vfio_virt_config_read(struct vfio_pci_device *vdev, int pos,
+static int vfio_virt_config_read(struct vfio_pci_core_device *vdev, int pos,
 				 int count, struct perm_bits *perm,
 				 int offset, __le32 *val)
 {
@@ -396,7 +396,7 @@ static inline void p_setd(struct perm_bits *p, int off, u32 virt, u32 write)
 }
 
 /* Caller should hold memory_lock semaphore */
-bool __vfio_pci_memory_enabled(struct vfio_pci_device *vdev)
+bool __vfio_pci_memory_enabled(struct vfio_pci_core_device *vdev)
 {
 	struct pci_dev *pdev = vdev->pdev;
 	u16 cmd = le16_to_cpu(*(__le16 *)&vdev->vconfig[PCI_COMMAND]);
@@ -413,7 +413,7 @@ bool __vfio_pci_memory_enabled(struct vfio_pci_device *vdev)
  * Restore the *real* BARs after we detect a FLR or backdoor reset.
  * (backdoor = some device specific technique that we didn't catch)
  */
-static void vfio_bar_restore(struct vfio_pci_device *vdev)
+static void vfio_bar_restore(struct vfio_pci_core_device *vdev)
 {
 	struct pci_dev *pdev = vdev->pdev;
 	u32 *rbar = vdev->rbar;
@@ -460,7 +460,7 @@ static __le32 vfio_generate_bar_flags(struct pci_dev *pdev, int bar)
  * Pretend we're hardware and tweak the values of the *virtual* PCI BARs
  * to reflect the hardware capabilities.  This implements BAR sizing.
  */
-static void vfio_bar_fixup(struct vfio_pci_device *vdev)
+static void vfio_bar_fixup(struct vfio_pci_core_device *vdev)
 {
 	struct pci_dev *pdev = vdev->pdev;
 	int i;
@@ -514,7 +514,7 @@ static void vfio_bar_fixup(struct vfio_pci_device *vdev)
 	vdev->bardirty = false;
 }
 
-static int vfio_basic_config_read(struct vfio_pci_device *vdev, int pos,
+static int vfio_basic_config_read(struct vfio_pci_core_device *vdev, int pos,
 				  int count, struct perm_bits *perm,
 				  int offset, __le32 *val)
 {
@@ -536,7 +536,7 @@ static int vfio_basic_config_read(struct vfio_pci_device *vdev, int pos,
 }
 
 /* Test whether BARs match the value we think they should contain */
-static bool vfio_need_bar_restore(struct vfio_pci_device *vdev)
+static bool vfio_need_bar_restore(struct vfio_pci_core_device *vdev)
 {
 	int i = 0, pos = PCI_BASE_ADDRESS_0, ret;
 	u32 bar;
@@ -552,7 +552,7 @@ static bool vfio_need_bar_restore(struct vfio_pci_device *vdev)
 	return false;
 }
 
-static int vfio_basic_config_write(struct vfio_pci_device *vdev, int pos,
+static int vfio_basic_config_write(struct vfio_pci_core_device *vdev, int pos,
 				   int count, struct perm_bits *perm,
 				   int offset, __le32 val)
 {
@@ -692,7 +692,7 @@ static int __init init_pci_cap_basic_perm(struct perm_bits *perm)
 	return 0;
 }
 
-static int vfio_pm_config_write(struct vfio_pci_device *vdev, int pos,
+static int vfio_pm_config_write(struct vfio_pci_core_device *vdev, int pos,
 				int count, struct perm_bits *perm,
 				int offset, __le32 val)
 {
@@ -747,7 +747,7 @@ static int __init init_pci_cap_pm_perm(struct perm_bits *perm)
 	return 0;
 }
 
-static int vfio_vpd_config_write(struct vfio_pci_device *vdev, int pos,
+static int vfio_vpd_config_write(struct vfio_pci_core_device *vdev, int pos,
 				 int count, struct perm_bits *perm,
 				 int offset, __le32 val)
 {
@@ -829,7 +829,7 @@ static int __init init_pci_cap_pcix_perm(struct perm_bits *perm)
 	return 0;
 }
 
-static int vfio_exp_config_write(struct vfio_pci_device *vdev, int pos,
+static int vfio_exp_config_write(struct vfio_pci_core_device *vdev, int pos,
 				 int count, struct perm_bits *perm,
 				 int offset, __le32 val)
 {
@@ -913,7 +913,7 @@ static int __init init_pci_cap_exp_perm(struct perm_bits *perm)
 	return 0;
 }
 
-static int vfio_af_config_write(struct vfio_pci_device *vdev, int pos,
+static int vfio_af_config_write(struct vfio_pci_core_device *vdev, int pos,
 				int count, struct perm_bits *perm,
 				int offset, __le32 val)
 {
@@ -1072,7 +1072,7 @@ int __init vfio_pci_init_perm_bits(void)
 	return ret;
 }
 
-static int vfio_find_cap_start(struct vfio_pci_device *vdev, int pos)
+static int vfio_find_cap_start(struct vfio_pci_core_device *vdev, int pos)
 {
 	u8 cap;
 	int base = (pos >= PCI_CFG_SPACE_SIZE) ? PCI_CFG_SPACE_SIZE :
@@ -1089,7 +1089,7 @@ static int vfio_find_cap_start(struct vfio_pci_device *vdev, int pos)
 	return pos;
 }
 
-static int vfio_msi_config_read(struct vfio_pci_device *vdev, int pos,
+static int vfio_msi_config_read(struct vfio_pci_core_device *vdev, int pos,
 				int count, struct perm_bits *perm,
 				int offset, __le32 *val)
 {
@@ -1109,7 +1109,7 @@ static int vfio_msi_config_read(struct vfio_pci_device *vdev, int pos,
 	return vfio_default_config_read(vdev, pos, count, perm, offset, val);
 }
 
-static int vfio_msi_config_write(struct vfio_pci_device *vdev, int pos,
+static int vfio_msi_config_write(struct vfio_pci_core_device *vdev, int pos,
 				 int count, struct perm_bits *perm,
 				 int offset, __le32 val)
 {
@@ -1189,7 +1189,7 @@ static int init_pci_cap_msi_perm(struct perm_bits *perm, int len, u16 flags)
 }
 
 /* Determine MSI CAP field length; initialize msi_perms on 1st call per vdev */
-static int vfio_msi_cap_len(struct vfio_pci_device *vdev, u8 pos)
+static int vfio_msi_cap_len(struct vfio_pci_core_device *vdev, u8 pos)
 {
 	struct pci_dev *pdev = vdev->pdev;
 	int len, ret;
@@ -1222,7 +1222,7 @@ static int vfio_msi_cap_len(struct vfio_pci_device *vdev, u8 pos)
 }
 
 /* Determine extended capability length for VC (2 & 9) and MFVC */
-static int vfio_vc_cap_len(struct vfio_pci_device *vdev, u16 pos)
+static int vfio_vc_cap_len(struct vfio_pci_core_device *vdev, u16 pos)
 {
 	struct pci_dev *pdev = vdev->pdev;
 	u32 tmp;
@@ -1263,7 +1263,7 @@ static int vfio_vc_cap_len(struct vfio_pci_device *vdev, u16 pos)
 	return len;
 }
 
-static int vfio_cap_len(struct vfio_pci_device *vdev, u8 cap, u8 pos)
+static int vfio_cap_len(struct vfio_pci_core_device *vdev, u8 cap, u8 pos)
 {
 	struct pci_dev *pdev = vdev->pdev;
 	u32 dword;
@@ -1338,7 +1338,7 @@ static int vfio_cap_len(struct vfio_pci_device *vdev, u8 cap, u8 pos)
 	return 0;
 }
 
-static int vfio_ext_cap_len(struct vfio_pci_device *vdev, u16 ecap, u16 epos)
+static int vfio_ext_cap_len(struct vfio_pci_core_device *vdev, u16 ecap, u16 epos)
 {
 	struct pci_dev *pdev = vdev->pdev;
 	u8 byte;
@@ -1412,7 +1412,7 @@ static int vfio_ext_cap_len(struct vfio_pci_device *vdev, u16 ecap, u16 epos)
 	return 0;
 }
 
-static int vfio_fill_vconfig_bytes(struct vfio_pci_device *vdev,
+static int vfio_fill_vconfig_bytes(struct vfio_pci_core_device *vdev,
 				   int offset, int size)
 {
 	struct pci_dev *pdev = vdev->pdev;
@@ -1459,7 +1459,7 @@ static int vfio_fill_vconfig_bytes(struct vfio_pci_device *vdev,
 	return ret;
 }
 
-static int vfio_cap_init(struct vfio_pci_device *vdev)
+static int vfio_cap_init(struct vfio_pci_core_device *vdev)
 {
 	struct pci_dev *pdev = vdev->pdev;
 	u8 *map = vdev->pci_config_map;
@@ -1549,7 +1549,7 @@ static int vfio_cap_init(struct vfio_pci_device *vdev)
 	return 0;
 }
 
-static int vfio_ecap_init(struct vfio_pci_device *vdev)
+static int vfio_ecap_init(struct vfio_pci_core_device *vdev)
 {
 	struct pci_dev *pdev = vdev->pdev;
 	u8 *map = vdev->pci_config_map;
@@ -1669,7 +1669,7 @@ static const struct pci_device_id known_bogus_vf_intx_pin[] = {
  * for each area requiring emulated bits, but the array of pointers
  * would be comparable in size (at least for standard config space).
  */
-int vfio_config_init(struct vfio_pci_device *vdev)
+int vfio_config_init(struct vfio_pci_core_device *vdev)
 {
 	struct pci_dev *pdev = vdev->pdev;
 	u8 *map, *vconfig;
@@ -1773,7 +1773,7 @@ int vfio_config_init(struct vfio_pci_device *vdev)
 	return pcibios_err_to_errno(ret);
 }
 
-void vfio_config_free(struct vfio_pci_device *vdev)
+void vfio_config_free(struct vfio_pci_core_device *vdev)
 {
 	kfree(vdev->vconfig);
 	vdev->vconfig = NULL;
@@ -1790,7 +1790,7 @@ void vfio_config_free(struct vfio_pci_device *vdev)
  * Find the remaining number of bytes in a dword that match the given
  * position.  Stop at either the end of the capability or the dword boundary.
  */
-static size_t vfio_pci_cap_remaining_dword(struct vfio_pci_device *vdev,
+static size_t vfio_pci_cap_remaining_dword(struct vfio_pci_core_device *vdev,
 					   loff_t pos)
 {
 	u8 cap = vdev->pci_config_map[pos];
@@ -1802,7 +1802,7 @@ static size_t vfio_pci_cap_remaining_dword(struct vfio_pci_device *vdev,
 	return i;
 }
 
-static ssize_t vfio_config_do_rw(struct vfio_pci_device *vdev, char __user *buf,
+static ssize_t vfio_config_do_rw(struct vfio_pci_core_device *vdev, char __user *buf,
 				 size_t count, loff_t *ppos, bool iswrite)
 {
 	struct pci_dev *pdev = vdev->pdev;
@@ -1885,7 +1885,7 @@ static ssize_t vfio_config_do_rw(struct vfio_pci_device *vdev, char __user *buf,
 	return ret;
 }
 
-ssize_t vfio_pci_config_rw(struct vfio_pci_device *vdev, char __user *buf,
+ssize_t vfio_pci_config_rw(struct vfio_pci_core_device *vdev, char __user *buf,
 			   size_t count, loff_t *ppos, bool iswrite)
 {
 	size_t done = 0;
diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c
index 51eb96375e98..6f95cd842545 100644
--- a/drivers/vfio/pci/vfio_pci_core.c
+++ b/drivers/vfio/pci/vfio_pci_core.c
@@ -121,7 +121,7 @@ static bool vfio_pci_is_denylisted(struct pci_dev *pdev)
  */
 static unsigned int vfio_pci_set_vga_decode(void *opaque, bool single_vga)
 {
-	struct vfio_pci_device *vdev = opaque;
+	struct vfio_pci_core_device *vdev = opaque;
 	struct pci_dev *tmp = NULL, *pdev = vdev->pdev;
 	unsigned char max_busnr;
 	unsigned int decodes;
@@ -155,7 +155,7 @@ static inline bool vfio_pci_is_vga(struct pci_dev *pdev)
 	return (pdev->class >> 8) == PCI_CLASS_DISPLAY_VGA;
 }
 
-static void vfio_pci_probe_mmaps(struct vfio_pci_device *vdev)
+static void vfio_pci_probe_mmaps(struct vfio_pci_core_device *vdev)
 {
 	struct resource *res;
 	int i;
@@ -224,9 +224,9 @@ static void vfio_pci_probe_mmaps(struct vfio_pci_device *vdev)
 }
 
 struct vfio_pci_group_info;
-static void vfio_pci_try_bus_reset(struct vfio_pci_device *vdev);
-static void vfio_pci_disable(struct vfio_pci_device *vdev);
-static int vfio_hot_reset_device_set(struct vfio_pci_device *vdev,
+static void vfio_pci_try_bus_reset(struct vfio_pci_core_device *vdev);
+static void vfio_pci_disable(struct vfio_pci_core_device *vdev);
+static int vfio_hot_reset_device_set(struct vfio_pci_core_device *vdev,
 				     struct vfio_pci_group_info *groups);
 
 /*
@@ -260,7 +260,7 @@ static bool vfio_pci_nointx(struct pci_dev *pdev)
 	return false;
 }
 
-static void vfio_pci_probe_power_state(struct vfio_pci_device *vdev)
+static void vfio_pci_probe_power_state(struct vfio_pci_core_device *vdev)
 {
 	struct pci_dev *pdev = vdev->pdev;
 	u16 pmcsr;
@@ -280,7 +280,7 @@ static void vfio_pci_probe_power_state(struct vfio_pci_device *vdev)
  * by PM capability emulation and separately from pci_dev internal saved state
  * to avoid it being overwritten and consumed around other resets.
  */
-int vfio_pci_set_power_state(struct vfio_pci_device *vdev, pci_power_t state)
+int vfio_pci_set_power_state(struct vfio_pci_core_device *vdev, pci_power_t state)
 {
 	struct pci_dev *pdev = vdev->pdev;
 	bool needs_restore = false, needs_save = false;
@@ -311,7 +311,7 @@ int vfio_pci_set_power_state(struct vfio_pci_device *vdev, pci_power_t state)
 	return ret;
 }
 
-static int vfio_pci_enable(struct vfio_pci_device *vdev)
+static int vfio_pci_enable(struct vfio_pci_core_device *vdev)
 {
 	struct pci_dev *pdev = vdev->pdev;
 	int ret;
@@ -399,7 +399,7 @@ static int vfio_pci_enable(struct vfio_pci_device *vdev)
 	return ret;
 }
 
-static void vfio_pci_disable(struct vfio_pci_device *vdev)
+static void vfio_pci_disable(struct vfio_pci_core_device *vdev)
 {
 	struct pci_dev *pdev = vdev->pdev;
 	struct vfio_pci_dummy_resource *dummy_res, *tmp;
@@ -500,7 +500,7 @@ static void vfio_pci_disable(struct vfio_pci_device *vdev)
 
 static struct pci_driver vfio_pci_driver;
 
-static struct vfio_pci_device *get_pf_vdev(struct vfio_pci_device *vdev)
+static struct vfio_pci_core_device *get_pf_vdev(struct vfio_pci_core_device *vdev)
 {
 	struct pci_dev *physfn = pci_physfn(vdev->pdev);
 	struct vfio_device *pf_dev;
@@ -517,12 +517,12 @@ static struct vfio_pci_device *get_pf_vdev(struct vfio_pci_device *vdev)
 		return NULL;
 	}
 
-	return container_of(pf_dev, struct vfio_pci_device, vdev);
+	return container_of(pf_dev, struct vfio_pci_core_device, vdev);
 }
 
-static void vfio_pci_vf_token_user_add(struct vfio_pci_device *vdev, int val)
+static void vfio_pci_vf_token_user_add(struct vfio_pci_core_device *vdev, int val)
 {
-	struct vfio_pci_device *pf_vdev = get_pf_vdev(vdev);
+	struct vfio_pci_core_device *pf_vdev = get_pf_vdev(vdev);
 
 	if (!pf_vdev)
 		return;
@@ -537,8 +537,8 @@ static void vfio_pci_vf_token_user_add(struct vfio_pci_device *vdev, int val)
 
 static void vfio_pci_close_device(struct vfio_device *core_vdev)
 {
-	struct vfio_pci_device *vdev =
-		container_of(core_vdev, struct vfio_pci_device, vdev);
+	struct vfio_pci_core_device *vdev =
+		container_of(core_vdev, struct vfio_pci_core_device, vdev);
 
 	vfio_pci_vf_token_user_add(vdev, -1);
 	vfio_spapr_pci_eeh_release(vdev->pdev);
@@ -558,8 +558,8 @@ static void vfio_pci_close_device(struct vfio_device *core_vdev)
 
 static int vfio_pci_open_device(struct vfio_device *core_vdev)
 {
-	struct vfio_pci_device *vdev =
-		container_of(core_vdev, struct vfio_pci_device, vdev);
+	struct vfio_pci_core_device *vdev =
+		container_of(core_vdev, struct vfio_pci_core_device, vdev);
 	int ret = 0;
 
 	ret = vfio_pci_enable(vdev);
@@ -571,7 +571,7 @@ static int vfio_pci_open_device(struct vfio_device *core_vdev)
 	return 0;
 }
 
-static int vfio_pci_get_irq_count(struct vfio_pci_device *vdev, int irq_type)
+static int vfio_pci_get_irq_count(struct vfio_pci_core_device *vdev, int irq_type)
 {
 	if (irq_type == VFIO_PCI_INTX_IRQ_INDEX) {
 		u8 pin;
@@ -692,7 +692,7 @@ static int vfio_pci_for_each_slot_or_bus(struct pci_dev *pdev,
 	return walk.ret;
 }
 
-static int msix_mmappable_cap(struct vfio_pci_device *vdev,
+static int msix_mmappable_cap(struct vfio_pci_core_device *vdev,
 			      struct vfio_info_cap *caps)
 {
 	struct vfio_info_cap_header header = {
@@ -703,7 +703,7 @@ static int msix_mmappable_cap(struct vfio_pci_device *vdev,
 	return vfio_info_add_capability(caps, &header, sizeof(header));
 }
 
-int vfio_pci_register_dev_region(struct vfio_pci_device *vdev,
+int vfio_pci_register_dev_region(struct vfio_pci_core_device *vdev,
 				 unsigned int type, unsigned int subtype,
 				 const struct vfio_pci_regops *ops,
 				 size_t size, u32 flags, void *data)
@@ -732,8 +732,8 @@ int vfio_pci_register_dev_region(struct vfio_pci_device *vdev,
 static long vfio_pci_ioctl(struct vfio_device *core_vdev,
 			   unsigned int cmd, unsigned long arg)
 {
-	struct vfio_pci_device *vdev =
-		container_of(core_vdev, struct vfio_pci_device, vdev);
+	struct vfio_pci_core_device *vdev =
+		container_of(core_vdev, struct vfio_pci_core_device, vdev);
 	unsigned long minsz;
 
 	if (cmd == VFIO_DEVICE_GET_INFO) {
@@ -1273,7 +1273,7 @@ static long vfio_pci_ioctl(struct vfio_device *core_vdev,
 	return -ENOTTY;
 }
 
-static ssize_t vfio_pci_rw(struct vfio_pci_device *vdev, char __user *buf,
+static ssize_t vfio_pci_rw(struct vfio_pci_core_device *vdev, char __user *buf,
 			   size_t count, loff_t *ppos, bool iswrite)
 {
 	unsigned int index = VFIO_PCI_OFFSET_TO_INDEX(*ppos);
@@ -1307,8 +1307,8 @@ static ssize_t vfio_pci_rw(struct vfio_pci_device *vdev, char __user *buf,
 static ssize_t vfio_pci_read(struct vfio_device *core_vdev, char __user *buf,
 			     size_t count, loff_t *ppos)
 {
-	struct vfio_pci_device *vdev =
-		container_of(core_vdev, struct vfio_pci_device, vdev);
+	struct vfio_pci_core_device *vdev =
+		container_of(core_vdev, struct vfio_pci_core_device, vdev);
 
 	if (!count)
 		return 0;
@@ -1319,8 +1319,8 @@ static ssize_t vfio_pci_read(struct vfio_device *core_vdev, char __user *buf,
 static ssize_t vfio_pci_write(struct vfio_device *core_vdev, const char __user *buf,
 			      size_t count, loff_t *ppos)
 {
-	struct vfio_pci_device *vdev =
-		container_of(core_vdev, struct vfio_pci_device, vdev);
+	struct vfio_pci_core_device *vdev =
+		container_of(core_vdev, struct vfio_pci_core_device, vdev);
 
 	if (!count)
 		return 0;
@@ -1329,7 +1329,7 @@ static ssize_t vfio_pci_write(struct vfio_device *core_vdev, const char __user *
 }
 
 /* Return 1 on zap and vma_lock acquired, 0 on contention (only with @try) */
-static int vfio_pci_zap_and_vma_lock(struct vfio_pci_device *vdev, bool try)
+static int vfio_pci_zap_and_vma_lock(struct vfio_pci_core_device *vdev, bool try)
 {
 	struct vfio_pci_mmap_vma *mmap_vma, *tmp;
 
@@ -1417,14 +1417,14 @@ static int vfio_pci_zap_and_vma_lock(struct vfio_pci_device *vdev, bool try)
 	}
 }
 
-void vfio_pci_zap_and_down_write_memory_lock(struct vfio_pci_device *vdev)
+void vfio_pci_zap_and_down_write_memory_lock(struct vfio_pci_core_device *vdev)
 {
 	vfio_pci_zap_and_vma_lock(vdev, false);
 	down_write(&vdev->memory_lock);
 	mutex_unlock(&vdev->vma_lock);
 }
 
-u16 vfio_pci_memory_lock_and_enable(struct vfio_pci_device *vdev)
+u16 vfio_pci_memory_lock_and_enable(struct vfio_pci_core_device *vdev)
 {
 	u16 cmd;
 
@@ -1437,14 +1437,14 @@ u16 vfio_pci_memory_lock_and_enable(struct vfio_pci_device *vdev)
 	return cmd;
 }
 
-void vfio_pci_memory_unlock_and_restore(struct vfio_pci_device *vdev, u16 cmd)
+void vfio_pci_memory_unlock_and_restore(struct vfio_pci_core_device *vdev, u16 cmd)
 {
 	pci_write_config_word(vdev->pdev, PCI_COMMAND, cmd);
 	up_write(&vdev->memory_lock);
 }
 
 /* Caller holds vma_lock */
-static int __vfio_pci_add_vma(struct vfio_pci_device *vdev,
+static int __vfio_pci_add_vma(struct vfio_pci_core_device *vdev,
 			      struct vm_area_struct *vma)
 {
 	struct vfio_pci_mmap_vma *mmap_vma;
@@ -1470,7 +1470,7 @@ static void vfio_pci_mmap_open(struct vm_area_struct *vma)
 
 static void vfio_pci_mmap_close(struct vm_area_struct *vma)
 {
-	struct vfio_pci_device *vdev = vma->vm_private_data;
+	struct vfio_pci_core_device *vdev = vma->vm_private_data;
 	struct vfio_pci_mmap_vma *mmap_vma;
 
 	mutex_lock(&vdev->vma_lock);
@@ -1487,7 +1487,7 @@ static void vfio_pci_mmap_close(struct vm_area_struct *vma)
 static vm_fault_t vfio_pci_mmap_fault(struct vm_fault *vmf)
 {
 	struct vm_area_struct *vma = vmf->vma;
-	struct vfio_pci_device *vdev = vma->vm_private_data;
+	struct vfio_pci_core_device *vdev = vma->vm_private_data;
 	struct vfio_pci_mmap_vma *mmap_vma;
 	vm_fault_t ret = VM_FAULT_NOPAGE;
 
@@ -1537,8 +1537,8 @@ static const struct vm_operations_struct vfio_pci_mmap_ops = {
 
 static int vfio_pci_mmap(struct vfio_device *core_vdev, struct vm_area_struct *vma)
 {
-	struct vfio_pci_device *vdev =
-		container_of(core_vdev, struct vfio_pci_device, vdev);
+	struct vfio_pci_core_device *vdev =
+		container_of(core_vdev, struct vfio_pci_core_device, vdev);
 	struct pci_dev *pdev = vdev->pdev;
 	unsigned int index;
 	u64 phys_len, req_len, pgoff, req_start;
@@ -1608,8 +1608,8 @@ static int vfio_pci_mmap(struct vfio_device *core_vdev, struct vm_area_struct *v
 
 static void vfio_pci_request(struct vfio_device *core_vdev, unsigned int count)
 {
-	struct vfio_pci_device *vdev =
-		container_of(core_vdev, struct vfio_pci_device, vdev);
+	struct vfio_pci_core_device *vdev =
+		container_of(core_vdev, struct vfio_pci_core_device, vdev);
 	struct pci_dev *pdev = vdev->pdev;
 
 	mutex_lock(&vdev->igate);
@@ -1628,7 +1628,7 @@ static void vfio_pci_request(struct vfio_device *core_vdev, unsigned int count)
 	mutex_unlock(&vdev->igate);
 }
 
-static int vfio_pci_validate_vf_token(struct vfio_pci_device *vdev,
+static int vfio_pci_validate_vf_token(struct vfio_pci_core_device *vdev,
 				      bool vf_token, uuid_t *uuid)
 {
 	/*
@@ -1660,7 +1660,7 @@ static int vfio_pci_validate_vf_token(struct vfio_pci_device *vdev,
 		return 0; /* No VF token provided or required */
 
 	if (vdev->pdev->is_virtfn) {
-		struct vfio_pci_device *pf_vdev = get_pf_vdev(vdev);
+		struct vfio_pci_core_device *pf_vdev = get_pf_vdev(vdev);
 		bool match;
 
 		if (!pf_vdev) {
@@ -1724,8 +1724,8 @@ static int vfio_pci_validate_vf_token(struct vfio_pci_device *vdev,
 
 static int vfio_pci_match(struct vfio_device *core_vdev, char *buf)
 {
-	struct vfio_pci_device *vdev =
-		container_of(core_vdev, struct vfio_pci_device, vdev);
+	struct vfio_pci_core_device *vdev =
+		container_of(core_vdev, struct vfio_pci_core_device, vdev);
 	bool vf_token = false;
 	uuid_t uuid;
 	int ret;
@@ -1787,8 +1787,8 @@ static const struct vfio_device_ops vfio_pci_ops = {
 static int vfio_pci_bus_notifier(struct notifier_block *nb,
 				 unsigned long action, void *data)
 {
-	struct vfio_pci_device *vdev = container_of(nb,
-						    struct vfio_pci_device, nb);
+	struct vfio_pci_core_device *vdev = container_of(nb,
+						    struct vfio_pci_core_device, nb);
 	struct device *dev = data;
 	struct pci_dev *pdev = to_pci_dev(dev);
 	struct pci_dev *physfn = pci_physfn(pdev);
@@ -1812,7 +1812,7 @@ static int vfio_pci_bus_notifier(struct notifier_block *nb,
 	return 0;
 }
 
-static int vfio_pci_vf_init(struct vfio_pci_device *vdev)
+static int vfio_pci_vf_init(struct vfio_pci_core_device *vdev)
 {
 	struct pci_dev *pdev = vdev->pdev;
 	int ret;
@@ -1836,7 +1836,7 @@ static int vfio_pci_vf_init(struct vfio_pci_device *vdev)
 	return 0;
 }
 
-static void vfio_pci_vf_uninit(struct vfio_pci_device *vdev)
+static void vfio_pci_vf_uninit(struct vfio_pci_core_device *vdev)
 {
 	if (!vdev->vf_token)
 		return;
@@ -1847,7 +1847,7 @@ static void vfio_pci_vf_uninit(struct vfio_pci_device *vdev)
 	kfree(vdev->vf_token);
 }
 
-static int vfio_pci_vga_init(struct vfio_pci_device *vdev)
+static int vfio_pci_vga_init(struct vfio_pci_core_device *vdev)
 {
 	struct pci_dev *pdev = vdev->pdev;
 	int ret;
@@ -1862,7 +1862,7 @@ static int vfio_pci_vga_init(struct vfio_pci_device *vdev)
 	return 0;
 }
 
-static void vfio_pci_vga_uninit(struct vfio_pci_device *vdev)
+static void vfio_pci_vga_uninit(struct vfio_pci_core_device *vdev)
 {
 	struct pci_dev *pdev = vdev->pdev;
 
@@ -1876,7 +1876,7 @@ static void vfio_pci_vga_uninit(struct vfio_pci_device *vdev)
 
 static int vfio_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 {
-	struct vfio_pci_device *vdev;
+	struct vfio_pci_core_device *vdev;
 	struct iommu_group *group;
 	int ret;
 
@@ -1974,7 +1974,7 @@ static int vfio_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 
 static void vfio_pci_remove(struct pci_dev *pdev)
 {
-	struct vfio_pci_device *vdev = dev_get_drvdata(&pdev->dev);
+	struct vfio_pci_core_device *vdev = dev_get_drvdata(&pdev->dev);
 
 	pci_disable_sriov(pdev);
 
@@ -1998,14 +1998,14 @@ static void vfio_pci_remove(struct pci_dev *pdev)
 static pci_ers_result_t vfio_pci_aer_err_detected(struct pci_dev *pdev,
 						  pci_channel_state_t state)
 {
-	struct vfio_pci_device *vdev;
+	struct vfio_pci_core_device *vdev;
 	struct vfio_device *device;
 
 	device = vfio_device_get_from_dev(&pdev->dev);
 	if (device == NULL)
 		return PCI_ERS_RESULT_DISCONNECT;
 
-	vdev = container_of(device, struct vfio_pci_device, vdev);
+	vdev = container_of(device, struct vfio_pci_core_device, vdev);
 
 	mutex_lock(&vdev->igate);
 
@@ -2069,7 +2069,7 @@ static int vfio_pci_check_all_devices_bound(struct pci_dev *pdev, void *data)
 	return -EBUSY;
 }
 
-static bool vfio_dev_in_groups(struct vfio_pci_device *vdev,
+static bool vfio_dev_in_groups(struct vfio_pci_core_device *vdev,
 			       struct vfio_pci_group_info *groups)
 {
 	unsigned int i;
@@ -2085,19 +2085,20 @@ static bool vfio_dev_in_groups(struct vfio_pci_device *vdev,
  * therefore we need to zap and hold the vma_lock for each device, and only then
  * get each memory_lock.
  */
-static int vfio_hot_reset_device_set(struct vfio_pci_device *vdev,
+static int vfio_hot_reset_device_set(struct vfio_pci_core_device *vdev,
 				     struct vfio_pci_group_info *groups)
 {
 	struct vfio_device_set *dev_set = vdev->vdev.dev_set;
-	struct vfio_pci_device *cur_mem;
-	struct vfio_pci_device *cur_vma;
-	struct vfio_pci_device *cur;
+	struct vfio_pci_core_device *cur_mem;
+	struct vfio_pci_core_device *cur_vma;
+	struct vfio_pci_core_device *cur;
 	bool is_mem = true;
 	int ret;
 
 	mutex_lock(&dev_set->lock);
 	cur_mem = list_first_entry(&dev_set->device_list,
-				   struct vfio_pci_device, vdev.dev_set_list);
+				   struct vfio_pci_core_device,
+				   vdev.dev_set_list);
 
 	/* All devices in the group to be reset need VFIO devices */
 	if (vfio_pci_for_each_slot_or_bus(
@@ -2170,11 +2171,11 @@ static int vfio_hot_reset_device_set(struct vfio_pci_device *vdev,
  * to be bound to vfio_pci since that's the only way we can be sure they
  * stay put.
  */
-static void vfio_pci_try_bus_reset(struct vfio_pci_device *vdev)
+static void vfio_pci_try_bus_reset(struct vfio_pci_core_device *vdev)
 {
 	struct vfio_device_set *dev_set = vdev->vdev.dev_set;
-	struct vfio_pci_device *to_reset = NULL;
-	struct vfio_pci_device *cur;
+	struct vfio_pci_core_device *to_reset = NULL;
+	struct vfio_pci_core_device *cur;
 	int ret;
 
 	if (pci_probe_reset_slot(vdev->pdev->slot) &&
diff --git a/drivers/vfio/pci/vfio_pci_core.h b/drivers/vfio/pci/vfio_pci_core.h
index ef26e781961d..2ceaa6e4ca25 100644
--- a/drivers/vfio/pci/vfio_pci_core.h
+++ b/drivers/vfio/pci/vfio_pci_core.h
@@ -33,7 +33,7 @@
 
 struct vfio_pci_ioeventfd {
 	struct list_head	next;
-	struct vfio_pci_device	*vdev;
+	struct vfio_pci_core_device	*vdev;
 	struct virqfd		*virqfd;
 	void __iomem		*addr;
 	uint64_t		data;
@@ -52,18 +52,18 @@ struct vfio_pci_irq_ctx {
 	struct irq_bypass_producer	producer;
 };
 
-struct vfio_pci_device;
+struct vfio_pci_core_device;
 struct vfio_pci_region;
 
 struct vfio_pci_regops {
-	ssize_t	(*rw)(struct vfio_pci_device *vdev, char __user *buf,
+	ssize_t (*rw)(struct vfio_pci_core_device *vdev, char __user *buf,
 		      size_t count, loff_t *ppos, bool iswrite);
-	void	(*release)(struct vfio_pci_device *vdev,
+	void	(*release)(struct vfio_pci_core_device *vdev,
 			   struct vfio_pci_region *region);
-	int	(*mmap)(struct vfio_pci_device *vdev,
+	int	(*mmap)(struct vfio_pci_core_device *vdev,
 			struct vfio_pci_region *region,
 			struct vm_area_struct *vma);
-	int	(*add_capability)(struct vfio_pci_device *vdev,
+	int	(*add_capability)(struct vfio_pci_core_device *vdev,
 				  struct vfio_pci_region *region,
 				  struct vfio_info_cap *caps);
 };
@@ -94,7 +94,7 @@ struct vfio_pci_mmap_vma {
 	struct list_head	vma_next;
 };
 
-struct vfio_pci_device {
+struct vfio_pci_core_device {
 	struct vfio_device	vdev;
 	struct pci_dev		*pdev;
 	void __iomem		*barmap[PCI_STD_NUM_BARS];
@@ -144,61 +144,61 @@ struct vfio_pci_device {
 #define is_irq_none(vdev) (!(is_intx(vdev) || is_msi(vdev) || is_msix(vdev)))
 #define irq_is(vdev, type) (vdev->irq_type == type)
 
-extern void vfio_pci_intx_mask(struct vfio_pci_device *vdev);
-extern void vfio_pci_intx_unmask(struct vfio_pci_device *vdev);
+extern void vfio_pci_intx_mask(struct vfio_pci_core_device *vdev);
+extern void vfio_pci_intx_unmask(struct vfio_pci_core_device *vdev);
 
-extern int vfio_pci_set_irqs_ioctl(struct vfio_pci_device *vdev,
+extern int vfio_pci_set_irqs_ioctl(struct vfio_pci_core_device *vdev,
 				   uint32_t flags, unsigned index,
 				   unsigned start, unsigned count, void *data);
 
-extern ssize_t vfio_pci_config_rw(struct vfio_pci_device *vdev,
+extern ssize_t vfio_pci_config_rw(struct vfio_pci_core_device *vdev,
 				  char __user *buf, size_t count,
 				  loff_t *ppos, bool iswrite);
 
-extern ssize_t vfio_pci_bar_rw(struct vfio_pci_device *vdev, char __user *buf,
+extern ssize_t vfio_pci_bar_rw(struct vfio_pci_core_device *vdev, char __user *buf,
 			       size_t count, loff_t *ppos, bool iswrite);
 
-extern ssize_t vfio_pci_vga_rw(struct vfio_pci_device *vdev, char __user *buf,
+extern ssize_t vfio_pci_vga_rw(struct vfio_pci_core_device *vdev, char __user *buf,
 			       size_t count, loff_t *ppos, bool iswrite);
 
-extern long vfio_pci_ioeventfd(struct vfio_pci_device *vdev, loff_t offset,
+extern long vfio_pci_ioeventfd(struct vfio_pci_core_device *vdev, loff_t offset,
 			       uint64_t data, int count, int fd);
 
 extern int vfio_pci_init_perm_bits(void);
 extern void vfio_pci_uninit_perm_bits(void);
 
-extern int vfio_config_init(struct vfio_pci_device *vdev);
-extern void vfio_config_free(struct vfio_pci_device *vdev);
+extern int vfio_config_init(struct vfio_pci_core_device *vdev);
+extern void vfio_config_free(struct vfio_pci_core_device *vdev);
 
-extern int vfio_pci_register_dev_region(struct vfio_pci_device *vdev,
+extern int vfio_pci_register_dev_region(struct vfio_pci_core_device *vdev,
 					unsigned int type, unsigned int subtype,
 					const struct vfio_pci_regops *ops,
 					size_t size, u32 flags, void *data);
 
-extern int vfio_pci_set_power_state(struct vfio_pci_device *vdev,
+extern int vfio_pci_set_power_state(struct vfio_pci_core_device *vdev,
 				    pci_power_t state);
 
-extern bool __vfio_pci_memory_enabled(struct vfio_pci_device *vdev);
-extern void vfio_pci_zap_and_down_write_memory_lock(struct vfio_pci_device
+extern bool __vfio_pci_memory_enabled(struct vfio_pci_core_device *vdev);
+extern void vfio_pci_zap_and_down_write_memory_lock(struct vfio_pci_core_device
 						    *vdev);
-extern u16 vfio_pci_memory_lock_and_enable(struct vfio_pci_device *vdev);
-extern void vfio_pci_memory_unlock_and_restore(struct vfio_pci_device *vdev,
+extern u16 vfio_pci_memory_lock_and_enable(struct vfio_pci_core_device *vdev);
+extern void vfio_pci_memory_unlock_and_restore(struct vfio_pci_core_device *vdev,
 					       u16 cmd);
 
 #ifdef CONFIG_VFIO_PCI_IGD
-extern int vfio_pci_igd_init(struct vfio_pci_device *vdev);
+extern int vfio_pci_igd_init(struct vfio_pci_core_device *vdev);
 #else
-static inline int vfio_pci_igd_init(struct vfio_pci_device *vdev)
+static inline int vfio_pci_igd_init(struct vfio_pci_core_device *vdev)
 {
 	return -ENODEV;
 }
 #endif
 
 #ifdef CONFIG_S390
-extern int vfio_pci_info_zdev_add_caps(struct vfio_pci_device *vdev,
+extern int vfio_pci_info_zdev_add_caps(struct vfio_pci_core_device *vdev,
 				       struct vfio_info_cap *caps);
 #else
-static inline int vfio_pci_info_zdev_add_caps(struct vfio_pci_device *vdev,
+static inline int vfio_pci_info_zdev_add_caps(struct vfio_pci_core_device *vdev,
 					      struct vfio_info_cap *caps)
 {
 	return -ENODEV;
diff --git a/drivers/vfio/pci/vfio_pci_igd.c b/drivers/vfio/pci/vfio_pci_igd.c
index d57c409b4033..a324ca7e6b5a 100644
--- a/drivers/vfio/pci/vfio_pci_igd.c
+++ b/drivers/vfio/pci/vfio_pci_igd.c
@@ -25,8 +25,9 @@
 #define OPREGION_RVDS		0x3c2
 #define OPREGION_VERSION	0x16
 
-static ssize_t vfio_pci_igd_rw(struct vfio_pci_device *vdev, char __user *buf,
-			       size_t count, loff_t *ppos, bool iswrite)
+static ssize_t vfio_pci_igd_rw(struct vfio_pci_core_device *vdev,
+			       char __user *buf, size_t count, loff_t *ppos,
+			       bool iswrite)
 {
 	unsigned int i = VFIO_PCI_OFFSET_TO_INDEX(*ppos) - VFIO_PCI_NUM_REGIONS;
 	void *base = vdev->region[i].data;
@@ -45,7 +46,7 @@ static ssize_t vfio_pci_igd_rw(struct vfio_pci_device *vdev, char __user *buf,
 	return count;
 }
 
-static void vfio_pci_igd_release(struct vfio_pci_device *vdev,
+static void vfio_pci_igd_release(struct vfio_pci_core_device *vdev,
 				 struct vfio_pci_region *region)
 {
 	memunmap(region->data);
@@ -56,7 +57,7 @@ static const struct vfio_pci_regops vfio_pci_igd_regops = {
 	.release	= vfio_pci_igd_release,
 };
 
-static int vfio_pci_igd_opregion_init(struct vfio_pci_device *vdev)
+static int vfio_pci_igd_opregion_init(struct vfio_pci_core_device *vdev)
 {
 	__le32 *dwordp = (__le32 *)(vdev->vconfig + OPREGION_PCI_ADDR);
 	u32 addr, size;
@@ -160,7 +161,7 @@ static int vfio_pci_igd_opregion_init(struct vfio_pci_device *vdev)
 	return ret;
 }
 
-static ssize_t vfio_pci_igd_cfg_rw(struct vfio_pci_device *vdev,
+static ssize_t vfio_pci_igd_cfg_rw(struct vfio_pci_core_device *vdev,
 				   char __user *buf, size_t count, loff_t *ppos,
 				   bool iswrite)
 {
@@ -253,7 +254,7 @@ static ssize_t vfio_pci_igd_cfg_rw(struct vfio_pci_device *vdev,
 	return count;
 }
 
-static void vfio_pci_igd_cfg_release(struct vfio_pci_device *vdev,
+static void vfio_pci_igd_cfg_release(struct vfio_pci_core_device *vdev,
 				     struct vfio_pci_region *region)
 {
 	struct pci_dev *pdev = region->data;
@@ -266,7 +267,7 @@ static const struct vfio_pci_regops vfio_pci_igd_cfg_regops = {
 	.release	= vfio_pci_igd_cfg_release,
 };
 
-static int vfio_pci_igd_cfg_init(struct vfio_pci_device *vdev)
+static int vfio_pci_igd_cfg_init(struct vfio_pci_core_device *vdev)
 {
 	struct pci_dev *host_bridge, *lpc_bridge;
 	int ret;
@@ -314,7 +315,7 @@ static int vfio_pci_igd_cfg_init(struct vfio_pci_device *vdev)
 	return 0;
 }
 
-int vfio_pci_igd_init(struct vfio_pci_device *vdev)
+int vfio_pci_igd_init(struct vfio_pci_core_device *vdev)
 {
 	int ret;
 
diff --git a/drivers/vfio/pci/vfio_pci_intrs.c b/drivers/vfio/pci/vfio_pci_intrs.c
index df1e8c8c274c..945ddbdf4d11 100644
--- a/drivers/vfio/pci/vfio_pci_intrs.c
+++ b/drivers/vfio/pci/vfio_pci_intrs.c
@@ -27,13 +27,13 @@
  */
 static void vfio_send_intx_eventfd(void *opaque, void *unused)
 {
-	struct vfio_pci_device *vdev = opaque;
+	struct vfio_pci_core_device *vdev = opaque;
 
 	if (likely(is_intx(vdev) && !vdev->virq_disabled))
 		eventfd_signal(vdev->ctx[0].trigger, 1);
 }
 
-void vfio_pci_intx_mask(struct vfio_pci_device *vdev)
+void vfio_pci_intx_mask(struct vfio_pci_core_device *vdev)
 {
 	struct pci_dev *pdev = vdev->pdev;
 	unsigned long flags;
@@ -73,7 +73,7 @@ void vfio_pci_intx_mask(struct vfio_pci_device *vdev)
  */
 static int vfio_pci_intx_unmask_handler(void *opaque, void *unused)
 {
-	struct vfio_pci_device *vdev = opaque;
+	struct vfio_pci_core_device *vdev = opaque;
 	struct pci_dev *pdev = vdev->pdev;
 	unsigned long flags;
 	int ret = 0;
@@ -107,7 +107,7 @@ static int vfio_pci_intx_unmask_handler(void *opaque, void *unused)
 	return ret;
 }
 
-void vfio_pci_intx_unmask(struct vfio_pci_device *vdev)
+void vfio_pci_intx_unmask(struct vfio_pci_core_device *vdev)
 {
 	if (vfio_pci_intx_unmask_handler(vdev, NULL) > 0)
 		vfio_send_intx_eventfd(vdev, NULL);
@@ -115,7 +115,7 @@ void vfio_pci_intx_unmask(struct vfio_pci_device *vdev)
 
 static irqreturn_t vfio_intx_handler(int irq, void *dev_id)
 {
-	struct vfio_pci_device *vdev = dev_id;
+	struct vfio_pci_core_device *vdev = dev_id;
 	unsigned long flags;
 	int ret = IRQ_NONE;
 
@@ -139,7 +139,7 @@ static irqreturn_t vfio_intx_handler(int irq, void *dev_id)
 	return ret;
 }
 
-static int vfio_intx_enable(struct vfio_pci_device *vdev)
+static int vfio_intx_enable(struct vfio_pci_core_device *vdev)
 {
 	if (!is_irq_none(vdev))
 		return -EINVAL;
@@ -168,7 +168,7 @@ static int vfio_intx_enable(struct vfio_pci_device *vdev)
 	return 0;
 }
 
-static int vfio_intx_set_signal(struct vfio_pci_device *vdev, int fd)
+static int vfio_intx_set_signal(struct vfio_pci_core_device *vdev, int fd)
 {
 	struct pci_dev *pdev = vdev->pdev;
 	unsigned long irqflags = IRQF_SHARED;
@@ -223,7 +223,7 @@ static int vfio_intx_set_signal(struct vfio_pci_device *vdev, int fd)
 	return 0;
 }
 
-static void vfio_intx_disable(struct vfio_pci_device *vdev)
+static void vfio_intx_disable(struct vfio_pci_core_device *vdev)
 {
 	vfio_virqfd_disable(&vdev->ctx[0].unmask);
 	vfio_virqfd_disable(&vdev->ctx[0].mask);
@@ -244,7 +244,7 @@ static irqreturn_t vfio_msihandler(int irq, void *arg)
 	return IRQ_HANDLED;
 }
 
-static int vfio_msi_enable(struct vfio_pci_device *vdev, int nvec, bool msix)
+static int vfio_msi_enable(struct vfio_pci_core_device *vdev, int nvec, bool msix)
 {
 	struct pci_dev *pdev = vdev->pdev;
 	unsigned int flag = msix ? PCI_IRQ_MSIX : PCI_IRQ_MSI;
@@ -285,7 +285,7 @@ static int vfio_msi_enable(struct vfio_pci_device *vdev, int nvec, bool msix)
 	return 0;
 }
 
-static int vfio_msi_set_vector_signal(struct vfio_pci_device *vdev,
+static int vfio_msi_set_vector_signal(struct vfio_pci_core_device *vdev,
 				      int vector, int fd, bool msix)
 {
 	struct pci_dev *pdev = vdev->pdev;
@@ -364,7 +364,7 @@ static int vfio_msi_set_vector_signal(struct vfio_pci_device *vdev,
 	return 0;
 }
 
-static int vfio_msi_set_block(struct vfio_pci_device *vdev, unsigned start,
+static int vfio_msi_set_block(struct vfio_pci_core_device *vdev, unsigned start,
 			      unsigned count, int32_t *fds, bool msix)
 {
 	int i, j, ret = 0;
@@ -385,7 +385,7 @@ static int vfio_msi_set_block(struct vfio_pci_device *vdev, unsigned start,
 	return ret;
 }
 
-static void vfio_msi_disable(struct vfio_pci_device *vdev, bool msix)
+static void vfio_msi_disable(struct vfio_pci_core_device *vdev, bool msix)
 {
 	struct pci_dev *pdev = vdev->pdev;
 	int i;
@@ -417,7 +417,7 @@ static void vfio_msi_disable(struct vfio_pci_device *vdev, bool msix)
 /*
  * IOCTL support
  */
-static int vfio_pci_set_intx_unmask(struct vfio_pci_device *vdev,
+static int vfio_pci_set_intx_unmask(struct vfio_pci_core_device *vdev,
 				    unsigned index, unsigned start,
 				    unsigned count, uint32_t flags, void *data)
 {
@@ -444,7 +444,7 @@ static int vfio_pci_set_intx_unmask(struct vfio_pci_device *vdev,
 	return 0;
 }
 
-static int vfio_pci_set_intx_mask(struct vfio_pci_device *vdev,
+static int vfio_pci_set_intx_mask(struct vfio_pci_core_device *vdev,
 				  unsigned index, unsigned start,
 				  unsigned count, uint32_t flags, void *data)
 {
@@ -464,7 +464,7 @@ static int vfio_pci_set_intx_mask(struct vfio_pci_device *vdev,
 	return 0;
 }
 
-static int vfio_pci_set_intx_trigger(struct vfio_pci_device *vdev,
+static int vfio_pci_set_intx_trigger(struct vfio_pci_core_device *vdev,
 				     unsigned index, unsigned start,
 				     unsigned count, uint32_t flags, void *data)
 {
@@ -507,7 +507,7 @@ static int vfio_pci_set_intx_trigger(struct vfio_pci_device *vdev,
 	return 0;
 }
 
-static int vfio_pci_set_msi_trigger(struct vfio_pci_device *vdev,
+static int vfio_pci_set_msi_trigger(struct vfio_pci_core_device *vdev,
 				    unsigned index, unsigned start,
 				    unsigned count, uint32_t flags, void *data)
 {
@@ -613,7 +613,7 @@ static int vfio_pci_set_ctx_trigger_single(struct eventfd_ctx **ctx,
 	return -EINVAL;
 }
 
-static int vfio_pci_set_err_trigger(struct vfio_pci_device *vdev,
+static int vfio_pci_set_err_trigger(struct vfio_pci_core_device *vdev,
 				    unsigned index, unsigned start,
 				    unsigned count, uint32_t flags, void *data)
 {
@@ -624,7 +624,7 @@ static int vfio_pci_set_err_trigger(struct vfio_pci_device *vdev,
 					       count, flags, data);
 }
 
-static int vfio_pci_set_req_trigger(struct vfio_pci_device *vdev,
+static int vfio_pci_set_req_trigger(struct vfio_pci_core_device *vdev,
 				    unsigned index, unsigned start,
 				    unsigned count, uint32_t flags, void *data)
 {
@@ -635,11 +635,11 @@ static int vfio_pci_set_req_trigger(struct vfio_pci_device *vdev,
 					       count, flags, data);
 }
 
-int vfio_pci_set_irqs_ioctl(struct vfio_pci_device *vdev, uint32_t flags,
+int vfio_pci_set_irqs_ioctl(struct vfio_pci_core_device *vdev, uint32_t flags,
 			    unsigned index, unsigned start, unsigned count,
 			    void *data)
 {
-	int (*func)(struct vfio_pci_device *vdev, unsigned index,
+	int (*func)(struct vfio_pci_core_device *vdev, unsigned index,
 		    unsigned start, unsigned count, uint32_t flags,
 		    void *data) = NULL;
 
diff --git a/drivers/vfio/pci/vfio_pci_rdwr.c b/drivers/vfio/pci/vfio_pci_rdwr.c
index 667e82726e75..8fff4689dd44 100644
--- a/drivers/vfio/pci/vfio_pci_rdwr.c
+++ b/drivers/vfio/pci/vfio_pci_rdwr.c
@@ -38,7 +38,7 @@
 #define vfio_iowrite8	iowrite8
 
 #define VFIO_IOWRITE(size) \
-static int vfio_pci_iowrite##size(struct vfio_pci_device *vdev,		\
+static int vfio_pci_iowrite##size(struct vfio_pci_core_device *vdev,		\
 			bool test_mem, u##size val, void __iomem *io)	\
 {									\
 	if (test_mem) {							\
@@ -65,7 +65,7 @@ VFIO_IOWRITE(64)
 #endif
 
 #define VFIO_IOREAD(size) \
-static int vfio_pci_ioread##size(struct vfio_pci_device *vdev,		\
+static int vfio_pci_ioread##size(struct vfio_pci_core_device *vdev,		\
 			bool test_mem, u##size *val, void __iomem *io)	\
 {									\
 	if (test_mem) {							\
@@ -94,7 +94,7 @@ VFIO_IOREAD(32)
  * reads with -1.  This is intended for handling MSI-X vector tables and
  * leftover space for ROM BARs.
  */
-static ssize_t do_io_rw(struct vfio_pci_device *vdev, bool test_mem,
+static ssize_t do_io_rw(struct vfio_pci_core_device *vdev, bool test_mem,
 			void __iomem *io, char __user *buf,
 			loff_t off, size_t count, size_t x_start,
 			size_t x_end, bool iswrite)
@@ -200,7 +200,7 @@ static ssize_t do_io_rw(struct vfio_pci_device *vdev, bool test_mem,
 	return done;
 }
 
-static int vfio_pci_setup_barmap(struct vfio_pci_device *vdev, int bar)
+static int vfio_pci_setup_barmap(struct vfio_pci_core_device *vdev, int bar)
 {
 	struct pci_dev *pdev = vdev->pdev;
 	int ret;
@@ -224,7 +224,7 @@ static int vfio_pci_setup_barmap(struct vfio_pci_device *vdev, int bar)
 	return 0;
 }
 
-ssize_t vfio_pci_bar_rw(struct vfio_pci_device *vdev, char __user *buf,
+ssize_t vfio_pci_bar_rw(struct vfio_pci_core_device *vdev, char __user *buf,
 			size_t count, loff_t *ppos, bool iswrite)
 {
 	struct pci_dev *pdev = vdev->pdev;
@@ -288,7 +288,7 @@ ssize_t vfio_pci_bar_rw(struct vfio_pci_device *vdev, char __user *buf,
 	return done;
 }
 
-ssize_t vfio_pci_vga_rw(struct vfio_pci_device *vdev, char __user *buf,
+ssize_t vfio_pci_vga_rw(struct vfio_pci_core_device *vdev, char __user *buf,
 			       size_t count, loff_t *ppos, bool iswrite)
 {
 	int ret;
@@ -384,7 +384,7 @@ static void vfio_pci_ioeventfd_do_write(struct vfio_pci_ioeventfd *ioeventfd,
 static int vfio_pci_ioeventfd_handler(void *opaque, void *unused)
 {
 	struct vfio_pci_ioeventfd *ioeventfd = opaque;
-	struct vfio_pci_device *vdev = ioeventfd->vdev;
+	struct vfio_pci_core_device *vdev = ioeventfd->vdev;
 
 	if (ioeventfd->test_mem) {
 		if (!down_read_trylock(&vdev->memory_lock))
@@ -410,7 +410,7 @@ static void vfio_pci_ioeventfd_thread(void *opaque, void *unused)
 	vfio_pci_ioeventfd_do_write(ioeventfd, ioeventfd->test_mem);
 }
 
-long vfio_pci_ioeventfd(struct vfio_pci_device *vdev, loff_t offset,
+long vfio_pci_ioeventfd(struct vfio_pci_core_device *vdev, loff_t offset,
 			uint64_t data, int count, int fd)
 {
 	struct pci_dev *pdev = vdev->pdev;
diff --git a/drivers/vfio/pci/vfio_pci_zdev.c b/drivers/vfio/pci/vfio_pci_zdev.c
index ecae0c3d95a0..2ffbdc11f089 100644
--- a/drivers/vfio/pci/vfio_pci_zdev.c
+++ b/drivers/vfio/pci/vfio_pci_zdev.c
@@ -114,7 +114,7 @@ static int zpci_pfip_cap(struct zpci_dev *zdev, struct vfio_info_cap *caps)
 /*
  * Add all supported capabilities to the VFIO_DEVICE_GET_INFO capability chain.
  */
-int vfio_pci_info_zdev_add_caps(struct vfio_pci_device *vdev,
+int vfio_pci_info_zdev_add_caps(struct vfio_pci_core_device *vdev,
 				struct vfio_info_cap *caps)
 {
 	struct zpci_dev *zdev = to_zpci(vdev->pdev);
-- 
2.18.1


^ permalink raw reply	[flat|nested] 55+ messages in thread

* [PATCH 04/12] vfio/pci: Rename ops functions to fit core namings
  2021-07-21 16:15 [PATCH 00/12] Introduce vfio_pci_core subsystem Yishai Hadas
                   ` (2 preceding siblings ...)
  2021-07-21 16:16 ` [PATCH 03/12] vfio/pci: Rename vfio_pci_device to vfio_pci_core_device Yishai Hadas
@ 2021-07-21 16:16 ` Yishai Hadas
  2021-07-21 16:16 ` [PATCH 05/12] vfio/pci: Include vfio header in vfio_pci_core.h Yishai Hadas
                   ` (8 subsequent siblings)
  12 siblings, 0 replies; 55+ messages in thread
From: Yishai Hadas @ 2021-07-21 16:16 UTC (permalink / raw)
  To: bhelgaas, corbet, alex.williamson, diana.craciun, kwankhede,
	eric.auger, masahiroy, michal.lkml
  Cc: linux-pci, linux-doc, kvm, linux-s390, linux-kbuild, mgurtovoy,
	jgg, yishaih, maorg, leonro

From: Max Gurtovoy <mgurtovoy@nvidia.com>

This is another preparation patch for separating the vfio_pci driver to
a subsystem driver and a generic pci driver. This patch doesn't change
any logic.

Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
---
 drivers/vfio/pci/vfio_pci_core.c | 32 ++++++++++++++++----------------
 1 file changed, 16 insertions(+), 16 deletions(-)

diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c
index 6f95cd842545..ab22b0db064a 100644
--- a/drivers/vfio/pci/vfio_pci_core.c
+++ b/drivers/vfio/pci/vfio_pci_core.c
@@ -535,7 +535,7 @@ static void vfio_pci_vf_token_user_add(struct vfio_pci_core_device *vdev, int va
 	vfio_device_put(&pf_vdev->vdev);
 }
 
-static void vfio_pci_close_device(struct vfio_device *core_vdev)
+static void vfio_pci_core_close_device(struct vfio_device *core_vdev)
 {
 	struct vfio_pci_core_device *vdev =
 		container_of(core_vdev, struct vfio_pci_core_device, vdev);
@@ -556,7 +556,7 @@ static void vfio_pci_close_device(struct vfio_device *core_vdev)
 	mutex_unlock(&vdev->igate);
 }
 
-static int vfio_pci_open_device(struct vfio_device *core_vdev)
+static int vfio_pci_core_open_device(struct vfio_device *core_vdev)
 {
 	struct vfio_pci_core_device *vdev =
 		container_of(core_vdev, struct vfio_pci_core_device, vdev);
@@ -729,7 +729,7 @@ int vfio_pci_register_dev_region(struct vfio_pci_core_device *vdev,
 	return 0;
 }
 
-static long vfio_pci_ioctl(struct vfio_device *core_vdev,
+static long vfio_pci_core_ioctl(struct vfio_device *core_vdev,
 			   unsigned int cmd, unsigned long arg)
 {
 	struct vfio_pci_core_device *vdev =
@@ -1304,7 +1304,7 @@ static ssize_t vfio_pci_rw(struct vfio_pci_core_device *vdev, char __user *buf,
 	return -EINVAL;
 }
 
-static ssize_t vfio_pci_read(struct vfio_device *core_vdev, char __user *buf,
+static ssize_t vfio_pci_core_read(struct vfio_device *core_vdev, char __user *buf,
 			     size_t count, loff_t *ppos)
 {
 	struct vfio_pci_core_device *vdev =
@@ -1316,7 +1316,7 @@ static ssize_t vfio_pci_read(struct vfio_device *core_vdev, char __user *buf,
 	return vfio_pci_rw(vdev, buf, count, ppos, false);
 }
 
-static ssize_t vfio_pci_write(struct vfio_device *core_vdev, const char __user *buf,
+static ssize_t vfio_pci_core_write(struct vfio_device *core_vdev, const char __user *buf,
 			      size_t count, loff_t *ppos)
 {
 	struct vfio_pci_core_device *vdev =
@@ -1535,7 +1535,7 @@ static const struct vm_operations_struct vfio_pci_mmap_ops = {
 	.fault = vfio_pci_mmap_fault,
 };
 
-static int vfio_pci_mmap(struct vfio_device *core_vdev, struct vm_area_struct *vma)
+static int vfio_pci_core_mmap(struct vfio_device *core_vdev, struct vm_area_struct *vma)
 {
 	struct vfio_pci_core_device *vdev =
 		container_of(core_vdev, struct vfio_pci_core_device, vdev);
@@ -1606,7 +1606,7 @@ static int vfio_pci_mmap(struct vfio_device *core_vdev, struct vm_area_struct *v
 	return 0;
 }
 
-static void vfio_pci_request(struct vfio_device *core_vdev, unsigned int count)
+static void vfio_pci_core_request(struct vfio_device *core_vdev, unsigned int count)
 {
 	struct vfio_pci_core_device *vdev =
 		container_of(core_vdev, struct vfio_pci_core_device, vdev);
@@ -1722,7 +1722,7 @@ static int vfio_pci_validate_vf_token(struct vfio_pci_core_device *vdev,
 
 #define VF_TOKEN_ARG "vf_token="
 
-static int vfio_pci_match(struct vfio_device *core_vdev, char *buf)
+static int vfio_pci_core_match(struct vfio_device *core_vdev, char *buf)
 {
 	struct vfio_pci_core_device *vdev =
 		container_of(core_vdev, struct vfio_pci_core_device, vdev);
@@ -1774,14 +1774,14 @@ static int vfio_pci_match(struct vfio_device *core_vdev, char *buf)
 
 static const struct vfio_device_ops vfio_pci_ops = {
 	.name		= "vfio-pci",
-	.open_device	= vfio_pci_open_device,
-	.close_device	= vfio_pci_close_device,
-	.ioctl		= vfio_pci_ioctl,
-	.read		= vfio_pci_read,
-	.write		= vfio_pci_write,
-	.mmap		= vfio_pci_mmap,
-	.request	= vfio_pci_request,
-	.match		= vfio_pci_match,
+	.open_device	= vfio_pci_core_open_device,
+	.close_device	= vfio_pci_core_close_device,
+	.ioctl		= vfio_pci_core_ioctl,
+	.read		= vfio_pci_core_read,
+	.write		= vfio_pci_core_write,
+	.mmap		= vfio_pci_core_mmap,
+	.request	= vfio_pci_core_request,
+	.match		= vfio_pci_core_match,
 };
 
 static int vfio_pci_bus_notifier(struct notifier_block *nb,
-- 
2.18.1


^ permalink raw reply	[flat|nested] 55+ messages in thread

* [PATCH 05/12] vfio/pci: Include vfio header in vfio_pci_core.h
  2021-07-21 16:15 [PATCH 00/12] Introduce vfio_pci_core subsystem Yishai Hadas
                   ` (3 preceding siblings ...)
  2021-07-21 16:16 ` [PATCH 04/12] vfio/pci: Rename ops functions to fit core namings Yishai Hadas
@ 2021-07-21 16:16 ` Yishai Hadas
  2021-07-21 16:16 ` [PATCH 06/12] vfio/pci: Split the pci_driver code out of vfio_pci_core.c Yishai Hadas
                   ` (7 subsequent siblings)
  12 siblings, 0 replies; 55+ messages in thread
From: Yishai Hadas @ 2021-07-21 16:16 UTC (permalink / raw)
  To: bhelgaas, corbet, alex.williamson, diana.craciun, kwankhede,
	eric.auger, masahiroy, michal.lkml
  Cc: linux-pci, linux-doc, kvm, linux-s390, linux-kbuild, mgurtovoy,
	jgg, yishaih, maorg, leonro

From: Max Gurtovoy <mgurtovoy@nvidia.com>

The vfio_device structure is embedded into the vfio_pci_core_device
structure, so there is no reason for not including the header file in
the vfio_pci_core header as well.

Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
---
 drivers/vfio/pci/vfio_pci_core.c | 1 -
 drivers/vfio/pci/vfio_pci_core.h | 1 +
 2 files changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c
index ab22b0db064a..99f579c23ddd 100644
--- a/drivers/vfio/pci/vfio_pci_core.c
+++ b/drivers/vfio/pci/vfio_pci_core.c
@@ -23,7 +23,6 @@
 #include <linux/slab.h>
 #include <linux/types.h>
 #include <linux/uaccess.h>
-#include <linux/vfio.h>
 #include <linux/vgaarb.h>
 #include <linux/nospec.h>
 #include <linux/sched/mm.h>
diff --git a/drivers/vfio/pci/vfio_pci_core.h b/drivers/vfio/pci/vfio_pci_core.h
index 2ceaa6e4ca25..17ad048752b6 100644
--- a/drivers/vfio/pci/vfio_pci_core.h
+++ b/drivers/vfio/pci/vfio_pci_core.h
@@ -10,6 +10,7 @@
 
 #include <linux/mutex.h>
 #include <linux/pci.h>
+#include <linux/vfio.h>
 #include <linux/irqbypass.h>
 #include <linux/types.h>
 #include <linux/uuid.h>
-- 
2.18.1


^ permalink raw reply	[flat|nested] 55+ messages in thread

* [PATCH 06/12] vfio/pci: Split the pci_driver code out of vfio_pci_core.c
  2021-07-21 16:15 [PATCH 00/12] Introduce vfio_pci_core subsystem Yishai Hadas
                   ` (4 preceding siblings ...)
  2021-07-21 16:16 ` [PATCH 05/12] vfio/pci: Include vfio header in vfio_pci_core.h Yishai Hadas
@ 2021-07-21 16:16 ` Yishai Hadas
  2021-07-21 16:16 ` [PATCH 07/12] vfio/pci: Move igd initialization to vfio_pci.c Yishai Hadas
                   ` (6 subsequent siblings)
  12 siblings, 0 replies; 55+ messages in thread
From: Yishai Hadas @ 2021-07-21 16:16 UTC (permalink / raw)
  To: bhelgaas, corbet, alex.williamson, diana.craciun, kwankhede,
	eric.auger, masahiroy, michal.lkml
  Cc: linux-pci, linux-doc, kvm, linux-s390, linux-kbuild, mgurtovoy,
	jgg, yishaih, maorg, leonro

From: Max Gurtovoy <mgurtovoy@nvidia.com>

Split the vfio_pci driver into two logical parts, the 'struct
pci_driver' (vfio_pci.c) which implements "Generic VFIO support for any
PCI device" and a library of code (vfio_pci_core.c) that helps
implementing a struct vfio_device on top of a PCI device.

vfio_pci.ko continues to present the same interface under sysfs and this
change should have no functional impact.

Following patches will turn vfio_pci and vfio_pci_core into a separate
module.

This is a preparation for allowing another module to provide the
pci_driver and allow that module to customize how VFIO is setup, inject
its own operations, and easily extend vendor specific functionality.

At this point the vfio_pci_core still contains a lot of vfio_pci
functionality mixed into it. Following patches will move more of the
large scale items out, but another cleanup series will be needed to get
everything.

Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
---
 drivers/vfio/pci/Makefile        |   2 +-
 drivers/vfio/pci/vfio_pci.c      | 227 ++++++++++++++++++++++++++
 drivers/vfio/pci/vfio_pci_core.c | 264 +++++++------------------------
 drivers/vfio/pci/vfio_pci_core.h |  23 +++
 4 files changed, 305 insertions(+), 211 deletions(-)
 create mode 100644 drivers/vfio/pci/vfio_pci.c

diff --git a/drivers/vfio/pci/Makefile b/drivers/vfio/pci/Makefile
index 66a40488e967..8aa517b4b671 100644
--- a/drivers/vfio/pci/Makefile
+++ b/drivers/vfio/pci/Makefile
@@ -1,6 +1,6 @@
 # SPDX-License-Identifier: GPL-2.0-only
 
-vfio-pci-y := vfio_pci_core.o vfio_pci_intrs.o vfio_pci_rdwr.o vfio_pci_config.o
+vfio-pci-y := vfio_pci.o vfio_pci_core.o vfio_pci_intrs.o vfio_pci_rdwr.o vfio_pci_config.o
 vfio-pci-$(CONFIG_VFIO_PCI_IGD) += vfio_pci_igd.o
 vfio-pci-$(CONFIG_S390) += vfio_pci_zdev.o
 
diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c
new file mode 100644
index 000000000000..4ccfbac0797a
--- /dev/null
+++ b/drivers/vfio/pci/vfio_pci.c
@@ -0,0 +1,227 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (c) 2021, NVIDIA CORPORATION & AFFILIATES. All rights reserved
+ *
+ * Copyright (C) 2012 Red Hat, Inc.  All rights reserved.
+ *     Author: Alex Williamson <alex.williamson@redhat.com>
+ *
+ * Derived from original vfio:
+ * Copyright 2010 Cisco Systems, Inc.  All rights reserved.
+ * Author: Tom Lyon, pugs@cisco.com
+ */
+
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
+#include <linux/device.h>
+#include <linux/eventfd.h>
+#include <linux/file.h>
+#include <linux/interrupt.h>
+#include <linux/iommu.h>
+#include <linux/module.h>
+#include <linux/mutex.h>
+#include <linux/notifier.h>
+#include <linux/pm_runtime.h>
+#include <linux/slab.h>
+#include <linux/types.h>
+#include <linux/uaccess.h>
+
+#include "vfio_pci_core.h"
+
+#define DRIVER_VERSION  "0.2"
+#define DRIVER_AUTHOR   "Alex Williamson <alex.williamson@redhat.com>"
+#define DRIVER_DESC     "VFIO PCI - User Level meta-driver"
+
+static char ids[1024] __initdata;
+module_param_string(ids, ids, sizeof(ids), 0);
+MODULE_PARM_DESC(ids, "Initial PCI IDs to add to the vfio driver, format is \"vendor:device[:subvendor[:subdevice[:class[:class_mask]]]]\" and multiple comma separated entries can be specified");
+
+static bool enable_sriov;
+#ifdef CONFIG_PCI_IOV
+module_param(enable_sriov, bool, 0644);
+MODULE_PARM_DESC(enable_sriov, "Enable support for SR-IOV configuration.  Enabling SR-IOV on a PF typically requires support of the userspace PF driver, enabling VFs without such support may result in non-functional VFs or PF.");
+#endif
+
+static bool disable_denylist;
+module_param(disable_denylist, bool, 0444);
+MODULE_PARM_DESC(disable_denylist, "Disable use of device denylist. Disabling the denylist allows binding to devices with known errata that may lead to exploitable stability or security issues when accessed by untrusted users.");
+
+static bool vfio_pci_dev_in_denylist(struct pci_dev *pdev)
+{
+	switch (pdev->vendor) {
+	case PCI_VENDOR_ID_INTEL:
+		switch (pdev->device) {
+		case PCI_DEVICE_ID_INTEL_QAT_C3XXX:
+		case PCI_DEVICE_ID_INTEL_QAT_C3XXX_VF:
+		case PCI_DEVICE_ID_INTEL_QAT_C62X:
+		case PCI_DEVICE_ID_INTEL_QAT_C62X_VF:
+		case PCI_DEVICE_ID_INTEL_QAT_DH895XCC:
+		case PCI_DEVICE_ID_INTEL_QAT_DH895XCC_VF:
+			return true;
+		default:
+			return false;
+		}
+	}
+
+	return false;
+}
+
+static bool vfio_pci_is_denylisted(struct pci_dev *pdev)
+{
+	if (!vfio_pci_dev_in_denylist(pdev))
+		return false;
+
+	if (disable_denylist) {
+		pci_warn(pdev,
+			 "device denylist disabled - allowing device %04x:%04x.\n",
+			 pdev->vendor, pdev->device);
+		return false;
+	}
+
+	pci_warn(pdev, "%04x:%04x exists in vfio-pci device denylist, driver probing disallowed.\n",
+		 pdev->vendor, pdev->device);
+
+	return true;
+}
+
+static const struct vfio_device_ops vfio_pci_ops = {
+	.name		= "vfio-pci",
+	.open_device	= vfio_pci_core_open_device,
+	.close_device	= vfio_pci_core_close_device,
+	.ioctl		= vfio_pci_core_ioctl,
+	.read		= vfio_pci_core_read,
+	.write		= vfio_pci_core_write,
+	.mmap		= vfio_pci_core_mmap,
+	.request	= vfio_pci_core_request,
+	.match		= vfio_pci_core_match,
+};
+
+static int vfio_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
+{
+	struct vfio_pci_core_device *vdev;
+	int ret;
+
+	if (vfio_pci_is_denylisted(pdev))
+		return -EINVAL;
+
+	vdev = kzalloc(sizeof(*vdev), GFP_KERNEL);
+	if (!vdev)
+		return -ENOMEM;
+	vfio_pci_core_init_device(vdev, pdev, &vfio_pci_ops);
+
+	ret = vfio_pci_core_register_device(vdev);
+	if (ret)
+		goto out_free;
+	return 0;
+
+out_free:
+	vfio_pci_core_uninit_device(vdev);
+	kfree(vdev);
+	return ret;
+}
+
+static void vfio_pci_remove(struct pci_dev *pdev)
+{
+	struct vfio_pci_core_device *vdev = dev_get_drvdata(&pdev->dev);
+
+	vfio_pci_core_unregister_device(vdev);
+	vfio_pci_core_uninit_device(vdev);
+	kfree(vdev);
+}
+
+static int vfio_pci_sriov_configure(struct pci_dev *pdev, int nr_virtfn)
+{
+	might_sleep();
+
+	if (!enable_sriov)
+		return -ENOENT;
+
+	return vfio_pci_core_sriov_configure(pdev, nr_virtfn);
+}
+
+static struct pci_driver vfio_pci_driver = {
+	.name			= "vfio-pci",
+	.id_table		= NULL, /* only dynamic ids */
+	.probe			= vfio_pci_probe,
+	.remove			= vfio_pci_remove,
+	.sriov_configure	= vfio_pci_sriov_configure,
+	.err_handler		= &vfio_pci_core_err_handlers,
+};
+
+static void __init vfio_pci_fill_ids(void)
+{
+	char *p, *id;
+	int rc;
+
+	/* no ids passed actually */
+	if (ids[0] == '\0')
+		return;
+
+	/* add ids specified in the module parameter */
+	p = ids;
+	while ((id = strsep(&p, ","))) {
+		unsigned int vendor, device, subvendor = PCI_ANY_ID,
+			subdevice = PCI_ANY_ID, class = 0, class_mask = 0;
+		int fields;
+
+		if (!strlen(id))
+			continue;
+
+		fields = sscanf(id, "%x:%x:%x:%x:%x:%x",
+				&vendor, &device, &subvendor, &subdevice,
+				&class, &class_mask);
+
+		if (fields < 2) {
+			pr_warn("invalid id string \"%s\"\n", id);
+			continue;
+		}
+
+		rc = pci_add_dynid(&vfio_pci_driver, vendor, device,
+				   subvendor, subdevice, class, class_mask, 0);
+		if (rc)
+			pr_warn("failed to add dynamic id [%04x:%04x[%04x:%04x]] class %#08x/%08x (%d)\n",
+				vendor, device, subvendor, subdevice,
+				class, class_mask, rc);
+		else
+			pr_info("add [%04x:%04x[%04x:%04x]] class %#08x/%08x\n",
+				vendor, device, subvendor, subdevice,
+				class, class_mask);
+	}
+}
+
+static int __init vfio_pci_init(void)
+{
+	int ret;
+
+	ret = vfio_pci_core_init();
+	if (ret)
+		return ret;
+
+	/* Register and scan for devices */
+	ret = pci_register_driver(&vfio_pci_driver);
+	if (ret)
+		goto out;
+
+	vfio_pci_fill_ids();
+
+	if (disable_denylist)
+		pr_warn("device denylist disabled.\n");
+
+	return 0;
+
+out:
+	vfio_pci_core_cleanup();
+	return ret;
+}
+module_init(vfio_pci_init);
+
+static void __exit vfio_pci_cleanup(void)
+{
+	pci_unregister_driver(&vfio_pci_driver);
+	vfio_pci_core_cleanup();
+}
+module_exit(vfio_pci_cleanup);
+
+MODULE_VERSION(DRIVER_VERSION);
+MODULE_LICENSE("GPL v2");
+MODULE_AUTHOR(DRIVER_AUTHOR);
+MODULE_DESCRIPTION(DRIVER_DESC);
diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c
index 99f579c23ddd..8323acc5d3b7 100644
--- a/drivers/vfio/pci/vfio_pci_core.c
+++ b/drivers/vfio/pci/vfio_pci_core.c
@@ -8,8 +8,6 @@
  * Author: Tom Lyon, pugs@cisco.com
  */
 
-#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
-
 #include <linux/device.h>
 #include <linux/eventfd.h>
 #include <linux/file.h>
@@ -29,14 +27,6 @@
 
 #include "vfio_pci_core.h"
 
-#define DRIVER_VERSION  "0.2"
-#define DRIVER_AUTHOR   "Alex Williamson <alex.williamson@redhat.com>"
-#define DRIVER_DESC     "VFIO PCI - User Level meta-driver"
-
-static char ids[1024] __initdata;
-module_param_string(ids, ids, sizeof(ids), 0);
-MODULE_PARM_DESC(ids, "Initial PCI IDs to add to the vfio driver, format is \"vendor:device[:subvendor[:subdevice[:class[:class_mask]]]]\" and multiple comma separated entries can be specified");
-
 static bool nointxmask;
 module_param_named(nointxmask, nointxmask, bool, S_IRUGO | S_IWUSR);
 MODULE_PARM_DESC(nointxmask,
@@ -53,16 +43,6 @@ module_param(disable_idle_d3, bool, S_IRUGO | S_IWUSR);
 MODULE_PARM_DESC(disable_idle_d3,
 		 "Disable using the PCI D3 low power state for idle, unused devices");
 
-static bool enable_sriov;
-#ifdef CONFIG_PCI_IOV
-module_param(enable_sriov, bool, 0644);
-MODULE_PARM_DESC(enable_sriov, "Enable support for SR-IOV configuration.  Enabling SR-IOV on a PF typically requires support of the userspace PF driver, enabling VFs without such support may result in non-functional VFs or PF.");
-#endif
-
-static bool disable_denylist;
-module_param(disable_denylist, bool, 0444);
-MODULE_PARM_DESC(disable_denylist, "Disable use of device denylist. Disabling the denylist allows binding to devices with known errata that may lead to exploitable stability or security issues when accessed by untrusted users.");
-
 static inline bool vfio_vga_disabled(void)
 {
 #ifdef CONFIG_VFIO_PCI_VGA
@@ -72,44 +52,6 @@ static inline bool vfio_vga_disabled(void)
 #endif
 }
 
-static bool vfio_pci_dev_in_denylist(struct pci_dev *pdev)
-{
-	switch (pdev->vendor) {
-	case PCI_VENDOR_ID_INTEL:
-		switch (pdev->device) {
-		case PCI_DEVICE_ID_INTEL_QAT_C3XXX:
-		case PCI_DEVICE_ID_INTEL_QAT_C3XXX_VF:
-		case PCI_DEVICE_ID_INTEL_QAT_C62X:
-		case PCI_DEVICE_ID_INTEL_QAT_C62X_VF:
-		case PCI_DEVICE_ID_INTEL_QAT_DH895XCC:
-		case PCI_DEVICE_ID_INTEL_QAT_DH895XCC_VF:
-			return true;
-		default:
-			return false;
-		}
-	}
-
-	return false;
-}
-
-static bool vfio_pci_is_denylisted(struct pci_dev *pdev)
-{
-	if (!vfio_pci_dev_in_denylist(pdev))
-		return false;
-
-	if (disable_denylist) {
-		pci_warn(pdev,
-			 "device denylist disabled - allowing device %04x:%04x.\n",
-			 pdev->vendor, pdev->device);
-		return false;
-	}
-
-	pci_warn(pdev, "%04x:%04x exists in vfio-pci device denylist, driver probing disallowed.\n",
-		 pdev->vendor, pdev->device);
-
-	return true;
-}
-
 /*
  * Our VGA arbiter participation is limited since we don't know anything
  * about the device itself.  However, if the device is the only VGA device
@@ -497,8 +439,6 @@ static void vfio_pci_disable(struct vfio_pci_core_device *vdev)
 		vfio_pci_set_power_state(vdev, PCI_D3hot);
 }
 
-static struct pci_driver vfio_pci_driver;
-
 static struct vfio_pci_core_device *get_pf_vdev(struct vfio_pci_core_device *vdev)
 {
 	struct pci_dev *physfn = pci_physfn(vdev->pdev);
@@ -511,7 +451,7 @@ static struct vfio_pci_core_device *get_pf_vdev(struct vfio_pci_core_device *vde
 	if (!pf_dev)
 		return NULL;
 
-	if (pci_dev_driver(physfn) != &vfio_pci_driver) {
+	if (pci_dev_driver(physfn) != pci_dev_driver(vdev->pdev)) {
 		vfio_device_put(pf_dev);
 		return NULL;
 	}
@@ -534,7 +474,7 @@ static void vfio_pci_vf_token_user_add(struct vfio_pci_core_device *vdev, int va
 	vfio_device_put(&pf_vdev->vdev);
 }
 
-static void vfio_pci_core_close_device(struct vfio_device *core_vdev)
+void vfio_pci_core_close_device(struct vfio_device *core_vdev)
 {
 	struct vfio_pci_core_device *vdev =
 		container_of(core_vdev, struct vfio_pci_core_device, vdev);
@@ -555,7 +495,7 @@ static void vfio_pci_core_close_device(struct vfio_device *core_vdev)
 	mutex_unlock(&vdev->igate);
 }
 
-static int vfio_pci_core_open_device(struct vfio_device *core_vdev)
+int vfio_pci_core_open_device(struct vfio_device *core_vdev)
 {
 	struct vfio_pci_core_device *vdev =
 		container_of(core_vdev, struct vfio_pci_core_device, vdev);
@@ -728,8 +668,8 @@ int vfio_pci_register_dev_region(struct vfio_pci_core_device *vdev,
 	return 0;
 }
 
-static long vfio_pci_core_ioctl(struct vfio_device *core_vdev,
-			   unsigned int cmd, unsigned long arg)
+long vfio_pci_core_ioctl(struct vfio_device *core_vdev, unsigned int cmd,
+		unsigned long arg)
 {
 	struct vfio_pci_core_device *vdev =
 		container_of(core_vdev, struct vfio_pci_core_device, vdev);
@@ -1303,8 +1243,8 @@ static ssize_t vfio_pci_rw(struct vfio_pci_core_device *vdev, char __user *buf,
 	return -EINVAL;
 }
 
-static ssize_t vfio_pci_core_read(struct vfio_device *core_vdev, char __user *buf,
-			     size_t count, loff_t *ppos)
+ssize_t vfio_pci_core_read(struct vfio_device *core_vdev, char __user *buf,
+		size_t count, loff_t *ppos)
 {
 	struct vfio_pci_core_device *vdev =
 		container_of(core_vdev, struct vfio_pci_core_device, vdev);
@@ -1315,8 +1255,8 @@ static ssize_t vfio_pci_core_read(struct vfio_device *core_vdev, char __user *bu
 	return vfio_pci_rw(vdev, buf, count, ppos, false);
 }
 
-static ssize_t vfio_pci_core_write(struct vfio_device *core_vdev, const char __user *buf,
-			      size_t count, loff_t *ppos)
+ssize_t vfio_pci_core_write(struct vfio_device *core_vdev, const char __user *buf,
+		size_t count, loff_t *ppos)
 {
 	struct vfio_pci_core_device *vdev =
 		container_of(core_vdev, struct vfio_pci_core_device, vdev);
@@ -1534,7 +1474,7 @@ static const struct vm_operations_struct vfio_pci_mmap_ops = {
 	.fault = vfio_pci_mmap_fault,
 };
 
-static int vfio_pci_core_mmap(struct vfio_device *core_vdev, struct vm_area_struct *vma)
+int vfio_pci_core_mmap(struct vfio_device *core_vdev, struct vm_area_struct *vma)
 {
 	struct vfio_pci_core_device *vdev =
 		container_of(core_vdev, struct vfio_pci_core_device, vdev);
@@ -1605,7 +1545,7 @@ static int vfio_pci_core_mmap(struct vfio_device *core_vdev, struct vm_area_stru
 	return 0;
 }
 
-static void vfio_pci_core_request(struct vfio_device *core_vdev, unsigned int count)
+void vfio_pci_core_request(struct vfio_device *core_vdev, unsigned int count)
 {
 	struct vfio_pci_core_device *vdev =
 		container_of(core_vdev, struct vfio_pci_core_device, vdev);
@@ -1721,7 +1661,7 @@ static int vfio_pci_validate_vf_token(struct vfio_pci_core_device *vdev,
 
 #define VF_TOKEN_ARG "vf_token="
 
-static int vfio_pci_core_match(struct vfio_device *core_vdev, char *buf)
+int vfio_pci_core_match(struct vfio_device *core_vdev, char *buf)
 {
 	struct vfio_pci_core_device *vdev =
 		container_of(core_vdev, struct vfio_pci_core_device, vdev);
@@ -1771,18 +1711,6 @@ static int vfio_pci_core_match(struct vfio_device *core_vdev, char *buf)
 	return 1; /* Match */
 }
 
-static const struct vfio_device_ops vfio_pci_ops = {
-	.name		= "vfio-pci",
-	.open_device	= vfio_pci_core_open_device,
-	.close_device	= vfio_pci_core_close_device,
-	.ioctl		= vfio_pci_core_ioctl,
-	.read		= vfio_pci_core_read,
-	.write		= vfio_pci_core_write,
-	.mmap		= vfio_pci_core_mmap,
-	.request	= vfio_pci_core_request,
-	.match		= vfio_pci_core_match,
-};
-
 static int vfio_pci_bus_notifier(struct notifier_block *nb,
 				 unsigned long action, void *data)
 {
@@ -1797,12 +1725,12 @@ static int vfio_pci_bus_notifier(struct notifier_block *nb,
 		pci_info(vdev->pdev, "Captured SR-IOV VF %s driver_override\n",
 			 pci_name(pdev));
 		pdev->driver_override = kasprintf(GFP_KERNEL, "%s",
-						  vfio_pci_ops.name);
+						  vdev->vdev.ops->name);
 	} else if (action == BUS_NOTIFY_BOUND_DRIVER &&
 		   pdev->is_virtfn && physfn == vdev->pdev) {
 		struct pci_driver *drv = pci_dev_driver(pdev);
 
-		if (drv && drv != &vfio_pci_driver)
+		if (drv && drv != pci_dev_driver(vdev->pdev))
 			pci_warn(vdev->pdev,
 				 "VF %s bound to driver %s while PF bound to vfio-pci\n",
 				 pci_name(pdev), drv->name);
@@ -1873,15 +1801,39 @@ static void vfio_pci_vga_uninit(struct vfio_pci_core_device *vdev)
 					      VGA_RSRC_LEGACY_MEM);
 }
 
-static int vfio_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
+void vfio_pci_core_init_device(struct vfio_pci_core_device *vdev,
+			       struct pci_dev *pdev,
+			       const struct vfio_device_ops *vfio_pci_ops)
 {
-	struct vfio_pci_core_device *vdev;
+	vfio_init_group_dev(&vdev->vdev, &pdev->dev, vfio_pci_ops);
+	vdev->pdev = pdev;
+	vdev->irq_type = VFIO_PCI_NUM_IRQS;
+	mutex_init(&vdev->igate);
+	spin_lock_init(&vdev->irqlock);
+	mutex_init(&vdev->ioeventfds_lock);
+	INIT_LIST_HEAD(&vdev->dummy_resources_list);
+	INIT_LIST_HEAD(&vdev->ioeventfds_list);
+	mutex_init(&vdev->vma_lock);
+	INIT_LIST_HEAD(&vdev->vma_list);
+	init_rwsem(&vdev->memory_lock);
+}
+
+void vfio_pci_core_uninit_device(struct vfio_pci_core_device *vdev)
+{
+	mutex_destroy(&vdev->igate);
+	mutex_destroy(&vdev->ioeventfds_lock);
+	mutex_destroy(&vdev->vma_lock);
+	vfio_uninit_group_dev(&vdev->vdev);
+	kfree(vdev->region);
+	kfree(vdev->pm_save);
+}
+
+int vfio_pci_core_register_device(struct vfio_pci_core_device *vdev)
+{
+	struct pci_dev *pdev = vdev->pdev;
 	struct iommu_group *group;
 	int ret;
 
-	if (vfio_pci_is_denylisted(pdev))
-		return -EINVAL;
-
 	if (pdev->hdr_type != PCI_HEADER_TYPE_NORMAL)
 		return -EINVAL;
 
@@ -1902,24 +1854,6 @@ static int vfio_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 	if (!group)
 		return -EINVAL;
 
-	vdev = kzalloc(sizeof(*vdev), GFP_KERNEL);
-	if (!vdev) {
-		ret = -ENOMEM;
-		goto out_group_put;
-	}
-
-	vfio_init_group_dev(&vdev->vdev, &pdev->dev, &vfio_pci_ops);
-	vdev->pdev = pdev;
-	vdev->irq_type = VFIO_PCI_NUM_IRQS;
-	mutex_init(&vdev->igate);
-	spin_lock_init(&vdev->irqlock);
-	mutex_init(&vdev->ioeventfds_lock);
-	INIT_LIST_HEAD(&vdev->dummy_resources_list);
-	INIT_LIST_HEAD(&vdev->ioeventfds_list);
-	mutex_init(&vdev->vma_lock);
-	INIT_LIST_HEAD(&vdev->vma_list);
-	init_rwsem(&vdev->memory_lock);
-
 	if (pci_is_root_bus(pdev->bus))
 		ret = vfio_assign_device_set(&vdev->vdev, vdev);
 	else if (!pci_probe_reset_slot(pdev->slot))
@@ -1927,10 +1861,10 @@ static int vfio_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 	else
 		ret = vfio_assign_device_set(&vdev->vdev, pdev->bus);
 	if (ret)
-		goto out_uninit;
+		goto out_group_put;
 	ret = vfio_pci_vf_init(vdev);
 	if (ret)
-		goto out_uninit;
+		goto out_group_put;
 	ret = vfio_pci_vga_init(vdev);
 	if (ret)
 		goto out_vf;
@@ -1962,36 +1896,26 @@ static int vfio_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 		vfio_pci_set_power_state(vdev, PCI_D0);
 out_vf:
 	vfio_pci_vf_uninit(vdev);
-out_uninit:
-	vfio_uninit_group_dev(&vdev->vdev);
-	kfree(vdev->pm_save);
-	kfree(vdev);
 out_group_put:
 	vfio_iommu_group_put(group, &pdev->dev);
 	return ret;
 }
 
-static void vfio_pci_remove(struct pci_dev *pdev)
+void vfio_pci_core_unregister_device(struct vfio_pci_core_device *vdev)
 {
-	struct vfio_pci_core_device *vdev = dev_get_drvdata(&pdev->dev);
+	struct pci_dev *pdev = vdev->pdev;
 
 	pci_disable_sriov(pdev);
 
 	vfio_unregister_group_dev(&vdev->vdev);
 
 	vfio_pci_vf_uninit(vdev);
-	vfio_uninit_group_dev(&vdev->vdev);
 	vfio_pci_vga_uninit(vdev);
 
 	vfio_iommu_group_put(pdev->dev.iommu_group, &pdev->dev);
 
 	if (!disable_idle_d3)
 		vfio_pci_set_power_state(vdev, PCI_D0);
-
-	mutex_destroy(&vdev->ioeventfds_lock);
-	kfree(vdev->region);
-	kfree(vdev->pm_save);
-	kfree(vdev);
 }
 
 static pci_ers_result_t vfio_pci_aer_err_detected(struct pci_dev *pdev,
@@ -2018,16 +1942,13 @@ static pci_ers_result_t vfio_pci_aer_err_detected(struct pci_dev *pdev,
 	return PCI_ERS_RESULT_CAN_RECOVER;
 }
 
-static int vfio_pci_sriov_configure(struct pci_dev *pdev, int nr_virtfn)
+int vfio_pci_core_sriov_configure(struct pci_dev *pdev, int nr_virtfn)
 {
 	struct vfio_device *device;
 	int ret = 0;
 
 	might_sleep();
 
-	if (!enable_sriov)
-		return -ENOENT;
-
 	device = vfio_device_get_from_dev(&pdev->dev);
 	if (!device)
 		return -ENODEV;
@@ -2042,19 +1963,10 @@ static int vfio_pci_sriov_configure(struct pci_dev *pdev, int nr_virtfn)
 	return ret < 0 ? ret : nr_virtfn;
 }
 
-static const struct pci_error_handlers vfio_err_handlers = {
+const struct pci_error_handlers vfio_pci_core_err_handlers = {
 	.error_detected = vfio_pci_aer_err_detected,
 };
 
-static struct pci_driver vfio_pci_driver = {
-	.name			= "vfio-pci",
-	.id_table		= NULL, /* only dynamic ids */
-	.probe			= vfio_pci_probe,
-	.remove			= vfio_pci_remove,
-	.sriov_configure	= vfio_pci_sriov_configure,
-	.err_handler		= &vfio_err_handlers,
-};
-
 static int vfio_pci_check_all_devices_bound(struct pci_dev *pdev, void *data)
 {
 	struct vfio_device_set *dev_set = data;
@@ -2216,83 +2128,15 @@ static void vfio_pci_try_bus_reset(struct vfio_pci_core_device *vdev)
 	}
 }
 
-static void __exit vfio_pci_cleanup(void)
+/* This will become the __exit function of vfio_pci_core.ko */
+void vfio_pci_core_cleanup(void)
 {
-	pci_unregister_driver(&vfio_pci_driver);
 	vfio_pci_uninit_perm_bits();
 }
 
-static void __init vfio_pci_fill_ids(void)
-{
-	char *p, *id;
-	int rc;
-
-	/* no ids passed actually */
-	if (ids[0] == '\0')
-		return;
-
-	/* add ids specified in the module parameter */
-	p = ids;
-	while ((id = strsep(&p, ","))) {
-		unsigned int vendor, device, subvendor = PCI_ANY_ID,
-			subdevice = PCI_ANY_ID, class = 0, class_mask = 0;
-		int fields;
-
-		if (!strlen(id))
-			continue;
-
-		fields = sscanf(id, "%x:%x:%x:%x:%x:%x",
-				&vendor, &device, &subvendor, &subdevice,
-				&class, &class_mask);
-
-		if (fields < 2) {
-			pr_warn("invalid id string \"%s\"\n", id);
-			continue;
-		}
-
-		rc = pci_add_dynid(&vfio_pci_driver, vendor, device,
-				   subvendor, subdevice, class, class_mask, 0);
-		if (rc)
-			pr_warn("failed to add dynamic id [%04x:%04x[%04x:%04x]] class %#08x/%08x (%d)\n",
-				vendor, device, subvendor, subdevice,
-				class, class_mask, rc);
-		else
-			pr_info("add [%04x:%04x[%04x:%04x]] class %#08x/%08x\n",
-				vendor, device, subvendor, subdevice,
-				class, class_mask);
-	}
-}
-
-static int __init vfio_pci_init(void)
+/* This will become the __init function of vfio_pci_core.ko */
+int __init vfio_pci_core_init(void)
 {
-	int ret;
-
 	/* Allocate shared config space permission data used by all devices */
-	ret = vfio_pci_init_perm_bits();
-	if (ret)
-		return ret;
-
-	/* Register and scan for devices */
-	ret = pci_register_driver(&vfio_pci_driver);
-	if (ret)
-		goto out_driver;
-
-	vfio_pci_fill_ids();
-
-	if (disable_denylist)
-		pr_warn("device denylist disabled.\n");
-
-	return 0;
-
-out_driver:
-	vfio_pci_uninit_perm_bits();
-	return ret;
+	return vfio_pci_init_perm_bits();
 }
-
-module_init(vfio_pci_init);
-module_exit(vfio_pci_cleanup);
-
-MODULE_VERSION(DRIVER_VERSION);
-MODULE_LICENSE("GPL v2");
-MODULE_AUTHOR(DRIVER_AUTHOR);
-MODULE_DESCRIPTION(DRIVER_DESC);
diff --git a/drivers/vfio/pci/vfio_pci_core.h b/drivers/vfio/pci/vfio_pci_core.h
index 17ad048752b6..7dbdd4dda5c0 100644
--- a/drivers/vfio/pci/vfio_pci_core.h
+++ b/drivers/vfio/pci/vfio_pci_core.h
@@ -206,4 +206,27 @@ static inline int vfio_pci_info_zdev_add_caps(struct vfio_pci_core_device *vdev,
 }
 #endif
 
+/* Will be exported for vfio pci drivers usage */
+void vfio_pci_core_cleanup(void);
+int vfio_pci_core_init(void);
+void vfio_pci_core_close_device(struct vfio_device *core_vdev);
+int vfio_pci_core_open_device(struct vfio_device *core_vdev);
+void vfio_pci_core_init_device(struct vfio_pci_core_device *vdev,
+			       struct pci_dev *pdev,
+			       const struct vfio_device_ops *vfio_pci_ops);
+int vfio_pci_core_register_device(struct vfio_pci_core_device *vdev);
+void vfio_pci_core_uninit_device(struct vfio_pci_core_device *vdev);
+void vfio_pci_core_unregister_device(struct vfio_pci_core_device *vdev);
+int vfio_pci_core_sriov_configure(struct pci_dev *pdev, int nr_virtfn);
+extern const struct pci_error_handlers vfio_pci_core_err_handlers;
+long vfio_pci_core_ioctl(struct vfio_device *core_vdev, unsigned int cmd,
+		unsigned long arg);
+ssize_t vfio_pci_core_read(struct vfio_device *core_vdev, char __user *buf,
+		size_t count, loff_t *ppos);
+ssize_t vfio_pci_core_write(struct vfio_device *core_vdev, const char __user *buf,
+		size_t count, loff_t *ppos);
+int vfio_pci_core_mmap(struct vfio_device *core_vdev, struct vm_area_struct *vma);
+void vfio_pci_core_request(struct vfio_device *core_vdev, unsigned int count);
+int vfio_pci_core_match(struct vfio_device *core_vdev, char *buf);
+
 #endif /* VFIO_PCI_CORE_H */
-- 
2.18.1


^ permalink raw reply	[flat|nested] 55+ messages in thread

* [PATCH 07/12] vfio/pci: Move igd initialization to vfio_pci.c
  2021-07-21 16:15 [PATCH 00/12] Introduce vfio_pci_core subsystem Yishai Hadas
                   ` (5 preceding siblings ...)
  2021-07-21 16:16 ` [PATCH 06/12] vfio/pci: Split the pci_driver code out of vfio_pci_core.c Yishai Hadas
@ 2021-07-21 16:16 ` Yishai Hadas
  2021-07-21 16:16 ` [PATCH 08/12] vfio/pci: Move module parameters " Yishai Hadas
                   ` (5 subsequent siblings)
  12 siblings, 0 replies; 55+ messages in thread
From: Yishai Hadas @ 2021-07-21 16:16 UTC (permalink / raw)
  To: bhelgaas, corbet, alex.williamson, diana.craciun, kwankhede,
	eric.auger, masahiroy, michal.lkml
  Cc: linux-pci, linux-doc, kvm, linux-s390, linux-kbuild, mgurtovoy,
	jgg, yishaih, maorg, leonro

From: Max Gurtovoy <mgurtovoy@nvidia.com>

igd is related to the vfio_pci pci_driver implementation, move it out of
vfio_pci_core.c.

This is preparation for splitting vfio_pci.ko into two drivers.

Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
---
 drivers/vfio/pci/vfio_pci.c      | 29 +++++++++++++++++++++++-
 drivers/vfio/pci/vfio_pci_core.c | 39 ++++----------------------------
 drivers/vfio/pci/vfio_pci_core.h |  9 +++++++-
 3 files changed, 41 insertions(+), 36 deletions(-)

diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c
index 4ccfbac0797a..801f66454e70 100644
--- a/drivers/vfio/pci/vfio_pci.c
+++ b/drivers/vfio/pci/vfio_pci.c
@@ -83,9 +83,36 @@ static bool vfio_pci_is_denylisted(struct pci_dev *pdev)
 	return true;
 }
 
+static int vfio_pci_open_device(struct vfio_device *core_vdev)
+{
+	struct vfio_pci_core_device *vdev =
+		container_of(core_vdev, struct vfio_pci_core_device, vdev);
+	struct pci_dev *pdev = vdev->pdev;
+	int ret;
+
+	ret = vfio_pci_core_enable(vdev);
+	if (ret)
+		return ret;
+
+	if (vfio_pci_is_vga(pdev) &&
+	    pdev->vendor == PCI_VENDOR_ID_INTEL &&
+	    IS_ENABLED(CONFIG_VFIO_PCI_IGD)) {
+		ret = vfio_pci_igd_init(vdev);
+		if (ret && ret != -ENODEV) {
+			pci_warn(pdev, "Failed to setup Intel IGD regions\n");
+			vfio_pci_core_disable(vdev);
+			return ret;
+		}
+	}
+
+	vfio_pci_core_finish_enable(vdev);
+
+	return 0;
+}
+
 static const struct vfio_device_ops vfio_pci_ops = {
 	.name		= "vfio-pci",
-	.open_device	= vfio_pci_core_open_device,
+	.open_device	= vfio_pci_open_device,
 	.close_device	= vfio_pci_core_close_device,
 	.ioctl		= vfio_pci_core_ioctl,
 	.read		= vfio_pci_core_read,
diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c
index 8323acc5d3b7..811601425798 100644
--- a/drivers/vfio/pci/vfio_pci_core.c
+++ b/drivers/vfio/pci/vfio_pci_core.c
@@ -91,11 +91,6 @@ static unsigned int vfio_pci_set_vga_decode(void *opaque, bool single_vga)
 	return decodes;
 }
 
-static inline bool vfio_pci_is_vga(struct pci_dev *pdev)
-{
-	return (pdev->class >> 8) == PCI_CLASS_DISPLAY_VGA;
-}
-
 static void vfio_pci_probe_mmaps(struct vfio_pci_core_device *vdev)
 {
 	struct resource *res;
@@ -166,7 +161,6 @@ static void vfio_pci_probe_mmaps(struct vfio_pci_core_device *vdev)
 
 struct vfio_pci_group_info;
 static void vfio_pci_try_bus_reset(struct vfio_pci_core_device *vdev);
-static void vfio_pci_disable(struct vfio_pci_core_device *vdev);
 static int vfio_hot_reset_device_set(struct vfio_pci_core_device *vdev,
 				     struct vfio_pci_group_info *groups);
 
@@ -252,7 +246,7 @@ int vfio_pci_set_power_state(struct vfio_pci_core_device *vdev, pci_power_t stat
 	return ret;
 }
 
-static int vfio_pci_enable(struct vfio_pci_core_device *vdev)
+int vfio_pci_core_enable(struct vfio_pci_core_device *vdev)
 {
 	struct pci_dev *pdev = vdev->pdev;
 	int ret;
@@ -321,26 +315,11 @@ static int vfio_pci_enable(struct vfio_pci_core_device *vdev)
 	if (!vfio_vga_disabled() && vfio_pci_is_vga(pdev))
 		vdev->has_vga = true;
 
-	if (vfio_pci_is_vga(pdev) &&
-	    pdev->vendor == PCI_VENDOR_ID_INTEL &&
-	    IS_ENABLED(CONFIG_VFIO_PCI_IGD)) {
-		ret = vfio_pci_igd_init(vdev);
-		if (ret && ret != -ENODEV) {
-			pci_warn(pdev, "Failed to setup Intel IGD regions\n");
-			goto disable_exit;
-		}
-	}
-
-	vfio_pci_probe_mmaps(vdev);
 
 	return 0;
-
-disable_exit:
-	vfio_pci_disable(vdev);
-	return ret;
 }
 
-static void vfio_pci_disable(struct vfio_pci_core_device *vdev)
+void vfio_pci_core_disable(struct vfio_pci_core_device *vdev)
 {
 	struct pci_dev *pdev = vdev->pdev;
 	struct vfio_pci_dummy_resource *dummy_res, *tmp;
@@ -481,7 +460,7 @@ void vfio_pci_core_close_device(struct vfio_device *core_vdev)
 
 	vfio_pci_vf_token_user_add(vdev, -1);
 	vfio_spapr_pci_eeh_release(vdev->pdev);
-	vfio_pci_disable(vdev);
+	vfio_pci_core_disable(vdev);
 
 	mutex_lock(&vdev->igate);
 	if (vdev->err_trigger) {
@@ -495,19 +474,11 @@ void vfio_pci_core_close_device(struct vfio_device *core_vdev)
 	mutex_unlock(&vdev->igate);
 }
 
-int vfio_pci_core_open_device(struct vfio_device *core_vdev)
+void vfio_pci_core_finish_enable(struct vfio_pci_core_device *vdev)
 {
-	struct vfio_pci_core_device *vdev =
-		container_of(core_vdev, struct vfio_pci_core_device, vdev);
-	int ret = 0;
-
-	ret = vfio_pci_enable(vdev);
-	if (ret)
-		return ret;
-
+	vfio_pci_probe_mmaps(vdev);
 	vfio_spapr_pci_eeh_open(vdev->pdev);
 	vfio_pci_vf_token_user_add(vdev, 1);
-	return 0;
 }
 
 static int vfio_pci_get_irq_count(struct vfio_pci_core_device *vdev, int irq_type)
diff --git a/drivers/vfio/pci/vfio_pci_core.h b/drivers/vfio/pci/vfio_pci_core.h
index 7dbdd4dda5c0..ffaf544f35db 100644
--- a/drivers/vfio/pci/vfio_pci_core.h
+++ b/drivers/vfio/pci/vfio_pci_core.h
@@ -210,7 +210,6 @@ static inline int vfio_pci_info_zdev_add_caps(struct vfio_pci_core_device *vdev,
 void vfio_pci_core_cleanup(void);
 int vfio_pci_core_init(void);
 void vfio_pci_core_close_device(struct vfio_device *core_vdev);
-int vfio_pci_core_open_device(struct vfio_device *core_vdev);
 void vfio_pci_core_init_device(struct vfio_pci_core_device *vdev,
 			       struct pci_dev *pdev,
 			       const struct vfio_device_ops *vfio_pci_ops);
@@ -228,5 +227,13 @@ ssize_t vfio_pci_core_write(struct vfio_device *core_vdev, const char __user *bu
 int vfio_pci_core_mmap(struct vfio_device *core_vdev, struct vm_area_struct *vma);
 void vfio_pci_core_request(struct vfio_device *core_vdev, unsigned int count);
 int vfio_pci_core_match(struct vfio_device *core_vdev, char *buf);
+int vfio_pci_core_enable(struct vfio_pci_core_device *vdev);
+void vfio_pci_core_disable(struct vfio_pci_core_device *vdev);
+void vfio_pci_core_finish_enable(struct vfio_pci_core_device *vdev);
+
+static inline bool vfio_pci_is_vga(struct pci_dev *pdev)
+{
+	return (pdev->class >> 8) == PCI_CLASS_DISPLAY_VGA;
+}
 
 #endif /* VFIO_PCI_CORE_H */
-- 
2.18.1


^ permalink raw reply	[flat|nested] 55+ messages in thread

* [PATCH 08/12] vfio/pci: Move module parameters to vfio_pci.c
  2021-07-21 16:15 [PATCH 00/12] Introduce vfio_pci_core subsystem Yishai Hadas
                   ` (6 preceding siblings ...)
  2021-07-21 16:16 ` [PATCH 07/12] vfio/pci: Move igd initialization to vfio_pci.c Yishai Hadas
@ 2021-07-21 16:16 ` Yishai Hadas
  2021-07-21 16:16 ` [PATCH 09/12] PCI: Add a PCI_ID_F_VFIO_DRIVER_OVERRIDE flag to struct pci_device_id Yishai Hadas
                   ` (4 subsequent siblings)
  12 siblings, 0 replies; 55+ messages in thread
From: Yishai Hadas @ 2021-07-21 16:16 UTC (permalink / raw)
  To: bhelgaas, corbet, alex.williamson, diana.craciun, kwankhede,
	eric.auger, masahiroy, michal.lkml
  Cc: linux-pci, linux-doc, kvm, linux-s390, linux-kbuild, mgurtovoy,
	jgg, yishaih, maorg, leonro

This is a preparation before splitting vfio_pci.ko to 2 modules.

As module parameters are a kind of uAPI they need to stay on vfio_pci.ko
to avoid a user visible impact.

For now continue to keep the implementation of these options in
vfio_pci_core.c. Arguably they are vfio_pci functionality, but further
splitting of vfio_pci_core.c will be better done in another series

Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/vfio/pci/vfio_pci.c      | 23 +++++++++++++++++++++++
 drivers/vfio/pci/vfio_pci_core.c | 20 ++++++++------------
 drivers/vfio/pci/vfio_pci_core.h |  2 ++
 3 files changed, 33 insertions(+), 12 deletions(-)

diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c
index 801f66454e70..0272b95d9c5f 100644
--- a/drivers/vfio/pci/vfio_pci.c
+++ b/drivers/vfio/pci/vfio_pci.c
@@ -35,6 +35,22 @@ static char ids[1024] __initdata;
 module_param_string(ids, ids, sizeof(ids), 0);
 MODULE_PARM_DESC(ids, "Initial PCI IDs to add to the vfio driver, format is \"vendor:device[:subvendor[:subdevice[:class[:class_mask]]]]\" and multiple comma separated entries can be specified");
 
+static bool nointxmask;
+module_param_named(nointxmask, nointxmask, bool, S_IRUGO | S_IWUSR);
+MODULE_PARM_DESC(nointxmask,
+		  "Disable support for PCI 2.3 style INTx masking.  If this resolves problems for specific devices, report lspci -vvvxxx to linux-pci@vger.kernel.org so the device can be fixed automatically via the broken_intx_masking flag.");
+
+#ifdef CONFIG_VFIO_PCI_VGA
+static bool disable_vga;
+module_param(disable_vga, bool, S_IRUGO);
+MODULE_PARM_DESC(disable_vga, "Disable VGA resource access through vfio-pci");
+#endif
+
+static bool disable_idle_d3;
+module_param(disable_idle_d3, bool, S_IRUGO | S_IWUSR);
+MODULE_PARM_DESC(disable_idle_d3,
+		 "Disable using the PCI D3 low power state for idle, unused devices");
+
 static bool enable_sriov;
 #ifdef CONFIG_PCI_IOV
 module_param(enable_sriov, bool, 0644);
@@ -218,6 +234,13 @@ static void __init vfio_pci_fill_ids(void)
 static int __init vfio_pci_init(void)
 {
 	int ret;
+	bool is_disable_vga = true;
+
+#ifdef CONFIG_VFIO_PCI_VGA
+	is_disable_vga = disable_vga;
+#endif
+
+	vfio_pci_core_set_params(nointxmask, is_disable_vga, disable_idle_d3);
 
 	ret = vfio_pci_core_init();
 	if (ret)
diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c
index 811601425798..e65b154f17c3 100644
--- a/drivers/vfio/pci/vfio_pci_core.c
+++ b/drivers/vfio/pci/vfio_pci_core.c
@@ -28,20 +28,8 @@
 #include "vfio_pci_core.h"
 
 static bool nointxmask;
-module_param_named(nointxmask, nointxmask, bool, S_IRUGO | S_IWUSR);
-MODULE_PARM_DESC(nointxmask,
-		  "Disable support for PCI 2.3 style INTx masking.  If this resolves problems for specific devices, report lspci -vvvxxx to linux-pci@vger.kernel.org so the device can be fixed automatically via the broken_intx_masking flag.");
-
-#ifdef CONFIG_VFIO_PCI_VGA
 static bool disable_vga;
-module_param(disable_vga, bool, S_IRUGO);
-MODULE_PARM_DESC(disable_vga, "Disable VGA resource access through vfio-pci");
-#endif
-
 static bool disable_idle_d3;
-module_param(disable_idle_d3, bool, S_IRUGO | S_IWUSR);
-MODULE_PARM_DESC(disable_idle_d3,
-		 "Disable using the PCI D3 low power state for idle, unused devices");
 
 static inline bool vfio_vga_disabled(void)
 {
@@ -2099,6 +2087,14 @@ static void vfio_pci_try_bus_reset(struct vfio_pci_core_device *vdev)
 	}
 }
 
+void vfio_pci_core_set_params(bool is_nointxmask, bool is_disable_vga,
+			      bool is_disable_idle_d3)
+{
+	nointxmask = is_nointxmask;
+	disable_vga = is_disable_vga;
+	disable_idle_d3 = is_disable_idle_d3;
+}
+
 /* This will become the __exit function of vfio_pci_core.ko */
 void vfio_pci_core_cleanup(void)
 {
diff --git a/drivers/vfio/pci/vfio_pci_core.h b/drivers/vfio/pci/vfio_pci_core.h
index ffaf544f35db..7a2da1e14de3 100644
--- a/drivers/vfio/pci/vfio_pci_core.h
+++ b/drivers/vfio/pci/vfio_pci_core.h
@@ -209,6 +209,8 @@ static inline int vfio_pci_info_zdev_add_caps(struct vfio_pci_core_device *vdev,
 /* Will be exported for vfio pci drivers usage */
 void vfio_pci_core_cleanup(void);
 int vfio_pci_core_init(void);
+void vfio_pci_core_set_params(bool nointxmask, bool is_disable_vga,
+			      bool is_disable_idle_d3);
 void vfio_pci_core_close_device(struct vfio_device *core_vdev);
 void vfio_pci_core_init_device(struct vfio_pci_core_device *vdev,
 			       struct pci_dev *pdev,
-- 
2.18.1


^ permalink raw reply	[flat|nested] 55+ messages in thread

* [PATCH 09/12] PCI: Add a PCI_ID_F_VFIO_DRIVER_OVERRIDE flag to struct pci_device_id
  2021-07-21 16:15 [PATCH 00/12] Introduce vfio_pci_core subsystem Yishai Hadas
                   ` (7 preceding siblings ...)
  2021-07-21 16:16 ` [PATCH 08/12] vfio/pci: Move module parameters " Yishai Hadas
@ 2021-07-21 16:16 ` Yishai Hadas
  2021-07-27 16:34   ` Alex Williamson
                     ` (2 more replies)
  2021-07-21 16:16 ` [PATCH 10/12] vfio: Use select for eventfd Yishai Hadas
                   ` (3 subsequent siblings)
  12 siblings, 3 replies; 55+ messages in thread
From: Yishai Hadas @ 2021-07-21 16:16 UTC (permalink / raw)
  To: bhelgaas, corbet, alex.williamson, diana.craciun, kwankhede,
	eric.auger, masahiroy, michal.lkml
  Cc: linux-pci, linux-doc, kvm, linux-s390, linux-kbuild, mgurtovoy,
	jgg, yishaih, maorg, leonro

From: Max Gurtovoy <mgurtovoy@nvidia.com>

The new flag field is be used to allow PCI drivers to signal the core code
during driver matching and when generating the modules.alias information.

The first use will be to define a VFIO flag that indicates the PCI driver
is a VFIO driver.

VFIO drivers have a few special properties compared to normal PCI drivers:
 - They do not automatically bind. VFIO drivers are used to swap out the
   normal driver for a device and convert the PCI device to the VFIO
   subsystem.

   The admin must make this choice and following the current uAPI this is
   usually done by using the driver_override sysfs.

 - The modules.alias includes the IDs of the VFIO PCI drivers, prefixing
   them with 'vfio_pci:' instead of the normal 'pci:'.

   This allows the userspace machinery that switches devices to VFIO to
   know what kernel drivers support what devices and allows it to trigger
   the proper device_override.

As existing tools do not recognize the "vfio_pci:" mod-alias prefix this
keeps todays behavior the same. VFIO remains on the side, is never
autoloaded and can only be activated by direct admin action.

This patch is the infrastructure to provide the information in the
modules.alias to userspace and enable the only PCI VFIO driver. Later
series introduce additional HW specific VFIO PCI drivers.

Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
---
 Documentation/PCI/pci.rst         |  1 +
 drivers/pci/pci-driver.c          | 25 +++++++++++++++++++++----
 drivers/vfio/pci/vfio_pci.c       |  9 ++++++++-
 include/linux/mod_devicetable.h   |  7 +++++++
 include/linux/pci.h               | 27 +++++++++++++++++++++++++++
 scripts/mod/devicetable-offsets.c |  1 +
 scripts/mod/file2alias.c          |  8 ++++++--
 7 files changed, 71 insertions(+), 7 deletions(-)

diff --git a/Documentation/PCI/pci.rst b/Documentation/PCI/pci.rst
index fa651e25d98c..24e70a386887 100644
--- a/Documentation/PCI/pci.rst
+++ b/Documentation/PCI/pci.rst
@@ -103,6 +103,7 @@ need pass only as many optional fields as necessary:
   - subvendor and subdevice fields default to PCI_ANY_ID (FFFFFFFF)
   - class and classmask fields default to 0
   - driver_data defaults to 0UL.
+  - flags field defaults to 0.
 
 Note that driver_data must match the value used by any of the pci_device_id
 entries defined in the driver. This makes the driver_data field mandatory
diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
index 3a72352aa5cf..1ed8a4ab96f1 100644
--- a/drivers/pci/pci-driver.c
+++ b/drivers/pci/pci-driver.c
@@ -136,7 +136,7 @@ static const struct pci_device_id *pci_match_device(struct pci_driver *drv,
 						    struct pci_dev *dev)
 {
 	struct pci_dynid *dynid;
-	const struct pci_device_id *found_id = NULL;
+	const struct pci_device_id *found_id = NULL, *ids;
 
 	/* When driver_override is set, only bind to the matching driver */
 	if (dev->driver_override && strcmp(dev->driver_override, drv->name))
@@ -152,10 +152,27 @@ static const struct pci_device_id *pci_match_device(struct pci_driver *drv,
 	}
 	spin_unlock(&drv->dynids.lock);
 
-	if (!found_id)
-		found_id = pci_match_id(drv->id_table, dev);
+	if (found_id)
+		return found_id;
+
+	ids = drv->id_table;
+	while ((found_id = pci_match_id(ids, dev))) {
+		/*
+		 * The match table is split based on driver_override. Check the
+		 * flags as well so that any matching PCI_ID_F_DRIVER_OVERRIDE
+		 * entry is returned.
+		 */
+		if ((found_id->flags & PCI_ID_F_VFIO_DRIVER_OVERRIDE) &&
+		    !dev->driver_override)
+			ids = found_id + 1;
+		else
+			break;
+	}
 
-	/* driver_override will always match, send a dummy id */
+	/*
+	 * if no static match, driver_override will always match, send a dummy
+	 * id.
+	 */
 	if (!found_id && dev->driver_override)
 		found_id = &pci_device_id_any;
 
diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c
index 0272b95d9c5f..7a43edbe8618 100644
--- a/drivers/vfio/pci/vfio_pci.c
+++ b/drivers/vfio/pci/vfio_pci.c
@@ -181,9 +181,16 @@ static int vfio_pci_sriov_configure(struct pci_dev *pdev, int nr_virtfn)
 	return vfio_pci_core_sriov_configure(pdev, nr_virtfn);
 }
 
+static const struct pci_device_id vfio_pci_table[] = {
+	{ PCI_DRIVER_OVERRIDE_DEVICE_VFIO(PCI_ANY_ID, PCI_ANY_ID) }, /* match all by default */
+	{ 0, }
+};
+
+MODULE_DEVICE_TABLE(pci, vfio_pci_table);
+
 static struct pci_driver vfio_pci_driver = {
 	.name			= "vfio-pci",
-	.id_table		= NULL, /* only dynamic ids */
+	.id_table		= vfio_pci_table,
 	.probe			= vfio_pci_probe,
 	.remove			= vfio_pci_remove,
 	.sriov_configure	= vfio_pci_sriov_configure,
diff --git a/include/linux/mod_devicetable.h b/include/linux/mod_devicetable.h
index 8e291cfdaf06..cd256d9c60d2 100644
--- a/include/linux/mod_devicetable.h
+++ b/include/linux/mod_devicetable.h
@@ -16,6 +16,11 @@ typedef unsigned long kernel_ulong_t;
 
 #define PCI_ANY_ID (~0)
 
+
+enum pci_id_flags {
+	PCI_ID_F_VFIO_DRIVER_OVERRIDE	= 1 << 0,
+};
+
 /**
  * struct pci_device_id - PCI device ID structure
  * @vendor:		Vendor ID to match (or PCI_ANY_ID)
@@ -34,12 +39,14 @@ typedef unsigned long kernel_ulong_t;
  *			Best practice is to use driver_data as an index
  *			into a static list of equivalent device types,
  *			instead of using it as a pointer.
+ * @flags:		PCI flags of the driver. Bitmap of pci_id_flags enum.
  */
 struct pci_device_id {
 	__u32 vendor, device;		/* Vendor and device ID or PCI_ANY_ID*/
 	__u32 subvendor, subdevice;	/* Subsystem ID's or PCI_ANY_ID */
 	__u32 class, class_mask;	/* (class,subclass,prog-if) triplet */
 	kernel_ulong_t driver_data;	/* Data private to the driver */
+	__u32 flags;
 };
 
 
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 540b377ca8f6..fd84609ff06b 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -901,6 +901,33 @@ struct pci_driver {
 	.vendor = (vend), .device = (dev), \
 	.subvendor = PCI_ANY_ID, .subdevice = PCI_ANY_ID
 
+/**
+ * PCI_DEVICE_FLAGS - macro used to describe a PCI device with specific flags.
+ * @vend: the 16 bit PCI Vendor ID
+ * @dev: the 16 bit PCI Device ID
+ * @fl: PCI Device flags as a bitmap of pci_id_flags enum
+ *
+ * This macro is used to create a struct pci_device_id that matches a
+ * specific device. The subvendor and subdevice fields will be set to
+ * PCI_ANY_ID.
+ */
+#define PCI_DEVICE_FLAGS(vend, dev, fl) \
+	.vendor = (vend), .device = (dev), .subvendor = PCI_ANY_ID, \
+	.subdevice = PCI_ANY_ID, .flags = (fl)
+
+/**
+ * PCI_DRIVER_OVERRIDE_DEVICE_VFIO - macro used to describe a VFIO
+ *                                   "driver_override" PCI device.
+ * @vend: the 16 bit PCI Vendor ID
+ * @dev: the 16 bit PCI Device ID
+ *
+ * This macro is used to create a struct pci_device_id that matches a
+ * specific device. The subvendor and subdevice fields will be set to
+ * PCI_ANY_ID and the flags will be set to PCI_ID_F_VFIO_DRIVER_OVERRIDE.
+ */
+#define PCI_DRIVER_OVERRIDE_DEVICE_VFIO(vend, dev) \
+	PCI_DEVICE_FLAGS(vend, dev, PCI_ID_F_VFIO_DRIVER_OVERRIDE)
+
 /**
  * PCI_DEVICE_SUB - macro used to describe a specific PCI device with subsystem
  * @vend: the 16 bit PCI Vendor ID
diff --git a/scripts/mod/devicetable-offsets.c b/scripts/mod/devicetable-offsets.c
index 9bb6c7edccc4..b927c36b8333 100644
--- a/scripts/mod/devicetable-offsets.c
+++ b/scripts/mod/devicetable-offsets.c
@@ -42,6 +42,7 @@ int main(void)
 	DEVID_FIELD(pci_device_id, subdevice);
 	DEVID_FIELD(pci_device_id, class);
 	DEVID_FIELD(pci_device_id, class_mask);
+	DEVID_FIELD(pci_device_id, flags);
 
 	DEVID(ccw_device_id);
 	DEVID_FIELD(ccw_device_id, match_flags);
diff --git a/scripts/mod/file2alias.c b/scripts/mod/file2alias.c
index 7c97fa8e36bc..f53b38e8f696 100644
--- a/scripts/mod/file2alias.c
+++ b/scripts/mod/file2alias.c
@@ -426,7 +426,7 @@ static int do_ieee1394_entry(const char *filename,
 	return 1;
 }
 
-/* Looks like: pci:vNdNsvNsdNbcNscNiN. */
+/* Looks like: pci:vNdNsvNsdNbcNscNiN or <prefix>_pci:vNdNsvNsdNbcNscNiN. */
 static int do_pci_entry(const char *filename,
 			void *symval, char *alias)
 {
@@ -440,8 +440,12 @@ static int do_pci_entry(const char *filename,
 	DEF_FIELD(symval, pci_device_id, subdevice);
 	DEF_FIELD(symval, pci_device_id, class);
 	DEF_FIELD(symval, pci_device_id, class_mask);
+	DEF_FIELD(symval, pci_device_id, flags);
 
-	strcpy(alias, "pci:");
+	if (flags & PCI_ID_F_VFIO_DRIVER_OVERRIDE)
+		strcpy(alias, "vfio_pci:");
+	else
+		strcpy(alias, "pci:");
 	ADD(alias, "v", vendor != PCI_ANY_ID, vendor);
 	ADD(alias, "d", device != PCI_ANY_ID, device);
 	ADD(alias, "sv", subvendor != PCI_ANY_ID, subvendor);
-- 
2.18.1


^ permalink raw reply	[flat|nested] 55+ messages in thread

* [PATCH 10/12] vfio: Use select for eventfd
  2021-07-21 16:15 [PATCH 00/12] Introduce vfio_pci_core subsystem Yishai Hadas
                   ` (8 preceding siblings ...)
  2021-07-21 16:16 ` [PATCH 09/12] PCI: Add a PCI_ID_F_VFIO_DRIVER_OVERRIDE flag to struct pci_device_id Yishai Hadas
@ 2021-07-21 16:16 ` Yishai Hadas
  2021-07-21 16:16 ` [PATCH 11/12] vfio: Use kconfig if XX/endif blocks instead of repeating 'depends on' Yishai Hadas
                   ` (2 subsequent siblings)
  12 siblings, 0 replies; 55+ messages in thread
From: Yishai Hadas @ 2021-07-21 16:16 UTC (permalink / raw)
  To: bhelgaas, corbet, alex.williamson, diana.craciun, kwankhede,
	eric.auger, masahiroy, michal.lkml
  Cc: linux-pci, linux-doc, kvm, linux-s390, linux-kbuild, mgurtovoy,
	jgg, yishaih, maorg, leonro

From: Jason Gunthorpe <jgg@nvidia.com>

If VFIO_VIRQFD is required then turn on eventfd automatically.
The majority of kconfig users of the EVENTFD use select not depends on.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
---
 drivers/vfio/Kconfig          | 3 ++-
 drivers/vfio/fsl-mc/Kconfig   | 3 ++-
 drivers/vfio/pci/Kconfig      | 2 +-
 drivers/vfio/platform/Kconfig | 2 +-
 4 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/drivers/vfio/Kconfig b/drivers/vfio/Kconfig
index e44bf736e2b2..698ca35b3f03 100644
--- a/drivers/vfio/Kconfig
+++ b/drivers/vfio/Kconfig
@@ -16,7 +16,8 @@ config VFIO_SPAPR_EEH
 
 config VFIO_VIRQFD
 	tristate
-	depends on VFIO && EVENTFD
+	depends on VFIO
+	select EVENTFD
 	default n
 
 menuconfig VFIO
diff --git a/drivers/vfio/fsl-mc/Kconfig b/drivers/vfio/fsl-mc/Kconfig
index b1a527d6b6f2..6df66813c882 100644
--- a/drivers/vfio/fsl-mc/Kconfig
+++ b/drivers/vfio/fsl-mc/Kconfig
@@ -1,6 +1,7 @@
 config VFIO_FSL_MC
 	tristate "VFIO support for QorIQ DPAA2 fsl-mc bus devices"
-	depends on VFIO && FSL_MC_BUS && EVENTFD
+	depends on VFIO && FSL_MC_BUS
+	select EVENTFD
 	help
 	  Driver to enable support for the VFIO QorIQ DPAA2 fsl-mc
 	  (Management Complex) devices. This is required to passthrough
diff --git a/drivers/vfio/pci/Kconfig b/drivers/vfio/pci/Kconfig
index 5e2e1b9a9fd3..d208a95a2767 100644
--- a/drivers/vfio/pci/Kconfig
+++ b/drivers/vfio/pci/Kconfig
@@ -1,7 +1,7 @@
 # SPDX-License-Identifier: GPL-2.0-only
 config VFIO_PCI
 	tristate "VFIO support for PCI devices"
-	depends on VFIO && PCI && EVENTFD
+	depends on VFIO && PCI
 	depends on MMU
 	select VFIO_VIRQFD
 	select IRQ_BYPASS_MANAGER
diff --git a/drivers/vfio/platform/Kconfig b/drivers/vfio/platform/Kconfig
index ab341108a0be..7f78eb96a5d5 100644
--- a/drivers/vfio/platform/Kconfig
+++ b/drivers/vfio/platform/Kconfig
@@ -1,7 +1,7 @@
 # SPDX-License-Identifier: GPL-2.0-only
 config VFIO_PLATFORM
 	tristate "VFIO support for platform devices"
-	depends on VFIO && EVENTFD && (ARM || ARM64 || COMPILE_TEST)
+	depends on VFIO && (ARM || ARM64 || COMPILE_TEST)
 	select VFIO_VIRQFD
 	help
 	  Support for platform devices with VFIO. This is required to make
-- 
2.18.1


^ permalink raw reply	[flat|nested] 55+ messages in thread

* [PATCH 11/12] vfio: Use kconfig if XX/endif blocks instead of repeating 'depends on'
  2021-07-21 16:15 [PATCH 00/12] Introduce vfio_pci_core subsystem Yishai Hadas
                   ` (9 preceding siblings ...)
  2021-07-21 16:16 ` [PATCH 10/12] vfio: Use select for eventfd Yishai Hadas
@ 2021-07-21 16:16 ` Yishai Hadas
  2021-07-21 16:16 ` [PATCH 12/12] vfio/pci: Introduce vfio_pci_core.ko Yishai Hadas
  2021-08-04 13:41 ` [PATCH 00/12] Introduce vfio_pci_core subsystem Yishai Hadas
  12 siblings, 0 replies; 55+ messages in thread
From: Yishai Hadas @ 2021-07-21 16:16 UTC (permalink / raw)
  To: bhelgaas, corbet, alex.williamson, diana.craciun, kwankhede,
	eric.auger, masahiroy, michal.lkml
  Cc: linux-pci, linux-doc, kvm, linux-s390, linux-kbuild, mgurtovoy,
	jgg, yishaih, maorg, leonro

From: Jason Gunthorpe <jgg@nvidia.com>

This results in less kconfig wordage and a simpler understanding of the
required "depends on" to create the menu structure.

The next patch increases the nesting level a lot so this is a nice
preparatory simplification.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
---
 drivers/vfio/Kconfig                | 28 ++++++++++++++--------------
 drivers/vfio/fsl-mc/Kconfig         |  2 +-
 drivers/vfio/mdev/Kconfig           |  1 -
 drivers/vfio/pci/Kconfig            | 11 ++++++-----
 drivers/vfio/platform/Kconfig       |  6 ++++--
 drivers/vfio/platform/reset/Kconfig |  4 +---
 6 files changed, 26 insertions(+), 26 deletions(-)

diff --git a/drivers/vfio/Kconfig b/drivers/vfio/Kconfig
index 698ca35b3f03..6130d00252ed 100644
--- a/drivers/vfio/Kconfig
+++ b/drivers/vfio/Kconfig
@@ -1,12 +1,22 @@
 # SPDX-License-Identifier: GPL-2.0-only
+menuconfig VFIO
+	tristate "VFIO Non-Privileged userspace driver framework"
+	select IOMMU_API
+	select VFIO_IOMMU_TYPE1 if MMU && (X86 || S390 || ARM || ARM64)
+	help
+	  VFIO provides a framework for secure userspace device drivers.
+	  See Documentation/driver-api/vfio.rst for more details.
+
+	  If you don't know what to do here, say N.
+
+if VFIO
 config VFIO_IOMMU_TYPE1
 	tristate
-	depends on VFIO
 	default n
 
 config VFIO_IOMMU_SPAPR_TCE
 	tristate
-	depends on VFIO && SPAPR_TCE_IOMMU
+	depends on SPAPR_TCE_IOMMU
 	default VFIO
 
 config VFIO_SPAPR_EEH
@@ -16,23 +26,11 @@ config VFIO_SPAPR_EEH
 
 config VFIO_VIRQFD
 	tristate
-	depends on VFIO
 	select EVENTFD
 	default n
 
-menuconfig VFIO
-	tristate "VFIO Non-Privileged userspace driver framework"
-	select IOMMU_API
-	select VFIO_IOMMU_TYPE1 if MMU && (X86 || S390 || ARM || ARM64)
-	help
-	  VFIO provides a framework for secure userspace device drivers.
-	  See Documentation/driver-api/vfio.rst for more details.
-
-	  If you don't know what to do here, say N.
-
 config VFIO_NOIOMMU
 	bool "VFIO No-IOMMU support"
-	depends on VFIO
 	help
 	  VFIO is built on the ability to isolate devices using the IOMMU.
 	  Only with an IOMMU can userspace access to DMA capable devices be
@@ -49,4 +47,6 @@ source "drivers/vfio/pci/Kconfig"
 source "drivers/vfio/platform/Kconfig"
 source "drivers/vfio/mdev/Kconfig"
 source "drivers/vfio/fsl-mc/Kconfig"
+endif
+
 source "virt/lib/Kconfig"
diff --git a/drivers/vfio/fsl-mc/Kconfig b/drivers/vfio/fsl-mc/Kconfig
index 6df66813c882..597d338c5c8a 100644
--- a/drivers/vfio/fsl-mc/Kconfig
+++ b/drivers/vfio/fsl-mc/Kconfig
@@ -1,6 +1,6 @@
 config VFIO_FSL_MC
 	tristate "VFIO support for QorIQ DPAA2 fsl-mc bus devices"
-	depends on VFIO && FSL_MC_BUS
+	depends on FSL_MC_BUS
 	select EVENTFD
 	help
 	  Driver to enable support for the VFIO QorIQ DPAA2 fsl-mc
diff --git a/drivers/vfio/mdev/Kconfig b/drivers/vfio/mdev/Kconfig
index 763c877a1318..646dbed44eb2 100644
--- a/drivers/vfio/mdev/Kconfig
+++ b/drivers/vfio/mdev/Kconfig
@@ -2,7 +2,6 @@
 
 config VFIO_MDEV
 	tristate "Mediated device driver framework"
-	depends on VFIO
 	default n
 	help
 	  Provides a framework to virtualize devices.
diff --git a/drivers/vfio/pci/Kconfig b/drivers/vfio/pci/Kconfig
index d208a95a2767..afdab7d71e98 100644
--- a/drivers/vfio/pci/Kconfig
+++ b/drivers/vfio/pci/Kconfig
@@ -1,7 +1,7 @@
 # SPDX-License-Identifier: GPL-2.0-only
 config VFIO_PCI
 	tristate "VFIO support for PCI devices"
-	depends on VFIO && PCI
+	depends on PCI
 	depends on MMU
 	select VFIO_VIRQFD
 	select IRQ_BYPASS_MANAGER
@@ -11,9 +11,10 @@ config VFIO_PCI
 
 	  If you don't know what to do here, say N.
 
+if VFIO_PCI
 config VFIO_PCI_VGA
 	bool "VFIO PCI support for VGA devices"
-	depends on VFIO_PCI && X86 && VGA_ARB
+	depends on X86 && VGA_ARB
 	help
 	  Support for VGA extension to VFIO PCI.  This exposes an additional
 	  region on VGA devices for accessing legacy VGA addresses used by
@@ -22,16 +23,14 @@ config VFIO_PCI_VGA
 	  If you don't know what to do here, say N.
 
 config VFIO_PCI_MMAP
-	depends on VFIO_PCI
 	def_bool y if !S390
 
 config VFIO_PCI_INTX
-	depends on VFIO_PCI
 	def_bool y if !S390
 
 config VFIO_PCI_IGD
 	bool "VFIO PCI extensions for Intel graphics (GVT-d)"
-	depends on VFIO_PCI && X86
+	depends on X86
 	default y
 	help
 	  Support for Intel IGD specific extensions to enable direct
@@ -40,3 +39,5 @@ config VFIO_PCI_IGD
 	  and LPC bridge config space.
 
 	  To enable Intel IGD assignment through vfio-pci, say Y.
+
+endif
diff --git a/drivers/vfio/platform/Kconfig b/drivers/vfio/platform/Kconfig
index 7f78eb96a5d5..331a5920f5ab 100644
--- a/drivers/vfio/platform/Kconfig
+++ b/drivers/vfio/platform/Kconfig
@@ -1,7 +1,7 @@
 # SPDX-License-Identifier: GPL-2.0-only
 config VFIO_PLATFORM
 	tristate "VFIO support for platform devices"
-	depends on VFIO && (ARM || ARM64 || COMPILE_TEST)
+	depends on ARM || ARM64 || COMPILE_TEST
 	select VFIO_VIRQFD
 	help
 	  Support for platform devices with VFIO. This is required to make
@@ -10,9 +10,10 @@ config VFIO_PLATFORM
 
 	  If you don't know what to do here, say N.
 
+if VFIO_PLATFORM
 config VFIO_AMBA
 	tristate "VFIO support for AMBA devices"
-	depends on VFIO_PLATFORM && (ARM_AMBA || COMPILE_TEST)
+	depends on ARM_AMBA || COMPILE_TEST
 	help
 	  Support for ARM AMBA devices with VFIO. This is required to make
 	  use of ARM AMBA devices present on the system using the VFIO
@@ -21,3 +22,4 @@ config VFIO_AMBA
 	  If you don't know what to do here, say N.
 
 source "drivers/vfio/platform/reset/Kconfig"
+endif
diff --git a/drivers/vfio/platform/reset/Kconfig b/drivers/vfio/platform/reset/Kconfig
index 1edbe9ee7356..12f5f3d80387 100644
--- a/drivers/vfio/platform/reset/Kconfig
+++ b/drivers/vfio/platform/reset/Kconfig
@@ -1,7 +1,6 @@
 # SPDX-License-Identifier: GPL-2.0-only
 config VFIO_PLATFORM_CALXEDAXGMAC_RESET
 	tristate "VFIO support for calxeda xgmac reset"
-	depends on VFIO_PLATFORM
 	help
 	  Enables the VFIO platform driver to handle reset for Calxeda xgmac
 
@@ -9,7 +8,6 @@ config VFIO_PLATFORM_CALXEDAXGMAC_RESET
 
 config VFIO_PLATFORM_AMDXGBE_RESET
 	tristate "VFIO support for AMD XGBE reset"
-	depends on VFIO_PLATFORM
 	help
 	  Enables the VFIO platform driver to handle reset for AMD XGBE
 
@@ -17,7 +15,7 @@ config VFIO_PLATFORM_AMDXGBE_RESET
 
 config VFIO_PLATFORM_BCMFLEXRM_RESET
 	tristate "VFIO support for Broadcom FlexRM reset"
-	depends on VFIO_PLATFORM && (ARCH_BCM_IPROC || COMPILE_TEST)
+	depends on ARCH_BCM_IPROC || COMPILE_TEST
 	default ARCH_BCM_IPROC
 	help
 	  Enables the VFIO platform driver to handle reset for Broadcom FlexRM
-- 
2.18.1


^ permalink raw reply	[flat|nested] 55+ messages in thread

* [PATCH 12/12] vfio/pci: Introduce vfio_pci_core.ko
  2021-07-21 16:15 [PATCH 00/12] Introduce vfio_pci_core subsystem Yishai Hadas
                   ` (10 preceding siblings ...)
  2021-07-21 16:16 ` [PATCH 11/12] vfio: Use kconfig if XX/endif blocks instead of repeating 'depends on' Yishai Hadas
@ 2021-07-21 16:16 ` Yishai Hadas
  2021-07-21 17:39   ` Leon Romanovsky
  2021-07-27 21:54   ` Alex Williamson
  2021-08-04 13:41 ` [PATCH 00/12] Introduce vfio_pci_core subsystem Yishai Hadas
  12 siblings, 2 replies; 55+ messages in thread
From: Yishai Hadas @ 2021-07-21 16:16 UTC (permalink / raw)
  To: bhelgaas, corbet, alex.williamson, diana.craciun, kwankhede,
	eric.auger, masahiroy, michal.lkml
  Cc: linux-pci, linux-doc, kvm, linux-s390, linux-kbuild, mgurtovoy,
	jgg, yishaih, maorg, leonro

From: Max Gurtovoy <mgurtovoy@nvidia.com>

Now that vfio_pci has been split into two source modules, one focusing
on the "struct pci_driver" (vfio_pci.c) and a toolbox library of code
(vfio_pci_core.c), complete the split and move them into two different
kernel modules.

As before vfio_pci.ko continues to present the same interface under
sysfs and this change will have no functional impact.

Splitting into another module and adding exports allows creating new HW
specific VFIO PCI drivers that can implement device specific
functionality, such as VFIO migration interfaces or specialized device
requirements.

Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
---
 drivers/vfio/pci/Kconfig                      | 30 ++++++++------
 drivers/vfio/pci/Makefile                     |  8 ++--
 drivers/vfio/pci/vfio_pci.c                   | 14 ++-----
 drivers/vfio/pci/vfio_pci_config.c            |  2 +-
 drivers/vfio/pci/vfio_pci_core.c              | 41 ++++++++++++++++---
 drivers/vfio/pci/vfio_pci_igd.c               |  2 +-
 drivers/vfio/pci/vfio_pci_intrs.c             |  2 +-
 drivers/vfio/pci/vfio_pci_rdwr.c              |  2 +-
 drivers/vfio/pci/vfio_pci_zdev.c              |  2 +-
 .../pci => include/linux}/vfio_pci_core.h     |  2 -
 10 files changed, 66 insertions(+), 39 deletions(-)
 rename {drivers/vfio/pci => include/linux}/vfio_pci_core.h (99%)

diff --git a/drivers/vfio/pci/Kconfig b/drivers/vfio/pci/Kconfig
index afdab7d71e98..18898ae49919 100644
--- a/drivers/vfio/pci/Kconfig
+++ b/drivers/vfio/pci/Kconfig
@@ -1,19 +1,31 @@
 # SPDX-License-Identifier: GPL-2.0-only
-config VFIO_PCI
+config VFIO_PCI_CORE
 	tristate "VFIO support for PCI devices"
 	depends on PCI
 	depends on MMU
 	select VFIO_VIRQFD
 	select IRQ_BYPASS_MANAGER
 	help
-	  Support for the PCI VFIO bus driver.  This is required to make
-	  use of PCI drivers using the VFIO framework.
+	  Support for using PCI devices with VFIO.
+
+if VFIO_PCI_CORE
+config VFIO_PCI_MMAP
+	def_bool y if !S390
+
+config VFIO_PCI_INTX
+	def_bool y if !S390
+
+config VFIO_PCI
+	tristate "Generic VFIO support for any PCI device"
+	help
+	  Support for the generic PCI VFIO bus driver which can connect any
+	  PCI device to the VFIO framework.
 
 	  If you don't know what to do here, say N.
 
 if VFIO_PCI
 config VFIO_PCI_VGA
-	bool "VFIO PCI support for VGA devices"
+	bool "Generic VFIO PCI support for VGA devices"
 	depends on X86 && VGA_ARB
 	help
 	  Support for VGA extension to VFIO PCI.  This exposes an additional
@@ -22,14 +34,8 @@ config VFIO_PCI_VGA
 
 	  If you don't know what to do here, say N.
 
-config VFIO_PCI_MMAP
-	def_bool y if !S390
-
-config VFIO_PCI_INTX
-	def_bool y if !S390
-
 config VFIO_PCI_IGD
-	bool "VFIO PCI extensions for Intel graphics (GVT-d)"
+	bool "Generic VFIO PCI extensions for Intel graphics (GVT-d)"
 	depends on X86
 	default y
 	help
@@ -39,5 +45,5 @@ config VFIO_PCI_IGD
 	  and LPC bridge config space.
 
 	  To enable Intel IGD assignment through vfio-pci, say Y.
-
+endif
 endif
diff --git a/drivers/vfio/pci/Makefile b/drivers/vfio/pci/Makefile
index 8aa517b4b671..349d68d242b4 100644
--- a/drivers/vfio/pci/Makefile
+++ b/drivers/vfio/pci/Makefile
@@ -1,7 +1,9 @@
 # SPDX-License-Identifier: GPL-2.0-only
 
-vfio-pci-y := vfio_pci.o vfio_pci_core.o vfio_pci_intrs.o vfio_pci_rdwr.o vfio_pci_config.o
-vfio-pci-$(CONFIG_VFIO_PCI_IGD) += vfio_pci_igd.o
-vfio-pci-$(CONFIG_S390) += vfio_pci_zdev.o
+vfio-pci-core-y := vfio_pci_core.o vfio_pci_intrs.o vfio_pci_rdwr.o vfio_pci_config.o
+vfio-pci-core-$(CONFIG_S390) += vfio_pci_zdev.o
+obj-$(CONFIG_VFIO_PCI_CORE) += vfio-pci-core.o
 
+vfio-pci-y := vfio_pci.o
+vfio-pci-$(CONFIG_VFIO_PCI_IGD) += vfio_pci_igd.o
 obj-$(CONFIG_VFIO_PCI) += vfio-pci.o
diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c
index 7a43edbe8618..41b4742aef20 100644
--- a/drivers/vfio/pci/vfio_pci.c
+++ b/drivers/vfio/pci/vfio_pci.c
@@ -25,7 +25,7 @@
 #include <linux/types.h>
 #include <linux/uaccess.h>
 
-#include "vfio_pci_core.h"
+#include <linux/vfio_pci_core.h>
 
 #define DRIVER_VERSION  "0.2"
 #define DRIVER_AUTHOR   "Alex Williamson <alex.williamson@redhat.com>"
@@ -154,6 +154,7 @@ static int vfio_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 	ret = vfio_pci_core_register_device(vdev);
 	if (ret)
 		goto out_free;
+	dev_set_drvdata(&pdev->dev, vdev);
 	return 0;
 
 out_free:
@@ -249,14 +250,10 @@ static int __init vfio_pci_init(void)
 
 	vfio_pci_core_set_params(nointxmask, is_disable_vga, disable_idle_d3);
 
-	ret = vfio_pci_core_init();
-	if (ret)
-		return ret;
-
 	/* Register and scan for devices */
 	ret = pci_register_driver(&vfio_pci_driver);
 	if (ret)
-		goto out;
+		return ret;
 
 	vfio_pci_fill_ids();
 
@@ -264,17 +261,12 @@ static int __init vfio_pci_init(void)
 		pr_warn("device denylist disabled.\n");
 
 	return 0;
-
-out:
-	vfio_pci_core_cleanup();
-	return ret;
 }
 module_init(vfio_pci_init);
 
 static void __exit vfio_pci_cleanup(void)
 {
 	pci_unregister_driver(&vfio_pci_driver);
-	vfio_pci_core_cleanup();
 }
 module_exit(vfio_pci_cleanup);
 
diff --git a/drivers/vfio/pci/vfio_pci_config.c b/drivers/vfio/pci/vfio_pci_config.c
index 1f034f768a27..6e58b4bf7a60 100644
--- a/drivers/vfio/pci/vfio_pci_config.c
+++ b/drivers/vfio/pci/vfio_pci_config.c
@@ -26,7 +26,7 @@
 #include <linux/vfio.h>
 #include <linux/slab.h>
 
-#include "vfio_pci_core.h"
+#include <linux/vfio_pci_core.h>
 
 /* Fake capability ID for standard config space */
 #define PCI_CAP_ID_BASIC	0
diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c
index e65b154f17c3..3a5b6d889a69 100644
--- a/drivers/vfio/pci/vfio_pci_core.c
+++ b/drivers/vfio/pci/vfio_pci_core.c
@@ -8,6 +8,8 @@
  * Author: Tom Lyon, pugs@cisco.com
  */
 
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
 #include <linux/device.h>
 #include <linux/eventfd.h>
 #include <linux/file.h>
@@ -25,7 +27,11 @@
 #include <linux/nospec.h>
 #include <linux/sched/mm.h>
 
-#include "vfio_pci_core.h"
+#include <linux/vfio_pci_core.h>
+
+#define DRIVER_VERSION  "0.2"
+#define DRIVER_AUTHOR   "Alex Williamson <alex.williamson@redhat.com>"
+#define DRIVER_DESC "core driver for VFIO based PCI devices"
 
 static bool nointxmask;
 static bool disable_vga;
@@ -306,6 +312,7 @@ int vfio_pci_core_enable(struct vfio_pci_core_device *vdev)
 
 	return 0;
 }
+EXPORT_SYMBOL_GPL(vfio_pci_core_enable);
 
 void vfio_pci_core_disable(struct vfio_pci_core_device *vdev)
 {
@@ -405,6 +412,7 @@ void vfio_pci_core_disable(struct vfio_pci_core_device *vdev)
 	if (!disable_idle_d3)
 		vfio_pci_set_power_state(vdev, PCI_D3hot);
 }
+EXPORT_SYMBOL_GPL(vfio_pci_core_disable);
 
 static struct vfio_pci_core_device *get_pf_vdev(struct vfio_pci_core_device *vdev)
 {
@@ -461,6 +469,7 @@ void vfio_pci_core_close_device(struct vfio_device *core_vdev)
 	}
 	mutex_unlock(&vdev->igate);
 }
+EXPORT_SYMBOL_GPL(vfio_pci_core_close_device);
 
 void vfio_pci_core_finish_enable(struct vfio_pci_core_device *vdev)
 {
@@ -468,6 +477,7 @@ void vfio_pci_core_finish_enable(struct vfio_pci_core_device *vdev)
 	vfio_spapr_pci_eeh_open(vdev->pdev);
 	vfio_pci_vf_token_user_add(vdev, 1);
 }
+EXPORT_SYMBOL_GPL(vfio_pci_core_finish_enable);
 
 static int vfio_pci_get_irq_count(struct vfio_pci_core_device *vdev, int irq_type)
 {
@@ -626,6 +636,7 @@ int vfio_pci_register_dev_region(struct vfio_pci_core_device *vdev,
 
 	return 0;
 }
+EXPORT_SYMBOL_GPL(vfio_pci_register_dev_region);
 
 long vfio_pci_core_ioctl(struct vfio_device *core_vdev, unsigned int cmd,
 		unsigned long arg)
@@ -1170,6 +1181,7 @@ long vfio_pci_core_ioctl(struct vfio_device *core_vdev, unsigned int cmd,
 
 	return -ENOTTY;
 }
+EXPORT_SYMBOL_GPL(vfio_pci_core_ioctl);
 
 static ssize_t vfio_pci_rw(struct vfio_pci_core_device *vdev, char __user *buf,
 			   size_t count, loff_t *ppos, bool iswrite)
@@ -1213,6 +1225,7 @@ ssize_t vfio_pci_core_read(struct vfio_device *core_vdev, char __user *buf,
 
 	return vfio_pci_rw(vdev, buf, count, ppos, false);
 }
+EXPORT_SYMBOL_GPL(vfio_pci_core_read);
 
 ssize_t vfio_pci_core_write(struct vfio_device *core_vdev, const char __user *buf,
 		size_t count, loff_t *ppos)
@@ -1225,6 +1238,7 @@ ssize_t vfio_pci_core_write(struct vfio_device *core_vdev, const char __user *bu
 
 	return vfio_pci_rw(vdev, (char __user *)buf, count, ppos, true);
 }
+EXPORT_SYMBOL_GPL(vfio_pci_core_write);
 
 /* Return 1 on zap and vma_lock acquired, 0 on contention (only with @try) */
 static int vfio_pci_zap_and_vma_lock(struct vfio_pci_core_device *vdev, bool try)
@@ -1503,6 +1517,7 @@ int vfio_pci_core_mmap(struct vfio_device *core_vdev, struct vm_area_struct *vma
 
 	return 0;
 }
+EXPORT_SYMBOL_GPL(vfio_pci_core_mmap);
 
 void vfio_pci_core_request(struct vfio_device *core_vdev, unsigned int count)
 {
@@ -1525,6 +1540,7 @@ void vfio_pci_core_request(struct vfio_device *core_vdev, unsigned int count)
 
 	mutex_unlock(&vdev->igate);
 }
+EXPORT_SYMBOL_GPL(vfio_pci_core_request);
 
 static int vfio_pci_validate_vf_token(struct vfio_pci_core_device *vdev,
 				      bool vf_token, uuid_t *uuid)
@@ -1669,6 +1685,7 @@ int vfio_pci_core_match(struct vfio_device *core_vdev, char *buf)
 
 	return 1; /* Match */
 }
+EXPORT_SYMBOL_GPL(vfio_pci_core_match);
 
 static int vfio_pci_bus_notifier(struct notifier_block *nb,
 				 unsigned long action, void *data)
@@ -1776,6 +1793,7 @@ void vfio_pci_core_init_device(struct vfio_pci_core_device *vdev,
 	INIT_LIST_HEAD(&vdev->vma_list);
 	init_rwsem(&vdev->memory_lock);
 }
+EXPORT_SYMBOL_GPL(vfio_pci_core_init_device);
 
 void vfio_pci_core_uninit_device(struct vfio_pci_core_device *vdev)
 {
@@ -1786,6 +1804,7 @@ void vfio_pci_core_uninit_device(struct vfio_pci_core_device *vdev)
 	kfree(vdev->region);
 	kfree(vdev->pm_save);
 }
+EXPORT_SYMBOL_GPL(vfio_pci_core_uninit_device);
 
 int vfio_pci_core_register_device(struct vfio_pci_core_device *vdev)
 {
@@ -1847,7 +1866,6 @@ int vfio_pci_core_register_device(struct vfio_pci_core_device *vdev)
 	ret = vfio_register_group_dev(&vdev->vdev);
 	if (ret)
 		goto out_power;
-	dev_set_drvdata(&pdev->dev, vdev);
 	return 0;
 
 out_power:
@@ -1859,6 +1877,7 @@ int vfio_pci_core_register_device(struct vfio_pci_core_device *vdev)
 	vfio_iommu_group_put(group, &pdev->dev);
 	return ret;
 }
+EXPORT_SYMBOL_GPL(vfio_pci_core_register_device);
 
 void vfio_pci_core_unregister_device(struct vfio_pci_core_device *vdev)
 {
@@ -1876,6 +1895,7 @@ void vfio_pci_core_unregister_device(struct vfio_pci_core_device *vdev)
 	if (!disable_idle_d3)
 		vfio_pci_set_power_state(vdev, PCI_D0);
 }
+EXPORT_SYMBOL_GPL(vfio_pci_core_unregister_device);
 
 static pci_ers_result_t vfio_pci_aer_err_detected(struct pci_dev *pdev,
 						  pci_channel_state_t state)
@@ -1921,10 +1941,12 @@ int vfio_pci_core_sriov_configure(struct pci_dev *pdev, int nr_virtfn)
 
 	return ret < 0 ? ret : nr_virtfn;
 }
+EXPORT_SYMBOL_GPL(vfio_pci_core_sriov_configure);
 
 const struct pci_error_handlers vfio_pci_core_err_handlers = {
 	.error_detected = vfio_pci_aer_err_detected,
 };
+EXPORT_SYMBOL_GPL(vfio_pci_core_err_handlers);
 
 static int vfio_pci_check_all_devices_bound(struct pci_dev *pdev, void *data)
 {
@@ -2094,16 +2116,23 @@ void vfio_pci_core_set_params(bool is_nointxmask, bool is_disable_vga,
 	disable_vga = is_disable_vga;
 	disable_idle_d3 = is_disable_idle_d3;
 }
+EXPORT_SYMBOL_GPL(vfio_pci_core_set_params);
 
-/* This will become the __exit function of vfio_pci_core.ko */
-void vfio_pci_core_cleanup(void)
+static void vfio_pci_core_cleanup(void)
 {
 	vfio_pci_uninit_perm_bits();
 }
 
-/* This will become the __init function of vfio_pci_core.ko */
-int __init vfio_pci_core_init(void)
+static int __init vfio_pci_core_init(void)
 {
 	/* Allocate shared config space permission data used by all devices */
 	return vfio_pci_init_perm_bits();
 }
+
+module_init(vfio_pci_core_init);
+module_exit(vfio_pci_core_cleanup);
+
+MODULE_VERSION(DRIVER_VERSION);
+MODULE_LICENSE("GPL v2");
+MODULE_AUTHOR(DRIVER_AUTHOR);
+MODULE_DESCRIPTION(DRIVER_DESC);
diff --git a/drivers/vfio/pci/vfio_pci_igd.c b/drivers/vfio/pci/vfio_pci_igd.c
index a324ca7e6b5a..7ca4109bba48 100644
--- a/drivers/vfio/pci/vfio_pci_igd.c
+++ b/drivers/vfio/pci/vfio_pci_igd.c
@@ -15,7 +15,7 @@
 #include <linux/uaccess.h>
 #include <linux/vfio.h>
 
-#include "vfio_pci_core.h"
+#include <linux/vfio_pci_core.h>
 
 #define OPREGION_SIGNATURE	"IntelGraphicsMem"
 #define OPREGION_SIZE		(8 * 1024)
diff --git a/drivers/vfio/pci/vfio_pci_intrs.c b/drivers/vfio/pci/vfio_pci_intrs.c
index 945ddbdf4d11..6069a11fb51a 100644
--- a/drivers/vfio/pci/vfio_pci_intrs.c
+++ b/drivers/vfio/pci/vfio_pci_intrs.c
@@ -20,7 +20,7 @@
 #include <linux/wait.h>
 #include <linux/slab.h>
 
-#include "vfio_pci_core.h"
+#include <linux/vfio_pci_core.h>
 
 /*
  * INTx
diff --git a/drivers/vfio/pci/vfio_pci_rdwr.c b/drivers/vfio/pci/vfio_pci_rdwr.c
index 8fff4689dd44..57d3b2cbbd8e 100644
--- a/drivers/vfio/pci/vfio_pci_rdwr.c
+++ b/drivers/vfio/pci/vfio_pci_rdwr.c
@@ -17,7 +17,7 @@
 #include <linux/vfio.h>
 #include <linux/vgaarb.h>
 
-#include "vfio_pci_core.h"
+#include <linux/vfio_pci_core.h>
 
 #ifdef __LITTLE_ENDIAN
 #define vfio_ioread64	ioread64
diff --git a/drivers/vfio/pci/vfio_pci_zdev.c b/drivers/vfio/pci/vfio_pci_zdev.c
index 2ffbdc11f089..fe4def9ffffb 100644
--- a/drivers/vfio/pci/vfio_pci_zdev.c
+++ b/drivers/vfio/pci/vfio_pci_zdev.c
@@ -19,7 +19,7 @@
 #include <asm/pci_clp.h>
 #include <asm/pci_io.h>
 
-#include "vfio_pci_core.h"
+#include <linux/vfio_pci_core.h>
 
 /*
  * Add the Base PCI Function information to the device info region.
diff --git a/drivers/vfio/pci/vfio_pci_core.h b/include/linux/vfio_pci_core.h
similarity index 99%
rename from drivers/vfio/pci/vfio_pci_core.h
rename to include/linux/vfio_pci_core.h
index 7a2da1e14de3..ef9a44b6cf5d 100644
--- a/drivers/vfio/pci/vfio_pci_core.h
+++ b/include/linux/vfio_pci_core.h
@@ -207,8 +207,6 @@ static inline int vfio_pci_info_zdev_add_caps(struct vfio_pci_core_device *vdev,
 #endif
 
 /* Will be exported for vfio pci drivers usage */
-void vfio_pci_core_cleanup(void);
-int vfio_pci_core_init(void);
 void vfio_pci_core_set_params(bool nointxmask, bool is_disable_vga,
 			      bool is_disable_idle_d3);
 void vfio_pci_core_close_device(struct vfio_device *core_vdev);
-- 
2.18.1


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 12/12] vfio/pci: Introduce vfio_pci_core.ko
  2021-07-21 16:16 ` [PATCH 12/12] vfio/pci: Introduce vfio_pci_core.ko Yishai Hadas
@ 2021-07-21 17:39   ` Leon Romanovsky
  2021-07-22  9:06     ` Yishai Hadas
  2021-07-27 21:54   ` Alex Williamson
  1 sibling, 1 reply; 55+ messages in thread
From: Leon Romanovsky @ 2021-07-21 17:39 UTC (permalink / raw)
  To: Yishai Hadas
  Cc: bhelgaas, corbet, alex.williamson, diana.craciun, kwankhede,
	eric.auger, masahiroy, michal.lkml, linux-pci, linux-doc, kvm,
	linux-s390, linux-kbuild, mgurtovoy, jgg, maorg

On Wed, Jul 21, 2021 at 07:16:09PM +0300, Yishai Hadas wrote:
> From: Max Gurtovoy <mgurtovoy@nvidia.com>
> 
> Now that vfio_pci has been split into two source modules, one focusing
> on the "struct pci_driver" (vfio_pci.c) and a toolbox library of code
> (vfio_pci_core.c), complete the split and move them into two different
> kernel modules.
> 
> As before vfio_pci.ko continues to present the same interface under
> sysfs and this change will have no functional impact.
> 
> Splitting into another module and adding exports allows creating new HW
> specific VFIO PCI drivers that can implement device specific
> functionality, such as VFIO migration interfaces or specialized device
> requirements.
> 
> Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
> ---
>  drivers/vfio/pci/Kconfig                      | 30 ++++++++------
>  drivers/vfio/pci/Makefile                     |  8 ++--
>  drivers/vfio/pci/vfio_pci.c                   | 14 ++-----
>  drivers/vfio/pci/vfio_pci_config.c            |  2 +-
>  drivers/vfio/pci/vfio_pci_core.c              | 41 ++++++++++++++++---
>  drivers/vfio/pci/vfio_pci_igd.c               |  2 +-
>  drivers/vfio/pci/vfio_pci_intrs.c             |  2 +-
>  drivers/vfio/pci/vfio_pci_rdwr.c              |  2 +-
>  drivers/vfio/pci/vfio_pci_zdev.c              |  2 +-
>  .../pci => include/linux}/vfio_pci_core.h     |  2 -
>  10 files changed, 66 insertions(+), 39 deletions(-)
>  rename {drivers/vfio/pci => include/linux}/vfio_pci_core.h (99%)

<...>

> -#include "vfio_pci_core.h"
> +#include <linux/vfio_pci_core.h>
> +
> +#define DRIVER_VERSION  "0.2"

<...>

> +MODULE_VERSION(DRIVER_VERSION);

Please don't add driver versions to the upstream kernel, they useless.

Thanks

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 12/12] vfio/pci: Introduce vfio_pci_core.ko
  2021-07-21 17:39   ` Leon Romanovsky
@ 2021-07-22  9:06     ` Yishai Hadas
  2021-07-22  9:22       ` Max Gurtovoy
  0 siblings, 1 reply; 55+ messages in thread
From: Yishai Hadas @ 2021-07-22  9:06 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: bhelgaas, corbet, alex.williamson, diana.craciun, kwankhede,
	eric.auger, masahiroy, michal.lkml, linux-pci, linux-doc, kvm,
	linux-s390, linux-kbuild, mgurtovoy, jgg, maorg

On 7/21/2021 8:39 PM, Leon Romanovsky wrote:
> On Wed, Jul 21, 2021 at 07:16:09PM +0300, Yishai Hadas wrote:
>> From: Max Gurtovoy <mgurtovoy@nvidia.com>
>>
>> Now that vfio_pci has been split into two source modules, one focusing
>> on the "struct pci_driver" (vfio_pci.c) and a toolbox library of code
>> (vfio_pci_core.c), complete the split and move them into two different
>> kernel modules.
>>
>> As before vfio_pci.ko continues to present the same interface under
>> sysfs and this change will have no functional impact.
>>
>> Splitting into another module and adding exports allows creating new HW
>> specific VFIO PCI drivers that can implement device specific
>> functionality, such as VFIO migration interfaces or specialized device
>> requirements.
>>
>> Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
>> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
>> Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
>> ---
>>   drivers/vfio/pci/Kconfig                      | 30 ++++++++------
>>   drivers/vfio/pci/Makefile                     |  8 ++--
>>   drivers/vfio/pci/vfio_pci.c                   | 14 ++-----
>>   drivers/vfio/pci/vfio_pci_config.c            |  2 +-
>>   drivers/vfio/pci/vfio_pci_core.c              | 41 ++++++++++++++++---
>>   drivers/vfio/pci/vfio_pci_igd.c               |  2 +-
>>   drivers/vfio/pci/vfio_pci_intrs.c             |  2 +-
>>   drivers/vfio/pci/vfio_pci_rdwr.c              |  2 +-
>>   drivers/vfio/pci/vfio_pci_zdev.c              |  2 +-
>>   .../pci => include/linux}/vfio_pci_core.h     |  2 -
>>   10 files changed, 66 insertions(+), 39 deletions(-)
>>   rename {drivers/vfio/pci => include/linux}/vfio_pci_core.h (99%)
> <...>
>
>> -#include "vfio_pci_core.h"
>> +#include <linux/vfio_pci_core.h>
>> +
>> +#define DRIVER_VERSION  "0.2"
> <...>
>
>> +MODULE_VERSION(DRIVER_VERSION);
> Please don't add driver versions to the upstream kernel, they useless.
>
> Thanks

This just preserves the code for driver/module version that was in 
vfio_pci.ko before the split.

However,  this can be removed in V2 if we may need to have.

Yishai


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 12/12] vfio/pci: Introduce vfio_pci_core.ko
  2021-07-22  9:06     ` Yishai Hadas
@ 2021-07-22  9:22       ` Max Gurtovoy
  2021-07-23 14:13         ` Leon Romanovsky
  0 siblings, 1 reply; 55+ messages in thread
From: Max Gurtovoy @ 2021-07-22  9:22 UTC (permalink / raw)
  To: Yishai Hadas, Leon Romanovsky
  Cc: bhelgaas, corbet, alex.williamson, diana.craciun, kwankhede,
	eric.auger, masahiroy, michal.lkml, linux-pci, linux-doc, kvm,
	linux-s390, linux-kbuild, jgg, maorg


On 7/22/2021 12:06 PM, Yishai Hadas wrote:
> On 7/21/2021 8:39 PM, Leon Romanovsky wrote:
>> On Wed, Jul 21, 2021 at 07:16:09PM +0300, Yishai Hadas wrote:
>>> From: Max Gurtovoy <mgurtovoy@nvidia.com>
>>>
>>> Now that vfio_pci has been split into two source modules, one focusing
>>> on the "struct pci_driver" (vfio_pci.c) and a toolbox library of code
>>> (vfio_pci_core.c), complete the split and move them into two different
>>> kernel modules.
>>>
>>> As before vfio_pci.ko continues to present the same interface under
>>> sysfs and this change will have no functional impact.
>>>
>>> Splitting into another module and adding exports allows creating new HW
>>> specific VFIO PCI drivers that can implement device specific
>>> functionality, such as VFIO migration interfaces or specialized device
>>> requirements.
>>>
>>> Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
>>> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
>>> Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
>>> ---
>>>   drivers/vfio/pci/Kconfig                      | 30 ++++++++------
>>>   drivers/vfio/pci/Makefile                     |  8 ++--
>>>   drivers/vfio/pci/vfio_pci.c                   | 14 ++-----
>>>   drivers/vfio/pci/vfio_pci_config.c            |  2 +-
>>>   drivers/vfio/pci/vfio_pci_core.c              | 41 
>>> ++++++++++++++++---
>>>   drivers/vfio/pci/vfio_pci_igd.c               |  2 +-
>>>   drivers/vfio/pci/vfio_pci_intrs.c             |  2 +-
>>>   drivers/vfio/pci/vfio_pci_rdwr.c              |  2 +-
>>>   drivers/vfio/pci/vfio_pci_zdev.c              |  2 +-
>>>   .../pci => include/linux}/vfio_pci_core.h     |  2 -
>>>   10 files changed, 66 insertions(+), 39 deletions(-)
>>>   rename {drivers/vfio/pci => include/linux}/vfio_pci_core.h (99%)
>> <...>
>>
>>> -#include "vfio_pci_core.h"
>>> +#include <linux/vfio_pci_core.h>
>>> +
>>> +#define DRIVER_VERSION  "0.2"
>> <...>
>>
>>> +MODULE_VERSION(DRIVER_VERSION);
>> Please don't add driver versions to the upstream kernel, they useless.
>>
>> Thanks
>
> This just preserves the code for driver/module version that was in 
> vfio_pci.ko before the split.
>
> However,  this can be removed in V2 if we may need to have.

Right, we already agreed to preserve vfio_pci versioning scheme and 
we'll not add it to new mlx5_vfio_pci or future drivers.


>
> Yishai
>

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 12/12] vfio/pci: Introduce vfio_pci_core.ko
  2021-07-22  9:22       ` Max Gurtovoy
@ 2021-07-23 14:13         ` Leon Romanovsky
  2021-07-25 10:45           ` Max Gurtovoy
  0 siblings, 1 reply; 55+ messages in thread
From: Leon Romanovsky @ 2021-07-23 14:13 UTC (permalink / raw)
  To: Max Gurtovoy
  Cc: Yishai Hadas, bhelgaas, corbet, alex.williamson, diana.craciun,
	kwankhede, eric.auger, masahiroy, michal.lkml, linux-pci,
	linux-doc, kvm, linux-s390, linux-kbuild, jgg, maorg

On Thu, Jul 22, 2021 at 12:22:05PM +0300, Max Gurtovoy wrote:
> 
> On 7/22/2021 12:06 PM, Yishai Hadas wrote:
> > On 7/21/2021 8:39 PM, Leon Romanovsky wrote:
> > > On Wed, Jul 21, 2021 at 07:16:09PM +0300, Yishai Hadas wrote:
> > > > From: Max Gurtovoy <mgurtovoy@nvidia.com>
> > > > 
> > > > Now that vfio_pci has been split into two source modules, one focusing
> > > > on the "struct pci_driver" (vfio_pci.c) and a toolbox library of code
> > > > (vfio_pci_core.c), complete the split and move them into two different
> > > > kernel modules.
> > > > 
> > > > As before vfio_pci.ko continues to present the same interface under
> > > > sysfs and this change will have no functional impact.
> > > > 
> > > > Splitting into another module and adding exports allows creating new HW
> > > > specific VFIO PCI drivers that can implement device specific
> > > > functionality, such as VFIO migration interfaces or specialized device
> > > > requirements.
> > > > 
> > > > Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
> > > > Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> > > > Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
> > > > ---
> > > >   drivers/vfio/pci/Kconfig                      | 30 ++++++++------
> > > >   drivers/vfio/pci/Makefile                     |  8 ++--
> > > >   drivers/vfio/pci/vfio_pci.c                   | 14 ++-----
> > > >   drivers/vfio/pci/vfio_pci_config.c            |  2 +-
> > > >   drivers/vfio/pci/vfio_pci_core.c              | 41
> > > > ++++++++++++++++---
> > > >   drivers/vfio/pci/vfio_pci_igd.c               |  2 +-
> > > >   drivers/vfio/pci/vfio_pci_intrs.c             |  2 +-
> > > >   drivers/vfio/pci/vfio_pci_rdwr.c              |  2 +-
> > > >   drivers/vfio/pci/vfio_pci_zdev.c              |  2 +-
> > > >   .../pci => include/linux}/vfio_pci_core.h     |  2 -
> > > >   10 files changed, 66 insertions(+), 39 deletions(-)
> > > >   rename {drivers/vfio/pci => include/linux}/vfio_pci_core.h (99%)
> > > <...>
> > > 
> > > > -#include "vfio_pci_core.h"
> > > > +#include <linux/vfio_pci_core.h>
> > > > +
> > > > +#define DRIVER_VERSION  "0.2"
> > > <...>
> > > 
> > > > +MODULE_VERSION(DRIVER_VERSION);
> > > Please don't add driver versions to the upstream kernel, they useless.
> > > 
> > > Thanks
> > 
> > This just preserves the code for driver/module version that was in
> > vfio_pci.ko before the split.
> > 
> > However,  this can be removed in V2 if we may need to have.
> 
> Right, we already agreed to preserve vfio_pci versioning scheme and we'll
> not add it to new mlx5_vfio_pci or future drivers.

There is nothing to preserve, instead of keeping this useless code, just
delete it.

https://lore.kernel.org/ksummit-discuss/CA+55aFx9A=5cc0QZ7CySC4F2K7eYaEfzkdYEc9JaNgCcV25=rg@mail.gmail.com/

Thanks

> 
> 
> > 
> > Yishai
> > 

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 12/12] vfio/pci: Introduce vfio_pci_core.ko
  2021-07-23 14:13         ` Leon Romanovsky
@ 2021-07-25 10:45           ` Max Gurtovoy
  0 siblings, 0 replies; 55+ messages in thread
From: Max Gurtovoy @ 2021-07-25 10:45 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: Yishai Hadas, bhelgaas, corbet, alex.williamson, diana.craciun,
	kwankhede, eric.auger, masahiroy, michal.lkml, linux-pci,
	linux-doc, kvm, linux-s390, linux-kbuild, jgg, maorg


On 7/23/2021 5:13 PM, Leon Romanovsky wrote:
> On Thu, Jul 22, 2021 at 12:22:05PM +0300, Max Gurtovoy wrote:
>> On 7/22/2021 12:06 PM, Yishai Hadas wrote:
>>> On 7/21/2021 8:39 PM, Leon Romanovsky wrote:
>>>> On Wed, Jul 21, 2021 at 07:16:09PM +0300, Yishai Hadas wrote:
>>>>> From: Max Gurtovoy <mgurtovoy@nvidia.com>
>>>>>
>>>>> Now that vfio_pci has been split into two source modules, one focusing
>>>>> on the "struct pci_driver" (vfio_pci.c) and a toolbox library of code
>>>>> (vfio_pci_core.c), complete the split and move them into two different
>>>>> kernel modules.
>>>>>
>>>>> As before vfio_pci.ko continues to present the same interface under
>>>>> sysfs and this change will have no functional impact.
>>>>>
>>>>> Splitting into another module and adding exports allows creating new HW
>>>>> specific VFIO PCI drivers that can implement device specific
>>>>> functionality, such as VFIO migration interfaces or specialized device
>>>>> requirements.
>>>>>
>>>>> Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
>>>>> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
>>>>> Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
>>>>> ---
>>>>>    drivers/vfio/pci/Kconfig                      | 30 ++++++++------
>>>>>    drivers/vfio/pci/Makefile                     |  8 ++--
>>>>>    drivers/vfio/pci/vfio_pci.c                   | 14 ++-----
>>>>>    drivers/vfio/pci/vfio_pci_config.c            |  2 +-
>>>>>    drivers/vfio/pci/vfio_pci_core.c              | 41
>>>>> ++++++++++++++++---
>>>>>    drivers/vfio/pci/vfio_pci_igd.c               |  2 +-
>>>>>    drivers/vfio/pci/vfio_pci_intrs.c             |  2 +-
>>>>>    drivers/vfio/pci/vfio_pci_rdwr.c              |  2 +-
>>>>>    drivers/vfio/pci/vfio_pci_zdev.c              |  2 +-
>>>>>    .../pci => include/linux}/vfio_pci_core.h     |  2 -
>>>>>    10 files changed, 66 insertions(+), 39 deletions(-)
>>>>>    rename {drivers/vfio/pci => include/linux}/vfio_pci_core.h (99%)
>>>> <...>
>>>>
>>>>> -#include "vfio_pci_core.h"
>>>>> +#include <linux/vfio_pci_core.h>
>>>>> +
>>>>> +#define DRIVER_VERSION  "0.2"
>>>> <...>
>>>>
>>>>> +MODULE_VERSION(DRIVER_VERSION);
>>>> Please don't add driver versions to the upstream kernel, they useless.
>>>>
>>>> Thanks
>>> This just preserves the code for driver/module version that was in
>>> vfio_pci.ko before the split.
>>>
>>> However,  this can be removed in V2 if we may need to have.
>> Right, we already agreed to preserve vfio_pci versioning scheme and we'll
>> not add it to new mlx5_vfio_pci or future drivers.
> There is nothing to preserve, instead of keeping this useless code, just
> delete it.

Ok I guess we can do it since the is new module vfio_pci_core.ko.

We'll remove it in V2.

>
> https://lore.kernel.org/ksummit-discuss/CA+55aFx9A=5cc0QZ7CySC4F2K7eYaEfzkdYEc9JaNgCcV25=rg@mail.gmail.com/
>
> Thanks
>
>>
>>> Yishai
>>>

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 09/12] PCI: Add a PCI_ID_F_VFIO_DRIVER_OVERRIDE flag to struct pci_device_id
  2021-07-21 16:16 ` [PATCH 09/12] PCI: Add a PCI_ID_F_VFIO_DRIVER_OVERRIDE flag to struct pci_device_id Yishai Hadas
@ 2021-07-27 16:34   ` Alex Williamson
  2021-07-27 17:14     ` Jason Gunthorpe
  2021-08-04 20:34   ` Bjorn Helgaas
  2021-08-12 15:42   ` Bjorn Helgaas
  2 siblings, 1 reply; 55+ messages in thread
From: Alex Williamson @ 2021-07-27 16:34 UTC (permalink / raw)
  To: Yishai Hadas
  Cc: bhelgaas, corbet, diana.craciun, kwankhede, eric.auger,
	masahiroy, michal.lkml, linux-pci, linux-doc, kvm, linux-s390,
	linux-kbuild, mgurtovoy, jgg, maorg, leonro

On Wed, 21 Jul 2021 19:16:06 +0300
Yishai Hadas <yishaih@nvidia.com> wrote:

> From: Max Gurtovoy <mgurtovoy@nvidia.com>
> 
> The new flag field is be used to allow PCI drivers to signal the core code
> during driver matching and when generating the modules.alias information.
> 
> The first use will be to define a VFIO flag that indicates the PCI driver
> is a VFIO driver.
> 
> VFIO drivers have a few special properties compared to normal PCI drivers:
>  - They do not automatically bind. VFIO drivers are used to swap out the
>    normal driver for a device and convert the PCI device to the VFIO
>    subsystem.
> 
>    The admin must make this choice and following the current uAPI this is
>    usually done by using the driver_override sysfs.
> 
>  - The modules.alias includes the IDs of the VFIO PCI drivers, prefixing
>    them with 'vfio_pci:' instead of the normal 'pci:'.
> 
>    This allows the userspace machinery that switches devices to VFIO to
>    know what kernel drivers support what devices and allows it to trigger
>    the proper device_override.
> 
> As existing tools do not recognize the "vfio_pci:" mod-alias prefix this
> keeps todays behavior the same. VFIO remains on the side, is never
> autoloaded and can only be activated by direct admin action.
> 
> This patch is the infrastructure to provide the information in the
> modules.alias to userspace and enable the only PCI VFIO driver. Later
> series introduce additional HW specific VFIO PCI drivers.

I don't really understand why we're combining the above "special
properties" into a single flag.  For instance, why wouldn't we create a
flag that just indicates a match entry is only for driver override?  Or
if we're only using this for full wildcard matches, we could detect
that even without a flag.

Then, how does the "vfio_pci:" alias extend to other drivers?  Is this
expected to be the only driver that would use an alias ever or would
other drivers use new bits of the flag?  Seems some documentation is
necessary; the comment on PCI_DRIVER_OVERRIDE_DEVICE_VFIO doesn't
really help, "This macro is used to create a struct pci_device_id that
matches a specific device", then we proceed to use it with PCI_ANY_ID.

vfio-pci has always tried (as much as possible) to be "just another
PCI" driver to avoid all the nasty issues that used to exist with
legacy KVM device assignment, so I cringe at seeing these vfio specific
hooks in PCI-core.  Thanks,

Alex


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 09/12] PCI: Add a PCI_ID_F_VFIO_DRIVER_OVERRIDE flag to struct pci_device_id
  2021-07-27 16:34   ` Alex Williamson
@ 2021-07-27 17:14     ` Jason Gunthorpe
  2021-07-27 23:02       ` Alex Williamson
  0 siblings, 1 reply; 55+ messages in thread
From: Jason Gunthorpe @ 2021-07-27 17:14 UTC (permalink / raw)
  To: Alex Williamson
  Cc: Yishai Hadas, bhelgaas, corbet, diana.craciun, kwankhede,
	eric.auger, masahiroy, michal.lkml, linux-pci, linux-doc, kvm,
	linux-s390, linux-kbuild, mgurtovoy, maorg, leonro

On Tue, Jul 27, 2021 at 10:34:18AM -0600, Alex Williamson wrote:
> On Wed, 21 Jul 2021 19:16:06 +0300
> Yishai Hadas <yishaih@nvidia.com> wrote:
> 
> > From: Max Gurtovoy <mgurtovoy@nvidia.com>
> > 
> > The new flag field is be used to allow PCI drivers to signal the core code
> > during driver matching and when generating the modules.alias information.
> > 
> > The first use will be to define a VFIO flag that indicates the PCI driver
> > is a VFIO driver.
> > 
> > VFIO drivers have a few special properties compared to normal PCI drivers:
> >  - They do not automatically bind. VFIO drivers are used to swap out the
> >    normal driver for a device and convert the PCI device to the VFIO
> >    subsystem.
> > 
> >    The admin must make this choice and following the current uAPI this is
> >    usually done by using the driver_override sysfs.
> > 
> >  - The modules.alias includes the IDs of the VFIO PCI drivers, prefixing
> >    them with 'vfio_pci:' instead of the normal 'pci:'.
> > 
> >    This allows the userspace machinery that switches devices to VFIO to
> >    know what kernel drivers support what devices and allows it to trigger
> >    the proper device_override.
> > 
> > As existing tools do not recognize the "vfio_pci:" mod-alias prefix this
> > keeps todays behavior the same. VFIO remains on the side, is never
> > autoloaded and can only be activated by direct admin action.
> > 
> > This patch is the infrastructure to provide the information in the
> > modules.alias to userspace and enable the only PCI VFIO driver. Later
> > series introduce additional HW specific VFIO PCI drivers.
> 
> I don't really understand why we're combining the above "special
> properties" into a single flag. 

Currently I can't think of any reason to have two flags. We always
need both behaviors together. It is trivial for someone to change down
the road, so I prefer to keep the flag bit usage to a minimum.

> For instance, why wouldn't we create a flag that just indicates a
> match entry is only for driver override?

We still need to signal the generation of vfio_pci: string in the
modules.alias.

> Or if we're only using this for full wildcard matches, we could
> detect that even without a flag.

The mlx/hns/etc drivers will not use wildcard matches. This series is
the prep and the only driver we have right at this point is the
wildcard vfio_pci generic driver.

> Then, how does the "vfio_pci:" alias extend to other drivers?  

After the HW drivers are merged we have a list of things in the
modules.alias file. Eg we might have something like:

alias vfio_pci:v000015B3d00001011sv*sd*bc*sc*i* mlx5_vfio_pci
alias vfio_pci:v0000abc1d0000abcdsv*sd*bc*sc*i* hns_vfio_pci
alias vfio_pci:v*d*sv*sd*bc*sc*i* vfio_pci

This flag, and the vfio_pci string, is only for the VFIO subsystem. If
someday another subsystem wants to use driver_override then it will
provide its own subsystem name here instead.

This is solving the problem you had at the start - that userspace must
be able to self identify the drivers.  Starting with a PCI BDF
userspace can match the modules.alias for vfio_pci: prefixes and
determine which string to put into the driver_override sysfs. This is
instead of having userspace hardwire vfio_pci.

> Is this expected to be the only driver that would use an alias ever
> or would other drivers use new bits of the flag?

Not sure what you mean by "only driver"? As above every driver
implementing VFIO on top of PCI will use this flag. If another
subsystem wants to use driver_override it will define its own flag,
and it's userspace will look for othersubsytem_pci: tags in
modules.alias when it wants to change a PCI device over.

> Seems some documentation is necessary; the comment on
> PCI_DRIVER_OVERRIDE_DEVICE_VFIO doesn't really help, "This macro is
> used to create a struct pci_device_id that matches a specific
> device", then we proceed to use it with PCI_ANY_ID.

Fair enough, this is ment in the broader context, the generic vfio_pci
is just special.

> vfio-pci has always tried (as much as possible) to be "just another
> PCI" driver to avoid all the nasty issues that used to exist with
> legacy KVM device assignment, so I cringe at seeing these vfio specific
> hooks in PCI-core.  Thanks,

It is has always had very special behavior - a PCI driver without a
match table is is not "just another PCI" driver.

While this is not entirely elegant, considering where we have ended up
and the historical ABI that has to be preserved, it is the best idea
so far anyone has presented.

Jason

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 12/12] vfio/pci: Introduce vfio_pci_core.ko
  2021-07-21 16:16 ` [PATCH 12/12] vfio/pci: Introduce vfio_pci_core.ko Yishai Hadas
  2021-07-21 17:39   ` Leon Romanovsky
@ 2021-07-27 21:54   ` Alex Williamson
  2021-07-27 23:09     ` Jason Gunthorpe
  1 sibling, 1 reply; 55+ messages in thread
From: Alex Williamson @ 2021-07-27 21:54 UTC (permalink / raw)
  To: Yishai Hadas
  Cc: bhelgaas, corbet, diana.craciun, kwankhede, eric.auger,
	masahiroy, michal.lkml, linux-pci, linux-doc, kvm, linux-s390,
	linux-kbuild, mgurtovoy, jgg, maorg, leonro

On Wed, 21 Jul 2021 19:16:09 +0300
Yishai Hadas <yishaih@nvidia.com> wrote:

> From: Max Gurtovoy <mgurtovoy@nvidia.com>
> 
> Now that vfio_pci has been split into two source modules, one focusing
> on the "struct pci_driver" (vfio_pci.c) and a toolbox library of code
> (vfio_pci_core.c), complete the split and move them into two different
> kernel modules.
> 
> As before vfio_pci.ko continues to present the same interface under
> sysfs and this change will have no functional impact.
> 
> Splitting into another module and adding exports allows creating new HW
> specific VFIO PCI drivers that can implement device specific
> functionality, such as VFIO migration interfaces or specialized device
> requirements.
> 
> Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
> ---
>  drivers/vfio/pci/Kconfig                      | 30 ++++++++------
>  drivers/vfio/pci/Makefile                     |  8 ++--
>  drivers/vfio/pci/vfio_pci.c                   | 14 ++-----
>  drivers/vfio/pci/vfio_pci_config.c            |  2 +-
>  drivers/vfio/pci/vfio_pci_core.c              | 41 ++++++++++++++++---
>  drivers/vfio/pci/vfio_pci_igd.c               |  2 +-
>  drivers/vfio/pci/vfio_pci_intrs.c             |  2 +-
>  drivers/vfio/pci/vfio_pci_rdwr.c              |  2 +-
>  drivers/vfio/pci/vfio_pci_zdev.c              |  2 +-
>  .../pci => include/linux}/vfio_pci_core.h     |  2 -
>  10 files changed, 66 insertions(+), 39 deletions(-)
>  rename {drivers/vfio/pci => include/linux}/vfio_pci_core.h (99%)
> 
> diff --git a/drivers/vfio/pci/Kconfig b/drivers/vfio/pci/Kconfig
> index afdab7d71e98..18898ae49919 100644
> --- a/drivers/vfio/pci/Kconfig
> +++ b/drivers/vfio/pci/Kconfig
> @@ -1,19 +1,31 @@
>  # SPDX-License-Identifier: GPL-2.0-only
> -config VFIO_PCI
> +config VFIO_PCI_CORE
>  	tristate "VFIO support for PCI devices"
>  	depends on PCI
>  	depends on MMU
>  	select VFIO_VIRQFD
>  	select IRQ_BYPASS_MANAGER
>  	help
> -	  Support for the PCI VFIO bus driver.  This is required to make
> -	  use of PCI drivers using the VFIO framework.
> +	  Support for using PCI devices with VFIO.
> +
> +if VFIO_PCI_CORE
> +config VFIO_PCI_MMAP
> +	def_bool y if !S390
> +
> +config VFIO_PCI_INTX
> +	def_bool y if !S390
> +
> +config VFIO_PCI
> +	tristate "Generic VFIO support for any PCI device"
> +	help
> +	  Support for the generic PCI VFIO bus driver which can connect any
> +	  PCI device to the VFIO framework.
>  
>  	  If you don't know what to do here, say N.
>  

I'm still not happy with how this is likely to break users and even
downstreams when upgrading to a Kconfig with this change.  A previously
selected VFIO_PCI comes out disabled unless the user is keen enough to
enable VFIO_PCI_CORE.  I think I'd prefer to sacrifice the purity of
the menus to pull VFIO_PCI out of the if block and have it select
VFIO_PCI_CORE.  Thanks,

Alex


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 09/12] PCI: Add a PCI_ID_F_VFIO_DRIVER_OVERRIDE flag to struct pci_device_id
  2021-07-27 17:14     ` Jason Gunthorpe
@ 2021-07-27 23:02       ` Alex Williamson
  2021-07-27 23:42         ` Jason Gunthorpe
  0 siblings, 1 reply; 55+ messages in thread
From: Alex Williamson @ 2021-07-27 23:02 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Yishai Hadas, bhelgaas, corbet, diana.craciun, kwankhede,
	eric.auger, masahiroy, michal.lkml, linux-pci, linux-doc, kvm,
	linux-s390, linux-kbuild, mgurtovoy, maorg, leonro

On Tue, 27 Jul 2021 14:14:58 -0300
Jason Gunthorpe <jgg@nvidia.com> wrote:

> On Tue, Jul 27, 2021 at 10:34:18AM -0600, Alex Williamson wrote:
> > On Wed, 21 Jul 2021 19:16:06 +0300
> > Yishai Hadas <yishaih@nvidia.com> wrote:
> >   
> > > From: Max Gurtovoy <mgurtovoy@nvidia.com>
> > > 
> > > The new flag field is be used to allow PCI drivers to signal the core code
> > > during driver matching and when generating the modules.alias information.
> > > 
> > > The first use will be to define a VFIO flag that indicates the PCI driver
> > > is a VFIO driver.
> > > 
> > > VFIO drivers have a few special properties compared to normal PCI drivers:
> > >  - They do not automatically bind. VFIO drivers are used to swap out the
> > >    normal driver for a device and convert the PCI device to the VFIO
> > >    subsystem.
> > > 
> > >    The admin must make this choice and following the current uAPI this is
> > >    usually done by using the driver_override sysfs.
> > > 
> > >  - The modules.alias includes the IDs of the VFIO PCI drivers, prefixing
> > >    them with 'vfio_pci:' instead of the normal 'pci:'.
> > > 
> > >    This allows the userspace machinery that switches devices to VFIO to
> > >    know what kernel drivers support what devices and allows it to trigger
> > >    the proper device_override.
> > > 
> > > As existing tools do not recognize the "vfio_pci:" mod-alias prefix this
> > > keeps todays behavior the same. VFIO remains on the side, is never
> > > autoloaded and can only be activated by direct admin action.
> > > 
> > > This patch is the infrastructure to provide the information in the
> > > modules.alias to userspace and enable the only PCI VFIO driver. Later
> > > series introduce additional HW specific VFIO PCI drivers.  
> > 
> > I don't really understand why we're combining the above "special
> > properties" into a single flag.   
> 
> Currently I can't think of any reason to have two flags. We always
> need both behaviors together. It is trivial for someone to change down
> the road, so I prefer to keep the flag bit usage to a minimum.
> 
> > For instance, why wouldn't we create a flag that just indicates a
> > match entry is only for driver override?  
> 
> We still need to signal the generation of vfio_pci: string in the
> modules.alias.
> 
> > Or if we're only using this for full wildcard matches, we could
> > detect that even without a flag.  
> 
> The mlx/hns/etc drivers will not use wildcard matches. This series is
> the prep and the only driver we have right at this point is the
> wildcard vfio_pci generic driver.
> 
> > Then, how does the "vfio_pci:" alias extend to other drivers?    
> 
> After the HW drivers are merged we have a list of things in the
> modules.alias file. Eg we might have something like:
> 
> alias vfio_pci:v000015B3d00001011sv*sd*bc*sc*i* mlx5_vfio_pci
> alias vfio_pci:v0000abc1d0000abcdsv*sd*bc*sc*i* hns_vfio_pci
> alias vfio_pci:v*d*sv*sd*bc*sc*i* vfio_pci
> 
> This flag, and the vfio_pci string, is only for the VFIO subsystem. If
> someday another subsystem wants to use driver_override then it will
> provide its own subsystem name here instead.
> 
> This is solving the problem you had at the start - that userspace must
> be able to self identify the drivers.  Starting with a PCI BDF
> userspace can match the modules.alias for vfio_pci: prefixes and
> determine which string to put into the driver_override sysfs. This is
> instead of having userspace hardwire vfio_pci.
> 
> > Is this expected to be the only driver that would use an alias ever
> > or would other drivers use new bits of the flag?  
> 
> Not sure what you mean by "only driver"? As above every driver
> implementing VFIO on top of PCI will use this flag. If another
> subsystem wants to use driver_override it will define its own flag,
> and it's userspace will look for othersubsytem_pci: tags in
> modules.alias when it wants to change a PCI device over.
> 
> > Seems some documentation is necessary; the comment on
> > PCI_DRIVER_OVERRIDE_DEVICE_VFIO doesn't really help, "This macro is
> > used to create a struct pci_device_id that matches a specific
> > device", then we proceed to use it with PCI_ANY_ID.  
> 
> Fair enough, this is ment in the broader context, the generic vfio_pci
> is just special.
> 
> > vfio-pci has always tried (as much as possible) to be "just another
> > PCI" driver to avoid all the nasty issues that used to exist with
> > legacy KVM device assignment, so I cringe at seeing these vfio specific
> > hooks in PCI-core.  Thanks,  
> 
> It is has always had very special behavior - a PCI driver without a
> match table is is not "just another PCI" driver.
> 
> While this is not entirely elegant, considering where we have ended up
> and the historical ABI that has to be preserved, it is the best idea
> so far anyone has presented.

In general I think my confusion is lack of documentation and examples.
There's good information here and in the cover letter, but reviewing
the patch itself I'm not sure if vfio_pci: is meant to indicate the
vfio_pci driver or the vfio_pci device api or as I've finally decided,
just prepending "vfio_" to the modalias for a device to indicate the
class of stuff, ie. no automatic binding but discoverable by userspace
as a "vfio" driver suitable for this device.

I think we need libvirt folks onboard and maybe a clearer idea what
userspace helpers might be available.  For example would driverctl have
an option to choose a vfio class driver for a device?

I can also imagine that if the flag only covered the
matching/driver_override aspect and pci_device_id further included an
optional modalias prefix, we could do this without littering pci-core
with vfio eccentricities.  I'll be interest to see Bjorn's thoughts on
this.  Thanks,

Alex


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 12/12] vfio/pci: Introduce vfio_pci_core.ko
  2021-07-27 21:54   ` Alex Williamson
@ 2021-07-27 23:09     ` Jason Gunthorpe
  2021-07-28  4:56       ` Leon Romanovsky
  2021-07-28  5:43       ` Christoph Hellwig
  0 siblings, 2 replies; 55+ messages in thread
From: Jason Gunthorpe @ 2021-07-27 23:09 UTC (permalink / raw)
  To: Alex Williamson, Christoph Hellwig, Arnd Bergmann
  Cc: Yishai Hadas, bhelgaas, corbet, diana.craciun, kwankhede,
	eric.auger, masahiroy, michal.lkml, linux-pci, linux-doc, kvm,
	linux-s390, linux-kbuild, mgurtovoy, maorg, leonro

On Tue, Jul 27, 2021 at 03:54:40PM -0600, Alex Williamson wrote:

> I'm still not happy with how this is likely to break users and even
> downstreams when upgrading to a Kconfig with this change.

I've never heard of Kconfig as stable ABI. Christoph/Arnd, have you
heard of any cases where we want to keep it stable?

As far as I know we should change kconfig to keep it working properly,
eg by having correct menu structure and sane kconfig names.

In any event, upgrades work in a reasonable way. Starting from this
.config fragment:

CONFIG_VFIO_IOMMU_TYPE1=y
CONFIG_VFIO_VIRQFD=y
CONFIG_VFIO=y
CONFIG_VFIO_NOIOMMU=y
CONFIG_VFIO_PCI=y
CONFIG_VFIO_PCI_VGA=y
CONFIG_VFIO_PCI_MMAP=y
CONFIG_VFIO_PCI_INTX=y
CONFIG_VFIO_PCI_IGD=y
CONFIG_VFIO_PLATFORM=y
CONFIG_VFIO_AMBA=y
CONFIG_VFIO_PLATFORM_CALXEDAXGMAC_RESET=y
CONFIG_VFIO_PLATFORM_AMDXGBE_RESET=y
CONFIG_VFIO_PLATFORM_BCMFLEXRM_RESET=y
CONFIG_VFIO_MDEV=y
CONFIG_VFIO_FSL_MC=y
CONFIG_IRQ_BYPASS_MANAGER=y

Which might reasonably be from an old kernel. 'make oldconfig' prompts:

VFIO Non-Privileged userspace driver framework (VFIO) [Y/n/m/?] y
  VFIO No-IOMMU support (VFIO_NOIOMMU) [Y/n/?] y
  VFIO support for PCI devices (VFIO_PCI_CORE) [N/m/y/?] (NEW) 

Which is completely fine, IMHO.

The menu structure ends up looking like this, which is pretty good:

  --- VFIO Non-Privileged userspace driver framework
  [*]   VFIO No-IOMMU support
  <*>   VFIO support for PCI devices
  <*>     Generic VFIO support for any PCI device
  [*]       Generic VFIO PCI support for VGA devices
  [*]       Generic VFIO PCI extensions for Intel graphics (GVT-d)
  <*>     VFIO support for MLX5 PCI devices (NEW)
  <*>   VFIO support for platform devices
  <*>     VFIO support for AMBA devices
  <*>     VFIO support for calxeda xgmac reset
  <*>     VFIO support for AMD XGBE reset
  <*>     VFIO support for Broadcom FlexRM reset
  <*>   Mediated device driver framework
  <*>   VFIO support for QorIQ DPAA2 fsl-mc bus devices

Jason

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 09/12] PCI: Add a PCI_ID_F_VFIO_DRIVER_OVERRIDE flag to struct pci_device_id
  2021-07-27 23:02       ` Alex Williamson
@ 2021-07-27 23:42         ` Jason Gunthorpe
  0 siblings, 0 replies; 55+ messages in thread
From: Jason Gunthorpe @ 2021-07-27 23:42 UTC (permalink / raw)
  To: Alex Williamson
  Cc: Yishai Hadas, bhelgaas, corbet, diana.craciun, kwankhede,
	eric.auger, masahiroy, michal.lkml, linux-pci, linux-doc, kvm,
	linux-s390, linux-kbuild, mgurtovoy, maorg, leonro

On Tue, Jul 27, 2021 at 05:02:02PM -0600, Alex Williamson wrote:

> In general I think my confusion is lack of documentation and examples.
> There's good information here and in the cover letter, but reviewing
> the patch itself I'm not sure if vfio_pci: is meant to indicate the
> vfio_pci driver or the vfio_pci device api or as I've finally decided,
> just prepending "vfio_" to the modalias for a device to indicate the
> class of stuff, ie. no automatic binding but discoverable by userspace
> as a "vfio" driver suitable for this device.

Yes, the "vfio_" prefix is ment to be a generic prefix that any bus
type could use to signify the modalias entry is for the vfio flavour
of driver_override devices.

The userspace algorihtm is pretty simple.

1) Identify the sysfs path to the device:
  /sys/bus/pci/devices/0000:01:00.0/modalias

2) Get the modalias string from the kernel:
 $ cat /sys/bus/pci/devices/0000:01:00.0/modalias
pci:v000015B3d00001017sv000015B3sd00000001bc02sc00i00

3) Prefix it with vfio_:
vfio_pci:v000015B3d00001017sv000015B3sd00000001bc02sc00i00

4) Search modules.alias for the above string, select the entry that
   has the fewest *'s. See Max's sample script.

5) modprobe the matched module name

6) cat the matched modules.alias module name to
   /sys/bus/pci/devices/0000\:01\:00.0/driver_override

Further patches can make this work universally for all the current and
future vfio bus types, eg platform, fsl, etc.

Something like driverctl or libvirt can implement this algorithm and
remove all the hardwired behavior of load vfio_fsl for this or
vfio_pci for that.

I'll add the above sequence to the commit message of this patch, since
I think it makes it really clear.

> I think we need libvirt folks onboard and maybe a clearer idea what
> userspace helpers might be available.  For example would driverctl have
> an option to choose a vfio class driver for a device?

Max wrote a demo script that shows how this can work, it is linked in
the cover letter.

At the end of the day there are only two ideas that survived scrutiny:

1) This patch which makes everything dynamic and driven by
   modules.alias,

2) We continue to hardwire the driver and module names into
   libvirt/etc and just add mlx, hns, etc.

> I can also imagine that if the flag only covered the
> matching/driver_override aspect and pci_device_id further included an
> optional modalias prefix, we could do this without littering pci-core
> with vfio eccentricities.  I'll be interest to see Bjorn's thoughts on
> this.  Thanks,

This is more elegant, but we didn't do this because the pci match
struct is widely used in the kernel and bloating it further doesn't
seem to make a lot of sense at this point. Due to the macros it would
be easy to change to this scheme if was appropriate.

Jason

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 12/12] vfio/pci: Introduce vfio_pci_core.ko
  2021-07-27 23:09     ` Jason Gunthorpe
@ 2021-07-28  4:56       ` Leon Romanovsky
  2021-07-28  5:43       ` Christoph Hellwig
  1 sibling, 0 replies; 55+ messages in thread
From: Leon Romanovsky @ 2021-07-28  4:56 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Alex Williamson, Christoph Hellwig, Arnd Bergmann, Yishai Hadas,
	bhelgaas, corbet, diana.craciun, kwankhede, eric.auger,
	masahiroy, michal.lkml, linux-pci, linux-doc, kvm, linux-s390,
	linux-kbuild, mgurtovoy, maorg

On Tue, Jul 27, 2021 at 08:09:41PM -0300, Jason Gunthorpe wrote:
> On Tue, Jul 27, 2021 at 03:54:40PM -0600, Alex Williamson wrote:
> 
> > I'm still not happy with how this is likely to break users and even
> > downstreams when upgrading to a Kconfig with this change.
> 
> I've never heard of Kconfig as stable ABI. Christoph/Arnd, have you
> heard of any cases where we want to keep it stable?

Of course not, otherwise we won't be able to do ANY cleanup in the kernel.

> 
> As far as I know we should change kconfig to keep it working properly,
> eg by having correct menu structure and sane kconfig names.

Everyone who upgrades through source code and needs rebuild kernel
should be proficient enough to enable/disable kernel config.

Thanks

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 12/12] vfio/pci: Introduce vfio_pci_core.ko
  2021-07-27 23:09     ` Jason Gunthorpe
  2021-07-28  4:56       ` Leon Romanovsky
@ 2021-07-28  5:43       ` Christoph Hellwig
  2021-07-28  7:04         ` Arnd Bergmann
  2021-07-28 12:03         ` Jason Gunthorpe
  1 sibling, 2 replies; 55+ messages in thread
From: Christoph Hellwig @ 2021-07-28  5:43 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Alex Williamson, Christoph Hellwig, Arnd Bergmann, Yishai Hadas,
	bhelgaas, corbet, diana.craciun, kwankhede, eric.auger,
	masahiroy, michal.lkml, linux-pci, linux-doc, kvm, linux-s390,
	linux-kbuild, mgurtovoy, maorg, leonro

On Tue, Jul 27, 2021 at 08:09:41PM -0300, Jason Gunthorpe wrote:
> On Tue, Jul 27, 2021 at 03:54:40PM -0600, Alex Williamson wrote:
> 
> > I'm still not happy with how this is likely to break users and even
> > downstreams when upgrading to a Kconfig with this change.
> 
> I've never heard of Kconfig as stable ABI. Christoph/Arnd, have you
> heard of any cases where we want to keep it stable?

It isn't an ABI, but we really do try to avoid breaking if we can and
I rember Linus shouting at people if they did that for common options.

However lately for example the completely silly s/THUNDERBOLT/USB4/
change did slip through and did break my test setup with a vfio passed
through external nvme drive :(

> Which might reasonably be from an old kernel. 'make oldconfig' prompts:
> 
> VFIO Non-Privileged userspace driver framework (VFIO) [Y/n/m/?] y
>   VFIO No-IOMMU support (VFIO_NOIOMMU) [Y/n/?] y
>   VFIO support for PCI devices (VFIO_PCI_CORE) [N/m/y/?] (NEW) 
> 
> Which is completely fine, IMHO.

Why do we need to have VFIO_PCI_CORE as a user visible option?
I'd just select it.

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 12/12] vfio/pci: Introduce vfio_pci_core.ko
  2021-07-28  5:43       ` Christoph Hellwig
@ 2021-07-28  7:04         ` Arnd Bergmann
  2021-07-28  7:17           ` Leon Romanovsky
  2021-07-28 12:03         ` Jason Gunthorpe
  1 sibling, 1 reply; 55+ messages in thread
From: Arnd Bergmann @ 2021-07-28  7:04 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jason Gunthorpe, Alex Williamson, Arnd Bergmann, Yishai Hadas,
	Bjorn Helgaas, Jonathan Corbet, diana.craciun, kwankhede,
	Eric Auger, Masahiro Yamada, Michal Marek, linux-pci,
	open list:DOCUMENTATION, kvm list, linux-s390,
	Linux Kbuild mailing list, mgurtovoy, maorg, leonro

On Wed, Jul 28, 2021 at 7:43 AM Christoph Hellwig <hch@lst.de> wrote:
>
> On Tue, Jul 27, 2021 at 08:09:41PM -0300, Jason Gunthorpe wrote:
> > On Tue, Jul 27, 2021 at 03:54:40PM -0600, Alex Williamson wrote:
> >
> > > I'm still not happy with how this is likely to break users and even
> > > downstreams when upgrading to a Kconfig with this change.
> >
> > I've never heard of Kconfig as stable ABI. Christoph/Arnd, have you
> > heard of any cases where we want to keep it stable?
>
> It isn't an ABI, but we really do try to avoid breaking if we can and
> I rember Linus shouting at people if they did that for common options.

This is handled in very different ways depending on the maintainers,
some people go to great lengths to avoid breaking 'make oldconfig'
or 'make defconfig', others don't seem to mind at all.

CONFIG_USB_EHCI_TEGRA is an example of an option that was
left in place to help users of old config files, another one is
CONFIG_EXT3_FS. In both cases the idea is that the original
code was changed, but the old option left in place to point to
the replacement.

I think doing this is generally a good idea, but I would not consider
this a stable ABI in the sense that we can never break it. Most users
should have migrated to the new option after a few kernel releases,
and then I would remove the old one.

If a user upgrades across multiple kernel releases at once, usually
all hope of reusing an old .config is lost anyway.

> However lately for example the completely silly s/THUNDERBOLT/USB4/
> change did slip through and did break my test setup with a vfio passed
> through external nvme drive :(

Another recent example is CONFIG_FB no longer being selected by
the DRM subsystem, which broke a lot of defconfigs.

        Arnd

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 12/12] vfio/pci: Introduce vfio_pci_core.ko
  2021-07-28  7:04         ` Arnd Bergmann
@ 2021-07-28  7:17           ` Leon Romanovsky
  0 siblings, 0 replies; 55+ messages in thread
From: Leon Romanovsky @ 2021-07-28  7:17 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Christoph Hellwig, Jason Gunthorpe, Alex Williamson,
	Yishai Hadas, Bjorn Helgaas, Jonathan Corbet, diana.craciun,
	kwankhede, Eric Auger, Masahiro Yamada, Michal Marek, linux-pci,
	open list:DOCUMENTATION, kvm list, linux-s390,
	Linux Kbuild mailing list, mgurtovoy, maorg

On Wed, Jul 28, 2021 at 09:04:34AM +0200, Arnd Bergmann wrote:
> On Wed, Jul 28, 2021 at 7:43 AM Christoph Hellwig <hch@lst.de> wrote:
> >
> > On Tue, Jul 27, 2021 at 08:09:41PM -0300, Jason Gunthorpe wrote:
> > > On Tue, Jul 27, 2021 at 03:54:40PM -0600, Alex Williamson wrote:
> > >
> > > > I'm still not happy with how this is likely to break users and even
> > > > downstreams when upgrading to a Kconfig with this change.
> > >
> > > I've never heard of Kconfig as stable ABI. Christoph/Arnd, have you
> > > heard of any cases where we want to keep it stable?
> >
> > It isn't an ABI, but we really do try to avoid breaking if we can and
> > I rember Linus shouting at people if they did that for common options.
> 
> This is handled in very different ways depending on the maintainers,
> some people go to great lengths to avoid breaking 'make oldconfig'
> or 'make defconfig', others don't seem to mind at all.
> 
> CONFIG_USB_EHCI_TEGRA is an example of an option that was
> left in place to help users of old config files, another one is
> CONFIG_EXT3_FS. In both cases the idea is that the original
> code was changed, but the old option left in place to point to
> the replacement.

And here starts the problem, when people treat their obscure config
options as first class citizen. The exposure of CONFIG_EXT3_FS is
in magnitudes larger than CONFIG_USB_EHCI_TEGRA.

This is why I think that is generally bad idea to leave old config
options, most of the time such options will rotten for years till
someone actually will notice and delete them.

Thanks

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 12/12] vfio/pci: Introduce vfio_pci_core.ko
  2021-07-28  5:43       ` Christoph Hellwig
  2021-07-28  7:04         ` Arnd Bergmann
@ 2021-07-28 12:03         ` Jason Gunthorpe
  2021-07-28 12:12           ` Arnd Bergmann
  2021-07-28 12:29           ` Christoph Hellwig
  1 sibling, 2 replies; 55+ messages in thread
From: Jason Gunthorpe @ 2021-07-28 12:03 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Alex Williamson, Arnd Bergmann, Yishai Hadas, bhelgaas, corbet,
	diana.craciun, kwankhede, eric.auger, masahiroy, michal.lkml,
	linux-pci, linux-doc, kvm, linux-s390, linux-kbuild, mgurtovoy,
	maorg, leonro

On Wed, Jul 28, 2021 at 07:43:06AM +0200, Christoph Hellwig wrote:

> > Which might reasonably be from an old kernel. 'make oldconfig' prompts:
> > 
> > VFIO Non-Privileged userspace driver framework (VFIO) [Y/n/m/?] y
> >   VFIO No-IOMMU support (VFIO_NOIOMMU) [Y/n/?] y
> >   VFIO support for PCI devices (VFIO_PCI_CORE) [N/m/y/?] (NEW) 
> > 
> > Which is completely fine, IMHO.
> 
> Why do we need to have VFIO_PCI_CORE as a user visible option?
> I'd just select it.

I'm not great with kconfig, but AFAIK:

- It controls building a module so it needs to be a tristate

- tristates need to be exposed in the menu structure

- As it builds a module it also has depends on other things

- Select should not be used to target tristates

- Select should not be used to target options in the menu tree

- Select should not be used to target options that have depends

Which leaves us with this arrangement unless we delete the
vfio_pci_core.ko module - which seems like a bad direction just for
kconfig backwards compatibility.

Jason

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 12/12] vfio/pci: Introduce vfio_pci_core.ko
  2021-07-28 12:03         ` Jason Gunthorpe
@ 2021-07-28 12:12           ` Arnd Bergmann
  2021-07-28 12:29           ` Christoph Hellwig
  1 sibling, 0 replies; 55+ messages in thread
From: Arnd Bergmann @ 2021-07-28 12:12 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Christoph Hellwig, Alex Williamson, Arnd Bergmann, Yishai Hadas,
	Bjorn Helgaas, Jonathan Corbet, diana.craciun, kwankhede,
	Eric Auger, Masahiro Yamada, Michal Marek, linux-pci,
	open list:DOCUMENTATION, kvm list, linux-s390,
	Linux Kbuild mailing list, mgurtovoy, maorg, leonro

On Wed, Jul 28, 2021 at 2:03 PM Jason Gunthorpe <jgg@nvidia.com> wrote:
>
> On Wed, Jul 28, 2021 at 07:43:06AM +0200, Christoph Hellwig wrote:
>
> > > Which might reasonably be from an old kernel. 'make oldconfig' prompts:
> > >
> > > VFIO Non-Privileged userspace driver framework (VFIO) [Y/n/m/?] y
> > >   VFIO No-IOMMU support (VFIO_NOIOMMU) [Y/n/?] y
> > >   VFIO support for PCI devices (VFIO_PCI_CORE) [N/m/y/?] (NEW)
> > >
> > > Which is completely fine, IMHO.
> >
> > Why do we need to have VFIO_PCI_CORE as a user visible option?
> > I'd just select it.
>
> I'm not great with kconfig, but AFAIK:
>
> - It controls building a module so it needs to be a tristate
>
> - tristates need to be exposed in the menu structure
>
> - As it builds a module it also has depends on other things
>
> - Select should not be used to target tristates
>
> - Select should not be used to target options in the menu tree
>
> - Select should not be used to target options that have depends
>
> Which leaves us with this arrangement unless we delete the
> vfio_pci_core.ko module - which seems like a bad direction just for
> kconfig backwards compatibility.

I have not looked at the requirements for this particular patch, but
generally speaking there is no problem with using 'select' on
a tristate symbol.

The other points are correct though: you can not 'select' a symbol
that has dependencies, unless the symbol selecting it already
depends on those same options, and you should not 'select' user
visible options or other subsystems.

One common mistake is to have a reverse dependency, where
A uses 'select B' or 'depends on B', but then exports an ELF
symbol that is consumed by B, as opposed to the other way round.
I don't think that is a problem here though.

            Arnd

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 12/12] vfio/pci: Introduce vfio_pci_core.ko
  2021-07-28 12:03         ` Jason Gunthorpe
  2021-07-28 12:12           ` Arnd Bergmann
@ 2021-07-28 12:29           ` Christoph Hellwig
  2021-07-28 12:47             ` Jason Gunthorpe
  1 sibling, 1 reply; 55+ messages in thread
From: Christoph Hellwig @ 2021-07-28 12:29 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Christoph Hellwig, Alex Williamson, Arnd Bergmann, Yishai Hadas,
	bhelgaas, corbet, diana.craciun, kwankhede, eric.auger,
	masahiroy, michal.lkml, linux-pci, linux-doc, kvm, linux-s390,
	linux-kbuild, mgurtovoy, maorg, leonro

On Wed, Jul 28, 2021 at 09:03:26AM -0300, Jason Gunthorpe wrote:
> I'm not great with kconfig, but AFAIK:
> 
> - It controls building a module so it needs to be a tristate
> 
> - tristates need to be exposed in the menu structure

select can be used on tristates perfectly fine.

> - As it builds a module it also has depends on other things

So the dependencies are:

 - VFIO - duh, yeah, anything vfio related needs to select that.
   But this is a perfectly fine transitive select
 - PCI - yeah.  But we can expect everything that selects VFIO_PCI_CORE
   to select PCI.  Or a transitive select would be fine again
 - EVENTFD this is another classic transitive one that should just be selected
   instead of a user asking why it is not set
 - MMU: I suspect all of VFIO and thus the menuconfig really should
   depend on that

So not really an issue here.  VFIO_PCI_CORE really is underlying
infrastructure a user should not care about.

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 12/12] vfio/pci: Introduce vfio_pci_core.ko
  2021-07-28 12:29           ` Christoph Hellwig
@ 2021-07-28 12:47             ` Jason Gunthorpe
  2021-07-28 12:55               ` Christoph Hellwig
  2021-07-28 13:08               ` Arnd Bergmann
  0 siblings, 2 replies; 55+ messages in thread
From: Jason Gunthorpe @ 2021-07-28 12:47 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Alex Williamson, Arnd Bergmann, Yishai Hadas, bhelgaas, corbet,
	diana.craciun, kwankhede, eric.auger, masahiroy, michal.lkml,
	linux-pci, linux-doc, kvm, linux-s390, linux-kbuild, mgurtovoy,
	maorg, leonro

On Wed, Jul 28, 2021 at 02:29:56PM +0200, Christoph Hellwig wrote:

> So not really an issue here.  VFIO_PCI_CORE really is underlying
> infrastructure a user should not care about.

So then we can write it like below? Unfortunately it deletes the nice
menu structure that groups all the PCI drivers together like platform
(and mdev in future). Not sure this loss is worth the backwards compat

diff --git a/drivers/vfio/pci/Kconfig b/drivers/vfio/pci/Kconfig
index 318116d03c21a4..2611d7d91ddbd5 100644
--- a/drivers/vfio/pci/Kconfig
+++ b/drivers/vfio/pci/Kconfig
@@ -1,14 +1,12 @@
 # SPDX-License-Identifier: GPL-2.0-only
+if PCI || MMU
 config VFIO_PCI_CORE
-	tristate "VFIO support for PCI devices"
-	depends on PCI
-	depends on MMU
+	tristate
 	select VFIO_VIRQFD
 	select IRQ_BYPASS_MANAGER
 	help
 	  Support for using PCI devices with VFIO.
 
-if VFIO_PCI_CORE
 config VFIO_PCI_MMAP
 	def_bool y if !S390
 
@@ -17,6 +15,7 @@ config VFIO_PCI_INTX
 
 config VFIO_PCI
 	tristate "Generic VFIO support for any PCI device"
+	select VFIO_PCI_CORE
 	help
 	  Support for the generic PCI VFIO bus driver which can connect any
 	  PCI device to the VFIO framework.
@@ -50,6 +49,7 @@ endif
 config MLX5_VFIO_PCI
 	tristate "VFIO support for MLX5 PCI devices"
 	depends on MLX5_CORE
+	select VFIO_PCI_CORE
 	help
 	  This provides a PCI support for MLX5 devices using the VFIO
 	  framework. The device specific driver supports suspend/resume

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 12/12] vfio/pci: Introduce vfio_pci_core.ko
  2021-07-28 12:47             ` Jason Gunthorpe
@ 2021-07-28 12:55               ` Christoph Hellwig
  2021-07-28 13:31                 ` Jason Gunthorpe
  2021-07-28 13:08               ` Arnd Bergmann
  1 sibling, 1 reply; 55+ messages in thread
From: Christoph Hellwig @ 2021-07-28 12:55 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Christoph Hellwig, Alex Williamson, Arnd Bergmann, Yishai Hadas,
	bhelgaas, corbet, diana.craciun, kwankhede, eric.auger,
	masahiroy, michal.lkml, linux-pci, linux-doc, kvm, linux-s390,
	linux-kbuild, mgurtovoy, maorg, leonro

On Wed, Jul 28, 2021 at 09:47:55AM -0300, Jason Gunthorpe wrote:
> So then we can write it like below?

> +if PCI || MMU

The || here should be &&.  But otherwise, yes.

> Unfortunately it deletes the nice
> menu structure that groups all the PCI drivers together like platform
> (and mdev in future). Not sure this loss is worth the backwards compat

All the ther visible options should depend on VFIO_PCI not VFIO_PCI_CORE.
So we can still keep the same menu struture.


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 12/12] vfio/pci: Introduce vfio_pci_core.ko
  2021-07-28 12:47             ` Jason Gunthorpe
  2021-07-28 12:55               ` Christoph Hellwig
@ 2021-07-28 13:08               ` Arnd Bergmann
  2021-07-28 17:26                 ` Jason Gunthorpe
  1 sibling, 1 reply; 55+ messages in thread
From: Arnd Bergmann @ 2021-07-28 13:08 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Christoph Hellwig, Alex Williamson, Arnd Bergmann, Yishai Hadas,
	Bjorn Helgaas, Jonathan Corbet, diana.craciun, kwankhede,
	Eric Auger, Masahiro Yamada, Michal Marek, linux-pci,
	open list:DOCUMENTATION, kvm list, linux-s390,
	Linux Kbuild mailing list, mgurtovoy, maorg, leonro

On Wed, Jul 28, 2021 at 2:47 PM Jason Gunthorpe <jgg@nvidia.com> wrote:
>
> On Wed, Jul 28, 2021 at 02:29:56PM +0200, Christoph Hellwig wrote:
>
> > So not really an issue here.  VFIO_PCI_CORE really is underlying
> > infrastructure a user should not care about.
>
> So then we can write it like below? Unfortunately it deletes the nice
> menu structure that groups all the PCI drivers together like platform
> (and mdev in future). Not sure this loss is worth the backwards compat

I think you can get back some structure by adding a 'menu "VFIO PCI drivers"'
and 'endmenu' around it.

> @@ -17,6 +15,7 @@ config VFIO_PCI_INTX
>
>  config VFIO_PCI
>         tristate "Generic VFIO support for any PCI device"
> +       select VFIO_PCI_CORE
>         help
>           Support for the generic PCI VFIO bus driver which can connect any
>           PCI device to the VFIO framework.
> @@ -50,6 +49,7 @@ endif
>  config MLX5_VFIO_PCI
>         tristate "VFIO support for MLX5 PCI devices"
>         depends on MLX5_CORE
> +       select VFIO_PCI_CORE
>         help

These two now have to get a 'depends on MMU' if they don't already inherit
that from elsewhere.

       Arnd

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 12/12] vfio/pci: Introduce vfio_pci_core.ko
  2021-07-28 12:55               ` Christoph Hellwig
@ 2021-07-28 13:31                 ` Jason Gunthorpe
  0 siblings, 0 replies; 55+ messages in thread
From: Jason Gunthorpe @ 2021-07-28 13:31 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Alex Williamson, Arnd Bergmann, Yishai Hadas, bhelgaas, corbet,
	diana.craciun, kwankhede, eric.auger, masahiroy, michal.lkml,
	linux-pci, linux-doc, kvm, linux-s390, linux-kbuild, mgurtovoy,
	maorg, leonro

On Wed, Jul 28, 2021 at 02:55:05PM +0200, Christoph Hellwig wrote:
> On Wed, Jul 28, 2021 at 09:47:55AM -0300, Jason Gunthorpe wrote:
> > So then we can write it like below?
>
> > +if PCI || MMU
>
> The || here should be &&.  But otherwise, yes.

Woops, thanks

> > Unfortunately it deletes the nice
> > menu structure that groups all the PCI drivers together like platform
> > (and mdev in future). Not sure this loss is worth the backwards compat
>
> All the ther visible options should depend on VFIO_PCI not VFIO_PCI_CORE.
> So we can still keep the same menu struture.

It is the upcoming VFIO PCI drivers I'm looking at, eg the
MLX5_VFIO_PCI

It goes from this:

  --- VFIO Non-Privileged userspace driver framework
  [*]   VFIO No-IOMMU support
  <*>   VFIO support for PCI devices
  <*>     Generic VFIO support for any PCI device
  [*]       Generic VFIO PCI support for VGA devices
  [*]       Generic VFIO PCI extensions for Intel graphics (GVT-d)
  <*>     VFIO support for MLX5 PCI devices (NEW)
  <*>   VFIO support for platform devices
  <*>     VFIO support for AMBA devices
  <*>     VFIO support for calxeda xgmac reset
  <*>     VFIO support for AMD XGBE reset
  <*>     VFIO support for Broadcom FlexRM reset
  <*>   Mediated device driver framework
  <*>   VFIO support for QorIQ DPAA2 fsl-mc bus devices

To this:

  --- VFIO Non-Privileged userspace driver framework
  [*]   VFIO No-IOMMU support
  <*>   Generic VFIO support for any PCI device
  [*]     Generic VFIO PCI support for VGA devices
  [*]     Generic VFIO PCI extensions for Intel graphics (GVT-d)
  < >   VFIO support for MLX5 PCI devices (NEW)
  <*>   VFIO support for platform devices
  <*>     VFIO support for AMBA devices
  <*>     VFIO support for calxeda xgmac reset
  <*>     VFIO support for AMD XGBE reset
  <*>     VFIO support for Broadcom FlexRM reset
  <*>   Mediated device driver framework
  <*>   VFIO support for QorIQ DPAA2 fsl-mc bus devices

Look at how "VFIO support for MLX5 PCI devices" has changed its
position.

Arnd's suggstion to add a menu seems OK, it gives another screen in
kconfig but it does group them.

My preference is the first version because it is simplest. Adding menu
seems like something to do if we get > 5 more drivers. Just hiding it
preserves back compat but hurts the UI.

Honestly, I don't care very much, if Alex values kconfig backcompat
higher than the UI then lets use the diff in my last email.

Jason

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 12/12] vfio/pci: Introduce vfio_pci_core.ko
  2021-07-28 13:08               ` Arnd Bergmann
@ 2021-07-28 17:26                 ` Jason Gunthorpe
  0 siblings, 0 replies; 55+ messages in thread
From: Jason Gunthorpe @ 2021-07-28 17:26 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Christoph Hellwig, Alex Williamson, Yishai Hadas, Bjorn Helgaas,
	Jonathan Corbet, diana.craciun, kwankhede, Eric Auger,
	Masahiro Yamada, Michal Marek, linux-pci,
	open list:DOCUMENTATION, kvm list, linux-s390,
	Linux Kbuild mailing list, mgurtovoy, maorg, leonro

On Wed, Jul 28, 2021 at 03:08:08PM +0200, Arnd Bergmann wrote:

> > @@ -17,6 +15,7 @@ config VFIO_PCI_INTX
> >
> >  config VFIO_PCI
> >         tristate "Generic VFIO support for any PCI device"
> > +       select VFIO_PCI_CORE
> >         help
> >           Support for the generic PCI VFIO bus driver which can connect any
> >           PCI device to the VFIO framework.
> > @@ -50,6 +49,7 @@ endif
> >  config MLX5_VFIO_PCI
> >         tristate "VFIO support for MLX5 PCI devices"
> >         depends on MLX5_CORE
> > +       select VFIO_PCI_CORE
> >         help
>
> These two now have to get a 'depends on MMU' if they don't already inherit
> that from elsewhere.

Just so I understand this remark properly, I added this at the top of
the file:

if PCI && MMU

And when I check CONFIG_MLX5_VFIO_PCI I see:

 Defined at drivers/vfio/pci/Kconfig:51
   Prompt: VFIO support for MLX5 PCI devices
   Depends on: VFIO [=y] && PCI [=y] && MMU [=y] && MLX5_CORE [=y]

So this is doing what you mean, right?

I've attached the whole thing below just for clarity

Thanks,
Jason

# SPDX-License-Identifier: GPL-2.0-only
if PCI && MMU
config VFIO_PCI_CORE
	tristate
	select VFIO_VIRQFD
	select IRQ_BYPASS_MANAGER
	help
	  Support for using PCI devices with VFIO.

config VFIO_PCI_MMAP
	def_bool y if !S390

config VFIO_PCI_INTX
	def_bool y if !S390

menu "VFIO PCI Drivers"

config VFIO_PCI
	tristate "Generic VFIO support for any PCI device"
	select VFIO_PCI_CORE
	help
	  Support for the generic PCI VFIO bus driver which can connect any
	  PCI device to the VFIO framework.

	  If you don't know what to do here, say N.

if VFIO_PCI
config VFIO_PCI_VGA
	bool "Generic VFIO PCI support for VGA devices"
	depends on X86 && VGA_ARB
	help
	  Support for VGA extension to VFIO PCI.  This exposes an additional
	  region on VGA devices for accessing legacy VGA addresses used by
	  BIOS and generic video drivers.

	  If you don't know what to do here, say N.

config VFIO_PCI_IGD
	bool "Generic VFIO PCI extensions for Intel graphics (GVT-d)"
	depends on X86
	default y
	help
	  Support for Intel IGD specific extensions to enable direct
	  assignment to virtual machines.  This includes exposing an IGD
	  specific firmware table and read-only copies of the host bridge
	  and LPC bridge config space.

	  To enable Intel IGD assignment through vfio-pci, say Y.
endif

config MLX5_VFIO_PCI
	tristate "VFIO support for MLX5 PCI devices"
	depends on MLX5_CORE
	select VFIO_PCI_CORE
	help
	  This provides a PCI support for MLX5 devices using the VFIO
	  framework. The device specific driver supports suspend/resume
	  of the MLX5 device.

	  If you don't know what to do here, say N.
endmenu
endif

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 00/12] Introduce vfio_pci_core subsystem
  2021-07-21 16:15 [PATCH 00/12] Introduce vfio_pci_core subsystem Yishai Hadas
                   ` (11 preceding siblings ...)
  2021-07-21 16:16 ` [PATCH 12/12] vfio/pci: Introduce vfio_pci_core.ko Yishai Hadas
@ 2021-08-04 13:41 ` Yishai Hadas
  2021-08-04 15:27   ` Alex Williamson
  12 siblings, 1 reply; 55+ messages in thread
From: Yishai Hadas @ 2021-08-04 13:41 UTC (permalink / raw)
  To: alex.williamson
  Cc: linux-pci, linux-doc, kvm, linux-s390, linux-kbuild, mgurtovoy,
	jgg, maorg, corbet, michal.lkml, bhelgaas, diana.craciun,
	kwankhede, eric.auger, masahiroy, Leon Romanovsky

On 7/21/2021 7:15 PM, Yishai Hadas wrote:
> Prologue:
>
> This is the second series of three to send the "mlx5_vfio_pci" driver
> that has been discussed on the list for a while now. It comes on top of
> the first series (i.e. Reorganize reflck to support splitting vfio_pci)
> that was sent already and pending merge [1].
>
>   - Split vfio_pci into vfio_pci/vfio_pci_core and provide infrastructure
>     for non-generic VFIO PCI drivers.
>   - The new driver mlx5_vfio_pci that is a full implementation of
>     suspend/resume functionality for mlx5 devices.
>
> A preview of all the patches can be seen here:
> https://github.com/jgunthorpe/linux/commits/mlx5_vfio_pci
>
> [1] https://lore.kernel.org/dri-devel/0-v2-b6a5582525c9+ff96-vfio_reflck_jgg@nvidia.com/T/#t
> =====================
>
>  From Max Gurtovoy:
> ====================
> This series splits the vfio_pci driver into two parts, a PCI driver and
> a subsystem driver that will also be library of code. The main PCI
> driver, vfio_pci.ko, will remain as before and it will use the library
> module vfio_pci_core.ko to help create the vfio_device.
>
> This series is intended to solve the issues that were raised in the
> previous attempts for extending vfio-pci for device specific
> functionality:
>
> 1. https://lore.kernel.org/kvm/20200518024202.13996-1-yan.y.zhao@intel.com
>     by Yan Zhao
> 2. https://lore.kernel.org/kvm/20210702095849.1610-1-shameerali.kolothum.thodi@huawei.com
>     by Longfang Liu
>
> Also to support proposed future changes to virtio and other common
> protocols to support migration:
>
> https://lists.oasis-open.org/archives/virtio-comment/202106/msg00044.html
>
> This subsystem framework will also ease adding new device specific
> functionality to VFIO devices in the future by allowing another module
> to provide the pci_driver that can setup a number of details before
> registering to the VFIO subsystem, such as injecting its own operations.
>
> This series also extends the "driver_override" mechanism. A flag is
> added for PCI drivers that will declare themselves as "driver_override"
> capable which sends their match table to the modules.alias file but
> otherwise leaves them outside of the normal driver core auto-binding
> world, like vfio_pci.
>
> In order to get the best match for "driver_override" drivers, one can
> create a userspace program to inspect the modules.alias, an example can
> be found at:
>
> https://github.com/maxgurtovoy/linux_tools/blob/main/vfio/bind_vfio_pci_driver.py
>
> Which finds the 'best match' according to a simple algorithm: "the
> driver with the fewest '*' matches wins."
>
> For example, the vfio-pci driver will match to any pci device. So it
> will have the maximal '*' matches.
>
> In case we are looking for a match to a mlx5 based device, we'll have a
> match to vfio-pci.ko and mlx5-vfio-pci.ko. We'll prefer mlx5-vfio-pci.ko
> since it will have less '*' matches (probably vendor and device IDs will
> match). This will work in the future for NVMe/Virtio devices that can
> match according to a class code or other criteria.
>
> Yishai
>
>
> Jason Gunthorpe (2):
>    vfio: Use select for eventfd
>    vfio: Use kconfig if XX/endif blocks instead of repeating 'depends on'
>
> Max Gurtovoy (9):
>    vfio/pci: Rename vfio_pci.c to vfio_pci_core.c
>    vfio/pci: Rename vfio_pci_private.h to vfio_pci_core.h
>    vfio/pci: Rename vfio_pci_device to vfio_pci_core_device
>    vfio/pci: Rename ops functions to fit core namings
>    vfio/pci: Include vfio header in vfio_pci_core.h
>    vfio/pci: Split the pci_driver code out of vfio_pci_core.c
>    vfio/pci: Move igd initialization to vfio_pci.c
>    PCI: Add a PCI_ID_F_VFIO_DRIVER_OVERRIDE flag to struct pci_device_id
>    vfio/pci: Introduce vfio_pci_core.ko
>
> Yishai Hadas (1):
>    vfio/pci: Move module parameters to vfio_pci.c
>
>   Documentation/PCI/pci.rst                     |    1 +
>   drivers/pci/pci-driver.c                      |   25 +-
>   drivers/vfio/Kconfig                          |   29 +-
>   drivers/vfio/fsl-mc/Kconfig                   |    3 +-
>   drivers/vfio/mdev/Kconfig                     |    1 -
>   drivers/vfio/pci/Kconfig                      |   39 +-
>   drivers/vfio/pci/Makefile                     |    8 +-
>   drivers/vfio/pci/vfio_pci.c                   | 2238 +----------------
>   drivers/vfio/pci/vfio_pci_config.c            |   70 +-
>   drivers/vfio/pci/vfio_pci_core.c              | 2138 ++++++++++++++++
>   drivers/vfio/pci/vfio_pci_igd.c               |   19 +-
>   drivers/vfio/pci/vfio_pci_intrs.c             |   42 +-
>   drivers/vfio/pci/vfio_pci_rdwr.c              |   18 +-
>   drivers/vfio/pci/vfio_pci_zdev.c              |    4 +-
>   drivers/vfio/platform/Kconfig                 |    6 +-
>   drivers/vfio/platform/reset/Kconfig           |    4 +-
>   include/linux/mod_devicetable.h               |    7 +
>   include/linux/pci.h                           |   27 +
>   .../linux/vfio_pci_core.h                     |   89 +-
>   scripts/mod/devicetable-offsets.c             |    1 +
>   scripts/mod/file2alias.c                      |    8 +-
>   21 files changed, 2496 insertions(+), 2281 deletions(-)
>   create mode 100644 drivers/vfio/pci/vfio_pci_core.c
>   rename drivers/vfio/pci/vfio_pci_private.h => include/linux/vfio_pci_core.h (56%)
>
Hi Alex,

Based on the feedback that we got so far on this series, no functional 
changes are expected in V2.

It may include the below minor changes:

- Drop DRIVER_VERSION as it's useless and not required any more. 
(Patches #6, #12).

- Add the sequence of commands/algorithm that is required by userspace 
to discover the matching driver to the commit message of patch #9.

Do we need to wait for more feedback or that we are fine to send V2 ?

Yishai


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 00/12] Introduce vfio_pci_core subsystem
  2021-08-04 13:41 ` [PATCH 00/12] Introduce vfio_pci_core subsystem Yishai Hadas
@ 2021-08-04 15:27   ` Alex Williamson
  0 siblings, 0 replies; 55+ messages in thread
From: Alex Williamson @ 2021-08-04 15:27 UTC (permalink / raw)
  To: Yishai Hadas
  Cc: linux-pci, linux-doc, kvm, linux-s390, linux-kbuild, mgurtovoy,
	jgg, maorg, corbet, michal.lkml, bhelgaas, diana.craciun,
	kwankhede, eric.auger, masahiroy, Leon Romanovsky

On Wed, 4 Aug 2021 16:41:34 +0300
Yishai Hadas <yishaih@nvidia.com> wrote:

> On 7/21/2021 7:15 PM, Yishai Hadas wrote:
> > Prologue:
> >
> > This is the second series of three to send the "mlx5_vfio_pci" driver
> > that has been discussed on the list for a while now. It comes on top of
> > the first series (i.e. Reorganize reflck to support splitting vfio_pci)
> > that was sent already and pending merge [1].
> >
> >   - Split vfio_pci into vfio_pci/vfio_pci_core and provide infrastructure
> >     for non-generic VFIO PCI drivers.
> >   - The new driver mlx5_vfio_pci that is a full implementation of
> >     suspend/resume functionality for mlx5 devices.
> >
> > A preview of all the patches can be seen here:
> > https://github.com/jgunthorpe/linux/commits/mlx5_vfio_pci
> >
> > [1] https://lore.kernel.org/dri-devel/0-v2-b6a5582525c9+ff96-vfio_reflck_jgg@nvidia.com/T/#t
> > =====================
> >
> >  From Max Gurtovoy:
> > ====================
> > This series splits the vfio_pci driver into two parts, a PCI driver and
> > a subsystem driver that will also be library of code. The main PCI
> > driver, vfio_pci.ko, will remain as before and it will use the library
> > module vfio_pci_core.ko to help create the vfio_device.
> >
> > This series is intended to solve the issues that were raised in the
> > previous attempts for extending vfio-pci for device specific
> > functionality:
> >
> > 1. https://lore.kernel.org/kvm/20200518024202.13996-1-yan.y.zhao@intel.com
> >     by Yan Zhao
> > 2. https://lore.kernel.org/kvm/20210702095849.1610-1-shameerali.kolothum.thodi@huawei.com
> >     by Longfang Liu
> >
> > Also to support proposed future changes to virtio and other common
> > protocols to support migration:
> >
> > https://lists.oasis-open.org/archives/virtio-comment/202106/msg00044.html
> >
> > This subsystem framework will also ease adding new device specific
> > functionality to VFIO devices in the future by allowing another module
> > to provide the pci_driver that can setup a number of details before
> > registering to the VFIO subsystem, such as injecting its own operations.
> >
> > This series also extends the "driver_override" mechanism. A flag is
> > added for PCI drivers that will declare themselves as "driver_override"
> > capable which sends their match table to the modules.alias file but
> > otherwise leaves them outside of the normal driver core auto-binding
> > world, like vfio_pci.
> >
> > In order to get the best match for "driver_override" drivers, one can
> > create a userspace program to inspect the modules.alias, an example can
> > be found at:
> >
> > https://github.com/maxgurtovoy/linux_tools/blob/main/vfio/bind_vfio_pci_driver.py
> >
> > Which finds the 'best match' according to a simple algorithm: "the
> > driver with the fewest '*' matches wins."
> >
> > For example, the vfio-pci driver will match to any pci device. So it
> > will have the maximal '*' matches.
> >
> > In case we are looking for a match to a mlx5 based device, we'll have a
> > match to vfio-pci.ko and mlx5-vfio-pci.ko. We'll prefer mlx5-vfio-pci.ko
> > since it will have less '*' matches (probably vendor and device IDs will
> > match). This will work in the future for NVMe/Virtio devices that can
> > match according to a class code or other criteria.
> >
> > Yishai
> >
> >
> > Jason Gunthorpe (2):
> >    vfio: Use select for eventfd
> >    vfio: Use kconfig if XX/endif blocks instead of repeating 'depends on'
> >
> > Max Gurtovoy (9):
> >    vfio/pci: Rename vfio_pci.c to vfio_pci_core.c
> >    vfio/pci: Rename vfio_pci_private.h to vfio_pci_core.h
> >    vfio/pci: Rename vfio_pci_device to vfio_pci_core_device
> >    vfio/pci: Rename ops functions to fit core namings
> >    vfio/pci: Include vfio header in vfio_pci_core.h
> >    vfio/pci: Split the pci_driver code out of vfio_pci_core.c
> >    vfio/pci: Move igd initialization to vfio_pci.c
> >    PCI: Add a PCI_ID_F_VFIO_DRIVER_OVERRIDE flag to struct pci_device_id
> >    vfio/pci: Introduce vfio_pci_core.ko
> >
> > Yishai Hadas (1):
> >    vfio/pci: Move module parameters to vfio_pci.c
> >
> >   Documentation/PCI/pci.rst                     |    1 +
> >   drivers/pci/pci-driver.c                      |   25 +-
> >   drivers/vfio/Kconfig                          |   29 +-
> >   drivers/vfio/fsl-mc/Kconfig                   |    3 +-
> >   drivers/vfio/mdev/Kconfig                     |    1 -
> >   drivers/vfio/pci/Kconfig                      |   39 +-
> >   drivers/vfio/pci/Makefile                     |    8 +-
> >   drivers/vfio/pci/vfio_pci.c                   | 2238 +----------------
> >   drivers/vfio/pci/vfio_pci_config.c            |   70 +-
> >   drivers/vfio/pci/vfio_pci_core.c              | 2138 ++++++++++++++++
> >   drivers/vfio/pci/vfio_pci_igd.c               |   19 +-
> >   drivers/vfio/pci/vfio_pci_intrs.c             |   42 +-
> >   drivers/vfio/pci/vfio_pci_rdwr.c              |   18 +-
> >   drivers/vfio/pci/vfio_pci_zdev.c              |    4 +-
> >   drivers/vfio/platform/Kconfig                 |    6 +-
> >   drivers/vfio/platform/reset/Kconfig           |    4 +-
> >   include/linux/mod_devicetable.h               |    7 +
> >   include/linux/pci.h                           |   27 +
> >   .../linux/vfio_pci_core.h                     |   89 +-
> >   scripts/mod/devicetable-offsets.c             |    1 +
> >   scripts/mod/file2alias.c                      |    8 +-
> >   21 files changed, 2496 insertions(+), 2281 deletions(-)
> >   create mode 100644 drivers/vfio/pci/vfio_pci_core.c
> >   rename drivers/vfio/pci/vfio_pci_private.h => include/linux/vfio_pci_core.h (56%)
> >  
> Hi Alex,
> 
> Based on the feedback that we got so far on this series, no functional 
> changes are expected in V2.
> 
> It may include the below minor changes:
> 
> - Drop DRIVER_VERSION as it's useless and not required any more. 
> (Patches #6, #12).
> 
> - Add the sequence of commands/algorithm that is required by userspace 
> to discover the matching driver to the commit message of patch #9.
> 
> Do we need to wait for more feedback or that we are fine to send V2 ?

 - Resolve Kconfig compatibility in patch 12

Patch 9 also depends on an ack from Bjorn, so whether you want to try
to get his buy-in before or after that patch gets updated to clarify
what it's trying to do and why, is up to you.  Thanks,

Alex


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 09/12] PCI: Add a PCI_ID_F_VFIO_DRIVER_OVERRIDE flag to struct pci_device_id
  2021-07-21 16:16 ` [PATCH 09/12] PCI: Add a PCI_ID_F_VFIO_DRIVER_OVERRIDE flag to struct pci_device_id Yishai Hadas
  2021-07-27 16:34   ` Alex Williamson
@ 2021-08-04 20:34   ` Bjorn Helgaas
  2021-08-05 16:47     ` Max Gurtovoy
  2021-08-06  0:23     ` Jason Gunthorpe
  2021-08-12 15:42   ` Bjorn Helgaas
  2 siblings, 2 replies; 55+ messages in thread
From: Bjorn Helgaas @ 2021-08-04 20:34 UTC (permalink / raw)
  To: Yishai Hadas
  Cc: bhelgaas, corbet, alex.williamson, diana.craciun, kwankhede,
	eric.auger, masahiroy, michal.lkml, linux-pci, linux-doc, kvm,
	linux-s390, linux-kbuild, mgurtovoy, jgg, maorg, leonro

On Wed, Jul 21, 2021 at 07:16:06PM +0300, Yishai Hadas wrote:
> From: Max Gurtovoy <mgurtovoy@nvidia.com>
> 
> The new flag field is be used to allow PCI drivers to signal the core code
> during driver matching and when generating the modules.alias information.

This needs to read as a complete idea even without the subject line.
The subject is the *title*; it's not the first sentence of the essay.

It's OK to repeat the subject line in the commit log, but I don't
think that would solve the problem here because "signal core code" and
"when generating ..." doesn't get to the point of the patch.

What's the objective here?

> The first use will be to define a VFIO flag that indicates the PCI driver
> is a VFIO driver.

Is there such a thing as a "VFIO driver" today?  Maybe this patch is
introducing that concept?  If so, maybe lead off by motivating and
defining what it is, then follow up with the details that go into
implementing it.

> VFIO drivers have a few special properties compared to normal PCI drivers:
>  - They do not automatically bind. VFIO drivers are used to swap out the
>    normal driver for a device and convert the PCI device to the VFIO
>    subsystem.

The comment below says "... any matching PCI_ID_F_DRIVER_OVERRIDE
[sic] entry is returned," which sounds like the opposite of "do not
automatically bind."  Might be exposing my VFIO ignorance here.

>    The admin must make this choice and following the current uAPI this is
>    usually done by using the driver_override sysfs.

I'm not sure "converting PCI device to the VFIO subsystem" is the
right way to phrase this, but whatever it is, make this idea specific,
e.g., by "echo pci-stub > /sys/.../driver_override" or whatever.

>  - The modules.alias includes the IDs of the VFIO PCI drivers, prefixing
>    them with 'vfio_pci:' instead of the normal 'pci:'.
> 
>    This allows the userspace machinery that switches devices to VFIO to
>    know what kernel drivers support what devices and allows it to trigger
>    the proper device_override.

What does "switch device to VFIO" mean?  I could be reading this too
literally (in my defense, I'm not a VFIO expert), but AFAICT this is
not something you do to the *device*.  I guess maybe this is something
like "prevent the normal driver from claiming the device so we can use
VFIO instead"?  Does "using VFIO" mean getting vfio-pci to claim the
device?

> As existing tools do not recognize the "vfio_pci:" mod-alias prefix this
> keeps todays behavior the same. VFIO remains on the side, is never
> autoloaded and can only be activated by direct admin action.

s/todays/today's/

> This patch is the infrastructure to provide the information in the
> modules.alias to userspace and enable the only PCI VFIO driver. Later
> series introduce additional HW specific VFIO PCI drivers.

s/the only/only the/ ?  (Not sure what you intend, but "the only"
doesn't seem right)

Sorry, I know I'm totally missing the point here.

> Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
> ---
>  Documentation/PCI/pci.rst         |  1 +
>  drivers/pci/pci-driver.c          | 25 +++++++++++++++++++++----
>  drivers/vfio/pci/vfio_pci.c       |  9 ++++++++-
>  include/linux/mod_devicetable.h   |  7 +++++++
>  include/linux/pci.h               | 27 +++++++++++++++++++++++++++
>  scripts/mod/devicetable-offsets.c |  1 +
>  scripts/mod/file2alias.c          |  8 ++++++--
>  7 files changed, 71 insertions(+), 7 deletions(-)
> 
> diff --git a/Documentation/PCI/pci.rst b/Documentation/PCI/pci.rst
> index fa651e25d98c..24e70a386887 100644
> --- a/Documentation/PCI/pci.rst
> +++ b/Documentation/PCI/pci.rst
> @@ -103,6 +103,7 @@ need pass only as many optional fields as necessary:
>    - subvendor and subdevice fields default to PCI_ANY_ID (FFFFFFFF)
>    - class and classmask fields default to 0
>    - driver_data defaults to 0UL.
> +  - flags field defaults to 0.
>  
>  Note that driver_data must match the value used by any of the pci_device_id
>  entries defined in the driver. This makes the driver_data field mandatory
> diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
> index 3a72352aa5cf..1ed8a4ab96f1 100644
> --- a/drivers/pci/pci-driver.c
> +++ b/drivers/pci/pci-driver.c
> @@ -136,7 +136,7 @@ static const struct pci_device_id *pci_match_device(struct pci_driver *drv,
>  						    struct pci_dev *dev)
>  {
>  	struct pci_dynid *dynid;
> -	const struct pci_device_id *found_id = NULL;
> +	const struct pci_device_id *found_id = NULL, *ids;
>  
>  	/* When driver_override is set, only bind to the matching driver */
>  	if (dev->driver_override && strcmp(dev->driver_override, drv->name))
> @@ -152,10 +152,27 @@ static const struct pci_device_id *pci_match_device(struct pci_driver *drv,
>  	}
>  	spin_unlock(&drv->dynids.lock);
>  
> -	if (!found_id)
> -		found_id = pci_match_id(drv->id_table, dev);
> +	if (found_id)
> +		return found_id;
> +
> +	ids = drv->id_table;
> +	while ((found_id = pci_match_id(ids, dev))) {
> +		/*
> +		 * The match table is split based on driver_override. Check the
> +		 * flags as well so that any matching PCI_ID_F_DRIVER_OVERRIDE

s/PCI_ID_F_DRIVER_OVERRIDE/PCI_ID_F_VFIO_DRIVER_OVERRIDE/ ?

> +		 * entry is returned.
> +		 */
> +		if ((found_id->flags & PCI_ID_F_VFIO_DRIVER_OVERRIDE) &&
> +		    !dev->driver_override)
> +			ids = found_id + 1;
> +		else
> +			break;

Isn't this break the same as "return found_id"?

> +	}
>  
> -	/* driver_override will always match, send a dummy id */
> +	/*
> +	 * if no static match, driver_override will always match, send a dummy

AFAICT this patch did not change dynamic matching, so I don't know why
you changed this comment.  Previously driver_override matched if there
was no dynamic or static match.  Now it's the same except that we skip
static matches with PCI_ID_F_VFIO_DRIVER_OVERRIDE.

> +	 * id.
> +	 */
>  	if (!found_id && dev->driver_override)
>  		found_id = &pci_device_id_any;
>  
> diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c
> index 0272b95d9c5f..7a43edbe8618 100644
> --- a/drivers/vfio/pci/vfio_pci.c
> +++ b/drivers/vfio/pci/vfio_pci.c
> @@ -181,9 +181,16 @@ static int vfio_pci_sriov_configure(struct pci_dev *pdev, int nr_virtfn)
>  	return vfio_pci_core_sriov_configure(pdev, nr_virtfn);
>  }
>  
> +static const struct pci_device_id vfio_pci_table[] = {
> +	{ PCI_DRIVER_OVERRIDE_DEVICE_VFIO(PCI_ANY_ID, PCI_ANY_ID) }, /* match all by default */
> +	{ 0, }
> +};
> +
> +MODULE_DEVICE_TABLE(pci, vfio_pci_table);
> +
>  static struct pci_driver vfio_pci_driver = {
>  	.name			= "vfio-pci",
> -	.id_table		= NULL, /* only dynamic ids */
> +	.id_table		= vfio_pci_table,
>  	.probe			= vfio_pci_probe,
>  	.remove			= vfio_pci_remove,
>  	.sriov_configure	= vfio_pci_sriov_configure,
> diff --git a/include/linux/mod_devicetable.h b/include/linux/mod_devicetable.h
> index 8e291cfdaf06..cd256d9c60d2 100644
> --- a/include/linux/mod_devicetable.h
> +++ b/include/linux/mod_devicetable.h
> @@ -16,6 +16,11 @@ typedef unsigned long kernel_ulong_t;
>  
>  #define PCI_ANY_ID (~0)
>  
> +

Spurious blank line.

> +enum pci_id_flags {
> +	PCI_ID_F_VFIO_DRIVER_OVERRIDE	= 1 << 0,
> +};

Why an enum?  Is the enum and the name following some similar style
elsewhere?

> +
>  /**
>   * struct pci_device_id - PCI device ID structure
>   * @vendor:		Vendor ID to match (or PCI_ANY_ID)
> @@ -34,12 +39,14 @@ typedef unsigned long kernel_ulong_t;
>   *			Best practice is to use driver_data as an index
>   *			into a static list of equivalent device types,
>   *			instead of using it as a pointer.
> + * @flags:		PCI flags of the driver. Bitmap of pci_id_flags enum.
>   */
>  struct pci_device_id {
>  	__u32 vendor, device;		/* Vendor and device ID or PCI_ANY_ID*/
>  	__u32 subvendor, subdevice;	/* Subsystem ID's or PCI_ANY_ID */
>  	__u32 class, class_mask;	/* (class,subclass,prog-if) triplet */
>  	kernel_ulong_t driver_data;	/* Data private to the driver */
> +	__u32 flags;
>  };
>  
>  
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index 540b377ca8f6..fd84609ff06b 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -901,6 +901,33 @@ struct pci_driver {
>  	.vendor = (vend), .device = (dev), \
>  	.subvendor = PCI_ANY_ID, .subdevice = PCI_ANY_ID
>  
> +/**
> + * PCI_DEVICE_FLAGS - macro used to describe a PCI device with specific flags.
> + * @vend: the 16 bit PCI Vendor ID
> + * @dev: the 16 bit PCI Device ID
> + * @fl: PCI Device flags as a bitmap of pci_id_flags enum
> + *
> + * This macro is used to create a struct pci_device_id that matches a
> + * specific device. The subvendor and subdevice fields will be set to
> + * PCI_ANY_ID.
> + */
> +#define PCI_DEVICE_FLAGS(vend, dev, fl) \
> +	.vendor = (vend), .device = (dev), .subvendor = PCI_ANY_ID, \
> +	.subdevice = PCI_ANY_ID, .flags = (fl)
> +
> +/**
> + * PCI_DRIVER_OVERRIDE_DEVICE_VFIO - macro used to describe a VFIO
> + *                                   "driver_override" PCI device.
> + * @vend: the 16 bit PCI Vendor ID
> + * @dev: the 16 bit PCI Device ID
> + *
> + * This macro is used to create a struct pci_device_id that matches a
> + * specific device. The subvendor and subdevice fields will be set to
> + * PCI_ANY_ID and the flags will be set to PCI_ID_F_VFIO_DRIVER_OVERRIDE.
> + */
> +#define PCI_DRIVER_OVERRIDE_DEVICE_VFIO(vend, dev) \
> +	PCI_DEVICE_FLAGS(vend, dev, PCI_ID_F_VFIO_DRIVER_OVERRIDE)
> +
>  /**
>   * PCI_DEVICE_SUB - macro used to describe a specific PCI device with subsystem
>   * @vend: the 16 bit PCI Vendor ID
> diff --git a/scripts/mod/devicetable-offsets.c b/scripts/mod/devicetable-offsets.c
> index 9bb6c7edccc4..b927c36b8333 100644
> --- a/scripts/mod/devicetable-offsets.c
> +++ b/scripts/mod/devicetable-offsets.c
> @@ -42,6 +42,7 @@ int main(void)
>  	DEVID_FIELD(pci_device_id, subdevice);
>  	DEVID_FIELD(pci_device_id, class);
>  	DEVID_FIELD(pci_device_id, class_mask);
> +	DEVID_FIELD(pci_device_id, flags);
>  
>  	DEVID(ccw_device_id);
>  	DEVID_FIELD(ccw_device_id, match_flags);
> diff --git a/scripts/mod/file2alias.c b/scripts/mod/file2alias.c
> index 7c97fa8e36bc..f53b38e8f696 100644
> --- a/scripts/mod/file2alias.c
> +++ b/scripts/mod/file2alias.c
> @@ -426,7 +426,7 @@ static int do_ieee1394_entry(const char *filename,
>  	return 1;
>  }
>  
> -/* Looks like: pci:vNdNsvNsdNbcNscNiN. */
> +/* Looks like: pci:vNdNsvNsdNbcNscNiN or <prefix>_pci:vNdNsvNsdNbcNscNiN. */
>  static int do_pci_entry(const char *filename,
>  			void *symval, char *alias)
>  {
> @@ -440,8 +440,12 @@ static int do_pci_entry(const char *filename,
>  	DEF_FIELD(symval, pci_device_id, subdevice);
>  	DEF_FIELD(symval, pci_device_id, class);
>  	DEF_FIELD(symval, pci_device_id, class_mask);
> +	DEF_FIELD(symval, pci_device_id, flags);

I'm a little bit wary of adding a new field to this kernel/user
interface just for this single bit.  Maybe it's justified but feels
like it's worth being careful.

> -	strcpy(alias, "pci:");
> +	if (flags & PCI_ID_F_VFIO_DRIVER_OVERRIDE)
> +		strcpy(alias, "vfio_pci:");
> +	else
> +		strcpy(alias, "pci:");
>  	ADD(alias, "v", vendor != PCI_ANY_ID, vendor);
>  	ADD(alias, "d", device != PCI_ANY_ID, device);
>  	ADD(alias, "sv", subvendor != PCI_ANY_ID, subvendor);
> -- 
> 2.18.1
> 

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 09/12] PCI: Add a PCI_ID_F_VFIO_DRIVER_OVERRIDE flag to struct pci_device_id
  2021-08-04 20:34   ` Bjorn Helgaas
@ 2021-08-05 16:47     ` Max Gurtovoy
  2021-08-06  0:23     ` Jason Gunthorpe
  1 sibling, 0 replies; 55+ messages in thread
From: Max Gurtovoy @ 2021-08-05 16:47 UTC (permalink / raw)
  To: Bjorn Helgaas, Yishai Hadas
  Cc: bhelgaas, corbet, alex.williamson, diana.craciun, kwankhede,
	eric.auger, masahiroy, michal.lkml, linux-pci, linux-doc, kvm,
	linux-s390, linux-kbuild, jgg, maorg, leonro


On 8/4/2021 11:34 PM, Bjorn Helgaas wrote:
> On Wed, Jul 21, 2021 at 07:16:06PM +0300, Yishai Hadas wrote:
>> From: Max Gurtovoy <mgurtovoy@nvidia.com>
>>
>> The new flag field is be used to allow PCI drivers to signal the core code
>> during driver matching and when generating the modules.alias information.
> This needs to read as a complete idea even without the subject line.
> The subject is the *title*; it's not the first sentence of the essay.
>
> It's OK to repeat the subject line in the commit log, but I don't
> think that would solve the problem here because "signal core code" and
> "when generating ..." doesn't get to the point of the patch.
>
> What's the objective here?

We're creating a framework for adding vendor/protocol specific vfio_pci 
drivers. Today we have vfio_pci that can match to all pci devices and 
implement generic pci functionality.

For adding features like Live migration and other goodies we'll use the 
new vendor drivers since it's not generic.

In this patch, we're providing the ability for userspace to identify 
these drivers and match to devices.

We also want to prevent auto loading via some udev facilities (by adding 
new aliases for vfio pci devices) and we don't want vfio pci vendor 
drivers to race with original pci driver (e.g mlx5_core).

Thus, we enforce that mlx5_vfio_pci will match to a device (id_table 
will be added to vendor drivers) only if some admin use driver_override.

>
>> The first use will be to define a VFIO flag that indicates the PCI driver
>> is a VFIO driver.
> Is there such a thing as a "VFIO driver" today?  Maybe this patch is
> introducing that concept?  If so, maybe lead off by motivating and
> defining what it is, then follow up with the details that go into
> implementing it.
>
>> VFIO drivers have a few special properties compared to normal PCI drivers:
>>   - They do not automatically bind. VFIO drivers are used to swap out the
>>     normal driver for a device and convert the PCI device to the VFIO
>>     subsystem.
> The comment below says "... any matching PCI_ID_F_DRIVER_OVERRIDE
> [sic] entry is returned," which sounds like the opposite of "do not
> automatically bind."  Might be exposing my VFIO ignorance here.
>
>>     The admin must make this choice and following the current uAPI this is
>>     usually done by using the driver_override sysfs.
> I'm not sure "converting PCI device to the VFIO subsystem" is the
> right way to phrase this, but whatever it is, make this idea specific,
> e.g., by "echo pci-stub > /sys/.../driver_override" or whatever.
>
>>   - The modules.alias includes the IDs of the VFIO PCI drivers, prefixing
>>     them with 'vfio_pci:' instead of the normal 'pci:'.
>>
>>     This allows the userspace machinery that switches devices to VFIO to
>>     know what kernel drivers support what devices and allows it to trigger
>>     the proper device_override.
> What does "switch device to VFIO" mean?  I could be reading this too
> literally (in my defense, I'm not a VFIO expert), but AFAICT this is
> not something you do to the *device*.  I guess maybe this is something
> like "prevent the normal driver from claiming the device so we can use
> VFIO instead"?  Does "using VFIO" mean getting vfio-pci to claim the
> device?

hope the above explanation made this more clear.

We'll have vendor_vfio_pci drivers in the next patchsets and not only 
vfio_pci.ko.

mlx5 and hns will be the first 2 drivers to implement vendor specific 
functionality in vfio/pci subsystem.

We want to use these drivers to drive our devices and not vfio_pci.ko 
that don't have the logic for migrating mlx5/hns devices.


We'll improve the commit message for the next version and add the 
algorithm Jason proposed in his previous answer.

>
>> As existing tools do not recognize the "vfio_pci:" mod-alias prefix this
>> keeps todays behavior the same. VFIO remains on the side, is never
>> autoloaded and can only be activated by direct admin action.
> s/todays/today's/
>
>> This patch is the infrastructure to provide the information in the
>> modules.alias to userspace and enable the only PCI VFIO driver. Later
>> series introduce additional HW specific VFIO PCI drivers.
> s/the only/only the/ ?  (Not sure what you intend, but "the only"
> doesn't seem right)
>
> Sorry, I know I'm totally missing the point here.
>
>> Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
>> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
>> Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
>> ---
>>   Documentation/PCI/pci.rst         |  1 +
>>   drivers/pci/pci-driver.c          | 25 +++++++++++++++++++++----
>>   drivers/vfio/pci/vfio_pci.c       |  9 ++++++++-
>>   include/linux/mod_devicetable.h   |  7 +++++++
>>   include/linux/pci.h               | 27 +++++++++++++++++++++++++++
>>   scripts/mod/devicetable-offsets.c |  1 +
>>   scripts/mod/file2alias.c          |  8 ++++++--
>>   7 files changed, 71 insertions(+), 7 deletions(-)
>>
>> diff --git a/Documentation/PCI/pci.rst b/Documentation/PCI/pci.rst
>> index fa651e25d98c..24e70a386887 100644
>> --- a/Documentation/PCI/pci.rst
>> +++ b/Documentation/PCI/pci.rst
>> @@ -103,6 +103,7 @@ need pass only as many optional fields as necessary:
>>     - subvendor and subdevice fields default to PCI_ANY_ID (FFFFFFFF)
>>     - class and classmask fields default to 0
>>     - driver_data defaults to 0UL.
>> +  - flags field defaults to 0.
>>   
>>   Note that driver_data must match the value used by any of the pci_device_id
>>   entries defined in the driver. This makes the driver_data field mandatory
>> diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
>> index 3a72352aa5cf..1ed8a4ab96f1 100644
>> --- a/drivers/pci/pci-driver.c
>> +++ b/drivers/pci/pci-driver.c
>> @@ -136,7 +136,7 @@ static const struct pci_device_id *pci_match_device(struct pci_driver *drv,
>>   						    struct pci_dev *dev)
>>   {
>>   	struct pci_dynid *dynid;
>> -	const struct pci_device_id *found_id = NULL;
>> +	const struct pci_device_id *found_id = NULL, *ids;
>>   
>>   	/* When driver_override is set, only bind to the matching driver */
>>   	if (dev->driver_override && strcmp(dev->driver_override, drv->name))
>> @@ -152,10 +152,27 @@ static const struct pci_device_id *pci_match_device(struct pci_driver *drv,
>>   	}
>>   	spin_unlock(&drv->dynids.lock);
>>   
>> -	if (!found_id)
>> -		found_id = pci_match_id(drv->id_table, dev);
>> +	if (found_id)
>> +		return found_id;
>> +
>> +	ids = drv->id_table;
>> +	while ((found_id = pci_match_id(ids, dev))) {
>> +		/*
>> +		 * The match table is split based on driver_override. Check the
>> +		 * flags as well so that any matching PCI_ID_F_DRIVER_OVERRIDE
> s/PCI_ID_F_DRIVER_OVERRIDE/PCI_ID_F_VFIO_DRIVER_OVERRIDE/ ?

sorry, leftover from last version.

>
>> +		 * entry is returned.
>> +		 */
>> +		if ((found_id->flags & PCI_ID_F_VFIO_DRIVER_OVERRIDE) &&
>> +		    !dev->driver_override)
>> +			ids = found_id + 1;
>> +		else
>> +			break;
> Isn't this break the same as "return found_id"?

the same.

Will update in next version.


>
>> +	}
>>   
>> -	/* driver_override will always match, send a dummy id */
>> +	/*
>> +	 * if no static match, driver_override will always match, send a dummy
> AFAICT this patch did not change dynamic matching, so I don't know why
> you changed this comment.  Previously driver_override matched if there
> was no dynamic or static match.  Now it's the same except that we skip
> static matches with PCI_ID_F_VFIO_DRIVER_OVERRIDE.

we'll keep the old comment.


>> +	 * id.
>> +	 */
>>   	if (!found_id && dev->driver_override)
>>   		found_id = &pci_device_id_any;
>>   
>> diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c
>> index 0272b95d9c5f..7a43edbe8618 100644
>> --- a/drivers/vfio/pci/vfio_pci.c
>> +++ b/drivers/vfio/pci/vfio_pci.c
>> @@ -181,9 +181,16 @@ static int vfio_pci_sriov_configure(struct pci_dev *pdev, int nr_virtfn)
>>   	return vfio_pci_core_sriov_configure(pdev, nr_virtfn);
>>   }
>>   
>> +static const struct pci_device_id vfio_pci_table[] = {
>> +	{ PCI_DRIVER_OVERRIDE_DEVICE_VFIO(PCI_ANY_ID, PCI_ANY_ID) }, /* match all by default */
>> +	{ 0, }
>> +};
>> +
>> +MODULE_DEVICE_TABLE(pci, vfio_pci_table);
>> +
>>   static struct pci_driver vfio_pci_driver = {
>>   	.name			= "vfio-pci",
>> -	.id_table		= NULL, /* only dynamic ids */
>> +	.id_table		= vfio_pci_table,
>>   	.probe			= vfio_pci_probe,
>>   	.remove			= vfio_pci_remove,
>>   	.sriov_configure	= vfio_pci_sriov_configure,
>> diff --git a/include/linux/mod_devicetable.h b/include/linux/mod_devicetable.h
>> index 8e291cfdaf06..cd256d9c60d2 100644
>> --- a/include/linux/mod_devicetable.h
>> +++ b/include/linux/mod_devicetable.h
>> @@ -16,6 +16,11 @@ typedef unsigned long kernel_ulong_t;
>>   
>>   #define PCI_ANY_ID (~0)
>>   
>> +
> Spurious blank line.

good catch, thanks.


>> +enum pci_id_flags {
>> +	PCI_ID_F_VFIO_DRIVER_OVERRIDE	= 1 << 0,
>> +};
> Why an enum?  Is the enum and the name following some similar style
> elsewhere?

We might want to add more flags in the future. I'll remove the enum name 
but let's keep the enum for future extensions.


>
>> +
>>   /**
>>    * struct pci_device_id - PCI device ID structure
>>    * @vendor:		Vendor ID to match (or PCI_ANY_ID)
>> @@ -34,12 +39,14 @@ typedef unsigned long kernel_ulong_t;
>>    *			Best practice is to use driver_data as an index
>>    *			into a static list of equivalent device types,
>>    *			instead of using it as a pointer.
>> + * @flags:		PCI flags of the driver. Bitmap of pci_id_flags enum.
>>    */
>>   struct pci_device_id {
>>   	__u32 vendor, device;		/* Vendor and device ID or PCI_ANY_ID*/
>>   	__u32 subvendor, subdevice;	/* Subsystem ID's or PCI_ANY_ID */
>>   	__u32 class, class_mask;	/* (class,subclass,prog-if) triplet */
>>   	kernel_ulong_t driver_data;	/* Data private to the driver */
>> +	__u32 flags;
>>   };
>>   
>>   
>> diff --git a/include/linux/pci.h b/include/linux/pci.h
>> index 540b377ca8f6..fd84609ff06b 100644
>> --- a/include/linux/pci.h
>> +++ b/include/linux/pci.h
>> @@ -901,6 +901,33 @@ struct pci_driver {
>>   	.vendor = (vend), .device = (dev), \
>>   	.subvendor = PCI_ANY_ID, .subdevice = PCI_ANY_ID
>>   
>> +/**
>> + * PCI_DEVICE_FLAGS - macro used to describe a PCI device with specific flags.
>> + * @vend: the 16 bit PCI Vendor ID
>> + * @dev: the 16 bit PCI Device ID
>> + * @fl: PCI Device flags as a bitmap of pci_id_flags enum
>> + *
>> + * This macro is used to create a struct pci_device_id that matches a
>> + * specific device. The subvendor and subdevice fields will be set to
>> + * PCI_ANY_ID.
>> + */
>> +#define PCI_DEVICE_FLAGS(vend, dev, fl) \
>> +	.vendor = (vend), .device = (dev), .subvendor = PCI_ANY_ID, \
>> +	.subdevice = PCI_ANY_ID, .flags = (fl)
>> +
>> +/**
>> + * PCI_DRIVER_OVERRIDE_DEVICE_VFIO - macro used to describe a VFIO
>> + *                                   "driver_override" PCI device.
>> + * @vend: the 16 bit PCI Vendor ID
>> + * @dev: the 16 bit PCI Device ID
>> + *
>> + * This macro is used to create a struct pci_device_id that matches a
>> + * specific device. The subvendor and subdevice fields will be set to
>> + * PCI_ANY_ID and the flags will be set to PCI_ID_F_VFIO_DRIVER_OVERRIDE.
>> + */
>> +#define PCI_DRIVER_OVERRIDE_DEVICE_VFIO(vend, dev) \
>> +	PCI_DEVICE_FLAGS(vend, dev, PCI_ID_F_VFIO_DRIVER_OVERRIDE)
>> +
>>   /**
>>    * PCI_DEVICE_SUB - macro used to describe a specific PCI device with subsystem
>>    * @vend: the 16 bit PCI Vendor ID
>> diff --git a/scripts/mod/devicetable-offsets.c b/scripts/mod/devicetable-offsets.c
>> index 9bb6c7edccc4..b927c36b8333 100644
>> --- a/scripts/mod/devicetable-offsets.c
>> +++ b/scripts/mod/devicetable-offsets.c
>> @@ -42,6 +42,7 @@ int main(void)
>>   	DEVID_FIELD(pci_device_id, subdevice);
>>   	DEVID_FIELD(pci_device_id, class);
>>   	DEVID_FIELD(pci_device_id, class_mask);
>> +	DEVID_FIELD(pci_device_id, flags);
>>   
>>   	DEVID(ccw_device_id);
>>   	DEVID_FIELD(ccw_device_id, match_flags);
>> diff --git a/scripts/mod/file2alias.c b/scripts/mod/file2alias.c
>> index 7c97fa8e36bc..f53b38e8f696 100644
>> --- a/scripts/mod/file2alias.c
>> +++ b/scripts/mod/file2alias.c
>> @@ -426,7 +426,7 @@ static int do_ieee1394_entry(const char *filename,
>>   	return 1;
>>   }
>>   
>> -/* Looks like: pci:vNdNsvNsdNbcNscNiN. */
>> +/* Looks like: pci:vNdNsvNsdNbcNscNiN or <prefix>_pci:vNdNsvNsdNbcNscNiN. */
>>   static int do_pci_entry(const char *filename,
>>   			void *symval, char *alias)
>>   {
>> @@ -440,8 +440,12 @@ static int do_pci_entry(const char *filename,
>>   	DEF_FIELD(symval, pci_device_id, subdevice);
>>   	DEF_FIELD(symval, pci_device_id, class);
>>   	DEF_FIELD(symval, pci_device_id, class_mask);
>> +	DEF_FIELD(symval, pci_device_id, flags);
> I'm a little bit wary of adding a new field to this kernel/user
> interface just for this single bit.  Maybe it's justified but feels
> like it's worth being careful.

Old applications are not aware of these flags.

what worries you ?

>
>> -	strcpy(alias, "pci:");
>> +	if (flags & PCI_ID_F_VFIO_DRIVER_OVERRIDE)
>> +		strcpy(alias, "vfio_pci:");
>> +	else
>> +		strcpy(alias, "pci:");
>>   	ADD(alias, "v", vendor != PCI_ANY_ID, vendor);
>>   	ADD(alias, "d", device != PCI_ANY_ID, device);
>>   	ADD(alias, "sv", subvendor != PCI_ANY_ID, subvendor);
>> -- 
>> 2.18.1
>>

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 09/12] PCI: Add a PCI_ID_F_VFIO_DRIVER_OVERRIDE flag to struct pci_device_id
  2021-08-04 20:34   ` Bjorn Helgaas
  2021-08-05 16:47     ` Max Gurtovoy
@ 2021-08-06  0:23     ` Jason Gunthorpe
  2021-08-11 12:22       ` Max Gurtovoy
  2021-08-11 19:07       ` Bjorn Helgaas
  1 sibling, 2 replies; 55+ messages in thread
From: Jason Gunthorpe @ 2021-08-06  0:23 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Yishai Hadas, bhelgaas, corbet, alex.williamson, diana.craciun,
	kwankhede, eric.auger, masahiroy, michal.lkml, linux-pci,
	linux-doc, kvm, linux-s390, linux-kbuild, mgurtovoy, maorg,
	leonro

On Wed, Aug 04, 2021 at 03:34:12PM -0500, Bjorn Helgaas wrote:

> > The first use will be to define a VFIO flag that indicates the PCI driver
> > is a VFIO driver.
>
> Is there such a thing as a "VFIO driver" today?  

Yes.

VFIO has long existed as a driver subsystem that binds drivers to
devices in various bus types. In the case of PCI the admin moves a PCI
device from normal operation to VFIO operation via something like:

echo vfio_pci > /sys/bus/pci/devices/0000:01:00.0/driver_override

Other bus types (platform, acpi, etc) have a similar command to move
them to VFIO.

> > VFIO drivers have a few special properties compared to normal PCI drivers:
> >  - They do not automatically bind. VFIO drivers are used to swap out the
> >    normal driver for a device and convert the PCI device to the VFIO
> >    subsystem.
> 
> The comment below says "... any matching PCI_ID_F_DRIVER_OVERRIDE
> [sic] entry is returned," which sounds like the opposite of "do not
> automatically bind."  Might be exposing my VFIO ignorance here.

The comment is in error
 
> >    The admin must make this choice and following the current uAPI this is
> >    usually done by using the driver_override sysfs.
> 
> I'm not sure "converting PCI device to the VFIO subsystem" is the
> right way to phrase this, but whatever it is, make this idea specific,
> e.g., by "echo pci-stub > /sys/.../driver_override" or whatever.

The next version will include the sequence we worked out with Alex in
the other branch of this thread. See below

> >  - The modules.alias includes the IDs of the VFIO PCI drivers, prefixing
> >    them with 'vfio_pci:' instead of the normal 'pci:'.
> > 
> >    This allows the userspace machinery that switches devices to VFIO to
> >    know what kernel drivers support what devices and allows it to trigger
> >    the proper device_override.
> 
> What does "switch device to VFIO" mean?  I could be reading this too
> literally (in my defense, I'm not a VFIO expert), but AFAICT this is
> not something you do to the *device*.  

It means change the struct device_driver bound to the struct device -
which is an operation that the admin does on the device object.

> I guess maybe this is something like "prevent the normal driver from
> claiming the device so we can use VFIO instead"?

no..

> Does "using VFIO" mean getting vfio-pci to claim the device?

If by claim you mean bind a pci_driver to the pci_dev, then yes.

> > As existing tools do not recognize the "vfio_pci:" mod-alias prefix this
> > keeps todays behavior the same. VFIO remains on the side, is never
> > autoloaded and can only be activated by direct admin action.
> 
> s/todays/today's/
> 
> > This patch is the infrastructure to provide the information in the
> > modules.alias to userspace and enable the only PCI VFIO driver. Later
> > series introduce additional HW specific VFIO PCI drivers.
> 
> s/the only/only the/ ?  (Not sure what you intend, but "the only"
> doesn't seem right)

"the only" is correct, at this point in the sequence there is only one
pci_driver that uses this, vfio_pci.ko

> Sorry, I know I'm totally missing the point here.

Lets try again..

PCI: Add a PCI_ID_F_VFIO_DRIVER_OVERRIDE flag to struct pci_device_id

Allow device drivers to include match entries in the modules.alias file
produced by kbuild that are not used for normal driver autoprobing and
module autoloading. Drivers using these match entries can be connected to
the PCI device manually, by userspace, using the existing driver_override
sysfs.

Add the flag PCI_ID_F_VFIO_DRIVER_OVERRIDE to indicate that the match
entry is for the VFIO subsystem. These match entries are prefixed with
"vfio_" in the modules.alias.

For example the resulting modules.alias may have:

  alias pci:v000015B3d00001021sv*sd*bc*sc*i* mlx5_core
  alias vfio_pci:v000015B3d00001021sv*sd*bc*sc*i* mlx5_vfio_pci
  alias vfio_pci:v*d*sv*sd*bc*sc*i* vfio_pci

In this example mlx5_core and mlx5_vfio_pci match to the same PCI
device. The kernel will autoload and autobind to mlx5_core but the kernel
and udev mechanisms will ignore mlx5_vfio_pci.

When userspace wants to change a device to the VFIO subsystem userspace
can implement a generic algorithm:

   1) Identify the sysfs path to the device:
    /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0

   2) Get the modalias string from the kernel:
    $ cat /sys/bus/pci/devices/0000:01:00.0/modalias
    pci:v000015B3d00001021sv000015B3sd00000001bc02sc00i00

   3) Prefix it with vfio_:
    vfio_pci:v000015B3d00001021sv000015B3sd00000001bc02sc00i00

   4) Search modules.alias for the above string and select the entry that
      has the fewest *'s:
    alias vfio_pci:v000015B3d00001021sv*sd*bc*sc*i* mlx5_vfio_pci

   5) modprobe the matched module name:
    $ modprobe mlx5_vfio_pci

   6) cat the matched module name to driver_override:
    echo mlx5_vfio_pci > /sys/bus/pci/devices/0000:01:00.0/driver_override

The algorithm is independent of bus type. In future the other buses's with
VFIO device drivers, like platform and ACPI, can use this algorithm as
well.

This patch is the infrastructure to provide the information in the
modules.alias to userspace. Convert the only VFIO pci_driver which
results in one new line in the modules.alias:

  alias vfio_pci:v*d*sv*sd*bc*sc*i* vfio_pci

Later series introduce additional HW specific VFIO PCI drivers, such as
mlx5_vfio_pci.

> > diff --git a/scripts/mod/file2alias.c b/scripts/mod/file2alias.c
> > index 7c97fa8e36bc..f53b38e8f696 100644
> > +++ b/scripts/mod/file2alias.c
> > @@ -426,7 +426,7 @@ static int do_ieee1394_entry(const char *filename,
> >  	return 1;
> >  }
> >  
> > -/* Looks like: pci:vNdNsvNsdNbcNscNiN. */
> > +/* Looks like: pci:vNdNsvNsdNbcNscNiN or <prefix>_pci:vNdNsvNsdNbcNscNiN. */
> >  static int do_pci_entry(const char *filename,
> >  			void *symval, char *alias)
> >  {
> > @@ -440,8 +440,12 @@ static int do_pci_entry(const char *filename,
> >  	DEF_FIELD(symval, pci_device_id, subdevice);
> >  	DEF_FIELD(symval, pci_device_id, class);
> >  	DEF_FIELD(symval, pci_device_id, class_mask);
> > +	DEF_FIELD(symval, pci_device_id, flags);
> 
> I'm a little bit wary of adding a new field to this kernel/user
> interface just for this single bit.  Maybe it's justified but feels
> like it's worth being careful.

A couple of us looked at this in one of the RFC threads..

As far as we could tell this is not a kernel/user interface. It is an
interface within kbuild between gcc and file2alias and is not used or
really exported beyond the kernel build sequence.

Debian code search didn't find anything, for instance.

modules.alias, as output by file2alias during kbuild, is the canonical
"kernel/user" interface here. Everything that needs this data should
be using that.

Thanks,
Jason

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 09/12] PCI: Add a PCI_ID_F_VFIO_DRIVER_OVERRIDE flag to struct pci_device_id
  2021-08-06  0:23     ` Jason Gunthorpe
@ 2021-08-11 12:22       ` Max Gurtovoy
  2021-08-11 19:07       ` Bjorn Helgaas
  1 sibling, 0 replies; 55+ messages in thread
From: Max Gurtovoy @ 2021-08-11 12:22 UTC (permalink / raw)
  To: Jason Gunthorpe, Bjorn Helgaas
  Cc: Yishai Hadas, bhelgaas, corbet, alex.williamson, diana.craciun,
	kwankhede, eric.auger, masahiroy, michal.lkml, linux-pci,
	linux-doc, kvm, linux-s390, linux-kbuild, maorg, leonro

Hi Bjorn,

On 8/6/2021 3:23 AM, Jason Gunthorpe wrote:
> On Wed, Aug 04, 2021 at 03:34:12PM -0500, Bjorn Helgaas wrote:
>
>>> The first use will be to define a VFIO flag that indicates the PCI driver
>>> is a VFIO driver.
>> Is there such a thing as a "VFIO driver" today?
> Yes.
>
> VFIO has long existed as a driver subsystem that binds drivers to
> devices in various bus types. In the case of PCI the admin moves a PCI
> device from normal operation to VFIO operation via something like:
>
> echo vfio_pci > /sys/bus/pci/devices/0000:01:00.0/driver_override
>
> Other bus types (platform, acpi, etc) have a similar command to move
> them to VFIO.
>
>>> VFIO drivers have a few special properties compared to normal PCI drivers:
>>>   - They do not automatically bind. VFIO drivers are used to swap out the
>>>     normal driver for a device and convert the PCI device to the VFIO
>>>     subsystem.
>> The comment below says "... any matching PCI_ID_F_DRIVER_OVERRIDE
>> [sic] entry is returned," which sounds like the opposite of "do not
>> automatically bind."  Might be exposing my VFIO ignorance here.
> The comment is in error
>   
>>>     The admin must make this choice and following the current uAPI this is
>>>     usually done by using the driver_override sysfs.
>> I'm not sure "converting PCI device to the VFIO subsystem" is the
>> right way to phrase this, but whatever it is, make this idea specific,
>> e.g., by "echo pci-stub > /sys/.../driver_override" or whatever.
> The next version will include the sequence we worked out with Alex in
> the other branch of this thread. See below
>
>>>   - The modules.alias includes the IDs of the VFIO PCI drivers, prefixing
>>>     them with 'vfio_pci:' instead of the normal 'pci:'.
>>>
>>>     This allows the userspace machinery that switches devices to VFIO to
>>>     know what kernel drivers support what devices and allows it to trigger
>>>     the proper device_override.
>> What does "switch device to VFIO" mean?  I could be reading this too
>> literally (in my defense, I'm not a VFIO expert), but AFAICT this is
>> not something you do to the *device*.
> It means change the struct device_driver bound to the struct device -
> which is an operation that the admin does on the device object.
>
>> I guess maybe this is something like "prevent the normal driver from
>> claiming the device so we can use VFIO instead"?
> no..
>
>> Does "using VFIO" mean getting vfio-pci to claim the device?
> If by claim you mean bind a pci_driver to the pci_dev, then yes.
>
>>> As existing tools do not recognize the "vfio_pci:" mod-alias prefix this
>>> keeps todays behavior the same. VFIO remains on the side, is never
>>> autoloaded and can only be activated by direct admin action.
>> s/todays/today's/
>>
>>> This patch is the infrastructure to provide the information in the
>>> modules.alias to userspace and enable the only PCI VFIO driver. Later
>>> series introduce additional HW specific VFIO PCI drivers.
>> s/the only/only the/ ?  (Not sure what you intend, but "the only"
>> doesn't seem right)
> "the only" is correct, at this point in the sequence there is only one
> pci_driver that uses this, vfio_pci.ko
>
>> Sorry, I know I'm totally missing the point here.
> Lets try again..
>
> PCI: Add a PCI_ID_F_VFIO_DRIVER_OVERRIDE flag to struct pci_device_id
>
> Allow device drivers to include match entries in the modules.alias file
> produced by kbuild that are not used for normal driver autoprobing and
> module autoloading. Drivers using these match entries can be connected to
> the PCI device manually, by userspace, using the existing driver_override
> sysfs.
>
> Add the flag PCI_ID_F_VFIO_DRIVER_OVERRIDE to indicate that the match
> entry is for the VFIO subsystem. These match entries are prefixed with
> "vfio_" in the modules.alias.
>
> For example the resulting modules.alias may have:
>
>    alias pci:v000015B3d00001021sv*sd*bc*sc*i* mlx5_core
>    alias vfio_pci:v000015B3d00001021sv*sd*bc*sc*i* mlx5_vfio_pci
>    alias vfio_pci:v*d*sv*sd*bc*sc*i* vfio_pci
>
> In this example mlx5_core and mlx5_vfio_pci match to the same PCI
> device. The kernel will autoload and autobind to mlx5_core but the kernel
> and udev mechanisms will ignore mlx5_vfio_pci.
>
> When userspace wants to change a device to the VFIO subsystem userspace
> can implement a generic algorithm:
>
>     1) Identify the sysfs path to the device:
>      /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0
>
>     2) Get the modalias string from the kernel:
>      $ cat /sys/bus/pci/devices/0000:01:00.0/modalias
>      pci:v000015B3d00001021sv000015B3sd00000001bc02sc00i00
>
>     3) Prefix it with vfio_:
>      vfio_pci:v000015B3d00001021sv000015B3sd00000001bc02sc00i00
>
>     4) Search modules.alias for the above string and select the entry that
>        has the fewest *'s:
>      alias vfio_pci:v000015B3d00001021sv*sd*bc*sc*i* mlx5_vfio_pci
>
>     5) modprobe the matched module name:
>      $ modprobe mlx5_vfio_pci
>
>     6) cat the matched module name to driver_override:
>      echo mlx5_vfio_pci > /sys/bus/pci/devices/0000:01:00.0/driver_override
>
> The algorithm is independent of bus type. In future the other buses's with
> VFIO device drivers, like platform and ACPI, can use this algorithm as
> well.
>
> This patch is the infrastructure to provide the information in the
> modules.alias to userspace. Convert the only VFIO pci_driver which
> results in one new line in the modules.alias:
>
>    alias vfio_pci:v*d*sv*sd*bc*sc*i* vfio_pci
>
> Later series introduce additional HW specific VFIO PCI drivers, such as
> mlx5_vfio_pci.

are we good with this commit message ?

And with the code logic ?

We would like to send V2 with the proposed fixes and the above commit 
message and get your ack on this.

Our goal is to merge this series and the first preparation series 
"Provide core infrastructure for managing open/release" sent by Jason to 
kernel 5.15.

The first series is in the final review phase but this series is mostly 
depend on this patch. For the other patches we have some kind of agreement.

hopefully we can collect more "reviewed-by" signatures before sending V2.


>>> diff --git a/scripts/mod/file2alias.c b/scripts/mod/file2alias.c
>>> index 7c97fa8e36bc..f53b38e8f696 100644
>>> +++ b/scripts/mod/file2alias.c
>>> @@ -426,7 +426,7 @@ static int do_ieee1394_entry(const char *filename,
>>>   	return 1;
>>>   }
>>>   
>>> -/* Looks like: pci:vNdNsvNsdNbcNscNiN. */
>>> +/* Looks like: pci:vNdNsvNsdNbcNscNiN or <prefix>_pci:vNdNsvNsdNbcNscNiN. */
>>>   static int do_pci_entry(const char *filename,
>>>   			void *symval, char *alias)
>>>   {
>>> @@ -440,8 +440,12 @@ static int do_pci_entry(const char *filename,
>>>   	DEF_FIELD(symval, pci_device_id, subdevice);
>>>   	DEF_FIELD(symval, pci_device_id, class);
>>>   	DEF_FIELD(symval, pci_device_id, class_mask);
>>> +	DEF_FIELD(symval, pci_device_id, flags);
>> I'm a little bit wary of adding a new field to this kernel/user
>> interface just for this single bit.  Maybe it's justified but feels
>> like it's worth being careful.
> A couple of us looked at this in one of the RFC threads..
>
> As far as we could tell this is not a kernel/user interface. It is an
> interface within kbuild between gcc and file2alias and is not used or
> really exported beyond the kernel build sequence.
>
> Debian code search didn't find anything, for instance.
>
> modules.alias, as output by file2alias during kbuild, is the canonical
> "kernel/user" interface here. Everything that needs this data should
> be using that.
>
> Thanks,
> Jason

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 09/12] PCI: Add a PCI_ID_F_VFIO_DRIVER_OVERRIDE flag to struct pci_device_id
  2021-08-06  0:23     ` Jason Gunthorpe
  2021-08-11 12:22       ` Max Gurtovoy
@ 2021-08-11 19:07       ` Bjorn Helgaas
  2021-08-12 13:27         ` Jason Gunthorpe
  1 sibling, 1 reply; 55+ messages in thread
From: Bjorn Helgaas @ 2021-08-11 19:07 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Yishai Hadas, bhelgaas, corbet, alex.williamson, diana.craciun,
	kwankhede, eric.auger, masahiroy, michal.lkml, linux-pci,
	linux-doc, kvm, linux-s390, linux-kbuild, mgurtovoy, maorg,
	leonro

On Thu, Aug 05, 2021 at 09:23:57PM -0300, Jason Gunthorpe wrote:
> On Wed, Aug 04, 2021 at 03:34:12PM -0500, Bjorn Helgaas wrote:
> 
> > > The first use will be to define a VFIO flag that indicates the PCI driver
> > > is a VFIO driver.
> >
> > Is there such a thing as a "VFIO driver" today?  
> 
> Yes.
> 
> VFIO has long existed as a driver subsystem that binds drivers to
> devices in various bus types. In the case of PCI the admin moves a PCI
> device from normal operation to VFIO operation via something like:

What specifically makes a driver a "VFIO driver"?  Maybe that it
supports the VFIO ioctls in include/uapi/linux/vfio.h?  That by itself
doesn't require special treatment by the kernel, so I think there's
more here.

> echo vfio_pci > /sys/bus/pci/devices/0000:01:00.0/driver_override
> 
> Other bus types (platform, acpi, etc) have a similar command to move
> them to VFIO.

Do the other bus types have a flag analogous to
PCI_ID_F_VFIO_DRIVER_OVERRIDE?  If we're doing something similar to
other bus types, it'd be nice if the approach were similar.

> > > This patch is the infrastructure to provide the information in the
> > > modules.alias to userspace and enable the only PCI VFIO driver. Later
> > > series introduce additional HW specific VFIO PCI drivers.
> > 
> > s/the only/only the/ ?  (Not sure what you intend, but "the only"
> > doesn't seem right)
> 
> "the only" is correct, at this point in the sequence there is only one
> pci_driver that uses this, vfio_pci.ko

Can we just name the specific driver instead of obliquely referring to
"the only such driver", e.g., something like "... add a modules.alias
entry for vfio_pci.ko, currently the only PCI VFIO driver"?

> > Sorry, I know I'm totally missing the point here.
> 
> Lets try again..
> 
> PCI: Add a PCI_ID_F_VFIO_DRIVER_OVERRIDE flag to struct pci_device_id
> 
> Allow device drivers to include match entries in the modules.alias file
> produced by kbuild that are not used for normal driver autoprobing and
> module autoloading. Drivers using these match entries can be connected to
> the PCI device manually, by userspace, using the existing driver_override
> sysfs.

IIUC, the end result of this is basically a tweak to the existing
sysfs driver_override functionality.

And I *think* (correct me if I'm wrong), this actually has nothing in
particular to do with VFIO.  It's just that you want to expose some
device IDs that are only used for binding when driver_override is set.

> Add the flag PCI_ID_F_VFIO_DRIVER_OVERRIDE to indicate that the match
> entry is for the VFIO subsystem. These match entries are prefixed with
> "vfio_" in the modules.alias.
> 
> For example the resulting modules.alias may have:
> 
>   alias pci:v000015B3d00001021sv*sd*bc*sc*i* mlx5_core
>   alias vfio_pci:v000015B3d00001021sv*sd*bc*sc*i* mlx5_vfio_pci
>   alias vfio_pci:v*d*sv*sd*bc*sc*i* vfio_pci
> 
> In this example mlx5_core and mlx5_vfio_pci match to the same PCI
> device. The kernel will autoload and autobind to mlx5_core but the kernel
> and udev mechanisms will ignore mlx5_vfio_pci.
> 
> When userspace wants to change a device to the VFIO subsystem userspace
> can implement a generic algorithm:
> 
>    1) Identify the sysfs path to the device:
>     /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0
> 
>    2) Get the modalias string from the kernel:
>     $ cat /sys/bus/pci/devices/0000:01:00.0/modalias
>     pci:v000015B3d00001021sv000015B3sd00000001bc02sc00i00

So far, I think this is all the existing behavior, unaffected by this
patch.

>    3) Prefix it with vfio_:
>     vfio_pci:v000015B3d00001021sv000015B3sd00000001bc02sc00i00
> 
>    4) Search modules.alias for the above string and select the entry that
>       has the fewest *'s:
>     alias vfio_pci:v000015B3d00001021sv*sd*bc*sc*i* mlx5_vfio_pci

And this patch basically adds this modules.alias entry.

Previously vfio_pci contained no vendor/device IDs, and the only way
to bind it to a device was to either:

  - Modprobe the driver and write dynamic device IDs to the driver's
    /sys/.../new_id.  This should directly bind the driver to all
    devices that match the new IDs (see new_id_store()).

or

  - Write "vfio_pci" to the device's /sys/.../driver_override.
    AFAICS, this won't bind anything (see driver_override_store()),
    but if we call the driver's .probe() method via modprobe or
    rescan, the driver_override will match any device regardless of
    ID.

IIUC, after this patch, you can add vendor/device IDs to a struct
pci_driver with this new flag.  These IDs are advertised via
modules.alias.

For driver binding, IDs with the new flag are eligible to match only
when driver_override is set to the matching driver.

Setting a device's driver_override has *always* caused the matching
driver to bind.  The only difference after this patch is that now we
give the driver an ID from its .id_table instead of pci_device_id_any.

>    5) modprobe the matched module name:
>     $ modprobe mlx5_vfio_pci

I assume somewhere in here you need to unbind mlx5_core before binding
mlx5_vfio_pci?

>    6) cat the matched module name to driver_override:
>     echo mlx5_vfio_pci > /sys/bus/pci/devices/0000:01:00.0/driver_override

Don't you need something here to trigger the driver attach, i.e.,
should step 5 and step 6 be swapped?  What if the driver is already
loaded?  Can you modprobe again to make it bind to a second device?

> The algorithm is independent of bus type. In future the other buses's with
> VFIO device drivers, like platform and ACPI, can use this algorithm as
> well.

s/buses's/buses/

I see drivers/vfio/platform/vfio_platform.c; is that what you mean?  I
don't see any VFIO things with ACPI in their name, so maybe I'm
looking the wrong place.  If this is purely *plans* for the future,
maybe say something like "planned VFIO drivers ..."

> This patch is the infrastructure to provide the information in the
> modules.alias to userspace. Convert the only VFIO pci_driver which
> results in one new line in the modules.alias:

"Convert vfio_pci, currently the only VFIO PCI driver, which ..." ?

>   alias vfio_pci:v*d*sv*sd*bc*sc*i* vfio_pci
> 
> Later series introduce additional HW specific VFIO PCI drivers, such as
> mlx5_vfio_pci.
> 
> > > diff --git a/scripts/mod/file2alias.c b/scripts/mod/file2alias.c
> > > index 7c97fa8e36bc..f53b38e8f696 100644
> > > +++ b/scripts/mod/file2alias.c
> > > @@ -426,7 +426,7 @@ static int do_ieee1394_entry(const char *filename,
> > >  	return 1;
> > >  }
> > >  
> > > -/* Looks like: pci:vNdNsvNsdNbcNscNiN. */
> > > +/* Looks like: pci:vNdNsvNsdNbcNscNiN or <prefix>_pci:vNdNsvNsdNbcNscNiN. */
> > >  static int do_pci_entry(const char *filename,
> > >  			void *symval, char *alias)
> > >  {
> > > @@ -440,8 +440,12 @@ static int do_pci_entry(const char *filename,
> > >  	DEF_FIELD(symval, pci_device_id, subdevice);
> > >  	DEF_FIELD(symval, pci_device_id, class);
> > >  	DEF_FIELD(symval, pci_device_id, class_mask);
> > > +	DEF_FIELD(symval, pci_device_id, flags);
> > 
> > I'm a little bit wary of adding a new field to this kernel/user
> > interface just for this single bit.  Maybe it's justified but feels
> > like it's worth being careful.
> 
> A couple of us looked at this in one of the RFC threads..
> 
> As far as we could tell this is not a kernel/user interface. It is an
> interface within kbuild between gcc and file2alias and is not used or
> really exported beyond the kernel build sequence.
> 
> Debian code search didn't find anything, for instance.
> 
> modules.alias, as output by file2alias during kbuild, is the canonical
> "kernel/user" interface here. Everything that needs this data should
> be using that.

Ah, thanks.  I was thinking this added something to /sys/.../modalias,
but sounds like that's not the case.

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 09/12] PCI: Add a PCI_ID_F_VFIO_DRIVER_OVERRIDE flag to struct pci_device_id
  2021-08-11 19:07       ` Bjorn Helgaas
@ 2021-08-12 13:27         ` Jason Gunthorpe
  2021-08-12 15:57           ` Bjorn Helgaas
  0 siblings, 1 reply; 55+ messages in thread
From: Jason Gunthorpe @ 2021-08-12 13:27 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Yishai Hadas, bhelgaas, corbet, alex.williamson, diana.craciun,
	kwankhede, eric.auger, masahiroy, michal.lkml, linux-pci,
	linux-doc, kvm, linux-s390, linux-kbuild, mgurtovoy, maorg,
	leonro

On Wed, Aug 11, 2021 at 02:07:37PM -0500, Bjorn Helgaas wrote:
> On Thu, Aug 05, 2021 at 09:23:57PM -0300, Jason Gunthorpe wrote:
> > On Wed, Aug 04, 2021 at 03:34:12PM -0500, Bjorn Helgaas wrote:
> > 
> > > > The first use will be to define a VFIO flag that indicates the PCI driver
> > > > is a VFIO driver.
> > >
> > > Is there such a thing as a "VFIO driver" today?  
> > 
> > Yes.
> > 
> > VFIO has long existed as a driver subsystem that binds drivers to
> > devices in various bus types. In the case of PCI the admin moves a PCI
> > device from normal operation to VFIO operation via something like:
> 
> What specifically makes a driver a "VFIO driver"?  

It is a device driver whose probe function instantiates a "struct
vfio_device" which binds it to the VFIO subsystem and triggers
creation of the char devs, ioctls, etc.

No different from every other subsystem, really. Eg a netdev driver
creates a struct ndev_device, a TPM driver creates struct tpm_chip,
etc.

> supports the VFIO ioctls in include/uapi/linux/vfio.h?  That by itself
> doesn't require special treatment by the kernel, so I think there's
> more here.

The unique thing about VFIO, compared to all other subsystems, is that
VFIO is a second choice for driver binding. A device will have a
natural kernel driver, eg mlx5 naturally creates netdevs, and it has a
VFIO driver option. Userspace selects if it wants the device to
operate in normal mode or VFIO mode.

The kernel should never move a device to VFIO mode automatically -
which is the special behavior compared to any other normal pci_driver.

> > echo vfio_pci > /sys/bus/pci/devices/0000:01:00.0/driver_override
> > 
> > Other bus types (platform, acpi, etc) have a similar command to move
> > them to VFIO.
> 
> Do the other bus types have a flag analogous to
> PCI_ID_F_VFIO_DRIVER_OVERRIDE?  If we're doing something similar to
> other bus types, it'd be nice if the approach were similar.

They could, this series doesn't attempt it. I expect the approach to
be similar as driver_override was copied from PCI to other
busses. When this is completed I hope to take a look at it.

> > PCI: Add a PCI_ID_F_VFIO_DRIVER_OVERRIDE flag to struct pci_device_id
> > 
> > Allow device drivers to include match entries in the modules.alias file
> > produced by kbuild that are not used for normal driver autoprobing and
> > module autoloading. Drivers using these match entries can be connected to
> > the PCI device manually, by userspace, using the existing driver_override
> > sysfs.
> 
> IIUC, the end result of this is basically a tweak to the existing
> sysfs driver_override functionality.

Yes..

> And I *think* (correct me if I'm wrong), this actually has nothing in
> particular to do with VFIO.  It's just that you want to expose some
> device IDs that are only used for binding when driver_override is set.

The general concept has nothing to do with VFIO but adding the "vfio_"
prefix to the modalias is obviously VFIO specific.

The entire point is to convay to userspace the information that the
modules.alias line is just for vfio.

We could imagine in future some other use for this, in which case the
future user would use their own prefix, not vfio.
 
> > When userspace wants to change a device to the VFIO subsystem userspace
> > can implement a generic algorithm:
> > 
> >    1) Identify the sysfs path to the device:
> >     /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0
> > 
> >    2) Get the modalias string from the kernel:
> >     $ cat /sys/bus/pci/devices/0000:01:00.0/modalias
> >     pci:v000015B3d00001021sv000015B3sd00000001bc02sc00i00
> 
> So far, I think this is all the existing behavior, unaffected by this
> patch.

Yes.
 
> >    3) Prefix it with vfio_:
> >     vfio_pci:v000015B3d00001021sv000015B3sd00000001bc02sc00i00
> > 
> >    4) Search modules.alias for the above string and select the entry that
> >       has the fewest *'s:
> >     alias vfio_pci:v000015B3d00001021sv*sd*bc*sc*i* mlx5_vfio_pci
> 
> And this patch basically adds this modules.alias entry.

Yes.
 
> Previously vfio_pci contained no vendor/device IDs, and the only way
> to bind it to a device was to either:
> 
>   - Modprobe the driver and write dynamic device IDs to the driver's
>     /sys/.../new_id.  This should directly bind the driver to all
>     devices that match the new IDs (see new_id_store()).
> 
> or
> 
>   - Write "vfio_pci" to the device's /sys/.../driver_override.
>     AFAICS, this won't bind anything (see driver_override_store()),
>     but if we call the driver's .probe() method via modprobe or
>     rescan, the driver_override will match any device regardless of
>     ID.

Yes

> IIUC, after this patch, you can add vendor/device IDs to a struct
> pci_driver with this new flag.  These IDs are advertised via
> modules.alias.

Yes
 
> For driver binding, IDs with the new flag are eligible to match only
> when driver_override is set to the matching driver.

Yes
 
> Setting a device's driver_override has *always* caused the matching
> driver to bind.  The only difference after this patch is that now we
> give the driver an ID from its .id_table instead of pci_device_id_any.

Almost - before a .id_table entried might be returned as well. The
difference here is that there are "hidden" entries in the id_table
that is only used by driver_overrride and we can return that hidden
entry.

> >    5) modprobe the matched module name:
> >     $ modprobe mlx5_vfio_pci
> 
> I assume somewhere in here you need to unbind mlx5_core before binding
> mlx5_vfio_pci?

Er, yes, I skipped some steps here where unbind/bind has to be done
 
> >    6) cat the matched module name to driver_override:
> >     echo mlx5_vfio_pci > /sys/bus/pci/devices/0000:01:00.0/driver_override
> 
> Don't you need something here to trigger the driver attach, i.e.,
> should step 5 and step 6 be swapped?  What if the driver is already
> loaded? 

The full sequence is more like:

     echo mlx5_vfio_pci > /sys/bus/pci/devices/0000:01:00.0/driver_override
     echo 0000:01:00.0 > /sys/bus/pci/devices/0000:01:00.0/driver/unbind
     echo 0000:01:00.0 > /sys/bus/pci/drivers_probe

> Can you modprobe again to make it bind to a second device?

modprobe is a single-shot, it just loads the module and doesn't
trigger any driver binding. modprobing a second time is a NOP.

> I see drivers/vfio/platform/vfio_platform.c; is that what you mean?

Yes, look around vfio_platform_acpi_probe()

> I don't see any VFIO things with ACPI in their name, so maybe I'm
> looking the wrong place.  If this is purely *plans* for the future,
> maybe say something like "planned VFIO drivers ..."

Sure

Jason

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 09/12] PCI: Add a PCI_ID_F_VFIO_DRIVER_OVERRIDE flag to struct pci_device_id
  2021-07-21 16:16 ` [PATCH 09/12] PCI: Add a PCI_ID_F_VFIO_DRIVER_OVERRIDE flag to struct pci_device_id Yishai Hadas
  2021-07-27 16:34   ` Alex Williamson
  2021-08-04 20:34   ` Bjorn Helgaas
@ 2021-08-12 15:42   ` Bjorn Helgaas
  2 siblings, 0 replies; 55+ messages in thread
From: Bjorn Helgaas @ 2021-08-12 15:42 UTC (permalink / raw)
  To: Yishai Hadas
  Cc: bhelgaas, corbet, alex.williamson, diana.craciun, kwankhede,
	eric.auger, masahiroy, michal.lkml, linux-pci, linux-doc, kvm,
	linux-s390, linux-kbuild, mgurtovoy, jgg, maorg, leonro

On Wed, Jul 21, 2021 at 07:16:06PM +0300, Yishai Hadas wrote:
> From: Max Gurtovoy <mgurtovoy@nvidia.com>
> 
> The new flag field is be used to allow PCI drivers to signal the core code
> during driver matching and when generating the modules.alias information.
> ...

> @@ -152,10 +152,27 @@ static const struct pci_device_id *pci_match_device(struct pci_driver *drv,
>  	}
>  	spin_unlock(&drv->dynids.lock);
>  
> -	if (!found_id)
> -		found_id = pci_match_id(drv->id_table, dev);
> +	if (found_id)
> +		return found_id;
> +
> +	ids = drv->id_table;
> +	while ((found_id = pci_match_id(ids, dev))) {
> +		/*
> +		 * The match table is split based on driver_override. Check the
> +		 * flags as well so that any matching PCI_ID_F_DRIVER_OVERRIDE
> +		 * entry is returned.
> +		 */
> +		if ((found_id->flags & PCI_ID_F_VFIO_DRIVER_OVERRIDE) &&
> +		    !dev->driver_override)
> +			ids = found_id + 1;
> +		else
> +			break;
> +	}
>  
> -	/* driver_override will always match, send a dummy id */
> +	/*
> +	 * if no static match, driver_override will always match, send a dummy
> +	 * id.
> +	 */
>  	if (!found_id && dev->driver_override)
>  		found_id = &pci_device_id_any;

Possibly more readable:

  while ((found_id = pci_match_id(ids, dev))) {

    /*
     * PCI_ID_F_VFIO_DRIVER_OVERRIDE entries only match when
     * driver_override matches this driver.
     */
    if (found_id->flags & PCI_ID_F_VFIO_DRIVER_OVERRIDE) {
      if (dev->driver_override)
	return found_id;
      else
	ids = found_id + 1;
    } else {
      return found_id;
    }
  }

  /* Driver_override will always match; send a dummy ID */
  if (dev->driver_override)
    return &pci_device_id_any;

  return NULL;

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 09/12] PCI: Add a PCI_ID_F_VFIO_DRIVER_OVERRIDE flag to struct pci_device_id
  2021-08-12 13:27         ` Jason Gunthorpe
@ 2021-08-12 15:57           ` Bjorn Helgaas
  2021-08-12 19:51             ` Jason Gunthorpe
  0 siblings, 1 reply; 55+ messages in thread
From: Bjorn Helgaas @ 2021-08-12 15:57 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Yishai Hadas, bhelgaas, corbet, alex.williamson, diana.craciun,
	kwankhede, eric.auger, masahiroy, michal.lkml, linux-pci,
	linux-doc, kvm, linux-s390, linux-kbuild, mgurtovoy, maorg,
	leonro

On Thu, Aug 12, 2021 at 10:27:28AM -0300, Jason Gunthorpe wrote:
> On Wed, Aug 11, 2021 at 02:07:37PM -0500, Bjorn Helgaas wrote:
> > On Thu, Aug 05, 2021 at 09:23:57PM -0300, Jason Gunthorpe wrote:

> > Do the other bus types have a flag analogous to
> > PCI_ID_F_VFIO_DRIVER_OVERRIDE?  If we're doing something similar to
> > other bus types, it'd be nice if the approach were similar.
> 
> They could, this series doesn't attempt it. I expect the approach to
> be similar as driver_override was copied from PCI to other
> busses. When this is completed I hope to take a look at it.

I think this would make more sense as two patches:

  - Add a "PCI_ID_DRIVER_OVERRIDE" flag.  This is not VFIO-specific,
    since nothing in PCI depends on the VFIO-ness of drivers that use
    the flag.  The only point here is that driver id_table entries
    with this flag only match when driver_override matches the driver.

  - Update file2alias.c to export the flags and the "vfio_pci:" alias.
    This seems to be the only place where VFIO comes into play, and
    putting it in a separate patch will make it much smaller and it
    will be clear how it could be extended for other buses.

> > I assume somewhere in here you need to unbind mlx5_core before binding
> > mlx5_vfio_pci?
> 
> Er, yes, I skipped some steps here where unbind/bind has to be done
>  
> > >    6) cat the matched module name to driver_override:
> > >     echo mlx5_vfio_pci > /sys/bus/pci/devices/0000:01:00.0/driver_override
> > 
> > Don't you need something here to trigger the driver attach, i.e.,
> > should step 5 and step 6 be swapped?  What if the driver is already
> > loaded? 
> 
> The full sequence is more like:
> 
>      echo mlx5_vfio_pci > /sys/bus/pci/devices/0000:01:00.0/driver_override
>      echo 0000:01:00.0 > /sys/bus/pci/devices/0000:01:00.0/driver/unbind
>      echo 0000:01:00.0 > /sys/bus/pci/drivers_probe

Thanks a lot for this!  I didn't know about drivers_probe (see
drivers_probe_store()), and it doesn't seem to be documented anywhere
except sysfs-bus-usb, where it's only incidental to USB.

Bjorn

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 09/12] PCI: Add a PCI_ID_F_VFIO_DRIVER_OVERRIDE flag to struct pci_device_id
  2021-08-12 15:57           ` Bjorn Helgaas
@ 2021-08-12 19:51             ` Jason Gunthorpe
  2021-08-12 20:26               ` Bjorn Helgaas
  0 siblings, 1 reply; 55+ messages in thread
From: Jason Gunthorpe @ 2021-08-12 19:51 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Yishai Hadas, bhelgaas, corbet, alex.williamson, diana.craciun,
	kwankhede, eric.auger, masahiroy, michal.lkml, linux-pci,
	linux-doc, kvm, linux-s390, linux-kbuild, mgurtovoy, maorg,
	leonro

On Thu, Aug 12, 2021 at 10:57:07AM -0500, Bjorn Helgaas wrote:
> On Thu, Aug 12, 2021 at 10:27:28AM -0300, Jason Gunthorpe wrote:
> > On Wed, Aug 11, 2021 at 02:07:37PM -0500, Bjorn Helgaas wrote:
> > > On Thu, Aug 05, 2021 at 09:23:57PM -0300, Jason Gunthorpe wrote:
> 
> > > Do the other bus types have a flag analogous to
> > > PCI_ID_F_VFIO_DRIVER_OVERRIDE?  If we're doing something similar to
> > > other bus types, it'd be nice if the approach were similar.
> > 
> > They could, this series doesn't attempt it. I expect the approach to
> > be similar as driver_override was copied from PCI to other
> > busses. When this is completed I hope to take a look at it.
> 
> I think this would make more sense as two patches:
> 
>   - Add a "PCI_ID_DRIVER_OVERRIDE" flag.  This is not VFIO-specific,
>     since nothing in PCI depends on the VFIO-ness of drivers that use
>     the flag.  The only point here is that driver id_table entries
>     with this flag only match when driver_override matches the driver.

This would require using two flags, one to indicate the above to the
PCI code and another to indicate the vfio_pci string to
file2alias. This doesn't seem justified at this point, IMHO.

>   - Update file2alias.c to export the flags and the "vfio_pci:" alias.
>     This seems to be the only place where VFIO comes into play, and
>     putting it in a separate patch will make it much smaller and it
>     will be clear how it could be extended for other buses.

Well, I don't want to see a flag called PCI_ID_DRIVER_OVERRIDE mapped
to the string "vfio_pci", that is just really confusing.

Other busses need to copy pretty much the entire patch, there isn't
really any sharing here. I don't see splitting as good here..

What this logically wants is the match entry to have a

  const char *file2alias_prefix

Which would be set to "vfio_", but I'm not keen to bloat the match
entry further to do that..

> > The full sequence is more like:
> > 
> >      echo mlx5_vfio_pci > /sys/bus/pci/devices/0000:01:00.0/driver_override
> >      echo 0000:01:00.0 > /sys/bus/pci/devices/0000:01:00.0/driver/unbind
> >      echo 0000:01:00.0 > /sys/bus/pci/drivers_probe
> 
> Thanks a lot for this!  I didn't know about drivers_probe (see
> drivers_probe_store()), and it doesn't seem to be documented anywhere
> except sysfs-bus-usb, where it's only incidental to USB.

Okay, lets make the changes in the commit message, it does help

Jason

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 09/12] PCI: Add a PCI_ID_F_VFIO_DRIVER_OVERRIDE flag to struct pci_device_id
  2021-08-12 19:51             ` Jason Gunthorpe
@ 2021-08-12 20:26               ` Bjorn Helgaas
  2021-08-12 23:21                 ` Max Gurtovoy
  0 siblings, 1 reply; 55+ messages in thread
From: Bjorn Helgaas @ 2021-08-12 20:26 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Yishai Hadas, bhelgaas, corbet, alex.williamson, diana.craciun,
	kwankhede, eric.auger, masahiroy, michal.lkml, linux-pci,
	linux-doc, kvm, linux-s390, linux-kbuild, mgurtovoy, maorg,
	leonro

On Thu, Aug 12, 2021 at 04:51:26PM -0300, Jason Gunthorpe wrote:
> On Thu, Aug 12, 2021 at 10:57:07AM -0500, Bjorn Helgaas wrote:
> > On Thu, Aug 12, 2021 at 10:27:28AM -0300, Jason Gunthorpe wrote:
> > > On Wed, Aug 11, 2021 at 02:07:37PM -0500, Bjorn Helgaas wrote:
> > > > On Thu, Aug 05, 2021 at 09:23:57PM -0300, Jason Gunthorpe wrote:
> > 
> > > > Do the other bus types have a flag analogous to
> > > > PCI_ID_F_VFIO_DRIVER_OVERRIDE?  If we're doing something similar to
> > > > other bus types, it'd be nice if the approach were similar.
> > > 
> > > They could, this series doesn't attempt it. I expect the approach to
> > > be similar as driver_override was copied from PCI to other
> > > busses. When this is completed I hope to take a look at it.
> > 
> > I think this would make more sense as two patches:
> > 
> >   - Add a "PCI_ID_DRIVER_OVERRIDE" flag.  This is not VFIO-specific,
> >     since nothing in PCI depends on the VFIO-ness of drivers that use
> >     the flag.  The only point here is that driver id_table entries
> >     with this flag only match when driver_override matches the driver.
> 
> This would require using two flags, one to indicate the above to the
> PCI code and another to indicate the vfio_pci string to
> file2alias. This doesn't seem justified at this point, IMHO.

I don't think it requires two flags.  do_pci_entry() has:

  if (flags & PCI_ID_F_VFIO_DRIVER_OVERRIDE)
    strcpy(alias, "vfio_pci:");

I'm just proposing a rename:

s/PCI_ID_F_VFIO_DRIVER_OVERRIDE/PCI_ID_DRIVER_OVERRIDE/

> >   - Update file2alias.c to export the flags and the "vfio_pci:" alias.
> >     This seems to be the only place where VFIO comes into play, and
> >     putting it in a separate patch will make it much smaller and it
> >     will be clear how it could be extended for other buses.
> 
> Well, I don't want to see a flag called PCI_ID_DRIVER_OVERRIDE mapped
> to the string "vfio_pci", that is just really confusing.

Hahaha, I see, that's fair :)  It confused me for a long time why you
wanted "VFIO" in the flag name because from the kernel's point of
view, the flag is not related to any VFIO-ness.  It's only related to
a special variety of driver_override, and VFIO happens to be one user
of it.

I think a separate patch that maps the flag to "vfio_pci" would be
less confusing because without the distractions of the PCI core
changes, it will be obvious that "vfio_" is a file2alias thing that's
there for userspace convenience, not for kernel reasons.

Do you envision any other prefixes in the future?  I hope we don't
have to clutter pci_match_device() with checking multiple flags.
Maybe the problem is that the modules.alias entry includes "vfio_" --
maybe we need a more generic prefix with just the idea of an
"alternate" driver.

Bjorn

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 09/12] PCI: Add a PCI_ID_F_VFIO_DRIVER_OVERRIDE flag to struct pci_device_id
  2021-08-12 20:26               ` Bjorn Helgaas
@ 2021-08-12 23:21                 ` Max Gurtovoy
  2021-08-13 17:44                   ` Bjorn Helgaas
  0 siblings, 1 reply; 55+ messages in thread
From: Max Gurtovoy @ 2021-08-12 23:21 UTC (permalink / raw)
  To: Bjorn Helgaas, Jason Gunthorpe
  Cc: Yishai Hadas, bhelgaas, corbet, alex.williamson, diana.craciun,
	kwankhede, eric.auger, masahiroy, michal.lkml, linux-pci,
	linux-doc, kvm, linux-s390, linux-kbuild, maorg, leonro


On 8/12/2021 11:26 PM, Bjorn Helgaas wrote:
> On Thu, Aug 12, 2021 at 04:51:26PM -0300, Jason Gunthorpe wrote:
>> On Thu, Aug 12, 2021 at 10:57:07AM -0500, Bjorn Helgaas wrote:
>>> On Thu, Aug 12, 2021 at 10:27:28AM -0300, Jason Gunthorpe wrote:
>>>> On Wed, Aug 11, 2021 at 02:07:37PM -0500, Bjorn Helgaas wrote:
>>>>> On Thu, Aug 05, 2021 at 09:23:57PM -0300, Jason Gunthorpe wrote:
>>>>> Do the other bus types have a flag analogous to
>>>>> PCI_ID_F_VFIO_DRIVER_OVERRIDE?  If we're doing something similar to
>>>>> other bus types, it'd be nice if the approach were similar.
>>>> They could, this series doesn't attempt it. I expect the approach to
>>>> be similar as driver_override was copied from PCI to other
>>>> busses. When this is completed I hope to take a look at it.
>>> I think this would make more sense as two patches:
>>>
>>>    - Add a "PCI_ID_DRIVER_OVERRIDE" flag.  This is not VFIO-specific,
>>>      since nothing in PCI depends on the VFIO-ness of drivers that use
>>>      the flag.  The only point here is that driver id_table entries
>>>      with this flag only match when driver_override matches the driver.
>> This would require using two flags, one to indicate the above to the
>> PCI code and another to indicate the vfio_pci string to
>> file2alias. This doesn't seem justified at this point, IMHO.
> I don't think it requires two flags.  do_pci_entry() has:
>
>    if (flags & PCI_ID_F_VFIO_DRIVER_OVERRIDE)
>      strcpy(alias, "vfio_pci:");
>
> I'm just proposing a rename:
>
> s/PCI_ID_F_VFIO_DRIVER_OVERRIDE/PCI_ID_DRIVER_OVERRIDE/
>
>>>    - Update file2alias.c to export the flags and the "vfio_pci:" alias.
>>>      This seems to be the only place where VFIO comes into play, and
>>>      putting it in a separate patch will make it much smaller and it
>>>      will be clear how it could be extended for other buses.
>> Well, I don't want to see a flag called PCI_ID_DRIVER_OVERRIDE mapped
>> to the string "vfio_pci", that is just really confusing.
> Hahaha, I see, that's fair :)  It confused me for a long time why you
> wanted "VFIO" in the flag name because from the kernel's point of
> view, the flag is not related to any VFIO-ness.  It's only related to
> a special variety of driver_override, and VFIO happens to be one user
> of it.

In my original patch I used

#define PCI_ID_DRIVER_OVERRIDE PCI_ID_F_VFIO_DRIVER_OVERRIDE

and in the pci core code I used PCI_ID_DRIVER_OVERRIDE in the "if" clause.

So we can maybe do that and leave the option to future update of the 
define without changing the core code.

In the future we can have something like:

#define PCI_ID_DRIVER_OVERRIDE (PCI_ID_F_VFIO_DRIVER_OVERRIDE | 
PCI_ID_F_MY_BUS_DRIVER_OVERRIDE)

The file2alias.c still have to use the exact 
PCI_ID_F_VFIO_DRIVER_OVERRIDE flag to add "vfio_" prefix.

Is that better ?

>
> I think a separate patch that maps the flag to "vfio_pci" would be
> less confusing because without the distractions of the PCI core
> changes, it will be obvious that "vfio_" is a file2alias thing that's
> there for userspace convenience, not for kernel reasons.
>
> Do you envision any other prefixes in the future?  I hope we don't
> have to clutter pci_match_device() with checking multiple flags.
> Maybe the problem is that the modules.alias entry includes "vfio_" --
> maybe we need a more generic prefix with just the idea of an
> "alternate" driver.
>
> Bjorn

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 09/12] PCI: Add a PCI_ID_F_VFIO_DRIVER_OVERRIDE flag to struct pci_device_id
  2021-08-12 23:21                 ` Max Gurtovoy
@ 2021-08-13 17:44                   ` Bjorn Helgaas
  2021-08-14 23:27                     ` Max Gurtovoy
  0 siblings, 1 reply; 55+ messages in thread
From: Bjorn Helgaas @ 2021-08-13 17:44 UTC (permalink / raw)
  To: Max Gurtovoy
  Cc: Jason Gunthorpe, Yishai Hadas, bhelgaas, corbet, alex.williamson,
	diana.craciun, kwankhede, eric.auger, masahiroy, michal.lkml,
	linux-pci, linux-doc, kvm, linux-s390, linux-kbuild, maorg,
	leonro

On Fri, Aug 13, 2021 at 02:21:41AM +0300, Max Gurtovoy wrote:
> 
> On 8/12/2021 11:26 PM, Bjorn Helgaas wrote:
> > On Thu, Aug 12, 2021 at 04:51:26PM -0300, Jason Gunthorpe wrote:
> > > On Thu, Aug 12, 2021 at 10:57:07AM -0500, Bjorn Helgaas wrote:
> > > > On Thu, Aug 12, 2021 at 10:27:28AM -0300, Jason Gunthorpe wrote:
> > > > > On Wed, Aug 11, 2021 at 02:07:37PM -0500, Bjorn Helgaas wrote:
> > > > > > On Thu, Aug 05, 2021 at 09:23:57PM -0300, Jason Gunthorpe wrote:
> > > > > > Do the other bus types have a flag analogous to
> > > > > > PCI_ID_F_VFIO_DRIVER_OVERRIDE?  If we're doing something similar to
> > > > > > other bus types, it'd be nice if the approach were similar.
> > > > > They could, this series doesn't attempt it. I expect the approach to
> > > > > be similar as driver_override was copied from PCI to other
> > > > > busses. When this is completed I hope to take a look at it.
> > > > I think this would make more sense as two patches:
> > > > 
> > > >    - Add a "PCI_ID_DRIVER_OVERRIDE" flag.  This is not VFIO-specific,
> > > >      since nothing in PCI depends on the VFIO-ness of drivers that use
> > > >      the flag.  The only point here is that driver id_table entries
> > > >      with this flag only match when driver_override matches the driver.
> > > This would require using two flags, one to indicate the above to the
> > > PCI code and another to indicate the vfio_pci string to
> > > file2alias. This doesn't seem justified at this point, IMHO.
> > I don't think it requires two flags.  do_pci_entry() has:
> > 
> >    if (flags & PCI_ID_F_VFIO_DRIVER_OVERRIDE)
> >      strcpy(alias, "vfio_pci:");
> > 
> > I'm just proposing a rename:
> > 
> > s/PCI_ID_F_VFIO_DRIVER_OVERRIDE/PCI_ID_DRIVER_OVERRIDE/
> > 
> > > >    - Update file2alias.c to export the flags and the "vfio_pci:" alias.
> > > >      This seems to be the only place where VFIO comes into play, and
> > > >      putting it in a separate patch will make it much smaller and it
> > > >      will be clear how it could be extended for other buses.
> > > Well, I don't want to see a flag called PCI_ID_DRIVER_OVERRIDE mapped
> > > to the string "vfio_pci", that is just really confusing.
> > Hahaha, I see, that's fair :)  It confused me for a long time why you
> > wanted "VFIO" in the flag name because from the kernel's point of
> > view, the flag is not related to any VFIO-ness.  It's only related to
> > a special variety of driver_override, and VFIO happens to be one user
> > of it.
> 
> In my original patch I used
> 
> #define PCI_ID_DRIVER_OVERRIDE PCI_ID_F_VFIO_DRIVER_OVERRIDE
> 
> and in the pci core code I used PCI_ID_DRIVER_OVERRIDE in the "if" clause.
> 
> So we can maybe do that and leave the option to future update of the define
> without changing the core code.
> 
> In the future we can have something like:
> 
> #define PCI_ID_DRIVER_OVERRIDE (PCI_ID_F_VFIO_DRIVER_OVERRIDE |
> PCI_ID_F_MY_BUS_DRIVER_OVERRIDE)
> 
> The file2alias.c still have to use the exact PCI_ID_F_VFIO_DRIVER_OVERRIDE
> flag to add "vfio_" prefix.
> 
> Is that better ?

I don't think it's worth having two separate #defines.  If we need
more in the future, we can add them when we need them.

What if we renamed "flags" to be specifically for this override case,
e.g., "override_only"?  Then the flag could be
PCI_ID_F_VFIO_DRIVER_OVERRIDE, which would trigger a "vfio_" prefix in
file2alias.c, but pci_match_device() could just check for it being
non-zero, without caring whether the reason is VFIO or something else,
e.g.,

  pci_match_device(...)
  {
    ...
    if (found_id->override_only) {
      if (dev->driver_override)
        return found_id;
      ...

Bjorn

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 09/12] PCI: Add a PCI_ID_F_VFIO_DRIVER_OVERRIDE flag to struct pci_device_id
  2021-08-13 17:44                   ` Bjorn Helgaas
@ 2021-08-14 23:27                     ` Max Gurtovoy
  2021-08-16 17:21                       ` Bjorn Helgaas
  0 siblings, 1 reply; 55+ messages in thread
From: Max Gurtovoy @ 2021-08-14 23:27 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Jason Gunthorpe, Yishai Hadas, bhelgaas, corbet, alex.williamson,
	diana.craciun, kwankhede, eric.auger, masahiroy, michal.lkml,
	linux-pci, linux-doc, kvm, linux-s390, linux-kbuild, maorg,
	leonro


On 8/13/2021 8:44 PM, Bjorn Helgaas wrote:
> On Fri, Aug 13, 2021 at 02:21:41AM +0300, Max Gurtovoy wrote:
>> On 8/12/2021 11:26 PM, Bjorn Helgaas wrote:
>>> On Thu, Aug 12, 2021 at 04:51:26PM -0300, Jason Gunthorpe wrote:
>>>> On Thu, Aug 12, 2021 at 10:57:07AM -0500, Bjorn Helgaas wrote:
>>>>> On Thu, Aug 12, 2021 at 10:27:28AM -0300, Jason Gunthorpe wrote:
>>>>>> On Wed, Aug 11, 2021 at 02:07:37PM -0500, Bjorn Helgaas wrote:
>>>>>>> On Thu, Aug 05, 2021 at 09:23:57PM -0300, Jason Gunthorpe wrote:
>>>>>>> Do the other bus types have a flag analogous to
>>>>>>> PCI_ID_F_VFIO_DRIVER_OVERRIDE?  If we're doing something similar to
>>>>>>> other bus types, it'd be nice if the approach were similar.
>>>>>> They could, this series doesn't attempt it. I expect the approach to
>>>>>> be similar as driver_override was copied from PCI to other
>>>>>> busses. When this is completed I hope to take a look at it.
>>>>> I think this would make more sense as two patches:
>>>>>
>>>>>     - Add a "PCI_ID_DRIVER_OVERRIDE" flag.  This is not VFIO-specific,
>>>>>       since nothing in PCI depends on the VFIO-ness of drivers that use
>>>>>       the flag.  The only point here is that driver id_table entries
>>>>>       with this flag only match when driver_override matches the driver.
>>>> This would require using two flags, one to indicate the above to the
>>>> PCI code and another to indicate the vfio_pci string to
>>>> file2alias. This doesn't seem justified at this point, IMHO.
>>> I don't think it requires two flags.  do_pci_entry() has:
>>>
>>>     if (flags & PCI_ID_F_VFIO_DRIVER_OVERRIDE)
>>>       strcpy(alias, "vfio_pci:");
>>>
>>> I'm just proposing a rename:
>>>
>>> s/PCI_ID_F_VFIO_DRIVER_OVERRIDE/PCI_ID_DRIVER_OVERRIDE/
>>>
>>>>>     - Update file2alias.c to export the flags and the "vfio_pci:" alias.
>>>>>       This seems to be the only place where VFIO comes into play, and
>>>>>       putting it in a separate patch will make it much smaller and it
>>>>>       will be clear how it could be extended for other buses.
>>>> Well, I don't want to see a flag called PCI_ID_DRIVER_OVERRIDE mapped
>>>> to the string "vfio_pci", that is just really confusing.
>>> Hahaha, I see, that's fair :)  It confused me for a long time why you
>>> wanted "VFIO" in the flag name because from the kernel's point of
>>> view, the flag is not related to any VFIO-ness.  It's only related to
>>> a special variety of driver_override, and VFIO happens to be one user
>>> of it.
>> In my original patch I used
>>
>> #define PCI_ID_DRIVER_OVERRIDE PCI_ID_F_VFIO_DRIVER_OVERRIDE
>>
>> and in the pci core code I used PCI_ID_DRIVER_OVERRIDE in the "if" clause.
>>
>> So we can maybe do that and leave the option to future update of the define
>> without changing the core code.
>>
>> In the future we can have something like:
>>
>> #define PCI_ID_DRIVER_OVERRIDE (PCI_ID_F_VFIO_DRIVER_OVERRIDE |
>> PCI_ID_F_MY_BUS_DRIVER_OVERRIDE)
>>
>> The file2alias.c still have to use the exact PCI_ID_F_VFIO_DRIVER_OVERRIDE
>> flag to add "vfio_" prefix.
>>
>> Is that better ?
> I don't think it's worth having two separate #defines.  If we need
> more in the future, we can add them when we need them.

I meant 1 #define and 1 enum:

enum {
     PCI_ID_F_VFIO_DRIVER_OVERRIDE    = 1 << 0,
};

#define PCI_ID_DRIVER_OVERRIDE PCI_ID_F_VFIO_DRIVER_OVERRIDE

>
> What if we renamed "flags" to be specifically for this override case,
> e.g., "override_only"?  Then the flag could be
> PCI_ID_F_VFIO_DRIVER_OVERRIDE, which would trigger a "vfio_" prefix in
> file2alias.c, but pci_match_device() could just check for it being
> non-zero, without caring whether the reason is VFIO or something else,
> e.g.,
>
>    pci_match_device(...)
>    {
>      ...
>      if (found_id->override_only) {
>        if (dev->driver_override)
>          return found_id;
>        ...

Jason suggested something like this:


static const struct pci_device_id *pci_match_device(struct pci_driver *drv,
                             struct pci_dev *dev)
{
     struct pci_dynid *dynid;
     const struct pci_device_id *found_id = NULL, *ids;

     /* When driver_override is set, only bind to the matching driver */
     if (dev->driver_override && strcmp(dev->driver_override, drv->name))
         return NULL;

     /* Look at the dynamic ids first, before the static ones */
     spin_lock(&drv->dynids.lock);
     list_for_each_entry(dynid, &drv->dynids.list, node) {
         if (pci_match_one_device(&dynid->id, dev)) {
             found_id = &dynid->id;
             break;
         }
     }
     spin_unlock(&drv->dynids.lock);

     if (found_id)
         return found_id;

     for (ids = drv->id_table; (found_id = pci_match_id(ids, dev));
          ids = found_id + 1) {
         /*
          * The match table is split based on driver_override. Check the
          * flags as well so that any matching
          * PCI_ID_F_VFIO_DRIVER_OVERRIDE entry is returned.
          */
         if (!(found_id->flags & PCI_ID_F_VFIO_DRIVER_OVERRIDE) ||
             dev->driver_override)
             return found_id;
     }

     /*
      * if no static match, driver_override will always match, send a dummy
      * id.
      */
     if (dev->driver_override)
         return &pci_device_id_any;
     return NULL;
}


It looks good to me as well.

I prefer the "flags" naming since its more generic and easy to extend.

can we continue with the above suggestion for V2 ?

It's really a matter of taste..

> Bjorn

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 09/12] PCI: Add a PCI_ID_F_VFIO_DRIVER_OVERRIDE flag to struct pci_device_id
  2021-08-14 23:27                     ` Max Gurtovoy
@ 2021-08-16 17:21                       ` Bjorn Helgaas
  2021-08-17 13:01                         ` Max Gurtovoy
  0 siblings, 1 reply; 55+ messages in thread
From: Bjorn Helgaas @ 2021-08-16 17:21 UTC (permalink / raw)
  To: Max Gurtovoy
  Cc: Jason Gunthorpe, Yishai Hadas, bhelgaas, corbet, alex.williamson,
	diana.craciun, kwankhede, eric.auger, masahiroy, michal.lkml,
	linux-pci, linux-doc, kvm, linux-s390, linux-kbuild, maorg,
	leonro

On Sun, Aug 15, 2021 at 02:27:13AM +0300, Max Gurtovoy wrote:
> On 8/13/2021 8:44 PM, Bjorn Helgaas wrote:
> > On Fri, Aug 13, 2021 at 02:21:41AM +0300, Max Gurtovoy wrote:
> > > On 8/12/2021 11:26 PM, Bjorn Helgaas wrote:
> > > > On Thu, Aug 12, 2021 at 04:51:26PM -0300, Jason Gunthorpe wrote:
> > > > > On Thu, Aug 12, 2021 at 10:57:07AM -0500, Bjorn Helgaas wrote:
> > > > > > On Thu, Aug 12, 2021 at 10:27:28AM -0300, Jason Gunthorpe wrote:
> > > > > > > On Wed, Aug 11, 2021 at 02:07:37PM -0500, Bjorn Helgaas wrote:
> > > > > > > > On Thu, Aug 05, 2021 at 09:23:57PM -0300, Jason Gunthorpe wrote:
> > > > > > > > Do the other bus types have a flag analogous to
> > > > > > > > PCI_ID_F_VFIO_DRIVER_OVERRIDE?  If we're doing something similar to
> > > > > > > > other bus types, it'd be nice if the approach were similar.
> > > > > > > They could, this series doesn't attempt it. I expect the approach to
> > > > > > > be similar as driver_override was copied from PCI to other
> > > > > > > busses. When this is completed I hope to take a look at it.
> > > > > > I think this would make more sense as two patches:
> > > > > > 
> > > > > >     - Add a "PCI_ID_DRIVER_OVERRIDE" flag.  This is not VFIO-specific,
> > > > > >       since nothing in PCI depends on the VFIO-ness of drivers that use
> > > > > >       the flag.  The only point here is that driver id_table entries
> > > > > >       with this flag only match when driver_override matches the driver.
> > > > > This would require using two flags, one to indicate the above to the
> > > > > PCI code and another to indicate the vfio_pci string to
> > > > > file2alias. This doesn't seem justified at this point, IMHO.
> > > > I don't think it requires two flags.  do_pci_entry() has:
> > > > 
> > > >     if (flags & PCI_ID_F_VFIO_DRIVER_OVERRIDE)
> > > >       strcpy(alias, "vfio_pci:");
> > > > 
> > > > I'm just proposing a rename:
> > > > 
> > > > s/PCI_ID_F_VFIO_DRIVER_OVERRIDE/PCI_ID_DRIVER_OVERRIDE/
> > > > 
> > > > > >     - Update file2alias.c to export the flags and the "vfio_pci:" alias.
> > > > > >       This seems to be the only place where VFIO comes into play, and
> > > > > >       putting it in a separate patch will make it much smaller and it
> > > > > >       will be clear how it could be extended for other buses.
> > > > > Well, I don't want to see a flag called PCI_ID_DRIVER_OVERRIDE mapped
> > > > > to the string "vfio_pci", that is just really confusing.
> > > > Hahaha, I see, that's fair :)  It confused me for a long time why you
> > > > wanted "VFIO" in the flag name because from the kernel's point of
> > > > view, the flag is not related to any VFIO-ness.  It's only related to
> > > > a special variety of driver_override, and VFIO happens to be one user
> > > > of it.
> > > In my original patch I used
> > > 
> > > #define PCI_ID_DRIVER_OVERRIDE PCI_ID_F_VFIO_DRIVER_OVERRIDE
> > > 
> > > and in the pci core code I used PCI_ID_DRIVER_OVERRIDE in the "if" clause.
> > > 
> > > So we can maybe do that and leave the option to future update of the define
> > > without changing the core code.
> > > 
> > > In the future we can have something like:
> > > 
> > > #define PCI_ID_DRIVER_OVERRIDE (PCI_ID_F_VFIO_DRIVER_OVERRIDE |
> > > PCI_ID_F_MY_BUS_DRIVER_OVERRIDE)
> > > 
> > > The file2alias.c still have to use the exact PCI_ID_F_VFIO_DRIVER_OVERRIDE
> > > flag to add "vfio_" prefix.
> > > 
> > > Is that better ?
> > I don't think it's worth having two separate #defines.  If we need
> > more in the future, we can add them when we need them.
> 
> I meant 1 #define and 1 enum:
> 
> enum {
>     PCI_ID_F_VFIO_DRIVER_OVERRIDE    = 1 << 0,
> };
> 
> #define PCI_ID_DRIVER_OVERRIDE PCI_ID_F_VFIO_DRIVER_OVERRIDE

Basically the same thing.  Doesn't seem worthwhile to me to have both.
When reading the code, it's not at all obvious why you would define a
new name for PCI_ID_F_VFIO_DRIVER_OVERRIDE.

> > What if we renamed "flags" to be specifically for this override case,
> > e.g., "override_only"?  Then the flag could be
> > PCI_ID_F_VFIO_DRIVER_OVERRIDE, which would trigger a "vfio_" prefix in
> > file2alias.c, but pci_match_device() could just check for it being
> > non-zero, without caring whether the reason is VFIO or something else,
> > e.g.,
> > 
> >    pci_match_device(...)
> >    {
> >      ...
> >      if (found_id->override_only) {
> >        if (dev->driver_override)
> >          return found_id;
> >        ...
> 
> Jason suggested something like this:
> 
> static const struct pci_device_id *pci_match_device(struct pci_driver *drv,
>                             struct pci_dev *dev)
> {
>     struct pci_dynid *dynid;
>     const struct pci_device_id *found_id = NULL, *ids;
> 
>     /* When driver_override is set, only bind to the matching driver */
>     if (dev->driver_override && strcmp(dev->driver_override, drv->name))
>         return NULL;
> 
>     /* Look at the dynamic ids first, before the static ones */
>     spin_lock(&drv->dynids.lock);
>     list_for_each_entry(dynid, &drv->dynids.list, node) {
>         if (pci_match_one_device(&dynid->id, dev)) {
>             found_id = &dynid->id;
>             break;
>         }
>     }
>     spin_unlock(&drv->dynids.lock);
> 
>     if (found_id)
>         return found_id;
> 
>     for (ids = drv->id_table; (found_id = pci_match_id(ids, dev));
>          ids = found_id + 1) {
>         /*
>          * The match table is split based on driver_override. Check the
>          * flags as well so that any matching
>          * PCI_ID_F_VFIO_DRIVER_OVERRIDE entry is returned.
>          */
>         if (!(found_id->flags & PCI_ID_F_VFIO_DRIVER_OVERRIDE) ||
>             dev->driver_override)
>             return found_id;
>     }
> 
>     /*
>      * if no static match, driver_override will always match, send a dummy
>      * id.
>      */
>     if (dev->driver_override)
>         return &pci_device_id_any;
>     return NULL;
> }
> 
> 
> It looks good to me as well.

I missed your point.  Isn't the above basically the 09/12 patch [1] we're
talking about?

Yes, I see the code structure is slightly different, but the question
we're talking about here is the name of the "flags" field and the enum
or #define for the VFIO bit.

> I prefer the "flags" naming since its more generic and easy to extend.

We don't need to worry about "flags" being generic or extensible until
we need to extend it.  It's easy to fiddle with it at that point.

> can we continue with the above suggestion for V2 ?

I don't see what really changed with the above suggestion.

The point I'm trying to make is that using PCI_ID_F_VFIO_DRIVER_OVERRIDE 
in pci_match_device() suggests that the code there has some connection
or dependency on VFIO, but it does not.

[1] https://lore.kernel.org/r/20210721161609.68223-10-yishaih@nvidia.com

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 09/12] PCI: Add a PCI_ID_F_VFIO_DRIVER_OVERRIDE flag to struct pci_device_id
  2021-08-16 17:21                       ` Bjorn Helgaas
@ 2021-08-17 13:01                         ` Max Gurtovoy
  2021-08-17 14:13                           ` Bjorn Helgaas
  0 siblings, 1 reply; 55+ messages in thread
From: Max Gurtovoy @ 2021-08-17 13:01 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Jason Gunthorpe, Yishai Hadas, bhelgaas, corbet, alex.williamson,
	diana.craciun, kwankhede, eric.auger, masahiroy, michal.lkml,
	linux-pci, linux-doc, kvm, linux-s390, linux-kbuild, maorg,
	leonro


On 8/16/2021 8:21 PM, Bjorn Helgaas wrote:
> On Sun, Aug 15, 2021 at 02:27:13AM +0300, Max Gurtovoy wrote:
>> On 8/13/2021 8:44 PM, Bjorn Helgaas wrote:
>>> On Fri, Aug 13, 2021 at 02:21:41AM +0300, Max Gurtovoy wrote:
>>>> On 8/12/2021 11:26 PM, Bjorn Helgaas wrote:
>>>>> On Thu, Aug 12, 2021 at 04:51:26PM -0300, Jason Gunthorpe wrote:
>>>>>> On Thu, Aug 12, 2021 at 10:57:07AM -0500, Bjorn Helgaas wrote:
>>>>>>> On Thu, Aug 12, 2021 at 10:27:28AM -0300, Jason Gunthorpe wrote:
>>>>>>>> On Wed, Aug 11, 2021 at 02:07:37PM -0500, Bjorn Helgaas wrote:
>>>>>>>>> On Thu, Aug 05, 2021 at 09:23:57PM -0300, Jason Gunthorpe wrote:
>>>>>>>>> Do the other bus types have a flag analogous to
>>>>>>>>> PCI_ID_F_VFIO_DRIVER_OVERRIDE?  If we're doing something similar to
>>>>>>>>> other bus types, it'd be nice if the approach were similar.
>>>>>>>> They could, this series doesn't attempt it. I expect the approach to
>>>>>>>> be similar as driver_override was copied from PCI to other
>>>>>>>> busses. When this is completed I hope to take a look at it.
>>>>>>> I think this would make more sense as two patches:
>>>>>>>
>>>>>>>      - Add a "PCI_ID_DRIVER_OVERRIDE" flag.  This is not VFIO-specific,
>>>>>>>        since nothing in PCI depends on the VFIO-ness of drivers that use
>>>>>>>        the flag.  The only point here is that driver id_table entries
>>>>>>>        with this flag only match when driver_override matches the driver.
>>>>>> This would require using two flags, one to indicate the above to the
>>>>>> PCI code and another to indicate the vfio_pci string to
>>>>>> file2alias. This doesn't seem justified at this point, IMHO.
>>>>> I don't think it requires two flags.  do_pci_entry() has:
>>>>>
>>>>>      if (flags & PCI_ID_F_VFIO_DRIVER_OVERRIDE)
>>>>>        strcpy(alias, "vfio_pci:");
>>>>>
>>>>> I'm just proposing a rename:
>>>>>
>>>>> s/PCI_ID_F_VFIO_DRIVER_OVERRIDE/PCI_ID_DRIVER_OVERRIDE/
>>>>>
>>>>>>>      - Update file2alias.c to export the flags and the "vfio_pci:" alias.
>>>>>>>        This seems to be the only place where VFIO comes into play, and
>>>>>>>        putting it in a separate patch will make it much smaller and it
>>>>>>>        will be clear how it could be extended for other buses.
>>>>>> Well, I don't want to see a flag called PCI_ID_DRIVER_OVERRIDE mapped
>>>>>> to the string "vfio_pci", that is just really confusing.
>>>>> Hahaha, I see, that's fair :)  It confused me for a long time why you
>>>>> wanted "VFIO" in the flag name because from the kernel's point of
>>>>> view, the flag is not related to any VFIO-ness.  It's only related to
>>>>> a special variety of driver_override, and VFIO happens to be one user
>>>>> of it.
>>>> In my original patch I used
>>>>
>>>> #define PCI_ID_DRIVER_OVERRIDE PCI_ID_F_VFIO_DRIVER_OVERRIDE
>>>>
>>>> and in the pci core code I used PCI_ID_DRIVER_OVERRIDE in the "if" clause.
>>>>
>>>> So we can maybe do that and leave the option to future update of the define
>>>> without changing the core code.
>>>>
>>>> In the future we can have something like:
>>>>
>>>> #define PCI_ID_DRIVER_OVERRIDE (PCI_ID_F_VFIO_DRIVER_OVERRIDE |
>>>> PCI_ID_F_MY_BUS_DRIVER_OVERRIDE)
>>>>
>>>> The file2alias.c still have to use the exact PCI_ID_F_VFIO_DRIVER_OVERRIDE
>>>> flag to add "vfio_" prefix.
>>>>
>>>> Is that better ?
>>> I don't think it's worth having two separate #defines.  If we need
>>> more in the future, we can add them when we need them.
>> I meant 1 #define and 1 enum:
>>
>> enum {
>>      PCI_ID_F_VFIO_DRIVER_OVERRIDE    = 1 << 0,
>> };
>>
>> #define PCI_ID_DRIVER_OVERRIDE PCI_ID_F_VFIO_DRIVER_OVERRIDE
> Basically the same thing.  Doesn't seem worthwhile to me to have both.
> When reading the code, it's not at all obvious why you would define a
> new name for PCI_ID_F_VFIO_DRIVER_OVERRIDE.

because we need the "vfio_" prefix in the alias.

And the match can use PCI_ID_DRIVER_OVERRIDE that in the future cab be 
(#define PCI_ID_DRIVER_OVERRIDE (PCI_ID_F_VFIO_DRIVER_OVERRIDE | 
PCI_ID_F_SOME_OTHER_ALIAS_DRIVER_OVERRIDE)

>>> What if we renamed "flags" to be specifically for this override case,
>>> e.g., "override_only"?  Then the flag could be
>>> PCI_ID_F_VFIO_DRIVER_OVERRIDE, which would trigger a "vfio_" prefix in
>>> file2alias.c, but pci_match_device() could just check for it being
>>> non-zero, without caring whether the reason is VFIO or something else,
>>> e.g.,
>>>
>>>     pci_match_device(...)
>>>     {
>>>       ...
>>>       if (found_id->override_only) {
>>>         if (dev->driver_override)
>>>           return found_id;
>>>         ...
>> Jason suggested something like this:
>>
>> static const struct pci_device_id *pci_match_device(struct pci_driver *drv,
>>                              struct pci_dev *dev)
>> {
>>      struct pci_dynid *dynid;
>>      const struct pci_device_id *found_id = NULL, *ids;
>>
>>      /* When driver_override is set, only bind to the matching driver */
>>      if (dev->driver_override && strcmp(dev->driver_override, drv->name))
>>          return NULL;
>>
>>      /* Look at the dynamic ids first, before the static ones */
>>      spin_lock(&drv->dynids.lock);
>>      list_for_each_entry(dynid, &drv->dynids.list, node) {
>>          if (pci_match_one_device(&dynid->id, dev)) {
>>              found_id = &dynid->id;
>>              break;
>>          }
>>      }
>>      spin_unlock(&drv->dynids.lock);
>>
>>      if (found_id)
>>          return found_id;
>>
>>      for (ids = drv->id_table; (found_id = pci_match_id(ids, dev));
>>           ids = found_id + 1) {
>>          /*
>>           * The match table is split based on driver_override. Check the
>>           * flags as well so that any matching
>>           * PCI_ID_F_VFIO_DRIVER_OVERRIDE entry is returned.
>>           */
>>          if (!(found_id->flags & PCI_ID_F_VFIO_DRIVER_OVERRIDE) ||
>>              dev->driver_override)
>>              return found_id;
>>      }
>>
>>      /*
>>       * if no static match, driver_override will always match, send a dummy
>>       * id.
>>       */
>>      if (dev->driver_override)
>>          return &pci_device_id_any;
>>      return NULL;
>> }
>>
>>
>> It looks good to me as well.
> I missed your point.  Isn't the above basically the 09/12 patch [1] we're
> talking about?
>
> Yes, I see the code structure is slightly different, but the question
> we're talking about here is the name of the "flags" field and the enum
> or #define for the VFIO bit.

I guess the renaming of "__u32 flags" to "__u32 driver_override" is ok 
from my perspective.

The enum for vfio should stay.

The prefix we want in the alias is "vfio_" and not "driver_override_".

This will allow a clean uAPI. "driver_override_" prefix will be too 
generic for userspace tools like libvirt that would like to find a 
*VFIO* driver not something else.

Thus we need alias to be "vfio_".

In the future if some other driver will use this flag, it will create an 
alias also. In your suggestion, the alias will be the same and the 
userspace tool won't be able to distinguish between the two.

But in the original solution, for non vfio driver override drivers, one 
can use new enum PCI_ID_F_SOME_OTHER_ALIAS_DRIVER_OVERRIDE and add its 
own alias prefix for recognition "my_prefix_".

>
>> I prefer the "flags" naming since its more generic and easy to extend.
> We don't need to worry about "flags" being generic or extensible until
> we need to extend it.  It's easy to fiddle with it at that point.
>
>> can we continue with the above suggestion for V2 ?
> I don't see what really changed with the above suggestion.
>
> The point I'm trying to make is that using PCI_ID_F_VFIO_DRIVER_OVERRIDE
> in pci_match_device() suggests that the code there has some connection
> or dependency on VFIO, but it does not.

This is why I suggested a "#define PCI_ID_DRIVER_OVERRIDE 
PCI_ID_F_VFIO_DRIVER_OVERRIDE"


>
> [1] https://lore.kernel.org/r/20210721161609.68223-10-yishaih@nvidia.com

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 09/12] PCI: Add a PCI_ID_F_VFIO_DRIVER_OVERRIDE flag to struct pci_device_id
  2021-08-17 13:01                         ` Max Gurtovoy
@ 2021-08-17 14:13                           ` Bjorn Helgaas
  2021-08-17 14:44                             ` Max Gurtovoy
  0 siblings, 1 reply; 55+ messages in thread
From: Bjorn Helgaas @ 2021-08-17 14:13 UTC (permalink / raw)
  To: Max Gurtovoy
  Cc: Jason Gunthorpe, Yishai Hadas, bhelgaas, corbet, alex.williamson,
	diana.craciun, kwankhede, eric.auger, masahiroy, michal.lkml,
	linux-pci, linux-doc, kvm, linux-s390, linux-kbuild, maorg,
	leonro

On Tue, Aug 17, 2021 at 04:01:49PM +0300, Max Gurtovoy wrote:
> On 8/16/2021 8:21 PM, Bjorn Helgaas wrote:
> > On Sun, Aug 15, 2021 at 02:27:13AM +0300, Max Gurtovoy wrote:
> > > On 8/13/2021 8:44 PM, Bjorn Helgaas wrote:
> > > > On Fri, Aug 13, 2021 at 02:21:41AM +0300, Max Gurtovoy wrote:
> > > > > On 8/12/2021 11:26 PM, Bjorn Helgaas wrote:
> > > > > > On Thu, Aug 12, 2021 at 04:51:26PM -0300, Jason Gunthorpe wrote:
> > > > > > > On Thu, Aug 12, 2021 at 10:57:07AM -0500, Bjorn Helgaas wrote:
> > > > > > > > On Thu, Aug 12, 2021 at 10:27:28AM -0300, Jason Gunthorpe wrote:
> > > > > > > > > On Wed, Aug 11, 2021 at 02:07:37PM -0500, Bjorn Helgaas wrote:
> > > > > > > > > > On Thu, Aug 05, 2021 at 09:23:57PM -0300, Jason Gunthorpe wrote:
> > > > > > > > > > Do the other bus types have a flag analogous to
> > > > > > > > > > PCI_ID_F_VFIO_DRIVER_OVERRIDE?  If we're doing something similar to
> > > > > > > > > > other bus types, it'd be nice if the approach were similar.
> > > > > > > > > They could, this series doesn't attempt it. I expect the approach to
> > > > > > > > > be similar as driver_override was copied from PCI to other
> > > > > > > > > busses. When this is completed I hope to take a look at it.
> > > > > > > > I think this would make more sense as two patches:
> > > > > > > > 
> > > > > > > >      - Add a "PCI_ID_DRIVER_OVERRIDE" flag.  This is not VFIO-specific,
> > > > > > > >        since nothing in PCI depends on the VFIO-ness of drivers that use
> > > > > > > >        the flag.  The only point here is that driver id_table entries
> > > > > > > >        with this flag only match when driver_override matches the driver.
> > > > > > > This would require using two flags, one to indicate the above to the
> > > > > > > PCI code and another to indicate the vfio_pci string to
> > > > > > > file2alias. This doesn't seem justified at this point, IMHO.
> > > > > > I don't think it requires two flags.  do_pci_entry() has:
> > > > > > 
> > > > > >      if (flags & PCI_ID_F_VFIO_DRIVER_OVERRIDE)
> > > > > >        strcpy(alias, "vfio_pci:");
> > > > > > 
> > > > > > I'm just proposing a rename:
> > > > > > 
> > > > > > s/PCI_ID_F_VFIO_DRIVER_OVERRIDE/PCI_ID_DRIVER_OVERRIDE/
> > > > > > 
> > > > > > > >      - Update file2alias.c to export the flags and the "vfio_pci:" alias.
> > > > > > > >        This seems to be the only place where VFIO comes into play, and
> > > > > > > >        putting it in a separate patch will make it much smaller and it
> > > > > > > >        will be clear how it could be extended for other buses.
> > > > > > > Well, I don't want to see a flag called PCI_ID_DRIVER_OVERRIDE mapped
> > > > > > > to the string "vfio_pci", that is just really confusing.
> > > > > > Hahaha, I see, that's fair :)  It confused me for a long time why you
> > > > > > wanted "VFIO" in the flag name because from the kernel's point of
> > > > > > view, the flag is not related to any VFIO-ness.  It's only related to
> > > > > > a special variety of driver_override, and VFIO happens to be one user
> > > > > > of it.
> > > > > In my original patch I used
> > > > > 
> > > > > #define PCI_ID_DRIVER_OVERRIDE PCI_ID_F_VFIO_DRIVER_OVERRIDE
> > > > > 
> > > > > and in the pci core code I used PCI_ID_DRIVER_OVERRIDE in the "if" clause.
> > > > > 
> > > > > So we can maybe do that and leave the option to future update of the define
> > > > > without changing the core code.
> > > > > 
> > > > > In the future we can have something like:
> > > > > 
> > > > > #define PCI_ID_DRIVER_OVERRIDE (PCI_ID_F_VFIO_DRIVER_OVERRIDE |
> > > > > PCI_ID_F_MY_BUS_DRIVER_OVERRIDE)
> > > > > 
> > > > > The file2alias.c still have to use the exact PCI_ID_F_VFIO_DRIVER_OVERRIDE
> > > > > flag to add "vfio_" prefix.
> > > > > 
> > > > > Is that better ?
> > > > I don't think it's worth having two separate #defines.  If we need
> > > > more in the future, we can add them when we need them.
> > > I meant 1 #define and 1 enum:
> > > 
> > > enum {
> > >      PCI_ID_F_VFIO_DRIVER_OVERRIDE    = 1 << 0,
> > > };
> > > 
> > > #define PCI_ID_DRIVER_OVERRIDE PCI_ID_F_VFIO_DRIVER_OVERRIDE
> > Basically the same thing.  Doesn't seem worthwhile to me to have both.
> > When reading the code, it's not at all obvious why you would define a
> > new name for PCI_ID_F_VFIO_DRIVER_OVERRIDE.
> 
> because we need the "vfio_" prefix in the alias.
> 
> And the match can use PCI_ID_DRIVER_OVERRIDE that in the future cab be
> (#define PCI_ID_DRIVER_OVERRIDE (PCI_ID_F_VFIO_DRIVER_OVERRIDE |
> PCI_ID_F_SOME_OTHER_ALIAS_DRIVER_OVERRIDE)

Read this again:
https://lore.kernel.org/r/20210813174459.GA2594783@bjorn-Precision-5520

That gives you a "vfio_" prefix without the unnecessary VFIO
connection in pci_match_device.

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 09/12] PCI: Add a PCI_ID_F_VFIO_DRIVER_OVERRIDE flag to struct pci_device_id
  2021-08-17 14:13                           ` Bjorn Helgaas
@ 2021-08-17 14:44                             ` Max Gurtovoy
  0 siblings, 0 replies; 55+ messages in thread
From: Max Gurtovoy @ 2021-08-17 14:44 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Jason Gunthorpe, Yishai Hadas, bhelgaas, corbet, alex.williamson,
	diana.craciun, kwankhede, eric.auger, masahiroy, michal.lkml,
	linux-pci, linux-doc, kvm, linux-s390, linux-kbuild, maorg,
	leonro


On 8/17/2021 5:13 PM, Bjorn Helgaas wrote:
> On Tue, Aug 17, 2021 at 04:01:49PM +0300, Max Gurtovoy wrote:
>> On 8/16/2021 8:21 PM, Bjorn Helgaas wrote:
>>> On Sun, Aug 15, 2021 at 02:27:13AM +0300, Max Gurtovoy wrote:
>>>> On 8/13/2021 8:44 PM, Bjorn Helgaas wrote:
>>>>> On Fri, Aug 13, 2021 at 02:21:41AM +0300, Max Gurtovoy wrote:
>>>>>> On 8/12/2021 11:26 PM, Bjorn Helgaas wrote:
>>>>>>> On Thu, Aug 12, 2021 at 04:51:26PM -0300, Jason Gunthorpe wrote:
>>>>>>>> On Thu, Aug 12, 2021 at 10:57:07AM -0500, Bjorn Helgaas wrote:
>>>>>>>>> On Thu, Aug 12, 2021 at 10:27:28AM -0300, Jason Gunthorpe wrote:
>>>>>>>>>> On Wed, Aug 11, 2021 at 02:07:37PM -0500, Bjorn Helgaas wrote:
>>>>>>>>>>> On Thu, Aug 05, 2021 at 09:23:57PM -0300, Jason Gunthorpe wrote:
>>>>>>>>>>> Do the other bus types have a flag analogous to
>>>>>>>>>>> PCI_ID_F_VFIO_DRIVER_OVERRIDE?  If we're doing something similar to
>>>>>>>>>>> other bus types, it'd be nice if the approach were similar.
>>>>>>>>>> They could, this series doesn't attempt it. I expect the approach to
>>>>>>>>>> be similar as driver_override was copied from PCI to other
>>>>>>>>>> busses. When this is completed I hope to take a look at it.
>>>>>>>>> I think this would make more sense as two patches:
>>>>>>>>>
>>>>>>>>>       - Add a "PCI_ID_DRIVER_OVERRIDE" flag.  This is not VFIO-specific,
>>>>>>>>>         since nothing in PCI depends on the VFIO-ness of drivers that use
>>>>>>>>>         the flag.  The only point here is that driver id_table entries
>>>>>>>>>         with this flag only match when driver_override matches the driver.
>>>>>>>> This would require using two flags, one to indicate the above to the
>>>>>>>> PCI code and another to indicate the vfio_pci string to
>>>>>>>> file2alias. This doesn't seem justified at this point, IMHO.
>>>>>>> I don't think it requires two flags.  do_pci_entry() has:
>>>>>>>
>>>>>>>       if (flags & PCI_ID_F_VFIO_DRIVER_OVERRIDE)
>>>>>>>         strcpy(alias, "vfio_pci:");
>>>>>>>
>>>>>>> I'm just proposing a rename:
>>>>>>>
>>>>>>> s/PCI_ID_F_VFIO_DRIVER_OVERRIDE/PCI_ID_DRIVER_OVERRIDE/
>>>>>>>
>>>>>>>>>       - Update file2alias.c to export the flags and the "vfio_pci:" alias.
>>>>>>>>>         This seems to be the only place where VFIO comes into play, and
>>>>>>>>>         putting it in a separate patch will make it much smaller and it
>>>>>>>>>         will be clear how it could be extended for other buses.
>>>>>>>> Well, I don't want to see a flag called PCI_ID_DRIVER_OVERRIDE mapped
>>>>>>>> to the string "vfio_pci", that is just really confusing.
>>>>>>> Hahaha, I see, that's fair :)  It confused me for a long time why you
>>>>>>> wanted "VFIO" in the flag name because from the kernel's point of
>>>>>>> view, the flag is not related to any VFIO-ness.  It's only related to
>>>>>>> a special variety of driver_override, and VFIO happens to be one user
>>>>>>> of it.
>>>>>> In my original patch I used
>>>>>>
>>>>>> #define PCI_ID_DRIVER_OVERRIDE PCI_ID_F_VFIO_DRIVER_OVERRIDE
>>>>>>
>>>>>> and in the pci core code I used PCI_ID_DRIVER_OVERRIDE in the "if" clause.
>>>>>>
>>>>>> So we can maybe do that and leave the option to future update of the define
>>>>>> without changing the core code.
>>>>>>
>>>>>> In the future we can have something like:
>>>>>>
>>>>>> #define PCI_ID_DRIVER_OVERRIDE (PCI_ID_F_VFIO_DRIVER_OVERRIDE |
>>>>>> PCI_ID_F_MY_BUS_DRIVER_OVERRIDE)
>>>>>>
>>>>>> The file2alias.c still have to use the exact PCI_ID_F_VFIO_DRIVER_OVERRIDE
>>>>>> flag to add "vfio_" prefix.
>>>>>>
>>>>>> Is that better ?
>>>>> I don't think it's worth having two separate #defines.  If we need
>>>>> more in the future, we can add them when we need them.
>>>> I meant 1 #define and 1 enum:
>>>>
>>>> enum {
>>>>       PCI_ID_F_VFIO_DRIVER_OVERRIDE    = 1 << 0,
>>>> };
>>>>
>>>> #define PCI_ID_DRIVER_OVERRIDE PCI_ID_F_VFIO_DRIVER_OVERRIDE
>>> Basically the same thing.  Doesn't seem worthwhile to me to have both.
>>> When reading the code, it's not at all obvious why you would define a
>>> new name for PCI_ID_F_VFIO_DRIVER_OVERRIDE.
>> because we need the "vfio_" prefix in the alias.
>>
>> And the match can use PCI_ID_DRIVER_OVERRIDE that in the future cab be
>> (#define PCI_ID_DRIVER_OVERRIDE (PCI_ID_F_VFIO_DRIVER_OVERRIDE |
>> PCI_ID_F_SOME_OTHER_ALIAS_DRIVER_OVERRIDE)
> Read this again:
> https://lore.kernel.org/r/20210813174459.GA2594783@bjorn-Precision-5520
>
> That gives you a "vfio_" prefix without the unnecessary VFIO
> connection in pci_match_device.

I see.

So I guess the following code should be fine:


static const struct pci_device_id *pci_match_device(struct pci_driver *drv,
                             struct pci_dev *dev)
{
     struct pci_dynid *dynid;
     const struct pci_device_id *found_id = NULL, *ids;

     /* When driver_override is set, only bind to the matching driver */
     if (dev->driver_override && strcmp(dev->driver_override, drv->name))
         return NULL;

     /* Look at the dynamic ids first, before the static ones */
     spin_lock(&drv->dynids.lock);
     list_for_each_entry(dynid, &drv->dynids.list, node) {
         if (pci_match_one_device(&dynid->id, dev)) {
             found_id = &dynid->id;
             break;
         }
     }
     spin_unlock(&drv->dynids.lock);

     if (found_id)
         return found_id;

     for (ids = drv->id_table; (found_id = pci_match_id(ids, dev));
          ids = found_id + 1) {
         /*
          * The match table is split based on driver_override.
          */
         if (!found_id->override_only || dev->driver_override)
             return found_id;
     }

     /*
      * if no static match, driver_override will always match, send a dummy
      * id.
      */
     if (dev->driver_override)
         return &pci_device_id_any;
     return NULL;
}




^ permalink raw reply	[flat|nested] 55+ messages in thread

end of thread, other threads:[~2021-08-17 14:45 UTC | newest]

Thread overview: 55+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-07-21 16:15 [PATCH 00/12] Introduce vfio_pci_core subsystem Yishai Hadas
2021-07-21 16:15 ` [PATCH 01/12] vfio/pci: Rename vfio_pci.c to vfio_pci_core.c Yishai Hadas
2021-07-21 16:15 ` [PATCH 02/12] vfio/pci: Rename vfio_pci_private.h to vfio_pci_core.h Yishai Hadas
2021-07-21 16:16 ` [PATCH 03/12] vfio/pci: Rename vfio_pci_device to vfio_pci_core_device Yishai Hadas
2021-07-21 16:16 ` [PATCH 04/12] vfio/pci: Rename ops functions to fit core namings Yishai Hadas
2021-07-21 16:16 ` [PATCH 05/12] vfio/pci: Include vfio header in vfio_pci_core.h Yishai Hadas
2021-07-21 16:16 ` [PATCH 06/12] vfio/pci: Split the pci_driver code out of vfio_pci_core.c Yishai Hadas
2021-07-21 16:16 ` [PATCH 07/12] vfio/pci: Move igd initialization to vfio_pci.c Yishai Hadas
2021-07-21 16:16 ` [PATCH 08/12] vfio/pci: Move module parameters " Yishai Hadas
2021-07-21 16:16 ` [PATCH 09/12] PCI: Add a PCI_ID_F_VFIO_DRIVER_OVERRIDE flag to struct pci_device_id Yishai Hadas
2021-07-27 16:34   ` Alex Williamson
2021-07-27 17:14     ` Jason Gunthorpe
2021-07-27 23:02       ` Alex Williamson
2021-07-27 23:42         ` Jason Gunthorpe
2021-08-04 20:34   ` Bjorn Helgaas
2021-08-05 16:47     ` Max Gurtovoy
2021-08-06  0:23     ` Jason Gunthorpe
2021-08-11 12:22       ` Max Gurtovoy
2021-08-11 19:07       ` Bjorn Helgaas
2021-08-12 13:27         ` Jason Gunthorpe
2021-08-12 15:57           ` Bjorn Helgaas
2021-08-12 19:51             ` Jason Gunthorpe
2021-08-12 20:26               ` Bjorn Helgaas
2021-08-12 23:21                 ` Max Gurtovoy
2021-08-13 17:44                   ` Bjorn Helgaas
2021-08-14 23:27                     ` Max Gurtovoy
2021-08-16 17:21                       ` Bjorn Helgaas
2021-08-17 13:01                         ` Max Gurtovoy
2021-08-17 14:13                           ` Bjorn Helgaas
2021-08-17 14:44                             ` Max Gurtovoy
2021-08-12 15:42   ` Bjorn Helgaas
2021-07-21 16:16 ` [PATCH 10/12] vfio: Use select for eventfd Yishai Hadas
2021-07-21 16:16 ` [PATCH 11/12] vfio: Use kconfig if XX/endif blocks instead of repeating 'depends on' Yishai Hadas
2021-07-21 16:16 ` [PATCH 12/12] vfio/pci: Introduce vfio_pci_core.ko Yishai Hadas
2021-07-21 17:39   ` Leon Romanovsky
2021-07-22  9:06     ` Yishai Hadas
2021-07-22  9:22       ` Max Gurtovoy
2021-07-23 14:13         ` Leon Romanovsky
2021-07-25 10:45           ` Max Gurtovoy
2021-07-27 21:54   ` Alex Williamson
2021-07-27 23:09     ` Jason Gunthorpe
2021-07-28  4:56       ` Leon Romanovsky
2021-07-28  5:43       ` Christoph Hellwig
2021-07-28  7:04         ` Arnd Bergmann
2021-07-28  7:17           ` Leon Romanovsky
2021-07-28 12:03         ` Jason Gunthorpe
2021-07-28 12:12           ` Arnd Bergmann
2021-07-28 12:29           ` Christoph Hellwig
2021-07-28 12:47             ` Jason Gunthorpe
2021-07-28 12:55               ` Christoph Hellwig
2021-07-28 13:31                 ` Jason Gunthorpe
2021-07-28 13:08               ` Arnd Bergmann
2021-07-28 17:26                 ` Jason Gunthorpe
2021-08-04 13:41 ` [PATCH 00/12] Introduce vfio_pci_core subsystem Yishai Hadas
2021-08-04 15:27   ` Alex Williamson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.