All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH V2 00/18]  vfio/pci: Back guest interrupts from Interrupt Message Store (IMS)
@ 2023-10-06 16:40 Reinette Chatre
  2023-10-06 16:40 ` [RFC PATCH V2 01/18] PCI/MSI: Provide stubs for IMS functions Reinette Chatre
                   ` (17 more replies)
  0 siblings, 18 replies; 21+ messages in thread
From: Reinette Chatre @ 2023-10-06 16:40 UTC (permalink / raw)
  To: jgg, yishaih, shameerali.kolothum.thodi, kevin.tian, alex.williamson
  Cc: kvm, dave.jiang, jing2.liu, ashok.raj, fenghua.yu, tom.zanussi,
	reinette.chatre, linux-kernel, patches

Changes since RFC V1:
- RFC V1: https://lore.kernel.org/lkml/cover.1692892275.git.reinette.chatre@intel.com/
- This is a complete rewrite based on feedback from Jason and Kevin.
  Primarily the transition is to make IMS a new backend of MSI-X
  emulation: VFIO PCI transitions to be an interrupt management frontend
  with existing interrupt management for PCI passthrough devices as a
  backend and IMS interrupt management introduced as a new backend.
  The first part of the series splits VFIO PCI interrupt
  management into a "frontend" and "backend" with the existing PCI
  interrupt management as its first backend. The second part of the
  series adds IMS interrupt management as a new interrupt management
  backend.
  This is a significant change from RFC V1 as well as in the impact of
  the changes on existing VFIO PCI. This was done in response to
  feedback that I hope I understood as intended. If I did not get it
  right, please do point out to me where I went astray and I'd be
  happy to rewrite. Of course, suggestions for improvement will
  be much appreciated.

Hi Everybody,

With Interrupt Message Store (IMS) support introduced in
commit 0194425af0c8 ("PCI/MSI: Provide IMS (Interrupt Message Store)
support") a device can create a secondary interrupt domain that works
side by side with MSI-X on the same device. IMS allows for
implementation-specific interrupt storage that is managed by the
implementation specific interrupt chip associated with the IMS domain
at the time it (the IMS domain) is created for the device via
pci_create_ims_domain().

An example usage of IMS is for devices that can have their resources
assigned to guests with varying granularity. For example, an
accelerator device may support many workqueues and a single workqueue
can be composed into a virtual device for use by a guest. Using
IMS interrupts for the guest preserves MSI-X for host usage while
allowing a significantly larger number of interrupt vectors than
allowed by MSI-X. All while enabling usage of the same device driver
within the host and guest.

This series introduces IMS support to VFIO PCI for use by
virtual devices that support MSI-X interrupts that are backed by IMS
interrupts on the host. Specifically, that means that when the virtual
device's VFIO_DEVICE_SET_IRQS ioctl() receives a "trigger interrupt"
(VFIO_IRQ_SET_ACTION_TRIGGER) for a MSI-X index then VFIO PCI IMS
allocates/frees an IMS interrupt on the host.

VFIO PCI assumes that it is managing interrupts of a passthrough PCI
device. VFIO PCI is split into a "frontend" and "backend" to support
interrupt management for virtual devices that are not passthrough PCI
devices. The VFIO PCI frontend directs guest requests to the
appropriate backend. Existing interrupt management for passthrough PCI
devices is the first backend, guest MSI-X interrupts backed by
IMS interrupts on the host is the new backend (VFIO PCI IMS).

An IMS interrupt is allocated via pci_ims_alloc_irq() that requires
an implementation specific cookie that is opaque to VFIO PCI IMS. This
can be a PASID, queue ID, pointer etc. During initialization
VFIO PCI IMS learns which PCI device to operate on and what the
default cookie should be for any new interrupt allocation. VFIO PCI
IMS can also associate a unique cookie with each vector and to maintain
this association the backend maintains interrupt contexts for the virtual
device's lifetime.

Guests may access a virtual device via both 'direct-path', where the
guest interacts directly with the underlying hardware, and 'intercepted
path', where the virtual device emulates operations. VFIO PCI
supports emulated interrupts (better naming suggestions are welcome) to
handle 'intercepted path' operations where completion interrupts are
signaled from the virtual device, not the underlying hardware. Backend
support is required for emulated interrupts and only VFIO PCI IMS
backend supports emulated interrupts in this series.

This has been tested with a yet to be published VFIO driver for the
Intel Data Accelerators (IDXD) present in Intel Xeon CPUs.

While this series contains a working implementation it is presented
as an RFC with the goal to obtain feedback on whether VFIO PCI IMS
is appropriate for inclusion into VFIO and whether it is
(or could be adapted to be) appropriate for support of other
planned IMS usages you may be aware of.

Any feedback will be greatly appreciated.

Reinette

Reinette Chatre (18):
  PCI/MSI: Provide stubs for IMS functions
  vfio/pci: Move PCI specific check from wrapper to PCI function
  vfio/pci: Use unsigned int instead of unsigned
  vfio/pci: Make core interrupt callbacks accessible to all virtual
    devices
  vfio/pci: Split PCI interrupt management into front and backend
  vfio/pci: Separate MSI and MSI-X handling
  vfio/pci: Move interrupt eventfd to interrupt context
  vfio/pci: Move mutex acquisition into function
  vfio/pci: Move interrupt contexts to generic interrupt struct
  vfio/pci: Move IRQ type to generic interrupt context
  vfio/pci: Split interrupt context initialization
  vfio/pci: Provide interrupt context to generic ops
  vfio/pci: Make vfio_pci_set_irqs_ioctl() available
  vfio/pci: Add core IMS support
  vfio/pci: Support emulated interrupts
  vfio/pci: Support emulated interrupts in IMS backend
  vfio/pci: Add accessor for IMS index
  vfio/pci: Support IMS cookie modification

 drivers/vfio/pci/vfio_pci_config.c |   2 +-
 drivers/vfio/pci/vfio_pci_core.c   |  50 +--
 drivers/vfio/pci/vfio_pci_intrs.c  | 658 ++++++++++++++++++++++++++---
 drivers/vfio/pci/vfio_pci_priv.h   |   2 +-
 include/linux/pci.h                |  31 +-
 include/linux/vfio_pci_core.h      |  70 ++-
 6 files changed, 706 insertions(+), 107 deletions(-)


base-commit: 8a749fd1a8720d4619c91c8b6e7528c0a355c0aa
-- 
2.34.1


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [RFC PATCH V2 01/18] PCI/MSI: Provide stubs for IMS functions
  2023-10-06 16:40 [RFC PATCH V2 00/18] vfio/pci: Back guest interrupts from Interrupt Message Store (IMS) Reinette Chatre
@ 2023-10-06 16:40 ` Reinette Chatre
  2023-10-06 16:40 ` [RFC PATCH V2 02/18] vfio/pci: Move PCI specific check from wrapper to PCI function Reinette Chatre
                   ` (16 subsequent siblings)
  17 siblings, 0 replies; 21+ messages in thread
From: Reinette Chatre @ 2023-10-06 16:40 UTC (permalink / raw)
  To: jgg, yishaih, shameerali.kolothum.thodi, kevin.tian, alex.williamson
  Cc: kvm, dave.jiang, jing2.liu, ashok.raj, fenghua.yu, tom.zanussi,
	reinette.chatre, linux-kernel, patches

The IMS related functions (pci_create_ims_domain(),
pci_ims_alloc_irq(), and pci_ims_free_irq()) are not declared
when CONFIG_PCI_MSI is disabled.

Provide definitions of these functions that can be used
when callers need to compile when CONFIG_PCI_MSI is disabled.

This is a preparatory patch for the first caller of these
functions (VFIO).

Fixes: 0194425af0c8 ("PCI/MSI: Provide IMS (Interrupt Message Store) support")
Fixes: c9e5bea27383 ("PCI/MSI: Provide pci_ims_alloc/free_irq()")
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Cc: stable@vger.kernel.org      # v6.2+
---
I plan to send this patch separately to PCI folks, pending the
outcome of this work.

 include/linux/pci.h | 31 +++++++++++++++++++++++--------
 1 file changed, 23 insertions(+), 8 deletions(-)

diff --git a/include/linux/pci.h b/include/linux/pci.h
index 8c7c2c3c6c65..68a52bc01864 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -1624,6 +1624,8 @@ struct msix_entry {
 	u16	entry;	/* Driver uses to specify entry, OS writes */
 };
 
+struct msi_domain_template;
+
 #ifdef CONFIG_PCI_MSI
 int pci_msi_vec_count(struct pci_dev *dev);
 void pci_disable_msi(struct pci_dev *dev);
@@ -1656,6 +1658,11 @@ void pci_msix_free_irq(struct pci_dev *pdev, struct msi_map map);
 void pci_free_irq_vectors(struct pci_dev *dev);
 int pci_irq_vector(struct pci_dev *dev, unsigned int nr);
 const struct cpumask *pci_irq_get_affinity(struct pci_dev *pdev, int vec);
+bool pci_create_ims_domain(struct pci_dev *pdev, const struct msi_domain_template *template,
+			   unsigned int hwsize, void *data);
+struct msi_map pci_ims_alloc_irq(struct pci_dev *pdev, union msi_instance_cookie *icookie,
+				 const struct irq_affinity_desc *affdesc);
+void pci_ims_free_irq(struct pci_dev *pdev, struct msi_map map);
 
 #else
 static inline int pci_msi_vec_count(struct pci_dev *dev) { return -ENOSYS; }
@@ -1719,6 +1726,22 @@ static inline const struct cpumask *pci_irq_get_affinity(struct pci_dev *pdev,
 {
 	return cpu_possible_mask;
 }
+static inline bool pci_create_ims_domain(struct pci_dev *pdev,
+					 const struct msi_domain_template *template,
+					 unsigned int hwsize, void *data)
+{ return false; }
+static inline struct msi_map pci_ims_alloc_irq(struct pci_dev *pdev,
+					       union msi_instance_cookie *icookie,
+					       const struct irq_affinity_desc *affdesc)
+{
+	struct msi_map map = { .index = -ENOSYS, };
+
+	return map;
+}
+static inline void pci_ims_free_irq(struct pci_dev *pdev, struct msi_map map)
+{
+}
+
 #endif
 
 /**
@@ -2616,14 +2639,6 @@ static inline bool pci_is_thunderbolt_attached(struct pci_dev *pdev)
 void pci_uevent_ers(struct pci_dev *pdev, enum  pci_ers_result err_type);
 #endif
 
-struct msi_domain_template;
-
-bool pci_create_ims_domain(struct pci_dev *pdev, const struct msi_domain_template *template,
-			   unsigned int hwsize, void *data);
-struct msi_map pci_ims_alloc_irq(struct pci_dev *pdev, union msi_instance_cookie *icookie,
-				 const struct irq_affinity_desc *affdesc);
-void pci_ims_free_irq(struct pci_dev *pdev, struct msi_map map);
-
 #include <linux/dma-mapping.h>
 
 #define pci_printk(level, pdev, fmt, arg...) \
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [RFC PATCH V2 02/18] vfio/pci: Move PCI specific check from wrapper to PCI function
  2023-10-06 16:40 [RFC PATCH V2 00/18] vfio/pci: Back guest interrupts from Interrupt Message Store (IMS) Reinette Chatre
  2023-10-06 16:40 ` [RFC PATCH V2 01/18] PCI/MSI: Provide stubs for IMS functions Reinette Chatre
@ 2023-10-06 16:40 ` Reinette Chatre
  2023-10-06 16:40 ` [RFC PATCH V2 03/18] vfio/pci: Use unsigned int instead of unsigned Reinette Chatre
                   ` (15 subsequent siblings)
  17 siblings, 0 replies; 21+ messages in thread
From: Reinette Chatre @ 2023-10-06 16:40 UTC (permalink / raw)
  To: jgg, yishaih, shameerali.kolothum.thodi, kevin.tian, alex.williamson
  Cc: kvm, dave.jiang, jing2.liu, ashok.raj, fenghua.yu, tom.zanussi,
	reinette.chatre, linux-kernel, patches

vfio_pci_set_irqs_ioctl() uses a PCI device specific check to
determine if PCI specific vfio_pci_set_err_trigger() should be
called.

Move the PCI device specific check into PCI specific
vfio_pci_set_err_trigger() to make it easier for
vfio_pci_set_irqs_ioctl() to become a frontend for interrupt
backends for PCI devices as well as virtual devices.

Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
---
 drivers/vfio/pci/vfio_pci_intrs.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/vfio/pci/vfio_pci_intrs.c b/drivers/vfio/pci/vfio_pci_intrs.c
index cbb4bcbfbf83..b5b1c09bef25 100644
--- a/drivers/vfio/pci/vfio_pci_intrs.c
+++ b/drivers/vfio/pci/vfio_pci_intrs.c
@@ -758,6 +758,9 @@ static int vfio_pci_set_err_trigger(struct vfio_pci_core_device *vdev,
 				    unsigned index, unsigned start,
 				    unsigned count, uint32_t flags, void *data)
 {
+	if (!pci_is_pcie(vdev->pdev))
+		return -ENOTTY;
+
 	if (index != VFIO_PCI_ERR_IRQ_INDEX || start != 0 || count > 1)
 		return -EINVAL;
 
@@ -813,8 +816,7 @@ int vfio_pci_set_irqs_ioctl(struct vfio_pci_core_device *vdev, uint32_t flags,
 	case VFIO_PCI_ERR_IRQ_INDEX:
 		switch (flags & VFIO_IRQ_SET_ACTION_TYPE_MASK) {
 		case VFIO_IRQ_SET_ACTION_TRIGGER:
-			if (pci_is_pcie(vdev->pdev))
-				func = vfio_pci_set_err_trigger;
+			func = vfio_pci_set_err_trigger;
 			break;
 		}
 		break;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [RFC PATCH V2 03/18] vfio/pci: Use unsigned int instead of unsigned
  2023-10-06 16:40 [RFC PATCH V2 00/18] vfio/pci: Back guest interrupts from Interrupt Message Store (IMS) Reinette Chatre
  2023-10-06 16:40 ` [RFC PATCH V2 01/18] PCI/MSI: Provide stubs for IMS functions Reinette Chatre
  2023-10-06 16:40 ` [RFC PATCH V2 02/18] vfio/pci: Move PCI specific check from wrapper to PCI function Reinette Chatre
@ 2023-10-06 16:40 ` Reinette Chatre
  2023-10-06 16:40 ` [RFC PATCH V2 04/18] vfio/pci: Make core interrupt callbacks accessible to all virtual devices Reinette Chatre
                   ` (14 subsequent siblings)
  17 siblings, 0 replies; 21+ messages in thread
From: Reinette Chatre @ 2023-10-06 16:40 UTC (permalink / raw)
  To: jgg, yishaih, shameerali.kolothum.thodi, kevin.tian, alex.williamson
  Cc: kvm, dave.jiang, jing2.liu, ashok.raj, fenghua.yu, tom.zanussi,
	reinette.chatre, linux-kernel, patches

checkpatch.pl warns about usage of bare unsigned.

Change unsigned to unsigned int as a preparatory change
to avoid checkpatch.pl producing several warnings as
the work adding support for backends to VFIO interrupt
management progress.

Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
---
 drivers/vfio/pci/vfio_pci_intrs.c | 37 ++++++++++++++++++-------------
 1 file changed, 21 insertions(+), 16 deletions(-)

diff --git a/drivers/vfio/pci/vfio_pci_intrs.c b/drivers/vfio/pci/vfio_pci_intrs.c
index b5b1c09bef25..c49588c8f4a3 100644
--- a/drivers/vfio/pci/vfio_pci_intrs.c
+++ b/drivers/vfio/pci/vfio_pci_intrs.c
@@ -553,8 +553,9 @@ static void vfio_msi_disable(struct vfio_pci_core_device *vdev, bool msix)
  * IOCTL support
  */
 static int vfio_pci_set_intx_unmask(struct vfio_pci_core_device *vdev,
-				    unsigned index, unsigned start,
-				    unsigned count, uint32_t flags, void *data)
+				    unsigned int index, unsigned int start,
+				    unsigned int count, uint32_t flags,
+				    void *data)
 {
 	if (!is_intx(vdev) || start != 0 || count != 1)
 		return -EINVAL;
@@ -584,8 +585,8 @@ static int vfio_pci_set_intx_unmask(struct vfio_pci_core_device *vdev,
 }
 
 static int vfio_pci_set_intx_mask(struct vfio_pci_core_device *vdev,
-				  unsigned index, unsigned start,
-				  unsigned count, uint32_t flags, void *data)
+				  unsigned int index, unsigned int start,
+				  unsigned int count, uint32_t flags, void *data)
 {
 	if (!is_intx(vdev) || start != 0 || count != 1)
 		return -EINVAL;
@@ -604,8 +605,9 @@ static int vfio_pci_set_intx_mask(struct vfio_pci_core_device *vdev,
 }
 
 static int vfio_pci_set_intx_trigger(struct vfio_pci_core_device *vdev,
-				     unsigned index, unsigned start,
-				     unsigned count, uint32_t flags, void *data)
+				     unsigned int index, unsigned int start,
+				     unsigned int count, uint32_t flags,
+				     void *data)
 {
 	if (is_intx(vdev) && !count && (flags & VFIO_IRQ_SET_DATA_NONE)) {
 		vfio_intx_disable(vdev);
@@ -647,8 +649,9 @@ static int vfio_pci_set_intx_trigger(struct vfio_pci_core_device *vdev,
 }
 
 static int vfio_pci_set_msi_trigger(struct vfio_pci_core_device *vdev,
-				    unsigned index, unsigned start,
-				    unsigned count, uint32_t flags, void *data)
+				    unsigned int index, unsigned int start,
+				    unsigned int count, uint32_t flags,
+				    void *data)
 {
 	struct vfio_pci_irq_ctx *ctx;
 	unsigned int i;
@@ -755,8 +758,9 @@ static int vfio_pci_set_ctx_trigger_single(struct eventfd_ctx **ctx,
 }
 
 static int vfio_pci_set_err_trigger(struct vfio_pci_core_device *vdev,
-				    unsigned index, unsigned start,
-				    unsigned count, uint32_t flags, void *data)
+				    unsigned int index, unsigned int start,
+				    unsigned int count, uint32_t flags,
+				    void *data)
 {
 	if (!pci_is_pcie(vdev->pdev))
 		return -ENOTTY;
@@ -769,8 +773,9 @@ static int vfio_pci_set_err_trigger(struct vfio_pci_core_device *vdev,
 }
 
 static int vfio_pci_set_req_trigger(struct vfio_pci_core_device *vdev,
-				    unsigned index, unsigned start,
-				    unsigned count, uint32_t flags, void *data)
+				    unsigned int index, unsigned int start,
+				    unsigned int count, uint32_t flags,
+				    void *data)
 {
 	if (index != VFIO_PCI_REQ_IRQ_INDEX || start != 0 || count > 1)
 		return -EINVAL;
@@ -780,11 +785,11 @@ static int vfio_pci_set_req_trigger(struct vfio_pci_core_device *vdev,
 }
 
 int vfio_pci_set_irqs_ioctl(struct vfio_pci_core_device *vdev, uint32_t flags,
-			    unsigned index, unsigned start, unsigned count,
-			    void *data)
+			    unsigned int index, unsigned int start,
+			    unsigned int count, void *data)
 {
-	int (*func)(struct vfio_pci_core_device *vdev, unsigned index,
-		    unsigned start, unsigned count, uint32_t flags,
+	int (*func)(struct vfio_pci_core_device *vdev, unsigned int index,
+		    unsigned int start, unsigned int count, uint32_t flags,
 		    void *data) = NULL;
 
 	switch (index) {
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [RFC PATCH V2 04/18] vfio/pci: Make core interrupt callbacks accessible to all virtual devices
  2023-10-06 16:40 [RFC PATCH V2 00/18] vfio/pci: Back guest interrupts from Interrupt Message Store (IMS) Reinette Chatre
                   ` (2 preceding siblings ...)
  2023-10-06 16:40 ` [RFC PATCH V2 03/18] vfio/pci: Use unsigned int instead of unsigned Reinette Chatre
@ 2023-10-06 16:40 ` Reinette Chatre
  2023-10-06 16:41 ` [RFC PATCH V2 05/18] vfio/pci: Split PCI interrupt management into front and backend Reinette Chatre
                   ` (13 subsequent siblings)
  17 siblings, 0 replies; 21+ messages in thread
From: Reinette Chatre @ 2023-10-06 16:40 UTC (permalink / raw)
  To: jgg, yishaih, shameerali.kolothum.thodi, kevin.tian, alex.williamson
  Cc: kvm, dave.jiang, jing2.liu, ashok.raj, fenghua.yu, tom.zanussi,
	reinette.chatre, linux-kernel, patches

The functions handling actions on interrupts for a virtual PCI device
triggered by VFIO_DEVICE_SET_IRQS ioctl() expect to act on a passthrough
PCI device represented by a struct vfio_pci_core_device.

A virtual device can support MSI-X while not being a passthrough PCI
device and thus not be represented by a struct vfio_pci_core_device.

To support MSI-X in all virtual devices it needs to be possible for
their drivers to interact with the MSI-X interrupt management and
thus the interrupt management should not require struct
vfio_pci_core_device.

Introduce struct vfio_pci_intr_ctx that will contain the interrupt
context of a virtual device that can be managed by different backends.
The first supported backend is the existing PCI interrupt management.
The core VFIO PCI interrupt management functions are modified to expect
this structure. As a backend managing interrupts of passthrough PCI
devices the existing VFIO PCI functions do still require to operate on
an actual PCI device represented by struct vfio_pci_core_device that
is provided via a private pointer.

More members are added to struct vfio_pci_intr_ctx in later patches
as members unique to interrupt context are transitioned from struct
vfio_pci_core_device.

Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
---
 drivers/vfio/pci/vfio_pci_core.c  |  7 ++++---
 drivers/vfio/pci/vfio_pci_intrs.c | 29 ++++++++++++++++++++---------
 drivers/vfio/pci/vfio_pci_priv.h  |  2 +-
 include/linux/vfio_pci_core.h     |  9 +++++++++
 4 files changed, 34 insertions(+), 13 deletions(-)

diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c
index 1929103ee59a..bb8181444c41 100644
--- a/drivers/vfio/pci/vfio_pci_core.c
+++ b/drivers/vfio/pci/vfio_pci_core.c
@@ -594,7 +594,7 @@ void vfio_pci_core_disable(struct vfio_pci_core_device *vdev)
 	/* Stop the device from further DMA */
 	pci_clear_master(pdev);
 
-	vfio_pci_set_irqs_ioctl(vdev, VFIO_IRQ_SET_DATA_NONE |
+	vfio_pci_set_irqs_ioctl(&vdev->intr_ctx, VFIO_IRQ_SET_DATA_NONE |
 				VFIO_IRQ_SET_ACTION_TRIGGER,
 				vdev->irq_type, 0, 0, NULL);
 
@@ -1216,8 +1216,8 @@ static int vfio_pci_ioctl_set_irqs(struct vfio_pci_core_device *vdev,
 
 	mutex_lock(&vdev->igate);
 
-	ret = vfio_pci_set_irqs_ioctl(vdev, hdr.flags, hdr.index, hdr.start,
-				      hdr.count, data);
+	ret = vfio_pci_set_irqs_ioctl(&vdev->intr_ctx, hdr.flags, hdr.index,
+				      hdr.start, hdr.count, data);
 
 	mutex_unlock(&vdev->igate);
 	kfree(data);
@@ -2166,6 +2166,7 @@ int vfio_pci_core_init_dev(struct vfio_device *core_vdev)
 	INIT_LIST_HEAD(&vdev->sriov_pfs_item);
 	init_rwsem(&vdev->memory_lock);
 	xa_init(&vdev->ctx);
+	vdev->intr_ctx.priv = vdev;
 
 	return 0;
 }
diff --git a/drivers/vfio/pci/vfio_pci_intrs.c b/drivers/vfio/pci/vfio_pci_intrs.c
index c49588c8f4a3..6d09a82def87 100644
--- a/drivers/vfio/pci/vfio_pci_intrs.c
+++ b/drivers/vfio/pci/vfio_pci_intrs.c
@@ -552,11 +552,13 @@ static void vfio_msi_disable(struct vfio_pci_core_device *vdev, bool msix)
 /*
  * IOCTL support
  */
-static int vfio_pci_set_intx_unmask(struct vfio_pci_core_device *vdev,
+static int vfio_pci_set_intx_unmask(struct vfio_pci_intr_ctx *intr_ctx,
 				    unsigned int index, unsigned int start,
 				    unsigned int count, uint32_t flags,
 				    void *data)
 {
+	struct vfio_pci_core_device *vdev = intr_ctx->priv;
+
 	if (!is_intx(vdev) || start != 0 || count != 1)
 		return -EINVAL;
 
@@ -584,10 +586,12 @@ static int vfio_pci_set_intx_unmask(struct vfio_pci_core_device *vdev,
 	return 0;
 }
 
-static int vfio_pci_set_intx_mask(struct vfio_pci_core_device *vdev,
+static int vfio_pci_set_intx_mask(struct vfio_pci_intr_ctx *intr_ctx,
 				  unsigned int index, unsigned int start,
 				  unsigned int count, uint32_t flags, void *data)
 {
+	struct vfio_pci_core_device *vdev = intr_ctx->priv;
+
 	if (!is_intx(vdev) || start != 0 || count != 1)
 		return -EINVAL;
 
@@ -604,11 +608,13 @@ static int vfio_pci_set_intx_mask(struct vfio_pci_core_device *vdev,
 	return 0;
 }
 
-static int vfio_pci_set_intx_trigger(struct vfio_pci_core_device *vdev,
+static int vfio_pci_set_intx_trigger(struct vfio_pci_intr_ctx *intr_ctx,
 				     unsigned int index, unsigned int start,
 				     unsigned int count, uint32_t flags,
 				     void *data)
 {
+	struct vfio_pci_core_device *vdev = intr_ctx->priv;
+
 	if (is_intx(vdev) && !count && (flags & VFIO_IRQ_SET_DATA_NONE)) {
 		vfio_intx_disable(vdev);
 		return 0;
@@ -648,11 +654,12 @@ static int vfio_pci_set_intx_trigger(struct vfio_pci_core_device *vdev,
 	return 0;
 }
 
-static int vfio_pci_set_msi_trigger(struct vfio_pci_core_device *vdev,
+static int vfio_pci_set_msi_trigger(struct vfio_pci_intr_ctx *intr_ctx,
 				    unsigned int index, unsigned int start,
 				    unsigned int count, uint32_t flags,
 				    void *data)
 {
+	struct vfio_pci_core_device *vdev = intr_ctx->priv;
 	struct vfio_pci_irq_ctx *ctx;
 	unsigned int i;
 	bool msix = (index == VFIO_PCI_MSIX_IRQ_INDEX) ? true : false;
@@ -757,11 +764,13 @@ static int vfio_pci_set_ctx_trigger_single(struct eventfd_ctx **ctx,
 	return -EINVAL;
 }
 
-static int vfio_pci_set_err_trigger(struct vfio_pci_core_device *vdev,
+static int vfio_pci_set_err_trigger(struct vfio_pci_intr_ctx *intr_ctx,
 				    unsigned int index, unsigned int start,
 				    unsigned int count, uint32_t flags,
 				    void *data)
 {
+	struct vfio_pci_core_device *vdev = intr_ctx->priv;
+
 	if (!pci_is_pcie(vdev->pdev))
 		return -ENOTTY;
 
@@ -772,11 +781,13 @@ static int vfio_pci_set_err_trigger(struct vfio_pci_core_device *vdev,
 					       count, flags, data);
 }
 
-static int vfio_pci_set_req_trigger(struct vfio_pci_core_device *vdev,
+static int vfio_pci_set_req_trigger(struct vfio_pci_intr_ctx *intr_ctx,
 				    unsigned int index, unsigned int start,
 				    unsigned int count, uint32_t flags,
 				    void *data)
 {
+	struct vfio_pci_core_device *vdev = intr_ctx->priv;
+
 	if (index != VFIO_PCI_REQ_IRQ_INDEX || start != 0 || count > 1)
 		return -EINVAL;
 
@@ -784,11 +795,11 @@ static int vfio_pci_set_req_trigger(struct vfio_pci_core_device *vdev,
 					       count, flags, data);
 }
 
-int vfio_pci_set_irqs_ioctl(struct vfio_pci_core_device *vdev, uint32_t flags,
+int vfio_pci_set_irqs_ioctl(struct vfio_pci_intr_ctx *intr_ctx, uint32_t flags,
 			    unsigned int index, unsigned int start,
 			    unsigned int count, void *data)
 {
-	int (*func)(struct vfio_pci_core_device *vdev, unsigned int index,
+	int (*func)(struct vfio_pci_intr_ctx *intr_ctx, unsigned int index,
 		    unsigned int start, unsigned int count, uint32_t flags,
 		    void *data) = NULL;
 
@@ -837,5 +848,5 @@ int vfio_pci_set_irqs_ioctl(struct vfio_pci_core_device *vdev, uint32_t flags,
 	if (!func)
 		return -ENOTTY;
 
-	return func(vdev, index, start, count, flags, data);
+	return func(intr_ctx, index, start, count, flags, data);
 }
diff --git a/drivers/vfio/pci/vfio_pci_priv.h b/drivers/vfio/pci/vfio_pci_priv.h
index 5e4fa69aee16..6dddcfe7ab19 100644
--- a/drivers/vfio/pci/vfio_pci_priv.h
+++ b/drivers/vfio/pci/vfio_pci_priv.h
@@ -26,7 +26,7 @@ struct vfio_pci_ioeventfd {
 bool vfio_pci_intx_mask(struct vfio_pci_core_device *vdev);
 void vfio_pci_intx_unmask(struct vfio_pci_core_device *vdev);
 
-int vfio_pci_set_irqs_ioctl(struct vfio_pci_core_device *vdev, uint32_t flags,
+int vfio_pci_set_irqs_ioctl(struct vfio_pci_intr_ctx *intr_ctx, uint32_t flags,
 			    unsigned index, unsigned start, unsigned count,
 			    void *data);
 
diff --git a/include/linux/vfio_pci_core.h b/include/linux/vfio_pci_core.h
index 562e8754869d..66bde5a60be7 100644
--- a/include/linux/vfio_pci_core.h
+++ b/include/linux/vfio_pci_core.h
@@ -49,6 +49,14 @@ struct vfio_pci_region {
 	u32				flags;
 };
 
+/*
+ * Interrupt context of virtual PCI device
+ * @priv:		Private data
+ */
+struct vfio_pci_intr_ctx {
+	void				*priv;
+};
+
 struct vfio_pci_core_device {
 	struct vfio_device	vdev;
 	struct pci_dev		*pdev;
@@ -96,6 +104,7 @@ struct vfio_pci_core_device {
 	struct mutex		vma_lock;
 	struct list_head	vma_list;
 	struct rw_semaphore	memory_lock;
+	struct vfio_pci_intr_ctx	intr_ctx;
 };
 
 /* Will be exported for vfio pci drivers usage */
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [RFC PATCH V2 05/18] vfio/pci: Split PCI interrupt management into front and backend
  2023-10-06 16:40 [RFC PATCH V2 00/18] vfio/pci: Back guest interrupts from Interrupt Message Store (IMS) Reinette Chatre
                   ` (3 preceding siblings ...)
  2023-10-06 16:40 ` [RFC PATCH V2 04/18] vfio/pci: Make core interrupt callbacks accessible to all virtual devices Reinette Chatre
@ 2023-10-06 16:41 ` Reinette Chatre
  2023-10-06 16:41 ` [RFC PATCH V2 06/18] vfio/pci: Separate MSI and MSI-X handling Reinette Chatre
                   ` (12 subsequent siblings)
  17 siblings, 0 replies; 21+ messages in thread
From: Reinette Chatre @ 2023-10-06 16:41 UTC (permalink / raw)
  To: jgg, yishaih, shameerali.kolothum.thodi, kevin.tian, alex.williamson
  Cc: kvm, dave.jiang, jing2.liu, ashok.raj, fenghua.yu, tom.zanussi,
	reinette.chatre, linux-kernel, patches

VFIO PCI interrupt management supports passthrough PCI devices
with an interrupt in the guest backed by the same type of
interrupt on the actual PCI device.

PCI interrupt management can be more flexible. An interrupt in
the guest may be backed by a different type of interrupt on the
host, for example MSI-X in guest can be backed by IMS on the host,
or not backed by a device at all when it needs to be emulated in
the virtual device driver.

The main entry to guest interrupt management is via the
VFIO_DEVICE_SET_IRQS ioctl(). By default the work is
passed to interrupt management for PCI devices with the
PCI specific functions called directly.

Make the ioctl() handling configurable to support different
interrupt management backends. This is accomplished
by introducing interrupt context specific callbacks that
are initialized by the virtual device driver and then
triggered via the ioctl().

The introduction of virtual device driver specific callbacks
require its initialization. Create a dedicated interrupt context
initialization function to avoid mixing more interrupt
context initialization with general virtual device driver
initialization.

Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
---
 drivers/vfio/pci/vfio_pci_core.c  |  2 +-
 drivers/vfio/pci/vfio_pci_intrs.c | 35 +++++++++++++++++++++++++------
 include/linux/vfio_pci_core.h     | 25 ++++++++++++++++++++++
 3 files changed, 55 insertions(+), 7 deletions(-)

diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c
index bb8181444c41..310259bbacae 100644
--- a/drivers/vfio/pci/vfio_pci_core.c
+++ b/drivers/vfio/pci/vfio_pci_core.c
@@ -2166,7 +2166,7 @@ int vfio_pci_core_init_dev(struct vfio_device *core_vdev)
 	INIT_LIST_HEAD(&vdev->sriov_pfs_item);
 	init_rwsem(&vdev->memory_lock);
 	xa_init(&vdev->ctx);
-	vdev->intr_ctx.priv = vdev;
+	vfio_pci_init_intr_ctx(vdev, &vdev->intr_ctx);
 
 	return 0;
 }
diff --git a/drivers/vfio/pci/vfio_pci_intrs.c b/drivers/vfio/pci/vfio_pci_intrs.c
index 6d09a82def87..e2d39b7561b8 100644
--- a/drivers/vfio/pci/vfio_pci_intrs.c
+++ b/drivers/vfio/pci/vfio_pci_intrs.c
@@ -795,6 +795,23 @@ static int vfio_pci_set_req_trigger(struct vfio_pci_intr_ctx *intr_ctx,
 					       count, flags, data);
 }
 
+struct vfio_pci_intr_ops vfio_pci_intr_ops = {
+	.set_intx_mask = vfio_pci_set_intx_mask,
+	.set_intx_unmask = vfio_pci_set_intx_unmask,
+	.set_intx_trigger = vfio_pci_set_intx_trigger,
+	.set_msi_trigger = vfio_pci_set_msi_trigger,
+	.set_err_trigger = vfio_pci_set_err_trigger,
+	.set_req_trigger = vfio_pci_set_req_trigger,
+};
+
+void vfio_pci_init_intr_ctx(struct vfio_pci_core_device *vdev,
+			    struct vfio_pci_intr_ctx *intr_ctx)
+{
+	intr_ctx->ops = &vfio_pci_intr_ops;
+	intr_ctx->priv = vdev;
+}
+EXPORT_SYMBOL_GPL(vfio_pci_init_intr_ctx);
+
 int vfio_pci_set_irqs_ioctl(struct vfio_pci_intr_ctx *intr_ctx, uint32_t flags,
 			    unsigned int index, unsigned int start,
 			    unsigned int count, void *data)
@@ -807,13 +824,16 @@ int vfio_pci_set_irqs_ioctl(struct vfio_pci_intr_ctx *intr_ctx, uint32_t flags,
 	case VFIO_PCI_INTX_IRQ_INDEX:
 		switch (flags & VFIO_IRQ_SET_ACTION_TYPE_MASK) {
 		case VFIO_IRQ_SET_ACTION_MASK:
-			func = vfio_pci_set_intx_mask;
+			if (intr_ctx->ops->set_intx_mask)
+				func = intr_ctx->ops->set_intx_mask;
 			break;
 		case VFIO_IRQ_SET_ACTION_UNMASK:
-			func = vfio_pci_set_intx_unmask;
+			if (intr_ctx->ops->set_intx_unmask)
+				func = intr_ctx->ops->set_intx_unmask;
 			break;
 		case VFIO_IRQ_SET_ACTION_TRIGGER:
-			func = vfio_pci_set_intx_trigger;
+			if (intr_ctx->ops->set_intx_trigger)
+				func = intr_ctx->ops->set_intx_trigger;
 			break;
 		}
 		break;
@@ -825,21 +845,24 @@ int vfio_pci_set_irqs_ioctl(struct vfio_pci_intr_ctx *intr_ctx, uint32_t flags,
 			/* XXX Need masking support exported */
 			break;
 		case VFIO_IRQ_SET_ACTION_TRIGGER:
-			func = vfio_pci_set_msi_trigger;
+			if (intr_ctx->ops->set_msi_trigger)
+				func = intr_ctx->ops->set_msi_trigger;
 			break;
 		}
 		break;
 	case VFIO_PCI_ERR_IRQ_INDEX:
 		switch (flags & VFIO_IRQ_SET_ACTION_TYPE_MASK) {
 		case VFIO_IRQ_SET_ACTION_TRIGGER:
-			func = vfio_pci_set_err_trigger;
+			if (intr_ctx->ops->set_err_trigger)
+				func = intr_ctx->ops->set_err_trigger;
 			break;
 		}
 		break;
 	case VFIO_PCI_REQ_IRQ_INDEX:
 		switch (flags & VFIO_IRQ_SET_ACTION_TYPE_MASK) {
 		case VFIO_IRQ_SET_ACTION_TRIGGER:
-			func = vfio_pci_set_req_trigger;
+			if (intr_ctx->ops->set_req_trigger)
+				func = intr_ctx->ops->set_req_trigger;
 			break;
 		}
 		break;
diff --git a/include/linux/vfio_pci_core.h b/include/linux/vfio_pci_core.h
index 66bde5a60be7..aba69669ec25 100644
--- a/include/linux/vfio_pci_core.h
+++ b/include/linux/vfio_pci_core.h
@@ -51,12 +51,35 @@ struct vfio_pci_region {
 
 /*
  * Interrupt context of virtual PCI device
+ * @ops:		Callbacks triggered via VFIO_DEVICE_SET_IRQS ioctl()
  * @priv:		Private data
  */
 struct vfio_pci_intr_ctx {
+	const struct vfio_pci_intr_ops	*ops;
 	void				*priv;
 };
 
+struct vfio_pci_intr_ops {
+	int (*set_intx_mask)(struct vfio_pci_intr_ctx *intr_ctx,
+			     unsigned int index, unsigned int start,
+			     unsigned int count, uint32_t flags, void *data);
+	int (*set_intx_unmask)(struct vfio_pci_intr_ctx *intr_ctx,
+			       unsigned int index, unsigned int start,
+			       unsigned int count, uint32_t flags, void *data);
+	int (*set_intx_trigger)(struct vfio_pci_intr_ctx *intr_ctx,
+				unsigned int index, unsigned int start,
+				unsigned int count, uint32_t flags, void *data);
+	int (*set_msi_trigger)(struct vfio_pci_intr_ctx *intr_ctx,
+			       unsigned int index, unsigned int start,
+			       unsigned int count, uint32_t flags, void *data);
+	int (*set_err_trigger)(struct vfio_pci_intr_ctx *intr_ctx,
+			       unsigned int index, unsigned int start,
+			       unsigned int count, uint32_t flags, void *data);
+	int (*set_req_trigger)(struct vfio_pci_intr_ctx *intr_ctx,
+			       unsigned int index, unsigned int start,
+			       unsigned int count, uint32_t flags, void *data);
+};
+
 struct vfio_pci_core_device {
 	struct vfio_device	vdev;
 	struct pci_dev		*pdev;
@@ -124,6 +147,8 @@ int vfio_pci_core_sriov_configure(struct vfio_pci_core_device *vdev,
 				  int nr_virtfn);
 long vfio_pci_core_ioctl(struct vfio_device *core_vdev, unsigned int cmd,
 		unsigned long arg);
+void vfio_pci_init_intr_ctx(struct vfio_pci_core_device *vdev,
+			    struct vfio_pci_intr_ctx *intr_ctx);
 int vfio_pci_core_ioctl_feature(struct vfio_device *device, u32 flags,
 				void __user *arg, size_t argsz);
 ssize_t vfio_pci_core_read(struct vfio_device *core_vdev, char __user *buf,
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [RFC PATCH V2 06/18] vfio/pci: Separate MSI and MSI-X handling
  2023-10-06 16:40 [RFC PATCH V2 00/18] vfio/pci: Back guest interrupts from Interrupt Message Store (IMS) Reinette Chatre
                   ` (4 preceding siblings ...)
  2023-10-06 16:41 ` [RFC PATCH V2 05/18] vfio/pci: Split PCI interrupt management into front and backend Reinette Chatre
@ 2023-10-06 16:41 ` Reinette Chatre
  2023-10-06 16:41 ` [RFC PATCH V2 07/18] vfio/pci: Move interrupt eventfd to interrupt context Reinette Chatre
                   ` (11 subsequent siblings)
  17 siblings, 0 replies; 21+ messages in thread
From: Reinette Chatre @ 2023-10-06 16:41 UTC (permalink / raw)
  To: jgg, yishaih, shameerali.kolothum.thodi, kevin.tian, alex.williamson
  Cc: kvm, dave.jiang, jing2.liu, ashok.raj, fenghua.yu, tom.zanussi,
	reinette.chatre, linux-kernel, patches

VFIO PCI interrupt management uses a single entry for both
MSI and MSI-X management with the called functions using a boolean
when necessary to distinguish between MSI and MSI-X. This remains
unchanged.

Virtual device interrupt management should not be required to
use the same callback for both MSI and MSI-X. It may be possible
for a virtual device to not support MSI at all and only
provide MSI-X interrupt management.

Separate the MSI and MSI-X interrupt management by allowing
different callbacks for each interrupt type. For PCI devices
the callback remains the same.

Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
---
 drivers/vfio/pci/vfio_pci_intrs.c | 14 +++++++++++++-
 include/linux/vfio_pci_core.h     |  3 +++
 2 files changed, 16 insertions(+), 1 deletion(-)

diff --git a/drivers/vfio/pci/vfio_pci_intrs.c b/drivers/vfio/pci/vfio_pci_intrs.c
index e2d39b7561b8..76ec5af3681a 100644
--- a/drivers/vfio/pci/vfio_pci_intrs.c
+++ b/drivers/vfio/pci/vfio_pci_intrs.c
@@ -800,6 +800,7 @@ struct vfio_pci_intr_ops vfio_pci_intr_ops = {
 	.set_intx_unmask = vfio_pci_set_intx_unmask,
 	.set_intx_trigger = vfio_pci_set_intx_trigger,
 	.set_msi_trigger = vfio_pci_set_msi_trigger,
+	.set_msix_trigger = vfio_pci_set_msi_trigger,
 	.set_err_trigger = vfio_pci_set_err_trigger,
 	.set_req_trigger = vfio_pci_set_req_trigger,
 };
@@ -838,7 +839,6 @@ int vfio_pci_set_irqs_ioctl(struct vfio_pci_intr_ctx *intr_ctx, uint32_t flags,
 		}
 		break;
 	case VFIO_PCI_MSI_IRQ_INDEX:
-	case VFIO_PCI_MSIX_IRQ_INDEX:
 		switch (flags & VFIO_IRQ_SET_ACTION_TYPE_MASK) {
 		case VFIO_IRQ_SET_ACTION_MASK:
 		case VFIO_IRQ_SET_ACTION_UNMASK:
@@ -850,6 +850,18 @@ int vfio_pci_set_irqs_ioctl(struct vfio_pci_intr_ctx *intr_ctx, uint32_t flags,
 			break;
 		}
 		break;
+	case VFIO_PCI_MSIX_IRQ_INDEX:
+		switch (flags & VFIO_IRQ_SET_ACTION_TYPE_MASK) {
+		case VFIO_IRQ_SET_ACTION_MASK:
+		case VFIO_IRQ_SET_ACTION_UNMASK:
+			/* XXX Need masking support exported */
+			break;
+		case VFIO_IRQ_SET_ACTION_TRIGGER:
+			if (intr_ctx->ops->set_msix_trigger)
+				func = intr_ctx->ops->set_msix_trigger;
+			break;
+		}
+		break;
 	case VFIO_PCI_ERR_IRQ_INDEX:
 		switch (flags & VFIO_IRQ_SET_ACTION_TYPE_MASK) {
 		case VFIO_IRQ_SET_ACTION_TRIGGER:
diff --git a/include/linux/vfio_pci_core.h b/include/linux/vfio_pci_core.h
index aba69669ec25..e9bfca9a0c0a 100644
--- a/include/linux/vfio_pci_core.h
+++ b/include/linux/vfio_pci_core.h
@@ -72,6 +72,9 @@ struct vfio_pci_intr_ops {
 	int (*set_msi_trigger)(struct vfio_pci_intr_ctx *intr_ctx,
 			       unsigned int index, unsigned int start,
 			       unsigned int count, uint32_t flags, void *data);
+	int (*set_msix_trigger)(struct vfio_pci_intr_ctx *intr_ctx,
+				unsigned int index, unsigned int start,
+				unsigned int count, uint32_t flags, void *data);
 	int (*set_err_trigger)(struct vfio_pci_intr_ctx *intr_ctx,
 			       unsigned int index, unsigned int start,
 			       unsigned int count, uint32_t flags, void *data);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [RFC PATCH V2 07/18] vfio/pci: Move interrupt eventfd to interrupt context
  2023-10-06 16:40 [RFC PATCH V2 00/18] vfio/pci: Back guest interrupts from Interrupt Message Store (IMS) Reinette Chatre
                   ` (5 preceding siblings ...)
  2023-10-06 16:41 ` [RFC PATCH V2 06/18] vfio/pci: Separate MSI and MSI-X handling Reinette Chatre
@ 2023-10-06 16:41 ` Reinette Chatre
  2023-10-06 16:41 ` [RFC PATCH V2 08/18] vfio/pci: Move mutex acquisition into function Reinette Chatre
                   ` (10 subsequent siblings)
  17 siblings, 0 replies; 21+ messages in thread
From: Reinette Chatre @ 2023-10-06 16:41 UTC (permalink / raw)
  To: jgg, yishaih, shameerali.kolothum.thodi, kevin.tian, alex.williamson
  Cc: kvm, dave.jiang, jing2.liu, ashok.raj, fenghua.yu, tom.zanussi,
	reinette.chatre, linux-kernel, patches

The eventfds associated with device request notification and
error IRQ are managed by VFIO PCI interrupt management as
triggered by the VFIO_DEVICE_SET_IRQS ioctl().

Move these eventfds as well as their mutex to the generic and
dedicated interrupt management context struct vfio_pci_intr_ctx
to enable another interrupt management backend to manage these
eventfd.

igate mutex protects eventfd modification. With the eventfd
within the bigger scoped interrupt context the mutex scope is
also expanded to mean that all members of struct
vfio_pci_intr_ctx are protected by it.

This move results in the vfio_pci_set_req_trigger() to
no longer require a struct vfio_pci_core_device, operating
just on the generic struct vfio_pci_intr_ctx, and thus
available for direct use by other interrupt management
backends.

This move introduces the first interrupt context related
cleanup call for which vfio_pci_release_intr_ctx() is
created to match existing vfio_pci_init_intr_ctx().

Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
---
 drivers/vfio/pci/vfio_pci_core.c  | 39 +++++++++++++++----------------
 drivers/vfio/pci/vfio_pci_intrs.c | 13 +++++++----
 include/linux/vfio_pci_core.h     | 10 +++++---
 3 files changed, 35 insertions(+), 27 deletions(-)

diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c
index 310259bbacae..5c9bd5d2db53 100644
--- a/drivers/vfio/pci/vfio_pci_core.c
+++ b/drivers/vfio/pci/vfio_pci_core.c
@@ -700,16 +700,16 @@ void vfio_pci_core_close_device(struct vfio_device *core_vdev)
 #endif
 	vfio_pci_core_disable(vdev);
 
-	mutex_lock(&vdev->igate);
-	if (vdev->err_trigger) {
-		eventfd_ctx_put(vdev->err_trigger);
-		vdev->err_trigger = NULL;
+	mutex_lock(&vdev->intr_ctx.igate);
+	if (vdev->intr_ctx.err_trigger) {
+		eventfd_ctx_put(vdev->intr_ctx.err_trigger);
+		vdev->intr_ctx.err_trigger = NULL;
 	}
-	if (vdev->req_trigger) {
-		eventfd_ctx_put(vdev->req_trigger);
-		vdev->req_trigger = NULL;
+	if (vdev->intr_ctx.req_trigger) {
+		eventfd_ctx_put(vdev->intr_ctx.req_trigger);
+		vdev->intr_ctx.req_trigger = NULL;
 	}
-	mutex_unlock(&vdev->igate);
+	mutex_unlock(&vdev->intr_ctx.igate);
 }
 EXPORT_SYMBOL_GPL(vfio_pci_core_close_device);
 
@@ -1214,12 +1214,12 @@ static int vfio_pci_ioctl_set_irqs(struct vfio_pci_core_device *vdev,
 			return PTR_ERR(data);
 	}
 
-	mutex_lock(&vdev->igate);
+	mutex_lock(&vdev->intr_ctx.igate);
 
 	ret = vfio_pci_set_irqs_ioctl(&vdev->intr_ctx, hdr.flags, hdr.index,
 				      hdr.start, hdr.count, data);
 
-	mutex_unlock(&vdev->igate);
+	mutex_unlock(&vdev->intr_ctx.igate);
 	kfree(data);
 
 	return ret;
@@ -1876,20 +1876,20 @@ void vfio_pci_core_request(struct vfio_device *core_vdev, unsigned int count)
 		container_of(core_vdev, struct vfio_pci_core_device, vdev);
 	struct pci_dev *pdev = vdev->pdev;
 
-	mutex_lock(&vdev->igate);
+	mutex_lock(&vdev->intr_ctx.igate);
 
-	if (vdev->req_trigger) {
+	if (vdev->intr_ctx.req_trigger) {
 		if (!(count % 10))
 			pci_notice_ratelimited(pdev,
 				"Relaying device request to user (#%u)\n",
 				count);
-		eventfd_signal(vdev->req_trigger, 1);
+		eventfd_signal(vdev->intr_ctx.req_trigger, 1);
 	} else if (count == 0) {
 		pci_warn(pdev,
 			"No device request channel registered, blocked until released by user\n");
 	}
 
-	mutex_unlock(&vdev->igate);
+	mutex_unlock(&vdev->intr_ctx.igate);
 }
 EXPORT_SYMBOL_GPL(vfio_pci_core_request);
 
@@ -2156,7 +2156,6 @@ int vfio_pci_core_init_dev(struct vfio_device *core_vdev)
 
 	vdev->pdev = to_pci_dev(core_vdev->dev);
 	vdev->irq_type = VFIO_PCI_NUM_IRQS;
-	mutex_init(&vdev->igate);
 	spin_lock_init(&vdev->irqlock);
 	mutex_init(&vdev->ioeventfds_lock);
 	INIT_LIST_HEAD(&vdev->dummy_resources_list);
@@ -2177,7 +2176,7 @@ void vfio_pci_core_release_dev(struct vfio_device *core_vdev)
 	struct vfio_pci_core_device *vdev =
 		container_of(core_vdev, struct vfio_pci_core_device, vdev);
 
-	mutex_destroy(&vdev->igate);
+	vfio_pci_release_intr_ctx(&vdev->intr_ctx);
 	mutex_destroy(&vdev->ioeventfds_lock);
 	mutex_destroy(&vdev->vma_lock);
 	kfree(vdev->region);
@@ -2300,12 +2299,12 @@ pci_ers_result_t vfio_pci_core_aer_err_detected(struct pci_dev *pdev,
 {
 	struct vfio_pci_core_device *vdev = dev_get_drvdata(&pdev->dev);
 
-	mutex_lock(&vdev->igate);
+	mutex_lock(&vdev->intr_ctx.igate);
 
-	if (vdev->err_trigger)
-		eventfd_signal(vdev->err_trigger, 1);
+	if (vdev->intr_ctx.err_trigger)
+		eventfd_signal(vdev->intr_ctx.err_trigger, 1);
 
-	mutex_unlock(&vdev->igate);
+	mutex_unlock(&vdev->intr_ctx.igate);
 
 	return PCI_ERS_RESULT_CAN_RECOVER;
 }
diff --git a/drivers/vfio/pci/vfio_pci_intrs.c b/drivers/vfio/pci/vfio_pci_intrs.c
index 76ec5af3681a..b9c92ede3b6f 100644
--- a/drivers/vfio/pci/vfio_pci_intrs.c
+++ b/drivers/vfio/pci/vfio_pci_intrs.c
@@ -777,7 +777,7 @@ static int vfio_pci_set_err_trigger(struct vfio_pci_intr_ctx *intr_ctx,
 	if (index != VFIO_PCI_ERR_IRQ_INDEX || start != 0 || count > 1)
 		return -EINVAL;
 
-	return vfio_pci_set_ctx_trigger_single(&vdev->err_trigger,
+	return vfio_pci_set_ctx_trigger_single(&intr_ctx->err_trigger,
 					       count, flags, data);
 }
 
@@ -786,12 +786,10 @@ static int vfio_pci_set_req_trigger(struct vfio_pci_intr_ctx *intr_ctx,
 				    unsigned int count, uint32_t flags,
 				    void *data)
 {
-	struct vfio_pci_core_device *vdev = intr_ctx->priv;
-
 	if (index != VFIO_PCI_REQ_IRQ_INDEX || start != 0 || count > 1)
 		return -EINVAL;
 
-	return vfio_pci_set_ctx_trigger_single(&vdev->req_trigger,
+	return vfio_pci_set_ctx_trigger_single(&intr_ctx->req_trigger,
 					       count, flags, data);
 }
 
@@ -810,9 +808,16 @@ void vfio_pci_init_intr_ctx(struct vfio_pci_core_device *vdev,
 {
 	intr_ctx->ops = &vfio_pci_intr_ops;
 	intr_ctx->priv = vdev;
+	mutex_init(&intr_ctx->igate);
 }
 EXPORT_SYMBOL_GPL(vfio_pci_init_intr_ctx);
 
+void vfio_pci_release_intr_ctx(struct vfio_pci_intr_ctx *intr_ctx)
+{
+	mutex_destroy(&intr_ctx->igate);
+}
+EXPORT_SYMBOL_GPL(vfio_pci_release_intr_ctx);
+
 int vfio_pci_set_irqs_ioctl(struct vfio_pci_intr_ctx *intr_ctx, uint32_t flags,
 			    unsigned int index, unsigned int start,
 			    unsigned int count, void *data)
diff --git a/include/linux/vfio_pci_core.h b/include/linux/vfio_pci_core.h
index e9bfca9a0c0a..b1c299188bf5 100644
--- a/include/linux/vfio_pci_core.h
+++ b/include/linux/vfio_pci_core.h
@@ -53,10 +53,16 @@ struct vfio_pci_region {
  * Interrupt context of virtual PCI device
  * @ops:		Callbacks triggered via VFIO_DEVICE_SET_IRQS ioctl()
  * @priv:		Private data
+ * @igate:		Protects members of struct vfio_pci_intr_ctx
+ * @err_trigger:	Eventfd associated with error reporting IRQ
+ * @req_trigger:	Eventfd associated with device request notification
  */
 struct vfio_pci_intr_ctx {
 	const struct vfio_pci_intr_ops	*ops;
 	void				*priv;
+	struct mutex			igate;
+	struct eventfd_ctx		*err_trigger;
+	struct eventfd_ctx		*req_trigger;
 };
 
 struct vfio_pci_intr_ops {
@@ -92,7 +98,6 @@ struct vfio_pci_core_device {
 	u8			*vconfig;
 	struct perm_bits	*msi_perm;
 	spinlock_t		irqlock;
-	struct mutex		igate;
 	struct xarray		ctx;
 	int			irq_type;
 	int			num_regions;
@@ -117,8 +122,6 @@ struct vfio_pci_core_device {
 	struct pci_saved_state	*pci_saved_state;
 	struct pci_saved_state	*pm_save;
 	int			ioeventfds_nr;
-	struct eventfd_ctx	*err_trigger;
-	struct eventfd_ctx	*req_trigger;
 	struct eventfd_ctx	*pm_wake_eventfd_ctx;
 	struct list_head	dummy_resources_list;
 	struct mutex		ioeventfds_lock;
@@ -152,6 +155,7 @@ long vfio_pci_core_ioctl(struct vfio_device *core_vdev, unsigned int cmd,
 		unsigned long arg);
 void vfio_pci_init_intr_ctx(struct vfio_pci_core_device *vdev,
 			    struct vfio_pci_intr_ctx *intr_ctx);
+void vfio_pci_release_intr_ctx(struct vfio_pci_intr_ctx *intr_ctx);
 int vfio_pci_core_ioctl_feature(struct vfio_device *device, u32 flags,
 				void __user *arg, size_t argsz);
 ssize_t vfio_pci_core_read(struct vfio_device *core_vdev, char __user *buf,
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [RFC PATCH V2 08/18] vfio/pci: Move mutex acquisition into function
  2023-10-06 16:40 [RFC PATCH V2 00/18] vfio/pci: Back guest interrupts from Interrupt Message Store (IMS) Reinette Chatre
                   ` (6 preceding siblings ...)
  2023-10-06 16:41 ` [RFC PATCH V2 07/18] vfio/pci: Move interrupt eventfd to interrupt context Reinette Chatre
@ 2023-10-06 16:41 ` Reinette Chatre
  2023-10-06 16:41 ` [RFC PATCH V2 09/18] vfio/pci: Move interrupt contexts to generic interrupt struct Reinette Chatre
                   ` (9 subsequent siblings)
  17 siblings, 0 replies; 21+ messages in thread
From: Reinette Chatre @ 2023-10-06 16:41 UTC (permalink / raw)
  To: jgg, yishaih, shameerali.kolothum.thodi, kevin.tian, alex.williamson
  Cc: kvm, dave.jiang, jing2.liu, ashok.raj, fenghua.yu, tom.zanussi,
	reinette.chatre, linux-kernel, patches

vfio_pci_set_irqs_ioctl() is the entrypoint for interrupt
management via the VFIO_DEVICE_SET_IRQS ioctl().
vfio_pci_set_irqs_ioctl() can be called from a virtual
device driver after its callbacks have been configured to
support the needed interrupt management.

The igate mutex is obtained before vfio_pci_set_irqs_ioctl()
to protect changes to interrupt context. It should not be
necessary for all users of vfio_pci_set_irqs_ioctl() to
remember to take the mutex - the mutex can be
acquired and released within vfio_pci_set_irqs_ioctl().

Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
---
 drivers/vfio/pci/vfio_pci_core.c  |  2 --
 drivers/vfio/pci/vfio_pci_intrs.c | 10 ++++++++--
 2 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c
index 5c9bd5d2db53..bf4de137ad2f 100644
--- a/drivers/vfio/pci/vfio_pci_core.c
+++ b/drivers/vfio/pci/vfio_pci_core.c
@@ -1214,12 +1214,10 @@ static int vfio_pci_ioctl_set_irqs(struct vfio_pci_core_device *vdev,
 			return PTR_ERR(data);
 	}
 
-	mutex_lock(&vdev->intr_ctx.igate);
 
 	ret = vfio_pci_set_irqs_ioctl(&vdev->intr_ctx, hdr.flags, hdr.index,
 				      hdr.start, hdr.count, data);
 
-	mutex_unlock(&vdev->intr_ctx.igate);
 	kfree(data);
 
 	return ret;
diff --git a/drivers/vfio/pci/vfio_pci_intrs.c b/drivers/vfio/pci/vfio_pci_intrs.c
index b9c92ede3b6f..9fc0a568d392 100644
--- a/drivers/vfio/pci/vfio_pci_intrs.c
+++ b/drivers/vfio/pci/vfio_pci_intrs.c
@@ -825,7 +825,9 @@ int vfio_pci_set_irqs_ioctl(struct vfio_pci_intr_ctx *intr_ctx, uint32_t flags,
 	int (*func)(struct vfio_pci_intr_ctx *intr_ctx, unsigned int index,
 		    unsigned int start, unsigned int count, uint32_t flags,
 		    void *data) = NULL;
+	int ret = -ENOTTY;
 
+	mutex_lock(&intr_ctx->igate);
 	switch (index) {
 	case VFIO_PCI_INTX_IRQ_INDEX:
 		switch (flags & VFIO_IRQ_SET_ACTION_TYPE_MASK) {
@@ -886,7 +888,11 @@ int vfio_pci_set_irqs_ioctl(struct vfio_pci_intr_ctx *intr_ctx, uint32_t flags,
 	}
 
 	if (!func)
-		return -ENOTTY;
+		goto out_unlock;
+
+	ret = func(intr_ctx, index, start, count, flags, data);
 
-	return func(intr_ctx, index, start, count, flags, data);
+out_unlock:
+	mutex_unlock(&intr_ctx->igate);
+	return ret;
 }
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [RFC PATCH V2 09/18] vfio/pci: Move interrupt contexts to generic interrupt struct
  2023-10-06 16:40 [RFC PATCH V2 00/18] vfio/pci: Back guest interrupts from Interrupt Message Store (IMS) Reinette Chatre
                   ` (7 preceding siblings ...)
  2023-10-06 16:41 ` [RFC PATCH V2 08/18] vfio/pci: Move mutex acquisition into function Reinette Chatre
@ 2023-10-06 16:41 ` Reinette Chatre
  2023-10-06 16:41 ` [RFC PATCH V2 10/18] vfio/pci: Move IRQ type to generic interrupt context Reinette Chatre
                   ` (8 subsequent siblings)
  17 siblings, 0 replies; 21+ messages in thread
From: Reinette Chatre @ 2023-10-06 16:41 UTC (permalink / raw)
  To: jgg, yishaih, shameerali.kolothum.thodi, kevin.tian, alex.williamson
  Cc: kvm, dave.jiang, jing2.liu, ashok.raj, fenghua.yu, tom.zanussi,
	reinette.chatre, linux-kernel, patches

VFIO PCI interrupt management maintains per-interrupt
context within an xarray using the interrupt vector as index.

Move the per-interrupt context to the generic interrupt
context in struct vfio_pci_intr_ctx to enable this context to
be managed by a different backend.

Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
---
 drivers/vfio/pci/vfio_pci_core.c  | 1 -
 drivers/vfio/pci/vfio_pci_intrs.c | 9 +++++----
 include/linux/vfio_pci_core.h     | 3 ++-
 3 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c
index bf4de137ad2f..cf303a9555f0 100644
--- a/drivers/vfio/pci/vfio_pci_core.c
+++ b/drivers/vfio/pci/vfio_pci_core.c
@@ -2162,7 +2162,6 @@ int vfio_pci_core_init_dev(struct vfio_device *core_vdev)
 	INIT_LIST_HEAD(&vdev->vma_list);
 	INIT_LIST_HEAD(&vdev->sriov_pfs_item);
 	init_rwsem(&vdev->memory_lock);
-	xa_init(&vdev->ctx);
 	vfio_pci_init_intr_ctx(vdev, &vdev->intr_ctx);
 
 	return 0;
diff --git a/drivers/vfio/pci/vfio_pci_intrs.c b/drivers/vfio/pci/vfio_pci_intrs.c
index 9fc0a568d392..3c8fed88208c 100644
--- a/drivers/vfio/pci/vfio_pci_intrs.c
+++ b/drivers/vfio/pci/vfio_pci_intrs.c
@@ -52,13 +52,13 @@ static
 struct vfio_pci_irq_ctx *vfio_irq_ctx_get(struct vfio_pci_core_device *vdev,
 					  unsigned long index)
 {
-	return xa_load(&vdev->ctx, index);
+	return xa_load(&vdev->intr_ctx.ctx, index);
 }
 
 static void vfio_irq_ctx_free(struct vfio_pci_core_device *vdev,
 			      struct vfio_pci_irq_ctx *ctx, unsigned long index)
 {
-	xa_erase(&vdev->ctx, index);
+	xa_erase(&vdev->intr_ctx.ctx, index);
 	kfree(ctx);
 }
 
@@ -72,7 +72,7 @@ vfio_irq_ctx_alloc(struct vfio_pci_core_device *vdev, unsigned long index)
 	if (!ctx)
 		return NULL;
 
-	ret = xa_insert(&vdev->ctx, index, ctx, GFP_KERNEL_ACCOUNT);
+	ret = xa_insert(&vdev->intr_ctx.ctx, index, ctx, GFP_KERNEL_ACCOUNT);
 	if (ret) {
 		kfree(ctx);
 		return NULL;
@@ -529,7 +529,7 @@ static void vfio_msi_disable(struct vfio_pci_core_device *vdev, bool msix)
 	unsigned long i;
 	u16 cmd;
 
-	xa_for_each(&vdev->ctx, i, ctx) {
+	xa_for_each(&vdev->intr_ctx.ctx, i, ctx) {
 		vfio_virqfd_disable(&ctx->unmask);
 		vfio_virqfd_disable(&ctx->mask);
 		vfio_msi_set_vector_signal(vdev, i, -1, msix);
@@ -809,6 +809,7 @@ void vfio_pci_init_intr_ctx(struct vfio_pci_core_device *vdev,
 	intr_ctx->ops = &vfio_pci_intr_ops;
 	intr_ctx->priv = vdev;
 	mutex_init(&intr_ctx->igate);
+	xa_init(&intr_ctx->ctx);
 }
 EXPORT_SYMBOL_GPL(vfio_pci_init_intr_ctx);
 
diff --git a/include/linux/vfio_pci_core.h b/include/linux/vfio_pci_core.h
index b1c299188bf5..46521dd82a6b 100644
--- a/include/linux/vfio_pci_core.h
+++ b/include/linux/vfio_pci_core.h
@@ -56,6 +56,7 @@ struct vfio_pci_region {
  * @igate:		Protects members of struct vfio_pci_intr_ctx
  * @err_trigger:	Eventfd associated with error reporting IRQ
  * @req_trigger:	Eventfd associated with device request notification
+ * @ctx:		Per-interrupt context indexed by vector
  */
 struct vfio_pci_intr_ctx {
 	const struct vfio_pci_intr_ops	*ops;
@@ -63,6 +64,7 @@ struct vfio_pci_intr_ctx {
 	struct mutex			igate;
 	struct eventfd_ctx		*err_trigger;
 	struct eventfd_ctx		*req_trigger;
+	struct xarray			ctx;
 };
 
 struct vfio_pci_intr_ops {
@@ -98,7 +100,6 @@ struct vfio_pci_core_device {
 	u8			*vconfig;
 	struct perm_bits	*msi_perm;
 	spinlock_t		irqlock;
-	struct xarray		ctx;
 	int			irq_type;
 	int			num_regions;
 	struct vfio_pci_region	*region;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [RFC PATCH V2 10/18] vfio/pci: Move IRQ type to generic interrupt context
  2023-10-06 16:40 [RFC PATCH V2 00/18] vfio/pci: Back guest interrupts from Interrupt Message Store (IMS) Reinette Chatre
                   ` (8 preceding siblings ...)
  2023-10-06 16:41 ` [RFC PATCH V2 09/18] vfio/pci: Move interrupt contexts to generic interrupt struct Reinette Chatre
@ 2023-10-06 16:41 ` Reinette Chatre
  2023-10-06 16:41 ` [RFC PATCH V2 11/18] vfio/pci: Split interrupt context initialization Reinette Chatre
                   ` (7 subsequent siblings)
  17 siblings, 0 replies; 21+ messages in thread
From: Reinette Chatre @ 2023-10-06 16:41 UTC (permalink / raw)
  To: jgg, yishaih, shameerali.kolothum.thodi, kevin.tian, alex.williamson
  Cc: kvm, dave.jiang, jing2.liu, ashok.raj, fenghua.yu, tom.zanussi,
	reinette.chatre, linux-kernel, patches

The type of interrupts within the guest is not unique to
PCI devices and needed for other virtual devices supporting
interrupts.

Move interrupt type to the generic interrupt context struct
vfio_pci_intr_ctx.

Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
---
Question for maintainers:
irq_type is accessed in ioctl() flow as well as other
flows. It is not clear to me how it is protected against
concurrent access. Should accesses outside of ioctl() flow
take the mutex?

 drivers/vfio/pci/vfio_pci_config.c |  2 +-
 drivers/vfio/pci/vfio_pci_core.c   |  5 ++---
 drivers/vfio/pci/vfio_pci_intrs.c  | 21 +++++++++++----------
 include/linux/vfio_pci_core.h      |  3 ++-
 4 files changed, 16 insertions(+), 15 deletions(-)

diff --git a/drivers/vfio/pci/vfio_pci_config.c b/drivers/vfio/pci/vfio_pci_config.c
index 7e2e62ab0869..2535bdbc016d 100644
--- a/drivers/vfio/pci/vfio_pci_config.c
+++ b/drivers/vfio/pci/vfio_pci_config.c
@@ -1168,7 +1168,7 @@ static int vfio_msi_config_write(struct vfio_pci_core_device *vdev, int pos,
 		flags = le16_to_cpu(*pflags);
 
 		/* MSI is enabled via ioctl */
-		if  (vdev->irq_type != VFIO_PCI_MSI_IRQ_INDEX)
+		if  (vdev->intr_ctx.irq_type != VFIO_PCI_MSI_IRQ_INDEX)
 			flags &= ~PCI_MSI_FLAGS_ENABLE;
 
 		/* Check queue size */
diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c
index cf303a9555f0..34109ed38454 100644
--- a/drivers/vfio/pci/vfio_pci_core.c
+++ b/drivers/vfio/pci/vfio_pci_core.c
@@ -427,7 +427,7 @@ static int vfio_pci_core_runtime_suspend(struct device *dev)
 	 * vfio_pci_intx_mask() will return false and in that case, INTx
 	 * should not be unmasked in the runtime resume.
 	 */
-	vdev->pm_intx_masked = ((vdev->irq_type == VFIO_PCI_INTX_IRQ_INDEX) &&
+	vdev->pm_intx_masked = ((vdev->intr_ctx.irq_type == VFIO_PCI_INTX_IRQ_INDEX) &&
 				vfio_pci_intx_mask(vdev));
 
 	return 0;
@@ -596,7 +596,7 @@ void vfio_pci_core_disable(struct vfio_pci_core_device *vdev)
 
 	vfio_pci_set_irqs_ioctl(&vdev->intr_ctx, VFIO_IRQ_SET_DATA_NONE |
 				VFIO_IRQ_SET_ACTION_TRIGGER,
-				vdev->irq_type, 0, 0, NULL);
+				vdev->intr_ctx.irq_type, 0, 0, NULL);
 
 	/* Device closed, don't need mutex here */
 	list_for_each_entry_safe(ioeventfd, ioeventfd_tmp,
@@ -2153,7 +2153,6 @@ int vfio_pci_core_init_dev(struct vfio_device *core_vdev)
 		container_of(core_vdev, struct vfio_pci_core_device, vdev);
 
 	vdev->pdev = to_pci_dev(core_vdev->dev);
-	vdev->irq_type = VFIO_PCI_NUM_IRQS;
 	spin_lock_init(&vdev->irqlock);
 	mutex_init(&vdev->ioeventfds_lock);
 	INIT_LIST_HEAD(&vdev->dummy_resources_list);
diff --git a/drivers/vfio/pci/vfio_pci_intrs.c b/drivers/vfio/pci/vfio_pci_intrs.c
index 3c8fed88208c..eb718787470f 100644
--- a/drivers/vfio/pci/vfio_pci_intrs.c
+++ b/drivers/vfio/pci/vfio_pci_intrs.c
@@ -33,19 +33,19 @@ struct vfio_pci_irq_ctx {
 
 static bool irq_is(struct vfio_pci_core_device *vdev, int type)
 {
-	return vdev->irq_type == type;
+	return vdev->intr_ctx.irq_type == type;
 }
 
 static bool is_intx(struct vfio_pci_core_device *vdev)
 {
-	return vdev->irq_type == VFIO_PCI_INTX_IRQ_INDEX;
+	return vdev->intr_ctx.irq_type == VFIO_PCI_INTX_IRQ_INDEX;
 }
 
 static bool is_irq_none(struct vfio_pci_core_device *vdev)
 {
-	return !(vdev->irq_type == VFIO_PCI_INTX_IRQ_INDEX ||
-		 vdev->irq_type == VFIO_PCI_MSI_IRQ_INDEX ||
-		 vdev->irq_type == VFIO_PCI_MSIX_IRQ_INDEX);
+	return !(vdev->intr_ctx.irq_type == VFIO_PCI_INTX_IRQ_INDEX ||
+		 vdev->intr_ctx.irq_type == VFIO_PCI_MSI_IRQ_INDEX ||
+		 vdev->intr_ctx.irq_type == VFIO_PCI_MSIX_IRQ_INDEX);
 }
 
 static
@@ -255,7 +255,7 @@ static int vfio_intx_enable(struct vfio_pci_core_device *vdev)
 	if (vdev->pci_2_3)
 		pci_intx(vdev->pdev, !ctx->masked);
 
-	vdev->irq_type = VFIO_PCI_INTX_IRQ_INDEX;
+	vdev->intr_ctx.irq_type = VFIO_PCI_INTX_IRQ_INDEX;
 
 	return 0;
 }
@@ -331,7 +331,7 @@ static void vfio_intx_disable(struct vfio_pci_core_device *vdev)
 		vfio_virqfd_disable(&ctx->mask);
 	}
 	vfio_intx_set_signal(vdev, -1);
-	vdev->irq_type = VFIO_PCI_NUM_IRQS;
+	vdev->intr_ctx.irq_type = VFIO_PCI_NUM_IRQS;
 	vfio_irq_ctx_free(vdev, ctx, 0);
 }
 
@@ -367,7 +367,7 @@ static int vfio_msi_enable(struct vfio_pci_core_device *vdev, int nvec, bool msi
 	}
 	vfio_pci_memory_unlock_and_restore(vdev, cmd);
 
-	vdev->irq_type = msix ? VFIO_PCI_MSIX_IRQ_INDEX :
+	vdev->intr_ctx.irq_type = msix ? VFIO_PCI_MSIX_IRQ_INDEX :
 				VFIO_PCI_MSI_IRQ_INDEX;
 
 	if (!msix) {
@@ -546,7 +546,7 @@ static void vfio_msi_disable(struct vfio_pci_core_device *vdev, bool msix)
 	if (vdev->nointx)
 		pci_intx(pdev, 0);
 
-	vdev->irq_type = VFIO_PCI_NUM_IRQS;
+	vdev->intr_ctx.irq_type = VFIO_PCI_NUM_IRQS;
 }
 
 /*
@@ -676,7 +676,7 @@ static int vfio_pci_set_msi_trigger(struct vfio_pci_intr_ctx *intr_ctx,
 		int32_t *fds = data;
 		int ret;
 
-		if (vdev->irq_type == index)
+		if (vdev->intr_ctx.irq_type == index)
 			return vfio_msi_set_block(vdev, start, count,
 						  fds, msix);
 
@@ -806,6 +806,7 @@ struct vfio_pci_intr_ops vfio_pci_intr_ops = {
 void vfio_pci_init_intr_ctx(struct vfio_pci_core_device *vdev,
 			    struct vfio_pci_intr_ctx *intr_ctx)
 {
+	intr_ctx->irq_type = VFIO_PCI_NUM_IRQS;
 	intr_ctx->ops = &vfio_pci_intr_ops;
 	intr_ctx->priv = vdev;
 	mutex_init(&intr_ctx->igate);
diff --git a/include/linux/vfio_pci_core.h b/include/linux/vfio_pci_core.h
index 46521dd82a6b..893a36b5d163 100644
--- a/include/linux/vfio_pci_core.h
+++ b/include/linux/vfio_pci_core.h
@@ -57,6 +57,7 @@ struct vfio_pci_region {
  * @err_trigger:	Eventfd associated with error reporting IRQ
  * @req_trigger:	Eventfd associated with device request notification
  * @ctx:		Per-interrupt context indexed by vector
+ * @irq_type:		Type of interrupt from guest perspective
  */
 struct vfio_pci_intr_ctx {
 	const struct vfio_pci_intr_ops	*ops;
@@ -65,6 +66,7 @@ struct vfio_pci_intr_ctx {
 	struct eventfd_ctx		*err_trigger;
 	struct eventfd_ctx		*req_trigger;
 	struct xarray			ctx;
+	int				irq_type;
 };
 
 struct vfio_pci_intr_ops {
@@ -100,7 +102,6 @@ struct vfio_pci_core_device {
 	u8			*vconfig;
 	struct perm_bits	*msi_perm;
 	spinlock_t		irqlock;
-	int			irq_type;
 	int			num_regions;
 	struct vfio_pci_region	*region;
 	u8			msi_qmax;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [RFC PATCH V2 11/18] vfio/pci: Split interrupt context initialization
  2023-10-06 16:40 [RFC PATCH V2 00/18] vfio/pci: Back guest interrupts from Interrupt Message Store (IMS) Reinette Chatre
                   ` (9 preceding siblings ...)
  2023-10-06 16:41 ` [RFC PATCH V2 10/18] vfio/pci: Move IRQ type to generic interrupt context Reinette Chatre
@ 2023-10-06 16:41 ` Reinette Chatre
  2023-10-06 16:41 ` [RFC PATCH V2 12/18] vfio/pci: Provide interrupt context to generic ops Reinette Chatre
                   ` (6 subsequent siblings)
  17 siblings, 0 replies; 21+ messages in thread
From: Reinette Chatre @ 2023-10-06 16:41 UTC (permalink / raw)
  To: jgg, yishaih, shameerali.kolothum.thodi, kevin.tian, alex.williamson
  Cc: kvm, dave.jiang, jing2.liu, ashok.raj, fenghua.yu, tom.zanussi,
	reinette.chatre, linux-kernel, patches

struct vfio_pci_intr_ctx is the context associated with interrupts
of a virtual device. The interrupt context is initialized with
backend specific data required by the particular interrupt management
backend as well as common initialization required by all interrupt
management backends.

Split interrupt context initialization into common and interrupt
management backend specific calls. The entrypoint will be the
initialization of a particular interrupt management backend which
in turn calls the common initialization.

Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
---
 drivers/vfio/pci/vfio_pci_intrs.c | 18 ++++++++++++++----
 1 file changed, 14 insertions(+), 4 deletions(-)

diff --git a/drivers/vfio/pci/vfio_pci_intrs.c b/drivers/vfio/pci/vfio_pci_intrs.c
index eb718787470f..8c9d44e99e7b 100644
--- a/drivers/vfio/pci/vfio_pci_intrs.c
+++ b/drivers/vfio/pci/vfio_pci_intrs.c
@@ -793,6 +793,18 @@ static int vfio_pci_set_req_trigger(struct vfio_pci_intr_ctx *intr_ctx,
 					       count, flags, data);
 }
 
+static void _vfio_pci_init_intr_ctx(struct vfio_pci_intr_ctx *intr_ctx)
+{
+	intr_ctx->irq_type = VFIO_PCI_NUM_IRQS;
+	mutex_init(&intr_ctx->igate);
+	xa_init(&intr_ctx->ctx);
+}
+
+static void _vfio_pci_release_intr_ctx(struct vfio_pci_intr_ctx *intr_ctx)
+{
+	mutex_destroy(&intr_ctx->igate);
+}
+
 struct vfio_pci_intr_ops vfio_pci_intr_ops = {
 	.set_intx_mask = vfio_pci_set_intx_mask,
 	.set_intx_unmask = vfio_pci_set_intx_unmask,
@@ -806,17 +818,15 @@ struct vfio_pci_intr_ops vfio_pci_intr_ops = {
 void vfio_pci_init_intr_ctx(struct vfio_pci_core_device *vdev,
 			    struct vfio_pci_intr_ctx *intr_ctx)
 {
-	intr_ctx->irq_type = VFIO_PCI_NUM_IRQS;
+	_vfio_pci_init_intr_ctx(intr_ctx);
 	intr_ctx->ops = &vfio_pci_intr_ops;
 	intr_ctx->priv = vdev;
-	mutex_init(&intr_ctx->igate);
-	xa_init(&intr_ctx->ctx);
 }
 EXPORT_SYMBOL_GPL(vfio_pci_init_intr_ctx);
 
 void vfio_pci_release_intr_ctx(struct vfio_pci_intr_ctx *intr_ctx)
 {
-	mutex_destroy(&intr_ctx->igate);
+	_vfio_pci_release_intr_ctx(intr_ctx);
 }
 EXPORT_SYMBOL_GPL(vfio_pci_release_intr_ctx);
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [RFC PATCH V2 12/18] vfio/pci: Provide interrupt context to generic ops
  2023-10-06 16:40 [RFC PATCH V2 00/18] vfio/pci: Back guest interrupts from Interrupt Message Store (IMS) Reinette Chatre
                   ` (10 preceding siblings ...)
  2023-10-06 16:41 ` [RFC PATCH V2 11/18] vfio/pci: Split interrupt context initialization Reinette Chatre
@ 2023-10-06 16:41 ` Reinette Chatre
  2023-10-06 16:41 ` [RFC PATCH V2 13/18] vfio/pci: Make vfio_pci_set_irqs_ioctl() available Reinette Chatre
                   ` (5 subsequent siblings)
  17 siblings, 0 replies; 21+ messages in thread
From: Reinette Chatre @ 2023-10-06 16:41 UTC (permalink / raw)
  To: jgg, yishaih, shameerali.kolothum.thodi, kevin.tian, alex.williamson
  Cc: kvm, dave.jiang, jing2.liu, ashok.raj, fenghua.yu, tom.zanussi,
	reinette.chatre, linux-kernel, patches

The functions operating on the per-interrupt context were originally
created to support management of PCI device interrupts where the
interrupt context was maintained within the virtual PCI device's
struct vfio_pci_core_device. Now that the per-interrupt context
has been moved to a more generic struct vfio_pci_intr_ctx these utilities
can be changed to expect the generic structure instead. This enables
these utilities to be used in other interrupt management backends.

Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
---
 drivers/vfio/pci/vfio_pci_intrs.c | 41 ++++++++++++++++---------------
 1 file changed, 21 insertions(+), 20 deletions(-)

diff --git a/drivers/vfio/pci/vfio_pci_intrs.c b/drivers/vfio/pci/vfio_pci_intrs.c
index 8c9d44e99e7b..0a741159368c 100644
--- a/drivers/vfio/pci/vfio_pci_intrs.c
+++ b/drivers/vfio/pci/vfio_pci_intrs.c
@@ -49,21 +49,21 @@ static bool is_irq_none(struct vfio_pci_core_device *vdev)
 }
 
 static
-struct vfio_pci_irq_ctx *vfio_irq_ctx_get(struct vfio_pci_core_device *vdev,
+struct vfio_pci_irq_ctx *vfio_irq_ctx_get(struct vfio_pci_intr_ctx *intr_ctx,
 					  unsigned long index)
 {
-	return xa_load(&vdev->intr_ctx.ctx, index);
+	return xa_load(&intr_ctx->ctx, index);
 }
 
-static void vfio_irq_ctx_free(struct vfio_pci_core_device *vdev,
+static void vfio_irq_ctx_free(struct vfio_pci_intr_ctx *intr_ctx,
 			      struct vfio_pci_irq_ctx *ctx, unsigned long index)
 {
-	xa_erase(&vdev->intr_ctx.ctx, index);
+	xa_erase(&intr_ctx->ctx, index);
 	kfree(ctx);
 }
 
 static struct vfio_pci_irq_ctx *
-vfio_irq_ctx_alloc(struct vfio_pci_core_device *vdev, unsigned long index)
+vfio_irq_ctx_alloc(struct vfio_pci_intr_ctx *intr_ctx, unsigned long index)
 {
 	struct vfio_pci_irq_ctx *ctx;
 	int ret;
@@ -72,7 +72,7 @@ vfio_irq_ctx_alloc(struct vfio_pci_core_device *vdev, unsigned long index)
 	if (!ctx)
 		return NULL;
 
-	ret = xa_insert(&vdev->intr_ctx.ctx, index, ctx, GFP_KERNEL_ACCOUNT);
+	ret = xa_insert(&intr_ctx->ctx, index, ctx, GFP_KERNEL_ACCOUNT);
 	if (ret) {
 		kfree(ctx);
 		return NULL;
@@ -91,7 +91,7 @@ static void vfio_send_intx_eventfd(void *opaque, void *unused)
 	if (likely(is_intx(vdev) && !vdev->virq_disabled)) {
 		struct vfio_pci_irq_ctx *ctx;
 
-		ctx = vfio_irq_ctx_get(vdev, 0);
+		ctx = vfio_irq_ctx_get(&vdev->intr_ctx, 0);
 		if (WARN_ON_ONCE(!ctx))
 			return;
 		eventfd_signal(ctx->trigger, 1);
@@ -120,7 +120,7 @@ bool vfio_pci_intx_mask(struct vfio_pci_core_device *vdev)
 		goto out_unlock;
 	}
 
-	ctx = vfio_irq_ctx_get(vdev, 0);
+	ctx = vfio_irq_ctx_get(&vdev->intr_ctx, 0);
 	if (WARN_ON_ONCE(!ctx))
 		goto out_unlock;
 
@@ -169,7 +169,7 @@ static int vfio_pci_intx_unmask_handler(void *opaque, void *unused)
 		goto out_unlock;
 	}
 
-	ctx = vfio_irq_ctx_get(vdev, 0);
+	ctx = vfio_irq_ctx_get(&vdev->intr_ctx, 0);
 	if (WARN_ON_ONCE(!ctx))
 		goto out_unlock;
 
@@ -207,7 +207,7 @@ static irqreturn_t vfio_intx_handler(int irq, void *dev_id)
 	unsigned long flags;
 	int ret = IRQ_NONE;
 
-	ctx = vfio_irq_ctx_get(vdev, 0);
+	ctx = vfio_irq_ctx_get(&vdev->intr_ctx, 0);
 	if (WARN_ON_ONCE(!ctx))
 		return ret;
 
@@ -241,7 +241,7 @@ static int vfio_intx_enable(struct vfio_pci_core_device *vdev)
 	if (!vdev->pdev->irq)
 		return -ENODEV;
 
-	ctx = vfio_irq_ctx_alloc(vdev, 0);
+	ctx = vfio_irq_ctx_alloc(&vdev->intr_ctx, 0);
 	if (!ctx)
 		return -ENOMEM;
 
@@ -269,7 +269,7 @@ static int vfio_intx_set_signal(struct vfio_pci_core_device *vdev, int fd)
 	unsigned long flags;
 	int ret;
 
-	ctx = vfio_irq_ctx_get(vdev, 0);
+	ctx = vfio_irq_ctx_get(&vdev->intr_ctx, 0);
 	if (WARN_ON_ONCE(!ctx))
 		return -EINVAL;
 
@@ -324,7 +324,7 @@ static void vfio_intx_disable(struct vfio_pci_core_device *vdev)
 {
 	struct vfio_pci_irq_ctx *ctx;
 
-	ctx = vfio_irq_ctx_get(vdev, 0);
+	ctx = vfio_irq_ctx_get(&vdev->intr_ctx, 0);
 	WARN_ON_ONCE(!ctx);
 	if (ctx) {
 		vfio_virqfd_disable(&ctx->unmask);
@@ -332,7 +332,7 @@ static void vfio_intx_disable(struct vfio_pci_core_device *vdev)
 	}
 	vfio_intx_set_signal(vdev, -1);
 	vdev->intr_ctx.irq_type = VFIO_PCI_NUM_IRQS;
-	vfio_irq_ctx_free(vdev, ctx, 0);
+	vfio_irq_ctx_free(&vdev->intr_ctx, ctx, 0);
 }
 
 /*
@@ -421,7 +421,7 @@ static int vfio_msi_set_vector_signal(struct vfio_pci_core_device *vdev,
 	int irq = -EINVAL, ret;
 	u16 cmd;
 
-	ctx = vfio_irq_ctx_get(vdev, vector);
+	ctx = vfio_irq_ctx_get(&vdev->intr_ctx, vector);
 
 	if (ctx) {
 		irq_bypass_unregister_producer(&ctx->producer);
@@ -432,7 +432,7 @@ static int vfio_msi_set_vector_signal(struct vfio_pci_core_device *vdev,
 		/* Interrupt stays allocated, will be freed at MSI-X disable. */
 		kfree(ctx->name);
 		eventfd_ctx_put(ctx->trigger);
-		vfio_irq_ctx_free(vdev, ctx, vector);
+		vfio_irq_ctx_free(&vdev->intr_ctx, ctx, vector);
 	}
 
 	if (fd < 0)
@@ -445,7 +445,7 @@ static int vfio_msi_set_vector_signal(struct vfio_pci_core_device *vdev,
 			return irq;
 	}
 
-	ctx = vfio_irq_ctx_alloc(vdev, vector);
+	ctx = vfio_irq_ctx_alloc(&vdev->intr_ctx, vector);
 	if (!ctx)
 		return -ENOMEM;
 
@@ -499,7 +499,7 @@ static int vfio_msi_set_vector_signal(struct vfio_pci_core_device *vdev,
 out_free_name:
 	kfree(ctx->name);
 out_free_ctx:
-	vfio_irq_ctx_free(vdev, ctx, vector);
+	vfio_irq_ctx_free(&vdev->intr_ctx, ctx, vector);
 	return ret;
 }
 
@@ -569,7 +569,8 @@ static int vfio_pci_set_intx_unmask(struct vfio_pci_intr_ctx *intr_ctx,
 		if (unmask)
 			vfio_pci_intx_unmask(vdev);
 	} else if (flags & VFIO_IRQ_SET_DATA_EVENTFD) {
-		struct vfio_pci_irq_ctx *ctx = vfio_irq_ctx_get(vdev, 0);
+		struct vfio_pci_irq_ctx *ctx = vfio_irq_ctx_get(&vdev->intr_ctx,
+								0);
 		int32_t fd = *(int32_t *)data;
 
 		if (WARN_ON_ONCE(!ctx))
@@ -695,7 +696,7 @@ static int vfio_pci_set_msi_trigger(struct vfio_pci_intr_ctx *intr_ctx,
 		return -EINVAL;
 
 	for (i = start; i < start + count; i++) {
-		ctx = vfio_irq_ctx_get(vdev, i);
+		ctx = vfio_irq_ctx_get(&vdev->intr_ctx, i);
 		if (!ctx)
 			continue;
 		if (flags & VFIO_IRQ_SET_DATA_NONE) {
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [RFC PATCH V2 13/18] vfio/pci: Make vfio_pci_set_irqs_ioctl() available
  2023-10-06 16:40 [RFC PATCH V2 00/18] vfio/pci: Back guest interrupts from Interrupt Message Store (IMS) Reinette Chatre
                   ` (11 preceding siblings ...)
  2023-10-06 16:41 ` [RFC PATCH V2 12/18] vfio/pci: Provide interrupt context to generic ops Reinette Chatre
@ 2023-10-06 16:41 ` Reinette Chatre
  2023-10-06 16:41 ` [RFC PATCH V2 14/18] vfio/pci: Add core IMS support Reinette Chatre
                   ` (4 subsequent siblings)
  17 siblings, 0 replies; 21+ messages in thread
From: Reinette Chatre @ 2023-10-06 16:41 UTC (permalink / raw)
  To: jgg, yishaih, shameerali.kolothum.thodi, kevin.tian, alex.williamson
  Cc: kvm, dave.jiang, jing2.liu, ashok.raj, fenghua.yu, tom.zanussi,
	reinette.chatre, linux-kernel, patches

vfio_pci_set_irqs_ioctl() is now a generic entrypoint that can
be configured to support different interrupt management backend.
Not limited to PCI devices it can be exported for use by
other virtual device drivers.

Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
---
 drivers/vfio/pci/vfio_pci_intrs.c | 1 +
 include/linux/vfio_pci_core.h     | 3 +++
 2 files changed, 4 insertions(+)

diff --git a/drivers/vfio/pci/vfio_pci_intrs.c b/drivers/vfio/pci/vfio_pci_intrs.c
index 0a741159368c..d04a4477c201 100644
--- a/drivers/vfio/pci/vfio_pci_intrs.c
+++ b/drivers/vfio/pci/vfio_pci_intrs.c
@@ -909,3 +909,4 @@ int vfio_pci_set_irqs_ioctl(struct vfio_pci_intr_ctx *intr_ctx, uint32_t flags,
 	mutex_unlock(&intr_ctx->igate);
 	return ret;
 }
+EXPORT_SYMBOL_GPL(vfio_pci_set_irqs_ioctl);
diff --git a/include/linux/vfio_pci_core.h b/include/linux/vfio_pci_core.h
index 893a36b5d163..1dd55d98dce9 100644
--- a/include/linux/vfio_pci_core.h
+++ b/include/linux/vfio_pci_core.h
@@ -158,6 +158,9 @@ long vfio_pci_core_ioctl(struct vfio_device *core_vdev, unsigned int cmd,
 void vfio_pci_init_intr_ctx(struct vfio_pci_core_device *vdev,
 			    struct vfio_pci_intr_ctx *intr_ctx);
 void vfio_pci_release_intr_ctx(struct vfio_pci_intr_ctx *intr_ctx);
+int vfio_pci_set_irqs_ioctl(struct vfio_pci_intr_ctx *intr_ctx, uint32_t flags,
+			    unsigned int index, unsigned int start,
+			    unsigned int count, void *data);
 int vfio_pci_core_ioctl_feature(struct vfio_device *device, u32 flags,
 				void __user *arg, size_t argsz);
 ssize_t vfio_pci_core_read(struct vfio_device *core_vdev, char __user *buf,
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [RFC PATCH V2 14/18] vfio/pci: Add core IMS support
  2023-10-06 16:40 [RFC PATCH V2 00/18] vfio/pci: Back guest interrupts from Interrupt Message Store (IMS) Reinette Chatre
                   ` (12 preceding siblings ...)
  2023-10-06 16:41 ` [RFC PATCH V2 13/18] vfio/pci: Make vfio_pci_set_irqs_ioctl() available Reinette Chatre
@ 2023-10-06 16:41 ` Reinette Chatre
  2023-10-13  8:10   ` Tian, Kevin
  2023-10-06 16:41 ` [RFC PATCH V2 15/18] vfio/pci: Support emulated interrupts Reinette Chatre
                   ` (3 subsequent siblings)
  17 siblings, 1 reply; 21+ messages in thread
From: Reinette Chatre @ 2023-10-06 16:41 UTC (permalink / raw)
  To: jgg, yishaih, shameerali.kolothum.thodi, kevin.tian, alex.williamson
  Cc: kvm, dave.jiang, jing2.liu, ashok.raj, fenghua.yu, tom.zanussi,
	reinette.chatre, linux-kernel, patches

Add a new interrupt management backend enabling a guest MSI-X
interrupt to be backed by an IMS interrupt on the host.

An IMS interrupt is allocated via pci_ims_alloc_irq() and requires
an implementation specific cookie that is opaque to the IMS backend.
This can be a PASID, queue ID, pointer etc. During initialization
the IMS backend learns which PCI device to operate on (and thus which
interrupt domain to allocate from) and what the default cookie should
be for any new interrupt allocation.
IMS can associate a unique cookie with each vector (more support in
later patches) and to maintain this association the backend maintains
interrupt contexts for the virtual device's lifetime.

A virtual device driver starts by initializing the backend
using new vfio_pci_ims_init_intr_ctx(), cleanup using new
vfio_pci_ims_release_intr_ctx(). Once initialized the virtual
device driver can call vfio_pci_set_irqs_ioctl() to handle the
VFIO_DEVICE_SET_IRQS ioctl() after it has validated the parameters
to be appropriate for the particular device.

To support the IMS backend the core utilities need to be aware
which interrupt context it interacts with. New ims_backed_irq
enables this and is false for the PCI passthrough backend and
true for the IMS backend.

Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
---
 drivers/vfio/pci/vfio_pci_intrs.c | 275 ++++++++++++++++++++++++++++++
 include/linux/vfio_pci_core.h     |   6 +
 2 files changed, 281 insertions(+)

diff --git a/drivers/vfio/pci/vfio_pci_intrs.c b/drivers/vfio/pci/vfio_pci_intrs.c
index d04a4477c201..7880fd4077a6 100644
--- a/drivers/vfio/pci/vfio_pci_intrs.c
+++ b/drivers/vfio/pci/vfio_pci_intrs.c
@@ -22,6 +22,21 @@
 
 #include "vfio_pci_priv.h"
 
+/*
+ * Interrupt Message Store (IMS) private interrupt context data
+ * @vdev:		Virtual device. Used for name of device in
+ *			request_irq().
+ * @pdev:		PCI device owning the IMS domain from where
+ *			interrupts are allocated.
+ * @default_cookie:	Default cookie used for IMS interrupts without unique
+ *			cookie.
+ */
+struct vfio_pci_ims {
+	struct vfio_device		*vdev;
+	struct pci_dev			*pdev;
+	union msi_instance_cookie	default_cookie;
+};
+
 struct vfio_pci_irq_ctx {
 	struct eventfd_ctx	*trigger;
 	struct virqfd		*unmask;
@@ -29,6 +44,9 @@ struct vfio_pci_irq_ctx {
 	char			*name;
 	bool			masked;
 	struct irq_bypass_producer	producer;
+	int			ims_id;
+	int			virq;
+	union msi_instance_cookie	icookie;
 };
 
 static bool irq_is(struct vfio_pci_core_device *vdev, int type)
@@ -72,6 +90,12 @@ vfio_irq_ctx_alloc(struct vfio_pci_intr_ctx *intr_ctx, unsigned long index)
 	if (!ctx)
 		return NULL;
 
+	if (intr_ctx->ims_backed_irq)  {
+		struct vfio_pci_ims *ims = intr_ctx->priv;
+
+		ctx->icookie = ims->default_cookie;
+	}
+
 	ret = xa_insert(&intr_ctx->ctx, index, ctx, GFP_KERNEL_ACCOUNT);
 	if (ret) {
 		kfree(ctx);
@@ -822,6 +846,7 @@ void vfio_pci_init_intr_ctx(struct vfio_pci_core_device *vdev,
 	_vfio_pci_init_intr_ctx(intr_ctx);
 	intr_ctx->ops = &vfio_pci_intr_ops;
 	intr_ctx->priv = vdev;
+	intr_ctx->ims_backed_irq = false;
 }
 EXPORT_SYMBOL_GPL(vfio_pci_init_intr_ctx);
 
@@ -831,6 +856,256 @@ void vfio_pci_release_intr_ctx(struct vfio_pci_intr_ctx *intr_ctx)
 }
 EXPORT_SYMBOL_GPL(vfio_pci_release_intr_ctx);
 
+/* Guest MSI-X interrupts backed by IMS host interrupts */
+
+/*
+ * Free the IMS interrupt associated with @ctx.
+ *
+ * For an IMS interrupt the interrupt is freed from the underlying
+ * PCI device's IMS domain.
+ */
+static void vfio_pci_ims_irq_free(struct vfio_pci_intr_ctx *intr_ctx,
+				  struct vfio_pci_irq_ctx *ctx)
+{
+	struct vfio_pci_ims *ims = intr_ctx->priv;
+	struct msi_map irq_map = {};
+
+	irq_map.index = ctx->ims_id;
+	irq_map.virq = ctx->virq;
+	pci_ims_free_irq(ims->pdev, irq_map);
+	ctx->ims_id = -EINVAL;
+	ctx->virq = 0;
+}
+
+/*
+ * Allocate a host IMS interrupt for @ctx.
+ *
+ * For an IMS interrupt the interrupt is allocated from the underlying
+ * PCI device's IMS domain.
+ */
+static int vfio_pci_ims_irq_alloc(struct vfio_pci_intr_ctx *intr_ctx,
+				  struct vfio_pci_irq_ctx *ctx)
+{
+	struct vfio_pci_ims *ims = intr_ctx->priv;
+	struct msi_map irq_map = {};
+
+	irq_map = pci_ims_alloc_irq(ims->pdev, &ctx->icookie, NULL);
+	if (irq_map.index < 0)
+		return irq_map.index;
+
+	ctx->ims_id = irq_map.index;
+	ctx->virq = irq_map.virq;
+
+	return 0;
+}
+
+static int vfio_pci_ims_set_vector_signal(struct vfio_pci_intr_ctx *intr_ctx,
+					  unsigned int vector, int fd)
+{
+	struct vfio_pci_ims *ims = intr_ctx->priv;
+	struct device *dev = &ims->vdev->device;
+	struct vfio_pci_irq_ctx *ctx;
+	struct eventfd_ctx *trigger;
+	int ret;
+
+	ctx = vfio_irq_ctx_get(intr_ctx, vector);
+
+	if (ctx && ctx->trigger) {
+		irq_bypass_unregister_producer(&ctx->producer);
+		free_irq(ctx->virq, ctx->trigger);
+		vfio_pci_ims_irq_free(intr_ctx, ctx);
+		kfree(ctx->name);
+		ctx->name = NULL;
+		eventfd_ctx_put(ctx->trigger);
+		ctx->trigger = NULL;
+	}
+
+	if (fd < 0)
+		return 0;
+
+	/* Interrupt contexts remain allocated until shutdown. */
+	if (!ctx) {
+		ctx = vfio_irq_ctx_alloc(intr_ctx, vector);
+		if (!ctx)
+			return -ENOMEM;
+	}
+
+	ctx->name = kasprintf(GFP_KERNEL, "vfio-ims[%d](%s)", vector,
+			      dev_name(dev));
+	if (!ctx->name)
+		return -ENOMEM;
+
+	trigger = eventfd_ctx_fdget(fd);
+	if (IS_ERR(trigger)) {
+		ret = PTR_ERR(trigger);
+		goto out_free_name;
+	}
+
+	ctx->trigger = trigger;
+
+	ret = vfio_pci_ims_irq_alloc(intr_ctx, ctx);
+	if (ret < 0)
+		goto out_put_eventfd_ctx;
+
+	ret = request_irq(ctx->virq, vfio_msihandler, 0, ctx->name,
+			  ctx->trigger);
+	if (ret < 0)
+		goto out_free_irq;
+
+	ctx->producer.token = ctx->trigger;
+	ctx->producer.irq = ctx->virq;
+	ret = irq_bypass_register_producer(&ctx->producer);
+	if (unlikely(ret)) {
+		dev_info(&ims->vdev->device,
+			 "irq bypass producer (token %p) registration fails: %d\n",
+			 &ctx->producer.token, ret);
+		ctx->producer.token = NULL;
+	}
+
+	return 0;
+
+out_free_irq:
+	vfio_pci_ims_irq_free(intr_ctx, ctx);
+out_put_eventfd_ctx:
+	eventfd_ctx_put(ctx->trigger);
+	ctx->trigger = NULL;
+out_free_name:
+	kfree(ctx->name);
+	ctx->name = NULL;
+	return ret;
+}
+
+static int vfio_pci_ims_set_block(struct vfio_pci_intr_ctx *intr_ctx,
+				  unsigned int start, unsigned int count,
+				  int *fds)
+{
+	unsigned int i, j;
+	int ret = 0;
+
+	for (i = 0, j = start; i < count && !ret; i++, j++) {
+		int fd = fds ? fds[i] : -1;
+
+		ret = vfio_pci_ims_set_vector_signal(intr_ctx, j, fd);
+	}
+
+	if (ret) {
+		for (i = start; i < j; i++)
+			vfio_pci_ims_set_vector_signal(intr_ctx, i, -1);
+	}
+
+	return ret;
+}
+
+/*
+ * Manage Interrupt Message Store (IMS) or emulated interrupts on the
+ * host that are backing guest MSI-X vectors.
+ *
+ * @intr_ctx:	 Interrupt context
+ * @index:	 Type of guest vectors to set up.  Must be
+ *		 VFIO_PCI_MSIX_IRQ_INDEX.
+ * @start:	 First vector index.
+ * @count:	 Number of vectors.
+ * @flags:	 Type of data provided in @data.
+ * @data:	 Data as specified by @flags.
+ *
+ * Caller is required to validate provided range for @vdev.
+ *
+ * Context: Interrupt context must be initialized via
+ *	    vfio_pci_ims_init_intr_ctx()  before any interrupts can be allocated.
+ *
+ * Return: Error code on failure or 0 on success.
+ */
+static int vfio_pci_set_ims_trigger(struct vfio_pci_intr_ctx *intr_ctx,
+				    unsigned int index, unsigned int start,
+				    unsigned int count, u32 flags,
+				    void *data)
+{
+	struct vfio_pci_irq_ctx *ctx;
+	unsigned long i;
+
+	if (index != VFIO_PCI_MSIX_IRQ_INDEX)
+		return -EINVAL;
+
+	if (!count && (flags & VFIO_IRQ_SET_DATA_NONE)) {
+		xa_for_each(&intr_ctx->ctx, i, ctx)
+			vfio_pci_ims_set_vector_signal(intr_ctx, i, -1);
+		return 0;
+	}
+
+	if (flags & VFIO_IRQ_SET_DATA_EVENTFD)
+		return vfio_pci_ims_set_block(intr_ctx, start, count, (int *)data);
+
+	for (i = start; i < start + count; i++) {
+		ctx = vfio_irq_ctx_get(intr_ctx, i);
+		if (!ctx || !ctx->trigger)
+			continue;
+		if (flags & VFIO_IRQ_SET_DATA_NONE) {
+			eventfd_signal(ctx->trigger, 1);
+		} else if (flags & VFIO_IRQ_SET_DATA_BOOL) {
+			uint8_t *bools = data;
+
+			if (bools[i - start])
+				eventfd_signal(ctx->trigger, 1);
+		}
+	}
+
+	return 0;
+}
+
+struct vfio_pci_intr_ops vfio_pci_ims_intr_ops = {
+	.set_msix_trigger = vfio_pci_set_ims_trigger,
+	.set_req_trigger = vfio_pci_set_req_trigger,
+};
+
+int vfio_pci_ims_init_intr_ctx(struct vfio_device *vdev,
+			       struct vfio_pci_intr_ctx *intr_ctx,
+			       struct pci_dev *pdev,
+			       union msi_instance_cookie *default_cookie)
+{
+	struct vfio_pci_ims *ims;
+
+	ims = kzalloc(sizeof(*ims), GFP_KERNEL_ACCOUNT);
+	if (!ims)
+		return -ENOMEM;
+
+	ims->pdev = pdev;
+	ims->default_cookie = *default_cookie;
+	ims->vdev = vdev;
+
+	_vfio_pci_init_intr_ctx(intr_ctx);
+
+	intr_ctx->ops = &vfio_pci_ims_intr_ops;
+	intr_ctx->priv = ims;
+	intr_ctx->ims_backed_irq = true;
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(vfio_pci_ims_init_intr_ctx);
+
+void vfio_pci_ims_release_intr_ctx(struct vfio_pci_intr_ctx *intr_ctx)
+{
+	struct vfio_pci_ims *ims = intr_ctx->priv;
+	struct vfio_pci_irq_ctx *ctx;
+	unsigned long i;
+
+	/*
+	 * IMS backed MSI-X keeps interrupt context allocated after
+	 * interrupt is freed. Interrupt contexts need to be freed
+	 * separately.
+	 */
+	mutex_lock(&intr_ctx->igate);
+	xa_for_each(&intr_ctx->ctx, i, ctx) {
+		WARN_ON_ONCE(ctx->trigger);
+		WARN_ON_ONCE(ctx->name);
+		xa_erase(&intr_ctx->ctx, i);
+		kfree(ctx);
+	}
+	mutex_unlock(&intr_ctx->igate);
+	kfree(ims);
+	_vfio_pci_release_intr_ctx(intr_ctx);
+}
+EXPORT_SYMBOL_GPL(vfio_pci_ims_release_intr_ctx);
+
 int vfio_pci_set_irqs_ioctl(struct vfio_pci_intr_ctx *intr_ctx, uint32_t flags,
 			    unsigned int index, unsigned int start,
 			    unsigned int count, void *data)
diff --git a/include/linux/vfio_pci_core.h b/include/linux/vfio_pci_core.h
index 1dd55d98dce9..13b807848286 100644
--- a/include/linux/vfio_pci_core.h
+++ b/include/linux/vfio_pci_core.h
@@ -67,6 +67,7 @@ struct vfio_pci_intr_ctx {
 	struct eventfd_ctx		*req_trigger;
 	struct xarray			ctx;
 	int				irq_type;
+	bool				ims_backed_irq:1;
 };
 
 struct vfio_pci_intr_ops {
@@ -161,6 +162,11 @@ void vfio_pci_release_intr_ctx(struct vfio_pci_intr_ctx *intr_ctx);
 int vfio_pci_set_irqs_ioctl(struct vfio_pci_intr_ctx *intr_ctx, uint32_t flags,
 			    unsigned int index, unsigned int start,
 			    unsigned int count, void *data);
+int vfio_pci_ims_init_intr_ctx(struct vfio_device *vdev,
+			       struct vfio_pci_intr_ctx *intr_ctx,
+			       struct pci_dev *pdev,
+			       union msi_instance_cookie *default_cookie);
+void vfio_pci_ims_release_intr_ctx(struct vfio_pci_intr_ctx *intr_ctx);
 int vfio_pci_core_ioctl_feature(struct vfio_device *device, u32 flags,
 				void __user *arg, size_t argsz);
 ssize_t vfio_pci_core_read(struct vfio_device *core_vdev, char __user *buf,
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [RFC PATCH V2 15/18] vfio/pci: Support emulated interrupts
  2023-10-06 16:40 [RFC PATCH V2 00/18] vfio/pci: Back guest interrupts from Interrupt Message Store (IMS) Reinette Chatre
                   ` (13 preceding siblings ...)
  2023-10-06 16:41 ` [RFC PATCH V2 14/18] vfio/pci: Add core IMS support Reinette Chatre
@ 2023-10-06 16:41 ` Reinette Chatre
  2023-10-06 16:41 ` [RFC PATCH V2 16/18] vfio/pci: Support emulated interrupts in IMS backend Reinette Chatre
                   ` (2 subsequent siblings)
  17 siblings, 0 replies; 21+ messages in thread
From: Reinette Chatre @ 2023-10-06 16:41 UTC (permalink / raw)
  To: jgg, yishaih, shameerali.kolothum.thodi, kevin.tian, alex.williamson
  Cc: kvm, dave.jiang, jing2.liu, ashok.raj, fenghua.yu, tom.zanussi,
	reinette.chatre, linux-kernel, patches

Access from a guest to a virtual device may be either 'direct-path',
where the guest interacts directly with the underlying hardware,
or 'intercepted path' where the virtual device emulates operations.

Support emulated interrupts that can be used to handle 'intercepted
path' operations. For example, a virtual device may use 'intercepted
path' for configuration. Doing so, configuration requests intercepted
by the virtual device driver are handled within the virtual device
driver with completion signaled to the guest without interacting with
the underlying hardware.

Add vfio_pci_set_emulated() and vfio_pci_send_signal() to the
VFIO PCI API. vfio_pci_set_emulated() configures a range of interrupts
to be emulated. A backend indicates support for emulated interrupts
with vfio_pci_intr_ctx::supports_emulated.

Any range of interrupts can be configured as emulated when the backend
supports emulated interrupts as long as no interrupt has previously been
allocated at that vector. The virtual device driver uses
vfio_pci_send_signal() to trigger interrupts in the guest.

Originally-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
---
 drivers/vfio/pci/vfio_pci_intrs.c | 86 +++++++++++++++++++++++++++++++
 include/linux/vfio_pci_core.h     |  4 ++
 2 files changed, 90 insertions(+)

diff --git a/drivers/vfio/pci/vfio_pci_intrs.c b/drivers/vfio/pci/vfio_pci_intrs.c
index 7880fd4077a6..c6b213d52beb 100644
--- a/drivers/vfio/pci/vfio_pci_intrs.c
+++ b/drivers/vfio/pci/vfio_pci_intrs.c
@@ -38,6 +38,7 @@ struct vfio_pci_ims {
 };
 
 struct vfio_pci_irq_ctx {
+	bool			emulated:1;
 	struct eventfd_ctx	*trigger;
 	struct virqfd		*unmask;
 	struct virqfd		*mask;
@@ -847,6 +848,7 @@ void vfio_pci_init_intr_ctx(struct vfio_pci_core_device *vdev,
 	intr_ctx->ops = &vfio_pci_intr_ops;
 	intr_ctx->priv = vdev;
 	intr_ctx->ims_backed_irq = false;
+	intr_ctx->supports_emulated = false;
 }
 EXPORT_SYMBOL_GPL(vfio_pci_init_intr_ctx);
 
@@ -1076,6 +1078,7 @@ int vfio_pci_ims_init_intr_ctx(struct vfio_device *vdev,
 
 	intr_ctx->ops = &vfio_pci_ims_intr_ops;
 	intr_ctx->priv = ims;
+	intr_ctx->supports_emulated = false;
 	intr_ctx->ims_backed_irq = true;
 
 	return 0;
@@ -1106,6 +1109,89 @@ void vfio_pci_ims_release_intr_ctx(struct vfio_pci_intr_ctx *intr_ctx)
 }
 EXPORT_SYMBOL_GPL(vfio_pci_ims_release_intr_ctx);
 
+/*
+ * vfio_pci_send_signal() - Send signal to the eventfd.
+ * @intr_ctx:	Interrupt context.
+ * @vector:	Vector for which interrupt will be signaled.
+ *
+ * Trigger signal to guest for emulated interrupts.
+ */
+void vfio_pci_send_signal(struct vfio_pci_intr_ctx *intr_ctx, unsigned int vector)
+{
+	struct vfio_pci_irq_ctx *ctx;
+
+	mutex_lock(&intr_ctx->igate);
+
+	if (!intr_ctx->supports_emulated)
+		goto out_unlock;
+
+	ctx = vfio_irq_ctx_get(intr_ctx, vector);
+
+	if (WARN_ON_ONCE(!ctx || !ctx->emulated || !ctx->trigger))
+		goto out_unlock;
+
+	eventfd_signal(ctx->trigger, 1);
+
+out_unlock:
+	mutex_unlock(&intr_ctx->igate);
+}
+EXPORT_SYMBOL_GPL(vfio_pci_send_signal);
+
+/*
+ * vfio_pci_set_emulated() - Set range of interrupts that will be emulated.
+ * @intr_ctx:	Interrupt context.
+ * @start:	First emulated interrupt vector.
+ * @count:	Number of emulated interrupts starting from @start.
+ *
+ * Emulated interrupts will not be backed by hardware interrupts but
+ * instead triggered by virtual device driver.
+ *
+ * Return: error code on failure (-EBUSY if the vector is not available,
+ * -ENOMEM on allocation failure), 0 on success. No partial success, on
+ * success entire range was set as emulated, on failure no interrupt in
+ * range was set as emulated.
+ */
+int vfio_pci_set_emulated(struct vfio_pci_intr_ctx *intr_ctx,
+			  unsigned int start, unsigned int count)
+{
+	struct vfio_pci_irq_ctx *ctx;
+	unsigned long i, j;
+	int ret = -EINVAL;
+
+	mutex_lock(&intr_ctx->igate);
+
+	if (!intr_ctx->supports_emulated)
+		goto out_unlock;
+
+	for (i = start; i < start + count; i++) {
+		ctx = kzalloc(sizeof(*ctx), GFP_KERNEL_ACCOUNT);
+		if (!ctx) {
+			ret = -ENOMEM;
+			goto out_err;
+		}
+		ctx->emulated = true;
+		ret = xa_insert(&intr_ctx->ctx, i, ctx, GFP_KERNEL_ACCOUNT);
+		if (ret) {
+			kfree(ctx);
+			goto out_err;
+		}
+	}
+
+	mutex_unlock(&intr_ctx->igate);
+	return 0;
+
+out_err:
+	for (j = start; j < i; j++) {
+		ctx = vfio_irq_ctx_get(intr_ctx, j);
+		vfio_irq_ctx_free(intr_ctx, ctx, j);
+	}
+out_unlock:
+	mutex_unlock(&intr_ctx->igate);
+
+	return ret;
+}
+EXPORT_SYMBOL_GPL(vfio_pci_set_emulated);
+
 int vfio_pci_set_irqs_ioctl(struct vfio_pci_intr_ctx *intr_ctx, uint32_t flags,
 			    unsigned int index, unsigned int start,
 			    unsigned int count, void *data)
diff --git a/include/linux/vfio_pci_core.h b/include/linux/vfio_pci_core.h
index 13b807848286..3c5df1d13e5d 100644
--- a/include/linux/vfio_pci_core.h
+++ b/include/linux/vfio_pci_core.h
@@ -67,6 +67,7 @@ struct vfio_pci_intr_ctx {
 	struct eventfd_ctx		*req_trigger;
 	struct xarray			ctx;
 	int				irq_type;
+	bool				supports_emulated:1;
 	bool				ims_backed_irq:1;
 };
 
@@ -167,6 +168,9 @@ int vfio_pci_ims_init_intr_ctx(struct vfio_device *vdev,
 			       struct pci_dev *pdev,
 			       union msi_instance_cookie *default_cookie);
 void vfio_pci_ims_release_intr_ctx(struct vfio_pci_intr_ctx *intr_ctx);
+void vfio_pci_send_signal(struct vfio_pci_intr_ctx *intr_ctx, unsigned int vector);
+int vfio_pci_set_emulated(struct vfio_pci_intr_ctx *intr_ctx,
+			  unsigned int start, unsigned int count);
 int vfio_pci_core_ioctl_feature(struct vfio_device *device, u32 flags,
 				void __user *arg, size_t argsz);
 ssize_t vfio_pci_core_read(struct vfio_device *core_vdev, char __user *buf,
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [RFC PATCH V2 16/18] vfio/pci: Support emulated interrupts in IMS backend
  2023-10-06 16:40 [RFC PATCH V2 00/18] vfio/pci: Back guest interrupts from Interrupt Message Store (IMS) Reinette Chatre
                   ` (14 preceding siblings ...)
  2023-10-06 16:41 ` [RFC PATCH V2 15/18] vfio/pci: Support emulated interrupts Reinette Chatre
@ 2023-10-06 16:41 ` Reinette Chatre
  2023-10-06 16:41 ` [RFC PATCH V2 17/18] vfio/pci: Add accessor for IMS index Reinette Chatre
  2023-10-06 16:41 ` [RFC PATCH V2 18/18] vfio/pci: Support IMS cookie modification Reinette Chatre
  17 siblings, 0 replies; 21+ messages in thread
From: Reinette Chatre @ 2023-10-06 16:41 UTC (permalink / raw)
  To: jgg, yishaih, shameerali.kolothum.thodi, kevin.tian, alex.williamson
  Cc: kvm, dave.jiang, jing2.liu, ashok.raj, fenghua.yu, tom.zanussi,
	reinette.chatre, linux-kernel, patches

An emulated interrupt has an associated eventfd but is
not backed by a hardware interrupt.

Add support for emulated interrupts to the IMS backend
that generally involves avoiding the actions involving
hardware configuration.

Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
---
 drivers/vfio/pci/vfio_pci_intrs.c | 27 +++++++++++++++++++--------
 1 file changed, 19 insertions(+), 8 deletions(-)

diff --git a/drivers/vfio/pci/vfio_pci_intrs.c b/drivers/vfio/pci/vfio_pci_intrs.c
index c6b213d52beb..f96d7481094a 100644
--- a/drivers/vfio/pci/vfio_pci_intrs.c
+++ b/drivers/vfio/pci/vfio_pci_intrs.c
@@ -863,8 +863,8 @@ EXPORT_SYMBOL_GPL(vfio_pci_release_intr_ctx);
 /*
  * Free the IMS interrupt associated with @ctx.
  *
- * For an IMS interrupt the interrupt is freed from the underlying
- * PCI device's IMS domain.
+ * For an emulated interrupt there is nothing to do. For an IMS interrupt
+ * the interrupt is freed from the underlying PCI device's IMS domain.
  */
 static void vfio_pci_ims_irq_free(struct vfio_pci_intr_ctx *intr_ctx,
 				  struct vfio_pci_irq_ctx *ctx)
@@ -872,6 +872,9 @@ static void vfio_pci_ims_irq_free(struct vfio_pci_intr_ctx *intr_ctx,
 	struct vfio_pci_ims *ims = intr_ctx->priv;
 	struct msi_map irq_map = {};
 
+	if (ctx->emulated)
+		return;
+
 	irq_map.index = ctx->ims_id;
 	irq_map.virq = ctx->virq;
 	pci_ims_free_irq(ims->pdev, irq_map);
@@ -882,8 +885,8 @@ static void vfio_pci_ims_irq_free(struct vfio_pci_intr_ctx *intr_ctx,
 /*
  * Allocate a host IMS interrupt for @ctx.
  *
- * For an IMS interrupt the interrupt is allocated from the underlying
- * PCI device's IMS domain.
+ * For an emulated interrupt there is nothing to do. For an IMS interrupt
+ * the interrupt is allocated from the underlying PCI device's IMS domain.
  */
 static int vfio_pci_ims_irq_alloc(struct vfio_pci_intr_ctx *intr_ctx,
 				  struct vfio_pci_irq_ctx *ctx)
@@ -891,6 +894,9 @@ static int vfio_pci_ims_irq_alloc(struct vfio_pci_intr_ctx *intr_ctx,
 	struct vfio_pci_ims *ims = intr_ctx->priv;
 	struct msi_map irq_map = {};
 
+	if (ctx->emulated)
+		return -EINVAL;
+
 	irq_map = pci_ims_alloc_irq(ims->pdev, &ctx->icookie, NULL);
 	if (irq_map.index < 0)
 		return irq_map.index;
@@ -913,9 +919,11 @@ static int vfio_pci_ims_set_vector_signal(struct vfio_pci_intr_ctx *intr_ctx,
 	ctx = vfio_irq_ctx_get(intr_ctx, vector);
 
 	if (ctx && ctx->trigger) {
-		irq_bypass_unregister_producer(&ctx->producer);
-		free_irq(ctx->virq, ctx->trigger);
-		vfio_pci_ims_irq_free(intr_ctx, ctx);
+		if (!ctx->emulated) {
+			irq_bypass_unregister_producer(&ctx->producer);
+			free_irq(ctx->virq, ctx->trigger);
+			vfio_pci_ims_irq_free(intr_ctx, ctx);
+		}
 		kfree(ctx->name);
 		ctx->name = NULL;
 		eventfd_ctx_put(ctx->trigger);
@@ -945,6 +953,9 @@ static int vfio_pci_ims_set_vector_signal(struct vfio_pci_intr_ctx *intr_ctx,
 
 	ctx->trigger = trigger;
 
+	if (ctx->emulated)
+		return 0;
+
 	ret = vfio_pci_ims_irq_alloc(intr_ctx, ctx);
 	if (ret < 0)
 		goto out_put_eventfd_ctx;
@@ -1078,7 +1089,7 @@ int vfio_pci_ims_init_intr_ctx(struct vfio_device *vdev,
 
 	intr_ctx->ops = &vfio_pci_ims_intr_ops;
 	intr_ctx->priv = ims;
-	intr_ctx->supports_emulated = false;
+	intr_ctx->supports_emulated = true;
 	intr_ctx->ims_backed_irq = true;
 
 	return 0;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [RFC PATCH V2 17/18] vfio/pci: Add accessor for IMS index
  2023-10-06 16:40 [RFC PATCH V2 00/18] vfio/pci: Back guest interrupts from Interrupt Message Store (IMS) Reinette Chatre
                   ` (15 preceding siblings ...)
  2023-10-06 16:41 ` [RFC PATCH V2 16/18] vfio/pci: Support emulated interrupts in IMS backend Reinette Chatre
@ 2023-10-06 16:41 ` Reinette Chatre
  2023-10-06 16:41 ` [RFC PATCH V2 18/18] vfio/pci: Support IMS cookie modification Reinette Chatre
  17 siblings, 0 replies; 21+ messages in thread
From: Reinette Chatre @ 2023-10-06 16:41 UTC (permalink / raw)
  To: jgg, yishaih, shameerali.kolothum.thodi, kevin.tian, alex.williamson
  Cc: kvm, dave.jiang, jing2.liu, ashok.raj, fenghua.yu, tom.zanussi,
	reinette.chatre, linux-kernel, patches

A virtual device driver needs to facilitate translation between
the guest's MSI-X interrupt and the backing IMS interrupt with
which the physical device is programmed. For example, the
guest may need to obtain the IMS index from the virtual device driver
that it needs to program into descriptors submitted to the device
to ensure that the completion interrupts are generated correctly.

Introduce vfio_pci_ims_hwirq() to the IMS backend as a helper
that returns the IMS interrupt index backing a provided MSI-X
interrupt index belonging to a guest.

Originally-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
---
 drivers/vfio/pci/vfio_pci_intrs.c | 25 +++++++++++++++++++++++++
 include/linux/vfio_pci_core.h     |  1 +
 2 files changed, 26 insertions(+)

diff --git a/drivers/vfio/pci/vfio_pci_intrs.c b/drivers/vfio/pci/vfio_pci_intrs.c
index f96d7481094a..df458aed2175 100644
--- a/drivers/vfio/pci/vfio_pci_intrs.c
+++ b/drivers/vfio/pci/vfio_pci_intrs.c
@@ -1203,6 +1203,31 @@ int vfio_pci_set_emulated(struct vfio_pci_intr_ctx *intr_ctx,
 }
 EXPORT_SYMBOL_GPL(vfio_pci_set_emulated);
 
+/*
+ * Return IMS index of IMS interrupt backing MSI-X interrupt @vector
+ */
+int vfio_pci_ims_hwirq(struct vfio_pci_intr_ctx *intr_ctx, unsigned int vector)
+{
+	struct vfio_pci_irq_ctx *ctx;
+	int id = -EINVAL;
+
+	mutex_lock(&intr_ctx->igate);
+
+	if (!intr_ctx->ims_backed_irq)
+		goto out_unlock;
+
+	ctx = vfio_irq_ctx_get(intr_ctx, vector);
+	if (!ctx || ctx->emulated)
+		goto out_unlock;
+
+	id = ctx->ims_id;
+
+out_unlock:
+	mutex_unlock(&intr_ctx->igate);
+	return id;
+}
+EXPORT_SYMBOL_GPL(vfio_pci_ims_hwirq);
+
 int vfio_pci_set_irqs_ioctl(struct vfio_pci_intr_ctx *intr_ctx, uint32_t flags,
 			    unsigned int index, unsigned int start,
 			    unsigned int count, void *data)
diff --git a/include/linux/vfio_pci_core.h b/include/linux/vfio_pci_core.h
index 3c5df1d13e5d..c6e399b39e90 100644
--- a/include/linux/vfio_pci_core.h
+++ b/include/linux/vfio_pci_core.h
@@ -167,6 +167,7 @@ int vfio_pci_ims_init_intr_ctx(struct vfio_device *vdev,
 			       struct vfio_pci_intr_ctx *intr_ctx,
 			       struct pci_dev *pdev,
 			       union msi_instance_cookie *default_cookie);
+int vfio_pci_ims_hwirq(struct vfio_pci_intr_ctx *intr_ctx, unsigned int vector);
 void vfio_pci_ims_release_intr_ctx(struct vfio_pci_intr_ctx *intr_ctx);
 void vfio_pci_send_signal(struct vfio_pci_intr_ctx *intr_ctx, unsigned int vector);
 int vfio_pci_set_emulated(struct vfio_pci_intr_ctx *intr_ctx,
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [RFC PATCH V2 18/18] vfio/pci: Support IMS cookie modification
  2023-10-06 16:40 [RFC PATCH V2 00/18] vfio/pci: Back guest interrupts from Interrupt Message Store (IMS) Reinette Chatre
                   ` (16 preceding siblings ...)
  2023-10-06 16:41 ` [RFC PATCH V2 17/18] vfio/pci: Add accessor for IMS index Reinette Chatre
@ 2023-10-06 16:41 ` Reinette Chatre
  17 siblings, 0 replies; 21+ messages in thread
From: Reinette Chatre @ 2023-10-06 16:41 UTC (permalink / raw)
  To: jgg, yishaih, shameerali.kolothum.thodi, kevin.tian, alex.williamson
  Cc: kvm, dave.jiang, jing2.liu, ashok.raj, fenghua.yu, tom.zanussi,
	reinette.chatre, linux-kernel, patches

IMS supports an implementation specific cookie that is associated
with each interrupt. By default the IMS interrupt allocation backend
will assign a default cookie to a new interrupt instance.

Add support for a virtual device driver to set the interrupt instance
specific cookie. For example, the virtual device driver may intercept
the guest's MMIO write that configuresa a new PASID for a particular
interrupt. Calling vfio_pci_ims_set_cookie() with the new PASID value
as IMS cookie enables subsequent interrupts to be allocated with
accurate data.

Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
---
 drivers/vfio/pci/vfio_pci_intrs.c | 53 +++++++++++++++++++++++++++++++
 include/linux/vfio_pci_core.h     |  3 ++
 2 files changed, 56 insertions(+)

diff --git a/drivers/vfio/pci/vfio_pci_intrs.c b/drivers/vfio/pci/vfio_pci_intrs.c
index df458aed2175..e9e46633af65 100644
--- a/drivers/vfio/pci/vfio_pci_intrs.c
+++ b/drivers/vfio/pci/vfio_pci_intrs.c
@@ -1228,6 +1228,59 @@ int vfio_pci_ims_hwirq(struct vfio_pci_intr_ctx *intr_ctx, unsigned int vector)
 }
 EXPORT_SYMBOL_GPL(vfio_pci_ims_hwirq);
 
+/*
+ * vfio_pci_ims_set_cookie() - Set unique cookie for vector.
+ * @intr_ctx:	Interrupt context.
+ * @vector:	Vector.
+ * @icookie:	New cookie for @vector.
+ *
+ * When new IMS interrupt is allocated for @vector it will be
+ * assigned @icookie.
+ */
+int vfio_pci_ims_set_cookie(struct vfio_pci_intr_ctx *intr_ctx,
+			    unsigned int vector,
+			    union msi_instance_cookie *icookie)
+{
+	struct vfio_pci_irq_ctx *ctx;
+	int ret = -EINVAL;
+
+	mutex_lock(&intr_ctx->igate);
+
+	if (!intr_ctx->ims_backed_irq)
+		goto out_unlock;
+
+	ctx = vfio_irq_ctx_get(intr_ctx, vector);
+	if (ctx) {
+		if (WARN_ON_ONCE(ctx->emulated)) {
+			ret = -EINVAL;
+			goto out_unlock;
+		}
+		ctx->icookie = *icookie;
+		ret = 0;
+		goto out_unlock;
+	}
+
+	ctx = kzalloc(sizeof(*ctx), GFP_KERNEL_ACCOUNT);
+	if (!ctx) {
+		ret = -ENOMEM;
+		goto out_unlock;
+	}
+
+	ctx->icookie = *icookie;
+	ret = xa_insert(&intr_ctx->ctx, vector, ctx, GFP_KERNEL_ACCOUNT);
+	if (ret) {
+		kfree(ctx);
+		goto out_unlock;
+	}
+
+	ret = 0;
+
+out_unlock:
+	mutex_unlock(&intr_ctx->igate);
+	return ret;
+}
+EXPORT_SYMBOL_GPL(vfio_pci_ims_set_cookie);
+
 int vfio_pci_set_irqs_ioctl(struct vfio_pci_intr_ctx *intr_ctx, uint32_t flags,
 			    unsigned int index, unsigned int start,
 			    unsigned int count, void *data)
diff --git a/include/linux/vfio_pci_core.h b/include/linux/vfio_pci_core.h
index c6e399b39e90..32c2145ffdb5 100644
--- a/include/linux/vfio_pci_core.h
+++ b/include/linux/vfio_pci_core.h
@@ -168,6 +168,9 @@ int vfio_pci_ims_init_intr_ctx(struct vfio_device *vdev,
 			       struct pci_dev *pdev,
 			       union msi_instance_cookie *default_cookie);
 int vfio_pci_ims_hwirq(struct vfio_pci_intr_ctx *intr_ctx, unsigned int vector);
+int vfio_pci_ims_set_cookie(struct vfio_pci_intr_ctx *intr_ctx,
+			    unsigned int vector,
+			    union msi_instance_cookie *icookie);
 void vfio_pci_ims_release_intr_ctx(struct vfio_pci_intr_ctx *intr_ctx);
 void vfio_pci_send_signal(struct vfio_pci_intr_ctx *intr_ctx, unsigned int vector);
 int vfio_pci_set_emulated(struct vfio_pci_intr_ctx *intr_ctx,
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* RE: [RFC PATCH V2 14/18] vfio/pci: Add core IMS support
  2023-10-06 16:41 ` [RFC PATCH V2 14/18] vfio/pci: Add core IMS support Reinette Chatre
@ 2023-10-13  8:10   ` Tian, Kevin
  2023-10-16 17:48     ` Reinette Chatre
  0 siblings, 1 reply; 21+ messages in thread
From: Tian, Kevin @ 2023-10-13  8:10 UTC (permalink / raw)
  To: Chatre, Reinette, jgg, yishaih, shameerali.kolothum.thodi,
	alex.williamson
  Cc: kvm, Jiang, Dave, Liu, Jing2, Raj, Ashok, Yu, Fenghua,
	tom.zanussi, linux-kernel, patches

> From: Chatre, Reinette <reinette.chatre@intel.com>
> Sent: Saturday, October 7, 2023 12:41 AM
> 
> A virtual device driver starts by initializing the backend
> using new vfio_pci_ims_init_intr_ctx(), cleanup using new
> vfio_pci_ims_release_intr_ctx(). Once initialized the virtual
> device driver can call vfio_pci_set_irqs_ioctl() to handle the
> VFIO_DEVICE_SET_IRQS ioctl() after it has validated the parameters
> to be appropriate for the particular device.

I wonder whether the code sharing can go deeper from
vfio_pci_set_irqs_ioctl() all the way down to set_vector_signal()
with proper abstraction. Then handle emulated interrupt in the
common code instead of ims specific path. intel gvt also uses
emulated interrupt, which could be converted to use this library
too.

There is some subtle difference between pci/ims backends
regarding to how set_vector_signal() is coded in this series. But
it is not intuitive to me whether such a difference is conceptual
or simply from a coding preference.

Would you mind doing an exercise whether that is achievable?

Thanks
Kevin

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH V2 14/18] vfio/pci: Add core IMS support
  2023-10-13  8:10   ` Tian, Kevin
@ 2023-10-16 17:48     ` Reinette Chatre
  0 siblings, 0 replies; 21+ messages in thread
From: Reinette Chatre @ 2023-10-16 17:48 UTC (permalink / raw)
  To: Tian, Kevin, jgg, yishaih, shameerali.kolothum.thodi, alex.williamson
  Cc: kvm, Jiang, Dave, Liu, Jing2, Raj, Ashok, Yu, Fenghua,
	tom.zanussi, linux-kernel, patches

Hi Kevin,

On 10/13/2023 1:10 AM, Tian, Kevin wrote:
>> From: Chatre, Reinette <reinette.chatre@intel.com>
>> Sent: Saturday, October 7, 2023 12:41 AM
>>
>> A virtual device driver starts by initializing the backend
>> using new vfio_pci_ims_init_intr_ctx(), cleanup using new
>> vfio_pci_ims_release_intr_ctx(). Once initialized the virtual
>> device driver can call vfio_pci_set_irqs_ioctl() to handle the
>> VFIO_DEVICE_SET_IRQS ioctl() after it has validated the parameters
>> to be appropriate for the particular device.
> 
> I wonder whether the code sharing can go deeper from
> vfio_pci_set_irqs_ioctl() all the way down to set_vector_signal()
> with proper abstraction. 

There is a foundational difference in the MSI and IMS interrupt
management that is handled by the separate set_vector_signal()
implementations.

For MSI interrupts the interrupts stay allocated but the individual
interrupt context is always freed and re-allocated.

For IMS the interrupts are always freed and re-allocated (to ensure that
any new cookie is taken into account) while the individual interrupt
context stays allocated (to not lose the cookie value associated
with the individual interrupt).

It may indeed be possible to accommodate this difference with further
abstraction. I will study the code more to explore how this
can be done.

> Then handle emulated interrupt in the
> common code instead of ims specific path. intel gvt also uses
> emulated interrupt, which could be converted to use this library
> too.

Thank you for pointing me to intel gvt. 

> There is some subtle difference between pci/ims backends
> regarding to how set_vector_signal() is coded in this series. But
> it is not intuitive to me whether such a difference is conceptual
> or simply from a coding preference.
> 
> Would you mind doing an exercise whether that is achievable?

I do not mind at all. Will do.

Thank you very much for taking a look and sharing your guidance.

Reinette 

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2023-10-16 17:49 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-10-06 16:40 [RFC PATCH V2 00/18] vfio/pci: Back guest interrupts from Interrupt Message Store (IMS) Reinette Chatre
2023-10-06 16:40 ` [RFC PATCH V2 01/18] PCI/MSI: Provide stubs for IMS functions Reinette Chatre
2023-10-06 16:40 ` [RFC PATCH V2 02/18] vfio/pci: Move PCI specific check from wrapper to PCI function Reinette Chatre
2023-10-06 16:40 ` [RFC PATCH V2 03/18] vfio/pci: Use unsigned int instead of unsigned Reinette Chatre
2023-10-06 16:40 ` [RFC PATCH V2 04/18] vfio/pci: Make core interrupt callbacks accessible to all virtual devices Reinette Chatre
2023-10-06 16:41 ` [RFC PATCH V2 05/18] vfio/pci: Split PCI interrupt management into front and backend Reinette Chatre
2023-10-06 16:41 ` [RFC PATCH V2 06/18] vfio/pci: Separate MSI and MSI-X handling Reinette Chatre
2023-10-06 16:41 ` [RFC PATCH V2 07/18] vfio/pci: Move interrupt eventfd to interrupt context Reinette Chatre
2023-10-06 16:41 ` [RFC PATCH V2 08/18] vfio/pci: Move mutex acquisition into function Reinette Chatre
2023-10-06 16:41 ` [RFC PATCH V2 09/18] vfio/pci: Move interrupt contexts to generic interrupt struct Reinette Chatre
2023-10-06 16:41 ` [RFC PATCH V2 10/18] vfio/pci: Move IRQ type to generic interrupt context Reinette Chatre
2023-10-06 16:41 ` [RFC PATCH V2 11/18] vfio/pci: Split interrupt context initialization Reinette Chatre
2023-10-06 16:41 ` [RFC PATCH V2 12/18] vfio/pci: Provide interrupt context to generic ops Reinette Chatre
2023-10-06 16:41 ` [RFC PATCH V2 13/18] vfio/pci: Make vfio_pci_set_irqs_ioctl() available Reinette Chatre
2023-10-06 16:41 ` [RFC PATCH V2 14/18] vfio/pci: Add core IMS support Reinette Chatre
2023-10-13  8:10   ` Tian, Kevin
2023-10-16 17:48     ` Reinette Chatre
2023-10-06 16:41 ` [RFC PATCH V2 15/18] vfio/pci: Support emulated interrupts Reinette Chatre
2023-10-06 16:41 ` [RFC PATCH V2 16/18] vfio/pci: Support emulated interrupts in IMS backend Reinette Chatre
2023-10-06 16:41 ` [RFC PATCH V2 17/18] vfio/pci: Add accessor for IMS index Reinette Chatre
2023-10-06 16:41 ` [RFC PATCH V2 18/18] vfio/pci: Support IMS cookie modification Reinette Chatre

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.