DPDK-dev Archive on lore.kernel.org
 help / color / Atom feed
* Re: [dpdk-dev] [RFC PATCH] vfio: avoid re-installing irq handler
@ 2019-07-15 16:50 Jerin Jacob Kollanukkaran
  2019-07-16  5:58 ` Hyong Youb Kim (hyonkim)
                   ` (3 more replies)
  0 siblings, 4 replies; 25+ messages in thread
From: Jerin Jacob Kollanukkaran @ 2019-07-15 16:50 UTC (permalink / raw)
  To: Hyong Youb Kim, David Marchand, Thomas Monjalon, Ferruh Yigit
  Cc: dev, John Daley, Shahed Shaikh, Nithin Kumar Dabilpuram

> -----Original Message-----
> From: Hyong Youb Kim <hyonkim@cisco.com>
> Sent: Monday, July 15, 2019 9:28 PM
> To: David Marchand <david.marchand@redhat.com>; Thomas Monjalon
> <thomas@monjalon.net>; Jerin Jacob Kollanukkaran <jerinj@marvell.com>;
> Ferruh Yigit <ferruh.yigit@intel.com>
> Cc: dev@dpdk.org; John Daley <johndale@cisco.com>; Hyong Youb Kim
> <hyonkim@cisco.com>
> Subject: [EXT] [RFC PATCH] vfio: avoid re-installing irq handler
> 
> A rough patch for the approach mentioned earlier. It is only for discussion.
> http://mails.dpdk.org/archives/dev/2019-July/138113.html
> 
> To try this out, first revert the following then apply.
> commit 89aac60e0be9 ("vfio: fix interrupts race condition")

Yes. This patch has to be to reverted. It changes the existing interrupt behavior and does not address the MSIX case as well.

I think, The clean fix would be to introduce rte_intr_mask() and rte_intr_unmask() by abstracting the INTX and MSIX differences
And let qede driver call it as needed.

Thoughts?

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [dpdk-dev] [RFC PATCH] vfio: avoid re-installing irq handler
  2019-07-15 16:50 [dpdk-dev] [RFC PATCH] vfio: avoid re-installing irq handler Jerin Jacob Kollanukkaran
@ 2019-07-16  5:58 ` Hyong Youb Kim (hyonkim)
  2019-07-16  6:47   ` Jerin Jacob Kollanukkaran
  2019-07-16  6:46 ` [dpdk-dev] [RFC PATCH] eal: add mask and unmask interrupt apis Nithin Dabilpuram
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 25+ messages in thread
From: Hyong Youb Kim (hyonkim) @ 2019-07-16  5:58 UTC (permalink / raw)
  To: Jerin Jacob Kollanukkaran, David Marchand, Thomas Monjalon,
	Ferruh Yigit, Alejandro Lucero, Anatoly Burakov
  Cc: dev, John Daley (johndale), Shahed Shaikh, Nithin Kumar Dabilpuram

> -----Original Message-----
> From: Jerin Jacob Kollanukkaran <jerinj@marvell.com>
> Sent: Tuesday, July 16, 2019 1:51 AM
> To: Hyong Youb Kim (hyonkim) <hyonkim@cisco.com>; David Marchand
> <david.marchand@redhat.com>; Thomas Monjalon
> <thomas@monjalon.net>; Ferruh Yigit <ferruh.yigit@intel.com>
> Cc: dev@dpdk.org; John Daley (johndale) <johndale@cisco.com>; Shahed
> Shaikh <shshaikh@marvell.com>; Nithin Kumar Dabilpuram
> <ndabilpuram@marvell.com>
> Subject: RE: [RFC PATCH] vfio: avoid re-installing irq handler
> 
> > -----Original Message-----
> > From: Hyong Youb Kim <hyonkim@cisco.com>
> > Sent: Monday, July 15, 2019 9:28 PM
> > To: David Marchand <david.marchand@redhat.com>; Thomas Monjalon
> > <thomas@monjalon.net>; Jerin Jacob Kollanukkaran <jerinj@marvell.com>;
> > Ferruh Yigit <ferruh.yigit@intel.com>
> > Cc: dev@dpdk.org; John Daley <johndale@cisco.com>; Hyong Youb Kim
> > <hyonkim@cisco.com>
> > Subject: [EXT] [RFC PATCH] vfio: avoid re-installing irq handler
> >
> > A rough patch for the approach mentioned earlier. It is only for discussion.
> > http://mails.dpdk.org/archives/dev/2019-July/138113.html
> >
> > To try this out, first revert the following then apply.
> > commit 89aac60e0be9 ("vfio: fix interrupts race condition")
> 
> Yes. This patch has to be to reverted. It changes the existing interrupt
> behavior and does not address the MSIX case as well.
> 
> I think, The clean fix would be to introduce rte_intr_mask() and
> rte_intr_unmask() by abstracting the INTX and MSIX differences
> And let qede driver call it as needed.
> 
> Thoughts?

Hi,

You are proposing these?
- Add rte_intr_mask_intx, rte_intr_unmask_intx.
  No APIs for masking MSI/MSI-X as vfio-pci does not support that.
- Modify PMD irq handlers to use rte_intr_unmask_intx as necessary.

That might be too intrusive. And too much work for the sake of
INTx.. Anyone really using/needing INTx these days? :-)

The following drivers call rte_intr_enable from their irq handlers. So
with explicit rte_intr_unmask_intx, all these would need to do "if
using intx, unmask"?

atlantic, avp, axgbe, bnx2x, e1000, fm10k, ice, ixgbe, nfp, qede, sfc,
vmxnet3

And nfp seems to rely on rte_intr_enable to re-install irq handler to
unmask a vector in MSI-X Table?

        if (hw->ctrl & NFP_NET_CFG_CTRL_MSIXAUTO) {
                /* If MSI-X auto-masking is used, clear the entry */
                rte_wmb();
                rte_intr_enable(&pci_dev->intr_handle);

With David's patch and mine, this handler would have to first
rte_intr_disable() and then enable, if such unmasking is really
necessary..

As for the semantics of rte_intr_enable/disable, I am ok as is.
- "enable": put things in a state where NIC can send an interrupt, and
  PMD/app gets a callback.
  Whether this involves unmasking for INTx is hidden.
- "disable": put things in a state where NIC cannot send an interrupt.

Regardless of vfio changes, we should probably remove rte_intr_enable
from qede_interrupt_handler (the MSI/MSI-X interrupt handler), to make
usage/intention clear..

Thanks.
-Hyong


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [dpdk-dev] [RFC PATCH] eal: add mask and unmask interrupt apis
  2019-07-15 16:50 [dpdk-dev] [RFC PATCH] vfio: avoid re-installing irq handler Jerin Jacob Kollanukkaran
  2019-07-16  5:58 ` Hyong Youb Kim (hyonkim)
@ 2019-07-16  6:46 ` Nithin Dabilpuram
  2019-07-16  7:01 ` [dpdk-dev] [RFC PATCH v2] " Nithin Dabilpuram
  2019-07-16 16:44 ` [dpdk-dev] [RFC PATCH v3 1/3] vfio: revert change that does intr eventfd setup at probe Nithin Dabilpuram
  3 siblings, 0 replies; 25+ messages in thread
From: Nithin Dabilpuram @ 2019-07-16  6:46 UTC (permalink / raw)
  To: Hyong Youb Kim, David Marchand, Thomas Monjalon, Ferruh Yigit
  Cc: jerinj, John Daley, Shahed Shaikh, dev, Nithin Dabilpuram

Add new mask and unmask interrupt api's specifically for
VFIO_LEGACY type of interrupts. For legacy/intx vfio-pci masks the
interrupt before posting the event and application has to unmask it
as a part of its interrupt handling. Hence these new api's can be used
for the light-weight operation of masking and unmasking.

In all the other VFIO interrupts i.e MSI & MSIx, vfio-pci doesn't
do anything like that and there is also no implementation for
masking and unmasking.
---
 lib/librte_eal/common/include/rte_interrupts.h |  23 ++++
 lib/librte_eal/linux/eal/eal_interrupts.c      | 155 +++++++++++++++++++++++++
 2 files changed, 178 insertions(+)

diff --git a/lib/librte_eal/common/include/rte_interrupts.h b/lib/librte_eal/common/include/rte_interrupts.h
index c1e912c..b0675be 100644
--- a/lib/librte_eal/common/include/rte_interrupts.h
+++ b/lib/librte_eal/common/include/rte_interrupts.h
@@ -118,6 +118,29 @@ int rte_intr_enable(const struct rte_intr_handle *intr_handle);
  */
 int rte_intr_disable(const struct rte_intr_handle *intr_handle);
 
+/**
+ * It masks the interrupt for the specified handle.
+ *
+ * @param intr_handle
+ *  pointer to the interrupt handle.
+ *
+ * @return
+ *  - On success, zero.
+ *  - On failure, a negative value.
+ */
+int rte_intr_mask(const struct rte_intr_handle *intr_handle);
+
+/**
+ * It unmasks the interrupt for the specified handle.
+ *
+ * @param intr_handle
+ *  pointer to the interrupt handle.
+ *
+ * @return
+ *  - On success, zero.
+ *  - On failure, a negative value.
+ */
+int rte_intr_unmask(const struct rte_intr_handle *intr_handle);
 #ifdef __cplusplus
 }
 #endif
diff --git a/lib/librte_eal/linux/eal/eal_interrupts.c b/lib/librte_eal/linux/eal/eal_interrupts.c
index 79ad5e8..d91022c 100644
--- a/lib/librte_eal/linux/eal/eal_interrupts.c
+++ b/lib/librte_eal/linux/eal/eal_interrupts.c
@@ -197,6 +197,63 @@ vfio_disable_intx(const struct rte_intr_handle *intr_handle) {
 	return 0;
 }
 
+/* unmask legacy (INTx) interrupts */
+static int
+vfio_unmask_intx(const struct rte_intr_handle *intr_handle)
+{
+	struct vfio_irq_set *irq_set;
+	char irq_set_buf[IRQ_SET_BUF_LEN];
+	int len, ret;
+
+	len = sizeof(struct vfio_irq_set);
+
+	/* unmask INTx */
+	irq_set = (struct vfio_irq_set *) irq_set_buf;
+	memset(irq_set, 0, len);
+	irq_set->argsz = len;
+	irq_set->count = 1;
+	irq_set->flags = VFIO_IRQ_SET_DATA_NONE | VFIO_IRQ_SET_ACTION_UNMASK;
+	irq_set->index = VFIO_PCI_INTX_IRQ_INDEX;
+	irq_set->start = 0;
+
+	ret = ioctl(intr_handle->vfio_dev_fd, VFIO_DEVICE_SET_IRQS, irq_set);
+
+	if (ret) {
+		RTE_LOG(ERR, EAL, "Error unmasking INTx interrupts for fd %d\n",
+			intr_handle->fd);
+		return -1;
+	}
+	return 0;
+}
+
+/* mask legacy (INTx) interrupts */
+static int
+vfio_mask_intx(const struct rte_intr_handle *intr_handle)
+{
+	struct vfio_irq_set *irq_set;
+	char irq_set_buf[IRQ_SET_BUF_LEN];
+	int len, ret;
+
+	len = sizeof(struct vfio_irq_set);
+
+	/* mask interrupts */
+	irq_set = (struct vfio_irq_set *) irq_set_buf;
+	irq_set->argsz = len;
+	irq_set->count = 1;
+	irq_set->flags = VFIO_IRQ_SET_DATA_NONE | VFIO_IRQ_SET_ACTION_MASK;
+	irq_set->index = VFIO_PCI_INTX_IRQ_INDEX;
+	irq_set->start = 0;
+
+	ret = ioctl(intr_handle->vfio_dev_fd, VFIO_DEVICE_SET_IRQS, irq_set);
+
+	if (ret) {
+		RTE_LOG(ERR, EAL, "Error masking INTx interrupts for fd %d\n",
+			intr_handle->fd);
+		return -1;
+	}
+
+	return 0;
+}
 /* enable MSI interrupts */
 static int
 vfio_enable_msi(const struct rte_intr_handle *intr_handle) {
@@ -694,6 +751,104 @@ rte_intr_enable(const struct rte_intr_handle *intr_handle)
 }
 
 int
+rte_intr_mask(const struct rte_intr_handle *intr_handle)
+{
+	if (intr_handle && intr_handle->type == RTE_INTR_HANDLE_VDEV)
+		return 0;
+
+	if (!intr_handle || intr_handle->fd < 0 || intr_handle->uio_cfg_fd < 0)
+		return -1;
+
+	switch (intr_handle->type){
+	/* Both masking and disabling are same for UIO */
+	case RTE_INTR_HANDLE_UIO:
+		if (uio_intr_disable(intr_handle))
+			return -1;
+		break;
+	case RTE_INTR_HANDLE_UIO_INTX:
+		if (uio_intx_intr_disable(intr_handle))
+			return -1;
+		break;
+	/* not used at this moment */
+	case RTE_INTR_HANDLE_ALARM:
+		return -1;
+#ifdef VFIO_PRESENT
+	case RTE_INTR_HANDLE_VFIO_MSIX:
+	case RTE_INTR_HANDLE_VFIO_MSI:
+		return -1;
+	case RTE_INTR_HANDLE_VFIO_LEGACY:
+		if (vfio_mask_intx(intr_handle))
+			return -1;
+		break;
+#ifdef HAVE_VFIO_DEV_REQ_INTERFACE
+	case RTE_INTR_HANDLE_VFIO_REQ:
+		return -1;
+#endif
+#endif
+	/* not used at this moment */
+	case RTE_INTR_HANDLE_DEV_EVENT:
+		return -1;
+	/* unknown handle type */
+	default:
+		RTE_LOG(ERR, EAL,
+			"Unknown handle type of fd %d\n",
+					intr_handle->fd);
+		return -1;
+	}
+
+	return 0;
+}
+
+int
+rte_intr_unmask(const struct rte_intr_handle *intr_handle)
+{
+	if (intr_handle && intr_handle->type == RTE_INTR_HANDLE_VDEV)
+		return 0;
+
+	if (!intr_handle || intr_handle->fd < 0 || intr_handle->uio_cfg_fd < 0)
+		return -1;
+
+	switch (intr_handle->type){
+	/* Both unmasking and disabling are same for UIO */
+	case RTE_INTR_HANDLE_UIO:
+		if (uio_intr_enable(intr_handle))
+			return -1;
+		break;
+	case RTE_INTR_HANDLE_UIO_INTX:
+		if (uio_intx_intr_enable(intr_handle))
+			return -1;
+		break;
+	/* not used at this moment */
+	case RTE_INTR_HANDLE_ALARM:
+		return -1;
+#ifdef VFIO_PRESENT
+	case RTE_INTR_HANDLE_VFIO_MSIX:
+	case RTE_INTR_HANDLE_VFIO_MSI:
+		return -1;
+	case RTE_INTR_HANDLE_VFIO_LEGACY:
+		if (vfio_unmask_intx(intr_handle))
+			return -1;
+		break;
+#ifdef HAVE_VFIO_DEV_REQ_INTERFACE
+	case RTE_INTR_HANDLE_VFIO_REQ:
+		return -1;
+#endif
+#endif
+	/* not used at this moment */
+	case RTE_INTR_HANDLE_DEV_EVENT:
+		return -1;
+	/* unknown handle type */
+	default:
+		RTE_LOG(ERR, EAL,
+			"Unknown handle type of fd %d\n",
+					intr_handle->fd);
+		return -1;
+	}
+
+	return 0;
+}
+
+int
 rte_intr_disable(const struct rte_intr_handle *intr_handle)
 {
 	if (intr_handle && intr_handle->type == RTE_INTR_HANDLE_VDEV)
-- 
2.8.4


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [dpdk-dev] [RFC PATCH] vfio: avoid re-installing irq handler
  2019-07-16  5:58 ` Hyong Youb Kim (hyonkim)
@ 2019-07-16  6:47   ` Jerin Jacob Kollanukkaran
  2019-07-16  7:49     ` Hyong Youb Kim (hyonkim)
  0 siblings, 1 reply; 25+ messages in thread
From: Jerin Jacob Kollanukkaran @ 2019-07-16  6:47 UTC (permalink / raw)
  To: Hyong Youb Kim (hyonkim),
	David Marchand, Thomas Monjalon, Ferruh Yigit, Alejandro Lucero,
	Anatoly Burakov
  Cc: dev, John Daley (johndale), Shahed Shaikh, Nithin Kumar Dabilpuram

> -----Original Message-----
> From: Hyong Youb Kim (hyonkim) <hyonkim@cisco.com>
> Sent: Tuesday, July 16, 2019 11:28 AM
> To: Jerin Jacob Kollanukkaran <jerinj@marvell.com>; David Marchand
> <david.marchand@redhat.com>; Thomas Monjalon
> <thomas@monjalon.net>; Ferruh Yigit <ferruh.yigit@intel.com>; Alejandro
> Lucero <alejandro.lucero@netronome.com>; Anatoly Burakov
> <anatoly.burakov@intel.com>
> Cc: dev@dpdk.org; John Daley (johndale) <johndale@cisco.com>; Shahed
> Shaikh <shshaikh@marvell.com>; Nithin Kumar Dabilpuram
> <ndabilpuram@marvell.com>
> Subject: RE: [RFC PATCH] vfio: avoid re-installing irq handler
> 
> > > A rough patch for the approach mentioned earlier. It is only for
> discussion.
> > > http://mails.dpdk.org/archives/dev/2019-July/138113.html
> > >
> > > To try this out, first revert the following then apply.
> > > commit 89aac60e0be9 ("vfio: fix interrupts race condition")
> >
> > Yes. This patch has to be to reverted. It changes the existing
> > interrupt behavior and does not address the MSIX case as well.
> >
> > I think, The clean fix would be to introduce rte_intr_mask() and
> > rte_intr_unmask() by abstracting the INTX and MSIX differences And let
> > qede driver call it as needed.
> >
> > Thoughts?
> 
> Hi,

Hi Hyong,

> 
> You are proposing these?
> - Add rte_intr_mask_intx, rte_intr_unmask_intx.
>   No APIs for masking MSI/MSI-X as vfio-pci does not support that.
> - Modify PMD irq handlers to use rte_intr_unmask_intx as necessary.

No, introduce the rte_intr_mask() and rte_intr_unmask().
For MSIX + Linux VFIO, That API can return -ENOSUP as Linux VFIO+MSIX is not supporting.
Another platform/eal may support it.

Mask and unmask is operation is known to all IRQ controllers.
So, IMO, As far as abstraction is concerned it will be good fit.

> That might be too intrusive. And too much work for the sake of INTx..
> Anyone really using/needing INTx these days? :-)

Yup. Mask needs to called only for only qede INTx. Looks like qede
Has MSIX and INTX separate handler. So this mask can go to qede INTx

> 
> The following drivers call rte_intr_enable from their irq handlers. So with
> explicit rte_intr_unmask_intx, all these would need to do "if using intx,
> unmask"?
> 
> atlantic, avp, axgbe, bnx2x, e1000, fm10k, ice, ixgbe, nfp, qede, sfc,
> vmxnet3

No change on these PMDs.

> And nfp seems to rely on rte_intr_enable to re-install irq handler to unmask
> a vector in MSI-X Table?
> 
>         if (hw->ctrl & NFP_NET_CFG_CTRL_MSIXAUTO) {
>                 /* If MSI-X auto-masking is used, clear the entry */
>                 rte_wmb();
>                 rte_intr_enable(&pci_dev->intr_handle);
> 
> With David's patch and mine, this handler would have to first
> rte_intr_disable() and then enable, if such unmasking is really necessary..
> 
> As for the semantics of rte_intr_enable/disable, I am ok as is.
> - "enable": put things in a state where NIC can send an interrupt, and
>   PMD/app gets a callback.
>   Whether this involves unmasking for INTx is hidden.
> - "disable": put things in a state where NIC cannot send an interrupt.

It looks OK to me. My only thought was, Since mask and unmask
is a common irq controller operation. We may not need to add
A lot of common code(Introducing a state) to hide unmask INTx.
More over as you said, There is may only handful of devices uses INTX.

IMO, mask and unmask API is good fit as eal abstraction.
But Using a separate API or hide inside eal to solve this problem is good question.
May be more thoughts from another quys will be good.

We will try to send a version with mask/unmask API to see the changes required.

> 
> Regardless of vfio changes, we should probably remove rte_intr_enable
> from qede_interrupt_handler (the MSI/MSI-X interrupt handler), to make
> usage/intention clear..

Yes. Anyway this change is required.

> 
> Thanks.
> -Hyong


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [dpdk-dev] [RFC PATCH v2] eal: add mask and unmask interrupt apis
  2019-07-15 16:50 [dpdk-dev] [RFC PATCH] vfio: avoid re-installing irq handler Jerin Jacob Kollanukkaran
  2019-07-16  5:58 ` Hyong Youb Kim (hyonkim)
  2019-07-16  6:46 ` [dpdk-dev] [RFC PATCH] eal: add mask and unmask interrupt apis Nithin Dabilpuram
@ 2019-07-16  7:01 ` " Nithin Dabilpuram
  2019-07-16 16:44 ` [dpdk-dev] [RFC PATCH v3 1/3] vfio: revert change that does intr eventfd setup at probe Nithin Dabilpuram
  3 siblings, 0 replies; 25+ messages in thread
From: Nithin Dabilpuram @ 2019-07-16  7:01 UTC (permalink / raw)
  To: Hyong Youb Kim, David Marchand, Thomas Monjalon, Ferruh Yigit
  Cc: jerinj, John Daley, Shahed Shaikh, dev, Nithin Dabilpuram

Add new mask and unmask interrupt api's specifically for
VFIO_LEGACY type of interrupts. For legacy/intx vfio-pci masks the
interrupt before posting the event and application has to unmask it
as a part of its interrupt handling. Hence these new api's can be used
for the light-weight operation of masking and unmasking.

In all the other VFIO interrupts i.e MSI & MSIx, vfio-pci doesn't
do anything like that and there is also no implementation for
masking and unmasking.

For qede intx handling, we can call the new api for unmasking interrupt.
We have not tested it on qede.
---
 drivers/net/qede/qede_ethdev.c                 |   6 +-
 lib/librte_eal/common/include/rte_interrupts.h |  23 ++++
 lib/librte_eal/linux/eal/eal_interrupts.c      | 155 +++++++++++++++++++++++++
 3 files changed, 180 insertions(+), 4 deletions(-)

diff --git a/drivers/net/qede/qede_ethdev.c b/drivers/net/qede/qede_ethdev.c
index 82363e6..20b7587 100644
--- a/drivers/net/qede/qede_ethdev.c
+++ b/drivers/net/qede/qede_ethdev.c
@@ -248,8 +248,8 @@ qede_interrupt_handler_intx(void *param)
 	if (status & 0x1) {
 		qede_interrupt_action(ECORE_LEADING_HWFN(edev));
 
-		if (rte_intr_enable(eth_dev->intr_handle))
-			DP_ERR(edev, "rte_intr_enable failed\n");
+		if (rte_intr_unmask(eth_dev->intr_handle))
+			DP_ERR(edev, "rte_intr_unmask failed\n");
 	}
 }
 
@@ -261,8 +261,6 @@ qede_interrupt_handler(void *param)
 	struct ecore_dev *edev = &qdev->edev;
 
 	qede_interrupt_action(ECORE_LEADING_HWFN(edev));
-	if (rte_intr_enable(eth_dev->intr_handle))
-		DP_ERR(edev, "rte_intr_enable failed\n");
 }
 
 static void
diff --git a/lib/librte_eal/common/include/rte_interrupts.h b/lib/librte_eal/common/include/rte_interrupts.h
index c1e912c..b0675be 100644
--- a/lib/librte_eal/common/include/rte_interrupts.h
+++ b/lib/librte_eal/common/include/rte_interrupts.h
@@ -118,6 +118,29 @@ int rte_intr_enable(const struct rte_intr_handle *intr_handle);
  */
 int rte_intr_disable(const struct rte_intr_handle *intr_handle);
 
+/**
+ * It masks the interrupt for the specified handle.
+ *
+ * @param intr_handle
+ *  pointer to the interrupt handle.
+ *
+ * @return
+ *  - On success, zero.
+ *  - On failure, a negative value.
+ */
+int rte_intr_mask(const struct rte_intr_handle *intr_handle);
+
+/**
+ * It unmasks the interrupt for the specified handle.
+ *
+ * @param intr_handle
+ *  pointer to the interrupt handle.
+ *
+ * @return
+ *  - On success, zero.
+ *  - On failure, a negative value.
+ */
+int rte_intr_unmask(const struct rte_intr_handle *intr_handle);
 #ifdef __cplusplus
 }
 #endif
diff --git a/lib/librte_eal/linux/eal/eal_interrupts.c b/lib/librte_eal/linux/eal/eal_interrupts.c
index 79ad5e8..d91022c 100644
--- a/lib/librte_eal/linux/eal/eal_interrupts.c
+++ b/lib/librte_eal/linux/eal/eal_interrupts.c
@@ -197,6 +197,63 @@ vfio_disable_intx(const struct rte_intr_handle *intr_handle) {
 	return 0;
 }
 
+/* unmask legacy (INTx) interrupts */
+static int
+vfio_unmask_intx(const struct rte_intr_handle *intr_handle)
+{
+	struct vfio_irq_set *irq_set;
+	char irq_set_buf[IRQ_SET_BUF_LEN];
+	int len, ret;
+
+	len = sizeof(struct vfio_irq_set);
+
+	/* unmask INTx */
+	irq_set = (struct vfio_irq_set *) irq_set_buf;
+	memset(irq_set, 0, len);
+	irq_set->argsz = len;
+	irq_set->count = 1;
+	irq_set->flags = VFIO_IRQ_SET_DATA_NONE | VFIO_IRQ_SET_ACTION_UNMASK;
+	irq_set->index = VFIO_PCI_INTX_IRQ_INDEX;
+	irq_set->start = 0;
+
+	ret = ioctl(intr_handle->vfio_dev_fd, VFIO_DEVICE_SET_IRQS, irq_set);
+
+	if (ret) {
+		RTE_LOG(ERR, EAL, "Error unmasking INTx interrupts for fd %d\n",
+			intr_handle->fd);
+		return -1;
+	}
+	return 0;
+}
+
+/* mask legacy (INTx) interrupts */
+static int
+vfio_mask_intx(const struct rte_intr_handle *intr_handle)
+{
+	struct vfio_irq_set *irq_set;
+	char irq_set_buf[IRQ_SET_BUF_LEN];
+	int len, ret;
+
+	len = sizeof(struct vfio_irq_set);
+
+	/* mask interrupts */
+	irq_set = (struct vfio_irq_set *) irq_set_buf;
+	irq_set->argsz = len;
+	irq_set->count = 1;
+	irq_set->flags = VFIO_IRQ_SET_DATA_NONE | VFIO_IRQ_SET_ACTION_MASK;
+	irq_set->index = VFIO_PCI_INTX_IRQ_INDEX;
+	irq_set->start = 0;
+
+	ret = ioctl(intr_handle->vfio_dev_fd, VFIO_DEVICE_SET_IRQS, irq_set);
+
+	if (ret) {
+		RTE_LOG(ERR, EAL, "Error masking INTx interrupts for fd %d\n",
+			intr_handle->fd);
+		return -1;
+	}
+
+	return 0;
+}
 /* enable MSI interrupts */
 static int
 vfio_enable_msi(const struct rte_intr_handle *intr_handle) {
@@ -694,6 +751,104 @@ rte_intr_enable(const struct rte_intr_handle *intr_handle)
 }
 
 int
+rte_intr_mask(const struct rte_intr_handle *intr_handle)
+{
+	if (intr_handle && intr_handle->type == RTE_INTR_HANDLE_VDEV)
+		return 0;
+
+	if (!intr_handle || intr_handle->fd < 0 || intr_handle->uio_cfg_fd < 0)
+		return -1;
+
+	switch (intr_handle->type){
+	/* Both masking and disabling are same for UIO */
+	case RTE_INTR_HANDLE_UIO:
+		if (uio_intr_disable(intr_handle))
+			return -1;
+		break;
+	case RTE_INTR_HANDLE_UIO_INTX:
+		if (uio_intx_intr_disable(intr_handle))
+			return -1;
+		break;
+	/* not used at this moment */
+	case RTE_INTR_HANDLE_ALARM:
+		return -1;
+#ifdef VFIO_PRESENT
+	case RTE_INTR_HANDLE_VFIO_MSIX:
+	case RTE_INTR_HANDLE_VFIO_MSI:
+		return -1;
+	case RTE_INTR_HANDLE_VFIO_LEGACY:
+		if (vfio_mask_intx(intr_handle))
+			return -1;
+		break;
+#ifdef HAVE_VFIO_DEV_REQ_INTERFACE
+	case RTE_INTR_HANDLE_VFIO_REQ:
+		return -1;
+#endif
+#endif
+	/* not used at this moment */
+	case RTE_INTR_HANDLE_DEV_EVENT:
+		return -1;
+	/* unknown handle type */
+	default:
+		RTE_LOG(ERR, EAL,
+			"Unknown handle type of fd %d\n",
+					intr_handle->fd);
+		return -1;
+	}
+
+	return 0;
+}
+
+int
+rte_intr_unmask(const struct rte_intr_handle *intr_handle)
+{
+	if (intr_handle && intr_handle->type == RTE_INTR_HANDLE_VDEV)
+		return 0;
+
+	if (!intr_handle || intr_handle->fd < 0 || intr_handle->uio_cfg_fd < 0)
+		return -1;
+
+	switch (intr_handle->type){
+	/* Both unmasking and disabling are same for UIO */
+	case RTE_INTR_HANDLE_UIO:
+		if (uio_intr_enable(intr_handle))
+			return -1;
+		break;
+	case RTE_INTR_HANDLE_UIO_INTX:
+		if (uio_intx_intr_enable(intr_handle))
+			return -1;
+		break;
+	/* not used at this moment */
+	case RTE_INTR_HANDLE_ALARM:
+		return -1;
+#ifdef VFIO_PRESENT
+	case RTE_INTR_HANDLE_VFIO_MSIX:
+	case RTE_INTR_HANDLE_VFIO_MSI:
+		return -1;
+	case RTE_INTR_HANDLE_VFIO_LEGACY:
+		if (vfio_unmask_intx(intr_handle))
+			return -1;
+		break;
+#ifdef HAVE_VFIO_DEV_REQ_INTERFACE
+	case RTE_INTR_HANDLE_VFIO_REQ:
+		return -1;
+#endif
+#endif
+	/* not used at this moment */
+	case RTE_INTR_HANDLE_DEV_EVENT:
+		return -1;
+	/* unknown handle type */
+	default:
+		RTE_LOG(ERR, EAL,
+			"Unknown handle type of fd %d\n",
+					intr_handle->fd);
+		return -1;
+	}
+
+	return 0;
+}
+
+int
 rte_intr_disable(const struct rte_intr_handle *intr_handle)
 {
 	if (intr_handle && intr_handle->type == RTE_INTR_HANDLE_VDEV)
-- 
2.8.4


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [dpdk-dev] [RFC PATCH] vfio: avoid re-installing irq handler
  2019-07-16  6:47   ` Jerin Jacob Kollanukkaran
@ 2019-07-16  7:49     ` Hyong Youb Kim (hyonkim)
  2019-07-16  9:56       ` Jerin Jacob Kollanukkaran
  0 siblings, 1 reply; 25+ messages in thread
From: Hyong Youb Kim (hyonkim) @ 2019-07-16  7:49 UTC (permalink / raw)
  To: Jerin Jacob Kollanukkaran, David Marchand, Thomas Monjalon,
	Ferruh Yigit, Alejandro Lucero, Anatoly Burakov
  Cc: dev, John Daley (johndale), Shahed Shaikh, Nithin Kumar Dabilpuram

> -----Original Message-----
> From: Jerin Jacob Kollanukkaran <jerinj@marvell.com>
 [...]
> > > > A rough patch for the approach mentioned earlier. It is only for
> > discussion.
> > > > http://mails.dpdk.org/archives/dev/2019-July/138113.html
> > > >
> > > > To try this out, first revert the following then apply.
> > > > commit 89aac60e0be9 ("vfio: fix interrupts race condition")
> > >
> > > Yes. This patch has to be to reverted. It changes the existing
> > > interrupt behavior and does not address the MSIX case as well.
> > >
> > > I think, The clean fix would be to introduce rte_intr_mask() and
> > > rte_intr_unmask() by abstracting the INTX and MSIX differences And let
> > > qede driver call it as needed.
> > >
> > > Thoughts?
> >
> > Hi,
> 
> Hi Hyong,
> 
> >
> > You are proposing these?
> > - Add rte_intr_mask_intx, rte_intr_unmask_intx.
> >   No APIs for masking MSI/MSI-X as vfio-pci does not support that.
> > - Modify PMD irq handlers to use rte_intr_unmask_intx as necessary.
> 
> No, introduce the rte_intr_mask() and rte_intr_unmask().
> For MSIX + Linux VFIO, That API can return -ENOSUP as Linux VFIO+MSIX is
> not supporting.
> Another platform/eal may support it.
> 

These generic names would invite people to use API, only to see it
fail, since it only works with INTx..

> Mask and unmask is operation is known to all IRQ controllers.
> So, IMO, As far as abstraction is concerned it will be good fit.
> 
> > That might be too intrusive. And too much work for the sake of INTx..
> > Anyone really using/needing INTx these days? :-)
> 
> Yup. Mask needs to called only for only qede INTx. Looks like qede
> Has MSIX and INTX separate handler. So this mask can go to qede INTx
> 
> >
> > The following drivers call rte_intr_enable from their irq handlers. So with
> > explicit rte_intr_unmask_intx, all these would need to do "if using intx,
> > unmask"?
> >
> > atlantic, avp, axgbe, bnx2x, e1000, fm10k, ice, ixgbe, nfp, qede, sfc,
> > vmxnet3
> 
> No change on these PMDs.
> 

Why is that?

These drivers potentially have the same "lost" interrupt issue
mentioned in the original redhat bz (qede + MSI). I *think* this
observation led David to address them all through vfio changes, rather
than fixing qede alone.

You want to introduce unmask API and use it only for qede in this
cycle, and ask respective maintainers to fix their drivers in 19.11?

> > And nfp seems to rely on rte_intr_enable to re-install irq handler to
> unmask
> > a vector in MSI-X Table?
> >
> >         if (hw->ctrl & NFP_NET_CFG_CTRL_MSIXAUTO) {
> >                 /* If MSI-X auto-masking is used, clear the entry */
> >                 rte_wmb();
> >                 rte_intr_enable(&pci_dev->intr_handle);
> >
> > With David's patch and mine, this handler would have to first
> > rte_intr_disable() and then enable, if such unmasking is really necessary..
> >
> > As for the semantics of rte_intr_enable/disable, I am ok as is.
> > - "enable": put things in a state where NIC can send an interrupt, and
> >   PMD/app gets a callback.
> >   Whether this involves unmasking for INTx is hidden.
> > - "disable": put things in a state where NIC cannot send an interrupt.
> 
> It looks OK to me. My only thought was, Since mask and unmask
> is a common irq controller operation. We may not need to add
> A lot of common code(Introducing a state) to hide unmask INTx.
> More over as you said, There is may only handful of devices uses INTX.
> 
> IMO, mask and unmask API is good fit as eal abstraction.
> But Using a separate API or hide inside eal to solve this problem is good
> question.
> May be more thoughts from another quys will be good.
> 
> We will try to send a version with mask/unmask API to see the changes
> required.
> 
> >
> > Regardless of vfio changes, we should probably remove rte_intr_enable
> > from qede_interrupt_handler (the MSI/MSI-X interrupt handler), to make
> > usage/intention clear..
> 
> Yes. Anyway this change is required.
> 

That change fixes the immediate problem (redhat bz) that started all
this discussion. And allow us to kick the can down the road, for
potential issues in other PMDs :-) Not suggesting we do, but it
becomes an option..

Thanks.
-Hyong

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [dpdk-dev] [RFC PATCH] vfio: avoid re-installing irq handler
  2019-07-16  7:49     ` Hyong Youb Kim (hyonkim)
@ 2019-07-16  9:56       ` Jerin Jacob Kollanukkaran
  0 siblings, 0 replies; 25+ messages in thread
From: Jerin Jacob Kollanukkaran @ 2019-07-16  9:56 UTC (permalink / raw)
  To: Hyong Youb Kim (hyonkim),
	David Marchand, Thomas Monjalon, Ferruh Yigit, Alejandro Lucero,
	Anatoly Burakov
  Cc: dev, John Daley (johndale), Shahed Shaikh, Nithin Kumar Dabilpuram

> -----Original Message-----
> From: Hyong Youb Kim (hyonkim) <hyonkim@cisco.com>
> Sent: Tuesday, July 16, 2019 1:19 PM
> To: Jerin Jacob Kollanukkaran <jerinj@marvell.com>; David Marchand
> <david.marchand@redhat.com>; Thomas Monjalon
> <thomas@monjalon.net>; Ferruh Yigit <ferruh.yigit@intel.com>; Alejandro
> Lucero <alejandro.lucero@netronome.com>; Anatoly Burakov
> <anatoly.burakov@intel.com>
> Cc: dev@dpdk.org; John Daley (johndale) <johndale@cisco.com>; Shahed
> Shaikh <shshaikh@marvell.com>; Nithin Kumar Dabilpuram
> <ndabilpuram@marvell.com>
> Subject: RE: [RFC PATCH] vfio: avoid re-installing irq handler
> 
> > -----Original Message-----
> > From: Jerin Jacob Kollanukkaran <jerinj@marvell.com>
>  [...]
> > > > > A rough patch for the approach mentioned earlier. It is only for
> > > discussion.
> > > > > http://mails.dpdk.org/archives/dev/2019-July/138113.html
> > > > >
> > > > > To try this out, first revert the following then apply.
> > > > > commit 89aac60e0be9 ("vfio: fix interrupts race condition")
> > > >
> > > > Yes. This patch has to be to reverted. It changes the existing
> > > > interrupt behavior and does not address the MSIX case as well.
> > > >
> > > > I think, The clean fix would be to introduce rte_intr_mask() and
> > > > rte_intr_unmask() by abstracting the INTX and MSIX differences And
> > > > let qede driver call it as needed.
> > > >
> > > > Thoughts?
> > >
> > > Hi,
> >
> > Hi Hyong,
> >
> > >
> > > You are proposing these?
> > > - Add rte_intr_mask_intx, rte_intr_unmask_intx.
> > >   No APIs for masking MSI/MSI-X as vfio-pci does not support that.
> > > - Modify PMD irq handlers to use rte_intr_unmask_intx as necessary.
> >
> > No, introduce the rte_intr_mask() and rte_intr_unmask().
> > For MSIX + Linux VFIO, That API can return -ENOSUP as Linux VFIO+MSIX
> > is not supporting.
> > Another platform/eal may support it.
> >
> 
> These generic names would invite people to use API, only to see it fail, since
> it only works with INTx..

It works for all non VFIO MSIx case. VFIO MSIx case it NOP(Yes, No need to return error in that case)


> 
> > Mask and unmask is operation is known to all IRQ controllers.
> > So, IMO, As far as abstraction is concerned it will be good fit.
> >
> > > That might be too intrusive. And too much work for the sake of INTx..
> > > Anyone really using/needing INTx these days? :-)
> >
> > Yup. Mask needs to called only for only qede INTx. Looks like qede Has
> > MSIX and INTX separate handler. So this mask can go to qede INTx
> >
> > >
> > > The following drivers call rte_intr_enable from their irq handlers.
> > > So with explicit rte_intr_unmask_intx, all these would need to do
> > > "if using intx, unmask"?
> > >
> > > atlantic, avp, axgbe, bnx2x, e1000, fm10k, ice, ixgbe, nfp, qede,
> > > sfc,
> > > vmxnet3
> >
> > No change on these PMDs.
> >
> 
> Why is that?
> 
> These drivers potentially have the same "lost" interrupt issue mentioned in
> the original redhat bz (qede + MSI). I *think* this observation led David to
> address them all through vfio changes, rather than fixing qede alone.
> 
> You want to introduce unmask API and use it only for qede in this cycle, and
> ask respective maintainers to fix their drivers in 19.11?

Changing the rte_intr_enable() to rte_intr_unmask() is  trivial on the places
Where existing drivers enable as unmask.

If we understand it correctly:

In case of non VFIO MSIX(INTx) and UIO
-------------------------------------------------
AK1) Kernel receives interrupt
AK2) Kernel _mask_ the interrupt
AK3) Kernel notify the use space

On usersapce:
AU1) Driver specific interrupt handler invoked
AU2) Handle the driver specific interrupt
AU3) Call rte_intr_enable(), it will intern call  VFIO_IRQ_SET_ACTION_UNMASK using VFIO_DEVICE_SET_IRQS to unmask the interrupt.

In case of  VFIO MSIX(INTx)
------------------------------------
BK1) Kernel receives interrupt.
BK2) Kernel notify the use space

On usersapce:
BU1) Driver specific interrupt handler invoked
BU2) Handle the driver specific interrupt
BU3) Call rte_intr_enable(), it will intern call  VFIO_IRQ_SET_ACTION_TRIGGER using VFIO_DEVICE_SET_IRQS to unmask the interrupt.

VFIO_IRQ_SET_ACTION_TRIGGER: is the nasty one, it will free the existing interrupt handler and request new handler etc.
Which is the source of the actual race conditional problem. 

Ideally BU3 can be just NOP. Since we need to keep the same interrupt handler for both UIO and MSIX(I *think*)
DPDK tends to use rte_intr_enable() which can work for AU3/BU3 as well.

So we need,
A light weight primitive, which unmask the AK2 incase of VFIO INTx by not overriding 
The meaning of normal rte_intr_enable() which suppose not use for MSIX interrupt in action due to racy behavior of VFIO_IRQ_SET_ACTION_TRIGGER

Replacing AU3 and BU3 as rte_intr_unmask() would fix problem. Where rte_intr_unmask() for BU3 is just NOP.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [dpdk-dev] [RFC PATCH v3 1/3] vfio: revert change that does intr eventfd setup at probe
  2019-07-15 16:50 [dpdk-dev] [RFC PATCH] vfio: avoid re-installing irq handler Jerin Jacob Kollanukkaran
                   ` (2 preceding siblings ...)
  2019-07-16  7:01 ` [dpdk-dev] [RFC PATCH v2] " Nithin Dabilpuram
@ 2019-07-16 16:44 ` Nithin Dabilpuram
  2019-07-16 16:44   ` [dpdk-dev] [RFC PATCH v3 2/3] eal: add mask and unmask interrupt APIs Nithin Dabilpuram
                     ` (2 more replies)
  3 siblings, 3 replies; 25+ messages in thread
From: Nithin Dabilpuram @ 2019-07-16 16:44 UTC (permalink / raw)
  To: Hyong Youb Kim, David Marchand, Thomas Monjalon, Ferruh Yigit,
	Anatoly Burakov
  Cc: jerinj, John Daley, Shahed Shaikh, dev, Nithin Dabilpuram

This reverts commit 89aac60e0be9ed95a87b16e3595f102f9faaffb4.
"vfio: fix interrupts race condition"

The above mentioned commit moves the interrupt's eventfd setup
to probe time but only enables one interrupt for all types of
interrupt handles i.e VFIO_MSI, VFIO_LEGACY, VFIO_MSIX, UIO.
It works fine with default case but breaks below cases specifically
for MSIX based interrupt handles.

* Applications like l3fwd-power that request rxq interrupts
  while ethdev setup.
* Drivers that need > 1 MSIx interrupts to be configured for
  functionality to work.

VFIO PCI for MSIx expects all the possible vectors to be setup up
when using VFIO_IRQ_SET_ACTION_TRIGGER so that they can be
allocated from kernel pci subsystem. Only way to increase the number
of vectors later is first free all by using VFIO_IRQ_SET_DATA_NONE
with action trigger and then enable new vector count.

Above commit changes the behavior of rte_intr_[enable|disable] to
only mask and unmask unlike earlier behavior and thereby
breaking above two scenarios.

Fixes: 89aac60e0be9 ("vfio: fix interrupts race condition")
Cc: david.marchand@redhat.com

Signed-off-by: Nithin Dabilpuram <ndabilpuram@marvell.com>
---
 drivers/bus/pci/linux/pci_vfio.c          |  78 ++++++------
 lib/librte_eal/linux/eal/eal_interrupts.c | 201 +++++++++++++++++++++++-------
 2 files changed, 191 insertions(+), 88 deletions(-)

diff --git a/drivers/bus/pci/linux/pci_vfio.c b/drivers/bus/pci/linux/pci_vfio.c
index ee31239..1ceb1c0 100644
--- a/drivers/bus/pci/linux/pci_vfio.c
+++ b/drivers/bus/pci/linux/pci_vfio.c
@@ -187,11 +187,8 @@ pci_vfio_set_bus_master(int dev_fd, bool op)
 static int
 pci_vfio_setup_interrupts(struct rte_pci_device *dev, int vfio_dev_fd)
 {
-	char irq_set_buf[sizeof(struct vfio_irq_set) + sizeof(int)];
-	struct vfio_irq_set *irq_set;
-	enum rte_intr_mode intr_mode;
 	int i, ret, intr_idx;
-	int fd;
+	enum rte_intr_mode intr_mode;
 
 	/* default to invalid index */
 	intr_idx = VFIO_PCI_NUM_IRQS;
@@ -223,6 +220,7 @@ pci_vfio_setup_interrupts(struct rte_pci_device *dev, int vfio_dev_fd)
 	/* start from MSI-X interrupt type */
 	for (i = VFIO_PCI_MSIX_IRQ_INDEX; i >= 0; i--) {
 		struct vfio_irq_info irq = { .argsz = sizeof(irq) };
+		int fd = -1;
 
 		/* skip interrupt modes we don't want */
 		if (intr_mode != RTE_INTR_MODE_NONE &&
@@ -238,51 +236,51 @@ pci_vfio_setup_interrupts(struct rte_pci_device *dev, int vfio_dev_fd)
 			return -1;
 		}
 
-		/* found a usable interrupt mode */
-		if ((irq.flags & VFIO_IRQ_INFO_EVENTFD) != 0)
-			break;
-
 		/* if this vector cannot be used with eventfd, fail if we explicitly
 		 * specified interrupt type, otherwise continue */
-		if (intr_mode != RTE_INTR_MODE_NONE) {
-			RTE_LOG(ERR, EAL, "  interrupt vector does not support eventfd!\n");
+		if ((irq.flags & VFIO_IRQ_INFO_EVENTFD) == 0) {
+			if (intr_mode != RTE_INTR_MODE_NONE) {
+				RTE_LOG(ERR, EAL,
+						"  interrupt vector does not support eventfd!\n");
+				return -1;
+			} else
+				continue;
+		}
+
+		/* set up an eventfd for interrupts */
+		fd = eventfd(0, EFD_NONBLOCK | EFD_CLOEXEC);
+		if (fd < 0) {
+			RTE_LOG(ERR, EAL, "  cannot set up eventfd, "
+					"error %i (%s)\n", errno, strerror(errno));
 			return -1;
 		}
-	}
 
-	if (i < 0)
-		return -1;
+		dev->intr_handle.fd = fd;
+		dev->intr_handle.vfio_dev_fd = vfio_dev_fd;
 
-	fd = eventfd(0, EFD_NONBLOCK | EFD_CLOEXEC);
-	if (fd < 0) {
-		RTE_LOG(ERR, EAL, "  cannot set up eventfd, error %i (%s)\n",
-			errno, strerror(errno));
-		return -1;
-	}
+		switch (i) {
+		case VFIO_PCI_MSIX_IRQ_INDEX:
+			intr_mode = RTE_INTR_MODE_MSIX;
+			dev->intr_handle.type = RTE_INTR_HANDLE_VFIO_MSIX;
+			break;
+		case VFIO_PCI_MSI_IRQ_INDEX:
+			intr_mode = RTE_INTR_MODE_MSI;
+			dev->intr_handle.type = RTE_INTR_HANDLE_VFIO_MSI;
+			break;
+		case VFIO_PCI_INTX_IRQ_INDEX:
+			intr_mode = RTE_INTR_MODE_LEGACY;
+			dev->intr_handle.type = RTE_INTR_HANDLE_VFIO_LEGACY;
+			break;
+		default:
+			RTE_LOG(ERR, EAL, "  unknown interrupt type!\n");
+			return -1;
+		}
 
-	irq_set = (struct vfio_irq_set *)irq_set_buf;
-	irq_set->argsz = sizeof(irq_set_buf);
-	irq_set->flags = VFIO_IRQ_SET_DATA_EVENTFD|VFIO_IRQ_SET_ACTION_TRIGGER;
-	irq_set->index = i;
-	irq_set->start = 0;
-	irq_set->count = 1;
-	memcpy(&irq_set->data, &fd, sizeof(int));
-	if (ioctl(vfio_dev_fd, VFIO_DEVICE_SET_IRQS, irq_set) < 0) {
-		RTE_LOG(ERR, EAL, "  error configuring interrupt\n");
-		close(fd);
-		return -1;
+		return 0;
 	}
 
-	dev->intr_handle.fd = fd;
-	dev->intr_handle.vfio_dev_fd = vfio_dev_fd;
-	if (i == VFIO_PCI_MSIX_IRQ_INDEX)
-		dev->intr_handle.type = RTE_INTR_HANDLE_VFIO_MSIX;
-	else if (i == VFIO_PCI_MSI_IRQ_INDEX)
-		dev->intr_handle.type = RTE_INTR_HANDLE_VFIO_MSI;
-	else if (i == VFIO_PCI_INTX_IRQ_INDEX)
-		dev->intr_handle.type = RTE_INTR_HANDLE_VFIO_LEGACY;
-
-	return 0;
+	/* if we're here, we haven't found a suitable interrupt vector */
+	return -1;
 }
 
 #ifdef HAVE_VFIO_DEV_REQ_INTERFACE
diff --git a/lib/librte_eal/linux/eal/eal_interrupts.c b/lib/librte_eal/linux/eal/eal_interrupts.c
index 27976b3..79ad5e8 100644
--- a/lib/librte_eal/linux/eal/eal_interrupts.c
+++ b/lib/librte_eal/linux/eal/eal_interrupts.c
@@ -109,19 +109,42 @@ static pthread_t intr_thread;
 
 /* enable legacy (INTx) interrupts */
 static int
-vfio_enable_intx(const struct rte_intr_handle *intr_handle)
-{
-	struct vfio_irq_set irq_set;
-	int ret;
-
-	/* unmask INTx */
-	irq_set.argsz = sizeof(irq_set);
-	irq_set.flags = VFIO_IRQ_SET_DATA_NONE | VFIO_IRQ_SET_ACTION_UNMASK;
-	irq_set.index = VFIO_PCI_INTX_IRQ_INDEX;
-	irq_set.start = 0;
-	irq_set.count = 1;
-
-	ret = ioctl(intr_handle->vfio_dev_fd, VFIO_DEVICE_SET_IRQS, &irq_set);
+vfio_enable_intx(const struct rte_intr_handle *intr_handle) {
+	struct vfio_irq_set *irq_set;
+	char irq_set_buf[IRQ_SET_BUF_LEN];
+	int len, ret;
+	int *fd_ptr;
+
+	len = sizeof(irq_set_buf);
+
+	/* enable INTx */
+	irq_set = (struct vfio_irq_set *) irq_set_buf;
+	irq_set->argsz = len;
+	irq_set->count = 1;
+	irq_set->flags = VFIO_IRQ_SET_DATA_EVENTFD | VFIO_IRQ_SET_ACTION_TRIGGER;
+	irq_set->index = VFIO_PCI_INTX_IRQ_INDEX;
+	irq_set->start = 0;
+	fd_ptr = (int *) &irq_set->data;
+	*fd_ptr = intr_handle->fd;
+
+	ret = ioctl(intr_handle->vfio_dev_fd, VFIO_DEVICE_SET_IRQS, irq_set);
+
+	if (ret) {
+		RTE_LOG(ERR, EAL, "Error enabling INTx interrupts for fd %d\n",
+						intr_handle->fd);
+		return -1;
+	}
+
+	/* unmask INTx after enabling */
+	memset(irq_set, 0, len);
+	len = sizeof(struct vfio_irq_set);
+	irq_set->argsz = len;
+	irq_set->count = 1;
+	irq_set->flags = VFIO_IRQ_SET_DATA_NONE | VFIO_IRQ_SET_ACTION_UNMASK;
+	irq_set->index = VFIO_PCI_INTX_IRQ_INDEX;
+	irq_set->start = 0;
+
+	ret = ioctl(intr_handle->vfio_dev_fd, VFIO_DEVICE_SET_IRQS, irq_set);
 
 	if (ret) {
 		RTE_LOG(ERR, EAL, "Error unmasking INTx interrupts for fd %d\n",
@@ -133,51 +156,128 @@ vfio_enable_intx(const struct rte_intr_handle *intr_handle)
 
 /* disable legacy (INTx) interrupts */
 static int
-vfio_disable_intx(const struct rte_intr_handle *intr_handle)
-{
-	struct vfio_irq_set irq_set;
-	int ret;
+vfio_disable_intx(const struct rte_intr_handle *intr_handle) {
+	struct vfio_irq_set *irq_set;
+	char irq_set_buf[IRQ_SET_BUF_LEN];
+	int len, ret;
 
-	/* mask interrupts */
-	irq_set.argsz = sizeof(irq_set);
-	irq_set.flags = VFIO_IRQ_SET_DATA_NONE | VFIO_IRQ_SET_ACTION_MASK;
-	irq_set.index = VFIO_PCI_INTX_IRQ_INDEX;
-	irq_set.start = 0;
-	irq_set.count = 1;
+	len = sizeof(struct vfio_irq_set);
 
-	ret = ioctl(intr_handle->vfio_dev_fd, VFIO_DEVICE_SET_IRQS, &irq_set);
+	/* mask interrupts before disabling */
+	irq_set = (struct vfio_irq_set *) irq_set_buf;
+	irq_set->argsz = len;
+	irq_set->count = 1;
+	irq_set->flags = VFIO_IRQ_SET_DATA_NONE | VFIO_IRQ_SET_ACTION_MASK;
+	irq_set->index = VFIO_PCI_INTX_IRQ_INDEX;
+	irq_set->start = 0;
+
+	ret = ioctl(intr_handle->vfio_dev_fd, VFIO_DEVICE_SET_IRQS, irq_set);
 
 	if (ret) {
 		RTE_LOG(ERR, EAL, "Error masking INTx interrupts for fd %d\n",
 						intr_handle->fd);
 		return -1;
 	}
+
+	/* disable INTx*/
+	memset(irq_set, 0, len);
+	irq_set->argsz = len;
+	irq_set->count = 0;
+	irq_set->flags = VFIO_IRQ_SET_DATA_NONE | VFIO_IRQ_SET_ACTION_TRIGGER;
+	irq_set->index = VFIO_PCI_INTX_IRQ_INDEX;
+	irq_set->start = 0;
+
+	ret = ioctl(intr_handle->vfio_dev_fd, VFIO_DEVICE_SET_IRQS, irq_set);
+
+	if (ret) {
+		RTE_LOG(ERR, EAL,
+			"Error disabling INTx interrupts for fd %d\n", intr_handle->fd);
+		return -1;
+	}
 	return 0;
 }
 
+/* enable MSI interrupts */
+static int
+vfio_enable_msi(const struct rte_intr_handle *intr_handle) {
+	int len, ret;
+	char irq_set_buf[IRQ_SET_BUF_LEN];
+	struct vfio_irq_set *irq_set;
+	int *fd_ptr;
+
+	len = sizeof(irq_set_buf);
+
+	irq_set = (struct vfio_irq_set *) irq_set_buf;
+	irq_set->argsz = len;
+	irq_set->count = 1;
+	irq_set->flags = VFIO_IRQ_SET_DATA_EVENTFD | VFIO_IRQ_SET_ACTION_TRIGGER;
+	irq_set->index = VFIO_PCI_MSI_IRQ_INDEX;
+	irq_set->start = 0;
+	fd_ptr = (int *) &irq_set->data;
+	*fd_ptr = intr_handle->fd;
+
+	ret = ioctl(intr_handle->vfio_dev_fd, VFIO_DEVICE_SET_IRQS, irq_set);
+
+	if (ret) {
+		RTE_LOG(ERR, EAL, "Error enabling MSI interrupts for fd %d\n",
+						intr_handle->fd);
+		return -1;
+	}
+	return 0;
+}
+
+/* disable MSI interrupts */
+static int
+vfio_disable_msi(const struct rte_intr_handle *intr_handle) {
+	struct vfio_irq_set *irq_set;
+	char irq_set_buf[IRQ_SET_BUF_LEN];
+	int len, ret;
+
+	len = sizeof(struct vfio_irq_set);
+
+	irq_set = (struct vfio_irq_set *) irq_set_buf;
+	irq_set->argsz = len;
+	irq_set->count = 0;
+	irq_set->flags = VFIO_IRQ_SET_DATA_NONE | VFIO_IRQ_SET_ACTION_TRIGGER;
+	irq_set->index = VFIO_PCI_MSI_IRQ_INDEX;
+	irq_set->start = 0;
+
+	ret = ioctl(intr_handle->vfio_dev_fd, VFIO_DEVICE_SET_IRQS, irq_set);
+
+	if (ret)
+		RTE_LOG(ERR, EAL,
+			"Error disabling MSI interrupts for fd %d\n", intr_handle->fd);
+
+	return ret;
+}
+
 /* enable MSI-X interrupts */
 static int
-vfio_enable_msix(const struct rte_intr_handle *intr_handle)
-{
+vfio_enable_msix(const struct rte_intr_handle *intr_handle) {
+	int len, ret;
 	char irq_set_buf[MSIX_IRQ_SET_BUF_LEN];
 	struct vfio_irq_set *irq_set;
-	int len, ret;
-
-	if (intr_handle->nb_efd == 0)
-		return 0;
+	int *fd_ptr;
 
 	len = sizeof(irq_set_buf);
 
 	irq_set = (struct vfio_irq_set *) irq_set_buf;
 	irq_set->argsz = len;
+	/* 0 < irq_set->count < RTE_MAX_RXTX_INTR_VEC_ID + 1 */
+	irq_set->count = intr_handle->max_intr ?
+		(intr_handle->max_intr > RTE_MAX_RXTX_INTR_VEC_ID + 1 ?
+		RTE_MAX_RXTX_INTR_VEC_ID + 1 : intr_handle->max_intr) : 1;
 	irq_set->flags = VFIO_IRQ_SET_DATA_EVENTFD | VFIO_IRQ_SET_ACTION_TRIGGER;
 	irq_set->index = VFIO_PCI_MSIX_IRQ_INDEX;
-	irq_set->start = RTE_INTR_VEC_RXTX_OFFSET;
-	irq_set->count = intr_handle->nb_efd;
-	memcpy(&irq_set->data, intr_handle->efds,
-	       sizeof(*intr_handle->efds) * intr_handle->nb_efd);
+	irq_set->start = 0;
+	fd_ptr = (int *) &irq_set->data;
+	/* INTR vector offset 0 reserve for non-efds mapping */
+	fd_ptr[RTE_INTR_VEC_ZERO_OFFSET] = intr_handle->fd;
+	memcpy(&fd_ptr[RTE_INTR_VEC_RXTX_OFFSET], intr_handle->efds,
+		sizeof(*intr_handle->efds) * intr_handle->nb_efd);
 
 	ret = ioctl(intr_handle->vfio_dev_fd, VFIO_DEVICE_SET_IRQS, irq_set);
+
 	if (ret) {
 		RTE_LOG(ERR, EAL, "Error enabling MSI-X interrupts for fd %d\n",
 						intr_handle->fd);
@@ -189,21 +289,22 @@ vfio_enable_msix(const struct rte_intr_handle *intr_handle)
 
 /* disable MSI-X interrupts */
 static int
-vfio_disable_msix(const struct rte_intr_handle *intr_handle)
-{
-	struct vfio_irq_set irq_set;
-	int ret;
+vfio_disable_msix(const struct rte_intr_handle *intr_handle) {
+	struct vfio_irq_set *irq_set;
+	char irq_set_buf[MSIX_IRQ_SET_BUF_LEN];
+	int len, ret;
 
-	if (intr_handle->nb_efd == 0)
-		return 0;
+	len = sizeof(struct vfio_irq_set);
 
-	irq_set.argsz = sizeof(irq_set);
-	irq_set.flags = VFIO_IRQ_SET_DATA_NONE | VFIO_IRQ_SET_ACTION_TRIGGER;
-	irq_set.index = VFIO_PCI_MSIX_IRQ_INDEX;
-	irq_set.start = RTE_INTR_VEC_RXTX_OFFSET;
-	irq_set.count = intr_handle->nb_efd;
+	irq_set = (struct vfio_irq_set *) irq_set_buf;
+	irq_set->argsz = len;
+	irq_set->count = 0;
+	irq_set->flags = VFIO_IRQ_SET_DATA_NONE | VFIO_IRQ_SET_ACTION_TRIGGER;
+	irq_set->index = VFIO_PCI_MSIX_IRQ_INDEX;
+	irq_set->start = 0;
+
+	ret = ioctl(intr_handle->vfio_dev_fd, VFIO_DEVICE_SET_IRQS, irq_set);
 
-	ret = ioctl(intr_handle->vfio_dev_fd, VFIO_DEVICE_SET_IRQS, &irq_set);
 	if (ret)
 		RTE_LOG(ERR, EAL,
 			"Error disabling MSI-X interrupts for fd %d\n", intr_handle->fd);
@@ -564,7 +665,9 @@ rte_intr_enable(const struct rte_intr_handle *intr_handle)
 			return -1;
 		break;
 	case RTE_INTR_HANDLE_VFIO_MSI:
-		return 0;
+		if (vfio_enable_msi(intr_handle))
+			return -1;
+		break;
 	case RTE_INTR_HANDLE_VFIO_LEGACY:
 		if (vfio_enable_intx(intr_handle))
 			return -1;
@@ -618,7 +721,9 @@ rte_intr_disable(const struct rte_intr_handle *intr_handle)
 			return -1;
 		break;
 	case RTE_INTR_HANDLE_VFIO_MSI:
-		return 0;
+		if (vfio_disable_msi(intr_handle))
+			return -1;
+		break;
 	case RTE_INTR_HANDLE_VFIO_LEGACY:
 		if (vfio_disable_intx(intr_handle))
 			return -1;
-- 
2.8.4


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [dpdk-dev] [RFC PATCH v3 2/3] eal: add mask and unmask interrupt APIs
  2019-07-16 16:44 ` [dpdk-dev] [RFC PATCH v3 1/3] vfio: revert change that does intr eventfd setup at probe Nithin Dabilpuram
@ 2019-07-16 16:44   ` Nithin Dabilpuram
  2019-07-17  5:55     ` Hyong Youb Kim (hyonkim)
  2019-07-16 16:44   ` [dpdk-dev] [RFC PATCH v3 3/3] drivers/net: use unmask API in interrupt handlers Nithin Dabilpuram
  2019-07-16 20:06   ` [dpdk-dev] [RFC PATCH v3 1/3] vfio: revert change that does intr eventfd setup at probe Stephen Hemminger
  2 siblings, 1 reply; 25+ messages in thread
From: Nithin Dabilpuram @ 2019-07-16 16:44 UTC (permalink / raw)
  To: Hyong Youb Kim, David Marchand, Thomas Monjalon, Ferruh Yigit,
	Bruce Richardson
  Cc: jerinj, John Daley, Shahed Shaikh, dev, Nithin Dabilpuram

Add new mask and unmask interrupt APIs to avoid using
VFIO_IRQ_SET_ACTION_TRIGGER for masking/unmasking purpose for VFIO
based handlers. This implementation is specific to Linux.

Using action trigger for masking and unmasking has below issues

 * Time consuming to do for every interrupt received as it will
   free_irq() followed by request_irq() and all other initializations
 * A race condition because of a window between free_irq() and
   request_irq() with packet reception still on and device still
   enabled and would throw warning messages like below.
   [158764.159833] do_IRQ: 9.34 No irq handler for vector

In this patch, mask/unmask is a no-op for VFIO_MSIX/VFIO_MSI interrupts
as they are edge triggered and kernel would not mask the interrupt before
delivering the event to userspace.

Signed-off-by: Nithin Dabilpuram <ndabilpuram@marvell.com>
---
v3:
* Re-org patch to move driver change to 3/3
* Add stub implementation for freebsd.
* Fix version map file for new apis.
v2:
Make change in qede driver for unmask
 lib/librte_eal/common/include/rte_interrupts.h |  23 ++++
 lib/librte_eal/freebsd/eal/eal_interrupts.c    |  54 +++++++++
 lib/librte_eal/linux/eal/eal_interrupts.c      | 155 +++++++++++++++++++++++++
 lib/librte_eal/rte_eal_version.map             |   2 +
 4 files changed, 234 insertions(+)

diff --git a/lib/librte_eal/common/include/rte_interrupts.h b/lib/librte_eal/common/include/rte_interrupts.h
index c1e912c..b0675be 100644
--- a/lib/librte_eal/common/include/rte_interrupts.h
+++ b/lib/librte_eal/common/include/rte_interrupts.h
@@ -118,6 +118,29 @@ int rte_intr_enable(const struct rte_intr_handle *intr_handle);
  */
 int rte_intr_disable(const struct rte_intr_handle *intr_handle);
 
+/**
+ * It masks the interrupt for the specified handle.
+ *
+ * @param intr_handle
+ *  pointer to the interrupt handle.
+ *
+ * @return
+ *  - On success, zero.
+ *  - On failure, a negative value.
+ */
+int rte_intr_mask(const struct rte_intr_handle *intr_handle);
+
+/**
+ * It unmasks the interrupt for the specified handle.
+ *
+ * @param intr_handle
+ *  pointer to the interrupt handle.
+ *
+ * @return
+ *  - On success, zero.
+ *  - On failure, a negative value.
+ */
+int rte_intr_unmask(const struct rte_intr_handle *intr_handle);
 #ifdef __cplusplus
 }
 #endif
diff --git a/lib/librte_eal/freebsd/eal/eal_interrupts.c b/lib/librte_eal/freebsd/eal/eal_interrupts.c
index 10375bd..c2f487a 100644
--- a/lib/librte_eal/freebsd/eal/eal_interrupts.c
+++ b/lib/librte_eal/freebsd/eal/eal_interrupts.c
@@ -387,6 +387,60 @@ rte_intr_disable(const struct rte_intr_handle *intr_handle)
 	return 0;
 }
 
+int
+rte_intr_unmask(const struct rte_intr_handle *intr_handle)
+{
+	if (intr_handle && intr_handle->type == RTE_INTR_HANDLE_VDEV)
+		return 0;
+
+	if (!intr_handle || intr_handle->fd < 0 || intr_handle->uio_cfg_fd < 0)
+		return -1;
+
+	switch (intr_handle->type) {
+	/* not used at this moment */
+	case RTE_INTR_HANDLE_ALARM:
+		return -1;
+	/* not used at this moment */
+	case RTE_INTR_HANDLE_DEV_EVENT:
+		return -1;
+	/* unknown handle type */
+	default:
+		RTE_LOG(ERR, EAL,
+			"Unknown handle type of fd %d\n",
+					intr_handle->fd);
+		return -1;
+	}
+
+	return 0;
+}
+
+int
+rte_intr_mask(const struct rte_intr_handle *intr_handle)
+{
+	if (intr_handle && intr_handle->type == RTE_INTR_HANDLE_VDEV)
+		return 0;
+
+	if (!intr_handle || intr_handle->fd < 0 || intr_handle->uio_cfg_fd < 0)
+		return -1;
+
+	switch (intr_handle->type) {
+	/* not used at this moment */
+	case RTE_INTR_HANDLE_ALARM:
+		return -1;
+	/* not used at this moment */
+	case RTE_INTR_HANDLE_DEV_EVENT:
+		return -1;
+	/* unknown handle type */
+	default:
+		RTE_LOG(ERR, EAL,
+			"Unknown handle type of fd %d\n",
+					intr_handle->fd);
+		return -1;
+	}
+
+	return 0;
+}
+
 static void
 eal_intr_process_interrupts(struct kevent *events, int nfds)
 {
diff --git a/lib/librte_eal/linux/eal/eal_interrupts.c b/lib/librte_eal/linux/eal/eal_interrupts.c
index 79ad5e8..f619981 100644
--- a/lib/librte_eal/linux/eal/eal_interrupts.c
+++ b/lib/librte_eal/linux/eal/eal_interrupts.c
@@ -197,6 +197,63 @@ vfio_disable_intx(const struct rte_intr_handle *intr_handle) {
 	return 0;
 }
 
+/* unmask legacy (INTx) interrupts */
+static int
+vfio_unmask_intx(const struct rte_intr_handle *intr_handle)
+{
+	struct vfio_irq_set *irq_set;
+	char irq_set_buf[IRQ_SET_BUF_LEN];
+	int len, ret;
+
+	len = sizeof(struct vfio_irq_set);
+
+	/* unmask INTx */
+	irq_set = (struct vfio_irq_set *) irq_set_buf;
+	memset(irq_set, 0, len);
+	irq_set->argsz = len;
+	irq_set->count = 1;
+	irq_set->flags = VFIO_IRQ_SET_DATA_NONE | VFIO_IRQ_SET_ACTION_UNMASK;
+	irq_set->index = VFIO_PCI_INTX_IRQ_INDEX;
+	irq_set->start = 0;
+
+	ret = ioctl(intr_handle->vfio_dev_fd, VFIO_DEVICE_SET_IRQS, irq_set);
+
+	if (ret) {
+		RTE_LOG(ERR, EAL, "Error unmasking INTx interrupts for fd %d\n",
+			intr_handle->fd);
+		return -1;
+	}
+	return 0;
+}
+
+/* mask legacy (INTx) interrupts */
+static int
+vfio_mask_intx(const struct rte_intr_handle *intr_handle)
+{
+	struct vfio_irq_set *irq_set;
+	char irq_set_buf[IRQ_SET_BUF_LEN];
+	int len, ret;
+
+	len = sizeof(struct vfio_irq_set);
+
+	/* mask interrupts */
+	irq_set = (struct vfio_irq_set *) irq_set_buf;
+	irq_set->argsz = len;
+	irq_set->count = 1;
+	irq_set->flags = VFIO_IRQ_SET_DATA_NONE | VFIO_IRQ_SET_ACTION_MASK;
+	irq_set->index = VFIO_PCI_INTX_IRQ_INDEX;
+	irq_set->start = 0;
+
+	ret = ioctl(intr_handle->vfio_dev_fd, VFIO_DEVICE_SET_IRQS, irq_set);
+
+	if (ret) {
+		RTE_LOG(ERR, EAL, "Error masking INTx interrupts for fd %d\n",
+			intr_handle->fd);
+		return -1;
+	}
+
+	return 0;
+}
 /* enable MSI interrupts */
 static int
 vfio_enable_msi(const struct rte_intr_handle *intr_handle) {
@@ -694,6 +751,104 @@ rte_intr_enable(const struct rte_intr_handle *intr_handle)
 }
 
 int
+rte_intr_mask(const struct rte_intr_handle *intr_handle)
+{
+	if (intr_handle && intr_handle->type == RTE_INTR_HANDLE_VDEV)
+		return 0;
+
+	if (!intr_handle || intr_handle->fd < 0 || intr_handle->uio_cfg_fd < 0)
+		return -1;
+
+	switch (intr_handle->type){
+	/* Both masking and disabling are same for UIO */
+	case RTE_INTR_HANDLE_UIO:
+		if (uio_intr_disable(intr_handle))
+			return -1;
+		break;
+	case RTE_INTR_HANDLE_UIO_INTX:
+		if (uio_intx_intr_disable(intr_handle))
+			return -1;
+		break;
+	/* not used at this moment */
+	case RTE_INTR_HANDLE_ALARM:
+		return -1;
+#ifdef VFIO_PRESENT
+	case RTE_INTR_HANDLE_VFIO_MSIX:
+	case RTE_INTR_HANDLE_VFIO_MSI:
+		return 0;
+	case RTE_INTR_HANDLE_VFIO_LEGACY:
+		if (vfio_mask_intx(intr_handle))
+			return -1;
+		break;
+#ifdef HAVE_VFIO_DEV_REQ_INTERFACE
+	case RTE_INTR_HANDLE_VFIO_REQ:
+		return -1;
+#endif
+#endif
+	/* not used at this moment */
+	case RTE_INTR_HANDLE_DEV_EVENT:
+		return -1;
+	/* unknown handle type */
+	default:
+		RTE_LOG(ERR, EAL,
+			"Unknown handle type of fd %d\n",
+					intr_handle->fd);
+		return -1;
+	}
+
+	return 0;
+}
+
+int
+rte_intr_unmask(const struct rte_intr_handle *intr_handle)
+{
+	if (intr_handle && intr_handle->type == RTE_INTR_HANDLE_VDEV)
+		return 0;
+
+	if (!intr_handle || intr_handle->fd < 0 || intr_handle->uio_cfg_fd < 0)
+		return -1;
+
+	switch (intr_handle->type){
+	/* Both unmasking and disabling are same for UIO */
+	case RTE_INTR_HANDLE_UIO:
+		if (uio_intr_enable(intr_handle))
+			return -1;
+		break;
+	case RTE_INTR_HANDLE_UIO_INTX:
+		if (uio_intx_intr_enable(intr_handle))
+			return -1;
+		break;
+	/* not used at this moment */
+	case RTE_INTR_HANDLE_ALARM:
+		return -1;
+#ifdef VFIO_PRESENT
+	case RTE_INTR_HANDLE_VFIO_MSIX:
+	case RTE_INTR_HANDLE_VFIO_MSI:
+		return 0;
+	case RTE_INTR_HANDLE_VFIO_LEGACY:
+		if (vfio_unmask_intx(intr_handle))
+			return -1;
+		break;
+#ifdef HAVE_VFIO_DEV_REQ_INTERFACE
+	case RTE_INTR_HANDLE_VFIO_REQ:
+		return -1;
+#endif
+#endif
+	/* not used at this moment */
+	case RTE_INTR_HANDLE_DEV_EVENT:
+		return -1;
+	/* unknown handle type */
+	default:
+		RTE_LOG(ERR, EAL,
+			"Unknown handle type of fd %d\n",
+					intr_handle->fd);
+		return -1;
+	}
+
+	return 0;
+}
+
+int
 rte_intr_disable(const struct rte_intr_handle *intr_handle)
 {
 	if (intr_handle && intr_handle->type == RTE_INTR_HANDLE_VDEV)
diff --git a/lib/librte_eal/rte_eal_version.map b/lib/librte_eal/rte_eal_version.map
index 1892d9e..b3b2df4 100644
--- a/lib/librte_eal/rte_eal_version.map
+++ b/lib/librte_eal/rte_eal_version.map
@@ -309,6 +309,8 @@ DPDK_19.08 {
 	rte_service_lcore_attr_reset_all;
 	rte_service_may_be_active;
 	rte_srand;
+	rte_intr_unmask;
+	rte_intr_mask;
 
 } DPDK_19.05;
 
-- 
2.8.4


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [dpdk-dev] [RFC PATCH v3 3/3] drivers/net: use unmask API in interrupt handlers
  2019-07-16 16:44 ` [dpdk-dev] [RFC PATCH v3 1/3] vfio: revert change that does intr eventfd setup at probe Nithin Dabilpuram
  2019-07-16 16:44   ` [dpdk-dev] [RFC PATCH v3 2/3] eal: add mask and unmask interrupt APIs Nithin Dabilpuram
@ 2019-07-16 16:44   ` Nithin Dabilpuram
  2019-07-17  6:01     ` Hyong Youb Kim (hyonkim)
  2019-07-16 20:06   ` [dpdk-dev] [RFC PATCH v3 1/3] vfio: revert change that does intr eventfd setup at probe Stephen Hemminger
  2 siblings, 1 reply; 25+ messages in thread
From: Nithin Dabilpuram @ 2019-07-16 16:44 UTC (permalink / raw)
  To: Hyong Youb Kim, David Marchand, Thomas Monjalon, Ferruh Yigit,
	Igor Russkikh, Pavel Belous, Allain Legacy, Matt Peters,
	Ravi Kumar, Rasesh Mody, Shahed Shaikh, Wenzhuo Lu, Qi Zhang,
	Xiao Wang, Beilei Xing, Jingjing Wu, Qiming Yang,
	Konstantin Ananyev, Alejandro Lucero, Andrew Rybchenko,
	Maxime Coquelin, Tiwei Bie, Zhihong Wang, Yong Wang
  Cc: jerinj, John Daley, dev, Nithin Dabilpuram

Replace rte_intr_enable() with rte_intr_unmask() API
for unmasking in interrupt handlers and rx_queue_intr_enable()
in callbacks of PMD's whose original intent was to unmask interrupts
after handling is completed if device is backed by UIO, IGB_UIO or
VFIO(with INTx).

Signed-off-by: Nithin Dabilpuram <ndabilpuram@marvell.com>
---

v3:
* Change all PMD's that use rte_intr_enable() in
  rx_queue_intr_enable() or in irq handler to use
  new unmask api.
  
 drivers/net/atlantic/atl_ethdev.c    |  2 +-
 drivers/net/avp/avp_ethdev.c         |  2 +-
 drivers/net/axgbe/axgbe_ethdev.c     |  4 ++--
 drivers/net/bnx2x/bnx2x_ethdev.c     |  2 +-
 drivers/net/e1000/em_ethdev.c        |  4 ++--
 drivers/net/e1000/igb_ethdev.c       |  6 +++---
 drivers/net/fm10k/fm10k_ethdev.c     |  6 +++---
 drivers/net/i40e/i40e_ethdev.c       |  2 +-
 drivers/net/iavf/iavf_ethdev.c       |  2 +-
 drivers/net/ice/ice_ethdev.c         |  4 ++--
 drivers/net/ixgbe/ixgbe_ethdev.c     |  6 +++---
 drivers/net/nfp/nfp_net.c            |  2 +-
 drivers/net/qede/qede_ethdev.c       |  8 ++++----
 drivers/net/sfc/sfc_intr.c           |  4 ++--
 drivers/net/virtio/virtio_ethdev.c   | 16 +++++++++++++++-
 drivers/net/vmxnet3/vmxnet3_ethdev.c |  2 +-
 16 files changed, 43 insertions(+), 29 deletions(-)

diff --git a/drivers/net/atlantic/atl_ethdev.c b/drivers/net/atlantic/atl_ethdev.c
index fdc0a7f..7d7cae1 100644
--- a/drivers/net/atlantic/atl_ethdev.c
+++ b/drivers/net/atlantic/atl_ethdev.c
@@ -1394,7 +1394,7 @@ atl_dev_interrupt_action(struct rte_eth_dev *dev,
 	}
 done:
 	atl_enable_intr(dev);
-	rte_intr_enable(intr_handle);
+	rte_intr_unmask(intr_handle);
 
 	return 0;
 }
diff --git a/drivers/net/avp/avp_ethdev.c b/drivers/net/avp/avp_ethdev.c
index 47b96ec..268316a 100644
--- a/drivers/net/avp/avp_ethdev.c
+++ b/drivers/net/avp/avp_ethdev.c
@@ -713,7 +713,7 @@ avp_dev_interrupt_handler(void *data)
 			    status);
 
 	/* re-enable UIO interrupt handling */
-	ret = rte_intr_enable(&pci_dev->intr_handle);
+	ret = rte_intr_unmask(&pci_dev->intr_handle);
 	if (ret < 0) {
 		PMD_DRV_LOG(ERR, "Failed to re-enable UIO interrupts, ret=%d\n",
 			    ret);
diff --git a/drivers/net/axgbe/axgbe_ethdev.c b/drivers/net/axgbe/axgbe_ethdev.c
index cfb1720..e25e323 100644
--- a/drivers/net/axgbe/axgbe_ethdev.c
+++ b/drivers/net/axgbe/axgbe_ethdev.c
@@ -136,8 +136,8 @@ axgbe_dev_interrupt_handler(void *param)
 					   DMA_CH_SR, dma_ch_isr);
 		}
 	}
-	/* Enable interrupts since disabled after generation*/
-	rte_intr_enable(&pdata->pci_dev->intr_handle);
+	/* Unmask interrupts since disabled after generation*/
+	rte_intr_unmask(&pdata->pci_dev->intr_handle);
 }
 
 /*
diff --git a/drivers/net/bnx2x/bnx2x_ethdev.c b/drivers/net/bnx2x/bnx2x_ethdev.c
index 10b4fdb..7ce6800 100644
--- a/drivers/net/bnx2x/bnx2x_ethdev.c
+++ b/drivers/net/bnx2x/bnx2x_ethdev.c
@@ -133,7 +133,7 @@ bnx2x_interrupt_handler(void *param)
 	PMD_DEBUG_PERIODIC_LOG(INFO, sc, "Interrupt handled");
 
 	bnx2x_interrupt_action(dev, 1);
-	rte_intr_enable(&sc->pci_dev->intr_handle);
+	rte_intr_unmask(&sc->pci_dev->intr_handle);
 }
 
 static void bnx2x_periodic_start(void *param)
diff --git a/drivers/net/e1000/em_ethdev.c b/drivers/net/e1000/em_ethdev.c
index dc88661..7bff3d2 100644
--- a/drivers/net/e1000/em_ethdev.c
+++ b/drivers/net/e1000/em_ethdev.c
@@ -1001,7 +1001,7 @@ eth_em_rx_queue_intr_enable(struct rte_eth_dev *dev, __rte_unused uint16_t queue
 	struct rte_intr_handle *intr_handle = &pci_dev->intr_handle;
 
 	em_rxq_intr_enable(hw);
-	rte_intr_enable(intr_handle);
+	rte_intr_unmask(intr_handle);
 
 	return 0;
 }
@@ -1568,7 +1568,7 @@ eth_em_interrupt_action(struct rte_eth_dev *dev,
 		return -1;
 
 	intr->flags &= ~E1000_FLAG_NEED_LINK_UPDATE;
-	rte_intr_enable(intr_handle);
+	rte_intr_unmask(intr_handle);
 
 	/* set get_link_status to check register later */
 	hw->mac.get_link_status = 1;
diff --git a/drivers/net/e1000/igb_ethdev.c b/drivers/net/e1000/igb_ethdev.c
index 3ee28cf..b8ac00d 100644
--- a/drivers/net/e1000/igb_ethdev.c
+++ b/drivers/net/e1000/igb_ethdev.c
@@ -2876,7 +2876,7 @@ eth_igb_interrupt_action(struct rte_eth_dev *dev,
 	}
 
 	igb_intr_enable(dev);
-	rte_intr_enable(intr_handle);
+	rte_intr_unmask(intr_handle);
 
 	if (intr->flags & E1000_FLAG_NEED_LINK_UPDATE) {
 		intr->flags &= ~E1000_FLAG_NEED_LINK_UPDATE;
@@ -2987,7 +2987,7 @@ eth_igbvf_interrupt_action(struct rte_eth_dev *dev, struct rte_intr_handle *intr
 	}
 
 	igbvf_intr_enable(dev);
-	rte_intr_enable(intr_handle);
+	rte_intr_unmask(intr_handle);
 
 	return 0;
 }
@@ -5500,7 +5500,7 @@ eth_igb_rx_queue_intr_enable(struct rte_eth_dev *dev, uint16_t queue_id)
 	E1000_WRITE_REG(hw, E1000_EIMS, regval | mask);
 	E1000_WRITE_FLUSH(hw);
 
-	rte_intr_enable(intr_handle);
+	rte_intr_unmask(intr_handle);
 
 	return 0;
 }
diff --git a/drivers/net/fm10k/fm10k_ethdev.c b/drivers/net/fm10k/fm10k_ethdev.c
index a1e3836..4b30c52 100644
--- a/drivers/net/fm10k/fm10k_ethdev.c
+++ b/drivers/net/fm10k/fm10k_ethdev.c
@@ -2381,7 +2381,7 @@ fm10k_dev_rx_queue_intr_enable(struct rte_eth_dev *dev, uint16_t queue_id)
 	else
 		FM10K_WRITE_REG(hw, FM10K_VFITR(Q2V(pdev, queue_id)),
 			FM10K_ITR_AUTOMASK | FM10K_ITR_MASK_CLEAR);
-	rte_intr_enable(&pdev->intr_handle);
+	rte_intr_unmask(&pdev->intr_handle);
 	return 0;
 }
 
@@ -2680,7 +2680,7 @@ fm10k_dev_interrupt_handler_pf(void *param)
 	FM10K_WRITE_REG(hw, FM10K_ITR(0), FM10K_ITR_AUTOMASK |
 					FM10K_ITR_MASK_CLEAR);
 	/* Re-enable interrupt from host side */
-	rte_intr_enable(dev->intr_handle);
+	rte_intr_unmask(dev->intr_handle);
 }
 
 /**
@@ -2760,7 +2760,7 @@ fm10k_dev_interrupt_handler_vf(void *param)
 	FM10K_WRITE_REG(hw, FM10K_VFITR(0), FM10K_ITR_AUTOMASK |
 					FM10K_ITR_MASK_CLEAR);
 	/* Re-enable interrupt from host side */
-	rte_intr_enable(dev->intr_handle);
+	rte_intr_unmask(dev->intr_handle);
 }
 
 /* Mailbox message handler in VF */
diff --git a/drivers/net/i40e/i40e_ethdev.c b/drivers/net/i40e/i40e_ethdev.c
index 2b9fc45..14af840 100644
--- a/drivers/net/i40e/i40e_ethdev.c
+++ b/drivers/net/i40e/i40e_ethdev.c
@@ -11646,7 +11646,7 @@ i40e_dev_rx_queue_intr_enable(struct rte_eth_dev *dev, uint16_t queue_id)
 			       I40E_PFINT_DYN_CTLN_ITR_INDX_MASK);
 
 	I40E_WRITE_FLUSH(hw);
-	rte_intr_enable(&pci_dev->intr_handle);
+	rte_intr_unmask(&pci_dev->intr_handle);
 
 	return 0;
 }
diff --git a/drivers/net/iavf/iavf_ethdev.c b/drivers/net/iavf/iavf_ethdev.c
index 53dc05c..fe97a26 100644
--- a/drivers/net/iavf/iavf_ethdev.c
+++ b/drivers/net/iavf/iavf_ethdev.c
@@ -1098,7 +1098,7 @@ iavf_dev_rx_queue_intr_enable(struct rte_eth_dev *dev, uint16_t queue_id)
 
 	IAVF_WRITE_FLUSH(hw);
 
-	rte_intr_enable(&pci_dev->intr_handle);
+	rte_intr_unmask(&pci_dev->intr_handle);
 
 	return 0;
 }
diff --git a/drivers/net/ice/ice_ethdev.c b/drivers/net/ice/ice_ethdev.c
index 9ce730c..c070f20 100644
--- a/drivers/net/ice/ice_ethdev.c
+++ b/drivers/net/ice/ice_ethdev.c
@@ -1118,7 +1118,7 @@ ice_interrupt_handler(void *param)
 done:
 	/* Enable interrupt */
 	ice_pf_enable_irq0(hw);
-	rte_intr_enable(dev->intr_handle);
+	rte_intr_unmask(dev->intr_handle);
 }
 
 /*  Initialize SW parameters of PF */
@@ -3002,7 +3002,7 @@ static int ice_rx_queue_intr_enable(struct rte_eth_dev *dev,
 	val &= ~GLINT_DYN_CTL_WB_ON_ITR_M;
 
 	ICE_WRITE_REG(hw, GLINT_DYN_CTL(msix_intr), val);
-	rte_intr_enable(&pci_dev->intr_handle);
+	rte_intr_unmask(&pci_dev->intr_handle);
 
 	return 0;
 }
diff --git a/drivers/net/ixgbe/ixgbe_ethdev.c b/drivers/net/ixgbe/ixgbe_ethdev.c
index 22c5b2c..be7472e 100644
--- a/drivers/net/ixgbe/ixgbe_ethdev.c
+++ b/drivers/net/ixgbe/ixgbe_ethdev.c
@@ -4502,7 +4502,7 @@ ixgbe_dev_interrupt_delayed_handler(void *param)
 
 	PMD_DRV_LOG(DEBUG, "enable intr in delayed handler S[%08x]", eicr);
 	ixgbe_enable_intr(dev);
-	rte_intr_enable(intr_handle);
+	rte_intr_unmask(intr_handle);
 }
 
 /**
@@ -5763,7 +5763,7 @@ ixgbevf_dev_rx_queue_intr_enable(struct rte_eth_dev *dev, uint16_t queue_id)
 	RTE_SET_USED(queue_id);
 	IXGBE_WRITE_REG(hw, IXGBE_VTEIMS, intr->mask);
 
-	rte_intr_enable(intr_handle);
+	rte_intr_unmask(intr_handle);
 
 	return 0;
 }
@@ -5812,7 +5812,7 @@ ixgbe_dev_rx_queue_intr_enable(struct rte_eth_dev *dev, uint16_t queue_id)
 		mask &= (1 << (queue_id - 32));
 		IXGBE_WRITE_REG(hw, IXGBE_EIMS_EX(1), mask);
 	}
-	rte_intr_enable(intr_handle);
+	rte_intr_unmask(intr_handle);
 
 	return 0;
 }
diff --git a/drivers/net/nfp/nfp_net.c b/drivers/net/nfp/nfp_net.c
index 1a7aa17..5484d31 100644
--- a/drivers/net/nfp/nfp_net.c
+++ b/drivers/net/nfp/nfp_net.c
@@ -1412,7 +1412,7 @@ nfp_net_irq_unmask(struct rte_eth_dev *dev)
 	if (hw->ctrl & NFP_NET_CFG_CTRL_MSIXAUTO) {
 		/* If MSI-X auto-masking is used, clear the entry */
 		rte_wmb();
-		rte_intr_enable(&pci_dev->intr_handle);
+		rte_intr_unmask(&pci_dev->intr_handle);
 	} else {
 		/* Make sure all updates are written before un-masking */
 		rte_wmb();
diff --git a/drivers/net/qede/qede_ethdev.c b/drivers/net/qede/qede_ethdev.c
index 82363e6..6648139 100644
--- a/drivers/net/qede/qede_ethdev.c
+++ b/drivers/net/qede/qede_ethdev.c
@@ -248,8 +248,8 @@ qede_interrupt_handler_intx(void *param)
 	if (status & 0x1) {
 		qede_interrupt_action(ECORE_LEADING_HWFN(edev));
 
-		if (rte_intr_enable(eth_dev->intr_handle))
-			DP_ERR(edev, "rte_intr_enable failed\n");
+		if (rte_intr_unmask(eth_dev->intr_handle))
+			DP_ERR(edev, "rte_intr_unmask failed\n");
 	}
 }
 
@@ -261,8 +261,8 @@ qede_interrupt_handler(void *param)
 	struct ecore_dev *edev = &qdev->edev;
 
 	qede_interrupt_action(ECORE_LEADING_HWFN(edev));
-	if (rte_intr_enable(eth_dev->intr_handle))
-		DP_ERR(edev, "rte_intr_enable failed\n");
+	if (rte_intr_unmask(eth_dev->intr_handle))
+		DP_ERR(edev, "rte_intr_unmask failed\n");
 }
 
 static void
diff --git a/drivers/net/sfc/sfc_intr.c b/drivers/net/sfc/sfc_intr.c
index 1f4969b..08f2057 100644
--- a/drivers/net/sfc/sfc_intr.c
+++ b/drivers/net/sfc/sfc_intr.c
@@ -79,7 +79,7 @@ sfc_intr_line_handler(void *cb_arg)
 	if (qmask & (1 << sa->mgmt_evq_index))
 		sfc_intr_handle_mgmt_evq(sa);
 
-	if (rte_intr_enable(&pci_dev->intr_handle) != 0)
+	if (rte_intr_unmask(&pci_dev->intr_handle) != 0)
 		sfc_err(sa, "cannot reenable interrupts");
 
 	sfc_log_init(sa, "done");
@@ -123,7 +123,7 @@ sfc_intr_message_handler(void *cb_arg)
 
 	sfc_intr_handle_mgmt_evq(sa);
 
-	if (rte_intr_enable(&pci_dev->intr_handle) != 0)
+	if (rte_intr_unmask(&pci_dev->intr_handle) != 0)
 		sfc_err(sa, "cannot reenable interrupts");
 
 	sfc_log_init(sa, "done");
diff --git a/drivers/net/virtio/virtio_ethdev.c b/drivers/net/virtio/virtio_ethdev.c
index 04aecb7..81e4a48 100644
--- a/drivers/net/virtio/virtio_ethdev.c
+++ b/drivers/net/virtio/virtio_ethdev.c
@@ -1265,6 +1265,20 @@ virtio_vlan_filter_set(struct rte_eth_dev *dev, uint16_t vlan_id, int on)
 }
 
 static int
+virtio_intr_unmask(struct rte_eth_dev *dev)
+{
+	struct virtio_hw *hw = dev->data->dev_private;
+
+	if (rte_intr_unmask(dev->intr_handle) < 0)
+		return -1;
+
+	if (!hw->virtio_user_dev)
+		hw->use_msix = vtpci_msix_detect(RTE_ETH_DEV_TO_PCI(dev));
+
+	return 0;
+}
+
+static int
 virtio_intr_enable(struct rte_eth_dev *dev)
 {
 	struct virtio_hw *hw = dev->data->dev_private;
@@ -1457,7 +1471,7 @@ virtio_interrupt_handler(void *param)
 	isr = vtpci_isr(hw);
 	PMD_DRV_LOG(INFO, "interrupt status = %#x", isr);
 
-	if (virtio_intr_enable(dev) < 0)
+	if (virtio_intr_unmask(dev) < 0)
 		PMD_DRV_LOG(ERR, "interrupt enable failed");
 
 	if (isr & VIRTIO_PCI_ISR_CONFIG) {
diff --git a/drivers/net/vmxnet3/vmxnet3_ethdev.c b/drivers/net/vmxnet3/vmxnet3_ethdev.c
index 2b1e915..641c846 100644
--- a/drivers/net/vmxnet3/vmxnet3_ethdev.c
+++ b/drivers/net/vmxnet3/vmxnet3_ethdev.c
@@ -1426,7 +1426,7 @@ vmxnet3_interrupt_handler(void *param)
 
 	vmxnet3_process_events(dev);
 
-	if (rte_intr_enable(&pci_dev->intr_handle) < 0)
+	if (rte_intr_unmask(&pci_dev->intr_handle) < 0)
 		PMD_DRV_LOG(ERR, "interrupt enable failed");
 }
 
-- 
2.8.4


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [dpdk-dev] [RFC PATCH v3 1/3] vfio: revert change that does intr eventfd setup at probe
  2019-07-16 16:44 ` [dpdk-dev] [RFC PATCH v3 1/3] vfio: revert change that does intr eventfd setup at probe Nithin Dabilpuram
  2019-07-16 16:44   ` [dpdk-dev] [RFC PATCH v3 2/3] eal: add mask and unmask interrupt APIs Nithin Dabilpuram
  2019-07-16 16:44   ` [dpdk-dev] [RFC PATCH v3 3/3] drivers/net: use unmask API in interrupt handlers Nithin Dabilpuram
@ 2019-07-16 20:06   ` Stephen Hemminger
  2 siblings, 0 replies; 25+ messages in thread
From: Stephen Hemminger @ 2019-07-16 20:06 UTC (permalink / raw)
  To: Nithin Dabilpuram
  Cc: Hyong Youb Kim, David Marchand, Thomas Monjalon, Ferruh Yigit,
	Anatoly Burakov, jerinj, John Daley, Shahed Shaikh, dev

On Tue, 16 Jul 2019 22:14:22 +0530
Nithin Dabilpuram <ndabilpuram@marvell.com> wrote:

> This reverts commit 89aac60e0be9ed95a87b16e3595f102f9faaffb4.
> "vfio: fix interrupts race condition"
> 
> The above mentioned commit moves the interrupt's eventfd setup
> to probe time but only enables one interrupt for all types of
> interrupt handles i.e VFIO_MSI, VFIO_LEGACY, VFIO_MSIX, UIO.
> It works fine with default case but breaks below cases specifically
> for MSIX based interrupt handles.
> 
> * Applications like l3fwd-power that request rxq interrupts
>   while ethdev setup.
> * Drivers that need > 1 MSIx interrupts to be configured for
>   functionality to work.
> 
> VFIO PCI for MSIx expects all the possible vectors to be setup up
> when using VFIO_IRQ_SET_ACTION_TRIGGER so that they can be
> allocated from kernel pci subsystem. Only way to increase the number
> of vectors later is first free all by using VFIO_IRQ_SET_DATA_NONE
> with action trigger and then enable new vector count.
> 
> Above commit changes the behavior of rte_intr_[enable|disable] to
> only mask and unmask unlike earlier behavior and thereby
> breaking above two scenarios.
> 
> Fixes: 89aac60e0be9 ("vfio: fix interrupts race condition")
> Cc: david.marchand@redhat.com
> 
> Signed-off-by: Nithin Dabilpuram <ndabilpuram@marvell.com>

This fixes the problem I am seeing with MSI-X.
Even with testpmd...

EAL: Error enabling MSI-X interrupts for fd 39

Tested-by: Stephen Hemminger <stephen@networkplumber.org>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [dpdk-dev] [RFC PATCH v3 2/3] eal: add mask and unmask interrupt APIs
  2019-07-16 16:44   ` [dpdk-dev] [RFC PATCH v3 2/3] eal: add mask and unmask interrupt APIs Nithin Dabilpuram
@ 2019-07-17  5:55     ` Hyong Youb Kim (hyonkim)
  2019-07-17  6:14       ` Jerin Jacob Kollanukkaran
  0 siblings, 1 reply; 25+ messages in thread
From: Hyong Youb Kim (hyonkim) @ 2019-07-17  5:55 UTC (permalink / raw)
  To: Nithin Dabilpuram, David Marchand, Thomas Monjalon, Ferruh Yigit,
	Bruce Richardson
  Cc: jerinj, John Daley (johndale), Shahed Shaikh, dev

> -----Original Message-----
> From: Nithin Dabilpuram <ndabilpuram@marvell.com>
> Sent: Wednesday, July 17, 2019 1:44 AM
> To: Hyong Youb Kim (hyonkim) <hyonkim@cisco.com>; David Marchand
> <david.marchand@redhat.com>; Thomas Monjalon
> <thomas@monjalon.net>; Ferruh Yigit <ferruh.yigit@intel.com>; Bruce
> Richardson <bruce.richardson@intel.com>
> Cc: jerinj@marvell.com; John Daley (johndale) <johndale@cisco.com>;
> Shahed Shaikh <shshaikh@marvell.com>; dev@dpdk.org; Nithin Dabilpuram
> <ndabilpuram@marvell.com>
> Subject: [RFC PATCH v3 2/3] eal: add mask and unmask interrupt APIs
> 
> Add new mask and unmask interrupt APIs to avoid using
> VFIO_IRQ_SET_ACTION_TRIGGER for masking/unmasking purpose for VFIO
> based handlers. This implementation is specific to Linux.
> 
> Using action trigger for masking and unmasking has below issues
> 
>  * Time consuming to do for every interrupt received as it will
>    free_irq() followed by request_irq() and all other initializations
>  * A race condition because of a window between free_irq() and
>    request_irq() with packet reception still on and device still
>    enabled and would throw warning messages like below.
>    [158764.159833] do_IRQ: 9.34 No irq handler for vector
> 
> In this patch, mask/unmask is a no-op for VFIO_MSIX/VFIO_MSI interrupts
> as they are edge triggered and kernel would not mask the interrupt before
> delivering the event to userspace.
> 
> Signed-off-by: Nithin Dabilpuram <ndabilpuram@marvell.com>
> ---
> v3:
> * Re-org patch to move driver change to 3/3
> * Add stub implementation for freebsd.
> * Fix version map file for new apis.
> v2:
> Make change in qede driver for unmask
>  lib/librte_eal/common/include/rte_interrupts.h |  23 ++++
>  lib/librte_eal/freebsd/eal/eal_interrupts.c    |  54 +++++++++
>  lib/librte_eal/linux/eal/eal_interrupts.c      | 155
> +++++++++++++++++++++++++
>  lib/librte_eal/rte_eal_version.map             |   2 +
>  4 files changed, 234 insertions(+)
> 
> diff --git a/lib/librte_eal/common/include/rte_interrupts.h
> b/lib/librte_eal/common/include/rte_interrupts.h
> index c1e912c..b0675be 100644
> --- a/lib/librte_eal/common/include/rte_interrupts.h
> +++ b/lib/librte_eal/common/include/rte_interrupts.h
> @@ -118,6 +118,29 @@ int rte_intr_enable(const struct rte_intr_handle
> *intr_handle);
>   */
>  int rte_intr_disable(const struct rte_intr_handle *intr_handle);
> 
> +/**
> + * It masks the interrupt for the specified handle.
> + *
> + * @param intr_handle
> + *  pointer to the interrupt handle.
> + *
> + * @return
> + *  - On success, zero.
> + *  - On failure, a negative value.
> + */
> +int rte_intr_mask(const struct rte_intr_handle *intr_handle);
> +
> +/**
> + * It unmasks the interrupt for the specified handle.
> + *
> + * @param intr_handle
> + *  pointer to the interrupt handle.
> + *
> + * @return
> + *  - On success, zero.
> + *  - On failure, a negative value.
> + */
> +int rte_intr_unmask(const struct rte_intr_handle *intr_handle);
>  #ifdef __cplusplus
>  }
>  #endif
[...]
> diff --git a/lib/librte_eal/linux/eal/eal_interrupts.c
> b/lib/librte_eal/linux/eal/eal_interrupts.c
> index 79ad5e8..f619981 100644
> --- a/lib/librte_eal/linux/eal/eal_interrupts.c
> +++ b/lib/librte_eal/linux/eal/eal_interrupts.c
[...]
>  int
> +rte_intr_mask(const struct rte_intr_handle *intr_handle)
> +{
> +	if (intr_handle && intr_handle->type == RTE_INTR_HANDLE_VDEV)
> +		return 0;
> +
> +	if (!intr_handle || intr_handle->fd < 0 || intr_handle->uio_cfg_fd < 0)
> +		return -1;
> +
> +	switch (intr_handle->type){
> +	/* Both masking and disabling are same for UIO */
> +	case RTE_INTR_HANDLE_UIO:
> +		if (uio_intr_disable(intr_handle))
> +			return -1;
> +		break;
> +	case RTE_INTR_HANDLE_UIO_INTX:
> +		if (uio_intx_intr_disable(intr_handle))
> +			return -1;
> +		break;
> +	/* not used at this moment */
> +	case RTE_INTR_HANDLE_ALARM:
> +		return -1;
> +#ifdef VFIO_PRESENT
> +	case RTE_INTR_HANDLE_VFIO_MSIX:
> +	case RTE_INTR_HANDLE_VFIO_MSI:
> +		return 0;

Isn't this a little confusing? It returns success, but irq is not
masked.

As is, return code 0 means...
- igb_uio: irq is masked for INTx, MSI, MSI-X
- vfio-pci + INTx: irq is masked
- vfio-pci + MSI/MSI-X: no changes

Masking is useful only for INTx, IMO...

Masking MSI/MSI-X via PCI-defined mechanisms (e.g. Mask bit in MSI-X
Table) has no practical use for drivers. Handshaking/masking/unmasking
is done via device/vendor specific ways, as needed. See all those
ack/block/unblock/credit/... mechanisms used in various drivers/NICs
to control interrupts their own way.

A long time ago in early PCIe days, the linux kernel did auto-masking
for MSI/MSI-X (i.e. mask before calling netdev irq handler). It was
soon removed as it was unnecessary overhead (expensive PIOs to NIC for
every interrupt). Windows and FreeBSD do not do auto-masking either.

Thanks.
-Hyong

> +	case RTE_INTR_HANDLE_VFIO_LEGACY:
> +		if (vfio_mask_intx(intr_handle))
> +			return -1;
> +		break;
> +#ifdef HAVE_VFIO_DEV_REQ_INTERFACE
> +	case RTE_INTR_HANDLE_VFIO_REQ:
> +		return -1;
> +#endif
> +#endif
> +	/* not used at this moment */
> +	case RTE_INTR_HANDLE_DEV_EVENT:
> +		return -1;
> +	/* unknown handle type */
> +	default:
> +		RTE_LOG(ERR, EAL,
> +			"Unknown handle type of fd %d\n",
> +					intr_handle->fd);
> +		return -1;
> +	}
> +
> +	return 0;
> +}
> +
> +int
> +rte_intr_unmask(const struct rte_intr_handle *intr_handle)
> +{
> +	if (intr_handle && intr_handle->type == RTE_INTR_HANDLE_VDEV)
> +		return 0;
> +
> +	if (!intr_handle || intr_handle->fd < 0 || intr_handle->uio_cfg_fd < 0)
> +		return -1;
> +
> +	switch (intr_handle->type){
> +	/* Both unmasking and disabling are same for UIO */
> +	case RTE_INTR_HANDLE_UIO:
> +		if (uio_intr_enable(intr_handle))
> +			return -1;
> +		break;
> +	case RTE_INTR_HANDLE_UIO_INTX:
> +		if (uio_intx_intr_enable(intr_handle))
> +			return -1;
> +		break;
> +	/* not used at this moment */
> +	case RTE_INTR_HANDLE_ALARM:
> +		return -1;
> +#ifdef VFIO_PRESENT
> +	case RTE_INTR_HANDLE_VFIO_MSIX:
> +	case RTE_INTR_HANDLE_VFIO_MSI:
> +		return 0;
> +	case RTE_INTR_HANDLE_VFIO_LEGACY:
> +		if (vfio_unmask_intx(intr_handle))
> +			return -1;
> +		break;
> +#ifdef HAVE_VFIO_DEV_REQ_INTERFACE
> +	case RTE_INTR_HANDLE_VFIO_REQ:
> +		return -1;
> +#endif
> +#endif
> +	/* not used at this moment */
> +	case RTE_INTR_HANDLE_DEV_EVENT:
> +		return -1;
> +	/* unknown handle type */
> +	default:
> +		RTE_LOG(ERR, EAL,
> +			"Unknown handle type of fd %d\n",
> +					intr_handle->fd);
> +		return -1;
> +	}
> +
> +	return 0;
> +}
> +
> +int
>  rte_intr_disable(const struct rte_intr_handle *intr_handle)
>  {
>  	if (intr_handle && intr_handle->type == RTE_INTR_HANDLE_VDEV)
> diff --git a/lib/librte_eal/rte_eal_version.map
> b/lib/librte_eal/rte_eal_version.map
> index 1892d9e..b3b2df4 100644
> --- a/lib/librte_eal/rte_eal_version.map
> +++ b/lib/librte_eal/rte_eal_version.map
> @@ -309,6 +309,8 @@ DPDK_19.08 {
>  	rte_service_lcore_attr_reset_all;
>  	rte_service_may_be_active;
>  	rte_srand;
> +	rte_intr_unmask;
> +	rte_intr_mask;
> 
>  } DPDK_19.05;
> 
> --
> 2.8.4


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [dpdk-dev] [RFC PATCH v3 3/3] drivers/net: use unmask API in interrupt handlers
  2019-07-16 16:44   ` [dpdk-dev] [RFC PATCH v3 3/3] drivers/net: use unmask API in interrupt handlers Nithin Dabilpuram
@ 2019-07-17  6:01     ` Hyong Youb Kim (hyonkim)
  2019-07-17  7:47       ` Nithin Kumar Dabilpuram
  0 siblings, 1 reply; 25+ messages in thread
From: Hyong Youb Kim (hyonkim) @ 2019-07-17  6:01 UTC (permalink / raw)
  To: Nithin Dabilpuram, David Marchand, Thomas Monjalon, Ferruh Yigit,
	Igor Russkikh, Pavel Belous, Allain Legacy, Matt Peters,
	Ravi Kumar, Rasesh Mody, Shahed Shaikh, Wenzhuo Lu, Qi Zhang,
	Xiao Wang, Beilei Xing, Jingjing Wu, Qiming Yang,
	Konstantin Ananyev, Alejandro Lucero, Andrew Rybchenko,
	Maxime Coquelin, Tiwei Bie, Zhihong Wang, Yong Wang
  Cc: jerinj, John Daley (johndale), dev

> -----Original Message-----
> From: Nithin Dabilpuram <ndabilpuram@marvell.com>
> Sent: Wednesday, July 17, 2019 1:44 AM
[...]
> Subject: [RFC PATCH v3 3/3] drivers/net: use unmask API in interrupt
> handlers
> 
> Replace rte_intr_enable() with rte_intr_unmask() API
> for unmasking in interrupt handlers and rx_queue_intr_enable()
> in callbacks of PMD's whose original intent was to unmask interrupts
> after handling is completed if device is backed by UIO, IGB_UIO or
> VFIO(with INTx).
> 
> Signed-off-by: Nithin Dabilpuram <ndabilpuram@marvell.com>
> ---
> 
> v3:
> * Change all PMD's that use rte_intr_enable() in
>   rx_queue_intr_enable() or in irq handler to use
>   new unmask api.
> 
>  drivers/net/atlantic/atl_ethdev.c    |  2 +-
>  drivers/net/avp/avp_ethdev.c         |  2 +-
>  drivers/net/axgbe/axgbe_ethdev.c     |  4 ++--
>  drivers/net/bnx2x/bnx2x_ethdev.c     |  2 +-
>  drivers/net/e1000/em_ethdev.c        |  4 ++--
>  drivers/net/e1000/igb_ethdev.c       |  6 +++---
>  drivers/net/fm10k/fm10k_ethdev.c     |  6 +++---
>  drivers/net/i40e/i40e_ethdev.c       |  2 +-
>  drivers/net/iavf/iavf_ethdev.c       |  2 +-
>  drivers/net/ice/ice_ethdev.c         |  4 ++--
>  drivers/net/ixgbe/ixgbe_ethdev.c     |  6 +++---
>  drivers/net/nfp/nfp_net.c            |  2 +-
>  drivers/net/qede/qede_ethdev.c       |  8 ++++----
>  drivers/net/sfc/sfc_intr.c           |  4 ++--
>  drivers/net/virtio/virtio_ethdev.c   | 16 +++++++++++++++-
>  drivers/net/vmxnet3/vmxnet3_ethdev.c |  2 +-
>  16 files changed, 43 insertions(+), 29 deletions(-)
> 
[...]
> diff --git a/drivers/net/nfp/nfp_net.c b/drivers/net/nfp/nfp_net.c
> index 1a7aa17..5484d31 100644
> --- a/drivers/net/nfp/nfp_net.c
> +++ b/drivers/net/nfp/nfp_net.c
> @@ -1412,7 +1412,7 @@ nfp_net_irq_unmask(struct rte_eth_dev *dev)
>  	if (hw->ctrl & NFP_NET_CFG_CTRL_MSIXAUTO) {
>  		/* If MSI-X auto-masking is used, clear the entry */
>  		rte_wmb();
> -		rte_intr_enable(&pci_dev->intr_handle);
> +		rte_intr_unmask(&pci_dev->intr_handle);

I mentioned this part earlier. The comment suggests the driver really
needs to clear Mask bit in MSI-X Table, if MSI-X is used?
rte_intr_unmask() + vfio-pci does not do that. The maintainers should
say something..

[...]
> diff --git a/drivers/net/qede/qede_ethdev.c
> b/drivers/net/qede/qede_ethdev.c
> index 82363e6..6648139 100644
> --- a/drivers/net/qede/qede_ethdev.c
> +++ b/drivers/net/qede/qede_ethdev.c
> @@ -248,8 +248,8 @@ qede_interrupt_handler_intx(void *param)
>  	if (status & 0x1) {
>  		qede_interrupt_action(ECORE_LEADING_HWFN(edev));
> 
> -		if (rte_intr_enable(eth_dev->intr_handle))
> -			DP_ERR(edev, "rte_intr_enable failed\n");
> +		if (rte_intr_unmask(eth_dev->intr_handle))
> +			DP_ERR(edev, "rte_intr_unmask failed\n");
>  	}
>  }
> 
> @@ -261,8 +261,8 @@ qede_interrupt_handler(void *param)
>  	struct ecore_dev *edev = &qdev->edev;
> 
>  	qede_interrupt_action(ECORE_LEADING_HWFN(edev));
> -	if (rte_intr_enable(eth_dev->intr_handle))
> -		DP_ERR(edev, "rte_intr_enable failed\n");
> +	if (rte_intr_unmask(eth_dev->intr_handle))
> +		DP_ERR(edev, "rte_intr_unmask failed\n");
>  }

I thought you were going to remove rte_intr_enable from this handler
altogether.

[...]

Thanks.
-Hyong


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [dpdk-dev] [RFC PATCH v3 2/3] eal: add mask and unmask interrupt APIs
  2019-07-17  5:55     ` Hyong Youb Kim (hyonkim)
@ 2019-07-17  6:14       ` Jerin Jacob Kollanukkaran
  2019-07-17  7:09         ` Hyong Youb Kim (hyonkim)
  0 siblings, 1 reply; 25+ messages in thread
From: Jerin Jacob Kollanukkaran @ 2019-07-17  6:14 UTC (permalink / raw)
  To: Hyong Youb Kim (hyonkim),
	Nithin Kumar Dabilpuram, David Marchand, Thomas Monjalon,
	Ferruh Yigit, Bruce Richardson
  Cc: John Daley (johndale), Shahed Shaikh, dev

> -----Original Message-----
> From: Hyong Youb Kim (hyonkim) <hyonkim@cisco.com>
> Sent: Wednesday, July 17, 2019 11:26 AM
> To: Nithin Kumar Dabilpuram <ndabilpuram@marvell.com>; David Marchand
> <david.marchand@redhat.com>; Thomas Monjalon
> <thomas@monjalon.net>; Ferruh Yigit <ferruh.yigit@intel.com>; Bruce
> Richardson <bruce.richardson@intel.com>
> Cc: Jerin Jacob Kollanukkaran <jerinj@marvell.com>; John Daley (johndale)
> <johndale@cisco.com>; Shahed Shaikh <shshaikh@marvell.com>;
> dev@dpdk.org
> Subject: RE: [RFC PATCH v3 2/3] eal: add mask and unmask interrupt APIs
> > +rte_intr_mask(const struct rte_intr_handle *intr_handle) {
> > +	if (intr_handle && intr_handle->type == RTE_INTR_HANDLE_VDEV)
> > +		return 0;
> > +
> > +	if (!intr_handle || intr_handle->fd < 0 || intr_handle->uio_cfg_fd <
> 0)
> > +		return -1;
> > +
> > +	switch (intr_handle->type){
> > +	/* Both masking and disabling are same for UIO */
> > +	case RTE_INTR_HANDLE_UIO:
> > +		if (uio_intr_disable(intr_handle))
> > +			return -1;
> > +		break;
> > +	case RTE_INTR_HANDLE_UIO_INTX:
> > +		if (uio_intx_intr_disable(intr_handle))
> > +			return -1;
> > +		break;
> > +	/* not used at this moment */
> > +	case RTE_INTR_HANDLE_ALARM:
> > +		return -1;
> > +#ifdef VFIO_PRESENT
> > +	case RTE_INTR_HANDLE_VFIO_MSIX:
> > +	case RTE_INTR_HANDLE_VFIO_MSI:
> > +		return 0;
> 
> Isn't this a little confusing? It returns success, but irq is not masked.

Yes. How about changing the API to rte_intr_ack()(Acknowledge the interrupt)
Or something similar? i.e replace rte_intr_unmask() with rte_intr_ack() for this use
case.
 
> As is, return code 0 means...
> - igb_uio: irq is masked for INTx, MSI, MSI-X
> - vfio-pci + INTx: irq is masked
> - vfio-pci + MSI/MSI-X: no changes
> 
> Masking is useful only for INTx, IMO...
> 
> Masking MSI/MSI-X via PCI-defined mechanisms (e.g. Mask bit in MSI-X
> Table) has no practical use for drivers. Handshaking/masking/unmasking is
> done via device/vendor specific ways, as needed. See all those
> ack/block/unblock/credit/... mechanisms used in various drivers/NICs to
> control interrupts their own way.
> 
> A long time ago in early PCIe days, the linux kernel did auto-masking for
> MSI/MSI-X (i.e. mask before calling netdev irq handler). It was soon removed
> as it was unnecessary overhead (expensive PIOs to NIC for every interrupt).
> Windows and FreeBSD do not do auto-masking either.

rte_intr_ack() can abstract FreeBSD and Windows difference.



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [dpdk-dev] [RFC PATCH v3 2/3] eal: add mask and unmask interrupt APIs
  2019-07-17  6:14       ` Jerin Jacob Kollanukkaran
@ 2019-07-17  7:09         ` Hyong Youb Kim (hyonkim)
  2019-07-17  8:03           ` Jerin Jacob Kollanukkaran
  0 siblings, 1 reply; 25+ messages in thread
From: Hyong Youb Kim (hyonkim) @ 2019-07-17  7:09 UTC (permalink / raw)
  To: Jerin Jacob Kollanukkaran, Nithin Kumar Dabilpuram,
	David Marchand, Thomas Monjalon, Ferruh Yigit, Bruce Richardson
  Cc: John Daley (johndale), Shahed Shaikh, dev

> -----Original Message-----
> From: Jerin Jacob Kollanukkaran <jerinj@marvell.com>
> Sent: Wednesday, July 17, 2019 3:15 PM
> To: Hyong Youb Kim (hyonkim) <hyonkim@cisco.com>; Nithin Kumar
> Dabilpuram <ndabilpuram@marvell.com>; David Marchand
> <david.marchand@redhat.com>; Thomas Monjalon
> <thomas@monjalon.net>; Ferruh Yigit <ferruh.yigit@intel.com>; Bruce
> Richardson <bruce.richardson@intel.com>
> Cc: John Daley (johndale) <johndale@cisco.com>; Shahed Shaikh
> <shshaikh@marvell.com>; dev@dpdk.org
> Subject: RE: [RFC PATCH v3 2/3] eal: add mask and unmask interrupt APIs
> 
> > -----Original Message-----
> > From: Hyong Youb Kim (hyonkim) <hyonkim@cisco.com>
> > Sent: Wednesday, July 17, 2019 11:26 AM
> > To: Nithin Kumar Dabilpuram <ndabilpuram@marvell.com>; David
> Marchand
> > <david.marchand@redhat.com>; Thomas Monjalon
> > <thomas@monjalon.net>; Ferruh Yigit <ferruh.yigit@intel.com>; Bruce
> > Richardson <bruce.richardson@intel.com>
> > Cc: Jerin Jacob Kollanukkaran <jerinj@marvell.com>; John Daley (johndale)
> > <johndale@cisco.com>; Shahed Shaikh <shshaikh@marvell.com>;
> > dev@dpdk.org
> > Subject: RE: [RFC PATCH v3 2/3] eal: add mask and unmask interrupt APIs
> > > +rte_intr_mask(const struct rte_intr_handle *intr_handle) {
> > > +	if (intr_handle && intr_handle->type == RTE_INTR_HANDLE_VDEV)
> > > +		return 0;
> > > +
> > > +	if (!intr_handle || intr_handle->fd < 0 || intr_handle->uio_cfg_fd <
> > 0)
> > > +		return -1;
> > > +
> > > +	switch (intr_handle->type){
> > > +	/* Both masking and disabling are same for UIO */
> > > +	case RTE_INTR_HANDLE_UIO:
> > > +		if (uio_intr_disable(intr_handle))
> > > +			return -1;
> > > +		break;
> > > +	case RTE_INTR_HANDLE_UIO_INTX:
> > > +		if (uio_intx_intr_disable(intr_handle))
> > > +			return -1;
> > > +		break;
> > > +	/* not used at this moment */
> > > +	case RTE_INTR_HANDLE_ALARM:
> > > +		return -1;
> > > +#ifdef VFIO_PRESENT
> > > +	case RTE_INTR_HANDLE_VFIO_MSIX:
> > > +	case RTE_INTR_HANDLE_VFIO_MSI:
> > > +		return 0;
> >
> > Isn't this a little confusing? It returns success, but irq is not masked.
> 
> Yes. How about changing the API to rte_intr_ack()(Acknowledge the
> interrupt)
> Or something similar? i.e replace rte_intr_unmask() with rte_intr_ack() for
> this use
> case.
> 

Not sure. I do not have a good suggestion here :-) Like to hear from
David when he comes back, as he spent most time on this issue..

Why not return -1 and let the caller deal with it?

Optimist view:
Maintainers will see the error as vfio-pci + MSI/MSI_X is on
everyone's test list. And it forces them to confront the issue. Do I
really need unmask here, etc.

Pessimist view:
Wastes a lot of people's time. Potentially duplicate code like this
everywhere.

  if (INTx) unmask();

BTW, are you targeting 19.08 or 19.11? Not sure how much change we can
tolerate in 19.08.

Requirements for 19.08 seem to be...
- Must fix the redhat bz (lost interrupt issue with qede + MSI/MSI-X)
- Fix potentially similar issues in other drivers too?

Thanks..
-Hyong

> > As is, return code 0 means...
> > - igb_uio: irq is masked for INTx, MSI, MSI-X
> > - vfio-pci + INTx: irq is masked
> > - vfio-pci + MSI/MSI-X: no changes
> >
> > Masking is useful only for INTx, IMO...
> >
> > Masking MSI/MSI-X via PCI-defined mechanisms (e.g. Mask bit in MSI-X
> > Table) has no practical use for drivers. Handshaking/masking/unmasking is
> > done via device/vendor specific ways, as needed. See all those
> > ack/block/unblock/credit/... mechanisms used in various drivers/NICs to
> > control interrupts their own way.
> >
> > A long time ago in early PCIe days, the linux kernel did auto-masking for
> > MSI/MSI-X (i.e. mask before calling netdev irq handler). It was soon
> removed
> > as it was unnecessary overhead (expensive PIOs to NIC for every interrupt).
> > Windows and FreeBSD do not do auto-masking either.
> 
> rte_intr_ack() can abstract FreeBSD and Windows difference.
> 


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [dpdk-dev] [RFC PATCH v3 3/3] drivers/net: use unmask API in interrupt handlers
  2019-07-17  6:01     ` Hyong Youb Kim (hyonkim)
@ 2019-07-17  7:47       ` Nithin Kumar Dabilpuram
  0 siblings, 0 replies; 25+ messages in thread
From: Nithin Kumar Dabilpuram @ 2019-07-17  7:47 UTC (permalink / raw)
  To: Hyong Youb Kim (hyonkim)
  Cc: David Marchand, Thomas Monjalon, Ferruh Yigit, Igor Russkikh,
	Pavel Belous, Allain Legacy, Matt Peters, Ravi Kumar,
	Rasesh Mody, Shahed Shaikh, Wenzhuo Lu, Qi Zhang, Xiao Wang,
	Beilei Xing, Jingjing Wu, Qiming Yang, Konstantin Ananyev,
	Alejandro Lucero, Andrew Rybchenko, Maxime Coquelin, Tiwei Bie,
	Zhihong Wang, Yong Wang, Jerin Jacob Kollanukkaran,
	John Daley (johndale),
	dev

On Wed, Jul 17, 2019 at 06:01:47AM +0000, Hyong Youb Kim (hyonkim) wrote:
> > -----Original Message-----
> > From: Nithin Dabilpuram <ndabilpuram@marvell.com>
> > Sent: Wednesday, July 17, 2019 1:44 AM
> [...]
> > Subject: [RFC PATCH v3 3/3] drivers/net: use unmask API in interrupt
> > handlers
> > 
> > Replace rte_intr_enable() with rte_intr_unmask() API
> > for unmasking in interrupt handlers and rx_queue_intr_enable()
> > in callbacks of PMD's whose original intent was to unmask interrupts
> > after handling is completed if device is backed by UIO, IGB_UIO or
> > VFIO(with INTx).
> > 
> > Signed-off-by: Nithin Dabilpuram <ndabilpuram@marvell.com>
> > ---
> > 
> > v3:
> > * Change all PMD's that use rte_intr_enable() in
> >   rx_queue_intr_enable() or in irq handler to use
> >   new unmask api.
> > 
> >  drivers/net/atlantic/atl_ethdev.c    |  2 +-
> >  drivers/net/avp/avp_ethdev.c         |  2 +-
> >  drivers/net/axgbe/axgbe_ethdev.c     |  4 ++--
> >  drivers/net/bnx2x/bnx2x_ethdev.c     |  2 +-
> >  drivers/net/e1000/em_ethdev.c        |  4 ++--
> >  drivers/net/e1000/igb_ethdev.c       |  6 +++---
> >  drivers/net/fm10k/fm10k_ethdev.c     |  6 +++---
> >  drivers/net/i40e/i40e_ethdev.c       |  2 +-
> >  drivers/net/iavf/iavf_ethdev.c       |  2 +-
> >  drivers/net/ice/ice_ethdev.c         |  4 ++--
> >  drivers/net/ixgbe/ixgbe_ethdev.c     |  6 +++---
> >  drivers/net/nfp/nfp_net.c            |  2 +-
> >  drivers/net/qede/qede_ethdev.c       |  8 ++++----
> >  drivers/net/sfc/sfc_intr.c           |  4 ++--
> >  drivers/net/virtio/virtio_ethdev.c   | 16 +++++++++++++++-
> >  drivers/net/vmxnet3/vmxnet3_ethdev.c |  2 +-
> >  16 files changed, 43 insertions(+), 29 deletions(-)
> > 
> [...]
> > diff --git a/drivers/net/nfp/nfp_net.c b/drivers/net/nfp/nfp_net.c
> > index 1a7aa17..5484d31 100644
> > --- a/drivers/net/nfp/nfp_net.c
> > +++ b/drivers/net/nfp/nfp_net.c
> > @@ -1412,7 +1412,7 @@ nfp_net_irq_unmask(struct rte_eth_dev *dev)
> >  	if (hw->ctrl & NFP_NET_CFG_CTRL_MSIXAUTO) {
> >  		/* If MSI-X auto-masking is used, clear the entry */
> >  		rte_wmb();
> > -		rte_intr_enable(&pci_dev->intr_handle);
> > +		rte_intr_unmask(&pci_dev->intr_handle);
> 
> I mentioned this part earlier. The comment suggests the driver really
> needs to clear Mask bit in MSI-X Table, if MSI-X is used?
> rte_intr_unmask() + vfio-pci does not do that. The maintainers should
> say something..
> 

Ok, if this comment meant that rte_intr_enable() is a must even with
VFIO-MSIX, then I can leave it as it is with rte_intr_unmask().
If it for MSIX interrupts via IGB_UIO, then rte_intr_unmask() will work.


> [...]
> > diff --git a/drivers/net/qede/qede_ethdev.c
> > b/drivers/net/qede/qede_ethdev.c
> > index 82363e6..6648139 100644
> > --- a/drivers/net/qede/qede_ethdev.c
> > +++ b/drivers/net/qede/qede_ethdev.c
> > @@ -248,8 +248,8 @@ qede_interrupt_handler_intx(void *param)
> >  	if (status & 0x1) {
> >  		qede_interrupt_action(ECORE_LEADING_HWFN(edev));
> > 
> > -		if (rte_intr_enable(eth_dev->intr_handle))
> > -			DP_ERR(edev, "rte_intr_enable failed\n");
> > +		if (rte_intr_unmask(eth_dev->intr_handle))
> > +			DP_ERR(edev, "rte_intr_unmask failed\n");
> >  	}
> >  }
> > 
> > @@ -261,8 +261,8 @@ qede_interrupt_handler(void *param)
> >  	struct ecore_dev *edev = &qdev->edev;
> > 
> >  	qede_interrupt_action(ECORE_LEADING_HWFN(edev));
> > -	if (rte_intr_enable(eth_dev->intr_handle))
> > -		DP_ERR(edev, "rte_intr_enable failed\n");
> > +	if (rte_intr_unmask(eth_dev->intr_handle))
> > +		DP_ERR(edev, "rte_intr_unmask failed\n");
> >  }
> 
> I thought you were going to remove rte_intr_enable from this handler
> altogether.
> 

Here again, if it was done for case of IGB_UIO with MSIx, then
I think it is needed. If maintainer confirms that it is not needed in this
i.e qede_interrupt_handler(), then I can remove this unmask.


> [...]
> 
> Thanks.
> -Hyong
> 

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [dpdk-dev] [RFC PATCH v3 2/3] eal: add mask and unmask interrupt APIs
  2019-07-17  7:09         ` Hyong Youb Kim (hyonkim)
@ 2019-07-17  8:03           ` Jerin Jacob Kollanukkaran
  2019-07-17  8:45             ` Hyong Youb Kim (hyonkim)
  0 siblings, 1 reply; 25+ messages in thread
From: Jerin Jacob Kollanukkaran @ 2019-07-17  8:03 UTC (permalink / raw)
  To: Hyong Youb Kim (hyonkim),
	Nithin Kumar Dabilpuram, David Marchand, Thomas Monjalon,
	Ferruh Yigit, Bruce Richardson
  Cc: John Daley (johndale), Shahed Shaikh, dev

> > > -----Original Message-----
> > > From: Hyong Youb Kim (hyonkim) <hyonkim@cisco.com>
> > > Sent: Wednesday, July 17, 2019 11:26 AM
> > > To: Nithin Kumar Dabilpuram <ndabilpuram@marvell.com>; David
> > Marchand
> > > <david.marchand@redhat.com>; Thomas Monjalon
> <thomas@monjalon.net>;
> > > Ferruh Yigit <ferruh.yigit@intel.com>; Bruce Richardson
> > > <bruce.richardson@intel.com>
> > > Cc: Jerin Jacob Kollanukkaran <jerinj@marvell.com>; John Daley
> > > (johndale) <johndale@cisco.com>; Shahed Shaikh
> > > <shshaikh@marvell.com>; dev@dpdk.org
> > > Subject: RE: [RFC PATCH v3 2/3] eal: add mask and unmask interrupt
> > > APIs
> > > > +rte_intr_mask(const struct rte_intr_handle *intr_handle) {
> > > > +	if (intr_handle && intr_handle->type == RTE_INTR_HANDLE_VDEV)
> > > > +		return 0;
> > > > +
> > > > +	if (!intr_handle || intr_handle->fd < 0 ||
> > > > +intr_handle->uio_cfg_fd <
> > > 0)
> > > > +		return -1;
> > > > +
> > > > +	switch (intr_handle->type){
> > > > +	/* Both masking and disabling are same for UIO */
> > > > +	case RTE_INTR_HANDLE_UIO:
> > > > +		if (uio_intr_disable(intr_handle))
> > > > +			return -1;
> > > > +		break;
> > > > +	case RTE_INTR_HANDLE_UIO_INTX:
> > > > +		if (uio_intx_intr_disable(intr_handle))
> > > > +			return -1;
> > > > +		break;
> > > > +	/* not used at this moment */
> > > > +	case RTE_INTR_HANDLE_ALARM:
> > > > +		return -1;
> > > > +#ifdef VFIO_PRESENT
> > > > +	case RTE_INTR_HANDLE_VFIO_MSIX:
> > > > +	case RTE_INTR_HANDLE_VFIO_MSI:
> > > > +		return 0;
> > >
> > > Isn't this a little confusing? It returns success, but irq is not masked.
> >
> > Yes. How about changing the API to rte_intr_ack()(Acknowledge the
> > interrupt)
> > Or something similar? i.e replace rte_intr_unmask() with
> > rte_intr_ack() for this use case.
> >
> 
> Not sure. I do not have a good suggestion here :-) Like to hear from
> David when he comes back, as he spent most time on this issue..

Sure. He is on vacation.
Any reason for thinking, rte_intr_ack()  may not be semantically correct?
I think, it is very much correct if there are no better suggestions.
Anyway it the name, From VFIO perspective, we know what is expected so I think it is fine.

> 
> Why not return -1 and let the caller deal with it?

If we make it as rte_intr_ack() no need to return -1 for MSIX-VFIO+Linux
as it is semantically correct.

> 
> Optimist view:
> Maintainers will see the error as vfio-pci + MSI/MSI_X is on
> everyone's test list. And it forces them to confront the issue. Do I
> really need unmask here, etc.

If we make it as ack then it fine as driver does not need to know the fine 
details.

> 
> Pessimist view:
> Wastes a lot of people's time. Potentially duplicate code like this
> everywhere.
> 
>   if (INTx) unmask();
> 
> BTW, are you targeting 19.08 or 19.11? Not sure how much change we can
> tolerate in 19.08.


19.08 as fundamentally it correct. Finer adjustment can made by existing
drivers if required in the testing phase.

It is trivial change as scope is limited to interrupt hander rte_intr_enable()
replacement with rte_intr_ack(). For MSIX case, it should be real NOP,
so I don't think there issue. It should be much better than the existing
state, where almost everything broken.

> Requirements for 19.08 seem to be...
> - Must fix the redhat bz (lost interrupt issue with qede + MSI/MSI-X)
> - Fix potentially similar issues in other drivers too?

Proposed patch will fix the above mentioned issues.

> 
> Thanks..
> -Hyong
> 
> > > As is, return code 0 means...
> > > - igb_uio: irq is masked for INTx, MSI, MSI-X
> > > - vfio-pci + INTx: irq is masked
> > > - vfio-pci + MSI/MSI-X: no changes
> > >
> > > Masking is useful only for INTx, IMO...
> > >
> > > Masking MSI/MSI-X via PCI-defined mechanisms (e.g. Mask bit in MSI-X
> > > Table) has no practical use for drivers. Handshaking/masking/unmasking
> is
> > > done via device/vendor specific ways, as needed. See all those
> > > ack/block/unblock/credit/... mechanisms used in various drivers/NICs to
> > > control interrupts their own way.
> > >
> > > A long time ago in early PCIe days, the linux kernel did auto-masking for
> > > MSI/MSI-X (i.e. mask before calling netdev irq handler). It was soon
> > removed
> > > as it was unnecessary overhead (expensive PIOs to NIC for every
> interrupt).
> > > Windows and FreeBSD do not do auto-masking either.
> >
> > rte_intr_ack() can abstract FreeBSD and Windows difference.
> >


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [dpdk-dev] [RFC PATCH v3 2/3] eal: add mask and unmask interrupt APIs
  2019-07-17  8:03           ` Jerin Jacob Kollanukkaran
@ 2019-07-17  8:45             ` Hyong Youb Kim (hyonkim)
  2019-07-17  9:20               ` Jerin Jacob Kollanukkaran
  0 siblings, 1 reply; 25+ messages in thread
From: Hyong Youb Kim (hyonkim) @ 2019-07-17  8:45 UTC (permalink / raw)
  To: Jerin Jacob Kollanukkaran, Nithin Kumar Dabilpuram,
	David Marchand, Thomas Monjalon, Ferruh Yigit, Bruce Richardson
  Cc: John Daley (johndale), Shahed Shaikh, dev

> -----Original Message-----
> From: Jerin Jacob Kollanukkaran <jerinj@marvell.com>
> Sent: Wednesday, July 17, 2019 5:03 PM
> To: Hyong Youb Kim (hyonkim) <hyonkim@cisco.com>; Nithin Kumar
> Dabilpuram <ndabilpuram@marvell.com>; David Marchand
> <david.marchand@redhat.com>; Thomas Monjalon
> <thomas@monjalon.net>; Ferruh Yigit <ferruh.yigit@intel.com>; Bruce
> Richardson <bruce.richardson@intel.com>
> Cc: John Daley (johndale) <johndale@cisco.com>; Shahed Shaikh
> <shshaikh@marvell.com>; dev@dpdk.org
> Subject: RE: [RFC PATCH v3 2/3] eal: add mask and unmask interrupt APIs
> 
> > > > -----Original Message-----
> > > > From: Hyong Youb Kim (hyonkim) <hyonkim@cisco.com>
> > > > Sent: Wednesday, July 17, 2019 11:26 AM
> > > > To: Nithin Kumar Dabilpuram <ndabilpuram@marvell.com>; David
> > > Marchand
> > > > <david.marchand@redhat.com>; Thomas Monjalon
> > <thomas@monjalon.net>;
> > > > Ferruh Yigit <ferruh.yigit@intel.com>; Bruce Richardson
> > > > <bruce.richardson@intel.com>
> > > > Cc: Jerin Jacob Kollanukkaran <jerinj@marvell.com>; John Daley
> > > > (johndale) <johndale@cisco.com>; Shahed Shaikh
> > > > <shshaikh@marvell.com>; dev@dpdk.org
> > > > Subject: RE: [RFC PATCH v3 2/3] eal: add mask and unmask interrupt
> > > > APIs
> > > > > +rte_intr_mask(const struct rte_intr_handle *intr_handle) {
> > > > > +	if (intr_handle && intr_handle->type ==
> RTE_INTR_HANDLE_VDEV)
> > > > > +		return 0;
> > > > > +
> > > > > +	if (!intr_handle || intr_handle->fd < 0 ||
> > > > > +intr_handle->uio_cfg_fd <
> > > > 0)
> > > > > +		return -1;
> > > > > +
> > > > > +	switch (intr_handle->type){
> > > > > +	/* Both masking and disabling are same for UIO */
> > > > > +	case RTE_INTR_HANDLE_UIO:
> > > > > +		if (uio_intr_disable(intr_handle))
> > > > > +			return -1;
> > > > > +		break;
> > > > > +	case RTE_INTR_HANDLE_UIO_INTX:
> > > > > +		if (uio_intx_intr_disable(intr_handle))
> > > > > +			return -1;
> > > > > +		break;
> > > > > +	/* not used at this moment */
> > > > > +	case RTE_INTR_HANDLE_ALARM:
> > > > > +		return -1;
> > > > > +#ifdef VFIO_PRESENT
> > > > > +	case RTE_INTR_HANDLE_VFIO_MSIX:
> > > > > +	case RTE_INTR_HANDLE_VFIO_MSI:
> > > > > +		return 0;
> > > >
> > > > Isn't this a little confusing? It returns success, but irq is not masked.
> > >
> > > Yes. How about changing the API to rte_intr_ack()(Acknowledge the
> > > interrupt)
> > > Or something similar? i.e replace rte_intr_unmask() with
> > > rte_intr_ack() for this use case.
> > >
> >
> > Not sure. I do not have a good suggestion here :-) Like to hear from
> > David when he comes back, as he spent most time on this issue..
> 
> Sure. He is on vacation.
> Any reason for thinking, rte_intr_ack()  may not be semantically correct?
> I think, it is very much correct if there are no better suggestions.
> Anyway it the name, From VFIO perspective, we know what is expected so I
> think it is fine.
> 
> >
> > Why not return -1 and let the caller deal with it?
> 
> If we make it as rte_intr_ack() no need to return -1 for MSIX-VFIO+Linux
> as it is semantically correct.
> 

Ack can be ambiguous. For INTx, ack usually means PIO to a NIC
register, saying "I got your interrupt, please de-assert irq".

Besides the name, are we agreeing that we want these?
- Unmask if INTx
- Nothing if MSI/MSI-X

So, really just "unmask if INTx". I am ok with rte_intr_unmask() if we
make this intention clear. rte_unmask_if_intx() looks messy.

Thanks..
-Hyong

> >
> > Optimist view:
> > Maintainers will see the error as vfio-pci + MSI/MSI_X is on
> > everyone's test list. And it forces them to confront the issue. Do I
> > really need unmask here, etc.
> 
> If we make it as ack then it fine as driver does not need to know the fine
> details.
> 
> >
> > Pessimist view:
> > Wastes a lot of people's time. Potentially duplicate code like this
> > everywhere.
> >
> >   if (INTx) unmask();
> >
> > BTW, are you targeting 19.08 or 19.11? Not sure how much change we can
> > tolerate in 19.08.
> 
> 
> 19.08 as fundamentally it correct. Finer adjustment can made by existing
> drivers if required in the testing phase.
> 
> It is trivial change as scope is limited to interrupt hander rte_intr_enable()
> replacement with rte_intr_ack(). For MSIX case, it should be real NOP,
> so I don't think there issue. It should be much better than the existing
> state, where almost everything broken.
> 
> > Requirements for 19.08 seem to be...
> > - Must fix the redhat bz (lost interrupt issue with qede + MSI/MSI-X)
> > - Fix potentially similar issues in other drivers too?
> 
> Proposed patch will fix the above mentioned issues.
> 
> >
> > Thanks..
> > -Hyong
> >
> > > > As is, return code 0 means...
> > > > - igb_uio: irq is masked for INTx, MSI, MSI-X
> > > > - vfio-pci + INTx: irq is masked
> > > > - vfio-pci + MSI/MSI-X: no changes
> > > >
> > > > Masking is useful only for INTx, IMO...
> > > >
> > > > Masking MSI/MSI-X via PCI-defined mechanisms (e.g. Mask bit in MSI-X
> > > > Table) has no practical use for drivers.
> Handshaking/masking/unmasking
> > is
> > > > done via device/vendor specific ways, as needed. See all those
> > > > ack/block/unblock/credit/... mechanisms used in various drivers/NICs
> to
> > > > control interrupts their own way.
> > > >
> > > > A long time ago in early PCIe days, the linux kernel did auto-masking for
> > > > MSI/MSI-X (i.e. mask before calling netdev irq handler). It was soon
> > > removed
> > > > as it was unnecessary overhead (expensive PIOs to NIC for every
> > interrupt).
> > > > Windows and FreeBSD do not do auto-masking either.
> > >
> > > rte_intr_ack() can abstract FreeBSD and Windows difference.
> > >


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [dpdk-dev] [RFC PATCH v3 2/3] eal: add mask and unmask interrupt APIs
  2019-07-17  8:45             ` Hyong Youb Kim (hyonkim)
@ 2019-07-17  9:20               ` Jerin Jacob Kollanukkaran
  2019-07-17  9:51                 ` Hyong Youb Kim (hyonkim)
  0 siblings, 1 reply; 25+ messages in thread
From: Jerin Jacob Kollanukkaran @ 2019-07-17  9:20 UTC (permalink / raw)
  To: Hyong Youb Kim (hyonkim),
	Nithin Kumar Dabilpuram, David Marchand, Thomas Monjalon,
	Ferruh Yigit, Bruce Richardson
  Cc: John Daley (johndale), Shahed Shaikh, dev

> > > Not sure. I do not have a good suggestion here :-) Like to hear from
> > > David when he comes back, as he spent most time on this issue..
> >
> > Sure. He is on vacation.
> > Any reason for thinking, rte_intr_ack()  may not be semantically correct?
> > I think, it is very much correct if there are no better suggestions.
> > Anyway it the name, From VFIO perspective, we know what is expected so
> > I think it is fine.
> >
> > >
> > > Why not return -1 and let the caller deal with it?
> >
> > If we make it as rte_intr_ack() no need to return -1 for
> > MSIX-VFIO+Linux as it is semantically correct.
> >
> 
> Ack can be ambiguous. For INTx, ack usually means PIO to a NIC register,
> saying "I got your interrupt, please de-assert irq".

I think, it vary from the perspective of IRQ Chip(or controller) vs NIC register(Source) PoV.
Since the API starts from rte_intr_* it is more of controller so _ack_ make sense
Other reason for ack:
1) It will enforce that it needs to be called form ISR
2) It would be have been really correct to unmask if VFIO+MSIx+Linux supports
it
3) if it is ack, no need to add unmask counterpart, the _mask_ API

> 
> Besides the name, are we agreeing that we want these?
> - Unmask if INTx

Yes

> - Nothing if MSI/MSI-X
Yes for MSI over VFIO
No for MSI over UIO/igb_uio

I don't  have very strong opinion unmask vs ack. I prefer to have ack due the reasons stated above.
If you really have strong opinion on using unmask, we will stick with that to make forward progress.
Let us know.


> 
> So, really just "unmask if INTx". I am ok with rte_intr_unmask() if we make
> this intention clear. rte_unmask_if_intx() looks messy.
> 
> Thanks..
> -Hyong

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [dpdk-dev] [RFC PATCH v3 2/3] eal: add mask and unmask interrupt APIs
  2019-07-17  9:20               ` Jerin Jacob Kollanukkaran
@ 2019-07-17  9:51                 ` Hyong Youb Kim (hyonkim)
  2019-07-17 10:43                   ` Jerin Jacob Kollanukkaran
  0 siblings, 1 reply; 25+ messages in thread
From: Hyong Youb Kim (hyonkim) @ 2019-07-17  9:51 UTC (permalink / raw)
  To: Jerin Jacob Kollanukkaran, Nithin Kumar Dabilpuram,
	David Marchand, Thomas Monjalon, Ferruh Yigit, Bruce Richardson
  Cc: John Daley (johndale), Shahed Shaikh, dev

> -----Original Message-----
> From: Jerin Jacob Kollanukkaran <jerinj@marvell.com>
> Sent: Wednesday, July 17, 2019 6:21 PM
> To: Hyong Youb Kim (hyonkim) <hyonkim@cisco.com>; Nithin Kumar
> Dabilpuram <ndabilpuram@marvell.com>; David Marchand
> <david.marchand@redhat.com>; Thomas Monjalon
> <thomas@monjalon.net>; Ferruh Yigit <ferruh.yigit@intel.com>; Bruce
> Richardson <bruce.richardson@intel.com>
> Cc: John Daley (johndale) <johndale@cisco.com>; Shahed Shaikh
> <shshaikh@marvell.com>; dev@dpdk.org
> Subject: RE: [RFC PATCH v3 2/3] eal: add mask and unmask interrupt APIs
> 
> > > > Not sure. I do not have a good suggestion here :-) Like to hear from
> > > > David when he comes back, as he spent most time on this issue..
> > >
> > > Sure. He is on vacation.
> > > Any reason for thinking, rte_intr_ack()  may not be semantically correct?
> > > I think, it is very much correct if there are no better suggestions.
> > > Anyway it the name, From VFIO perspective, we know what is expected
> so
> > > I think it is fine.
> > >
> > > >
> > > > Why not return -1 and let the caller deal with it?
> > >
> > > If we make it as rte_intr_ack() no need to return -1 for
> > > MSIX-VFIO+Linux as it is semantically correct.
> > >
> >
> > Ack can be ambiguous. For INTx, ack usually means PIO to a NIC register,
> > saying "I got your interrupt, please de-assert irq".
> 
> I think, it vary from the perspective of IRQ Chip(or controller) vs NIC
> register(Source) PoV.
> Since the API starts from rte_intr_* it is more of controller so _ack_ make
> sense
> Other reason for ack:
> 1) It will enforce that it needs to be called form ISR
> 2) It would be have been really correct to unmask if VFIO+MSIx+Linux
> supports
> it
> 3) if it is ack, no need to add unmask counterpart, the _mask_ API
> 

Just curious, what you mean by irq controller? Ack/mask/unmask PIOs
all go to the NIC. It is the NIC that asserts/de-asserts irq..

> >
> > Besides the name, are we agreeing that we want these?
> > - Unmask if INTx
> 
> Yes
> 
> > - Nothing if MSI/MSI-X
> Yes for MSI over VFIO
> No for MSI over UIO/igb_uio
> 

I guess I was not clear. For MSI/MSI-X, we do not want to do
mask/unmask regardless of vfio-pci/igb_uio.  Below is my comment about
linux/windows/freebsd from an earlier email. Do you disagree? I am
sure there are plenty of kernel NIC driver guys here. Please correct
me if I am mistaken...

---
Masking is useful only for INTx, IMO...

Masking MSI/MSI-X via PCI-defined mechanisms (e.g. Mask bit in MSI-X
Table) has no practical use for drivers. Handshaking/masking/unmasking
is done via device/vendor specific ways, as needed. See all those
ack/block/unblock/credit/... mechanisms used in various drivers/NICs
to control interrupts their own way.

A long time ago in early PCIe days, the linux kernel did auto-masking
for MSI/MSI-X (i.e. mask before calling netdev irq handler). It was
soon removed as it was unnecessary overhead (expensive PIOs to NIC for
every interrupt). Windows and FreeBSD do not do auto-masking either.
---

Most drivers have a single irq callback.

handler() {
  do_action()
  rte_intr_umask/ack()
}

Suppose MSI/MSI-X is used (super likely since it is the default).
With igb_uio, rte_intr_umask/ack() will actually do PIO writes to the
NIC to unmask. This is unnecessary overhead.

> I don't  have very strong opinion unmask vs ack. I prefer to have ack due the
> reasons stated above.
> If you really have strong opinion on using unmask, we will stick with that to
> make forward progress.
> Let us know.
> 

I have no strong opinion either.

Thanks..
-Hyong


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [dpdk-dev] [RFC PATCH v3 2/3] eal: add mask and unmask interrupt APIs
  2019-07-17  9:51                 ` Hyong Youb Kim (hyonkim)
@ 2019-07-17 10:43                   ` Jerin Jacob Kollanukkaran
  2019-07-17 11:06                     ` Hyong Youb Kim (hyonkim)
  0 siblings, 1 reply; 25+ messages in thread
From: Jerin Jacob Kollanukkaran @ 2019-07-17 10:43 UTC (permalink / raw)
  To: Hyong Youb Kim (hyonkim),
	Nithin Kumar Dabilpuram, David Marchand, Thomas Monjalon,
	Ferruh Yigit, Bruce Richardson
  Cc: John Daley (johndale), Shahed Shaikh, dev

> > I think, it vary from the perspective of IRQ Chip(or controller) vs
> > NIC
> > register(Source) PoV.
> > Since the API starts from rte_intr_* it is more of controller so _ack_
> > make sense Other reason for ack:
> > 1) It will enforce that it needs to be called form ISR
> > 2) It would be have been really correct to unmask if VFIO+MSIx+Linux
> > supports it
> > 3) if it is ack, no need to add unmask counterpart, the _mask_ API
> >
> 
> Just curious, what you mean by irq controller? Ack/mask/unmask PIOs all go

Programmable Interrupt Controller. Like Intel 8259A, GIC from ARM etc
The drivers in linux/drivers/irqchip/

> to the NIC. It is the NIC that asserts/de-asserts irq..
> 
> > >
> > > Besides the name, are we agreeing that we want these?
> > > - Unmask if INTx
> >
> > Yes
> >
> > > - Nothing if MSI/MSI-X
> > Yes for MSI over VFIO
> > No for MSI over UIO/igb_uio
> >
> 
> I guess I was not clear. For MSI/MSI-X, we do not want to do mask/unmask
> regardless of vfio-pci/igb_uio.  Below is my comment about
> linux/windows/freebsd from an earlier email. Do you disagree? I am sure
> there are plenty of kernel NIC driver guys here. Please correct me if I am
> mistaken...


For some reason, igb_uio kernel driver mask the interrupt for MSIx.
We need to ack or unmask if needs to work with MSIX + IGB_UIO.

See 
pci_uio_alloc_resource()
        if (dev->kdrv == RTE_KDRV_IGB_UIO)
                dev->intr_handle.type = RTE_INTR_HANDLE_UIO; 
        else {
                dev->intr_handle.type = RTE_INTR_HANDLE_UIO_INTX;

igbuio_pci_irqcontrol() for masking in kernel.

So it is more of making inline with igb_uio kernel driver AND not break
The existing drivers which was using rte_intr_enable in ISR with MSIX+IGB_UIO

I do agree with that for edge trigged interrupt mask may not require from kernel.
But I am not sure why it is added in igb_uio kernel driver. May  be it is just legacy.
Anyway this wont change schematics, when igb_uio kenrel fixed then the counter
Part can be changed in rte_intr_ack(). Ie. it is transparent to drivers.

> 
> > I don't  have very strong opinion unmask vs ack. I prefer to have ack
> > due the reasons stated above.
> > If you really have strong opinion on using unmask, we will stick with
> > that to make forward progress.
> > Let us know.
> >
> 
> I have no strong opinion either.

OK. Lets stick with rte_intr_ack().

> 
> Thanks..
> -Hyong


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [dpdk-dev] [RFC PATCH v3 2/3] eal: add mask and unmask interrupt APIs
  2019-07-17 10:43                   ` Jerin Jacob Kollanukkaran
@ 2019-07-17 11:06                     ` Hyong Youb Kim (hyonkim)
  2019-07-17 11:16                       ` Jerin Jacob Kollanukkaran
  0 siblings, 1 reply; 25+ messages in thread
From: Hyong Youb Kim (hyonkim) @ 2019-07-17 11:06 UTC (permalink / raw)
  To: Jerin Jacob Kollanukkaran, Nithin Kumar Dabilpuram,
	David Marchand, Thomas Monjalon, Ferruh Yigit, Bruce Richardson
  Cc: John Daley (johndale), Shahed Shaikh, dev

> -----Original Message-----
> From: Jerin Jacob Kollanukkaran <jerinj@marvell.com>
> Sent: Wednesday, July 17, 2019 7:44 PM
> To: Hyong Youb Kim (hyonkim) <hyonkim@cisco.com>; Nithin Kumar
> Dabilpuram <ndabilpuram@marvell.com>; David Marchand
> <david.marchand@redhat.com>; Thomas Monjalon
> <thomas@monjalon.net>; Ferruh Yigit <ferruh.yigit@intel.com>; Bruce
> Richardson <bruce.richardson@intel.com>
> Cc: John Daley (johndale) <johndale@cisco.com>; Shahed Shaikh
> <shshaikh@marvell.com>; dev@dpdk.org
> Subject: RE: [RFC PATCH v3 2/3] eal: add mask and unmask interrupt APIs
> 
> > > I think, it vary from the perspective of IRQ Chip(or controller) vs
> > > NIC
> > > register(Source) PoV.
> > > Since the API starts from rte_intr_* it is more of controller so _ack_
> > > make sense Other reason for ack:
> > > 1) It will enforce that it needs to be called form ISR
> > > 2) It would be have been really correct to unmask if VFIO+MSIx+Linux
> > > supports it
> > > 3) if it is ack, no need to add unmask counterpart, the _mask_ API
> > >
> >
> > Just curious, what you mean by irq controller? Ack/mask/unmask PIOs all
> go
> 
> Programmable Interrupt Controller. Like Intel 8259A, GIC from ARM etc
> The drivers in linux/drivers/irqchip/
> 
> > to the NIC. It is the NIC that asserts/de-asserts irq..
> >
> > > >
> > > > Besides the name, are we agreeing that we want these?
> > > > - Unmask if INTx
> > >
> > > Yes
> > >
> > > > - Nothing if MSI/MSI-X
> > > Yes for MSI over VFIO
> > > No for MSI over UIO/igb_uio
> > >
> >
> > I guess I was not clear. For MSI/MSI-X, we do not want to do mask/unmask
> > regardless of vfio-pci/igb_uio.  Below is my comment about
> > linux/windows/freebsd from an earlier email. Do you disagree? I am sure
> > there are plenty of kernel NIC driver guys here. Please correct me if I am
> > mistaken...
> 
> 
> For some reason, igb_uio kernel driver mask the interrupt for MSIx.
> We need to ack or unmask if needs to work with MSIX + IGB_UIO.
> 
> See
> pci_uio_alloc_resource()
>         if (dev->kdrv == RTE_KDRV_IGB_UIO)
>                 dev->intr_handle.type = RTE_INTR_HANDLE_UIO;
>         else {
>                 dev->intr_handle.type = RTE_INTR_HANDLE_UIO_INTX;
> 
> igbuio_pci_irqcontrol() for masking in kernel.
> 

igb_uio does not auto-mask MSI/MSI-X.

static irqreturn_t
igbuio_pci_irqhandler(int irq, void *dev_id)
{
        struct rte_uio_pci_dev *udev = (struct rte_uio_pci_dev *)dev_id;
        struct uio_info *info = &udev->info;

        /* Legacy mode need to mask in hardware */
        if (udev->mode == RTE_INTR_MODE_LEGACY &&
            !pci_check_and_mask_intx(udev->pdev))
                return IRQ_NONE;

        uio_event_notify(info);

        /* Message signal mode, no share IRQ and automasked */
        return IRQ_HANDLED;
}

Also tested just now with igb_uio. The driver does not need to call
rte_intr_enable(), and it keeps getting interrupts without any issues.

Am I missing something?

-Hyong

> So it is more of making inline with igb_uio kernel driver AND not break
> The existing drivers which was using rte_intr_enable in ISR with
> MSIX+IGB_UIO
> 
> I do agree with that for edge trigged interrupt mask may not require from
> kernel.
> But I am not sure why it is added in igb_uio kernel driver. May  be it is just
> legacy.
> Anyway this wont change schematics, when igb_uio kenrel fixed then the
> counter
> Part can be changed in rte_intr_ack(). Ie. it is transparent to drivers.
> 
> >
> > > I don't  have very strong opinion unmask vs ack. I prefer to have ack
> > > due the reasons stated above.
> > > If you really have strong opinion on using unmask, we will stick with
> > > that to make forward progress.
> > > Let us know.
> > >
> >
> > I have no strong opinion either.
> 
> OK. Lets stick with rte_intr_ack().
> 
> >
> > Thanks..
> > -Hyong


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [dpdk-dev] [RFC PATCH v3 2/3] eal: add mask and unmask interrupt APIs
  2019-07-17 11:06                     ` Hyong Youb Kim (hyonkim)
@ 2019-07-17 11:16                       ` Jerin Jacob Kollanukkaran
  2019-07-17 12:04                         ` Nithin Kumar Dabilpuram
  0 siblings, 1 reply; 25+ messages in thread
From: Jerin Jacob Kollanukkaran @ 2019-07-17 11:16 UTC (permalink / raw)
  To: Hyong Youb Kim (hyonkim),
	Nithin Kumar Dabilpuram, David Marchand, Thomas Monjalon,
	Ferruh Yigit, Bruce Richardson
  Cc: John Daley (johndale), Shahed Shaikh, dev

> -----Original Message-----
> From: Hyong Youb Kim (hyonkim) <hyonkim@cisco.com>
> Sent: Wednesday, July 17, 2019 4:36 PM
> To: Jerin Jacob Kollanukkaran <jerinj@marvell.com>; Nithin Kumar
> Dabilpuram <ndabilpuram@marvell.com>; David Marchand
> <david.marchand@redhat.com>; Thomas Monjalon
> <thomas@monjalon.net>; Ferruh Yigit <ferruh.yigit@intel.com>; Bruce
> Richardson <bruce.richardson@intel.com>
> Cc: John Daley (johndale) <johndale@cisco.com>; Shahed Shaikh
> <shshaikh@marvell.com>; dev@dpdk.org
> Subject: RE: [RFC PATCH v3 2/3] eal: add mask and unmask interrupt APIs
> 
> > -----Original Message-----
> > From: Jerin Jacob Kollanukkaran <jerinj@marvell.com>
> > Sent: Wednesday, July 17, 2019 7:44 PM
> > To: Hyong Youb Kim (hyonkim) <hyonkim@cisco.com>; Nithin Kumar
> > Dabilpuram <ndabilpuram@marvell.com>; David Marchand
> > <david.marchand@redhat.com>; Thomas Monjalon
> <thomas@monjalon.net>;
> > Ferruh Yigit <ferruh.yigit@intel.com>; Bruce Richardson
> > <bruce.richardson@intel.com>
> > Cc: John Daley (johndale) <johndale@cisco.com>; Shahed Shaikh
> > <shshaikh@marvell.com>; dev@dpdk.org
> > Subject: RE: [RFC PATCH v3 2/3] eal: add mask and unmask interrupt
> > APIs
> >
> > > > I think, it vary from the perspective of IRQ Chip(or controller)
> > > > vs NIC
> > > > register(Source) PoV.
> > > > Since the API starts from rte_intr_* it is more of controller so
> > > > _ack_ make sense Other reason for ack:
> > > > 1) It will enforce that it needs to be called form ISR
> > > > 2) It would be have been really correct to unmask if
> > > > VFIO+MSIx+Linux supports it
> > > > 3) if it is ack, no need to add unmask counterpart, the _mask_ API
> > > >
> > >
> > > Just curious, what you mean by irq controller? Ack/mask/unmask PIOs
> > > all
> > go
> >
> > Programmable Interrupt Controller. Like Intel 8259A, GIC from ARM etc
> > The drivers in linux/drivers/irqchip/
> >
> > > to the NIC. It is the NIC that asserts/de-asserts irq..
> > >
> > > > >
> > > > > Besides the name, are we agreeing that we want these?
> > > > > - Unmask if INTx
> > > >
> > > > Yes
> > > >
> > > > > - Nothing if MSI/MSI-X
> > > > Yes for MSI over VFIO
> > > > No for MSI over UIO/igb_uio
> > > >
> > >
> > > I guess I was not clear. For MSI/MSI-X, we do not want to do
> > > mask/unmask regardless of vfio-pci/igb_uio.  Below is my comment
> > > about linux/windows/freebsd from an earlier email. Do you disagree?
> > > I am sure there are plenty of kernel NIC driver guys here. Please
> > > correct me if I am mistaken...
> >
> >
> > For some reason, igb_uio kernel driver mask the interrupt for MSIx.
> > We need to ack or unmask if needs to work with MSIX + IGB_UIO.
> >
> > See
> > pci_uio_alloc_resource()
> >         if (dev->kdrv == RTE_KDRV_IGB_UIO)
> >                 dev->intr_handle.type = RTE_INTR_HANDLE_UIO;
> >         else {
> >                 dev->intr_handle.type = RTE_INTR_HANDLE_UIO_INTX;
> >
> > igbuio_pci_irqcontrol() for masking in kernel.
> >
> 
> igb_uio does not auto-mask MSI/MSI-X.

I have not tested igbuio as we don't specific NIC + IGB_UIO platform.

The observation based on following code. see code under HAVE_PCI_MSI_MASK_IRQ

static int
igbuio_pci_irqcontrol(struct uio_info *info, s32 irq_state)
{
        struct rte_uio_pci_dev *udev = info->priv;
        struct pci_dev *pdev = udev->pdev;

#ifdef HAVE_PCI_MSI_MASK_IRQ
        struct irq_data *irq = irq_get_irq_data(udev->info.irq);
#endif

        pci_cfg_access_lock(pdev);

        if (udev->mode == RTE_INTR_MODE_MSIX || udev->mode == RTE_INTR_MODE_MSI) {
#ifdef HAVE_PCI_MSI_MASK_IRQ
                if (irq_state == 1)
                        pci_msi_unmask_irq(irq);
                else
                        pci_msi_mask_irq(irq);
#else
                igbuio_mask_irq(pdev, udev->mode, irq_state);
#endif
        }

        if (udev->mode == RTE_INTR_MODE_LEGACY)
                pci_intx(pdev, !!irq_state);

        pci_cfg_access_unlock(pdev);

        return 0;
}

> 
> static irqreturn_t
> igbuio_pci_irqhandler(int irq, void *dev_id) {
>         struct rte_uio_pci_dev *udev = (struct rte_uio_pci_dev *)dev_id;
>         struct uio_info *info = &udev->info;
> 
>         /* Legacy mode need to mask in hardware */
>         if (udev->mode == RTE_INTR_MODE_LEGACY &&
>             !pci_check_and_mask_intx(udev->pdev))
>                 return IRQ_NONE;
> 
>         uio_event_notify(info);
> 
>         /* Message signal mode, no share IRQ and automasked */
>         return IRQ_HANDLED;
> }
> 
> Also tested just now with igb_uio. The driver does not need to call
> rte_intr_enable(), and it keeps getting interrupts without any issues.

 If you are sure, we can make MSIX+IGB_UIO as NOP in rte_intr_ack()

> Am I missing something?
> 
> -Hyong
> 
> > So it is more of making inline with igb_uio kernel driver AND not
> > break The existing drivers which was using rte_intr_enable in ISR with
> > MSIX+IGB_UIO
> >
> > I do agree with that for edge trigged interrupt mask may not require
> > from kernel.
> > But I am not sure why it is added in igb_uio kernel driver. May  be it
> > is just legacy.
> > Anyway this wont change schematics, when igb_uio kenrel fixed then the
> > counter Part can be changed in rte_intr_ack(). Ie. it is transparent
> > to drivers.
> >
> > >
> > > > I don't  have very strong opinion unmask vs ack. I prefer to have
> > > > ack due the reasons stated above.
> > > > If you really have strong opinion on using unmask, we will stick
> > > > with that to make forward progress.
> > > > Let us know.
> > > >
> > >
> > > I have no strong opinion either.
> >
> > OK. Lets stick with rte_intr_ack().
> >
> > >
> > > Thanks..
> > > -Hyong


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [dpdk-dev] [RFC PATCH v3 2/3] eal: add mask and unmask interrupt APIs
  2019-07-17 11:16                       ` Jerin Jacob Kollanukkaran
@ 2019-07-17 12:04                         ` Nithin Kumar Dabilpuram
  0 siblings, 0 replies; 25+ messages in thread
From: Nithin Kumar Dabilpuram @ 2019-07-17 12:04 UTC (permalink / raw)
  To: Jerin Jacob Kollanukkaran, Hyong Youb Kim (hyonkim),
	David Marchand, Thomas Monjalon, Ferruh Yigit, Bruce Richardson
  Cc: John Daley (johndale), Shahed Shaikh, dev


On 7/17/2019 4:46 PM, Jerin Jacob Kollanukkaran wrote:
>> -----Original Message-----
>> From: Hyong Youb Kim (hyonkim) <hyonkim@cisco.com>
>> Sent: Wednesday, July 17, 2019 4:36 PM
>> To: Jerin Jacob Kollanukkaran <jerinj@marvell.com>; Nithin Kumar
>> Dabilpuram <ndabilpuram@marvell.com>; David Marchand
>> <david.marchand@redhat.com>; Thomas Monjalon
>> <thomas@monjalon.net>; Ferruh Yigit <ferruh.yigit@intel.com>; Bruce
>> Richardson <bruce.richardson@intel.com>
>> Cc: John Daley (johndale) <johndale@cisco.com>; Shahed Shaikh
>> <shshaikh@marvell.com>; dev@dpdk.org
>> Subject: RE: [RFC PATCH v3 2/3] eal: add mask and unmask interrupt APIs
>>
>>> -----Original Message-----
>>> From: Jerin Jacob Kollanukkaran <jerinj@marvell.com>
>>> Sent: Wednesday, July 17, 2019 7:44 PM
>>> To: Hyong Youb Kim (hyonkim) <hyonkim@cisco.com>; Nithin Kumar
>>> Dabilpuram <ndabilpuram@marvell.com>; David Marchand
>>> <david.marchand@redhat.com>; Thomas Monjalon
>> <thomas@monjalon.net>;
>>> Ferruh Yigit <ferruh.yigit@intel.com>; Bruce Richardson
>>> <bruce.richardson@intel.com>
>>> Cc: John Daley (johndale) <johndale@cisco.com>; Shahed Shaikh
>>> <shshaikh@marvell.com>; dev@dpdk.org
>>> Subject: RE: [RFC PATCH v3 2/3] eal: add mask and unmask interrupt
>>> APIs
>>>
>>>>> I think, it vary from the perspective of IRQ Chip(or controller)
>>>>> vs NIC
>>>>> register(Source) PoV.
>>>>> Since the API starts from rte_intr_* it is more of controller so
>>>>> _ack_ make sense Other reason for ack:
>>>>> 1) It will enforce that it needs to be called form ISR
>>>>> 2) It would be have been really correct to unmask if
>>>>> VFIO+MSIx+Linux supports it
>>>>> 3) if it is ack, no need to add unmask counterpart, the _mask_ API
>>>>>
>>>> Just curious, what you mean by irq controller? Ack/mask/unmask PIOs
>>>> all
>>> go
>>>
>>> Programmable Interrupt Controller. Like Intel 8259A, GIC from ARM etc
>>> The drivers in linux/drivers/irqchip/
>>>
>>>> to the NIC. It is the NIC that asserts/de-asserts irq..
>>>>
>>>>>> Besides the name, are we agreeing that we want these?
>>>>>> - Unmask if INTx
>>>>> Yes
>>>>>
>>>>>> - Nothing if MSI/MSI-X
>>>>> Yes for MSI over VFIO
>>>>> No for MSI over UIO/igb_uio
>>>>>
>>>> I guess I was not clear. For MSI/MSI-X, we do not want to do
>>>> mask/unmask regardless of vfio-pci/igb_uio.  Below is my comment
>>>> about linux/windows/freebsd from an earlier email. Do you disagree?
>>>> I am sure there are plenty of kernel NIC driver guys here. Please
>>>> correct me if I am mistaken...
>>>
>>> For some reason, igb_uio kernel driver mask the interrupt for MSIx.
>>> We need to ack or unmask if needs to work with MSIX + IGB_UIO.
>>>
>>> See
>>> pci_uio_alloc_resource()
>>>          if (dev->kdrv == RTE_KDRV_IGB_UIO)
>>>                  dev->intr_handle.type = RTE_INTR_HANDLE_UIO;
>>>          else {
>>>                  dev->intr_handle.type = RTE_INTR_HANDLE_UIO_INTX;
>>>
>>> igbuio_pci_irqcontrol() for masking in kernel.
>>>
>> igb_uio does not auto-mask MSI/MSI-X.
> I have not tested igbuio as we don't specific NIC + IGB_UIO platform.
>
> The observation based on following code. see code under HAVE_PCI_MSI_MASK_IRQ
>
> static int
> igbuio_pci_irqcontrol(struct uio_info *info, s32 irq_state)
> {
>          struct rte_uio_pci_dev *udev = info->priv;
>          struct pci_dev *pdev = udev->pdev;
>
> #ifdef HAVE_PCI_MSI_MASK_IRQ
>          struct irq_data *irq = irq_get_irq_data(udev->info.irq);
> #endif
>
>          pci_cfg_access_lock(pdev);
>
>          if (udev->mode == RTE_INTR_MODE_MSIX || udev->mode == RTE_INTR_MODE_MSI) {
> #ifdef HAVE_PCI_MSI_MASK_IRQ
>                  if (irq_state == 1)
>                          pci_msi_unmask_irq(irq);
>                  else
>                          pci_msi_mask_irq(irq);
> #else
>                  igbuio_mask_irq(pdev, udev->mode, irq_state);
> #endif
>          }
>
>          if (udev->mode == RTE_INTR_MODE_LEGACY)
>                  pci_intx(pdev, !!irq_state);
>
>          pci_cfg_access_unlock(pdev);
>
>          return 0;
> }
>
>> static irqreturn_t
>> igbuio_pci_irqhandler(int irq, void *dev_id) {
>>          struct rte_uio_pci_dev *udev = (struct rte_uio_pci_dev *)dev_id;
>>          struct uio_info *info = &udev->info;
>>
>>          /* Legacy mode need to mask in hardware */
>>          if (udev->mode == RTE_INTR_MODE_LEGACY &&
>>              !pci_check_and_mask_intx(udev->pdev))
>>                  return IRQ_NONE;
>>
>>          uio_event_notify(info);
>>
>>          /* Message signal mode, no share IRQ and automasked */
>>          return IRQ_HANDLED;
>> }
>>
>> Also tested just now with igb_uio. The driver does not need to call
>> rte_intr_enable(), and it keeps getting interrupts without any issues.
>   If you are sure, we can make MSIX+IGB_UIO as NOP in rte_intr_ack()

Ok. Another problem is that we might not be able to distinguish in case 
of IGB_UIO
at rte_intr_ack() level if underlying interrupt is a INTx or MSIx. See 
igbuio_pci_enable_interrupts() that
finds and stores that mode in uio->mode.

So we think leaving the behavior as earlier is needed and simpler as it 
meets the current expectation.

>
>> Am I missing something?
>>
>> -Hyong
>>
>>> So it is more of making inline with igb_uio kernel driver AND not
>>> break The existing drivers which was using rte_intr_enable in ISR with
>>> MSIX+IGB_UIO
>>>
>>> I do agree with that for edge trigged interrupt mask may not require
>>> from kernel.
>>> But I am not sure why it is added in igb_uio kernel driver. May  be it
>>> is just legacy.
>>> Anyway this wont change schematics, when igb_uio kenrel fixed then the
>>> counter Part can be changed in rte_intr_ack(). Ie. it is transparent
>>> to drivers.
>>>
>>>>> I don't  have very strong opinion unmask vs ack. I prefer to have
>>>>> ack due the reasons stated above.
>>>>> If you really have strong opinion on using unmask, we will stick
>>>>> with that to make forward progress.
>>>>> Let us know.
>>>>>
>>>> I have no strong opinion either.
>>> OK. Lets stick with rte_intr_ack().
>>>
>>>> Thanks..
>>>> -Hyong

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [dpdk-dev] [RFC PATCH] vfio: avoid re-installing irq handler
@ 2019-07-15 15:58 Hyong Youb Kim
  0 siblings, 0 replies; 25+ messages in thread
From: Hyong Youb Kim @ 2019-07-15 15:58 UTC (permalink / raw)
  To: David Marchand, Thomas Monjalon, Jerin Jacob Kollanukkaran, Ferruh Yigit
  Cc: dev, John Daley, Hyong Youb Kim

A rough patch for the approach mentioned earlier. It is only for
discussion.
http://mails.dpdk.org/archives/dev/2019-July/138113.html

To try this out, first revert the following then apply.
commit 89aac60e0be9 ("vfio: fix interrupts race condition")

Signed-off-by: Hyong Youb Kim <hyonkim@cisco.com>
---
 .../common/include/rte_eal_interrupts.h       |  1 +
 lib/librte_eal/linux/eal/eal_interrupts.c     | 68 ++++++++++++++++++-
 2 files changed, 67 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/common/include/rte_eal_interrupts.h b/lib/librte_eal/common/include/rte_eal_interrupts.h
index b370c0d26..d509967d3 100644
--- a/lib/librte_eal/common/include/rte_eal_interrupts.h
+++ b/lib/librte_eal/common/include/rte_eal_interrupts.h
@@ -76,6 +76,7 @@ struct rte_intr_handle {
 	enum rte_intr_handle_type type;  /**< handle type */
 	uint32_t max_intr;             /**< max interrupt requested */
 	uint32_t nb_efd;               /**< number of available efd(event fd) */
+	uint8_t vfio_efd_assigned;
 	uint8_t efd_counter_size;      /**< size of efd counter, used for vdev */
 	int efds[RTE_MAX_RXTX_INTR_VEC_ID];  /**< intr vectors/efds mapping */
 	struct rte_epoll_event elist[RTE_MAX_RXTX_INTR_VEC_ID];
diff --git a/lib/librte_eal/linux/eal/eal_interrupts.c b/lib/librte_eal/linux/eal/eal_interrupts.c
index 79ad5e8d7..5b03388dd 100644
--- a/lib/librte_eal/linux/eal/eal_interrupts.c
+++ b/lib/librte_eal/linux/eal/eal_interrupts.c
@@ -107,9 +107,21 @@ static pthread_t intr_thread;
 #define MSIX_IRQ_SET_BUF_LEN (sizeof(struct vfio_irq_set) + \
 			      sizeof(int) * (RTE_MAX_RXTX_INTR_VEC_ID + 1))
 
-/* enable legacy (INTx) interrupts */
+#ifndef __INTEL_COMPILER
+#pragma GCC diagnostic ignored "-Wcast-qual"
+#endif
+
+static void
+set_vfio_efd_assigned(const struct rte_intr_handle *intr_handle, uint8_t v) {
+	((struct rte_intr_handle *)intr_handle)->vfio_efd_assigned = v;
+}
+
+#ifndef __INTEL_COMPILER
+#pragma GCC diagnostic error "-Wcast-qual"
+#endif
+
 static int
-vfio_enable_intx(const struct rte_intr_handle *intr_handle) {
+vfio_assign_efd_intx(const struct rte_intr_handle *intr_handle) {
 	struct vfio_irq_set *irq_set;
 	char irq_set_buf[IRQ_SET_BUF_LEN];
 	int len, ret;
@@ -134,8 +146,36 @@ vfio_enable_intx(const struct rte_intr_handle *intr_handle) {
 						intr_handle->fd);
 		return -1;
 	}
+	/*
+	 * In the kernel, irq has been set up, its handler installed (request_irq),
+	 * and efd is assigned to it.  Remember this so we can avoid
+	 * re-doing irq setup later.
+	 */
+	set_vfio_efd_assigned(intr_handle, 1);
+	return 0;
+}
+
+/* enable legacy (INTx) interrupts */
+static int
+vfio_enable_intx(const struct rte_intr_handle *intr_handle) {
+	struct vfio_irq_set *irq_set;
+	char irq_set_buf[IRQ_SET_BUF_LEN];
+	int len, ret;
+
+	/*
+	 * Assign efd to irq only if it's not been done, to avoid
+	 * re-installing irq handler in the kernel.
+	 */
+	if (!intr_handle->vfio_efd_assigned) {
+		ret = vfio_assign_efd_intx(intr_handle);
+		if (ret)
+			return ret;
+	}
+
+	len = sizeof(irq_set_buf);
 
 	/* unmask INTx after enabling */
+	irq_set = (struct vfio_irq_set *) irq_set_buf;
 	memset(irq_set, 0, len);
 	len = sizeof(struct vfio_irq_set);
 	irq_set->argsz = len;
@@ -194,6 +234,12 @@ vfio_disable_intx(const struct rte_intr_handle *intr_handle) {
 			"Error disabling INTx interrupts for fd %d\n", intr_handle->fd);
 		return -1;
 	}
+	/*
+	 * Irq handler's been removed in the kernel. The next time we
+	 * enable intr_handle, we need to assign efd to irq. Remember
+	 * that.
+	 */
+	set_vfio_efd_assigned(intr_handle, 0);
 	return 0;
 }
 
@@ -205,6 +251,13 @@ vfio_enable_msi(const struct rte_intr_handle *intr_handle) {
 	struct vfio_irq_set *irq_set;
 	int *fd_ptr;
 
+	/*
+	 * vfio does not support MSI masking/unmaking. So if irq has
+	 * been set up previously, nothing to do.
+	 */
+	if (intr_handle->vfio_efd_assigned)
+		return 0;
+
 	len = sizeof(irq_set_buf);
 
 	irq_set = (struct vfio_irq_set *) irq_set_buf;
@@ -223,6 +276,7 @@ vfio_enable_msi(const struct rte_intr_handle *intr_handle) {
 						intr_handle->fd);
 		return -1;
 	}
+	set_vfio_efd_assigned(intr_handle, 1);
 	return 0;
 }
 
@@ -248,6 +302,7 @@ vfio_disable_msi(const struct rte_intr_handle *intr_handle) {
 		RTE_LOG(ERR, EAL,
 			"Error disabling MSI interrupts for fd %d\n", intr_handle->fd);
 
+	set_vfio_efd_assigned(intr_handle, 0);
 	return ret;
 }
 
@@ -259,6 +314,13 @@ vfio_enable_msix(const struct rte_intr_handle *intr_handle) {
 	struct vfio_irq_set *irq_set;
 	int *fd_ptr;
 
+	/*
+	 * vfio does not support MSI-X masking/unmaking. So if irq has
+	 * been set up previously, nothing to do.
+	 */
+	if (intr_handle->vfio_efd_assigned)
+		return 0;
+
 	len = sizeof(irq_set_buf);
 
 	irq_set = (struct vfio_irq_set *) irq_set_buf;
@@ -284,6 +346,7 @@ vfio_enable_msix(const struct rte_intr_handle *intr_handle) {
 		return -1;
 	}
 
+	set_vfio_efd_assigned(intr_handle, 1);
 	return 0;
 }
 
@@ -309,6 +372,7 @@ vfio_disable_msix(const struct rte_intr_handle *intr_handle) {
 		RTE_LOG(ERR, EAL,
 			"Error disabling MSI-X interrupts for fd %d\n", intr_handle->fd);
 
+	set_vfio_efd_assigned(intr_handle, 0);
 	return ret;
 }
 
-- 
2.22.0


^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, back to index

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-07-15 16:50 [dpdk-dev] [RFC PATCH] vfio: avoid re-installing irq handler Jerin Jacob Kollanukkaran
2019-07-16  5:58 ` Hyong Youb Kim (hyonkim)
2019-07-16  6:47   ` Jerin Jacob Kollanukkaran
2019-07-16  7:49     ` Hyong Youb Kim (hyonkim)
2019-07-16  9:56       ` Jerin Jacob Kollanukkaran
2019-07-16  6:46 ` [dpdk-dev] [RFC PATCH] eal: add mask and unmask interrupt apis Nithin Dabilpuram
2019-07-16  7:01 ` [dpdk-dev] [RFC PATCH v2] " Nithin Dabilpuram
2019-07-16 16:44 ` [dpdk-dev] [RFC PATCH v3 1/3] vfio: revert change that does intr eventfd setup at probe Nithin Dabilpuram
2019-07-16 16:44   ` [dpdk-dev] [RFC PATCH v3 2/3] eal: add mask and unmask interrupt APIs Nithin Dabilpuram
2019-07-17  5:55     ` Hyong Youb Kim (hyonkim)
2019-07-17  6:14       ` Jerin Jacob Kollanukkaran
2019-07-17  7:09         ` Hyong Youb Kim (hyonkim)
2019-07-17  8:03           ` Jerin Jacob Kollanukkaran
2019-07-17  8:45             ` Hyong Youb Kim (hyonkim)
2019-07-17  9:20               ` Jerin Jacob Kollanukkaran
2019-07-17  9:51                 ` Hyong Youb Kim (hyonkim)
2019-07-17 10:43                   ` Jerin Jacob Kollanukkaran
2019-07-17 11:06                     ` Hyong Youb Kim (hyonkim)
2019-07-17 11:16                       ` Jerin Jacob Kollanukkaran
2019-07-17 12:04                         ` Nithin Kumar Dabilpuram
2019-07-16 16:44   ` [dpdk-dev] [RFC PATCH v3 3/3] drivers/net: use unmask API in interrupt handlers Nithin Dabilpuram
2019-07-17  6:01     ` Hyong Youb Kim (hyonkim)
2019-07-17  7:47       ` Nithin Kumar Dabilpuram
2019-07-16 20:06   ` [dpdk-dev] [RFC PATCH v3 1/3] vfio: revert change that does intr eventfd setup at probe Stephen Hemminger
  -- strict thread matches above, loose matches on Subject: below --
2019-07-15 15:58 [dpdk-dev] [RFC PATCH] vfio: avoid re-installing irq handler Hyong Youb Kim

DPDK-dev Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/dpdk-dev/0 dpdk-dev/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 dpdk-dev dpdk-dev/ https://lore.kernel.org/dpdk-dev \
		dev@dpdk.org dpdk-dev@archiver.kernel.org
	public-inbox-index dpdk-dev


Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.dpdk.dev


AGPL code for this site: git clone https://public-inbox.org/ public-inbox