All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] net/failsafe: add Rx interrupts
@ 2017-12-11 12:41 Moti Haimovsky
  2017-12-12  1:34 ` Stephen Hemminger
  2018-01-04 15:01 ` [PATCH v2] " Moti Haimovsky
  0 siblings, 2 replies; 29+ messages in thread
From: Moti Haimovsky @ 2017-12-11 12:41 UTC (permalink / raw)
  To: gaetan.rivet; +Cc: dev, Moti Haimovsky

This patch adds support for registering and waiting for Rx
interrupts in failsafe PMD. This allows applications to wait
for Rx events from the PMD using the DPDK rte_epoll subsystem.
The failsafe PMD presents to the application a facade of a single
device to be handled by the application while internally it manages
several devices on behalf of the application including packets
transmission and reception.
The Proposed failsafe Rx interrupt scheme follows this approach.
The failsafe PMD will present the application with a single set of Rx
interrupt vectors representing the failsafe Rx queues, while internally
it will serve as an interrupt proxy for its subdevices.
This will allow applications to wait for Rx traffic from the failsafe
PMD by registering and waiting for Rx events from its Rx queues.
In order to support this the following is suggested:
  * Every Rx queue in the failsafe (virtual) device will be assigned a
    Linux event file descriptor (efd) and an enable_interrupts flag.
  * The failsafe PMD will fill in its rte_intr_handle structure with
    the Rx efds assigned previously and register them with the EAL.
  * The failsafe driver will create a private epoll fd (epfd) and will
    allocate enough space to handle all the Rx events from all its
    subdevices.
  * Acting as an application,
    for each Rx queue in each active subdevice the failsafe will:
      o Register the Rx queue with the EAL.
      o Pass the EAL the failsafe private epoll fd as the epfd to
        register the Rx queue event on.
      o Pass the EAL, as a parameter, the pointer to the failsafe Rx
        queue that handles this Rx queue.
      o Using the DPDK service callbacks, the failsafe PMD will launch
        an Rx proxy service that will Wait on the epoll fd for Rx events
        from the sub-devices.
      o For each Rx event received the proxy service will
         - Retrieve the pointer to failsafe Rx queue that handles this
           subdevice Rx queue from the user info returned by the EAL.
         - Trigger a failsafe Rx event on that queue by writing to the
           event fd unless interrupts are disabled for that queue.
  * The failsafe pmd will also implement the rx_queue_intr_enable and
    rx_queue_intr_disable routines that will enable and disable Rx
    interrupts respectively on both on the failsafe and its subdevices.

Signed-off-by: Moti Haimovsky <motih@mellanox.com>
---
 doc/guides/nics/features/failsafe.ini   |   1 +
 drivers/net/failsafe/Makefile           |   1 +
 drivers/net/failsafe/failsafe.c         |   4 +
 drivers/net/failsafe/failsafe_ether.c   |   1 +
 drivers/net/failsafe/failsafe_intr.c    | 596 ++++++++++++++++++++++++++++++++
 drivers/net/failsafe/failsafe_ops.c     |  21 ++
 drivers/net/failsafe/failsafe_private.h |  44 +++
 7 files changed, 668 insertions(+)
 create mode 100644 drivers/net/failsafe/failsafe_intr.c

diff --git a/doc/guides/nics/features/failsafe.ini b/doc/guides/nics/features/failsafe.ini
index a42e344..39ee579 100644
--- a/doc/guides/nics/features/failsafe.ini
+++ b/doc/guides/nics/features/failsafe.ini
@@ -6,6 +6,7 @@
 [Features]
 Link status          = Y
 Link status event    = Y
+Rx interrupt         = Y
 MTU update           = Y
 Jumbo frame          = Y
 Promiscuous mode     = Y
diff --git a/drivers/net/failsafe/Makefile b/drivers/net/failsafe/Makefile
index ea2a8fe..91a734b 100644
--- a/drivers/net/failsafe/Makefile
+++ b/drivers/net/failsafe/Makefile
@@ -46,6 +46,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_PMD_FAILSAFE) += failsafe_ops.c
 SRCS-$(CONFIG_RTE_LIBRTE_PMD_FAILSAFE) += failsafe_rxtx.c
 SRCS-$(CONFIG_RTE_LIBRTE_PMD_FAILSAFE) += failsafe_ether.c
 SRCS-$(CONFIG_RTE_LIBRTE_PMD_FAILSAFE) += failsafe_flow.c
+SRCS-$(CONFIG_RTE_LIBRTE_PMD_FAILSAFE) += failsafe_intr.c
 
 # No exported include files
 
diff --git a/drivers/net/failsafe/failsafe.c b/drivers/net/failsafe/failsafe.c
index 6bc5aba..3b5e059 100644
--- a/drivers/net/failsafe/failsafe.c
+++ b/drivers/net/failsafe/failsafe.c
@@ -239,6 +239,10 @@
 		mac->addr_bytes[2], mac->addr_bytes[3],
 		mac->addr_bytes[4], mac->addr_bytes[5]);
 	dev->data->dev_flags |= RTE_ETH_DEV_INTR_LSC;
+	PRIV(dev)->intr_handle = (struct rte_intr_handle){
+		.fd = -1,
+		.type = RTE_INTR_HANDLE_EXT,
+	};
 	return 0;
 free_args:
 	failsafe_args_free(dev);
diff --git a/drivers/net/failsafe/failsafe_ether.c b/drivers/net/failsafe/failsafe_ether.c
index 21392e5..80741ba 100644
--- a/drivers/net/failsafe/failsafe_ether.c
+++ b/drivers/net/failsafe/failsafe_ether.c
@@ -283,6 +283,7 @@
 		return;
 	switch (sdev->state) {
 	case DEV_STARTED:
+		failsafe_rx_intr_uninstall_subdevice(sdev);
 		rte_eth_dev_stop(PORT_ID(sdev));
 		sdev->state = DEV_ACTIVE;
 		/* fallthrough */
diff --git a/drivers/net/failsafe/failsafe_intr.c b/drivers/net/failsafe/failsafe_intr.c
new file mode 100644
index 0000000..2e395db
--- /dev/null
+++ b/drivers/net/failsafe/failsafe_intr.c
@@ -0,0 +1,596 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright 2017 Mellanox
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of the copyright holder nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+/**
+ * @file
+ * Interrupts handling for failsafe driver.
+ */
+
+#include <sys/eventfd.h>
+#include <sys/epoll.h>
+#include <unistd.h>
+
+#include <rte_alarm.h>
+#include <rte_config.h>
+#include <rte_errno.h>
+#include <rte_ethdev.h>
+#include <rte_interrupts.h>
+#include <rte_io.h>
+#include <rte_service_component.h>
+
+#include "failsafe_private.h"
+
+#define NUM_RX_PROXIES (FAILSAFE_MAX_ETHPORTS * RTE_MAX_RXTX_INTR_VEC_ID)
+
+/**
+ * Install failsafe Rx event proxy service.
+ * The Rx event proxy is the service that listens to Rx events from the
+ * subdevices and triggers failsafe Rx events accordingly.
+ *
+ * @param priv
+ *   Pointer to failsafe private structure.
+ * @return
+ *   0 on success, negative errno value otherwise.
+ */
+static int
+fs_rx_event_proxy_service(void *data)
+{
+	struct fs_priv *priv = data;
+	struct rxq *rxq;
+	struct rte_epoll_event *events = priv->rxp.evec;
+	uint64_t u64 = 1;
+	int i, n, rc = 0;
+
+	n = rte_epoll_wait(priv->rxp.efd, events, NUM_RX_PROXIES, -1);
+	for (i = 0; i < n; i++) {
+		rxq = (struct rxq *)events[i].epdata.data;
+		if (rxq->enable_events && rxq->event_fd != -1) {
+			if (write(rxq->event_fd, &u64, sizeof(u64)) !=
+			    sizeof(u64)) {
+				ERROR("failed to proxy Rx event to socket %d",
+				       rxq->event_fd);
+				rc = -EIO;
+			}
+		}
+	}
+	return rc;
+}
+
+/**
+ * Uninstall failsafe Rx event proxy service.
+ *
+ * @param priv
+ *   Pointer to failsafe private structure.
+ */
+static void
+fs_rx_event_proxy_service_uninstall(struct fs_priv *priv)
+{
+	/* Unregister the event service. */
+	switch (priv->rxp.sstate) {
+	case SS_RUNNING:
+		rte_service_map_lcore_set(priv->rxp.sid, priv->rxp.scid, 0);
+		/* fall through */
+	case SS_READY:
+		rte_service_runstate_set(priv->rxp.sid, 0);
+		rte_service_set_stats_enable(priv->rxp.sid, 0);
+		rte_service_component_runstate_set(priv->rxp.sid, 0);
+		/* fall through */
+	case SS_REGISTERED:
+		rte_service_component_unregister(priv->rxp.sid);
+		/* fall through */
+	default:
+		break;
+	}
+}
+
+/**
+ * Install the failsafe Rx event proxy service.
+ *
+ * @param priv
+ *   Pointer to failsafe private structure.
+ * @return
+ *   0 on success, negative errno value otherwise.
+ */
+static int
+fs_rx_event_proxy_service_install(struct fs_priv *priv)
+{
+	struct rte_service_spec service;
+	int32_t num_service_cores = rte_service_lcore_count();
+	int ret = 0;
+
+	if (!num_service_cores) {
+		ERROR("Failed to install Rx interrupts, "
+		      "no service core found");
+		return -ENOTSUP;
+	}
+	/* prepare service info */
+	memset(&service, 0, sizeof(struct rte_service_spec));
+	snprintf(service.name, sizeof(service.name), "%s_Rx_service",
+		 priv->dev->device->name);
+	service.socket_id = priv->dev->data->numa_node;
+	service.callback = fs_rx_event_proxy_service;
+	service.callback_userdata = (void *)priv;
+
+	if (priv->rxp.sstate == SS_NO_SREVICE) {
+		uint32_t service_core_list[num_service_cores];
+
+		/* get a service core to work with */
+		ret = rte_service_lcore_list(service_core_list,
+					     num_service_cores);
+		if (ret <= 0) {
+			ERROR("Failed to install Rx interrupts, "
+			      "service core list empty or corrupted");
+			return -ENOTSUP;
+		}
+		priv->rxp.scid = service_core_list[0];
+		ret = rte_service_lcore_add(priv->rxp.scid);
+		if (ret && ret != -EALREADY) {
+			ERROR("Failed adding service core");
+			return ret;
+		}
+		/* service core may be in "stopped" state, start it */
+		ret = rte_service_lcore_start(priv->rxp.scid);
+		if (ret && (ret != -EALREADY)) {
+			ERROR("Failed to install Rx interrupts, "
+			      "service core not started");
+			return ret;
+		}
+		/* register our service */
+		int32_t ret = rte_service_component_register(&service,
+							     &priv->rxp.sid);
+		if (ret) {
+			ERROR("service register() failed");
+			return -ENOEXEC;
+		}
+		priv->rxp.sstate = SS_REGISTERED;
+		/* run the service */
+		ret = rte_service_component_runstate_set(priv->rxp.sid, 1);
+		if (ret < 0) {
+			ERROR("Failed Setting component runstate\n");
+			return ret;
+		}
+		ret = rte_service_set_stats_enable(priv->rxp.sid, 1);
+		if (ret < 0) {
+			ERROR("Failed enabling stats\n");
+			return ret;
+		}
+		ret = rte_service_runstate_set(priv->rxp.sid, 1);
+		if (ret < 0) {
+			ERROR("Failed to run service\n");
+			return ret;
+		}
+		priv->rxp.sstate = SS_READY;
+		/* map the service with the service core */
+		ret = rte_service_map_lcore_set(priv->rxp.sid,
+						priv->rxp.scid, 1);
+		if (ret) {
+			ERROR("Failed to install Rx interrupts, "
+			      "could not map service core");
+			return ret;
+		}
+		priv->rxp.sstate = SS_RUNNING;
+	}
+	return 0;
+}
+
+/**
+ * Install failsafe Rx event proxy subsystem.
+ * This is the way the failsafe PMD generates Rx events on behalf of its
+ * subdevices.
+ *
+ * @param priv
+ *   Pointer to failsafe private structure.
+ * @return
+ *   0 on success, negative errno value otherwise and rte_errno is set.
+ */
+static int
+fs_rx_event_proxy_install(struct fs_priv *priv)
+{
+	int rc = 0;
+
+	/* create the epoll to wait on for Rx events form subdevices */
+	priv->rxp.efd = epoll_create1(0);
+	if (priv->rxp.efd < 0) {
+		rte_errno = errno;
+		ERROR("failed to create epoll,"
+		      " Rx interrupts will not be supported");
+		return -rte_errno;
+	}
+	/* allocate memory for receiving the Rx events from the subdevices. */
+	priv->rxp.evec = calloc(NUM_RX_PROXIES, sizeof(*priv->rxp.evec));
+	if (priv->rxp.evec == NULL) {
+		ERROR("failed to allocate memory for event vectors,"
+		      " Rx interrupts will not be supported");
+		rc = -ENOMEM;
+		goto error;
+	}
+	if (fs_rx_event_proxy_service_install(priv) < 0) {
+		rc = -rte_errno;
+		goto error;
+	}
+	return 0;
+error:
+	if (priv->rxp.efd >= 0)
+		close(priv->rxp.efd);
+	if (priv->rxp.evec)
+		free(priv->rxp.evec);
+	rte_errno = -rc;
+	return rc;
+}
+
+/**
+ * Uninstall failsafe Rx event proxy.
+ *
+ * @param priv
+ *   Pointer to failsafe private structure.
+ */
+static void
+fs_rx_event_proxy_uninstall(struct fs_priv *priv)
+{
+	fs_rx_event_proxy_service_uninstall(priv);
+	if (priv->rxp.evec) {
+		free(priv->rxp.evec);
+		priv->rxp.evec = NULL;
+	}
+	if (priv->rxp.efd > 0) {
+		close(priv->rxp.efd);
+		priv->rxp.efd = -1;
+	}
+}
+
+/**
+ * Uninstall failsafe interrupt vector.
+ *
+ * @param priv
+ *   Pointer to failsafe private structure.
+ */
+static void
+fs_rx_intr_vec_uninstall(struct fs_priv *priv)
+{
+	struct rte_intr_handle *intr_handle = &priv->intr_handle;
+
+	if (intr_handle->intr_vec) {
+		free(intr_handle->intr_vec);
+		intr_handle->intr_vec = NULL;
+	}
+	intr_handle->nb_efd = 0;
+}
+/**
+ * Installs failsafe interrupt vector to be registered with EAL later on.
+ *
+ * @param priv
+ *   Pointer to failsafe private structure.
+ *
+ * @return
+ *   0 on success, negative errno value otherwise and rte_errno is set.
+ */
+static int
+fs_rx_intr_vec_install(struct fs_priv *priv)
+{
+	unsigned int i;
+	unsigned int rxqs_n = priv->dev->data->nb_rx_queues;
+	unsigned int n = RTE_MIN(rxqs_n, (uint32_t)RTE_MAX_RXTX_INTR_VEC_ID);
+	unsigned int count = 0;
+	struct rte_intr_handle *intr_handle = &priv->intr_handle;
+
+	/* Allocate the interrupt vector of the failsafe Rx proxy interrupts */
+	intr_handle->intr_vec = malloc(n * sizeof(intr_handle->intr_vec[0]));
+	if (intr_handle->intr_vec == NULL) {
+		fs_rx_intr_vec_uninstall(priv);
+		rte_errno = ENOMEM;
+		ERROR("failed to allocate memory for interrupt vector,"
+		      " Rx interrupts will not be supported");
+		return -rte_errno;
+	}
+	for (i = 0; i < n; i++) {
+		struct rxq *rxq = priv->dev->data->rx_queues[i];
+
+		/* Skip queues that cannot request interrupts. */
+		if (!rxq || rxq->event_fd < 0) {
+			/* Use invalid intr_vec[] index to disable entry. */
+			intr_handle->intr_vec[i] =
+				RTE_INTR_VEC_RXTX_OFFSET +
+				RTE_MAX_RXTX_INTR_VEC_ID;
+			continue;
+		}
+		if (count >= RTE_MAX_RXTX_INTR_VEC_ID) {
+			rte_errno = E2BIG;
+			ERROR("too many Rx queues for interrupt vector size"
+			      " (%d), Rx interrupts cannot be enabled",
+			      RTE_MAX_RXTX_INTR_VEC_ID);
+			fs_rx_intr_vec_uninstall(priv);
+			return -rte_errno;
+		}
+		intr_handle->intr_vec[i] = RTE_INTR_VEC_RXTX_OFFSET + count;
+		intr_handle->efds[count] = rxq->event_fd;
+		count++;
+	}
+	if (!count)
+		fs_rx_intr_vec_uninstall(priv);
+	else
+		intr_handle->nb_efd = count;
+	return 0;
+}
+
+/**
+ * RX Interrupt control per subdevice.
+ *
+ * @param sdev
+ *   Pointer to sub-device structure.
+ * @param op
+ *   The operation be performed for the vector.
+ *   Operation type of {RTE_INTR_EVENT_ADD, RTE_INTR_EVENT_DEL}.
+ * @return
+ *   - On success, zero.
+ *   - On failure, a negative value.
+ */
+static int
+failsafe_eth_rx_intr_ctl_subdevice(struct sub_device *sdev, int op)
+{
+	struct rte_eth_dev *dev;
+	struct rte_eth_dev *fsdev;
+	int epfd;
+	uint16_t pid;
+	uint16_t qid;
+	struct rxq *fsrxq;
+	int rc;
+	int ret = 0;
+
+	if ((sdev == NULL) || (ETH(sdev) == NULL) ||
+	    (sdev->fs_dev == NULL) || (PRIV(sdev->fs_dev) == NULL)) {
+		ERROR("Called with invalid arguments");
+		return -EINVAL;
+	}
+	dev = ETH(sdev);
+	fsdev = sdev->fs_dev;
+	epfd = PRIV(sdev->fs_dev)->rxp.efd;
+	pid = PORT_ID(sdev);
+
+	if (epfd <= 0) {
+		if (op == RTE_INTR_EVENT_ADD) {
+			ERROR("proxy events are not initialized");
+			return -EBADFD;
+		} else {
+			return 0;
+		}
+	}
+	if (dev->data->nb_rx_queues > fsdev->data->nb_rx_queues) {
+		ERROR("subdevice has too many queues,"
+		      " Interrupts will not be enabled");
+			return -E2BIG;
+	}
+	for (qid = 0; qid < dev->data->nb_rx_queues; qid++) {
+		fsrxq = fsdev->data->rx_queues[qid];
+		rc = rte_eth_dev_rx_intr_ctl_q(pid, qid, epfd,
+					       op, (void *)fsrxq);
+		if (rc) {
+			ERROR("rte_eth_dev_rx_intr_ctl_q failed for "
+			      "port %d  queue %d, epfd %d, error %d",
+			      pid, qid, epfd, rc);
+			ret = rc;
+		}
+	}
+	return ret;
+}
+
+/**
+ * Install Rx interrupts subsystem for a subdevice.
+ * This is a support for dynamically adding subdevices.
+ *
+ * @param sdev
+ *   Pointer to subdevice structure.
+ *
+ * @return
+ *   0 on success, negative errno value otherwise and rte_errno is set.
+ */
+int failsafe_rx_intr_install_subdevice(struct sub_device *sdev)
+{
+	int rc;
+	int qid;
+	struct rte_eth_dev *fsdev = sdev->fs_dev;
+	struct rxq **rxq = (struct rxq **)fsdev->data->rx_queues;
+	const struct rte_intr_conf *const intr_conf =
+				&ETH(sdev)->data->dev_conf.intr_conf;
+
+	if (!intr_conf->rxq)
+		return 0;
+	rc = failsafe_eth_rx_intr_ctl_subdevice(sdev, RTE_INTR_EVENT_ADD);
+	if (rc)
+		return rc;
+	/* enable interrupts on already-enabled queues */
+	for (qid = 0; qid < ETH(sdev)->data->nb_rx_queues; qid++) {
+		if (rxq[qid]->enable_events) {
+			int ret = rte_eth_dev_rx_intr_enable(PORT_ID(sdev),
+							     qid);
+			if (ret && (ret != -ENOTSUP)) {
+				ERROR("Failed to enable interrupts on "
+				      "port %d queue %d", PORT_ID(sdev), qid);
+				rc = ret;
+			}
+		}
+	}
+	return rc;
+}
+
+/**
+ * Uninstall Rx interrupts subsystem for a subdevice.
+ * This is a support for dynamically removing subdevices.
+ *
+ * @param sdev
+ *   Pointer to subdevice structure.
+ *
+ * @return
+ *   0 on success, negative errno value otherwise and rte_errno is set.
+ */
+void failsafe_rx_intr_uninstall_subdevice(struct sub_device *sdev)
+{
+	int qid;
+
+	for (qid = 0; qid < ETH(sdev)->data->nb_rx_queues; qid++)
+		rte_eth_dev_rx_intr_disable(PORT_ID(sdev), qid);
+	failsafe_eth_rx_intr_ctl_subdevice(sdev, RTE_INTR_EVENT_DEL);
+}
+
+/**
+ * Uninstall failsafe Rx interrupts subsystem.
+ *
+ * @param priv
+ *   Pointer to private structure.
+ *
+ * @return
+ *   0 on success, negative errno value otherwise and rte_errno is set.
+ */
+void
+failsafe_rx_intr_uninstall(struct rte_eth_dev *dev)
+{
+	struct fs_priv *priv = PRIV(dev);
+	struct rte_intr_handle *intr_handle = &priv->intr_handle;
+
+	dev->intr_handle = NULL;
+	rte_intr_free_epoll_fd(intr_handle);
+	fs_rx_event_proxy_uninstall(priv);
+	if (intr_handle->intr_vec) {
+		free(intr_handle->intr_vec);
+		intr_handle->intr_vec = NULL;
+	}
+	intr_handle->nb_efd = 0;
+}
+
+/**
+ * Install failsafe Rx interrupts subsystem.
+ *
+ * @param priv
+ *   Pointer to private structure.
+ *
+ * @return
+ *   0 on success, negative errno value otherwise and rte_errno is set.
+ */
+int
+failsafe_rx_intr_install(struct rte_eth_dev *dev)
+{
+	struct fs_priv *priv = PRIV(dev);
+	const struct rte_intr_conf *const intr_conf =
+			&priv->dev->data->dev_conf.intr_conf;
+
+	if (!intr_conf->rxq || priv->intr_handle.intr_vec != NULL)
+		return 0;
+	if (fs_rx_intr_vec_install(priv) < 0)
+		return -rte_errno;
+	if (fs_rx_event_proxy_install(priv) < 0) {
+		fs_rx_intr_vec_uninstall(priv);
+		return -rte_errno;
+	}
+	priv->intr_handle.efd_counter_size = sizeof(uint64_t);
+	dev->intr_handle = &priv->intr_handle;
+	return 0;
+}
+
+
+/**
+ * DPDK callback for Rx queue interrupt disable.
+ *
+ * @param dev
+ *   Pointer to Ethernet device structure.
+ * @param idx
+ *   Rx queue index.
+ *
+ * @return
+ *   0 on success, negative errno value otherwise and rte_errno is set.
+ */
+int
+failsafe_rx_intr_disable(struct rte_eth_dev *dev, uint16_t idx)
+{
+	struct rxq *rxq = dev->data->rx_queues[idx];
+	struct sub_device *sdev;
+	uint64_t u64;
+	uint8_t i;
+	int rc = 0;
+	int ret;
+
+	if (!rxq || rxq->event_fd <= 0) {
+		rte_errno = EINVAL;
+		return -rte_errno;
+	}
+	rxq->enable_events = 0;
+	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
+		ret = rte_eth_dev_rx_intr_disable(PORT_ID(sdev), idx);
+		if ((ret != -ENOTSUP) && (ret != -ENODEV))
+			rc = ret;
+	}
+	/* Clear pending events */
+	while (read(rxq->event_fd, &u64, sizeof(uint64_t)) >  0)
+		;
+	if (rc)
+		rte_errno = -rc;
+	return rc;
+}
+
+/**
+ * DPDK callback for Rx queue interrupt enable.
+ *
+ * @param dev
+ *   Pointer to Ethernet device structure.
+ * @param idx
+ *   Rx queue index.
+ *
+ * @return
+ *   0 on success, negative errno value otherwise and rte_errno is set.
+ */
+int
+failsafe_rx_intr_enable(struct rte_eth_dev *dev, uint16_t idx)
+{
+	struct rxq *rxq = dev->data->rx_queues[idx];
+	struct sub_device *sdev;
+	uint8_t i;
+	int rc = 0;
+	int ret;
+
+	if (!rxq || rxq->event_fd <= 0) {
+		rte_errno = EINVAL;
+		return -rte_errno;
+	}
+	/* Let the proxy service run. */
+	if (PRIV(dev)->rxp.sstate != SS_RUNNING) {
+		ERROR("failsafe interrupt services are not running");
+		rte_errno = EAGAIN;
+		return -rte_errno;
+	}
+	rxq->enable_events = 1;
+	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
+		ret = rte_eth_dev_rx_intr_enable(PORT_ID(sdev), idx);
+		if ((ret != -ENOTSUP) && (ret != -ENODEV))
+			rc = ret;
+	}
+	if (rc)
+		rte_errno = -rc;
+	return rc;
+}
diff --git a/drivers/net/failsafe/failsafe_ops.c b/drivers/net/failsafe/failsafe_ops.c
index e16a590..d4839a9 100644
--- a/drivers/net/failsafe/failsafe_ops.c
+++ b/drivers/net/failsafe/failsafe_ops.c
@@ -32,6 +32,7 @@
  */
 
 #include <stdint.h>
+#include <unistd.h>
 
 #include <rte_debug.h>
 #include <rte_atomic.h>
@@ -158,6 +159,10 @@
 	uint8_t i;
 	int ret;
 
+	ret = failsafe_rx_intr_install(dev);
+	if (ret)
+		return ret;
+
 	FOREACH_SUBDEV(sdev, i, dev) {
 		if (sdev->state != DEV_ACTIVE)
 			continue;
@@ -165,6 +170,11 @@
 		ret = rte_eth_dev_start(PORT_ID(sdev));
 		if (ret)
 			return ret;
+		ret = failsafe_rx_intr_install_subdevice(sdev);
+		if (ret) {
+			rte_eth_dev_stop(PORT_ID(sdev));
+			return ret;
+		}
 		sdev->state = DEV_STARTED;
 	}
 	if (PRIV(dev)->state < DEV_STARTED)
@@ -181,9 +191,11 @@
 
 	PRIV(dev)->state = DEV_STARTED - 1;
 	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_STARTED) {
+		failsafe_rx_intr_uninstall_subdevice(sdev);
 		rte_eth_dev_stop(PORT_ID(sdev));
 		sdev->state = DEV_STARTED - 1;
 	}
+	failsafe_rx_intr_uninstall(dev);
 }
 
 static int
@@ -254,6 +266,8 @@
 	if (queue == NULL)
 		return;
 	rxq = queue;
+	if (rxq->event_fd > 0)
+		close(rxq->event_fd);
 	dev = rxq->priv->dev;
 	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE)
 		SUBOPS(sdev, rx_queue_release)
@@ -294,6 +308,11 @@
 	rxq->info.conf = *rx_conf;
 	rxq->info.nb_desc = nb_rx_desc;
 	rxq->priv = PRIV(dev);
+	rxq->event_fd = eventfd(0, EFD_CLOEXEC | EFD_NONBLOCK);
+	if (rxq->event_fd < 0) {
+		ERROR("Rx event_fd error, %s", strerror(errno));
+		return -errno;
+	}
 	dev->data->rx_queues[rx_queue_id] = rxq;
 	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
 		ret = rte_eth_rx_queue_setup(PORT_ID(sdev),
@@ -767,4 +786,6 @@
 	.mac_addr_add = fs_mac_addr_add,
 	.mac_addr_set = fs_mac_addr_set,
 	.filter_ctrl = fs_filter_ctrl,
+	.rx_queue_intr_enable = failsafe_rx_intr_enable,
+	.rx_queue_intr_disable = failsafe_rx_intr_disable,
 };
diff --git a/drivers/net/failsafe/failsafe_private.h b/drivers/net/failsafe/failsafe_private.h
index d81cc3c..29ad2f1 100644
--- a/drivers/net/failsafe/failsafe_private.h
+++ b/drivers/net/failsafe/failsafe_private.h
@@ -34,6 +34,7 @@
 #ifndef _RTE_ETH_FAILSAFE_PRIVATE_H_
 #define _RTE_ETH_FAILSAFE_PRIVATE_H_
 
+#include <sys/eventfd.h>
 #include <sys/queue.h>
 
 #include <rte_atomic.h>
@@ -57,6 +58,13 @@
 #define FAILSAFE_MAX_ETHPORTS 2
 #define FAILSAFE_MAX_ETHADDR 128
 
+enum rxp_service_state {
+	SS_NO_SREVICE = 0,
+	SS_REGISTERED,
+	SS_READY,
+	SS_RUNNING,
+};
+
 /* TYPES */
 
 struct rxq {
@@ -65,10 +73,25 @@ struct rxq {
 	/* id of last sub_device polled */
 	uint8_t last_polled;
 	unsigned int socket_id;
+	int event_fd;
+	unsigned int enable_events:1;
 	struct rte_eth_rxq_info info;
 	rte_atomic64_t refcnt[];
 };
 
+struct rx_proxy {
+	/* epoll file descriptor */
+	int efd;
+	/* event vector to be used by epoll */
+	struct rte_epoll_event *evec;
+	/* rte service id */
+	uint32_t sid;
+	/* service core id */
+	uint32_t scid;
+	enum rxp_service_state sstate;
+
+};
+
 struct txq {
 	struct fs_priv *priv;
 	uint16_t qid;
@@ -139,6 +162,7 @@ struct fs_priv {
 	uint32_t mac_addr_pool[FAILSAFE_MAX_ETHADDR];
 	/* current capabilities */
 	struct rte_eth_dev_info infos;
+	struct rte_intr_handle intr_handle; /* Port interrupt handle. */
 	/*
 	 * Fail-safe state machine.
 	 * This level will be tracking state of the EAL and eth
@@ -151,8 +175,28 @@ struct fs_priv {
 	unsigned int pending_alarm:1; /* An alarm is pending */
 	/* flow isolation state */
 	int flow_isolated:1;
+	/*
+	 * Rx interrupts/events proxy.
+	 * The PMD issues Rx events to the EAL on behalf of its subdevices,
+	 *  it does that by registering event queues to the EAL. Each such
+	 *  queue represents a failsafe Rx queue. A PMD service thread listens
+	 *  to all the Rx events of of all the failsafe subdevices.
+	 *  When an Rx event is issued by a subdevice Rx queue it will be
+	 *  caught by the service and delivered by it to the appropriate
+	 *  failsafe event queue.
+	 */
+	struct rx_proxy rxp;
 };
 
+/* FAILSAFE_INTR */
+
+int failsafe_rx_intr_install(struct rte_eth_dev *dev);
+void failsafe_rx_intr_uninstall(struct rte_eth_dev *dev);
+int failsafe_rx_intr_enable(struct rte_eth_dev *dev, uint16_t idx);
+int failsafe_rx_intr_disable(struct rte_eth_dev *dev, uint16_t idx);
+int failsafe_rx_intr_install_subdevice(struct sub_device *sdev);
+void failsafe_rx_intr_uninstall_subdevice(struct sub_device *sdev);
+
 /* MISC */
 
 int failsafe_hotplug_alarm_install(struct rte_eth_dev *dev);
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [PATCH] net/failsafe: add Rx interrupts
  2017-12-11 12:41 [PATCH] net/failsafe: add Rx interrupts Moti Haimovsky
@ 2017-12-12  1:34 ` Stephen Hemminger
  2017-12-13 13:12   ` Mordechay Haimovsky
  2018-01-04 15:01 ` [PATCH v2] " Moti Haimovsky
  1 sibling, 1 reply; 29+ messages in thread
From: Stephen Hemminger @ 2017-12-12  1:34 UTC (permalink / raw)
  To: Moti Haimovsky; +Cc: gaetan.rivet, dev

On Mon, 11 Dec 2017 14:41:47 +0200
Moti Haimovsky <motih@mellanox.com> wrote:

> +	for (i = 0; i < n; i++) {
> +		rxq = (struct rxq *)events[i].epdata.data;

Minor nit. events[i].epdata.data is "void *" therefore cast is unnecessary.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH] net/failsafe: add Rx interrupts
  2017-12-12  1:34 ` Stephen Hemminger
@ 2017-12-13 13:12   ` Mordechay Haimovsky
  0 siblings, 0 replies; 29+ messages in thread
From: Mordechay Haimovsky @ 2017-12-13 13:12 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: gaetan.rivet, dev

Thank you Stephen,

 Will gather more review inputs and send a fix.

Moti

> -----Original Message-----
> From: Stephen Hemminger [mailto:stephen@networkplumber.org]
> Sent: Tuesday, December 12, 2017 3:34 AM
> To: Mordechay Haimovsky <motih@mellanox.com>
> Cc: gaetan.rivet@6wind.com; dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH] net/failsafe: add Rx interrupts
> 
> On Mon, 11 Dec 2017 14:41:47 +0200
> Moti Haimovsky <motih@mellanox.com> wrote:
> 
> > +	for (i = 0; i < n; i++) {
> > +		rxq = (struct rxq *)events[i].epdata.data;
> 
> Minor nit. events[i].epdata.data is "void *" therefore cast is unnecessary.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH v2] net/failsafe: add Rx interrupts
  2017-12-11 12:41 [PATCH] net/failsafe: add Rx interrupts Moti Haimovsky
  2017-12-12  1:34 ` Stephen Hemminger
@ 2018-01-04 15:01 ` Moti Haimovsky
  2018-01-17 12:54   ` [PATCH V3] " Moti Haimovsky
  1 sibling, 1 reply; 29+ messages in thread
From: Moti Haimovsky @ 2018-01-04 15:01 UTC (permalink / raw)
  To: gaetan.rivet, stephen; +Cc: dev, Moti Haimovsky

This patch adds support for registering and waiting for Rx
interrupts in failsafe PMD. This allows applications to wait
for Rx events from the PMD using the DPDK rte_epoll subsystem.
The failsafe PMD presents to the application a facade of a single
device to be handled by the application while internally it manages
several devices on behalf of the application including packets
transmission and reception.
The Proposed failsafe Rx interrupt scheme follows this approach.
The failsafe PMD will present the application with a single set of Rx
interrupt vectors representing the failsafe Rx queues, while internally
it will serve as an interrupt proxy for its subdevices.
This will allow applications to wait for Rx traffic from the failsafe
PMD by registering and waiting for Rx events from its Rx queues.
In order to support this the following is suggested:
  * Every Rx queue in the failsafe (virtual) device will be assigned a
    Linux event file descriptor (efd) and an enable_interrupts flag.
  * The failsafe PMD will fill in its rte_intr_handle structure with
    the Rx efds assigned previously and register them with the EAL.
  * The failsafe driver will create a private epoll fd (epfd) and will
    allocate enough space to handle all the Rx events from all its
    subdevices.
  * Acting as an application,
    for each Rx queue in each active subdevice the failsafe will:
      o Register the Rx queue with the EAL.
      o Pass the EAL the failsafe private epoll fd as the epfd to
        register the Rx queue event on.
      o Pass the EAL, as a parameter, the pointer to the failsafe Rx
        queue that handles this Rx queue.
      o Using the DPDK service callbacks, the failsafe PMD will launch
        an Rx proxy service that will Wait on the epoll fd for Rx events
        from the sub-devices.
      o For each Rx event received the proxy service will
         - Retrieve the pointer to failsafe Rx queue that handles this
           subdevice Rx queue from the user info returned by the EAL.
         - Trigger a failsafe Rx event on that queue by writing to the
           event fd unless interrupts are disabled for that queue.
  * The failsafe pmd will also implement the rx_queue_intr_enable and
    rx_queue_intr_disable routines that will enable and disable Rx
    interrupts respectively on both on the failsafe and its subdevices.

Signed-off-by: Moti Haimovsky <motih@mellanox.com>
---
 V2:
Modifications according to inputs from Stephen Hemminger:
* Removed unneeded (void *) casting.
Fixed coding style warning.
---
doc/guides/nics/features/failsafe.ini   |   1 +
 drivers/net/failsafe/Makefile           |   1 +
 drivers/net/failsafe/failsafe.c         |   4 +
 drivers/net/failsafe/failsafe_ether.c   |   1 +
 drivers/net/failsafe/failsafe_intr.c    | 596 ++++++++++++++++++++++++++++++++
 drivers/net/failsafe/failsafe_ops.c     |  21 ++
 drivers/net/failsafe/failsafe_private.h |  44 +++
 7 files changed, 668 insertions(+)
 create mode 100644 drivers/net/failsafe/failsafe_intr.c

diff --git a/doc/guides/nics/features/failsafe.ini b/doc/guides/nics/features/failsafe.ini
index a42e344..39ee579 100644
--- a/doc/guides/nics/features/failsafe.ini
+++ b/doc/guides/nics/features/failsafe.ini
@@ -6,6 +6,7 @@
 [Features]
 Link status          = Y
 Link status event    = Y
+Rx interrupt         = Y
 MTU update           = Y
 Jumbo frame          = Y
 Promiscuous mode     = Y
diff --git a/drivers/net/failsafe/Makefile b/drivers/net/failsafe/Makefile
index ea2a8fe..91a734b 100644
--- a/drivers/net/failsafe/Makefile
+++ b/drivers/net/failsafe/Makefile
@@ -46,6 +46,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_PMD_FAILSAFE) += failsafe_ops.c
 SRCS-$(CONFIG_RTE_LIBRTE_PMD_FAILSAFE) += failsafe_rxtx.c
 SRCS-$(CONFIG_RTE_LIBRTE_PMD_FAILSAFE) += failsafe_ether.c
 SRCS-$(CONFIG_RTE_LIBRTE_PMD_FAILSAFE) += failsafe_flow.c
+SRCS-$(CONFIG_RTE_LIBRTE_PMD_FAILSAFE) += failsafe_intr.c
 
 # No exported include files
 
diff --git a/drivers/net/failsafe/failsafe.c b/drivers/net/failsafe/failsafe.c
index 6bc5aba..3b5e059 100644
--- a/drivers/net/failsafe/failsafe.c
+++ b/drivers/net/failsafe/failsafe.c
@@ -239,6 +239,10 @@
 		mac->addr_bytes[2], mac->addr_bytes[3],
 		mac->addr_bytes[4], mac->addr_bytes[5]);
 	dev->data->dev_flags |= RTE_ETH_DEV_INTR_LSC;
+	PRIV(dev)->intr_handle = (struct rte_intr_handle){
+		.fd = -1,
+		.type = RTE_INTR_HANDLE_EXT,
+	};
 	return 0;
 free_args:
 	failsafe_args_free(dev);
diff --git a/drivers/net/failsafe/failsafe_ether.c b/drivers/net/failsafe/failsafe_ether.c
index 21392e5..80741ba 100644
--- a/drivers/net/failsafe/failsafe_ether.c
+++ b/drivers/net/failsafe/failsafe_ether.c
@@ -283,6 +283,7 @@
 		return;
 	switch (sdev->state) {
 	case DEV_STARTED:
+		failsafe_rx_intr_uninstall_subdevice(sdev);
 		rte_eth_dev_stop(PORT_ID(sdev));
 		sdev->state = DEV_ACTIVE;
 		/* fallthrough */
diff --git a/drivers/net/failsafe/failsafe_intr.c b/drivers/net/failsafe/failsafe_intr.c
new file mode 100644
index 0000000..580c002
--- /dev/null
+++ b/drivers/net/failsafe/failsafe_intr.c
@@ -0,0 +1,596 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright 2017 Mellanox
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of the copyright holder nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+/**
+ * @file
+ * Interrupts handling for failsafe driver.
+ */
+
+#include <sys/eventfd.h>
+#include <sys/epoll.h>
+#include <unistd.h>
+
+#include <rte_alarm.h>
+#include <rte_config.h>
+#include <rte_errno.h>
+#include <rte_ethdev.h>
+#include <rte_interrupts.h>
+#include <rte_io.h>
+#include <rte_service_component.h>
+
+#include "failsafe_private.h"
+
+#define NUM_RX_PROXIES (FAILSAFE_MAX_ETHPORTS * RTE_MAX_RXTX_INTR_VEC_ID)
+
+/**
+ * Install failsafe Rx event proxy service.
+ * The Rx event proxy is the service that listens to Rx events from the
+ * subdevices and triggers failsafe Rx events accordingly.
+ *
+ * @param priv
+ *   Pointer to failsafe private structure.
+ * @return
+ *   0 on success, negative errno value otherwise.
+ */
+static int
+fs_rx_event_proxy_service(void *data)
+{
+	struct fs_priv *priv = data;
+	struct rxq *rxq;
+	struct rte_epoll_event *events = priv->rxp.evec;
+	uint64_t u64 = 1;
+	int i, n, rc = 0;
+
+	n = rte_epoll_wait(priv->rxp.efd, events, NUM_RX_PROXIES, -1);
+	for (i = 0; i < n; i++) {
+		rxq = events[i].epdata.data;
+		if (rxq->enable_events && rxq->event_fd != -1) {
+			if (write(rxq->event_fd, &u64, sizeof(u64)) !=
+			    sizeof(u64)) {
+				ERROR("failed to proxy Rx event to socket %d",
+				       rxq->event_fd);
+				rc = -EIO;
+			}
+		}
+	}
+	return rc;
+}
+
+/**
+ * Uninstall failsafe Rx event proxy service.
+ *
+ * @param priv
+ *   Pointer to failsafe private structure.
+ */
+static void
+fs_rx_event_proxy_service_uninstall(struct fs_priv *priv)
+{
+	/* Unregister the event service. */
+	switch (priv->rxp.sstate) {
+	case SS_RUNNING:
+		rte_service_map_lcore_set(priv->rxp.sid, priv->rxp.scid, 0);
+		/* fall through */
+	case SS_READY:
+		rte_service_runstate_set(priv->rxp.sid, 0);
+		rte_service_set_stats_enable(priv->rxp.sid, 0);
+		rte_service_component_runstate_set(priv->rxp.sid, 0);
+		/* fall through */
+	case SS_REGISTERED:
+		rte_service_component_unregister(priv->rxp.sid);
+		/* fall through */
+	default:
+		break;
+	}
+}
+
+/**
+ * Install the failsafe Rx event proxy service.
+ *
+ * @param priv
+ *   Pointer to failsafe private structure.
+ * @return
+ *   0 on success, negative errno value otherwise.
+ */
+static int
+fs_rx_event_proxy_service_install(struct fs_priv *priv)
+{
+	struct rte_service_spec service;
+	int32_t num_service_cores = rte_service_lcore_count();
+	int ret = 0;
+
+	if (!num_service_cores) {
+		ERROR("Failed to install Rx interrupts, "
+		      "no service core found");
+		return -ENOTSUP;
+	}
+	/* prepare service info */
+	memset(&service, 0, sizeof(struct rte_service_spec));
+	snprintf(service.name, sizeof(service.name), "%s_Rx_service",
+		 priv->dev->device->name);
+	service.socket_id = priv->dev->data->numa_node;
+	service.callback = fs_rx_event_proxy_service;
+	service.callback_userdata = (void *)priv;
+
+	if (priv->rxp.sstate == SS_NO_SREVICE) {
+		uint32_t service_core_list[num_service_cores];
+
+		/* get a service core to work with */
+		ret = rte_service_lcore_list(service_core_list,
+					     num_service_cores);
+		if (ret <= 0) {
+			ERROR("Failed to install Rx interrupts, "
+			      "service core list empty or corrupted");
+			return -ENOTSUP;
+		}
+		priv->rxp.scid = service_core_list[0];
+		ret = rte_service_lcore_add(priv->rxp.scid);
+		if (ret && ret != -EALREADY) {
+			ERROR("Failed adding service core");
+			return ret;
+		}
+		/* service core may be in "stopped" state, start it */
+		ret = rte_service_lcore_start(priv->rxp.scid);
+		if (ret && (ret != -EALREADY)) {
+			ERROR("Failed to install Rx interrupts, "
+			      "service core not started");
+			return ret;
+		}
+		/* register our service */
+		int32_t ret = rte_service_component_register(&service,
+							     &priv->rxp.sid);
+		if (ret) {
+			ERROR("service register() failed");
+			return -ENOEXEC;
+		}
+		priv->rxp.sstate = SS_REGISTERED;
+		/* run the service */
+		ret = rte_service_component_runstate_set(priv->rxp.sid, 1);
+		if (ret < 0) {
+			ERROR("Failed Setting component runstate\n");
+			return ret;
+		}
+		ret = rte_service_set_stats_enable(priv->rxp.sid, 1);
+		if (ret < 0) {
+			ERROR("Failed enabling stats\n");
+			return ret;
+		}
+		ret = rte_service_runstate_set(priv->rxp.sid, 1);
+		if (ret < 0) {
+			ERROR("Failed to run service\n");
+			return ret;
+		}
+		priv->rxp.sstate = SS_READY;
+		/* map the service with the service core */
+		ret = rte_service_map_lcore_set(priv->rxp.sid,
+						priv->rxp.scid, 1);
+		if (ret) {
+			ERROR("Failed to install Rx interrupts, "
+			      "could not map service core");
+			return ret;
+		}
+		priv->rxp.sstate = SS_RUNNING;
+	}
+	return 0;
+}
+
+/**
+ * Install failsafe Rx event proxy subsystem.
+ * This is the way the failsafe PMD generates Rx events on behalf of its
+ * subdevices.
+ *
+ * @param priv
+ *   Pointer to failsafe private structure.
+ * @return
+ *   0 on success, negative errno value otherwise and rte_errno is set.
+ */
+static int
+fs_rx_event_proxy_install(struct fs_priv *priv)
+{
+	int rc = 0;
+
+	/* create the epoll to wait on for Rx events form subdevices */
+	priv->rxp.efd = epoll_create1(0);
+	if (priv->rxp.efd < 0) {
+		rte_errno = errno;
+		ERROR("failed to create epoll,"
+		      " Rx interrupts will not be supported");
+		return -rte_errno;
+	}
+	/* allocate memory for receiving the Rx events from the subdevices. */
+	priv->rxp.evec = calloc(NUM_RX_PROXIES, sizeof(*priv->rxp.evec));
+	if (priv->rxp.evec == NULL) {
+		ERROR("failed to allocate memory for event vectors,"
+		      " Rx interrupts will not be supported");
+		rc = -ENOMEM;
+		goto error;
+	}
+	if (fs_rx_event_proxy_service_install(priv) < 0) {
+		rc = -rte_errno;
+		goto error;
+	}
+	return 0;
+error:
+	if (priv->rxp.efd >= 0)
+		close(priv->rxp.efd);
+	if (priv->rxp.evec)
+		free(priv->rxp.evec);
+	rte_errno = -rc;
+	return rc;
+}
+
+/**
+ * Uninstall failsafe Rx event proxy.
+ *
+ * @param priv
+ *   Pointer to failsafe private structure.
+ */
+static void
+fs_rx_event_proxy_uninstall(struct fs_priv *priv)
+{
+	fs_rx_event_proxy_service_uninstall(priv);
+	if (priv->rxp.evec) {
+		free(priv->rxp.evec);
+		priv->rxp.evec = NULL;
+	}
+	if (priv->rxp.efd > 0) {
+		close(priv->rxp.efd);
+		priv->rxp.efd = -1;
+	}
+}
+
+/**
+ * Uninstall failsafe interrupt vector.
+ *
+ * @param priv
+ *   Pointer to failsafe private structure.
+ */
+static void
+fs_rx_intr_vec_uninstall(struct fs_priv *priv)
+{
+	struct rte_intr_handle *intr_handle = &priv->intr_handle;
+
+	if (intr_handle->intr_vec) {
+		free(intr_handle->intr_vec);
+		intr_handle->intr_vec = NULL;
+	}
+	intr_handle->nb_efd = 0;
+}
+/**
+ * Installs failsafe interrupt vector to be registered with EAL later on.
+ *
+ * @param priv
+ *   Pointer to failsafe private structure.
+ *
+ * @return
+ *   0 on success, negative errno value otherwise and rte_errno is set.
+ */
+static int
+fs_rx_intr_vec_install(struct fs_priv *priv)
+{
+	unsigned int i;
+	unsigned int rxqs_n = priv->dev->data->nb_rx_queues;
+	unsigned int n = RTE_MIN(rxqs_n, (uint32_t)RTE_MAX_RXTX_INTR_VEC_ID);
+	unsigned int count = 0;
+	struct rte_intr_handle *intr_handle = &priv->intr_handle;
+
+	/* Allocate the interrupt vector of the failsafe Rx proxy interrupts */
+	intr_handle->intr_vec = malloc(n * sizeof(intr_handle->intr_vec[0]));
+	if (intr_handle->intr_vec == NULL) {
+		fs_rx_intr_vec_uninstall(priv);
+		rte_errno = ENOMEM;
+		ERROR("failed to allocate memory for interrupt vector,"
+		      " Rx interrupts will not be supported");
+		return -rte_errno;
+	}
+	for (i = 0; i < n; i++) {
+		struct rxq *rxq = priv->dev->data->rx_queues[i];
+
+		/* Skip queues that cannot request interrupts. */
+		if (!rxq || rxq->event_fd < 0) {
+			/* Use invalid intr_vec[] index to disable entry. */
+			intr_handle->intr_vec[i] =
+				RTE_INTR_VEC_RXTX_OFFSET +
+				RTE_MAX_RXTX_INTR_VEC_ID;
+			continue;
+		}
+		if (count >= RTE_MAX_RXTX_INTR_VEC_ID) {
+			rte_errno = E2BIG;
+			ERROR("too many Rx queues for interrupt vector size"
+			      " (%d), Rx interrupts cannot be enabled",
+			      RTE_MAX_RXTX_INTR_VEC_ID);
+			fs_rx_intr_vec_uninstall(priv);
+			return -rte_errno;
+		}
+		intr_handle->intr_vec[i] = RTE_INTR_VEC_RXTX_OFFSET + count;
+		intr_handle->efds[count] = rxq->event_fd;
+		count++;
+	}
+	if (!count)
+		fs_rx_intr_vec_uninstall(priv);
+	else
+		intr_handle->nb_efd = count;
+	return 0;
+}
+
+/**
+ * RX Interrupt control per subdevice.
+ *
+ * @param sdev
+ *   Pointer to sub-device structure.
+ * @param op
+ *   The operation be performed for the vector.
+ *   Operation type of {RTE_INTR_EVENT_ADD, RTE_INTR_EVENT_DEL}.
+ * @return
+ *   - On success, zero.
+ *   - On failure, a negative value.
+ */
+static int
+failsafe_eth_rx_intr_ctl_subdevice(struct sub_device *sdev, int op)
+{
+	struct rte_eth_dev *dev;
+	struct rte_eth_dev *fsdev;
+	int epfd;
+	uint16_t pid;
+	uint16_t qid;
+	struct rxq *fsrxq;
+	int rc;
+	int ret = 0;
+
+	if (sdev == NULL || (ETH(sdev) == NULL) ||
+	    sdev->fs_dev == NULL || (PRIV(sdev->fs_dev) == NULL)) {
+		ERROR("Called with invalid arguments");
+		return -EINVAL;
+	}
+	dev = ETH(sdev);
+	fsdev = sdev->fs_dev;
+	epfd = PRIV(sdev->fs_dev)->rxp.efd;
+	pid = PORT_ID(sdev);
+
+	if (epfd <= 0) {
+		if (op == RTE_INTR_EVENT_ADD) {
+			ERROR("proxy events are not initialized");
+			return -EBADFD;
+		} else {
+			return 0;
+		}
+	}
+	if (dev->data->nb_rx_queues > fsdev->data->nb_rx_queues) {
+		ERROR("subdevice has too many queues,"
+		      " Interrupts will not be enabled");
+			return -E2BIG;
+	}
+	for (qid = 0; qid < dev->data->nb_rx_queues; qid++) {
+		fsrxq = fsdev->data->rx_queues[qid];
+		rc = rte_eth_dev_rx_intr_ctl_q(pid, qid, epfd,
+					       op, (void *)fsrxq);
+		if (rc) {
+			ERROR("rte_eth_dev_rx_intr_ctl_q failed for "
+			      "port %d  queue %d, epfd %d, error %d",
+			      pid, qid, epfd, rc);
+			ret = rc;
+		}
+	}
+	return ret;
+}
+
+/**
+ * Install Rx interrupts subsystem for a subdevice.
+ * This is a support for dynamically adding subdevices.
+ *
+ * @param sdev
+ *   Pointer to subdevice structure.
+ *
+ * @return
+ *   0 on success, negative errno value otherwise and rte_errno is set.
+ */
+int failsafe_rx_intr_install_subdevice(struct sub_device *sdev)
+{
+	int rc;
+	int qid;
+	struct rte_eth_dev *fsdev = sdev->fs_dev;
+	struct rxq **rxq = (struct rxq **)fsdev->data->rx_queues;
+	const struct rte_intr_conf *const intr_conf =
+				&ETH(sdev)->data->dev_conf.intr_conf;
+
+	if (!intr_conf->rxq)
+		return 0;
+	rc = failsafe_eth_rx_intr_ctl_subdevice(sdev, RTE_INTR_EVENT_ADD);
+	if (rc)
+		return rc;
+	/* enable interrupts on already-enabled queues */
+	for (qid = 0; qid < ETH(sdev)->data->nb_rx_queues; qid++) {
+		if (rxq[qid]->enable_events) {
+			int ret = rte_eth_dev_rx_intr_enable(PORT_ID(sdev),
+							     qid);
+			if (ret && (ret != -ENOTSUP)) {
+				ERROR("Failed to enable interrupts on "
+				      "port %d queue %d", PORT_ID(sdev), qid);
+				rc = ret;
+			}
+		}
+	}
+	return rc;
+}
+
+/**
+ * Uninstall Rx interrupts subsystem for a subdevice.
+ * This is a support for dynamically removing subdevices.
+ *
+ * @param sdev
+ *   Pointer to subdevice structure.
+ *
+ * @return
+ *   0 on success, negative errno value otherwise and rte_errno is set.
+ */
+void failsafe_rx_intr_uninstall_subdevice(struct sub_device *sdev)
+{
+	int qid;
+
+	for (qid = 0; qid < ETH(sdev)->data->nb_rx_queues; qid++)
+		rte_eth_dev_rx_intr_disable(PORT_ID(sdev), qid);
+	failsafe_eth_rx_intr_ctl_subdevice(sdev, RTE_INTR_EVENT_DEL);
+}
+
+/**
+ * Uninstall failsafe Rx interrupts subsystem.
+ *
+ * @param priv
+ *   Pointer to private structure.
+ *
+ * @return
+ *   0 on success, negative errno value otherwise and rte_errno is set.
+ */
+void
+failsafe_rx_intr_uninstall(struct rte_eth_dev *dev)
+{
+	struct fs_priv *priv = PRIV(dev);
+	struct rte_intr_handle *intr_handle = &priv->intr_handle;
+
+	dev->intr_handle = NULL;
+	rte_intr_free_epoll_fd(intr_handle);
+	fs_rx_event_proxy_uninstall(priv);
+	if (intr_handle->intr_vec) {
+		free(intr_handle->intr_vec);
+		intr_handle->intr_vec = NULL;
+	}
+	intr_handle->nb_efd = 0;
+}
+
+/**
+ * Install failsafe Rx interrupts subsystem.
+ *
+ * @param priv
+ *   Pointer to private structure.
+ *
+ * @return
+ *   0 on success, negative errno value otherwise and rte_errno is set.
+ */
+int
+failsafe_rx_intr_install(struct rte_eth_dev *dev)
+{
+	struct fs_priv *priv = PRIV(dev);
+	const struct rte_intr_conf *const intr_conf =
+			&priv->dev->data->dev_conf.intr_conf;
+
+	if (!intr_conf->rxq || priv->intr_handle.intr_vec != NULL)
+		return 0;
+	if (fs_rx_intr_vec_install(priv) < 0)
+		return -rte_errno;
+	if (fs_rx_event_proxy_install(priv) < 0) {
+		fs_rx_intr_vec_uninstall(priv);
+		return -rte_errno;
+	}
+	priv->intr_handle.efd_counter_size = sizeof(uint64_t);
+	dev->intr_handle = &priv->intr_handle;
+	return 0;
+}
+
+
+/**
+ * DPDK callback for Rx queue interrupt disable.
+ *
+ * @param dev
+ *   Pointer to Ethernet device structure.
+ * @param idx
+ *   Rx queue index.
+ *
+ * @return
+ *   0 on success, negative errno value otherwise and rte_errno is set.
+ */
+int
+failsafe_rx_intr_disable(struct rte_eth_dev *dev, uint16_t idx)
+{
+	struct rxq *rxq = dev->data->rx_queues[idx];
+	struct sub_device *sdev;
+	uint64_t u64;
+	uint8_t i;
+	int rc = 0;
+	int ret;
+
+	if (!rxq || rxq->event_fd <= 0) {
+		rte_errno = EINVAL;
+		return -rte_errno;
+	}
+	rxq->enable_events = 0;
+	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
+		ret = rte_eth_dev_rx_intr_disable(PORT_ID(sdev), idx);
+		if ((ret != -ENOTSUP) && (ret != -ENODEV))
+			rc = ret;
+	}
+	/* Clear pending events */
+	while (read(rxq->event_fd, &u64, sizeof(uint64_t)) >  0)
+		;
+	if (rc)
+		rte_errno = -rc;
+	return rc;
+}
+
+/**
+ * DPDK callback for Rx queue interrupt enable.
+ *
+ * @param dev
+ *   Pointer to Ethernet device structure.
+ * @param idx
+ *   Rx queue index.
+ *
+ * @return
+ *   0 on success, negative errno value otherwise and rte_errno is set.
+ */
+int
+failsafe_rx_intr_enable(struct rte_eth_dev *dev, uint16_t idx)
+{
+	struct rxq *rxq = dev->data->rx_queues[idx];
+	struct sub_device *sdev;
+	uint8_t i;
+	int rc = 0;
+	int ret;
+
+	if (!rxq || rxq->event_fd <= 0) {
+		rte_errno = EINVAL;
+		return -rte_errno;
+	}
+	/* Let the proxy service run. */
+	if (PRIV(dev)->rxp.sstate != SS_RUNNING) {
+		ERROR("failsafe interrupt services are not running");
+		rte_errno = EAGAIN;
+		return -rte_errno;
+	}
+	rxq->enable_events = 1;
+	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
+		ret = rte_eth_dev_rx_intr_enable(PORT_ID(sdev), idx);
+		if ((ret != -ENOTSUP) && (ret != -ENODEV))
+			rc = ret;
+	}
+	if (rc)
+		rte_errno = -rc;
+	return rc;
+}
diff --git a/drivers/net/failsafe/failsafe_ops.c b/drivers/net/failsafe/failsafe_ops.c
index e16a590..d4839a9 100644
--- a/drivers/net/failsafe/failsafe_ops.c
+++ b/drivers/net/failsafe/failsafe_ops.c
@@ -32,6 +32,7 @@
  */
 
 #include <stdint.h>
+#include <unistd.h>
 
 #include <rte_debug.h>
 #include <rte_atomic.h>
@@ -158,6 +159,10 @@
 	uint8_t i;
 	int ret;
 
+	ret = failsafe_rx_intr_install(dev);
+	if (ret)
+		return ret;
+
 	FOREACH_SUBDEV(sdev, i, dev) {
 		if (sdev->state != DEV_ACTIVE)
 			continue;
@@ -165,6 +170,11 @@
 		ret = rte_eth_dev_start(PORT_ID(sdev));
 		if (ret)
 			return ret;
+		ret = failsafe_rx_intr_install_subdevice(sdev);
+		if (ret) {
+			rte_eth_dev_stop(PORT_ID(sdev));
+			return ret;
+		}
 		sdev->state = DEV_STARTED;
 	}
 	if (PRIV(dev)->state < DEV_STARTED)
@@ -181,9 +191,11 @@
 
 	PRIV(dev)->state = DEV_STARTED - 1;
 	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_STARTED) {
+		failsafe_rx_intr_uninstall_subdevice(sdev);
 		rte_eth_dev_stop(PORT_ID(sdev));
 		sdev->state = DEV_STARTED - 1;
 	}
+	failsafe_rx_intr_uninstall(dev);
 }
 
 static int
@@ -254,6 +266,8 @@
 	if (queue == NULL)
 		return;
 	rxq = queue;
+	if (rxq->event_fd > 0)
+		close(rxq->event_fd);
 	dev = rxq->priv->dev;
 	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE)
 		SUBOPS(sdev, rx_queue_release)
@@ -294,6 +308,11 @@
 	rxq->info.conf = *rx_conf;
 	rxq->info.nb_desc = nb_rx_desc;
 	rxq->priv = PRIV(dev);
+	rxq->event_fd = eventfd(0, EFD_CLOEXEC | EFD_NONBLOCK);
+	if (rxq->event_fd < 0) {
+		ERROR("Rx event_fd error, %s", strerror(errno));
+		return -errno;
+	}
 	dev->data->rx_queues[rx_queue_id] = rxq;
 	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
 		ret = rte_eth_rx_queue_setup(PORT_ID(sdev),
@@ -767,4 +786,6 @@
 	.mac_addr_add = fs_mac_addr_add,
 	.mac_addr_set = fs_mac_addr_set,
 	.filter_ctrl = fs_filter_ctrl,
+	.rx_queue_intr_enable = failsafe_rx_intr_enable,
+	.rx_queue_intr_disable = failsafe_rx_intr_disable,
 };
diff --git a/drivers/net/failsafe/failsafe_private.h b/drivers/net/failsafe/failsafe_private.h
index d81cc3c..29ad2f1 100644
--- a/drivers/net/failsafe/failsafe_private.h
+++ b/drivers/net/failsafe/failsafe_private.h
@@ -34,6 +34,7 @@
 #ifndef _RTE_ETH_FAILSAFE_PRIVATE_H_
 #define _RTE_ETH_FAILSAFE_PRIVATE_H_
 
+#include <sys/eventfd.h>
 #include <sys/queue.h>
 
 #include <rte_atomic.h>
@@ -57,6 +58,13 @@
 #define FAILSAFE_MAX_ETHPORTS 2
 #define FAILSAFE_MAX_ETHADDR 128
 
+enum rxp_service_state {
+	SS_NO_SREVICE = 0,
+	SS_REGISTERED,
+	SS_READY,
+	SS_RUNNING,
+};
+
 /* TYPES */
 
 struct rxq {
@@ -65,10 +73,25 @@ struct rxq {
 	/* id of last sub_device polled */
 	uint8_t last_polled;
 	unsigned int socket_id;
+	int event_fd;
+	unsigned int enable_events:1;
 	struct rte_eth_rxq_info info;
 	rte_atomic64_t refcnt[];
 };
 
+struct rx_proxy {
+	/* epoll file descriptor */
+	int efd;
+	/* event vector to be used by epoll */
+	struct rte_epoll_event *evec;
+	/* rte service id */
+	uint32_t sid;
+	/* service core id */
+	uint32_t scid;
+	enum rxp_service_state sstate;
+
+};
+
 struct txq {
 	struct fs_priv *priv;
 	uint16_t qid;
@@ -139,6 +162,7 @@ struct fs_priv {
 	uint32_t mac_addr_pool[FAILSAFE_MAX_ETHADDR];
 	/* current capabilities */
 	struct rte_eth_dev_info infos;
+	struct rte_intr_handle intr_handle; /* Port interrupt handle. */
 	/*
 	 * Fail-safe state machine.
 	 * This level will be tracking state of the EAL and eth
@@ -151,8 +175,28 @@ struct fs_priv {
 	unsigned int pending_alarm:1; /* An alarm is pending */
 	/* flow isolation state */
 	int flow_isolated:1;
+	/*
+	 * Rx interrupts/events proxy.
+	 * The PMD issues Rx events to the EAL on behalf of its subdevices,
+	 *  it does that by registering event queues to the EAL. Each such
+	 *  queue represents a failsafe Rx queue. A PMD service thread listens
+	 *  to all the Rx events of of all the failsafe subdevices.
+	 *  When an Rx event is issued by a subdevice Rx queue it will be
+	 *  caught by the service and delivered by it to the appropriate
+	 *  failsafe event queue.
+	 */
+	struct rx_proxy rxp;
 };
 
+/* FAILSAFE_INTR */
+
+int failsafe_rx_intr_install(struct rte_eth_dev *dev);
+void failsafe_rx_intr_uninstall(struct rte_eth_dev *dev);
+int failsafe_rx_intr_enable(struct rte_eth_dev *dev, uint16_t idx);
+int failsafe_rx_intr_disable(struct rte_eth_dev *dev, uint16_t idx);
+int failsafe_rx_intr_install_subdevice(struct sub_device *sdev);
+void failsafe_rx_intr_uninstall_subdevice(struct sub_device *sdev);
+
 /* MISC */
 
 int failsafe_hotplug_alarm_install(struct rte_eth_dev *dev);
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH V3] net/failsafe: add Rx interrupts
  2018-01-04 15:01 ` [PATCH v2] " Moti Haimovsky
@ 2018-01-17 12:54   ` Moti Haimovsky
  2018-01-19  9:32     ` [PATCH v4] " Moti Haimovsky
  0 siblings, 1 reply; 29+ messages in thread
From: Moti Haimovsky @ 2018-01-17 12:54 UTC (permalink / raw)
  To: gaetan.rivet, ferruh.yigit; +Cc: dev, Moti Haimovsky

This patch adds support for registering and waiting for Rx
interrupts in failsafe PMD. This allows applications to wait
for Rx events from the PMD using the DPDK rte_epoll subsystem.
The failsafe PMD presents to the application a facade of a single
device to be handled by the application while internally it manages
several devices on behalf of the application including packets
transmission and reception.
The Proposed failsafe Rx interrupt scheme follows this approach.
The failsafe PMD will present the application with a single set of Rx
interrupt vectors representing the failsafe Rx queues, while internally
it will serve as an interrupt proxy for its subdevices.
This will allow applications to wait for Rx traffic from the failsafe
PMD by registering and waiting for Rx events from its Rx queues.
In order to support this the following is suggested:
  * Every Rx queue in the failsafe (virtual) device will be assigned a
    Linux event file descriptor (efd) and an enable_interrupts flag.
  * The failsafe PMD will fill in its rte_intr_handle structure with
    the Rx efds assigned previously and register them with the EAL.
  * The failsafe driver will create a private epoll fd (epfd) and will
    allocate enough space to handle all the Rx events from all its
    subdevices.
  * Acting as an application,
    for each Rx queue in each active subdevice the failsafe will:
      o Register the Rx queue with the EAL.
      o Pass the EAL the failsafe private epoll fd as the epfd to
        register the Rx queue event on.
      o Pass the EAL, as a parameter, the pointer to the failsafe Rx
        queue that handles this Rx queue.
      o Using the DPDK service callbacks, the failsafe PMD will launch
        an Rx proxy service that will Wait on the epoll fd for Rx events
        from the sub-devices.
      o For each Rx event received the proxy service will
         - Retrieve the pointer to failsafe Rx queue that handles this
           subdevice Rx queue from the user info returned by the EAL.
         - Trigger a failsafe Rx event on that queue by writing to the
           event fd unless interrupts are disabled for that queue.
  * The failsafe pmd will also implement the rx_queue_intr_enable and
    rx_queue_intr_disable routines that will enable and disable Rx
    interrupts respectively on both on the failsafe and its subdevices.

Signed-off-by: Moti Haimovsky <motih@mellanox.com>
---
V3:
Fixed build failures in FreeBSD10.3_64

V2:
Modifications according to inputs from Stephen Hemminger:
* Removed unneeded (void *) casting.
Fixed coding style warning.
---

 doc/guides/nics/features/failsafe.ini   |   1 +
 drivers/net/failsafe/Makefile           |   1 +
 drivers/net/failsafe/failsafe.c         |   4 +
 drivers/net/failsafe/failsafe_ether.c   |   1 +
 drivers/net/failsafe/failsafe_intr.c    | 595 ++++++++++++++++++++++++++++++++
 drivers/net/failsafe/failsafe_ops.c     |  28 ++
 drivers/net/failsafe/failsafe_private.h |  44 +++
 7 files changed, 674 insertions(+)
 create mode 100644 drivers/net/failsafe/failsafe_intr.c

diff --git a/doc/guides/nics/features/failsafe.ini b/doc/guides/nics/features/failsafe.ini
index a42e344..39ee579 100644
--- a/doc/guides/nics/features/failsafe.ini
+++ b/doc/guides/nics/features/failsafe.ini
@@ -6,6 +6,7 @@
 [Features]
 Link status          = Y
 Link status event    = Y
+Rx interrupt         = Y
 MTU update           = Y
 Jumbo frame          = Y
 Promiscuous mode     = Y
diff --git a/drivers/net/failsafe/Makefile b/drivers/net/failsafe/Makefile
index ea2a8fe..91a734b 100644
--- a/drivers/net/failsafe/Makefile
+++ b/drivers/net/failsafe/Makefile
@@ -46,6 +46,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_PMD_FAILSAFE) += failsafe_ops.c
 SRCS-$(CONFIG_RTE_LIBRTE_PMD_FAILSAFE) += failsafe_rxtx.c
 SRCS-$(CONFIG_RTE_LIBRTE_PMD_FAILSAFE) += failsafe_ether.c
 SRCS-$(CONFIG_RTE_LIBRTE_PMD_FAILSAFE) += failsafe_flow.c
+SRCS-$(CONFIG_RTE_LIBRTE_PMD_FAILSAFE) += failsafe_intr.c
 
 # No exported include files
 
diff --git a/drivers/net/failsafe/failsafe.c b/drivers/net/failsafe/failsafe.c
index b767352..35fcdc2 100644
--- a/drivers/net/failsafe/failsafe.c
+++ b/drivers/net/failsafe/failsafe.c
@@ -244,6 +244,10 @@
 		mac->addr_bytes[2], mac->addr_bytes[3],
 		mac->addr_bytes[4], mac->addr_bytes[5]);
 	dev->data->dev_flags |= RTE_ETH_DEV_INTR_LSC;
+	PRIV(dev)->intr_handle = (struct rte_intr_handle){
+		.fd = -1,
+		.type = RTE_INTR_HANDLE_EXT,
+	};
 	return 0;
 free_args:
 	failsafe_args_free(dev);
diff --git a/drivers/net/failsafe/failsafe_ether.c b/drivers/net/failsafe/failsafe_ether.c
index 8a4cacf..0f1630e 100644
--- a/drivers/net/failsafe/failsafe_ether.c
+++ b/drivers/net/failsafe/failsafe_ether.c
@@ -283,6 +283,7 @@
 		return;
 	switch (sdev->state) {
 	case DEV_STARTED:
+		failsafe_rx_intr_uninstall_subdevice(sdev);
 		rte_eth_dev_stop(PORT_ID(sdev));
 		sdev->state = DEV_ACTIVE;
 		/* fallthrough */
diff --git a/drivers/net/failsafe/failsafe_intr.c b/drivers/net/failsafe/failsafe_intr.c
new file mode 100644
index 0000000..9173f09
--- /dev/null
+++ b/drivers/net/failsafe/failsafe_intr.c
@@ -0,0 +1,595 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright 2017 Mellanox
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of the copyright holder nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+/**
+ * @file
+ * Interrupts handling for failsafe driver.
+ */
+
+#include <sys/epoll.h>
+#include <unistd.h>
+
+#include <rte_alarm.h>
+#include <rte_config.h>
+#include <rte_errno.h>
+#include <rte_ethdev.h>
+#include <rte_interrupts.h>
+#include <rte_io.h>
+#include <rte_service_component.h>
+
+#include "failsafe_private.h"
+
+#define NUM_RX_PROXIES (FAILSAFE_MAX_ETHPORTS * RTE_MAX_RXTX_INTR_VEC_ID)
+
+/**
+ * Install failsafe Rx event proxy service.
+ * The Rx event proxy is the service that listens to Rx events from the
+ * subdevices and triggers failsafe Rx events accordingly.
+ *
+ * @param priv
+ *   Pointer to failsafe private structure.
+ * @return
+ *   0 on success, negative errno value otherwise.
+ */
+static int
+fs_rx_event_proxy_service(void *data)
+{
+	struct fs_priv *priv = data;
+	struct rxq *rxq;
+	struct rte_epoll_event *events = priv->rxp.evec;
+	uint64_t u64 = 1;
+	int i, n, rc = 0;
+
+	n = rte_epoll_wait(priv->rxp.efd, events, NUM_RX_PROXIES, -1);
+	for (i = 0; i < n; i++) {
+		rxq = events[i].epdata.data;
+		if (rxq->enable_events && rxq->event_fd != -1) {
+			if (write(rxq->event_fd, &u64, sizeof(u64)) !=
+			    sizeof(u64)) {
+				ERROR("failed to proxy Rx event to socket %d",
+				       rxq->event_fd);
+				rc = -EIO;
+			}
+		}
+	}
+	return rc;
+}
+
+/**
+ * Uninstall failsafe Rx event proxy service.
+ *
+ * @param priv
+ *   Pointer to failsafe private structure.
+ */
+static void
+fs_rx_event_proxy_service_uninstall(struct fs_priv *priv)
+{
+	/* Unregister the event service. */
+	switch (priv->rxp.sstate) {
+	case SS_RUNNING:
+		rte_service_map_lcore_set(priv->rxp.sid, priv->rxp.scid, 0);
+		/* fall through */
+	case SS_READY:
+		rte_service_runstate_set(priv->rxp.sid, 0);
+		rte_service_set_stats_enable(priv->rxp.sid, 0);
+		rte_service_component_runstate_set(priv->rxp.sid, 0);
+		/* fall through */
+	case SS_REGISTERED:
+		rte_service_component_unregister(priv->rxp.sid);
+		/* fall through */
+	default:
+		break;
+	}
+}
+
+/**
+ * Install the failsafe Rx event proxy service.
+ *
+ * @param priv
+ *   Pointer to failsafe private structure.
+ * @return
+ *   0 on success, negative errno value otherwise.
+ */
+static int
+fs_rx_event_proxy_service_install(struct fs_priv *priv)
+{
+	struct rte_service_spec service;
+	int32_t num_service_cores = rte_service_lcore_count();
+	int ret = 0;
+
+	if (!num_service_cores) {
+		ERROR("Failed to install Rx interrupts, "
+		      "no service core found");
+		return -ENOTSUP;
+	}
+	/* prepare service info */
+	memset(&service, 0, sizeof(struct rte_service_spec));
+	snprintf(service.name, sizeof(service.name), "%s_Rx_service",
+		 priv->dev->device->name);
+	service.socket_id = priv->dev->data->numa_node;
+	service.callback = fs_rx_event_proxy_service;
+	service.callback_userdata = (void *)priv;
+
+	if (priv->rxp.sstate == SS_NO_SREVICE) {
+		uint32_t service_core_list[num_service_cores];
+
+		/* get a service core to work with */
+		ret = rte_service_lcore_list(service_core_list,
+					     num_service_cores);
+		if (ret <= 0) {
+			ERROR("Failed to install Rx interrupts, "
+			      "service core list empty or corrupted");
+			return -ENOTSUP;
+		}
+		priv->rxp.scid = service_core_list[0];
+		ret = rte_service_lcore_add(priv->rxp.scid);
+		if (ret && ret != -EALREADY) {
+			ERROR("Failed adding service core");
+			return ret;
+		}
+		/* service core may be in "stopped" state, start it */
+		ret = rte_service_lcore_start(priv->rxp.scid);
+		if (ret && (ret != -EALREADY)) {
+			ERROR("Failed to install Rx interrupts, "
+			      "service core not started");
+			return ret;
+		}
+		/* register our service */
+		int32_t ret = rte_service_component_register(&service,
+							     &priv->rxp.sid);
+		if (ret) {
+			ERROR("service register() failed");
+			return -ENOEXEC;
+		}
+		priv->rxp.sstate = SS_REGISTERED;
+		/* run the service */
+		ret = rte_service_component_runstate_set(priv->rxp.sid, 1);
+		if (ret < 0) {
+			ERROR("Failed Setting component runstate\n");
+			return ret;
+		}
+		ret = rte_service_set_stats_enable(priv->rxp.sid, 1);
+		if (ret < 0) {
+			ERROR("Failed enabling stats\n");
+			return ret;
+		}
+		ret = rte_service_runstate_set(priv->rxp.sid, 1);
+		if (ret < 0) {
+			ERROR("Failed to run service\n");
+			return ret;
+		}
+		priv->rxp.sstate = SS_READY;
+		/* map the service with the service core */
+		ret = rte_service_map_lcore_set(priv->rxp.sid,
+						priv->rxp.scid, 1);
+		if (ret) {
+			ERROR("Failed to install Rx interrupts, "
+			      "could not map service core");
+			return ret;
+		}
+		priv->rxp.sstate = SS_RUNNING;
+	}
+	return 0;
+}
+
+/**
+ * Install failsafe Rx event proxy subsystem.
+ * This is the way the failsafe PMD generates Rx events on behalf of its
+ * subdevices.
+ *
+ * @param priv
+ *   Pointer to failsafe private structure.
+ * @return
+ *   0 on success, negative errno value otherwise and rte_errno is set.
+ */
+static int
+fs_rx_event_proxy_install(struct fs_priv *priv)
+{
+	int rc = 0;
+
+	/* create the epoll to wait on for Rx events form subdevices */
+	priv->rxp.efd = epoll_create1(0);
+	if (priv->rxp.efd < 0) {
+		rte_errno = errno;
+		ERROR("failed to create epoll,"
+		      " Rx interrupts will not be supported");
+		return -rte_errno;
+	}
+	/* allocate memory for receiving the Rx events from the subdevices. */
+	priv->rxp.evec = calloc(NUM_RX_PROXIES, sizeof(*priv->rxp.evec));
+	if (priv->rxp.evec == NULL) {
+		ERROR("failed to allocate memory for event vectors,"
+		      " Rx interrupts will not be supported");
+		rc = -ENOMEM;
+		goto error;
+	}
+	if (fs_rx_event_proxy_service_install(priv) < 0) {
+		rc = -rte_errno;
+		goto error;
+	}
+	return 0;
+error:
+	if (priv->rxp.efd >= 0)
+		close(priv->rxp.efd);
+	if (priv->rxp.evec)
+		free(priv->rxp.evec);
+	rte_errno = -rc;
+	return rc;
+}
+
+/**
+ * Uninstall failsafe Rx event proxy.
+ *
+ * @param priv
+ *   Pointer to failsafe private structure.
+ */
+static void
+fs_rx_event_proxy_uninstall(struct fs_priv *priv)
+{
+	fs_rx_event_proxy_service_uninstall(priv);
+	if (priv->rxp.evec) {
+		free(priv->rxp.evec);
+		priv->rxp.evec = NULL;
+	}
+	if (priv->rxp.efd > 0) {
+		close(priv->rxp.efd);
+		priv->rxp.efd = -1;
+	}
+}
+
+/**
+ * Uninstall failsafe interrupt vector.
+ *
+ * @param priv
+ *   Pointer to failsafe private structure.
+ */
+static void
+fs_rx_intr_vec_uninstall(struct fs_priv *priv)
+{
+	struct rte_intr_handle *intr_handle = &priv->intr_handle;
+
+	if (intr_handle->intr_vec) {
+		free(intr_handle->intr_vec);
+		intr_handle->intr_vec = NULL;
+	}
+	intr_handle->nb_efd = 0;
+}
+/**
+ * Installs failsafe interrupt vector to be registered with EAL later on.
+ *
+ * @param priv
+ *   Pointer to failsafe private structure.
+ *
+ * @return
+ *   0 on success, negative errno value otherwise and rte_errno is set.
+ */
+static int
+fs_rx_intr_vec_install(struct fs_priv *priv)
+{
+	unsigned int i;
+	unsigned int rxqs_n = priv->dev->data->nb_rx_queues;
+	unsigned int n = RTE_MIN(rxqs_n, (uint32_t)RTE_MAX_RXTX_INTR_VEC_ID);
+	unsigned int count = 0;
+	struct rte_intr_handle *intr_handle = &priv->intr_handle;
+
+	/* Allocate the interrupt vector of the failsafe Rx proxy interrupts */
+	intr_handle->intr_vec = malloc(n * sizeof(intr_handle->intr_vec[0]));
+	if (intr_handle->intr_vec == NULL) {
+		fs_rx_intr_vec_uninstall(priv);
+		rte_errno = ENOMEM;
+		ERROR("failed to allocate memory for interrupt vector,"
+		      " Rx interrupts will not be supported");
+		return -rte_errno;
+	}
+	for (i = 0; i < n; i++) {
+		struct rxq *rxq = priv->dev->data->rx_queues[i];
+
+		/* Skip queues that cannot request interrupts. */
+		if (!rxq || rxq->event_fd < 0) {
+			/* Use invalid intr_vec[] index to disable entry. */
+			intr_handle->intr_vec[i] =
+				RTE_INTR_VEC_RXTX_OFFSET +
+				RTE_MAX_RXTX_INTR_VEC_ID;
+			continue;
+		}
+		if (count >= RTE_MAX_RXTX_INTR_VEC_ID) {
+			rte_errno = E2BIG;
+			ERROR("too many Rx queues for interrupt vector size"
+			      " (%d), Rx interrupts cannot be enabled",
+			      RTE_MAX_RXTX_INTR_VEC_ID);
+			fs_rx_intr_vec_uninstall(priv);
+			return -rte_errno;
+		}
+		intr_handle->intr_vec[i] = RTE_INTR_VEC_RXTX_OFFSET + count;
+		intr_handle->efds[count] = rxq->event_fd;
+		count++;
+	}
+	if (!count)
+		fs_rx_intr_vec_uninstall(priv);
+	else
+		intr_handle->nb_efd = count;
+	return 0;
+}
+
+/**
+ * RX Interrupt control per subdevice.
+ *
+ * @param sdev
+ *   Pointer to sub-device structure.
+ * @param op
+ *   The operation be performed for the vector.
+ *   Operation type of {RTE_INTR_EVENT_ADD, RTE_INTR_EVENT_DEL}.
+ * @return
+ *   - On success, zero.
+ *   - On failure, a negative value.
+ */
+static int
+failsafe_eth_rx_intr_ctl_subdevice(struct sub_device *sdev, int op)
+{
+	struct rte_eth_dev *dev;
+	struct rte_eth_dev *fsdev;
+	int epfd;
+	uint16_t pid;
+	uint16_t qid;
+	struct rxq *fsrxq;
+	int rc;
+	int ret = 0;
+
+	if (sdev == NULL || (ETH(sdev) == NULL) ||
+	    sdev->fs_dev == NULL || (PRIV(sdev->fs_dev) == NULL)) {
+		ERROR("Called with invalid arguments");
+		return -EINVAL;
+	}
+	dev = ETH(sdev);
+	fsdev = sdev->fs_dev;
+	epfd = PRIV(sdev->fs_dev)->rxp.efd;
+	pid = PORT_ID(sdev);
+
+	if (epfd <= 0) {
+		if (op == RTE_INTR_EVENT_ADD) {
+			ERROR("proxy events are not initialized");
+			return -EBADFD;
+		} else {
+			return 0;
+		}
+	}
+	if (dev->data->nb_rx_queues > fsdev->data->nb_rx_queues) {
+		ERROR("subdevice has too many queues,"
+		      " Interrupts will not be enabled");
+			return -E2BIG;
+	}
+	for (qid = 0; qid < dev->data->nb_rx_queues; qid++) {
+		fsrxq = fsdev->data->rx_queues[qid];
+		rc = rte_eth_dev_rx_intr_ctl_q(pid, qid, epfd,
+					       op, (void *)fsrxq);
+		if (rc) {
+			ERROR("rte_eth_dev_rx_intr_ctl_q failed for "
+			      "port %d  queue %d, epfd %d, error %d",
+			      pid, qid, epfd, rc);
+			ret = rc;
+		}
+	}
+	return ret;
+}
+
+/**
+ * Install Rx interrupts subsystem for a subdevice.
+ * This is a support for dynamically adding subdevices.
+ *
+ * @param sdev
+ *   Pointer to subdevice structure.
+ *
+ * @return
+ *   0 on success, negative errno value otherwise and rte_errno is set.
+ */
+int failsafe_rx_intr_install_subdevice(struct sub_device *sdev)
+{
+	int rc;
+	int qid;
+	struct rte_eth_dev *fsdev = sdev->fs_dev;
+	struct rxq **rxq = (struct rxq **)fsdev->data->rx_queues;
+	const struct rte_intr_conf *const intr_conf =
+				&ETH(sdev)->data->dev_conf.intr_conf;
+
+	if (!intr_conf->rxq)
+		return 0;
+	rc = failsafe_eth_rx_intr_ctl_subdevice(sdev, RTE_INTR_EVENT_ADD);
+	if (rc)
+		return rc;
+	/* enable interrupts on already-enabled queues */
+	for (qid = 0; qid < ETH(sdev)->data->nb_rx_queues; qid++) {
+		if (rxq[qid]->enable_events) {
+			int ret = rte_eth_dev_rx_intr_enable(PORT_ID(sdev),
+							     qid);
+			if (ret && (ret != -ENOTSUP)) {
+				ERROR("Failed to enable interrupts on "
+				      "port %d queue %d", PORT_ID(sdev), qid);
+				rc = ret;
+			}
+		}
+	}
+	return rc;
+}
+
+/**
+ * Uninstall Rx interrupts subsystem for a subdevice.
+ * This is a support for dynamically removing subdevices.
+ *
+ * @param sdev
+ *   Pointer to subdevice structure.
+ *
+ * @return
+ *   0 on success, negative errno value otherwise and rte_errno is set.
+ */
+void failsafe_rx_intr_uninstall_subdevice(struct sub_device *sdev)
+{
+	int qid;
+
+	for (qid = 0; qid < ETH(sdev)->data->nb_rx_queues; qid++)
+		rte_eth_dev_rx_intr_disable(PORT_ID(sdev), qid);
+	failsafe_eth_rx_intr_ctl_subdevice(sdev, RTE_INTR_EVENT_DEL);
+}
+
+/**
+ * Uninstall failsafe Rx interrupts subsystem.
+ *
+ * @param priv
+ *   Pointer to private structure.
+ *
+ * @return
+ *   0 on success, negative errno value otherwise and rte_errno is set.
+ */
+void
+failsafe_rx_intr_uninstall(struct rte_eth_dev *dev)
+{
+	struct fs_priv *priv = PRIV(dev);
+	struct rte_intr_handle *intr_handle = &priv->intr_handle;
+
+	dev->intr_handle = NULL;
+	rte_intr_free_epoll_fd(intr_handle);
+	fs_rx_event_proxy_uninstall(priv);
+	if (intr_handle->intr_vec) {
+		free(intr_handle->intr_vec);
+		intr_handle->intr_vec = NULL;
+	}
+	intr_handle->nb_efd = 0;
+}
+
+/**
+ * Install failsafe Rx interrupts subsystem.
+ *
+ * @param priv
+ *   Pointer to private structure.
+ *
+ * @return
+ *   0 on success, negative errno value otherwise and rte_errno is set.
+ */
+int
+failsafe_rx_intr_install(struct rte_eth_dev *dev)
+{
+	struct fs_priv *priv = PRIV(dev);
+	const struct rte_intr_conf *const intr_conf =
+			&priv->dev->data->dev_conf.intr_conf;
+
+	if (!intr_conf->rxq || priv->intr_handle.intr_vec != NULL)
+		return 0;
+	if (fs_rx_intr_vec_install(priv) < 0)
+		return -rte_errno;
+	if (fs_rx_event_proxy_install(priv) < 0) {
+		fs_rx_intr_vec_uninstall(priv);
+		return -rte_errno;
+	}
+	priv->intr_handle.efd_counter_size = sizeof(uint64_t);
+	dev->intr_handle = &priv->intr_handle;
+	return 0;
+}
+
+
+/**
+ * DPDK callback for Rx queue interrupt disable.
+ *
+ * @param dev
+ *   Pointer to Ethernet device structure.
+ * @param idx
+ *   Rx queue index.
+ *
+ * @return
+ *   0 on success, negative errno value otherwise and rte_errno is set.
+ */
+int
+failsafe_rx_intr_disable(struct rte_eth_dev *dev, uint16_t idx)
+{
+	struct rxq *rxq = dev->data->rx_queues[idx];
+	struct sub_device *sdev;
+	uint64_t u64;
+	uint8_t i;
+	int rc = 0;
+	int ret;
+
+	if (!rxq || rxq->event_fd <= 0) {
+		rte_errno = EINVAL;
+		return -rte_errno;
+	}
+	rxq->enable_events = 0;
+	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
+		ret = rte_eth_dev_rx_intr_disable(PORT_ID(sdev), idx);
+		if ((ret != -ENOTSUP) && (ret != -ENODEV))
+			rc = ret;
+	}
+	/* Clear pending events */
+	while (read(rxq->event_fd, &u64, sizeof(uint64_t)) >  0)
+		;
+	if (rc)
+		rte_errno = -rc;
+	return rc;
+}
+
+/**
+ * DPDK callback for Rx queue interrupt enable.
+ *
+ * @param dev
+ *   Pointer to Ethernet device structure.
+ * @param idx
+ *   Rx queue index.
+ *
+ * @return
+ *   0 on success, negative errno value otherwise and rte_errno is set.
+ */
+int
+failsafe_rx_intr_enable(struct rte_eth_dev *dev, uint16_t idx)
+{
+	struct rxq *rxq = dev->data->rx_queues[idx];
+	struct sub_device *sdev;
+	uint8_t i;
+	int rc = 0;
+	int ret;
+
+	if (!rxq || rxq->event_fd <= 0) {
+		rte_errno = EINVAL;
+		return -rte_errno;
+	}
+	/* Let the proxy service run. */
+	if (PRIV(dev)->rxp.sstate != SS_RUNNING) {
+		ERROR("failsafe interrupt services are not running");
+		rte_errno = EAGAIN;
+		return -rte_errno;
+	}
+	rxq->enable_events = 1;
+	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
+		ret = rte_eth_dev_rx_intr_enable(PORT_ID(sdev), idx);
+		if ((ret != -ENOTSUP) && (ret != -ENODEV))
+			rc = ret;
+	}
+	if (rc)
+		rte_errno = -rc;
+	return rc;
+}
diff --git a/drivers/net/failsafe/failsafe_ops.c b/drivers/net/failsafe/failsafe_ops.c
index fe957ad..483c434 100644
--- a/drivers/net/failsafe/failsafe_ops.c
+++ b/drivers/net/failsafe/failsafe_ops.c
@@ -32,6 +32,7 @@
  */
 
 #include <stdint.h>
+#include <unistd.h>
 
 #include <rte_debug.h>
 #include <rte_atomic.h>
@@ -158,6 +159,10 @@
 	uint8_t i;
 	int ret;
 
+	ret = failsafe_rx_intr_install(dev);
+	if (ret)
+		return ret;
+
 	FOREACH_SUBDEV(sdev, i, dev) {
 		if (sdev->state != DEV_ACTIVE)
 			continue;
@@ -165,6 +170,11 @@
 		ret = rte_eth_dev_start(PORT_ID(sdev));
 		if (ret)
 			return ret;
+		ret = failsafe_rx_intr_install_subdevice(sdev);
+		if (ret) {
+			rte_eth_dev_stop(PORT_ID(sdev));
+			return ret;
+		}
 		sdev->state = DEV_STARTED;
 	}
 	if (PRIV(dev)->state < DEV_STARTED)
@@ -181,9 +191,11 @@
 
 	PRIV(dev)->state = DEV_STARTED - 1;
 	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_STARTED) {
+		failsafe_rx_intr_uninstall_subdevice(sdev);
 		rte_eth_dev_stop(PORT_ID(sdev));
 		sdev->state = DEV_STARTED - 1;
 	}
+	failsafe_rx_intr_uninstall(dev);
 }
 
 static int
@@ -254,6 +266,8 @@
 	if (queue == NULL)
 		return;
 	rxq = queue;
+	if (rxq->event_fd > 0)
+		close(rxq->event_fd);
 	dev = rxq->priv->dev;
 	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE)
 		SUBOPS(sdev, rx_queue_release)
@@ -270,6 +284,14 @@
 		const struct rte_eth_rxconf *rx_conf,
 		struct rte_mempool *mb_pool)
 {
+	/*
+	 * Fake MSIX interrupts causing rte_intr_efd_enable to
+	 * allocate an eventfd for us.
+	 */
+	struct rte_intr_handle intr_handle = {
+		.type = RTE_INTR_HANDLE_VFIO_MSIX,
+		.efds = {-1, },
+	};
 	struct sub_device *sdev;
 	struct rxq *rxq;
 	uint8_t i;
@@ -295,6 +317,10 @@
 	rxq->info.nb_desc = nb_rx_desc;
 	rxq->priv = PRIV(dev);
 	rxq->sdev = PRIV(dev)->subs;
+	ret = rte_intr_efd_enable(&intr_handle, 1);
+	if (ret < 0)
+		return ret;
+	rxq->event_fd = intr_handle.efds[0];
 	dev->data->rx_queues[rx_queue_id] = rxq;
 	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
 		ret = rte_eth_rx_queue_setup(PORT_ID(sdev),
@@ -768,4 +794,6 @@
 	.mac_addr_add = fs_mac_addr_add,
 	.mac_addr_set = fs_mac_addr_set,
 	.filter_ctrl = fs_filter_ctrl,
+	.rx_queue_intr_enable = failsafe_rx_intr_enable,
+	.rx_queue_intr_disable = failsafe_rx_intr_disable,
 };
diff --git a/drivers/net/failsafe/failsafe_private.h b/drivers/net/failsafe/failsafe_private.h
index 54b5b91..1c5b006 100644
--- a/drivers/net/failsafe/failsafe_private.h
+++ b/drivers/net/failsafe/failsafe_private.h
@@ -40,6 +40,7 @@
 #include <rte_dev.h>
 #include <rte_ethdev.h>
 #include <rte_devargs.h>
+#include <rte_interrupts.h>
 
 #define FAILSAFE_DRIVER_NAME "Fail-safe PMD"
 
@@ -57,6 +58,13 @@
 #define FAILSAFE_MAX_ETHPORTS 2
 #define FAILSAFE_MAX_ETHADDR 128
 
+enum rxp_service_state {
+	SS_NO_SREVICE = 0,
+	SS_REGISTERED,
+	SS_READY,
+	SS_RUNNING,
+};
+
 /* TYPES */
 
 struct rxq {
@@ -65,10 +73,25 @@ struct rxq {
 	/* next sub_device to poll */
 	struct sub_device *sdev;
 	unsigned int socket_id;
+	int event_fd;
+	unsigned int enable_events:1;
 	struct rte_eth_rxq_info info;
 	rte_atomic64_t refcnt[];
 };
 
+struct rx_proxy {
+	/* epoll file descriptor */
+	int efd;
+	/* event vector to be used by epoll */
+	struct rte_epoll_event *evec;
+	/* rte service id */
+	uint32_t sid;
+	/* service core id */
+	uint32_t scid;
+	enum rxp_service_state sstate;
+
+};
+
 struct txq {
 	struct fs_priv *priv;
 	uint16_t qid;
@@ -140,6 +163,7 @@ struct fs_priv {
 	uint32_t mac_addr_pool[FAILSAFE_MAX_ETHADDR];
 	/* current capabilities */
 	struct rte_eth_dev_info infos;
+	struct rte_intr_handle intr_handle; /* Port interrupt handle. */
 	/*
 	 * Fail-safe state machine.
 	 * This level will be tracking state of the EAL and eth
@@ -152,8 +176,28 @@ struct fs_priv {
 	unsigned int pending_alarm:1; /* An alarm is pending */
 	/* flow isolation state */
 	int flow_isolated:1;
+	/*
+	 * Rx interrupts/events proxy.
+	 * The PMD issues Rx events to the EAL on behalf of its subdevices,
+	 *  it does that by registering event queues to the EAL. Each such
+	 *  queue represents a failsafe Rx queue. A PMD service thread listens
+	 *  to all the Rx events of of all the failsafe subdevices.
+	 *  When an Rx event is issued by a subdevice Rx queue it will be
+	 *  caught by the service and delivered by it to the appropriate
+	 *  failsafe event queue.
+	 */
+	struct rx_proxy rxp;
 };
 
+/* FAILSAFE_INTR */
+
+int failsafe_rx_intr_install(struct rte_eth_dev *dev);
+void failsafe_rx_intr_uninstall(struct rte_eth_dev *dev);
+int failsafe_rx_intr_enable(struct rte_eth_dev *dev, uint16_t idx);
+int failsafe_rx_intr_disable(struct rte_eth_dev *dev, uint16_t idx);
+int failsafe_rx_intr_install_subdevice(struct sub_device *sdev);
+void failsafe_rx_intr_uninstall_subdevice(struct sub_device *sdev);
+
 /* MISC */
 
 int failsafe_hotplug_alarm_install(struct rte_eth_dev *dev);
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v4] net/failsafe: add Rx interrupts
  2018-01-17 12:54   ` [PATCH V3] " Moti Haimovsky
@ 2018-01-19  9:32     ` Moti Haimovsky
  2018-01-19  9:32       ` Moti Haimovsky
  0 siblings, 1 reply; 29+ messages in thread
From: Moti Haimovsky @ 2018-01-19  9:32 UTC (permalink / raw)
  To: gaetan.rivet, ferruh.yigit; +Cc: dev, Moti Haimovsky

This patch adds support for registering and waiting for Rx
interrupts in failsafe PMD,

The patch should be applied on top of the following series of
patches by Matan Azrad:
  [PATCH v6 0/6] Fail-safe\ethdev: fix removal handling lack
  [PATCH v6 0/8] Introduce virtual driver for Hyper-V/Azure platforms 
  [PATCH v3 0/7] Port ownership and syncronization 

V4:
Rebase on top of Matam Azrad patches listed above.

V3:
Fixed build failures in FreeBSD10.3_64

V2:
Modifications according to inputs from Stephen Hemminger:
* Removed unneeded (void *) casting.
Fixed coding style warning.

Moti Haimovsky (1):
  net/failsafe: add Rx interrupts

 doc/guides/nics/features/failsafe.ini   |   1 +
 drivers/net/failsafe/Makefile           |   1 +
 drivers/net/failsafe/failsafe.c         |   4 +
 drivers/net/failsafe/failsafe_ether.c   |   1 +
 drivers/net/failsafe/failsafe_intr.c    | 597 ++++++++++++++++++++++++++++++++
 drivers/net/failsafe/failsafe_ops.c     |  28 ++
 drivers/net/failsafe/failsafe_private.h |  44 +++
 7 files changed, 676 insertions(+)
 create mode 100644 drivers/net/failsafe/failsafe_intr.c

-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH v4] net/failsafe: add Rx interrupts
  2018-01-19  9:32     ` [PATCH v4] " Moti Haimovsky
@ 2018-01-19  9:32       ` Moti Haimovsky
  2018-01-19 14:11         ` Gaëtan Rivet
  2018-01-23 18:43         ` [PATCH v5 0/3] net/failsafe: add Rx interrupts support Moti Haimovsky
  0 siblings, 2 replies; 29+ messages in thread
From: Moti Haimovsky @ 2018-01-19  9:32 UTC (permalink / raw)
  To: gaetan.rivet, ferruh.yigit; +Cc: dev, Moti Haimovsky

This patch adds support for registering and waiting for Rx
interrupts in failsafe PMD. This allows applications to wait
for Rx events from the PMD using the DPDK rte_epoll subsystem.
The failsafe PMD presents to the application a facade of a single
device to be handled by the application while internally it manages
several devices on behalf of the application including packets
transmission and reception.
The Proposed failsafe Rx interrupt scheme follows this approach.
The failsafe PMD will present the application with a single set of Rx
interrupt vectors representing the failsafe Rx queues, while internally
it will serve as an interrupt proxy for its subdevices.
This will allow applications to wait for Rx traffic from the failsafe
PMD by registering and waiting for Rx events from its Rx queues.
In order to support this the following is suggested:
  * Every Rx queue in the failsafe (virtual) device will be assigned a
    Linux event file descriptor (efd) and an enable_interrupts flag.
  * The failsafe PMD will fill in its rte_intr_handle structure with
    the Rx efds assigned previously and register them with the EAL.
  * The failsafe driver will create a private epoll fd (epfd) and will
    allocate enough space to handle all the Rx events from all its
    subdevices.
  * Acting as an application,
    for each Rx queue in each active subdevice the failsafe will:
      o Register the Rx queue with the EAL.
      o Pass the EAL the failsafe private epoll fd as the epfd to
        register the Rx queue event on.
      o Pass the EAL, as a parameter, the pointer to the failsafe Rx
        queue that handles this Rx queue.
      o Using the DPDK service callbacks, the failsafe PMD will launch
        an Rx proxy service that will Wait on the epoll fd for Rx events
        from the sub-devices.
      o For each Rx event received the proxy service will
         - Retrieve the pointer to failsafe Rx queue that handles this
           subdevice Rx queue from the user info returned by the EAL.
         - Trigger a failsafe Rx event on that queue by writing to the
           event fd unless interrupts are disabled for that queue.
  * The failsafe pmd will also implement the rx_queue_intr_enable and
    rx_queue_intr_disable routines that will enable and disable Rx
    interrupts respectively on both on the failsafe and its subdevices.

Signed-off-by: Moti Haimovsky <motih@mellanox.com>

Conflicts:
	drivers/net/failsafe/failsafe_ops.c
	drivers/net/failsafe/failsafe_private.h
---
V4:
Fixed merge conflicts gound during integration with othe falsafe patches
(See cover letter).

V3:
Fixed build failures in FreeBSD10.3_64

V2:
Modifications according to inputs from Stephen Hemminger:
* Removed unneeded (void *) casting.
Fixed coding style warning.
---

 doc/guides/nics/features/failsafe.ini   |   1 +
 drivers/net/failsafe/Makefile           |   1 +
 drivers/net/failsafe/failsafe.c         |   4 +
 drivers/net/failsafe/failsafe_ether.c   |   1 +
 drivers/net/failsafe/failsafe_intr.c    | 597 ++++++++++++++++++++++++++++++++
 drivers/net/failsafe/failsafe_ops.c     |  28 ++
 drivers/net/failsafe/failsafe_private.h |  44 +++
 7 files changed, 676 insertions(+)
 create mode 100644 drivers/net/failsafe/failsafe_intr.c

diff --git a/doc/guides/nics/features/failsafe.ini b/doc/guides/nics/features/failsafe.ini
index a42e344..39ee579 100644
--- a/doc/guides/nics/features/failsafe.ini
+++ b/doc/guides/nics/features/failsafe.ini
@@ -6,6 +6,7 @@
 [Features]
 Link status          = Y
 Link status event    = Y
+Rx interrupt         = Y
 MTU update           = Y
 Jumbo frame          = Y
 Promiscuous mode     = Y
diff --git a/drivers/net/failsafe/Makefile b/drivers/net/failsafe/Makefile
index ea2a8fe..91a734b 100644
--- a/drivers/net/failsafe/Makefile
+++ b/drivers/net/failsafe/Makefile
@@ -46,6 +46,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_PMD_FAILSAFE) += failsafe_ops.c
 SRCS-$(CONFIG_RTE_LIBRTE_PMD_FAILSAFE) += failsafe_rxtx.c
 SRCS-$(CONFIG_RTE_LIBRTE_PMD_FAILSAFE) += failsafe_ether.c
 SRCS-$(CONFIG_RTE_LIBRTE_PMD_FAILSAFE) += failsafe_flow.c
+SRCS-$(CONFIG_RTE_LIBRTE_PMD_FAILSAFE) += failsafe_intr.c
 
 # No exported include files
 
diff --git a/drivers/net/failsafe/failsafe.c b/drivers/net/failsafe/failsafe.c
index a1e1c7a..621944f 100644
--- a/drivers/net/failsafe/failsafe.c
+++ b/drivers/net/failsafe/failsafe.c
@@ -251,6 +251,10 @@
 		mac->addr_bytes[2], mac->addr_bytes[3],
 		mac->addr_bytes[4], mac->addr_bytes[5]);
 	dev->data->dev_flags |= RTE_ETH_DEV_INTR_LSC;
+	PRIV(dev)->intr_handle = (struct rte_intr_handle){
+		.fd = -1,
+		.type = RTE_INTR_HANDLE_EXT,
+	};
 	return 0;
 free_args:
 	failsafe_args_free(dev);
diff --git a/drivers/net/failsafe/failsafe_ether.c b/drivers/net/failsafe/failsafe_ether.c
index e9b0cfe..643f3d6 100644
--- a/drivers/net/failsafe/failsafe_ether.c
+++ b/drivers/net/failsafe/failsafe_ether.c
@@ -283,6 +283,7 @@
 		return;
 	switch (sdev->state) {
 	case DEV_STARTED:
+		failsafe_rx_intr_uninstall_subdevice(sdev);
 		rte_eth_dev_stop(PORT_ID(sdev));
 		sdev->state = DEV_ACTIVE;
 		/* fallthrough */
diff --git a/drivers/net/failsafe/failsafe_intr.c b/drivers/net/failsafe/failsafe_intr.c
new file mode 100644
index 0000000..4d42810
--- /dev/null
+++ b/drivers/net/failsafe/failsafe_intr.c
@@ -0,0 +1,597 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright 2017 Mellanox
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of the copyright holder nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+/**
+ * @file
+ * Interrupts handling for failsafe driver.
+ */
+
+#include <sys/epoll.h>
+#include <unistd.h>
+
+#include <rte_alarm.h>
+#include <rte_config.h>
+#include <rte_errno.h>
+#include <rte_ethdev.h>
+#include <rte_interrupts.h>
+#include <rte_io.h>
+#include <rte_service_component.h>
+
+#include "failsafe_private.h"
+
+#define NUM_RX_PROXIES (FAILSAFE_MAX_ETHPORTS * RTE_MAX_RXTX_INTR_VEC_ID)
+
+/**
+ * Install failsafe Rx event proxy service.
+ * The Rx event proxy is the service that listens to Rx events from the
+ * subdevices and triggers failsafe Rx events accordingly.
+ *
+ * @param priv
+ *   Pointer to failsafe private structure.
+ * @return
+ *   0 on success, negative errno value otherwise.
+ */
+static int
+fs_rx_event_proxy_service(void *data)
+{
+	struct fs_priv *priv = data;
+	struct rxq *rxq;
+	struct rte_epoll_event *events = priv->rxp.evec;
+	uint64_t u64 = 1;
+	int i, n, rc = 0;
+
+	n = rte_epoll_wait(priv->rxp.efd, events, NUM_RX_PROXIES, -1);
+	for (i = 0; i < n; i++) {
+		rxq = events[i].epdata.data;
+		if (rxq->enable_events && rxq->event_fd != -1) {
+			if (write(rxq->event_fd, &u64, sizeof(u64)) !=
+			    sizeof(u64)) {
+				ERROR("failed to proxy Rx event to socket %d",
+				       rxq->event_fd);
+				rc = -EIO;
+			}
+		}
+	}
+	return rc;
+}
+
+/**
+ * Uninstall failsafe Rx event proxy service.
+ *
+ * @param priv
+ *   Pointer to failsafe private structure.
+ */
+static void
+fs_rx_event_proxy_service_uninstall(struct fs_priv *priv)
+{
+	/* Unregister the event service. */
+	switch (priv->rxp.sstate) {
+	case SS_RUNNING:
+		rte_service_map_lcore_set(priv->rxp.sid, priv->rxp.scid, 0);
+		/* fall through */
+	case SS_READY:
+		rte_service_runstate_set(priv->rxp.sid, 0);
+		rte_service_set_stats_enable(priv->rxp.sid, 0);
+		rte_service_component_runstate_set(priv->rxp.sid, 0);
+		/* fall through */
+	case SS_REGISTERED:
+		rte_service_component_unregister(priv->rxp.sid);
+		/* fall through */
+	default:
+		break;
+	}
+}
+
+/**
+ * Install the failsafe Rx event proxy service.
+ *
+ * @param priv
+ *   Pointer to failsafe private structure.
+ * @return
+ *   0 on success, negative errno value otherwise.
+ */
+static int
+fs_rx_event_proxy_service_install(struct fs_priv *priv)
+{
+	struct rte_service_spec service;
+	int32_t num_service_cores = rte_service_lcore_count();
+	int ret = 0;
+
+	if (!num_service_cores) {
+		ERROR("Failed to install Rx interrupts, "
+		      "no service core found");
+		return -ENOTSUP;
+	}
+	/* prepare service info */
+	memset(&service, 0, sizeof(struct rte_service_spec));
+	snprintf(service.name, sizeof(service.name), "%s_Rx_service",
+		 priv->dev->device->name);
+	service.socket_id = priv->dev->data->numa_node;
+	service.callback = fs_rx_event_proxy_service;
+	service.callback_userdata = (void *)priv;
+
+	if (priv->rxp.sstate == SS_NO_SREVICE) {
+		uint32_t service_core_list[num_service_cores];
+
+		/* get a service core to work with */
+		ret = rte_service_lcore_list(service_core_list,
+					     num_service_cores);
+		if (ret <= 0) {
+			ERROR("Failed to install Rx interrupts, "
+			      "service core list empty or corrupted");
+			return -ENOTSUP;
+		}
+		priv->rxp.scid = service_core_list[0];
+		ret = rte_service_lcore_add(priv->rxp.scid);
+		if (ret && ret != -EALREADY) {
+			ERROR("Failed adding service core");
+			return ret;
+		}
+		/* service core may be in "stopped" state, start it */
+		ret = rte_service_lcore_start(priv->rxp.scid);
+		if (ret && (ret != -EALREADY)) {
+			ERROR("Failed to install Rx interrupts, "
+			      "service core not started");
+			return ret;
+		}
+		/* register our service */
+		int32_t ret = rte_service_component_register(&service,
+							     &priv->rxp.sid);
+		if (ret) {
+			ERROR("service register() failed");
+			return -ENOEXEC;
+		}
+		priv->rxp.sstate = SS_REGISTERED;
+		/* run the service */
+		ret = rte_service_component_runstate_set(priv->rxp.sid, 1);
+		if (ret < 0) {
+			ERROR("Failed Setting component runstate\n");
+			return ret;
+		}
+		ret = rte_service_set_stats_enable(priv->rxp.sid, 1);
+		if (ret < 0) {
+			ERROR("Failed enabling stats\n");
+			return ret;
+		}
+		ret = rte_service_runstate_set(priv->rxp.sid, 1);
+		if (ret < 0) {
+			ERROR("Failed to run service\n");
+			return ret;
+		}
+		priv->rxp.sstate = SS_READY;
+		/* map the service with the service core */
+		ret = rte_service_map_lcore_set(priv->rxp.sid,
+						priv->rxp.scid, 1);
+		if (ret) {
+			ERROR("Failed to install Rx interrupts, "
+			      "could not map service core");
+			return ret;
+		}
+		priv->rxp.sstate = SS_RUNNING;
+	}
+	return 0;
+}
+
+/**
+ * Install failsafe Rx event proxy subsystem.
+ * This is the way the failsafe PMD generates Rx events on behalf of its
+ * subdevices.
+ *
+ * @param priv
+ *   Pointer to failsafe private structure.
+ * @return
+ *   0 on success, negative errno value otherwise and rte_errno is set.
+ */
+static int
+fs_rx_event_proxy_install(struct fs_priv *priv)
+{
+	int rc = 0;
+
+	/* create the epoll to wait on for Rx events form subdevices */
+	priv->rxp.efd = epoll_create1(0);
+	if (priv->rxp.efd < 0) {
+		rte_errno = errno;
+		ERROR("failed to create epoll,"
+		      " Rx interrupts will not be supported");
+		return -rte_errno;
+	}
+	/* allocate memory for receiving the Rx events from the subdevices. */
+	priv->rxp.evec = calloc(NUM_RX_PROXIES, sizeof(*priv->rxp.evec));
+	if (priv->rxp.evec == NULL) {
+		ERROR("failed to allocate memory for event vectors,"
+		      " Rx interrupts will not be supported");
+		rc = -ENOMEM;
+		goto error;
+	}
+	if (fs_rx_event_proxy_service_install(priv) < 0) {
+		rc = -rte_errno;
+		goto error;
+	}
+	return 0;
+error:
+	if (priv->rxp.efd >= 0)
+		close(priv->rxp.efd);
+	if (priv->rxp.evec)
+		free(priv->rxp.evec);
+	rte_errno = -rc;
+	return rc;
+}
+
+/**
+ * Uninstall failsafe Rx event proxy.
+ *
+ * @param priv
+ *   Pointer to failsafe private structure.
+ */
+static void
+fs_rx_event_proxy_uninstall(struct fs_priv *priv)
+{
+	fs_rx_event_proxy_service_uninstall(priv);
+	if (priv->rxp.evec) {
+		free(priv->rxp.evec);
+		priv->rxp.evec = NULL;
+	}
+	if (priv->rxp.efd > 0) {
+		close(priv->rxp.efd);
+		priv->rxp.efd = -1;
+	}
+}
+
+/**
+ * Uninstall failsafe interrupt vector.
+ *
+ * @param priv
+ *   Pointer to failsafe private structure.
+ */
+static void
+fs_rx_intr_vec_uninstall(struct fs_priv *priv)
+{
+	struct rte_intr_handle *intr_handle = &priv->intr_handle;
+
+	if (intr_handle->intr_vec) {
+		free(intr_handle->intr_vec);
+		intr_handle->intr_vec = NULL;
+	}
+	intr_handle->nb_efd = 0;
+}
+/**
+ * Installs failsafe interrupt vector to be registered with EAL later on.
+ *
+ * @param priv
+ *   Pointer to failsafe private structure.
+ *
+ * @return
+ *   0 on success, negative errno value otherwise and rte_errno is set.
+ */
+static int
+fs_rx_intr_vec_install(struct fs_priv *priv)
+{
+	unsigned int i;
+	unsigned int rxqs_n = priv->dev->data->nb_rx_queues;
+	unsigned int n = RTE_MIN(rxqs_n, (uint32_t)RTE_MAX_RXTX_INTR_VEC_ID);
+	unsigned int count = 0;
+	struct rte_intr_handle *intr_handle = &priv->intr_handle;
+
+	/* Allocate the interrupt vector of the failsafe Rx proxy interrupts */
+	intr_handle->intr_vec = malloc(n * sizeof(intr_handle->intr_vec[0]));
+	if (intr_handle->intr_vec == NULL) {
+		fs_rx_intr_vec_uninstall(priv);
+		rte_errno = ENOMEM;
+		ERROR("failed to allocate memory for interrupt vector,"
+		      " Rx interrupts will not be supported");
+		return -rte_errno;
+	}
+	for (i = 0; i < n; i++) {
+		struct rxq *rxq = priv->dev->data->rx_queues[i];
+
+		/* Skip queues that cannot request interrupts. */
+		if (!rxq || rxq->event_fd < 0) {
+			/* Use invalid intr_vec[] index to disable entry. */
+			intr_handle->intr_vec[i] =
+				RTE_INTR_VEC_RXTX_OFFSET +
+				RTE_MAX_RXTX_INTR_VEC_ID;
+			continue;
+		}
+		if (count >= RTE_MAX_RXTX_INTR_VEC_ID) {
+			rte_errno = E2BIG;
+			ERROR("too many Rx queues for interrupt vector size"
+			      " (%d), Rx interrupts cannot be enabled",
+			      RTE_MAX_RXTX_INTR_VEC_ID);
+			fs_rx_intr_vec_uninstall(priv);
+			return -rte_errno;
+		}
+		intr_handle->intr_vec[i] = RTE_INTR_VEC_RXTX_OFFSET + count;
+		intr_handle->efds[count] = rxq->event_fd;
+		count++;
+	}
+	if (!count)
+		fs_rx_intr_vec_uninstall(priv);
+	else
+		intr_handle->nb_efd = count;
+	return 0;
+}
+
+/**
+ * RX Interrupt control per subdevice.
+ *
+ * @param sdev
+ *   Pointer to sub-device structure.
+ * @param op
+ *   The operation be performed for the vector.
+ *   Operation type of {RTE_INTR_EVENT_ADD, RTE_INTR_EVENT_DEL}.
+ * @return
+ *   - On success, zero.
+ *   - On failure, a negative value.
+ */
+static int
+failsafe_eth_rx_intr_ctl_subdevice(struct sub_device *sdev, int op)
+{
+	struct rte_eth_dev *dev;
+	struct rte_eth_dev *fsdev;
+	int epfd;
+	uint16_t pid;
+	uint16_t qid;
+	struct rxq *fsrxq;
+	int rc;
+	int ret = 0;
+
+	if (sdev == NULL || (ETH(sdev) == NULL) ||
+	    sdev->fs_dev == NULL || (PRIV(sdev->fs_dev) == NULL)) {
+		ERROR("Called with invalid arguments");
+		return -EINVAL;
+	}
+	dev = ETH(sdev);
+	fsdev = sdev->fs_dev;
+	epfd = PRIV(sdev->fs_dev)->rxp.efd;
+	pid = PORT_ID(sdev);
+
+	if (epfd <= 0) {
+		if (op == RTE_INTR_EVENT_ADD) {
+			ERROR("proxy events are not initialized");
+			return -EBADFD;
+		} else {
+			return 0;
+		}
+	}
+	if (dev->data->nb_rx_queues > fsdev->data->nb_rx_queues) {
+		ERROR("subdevice has too many queues,"
+		      " Interrupts will not be enabled");
+			return -E2BIG;
+	}
+	for (qid = 0; qid < dev->data->nb_rx_queues; qid++) {
+		fsrxq = fsdev->data->rx_queues[qid];
+		rc = rte_eth_dev_rx_intr_ctl_q(pid, qid, epfd,
+					       op, (void *)fsrxq);
+		if (rc) {
+			ERROR("rte_eth_dev_rx_intr_ctl_q failed for "
+			      "port %d  queue %d, epfd %d, error %d",
+			      pid, qid, epfd, rc);
+			ret = rc;
+		}
+	}
+	return ret;
+}
+
+/**
+ * Install Rx interrupts subsystem for a subdevice.
+ * This is a support for dynamically adding subdevices.
+ *
+ * @param sdev
+ *   Pointer to subdevice structure.
+ *
+ * @return
+ *   0 on success, negative errno value otherwise and rte_errno is set.
+ */
+int failsafe_rx_intr_install_subdevice(struct sub_device *sdev)
+{
+	int rc;
+	int qid;
+	struct rte_eth_dev *fsdev = sdev->fs_dev;
+	struct rxq **rxq = (struct rxq **)fsdev->data->rx_queues;
+	const struct rte_intr_conf *const intr_conf =
+				&ETH(sdev)->data->dev_conf.intr_conf;
+
+	if (!intr_conf->rxq)
+		return 0;
+	rc = failsafe_eth_rx_intr_ctl_subdevice(sdev, RTE_INTR_EVENT_ADD);
+	if (rc)
+		return rc;
+	/* enable interrupts on already-enabled queues */
+	for (qid = 0; qid < ETH(sdev)->data->nb_rx_queues; qid++) {
+		if (rxq[qid]->enable_events) {
+			int ret = rte_eth_dev_rx_intr_enable(PORT_ID(sdev),
+							     qid);
+			if (ret && (ret != -ENOTSUP)) {
+				ERROR("Failed to enable interrupts on "
+				      "port %d queue %d", PORT_ID(sdev), qid);
+				rc = ret;
+			}
+		}
+	}
+	return rc;
+}
+
+/**
+ * Uninstall Rx interrupts subsystem for a subdevice.
+ * This is a support for dynamically removing subdevices.
+ *
+ * @param sdev
+ *   Pointer to subdevice structure.
+ *
+ * @return
+ *   0 on success, negative errno value otherwise and rte_errno is set.
+ */
+void failsafe_rx_intr_uninstall_subdevice(struct sub_device *sdev)
+{
+	int qid;
+
+	for (qid = 0; qid < ETH(sdev)->data->nb_rx_queues; qid++)
+		rte_eth_dev_rx_intr_disable(PORT_ID(sdev), qid);
+	failsafe_eth_rx_intr_ctl_subdevice(sdev, RTE_INTR_EVENT_DEL);
+}
+
+/**
+ * Uninstall failsafe Rx interrupts subsystem.
+ *
+ * @param priv
+ *   Pointer to private structure.
+ *
+ * @return
+ *   0 on success, negative errno value otherwise and rte_errno is set.
+ */
+void
+failsafe_rx_intr_uninstall(struct rte_eth_dev *dev)
+{
+	struct fs_priv *priv = PRIV(dev);
+	struct rte_intr_handle *intr_handle = &priv->intr_handle;
+
+	dev->intr_handle = NULL;
+	rte_intr_free_epoll_fd(intr_handle);
+	fs_rx_event_proxy_uninstall(priv);
+	if (intr_handle->intr_vec) {
+		free(intr_handle->intr_vec);
+		intr_handle->intr_vec = NULL;
+	}
+	intr_handle->nb_efd = 0;
+}
+
+/**
+ * Install failsafe Rx interrupts subsystem.
+ *
+ * @param priv
+ *   Pointer to private structure.
+ *
+ * @return
+ *   0 on success, negative errno value otherwise and rte_errno is set.
+ */
+int
+failsafe_rx_intr_install(struct rte_eth_dev *dev)
+{
+	struct fs_priv *priv = PRIV(dev);
+	const struct rte_intr_conf *const intr_conf =
+			&priv->dev->data->dev_conf.intr_conf;
+
+	if (!intr_conf->rxq || priv->intr_handle.intr_vec != NULL)
+		return 0;
+	if (fs_rx_intr_vec_install(priv) < 0)
+		return -rte_errno;
+	if (fs_rx_event_proxy_install(priv) < 0) {
+		fs_rx_intr_vec_uninstall(priv);
+		return -rte_errno;
+	}
+	priv->intr_handle.efd_counter_size = sizeof(uint64_t);
+	dev->intr_handle = &priv->intr_handle;
+	return 0;
+}
+
+
+/**
+ * DPDK callback for Rx queue interrupt disable.
+ *
+ * @param dev
+ *   Pointer to Ethernet device structure.
+ * @param idx
+ *   Rx queue index.
+ *
+ * @return
+ *   0 on success, negative errno value otherwise and rte_errno is set.
+ */
+int
+failsafe_rx_intr_disable(struct rte_eth_dev *dev, uint16_t idx)
+{
+	struct rxq *rxq = dev->data->rx_queues[idx];
+	struct sub_device *sdev;
+	uint64_t u64;
+	uint8_t i;
+	int rc = 0;
+	int ret;
+
+	if (!rxq || rxq->event_fd <= 0) {
+		rte_errno = EINVAL;
+		return -rte_errno;
+	}
+	rxq->enable_events = 0;
+	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
+		ret = rte_eth_dev_rx_intr_disable(PORT_ID(sdev), idx);
+		if ((ret != -ENODEV) && !fs_err(sdev, ret))
+			rc = ret;
+	}
+	/* Clear pending events */
+	while (read(rxq->event_fd, &u64, sizeof(uint64_t)) >  0)
+		;
+	if (rc)
+		rte_errno = -rc;
+	return rc;
+}
+
+/**
+ * DPDK callback for Rx queue interrupt enable.
+ *
+ * @param dev
+ *   Pointer to Ethernet device structure.
+ * @param idx
+ *   Rx queue index.
+ *
+ * @return
+ *   0 on success, negative errno value otherwise and rte_errno is set.
+ */
+int
+failsafe_rx_intr_enable(struct rte_eth_dev *dev, uint16_t idx)
+{
+	struct rxq *rxq = dev->data->rx_queues[idx];
+	struct sub_device *sdev;
+	uint8_t i;
+	int rc = 0;
+	int ret;
+
+	if (!rxq || rxq->event_fd <= 0) {
+		rte_errno = EINVAL;
+		return -rte_errno;
+	}
+	/* Let the proxy service run. */
+	if (PRIV(dev)->rxp.sstate != SS_RUNNING) {
+		ERROR("failsafe interrupt services are not running");
+		rte_errno = EAGAIN;
+		return -rte_errno;
+	}
+	rxq->enable_events = 1;
+	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
+		ret = rte_eth_dev_rx_intr_enable(PORT_ID(sdev), idx);
+		if ((ret != -ENODEV) && !fs_err(sdev, ret))
+			rc = ret;
+	}
+	if (rc) {
+		failsafe_rx_intr_disable(dev, idx);
+		rte_errno = -rc;
+	}
+	return rc;
+}
diff --git a/drivers/net/failsafe/failsafe_ops.c b/drivers/net/failsafe/failsafe_ops.c
index 0976745..b5b4eab 100644
--- a/drivers/net/failsafe/failsafe_ops.c
+++ b/drivers/net/failsafe/failsafe_ops.c
@@ -32,6 +32,7 @@
  */
 
 #include <stdint.h>
+#include <unistd.h>
 
 #include <rte_debug.h>
 #include <rte_atomic.h>
@@ -160,6 +161,10 @@
 	uint8_t i;
 	int ret;
 
+	ret = failsafe_rx_intr_install(dev);
+	if (ret)
+		return ret;
+
 	FOREACH_SUBDEV(sdev, i, dev) {
 		if (sdev->state != DEV_ACTIVE)
 			continue;
@@ -170,6 +175,11 @@
 				continue;
 			return ret;
 		}
+		ret = failsafe_rx_intr_install_subdevice(sdev);
+		if (ret) {
+			rte_eth_dev_stop(PORT_ID(sdev));
+			return ret;
+		}
 		sdev->state = DEV_STARTED;
 	}
 	if (PRIV(dev)->state < DEV_STARTED)
@@ -186,9 +196,11 @@
 
 	PRIV(dev)->state = DEV_STARTED - 1;
 	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_STARTED) {
+		failsafe_rx_intr_uninstall_subdevice(sdev);
 		rte_eth_dev_stop(PORT_ID(sdev));
 		sdev->state = DEV_STARTED - 1;
 	}
+	failsafe_rx_intr_uninstall(dev);
 }
 
 static int
@@ -259,6 +271,8 @@
 	if (queue == NULL)
 		return;
 	rxq = queue;
+	if (rxq->event_fd > 0)
+		close(rxq->event_fd);
 	dev = rxq->priv->dev;
 	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE)
 		SUBOPS(sdev, rx_queue_release)
@@ -275,6 +289,14 @@
 		const struct rte_eth_rxconf *rx_conf,
 		struct rte_mempool *mb_pool)
 {
+	/*
+	 * Fake MSIX interrupts causing rte_intr_efd_enable to
+	 * allocate an eventfd for us.
+	 */
+	struct rte_intr_handle intr_handle = {
+		.type = RTE_INTR_HANDLE_VFIO_MSIX,
+		.efds = {-1, },
+	};
 	struct sub_device *sdev;
 	struct rxq *rxq;
 	uint8_t i;
@@ -300,6 +322,10 @@
 	rxq->info.nb_desc = nb_rx_desc;
 	rxq->priv = PRIV(dev);
 	rxq->sdev = PRIV(dev)->subs;
+	ret = rte_intr_efd_enable(&intr_handle, 1);
+	if (ret < 0)
+		return ret;
+	rxq->event_fd = intr_handle.efds[0];
 	dev->data->rx_queues[rx_queue_id] = rxq;
 	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
 		ret = rte_eth_rx_queue_setup(PORT_ID(sdev),
@@ -781,4 +807,6 @@
 	.mac_addr_add = fs_mac_addr_add,
 	.mac_addr_set = fs_mac_addr_set,
 	.filter_ctrl = fs_filter_ctrl,
+	.rx_queue_intr_enable = failsafe_rx_intr_enable,
+	.rx_queue_intr_disable = failsafe_rx_intr_disable,
 };
diff --git a/drivers/net/failsafe/failsafe_private.h b/drivers/net/failsafe/failsafe_private.h
index b377046..d7617c4 100644
--- a/drivers/net/failsafe/failsafe_private.h
+++ b/drivers/net/failsafe/failsafe_private.h
@@ -40,6 +40,7 @@
 #include <rte_dev.h>
 #include <rte_ethdev.h>
 #include <rte_devargs.h>
+#include <rte_interrupts.h>
 
 #define FAILSAFE_DRIVER_NAME "Fail-safe PMD"
 #define FAILSAFE_OWNER_NAME "Fail-safe"
@@ -61,6 +62,13 @@
 
 #define DEVARGS_MAXLEN 4096
 
+enum rxp_service_state {
+	SS_NO_SREVICE = 0,
+	SS_REGISTERED,
+	SS_READY,
+	SS_RUNNING,
+};
+
 /* TYPES */
 
 struct rxq {
@@ -69,10 +77,25 @@ struct rxq {
 	/* next sub_device to poll */
 	struct sub_device *sdev;
 	unsigned int socket_id;
+	int event_fd;
+	unsigned int enable_events:1;
 	struct rte_eth_rxq_info info;
 	rte_atomic64_t refcnt[];
 };
 
+struct rx_proxy {
+	/* epoll file descriptor */
+	int efd;
+	/* event vector to be used by epoll */
+	struct rte_epoll_event *evec;
+	/* rte service id */
+	uint32_t sid;
+	/* service core id */
+	uint32_t scid;
+	enum rxp_service_state sstate;
+
+};
+
 struct txq {
 	struct fs_priv *priv;
 	uint16_t qid;
@@ -147,6 +170,7 @@ struct fs_priv {
 	/* current capabilities */
 	struct rte_eth_dev_info infos;
 	struct rte_eth_dev_owner my_owner; /* Unique owner. */
+	struct rte_intr_handle intr_handle; /* Port interrupt handle. */
 	/*
 	 * Fail-safe state machine.
 	 * This level will be tracking state of the EAL and eth
@@ -159,8 +183,28 @@ struct fs_priv {
 	unsigned int pending_alarm:1; /* An alarm is pending */
 	/* flow isolation state */
 	int flow_isolated:1;
+	/*
+	 * Rx interrupts/events proxy.
+	 * The PMD issues Rx events to the EAL on behalf of its subdevices,
+	 *  it does that by registering event queues to the EAL. Each such
+	 *  queue represents a failsafe Rx queue. A PMD service thread listens
+	 *  to all the Rx events of of all the failsafe subdevices.
+	 *  When an Rx event is issued by a subdevice Rx queue it will be
+	 *  caught by the service and delivered by it to the appropriate
+	 *  failsafe event queue.
+	 */
+	struct rx_proxy rxp;
 };
 
+/* FAILSAFE_INTR */
+
+int failsafe_rx_intr_install(struct rte_eth_dev *dev);
+void failsafe_rx_intr_uninstall(struct rte_eth_dev *dev);
+int failsafe_rx_intr_enable(struct rte_eth_dev *dev, uint16_t idx);
+int failsafe_rx_intr_disable(struct rte_eth_dev *dev, uint16_t idx);
+int failsafe_rx_intr_install_subdevice(struct sub_device *sdev);
+void failsafe_rx_intr_uninstall_subdevice(struct sub_device *sdev);
+
 /* MISC */
 
 int failsafe_hotplug_alarm_install(struct rte_eth_dev *dev);
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [PATCH v4] net/failsafe: add Rx interrupts
  2018-01-19  9:32       ` Moti Haimovsky
@ 2018-01-19 14:11         ` Gaëtan Rivet
  2018-01-23 18:43         ` [PATCH v5 0/3] net/failsafe: add Rx interrupts support Moti Haimovsky
  1 sibling, 0 replies; 29+ messages in thread
From: Gaëtan Rivet @ 2018-01-19 14:11 UTC (permalink / raw)
  To: Moti Haimovsky; +Cc: ferruh.yigit, dev

Hi Moti,

This patch is pretty big. It would have helped review to have it divided
in smaller patches.

Overall, I wholly support adding Rx interrupt support, and I think it is
interesting to have done it using rte_service.

I am entirely unfamiliar with rte_service however, so I will take your
word for it that it does work as intended, and hope that you will help
fix issues if some are found afterward. I will only comment on logic and
coding style.

On Fri, Jan 19, 2018 at 11:32:24AM +0200, Moti Haimovsky wrote:
> This patch adds support for registering and waiting for Rx
> interrupts in failsafe PMD. This allows applications to wait
> for Rx events from the PMD using the DPDK rte_epoll subsystem.
> The failsafe PMD presents to the application a facade of a single
> device to be handled by the application while internally it manages
> several devices on behalf of the application including packets
> transmission and reception.
> The Proposed failsafe Rx interrupt scheme follows this approach.
> The failsafe PMD will present the application with a single set of Rx
> interrupt vectors representing the failsafe Rx queues, while internally
> it will serve as an interrupt proxy for its subdevices.
> This will allow applications to wait for Rx traffic from the failsafe
> PMD by registering and waiting for Rx events from its Rx queues.
> In order to support this the following is suggested:
>   * Every Rx queue in the failsafe (virtual) device will be assigned a
>     Linux event file descriptor (efd) and an enable_interrupts flag.
>   * The failsafe PMD will fill in its rte_intr_handle structure with
>     the Rx efds assigned previously and register them with the EAL.
>   * The failsafe driver will create a private epoll fd (epfd) and will
>     allocate enough space to handle all the Rx events from all its
>     subdevices.
>   * Acting as an application,
>     for each Rx queue in each active subdevice the failsafe will:
>       o Register the Rx queue with the EAL.
>       o Pass the EAL the failsafe private epoll fd as the epfd to
>         register the Rx queue event on.
>       o Pass the EAL, as a parameter, the pointer to the failsafe Rx
>         queue that handles this Rx queue.
>       o Using the DPDK service callbacks, the failsafe PMD will launch
>         an Rx proxy service that will Wait on the epoll fd for Rx events
>         from the sub-devices.
>       o For each Rx event received the proxy service will
>          - Retrieve the pointer to failsafe Rx queue that handles this
>            subdevice Rx queue from the user info returned by the EAL.
>          - Trigger a failsafe Rx event on that queue by writing to the
>            event fd unless interrupts are disabled for that queue.
>   * The failsafe pmd will also implement the rx_queue_intr_enable and
>     rx_queue_intr_disable routines that will enable and disable Rx
>     interrupts respectively on both on the failsafe and its subdevices.
> 

Were you able to count the latency introduced by the proxy?
At normal rates of reception (~9Mpps single core 10Gbps port for
example), do we lose packets by using rx interrupts (with or without the
fail-safe in-between).

> Signed-off-by: Moti Haimovsky <motih@mellanox.com>
> 
> Conflicts:
>         drivers/net/failsafe/failsafe_ops.c
>         drivers/net/failsafe/failsafe_private.h

These lines should be removed from the commitlog.

> ---
> V4:
> Fixed merge conflicts gound during integration with othe falsafe patches
> (See cover letter).
> 
> V3:
> Fixed build failures in FreeBSD10.3_64
> 
> V2:
> Modifications according to inputs from Stephen Hemminger:
> * Removed unneeded (void *) casting.
> Fixed coding style warning.
> ---
> 
>  doc/guides/nics/features/failsafe.ini   |   1 +
>  drivers/net/failsafe/Makefile           |   1 +
>  drivers/net/failsafe/failsafe.c         |   4 +
>  drivers/net/failsafe/failsafe_ether.c   |   1 +
>  drivers/net/failsafe/failsafe_intr.c    | 597 ++++++++++++++++++++++++++++++++
>  drivers/net/failsafe/failsafe_ops.c     |  28 ++
>  drivers/net/failsafe/failsafe_private.h |  44 +++
>  7 files changed, 676 insertions(+)
>  create mode 100644 drivers/net/failsafe/failsafe_intr.c
> 
> diff --git a/doc/guides/nics/features/failsafe.ini b/doc/guides/nics/features/failsafe.ini
> index a42e344..39ee579 100644
> --- a/doc/guides/nics/features/failsafe.ini
> +++ b/doc/guides/nics/features/failsafe.ini
> @@ -6,6 +6,7 @@
>  [Features]
>  Link status          = Y
>  Link status event    = Y
> +Rx interrupt         = Y
>  MTU update           = Y
>  Jumbo frame          = Y
>  Promiscuous mode     = Y
> diff --git a/drivers/net/failsafe/Makefile b/drivers/net/failsafe/Makefile
> index ea2a8fe..91a734b 100644
> --- a/drivers/net/failsafe/Makefile
> +++ b/drivers/net/failsafe/Makefile
> @@ -46,6 +46,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_PMD_FAILSAFE) += failsafe_ops.c
>  SRCS-$(CONFIG_RTE_LIBRTE_PMD_FAILSAFE) += failsafe_rxtx.c
>  SRCS-$(CONFIG_RTE_LIBRTE_PMD_FAILSAFE) += failsafe_ether.c
>  SRCS-$(CONFIG_RTE_LIBRTE_PMD_FAILSAFE) += failsafe_flow.c
> +SRCS-$(CONFIG_RTE_LIBRTE_PMD_FAILSAFE) += failsafe_intr.c
>  
>  # No exported include files
>  
> diff --git a/drivers/net/failsafe/failsafe.c b/drivers/net/failsafe/failsafe.c
> index a1e1c7a..621944f 100644
> --- a/drivers/net/failsafe/failsafe.c
> +++ b/drivers/net/failsafe/failsafe.c
> @@ -251,6 +251,10 @@
>                  mac->addr_bytes[2], mac->addr_bytes[3],
>                  mac->addr_bytes[4], mac->addr_bytes[5]);
>          dev->data->dev_flags |= RTE_ETH_DEV_INTR_LSC;
> +        PRIV(dev)->intr_handle = (struct rte_intr_handle){
> +                .fd = -1,
> +                .type = RTE_INTR_HANDLE_EXT,
> +        };
>          return 0;
>  free_args:
>          failsafe_args_free(dev);
> diff --git a/drivers/net/failsafe/failsafe_ether.c b/drivers/net/failsafe/failsafe_ether.c
> index e9b0cfe..643f3d6 100644
> --- a/drivers/net/failsafe/failsafe_ether.c
> +++ b/drivers/net/failsafe/failsafe_ether.c
> @@ -283,6 +283,7 @@
>                  return;
>          switch (sdev->state) {
>          case DEV_STARTED:
> +                failsafe_rx_intr_uninstall_subdevice(sdev);
>                  rte_eth_dev_stop(PORT_ID(sdev));
>                  sdev->state = DEV_ACTIVE;
>                  /* fallthrough */
> diff --git a/drivers/net/failsafe/failsafe_intr.c b/drivers/net/failsafe/failsafe_intr.c
> new file mode 100644
> index 0000000..4d42810
> --- /dev/null
> +++ b/drivers/net/failsafe/failsafe_intr.c
> @@ -0,0 +1,597 @@
> +/*-
> + *   BSD LICENSE
> + *
> + *   Copyright 2017 Mellanox
> + *
> + *   Redistribution and use in source and binary forms, with or without
> + *   modification, are permitted provided that the following conditions
> + *   are met:
> + *
> + *     * Redistributions of source code must retain the above copyright
> + *       notice, this list of conditions and the following disclaimer.
> + *     * Redistributions in binary form must reproduce the above copyright
> + *       notice, this list of conditions and the following disclaimer in
> + *       the documentation and/or other materials provided with the
> + *       distribution.
> + *     * Neither the name of the copyright holder nor the names of its
> + *       contributors may be used to endorse or promote products derived
> + *       from this software without specific prior written permission.
> + *
> + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> + */
> +
> +/**
> + * @file
> + * Interrupts handling for failsafe driver.
> + */
> +
> +#include <sys/epoll.h>
> +#include <unistd.h>
> +
> +#include <rte_alarm.h>
> +#include <rte_config.h>
> +#include <rte_errno.h>
> +#include <rte_ethdev.h>
> +#include <rte_interrupts.h>
> +#include <rte_io.h>
> +#include <rte_service_component.h>
> +
> +#include "failsafe_private.h"
> +
> +#define NUM_RX_PROXIES (FAILSAFE_MAX_ETHPORTS * RTE_MAX_RXTX_INTR_VEC_ID)
> +
> +/**
> + * Install failsafe Rx event proxy service.
> + * The Rx event proxy is the service that listens to Rx events from the
> + * subdevices and triggers failsafe Rx events accordingly.
> + *
> + * @param priv
> + *   Pointer to failsafe private structure.
> + * @return
> + *   0 on success, negative errno value otherwise.
> + */
> +static int
> +fs_rx_event_proxy_service(void *data)
> +{
> +        struct fs_priv *priv = data;
> +        struct rxq *rxq;
> +        struct rte_epoll_event *events = priv->rxp.evec;
> +        uint64_t u64 = 1;
> +        int i, n, rc = 0;
> +
> +        n = rte_epoll_wait(priv->rxp.efd, events, NUM_RX_PROXIES, -1);
> +        for (i = 0; i < n; i++) {
> +                rxq = events[i].epdata.data;
> +                if (rxq->enable_events && rxq->event_fd != -1) {
> +                        if (write(rxq->event_fd, &u64, sizeof(u64)) !=
> +                            sizeof(u64)) {
> +                                ERROR("failed to proxy Rx event to socket %d",

Failed should be capitalized.

> +                                       rxq->event_fd);
> +                                rc = -EIO;
> +                        }
> +                }
> +        }
> +        return rc;
> +}
> +
> +/**
> + * Uninstall failsafe Rx event proxy service.
> + *
> + * @param priv
> + *   Pointer to failsafe private structure.
> + */
> +static void
> +fs_rx_event_proxy_service_uninstall(struct fs_priv *priv)
> +{
> +        /* Unregister the event service. */
> +        switch (priv->rxp.sstate) {
> +        case SS_RUNNING:
> +                rte_service_map_lcore_set(priv->rxp.sid, priv->rxp.scid, 0);
> +                /* fall through */
> +        case SS_READY:
> +                rte_service_runstate_set(priv->rxp.sid, 0);
> +                rte_service_set_stats_enable(priv->rxp.sid, 0);
> +                rte_service_component_runstate_set(priv->rxp.sid, 0);
> +                /* fall through */
> +        case SS_REGISTERED:
> +                rte_service_component_unregister(priv->rxp.sid);
> +                /* fall through */
> +        default:
> +                break;
> +        }
> +}
> +
> +/**
> + * Install the failsafe Rx event proxy service.
> + *
> + * @param priv
> + *   Pointer to failsafe private structure.
> + * @return
> + *   0 on success, negative errno value otherwise.
> + */
> +static int
> +fs_rx_event_proxy_service_install(struct fs_priv *priv)
> +{
> +        struct rte_service_spec service;
> +        int32_t num_service_cores = rte_service_lcore_count();
> +        int ret = 0;
> +
> +        if (!num_service_cores) {

It would be better to explicictly check against 0.

> +                ERROR("Failed to install Rx interrupts, "
> +                      "no service core found");
> +                return -ENOTSUP;
> +        }
> +        /* prepare service info */
> +        memset(&service, 0, sizeof(struct rte_service_spec));
> +        snprintf(service.name, sizeof(service.name), "%s_Rx_service",
> +                 priv->dev->device->name);

You might want to use the eth_dev name here.
ConnectX-3 will have two physical ports using the same PCI id. This PCI
id is used as rte_device->name, which would result here in the same
rte_service name. I don't know if there is conflict resolution.

In any case, use the eth_dev name, it _should_ be unique.

> +        service.socket_id = priv->dev->data->numa_node;
> +        service.callback = fs_rx_event_proxy_service;
> +        service.callback_userdata = (void *)priv;
> +
> +        if (priv->rxp.sstate == SS_NO_SREVICE) {

Typo with SREVICE.

> +                uint32_t service_core_list[num_service_cores];
> +
> +                /* get a service core to work with */
> +                ret = rte_service_lcore_list(service_core_list,
> +                                             num_service_cores);
> +                if (ret <= 0) {
> +                        ERROR("Failed to install Rx interrupts, "
> +                              "service core list empty or corrupted");
> +                        return -ENOTSUP;
> +                }
> +                priv->rxp.scid = service_core_list[0];
> +                ret = rte_service_lcore_add(priv->rxp.scid);
> +                if (ret && ret != -EALREADY) {
> +                        ERROR("Failed adding service core");
> +                        return ret;
> +                }
> +                /* service core may be in "stopped" state, start it */
> +                ret = rte_service_lcore_start(priv->rxp.scid);
> +                if (ret && (ret != -EALREADY)) {
> +                        ERROR("Failed to install Rx interrupts, "
> +                              "service core not started");
> +                        return ret;
> +                }
> +                /* register our service */
> +                int32_t ret = rte_service_component_register(&service,
> +                                                             &priv->rxp.sid);
> +                if (ret) {
> +                        ERROR("service register() failed");
> +                        return -ENOEXEC;
> +                }
> +                priv->rxp.sstate = SS_REGISTERED;
> +                /* run the service */
> +                ret = rte_service_component_runstate_set(priv->rxp.sid, 1);
> +                if (ret < 0) {
> +                        ERROR("Failed Setting component runstate\n");
> +                        return ret;
> +                }
> +                ret = rte_service_set_stats_enable(priv->rxp.sid, 1);
> +                if (ret < 0) {
> +                        ERROR("Failed enabling stats\n");
> +                        return ret;
> +                }
> +                ret = rte_service_runstate_set(priv->rxp.sid, 1);
> +                if (ret < 0) {
> +                        ERROR("Failed to run service\n");
> +                        return ret;
> +                }
> +                priv->rxp.sstate = SS_READY;
> +                /* map the service with the service core */
> +                ret = rte_service_map_lcore_set(priv->rxp.sid,
> +                                                priv->rxp.scid, 1);
> +                if (ret) {
> +                        ERROR("Failed to install Rx interrupts, "
> +                              "could not map service core");
> +                        return ret;
> +                }
> +                priv->rxp.sstate = SS_RUNNING;
> +        }
> +        return 0;
> +}
> +
> +/**
> + * Install failsafe Rx event proxy subsystem.
> + * This is the way the failsafe PMD generates Rx events on behalf of its
> + * subdevices.
> + *
> + * @param priv
> + *   Pointer to failsafe private structure.
> + * @return
> + *   0 on success, negative errno value otherwise and rte_errno is set.
> + */
> +static int
> +fs_rx_event_proxy_install(struct fs_priv *priv)
> +{
> +        int rc = 0;
> +
> +        /* create the epoll to wait on for Rx events form subdevices */

-->        /* create the epoll to wait on for Rx events from subdevices */

> +        priv->rxp.efd = epoll_create1(0);
> +        if (priv->rxp.efd < 0) {
> +                rte_errno = errno;
> +                ERROR("failed to create epoll,"
> +                      " Rx interrupts will not be supported");

Failed should be capitalized.

> +                return -rte_errno;
> +        }
> +        /* allocate memory for receiving the Rx events from the subdevices. */
> +        priv->rxp.evec = calloc(NUM_RX_PROXIES, sizeof(*priv->rxp.evec));
> +        if (priv->rxp.evec == NULL) {
> +                ERROR("failed to allocate memory for event vectors,"
> +                      " Rx interrupts will not be supported");

idem.

> +                rc = -ENOMEM;
> +                goto error;
> +        }
> +        if (fs_rx_event_proxy_service_install(priv) < 0) {
> +                rc = -rte_errno;
> +                goto error;
> +        }
> +        return 0;
> +error:
> +        if (priv->rxp.efd >= 0)
> +                close(priv->rxp.efd);
> +        if (priv->rxp.evec)

Check against NULL.

> +                free(priv->rxp.evec);
> +        rte_errno = -rc;
> +        return rc;
> +}
> +
> +/**
> + * Uninstall failsafe Rx event proxy.
> + *
> + * @param priv
> + *   Pointer to failsafe private structure.
> + */
> +static void
> +fs_rx_event_proxy_uninstall(struct fs_priv *priv)
> +{
> +        fs_rx_event_proxy_service_uninstall(priv);
> +        if (priv->rxp.evec) {

Check against NULL.

> +                free(priv->rxp.evec);
> +                priv->rxp.evec = NULL;
> +        }
> +        if (priv->rxp.efd > 0) {
> +                close(priv->rxp.efd);
> +                priv->rxp.efd = -1;
> +        }
> +}
> +
> +/**
> + * Uninstall failsafe interrupt vector.
> + *
> + * @param priv
> + *   Pointer to failsafe private structure.
> + */
> +static void
> +fs_rx_intr_vec_uninstall(struct fs_priv *priv)
> +{
> +        struct rte_intr_handle *intr_handle = &priv->intr_handle;
> +
> +        if (intr_handle->intr_vec) {

Check against NULL.

> +                free(intr_handle->intr_vec);
> +                intr_handle->intr_vec = NULL;
> +        }
> +        intr_handle->nb_efd = 0;
> +}
> +/**
> + * Installs failsafe interrupt vector to be registered with EAL later on.
> + *
> + * @param priv
> + *   Pointer to failsafe private structure.
> + *
> + * @return
> + *   0 on success, negative errno value otherwise and rte_errno is set.
> + */
> +static int
> +fs_rx_intr_vec_install(struct fs_priv *priv)
> +{
> +        unsigned int i;
> +        unsigned int rxqs_n = priv->dev->data->nb_rx_queues;
> +        unsigned int n = RTE_MIN(rxqs_n, (uint32_t)RTE_MAX_RXTX_INTR_VEC_ID);
> +        unsigned int count = 0;
> +        struct rte_intr_handle *intr_handle = &priv->intr_handle;
> +
> +        /* Allocate the interrupt vector of the failsafe Rx proxy interrupts */
> +        intr_handle->intr_vec = malloc(n * sizeof(intr_handle->intr_vec[0]));
> +        if (intr_handle->intr_vec == NULL) {
> +                fs_rx_intr_vec_uninstall(priv);
> +                rte_errno = ENOMEM;
> +                ERROR("failed to allocate memory for interrupt vector,"
> +                      " Rx interrupts will not be supported");

Failed capitalized.

> +                return -rte_errno;
> +        }
> +        for (i = 0; i < n; i++) {
> +                struct rxq *rxq = priv->dev->data->rx_queues[i];
> +
> +                /* Skip queues that cannot request interrupts. */
> +                if (!rxq || rxq->event_fd < 0) {
> +                        /* Use invalid intr_vec[] index to disable entry. */
> +                        intr_handle->intr_vec[i] =
> +                                RTE_INTR_VEC_RXTX_OFFSET +
> +                                RTE_MAX_RXTX_INTR_VEC_ID;
> +                        continue;
> +                }
> +                if (count >= RTE_MAX_RXTX_INTR_VEC_ID) {
> +                        rte_errno = E2BIG;
> +                        ERROR("too many Rx queues for interrupt vector size"
> +                              " (%d), Rx interrupts cannot be enabled",

Too capitalized.

> +                              RTE_MAX_RXTX_INTR_VEC_ID);
> +                        fs_rx_intr_vec_uninstall(priv);
> +                        return -rte_errno;
> +                }
> +                intr_handle->intr_vec[i] = RTE_INTR_VEC_RXTX_OFFSET + count;
> +                intr_handle->efds[count] = rxq->event_fd;
> +                count++;
> +        }
> +        if (!count)

It would be better compared with 0.

> +                fs_rx_intr_vec_uninstall(priv);
> +        else
> +                intr_handle->nb_efd = count;
> +        return 0;
> +}
> +
> +/**
> + * RX Interrupt control per subdevice.
> + *
> + * @param sdev
> + *   Pointer to sub-device structure.
> + * @param op
> + *   The operation be performed for the vector.
> + *   Operation type of {RTE_INTR_EVENT_ADD, RTE_INTR_EVENT_DEL}.
> + * @return
> + *   - On success, zero.
> + *   - On failure, a negative value.
> + */
> +static int
> +failsafe_eth_rx_intr_ctl_subdevice(struct sub_device *sdev, int op)
> +{
> +        struct rte_eth_dev *dev;
> +        struct rte_eth_dev *fsdev;
> +        int epfd;
> +        uint16_t pid;
> +        uint16_t qid;
> +        struct rxq *fsrxq;
> +        int rc;
> +        int ret = 0;
> +
> +        if (sdev == NULL || (ETH(sdev) == NULL) ||
> +            sdev->fs_dev == NULL || (PRIV(sdev->fs_dev) == NULL)) {
> +                ERROR("Called with invalid arguments");
> +                return -EINVAL;
> +        }
> +        dev = ETH(sdev);
> +        fsdev = sdev->fs_dev;
> +        epfd = PRIV(sdev->fs_dev)->rxp.efd;
> +        pid = PORT_ID(sdev);
> +
> +        if (epfd <= 0) {
> +                if (op == RTE_INTR_EVENT_ADD) {
> +                        ERROR("proxy events are not initialized");

Proxy should be capitalized here.

> +                        return -EBADFD;
> +                } else {
> +                        return 0;
> +                }
> +        }
> +        if (dev->data->nb_rx_queues > fsdev->data->nb_rx_queues) {
> +                ERROR("subdevice has too many queues,"
> +                      " Interrupts will not be enabled");
> +                        return -E2BIG;
> +        }
> +        for (qid = 0; qid < dev->data->nb_rx_queues; qid++) {
> +                fsrxq = fsdev->data->rx_queues[qid];
> +                rc = rte_eth_dev_rx_intr_ctl_q(pid, qid, epfd,
> +                                               op, (void *)fsrxq);
> +                if (rc) {
> +                        ERROR("rte_eth_dev_rx_intr_ctl_q failed for "
> +                              "port %d  queue %d, epfd %d, error %d",
> +                              pid, qid, epfd, rc);
> +                        ret = rc;
> +                }
> +        }
> +        return ret;
> +}
> +
> +/**
> + * Install Rx interrupts subsystem for a subdevice.
> + * This is a support for dynamically adding subdevices.

So it works with Matan's patch for capturing ethdev?
Have you tested capturing ports with the following configurations:

port \ conf
failsafe rx intr       on   |   on   |  off  |  off
ethdev   rx intr       on   |   off  |  on   |  off
                                         |
        .--------------------------------'
       (_
         `-> and how should this configuration work?

> + *
> + * @param sdev
> + *   Pointer to subdevice structure.
> + *
> + * @return
> + *   0 on success, negative errno value otherwise and rte_errno is set.
> + */
> +int failsafe_rx_intr_install_subdevice(struct sub_device *sdev)
> +{
> +        int rc;
> +        int qid;
> +        struct rte_eth_dev *fsdev = sdev->fs_dev;
> +        struct rxq **rxq = (struct rxq **)fsdev->data->rx_queues;
> +        const struct rte_intr_conf *const intr_conf =
> +                                &ETH(sdev)->data->dev_conf.intr_conf;
> +
> +        if (!intr_conf->rxq)

Explicit comparison please.

> +                return 0;
> +        rc = failsafe_eth_rx_intr_ctl_subdevice(sdev, RTE_INTR_EVENT_ADD);
> +        if (rc)
> +                return rc;
> +        /* enable interrupts on already-enabled queues */
> +        for (qid = 0; qid < ETH(sdev)->data->nb_rx_queues; qid++) {
> +                if (rxq[qid]->enable_events) {
> +                        int ret = rte_eth_dev_rx_intr_enable(PORT_ID(sdev),
> +                                                             qid);
> +                        if (ret && (ret != -ENOTSUP)) {
> +                                ERROR("Failed to enable interrupts on "
> +                                      "port %d queue %d", PORT_ID(sdev), qid);
> +                                rc = ret;
> +                        }
> +                }
> +        }
> +        return rc;
> +}
> +
> +/**
> + * Uninstall Rx interrupts subsystem for a subdevice.
> + * This is a support for dynamically removing subdevices.
> + *
> + * @param sdev
> + *   Pointer to subdevice structure.
> + *
> + * @return
> + *   0 on success, negative errno value otherwise and rte_errno is set.
> + */
> +void failsafe_rx_intr_uninstall_subdevice(struct sub_device *sdev)
> +{
> +        int qid;
> +
> +        for (qid = 0; qid < ETH(sdev)->data->nb_rx_queues; qid++)
> +                rte_eth_dev_rx_intr_disable(PORT_ID(sdev), qid);
> +        failsafe_eth_rx_intr_ctl_subdevice(sdev, RTE_INTR_EVENT_DEL);
> +}
> +
> +/**
> + * Uninstall failsafe Rx interrupts subsystem.
> + *
> + * @param priv
> + *   Pointer to private structure.
> + *
> + * @return
> + *   0 on success, negative errno value otherwise and rte_errno is set.
> + */
> +void
> +failsafe_rx_intr_uninstall(struct rte_eth_dev *dev)
> +{
> +        struct fs_priv *priv = PRIV(dev);
> +        struct rte_intr_handle *intr_handle = &priv->intr_handle;
> +
> +        dev->intr_handle = NULL;
> +        rte_intr_free_epoll_fd(intr_handle);
> +        fs_rx_event_proxy_uninstall(priv);
> +        if (intr_handle->intr_vec) {

Needs an explicit comparison with NULL.

> +                free(intr_handle->intr_vec);
> +                intr_handle->intr_vec = NULL;
> +        }
> +        intr_handle->nb_efd = 0;
> +}
> +
> +/**
> + * Install failsafe Rx interrupts subsystem.
> + *
> + * @param priv
> + *   Pointer to private structure.
> + *
> + * @return
> + *   0 on success, negative errno value otherwise and rte_errno is set.
> + */
> +int
> +failsafe_rx_intr_install(struct rte_eth_dev *dev)
> +{
> +        struct fs_priv *priv = PRIV(dev);
> +        const struct rte_intr_conf *const intr_conf =
> +                        &priv->dev->data->dev_conf.intr_conf;
> +
> +        if (!intr_conf->rxq || priv->intr_handle.intr_vec != NULL)

Needs an explicit comparison.

> +                return 0;
> +        if (fs_rx_intr_vec_install(priv) < 0)
> +                return -rte_errno;
> +        if (fs_rx_event_proxy_install(priv) < 0) {
> +                fs_rx_intr_vec_uninstall(priv);
> +                return -rte_errno;
> +        }
> +        priv->intr_handle.efd_counter_size = sizeof(uint64_t);
> +        dev->intr_handle = &priv->intr_handle;
> +        return 0;
> +}
> +
> +
> +/**
> + * DPDK callback for Rx queue interrupt disable.
> + *
> + * @param dev
> + *   Pointer to Ethernet device structure.
> + * @param idx
> + *   Rx queue index.
> + *
> + * @return
> + *   0 on success, negative errno value otherwise and rte_errno is set.
> + */
> +int
> +failsafe_rx_intr_disable(struct rte_eth_dev *dev, uint16_t idx)
> +{
> +        struct rxq *rxq = dev->data->rx_queues[idx];
> +        struct sub_device *sdev;
> +        uint64_t u64;
> +        uint8_t i;
> +        int rc = 0;
> +        int ret;
> +
> +        if (!rxq || rxq->event_fd <= 0) {

Needs an explicit comparison.

> +                rte_errno = EINVAL;
> +                return -rte_errno;
> +        }
> +        rxq->enable_events = 0;
> +        FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
> +                ret = rte_eth_dev_rx_intr_disable(PORT_ID(sdev), idx);
> +                if ((ret != -ENODEV) && !fs_err(sdev, ret))
> +                        rc = ret;
> +        }
> +        /* Clear pending events */
> +        while (read(rxq->event_fd, &u64, sizeof(uint64_t)) >  0)
> +                ;
> +        if (rc)
> +                rte_errno = -rc;
> +        return rc;
> +}
> +
> +/**
> + * DPDK callback for Rx queue interrupt enable.
> + *
> + * @param dev
> + *   Pointer to Ethernet device structure.
> + * @param idx
> + *   Rx queue index.
> + *
> + * @return
> + *   0 on success, negative errno value otherwise and rte_errno is set.
> + */
> +int
> +failsafe_rx_intr_enable(struct rte_eth_dev *dev, uint16_t idx)
> +{
> +        struct rxq *rxq = dev->data->rx_queues[idx];
> +        struct sub_device *sdev;
> +        uint8_t i;
> +        int rc = 0;
> +        int ret;
> +
> +        if (!rxq || rxq->event_fd <= 0) {
> +                rte_errno = EINVAL;
> +                return -rte_errno;
> +        }
> +        /* Let the proxy service run. */
> +        if (PRIV(dev)->rxp.sstate != SS_RUNNING) {
> +                ERROR("failsafe interrupt services are not running");
> +                rte_errno = EAGAIN;
> +                return -rte_errno;
> +        }
> +        rxq->enable_events = 1;
> +        FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
> +                ret = rte_eth_dev_rx_intr_enable(PORT_ID(sdev), idx);
> +                if ((ret != -ENODEV) && !fs_err(sdev, ret))
> +                        rc = ret;
> +        }
> +        if (rc) {
> +                failsafe_rx_intr_disable(dev, idx);
> +                rte_errno = -rc;
> +        }
> +        return rc;
> +}
> diff --git a/drivers/net/failsafe/failsafe_ops.c b/drivers/net/failsafe/failsafe_ops.c
> index 0976745..b5b4eab 100644
> --- a/drivers/net/failsafe/failsafe_ops.c
> +++ b/drivers/net/failsafe/failsafe_ops.c
> @@ -32,6 +32,7 @@
>   */
>  
>  #include <stdint.h>
> +#include <unistd.h>
>  
>  #include <rte_debug.h>
>  #include <rte_atomic.h>
> @@ -160,6 +161,10 @@
>          uint8_t i;
>          int ret;
>  
> +        ret = failsafe_rx_intr_install(dev);
> +        if (ret)
> +                return ret;
> +
>          FOREACH_SUBDEV(sdev, i, dev) {
>                  if (sdev->state != DEV_ACTIVE)
>                          continue;
> @@ -170,6 +175,11 @@
>                                  continue;
>                          return ret;
>                  }
> +                ret = failsafe_rx_intr_install_subdevice(sdev);
> +                if (ret) {
> +                        rte_eth_dev_stop(PORT_ID(sdev));
> +                        return ret;
> +                }
>                  sdev->state = DEV_STARTED;
>          }
>          if (PRIV(dev)->state < DEV_STARTED)
> @@ -186,9 +196,11 @@
>  
>          PRIV(dev)->state = DEV_STARTED - 1;
>          FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_STARTED) {
> +                failsafe_rx_intr_uninstall_subdevice(sdev);
>                  rte_eth_dev_stop(PORT_ID(sdev));
>                  sdev->state = DEV_STARTED - 1;
>          }
> +        failsafe_rx_intr_uninstall(dev);
>  }
>  
>  static int
> @@ -259,6 +271,8 @@
>          if (queue == NULL)
>                  return;
>          rxq = queue;
> +        if (rxq->event_fd > 0)
> +                close(rxq->event_fd);
>          dev = rxq->priv->dev;
>          FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE)
>                  SUBOPS(sdev, rx_queue_release)
> @@ -275,6 +289,14 @@
>                  const struct rte_eth_rxconf *rx_conf,
>                  struct rte_mempool *mb_pool)
>  {
> +        /*
> +         * Fake MSIX interrupts causing rte_intr_efd_enable to
> +         * allocate an eventfd for us.
> +         */

I'm a bit sceptic about it.
This seems like subverting the API for your own mean.

The preferred way would be to extend the API, for example by introducing
a speficic type that would ask for additional eventfd allocation.

The implementation would have been very simple, but would be much
cleaner.

It seems too late now, but that should be done instead of keeping this.

> +        struct rte_intr_handle intr_handle = {
> +                .type = RTE_INTR_HANDLE_VFIO_MSIX,
> +                .efds = {-1, },
> +        };
>          struct sub_device *sdev;
>          struct rxq *rxq;
>          uint8_t i;
> @@ -300,6 +322,10 @@
>          rxq->info.nb_desc = nb_rx_desc;
>          rxq->priv = PRIV(dev);
>          rxq->sdev = PRIV(dev)->subs;
> +        ret = rte_intr_efd_enable(&intr_handle, 1);
> +        if (ret < 0)
> +                return ret;
> +        rxq->event_fd = intr_handle.efds[0];
>          dev->data->rx_queues[rx_queue_id] = rxq;
>          FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
>                  ret = rte_eth_rx_queue_setup(PORT_ID(sdev),
> @@ -781,4 +807,6 @@
>          .mac_addr_add = fs_mac_addr_add,
>          .mac_addr_set = fs_mac_addr_set,
>          .filter_ctrl = fs_filter_ctrl,
> +        .rx_queue_intr_enable = failsafe_rx_intr_enable,
> +        .rx_queue_intr_disable = failsafe_rx_intr_disable,

You should add those two ops between ".tx_queue_release" and
".flow_ctrl_get", to keep the same order as the struct eth_dev_ops.

Otherwise, it would be better as well to have those two ops implemented
as static functions within this file (as all eth_dev_ops), calling your
rxq intr implementations from here.

>  };
> diff --git a/drivers/net/failsafe/failsafe_private.h b/drivers/net/failsafe/failsafe_private.h
> index b377046..d7617c4 100644
> --- a/drivers/net/failsafe/failsafe_private.h
> +++ b/drivers/net/failsafe/failsafe_private.h
> @@ -40,6 +40,7 @@
>  #include <rte_dev.h>
>  #include <rte_ethdev.h>
>  #include <rte_devargs.h>
> +#include <rte_interrupts.h>
>  
>  #define FAILSAFE_DRIVER_NAME "Fail-safe PMD"
>  #define FAILSAFE_OWNER_NAME "Fail-safe"
> @@ -61,6 +62,13 @@
>  
>  #define DEVARGS_MAXLEN 4096
>  
> +enum rxp_service_state {
> +        SS_NO_SREVICE = 0,

typo for SREVICE.

> +        SS_REGISTERED,
> +        SS_READY,
> +        SS_RUNNING,
> +};
> +
>  /* TYPES */
>  
>  struct rxq {
> @@ -69,10 +77,25 @@ struct rxq {
>          /* next sub_device to poll */
>          struct sub_device *sdev;
>          unsigned int socket_id;
> +        int event_fd;
> +        unsigned int enable_events:1;
>          struct rte_eth_rxq_info info;
>          rte_atomic64_t refcnt[];
>  };
>  
> +struct rx_proxy {
> +        /* epoll file descriptor */
> +        int efd;
> +        /* event vector to be used by epoll */
> +        struct rte_epoll_event *evec;
> +        /* rte service id */
> +        uint32_t sid;
> +        /* service core id */
> +        uint32_t scid;
> +        enum rxp_service_state sstate;
> +
> +};
> +
>  struct txq {
>          struct fs_priv *priv;
>          uint16_t qid;
> @@ -147,6 +170,7 @@ struct fs_priv {
>          /* current capabilities */
>          struct rte_eth_dev_info infos;
>          struct rte_eth_dev_owner my_owner; /* Unique owner. */
> +        struct rte_intr_handle intr_handle; /* Port interrupt handle. */
>          /*
>           * Fail-safe state machine.
>           * This level will be tracking state of the EAL and eth
> @@ -159,8 +183,28 @@ struct fs_priv {
>          unsigned int pending_alarm:1; /* An alarm is pending */
>          /* flow isolation state */
>          int flow_isolated:1;
> +        /*
> +         * Rx interrupts/events proxy.
> +         * The PMD issues Rx events to the EAL on behalf of its subdevices,
> +         *  it does that by registering event queues to the EAL. Each such
> +         *  queue represents a failsafe Rx queue. A PMD service thread listens
> +         *  to all the Rx events of of all the failsafe subdevices.
> +         *  When an Rx event is issued by a subdevice Rx queue it will be
> +         *  caught by the service and delivered by it to the appropriate
> +         *  failsafe event queue.
> +         */
> +        struct rx_proxy rxp;

Not very important, but can you put this before the :1 bitfields?

>  };
>  
> +/* FAILSAFE_INTR */
> +
> +int failsafe_rx_intr_install(struct rte_eth_dev *dev);
> +void failsafe_rx_intr_uninstall(struct rte_eth_dev *dev);
> +int failsafe_rx_intr_enable(struct rte_eth_dev *dev, uint16_t idx);
> +int failsafe_rx_intr_disable(struct rte_eth_dev *dev, uint16_t idx);
> +int failsafe_rx_intr_install_subdevice(struct sub_device *sdev);
> +void failsafe_rx_intr_uninstall_subdevice(struct sub_device *sdev);
> +
>  /* MISC */
>  
>  int failsafe_hotplug_alarm_install(struct rte_eth_dev *dev);

Overall good quality code, thanks!

-- 
Gaëtan Rivet
6WIND

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH v5 0/3] net/failsafe: add Rx interrupts support
  2018-01-19  9:32       ` Moti Haimovsky
  2018-01-19 14:11         ` Gaëtan Rivet
@ 2018-01-23 18:43         ` Moti Haimovsky
  2018-01-23 18:43           ` [PATCH v5 1/3] net/failsafe: regiter as an Rx interrupt mode PMD Moti Haimovsky
                             ` (2 more replies)
  1 sibling, 3 replies; 29+ messages in thread
From: Moti Haimovsky @ 2018-01-23 18:43 UTC (permalink / raw)
  To: gaetan.rivet, ferruh.yigit; +Cc: dev, Moti Haimovsky

These three patches add support for registering and waiting for 
Rx interrupts in failsafe PMD. This allows applications to wait
for Rx events from the PMD using the DPDK rte_epoll subsystem.
The failsafe PMD presents to the application a facade of a single
device to be handled by the application while internally it manages
several devices on behalf of the application including packets
transmission and reception.
The Proposed failsafe Rx interrupt scheme follows this approach.
The failsafe PMD will present the application with a single set of
Rx interrupt vectors representing the failsafe Rx queues, while
internally it will serve as an interrupt proxy for its subdevices.
will allow applications to wait for Rx traffic from the failsafe
PMD by registering and waiting for Rx events from its Rx queues.
In order to support this the following is suggested:
  * Every Rx queue in the failsafe (virtual) device will be assigned
  * a Linux event file descriptor (efd) and an enable_interrupts flag.
  * The failsafe PMD will fill in its rte_intr_handle structure with
    the Rx efds assigned previously and register them with the EAL.
  * The failsafe driver will create a private epoll fd (epfd) and
  * will allocate enough space to handle all the Rx events from all its
    subdevices.
  * Acting as an application,
    for each Rx queue in each active subdevice the failsafe will:
      o Register the Rx queue with the EAL.
      o Pass the EAL the failsafe private epoll fd as the epfd to
        register the Rx queue event on.
      o Pass the EAL, as a parameter, the pointer to the failsafe Rx
        queue that handles this Rx queue.
      o Using the DPDK service callbacks, the failsafe PMD will launch
        an Rx proxy service that will Wait on the epoll fd for Rx
        events from the sub-devices.
      o For each Rx event received the proxy service will
          - Retrieve the pointer to failsafe Rx queue that handles
            this subdevice Rx queue from the user info returned by the
            EAL.
          - Trigger a failsafe Rx event on that queue by writing to
            the event fd unless interrupts are disabled for that queue.
  * The failsafe pmd will also implement the rx_queue_intr_enable
  * and rx_queue_intr_disable routines that will enable and disable Rx
    interrupts respectively on both on the failsafe and its subdevices.

Moti Haimovsky (3):
  net/failsafe: regiter as an Rx interrupt mode PMD
  net/failsafe: slaves Rx interrupts registration
  net/failsafe: add Rx interrupts

---
V5:
Modified code and split the patch into three patches in accordance to
inputs from Gaetan Rivet in reply to
1516354344-13495-2-git-send-email-motih@mellanox.com

V4:
Fixed merge conflicts found during integration with other failsafe patches
(See cover letter).

V3:
Fixed build failures in FreeBSD10.3_64

V2:
Modifications according to inputs from Stephen Hemminger:
* Removed unneeded (void *) casting.
Fixed coding style warning.
---
 doc/guides/nics/features/failsafe.ini   |   1 +
 drivers/net/failsafe/Makefile           |   1 +
 drivers/net/failsafe/failsafe.c         |   4 +
 drivers/net/failsafe/failsafe_ether.c   |   1 +
 drivers/net/failsafe/failsafe_intr.c    | 492 ++++++++++++++++++++++++++++++++
 drivers/net/failsafe/failsafe_ops.c     | 102 +++++++
 drivers/net/failsafe/failsafe_private.h |  40 +++
 7 files changed, 641 insertions(+)
 create mode 100644 drivers/net/failsafe/failsafe_intr.c

-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH v5 1/3] net/failsafe: regiter as an Rx interrupt mode PMD
  2018-01-23 18:43         ` [PATCH v5 0/3] net/failsafe: add Rx interrupts support Moti Haimovsky
@ 2018-01-23 18:43           ` Moti Haimovsky
  2018-01-23 18:43           ` [PATCH v5 2/3] net/failsafe: slaves Rx interrupts registration Moti Haimovsky
  2018-01-23 18:43           ` [PATCH v5 " Moti Haimovsky
  2 siblings, 0 replies; 29+ messages in thread
From: Moti Haimovsky @ 2018-01-23 18:43 UTC (permalink / raw)
  To: gaetan.rivet, ferruh.yigit; +Cc: dev, Moti Haimovsky

This patch adds registering the Rx queues of the failsafe PMD with EAL
Rx interrupts subsystem.
Each failsafe RX queue is assigned with a unique eventfd and an enable
interrupts flag.
The PMD creates an interrupt vector containing the above eventfds and
Registers it with  EAL. The PMD also implements the Rx interrupts enable
and disable interface routines.
This patch does not implement the generation of Rx interrupts, so an
application can now wait for failsafe Rx interrupts but it will not
receive one.

Signed-off-by: Moti Haimovsky <motih@mellanox.com>
---
V5:
Initial version of this patch in accordance to inputs from Gaetan Rivet
in reply to
1516354344-13495-2-git-send-email-motih@mellanox.com
---
 doc/guides/nics/features/failsafe.ini   |   1 +
 drivers/net/failsafe/Makefile           |   1 +
 drivers/net/failsafe/failsafe.c         |   4 +
 drivers/net/failsafe/failsafe_intr.c    | 132 ++++++++++++++++++++++++++++++++
 drivers/net/failsafe/failsafe_ops.c     |  64 ++++++++++++++++
 drivers/net/failsafe/failsafe_private.h |   9 +++
 6 files changed, 211 insertions(+)
 create mode 100644 drivers/net/failsafe/failsafe_intr.c

diff --git a/doc/guides/nics/features/failsafe.ini b/doc/guides/nics/features/failsafe.ini
index a42e344..39ee579 100644
--- a/doc/guides/nics/features/failsafe.ini
+++ b/doc/guides/nics/features/failsafe.ini
@@ -6,6 +6,7 @@
 [Features]
 Link status          = Y
 Link status event    = Y
+Rx interrupt         = Y
 MTU update           = Y
 Jumbo frame          = Y
 Promiscuous mode     = Y
diff --git a/drivers/net/failsafe/Makefile b/drivers/net/failsafe/Makefile
index ea2a8fe..91a734b 100644
--- a/drivers/net/failsafe/Makefile
+++ b/drivers/net/failsafe/Makefile
@@ -46,6 +46,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_PMD_FAILSAFE) += failsafe_ops.c
 SRCS-$(CONFIG_RTE_LIBRTE_PMD_FAILSAFE) += failsafe_rxtx.c
 SRCS-$(CONFIG_RTE_LIBRTE_PMD_FAILSAFE) += failsafe_ether.c
 SRCS-$(CONFIG_RTE_LIBRTE_PMD_FAILSAFE) += failsafe_flow.c
+SRCS-$(CONFIG_RTE_LIBRTE_PMD_FAILSAFE) += failsafe_intr.c
 
 # No exported include files
 
diff --git a/drivers/net/failsafe/failsafe.c b/drivers/net/failsafe/failsafe.c
index cb274eb..921e656 100644
--- a/drivers/net/failsafe/failsafe.c
+++ b/drivers/net/failsafe/failsafe.c
@@ -244,6 +244,10 @@
 		mac->addr_bytes[2], mac->addr_bytes[3],
 		mac->addr_bytes[4], mac->addr_bytes[5]);
 	dev->data->dev_flags |= RTE_ETH_DEV_INTR_LSC;
+	PRIV(dev)->intr_handle = (struct rte_intr_handle){
+		.fd = -1,
+		.type = RTE_INTR_HANDLE_EXT,
+	};
 	return 0;
 free_args:
 	failsafe_args_free(dev);
diff --git a/drivers/net/failsafe/failsafe_intr.c b/drivers/net/failsafe/failsafe_intr.c
new file mode 100644
index 0000000..2c92d95
--- /dev/null
+++ b/drivers/net/failsafe/failsafe_intr.c
@@ -0,0 +1,132 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2018 Mellanox Technologies, Ltd.
+ */
+
+/**
+ * @file
+ * Interrupts handling for failsafe driver.
+ */
+
+#include <unistd.h>
+
+#include "failsafe_private.h"
+
+/**
+ * Uninstall failsafe interrupt vector.
+ *
+ * @param priv
+ *   Pointer to failsafe private structure.
+ */
+static void
+fs_rx_intr_vec_uninstall(struct fs_priv *priv)
+{
+	struct rte_intr_handle *intr_handle = &priv->intr_handle;
+
+	if (intr_handle->intr_vec != NULL) {
+		free(intr_handle->intr_vec);
+		intr_handle->intr_vec = NULL;
+	}
+	intr_handle->nb_efd = 0;
+}
+
+/**
+ * Installs failsafe interrupt vector to be registered with EAL later on.
+ *
+ * @param priv
+ *   Pointer to failsafe private structure.
+ *
+ * @return
+ *   0 on success, negative errno value otherwise and rte_errno is set.
+ */
+static int
+fs_rx_intr_vec_install(struct fs_priv *priv)
+{
+	unsigned int i;
+	unsigned int rxqs_n = priv->dev->data->nb_rx_queues;
+	unsigned int n = RTE_MIN(rxqs_n, (uint32_t)RTE_MAX_RXTX_INTR_VEC_ID);
+	unsigned int count = 0;
+	struct rte_intr_handle *intr_handle = &priv->intr_handle;
+
+	/* Allocate the interrupt vector of the failsafe Rx proxy interrupts */
+	intr_handle->intr_vec = malloc(n * sizeof(intr_handle->intr_vec[0]));
+	if (intr_handle->intr_vec == NULL) {
+		fs_rx_intr_vec_uninstall(priv);
+		rte_errno = ENOMEM;
+		ERROR("Failed to allocate memory for interrupt vector,"
+		      " Rx interrupts will not be supported");
+		return -rte_errno;
+	}
+	for (i = 0; i < n; i++) {
+		struct rxq *rxq = priv->dev->data->rx_queues[i];
+
+		/* Skip queues that cannot request interrupts. */
+		if (rxq == NULL || rxq->event_fd < 0) {
+			/* Use invalid intr_vec[] index to disable entry. */
+			intr_handle->intr_vec[i] =
+				RTE_INTR_VEC_RXTX_OFFSET +
+				RTE_MAX_RXTX_INTR_VEC_ID;
+			continue;
+		}
+		if (count >= RTE_MAX_RXTX_INTR_VEC_ID) {
+			rte_errno = E2BIG;
+			ERROR("Too many Rx queues for interrupt vector size"
+			      " (%d), Rx interrupts cannot be enabled",
+			      RTE_MAX_RXTX_INTR_VEC_ID);
+			fs_rx_intr_vec_uninstall(priv);
+			return -rte_errno;
+		}
+		intr_handle->intr_vec[i] = RTE_INTR_VEC_RXTX_OFFSET + count;
+		intr_handle->efds[count] = rxq->event_fd;
+		count++;
+	}
+	if (count == 0) {
+		fs_rx_intr_vec_uninstall(priv);
+	} else {
+		intr_handle->nb_efd = count;
+		intr_handle->efd_counter_size = sizeof(uint64_t);
+	}
+	return 0;
+}
+
+
+/**
+ * Uninstall failsafe Rx interrupts subsystem.
+ *
+ * @param priv
+ *   Pointer to private structure.
+ *
+ * @return
+ *   0 on success, negative errno value otherwise and rte_errno is set.
+ */
+void
+failsafe_rx_intr_uninstall(struct rte_eth_dev *dev)
+{
+	struct fs_priv *priv = PRIV(dev);
+
+	fs_rx_intr_vec_uninstall(priv);
+	dev->intr_handle = NULL;
+}
+
+/**
+ * Install failsafe Rx interrupts subsystem.
+ *
+ * @param priv
+ *   Pointer to private structure.
+ *
+ * @return
+ *   0 on success, negative errno value otherwise and rte_errno is set.
+ */
+int
+failsafe_rx_intr_install(struct rte_eth_dev *dev)
+{
+	struct fs_priv *priv = PRIV(dev);
+	const struct rte_intr_conf *const intr_conf =
+			&priv->dev->data->dev_conf.intr_conf;
+
+	if (intr_conf->rxq == 0 || priv->intr_handle.intr_vec != NULL)
+		return 0;
+	if (fs_rx_intr_vec_install(priv) < 0)
+		return -rte_errno;
+	dev->intr_handle = &priv->intr_handle;
+	return 0;
+}
diff --git a/drivers/net/failsafe/failsafe_ops.c b/drivers/net/failsafe/failsafe_ops.c
index 946ac98..d6a82b3 100644
--- a/drivers/net/failsafe/failsafe_ops.c
+++ b/drivers/net/failsafe/failsafe_ops.c
@@ -33,6 +33,7 @@
 
 #include <stdbool.h>
 #include <stdint.h>
+#include <unistd.h>
 
 #include <rte_debug.h>
 #include <rte_atomic.h>
@@ -199,6 +200,10 @@
 	uint8_t i;
 	int ret;
 
+	ret = failsafe_rx_intr_install(dev);
+	if (ret)
+		return ret;
+
 	FOREACH_SUBDEV(sdev, i, dev) {
 		if (sdev->state != DEV_ACTIVE)
 			continue;
@@ -228,6 +233,7 @@
 		rte_eth_dev_stop(PORT_ID(sdev));
 		sdev->state = DEV_STARTED - 1;
 	}
+	failsafe_rx_intr_uninstall(dev);
 }
 
 static int
@@ -317,6 +323,8 @@
 	if (queue == NULL)
 		return;
 	rxq = queue;
+	if (rxq->event_fd > 0)
+		close(rxq->event_fd);
 	dev = rxq->priv->dev;
 	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE)
 		SUBOPS(sdev, rx_queue_release)
@@ -333,6 +341,16 @@
 		const struct rte_eth_rxconf *rx_conf,
 		struct rte_mempool *mb_pool)
 {
+	/*
+	 * FIXME: Add a proper interface in rte_eal_interrupts for
+	 * allocating eventfd as an interrupt vector.
+	 * For the time being, fake as if we are using MSIX interrupts,
+	 * this will cause rte_intr_efd_enable to allocate an eventfd for us.
+	 */
+	struct rte_intr_handle intr_handle = {
+		.type = RTE_INTR_HANDLE_VFIO_MSIX,
+		.efds = {-1, },
+	};
 	struct sub_device *sdev;
 	struct rxq *rxq;
 	uint8_t i;
@@ -370,6 +388,10 @@
 	rxq->info.nb_desc = nb_rx_desc;
 	rxq->priv = PRIV(dev);
 	rxq->sdev = PRIV(dev)->subs;
+	ret = rte_intr_efd_enable(&intr_handle, 1);
+	if (ret < 0)
+		return ret;
+	rxq->event_fd = intr_handle.efds[0];
 	dev->data->rx_queues[rx_queue_id] = rxq;
 	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
 		ret = rte_eth_rx_queue_setup(PORT_ID(sdev),
@@ -387,6 +409,46 @@
 	return ret;
 }
 
+static int
+fs_rx_intr_enable(struct rte_eth_dev *dev, uint16_t idx)
+{
+	struct rxq *rxq;
+
+	if (idx >= dev->data->nb_rx_queues) {
+		rte_errno = EINVAL;
+		return -rte_errno;
+	}
+	rxq = dev->data->rx_queues[idx];
+	if (rxq == NULL || rxq->event_fd <= 0) {
+		rte_errno = EINVAL;
+		return -rte_errno;
+	}
+	rxq->enable_events = 1;
+	return 0;
+}
+
+static int
+fs_rx_intr_disable(struct rte_eth_dev *dev, uint16_t idx)
+{
+	struct rxq *rxq;
+	uint64_t u64;
+
+	if (idx >= dev->data->nb_rx_queues) {
+		rte_errno = EINVAL;
+		return -rte_errno;
+	}
+	rxq = dev->data->rx_queues[idx];
+	if (rxq == NULL || rxq->event_fd <= 0) {
+		rte_errno = EINVAL;
+		return -rte_errno;
+	}
+	rxq->enable_events = 0;
+	/* Clear pending events */
+	while (read(rxq->event_fd, &u64, sizeof(uint64_t)) >  0)
+		;
+	return 0;
+}
+
 static bool
 fs_txq_offloads_valid(struct rte_eth_dev *dev, uint64_t offloads)
 {
@@ -888,6 +950,8 @@
 	.tx_queue_setup = fs_tx_queue_setup,
 	.rx_queue_release = fs_rx_queue_release,
 	.tx_queue_release = fs_tx_queue_release,
+	.rx_queue_intr_enable = fs_rx_intr_enable,
+	.rx_queue_intr_disable = fs_rx_intr_disable,
 	.flow_ctrl_get = fs_flow_ctrl_get,
 	.flow_ctrl_set = fs_flow_ctrl_set,
 	.mac_addr_remove = fs_mac_addr_remove,
diff --git a/drivers/net/failsafe/failsafe_private.h b/drivers/net/failsafe/failsafe_private.h
index 7754248..419e5e7 100644
--- a/drivers/net/failsafe/failsafe_private.h
+++ b/drivers/net/failsafe/failsafe_private.h
@@ -40,6 +40,7 @@
 #include <rte_dev.h>
 #include <rte_ethdev_driver.h>
 #include <rte_devargs.h>
+#include <rte_interrupts.h>
 
 #define FAILSAFE_DRIVER_NAME "Fail-safe PMD"
 
@@ -68,6 +69,8 @@ struct rxq {
 	/* next sub_device to poll */
 	struct sub_device *sdev;
 	unsigned int socket_id;
+	int event_fd;
+	unsigned int enable_events:1;
 	struct rte_eth_rxq_info info;
 	rte_atomic64_t refcnt[];
 };
@@ -145,6 +148,7 @@ struct fs_priv {
 	uint32_t mac_addr_pool[FAILSAFE_MAX_ETHADDR];
 	/* current capabilities */
 	struct rte_eth_dev_info infos;
+	struct rte_intr_handle intr_handle; /* Port interrupt handle. */
 	/*
 	 * Fail-safe state machine.
 	 * This level will be tracking state of the EAL and eth
@@ -159,6 +163,11 @@ struct fs_priv {
 	int flow_isolated:1;
 };
 
+/* FAILSAFE_INTR */
+
+int failsafe_rx_intr_install(struct rte_eth_dev *dev);
+void failsafe_rx_intr_uninstall(struct rte_eth_dev *dev);
+
 /* MISC */
 
 int failsafe_hotplug_alarm_install(struct rte_eth_dev *dev);
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v5 2/3] net/failsafe: slaves Rx interrupts registration
  2018-01-23 18:43         ` [PATCH v5 0/3] net/failsafe: add Rx interrupts support Moti Haimovsky
  2018-01-23 18:43           ` [PATCH v5 1/3] net/failsafe: regiter as an Rx interrupt mode PMD Moti Haimovsky
@ 2018-01-23 18:43           ` Moti Haimovsky
  2018-01-24 16:12             ` [PATCH v6 0/3] net/failsafe: add Rx interrupts support Moti Haimovsky
  2018-01-23 18:43           ` [PATCH v5 " Moti Haimovsky
  2 siblings, 1 reply; 29+ messages in thread
From: Moti Haimovsky @ 2018-01-23 18:43 UTC (permalink / raw)
  To: gaetan.rivet, ferruh.yigit; +Cc: dev, Moti Haimovsky

This commit adds the following functionality to failsafe PMD:
* Register and unregister slaves Rx interrupts.
* Enable and Disable slaves Rx interrupts.
The interrupts events generated by the slaves are not handled in this
commit.

Signed-off-by: Moti Haimovsky <motih@mellanox.com>
---
V5:
Initial version of this patch in accordance to inputs from Gaetan Rivet
in reply to
1516354344-13495-2-git-send-email-motih@mellanox.com
---
 drivers/net/failsafe/failsafe_ether.c   |   1 +
 drivers/net/failsafe/failsafe_intr.c    | 196 ++++++++++++++++++++++++++++++++
 drivers/net/failsafe/failsafe_ops.c     |  36 +++++-
 drivers/net/failsafe/failsafe_private.h |  16 +++
 4 files changed, 247 insertions(+), 2 deletions(-)

diff --git a/drivers/net/failsafe/failsafe_ether.c b/drivers/net/failsafe/failsafe_ether.c
index 8a4cacf..0f1630e 100644
--- a/drivers/net/failsafe/failsafe_ether.c
+++ b/drivers/net/failsafe/failsafe_ether.c
@@ -283,6 +283,7 @@
 		return;
 	switch (sdev->state) {
 	case DEV_STARTED:
+		failsafe_rx_intr_uninstall_subdevice(sdev);
 		rte_eth_dev_stop(PORT_ID(sdev));
 		sdev->state = DEV_ACTIVE;
 		/* fallthrough */
diff --git a/drivers/net/failsafe/failsafe_intr.c b/drivers/net/failsafe/failsafe_intr.c
index 2c92d95..2f18b3d 100644
--- a/drivers/net/failsafe/failsafe_intr.c
+++ b/drivers/net/failsafe/failsafe_intr.c
@@ -7,10 +7,198 @@
  * Interrupts handling for failsafe driver.
  */
 
+#include <sys/epoll.h>
 #include <unistd.h>
 
 #include "failsafe_private.h"
 
+#define NUM_RX_PROXIES (FAILSAFE_MAX_ETHPORTS * RTE_MAX_RXTX_INTR_VEC_ID)
+
+/**
+ * Install failsafe Rx event proxy subsystem.
+ * This is the way the failsafe PMD generates Rx events on behalf of its
+ * subdevices.
+ *
+ * @param priv
+ *   Pointer to failsafe private structure.
+ * @return
+ *   0 on success, negative errno value otherwise and rte_errno is set.
+ */
+static int
+fs_rx_event_proxy_install(struct fs_priv *priv)
+{
+	int rc = 0;
+
+	/*
+	 * Create the epoll fd and event vector for the proxy service to
+	 * wait on for Rx events generated by the subdevices.
+	 */
+	priv->rxp.efd = epoll_create1(0);
+	if (priv->rxp.efd < 0) {
+		rte_errno = errno;
+		ERROR("Failed to create epoll,"
+		      " Rx interrupts will not be supported");
+		return -rte_errno;
+	}
+	priv->rxp.evec = calloc(NUM_RX_PROXIES, sizeof(*priv->rxp.evec));
+	if (priv->rxp.evec == NULL) {
+		ERROR("Failed to allocate memory for event vectors,"
+		      " Rx interrupts will not be supported");
+		rc = -ENOMEM;
+		goto error;
+	}
+	return 0;
+error:
+	if (priv->rxp.efd >= 0) {
+		close(priv->rxp.efd);
+		priv->rxp.efd = -1;
+	}
+	if (priv->rxp.evec != NULL) {
+		free(priv->rxp.evec);
+		priv->rxp.evec = NULL;
+	}
+	rte_errno = -rc;
+	return rc;
+}
+
+/**
+ * RX Interrupt control per subdevice.
+ *
+ * @param sdev
+ *   Pointer to sub-device structure.
+ * @param op
+ *   The operation be performed for the vector.
+ *   Operation type of {RTE_INTR_EVENT_ADD, RTE_INTR_EVENT_DEL}.
+ * @return
+ *   - On success, zero.
+ *   - On failure, a negative value.
+ */
+static int
+failsafe_eth_rx_intr_ctl_subdevice(struct sub_device *sdev, int op)
+{
+	struct rte_eth_dev *dev;
+	struct rte_eth_dev *fsdev;
+	int epfd;
+	uint16_t pid;
+	uint16_t qid;
+	struct rxq *fsrxq;
+	int rc;
+	int ret = 0;
+
+	if (sdev == NULL || (ETH(sdev) == NULL) ||
+	    sdev->fs_dev == NULL || (PRIV(sdev->fs_dev) == NULL)) {
+		ERROR("Called with invalid arguments");
+		return -EINVAL;
+	}
+	dev = ETH(sdev);
+	fsdev = sdev->fs_dev;
+	epfd = PRIV(sdev->fs_dev)->rxp.efd;
+	pid = PORT_ID(sdev);
+
+	if (epfd <= 0) {
+		if (op == RTE_INTR_EVENT_ADD) {
+			ERROR("Proxy events are not initialized");
+			return -EBADFD;
+		} else {
+			return 0;
+		}
+	}
+	if (dev->data->nb_rx_queues > fsdev->data->nb_rx_queues) {
+		ERROR("subdevice has too many queues,"
+		      " Interrupts will not be enabled");
+			return -E2BIG;
+	}
+	for (qid = 0; qid < dev->data->nb_rx_queues; qid++) {
+		fsrxq = fsdev->data->rx_queues[qid];
+		rc = rte_eth_dev_rx_intr_ctl_q(pid, qid, epfd,
+					       op, (void *)fsrxq);
+		if (rc) {
+			ERROR("rte_eth_dev_rx_intr_ctl_q failed for "
+			      "port %d  queue %d, epfd %d, error %d",
+			      pid, qid, epfd, rc);
+			ret = rc;
+		}
+	}
+	return ret;
+}
+
+/**
+ * Install Rx interrupts subsystem for a subdevice.
+ * This is a support for dynamically adding subdevices.
+ *
+ * @param sdev
+ *   Pointer to subdevice structure.
+ *
+ * @return
+ *   0 on success, negative errno value otherwise and rte_errno is set.
+ */
+int failsafe_rx_intr_install_subdevice(struct sub_device *sdev)
+{
+	int rc;
+	int qid;
+	struct rte_eth_dev *fsdev = sdev->fs_dev;
+	struct rxq **rxq = (struct rxq **)fsdev->data->rx_queues;
+	const struct rte_intr_conf *const intr_conf =
+				&ETH(sdev)->data->dev_conf.intr_conf;
+
+	if (intr_conf->rxq == 0)
+		return 0;
+	rc = failsafe_eth_rx_intr_ctl_subdevice(sdev, RTE_INTR_EVENT_ADD);
+	if (rc)
+		return rc;
+	/* enable interrupts on already-enabled queues */
+	for (qid = 0; qid < ETH(sdev)->data->nb_rx_queues; qid++) {
+		if (rxq[qid]->enable_events) {
+			int ret = rte_eth_dev_rx_intr_enable(PORT_ID(sdev),
+							     qid);
+			if (ret && (ret != -ENOTSUP)) {
+				ERROR("Failed to enable interrupts on "
+				      "port %d queue %d", PORT_ID(sdev), qid);
+				rc = ret;
+			}
+		}
+	}
+	return rc;
+}
+
+/**
+ * Uninstall Rx interrupts subsystem for a subdevice.
+ * This is a support for dynamically removing subdevices.
+ *
+ * @param sdev
+ *   Pointer to subdevice structure.
+ *
+ * @return
+ *   0 on success, negative errno value otherwise and rte_errno is set.
+ */
+void failsafe_rx_intr_uninstall_subdevice(struct sub_device *sdev)
+{
+	int qid;
+
+	for (qid = 0; qid < ETH(sdev)->data->nb_rx_queues; qid++)
+		rte_eth_dev_rx_intr_disable(PORT_ID(sdev), qid);
+	failsafe_eth_rx_intr_ctl_subdevice(sdev, RTE_INTR_EVENT_DEL);
+}
+
+/**
+ * Uninstall failsafe Rx event proxy.
+ *
+ * @param priv
+ *   Pointer to failsafe private structure.
+ */
+static void
+fs_rx_event_proxy_uninstall(struct fs_priv *priv)
+{
+	if (priv->rxp.evec != NULL) {
+		free(priv->rxp.evec);
+		priv->rxp.evec = NULL;
+	}
+	if (priv->rxp.efd > 0) {
+		close(priv->rxp.efd);
+		priv->rxp.efd = -1;
+	}
+}
+
 /**
  * Uninstall failsafe interrupt vector.
  *
@@ -102,7 +290,11 @@
 failsafe_rx_intr_uninstall(struct rte_eth_dev *dev)
 {
 	struct fs_priv *priv = PRIV(dev);
+	struct rte_intr_handle *intr_handle;
 
+	intr_handle = &priv->intr_handle;
+	rte_intr_free_epoll_fd(intr_handle);
+	fs_rx_event_proxy_uninstall(priv);
 	fs_rx_intr_vec_uninstall(priv);
 	dev->intr_handle = NULL;
 }
@@ -127,6 +319,10 @@
 		return 0;
 	if (fs_rx_intr_vec_install(priv) < 0)
 		return -rte_errno;
+	if (fs_rx_event_proxy_install(priv) < 0) {
+		fs_rx_intr_vec_uninstall(priv);
+		return -rte_errno;
+	}
 	dev->intr_handle = &priv->intr_handle;
 	return 0;
 }
diff --git a/drivers/net/failsafe/failsafe_ops.c b/drivers/net/failsafe/failsafe_ops.c
index d6a82b3..2ea9cdd 100644
--- a/drivers/net/failsafe/failsafe_ops.c
+++ b/drivers/net/failsafe/failsafe_ops.c
@@ -214,6 +214,13 @@
 				continue;
 			return ret;
 		}
+		ret = failsafe_rx_intr_install_subdevice(sdev);
+		if (ret) {
+			if (!fs_err(sdev, ret))
+				continue;
+			rte_eth_dev_stop(PORT_ID(sdev));
+			return ret;
+		}
 		sdev->state = DEV_STARTED;
 	}
 	if (PRIV(dev)->state < DEV_STARTED)
@@ -231,6 +238,7 @@
 	PRIV(dev)->state = DEV_STARTED - 1;
 	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_STARTED) {
 		rte_eth_dev_stop(PORT_ID(sdev));
+		failsafe_rx_intr_uninstall_subdevice(sdev);
 		sdev->state = DEV_STARTED - 1;
 	}
 	failsafe_rx_intr_uninstall(dev);
@@ -413,6 +421,10 @@
 fs_rx_intr_enable(struct rte_eth_dev *dev, uint16_t idx)
 {
 	struct rxq *rxq;
+	struct sub_device *sdev;
+	uint8_t i;
+	int ret;
+	int rc = 0;
 
 	if (idx >= dev->data->nb_rx_queues) {
 		rte_errno = EINVAL;
@@ -424,14 +436,26 @@
 		return -rte_errno;
 	}
 	rxq->enable_events = 1;
-	return 0;
+	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
+		ret = rte_eth_dev_rx_intr_enable(PORT_ID(sdev), idx);
+		ret = fs_err(sdev, ret);
+		if (ret)
+			rc = ret;
+	}
+	if (rc)
+		rte_errno = -rc;
+	return rc;
 }
 
 static int
 fs_rx_intr_disable(struct rte_eth_dev *dev, uint16_t idx)
 {
 	struct rxq *rxq;
+	struct sub_device *sdev;
 	uint64_t u64;
+	uint8_t i;
+	int rc = 0;
+	int ret;
 
 	if (idx >= dev->data->nb_rx_queues) {
 		rte_errno = EINVAL;
@@ -443,10 +467,18 @@
 		return -rte_errno;
 	}
 	rxq->enable_events = 0;
+	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
+		ret = rte_eth_dev_rx_intr_disable(PORT_ID(sdev), idx);
+		ret = fs_err(sdev, ret);
+		if (ret)
+			rc = ret;
+	}
 	/* Clear pending events */
 	while (read(rxq->event_fd, &u64, sizeof(uint64_t)) >  0)
 		;
-	return 0;
+	if (rc)
+		rte_errno = -rc;
+	return rc;
 }
 
 static bool
diff --git a/drivers/net/failsafe/failsafe_private.h b/drivers/net/failsafe/failsafe_private.h
index 419e5e7..ff78b9f 100644
--- a/drivers/net/failsafe/failsafe_private.h
+++ b/drivers/net/failsafe/failsafe_private.h
@@ -63,6 +63,13 @@
 
 /* TYPES */
 
+struct rx_proxy {
+	/* epoll file descriptor */
+	int efd;
+	/* event vector to be used by epoll */
+	struct rte_epoll_event *evec;
+};
+
 struct rxq {
 	struct fs_priv *priv;
 	uint16_t qid;
@@ -158,6 +165,13 @@ struct fs_priv {
 	 */
 	enum dev_state state;
 	struct rte_eth_stats stats_accumulator;
+	/*
+	 * Rx interrupts/events proxy.
+	 * The PMD issues Rx events to the EAL on behalf of its subdevices,
+	 * it does that by registering an event-fd for each of its queues with
+	 * the EAL.
+	 */
+	struct rx_proxy rxp;
 	unsigned int pending_alarm:1; /* An alarm is pending */
 	/* flow isolation state */
 	int flow_isolated:1;
@@ -167,6 +181,8 @@ struct fs_priv {
 
 int failsafe_rx_intr_install(struct rte_eth_dev *dev);
 void failsafe_rx_intr_uninstall(struct rte_eth_dev *dev);
+int failsafe_rx_intr_install_subdevice(struct sub_device *sdev);
+void failsafe_rx_intr_uninstall_subdevice(struct sub_device *sdev);
 
 /* MISC */
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v5 3/3] net/failsafe: add Rx interrupts
  2018-01-23 18:43         ` [PATCH v5 0/3] net/failsafe: add Rx interrupts support Moti Haimovsky
  2018-01-23 18:43           ` [PATCH v5 1/3] net/failsafe: regiter as an Rx interrupt mode PMD Moti Haimovsky
  2018-01-23 18:43           ` [PATCH v5 2/3] net/failsafe: slaves Rx interrupts registration Moti Haimovsky
@ 2018-01-23 18:43           ` Moti Haimovsky
  2 siblings, 0 replies; 29+ messages in thread
From: Moti Haimovsky @ 2018-01-23 18:43 UTC (permalink / raw)
  To: gaetan.rivet, ferruh.yigit; +Cc: dev, Moti Haimovsky

This patch is the last patch in the series of patches aimed
to add support for registering and waiting for Rx interrupts
in failsafe PMD. This allows applications to wait for Rx events
from the PMD using the DPDK rte_epoll subsystem.
The failsafe PMD presents to the application a facade of a single
device to be handled by the application while internally it manages
several devices on behalf of the application including packets
transmission and reception.
The Proposed failsafe Rx interrupt scheme follows this approach.
The failsafe PMD will present the application with a single set of
Rx interrupt vectors representing the failsafe Rx queues, while
internally it will serve as an interrupt proxy for its subdevices.
will allow applications to wait for Rx traffic from the failsafe
PMD by registering and waiting for Rx events from its Rx queues.
In order to support this the following is suggested:
  * Every Rx queue in the failsafe (virtual) device will be assigned
  * a Linux event file descriptor (efd) and an enable_interrupts flag.
  * The failsafe PMD will fill in its rte_intr_handle structure with
    the Rx efds assigned previously and register them with the EAL.
  * The failsafe driver will create a private epoll fd (epfd) and
  * will allocate enough space to handle all the Rx events from all its
    subdevices.
  * Acting as an application,
    for each Rx queue in each active subdevice the failsafe will:
      o Register the Rx queue with the EAL.
      o Pass the EAL the failsafe private epoll fd as the epfd to
        register the Rx queue event on.
      o Pass the EAL, as a parameter, the pointer to the failsafe Rx
        queue that handles this Rx queue.
      o Using the DPDK service callbacks, the failsafe PMD will launch
        an Rx proxy service that will Wait on the epoll fd for Rx
        events from the sub-devices.
      o For each Rx event received the proxy service will
          - Retrieve the pointer to failsafe Rx queue that handles
            this subdevice Rx queue from the user info returned by the
            EAL.
          - Trigger a failsafe Rx event on that queue by writing to
            the event fd unless interrupts are disabled for that queue.
  * The failsafe pmd will also implement the rx_queue_intr_enable
  * and rx_queue_intr_disable routines that will enable and disable Rx
    interrupts respectively on both on the failsafe and its subdevices.

Signed-off-by: Moti Haimovsky <motih@mellanox.com>
---
V5:
Modified code and split the patch into three patches in accordance to
inputs from Gaetan Rivet in reply to
1516354344-13495-2-git-send-email-motih@mellanox.com

V4:
Fixed merge conflicts found during integration with other failsafe patches
(See cover letter).

V3:
Fixed build failures in FreeBSD10.3_64

V2:
Modifications according to inputs from Stephen Hemminger:
* Removed unneeded (void *) casting.
Fixed coding style warning.
---

 drivers/net/failsafe/failsafe_intr.c    | 164 ++++++++++++++++++++++++++++++++
 drivers/net/failsafe/failsafe_ops.c     |   6 ++
 drivers/net/failsafe/failsafe_private.h |  17 +++-
 3 files changed, 186 insertions(+), 1 deletion(-)

diff --git a/drivers/net/failsafe/failsafe_intr.c b/drivers/net/failsafe/failsafe_intr.c
index 2f18b3d..2a84e81 100644
--- a/drivers/net/failsafe/failsafe_intr.c
+++ b/drivers/net/failsafe/failsafe_intr.c
@@ -10,11 +10,170 @@
 #include <sys/epoll.h>
 #include <unistd.h>
 
+#include <rte_alarm.h>
+#include <rte_config.h>
+#include <rte_errno.h>
+#include <rte_ethdev.h>
+#include <rte_interrupts.h>
+#include <rte_io.h>
+#include <rte_service_component.h>
+
 #include "failsafe_private.h"
 
 #define NUM_RX_PROXIES (FAILSAFE_MAX_ETHPORTS * RTE_MAX_RXTX_INTR_VEC_ID)
 
 /**
+ * Install failsafe Rx event proxy service.
+ * The Rx event proxy is the service that listens to Rx events from the
+ * subdevices and triggers failsafe Rx events accordingly.
+ *
+ * @param priv
+ *   Pointer to failsafe private structure.
+ * @return
+ *   0 on success, negative errno value otherwise.
+ */
+static int
+fs_rx_event_proxy_routine(void *data)
+{
+	struct fs_priv *priv = data;
+	struct rxq *rxq;
+	struct rte_epoll_event *events = priv->rxp.evec;
+	uint64_t u64 = 1;
+	int i, n, rc = 0;
+
+	n = rte_epoll_wait(priv->rxp.efd, events, NUM_RX_PROXIES, -1);
+	for (i = 0; i < n; i++) {
+		rxq = events[i].epdata.data;
+		if (rxq->enable_events && rxq->event_fd != -1) {
+			if (write(rxq->event_fd, &u64, sizeof(u64)) !=
+			    sizeof(u64)) {
+				ERROR("Failed to proxy Rx event to socket %d",
+				       rxq->event_fd);
+				rc = -EIO;
+			}
+		}
+	}
+	return rc;
+}
+
+/**
+ * Uninstall failsafe Rx event proxy service.
+ *
+ * @param priv
+ *   Pointer to failsafe private structure.
+ */
+static void
+fs_rx_event_proxy_service_uninstall(struct fs_priv *priv)
+{
+	/* Unregister the event service. */
+	switch (priv->rxp.sstate) {
+	case SS_RUNNING:
+		rte_service_map_lcore_set(priv->rxp.sid, priv->rxp.scid, 0);
+		/* fall through */
+	case SS_READY:
+		rte_service_runstate_set(priv->rxp.sid, 0);
+		rte_service_set_stats_enable(priv->rxp.sid, 0);
+		rte_service_component_runstate_set(priv->rxp.sid, 0);
+		/* fall through */
+	case SS_REGISTERED:
+		rte_service_component_unregister(priv->rxp.sid);
+		/* fall through */
+	default:
+		break;
+	}
+}
+
+/**
+ * Install the failsafe Rx event proxy service.
+ *
+ * @param priv
+ *   Pointer to failsafe private structure.
+ * @return
+ *   0 on success, negative errno value otherwise.
+ */
+static int
+fs_rx_event_proxy_service_install(struct fs_priv *priv)
+{
+	struct rte_service_spec service;
+	int32_t num_service_cores = rte_service_lcore_count();
+	int ret = 0;
+
+	if (num_service_cores <= 0) {
+		ERROR("Failed to install Rx interrupts, "
+		      "no service core found");
+		return -ENOTSUP;
+	}
+	/* prepare service info */
+	memset(&service, 0, sizeof(struct rte_service_spec));
+	snprintf(service.name, sizeof(service.name), "%s_Rx_service",
+		 priv->dev->data->name);
+	service.socket_id = priv->dev->data->numa_node;
+	service.callback = fs_rx_event_proxy_routine;
+	service.callback_userdata = (void *)priv;
+
+	if (priv->rxp.sstate == SS_NO_SERVICE) {
+		uint32_t service_core_list[num_service_cores];
+
+		/* get a service core to work with */
+		ret = rte_service_lcore_list(service_core_list,
+					     num_service_cores);
+		if (ret <= 0) {
+			ERROR("Failed to install Rx interrupts, "
+			      "service core list empty or corrupted");
+			return -ENOTSUP;
+		}
+		priv->rxp.scid = service_core_list[0];
+		ret = rte_service_lcore_add(priv->rxp.scid);
+		if (ret && ret != -EALREADY) {
+			ERROR("Failed adding service core");
+			return ret;
+		}
+		/* service core may be in "stopped" state, start it */
+		ret = rte_service_lcore_start(priv->rxp.scid);
+		if (ret && (ret != -EALREADY)) {
+			ERROR("Failed to install Rx interrupts, "
+			      "service core not started");
+			return ret;
+		}
+		/* register our service */
+		int32_t ret = rte_service_component_register(&service,
+							     &priv->rxp.sid);
+		if (ret) {
+			ERROR("service register() failed");
+			return -ENOEXEC;
+		}
+		priv->rxp.sstate = SS_REGISTERED;
+		/* run the service */
+		ret = rte_service_component_runstate_set(priv->rxp.sid, 1);
+		if (ret < 0) {
+			ERROR("Failed Setting component runstate\n");
+			return ret;
+		}
+		ret = rte_service_set_stats_enable(priv->rxp.sid, 1);
+		if (ret < 0) {
+			ERROR("Failed enabling stats\n");
+			return ret;
+		}
+		ret = rte_service_runstate_set(priv->rxp.sid, 1);
+		if (ret < 0) {
+			ERROR("Failed to run service\n");
+			return ret;
+		}
+		priv->rxp.sstate = SS_READY;
+		/* map the service with the service core */
+		ret = rte_service_map_lcore_set(priv->rxp.sid,
+						priv->rxp.scid, 1);
+		if (ret) {
+			ERROR("Failed to install Rx interrupts, "
+			      "could not map service core");
+			return ret;
+		}
+		priv->rxp.sstate = SS_RUNNING;
+	}
+	return 0;
+}
+
+/**
  * Install failsafe Rx event proxy subsystem.
  * This is the way the failsafe PMD generates Rx events on behalf of its
  * subdevices.
@@ -47,6 +206,10 @@
 		rc = -ENOMEM;
 		goto error;
 	}
+	if (fs_rx_event_proxy_service_install(priv) < 0) {
+		rc = -rte_errno;
+		goto error;
+	}
 	return 0;
 error:
 	if (priv->rxp.efd >= 0) {
@@ -189,6 +352,7 @@ void failsafe_rx_intr_uninstall_subdevice(struct sub_device *sdev)
 static void
 fs_rx_event_proxy_uninstall(struct fs_priv *priv)
 {
+	fs_rx_event_proxy_service_uninstall(priv);
 	if (priv->rxp.evec != NULL) {
 		free(priv->rxp.evec);
 		priv->rxp.evec = NULL;
diff --git a/drivers/net/failsafe/failsafe_ops.c b/drivers/net/failsafe/failsafe_ops.c
index 2ea9cdd..840024c 100644
--- a/drivers/net/failsafe/failsafe_ops.c
+++ b/drivers/net/failsafe/failsafe_ops.c
@@ -435,6 +435,12 @@
 		rte_errno = EINVAL;
 		return -rte_errno;
 	}
+	/* Fail if proxy service is nor running. */
+	if (PRIV(dev)->rxp.sstate != SS_RUNNING) {
+		ERROR("failsafe interrupt services are not running");
+		rte_errno = EAGAIN;
+		return -rte_errno;
+	}
 	rxq->enable_events = 1;
 	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
 		ret = rte_eth_dev_rx_intr_enable(PORT_ID(sdev), idx);
diff --git a/drivers/net/failsafe/failsafe_private.h b/drivers/net/failsafe/failsafe_private.h
index ff78b9f..5d328ff 100644
--- a/drivers/net/failsafe/failsafe_private.h
+++ b/drivers/net/failsafe/failsafe_private.h
@@ -61,6 +61,13 @@
 
 #define DEVARGS_MAXLEN 4096
 
+enum rxp_service_state {
+	SS_NO_SERVICE = 0,
+	SS_REGISTERED,
+	SS_READY,
+	SS_RUNNING,
+};
+
 /* TYPES */
 
 struct rx_proxy {
@@ -68,6 +75,11 @@ struct rx_proxy {
 	int efd;
 	/* event vector to be used by epoll */
 	struct rte_epoll_event *evec;
+	/* rte service id */
+	uint32_t sid;
+	/* service core id */
+	uint32_t scid;
+	enum rxp_service_state sstate;
 };
 
 struct rxq {
@@ -169,7 +181,10 @@ struct fs_priv {
 	 * Rx interrupts/events proxy.
 	 * The PMD issues Rx events to the EAL on behalf of its subdevices,
 	 * it does that by registering an event-fd for each of its queues with
-	 * the EAL.
+	 * the EAL. A PMD service thread listens to all the Rx events from the
+	 * subdevices, when an Rx event is issued by a subdevice it will be
+	 * caught by this service with will trigger an Rx event in the
+	 * appropriate failsafe Rx queue.
 	 */
 	struct rx_proxy rxp;
 	unsigned int pending_alarm:1; /* An alarm is pending */
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v6 0/3] net/failsafe: add Rx interrupts support
  2018-01-23 18:43           ` [PATCH v5 2/3] net/failsafe: slaves Rx interrupts registration Moti Haimovsky
@ 2018-01-24 16:12             ` Moti Haimovsky
  2018-01-24 16:12               ` [PATCH v6 1/3] net/failsafe: register as an Rx interrupt mode PMD Moti Haimovsky
                                 ` (2 more replies)
  0 siblings, 3 replies; 29+ messages in thread
From: Moti Haimovsky @ 2018-01-24 16:12 UTC (permalink / raw)
  To: gaetan.rivet, ferruh.yigit; +Cc: dev, Moti Haimovsky

These three patches add support for registering and waiting for
Rx interrupts in failsafe PMD. This allows applications to wait
for Rx events from the PMD using the DPDK rte_epoll subsystem.
The failsafe PMD presents to the application a facade of a single
device to be handled by the application while internally it manages
several devices on behalf of the application including packets
transmission and reception.
The Proposed failsafe Rx interrupt scheme follows this approach.
The failsafe PMD will present the application with a single set of
Rx interrupt vectors representing the failsafe Rx queues, while
internally it will serve as an interrupt proxy for its subdevices.
will allow applications to wait for Rx traffic from the failsafe
PMD by registering and waiting for Rx events from its Rx queues.
In order to support this the following is suggested:
  * Every Rx queue in the failsafe (virtual) device will be assigned
  * a Linux event file descriptor (efd) and an enable_interrupts flag.
  * The failsafe PMD will fill in its rte_intr_handle structure with
    the Rx efds assigned previously and register them with the EAL.
  * The failsafe driver will create a private epoll fd (epfd) and
  * will allocate enough space to handle all the Rx events from all its
    subdevices.
  * Acting as an application,
    for each Rx queue in each active subdevice the failsafe will:
      o Register the Rx queue with the EAL.
      o Pass the EAL the failsafe private epoll fd as the epfd to
        register the Rx queue event on.
      o Pass the EAL, as a parameter, the pointer to the failsafe Rx
        queue that handles this Rx queue.
      o Using the DPDK service callbacks, the failsafe PMD will launch
        an Rx proxy service that will Wait on the epoll fd for Rx
        events from the sub-devices.
      o For each Rx event received the proxy service will
          - Retrieve the pointer to failsafe Rx queue that handles
            this subdevice Rx queue from the user info returned by the
            EAL.
          - Trigger a failsafe Rx event on that queue by writing to
            the event fd unless interrupts are disabled for that queue.
  * The failsafe pmd will also implement the rx_queue_intr_enable
  * and rx_queue_intr_disable routines that will enable and disable Rx
    interrupts respectively on both on the failsafe and its subdevices.

Moti Haimovsky (3):
  net/failsafe: register as an Rx interrupt mode PMD
  net/failsafe: slaves Rx interrupts registration
  net/failsafe: add Rx interrupts
---
V6:
* Added a wrapper around epoll_create1 since it is not supported in FreeBSD.
  See: 1516193643-130838-1-git-send-email-motih@mellanox.com
* Separated between routines' variables definition and initialization
  according to guidelines from Gaetan Rivet.

V5:
Modified code and split the patch into three patches in accordance to
inputs from Gaetan Rivet in reply to
1516354344-13495-2-git-send-email-motih@mellanox.com

V4:
Fixed merge conflicts found during integration with other failsafe patches
(See cover letter).

V3:
Fixed build failures in FreeBSD10.3_64

V2:
Modifications according to inputs from Stephen Hemminger:
* Removed unneeded (void *) casting.
Fixed coding style warning.
---
 doc/guides/nics/features/failsafe.ini          |   1 +
 drivers/net/failsafe/Makefile                  |   6 +
 drivers/net/failsafe/failsafe.c                |   4 +
 drivers/net/failsafe/failsafe_epoll.h          |  10 +
 drivers/net/failsafe/failsafe_epoll_bsdapp.c   |  19 +
 drivers/net/failsafe/failsafe_epoll_linuxapp.c |  18 +
 drivers/net/failsafe/failsafe_ether.c          |   1 +
 drivers/net/failsafe/failsafe_intr.c           | 505 +++++++++++++++++++++++++
 drivers/net/failsafe/failsafe_ops.c            | 102 +++++
 drivers/net/failsafe/failsafe_private.h        |  40 ++
 10 files changed, 706 insertions(+)
 create mode 100644 drivers/net/failsafe/failsafe_epoll.h
 create mode 100644 drivers/net/failsafe/failsafe_epoll_bsdapp.c
 create mode 100644 drivers/net/failsafe/failsafe_epoll_linuxapp.c
 create mode 100644 drivers/net/failsafe/failsafe_intr.c

-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH v6 1/3] net/failsafe: register as an Rx interrupt mode PMD
  2018-01-24 16:12             ` [PATCH v6 0/3] net/failsafe: add Rx interrupts support Moti Haimovsky
@ 2018-01-24 16:12               ` Moti Haimovsky
  2018-01-24 16:12               ` [PATCH v6 2/3] net/failsafe: slaves Rx interrupts registration Moti Haimovsky
  2018-01-24 16:12               ` [PATCH v6 3/3] net/failsafe: add Rx interrupts Moti Haimovsky
  2 siblings, 0 replies; 29+ messages in thread
From: Moti Haimovsky @ 2018-01-24 16:12 UTC (permalink / raw)
  To: gaetan.rivet, ferruh.yigit; +Cc: dev, Moti Haimovsky

This patch adds registering the Rx queues of the failsafe PMD with EAL
Rx interrupts subsystem.
Each failsafe RX queue is assigned with a unique eventfd and an enable
interrupts flag.
The PMD creates an interrupt vector containing the above eventfds and
Registers it with  EAL. The PMD also implements the Rx interrupts enable
and disable interface routines.
This patch does not implement the generation of Rx interrupts, so an
application can now wait for failsafe Rx interrupts but it will not
receive one.

Signed-off-by: Moti Haimovsky <motih@mellanox.com>
---
V6:
Fixed typo in commit subject.

V5:
Initial version of this patch in accordance to inputs from Gaetan Rivet
in reply to
1516354344-13495-2-git-send-email-motih@mellanox.com
---
 doc/guides/nics/features/failsafe.ini   |   1 +
 drivers/net/failsafe/Makefile           |   1 +
 drivers/net/failsafe/failsafe.c         |   4 +
 drivers/net/failsafe/failsafe_intr.c    | 138 ++++++++++++++++++++++++++++++++
 drivers/net/failsafe/failsafe_ops.c     |  64 +++++++++++++++
 drivers/net/failsafe/failsafe_private.h |   9 +++
 6 files changed, 217 insertions(+)
 create mode 100644 drivers/net/failsafe/failsafe_intr.c

diff --git a/doc/guides/nics/features/failsafe.ini b/doc/guides/nics/features/failsafe.ini
index a42e344..39ee579 100644
--- a/doc/guides/nics/features/failsafe.ini
+++ b/doc/guides/nics/features/failsafe.ini
@@ -6,6 +6,7 @@
 [Features]
 Link status          = Y
 Link status event    = Y
+Rx interrupt         = Y
 MTU update           = Y
 Jumbo frame          = Y
 Promiscuous mode     = Y
diff --git a/drivers/net/failsafe/Makefile b/drivers/net/failsafe/Makefile
index ea2a8fe..91a734b 100644
--- a/drivers/net/failsafe/Makefile
+++ b/drivers/net/failsafe/Makefile
@@ -46,6 +46,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_PMD_FAILSAFE) += failsafe_ops.c
 SRCS-$(CONFIG_RTE_LIBRTE_PMD_FAILSAFE) += failsafe_rxtx.c
 SRCS-$(CONFIG_RTE_LIBRTE_PMD_FAILSAFE) += failsafe_ether.c
 SRCS-$(CONFIG_RTE_LIBRTE_PMD_FAILSAFE) += failsafe_flow.c
+SRCS-$(CONFIG_RTE_LIBRTE_PMD_FAILSAFE) += failsafe_intr.c
 
 # No exported include files
 
diff --git a/drivers/net/failsafe/failsafe.c b/drivers/net/failsafe/failsafe.c
index cb274eb..921e656 100644
--- a/drivers/net/failsafe/failsafe.c
+++ b/drivers/net/failsafe/failsafe.c
@@ -244,6 +244,10 @@
 		mac->addr_bytes[2], mac->addr_bytes[3],
 		mac->addr_bytes[4], mac->addr_bytes[5]);
 	dev->data->dev_flags |= RTE_ETH_DEV_INTR_LSC;
+	PRIV(dev)->intr_handle = (struct rte_intr_handle){
+		.fd = -1,
+		.type = RTE_INTR_HANDLE_EXT,
+	};
 	return 0;
 free_args:
 	failsafe_args_free(dev);
diff --git a/drivers/net/failsafe/failsafe_intr.c b/drivers/net/failsafe/failsafe_intr.c
new file mode 100644
index 0000000..54ef2f4
--- /dev/null
+++ b/drivers/net/failsafe/failsafe_intr.c
@@ -0,0 +1,138 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2018 Mellanox Technologies, Ltd.
+ */
+
+/**
+ * @file
+ * Interrupts handling for failsafe driver.
+ */
+
+#include <unistd.h>
+
+#include "failsafe_private.h"
+
+/**
+ * Uninstall failsafe interrupt vector.
+ *
+ * @param priv
+ *   Pointer to failsafe private structure.
+ */
+static void
+fs_rx_intr_vec_uninstall(struct fs_priv *priv)
+{
+	struct rte_intr_handle *intr_handle;
+
+	intr_handle = &priv->intr_handle;
+	if (intr_handle->intr_vec != NULL) {
+		free(intr_handle->intr_vec);
+		intr_handle->intr_vec = NULL;
+	}
+	intr_handle->nb_efd = 0;
+}
+
+/**
+ * Installs failsafe interrupt vector to be registered with EAL later on.
+ *
+ * @param priv
+ *   Pointer to failsafe private structure.
+ *
+ * @return
+ *   0 on success, negative errno value otherwise and rte_errno is set.
+ */
+static int
+fs_rx_intr_vec_install(struct fs_priv *priv)
+{
+	unsigned int i;
+	unsigned int rxqs_n;
+	unsigned int n;
+	unsigned int count;
+	struct rte_intr_handle *intr_handle;
+
+	rxqs_n = priv->dev->data->nb_rx_queues;
+	n = RTE_MIN(rxqs_n, (uint32_t)RTE_MAX_RXTX_INTR_VEC_ID);
+	count = 0;
+	intr_handle = &priv->intr_handle;
+	/* Allocate the interrupt vector of the failsafe Rx proxy interrupts */
+	intr_handle->intr_vec = malloc(n * sizeof(intr_handle->intr_vec[0]));
+	if (intr_handle->intr_vec == NULL) {
+		fs_rx_intr_vec_uninstall(priv);
+		rte_errno = ENOMEM;
+		ERROR("Failed to allocate memory for interrupt vector,"
+		      " Rx interrupts will not be supported");
+		return -rte_errno;
+	}
+	for (i = 0; i < n; i++) {
+		struct rxq *rxq = priv->dev->data->rx_queues[i];
+
+		/* Skip queues that cannot request interrupts. */
+		if (rxq == NULL || rxq->event_fd < 0) {
+			/* Use invalid intr_vec[] index to disable entry. */
+			intr_handle->intr_vec[i] =
+				RTE_INTR_VEC_RXTX_OFFSET +
+				RTE_MAX_RXTX_INTR_VEC_ID;
+			continue;
+		}
+		if (count >= RTE_MAX_RXTX_INTR_VEC_ID) {
+			rte_errno = E2BIG;
+			ERROR("Too many Rx queues for interrupt vector size"
+			      " (%d), Rx interrupts cannot be enabled",
+			      RTE_MAX_RXTX_INTR_VEC_ID);
+			fs_rx_intr_vec_uninstall(priv);
+			return -rte_errno;
+		}
+		intr_handle->intr_vec[i] = RTE_INTR_VEC_RXTX_OFFSET + count;
+		intr_handle->efds[count] = rxq->event_fd;
+		count++;
+	}
+	if (count == 0) {
+		fs_rx_intr_vec_uninstall(priv);
+	} else {
+		intr_handle->nb_efd = count;
+		intr_handle->efd_counter_size = sizeof(uint64_t);
+	}
+	return 0;
+}
+
+
+/**
+ * Uninstall failsafe Rx interrupts subsystem.
+ *
+ * @param priv
+ *   Pointer to private structure.
+ *
+ * @return
+ *   0 on success, negative errno value otherwise and rte_errno is set.
+ */
+void
+failsafe_rx_intr_uninstall(struct rte_eth_dev *dev)
+{
+	struct fs_priv *priv;
+
+	priv = PRIV(dev);
+	fs_rx_intr_vec_uninstall(priv);
+	dev->intr_handle = NULL;
+}
+
+/**
+ * Install failsafe Rx interrupts subsystem.
+ *
+ * @param priv
+ *   Pointer to private structure.
+ *
+ * @return
+ *   0 on success, negative errno value otherwise and rte_errno is set.
+ */
+int
+failsafe_rx_intr_install(struct rte_eth_dev *dev)
+{
+	struct fs_priv *priv = PRIV(dev);
+	const struct rte_intr_conf *const intr_conf =
+			&priv->dev->data->dev_conf.intr_conf;
+
+	if (intr_conf->rxq == 0 || priv->intr_handle.intr_vec != NULL)
+		return 0;
+	if (fs_rx_intr_vec_install(priv) < 0)
+		return -rte_errno;
+	dev->intr_handle = &priv->intr_handle;
+	return 0;
+}
diff --git a/drivers/net/failsafe/failsafe_ops.c b/drivers/net/failsafe/failsafe_ops.c
index 946ac98..d6a82b3 100644
--- a/drivers/net/failsafe/failsafe_ops.c
+++ b/drivers/net/failsafe/failsafe_ops.c
@@ -33,6 +33,7 @@
 
 #include <stdbool.h>
 #include <stdint.h>
+#include <unistd.h>
 
 #include <rte_debug.h>
 #include <rte_atomic.h>
@@ -199,6 +200,10 @@
 	uint8_t i;
 	int ret;
 
+	ret = failsafe_rx_intr_install(dev);
+	if (ret)
+		return ret;
+
 	FOREACH_SUBDEV(sdev, i, dev) {
 		if (sdev->state != DEV_ACTIVE)
 			continue;
@@ -228,6 +233,7 @@
 		rte_eth_dev_stop(PORT_ID(sdev));
 		sdev->state = DEV_STARTED - 1;
 	}
+	failsafe_rx_intr_uninstall(dev);
 }
 
 static int
@@ -317,6 +323,8 @@
 	if (queue == NULL)
 		return;
 	rxq = queue;
+	if (rxq->event_fd > 0)
+		close(rxq->event_fd);
 	dev = rxq->priv->dev;
 	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE)
 		SUBOPS(sdev, rx_queue_release)
@@ -333,6 +341,16 @@
 		const struct rte_eth_rxconf *rx_conf,
 		struct rte_mempool *mb_pool)
 {
+	/*
+	 * FIXME: Add a proper interface in rte_eal_interrupts for
+	 * allocating eventfd as an interrupt vector.
+	 * For the time being, fake as if we are using MSIX interrupts,
+	 * this will cause rte_intr_efd_enable to allocate an eventfd for us.
+	 */
+	struct rte_intr_handle intr_handle = {
+		.type = RTE_INTR_HANDLE_VFIO_MSIX,
+		.efds = {-1, },
+	};
 	struct sub_device *sdev;
 	struct rxq *rxq;
 	uint8_t i;
@@ -370,6 +388,10 @@
 	rxq->info.nb_desc = nb_rx_desc;
 	rxq->priv = PRIV(dev);
 	rxq->sdev = PRIV(dev)->subs;
+	ret = rte_intr_efd_enable(&intr_handle, 1);
+	if (ret < 0)
+		return ret;
+	rxq->event_fd = intr_handle.efds[0];
 	dev->data->rx_queues[rx_queue_id] = rxq;
 	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
 		ret = rte_eth_rx_queue_setup(PORT_ID(sdev),
@@ -387,6 +409,46 @@
 	return ret;
 }
 
+static int
+fs_rx_intr_enable(struct rte_eth_dev *dev, uint16_t idx)
+{
+	struct rxq *rxq;
+
+	if (idx >= dev->data->nb_rx_queues) {
+		rte_errno = EINVAL;
+		return -rte_errno;
+	}
+	rxq = dev->data->rx_queues[idx];
+	if (rxq == NULL || rxq->event_fd <= 0) {
+		rte_errno = EINVAL;
+		return -rte_errno;
+	}
+	rxq->enable_events = 1;
+	return 0;
+}
+
+static int
+fs_rx_intr_disable(struct rte_eth_dev *dev, uint16_t idx)
+{
+	struct rxq *rxq;
+	uint64_t u64;
+
+	if (idx >= dev->data->nb_rx_queues) {
+		rte_errno = EINVAL;
+		return -rte_errno;
+	}
+	rxq = dev->data->rx_queues[idx];
+	if (rxq == NULL || rxq->event_fd <= 0) {
+		rte_errno = EINVAL;
+		return -rte_errno;
+	}
+	rxq->enable_events = 0;
+	/* Clear pending events */
+	while (read(rxq->event_fd, &u64, sizeof(uint64_t)) >  0)
+		;
+	return 0;
+}
+
 static bool
 fs_txq_offloads_valid(struct rte_eth_dev *dev, uint64_t offloads)
 {
@@ -888,6 +950,8 @@
 	.tx_queue_setup = fs_tx_queue_setup,
 	.rx_queue_release = fs_rx_queue_release,
 	.tx_queue_release = fs_tx_queue_release,
+	.rx_queue_intr_enable = fs_rx_intr_enable,
+	.rx_queue_intr_disable = fs_rx_intr_disable,
 	.flow_ctrl_get = fs_flow_ctrl_get,
 	.flow_ctrl_set = fs_flow_ctrl_set,
 	.mac_addr_remove = fs_mac_addr_remove,
diff --git a/drivers/net/failsafe/failsafe_private.h b/drivers/net/failsafe/failsafe_private.h
index 7754248..419e5e7 100644
--- a/drivers/net/failsafe/failsafe_private.h
+++ b/drivers/net/failsafe/failsafe_private.h
@@ -40,6 +40,7 @@
 #include <rte_dev.h>
 #include <rte_ethdev_driver.h>
 #include <rte_devargs.h>
+#include <rte_interrupts.h>
 
 #define FAILSAFE_DRIVER_NAME "Fail-safe PMD"
 
@@ -68,6 +69,8 @@ struct rxq {
 	/* next sub_device to poll */
 	struct sub_device *sdev;
 	unsigned int socket_id;
+	int event_fd;
+	unsigned int enable_events:1;
 	struct rte_eth_rxq_info info;
 	rte_atomic64_t refcnt[];
 };
@@ -145,6 +148,7 @@ struct fs_priv {
 	uint32_t mac_addr_pool[FAILSAFE_MAX_ETHADDR];
 	/* current capabilities */
 	struct rte_eth_dev_info infos;
+	struct rte_intr_handle intr_handle; /* Port interrupt handle. */
 	/*
 	 * Fail-safe state machine.
 	 * This level will be tracking state of the EAL and eth
@@ -159,6 +163,11 @@ struct fs_priv {
 	int flow_isolated:1;
 };
 
+/* FAILSAFE_INTR */
+
+int failsafe_rx_intr_install(struct rte_eth_dev *dev);
+void failsafe_rx_intr_uninstall(struct rte_eth_dev *dev);
+
 /* MISC */
 
 int failsafe_hotplug_alarm_install(struct rte_eth_dev *dev);
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v6 2/3] net/failsafe: slaves Rx interrupts registration
  2018-01-24 16:12             ` [PATCH v6 0/3] net/failsafe: add Rx interrupts support Moti Haimovsky
  2018-01-24 16:12               ` [PATCH v6 1/3] net/failsafe: register as an Rx interrupt mode PMD Moti Haimovsky
@ 2018-01-24 16:12               ` Moti Haimovsky
  2018-01-25  8:07                 ` [PATCH v7 0/3] net/failsafe: add Rx interrupts support Moti Haimovsky
  2018-01-24 16:12               ` [PATCH v6 3/3] net/failsafe: add Rx interrupts Moti Haimovsky
  2 siblings, 1 reply; 29+ messages in thread
From: Moti Haimovsky @ 2018-01-24 16:12 UTC (permalink / raw)
  To: gaetan.rivet, ferruh.yigit; +Cc: dev, Moti Haimovsky

This commit adds the following functionality to failsafe PMD:
* Register and unregister slaves Rx interrupts.
* Enable and Disable slaves Rx interrupts.
The interrupts events generated by the slaves are not handled in this
commit.

Signed-off-by: Moti Haimovsky <motih@mellanox.com>
---
V6:
Added a wrapper around epoll_create1 since it is not supported in FreeBSD.
See: 1516193643-130838-1-git-send-email-motih@mellanox.com

V5:
Initial version of this patch in accordance to inputs from Gaetan Rivet
in reply to
1516354344-13495-2-git-send-email-motih@mellanox.com
---
 drivers/net/failsafe/Makefile                  |   5 +
 drivers/net/failsafe/failsafe_epoll.h          |  10 ++
 drivers/net/failsafe/failsafe_epoll_bsdapp.c   |  19 +++
 drivers/net/failsafe/failsafe_epoll_linuxapp.c |  18 +++
 drivers/net/failsafe/failsafe_ether.c          |   1 +
 drivers/net/failsafe/failsafe_intr.c           | 198 +++++++++++++++++++++++++
 drivers/net/failsafe/failsafe_ops.c            |  36 ++++-
 drivers/net/failsafe/failsafe_private.h        |  16 ++
 8 files changed, 301 insertions(+), 2 deletions(-)
 create mode 100644 drivers/net/failsafe/failsafe_epoll.h
 create mode 100644 drivers/net/failsafe/failsafe_epoll_bsdapp.c
 create mode 100644 drivers/net/failsafe/failsafe_epoll_linuxapp.c

diff --git a/drivers/net/failsafe/Makefile b/drivers/net/failsafe/Makefile
index 91a734b..4e6a983 100644
--- a/drivers/net/failsafe/Makefile
+++ b/drivers/net/failsafe/Makefile
@@ -47,6 +47,11 @@ SRCS-$(CONFIG_RTE_LIBRTE_PMD_FAILSAFE) += failsafe_rxtx.c
 SRCS-$(CONFIG_RTE_LIBRTE_PMD_FAILSAFE) += failsafe_ether.c
 SRCS-$(CONFIG_RTE_LIBRTE_PMD_FAILSAFE) += failsafe_flow.c
 SRCS-$(CONFIG_RTE_LIBRTE_PMD_FAILSAFE) += failsafe_intr.c
+ifeq ($(CONFIG_RTE_EXEC_ENV_LINUXAPP),y)
+SRCS-$(CONFIG_RTE_LIBRTE_PMD_FAILSAFE) += failsafe_epoll_linuxapp.c
+else
+SRCS-$(CONFIG_RTE_LIBRTE_PMD_FAILSAFE) += failsafe_epoll_bsdapp.c
+endif
 
 # No exported include files
 
diff --git a/drivers/net/failsafe/failsafe_epoll.h b/drivers/net/failsafe/failsafe_epoll.h
new file mode 100644
index 0000000..8e6a1ec
--- /dev/null
+++ b/drivers/net/failsafe/failsafe_epoll.h
@@ -0,0 +1,10 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2018 Mellanox Technologies, Ltd.
+ */
+
+#ifndef _RTE_ETH_FAILSAFE_EPOLL_H_
+#define _RTE_ETH_FAILSAFE_EPOLL_H_
+
+int failsafe_epoll_create1(int flags);
+
+#endif /* _RTE_ETH_FAILSAFE_EPOLL_H_*/
diff --git a/drivers/net/failsafe/failsafe_epoll_bsdapp.c b/drivers/net/failsafe/failsafe_epoll_bsdapp.c
new file mode 100644
index 0000000..46c839b
--- /dev/null
+++ b/drivers/net/failsafe/failsafe_epoll_bsdapp.c
@@ -0,0 +1,19 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2018 Mellanox Technologies, Ltd.
+ */
+
+/**
+ * @file
+ * epoll wrapper for failsafe driver.
+ */
+
+#include <rte_common.h>
+
+#include "failsafe_epoll.h"
+
+int
+failsafe_epoll_create1(int flags)
+{
+	RTE_SET_USED(flags);
+	return -ENOTSUP;
+}
diff --git a/drivers/net/failsafe/failsafe_epoll_linuxapp.c b/drivers/net/failsafe/failsafe_epoll_linuxapp.c
new file mode 100644
index 0000000..d82ee0a
--- /dev/null
+++ b/drivers/net/failsafe/failsafe_epoll_linuxapp.c
@@ -0,0 +1,18 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2018 Mellanox Technologies, Ltd.
+ */
+
+/**
+ * @file
+ * epoll wrapper for failsafe driver.
+ */
+
+#include <sys/epoll.h>
+
+#include "failsafe_epoll.h"
+
+int
+failsafe_epoll_create1(int flags)
+{
+	return epoll_create1(flags);
+}
diff --git a/drivers/net/failsafe/failsafe_ether.c b/drivers/net/failsafe/failsafe_ether.c
index 8a4cacf..0f1630e 100644
--- a/drivers/net/failsafe/failsafe_ether.c
+++ b/drivers/net/failsafe/failsafe_ether.c
@@ -283,6 +283,7 @@
 		return;
 	switch (sdev->state) {
 	case DEV_STARTED:
+		failsafe_rx_intr_uninstall_subdevice(sdev);
 		rte_eth_dev_stop(PORT_ID(sdev));
 		sdev->state = DEV_ACTIVE;
 		/* fallthrough */
diff --git a/drivers/net/failsafe/failsafe_intr.c b/drivers/net/failsafe/failsafe_intr.c
index 54ef2f4..512efc7 100644
--- a/drivers/net/failsafe/failsafe_intr.c
+++ b/drivers/net/failsafe/failsafe_intr.c
@@ -9,8 +9,198 @@
 
 #include <unistd.h>
 
+#include "failsafe_epoll.h"
 #include "failsafe_private.h"
 
+#define NUM_RX_PROXIES (FAILSAFE_MAX_ETHPORTS * RTE_MAX_RXTX_INTR_VEC_ID)
+
+/**
+ * Install failsafe Rx event proxy subsystem.
+ * This is the way the failsafe PMD generates Rx events on behalf of its
+ * subdevices.
+ *
+ * @param priv
+ *   Pointer to failsafe private structure.
+ * @return
+ *   0 on success, negative errno value otherwise and rte_errno is set.
+ */
+static int
+fs_rx_event_proxy_install(struct fs_priv *priv)
+{
+	int rc = 0;
+
+	/*
+	 * Create the epoll fd and event vector for the proxy service to
+	 * wait on for Rx events generated by the subdevices.
+	 */
+	priv->rxp.efd = failsafe_epoll_create1(0);
+	if (priv->rxp.efd < 0) {
+		rte_errno = errno;
+		ERROR("Failed to create epoll,"
+		      " Rx interrupts will not be supported");
+		return -rte_errno;
+	}
+	priv->rxp.evec = calloc(NUM_RX_PROXIES, sizeof(*priv->rxp.evec));
+	if (priv->rxp.evec == NULL) {
+		ERROR("Failed to allocate memory for event vectors,"
+		      " Rx interrupts will not be supported");
+		rc = -ENOMEM;
+		goto error;
+	}
+	return 0;
+error:
+	if (priv->rxp.efd >= 0) {
+		close(priv->rxp.efd);
+		priv->rxp.efd = -1;
+	}
+	if (priv->rxp.evec != NULL) {
+		free(priv->rxp.evec);
+		priv->rxp.evec = NULL;
+	}
+	rte_errno = -rc;
+	return rc;
+}
+
+/**
+ * RX Interrupt control per subdevice.
+ *
+ * @param sdev
+ *   Pointer to sub-device structure.
+ * @param op
+ *   The operation be performed for the vector.
+ *   Operation type of {RTE_INTR_EVENT_ADD, RTE_INTR_EVENT_DEL}.
+ * @return
+ *   - On success, zero.
+ *   - On failure, a negative value.
+ */
+static int
+failsafe_eth_rx_intr_ctl_subdevice(struct sub_device *sdev, int op)
+{
+	struct rte_eth_dev *dev;
+	struct rte_eth_dev *fsdev;
+	int epfd;
+	uint16_t pid;
+	uint16_t qid;
+	struct rxq *fsrxq;
+	int rc;
+	int ret = 0;
+
+	if (sdev == NULL || (ETH(sdev) == NULL) ||
+	    sdev->fs_dev == NULL || (PRIV(sdev->fs_dev) == NULL)) {
+		ERROR("Called with invalid arguments");
+		return -EINVAL;
+	}
+	dev = ETH(sdev);
+	fsdev = sdev->fs_dev;
+	epfd = PRIV(sdev->fs_dev)->rxp.efd;
+	pid = PORT_ID(sdev);
+
+	if (epfd <= 0) {
+		if (op == RTE_INTR_EVENT_ADD) {
+			ERROR("Proxy events are not initialized");
+			return -EBADFD;
+		} else {
+			return 0;
+		}
+	}
+	if (dev->data->nb_rx_queues > fsdev->data->nb_rx_queues) {
+		ERROR("subdevice has too many queues,"
+		      " Interrupts will not be enabled");
+			return -E2BIG;
+	}
+	for (qid = 0; qid < dev->data->nb_rx_queues; qid++) {
+		fsrxq = fsdev->data->rx_queues[qid];
+		rc = rte_eth_dev_rx_intr_ctl_q(pid, qid, epfd,
+					       op, (void *)fsrxq);
+		if (rc) {
+			ERROR("rte_eth_dev_rx_intr_ctl_q failed for "
+			      "port %d  queue %d, epfd %d, error %d",
+			      pid, qid, epfd, rc);
+			ret = rc;
+		}
+	}
+	return ret;
+}
+
+/**
+ * Install Rx interrupts subsystem for a subdevice.
+ * This is a support for dynamically adding subdevices.
+ *
+ * @param sdev
+ *   Pointer to subdevice structure.
+ *
+ * @return
+ *   0 on success, negative errno value otherwise and rte_errno is set.
+ */
+int failsafe_rx_intr_install_subdevice(struct sub_device *sdev)
+{
+	int rc;
+	int qid;
+	struct rte_eth_dev *fsdev;
+	struct rxq **rxq;
+	const struct rte_intr_conf *const intr_conf =
+				&ETH(sdev)->data->dev_conf.intr_conf;
+
+	fsdev = sdev->fs_dev;
+	rxq = (struct rxq **)fsdev->data->rx_queues;
+	if (intr_conf->rxq == 0)
+		return 0;
+	rc = failsafe_eth_rx_intr_ctl_subdevice(sdev, RTE_INTR_EVENT_ADD);
+	if (rc)
+		return rc;
+	/* enable interrupts on already-enabled queues */
+	for (qid = 0; qid < ETH(sdev)->data->nb_rx_queues; qid++) {
+		if (rxq[qid]->enable_events) {
+			int ret = rte_eth_dev_rx_intr_enable(PORT_ID(sdev),
+							     qid);
+			if (ret && (ret != -ENOTSUP)) {
+				ERROR("Failed to enable interrupts on "
+				      "port %d queue %d", PORT_ID(sdev), qid);
+				rc = ret;
+			}
+		}
+	}
+	return rc;
+}
+
+/**
+ * Uninstall Rx interrupts subsystem for a subdevice.
+ * This is a support for dynamically removing subdevices.
+ *
+ * @param sdev
+ *   Pointer to subdevice structure.
+ *
+ * @return
+ *   0 on success, negative errno value otherwise and rte_errno is set.
+ */
+void failsafe_rx_intr_uninstall_subdevice(struct sub_device *sdev)
+{
+	int qid;
+
+	for (qid = 0; qid < ETH(sdev)->data->nb_rx_queues; qid++)
+		rte_eth_dev_rx_intr_disable(PORT_ID(sdev), qid);
+	failsafe_eth_rx_intr_ctl_subdevice(sdev, RTE_INTR_EVENT_DEL);
+}
+
+/**
+ * Uninstall failsafe Rx event proxy.
+ *
+ * @param priv
+ *   Pointer to failsafe private structure.
+ */
+static void
+fs_rx_event_proxy_uninstall(struct fs_priv *priv)
+{
+	if (priv->rxp.evec != NULL) {
+		free(priv->rxp.evec);
+		priv->rxp.evec = NULL;
+	}
+	if (priv->rxp.efd > 0) {
+		close(priv->rxp.efd);
+		priv->rxp.efd = -1;
+	}
+}
+
 /**
  * Uninstall failsafe interrupt vector.
  *
@@ -107,8 +297,12 @@
 failsafe_rx_intr_uninstall(struct rte_eth_dev *dev)
 {
 	struct fs_priv *priv;
+	struct rte_intr_handle *intr_handle;
 
 	priv = PRIV(dev);
+	intr_handle = &priv->intr_handle;
+	rte_intr_free_epoll_fd(intr_handle);
+	fs_rx_event_proxy_uninstall(priv);
 	fs_rx_intr_vec_uninstall(priv);
 	dev->intr_handle = NULL;
 }
@@ -133,6 +327,10 @@
 		return 0;
 	if (fs_rx_intr_vec_install(priv) < 0)
 		return -rte_errno;
+	if (fs_rx_event_proxy_install(priv) < 0) {
+		fs_rx_intr_vec_uninstall(priv);
+		return -rte_errno;
+	}
 	dev->intr_handle = &priv->intr_handle;
 	return 0;
 }
diff --git a/drivers/net/failsafe/failsafe_ops.c b/drivers/net/failsafe/failsafe_ops.c
index d6a82b3..2ea9cdd 100644
--- a/drivers/net/failsafe/failsafe_ops.c
+++ b/drivers/net/failsafe/failsafe_ops.c
@@ -214,6 +214,13 @@
 				continue;
 			return ret;
 		}
+		ret = failsafe_rx_intr_install_subdevice(sdev);
+		if (ret) {
+			if (!fs_err(sdev, ret))
+				continue;
+			rte_eth_dev_stop(PORT_ID(sdev));
+			return ret;
+		}
 		sdev->state = DEV_STARTED;
 	}
 	if (PRIV(dev)->state < DEV_STARTED)
@@ -231,6 +238,7 @@
 	PRIV(dev)->state = DEV_STARTED - 1;
 	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_STARTED) {
 		rte_eth_dev_stop(PORT_ID(sdev));
+		failsafe_rx_intr_uninstall_subdevice(sdev);
 		sdev->state = DEV_STARTED - 1;
 	}
 	failsafe_rx_intr_uninstall(dev);
@@ -413,6 +421,10 @@
 fs_rx_intr_enable(struct rte_eth_dev *dev, uint16_t idx)
 {
 	struct rxq *rxq;
+	struct sub_device *sdev;
+	uint8_t i;
+	int ret;
+	int rc = 0;
 
 	if (idx >= dev->data->nb_rx_queues) {
 		rte_errno = EINVAL;
@@ -424,14 +436,26 @@
 		return -rte_errno;
 	}
 	rxq->enable_events = 1;
-	return 0;
+	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
+		ret = rte_eth_dev_rx_intr_enable(PORT_ID(sdev), idx);
+		ret = fs_err(sdev, ret);
+		if (ret)
+			rc = ret;
+	}
+	if (rc)
+		rte_errno = -rc;
+	return rc;
 }
 
 static int
 fs_rx_intr_disable(struct rte_eth_dev *dev, uint16_t idx)
 {
 	struct rxq *rxq;
+	struct sub_device *sdev;
 	uint64_t u64;
+	uint8_t i;
+	int rc = 0;
+	int ret;
 
 	if (idx >= dev->data->nb_rx_queues) {
 		rte_errno = EINVAL;
@@ -443,10 +467,18 @@
 		return -rte_errno;
 	}
 	rxq->enable_events = 0;
+	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
+		ret = rte_eth_dev_rx_intr_disable(PORT_ID(sdev), idx);
+		ret = fs_err(sdev, ret);
+		if (ret)
+			rc = ret;
+	}
 	/* Clear pending events */
 	while (read(rxq->event_fd, &u64, sizeof(uint64_t)) >  0)
 		;
-	return 0;
+	if (rc)
+		rte_errno = -rc;
+	return rc;
 }
 
 static bool
diff --git a/drivers/net/failsafe/failsafe_private.h b/drivers/net/failsafe/failsafe_private.h
index 419e5e7..ff78b9f 100644
--- a/drivers/net/failsafe/failsafe_private.h
+++ b/drivers/net/failsafe/failsafe_private.h
@@ -63,6 +63,13 @@
 
 /* TYPES */
 
+struct rx_proxy {
+	/* epoll file descriptor */
+	int efd;
+	/* event vector to be used by epoll */
+	struct rte_epoll_event *evec;
+};
+
 struct rxq {
 	struct fs_priv *priv;
 	uint16_t qid;
@@ -158,6 +165,13 @@ struct fs_priv {
 	 */
 	enum dev_state state;
 	struct rte_eth_stats stats_accumulator;
+	/*
+	 * Rx interrupts/events proxy.
+	 * The PMD issues Rx events to the EAL on behalf of its subdevices,
+	 * it does that by registering an event-fd for each of its queues with
+	 * the EAL.
+	 */
+	struct rx_proxy rxp;
 	unsigned int pending_alarm:1; /* An alarm is pending */
 	/* flow isolation state */
 	int flow_isolated:1;
@@ -167,6 +181,8 @@ struct fs_priv {
 
 int failsafe_rx_intr_install(struct rte_eth_dev *dev);
 void failsafe_rx_intr_uninstall(struct rte_eth_dev *dev);
+int failsafe_rx_intr_install_subdevice(struct sub_device *sdev);
+void failsafe_rx_intr_uninstall_subdevice(struct sub_device *sdev);
 
 /* MISC */
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v6 3/3] net/failsafe: add Rx interrupts
  2018-01-24 16:12             ` [PATCH v6 0/3] net/failsafe: add Rx interrupts support Moti Haimovsky
  2018-01-24 16:12               ` [PATCH v6 1/3] net/failsafe: register as an Rx interrupt mode PMD Moti Haimovsky
  2018-01-24 16:12               ` [PATCH v6 2/3] net/failsafe: slaves Rx interrupts registration Moti Haimovsky
@ 2018-01-24 16:12               ` Moti Haimovsky
  2 siblings, 0 replies; 29+ messages in thread
From: Moti Haimovsky @ 2018-01-24 16:12 UTC (permalink / raw)
  To: gaetan.rivet, ferruh.yigit; +Cc: dev, Moti Haimovsky

This patch is the last patch in the series of patches aimed
to add support for registering and waiting for Rx interrupts
in failsafe PMD. This allows applications to wait for Rx events
from the PMD using the DPDK rte_epoll subsystem.
The failsafe PMD presents to the application a facade of a single
device to be handled by the application while internally it manages
several devices on behalf of the application including packets
transmission and reception.
The Proposed failsafe Rx interrupt scheme follows this approach.
The failsafe PMD will present the application with a single set of
Rx interrupt vectors representing the failsafe Rx queues, while
internally it will serve as an interrupt proxy for its subdevices.
will allow applications to wait for Rx traffic from the failsafe
PMD by registering and waiting for Rx events from its Rx queues.
In order to support this the following is suggested:
  * Every Rx queue in the failsafe (virtual) device will be assigned
  * a Linux event file descriptor (efd) and an enable_interrupts flag.
  * The failsafe PMD will fill in its rte_intr_handle structure with
    the Rx efds assigned previously and register them with the EAL.
  * The failsafe driver will create a private epoll fd (epfd) and
  * will allocate enough space to handle all the Rx events from all its
    subdevices.
  * Acting as an application,
    for each Rx queue in each active subdevice the failsafe will:
      o Register the Rx queue with the EAL.
      o Pass the EAL the failsafe private epoll fd as the epfd to
        register the Rx queue event on.
      o Pass the EAL, as a parameter, the pointer to the failsafe Rx
        queue that handles this Rx queue.
      o Using the DPDK service callbacks, the failsafe PMD will launch
        an Rx proxy service that will Wait on the epoll fd for Rx
        events from the sub-devices.
      o For each Rx event received the proxy service will
          - Retrieve the pointer to failsafe Rx queue that handles
            this subdevice Rx queue from the user info returned by the
            EAL.
          - Trigger a failsafe Rx event on that queue by writing to
            the event fd unless interrupts are disabled for that queue.
  * The failsafe pmd will also implement the rx_queue_intr_enable
  * and rx_queue_intr_disable routines that will enable and disable Rx
    interrupts respectively on both on the failsafe and its subdevices.

Signed-off-by: Moti Haimovsky <motih@mellanox.com>
---
V6:
Separated between routines' variables definition and initialization
according to guidelines from Gaetan Rivet.

V5:
Modified code and split the patch into three patches in accordance to
inputs from Gaetan Rivet in reply to
1516354344-13495-2-git-send-email-motih@mellanox.com

V4:
Fixed merge conflicts found during integration with other failsafe patches
(See cover letter).

V3:
Fixed build failures in FreeBSD10.3_64

V2:
Modifications according to inputs from Stephen Hemminger:
* Removed unneeded (void *) casting.
Fixed coding style warning.
---
 drivers/net/failsafe/failsafe_intr.c    | 169 ++++++++++++++++++++++++++++++++
 drivers/net/failsafe/failsafe_ops.c     |   6 ++
 drivers/net/failsafe/failsafe_private.h |  17 +++-
 3 files changed, 191 insertions(+), 1 deletion(-)

diff --git a/drivers/net/failsafe/failsafe_intr.c b/drivers/net/failsafe/failsafe_intr.c
index 512efc7..215bfe4 100644
--- a/drivers/net/failsafe/failsafe_intr.c
+++ b/drivers/net/failsafe/failsafe_intr.c
@@ -9,12 +9,176 @@
 
 #include <unistd.h>
 
+#include <rte_alarm.h>
+#include <rte_config.h>
+#include <rte_errno.h>
+#include <rte_ethdev.h>
+#include <rte_interrupts.h>
+#include <rte_io.h>
+#include <rte_service_component.h>
+
 #include "failsafe_epoll.h"
 #include "failsafe_private.h"
 
 #define NUM_RX_PROXIES (FAILSAFE_MAX_ETHPORTS * RTE_MAX_RXTX_INTR_VEC_ID)
 
 /**
+ * Install failsafe Rx event proxy service.
+ * The Rx event proxy is the service that listens to Rx events from the
+ * subdevices and triggers failsafe Rx events accordingly.
+ *
+ * @param priv
+ *   Pointer to failsafe private structure.
+ * @return
+ *   0 on success, negative errno value otherwise.
+ */
+static int
+fs_rx_event_proxy_routine(void *data)
+{
+	struct fs_priv *priv;
+	struct rxq *rxq;
+	struct rte_epoll_event *events;
+	uint64_t u64;
+	int i, n;
+	int rc = 0;
+
+	u64 = 1;
+	priv = data;
+	events = priv->rxp.evec;
+	n = rte_epoll_wait(priv->rxp.efd, events, NUM_RX_PROXIES, -1);
+	for (i = 0; i < n; i++) {
+		rxq = events[i].epdata.data;
+		if (rxq->enable_events && rxq->event_fd != -1) {
+			if (write(rxq->event_fd, &u64, sizeof(u64)) !=
+			    sizeof(u64)) {
+				ERROR("Failed to proxy Rx event to socket %d",
+				       rxq->event_fd);
+				rc = -EIO;
+			}
+		}
+	}
+	return rc;
+}
+
+/**
+ * Uninstall failsafe Rx event proxy service.
+ *
+ * @param priv
+ *   Pointer to failsafe private structure.
+ */
+static void
+fs_rx_event_proxy_service_uninstall(struct fs_priv *priv)
+{
+	/* Unregister the event service. */
+	switch (priv->rxp.sstate) {
+	case SS_RUNNING:
+		rte_service_map_lcore_set(priv->rxp.sid, priv->rxp.scid, 0);
+		/* fall through */
+	case SS_READY:
+		rte_service_runstate_set(priv->rxp.sid, 0);
+		rte_service_set_stats_enable(priv->rxp.sid, 0);
+		rte_service_component_runstate_set(priv->rxp.sid, 0);
+		/* fall through */
+	case SS_REGISTERED:
+		rte_service_component_unregister(priv->rxp.sid);
+		/* fall through */
+	default:
+		break;
+	}
+}
+
+/**
+ * Install the failsafe Rx event proxy service.
+ *
+ * @param priv
+ *   Pointer to failsafe private structure.
+ * @return
+ *   0 on success, negative errno value otherwise.
+ */
+static int
+fs_rx_event_proxy_service_install(struct fs_priv *priv)
+{
+	struct rte_service_spec service;
+	int32_t num_service_cores;
+	int ret = 0;
+
+	num_service_cores = rte_service_lcore_count();
+	if (num_service_cores <= 0) {
+		ERROR("Failed to install Rx interrupts, "
+		      "no service core found");
+		return -ENOTSUP;
+	}
+	/* prepare service info */
+	memset(&service, 0, sizeof(struct rte_service_spec));
+	snprintf(service.name, sizeof(service.name), "%s_Rx_service",
+		 priv->dev->data->name);
+	service.socket_id = priv->dev->data->numa_node;
+	service.callback = fs_rx_event_proxy_routine;
+	service.callback_userdata = (void *)priv;
+
+	if (priv->rxp.sstate == SS_NO_SERVICE) {
+		uint32_t service_core_list[num_service_cores];
+
+		/* get a service core to work with */
+		ret = rte_service_lcore_list(service_core_list,
+					     num_service_cores);
+		if (ret <= 0) {
+			ERROR("Failed to install Rx interrupts, "
+			      "service core list empty or corrupted");
+			return -ENOTSUP;
+		}
+		priv->rxp.scid = service_core_list[0];
+		ret = rte_service_lcore_add(priv->rxp.scid);
+		if (ret && ret != -EALREADY) {
+			ERROR("Failed adding service core");
+			return ret;
+		}
+		/* service core may be in "stopped" state, start it */
+		ret = rte_service_lcore_start(priv->rxp.scid);
+		if (ret && (ret != -EALREADY)) {
+			ERROR("Failed to install Rx interrupts, "
+			      "service core not started");
+			return ret;
+		}
+		/* register our service */
+		int32_t ret = rte_service_component_register(&service,
+							     &priv->rxp.sid);
+		if (ret) {
+			ERROR("service register() failed");
+			return -ENOEXEC;
+		}
+		priv->rxp.sstate = SS_REGISTERED;
+		/* run the service */
+		ret = rte_service_component_runstate_set(priv->rxp.sid, 1);
+		if (ret < 0) {
+			ERROR("Failed Setting component runstate\n");
+			return ret;
+		}
+		ret = rte_service_set_stats_enable(priv->rxp.sid, 1);
+		if (ret < 0) {
+			ERROR("Failed enabling stats\n");
+			return ret;
+		}
+		ret = rte_service_runstate_set(priv->rxp.sid, 1);
+		if (ret < 0) {
+			ERROR("Failed to run service\n");
+			return ret;
+		}
+		priv->rxp.sstate = SS_READY;
+		/* map the service with the service core */
+		ret = rte_service_map_lcore_set(priv->rxp.sid,
+						priv->rxp.scid, 1);
+		if (ret) {
+			ERROR("Failed to install Rx interrupts, "
+			      "could not map service core");
+			return ret;
+		}
+		priv->rxp.sstate = SS_RUNNING;
+	}
+	return 0;
+}
+
+/**
  * Install failsafe Rx event proxy subsystem.
  * This is the way the failsafe PMD generates Rx events on behalf of its
  * subdevices.
@@ -47,6 +211,10 @@
 		rc = -ENOMEM;
 		goto error;
 	}
+	if (fs_rx_event_proxy_service_install(priv) < 0) {
+		rc = -rte_errno;
+		goto error;
+	}
 	return 0;
 error:
 	if (priv->rxp.efd >= 0) {
@@ -191,6 +359,7 @@ void failsafe_rx_intr_uninstall_subdevice(struct sub_device *sdev)
 static void
 fs_rx_event_proxy_uninstall(struct fs_priv *priv)
 {
+	fs_rx_event_proxy_service_uninstall(priv);
 	if (priv->rxp.evec != NULL) {
 		free(priv->rxp.evec);
 		priv->rxp.evec = NULL;
diff --git a/drivers/net/failsafe/failsafe_ops.c b/drivers/net/failsafe/failsafe_ops.c
index 2ea9cdd..840024c 100644
--- a/drivers/net/failsafe/failsafe_ops.c
+++ b/drivers/net/failsafe/failsafe_ops.c
@@ -435,6 +435,12 @@
 		rte_errno = EINVAL;
 		return -rte_errno;
 	}
+	/* Fail if proxy service is nor running. */
+	if (PRIV(dev)->rxp.sstate != SS_RUNNING) {
+		ERROR("failsafe interrupt services are not running");
+		rte_errno = EAGAIN;
+		return -rte_errno;
+	}
 	rxq->enable_events = 1;
 	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
 		ret = rte_eth_dev_rx_intr_enable(PORT_ID(sdev), idx);
diff --git a/drivers/net/failsafe/failsafe_private.h b/drivers/net/failsafe/failsafe_private.h
index ff78b9f..5d328ff 100644
--- a/drivers/net/failsafe/failsafe_private.h
+++ b/drivers/net/failsafe/failsafe_private.h
@@ -61,6 +61,13 @@
 
 #define DEVARGS_MAXLEN 4096
 
+enum rxp_service_state {
+	SS_NO_SERVICE = 0,
+	SS_REGISTERED,
+	SS_READY,
+	SS_RUNNING,
+};
+
 /* TYPES */
 
 struct rx_proxy {
@@ -68,6 +75,11 @@ struct rx_proxy {
 	int efd;
 	/* event vector to be used by epoll */
 	struct rte_epoll_event *evec;
+	/* rte service id */
+	uint32_t sid;
+	/* service core id */
+	uint32_t scid;
+	enum rxp_service_state sstate;
 };
 
 struct rxq {
@@ -169,7 +181,10 @@ struct fs_priv {
 	 * Rx interrupts/events proxy.
 	 * The PMD issues Rx events to the EAL on behalf of its subdevices,
 	 * it does that by registering an event-fd for each of its queues with
-	 * the EAL.
+	 * the EAL. A PMD service thread listens to all the Rx events from the
+	 * subdevices, when an Rx event is issued by a subdevice it will be
+	 * caught by this service with will trigger an Rx event in the
+	 * appropriate failsafe Rx queue.
 	 */
 	struct rx_proxy rxp;
 	unsigned int pending_alarm:1; /* An alarm is pending */
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v7 0/3] net/failsafe: add Rx interrupts support
  2018-01-24 16:12               ` [PATCH v6 2/3] net/failsafe: slaves Rx interrupts registration Moti Haimovsky
@ 2018-01-25  8:07                 ` Moti Haimovsky
  2018-01-25  8:07                   ` [PATCH v7 1/3] net/failsafe: register as an Rx interrupt mode PMD Moti Haimovsky
                                     ` (2 more replies)
  0 siblings, 3 replies; 29+ messages in thread
From: Moti Haimovsky @ 2018-01-25  8:07 UTC (permalink / raw)
  To: gaetan.rivet, ferruh.yigit; +Cc: dev, Moti Haimovsky

These three patches add support for registering and waiting for
Rx interrupts in failsafe PMD. This allows applications to wait
for Rx events from the PMD using the DPDK rte_epoll subsystem.
The failsafe PMD presents to the application a facade of a single
device to be handled by the application while internally it manages
several devices on behalf of the application including packets
transmission and reception.
The Proposed failsafe Rx interrupt scheme follows this approach.
The failsafe PMD will present the application with a single set of
Rx interrupt vectors representing the failsafe Rx queues, while
internally it will serve as an interrupt proxy for its subdevices.
will allow applications to wait for Rx traffic from the failsafe
PMD by registering and waiting for Rx events from its Rx queues.
In order to support this the following is suggested:
  * Every Rx queue in the failsafe (virtual) device will be assigned
  * a Linux event file descriptor (efd) and an enable_interrupts flag.
  * The failsafe PMD will fill in its rte_intr_handle structure with
    the Rx efds assigned previously and register them with the EAL.
  * The failsafe driver will create a private epoll fd (epfd) and
  * will allocate enough space to handle all the Rx events from all its
    subdevices.
  * Acting as an application,
    for each Rx queue in each active subdevice the failsafe will:
      o Register the Rx queue with the EAL.
      o Pass the EAL the failsafe private epoll fd as the epfd to
        register the Rx queue event on.
      o Pass the EAL, as a parameter, the pointer to the failsafe Rx
        queue that handles this Rx queue.
      o Using the DPDK service callbacks, the failsafe PMD will launch
        an Rx proxy service that will Wait on the epoll fd for Rx
        events from the sub-devices.
      o For each Rx event received the proxy service will
          - Retrieve the pointer to failsafe Rx queue that handles
            this subdevice Rx queue from the user info returned by the
            EAL.
          - Trigger a failsafe Rx event on that queue by writing to
            the event fd unless interrupts are disabled for that queue.
  * The failsafe pmd will also implement the rx_queue_intr_enable
  * and rx_queue_intr_disable routines that will enable and disable Rx
    interrupts respectively on both on the failsafe and its subdevices.

Moti Haimovsky (3):
  net/failsafe: register as an Rx interrupt mode PMD
  net/failsafe: slaves Rx interrupts registration
  net/failsafe: add Rx interrupts
---
V7:
Fixed compilation errors in FreeBSD.
See 1516810328-39383-3-git-send-email-motih@mellanox.com

V6:
* Added a wrapper around epoll_create1 since it is not supported in FreeBSD.
  See: 1516193643-130838-1-git-send-email-motih@mellanox.com
* Separated between routines' variables definition and initialization
  according to guidelines from Gaetan Rivet.

V5:
Modified code and split the patch into three patches in accordance to
inputs from Gaetan Rivet in reply to
1516354344-13495-2-git-send-email-motih@mellanox.com

V4:
Fixed merge conflicts found during integration with other failsafe patches
(See cover letter).

V3:
Fixed build failures in FreeBSD10.3_64

V2:
Modifications according to inputs from Stephen Hemminger:
* Removed unneeded (void *) casting.
Fixed coding style warning.
---
 doc/guides/nics/features/failsafe.ini          |   1 +
 drivers/net/failsafe/Makefile                  |   6 +
 drivers/net/failsafe/failsafe.c                |   4 +
 drivers/net/failsafe/failsafe_epoll.h          |  10 +
 drivers/net/failsafe/failsafe_epoll_bsdapp.c   |  19 +
 drivers/net/failsafe/failsafe_epoll_linuxapp.c |  18 +
 drivers/net/failsafe/failsafe_ether.c          |   1 +
 drivers/net/failsafe/failsafe_intr.c           | 505 +++++++++++++++++++++++++
 drivers/net/failsafe/failsafe_ops.c            | 102 +++++
 drivers/net/failsafe/failsafe_private.h        |  40 ++
 10 files changed, 706 insertions(+)
 create mode 100644 drivers/net/failsafe/failsafe_epoll.h
 create mode 100644 drivers/net/failsafe/failsafe_epoll_bsdapp.c
 create mode 100644 drivers/net/failsafe/failsafe_epoll_linuxapp.c
 create mode 100644 drivers/net/failsafe/failsafe_intr.c

-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH v7 1/3] net/failsafe: register as an Rx interrupt mode PMD
  2018-01-25  8:07                 ` [PATCH v7 0/3] net/failsafe: add Rx interrupts support Moti Haimovsky
@ 2018-01-25  8:07                   ` Moti Haimovsky
  2018-01-25 11:36                     ` Gaëtan Rivet
  2018-01-25  8:07                   ` [PATCH v7 2/3] net/failsafe: slaves Rx interrupts registration Moti Haimovsky
  2018-01-25  8:07                   ` [PATCH v7 3/3] net/failsafe: add Rx interrupts Moti Haimovsky
  2 siblings, 1 reply; 29+ messages in thread
From: Moti Haimovsky @ 2018-01-25  8:07 UTC (permalink / raw)
  To: gaetan.rivet, ferruh.yigit; +Cc: dev, Moti Haimovsky

This patch adds registering the Rx queues of the failsafe PMD with EAL
Rx interrupts subsystem.
Each failsafe RX queue is assigned with a unique eventfd and an enable
interrupts flag.
The PMD creates an interrupt vector containing the above eventfds and
Registers it with  EAL. The PMD also implements the Rx interrupts enable
and disable interface routines.
This patch does not implement the generation of Rx interrupts, so an
application can now wait for failsafe Rx interrupts but it will not
receive one.

Signed-off-by: Moti Haimovsky <motih@mellanox.com>
---
V6:
Fixed typo in commit subject.

V5:
Initial version of this patch in accordance to inputs from Gaetan Rivet
in reply to
1516354344-13495-2-git-send-email-motih@mellanox.com
---
 doc/guides/nics/features/failsafe.ini   |   1 +
 drivers/net/failsafe/Makefile           |   1 +
 drivers/net/failsafe/failsafe.c         |   4 +
 drivers/net/failsafe/failsafe_intr.c    | 138 ++++++++++++++++++++++++++++++++
 drivers/net/failsafe/failsafe_ops.c     |  64 +++++++++++++++
 drivers/net/failsafe/failsafe_private.h |   9 +++
 6 files changed, 217 insertions(+)
 create mode 100644 drivers/net/failsafe/failsafe_intr.c

diff --git a/doc/guides/nics/features/failsafe.ini b/doc/guides/nics/features/failsafe.ini
index a42e344..39ee579 100644
--- a/doc/guides/nics/features/failsafe.ini
+++ b/doc/guides/nics/features/failsafe.ini
@@ -6,6 +6,7 @@
 [Features]
 Link status          = Y
 Link status event    = Y
+Rx interrupt         = Y
 MTU update           = Y
 Jumbo frame          = Y
 Promiscuous mode     = Y
diff --git a/drivers/net/failsafe/Makefile b/drivers/net/failsafe/Makefile
index ea2a8fe..91a734b 100644
--- a/drivers/net/failsafe/Makefile
+++ b/drivers/net/failsafe/Makefile
@@ -46,6 +46,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_PMD_FAILSAFE) += failsafe_ops.c
 SRCS-$(CONFIG_RTE_LIBRTE_PMD_FAILSAFE) += failsafe_rxtx.c
 SRCS-$(CONFIG_RTE_LIBRTE_PMD_FAILSAFE) += failsafe_ether.c
 SRCS-$(CONFIG_RTE_LIBRTE_PMD_FAILSAFE) += failsafe_flow.c
+SRCS-$(CONFIG_RTE_LIBRTE_PMD_FAILSAFE) += failsafe_intr.c
 
 # No exported include files
 
diff --git a/drivers/net/failsafe/failsafe.c b/drivers/net/failsafe/failsafe.c
index cb274eb..921e656 100644
--- a/drivers/net/failsafe/failsafe.c
+++ b/drivers/net/failsafe/failsafe.c
@@ -244,6 +244,10 @@
 		mac->addr_bytes[2], mac->addr_bytes[3],
 		mac->addr_bytes[4], mac->addr_bytes[5]);
 	dev->data->dev_flags |= RTE_ETH_DEV_INTR_LSC;
+	PRIV(dev)->intr_handle = (struct rte_intr_handle){
+		.fd = -1,
+		.type = RTE_INTR_HANDLE_EXT,
+	};
 	return 0;
 free_args:
 	failsafe_args_free(dev);
diff --git a/drivers/net/failsafe/failsafe_intr.c b/drivers/net/failsafe/failsafe_intr.c
new file mode 100644
index 0000000..54ef2f4
--- /dev/null
+++ b/drivers/net/failsafe/failsafe_intr.c
@@ -0,0 +1,138 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2018 Mellanox Technologies, Ltd.
+ */
+
+/**
+ * @file
+ * Interrupts handling for failsafe driver.
+ */
+
+#include <unistd.h>
+
+#include "failsafe_private.h"
+
+/**
+ * Uninstall failsafe interrupt vector.
+ *
+ * @param priv
+ *   Pointer to failsafe private structure.
+ */
+static void
+fs_rx_intr_vec_uninstall(struct fs_priv *priv)
+{
+	struct rte_intr_handle *intr_handle;
+
+	intr_handle = &priv->intr_handle;
+	if (intr_handle->intr_vec != NULL) {
+		free(intr_handle->intr_vec);
+		intr_handle->intr_vec = NULL;
+	}
+	intr_handle->nb_efd = 0;
+}
+
+/**
+ * Installs failsafe interrupt vector to be registered with EAL later on.
+ *
+ * @param priv
+ *   Pointer to failsafe private structure.
+ *
+ * @return
+ *   0 on success, negative errno value otherwise and rte_errno is set.
+ */
+static int
+fs_rx_intr_vec_install(struct fs_priv *priv)
+{
+	unsigned int i;
+	unsigned int rxqs_n;
+	unsigned int n;
+	unsigned int count;
+	struct rte_intr_handle *intr_handle;
+
+	rxqs_n = priv->dev->data->nb_rx_queues;
+	n = RTE_MIN(rxqs_n, (uint32_t)RTE_MAX_RXTX_INTR_VEC_ID);
+	count = 0;
+	intr_handle = &priv->intr_handle;
+	/* Allocate the interrupt vector of the failsafe Rx proxy interrupts */
+	intr_handle->intr_vec = malloc(n * sizeof(intr_handle->intr_vec[0]));
+	if (intr_handle->intr_vec == NULL) {
+		fs_rx_intr_vec_uninstall(priv);
+		rte_errno = ENOMEM;
+		ERROR("Failed to allocate memory for interrupt vector,"
+		      " Rx interrupts will not be supported");
+		return -rte_errno;
+	}
+	for (i = 0; i < n; i++) {
+		struct rxq *rxq = priv->dev->data->rx_queues[i];
+
+		/* Skip queues that cannot request interrupts. */
+		if (rxq == NULL || rxq->event_fd < 0) {
+			/* Use invalid intr_vec[] index to disable entry. */
+			intr_handle->intr_vec[i] =
+				RTE_INTR_VEC_RXTX_OFFSET +
+				RTE_MAX_RXTX_INTR_VEC_ID;
+			continue;
+		}
+		if (count >= RTE_MAX_RXTX_INTR_VEC_ID) {
+			rte_errno = E2BIG;
+			ERROR("Too many Rx queues for interrupt vector size"
+			      " (%d), Rx interrupts cannot be enabled",
+			      RTE_MAX_RXTX_INTR_VEC_ID);
+			fs_rx_intr_vec_uninstall(priv);
+			return -rte_errno;
+		}
+		intr_handle->intr_vec[i] = RTE_INTR_VEC_RXTX_OFFSET + count;
+		intr_handle->efds[count] = rxq->event_fd;
+		count++;
+	}
+	if (count == 0) {
+		fs_rx_intr_vec_uninstall(priv);
+	} else {
+		intr_handle->nb_efd = count;
+		intr_handle->efd_counter_size = sizeof(uint64_t);
+	}
+	return 0;
+}
+
+
+/**
+ * Uninstall failsafe Rx interrupts subsystem.
+ *
+ * @param priv
+ *   Pointer to private structure.
+ *
+ * @return
+ *   0 on success, negative errno value otherwise and rte_errno is set.
+ */
+void
+failsafe_rx_intr_uninstall(struct rte_eth_dev *dev)
+{
+	struct fs_priv *priv;
+
+	priv = PRIV(dev);
+	fs_rx_intr_vec_uninstall(priv);
+	dev->intr_handle = NULL;
+}
+
+/**
+ * Install failsafe Rx interrupts subsystem.
+ *
+ * @param priv
+ *   Pointer to private structure.
+ *
+ * @return
+ *   0 on success, negative errno value otherwise and rte_errno is set.
+ */
+int
+failsafe_rx_intr_install(struct rte_eth_dev *dev)
+{
+	struct fs_priv *priv = PRIV(dev);
+	const struct rte_intr_conf *const intr_conf =
+			&priv->dev->data->dev_conf.intr_conf;
+
+	if (intr_conf->rxq == 0 || priv->intr_handle.intr_vec != NULL)
+		return 0;
+	if (fs_rx_intr_vec_install(priv) < 0)
+		return -rte_errno;
+	dev->intr_handle = &priv->intr_handle;
+	return 0;
+}
diff --git a/drivers/net/failsafe/failsafe_ops.c b/drivers/net/failsafe/failsafe_ops.c
index 946ac98..d6a82b3 100644
--- a/drivers/net/failsafe/failsafe_ops.c
+++ b/drivers/net/failsafe/failsafe_ops.c
@@ -33,6 +33,7 @@
 
 #include <stdbool.h>
 #include <stdint.h>
+#include <unistd.h>
 
 #include <rte_debug.h>
 #include <rte_atomic.h>
@@ -199,6 +200,10 @@
 	uint8_t i;
 	int ret;
 
+	ret = failsafe_rx_intr_install(dev);
+	if (ret)
+		return ret;
+
 	FOREACH_SUBDEV(sdev, i, dev) {
 		if (sdev->state != DEV_ACTIVE)
 			continue;
@@ -228,6 +233,7 @@
 		rte_eth_dev_stop(PORT_ID(sdev));
 		sdev->state = DEV_STARTED - 1;
 	}
+	failsafe_rx_intr_uninstall(dev);
 }
 
 static int
@@ -317,6 +323,8 @@
 	if (queue == NULL)
 		return;
 	rxq = queue;
+	if (rxq->event_fd > 0)
+		close(rxq->event_fd);
 	dev = rxq->priv->dev;
 	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE)
 		SUBOPS(sdev, rx_queue_release)
@@ -333,6 +341,16 @@
 		const struct rte_eth_rxconf *rx_conf,
 		struct rte_mempool *mb_pool)
 {
+	/*
+	 * FIXME: Add a proper interface in rte_eal_interrupts for
+	 * allocating eventfd as an interrupt vector.
+	 * For the time being, fake as if we are using MSIX interrupts,
+	 * this will cause rte_intr_efd_enable to allocate an eventfd for us.
+	 */
+	struct rte_intr_handle intr_handle = {
+		.type = RTE_INTR_HANDLE_VFIO_MSIX,
+		.efds = {-1, },
+	};
 	struct sub_device *sdev;
 	struct rxq *rxq;
 	uint8_t i;
@@ -370,6 +388,10 @@
 	rxq->info.nb_desc = nb_rx_desc;
 	rxq->priv = PRIV(dev);
 	rxq->sdev = PRIV(dev)->subs;
+	ret = rte_intr_efd_enable(&intr_handle, 1);
+	if (ret < 0)
+		return ret;
+	rxq->event_fd = intr_handle.efds[0];
 	dev->data->rx_queues[rx_queue_id] = rxq;
 	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
 		ret = rte_eth_rx_queue_setup(PORT_ID(sdev),
@@ -387,6 +409,46 @@
 	return ret;
 }
 
+static int
+fs_rx_intr_enable(struct rte_eth_dev *dev, uint16_t idx)
+{
+	struct rxq *rxq;
+
+	if (idx >= dev->data->nb_rx_queues) {
+		rte_errno = EINVAL;
+		return -rte_errno;
+	}
+	rxq = dev->data->rx_queues[idx];
+	if (rxq == NULL || rxq->event_fd <= 0) {
+		rte_errno = EINVAL;
+		return -rte_errno;
+	}
+	rxq->enable_events = 1;
+	return 0;
+}
+
+static int
+fs_rx_intr_disable(struct rte_eth_dev *dev, uint16_t idx)
+{
+	struct rxq *rxq;
+	uint64_t u64;
+
+	if (idx >= dev->data->nb_rx_queues) {
+		rte_errno = EINVAL;
+		return -rte_errno;
+	}
+	rxq = dev->data->rx_queues[idx];
+	if (rxq == NULL || rxq->event_fd <= 0) {
+		rte_errno = EINVAL;
+		return -rte_errno;
+	}
+	rxq->enable_events = 0;
+	/* Clear pending events */
+	while (read(rxq->event_fd, &u64, sizeof(uint64_t)) >  0)
+		;
+	return 0;
+}
+
 static bool
 fs_txq_offloads_valid(struct rte_eth_dev *dev, uint64_t offloads)
 {
@@ -888,6 +950,8 @@
 	.tx_queue_setup = fs_tx_queue_setup,
 	.rx_queue_release = fs_rx_queue_release,
 	.tx_queue_release = fs_tx_queue_release,
+	.rx_queue_intr_enable = fs_rx_intr_enable,
+	.rx_queue_intr_disable = fs_rx_intr_disable,
 	.flow_ctrl_get = fs_flow_ctrl_get,
 	.flow_ctrl_set = fs_flow_ctrl_set,
 	.mac_addr_remove = fs_mac_addr_remove,
diff --git a/drivers/net/failsafe/failsafe_private.h b/drivers/net/failsafe/failsafe_private.h
index 7754248..419e5e7 100644
--- a/drivers/net/failsafe/failsafe_private.h
+++ b/drivers/net/failsafe/failsafe_private.h
@@ -40,6 +40,7 @@
 #include <rte_dev.h>
 #include <rte_ethdev_driver.h>
 #include <rte_devargs.h>
+#include <rte_interrupts.h>
 
 #define FAILSAFE_DRIVER_NAME "Fail-safe PMD"
 
@@ -68,6 +69,8 @@ struct rxq {
 	/* next sub_device to poll */
 	struct sub_device *sdev;
 	unsigned int socket_id;
+	int event_fd;
+	unsigned int enable_events:1;
 	struct rte_eth_rxq_info info;
 	rte_atomic64_t refcnt[];
 };
@@ -145,6 +148,7 @@ struct fs_priv {
 	uint32_t mac_addr_pool[FAILSAFE_MAX_ETHADDR];
 	/* current capabilities */
 	struct rte_eth_dev_info infos;
+	struct rte_intr_handle intr_handle; /* Port interrupt handle. */
 	/*
 	 * Fail-safe state machine.
 	 * This level will be tracking state of the EAL and eth
@@ -159,6 +163,11 @@ struct fs_priv {
 	int flow_isolated:1;
 };
 
+/* FAILSAFE_INTR */
+
+int failsafe_rx_intr_install(struct rte_eth_dev *dev);
+void failsafe_rx_intr_uninstall(struct rte_eth_dev *dev);
+
 /* MISC */
 
 int failsafe_hotplug_alarm_install(struct rte_eth_dev *dev);
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v7 2/3] net/failsafe: slaves Rx interrupts registration
  2018-01-25  8:07                 ` [PATCH v7 0/3] net/failsafe: add Rx interrupts support Moti Haimovsky
  2018-01-25  8:07                   ` [PATCH v7 1/3] net/failsafe: register as an Rx interrupt mode PMD Moti Haimovsky
@ 2018-01-25  8:07                   ` Moti Haimovsky
  2018-01-25 11:49                     ` Gaëtan Rivet
  2018-01-25  8:07                   ` [PATCH v7 3/3] net/failsafe: add Rx interrupts Moti Haimovsky
  2 siblings, 1 reply; 29+ messages in thread
From: Moti Haimovsky @ 2018-01-25  8:07 UTC (permalink / raw)
  To: gaetan.rivet, ferruh.yigit; +Cc: dev, Moti Haimovsky

This commit adds the following functionality to failsafe PMD:
* Register and unregister slaves Rx interrupts.
* Enable and Disable slaves Rx interrupts.
The interrupts events generated by the slaves are not handled in this
commit.

Signed-off-by: Moti Haimovsky <motih@mellanox.com>
---
V7:
Fixed compilation errors in FreeBSD.
See 1516810328-39383-3-git-send-email-motih@mellanox.com

V6:
Added a wrapper around epoll_create1 since it is not supported in FreeBSD.
See: 1516193643-130838-1-git-send-email-motih@mellanox.com

V5:
Initial version of this patch in accordance to inputs from Gaetan Rivet
in reply to
1516354344-13495-2-git-send-email-motih@mellanox.com
---
 drivers/net/failsafe/Makefile                  |   5 +
 drivers/net/failsafe/failsafe_epoll.h          |  10 ++
 drivers/net/failsafe/failsafe_epoll_bsdapp.c   |  19 +++
 drivers/net/failsafe/failsafe_epoll_linuxapp.c |  18 +++
 drivers/net/failsafe/failsafe_ether.c          |   1 +
 drivers/net/failsafe/failsafe_intr.c           | 198 +++++++++++++++++++++++++
 drivers/net/failsafe/failsafe_ops.c            |  36 ++++-
 drivers/net/failsafe/failsafe_private.h        |  16 ++
 8 files changed, 301 insertions(+), 2 deletions(-)
 create mode 100644 drivers/net/failsafe/failsafe_epoll.h
 create mode 100644 drivers/net/failsafe/failsafe_epoll_bsdapp.c
 create mode 100644 drivers/net/failsafe/failsafe_epoll_linuxapp.c

diff --git a/drivers/net/failsafe/Makefile b/drivers/net/failsafe/Makefile
index 91a734b..4e6a983 100644
--- a/drivers/net/failsafe/Makefile
+++ b/drivers/net/failsafe/Makefile
@@ -47,6 +47,11 @@ SRCS-$(CONFIG_RTE_LIBRTE_PMD_FAILSAFE) += failsafe_rxtx.c
 SRCS-$(CONFIG_RTE_LIBRTE_PMD_FAILSAFE) += failsafe_ether.c
 SRCS-$(CONFIG_RTE_LIBRTE_PMD_FAILSAFE) += failsafe_flow.c
 SRCS-$(CONFIG_RTE_LIBRTE_PMD_FAILSAFE) += failsafe_intr.c
+ifeq ($(CONFIG_RTE_EXEC_ENV_LINUXAPP),y)
+SRCS-$(CONFIG_RTE_LIBRTE_PMD_FAILSAFE) += failsafe_epoll_linuxapp.c
+else
+SRCS-$(CONFIG_RTE_LIBRTE_PMD_FAILSAFE) += failsafe_epoll_bsdapp.c
+endif
 
 # No exported include files
 
diff --git a/drivers/net/failsafe/failsafe_epoll.h b/drivers/net/failsafe/failsafe_epoll.h
new file mode 100644
index 0000000..8e6a1ec
--- /dev/null
+++ b/drivers/net/failsafe/failsafe_epoll.h
@@ -0,0 +1,10 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2018 Mellanox Technologies, Ltd.
+ */
+
+#ifndef _RTE_ETH_FAILSAFE_EPOLL_H_
+#define _RTE_ETH_FAILSAFE_EPOLL_H_
+
+int failsafe_epoll_create1(int flags);
+
+#endif /* _RTE_ETH_FAILSAFE_EPOLL_H_*/
diff --git a/drivers/net/failsafe/failsafe_epoll_bsdapp.c b/drivers/net/failsafe/failsafe_epoll_bsdapp.c
new file mode 100644
index 0000000..46c839b
--- /dev/null
+++ b/drivers/net/failsafe/failsafe_epoll_bsdapp.c
@@ -0,0 +1,19 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2018 Mellanox Technologies, Ltd.
+ */
+
+/**
+ * @file
+ * epoll wrapper for failsafe driver.
+ */
+
+#include <rte_common.h>
+
+#include "failsafe_epoll.h"
+
+int
+failsafe_epoll_create1(int flags)
+{
+	RTE_SET_USED(flags);
+	return -ENOTSUP;
+}
diff --git a/drivers/net/failsafe/failsafe_epoll_linuxapp.c b/drivers/net/failsafe/failsafe_epoll_linuxapp.c
new file mode 100644
index 0000000..d82ee0a
--- /dev/null
+++ b/drivers/net/failsafe/failsafe_epoll_linuxapp.c
@@ -0,0 +1,18 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2018 Mellanox Technologies, Ltd.
+ */
+
+/**
+ * @file
+ * epoll wrapper for failsafe driver.
+ */
+
+#include <sys/epoll.h>
+
+#include "failsafe_epoll.h"
+
+int
+failsafe_epoll_create1(int flags)
+{
+	return epoll_create1(flags);
+}
diff --git a/drivers/net/failsafe/failsafe_ether.c b/drivers/net/failsafe/failsafe_ether.c
index 8a4cacf..0f1630e 100644
--- a/drivers/net/failsafe/failsafe_ether.c
+++ b/drivers/net/failsafe/failsafe_ether.c
@@ -283,6 +283,7 @@
 		return;
 	switch (sdev->state) {
 	case DEV_STARTED:
+		failsafe_rx_intr_uninstall_subdevice(sdev);
 		rte_eth_dev_stop(PORT_ID(sdev));
 		sdev->state = DEV_ACTIVE;
 		/* fallthrough */
diff --git a/drivers/net/failsafe/failsafe_intr.c b/drivers/net/failsafe/failsafe_intr.c
index 54ef2f4..8f8f129 100644
--- a/drivers/net/failsafe/failsafe_intr.c
+++ b/drivers/net/failsafe/failsafe_intr.c
@@ -9,8 +9,198 @@
 
 #include <unistd.h>
 
+#include "failsafe_epoll.h"
 #include "failsafe_private.h"
 
+#define NUM_RX_PROXIES (FAILSAFE_MAX_ETHPORTS * RTE_MAX_RXTX_INTR_VEC_ID)
+
+/**
+ * Install failsafe Rx event proxy subsystem.
+ * This is the way the failsafe PMD generates Rx events on behalf of its
+ * subdevices.
+ *
+ * @param priv
+ *   Pointer to failsafe private structure.
+ * @return
+ *   0 on success, negative errno value otherwise and rte_errno is set.
+ */
+static int
+fs_rx_event_proxy_install(struct fs_priv *priv)
+{
+	int rc = 0;
+
+	/*
+	 * Create the epoll fd and event vector for the proxy service to
+	 * wait on for Rx events generated by the subdevices.
+	 */
+	priv->rxp.efd = failsafe_epoll_create1(0);
+	if (priv->rxp.efd < 0) {
+		rte_errno = errno;
+		ERROR("Failed to create epoll,"
+		      " Rx interrupts will not be supported");
+		return -rte_errno;
+	}
+	priv->rxp.evec = calloc(NUM_RX_PROXIES, sizeof(*priv->rxp.evec));
+	if (priv->rxp.evec == NULL) {
+		ERROR("Failed to allocate memory for event vectors,"
+		      " Rx interrupts will not be supported");
+		rc = -ENOMEM;
+		goto error;
+	}
+	return 0;
+error:
+	if (priv->rxp.efd >= 0) {
+		close(priv->rxp.efd);
+		priv->rxp.efd = -1;
+	}
+	if (priv->rxp.evec != NULL) {
+		free(priv->rxp.evec);
+		priv->rxp.evec = NULL;
+	}
+	rte_errno = -rc;
+	return rc;
+}
+
+/**
+ * RX Interrupt control per subdevice.
+ *
+ * @param sdev
+ *   Pointer to sub-device structure.
+ * @param op
+ *   The operation be performed for the vector.
+ *   Operation type of {RTE_INTR_EVENT_ADD, RTE_INTR_EVENT_DEL}.
+ * @return
+ *   - On success, zero.
+ *   - On failure, a negative value.
+ */
+static int
+failsafe_eth_rx_intr_ctl_subdevice(struct sub_device *sdev, int op)
+{
+	struct rte_eth_dev *dev;
+	struct rte_eth_dev *fsdev;
+	int epfd;
+	uint16_t pid;
+	uint16_t qid;
+	struct rxq *fsrxq;
+	int rc;
+	int ret = 0;
+
+	if (sdev == NULL || (ETH(sdev) == NULL) ||
+	    sdev->fs_dev == NULL || (PRIV(sdev->fs_dev) == NULL)) {
+		ERROR("Called with invalid arguments");
+		return -EINVAL;
+	}
+	dev = ETH(sdev);
+	fsdev = sdev->fs_dev;
+	epfd = PRIV(sdev->fs_dev)->rxp.efd;
+	pid = PORT_ID(sdev);
+
+	if (epfd <= 0) {
+		if (op == RTE_INTR_EVENT_ADD) {
+			ERROR("Proxy events are not initialized");
+			return -EBADF;
+		} else {
+			return 0;
+		}
+	}
+	if (dev->data->nb_rx_queues > fsdev->data->nb_rx_queues) {
+		ERROR("subdevice has too many queues,"
+		      " Interrupts will not be enabled");
+			return -E2BIG;
+	}
+	for (qid = 0; qid < dev->data->nb_rx_queues; qid++) {
+		fsrxq = fsdev->data->rx_queues[qid];
+		rc = rte_eth_dev_rx_intr_ctl_q(pid, qid, epfd,
+					       op, (void *)fsrxq);
+		if (rc) {
+			ERROR("rte_eth_dev_rx_intr_ctl_q failed for "
+			      "port %d  queue %d, epfd %d, error %d",
+			      pid, qid, epfd, rc);
+			ret = rc;
+		}
+	}
+	return ret;
+}
+
+/**
+ * Install Rx interrupts subsystem for a subdevice.
+ * This is a support for dynamically adding subdevices.
+ *
+ * @param sdev
+ *   Pointer to subdevice structure.
+ *
+ * @return
+ *   0 on success, negative errno value otherwise and rte_errno is set.
+ */
+int failsafe_rx_intr_install_subdevice(struct sub_device *sdev)
+{
+	int rc;
+	int qid;
+	struct rte_eth_dev *fsdev;
+	struct rxq **rxq;
+	const struct rte_intr_conf *const intr_conf =
+				&ETH(sdev)->data->dev_conf.intr_conf;
+
+	fsdev = sdev->fs_dev;
+	rxq = (struct rxq **)fsdev->data->rx_queues;
+	if (intr_conf->rxq == 0)
+		return 0;
+	rc = failsafe_eth_rx_intr_ctl_subdevice(sdev, RTE_INTR_EVENT_ADD);
+	if (rc)
+		return rc;
+	/* enable interrupts on already-enabled queues */
+	for (qid = 0; qid < ETH(sdev)->data->nb_rx_queues; qid++) {
+		if (rxq[qid]->enable_events) {
+			int ret = rte_eth_dev_rx_intr_enable(PORT_ID(sdev),
+							     qid);
+			if (ret && (ret != -ENOTSUP)) {
+				ERROR("Failed to enable interrupts on "
+				      "port %d queue %d", PORT_ID(sdev), qid);
+				rc = ret;
+			}
+		}
+	}
+	return rc;
+}
+
+/**
+ * Uninstall Rx interrupts subsystem for a subdevice.
+ * This is a support for dynamically removing subdevices.
+ *
+ * @param sdev
+ *   Pointer to subdevice structure.
+ *
+ * @return
+ *   0 on success, negative errno value otherwise and rte_errno is set.
+ */
+void failsafe_rx_intr_uninstall_subdevice(struct sub_device *sdev)
+{
+	int qid;
+
+	for (qid = 0; qid < ETH(sdev)->data->nb_rx_queues; qid++)
+		rte_eth_dev_rx_intr_disable(PORT_ID(sdev), qid);
+	failsafe_eth_rx_intr_ctl_subdevice(sdev, RTE_INTR_EVENT_DEL);
+}
+
+/**
+ * Uninstall failsafe Rx event proxy.
+ *
+ * @param priv
+ *   Pointer to failsafe private structure.
+ */
+static void
+fs_rx_event_proxy_uninstall(struct fs_priv *priv)
+{
+	if (priv->rxp.evec != NULL) {
+		free(priv->rxp.evec);
+		priv->rxp.evec = NULL;
+	}
+	if (priv->rxp.efd > 0) {
+		close(priv->rxp.efd);
+		priv->rxp.efd = -1;
+	}
+}
+
 /**
  * Uninstall failsafe interrupt vector.
  *
@@ -107,8 +297,12 @@
 failsafe_rx_intr_uninstall(struct rte_eth_dev *dev)
 {
 	struct fs_priv *priv;
+	struct rte_intr_handle *intr_handle;
 
 	priv = PRIV(dev);
+	intr_handle = &priv->intr_handle;
+	rte_intr_free_epoll_fd(intr_handle);
+	fs_rx_event_proxy_uninstall(priv);
 	fs_rx_intr_vec_uninstall(priv);
 	dev->intr_handle = NULL;
 }
@@ -133,6 +327,10 @@
 		return 0;
 	if (fs_rx_intr_vec_install(priv) < 0)
 		return -rte_errno;
+	if (fs_rx_event_proxy_install(priv) < 0) {
+		fs_rx_intr_vec_uninstall(priv);
+		return -rte_errno;
+	}
 	dev->intr_handle = &priv->intr_handle;
 	return 0;
 }
diff --git a/drivers/net/failsafe/failsafe_ops.c b/drivers/net/failsafe/failsafe_ops.c
index d6a82b3..2ea9cdd 100644
--- a/drivers/net/failsafe/failsafe_ops.c
+++ b/drivers/net/failsafe/failsafe_ops.c
@@ -214,6 +214,13 @@
 				continue;
 			return ret;
 		}
+		ret = failsafe_rx_intr_install_subdevice(sdev);
+		if (ret) {
+			if (!fs_err(sdev, ret))
+				continue;
+			rte_eth_dev_stop(PORT_ID(sdev));
+			return ret;
+		}
 		sdev->state = DEV_STARTED;
 	}
 	if (PRIV(dev)->state < DEV_STARTED)
@@ -231,6 +238,7 @@
 	PRIV(dev)->state = DEV_STARTED - 1;
 	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_STARTED) {
 		rte_eth_dev_stop(PORT_ID(sdev));
+		failsafe_rx_intr_uninstall_subdevice(sdev);
 		sdev->state = DEV_STARTED - 1;
 	}
 	failsafe_rx_intr_uninstall(dev);
@@ -413,6 +421,10 @@
 fs_rx_intr_enable(struct rte_eth_dev *dev, uint16_t idx)
 {
 	struct rxq *rxq;
+	struct sub_device *sdev;
+	uint8_t i;
+	int ret;
+	int rc = 0;
 
 	if (idx >= dev->data->nb_rx_queues) {
 		rte_errno = EINVAL;
@@ -424,14 +436,26 @@
 		return -rte_errno;
 	}
 	rxq->enable_events = 1;
-	return 0;
+	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
+		ret = rte_eth_dev_rx_intr_enable(PORT_ID(sdev), idx);
+		ret = fs_err(sdev, ret);
+		if (ret)
+			rc = ret;
+	}
+	if (rc)
+		rte_errno = -rc;
+	return rc;
 }
 
 static int
 fs_rx_intr_disable(struct rte_eth_dev *dev, uint16_t idx)
 {
 	struct rxq *rxq;
+	struct sub_device *sdev;
 	uint64_t u64;
+	uint8_t i;
+	int rc = 0;
+	int ret;
 
 	if (idx >= dev->data->nb_rx_queues) {
 		rte_errno = EINVAL;
@@ -443,10 +467,18 @@
 		return -rte_errno;
 	}
 	rxq->enable_events = 0;
+	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
+		ret = rte_eth_dev_rx_intr_disable(PORT_ID(sdev), idx);
+		ret = fs_err(sdev, ret);
+		if (ret)
+			rc = ret;
+	}
 	/* Clear pending events */
 	while (read(rxq->event_fd, &u64, sizeof(uint64_t)) >  0)
 		;
-	return 0;
+	if (rc)
+		rte_errno = -rc;
+	return rc;
 }
 
 static bool
diff --git a/drivers/net/failsafe/failsafe_private.h b/drivers/net/failsafe/failsafe_private.h
index 419e5e7..ff78b9f 100644
--- a/drivers/net/failsafe/failsafe_private.h
+++ b/drivers/net/failsafe/failsafe_private.h
@@ -63,6 +63,13 @@
 
 /* TYPES */
 
+struct rx_proxy {
+	/* epoll file descriptor */
+	int efd;
+	/* event vector to be used by epoll */
+	struct rte_epoll_event *evec;
+};
+
 struct rxq {
 	struct fs_priv *priv;
 	uint16_t qid;
@@ -158,6 +165,13 @@ struct fs_priv {
 	 */
 	enum dev_state state;
 	struct rte_eth_stats stats_accumulator;
+	/*
+	 * Rx interrupts/events proxy.
+	 * The PMD issues Rx events to the EAL on behalf of its subdevices,
+	 * it does that by registering an event-fd for each of its queues with
+	 * the EAL.
+	 */
+	struct rx_proxy rxp;
 	unsigned int pending_alarm:1; /* An alarm is pending */
 	/* flow isolation state */
 	int flow_isolated:1;
@@ -167,6 +181,8 @@ struct fs_priv {
 
 int failsafe_rx_intr_install(struct rte_eth_dev *dev);
 void failsafe_rx_intr_uninstall(struct rte_eth_dev *dev);
+int failsafe_rx_intr_install_subdevice(struct sub_device *sdev);
+void failsafe_rx_intr_uninstall_subdevice(struct sub_device *sdev);
 
 /* MISC */
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v7 3/3] net/failsafe: add Rx interrupts
  2018-01-25  8:07                 ` [PATCH v7 0/3] net/failsafe: add Rx interrupts support Moti Haimovsky
  2018-01-25  8:07                   ` [PATCH v7 1/3] net/failsafe: register as an Rx interrupt mode PMD Moti Haimovsky
  2018-01-25  8:07                   ` [PATCH v7 2/3] net/failsafe: slaves Rx interrupts registration Moti Haimovsky
@ 2018-01-25  8:07                   ` Moti Haimovsky
  2018-01-25 11:58                     ` Gaëtan Rivet
  2018-01-25 16:19                     ` [PATCH v8 0/3] net/failsafe: add Rx interrupts support Moti Haimovsky
  2 siblings, 2 replies; 29+ messages in thread
From: Moti Haimovsky @ 2018-01-25  8:07 UTC (permalink / raw)
  To: gaetan.rivet, ferruh.yigit; +Cc: dev, Moti Haimovsky

This patch is the last patch in the series of patches aimed
to add support for registering and waiting for Rx interrupts
in failsafe PMD. This allows applications to wait for Rx events
from the PMD using the DPDK rte_epoll subsystem.
The failsafe PMD presents to the application a facade of a single
device to be handled by the application while internally it manages
several devices on behalf of the application including packets
transmission and reception.
The Proposed failsafe Rx interrupt scheme follows this approach.
The failsafe PMD will present the application with a single set of
Rx interrupt vectors representing the failsafe Rx queues, while
internally it will serve as an interrupt proxy for its subdevices.
will allow applications to wait for Rx traffic from the failsafe
PMD by registering and waiting for Rx events from its Rx queues.
In order to support this the following is suggested:
  * Every Rx queue in the failsafe (virtual) device will be assigned
  * a Linux event file descriptor (efd) and an enable_interrupts flag.
  * The failsafe PMD will fill in its rte_intr_handle structure with
    the Rx efds assigned previously and register them with the EAL.
  * The failsafe driver will create a private epoll fd (epfd) and
  * will allocate enough space to handle all the Rx events from all its
    subdevices.
  * Acting as an application,
    for each Rx queue in each active subdevice the failsafe will:
      o Register the Rx queue with the EAL.
      o Pass the EAL the failsafe private epoll fd as the epfd to
        register the Rx queue event on.
      o Pass the EAL, as a parameter, the pointer to the failsafe Rx
        queue that handles this Rx queue.
      o Using the DPDK service callbacks, the failsafe PMD will launch
        an Rx proxy service that will Wait on the epoll fd for Rx
        events from the sub-devices.
      o For each Rx event received the proxy service will
          - Retrieve the pointer to failsafe Rx queue that handles
            this subdevice Rx queue from the user info returned by the
            EAL.
          - Trigger a failsafe Rx event on that queue by writing to
            the event fd unless interrupts are disabled for that queue.
  * The failsafe pmd will also implement the rx_queue_intr_enable
  * and rx_queue_intr_disable routines that will enable and disable Rx
    interrupts respectively on both on the failsafe and its subdevices.

Signed-off-by: Moti Haimovsky <motih@mellanox.com>
---
V6:
Separated between routines' variables definition and initialization
according to guidelines from Gaetan Rivet.

V5:
Modified code and split the patch into three patches in accordance to
inputs from Gaetan Rivet in reply to
1516354344-13495-2-git-send-email-motih@mellanox.com

V4:
Fixed merge conflicts found during integration with other failsafe patches
(See cover letter).

V3:
Fixed build failures in FreeBSD10.3_64

V2:
Modifications according to inputs from Stephen Hemminger:
* Removed unneeded (void *) casting.
Fixed coding style warning.
---
 drivers/net/failsafe/failsafe_intr.c    | 169 ++++++++++++++++++++++++++++++++
 drivers/net/failsafe/failsafe_ops.c     |   6 ++
 drivers/net/failsafe/failsafe_private.h |  17 +++-
 3 files changed, 191 insertions(+), 1 deletion(-)

diff --git a/drivers/net/failsafe/failsafe_intr.c b/drivers/net/failsafe/failsafe_intr.c
index 8f8f129..c58289b 100644
--- a/drivers/net/failsafe/failsafe_intr.c
+++ b/drivers/net/failsafe/failsafe_intr.c
@@ -9,12 +9,176 @@
 
 #include <unistd.h>
 
+#include <rte_alarm.h>
+#include <rte_config.h>
+#include <rte_errno.h>
+#include <rte_ethdev.h>
+#include <rte_interrupts.h>
+#include <rte_io.h>
+#include <rte_service_component.h>
+
 #include "failsafe_epoll.h"
 #include "failsafe_private.h"
 
 #define NUM_RX_PROXIES (FAILSAFE_MAX_ETHPORTS * RTE_MAX_RXTX_INTR_VEC_ID)
 
 /**
+ * Install failsafe Rx event proxy service.
+ * The Rx event proxy is the service that listens to Rx events from the
+ * subdevices and triggers failsafe Rx events accordingly.
+ *
+ * @param priv
+ *   Pointer to failsafe private structure.
+ * @return
+ *   0 on success, negative errno value otherwise.
+ */
+static int
+fs_rx_event_proxy_routine(void *data)
+{
+	struct fs_priv *priv;
+	struct rxq *rxq;
+	struct rte_epoll_event *events;
+	uint64_t u64;
+	int i, n;
+	int rc = 0;
+
+	u64 = 1;
+	priv = data;
+	events = priv->rxp.evec;
+	n = rte_epoll_wait(priv->rxp.efd, events, NUM_RX_PROXIES, -1);
+	for (i = 0; i < n; i++) {
+		rxq = events[i].epdata.data;
+		if (rxq->enable_events && rxq->event_fd != -1) {
+			if (write(rxq->event_fd, &u64, sizeof(u64)) !=
+			    sizeof(u64)) {
+				ERROR("Failed to proxy Rx event to socket %d",
+				       rxq->event_fd);
+				rc = -EIO;
+			}
+		}
+	}
+	return rc;
+}
+
+/**
+ * Uninstall failsafe Rx event proxy service.
+ *
+ * @param priv
+ *   Pointer to failsafe private structure.
+ */
+static void
+fs_rx_event_proxy_service_uninstall(struct fs_priv *priv)
+{
+	/* Unregister the event service. */
+	switch (priv->rxp.sstate) {
+	case SS_RUNNING:
+		rte_service_map_lcore_set(priv->rxp.sid, priv->rxp.scid, 0);
+		/* fall through */
+	case SS_READY:
+		rte_service_runstate_set(priv->rxp.sid, 0);
+		rte_service_set_stats_enable(priv->rxp.sid, 0);
+		rte_service_component_runstate_set(priv->rxp.sid, 0);
+		/* fall through */
+	case SS_REGISTERED:
+		rte_service_component_unregister(priv->rxp.sid);
+		/* fall through */
+	default:
+		break;
+	}
+}
+
+/**
+ * Install the failsafe Rx event proxy service.
+ *
+ * @param priv
+ *   Pointer to failsafe private structure.
+ * @return
+ *   0 on success, negative errno value otherwise.
+ */
+static int
+fs_rx_event_proxy_service_install(struct fs_priv *priv)
+{
+	struct rte_service_spec service;
+	int32_t num_service_cores;
+	int ret = 0;
+
+	num_service_cores = rte_service_lcore_count();
+	if (num_service_cores <= 0) {
+		ERROR("Failed to install Rx interrupts, "
+		      "no service core found");
+		return -ENOTSUP;
+	}
+	/* prepare service info */
+	memset(&service, 0, sizeof(struct rte_service_spec));
+	snprintf(service.name, sizeof(service.name), "%s_Rx_service",
+		 priv->dev->data->name);
+	service.socket_id = priv->dev->data->numa_node;
+	service.callback = fs_rx_event_proxy_routine;
+	service.callback_userdata = (void *)priv;
+
+	if (priv->rxp.sstate == SS_NO_SERVICE) {
+		uint32_t service_core_list[num_service_cores];
+
+		/* get a service core to work with */
+		ret = rte_service_lcore_list(service_core_list,
+					     num_service_cores);
+		if (ret <= 0) {
+			ERROR("Failed to install Rx interrupts, "
+			      "service core list empty or corrupted");
+			return -ENOTSUP;
+		}
+		priv->rxp.scid = service_core_list[0];
+		ret = rte_service_lcore_add(priv->rxp.scid);
+		if (ret && ret != -EALREADY) {
+			ERROR("Failed adding service core");
+			return ret;
+		}
+		/* service core may be in "stopped" state, start it */
+		ret = rte_service_lcore_start(priv->rxp.scid);
+		if (ret && (ret != -EALREADY)) {
+			ERROR("Failed to install Rx interrupts, "
+			      "service core not started");
+			return ret;
+		}
+		/* register our service */
+		int32_t ret = rte_service_component_register(&service,
+							     &priv->rxp.sid);
+		if (ret) {
+			ERROR("service register() failed");
+			return -ENOEXEC;
+		}
+		priv->rxp.sstate = SS_REGISTERED;
+		/* run the service */
+		ret = rte_service_component_runstate_set(priv->rxp.sid, 1);
+		if (ret < 0) {
+			ERROR("Failed Setting component runstate\n");
+			return ret;
+		}
+		ret = rte_service_set_stats_enable(priv->rxp.sid, 1);
+		if (ret < 0) {
+			ERROR("Failed enabling stats\n");
+			return ret;
+		}
+		ret = rte_service_runstate_set(priv->rxp.sid, 1);
+		if (ret < 0) {
+			ERROR("Failed to run service\n");
+			return ret;
+		}
+		priv->rxp.sstate = SS_READY;
+		/* map the service with the service core */
+		ret = rte_service_map_lcore_set(priv->rxp.sid,
+						priv->rxp.scid, 1);
+		if (ret) {
+			ERROR("Failed to install Rx interrupts, "
+			      "could not map service core");
+			return ret;
+		}
+		priv->rxp.sstate = SS_RUNNING;
+	}
+	return 0;
+}
+
+/**
  * Install failsafe Rx event proxy subsystem.
  * This is the way the failsafe PMD generates Rx events on behalf of its
  * subdevices.
@@ -47,6 +211,10 @@
 		rc = -ENOMEM;
 		goto error;
 	}
+	if (fs_rx_event_proxy_service_install(priv) < 0) {
+		rc = -rte_errno;
+		goto error;
+	}
 	return 0;
 error:
 	if (priv->rxp.efd >= 0) {
@@ -191,6 +359,7 @@ void failsafe_rx_intr_uninstall_subdevice(struct sub_device *sdev)
 static void
 fs_rx_event_proxy_uninstall(struct fs_priv *priv)
 {
+	fs_rx_event_proxy_service_uninstall(priv);
 	if (priv->rxp.evec != NULL) {
 		free(priv->rxp.evec);
 		priv->rxp.evec = NULL;
diff --git a/drivers/net/failsafe/failsafe_ops.c b/drivers/net/failsafe/failsafe_ops.c
index 2ea9cdd..840024c 100644
--- a/drivers/net/failsafe/failsafe_ops.c
+++ b/drivers/net/failsafe/failsafe_ops.c
@@ -435,6 +435,12 @@
 		rte_errno = EINVAL;
 		return -rte_errno;
 	}
+	/* Fail if proxy service is nor running. */
+	if (PRIV(dev)->rxp.sstate != SS_RUNNING) {
+		ERROR("failsafe interrupt services are not running");
+		rte_errno = EAGAIN;
+		return -rte_errno;
+	}
 	rxq->enable_events = 1;
 	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
 		ret = rte_eth_dev_rx_intr_enable(PORT_ID(sdev), idx);
diff --git a/drivers/net/failsafe/failsafe_private.h b/drivers/net/failsafe/failsafe_private.h
index ff78b9f..5d328ff 100644
--- a/drivers/net/failsafe/failsafe_private.h
+++ b/drivers/net/failsafe/failsafe_private.h
@@ -61,6 +61,13 @@
 
 #define DEVARGS_MAXLEN 4096
 
+enum rxp_service_state {
+	SS_NO_SERVICE = 0,
+	SS_REGISTERED,
+	SS_READY,
+	SS_RUNNING,
+};
+
 /* TYPES */
 
 struct rx_proxy {
@@ -68,6 +75,11 @@ struct rx_proxy {
 	int efd;
 	/* event vector to be used by epoll */
 	struct rte_epoll_event *evec;
+	/* rte service id */
+	uint32_t sid;
+	/* service core id */
+	uint32_t scid;
+	enum rxp_service_state sstate;
 };
 
 struct rxq {
@@ -169,7 +181,10 @@ struct fs_priv {
 	 * Rx interrupts/events proxy.
 	 * The PMD issues Rx events to the EAL on behalf of its subdevices,
 	 * it does that by registering an event-fd for each of its queues with
-	 * the EAL.
+	 * the EAL. A PMD service thread listens to all the Rx events from the
+	 * subdevices, when an Rx event is issued by a subdevice it will be
+	 * caught by this service with will trigger an Rx event in the
+	 * appropriate failsafe Rx queue.
 	 */
 	struct rx_proxy rxp;
 	unsigned int pending_alarm:1; /* An alarm is pending */
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [PATCH v7 1/3] net/failsafe: register as an Rx interrupt mode PMD
  2018-01-25  8:07                   ` [PATCH v7 1/3] net/failsafe: register as an Rx interrupt mode PMD Moti Haimovsky
@ 2018-01-25 11:36                     ` Gaëtan Rivet
  0 siblings, 0 replies; 29+ messages in thread
From: Gaëtan Rivet @ 2018-01-25 11:36 UTC (permalink / raw)
  To: Moti Haimovsky; +Cc: ferruh.yigit, dev

Hi Moti,

Thanks for splitting the patches,
A few comments.

On Thu, Jan 25, 2018 at 10:07:13AM +0200, Moti Haimovsky wrote:
> This patch adds registering the Rx queues of the failsafe PMD with EAL
> Rx interrupts subsystem.
> Each failsafe RX queue is assigned with a unique eventfd and an enable
> interrupts flag.
> The PMD creates an interrupt vector containing the above eventfds and
> Registers it with  EAL. The PMD also implements the Rx interrupts enable
> and disable interface routines.
> This patch does not implement the generation of Rx interrupts, so an
> application can now wait for failsafe Rx interrupts but it will not
> receive one.
> 
> Signed-off-by: Moti Haimovsky <motih@mellanox.com>
> ---
> V6:
> Fixed typo in commit subject.
> 
> V5:
> Initial version of this patch in accordance to inputs from Gaetan Rivet
> in reply to
> 1516354344-13495-2-git-send-email-motih@mellanox.com
> ---
>  doc/guides/nics/features/failsafe.ini   |   1 +
>  drivers/net/failsafe/Makefile           |   1 +
>  drivers/net/failsafe/failsafe.c         |   4 +
>  drivers/net/failsafe/failsafe_intr.c    | 138 ++++++++++++++++++++++++++++++++
>  drivers/net/failsafe/failsafe_ops.c     |  64 +++++++++++++++
>  drivers/net/failsafe/failsafe_private.h |   9 +++
>  6 files changed, 217 insertions(+)
>  create mode 100644 drivers/net/failsafe/failsafe_intr.c
> 
> diff --git a/doc/guides/nics/features/failsafe.ini b/doc/guides/nics/features/failsafe.ini
> index a42e344..39ee579 100644
> --- a/doc/guides/nics/features/failsafe.ini
> +++ b/doc/guides/nics/features/failsafe.ini
> @@ -6,6 +6,7 @@
>  [Features]
>  Link status          = Y
>  Link status event    = Y
> +Rx interrupt         = Y

It would have been more logical to update this once everything
was properly implemented, but this is inconsequential.

>  MTU update           = Y
>  Jumbo frame          = Y
>  Promiscuous mode     = Y
> diff --git a/drivers/net/failsafe/Makefile b/drivers/net/failsafe/Makefile
> index ea2a8fe..91a734b 100644
> --- a/drivers/net/failsafe/Makefile
> +++ b/drivers/net/failsafe/Makefile
> @@ -46,6 +46,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_PMD_FAILSAFE) += failsafe_ops.c
>  SRCS-$(CONFIG_RTE_LIBRTE_PMD_FAILSAFE) += failsafe_rxtx.c
>  SRCS-$(CONFIG_RTE_LIBRTE_PMD_FAILSAFE) += failsafe_ether.c
>  SRCS-$(CONFIG_RTE_LIBRTE_PMD_FAILSAFE) += failsafe_flow.c
> +SRCS-$(CONFIG_RTE_LIBRTE_PMD_FAILSAFE) += failsafe_intr.c
>  
>  # No exported include files
>  
> diff --git a/drivers/net/failsafe/failsafe.c b/drivers/net/failsafe/failsafe.c
> index cb274eb..921e656 100644
> --- a/drivers/net/failsafe/failsafe.c
> +++ b/drivers/net/failsafe/failsafe.c
> @@ -244,6 +244,10 @@

Please, can you update your git (I see 1.8, debian stable uses 2.11).
The context without the function name is really hard to follow,
especially in the devops section, where the context can be very sparse.
I need each time to follow with the file opened on the side and verify
the line numbers.

>  		mac->addr_bytes[2], mac->addr_bytes[3],
>  		mac->addr_bytes[4], mac->addr_bytes[5]);
>  	dev->data->dev_flags |= RTE_ETH_DEV_INTR_LSC;
> +	PRIV(dev)->intr_handle = (struct rte_intr_handle){
> +		.fd = -1,
> +		.type = RTE_INTR_HANDLE_EXT,
> +	};
>  	return 0;
>  free_args:
>  	failsafe_args_free(dev);
> diff --git a/drivers/net/failsafe/failsafe_intr.c b/drivers/net/failsafe/failsafe_intr.c
> new file mode 100644
> index 0000000..54ef2f4
> --- /dev/null
> +++ b/drivers/net/failsafe/failsafe_intr.c
> @@ -0,0 +1,138 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright 2018 Mellanox Technologies, Ltd.
> + */
> +
> +/**
> + * @file
> + * Interrupts handling for failsafe driver.
> + */
> +
> +#include <unistd.h>
> +
> +#include "failsafe_private.h"
> +
> +/**
> + * Uninstall failsafe interrupt vector.
> + *
> + * @param priv
> + *   Pointer to failsafe private structure.
> + */
> +static void
> +fs_rx_intr_vec_uninstall(struct fs_priv *priv)
> +{
> +	struct rte_intr_handle *intr_handle;
> +
> +	intr_handle = &priv->intr_handle;
> +	if (intr_handle->intr_vec != NULL) {
> +		free(intr_handle->intr_vec);
> +		intr_handle->intr_vec = NULL;
> +	}
> +	intr_handle->nb_efd = 0;
> +}
> +
> +/**
> + * Installs failsafe interrupt vector to be registered with EAL later on.
> + *
> + * @param priv
> + *   Pointer to failsafe private structure.
> + *
> + * @return
> + *   0 on success, negative errno value otherwise and rte_errno is set.
> + */
> +static int
> +fs_rx_intr_vec_install(struct fs_priv *priv)
> +{
> +	unsigned int i;
> +	unsigned int rxqs_n;
> +	unsigned int n;
> +	unsigned int count;
> +	struct rte_intr_handle *intr_handle;
> +
> +	rxqs_n = priv->dev->data->nb_rx_queues;
> +	n = RTE_MIN(rxqs_n, (uint32_t)RTE_MAX_RXTX_INTR_VEC_ID);
> +	count = 0;
> +	intr_handle = &priv->intr_handle;
> +	/* Allocate the interrupt vector of the failsafe Rx proxy interrupts */
> +	intr_handle->intr_vec = malloc(n * sizeof(intr_handle->intr_vec[0]));
> +	if (intr_handle->intr_vec == NULL) {
> +		fs_rx_intr_vec_uninstall(priv);
> +		rte_errno = ENOMEM;
> +		ERROR("Failed to allocate memory for interrupt vector,"
> +		      " Rx interrupts will not be supported");
> +		return -rte_errno;
> +	}
> +	for (i = 0; i < n; i++) {
> +		struct rxq *rxq = priv->dev->data->rx_queues[i];
> +
> +		/* Skip queues that cannot request interrupts. */
> +		if (rxq == NULL || rxq->event_fd < 0) {
> +			/* Use invalid intr_vec[] index to disable entry. */
> +			intr_handle->intr_vec[i] =
> +				RTE_INTR_VEC_RXTX_OFFSET +
> +				RTE_MAX_RXTX_INTR_VEC_ID;
> +			continue;
> +		}
> +		if (count >= RTE_MAX_RXTX_INTR_VEC_ID) {
> +			rte_errno = E2BIG;
> +			ERROR("Too many Rx queues for interrupt vector size"
> +			      " (%d), Rx interrupts cannot be enabled",
> +			      RTE_MAX_RXTX_INTR_VEC_ID);
> +			fs_rx_intr_vec_uninstall(priv);
> +			return -rte_errno;
> +		}
> +		intr_handle->intr_vec[i] = RTE_INTR_VEC_RXTX_OFFSET + count;
> +		intr_handle->efds[count] = rxq->event_fd;
> +		count++;
> +	}
> +	if (count == 0) {
> +		fs_rx_intr_vec_uninstall(priv);
> +	} else {
> +		intr_handle->nb_efd = count;
> +		intr_handle->efd_counter_size = sizeof(uint64_t);
> +	}
> +	return 0;
> +}
> +
> +
> +/**
> + * Uninstall failsafe Rx interrupts subsystem.
> + *
> + * @param priv
> + *   Pointer to private structure.
> + *
> + * @return
> + *   0 on success, negative errno value otherwise and rte_errno is set.
> + */
> +void
> +failsafe_rx_intr_uninstall(struct rte_eth_dev *dev)
> +{
> +	struct fs_priv *priv;
> +
> +	priv = PRIV(dev);
> +	fs_rx_intr_vec_uninstall(priv);

Why not

> +	fs_rx_intr_vec_uninstall(PRIV(dev));

Here?
Is it ok to call this without having enabled the interrupts before? I
see the check on intr_conf done within intr_install, but then none is
made here.

It is probably fine right now, but what will happen once someone updates
fs_rx_intr_vec_uninstall, with potentially dangerous side-effects? It
would be safer to have a symmetric code flow on both init and uninit.

> +	dev->intr_handle = NULL;
> +}
> +
> +/**
> + * Install failsafe Rx interrupts subsystem.
> + *
> + * @param priv
> + *   Pointer to private structure.
> + *
> + * @return
> + *   0 on success, negative errno value otherwise and rte_errno is set.
> + */
> +int
> +failsafe_rx_intr_install(struct rte_eth_dev *dev)
> +{
> +	struct fs_priv *priv = PRIV(dev);
> +	const struct rte_intr_conf *const intr_conf =
> +			&priv->dev->data->dev_conf.intr_conf;
> +
> +	if (intr_conf->rxq == 0 || priv->intr_handle.intr_vec != NULL)

Do you plan on calling this function with

priv->intr_handle.intr_vec != NULL?

Is it a condition that should ever happen?
I think if this function was called and the vectors were already
allocated, it would be a programming mistake.

Putting an RTE_ASSERT() on the condition might be better for catching
refactoring mistakes or reorders in configuration that might happen
afterward.

> +		return 0;
> +	if (fs_rx_intr_vec_install(priv) < 0)
> +		return -rte_errno;
> +	dev->intr_handle = &priv->intr_handle;
> +	return 0;
> +}
> diff --git a/drivers/net/failsafe/failsafe_ops.c b/drivers/net/failsafe/failsafe_ops.c
> index 946ac98..d6a82b3 100644
> --- a/drivers/net/failsafe/failsafe_ops.c
> +++ b/drivers/net/failsafe/failsafe_ops.c
> @@ -33,6 +33,7 @@
>  
>  #include <stdbool.h>
>  #include <stdint.h>
> +#include <unistd.h>
>  
>  #include <rte_debug.h>
>  #include <rte_atomic.h>
> @@ -199,6 +200,10 @@
>  	uint8_t i;
>  	int ret;
>  
> +	ret = failsafe_rx_intr_install(dev);
> +	if (ret)
> +		return ret;
> +

Supperfluous line here.

>  	FOREACH_SUBDEV(sdev, i, dev) {
>  		if (sdev->state != DEV_ACTIVE)
>  			continue;
> @@ -228,6 +233,7 @@
>  		rte_eth_dev_stop(PORT_ID(sdev));
>  		sdev->state = DEV_STARTED - 1;
>  	}
> +	failsafe_rx_intr_uninstall(dev);
>  }
>  
>  static int
> @@ -317,6 +323,8 @@
>  	if (queue == NULL)
>  		return;
>  	rxq = queue;
> +	if (rxq->event_fd > 0)
> +		close(rxq->event_fd);
>  	dev = rxq->priv->dev;
>  	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE)
>  		SUBOPS(sdev, rx_queue_release)
> @@ -333,6 +341,16 @@
>  		const struct rte_eth_rxconf *rx_conf,
>  		struct rte_mempool *mb_pool)
>  {
> +	/*
> +	 * FIXME: Add a proper interface in rte_eal_interrupts for
> +	 * allocating eventfd as an interrupt vector.
> +	 * For the time being, fake as if we are using MSIX interrupts,
> +	 * this will cause rte_intr_efd_enable to allocate an eventfd for us.
> +	 */
> +	struct rte_intr_handle intr_handle = {
> +		.type = RTE_INTR_HANDLE_VFIO_MSIX,
> +		.efds = {-1, },

Missing space:

> +		.efds = { -1, },

Regards,
-- 
Gaëtan Rivet
6WIND

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v7 2/3] net/failsafe: slaves Rx interrupts registration
  2018-01-25  8:07                   ` [PATCH v7 2/3] net/failsafe: slaves Rx interrupts registration Moti Haimovsky
@ 2018-01-25 11:49                     ` Gaëtan Rivet
  0 siblings, 0 replies; 29+ messages in thread
From: Gaëtan Rivet @ 2018-01-25 11:49 UTC (permalink / raw)
  To: Moti Haimovsky; +Cc: ferruh.yigit, dev

On Thu, Jan 25, 2018 at 10:07:14AM +0200, Moti Haimovsky wrote:
> This commit adds the following functionality to failsafe PMD:
> * Register and unregister slaves Rx interrupts.
> * Enable and Disable slaves Rx interrupts.
> The interrupts events generated by the slaves are not handled in this
> commit.
> 
> Signed-off-by: Moti Haimovsky <motih@mellanox.com>
> ---
> V7:
> Fixed compilation errors in FreeBSD.
> See 1516810328-39383-3-git-send-email-motih@mellanox.com
> 
> V6:
> Added a wrapper around epoll_create1 since it is not supported in FreeBSD.
> See: 1516193643-130838-1-git-send-email-motih@mellanox.com
> 
> V5:
> Initial version of this patch in accordance to inputs from Gaetan Rivet
> in reply to
> 1516354344-13495-2-git-send-email-motih@mellanox.com
> ---
>  drivers/net/failsafe/Makefile                  |   5 +
>  drivers/net/failsafe/failsafe_epoll.h          |  10 ++
>  drivers/net/failsafe/failsafe_epoll_bsdapp.c   |  19 +++
>  drivers/net/failsafe/failsafe_epoll_linuxapp.c |  18 +++
>  drivers/net/failsafe/failsafe_ether.c          |   1 +
>  drivers/net/failsafe/failsafe_intr.c           | 198 +++++++++++++++++++++++++
>  drivers/net/failsafe/failsafe_ops.c            |  36 ++++-
>  drivers/net/failsafe/failsafe_private.h        |  16 ++
>  8 files changed, 301 insertions(+), 2 deletions(-)
>  create mode 100644 drivers/net/failsafe/failsafe_epoll.h
>  create mode 100644 drivers/net/failsafe/failsafe_epoll_bsdapp.c
>  create mode 100644 drivers/net/failsafe/failsafe_epoll_linuxapp.c
> 
> diff --git a/drivers/net/failsafe/Makefile b/drivers/net/failsafe/Makefile
> index 91a734b..4e6a983 100644
> --- a/drivers/net/failsafe/Makefile
> +++ b/drivers/net/failsafe/Makefile
> @@ -47,6 +47,11 @@ SRCS-$(CONFIG_RTE_LIBRTE_PMD_FAILSAFE) += failsafe_rxtx.c
>  SRCS-$(CONFIG_RTE_LIBRTE_PMD_FAILSAFE) += failsafe_ether.c
>  SRCS-$(CONFIG_RTE_LIBRTE_PMD_FAILSAFE) += failsafe_flow.c
>  SRCS-$(CONFIG_RTE_LIBRTE_PMD_FAILSAFE) += failsafe_intr.c
> +ifeq ($(CONFIG_RTE_EXEC_ENV_LINUXAPP),y)
> +SRCS-$(CONFIG_RTE_LIBRTE_PMD_FAILSAFE) += failsafe_epoll_linuxapp.c
> +else
> +SRCS-$(CONFIG_RTE_LIBRTE_PMD_FAILSAFE) += failsafe_epoll_bsdapp.c
> +endif

I'm not a fan of whole additional source files for only one function.
Why not something akin to:

ifeq ($(CONFIG_RTE_EXEC_ENV_LINUXAPP),y)
CFLAGS += -DLINUX
elif ($(CONFIG_RTE_EXEC_ENV_BSDAPP),y)
CFLAGS += -DBSD
endif

Then, within failsafe_intr.c:

    static int
    fs_epoll_create1(int flags)
    {
    #if define(LINUX)
    	return epoll_create1(flags);
    #elif defined(BSD)
    	RTE_SET_USED(flags);
    	return -ENOTSUP;
    #endif
    }

>  
>  # No exported include files
>  
> diff --git a/drivers/net/failsafe/failsafe_epoll.h b/drivers/net/failsafe/failsafe_epoll.h
> new file mode 100644
> index 0000000..8e6a1ec
> --- /dev/null
> +++ b/drivers/net/failsafe/failsafe_epoll.h
> @@ -0,0 +1,10 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright 2018 Mellanox Technologies, Ltd.
> + */
> +
> +#ifndef _RTE_ETH_FAILSAFE_EPOLL_H_
> +#define _RTE_ETH_FAILSAFE_EPOLL_H_
> +
> +int failsafe_epoll_create1(int flags);
> +
> +#endif /* _RTE_ETH_FAILSAFE_EPOLL_H_*/
> diff --git a/drivers/net/failsafe/failsafe_epoll_bsdapp.c b/drivers/net/failsafe/failsafe_epoll_bsdapp.c
> new file mode 100644
> index 0000000..46c839b
> --- /dev/null
> +++ b/drivers/net/failsafe/failsafe_epoll_bsdapp.c
> @@ -0,0 +1,19 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright 2018 Mellanox Technologies, Ltd.
> + */
> +
> +/**
> + * @file
> + * epoll wrapper for failsafe driver.
> + */
> +
> +#include <rte_common.h>
> +
> +#include "failsafe_epoll.h"
> +
> +int
> +failsafe_epoll_create1(int flags)
> +{
> +	RTE_SET_USED(flags);
> +	return -ENOTSUP;
> +}
> diff --git a/drivers/net/failsafe/failsafe_epoll_linuxapp.c b/drivers/net/failsafe/failsafe_epoll_linuxapp.c
> new file mode 100644
> index 0000000..d82ee0a
> --- /dev/null
> +++ b/drivers/net/failsafe/failsafe_epoll_linuxapp.c
> @@ -0,0 +1,18 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright 2018 Mellanox Technologies, Ltd.
> + */
> +
> +/**
> + * @file
> + * epoll wrapper for failsafe driver.
> + */
> +
> +#include <sys/epoll.h>
> +
> +#include "failsafe_epoll.h"
> +
> +int
> +failsafe_epoll_create1(int flags)
> +{
> +	return epoll_create1(flags);
> +}
> diff --git a/drivers/net/failsafe/failsafe_ether.c b/drivers/net/failsafe/failsafe_ether.c
> index 8a4cacf..0f1630e 100644
> --- a/drivers/net/failsafe/failsafe_ether.c
> +++ b/drivers/net/failsafe/failsafe_ether.c
> @@ -283,6 +283,7 @@
>  		return;
>  	switch (sdev->state) {
>  	case DEV_STARTED:
> +		failsafe_rx_intr_uninstall_subdevice(sdev);
>  		rte_eth_dev_stop(PORT_ID(sdev));
>  		sdev->state = DEV_ACTIVE;
>  		/* fallthrough */
> diff --git a/drivers/net/failsafe/failsafe_intr.c b/drivers/net/failsafe/failsafe_intr.c
> index 54ef2f4..8f8f129 100644
> --- a/drivers/net/failsafe/failsafe_intr.c
> +++ b/drivers/net/failsafe/failsafe_intr.c
> @@ -9,8 +9,198 @@
>  
>  #include <unistd.h>
>  
> +#include "failsafe_epoll.h"
>  #include "failsafe_private.h"
>  
> +#define NUM_RX_PROXIES (FAILSAFE_MAX_ETHPORTS * RTE_MAX_RXTX_INTR_VEC_ID)
> +
> +/**
> + * Install failsafe Rx event proxy subsystem.
> + * This is the way the failsafe PMD generates Rx events on behalf of its
> + * subdevices.
> + *
> + * @param priv
> + *   Pointer to failsafe private structure.
> + * @return
> + *   0 on success, negative errno value otherwise and rte_errno is set.
> + */
> +static int
> +fs_rx_event_proxy_install(struct fs_priv *priv)
> +{
> +	int rc = 0;
> +
> +	/*
> +	 * Create the epoll fd and event vector for the proxy service to
> +	 * wait on for Rx events generated by the subdevices.
> +	 */
> +	priv->rxp.efd = failsafe_epoll_create1(0);
> +	if (priv->rxp.efd < 0) {
> +		rte_errno = errno;
> +		ERROR("Failed to create epoll,"
> +		      " Rx interrupts will not be supported");
> +		return -rte_errno;
> +	}
> +	priv->rxp.evec = calloc(NUM_RX_PROXIES, sizeof(*priv->rxp.evec));
> +	if (priv->rxp.evec == NULL) {
> +		ERROR("Failed to allocate memory for event vectors,"
> +		      " Rx interrupts will not be supported");
> +		rc = -ENOMEM;
> +		goto error;
> +	}
> +	return 0;
> +error:
> +	if (priv->rxp.efd >= 0) {
> +		close(priv->rxp.efd);
> +		priv->rxp.efd = -1;
> +	}
> +	if (priv->rxp.evec != NULL) {
> +		free(priv->rxp.evec);
> +		priv->rxp.evec = NULL;
> +	}
> +	rte_errno = -rc;
> +	return rc;
> +}
> +
> +/**
> + * RX Interrupt control per subdevice.
> + *
> + * @param sdev
> + *   Pointer to sub-device structure.
> + * @param op
> + *   The operation be performed for the vector.
> + *   Operation type of {RTE_INTR_EVENT_ADD, RTE_INTR_EVENT_DEL}.
> + * @return
> + *   - On success, zero.
> + *   - On failure, a negative value.
> + */
> +static int
> +failsafe_eth_rx_intr_ctl_subdevice(struct sub_device *sdev, int op)
> +{
> +	struct rte_eth_dev *dev;
> +	struct rte_eth_dev *fsdev;
> +	int epfd;
> +	uint16_t pid;
> +	uint16_t qid;
> +	struct rxq *fsrxq;
> +	int rc;
> +	int ret = 0;
> +
> +	if (sdev == NULL || (ETH(sdev) == NULL) ||
> +	    sdev->fs_dev == NULL || (PRIV(sdev->fs_dev) == NULL)) {
> +		ERROR("Called with invalid arguments");
> +		return -EINVAL;
> +	}
> +	dev = ETH(sdev);
> +	fsdev = sdev->fs_dev;
> +	epfd = PRIV(sdev->fs_dev)->rxp.efd;
> +	pid = PORT_ID(sdev);
> +
> +	if (epfd <= 0) {
> +		if (op == RTE_INTR_EVENT_ADD) {
> +			ERROR("Proxy events are not initialized");
> +			return -EBADF;
> +		} else {
> +			return 0;
> +		}
> +	}
> +	if (dev->data->nb_rx_queues > fsdev->data->nb_rx_queues) {
> +		ERROR("subdevice has too many queues,"
> +		      " Interrupts will not be enabled");
> +			return -E2BIG;
> +	}
> +	for (qid = 0; qid < dev->data->nb_rx_queues; qid++) {
> +		fsrxq = fsdev->data->rx_queues[qid];
> +		rc = rte_eth_dev_rx_intr_ctl_q(pid, qid, epfd,
> +					       op, (void *)fsrxq);
> +		if (rc) {
> +			ERROR("rte_eth_dev_rx_intr_ctl_q failed for "
> +			      "port %d  queue %d, epfd %d, error %d",
> +			      pid, qid, epfd, rc);
> +			ret = rc;
> +		}
> +	}
> +	return ret;
> +}
> +
> +/**
> + * Install Rx interrupts subsystem for a subdevice.
> + * This is a support for dynamically adding subdevices.
> + *
> + * @param sdev
> + *   Pointer to subdevice structure.
> + *
> + * @return
> + *   0 on success, negative errno value otherwise and rte_errno is set.
> + */
> +int failsafe_rx_intr_install_subdevice(struct sub_device *sdev)
> +{
> +	int rc;
> +	int qid;
> +	struct rte_eth_dev *fsdev;
> +	struct rxq **rxq;
> +	const struct rte_intr_conf *const intr_conf =
> +				&ETH(sdev)->data->dev_conf.intr_conf;
> +
> +	fsdev = sdev->fs_dev;
> +	rxq = (struct rxq **)fsdev->data->rx_queues;
> +	if (intr_conf->rxq == 0)
> +		return 0;
> +	rc = failsafe_eth_rx_intr_ctl_subdevice(sdev, RTE_INTR_EVENT_ADD);
> +	if (rc)
> +		return rc;
> +	/* enable interrupts on already-enabled queues */
> +	for (qid = 0; qid < ETH(sdev)->data->nb_rx_queues; qid++) {
> +		if (rxq[qid]->enable_events) {
> +			int ret = rte_eth_dev_rx_intr_enable(PORT_ID(sdev),
> +							     qid);
> +			if (ret && (ret != -ENOTSUP)) {
> +				ERROR("Failed to enable interrupts on "
> +				      "port %d queue %d", PORT_ID(sdev), qid);
> +				rc = ret;
> +			}
> +		}
> +	}
> +	return rc;
> +}
> +
> +/**
> + * Uninstall Rx interrupts subsystem for a subdevice.
> + * This is a support for dynamically removing subdevices.
> + *
> + * @param sdev
> + *   Pointer to subdevice structure.
> + *
> + * @return
> + *   0 on success, negative errno value otherwise and rte_errno is set.
> + */
> +void failsafe_rx_intr_uninstall_subdevice(struct sub_device *sdev)
> +{
> +	int qid;
> +
> +	for (qid = 0; qid < ETH(sdev)->data->nb_rx_queues; qid++)
> +		rte_eth_dev_rx_intr_disable(PORT_ID(sdev), qid);

I think here you assume the underlying PMD has been properly
implemented, and that calling rte_eth_dev_rx_intr_disable has no effect.

I think that it would be better to write defensively in general
regarding other PMDs, and assume a bare minimal respect of the API.

As such, here you might want to check that the intr_conf asked for
interrupt enabling before attempting to disable it.

> +	failsafe_eth_rx_intr_ctl_subdevice(sdev, RTE_INTR_EVENT_DEL);
> +}
> +
> +/**
> + * Uninstall failsafe Rx event proxy.
> + *
> + * @param priv
> + *   Pointer to failsafe private structure.
> + */
> +static void
> +fs_rx_event_proxy_uninstall(struct fs_priv *priv)
> +{
> +	if (priv->rxp.evec != NULL) {
> +		free(priv->rxp.evec);
> +		priv->rxp.evec = NULL;
> +	}
> +	if (priv->rxp.efd > 0) {
> +		close(priv->rxp.efd);
> +		priv->rxp.efd = -1;
> +	}
> +}
> +
>  /**
>   * Uninstall failsafe interrupt vector.
>   *
> @@ -107,8 +297,12 @@
>  failsafe_rx_intr_uninstall(struct rte_eth_dev *dev)
>  {
>  	struct fs_priv *priv;
> +	struct rte_intr_handle *intr_handle;
>  
>  	priv = PRIV(dev);
> +	intr_handle = &priv->intr_handle;
> +	rte_intr_free_epoll_fd(intr_handle);
> +	fs_rx_event_proxy_uninstall(priv);
>  	fs_rx_intr_vec_uninstall(priv);
>  	dev->intr_handle = NULL;
>  }
> @@ -133,6 +327,10 @@
>  		return 0;
>  	if (fs_rx_intr_vec_install(priv) < 0)
>  		return -rte_errno;
> +	if (fs_rx_event_proxy_install(priv) < 0) {
> +		fs_rx_intr_vec_uninstall(priv);
> +		return -rte_errno;
> +	}
>  	dev->intr_handle = &priv->intr_handle;
>  	return 0;
>  }
> diff --git a/drivers/net/failsafe/failsafe_ops.c b/drivers/net/failsafe/failsafe_ops.c
> index d6a82b3..2ea9cdd 100644
> --- a/drivers/net/failsafe/failsafe_ops.c
> +++ b/drivers/net/failsafe/failsafe_ops.c
> @@ -214,6 +214,13 @@
>  				continue;
>  			return ret;
>  		}
> +		ret = failsafe_rx_intr_install_subdevice(sdev);
> +		if (ret) {
> +			if (!fs_err(sdev, ret))
> +				continue;
> +			rte_eth_dev_stop(PORT_ID(sdev));
> +			return ret;
> +		}
>  		sdev->state = DEV_STARTED;
>  	}
>  	if (PRIV(dev)->state < DEV_STARTED)
> @@ -231,6 +238,7 @@
>  	PRIV(dev)->state = DEV_STARTED - 1;
>  	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_STARTED) {
>  		rte_eth_dev_stop(PORT_ID(sdev));
> +		failsafe_rx_intr_uninstall_subdevice(sdev);
>  		sdev->state = DEV_STARTED - 1;
>  	}
>  	failsafe_rx_intr_uninstall(dev);
> @@ -413,6 +421,10 @@
>  fs_rx_intr_enable(struct rte_eth_dev *dev, uint16_t idx)
>  {
>  	struct rxq *rxq;
> +	struct sub_device *sdev;
> +	uint8_t i;
> +	int ret;
> +	int rc = 0;
>  
>  	if (idx >= dev->data->nb_rx_queues) {
>  		rte_errno = EINVAL;
> @@ -424,14 +436,26 @@
>  		return -rte_errno;
>  	}
>  	rxq->enable_events = 1;
> -	return 0;
> +	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
> +		ret = rte_eth_dev_rx_intr_enable(PORT_ID(sdev), idx);
> +		ret = fs_err(sdev, ret);
> +		if (ret)
> +			rc = ret;

why not
        rc = fs_err(sdev, ret);
        if (rc)
                break;

instead? I'm not sure ret is even really useful to have here, but I
don't see the rest of the function.

Regards,
-- 
Gaëtan Rivet
6WIND

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v7 3/3] net/failsafe: add Rx interrupts
  2018-01-25  8:07                   ` [PATCH v7 3/3] net/failsafe: add Rx interrupts Moti Haimovsky
@ 2018-01-25 11:58                     ` Gaëtan Rivet
  2018-01-25 16:19                     ` [PATCH v8 0/3] net/failsafe: add Rx interrupts support Moti Haimovsky
  1 sibling, 0 replies; 29+ messages in thread
From: Gaëtan Rivet @ 2018-01-25 11:58 UTC (permalink / raw)
  To: Moti Haimovsky; +Cc: ferruh.yigit, dev

On Thu, Jan 25, 2018 at 10:07:15AM +0200, Moti Haimovsky wrote:
> This patch is the last patch in the series of patches aimed
> to add support for registering and waiting for Rx interrupts
> in failsafe PMD. This allows applications to wait for Rx events
> from the PMD using the DPDK rte_epoll subsystem.
> The failsafe PMD presents to the application a facade of a single
> device to be handled by the application while internally it manages
> several devices on behalf of the application including packets
> transmission and reception.
> The Proposed failsafe Rx interrupt scheme follows this approach.
> The failsafe PMD will present the application with a single set of
> Rx interrupt vectors representing the failsafe Rx queues, while
> internally it will serve as an interrupt proxy for its subdevices.
> will allow applications to wait for Rx traffic from the failsafe
> PMD by registering and waiting for Rx events from its Rx queues.
> In order to support this the following is suggested:
>   * Every Rx queue in the failsafe (virtual) device will be assigned
>   * a Linux event file descriptor (efd) and an enable_interrupts flag.
>   * The failsafe PMD will fill in its rte_intr_handle structure with
>     the Rx efds assigned previously and register them with the EAL.
>   * The failsafe driver will create a private epoll fd (epfd) and
>   * will allocate enough space to handle all the Rx events from all its
>     subdevices.
>   * Acting as an application,
>     for each Rx queue in each active subdevice the failsafe will:
>       o Register the Rx queue with the EAL.
>       o Pass the EAL the failsafe private epoll fd as the epfd to
>         register the Rx queue event on.
>       o Pass the EAL, as a parameter, the pointer to the failsafe Rx
>         queue that handles this Rx queue.
>       o Using the DPDK service callbacks, the failsafe PMD will launch
>         an Rx proxy service that will Wait on the epoll fd for Rx
>         events from the sub-devices.
>       o For each Rx event received the proxy service will
>           - Retrieve the pointer to failsafe Rx queue that handles
>             this subdevice Rx queue from the user info returned by the
>             EAL.
>           - Trigger a failsafe Rx event on that queue by writing to
>             the event fd unless interrupts are disabled for that queue.
>   * The failsafe pmd will also implement the rx_queue_intr_enable
>   * and rx_queue_intr_disable routines that will enable and disable Rx
>     interrupts respectively on both on the failsafe and its subdevices.
> 
> Signed-off-by: Moti Haimovsky <motih@mellanox.com>
> ---
> V6:
> Separated between routines' variables definition and initialization
> according to guidelines from Gaetan Rivet.
> 
> V5:
> Modified code and split the patch into three patches in accordance to
> inputs from Gaetan Rivet in reply to
> 1516354344-13495-2-git-send-email-motih@mellanox.com
> 
> V4:
> Fixed merge conflicts found during integration with other failsafe patches
> (See cover letter).
> 
> V3:
> Fixed build failures in FreeBSD10.3_64
> 
> V2:
> Modifications according to inputs from Stephen Hemminger:
> * Removed unneeded (void *) casting.
> Fixed coding style warning.
> ---
>  drivers/net/failsafe/failsafe_intr.c    | 169 ++++++++++++++++++++++++++++++++
>  drivers/net/failsafe/failsafe_ops.c     |   6 ++
>  drivers/net/failsafe/failsafe_private.h |  17 +++-
>  3 files changed, 191 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/failsafe/failsafe_intr.c b/drivers/net/failsafe/failsafe_intr.c
> index 8f8f129..c58289b 100644
> --- a/drivers/net/failsafe/failsafe_intr.c
> +++ b/drivers/net/failsafe/failsafe_intr.c
> @@ -9,12 +9,176 @@
>  
>  #include <unistd.h>
>  
> +#include <rte_alarm.h>
> +#include <rte_config.h>
> +#include <rte_errno.h>
> +#include <rte_ethdev.h>
> +#include <rte_interrupts.h>
> +#include <rte_io.h>
> +#include <rte_service_component.h>
> +
>  #include "failsafe_epoll.h"
>  #include "failsafe_private.h"
>  
>  #define NUM_RX_PROXIES (FAILSAFE_MAX_ETHPORTS * RTE_MAX_RXTX_INTR_VEC_ID)
>  
>  /**
> + * Install failsafe Rx event proxy service.
> + * The Rx event proxy is the service that listens to Rx events from the
> + * subdevices and triggers failsafe Rx events accordingly.
> + *
> + * @param priv
> + *   Pointer to failsafe private structure.
> + * @return
> + *   0 on success, negative errno value otherwise.
> + */
> +static int
> +fs_rx_event_proxy_routine(void *data)
> +{
> +	struct fs_priv *priv;
> +	struct rxq *rxq;
> +	struct rte_epoll_event *events;
> +	uint64_t u64;
> +	int i, n;
> +	int rc = 0;
> +
> +	u64 = 1;
> +	priv = data;
> +	events = priv->rxp.evec;
> +	n = rte_epoll_wait(priv->rxp.efd, events, NUM_RX_PROXIES, -1);
> +	for (i = 0; i < n; i++) {
> +		rxq = events[i].epdata.data;
> +		if (rxq->enable_events && rxq->event_fd != -1) {
> +			if (write(rxq->event_fd, &u64, sizeof(u64)) !=
> +			    sizeof(u64)) {
> +				ERROR("Failed to proxy Rx event to socket %d",
> +				       rxq->event_fd);
> +				rc = -EIO;
> +			}
> +		}
> +	}
> +	return rc;
> +}
> +
> +/**
> + * Uninstall failsafe Rx event proxy service.
> + *
> + * @param priv
> + *   Pointer to failsafe private structure.
> + */
> +static void
> +fs_rx_event_proxy_service_uninstall(struct fs_priv *priv)
> +{
> +	/* Unregister the event service. */
> +	switch (priv->rxp.sstate) {
> +	case SS_RUNNING:
> +		rte_service_map_lcore_set(priv->rxp.sid, priv->rxp.scid, 0);
> +		/* fall through */
> +	case SS_READY:
> +		rte_service_runstate_set(priv->rxp.sid, 0);
> +		rte_service_set_stats_enable(priv->rxp.sid, 0);
> +		rte_service_component_runstate_set(priv->rxp.sid, 0);
> +		/* fall through */
> +	case SS_REGISTERED:
> +		rte_service_component_unregister(priv->rxp.sid);
> +		/* fall through */
> +	default:
> +		break;
> +	}
> +}
> +
> +/**
> + * Install the failsafe Rx event proxy service.
> + *
> + * @param priv
> + *   Pointer to failsafe private structure.
> + * @return
> + *   0 on success, negative errno value otherwise.
> + */
> +static int
> +fs_rx_event_proxy_service_install(struct fs_priv *priv)
> +{
> +	struct rte_service_spec service;
> +	int32_t num_service_cores;
> +	int ret = 0;
> +
> +	num_service_cores = rte_service_lcore_count();
> +	if (num_service_cores <= 0) {
> +		ERROR("Failed to install Rx interrupts, "
> +		      "no service core found");
> +		return -ENOTSUP;

You don't seem to update rte_errno here,
while you set rc to -rte_errno when checking for errors on this function
later.

> +	}
> +	/* prepare service info */
> +	memset(&service, 0, sizeof(struct rte_service_spec));
> +	snprintf(service.name, sizeof(service.name), "%s_Rx_service",
> +		 priv->dev->data->name);
> +	service.socket_id = priv->dev->data->numa_node;
> +	service.callback = fs_rx_event_proxy_routine;
> +	service.callback_userdata = (void *)priv;

The cast is unnecessary here I think.

> +
> +	if (priv->rxp.sstate == SS_NO_SERVICE) {
> +		uint32_t service_core_list[num_service_cores];
> +
> +		/* get a service core to work with */
> +		ret = rte_service_lcore_list(service_core_list,
> +					     num_service_cores);
> +		if (ret <= 0) {
> +			ERROR("Failed to install Rx interrupts, "
> +			      "service core list empty or corrupted");
> +			return -ENOTSUP;

Same comment regarding setting rte_errno here, and afterward.

> +		}
> +		priv->rxp.scid = service_core_list[0];
> +		ret = rte_service_lcore_add(priv->rxp.scid);
> +		if (ret && ret != -EALREADY) {
> +			ERROR("Failed adding service core");
> +			return ret;
> +		}
> +		/* service core may be in "stopped" state, start it */
> +		ret = rte_service_lcore_start(priv->rxp.scid);
> +		if (ret && (ret != -EALREADY)) {
> +			ERROR("Failed to install Rx interrupts, "
> +			      "service core not started");
> +			return ret;
> +		}
> +		/* register our service */
> +		int32_t ret = rte_service_component_register(&service,
> +							     &priv->rxp.sid);
> +		if (ret) {
> +			ERROR("service register() failed");
> +			return -ENOEXEC;
> +		}
> +		priv->rxp.sstate = SS_REGISTERED;
> +		/* run the service */
> +		ret = rte_service_component_runstate_set(priv->rxp.sid, 1);
> +		if (ret < 0) {
> +			ERROR("Failed Setting component runstate\n");
> +			return ret;
> +		}
> +		ret = rte_service_set_stats_enable(priv->rxp.sid, 1);
> +		if (ret < 0) {
> +			ERROR("Failed enabling stats\n");
> +			return ret;
> +		}
> +		ret = rte_service_runstate_set(priv->rxp.sid, 1);
> +		if (ret < 0) {
> +			ERROR("Failed to run service\n");
> +			return ret;
> +		}
> +		priv->rxp.sstate = SS_READY;
> +		/* map the service with the service core */
> +		ret = rte_service_map_lcore_set(priv->rxp.sid,
> +						priv->rxp.scid, 1);
> +		if (ret) {
> +			ERROR("Failed to install Rx interrupts, "
> +			      "could not map service core");
> +			return ret;
> +		}
> +		priv->rxp.sstate = SS_RUNNING;
> +	}
> +	return 0;
> +}
> +
> +/**
>   * Install failsafe Rx event proxy subsystem.
>   * This is the way the failsafe PMD generates Rx events on behalf of its
>   * subdevices.
> @@ -47,6 +211,10 @@
>  		rc = -ENOMEM;
>  		goto error;
>  	}
> +	if (fs_rx_event_proxy_service_install(priv) < 0) {
> +		rc = -rte_errno;
> +		goto error;
> +	}
>  	return 0;
>  error:
>  	if (priv->rxp.efd >= 0) {

-- 
Gaëtan Rivet
6WIND

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH v8 0/3] net/failsafe: add Rx interrupts support
  2018-01-25  8:07                   ` [PATCH v7 3/3] net/failsafe: add Rx interrupts Moti Haimovsky
  2018-01-25 11:58                     ` Gaëtan Rivet
@ 2018-01-25 16:19                     ` Moti Haimovsky
  2018-01-25 16:19                       ` [PATCH v8 1/3] net/failsafe: register as an Rx interrupt mode PMD Moti Haimovsky
                                         ` (3 more replies)
  1 sibling, 4 replies; 29+ messages in thread
From: Moti Haimovsky @ 2018-01-25 16:19 UTC (permalink / raw)
  To: gaetan.rivet, ferruh.yigit; +Cc: dev, Moti Haimovsky

These three patches add support for registering and waiting for
Rx interrupts in failsafe PMD. This allows applications to wait
for Rx events from the PMD using the DPDK rte_epoll subsystem.
The failsafe PMD presents to the application a facade of a single
device to be handled by the application while internally it manages
several devices on behalf of the application including packets
transmission and reception.
The Proposed failsafe Rx interrupt scheme follows this approach.
The failsafe PMD will present the application with a single set of
Rx interrupt vectors representing the failsafe Rx queues, while
internally it will serve as an interrupt proxy for its subdevices.
will allow applications to wait for Rx traffic from the failsafe
PMD by registering and waiting for Rx events from its Rx queues.
In order to support this the following is suggested:
  * Every Rx queue in the failsafe (virtual) device will be assigned
  * a Linux event file descriptor (efd) and an enable_interrupts flag.
  * The failsafe PMD will fill in its rte_intr_handle structure with
    the Rx efds assigned previously and register them with the EAL.
  * The failsafe driver will create a private epoll fd (epfd) and
  * will allocate enough space to handle all the Rx events from all its
    subdevices.
  * Acting as an application,
    for each Rx queue in each active subdevice the failsafe will:
      o Register the Rx queue with the EAL.
      o Pass the EAL the failsafe private epoll fd as the epfd to
        register the Rx queue event on.
      o Pass the EAL, as a parameter, the pointer to the failsafe Rx
        queue that handles this Rx queue.
      o Using the DPDK service callbacks, the failsafe PMD will launch
        an Rx proxy service that will Wait on the epoll fd for Rx
        events from the sub-devices.
      o For each Rx event received the proxy service will
          - Retrieve the pointer to failsafe Rx queue that handles
            this subdevice Rx queue from the user info returned by the
            EAL.
          - Trigger a failsafe Rx event on that queue by writing to
            the event fd unless interrupts are disabled for that queue.
  * The failsafe pmd will also implement the rx_queue_intr_enable
  * and rx_queue_intr_disable routines that will enable and disable Rx
    interrupts respectively on both on the failsafe and its subdevices.

Moti Haimovsky (3):
  net/failsafe: register as an Rx interrupt mode PMD
  net/failsafe: slaves Rx interrupts registration
  net/failsafe: add Rx interrupts
---
V8:
Modifications according to inputs from Ferruh Yigit.
See each patch for details.

V7:
Fixed compilation errors in FreeBSD.
See 1516810328-39383-3-git-send-email-motih@mellanox.com

V6:
* Added a wrapper around epoll_create1 since it is not supported in FreeBSD.
  See: 1516193643-130838-1-git-send-email-motih@mellanox.com
* Separated between routines' variables definition and initialization
  according to guidelines from Gaetan Rivet.

V5:
Modified code and split the patch into three patches in accordance to
inputs from Gaetan Rivet in reply to
1516354344-13495-2-git-send-email-motih@mellanox.com

V4:
Fixed merge conflicts found during integration with other failsafe patches
(See cover letter).

V3:
Fixed build failures in FreeBSD10.3_64

V2:
Modifications according to inputs from Stephen Hemminger:
* Removed unneeded (void *) casting.
Fixed coding style warning.
---
 doc/guides/nics/features/failsafe.ini   |   1 +
 drivers/net/failsafe/Makefile           |   6 +
 drivers/net/failsafe/failsafe.c         |   4 +
 drivers/net/failsafe/failsafe_ether.c   |   1 +
 drivers/net/failsafe/failsafe_intr.c    | 536 ++++++++++++++++++++++++++++++++
 drivers/net/failsafe/failsafe_ops.c     | 101 ++++++
 drivers/net/failsafe/failsafe_private.h |  40 +++
 7 files changed, 689 insertions(+)
 create mode 100644 drivers/net/failsafe/failsafe_intr.c

-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH v8 1/3] net/failsafe: register as an Rx interrupt mode PMD
  2018-01-25 16:19                     ` [PATCH v8 0/3] net/failsafe: add Rx interrupts support Moti Haimovsky
@ 2018-01-25 16:19                       ` Moti Haimovsky
  2018-01-25 16:19                       ` [PATCH v8 2/3] net/failsafe: slaves Rx interrupts registration Moti Haimovsky
                                         ` (2 subsequent siblings)
  3 siblings, 0 replies; 29+ messages in thread
From: Moti Haimovsky @ 2018-01-25 16:19 UTC (permalink / raw)
  To: gaetan.rivet, ferruh.yigit; +Cc: dev, Moti Haimovsky

This patch adds registering the Rx queues of the failsafe PMD with EAL
Rx interrupts subsystem.
Each failsafe RX queue is assigned with a unique eventfd and an enable
interrupts flag.
The PMD creates an interrupt vector containing the above eventfds and
Registers it with  EAL. The PMD also implements the Rx interrupts enable
and disable interface routines.
This patch does not implement the generation of Rx interrupts, so an
application can now wait for failsafe Rx interrupts but it will not
receive one.

Signed-off-by: Moti Haimovsky <motih@mellanox.com>
---
V8:
Modifications according to inputs from Ferruh Yigit
in reply to
1516867635-67104-2-git-send-email-motih@mellanox.com

V6:
Fixed typo in commit subject.

V5:
Initial version of this patch in accordance to inputs from Gaetan Rivet
in reply to
1516354344-13495-2-git-send-email-motih@mellanox.com
---
 drivers/net/failsafe/Makefile           |   1 +
 drivers/net/failsafe/failsafe.c         |   4 +
 drivers/net/failsafe/failsafe_intr.c    | 136 ++++++++++++++++++++++++++++++++
 drivers/net/failsafe/failsafe_ops.c     |  63 +++++++++++++++
 drivers/net/failsafe/failsafe_private.h |   9 +++
 5 files changed, 213 insertions(+)
 create mode 100644 drivers/net/failsafe/failsafe_intr.c

diff --git a/drivers/net/failsafe/Makefile b/drivers/net/failsafe/Makefile
index ea2a8fe..91a734b 100644
--- a/drivers/net/failsafe/Makefile
+++ b/drivers/net/failsafe/Makefile
@@ -46,6 +46,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_PMD_FAILSAFE) += failsafe_ops.c
 SRCS-$(CONFIG_RTE_LIBRTE_PMD_FAILSAFE) += failsafe_rxtx.c
 SRCS-$(CONFIG_RTE_LIBRTE_PMD_FAILSAFE) += failsafe_ether.c
 SRCS-$(CONFIG_RTE_LIBRTE_PMD_FAILSAFE) += failsafe_flow.c
+SRCS-$(CONFIG_RTE_LIBRTE_PMD_FAILSAFE) += failsafe_intr.c
 
 # No exported include files
 
diff --git a/drivers/net/failsafe/failsafe.c b/drivers/net/failsafe/failsafe.c
index cb274eb..921e656 100644
--- a/drivers/net/failsafe/failsafe.c
+++ b/drivers/net/failsafe/failsafe.c
@@ -244,6 +244,10 @@
 		mac->addr_bytes[2], mac->addr_bytes[3],
 		mac->addr_bytes[4], mac->addr_bytes[5]);
 	dev->data->dev_flags |= RTE_ETH_DEV_INTR_LSC;
+	PRIV(dev)->intr_handle = (struct rte_intr_handle){
+		.fd = -1,
+		.type = RTE_INTR_HANDLE_EXT,
+	};
 	return 0;
 free_args:
 	failsafe_args_free(dev);
diff --git a/drivers/net/failsafe/failsafe_intr.c b/drivers/net/failsafe/failsafe_intr.c
new file mode 100644
index 0000000..8829ca4
--- /dev/null
+++ b/drivers/net/failsafe/failsafe_intr.c
@@ -0,0 +1,136 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2018 Mellanox Technologies, Ltd.
+ */
+
+/**
+ * @file
+ * Interrupts handling for failsafe driver.
+ */
+
+#include <unistd.h>
+
+#include "failsafe_private.h"
+
+/**
+ * Uninstall failsafe interrupt vector.
+ *
+ * @param priv
+ *   Pointer to failsafe private structure.
+ */
+static void
+fs_rx_intr_vec_uninstall(struct fs_priv *priv)
+{
+	struct rte_intr_handle *intr_handle;
+
+	intr_handle = &priv->intr_handle;
+	if (intr_handle->intr_vec != NULL) {
+		free(intr_handle->intr_vec);
+		intr_handle->intr_vec = NULL;
+	}
+	intr_handle->nb_efd = 0;
+}
+
+/**
+ * Installs failsafe interrupt vector to be registered with EAL later on.
+ *
+ * @param priv
+ *   Pointer to failsafe private structure.
+ *
+ * @return
+ *   0 on success, negative errno value otherwise and rte_errno is set.
+ */
+static int
+fs_rx_intr_vec_install(struct fs_priv *priv)
+{
+	unsigned int i;
+	unsigned int rxqs_n;
+	unsigned int n;
+	unsigned int count;
+	struct rte_intr_handle *intr_handle;
+
+	rxqs_n = priv->dev->data->nb_rx_queues;
+	n = RTE_MIN(rxqs_n, (uint32_t)RTE_MAX_RXTX_INTR_VEC_ID);
+	count = 0;
+	intr_handle = &priv->intr_handle;
+	RTE_ASSERT(intr_handle->intr_vec == NULL);
+	/* Allocate the interrupt vector of the failsafe Rx proxy interrupts */
+	intr_handle->intr_vec = malloc(n * sizeof(intr_handle->intr_vec[0]));
+	if (intr_handle->intr_vec == NULL) {
+		fs_rx_intr_vec_uninstall(priv);
+		rte_errno = ENOMEM;
+		ERROR("Failed to allocate memory for interrupt vector,"
+		      " Rx interrupts will not be supported");
+		return -rte_errno;
+	}
+	for (i = 0; i < n; i++) {
+		struct rxq *rxq = priv->dev->data->rx_queues[i];
+
+		/* Skip queues that cannot request interrupts. */
+		if (rxq == NULL || rxq->event_fd < 0) {
+			/* Use invalid intr_vec[] index to disable entry. */
+			intr_handle->intr_vec[i] =
+				RTE_INTR_VEC_RXTX_OFFSET +
+				RTE_MAX_RXTX_INTR_VEC_ID;
+			continue;
+		}
+		if (count >= RTE_MAX_RXTX_INTR_VEC_ID) {
+			rte_errno = E2BIG;
+			ERROR("Too many Rx queues for interrupt vector size"
+			      " (%d), Rx interrupts cannot be enabled",
+			      RTE_MAX_RXTX_INTR_VEC_ID);
+			fs_rx_intr_vec_uninstall(priv);
+			return -rte_errno;
+		}
+		intr_handle->intr_vec[i] = RTE_INTR_VEC_RXTX_OFFSET + count;
+		intr_handle->efds[count] = rxq->event_fd;
+		count++;
+	}
+	if (count == 0) {
+		fs_rx_intr_vec_uninstall(priv);
+	} else {
+		intr_handle->nb_efd = count;
+		intr_handle->efd_counter_size = sizeof(uint64_t);
+	}
+	return 0;
+}
+
+
+/**
+ * Uninstall failsafe Rx interrupts subsystem.
+ *
+ * @param priv
+ *   Pointer to private structure.
+ *
+ * @return
+ *   0 on success, negative errno value otherwise and rte_errno is set.
+ */
+void
+failsafe_rx_intr_uninstall(struct rte_eth_dev *dev)
+{
+	fs_rx_intr_vec_uninstall(PRIV(dev));
+	dev->intr_handle = NULL;
+}
+
+/**
+ * Install failsafe Rx interrupts subsystem.
+ *
+ * @param priv
+ *   Pointer to private structure.
+ *
+ * @return
+ *   0 on success, negative errno value otherwise and rte_errno is set.
+ */
+int
+failsafe_rx_intr_install(struct rte_eth_dev *dev)
+{
+	struct fs_priv *priv = PRIV(dev);
+	const struct rte_intr_conf *const intr_conf =
+			&priv->dev->data->dev_conf.intr_conf;
+
+	if (intr_conf->rxq == 0)
+		return 0;
+	if (fs_rx_intr_vec_install(priv) < 0)
+		return -rte_errno;
+	dev->intr_handle = &priv->intr_handle;
+	return 0;
+}
diff --git a/drivers/net/failsafe/failsafe_ops.c b/drivers/net/failsafe/failsafe_ops.c
index 946ac98..5cbc591 100644
--- a/drivers/net/failsafe/failsafe_ops.c
+++ b/drivers/net/failsafe/failsafe_ops.c
@@ -33,6 +33,7 @@
 
 #include <stdbool.h>
 #include <stdint.h>
+#include <unistd.h>
 
 #include <rte_debug.h>
 #include <rte_atomic.h>
@@ -199,6 +200,9 @@
 	uint8_t i;
 	int ret;
 
+	ret = failsafe_rx_intr_install(dev);
+	if (ret)
+		return ret;
 	FOREACH_SUBDEV(sdev, i, dev) {
 		if (sdev->state != DEV_ACTIVE)
 			continue;
@@ -228,6 +232,7 @@
 		rte_eth_dev_stop(PORT_ID(sdev));
 		sdev->state = DEV_STARTED - 1;
 	}
+	failsafe_rx_intr_uninstall(dev);
 }
 
 static int
@@ -317,6 +322,8 @@
 	if (queue == NULL)
 		return;
 	rxq = queue;
+	if (rxq->event_fd > 0)
+		close(rxq->event_fd);
 	dev = rxq->priv->dev;
 	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE)
 		SUBOPS(sdev, rx_queue_release)
@@ -333,6 +340,16 @@
 		const struct rte_eth_rxconf *rx_conf,
 		struct rte_mempool *mb_pool)
 {
+	/*
+	 * FIXME: Add a proper interface in rte_eal_interrupts for
+	 * allocating eventfd as an interrupt vector.
+	 * For the time being, fake as if we are using MSIX interrupts,
+	 * this will cause rte_intr_efd_enable to allocate an eventfd for us.
+	 */
+	struct rte_intr_handle intr_handle = {
+		.type = RTE_INTR_HANDLE_VFIO_MSIX,
+		.efds = { -1, },
+	};
 	struct sub_device *sdev;
 	struct rxq *rxq;
 	uint8_t i;
@@ -370,6 +387,10 @@
 	rxq->info.nb_desc = nb_rx_desc;
 	rxq->priv = PRIV(dev);
 	rxq->sdev = PRIV(dev)->subs;
+	ret = rte_intr_efd_enable(&intr_handle, 1);
+	if (ret < 0)
+		return ret;
+	rxq->event_fd = intr_handle.efds[0];
 	dev->data->rx_queues[rx_queue_id] = rxq;
 	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
 		ret = rte_eth_rx_queue_setup(PORT_ID(sdev),
@@ -387,6 +408,46 @@
 	return ret;
 }
 
+static int
+fs_rx_intr_enable(struct rte_eth_dev *dev, uint16_t idx)
+{
+	struct rxq *rxq;
+
+	if (idx >= dev->data->nb_rx_queues) {
+		rte_errno = EINVAL;
+		return -rte_errno;
+	}
+	rxq = dev->data->rx_queues[idx];
+	if (rxq == NULL || rxq->event_fd <= 0) {
+		rte_errno = EINVAL;
+		return -rte_errno;
+	}
+	rxq->enable_events = 1;
+	return 0;
+}
+
+static int
+fs_rx_intr_disable(struct rte_eth_dev *dev, uint16_t idx)
+{
+	struct rxq *rxq;
+	uint64_t u64;
+
+	if (idx >= dev->data->nb_rx_queues) {
+		rte_errno = EINVAL;
+		return -rte_errno;
+	}
+	rxq = dev->data->rx_queues[idx];
+	if (rxq == NULL || rxq->event_fd <= 0) {
+		rte_errno = EINVAL;
+		return -rte_errno;
+	}
+	rxq->enable_events = 0;
+	/* Clear pending events */
+	while (read(rxq->event_fd, &u64, sizeof(uint64_t)) >  0)
+		;
+	return 0;
+}
+
 static bool
 fs_txq_offloads_valid(struct rte_eth_dev *dev, uint64_t offloads)
 {
@@ -888,6 +949,8 @@
 	.tx_queue_setup = fs_tx_queue_setup,
 	.rx_queue_release = fs_rx_queue_release,
 	.tx_queue_release = fs_tx_queue_release,
+	.rx_queue_intr_enable = fs_rx_intr_enable,
+	.rx_queue_intr_disable = fs_rx_intr_disable,
 	.flow_ctrl_get = fs_flow_ctrl_get,
 	.flow_ctrl_set = fs_flow_ctrl_set,
 	.mac_addr_remove = fs_mac_addr_remove,
diff --git a/drivers/net/failsafe/failsafe_private.h b/drivers/net/failsafe/failsafe_private.h
index 7754248..419e5e7 100644
--- a/drivers/net/failsafe/failsafe_private.h
+++ b/drivers/net/failsafe/failsafe_private.h
@@ -40,6 +40,7 @@
 #include <rte_dev.h>
 #include <rte_ethdev_driver.h>
 #include <rte_devargs.h>
+#include <rte_interrupts.h>
 
 #define FAILSAFE_DRIVER_NAME "Fail-safe PMD"
 
@@ -68,6 +69,8 @@ struct rxq {
 	/* next sub_device to poll */
 	struct sub_device *sdev;
 	unsigned int socket_id;
+	int event_fd;
+	unsigned int enable_events:1;
 	struct rte_eth_rxq_info info;
 	rte_atomic64_t refcnt[];
 };
@@ -145,6 +148,7 @@ struct fs_priv {
 	uint32_t mac_addr_pool[FAILSAFE_MAX_ETHADDR];
 	/* current capabilities */
 	struct rte_eth_dev_info infos;
+	struct rte_intr_handle intr_handle; /* Port interrupt handle. */
 	/*
 	 * Fail-safe state machine.
 	 * This level will be tracking state of the EAL and eth
@@ -159,6 +163,11 @@ struct fs_priv {
 	int flow_isolated:1;
 };
 
+/* FAILSAFE_INTR */
+
+int failsafe_rx_intr_install(struct rte_eth_dev *dev);
+void failsafe_rx_intr_uninstall(struct rte_eth_dev *dev);
+
 /* MISC */
 
 int failsafe_hotplug_alarm_install(struct rte_eth_dev *dev);
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v8 2/3] net/failsafe: slaves Rx interrupts registration
  2018-01-25 16:19                     ` [PATCH v8 0/3] net/failsafe: add Rx interrupts support Moti Haimovsky
  2018-01-25 16:19                       ` [PATCH v8 1/3] net/failsafe: register as an Rx interrupt mode PMD Moti Haimovsky
@ 2018-01-25 16:19                       ` Moti Haimovsky
  2018-01-25 16:19                       ` [PATCH v8 3/3] net/failsafe: add Rx interrupts Moti Haimovsky
  2018-01-25 16:53                       ` [PATCH v8 0/3] net/failsafe: add Rx interrupts support Gaëtan Rivet
  3 siblings, 0 replies; 29+ messages in thread
From: Moti Haimovsky @ 2018-01-25 16:19 UTC (permalink / raw)
  To: gaetan.rivet, ferruh.yigit; +Cc: dev, Moti Haimovsky

This commit adds the following functionality to failsafe PMD:
* Register and unregister slaves Rx interrupts.
* Enable and Disable slaves Rx interrupts.
The interrupts events generated by the slaves are not handled in this
commit.

Signed-off-by: Moti Haimovsky <motih@mellanox.com>
---
V8:
Modifications according to inputs from Ferruh Yigit
in reply to
1516867635-67104-3-git-send-email-motih@mellanox.com

V7:
Fixed compilation errors in FreeBSD.
See 1516810328-39383-3-git-send-email-motih@mellanox.com

V6:
Added a wrapper around epoll_create1 since it is not supported in FreeBSD.
See: 1516193643-130838-1-git-send-email-motih@mellanox.com

V5:
Initial version of this patch in accordance to inputs from Gaetan Rivet
in reply to
1516354344-13495-2-git-send-email-motih@mellanox.com
---
 drivers/net/failsafe/Makefile           |   5 +
 drivers/net/failsafe/failsafe_ether.c   |   1 +
 drivers/net/failsafe/failsafe_intr.c    | 234 +++++++++++++++++++++++++++++++-
 drivers/net/failsafe/failsafe_ops.c     |  36 ++++-
 drivers/net/failsafe/failsafe_private.h |  16 +++
 5 files changed, 289 insertions(+), 3 deletions(-)

diff --git a/drivers/net/failsafe/Makefile b/drivers/net/failsafe/Makefile
index 91a734b..9bce1f7 100644
--- a/drivers/net/failsafe/Makefile
+++ b/drivers/net/failsafe/Makefile
@@ -47,6 +47,11 @@ SRCS-$(CONFIG_RTE_LIBRTE_PMD_FAILSAFE) += failsafe_rxtx.c
 SRCS-$(CONFIG_RTE_LIBRTE_PMD_FAILSAFE) += failsafe_ether.c
 SRCS-$(CONFIG_RTE_LIBRTE_PMD_FAILSAFE) += failsafe_flow.c
 SRCS-$(CONFIG_RTE_LIBRTE_PMD_FAILSAFE) += failsafe_intr.c
+ifeq ($(CONFIG_RTE_EXEC_ENV_LINUXAPP),y)
+CFLAGS += -DLINUX
+else
+CFLAGS += -DBSD
+endif
 
 # No exported include files
 
diff --git a/drivers/net/failsafe/failsafe_ether.c b/drivers/net/failsafe/failsafe_ether.c
index 8a4cacf..0f1630e 100644
--- a/drivers/net/failsafe/failsafe_ether.c
+++ b/drivers/net/failsafe/failsafe_ether.c
@@ -283,6 +283,7 @@
 		return;
 	switch (sdev->state) {
 	case DEV_STARTED:
+		failsafe_rx_intr_uninstall_subdevice(sdev);
 		rte_eth_dev_stop(PORT_ID(sdev));
 		sdev->state = DEV_ACTIVE;
 		/* fallthrough */
diff --git a/drivers/net/failsafe/failsafe_intr.c b/drivers/net/failsafe/failsafe_intr.c
index 8829ca4..b0eacdc 100644
--- a/drivers/net/failsafe/failsafe_intr.c
+++ b/drivers/net/failsafe/failsafe_intr.c
@@ -7,10 +7,231 @@
  * Interrupts handling for failsafe driver.
  */
 
+#if defined(LINUX)
+#include <sys/epoll.h>
+#endif
 #include <unistd.h>
 
 #include "failsafe_private.h"
 
+#define NUM_RX_PROXIES (FAILSAFE_MAX_ETHPORTS * RTE_MAX_RXTX_INTR_VEC_ID)
+
+
+/**
+ * Open an epoll file descriptor.
+ *
+ * @param flags
+ *   Flags for defining epoll behavior.
+ * @return
+ *   0 on success, negative errno value otherwise.
+ */
+static int
+fs_epoll_create1(int flags)
+{
+#if defined(LINUX)
+	return epoll_create1(flags);
+#elif defined(BSD)
+	RTE_SET_USED(flags);
+	return -ENOTSUP;
+#endif
+}
+
+/**
+ * Install failsafe Rx event proxy subsystem.
+ * This is the way the failsafe PMD generates Rx events on behalf of its
+ * subdevices.
+ *
+ * @param priv
+ *   Pointer to failsafe private structure.
+ * @return
+ *   0 on success, negative errno value otherwise and rte_errno is set.
+ */
+static int
+fs_rx_event_proxy_install(struct fs_priv *priv)
+{
+	int rc = 0;
+
+	/*
+	 * Create the epoll fd and event vector for the proxy service to
+	 * wait on for Rx events generated by the subdevices.
+	 */
+	priv->rxp.efd = fs_epoll_create1(0);
+	if (priv->rxp.efd < 0) {
+		rte_errno = errno;
+		ERROR("Failed to create epoll,"
+		      " Rx interrupts will not be supported");
+		return -rte_errno;
+	}
+	priv->rxp.evec = calloc(NUM_RX_PROXIES, sizeof(*priv->rxp.evec));
+	if (priv->rxp.evec == NULL) {
+		ERROR("Failed to allocate memory for event vectors,"
+		      " Rx interrupts will not be supported");
+		rc = -ENOMEM;
+		goto error;
+	}
+	return 0;
+error:
+	if (priv->rxp.efd >= 0) {
+		close(priv->rxp.efd);
+		priv->rxp.efd = -1;
+	}
+	if (priv->rxp.evec != NULL) {
+		free(priv->rxp.evec);
+		priv->rxp.evec = NULL;
+	}
+	rte_errno = -rc;
+	return rc;
+}
+
+/**
+ * RX Interrupt control per subdevice.
+ *
+ * @param sdev
+ *   Pointer to sub-device structure.
+ * @param op
+ *   The operation be performed for the vector.
+ *   Operation type of {RTE_INTR_EVENT_ADD, RTE_INTR_EVENT_DEL}.
+ * @return
+ *   - On success, zero.
+ *   - On failure, a negative value.
+ */
+static int
+failsafe_eth_rx_intr_ctl_subdevice(struct sub_device *sdev, int op)
+{
+	struct rte_eth_dev *dev;
+	struct rte_eth_dev *fsdev;
+	int epfd;
+	uint16_t pid;
+	uint16_t qid;
+	struct rxq *fsrxq;
+	int rc;
+	int ret = 0;
+
+	if (sdev == NULL || (ETH(sdev) == NULL) ||
+	    sdev->fs_dev == NULL || (PRIV(sdev->fs_dev) == NULL)) {
+		ERROR("Called with invalid arguments");
+		return -EINVAL;
+	}
+	dev = ETH(sdev);
+	fsdev = sdev->fs_dev;
+	epfd = PRIV(sdev->fs_dev)->rxp.efd;
+	pid = PORT_ID(sdev);
+
+	if (epfd <= 0) {
+		if (op == RTE_INTR_EVENT_ADD) {
+			ERROR("Proxy events are not initialized");
+			return -EBADF;
+		} else {
+			return 0;
+		}
+	}
+	if (dev->data->nb_rx_queues > fsdev->data->nb_rx_queues) {
+		ERROR("subdevice has too many queues,"
+		      " Interrupts will not be enabled");
+			return -E2BIG;
+	}
+	for (qid = 0; qid < dev->data->nb_rx_queues; qid++) {
+		fsrxq = fsdev->data->rx_queues[qid];
+		rc = rte_eth_dev_rx_intr_ctl_q(pid, qid, epfd,
+					       op, (void *)fsrxq);
+		if (rc) {
+			ERROR("rte_eth_dev_rx_intr_ctl_q failed for "
+			      "port %d  queue %d, epfd %d, error %d",
+			      pid, qid, epfd, rc);
+			ret = rc;
+		}
+	}
+	return ret;
+}
+
+/**
+ * Install Rx interrupts subsystem for a subdevice.
+ * This is a support for dynamically adding subdevices.
+ *
+ * @param sdev
+ *   Pointer to subdevice structure.
+ *
+ * @return
+ *   0 on success, negative errno value otherwise and rte_errno is set.
+ */
+int failsafe_rx_intr_install_subdevice(struct sub_device *sdev)
+{
+	int rc;
+	int qid;
+	struct rte_eth_dev *fsdev;
+	struct rxq **rxq;
+	const struct rte_intr_conf *const intr_conf =
+				&ETH(sdev)->data->dev_conf.intr_conf;
+
+	fsdev = sdev->fs_dev;
+	rxq = (struct rxq **)fsdev->data->rx_queues;
+	if (intr_conf->rxq == 0)
+		return 0;
+	rc = failsafe_eth_rx_intr_ctl_subdevice(sdev, RTE_INTR_EVENT_ADD);
+	if (rc)
+		return rc;
+	/* enable interrupts on already-enabled queues */
+	for (qid = 0; qid < ETH(sdev)->data->nb_rx_queues; qid++) {
+		if (rxq[qid]->enable_events) {
+			int ret = rte_eth_dev_rx_intr_enable(PORT_ID(sdev),
+							     qid);
+			if (ret && (ret != -ENOTSUP)) {
+				ERROR("Failed to enable interrupts on "
+				      "port %d queue %d", PORT_ID(sdev), qid);
+				rc = ret;
+			}
+		}
+	}
+	return rc;
+}
+
+/**
+ * Uninstall Rx interrupts subsystem for a subdevice.
+ * This is a support for dynamically removing subdevices.
+ *
+ * @param sdev
+ *   Pointer to subdevice structure.
+ *
+ * @return
+ *   0 on success, negative errno value otherwise and rte_errno is set.
+ */
+void failsafe_rx_intr_uninstall_subdevice(struct sub_device *sdev)
+{
+	int qid;
+	struct rte_eth_dev *fsdev;
+	struct rxq *fsrxq;
+
+	fsdev = sdev->fs_dev;
+	for (qid = 0; qid < ETH(sdev)->data->nb_rx_queues; qid++) {
+		if (qid < fsdev->data->nb_rx_queues) {
+			fsrxq = fsdev->data->rx_queues[qid];
+			if (fsrxq->enable_events)
+				rte_eth_dev_rx_intr_disable(PORT_ID(sdev),
+							    qid);
+		}
+	}
+	failsafe_eth_rx_intr_ctl_subdevice(sdev, RTE_INTR_EVENT_DEL);
+}
+
+/**
+ * Uninstall failsafe Rx event proxy.
+ *
+ * @param priv
+ *   Pointer to failsafe private structure.
+ */
+static void
+fs_rx_event_proxy_uninstall(struct fs_priv *priv)
+{
+	if (priv->rxp.evec != NULL) {
+		free(priv->rxp.evec);
+		priv->rxp.evec = NULL;
+	}
+	if (priv->rxp.efd > 0) {
+		close(priv->rxp.efd);
+		priv->rxp.efd = -1;
+	}
+}
+
 /**
  * Uninstall failsafe interrupt vector.
  *
@@ -107,7 +328,14 @@
 void
 failsafe_rx_intr_uninstall(struct rte_eth_dev *dev)
 {
-	fs_rx_intr_vec_uninstall(PRIV(dev));
+	struct fs_priv *priv;
+	struct rte_intr_handle *intr_handle;
+
+	priv = PRIV(dev);
+	intr_handle = &priv->intr_handle;
+	rte_intr_free_epoll_fd(intr_handle);
+	fs_rx_event_proxy_uninstall(priv);
+	fs_rx_intr_vec_uninstall(priv);
 	dev->intr_handle = NULL;
 }
 
@@ -131,6 +359,10 @@
 		return 0;
 	if (fs_rx_intr_vec_install(priv) < 0)
 		return -rte_errno;
+	if (fs_rx_event_proxy_install(priv) < 0) {
+		fs_rx_intr_vec_uninstall(priv);
+		return -rte_errno;
+	}
 	dev->intr_handle = &priv->intr_handle;
 	return 0;
 }
diff --git a/drivers/net/failsafe/failsafe_ops.c b/drivers/net/failsafe/failsafe_ops.c
index 5cbc591..249baea 100644
--- a/drivers/net/failsafe/failsafe_ops.c
+++ b/drivers/net/failsafe/failsafe_ops.c
@@ -213,6 +213,13 @@
 				continue;
 			return ret;
 		}
+		ret = failsafe_rx_intr_install_subdevice(sdev);
+		if (ret) {
+			if (!fs_err(sdev, ret))
+				continue;
+			rte_eth_dev_stop(PORT_ID(sdev));
+			return ret;
+		}
 		sdev->state = DEV_STARTED;
 	}
 	if (PRIV(dev)->state < DEV_STARTED)
@@ -230,6 +237,7 @@
 	PRIV(dev)->state = DEV_STARTED - 1;
 	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_STARTED) {
 		rte_eth_dev_stop(PORT_ID(sdev));
+		failsafe_rx_intr_uninstall_subdevice(sdev);
 		sdev->state = DEV_STARTED - 1;
 	}
 	failsafe_rx_intr_uninstall(dev);
@@ -412,6 +420,10 @@
 fs_rx_intr_enable(struct rte_eth_dev *dev, uint16_t idx)
 {
 	struct rxq *rxq;
+	struct sub_device *sdev;
+	uint8_t i;
+	int ret;
+	int rc = 0;
 
 	if (idx >= dev->data->nb_rx_queues) {
 		rte_errno = EINVAL;
@@ -423,14 +435,26 @@
 		return -rte_errno;
 	}
 	rxq->enable_events = 1;
-	return 0;
+	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
+		ret = rte_eth_dev_rx_intr_enable(PORT_ID(sdev), idx);
+		ret = fs_err(sdev, ret);
+		if (ret)
+			rc = ret;
+	}
+	if (rc)
+		rte_errno = -rc;
+	return rc;
 }
 
 static int
 fs_rx_intr_disable(struct rte_eth_dev *dev, uint16_t idx)
 {
 	struct rxq *rxq;
+	struct sub_device *sdev;
 	uint64_t u64;
+	uint8_t i;
+	int rc = 0;
+	int ret;
 
 	if (idx >= dev->data->nb_rx_queues) {
 		rte_errno = EINVAL;
@@ -442,10 +466,18 @@
 		return -rte_errno;
 	}
 	rxq->enable_events = 0;
+	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
+		ret = rte_eth_dev_rx_intr_disable(PORT_ID(sdev), idx);
+		ret = fs_err(sdev, ret);
+		if (ret)
+			rc = ret;
+	}
 	/* Clear pending events */
 	while (read(rxq->event_fd, &u64, sizeof(uint64_t)) >  0)
 		;
-	return 0;
+	if (rc)
+		rte_errno = -rc;
+	return rc;
 }
 
 static bool
diff --git a/drivers/net/failsafe/failsafe_private.h b/drivers/net/failsafe/failsafe_private.h
index 419e5e7..ff78b9f 100644
--- a/drivers/net/failsafe/failsafe_private.h
+++ b/drivers/net/failsafe/failsafe_private.h
@@ -63,6 +63,13 @@
 
 /* TYPES */
 
+struct rx_proxy {
+	/* epoll file descriptor */
+	int efd;
+	/* event vector to be used by epoll */
+	struct rte_epoll_event *evec;
+};
+
 struct rxq {
 	struct fs_priv *priv;
 	uint16_t qid;
@@ -158,6 +165,13 @@ struct fs_priv {
 	 */
 	enum dev_state state;
 	struct rte_eth_stats stats_accumulator;
+	/*
+	 * Rx interrupts/events proxy.
+	 * The PMD issues Rx events to the EAL on behalf of its subdevices,
+	 * it does that by registering an event-fd for each of its queues with
+	 * the EAL.
+	 */
+	struct rx_proxy rxp;
 	unsigned int pending_alarm:1; /* An alarm is pending */
 	/* flow isolation state */
 	int flow_isolated:1;
@@ -167,6 +181,8 @@ struct fs_priv {
 
 int failsafe_rx_intr_install(struct rte_eth_dev *dev);
 void failsafe_rx_intr_uninstall(struct rte_eth_dev *dev);
+int failsafe_rx_intr_install_subdevice(struct sub_device *sdev);
+void failsafe_rx_intr_uninstall_subdevice(struct sub_device *sdev);
 
 /* MISC */
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v8 3/3] net/failsafe: add Rx interrupts
  2018-01-25 16:19                     ` [PATCH v8 0/3] net/failsafe: add Rx interrupts support Moti Haimovsky
  2018-01-25 16:19                       ` [PATCH v8 1/3] net/failsafe: register as an Rx interrupt mode PMD Moti Haimovsky
  2018-01-25 16:19                       ` [PATCH v8 2/3] net/failsafe: slaves Rx interrupts registration Moti Haimovsky
@ 2018-01-25 16:19                       ` Moti Haimovsky
  2018-01-25 16:53                       ` [PATCH v8 0/3] net/failsafe: add Rx interrupts support Gaëtan Rivet
  3 siblings, 0 replies; 29+ messages in thread
From: Moti Haimovsky @ 2018-01-25 16:19 UTC (permalink / raw)
  To: gaetan.rivet, ferruh.yigit; +Cc: dev, Moti Haimovsky

This patch is the last patch in the series of patches aimed
to add support for registering and waiting for Rx interrupts
in failsafe PMD. This allows applications to wait for Rx events
from the PMD using the DPDK rte_epoll subsystem.
The failsafe PMD presents to the application a facade of a single
device to be handled by the application while internally it manages
several devices on behalf of the application including packets
transmission and reception.
The Proposed failsafe Rx interrupt scheme follows this approach.
The failsafe PMD will present the application with a single set of
Rx interrupt vectors representing the failsafe Rx queues, while
internally it will serve as an interrupt proxy for its subdevices.
will allow applications to wait for Rx traffic from the failsafe
PMD by registering and waiting for Rx events from its Rx queues.
In order to support this the following is suggested:
  * Every Rx queue in the failsafe (virtual) device will be assigned
  * a Linux event file descriptor (efd) and an enable_interrupts flag.
  * The failsafe PMD will fill in its rte_intr_handle structure with
    the Rx efds assigned previously and register them with the EAL.
  * The failsafe driver will create a private epoll fd (epfd) and
  * will allocate enough space to handle all the Rx events from all its
    subdevices.
  * Acting as an application,
    for each Rx queue in each active subdevice the failsafe will:
      o Register the Rx queue with the EAL.
      o Pass the EAL the failsafe private epoll fd as the epfd to
        register the Rx queue event on.
      o Pass the EAL, as a parameter, the pointer to the failsafe Rx
        queue that handles this Rx queue.
      o Using the DPDK service callbacks, the failsafe PMD will launch
        an Rx proxy service that will Wait on the epoll fd for Rx
        events from the sub-devices.
      o For each Rx event received the proxy service will
          - Retrieve the pointer to failsafe Rx queue that handles
            this subdevice Rx queue from the user info returned by the
            EAL.
          - Trigger a failsafe Rx event on that queue by writing to
            the event fd unless interrupts are disabled for that queue.
  * The failsafe pmd will also implement the rx_queue_intr_enable
  * and rx_queue_intr_disable routines that will enable and disable Rx
    interrupts respectively on both on the failsafe and its subdevices.

Signed-off-by: Moti Haimovsky <motih@mellanox.com>
---
V8:
Modifications according to inputs from Ferruh Yigit
in reply to
1516867635-67104-4-git-send-email-motih@mellanox.com

V6:
Separated between routines' variables definition and initialization
according to guidelines from Gaetan Rivet.

V5:
Modified code and split the patch into three patches in accordance to
inputs from Gaetan Rivet in reply to
1516354344-13495-2-git-send-email-motih@mellanox.com

V4:
Fixed merge conflicts found during integration with other failsafe patches
(See cover letter).

V3:
Fixed build failures in FreeBSD10.3_64

V2:
Modifications according to inputs from Stephen Hemminger:
* Removed unneeded (void *) casting.
Fixed coding style warning.
---
 doc/guides/nics/features/failsafe.ini   |   1 +
 drivers/net/failsafe/failsafe_intr.c    | 168 ++++++++++++++++++++++++++++++++
 drivers/net/failsafe/failsafe_ops.c     |   6 ++
 drivers/net/failsafe/failsafe_private.h |  17 +++-
 4 files changed, 191 insertions(+), 1 deletion(-)

diff --git a/doc/guides/nics/features/failsafe.ini b/doc/guides/nics/features/failsafe.ini
index a42e344..39ee579 100644
--- a/doc/guides/nics/features/failsafe.ini
+++ b/doc/guides/nics/features/failsafe.ini
@@ -6,6 +6,7 @@
 [Features]
 Link status          = Y
 Link status event    = Y
+Rx interrupt         = Y
 MTU update           = Y
 Jumbo frame          = Y
 Promiscuous mode     = Y
diff --git a/drivers/net/failsafe/failsafe_intr.c b/drivers/net/failsafe/failsafe_intr.c
index b0eacdc..f6ff04d 100644
--- a/drivers/net/failsafe/failsafe_intr.c
+++ b/drivers/net/failsafe/failsafe_intr.c
@@ -12,6 +12,14 @@
 #endif
 #include <unistd.h>
 
+#include <rte_alarm.h>
+#include <rte_config.h>
+#include <rte_errno.h>
+#include <rte_ethdev.h>
+#include <rte_interrupts.h>
+#include <rte_io.h>
+#include <rte_service_component.h>
+
 #include "failsafe_private.h"
 
 #define NUM_RX_PROXIES (FAILSAFE_MAX_ETHPORTS * RTE_MAX_RXTX_INTR_VEC_ID)
@@ -37,6 +45,162 @@
 }
 
 /**
+ * Install failsafe Rx event proxy service.
+ * The Rx event proxy is the service that listens to Rx events from the
+ * subdevices and triggers failsafe Rx events accordingly.
+ *
+ * @param priv
+ *   Pointer to failsafe private structure.
+ * @return
+ *   0 on success, negative errno value otherwise.
+ */
+static int
+fs_rx_event_proxy_routine(void *data)
+{
+	struct fs_priv *priv;
+	struct rxq *rxq;
+	struct rte_epoll_event *events;
+	uint64_t u64;
+	int i, n;
+	int rc = 0;
+
+	u64 = 1;
+	priv = data;
+	events = priv->rxp.evec;
+	n = rte_epoll_wait(priv->rxp.efd, events, NUM_RX_PROXIES, -1);
+	for (i = 0; i < n; i++) {
+		rxq = events[i].epdata.data;
+		if (rxq->enable_events && rxq->event_fd != -1) {
+			if (write(rxq->event_fd, &u64, sizeof(u64)) !=
+			    sizeof(u64)) {
+				ERROR("Failed to proxy Rx event to socket %d",
+				       rxq->event_fd);
+				rc = -EIO;
+			}
+		}
+	}
+	return rc;
+}
+
+/**
+ * Uninstall failsafe Rx event proxy service.
+ *
+ * @param priv
+ *   Pointer to failsafe private structure.
+ */
+static void
+fs_rx_event_proxy_service_uninstall(struct fs_priv *priv)
+{
+	/* Unregister the event service. */
+	switch (priv->rxp.sstate) {
+	case SS_RUNNING:
+		rte_service_map_lcore_set(priv->rxp.sid, priv->rxp.scid, 0);
+		/* fall through */
+	case SS_READY:
+		rte_service_runstate_set(priv->rxp.sid, 0);
+		rte_service_set_stats_enable(priv->rxp.sid, 0);
+		rte_service_component_runstate_set(priv->rxp.sid, 0);
+		/* fall through */
+	case SS_REGISTERED:
+		rte_service_component_unregister(priv->rxp.sid);
+		/* fall through */
+	default:
+		break;
+	}
+}
+
+/**
+ * Install the failsafe Rx event proxy service.
+ *
+ * @param priv
+ *   Pointer to failsafe private structure.
+ * @return
+ *   0 on success, negative errno value otherwise.
+ */
+static int
+fs_rx_event_proxy_service_install(struct fs_priv *priv)
+{
+	struct rte_service_spec service;
+	int32_t num_service_cores;
+	int ret = 0;
+
+	num_service_cores = rte_service_lcore_count();
+	if (num_service_cores <= 0) {
+		ERROR("Failed to install Rx interrupts, "
+		      "no service core found");
+		return -ENOTSUP;
+	}
+	/* prepare service info */
+	memset(&service, 0, sizeof(struct rte_service_spec));
+	snprintf(service.name, sizeof(service.name), "%s_Rx_service",
+		 priv->dev->data->name);
+	service.socket_id = priv->dev->data->numa_node;
+	service.callback = fs_rx_event_proxy_routine;
+	service.callback_userdata = priv;
+
+	if (priv->rxp.sstate == SS_NO_SERVICE) {
+		uint32_t service_core_list[num_service_cores];
+
+		/* get a service core to work with */
+		ret = rte_service_lcore_list(service_core_list,
+					     num_service_cores);
+		if (ret <= 0) {
+			ERROR("Failed to install Rx interrupts, "
+			      "service core list empty or corrupted");
+			return -ENOTSUP;
+		}
+		priv->rxp.scid = service_core_list[0];
+		ret = rte_service_lcore_add(priv->rxp.scid);
+		if (ret && ret != -EALREADY) {
+			ERROR("Failed adding service core");
+			return ret;
+		}
+		/* service core may be in "stopped" state, start it */
+		ret = rte_service_lcore_start(priv->rxp.scid);
+		if (ret && (ret != -EALREADY)) {
+			ERROR("Failed to install Rx interrupts, "
+			      "service core not started");
+			return ret;
+		}
+		/* register our service */
+		int32_t ret = rte_service_component_register(&service,
+							     &priv->rxp.sid);
+		if (ret) {
+			ERROR("service register() failed");
+			return -ENOEXEC;
+		}
+		priv->rxp.sstate = SS_REGISTERED;
+		/* run the service */
+		ret = rte_service_component_runstate_set(priv->rxp.sid, 1);
+		if (ret < 0) {
+			ERROR("Failed Setting component runstate\n");
+			return ret;
+		}
+		ret = rte_service_set_stats_enable(priv->rxp.sid, 1);
+		if (ret < 0) {
+			ERROR("Failed enabling stats\n");
+			return ret;
+		}
+		ret = rte_service_runstate_set(priv->rxp.sid, 1);
+		if (ret < 0) {
+			ERROR("Failed to run service\n");
+			return ret;
+		}
+		priv->rxp.sstate = SS_READY;
+		/* map the service with the service core */
+		ret = rte_service_map_lcore_set(priv->rxp.sid,
+						priv->rxp.scid, 1);
+		if (ret) {
+			ERROR("Failed to install Rx interrupts, "
+			      "could not map service core");
+			return ret;
+		}
+		priv->rxp.sstate = SS_RUNNING;
+	}
+	return 0;
+}
+
+/**
  * Install failsafe Rx event proxy subsystem.
  * This is the way the failsafe PMD generates Rx events on behalf of its
  * subdevices.
@@ -69,6 +233,9 @@
 		rc = -ENOMEM;
 		goto error;
 	}
+	rc = fs_rx_event_proxy_service_install(priv);
+	if (rc < 0)
+		goto error;
 	return 0;
 error:
 	if (priv->rxp.efd >= 0) {
@@ -222,6 +389,7 @@ void failsafe_rx_intr_uninstall_subdevice(struct sub_device *sdev)
 static void
 fs_rx_event_proxy_uninstall(struct fs_priv *priv)
 {
+	fs_rx_event_proxy_service_uninstall(priv);
 	if (priv->rxp.evec != NULL) {
 		free(priv->rxp.evec);
 		priv->rxp.evec = NULL;
diff --git a/drivers/net/failsafe/failsafe_ops.c b/drivers/net/failsafe/failsafe_ops.c
index 249baea..eb781ef 100644
--- a/drivers/net/failsafe/failsafe_ops.c
+++ b/drivers/net/failsafe/failsafe_ops.c
@@ -434,6 +434,12 @@
 		rte_errno = EINVAL;
 		return -rte_errno;
 	}
+	/* Fail if proxy service is nor running. */
+	if (PRIV(dev)->rxp.sstate != SS_RUNNING) {
+		ERROR("failsafe interrupt services are not running");
+		rte_errno = EAGAIN;
+		return -rte_errno;
+	}
 	rxq->enable_events = 1;
 	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
 		ret = rte_eth_dev_rx_intr_enable(PORT_ID(sdev), idx);
diff --git a/drivers/net/failsafe/failsafe_private.h b/drivers/net/failsafe/failsafe_private.h
index ff78b9f..5d328ff 100644
--- a/drivers/net/failsafe/failsafe_private.h
+++ b/drivers/net/failsafe/failsafe_private.h
@@ -61,6 +61,13 @@
 
 #define DEVARGS_MAXLEN 4096
 
+enum rxp_service_state {
+	SS_NO_SERVICE = 0,
+	SS_REGISTERED,
+	SS_READY,
+	SS_RUNNING,
+};
+
 /* TYPES */
 
 struct rx_proxy {
@@ -68,6 +75,11 @@ struct rx_proxy {
 	int efd;
 	/* event vector to be used by epoll */
 	struct rte_epoll_event *evec;
+	/* rte service id */
+	uint32_t sid;
+	/* service core id */
+	uint32_t scid;
+	enum rxp_service_state sstate;
 };
 
 struct rxq {
@@ -169,7 +181,10 @@ struct fs_priv {
 	 * Rx interrupts/events proxy.
 	 * The PMD issues Rx events to the EAL on behalf of its subdevices,
 	 * it does that by registering an event-fd for each of its queues with
-	 * the EAL.
+	 * the EAL. A PMD service thread listens to all the Rx events from the
+	 * subdevices, when an Rx event is issued by a subdevice it will be
+	 * caught by this service with will trigger an Rx event in the
+	 * appropriate failsafe Rx queue.
 	 */
 	struct rx_proxy rxp;
 	unsigned int pending_alarm:1; /* An alarm is pending */
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [PATCH v8 0/3] net/failsafe: add Rx interrupts support
  2018-01-25 16:19                     ` [PATCH v8 0/3] net/failsafe: add Rx interrupts support Moti Haimovsky
                                         ` (2 preceding siblings ...)
  2018-01-25 16:19                       ` [PATCH v8 3/3] net/failsafe: add Rx interrupts Moti Haimovsky
@ 2018-01-25 16:53                       ` Gaëtan Rivet
  2018-01-26 18:22                         ` Ferruh Yigit
  3 siblings, 1 reply; 29+ messages in thread
From: Gaëtan Rivet @ 2018-01-25 16:53 UTC (permalink / raw)
  To: Moti Haimovsky; +Cc: ferruh.yigit, dev

Hi Moti,

There are still a few nits here and there, but nothing important.

Acked-by: Gaetan Rivet <gaetan.rivet@6wind.com>

Regards,
-- 
Gaëtan Rivet
6WIND

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v8 0/3] net/failsafe: add Rx interrupts support
  2018-01-25 16:53                       ` [PATCH v8 0/3] net/failsafe: add Rx interrupts support Gaëtan Rivet
@ 2018-01-26 18:22                         ` Ferruh Yigit
  0 siblings, 0 replies; 29+ messages in thread
From: Ferruh Yigit @ 2018-01-26 18:22 UTC (permalink / raw)
  To: Gaëtan Rivet, Moti Haimovsky; +Cc: dev

On 1/25/2018 4:53 PM, Gaëtan Rivet wrote:
> Hi Moti,
> 
> There are still a few nits here and there, but nothing important.
> 
> Acked-by: Gaetan Rivet <gaetan.rivet@6wind.com>

Series applied to dpdk-next-net/master, thanks.

^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2018-01-26 18:22 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-12-11 12:41 [PATCH] net/failsafe: add Rx interrupts Moti Haimovsky
2017-12-12  1:34 ` Stephen Hemminger
2017-12-13 13:12   ` Mordechay Haimovsky
2018-01-04 15:01 ` [PATCH v2] " Moti Haimovsky
2018-01-17 12:54   ` [PATCH V3] " Moti Haimovsky
2018-01-19  9:32     ` [PATCH v4] " Moti Haimovsky
2018-01-19  9:32       ` Moti Haimovsky
2018-01-19 14:11         ` Gaëtan Rivet
2018-01-23 18:43         ` [PATCH v5 0/3] net/failsafe: add Rx interrupts support Moti Haimovsky
2018-01-23 18:43           ` [PATCH v5 1/3] net/failsafe: regiter as an Rx interrupt mode PMD Moti Haimovsky
2018-01-23 18:43           ` [PATCH v5 2/3] net/failsafe: slaves Rx interrupts registration Moti Haimovsky
2018-01-24 16:12             ` [PATCH v6 0/3] net/failsafe: add Rx interrupts support Moti Haimovsky
2018-01-24 16:12               ` [PATCH v6 1/3] net/failsafe: register as an Rx interrupt mode PMD Moti Haimovsky
2018-01-24 16:12               ` [PATCH v6 2/3] net/failsafe: slaves Rx interrupts registration Moti Haimovsky
2018-01-25  8:07                 ` [PATCH v7 0/3] net/failsafe: add Rx interrupts support Moti Haimovsky
2018-01-25  8:07                   ` [PATCH v7 1/3] net/failsafe: register as an Rx interrupt mode PMD Moti Haimovsky
2018-01-25 11:36                     ` Gaëtan Rivet
2018-01-25  8:07                   ` [PATCH v7 2/3] net/failsafe: slaves Rx interrupts registration Moti Haimovsky
2018-01-25 11:49                     ` Gaëtan Rivet
2018-01-25  8:07                   ` [PATCH v7 3/3] net/failsafe: add Rx interrupts Moti Haimovsky
2018-01-25 11:58                     ` Gaëtan Rivet
2018-01-25 16:19                     ` [PATCH v8 0/3] net/failsafe: add Rx interrupts support Moti Haimovsky
2018-01-25 16:19                       ` [PATCH v8 1/3] net/failsafe: register as an Rx interrupt mode PMD Moti Haimovsky
2018-01-25 16:19                       ` [PATCH v8 2/3] net/failsafe: slaves Rx interrupts registration Moti Haimovsky
2018-01-25 16:19                       ` [PATCH v8 3/3] net/failsafe: add Rx interrupts Moti Haimovsky
2018-01-25 16:53                       ` [PATCH v8 0/3] net/failsafe: add Rx interrupts support Gaëtan Rivet
2018-01-26 18:22                         ` Ferruh Yigit
2018-01-24 16:12               ` [PATCH v6 3/3] net/failsafe: add Rx interrupts Moti Haimovsky
2018-01-23 18:43           ` [PATCH v5 " Moti Haimovsky

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.