All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] Fail-safe fix removal handling lack
@ 2017-11-02 15:42 Matan Azrad
  2017-11-02 15:42 ` [PATCH 1/3] net/failsafe: " Matan Azrad
                   ` (3 more replies)
  0 siblings, 4 replies; 98+ messages in thread
From: Matan Azrad @ 2017-11-02 15:42 UTC (permalink / raw)
  To: Adrien Mazarguil, Gaetan Rivet; +Cc: dev

There is time between the physical removal of the device until sub-device PMDs
get a RMV interrupt. At this time DPDK PMDs and  applications still don't know
about the removal and may call sub-device control operation which should return an error.

This series adjusts the -ENODEV error value to failsafe and mlx PMDs.

Matan Azrad (3):
  net/failsafe: fix removal handling lack
  net/mlx4: adjust removal error
  net/mlx5: adjust removal error

 doc/guides/nics/fail_safe.rst                   |  7 +++++
 doc/guides/prog_guide/env_abstraction_layer.rst |  3 ++
 drivers/net/failsafe/failsafe_flow.c            | 16 ++++++----
 drivers/net/failsafe/failsafe_ops.c             | 27 ++++++++++-------
 drivers/net/failsafe/failsafe_private.h         |  8 +++++
 drivers/net/mlx4/mlx4.h                         |  1 +
 drivers/net/mlx4/mlx4_ethdev.c                  | 38 +++++++++++++++++++++---
 drivers/net/mlx4/mlx4_flow.c                    |  2 ++
 drivers/net/mlx4/mlx4_intr.c                    |  5 +++-
 drivers/net/mlx4/mlx4_rxq.c                     |  1 +
 drivers/net/mlx4/mlx4_txq.c                     |  1 +
 drivers/net/mlx5/mlx5.h                         |  1 +
 drivers/net/mlx5/mlx5_ethdev.c                  | 39 ++++++++++++++++++++++---
 drivers/net/mlx5/mlx5_flow.c                    |  2 ++
 drivers/net/mlx5/mlx5_rss.c                     |  4 +++
 drivers/net/mlx5/mlx5_rxq.c                     | 12 ++++++--
 drivers/net/mlx5/mlx5_stats.c                   |  6 +++-
 drivers/net/mlx5/mlx5_txq.c                     |  2 ++
 18 files changed, 147 insertions(+), 28 deletions(-)

-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 98+ messages in thread

* [PATCH 1/3] net/failsafe: fix removal handling lack
  2017-11-02 15:42 [PATCH 0/3] Fail-safe fix removal handling lack Matan Azrad
@ 2017-11-02 15:42 ` Matan Azrad
  2017-11-06  8:19   ` Gaëtan Rivet
  2017-11-02 15:42 ` [PATCH 2/3] net/mlx4: adjust removal error Matan Azrad
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 98+ messages in thread
From: Matan Azrad @ 2017-11-02 15:42 UTC (permalink / raw)
  To: Adrien Mazarguil, Gaetan Rivet; +Cc: dev, stable

There is time between the physical removal of the device until
sub-device PMDs get a RMV interrupt. At this time DPDK PMDs and
applications still don't know about the removal and may call sub-device
control operation which should return an error.

In previous code this error is reported to the application contrary to
fail-safe principle that the app should not be aware of device removal.

Define a removal error that each sub-device PMD should return in case
of an error caused by removal event; The special error is -ENODEV.

Add an error check in each relevant control command error flow and
prevent an error report to application when its value is -ENODEV.

Fixes: a46f8d5 ("net/failsafe: add fail-safe PMD")
Fixes: b737a1e ("net/failsafe: support flow API")
Cc: stable@dpdk.org

Signed-off-by: Matan Azrad <matan@mellanox.com>
---
 doc/guides/nics/fail_safe.rst                   |  7 +++++++
 doc/guides/prog_guide/env_abstraction_layer.rst |  3 +++
 drivers/net/failsafe/failsafe_flow.c            | 16 +++++++++------
 drivers/net/failsafe/failsafe_ops.c             | 27 ++++++++++++++++---------
 drivers/net/failsafe/failsafe_private.h         |  8 ++++++++
 5 files changed, 45 insertions(+), 16 deletions(-)

diff --git a/doc/guides/nics/fail_safe.rst b/doc/guides/nics/fail_safe.rst
index c4e3d2e..5023fc4 100644
--- a/doc/guides/nics/fail_safe.rst
+++ b/doc/guides/nics/fail_safe.rst
@@ -193,6 +193,13 @@ any time. The fail-safe PMD will register a callback for such event and react
 accordingly. It will try to safely stop, close and uninit the sub-device having
 emitted this event, allowing it to free its eventual resources.
 
+When fail-safe PMD gets -ENODEV error from control command sent to removable
+sub-devices, it assumes that the error reason is device removal. In this case
+fail-safe returns success value to application. The PMD controlling the
+sub-device is still responsible to emit a removal event (RMV) in addition to
+returning -ENODEV from control operations after the device has been physically
+removed. Only the reception of this event unregisters it on the fail-safe side.
+
 Fail-safe glossary
 ------------------
 
diff --git a/doc/guides/prog_guide/env_abstraction_layer.rst b/doc/guides/prog_guide/env_abstraction_layer.rst
index 4775eb3..bd2fd87 100644
--- a/doc/guides/prog_guide/env_abstraction_layer.rst
+++ b/doc/guides/prog_guide/env_abstraction_layer.rst
@@ -213,6 +213,9 @@ device having emitted a Device Removal Event. In such case, calling
 callback. Care must be taken not to close the device from the interrupt handler
 context. It is necessary to reschedule such closing operation.
 
+Unsuccessful control operations (for those that return errors) may return
+-ENODEV after the device is physically unplugged.
+
 Blacklisting
 ~~~~~~~~~~~~
 
diff --git a/drivers/net/failsafe/failsafe_flow.c b/drivers/net/failsafe/failsafe_flow.c
index 153ceee..ce9b769 100644
--- a/drivers/net/failsafe/failsafe_flow.c
+++ b/drivers/net/failsafe/failsafe_flow.c
@@ -87,7 +87,7 @@
 		DEBUG("Calling rte_flow_validate on sub_device %d", i);
 		ret = rte_flow_validate(PORT_ID(sdev),
 				attr, patterns, actions, error);
-		if (ret) {
+		if (ret && !SUBDEV_REMOVED(sdev, ret)) {
 			ERROR("Operation rte_flow_validate failed for sub_device %d"
 			      " with error %d", i, ret);
 			return ret;
@@ -111,7 +111,8 @@
 	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
 		flow->flows[i] = rte_flow_create(PORT_ID(sdev),
 				attr, patterns, actions, error);
-		if (flow->flows[i] == NULL) {
+		if (flow->flows[i] == NULL &&
+			!SUBDEV_REMOVED(sdev, -rte_errno)) {
 			ERROR("Failed to create flow on sub_device %d",
 				i);
 			goto err;
@@ -150,7 +151,7 @@
 			continue;
 		local_ret = rte_flow_destroy(PORT_ID(sdev),
 				flow->flows[i], error);
-		if (local_ret) {
+		if (local_ret && !SUBDEV_REMOVED(sdev, local_ret)) {
 			ERROR("Failed to destroy flow on sub_device %d: %d",
 					i, local_ret);
 			if (ret == 0)
@@ -175,7 +176,7 @@
 	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
 		DEBUG("Calling rte_flow_flush on sub_device %d", i);
 		ret = rte_flow_flush(PORT_ID(sdev), error);
-		if (ret) {
+		if (ret && !SUBDEV_REMOVED(sdev, ret)) {
 			ERROR("Operation rte_flow_flush failed for sub_device %d"
 			      " with error %d", i, ret);
 			return ret;
@@ -199,8 +200,11 @@
 
 	sdev = TX_SUBDEV(dev);
 	if (sdev != NULL) {
-		return rte_flow_query(PORT_ID(sdev),
+		int ret = rte_flow_query(PORT_ID(sdev),
 				flow->flows[SUB_ID(sdev)], type, arg, error);
+
+		if (!SUBDEV_REMOVED(sdev, ret))
+			return ret;
 	}
 	WARN("No active sub_device to query about its flow");
 	return -1;
@@ -223,7 +227,7 @@
 			WARN("flow isolation mode of sub_device %d in incoherent state.",
 				i);
 		ret = rte_flow_isolate(PORT_ID(sdev), set, error);
-		if (ret) {
+		if (ret && !SUBDEV_REMOVED(sdev, ret)) {
 			ERROR("Operation rte_flow_isolate failed for sub_device %d"
 			      " with error %d", i, ret);
 			return ret;
diff --git a/drivers/net/failsafe/failsafe_ops.c b/drivers/net/failsafe/failsafe_ops.c
index f460551..cc7ab7f 100644
--- a/drivers/net/failsafe/failsafe_ops.c
+++ b/drivers/net/failsafe/failsafe_ops.c
@@ -314,7 +314,7 @@
 	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
 		DEBUG("Calling rte_eth_dev_set_link_up on sub_device %d", i);
 		ret = rte_eth_dev_set_link_up(PORT_ID(sdev));
-		if (ret) {
+		if (ret && !SUBDEV_REMOVED(sdev, ret)) {
 			ERROR("Operation rte_eth_dev_set_link_up failed for sub_device %d"
 			      " with error %d", i, ret);
 			return ret;
@@ -333,7 +333,7 @@
 	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
 		DEBUG("Calling rte_eth_dev_set_link_down on sub_device %d", i);
 		ret = rte_eth_dev_set_link_down(PORT_ID(sdev));
-		if (ret) {
+		if (ret && !SUBDEV_REMOVED(sdev, ret)) {
 			ERROR("Operation rte_eth_dev_set_link_down failed for sub_device %d"
 			      " with error %d", i, ret);
 			return ret;
@@ -418,7 +418,7 @@
 				rx_queue_id,
 				nb_rx_desc, socket_id,
 				rx_conf, mb_pool);
-		if (ret) {
+		if (ret && !SUBDEV_REMOVED(sdev, ret)) {
 			ERROR("RX queue setup failed for sub_device %d", i);
 			goto free_rxq;
 		}
@@ -484,7 +484,7 @@
 				tx_queue_id,
 				nb_tx_desc, socket_id,
 				tx_conf);
-		if (ret) {
+		if (ret && !SUBDEV_REMOVED(sdev, ret)) {
 			ERROR("TX queue setup failed for sub_device %d", i);
 			goto free_txq;
 		}
@@ -563,7 +563,7 @@
 	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
 		DEBUG("Calling link_update on sub_device %d", i);
 		ret = (SUBOPS(sdev, link_update))(ETH(sdev), wait_to_complete);
-		if (ret && ret != -1) {
+		if (ret && ret != -1  && !SUBDEV_REMOVED(sdev, ret)) {
 			ERROR("Link update failed for sub_device %d with error %d",
 			      i, ret);
 			return ret;
@@ -587,6 +587,7 @@
 fs_stats_get(struct rte_eth_dev *dev,
 	     struct rte_eth_stats *stats)
 {
+	struct rte_eth_stats backup;
 	struct sub_device *sdev;
 	uint8_t i;
 	int ret;
@@ -596,14 +597,20 @@
 		struct rte_eth_stats *snapshot = &sdev->stats_snapshot.stats;
 		uint64_t *timestamp = &sdev->stats_snapshot.timestamp;
 
+		rte_memcpy(&backup, snapshot, sizeof(backup));
 		ret = rte_eth_stats_get(PORT_ID(sdev), snapshot);
 		if (ret) {
+			if (SUBDEV_REMOVED(sdev, ret)) {
+				rte_memcpy(snapshot, &backup, sizeof(backup));
+				goto inc;
+			}
 			ERROR("Operation rte_eth_stats_get failed for sub_device %d with error %d",
 				  i, ret);
 			*timestamp = 0;
 			return ret;
 		}
 		*timestamp = rte_rdtsc();
+inc:
 		failsafe_stats_increment(stats, snapshot);
 	}
 	return 0;
@@ -716,7 +723,7 @@
 	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
 		DEBUG("Calling rte_eth_dev_set_mtu on sub_device %d", i);
 		ret = rte_eth_dev_set_mtu(PORT_ID(sdev), mtu);
-		if (ret) {
+		if (ret && !SUBDEV_REMOVED(sdev, ret)) {
 			ERROR("Operation rte_eth_dev_set_mtu failed for sub_device %d with error %d",
 			      i, ret);
 			return ret;
@@ -735,7 +742,7 @@
 	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
 		DEBUG("Calling rte_eth_dev_vlan_filter on sub_device %d", i);
 		ret = rte_eth_dev_vlan_filter(PORT_ID(sdev), vlan_id, on);
-		if (ret) {
+		if (ret && !SUBDEV_REMOVED(sdev, ret)) {
 			ERROR("Operation rte_eth_dev_vlan_filter failed for sub_device %d"
 			      " with error %d", i, ret);
 			return ret;
@@ -769,7 +776,7 @@
 	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
 		DEBUG("Calling rte_eth_dev_flow_ctrl_set on sub_device %d", i);
 		ret = rte_eth_dev_flow_ctrl_set(PORT_ID(sdev), fc_conf);
-		if (ret) {
+		if (ret && !SUBDEV_REMOVED(sdev, ret)) {
 			ERROR("Operation rte_eth_dev_flow_ctrl_set failed for sub_device %d"
 			      " with error %d", i, ret);
 			return ret;
@@ -806,7 +813,7 @@
 	RTE_ASSERT(index < FAILSAFE_MAX_ETHADDR);
 	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
 		ret = rte_eth_dev_mac_addr_add(PORT_ID(sdev), mac_addr, vmdq);
-		if (ret) {
+		if (ret && !SUBDEV_REMOVED(sdev, ret)) {
 			ERROR("Operation rte_eth_dev_mac_addr_add failed for sub_device %"
 			      PRIu8 " with error %d", i, ret);
 			return ret;
@@ -848,7 +855,7 @@
 	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
 		DEBUG("Calling rte_eth_dev_filter_ctrl on sub_device %d", i);
 		ret = rte_eth_dev_filter_ctrl(PORT_ID(sdev), type, op, arg);
-		if (ret) {
+		if (ret && !SUBDEV_REMOVED(sdev, ret)) {
 			ERROR("Operation rte_eth_dev_filter_ctrl failed for sub_device %d"
 			      " with error %d", i, ret);
 			return ret;
diff --git a/drivers/net/failsafe/failsafe_private.h b/drivers/net/failsafe/failsafe_private.h
index d81cc3c..ee81b70 100644
--- a/drivers/net/failsafe/failsafe_private.h
+++ b/drivers/net/failsafe/failsafe_private.h
@@ -262,6 +262,14 @@ int failsafe_eth_lsc_event_callback(uint16_t port_id,
 	(ETH(s)->dev_ops->ops)
 
 /**
+ * s: (struct sub_device *)
+ * e: (int) error
+ */
+#define SUBDEV_REMOVED(s, e) \
+	(s->remove || \
+	 (((e) == -ENODEV) && (ETH(s)->data->dev_flags & RTE_ETH_DEV_INTR_RMV)))
+
+/**
  * Atomic guard
  */
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH 2/3] net/mlx4: adjust removal error
  2017-11-02 15:42 [PATCH 0/3] Fail-safe fix removal handling lack Matan Azrad
  2017-11-02 15:42 ` [PATCH 1/3] net/failsafe: " Matan Azrad
@ 2017-11-02 15:42 ` Matan Azrad
  2017-11-03 13:05   ` Adrien Mazarguil
  2017-11-02 15:42 ` [PATCH 3/3] net/mlx5: " Matan Azrad
  2017-12-13 14:29 ` [PATCH v2 0/4] Fail-safe fix removal handling lack Matan Azrad
  3 siblings, 1 reply; 98+ messages in thread
From: Matan Azrad @ 2017-11-02 15:42 UTC (permalink / raw)
  To: Adrien Mazarguil, Gaetan Rivet; +Cc: dev

Fail-safe PMD expects to get -ENODEV error value if sub PMD control
command fails because of device removal.

Make control callbacks return with -ENODEV when the device has
disappeared.

Signed-off-by: Matan Azrad <matan@mellanox.com>
---
 drivers/net/mlx4/mlx4.h        |  1 +
 drivers/net/mlx4/mlx4_ethdev.c | 38 ++++++++++++++++++++++++++++++++++----
 drivers/net/mlx4/mlx4_flow.c   |  2 ++
 drivers/net/mlx4/mlx4_intr.c   |  5 ++++-
 drivers/net/mlx4/mlx4_rxq.c    |  1 +
 drivers/net/mlx4/mlx4_txq.c    |  1 +
 6 files changed, 43 insertions(+), 5 deletions(-)

diff --git a/drivers/net/mlx4/mlx4.h b/drivers/net/mlx4/mlx4.h
index e0a9853..cac9654 100644
--- a/drivers/net/mlx4/mlx4.h
+++ b/drivers/net/mlx4/mlx4.h
@@ -149,6 +149,7 @@ int mlx4_flow_ctrl_get(struct rte_eth_dev *dev,
 		       struct rte_eth_fc_conf *fc_conf);
 int mlx4_flow_ctrl_set(struct rte_eth_dev *dev,
 		       struct rte_eth_fc_conf *fc_conf);
+int mlx4_removed(const struct priv *priv);
 
 /* mlx4_intr.c */
 
diff --git a/drivers/net/mlx4/mlx4_ethdev.c b/drivers/net/mlx4/mlx4_ethdev.c
index b0acd12..76914b0 100644
--- a/drivers/net/mlx4/mlx4_ethdev.c
+++ b/drivers/net/mlx4/mlx4_ethdev.c
@@ -312,6 +312,8 @@
 
 	ret = mlx4_sysfs_write(priv, name, value_str, (sizeof(value_str) - 1));
 	if (ret < 0) {
+		if (mlx4_removed(priv))
+			ret = -ENODEV;
 		DEBUG("cannot write %s `%s' (%lu) to sysfs: %s",
 		      name, value_str, value, strerror(rte_errno));
 		return ret;
@@ -340,15 +342,19 @@
 
 	if (sock == -1) {
 		rte_errno = errno;
-		return -rte_errno;
+		goto error;
 	}
 	ret = mlx4_get_ifname(priv, &ifr->ifr_name);
 	if (!ret && ioctl(sock, req, ifr) == -1) {
 		rte_errno = errno;
-		ret = -rte_errno;
+		close(sock);
+		goto error;
 	}
 	close(sock);
 	return ret;
+error:
+	mlx4_removed(priv);
+	return -rte_errno;
 }
 
 /**
@@ -473,13 +479,17 @@
 	if (up) {
 		err = mlx4_set_flags(priv, ~IFF_UP, IFF_UP);
 		if (err)
-			return err;
+			goto error;
 	} else {
 		err = mlx4_set_flags(priv, ~IFF_UP, ~IFF_UP);
 		if (err)
-			return err;
+			goto error;
 	}
 	return 0;
+error:
+	if (mlx4_removed(priv))
+		return -ENODEV;
+	return err;
 }
 
 /**
@@ -947,6 +957,7 @@ enum rxmode_toggle {
 
 	ifr.ifr_data = (void *)&ethpause;
 	if (mlx4_ifreq(priv, SIOCETHTOOL, &ifr)) {
+		mlx4_removed(priv);
 		ret = rte_errno;
 		WARN("ioctl(SIOCETHTOOL, ETHTOOL_GPAUSEPARAM)"
 		     " failed: %s",
@@ -1002,6 +1013,7 @@ enum rxmode_toggle {
 	else
 		ethpause.tx_pause = 0;
 	if (mlx4_ifreq(priv, SIOCETHTOOL, &ifr)) {
+		mlx4_removed(priv);
 		ret = rte_errno;
 		WARN("ioctl(SIOCETHTOOL, ETHTOOL_SPAUSEPARAM)"
 		     " failed: %s",
@@ -1013,3 +1025,21 @@ enum rxmode_toggle {
 	assert(ret >= 0);
 	return -ret;
 }
+/**
+ * Check if mlx4 device was removed.
+ *
+ * @param priv
+ *   Pointer to private structure.
+ *
+ * @return
+ *   -ENODEV when device is removed and rte_errno is set, otherwise 0.
+ */
+int
+mlx4_removed(const struct priv *priv)
+{
+	struct ibv_device_attr device_attr;
+
+	if (ibv_query_device(priv->ctx, &device_attr) == EIO)
+		return -(rte_errno = ENODEV);
+	return 0;
+}
diff --git a/drivers/net/mlx4/mlx4_flow.c b/drivers/net/mlx4/mlx4_flow.c
index 8b87b29..606c888 100644
--- a/drivers/net/mlx4/mlx4_flow.c
+++ b/drivers/net/mlx4/mlx4_flow.c
@@ -1069,6 +1069,8 @@ struct mlx4_drop {
 	err = errno;
 	msg = "flow rule rejected by device";
 error:
+	if (mlx4_removed(priv))
+		err = ENODEV;
 	return rte_flow_error_set
 		(error, err, RTE_FLOW_ERROR_TYPE_HANDLE, flow, msg);
 }
diff --git a/drivers/net/mlx4/mlx4_intr.c b/drivers/net/mlx4/mlx4_intr.c
index b17d109..0ebdb28 100644
--- a/drivers/net/mlx4/mlx4_intr.c
+++ b/drivers/net/mlx4/mlx4_intr.c
@@ -359,7 +359,10 @@
 			ret = EINVAL;
 	}
 	if (ret) {
-		rte_errno = ret;
+		if (mlx4_removed(dev->data->dev_private))
+			ret = ENODEV;
+		else
+			rte_errno = ret;
 		WARN("unable to disable interrupt on rx queue %d",
 		     idx);
 	} else {
diff --git a/drivers/net/mlx4/mlx4_rxq.c b/drivers/net/mlx4/mlx4_rxq.c
index 7fe21b6..43dad26 100644
--- a/drivers/net/mlx4/mlx4_rxq.c
+++ b/drivers/net/mlx4/mlx4_rxq.c
@@ -832,6 +832,7 @@ void mlx4_rss_detach(struct mlx4_rss *rss)
 	ret = rte_errno;
 	mlx4_rx_queue_release(rxq);
 	rte_errno = ret;
+	mlx4_removed(priv);
 	assert(rte_errno > 0);
 	return -rte_errno;
 }
diff --git a/drivers/net/mlx4/mlx4_txq.c b/drivers/net/mlx4/mlx4_txq.c
index a9c5bd2..09bdfd8 100644
--- a/drivers/net/mlx4/mlx4_txq.c
+++ b/drivers/net/mlx4/mlx4_txq.c
@@ -372,6 +372,7 @@ struct txq_mp2mr_mbuf_check_data {
 	ret = rte_errno;
 	mlx4_tx_queue_release(txq);
 	rte_errno = ret;
+	mlx4_removed(priv);
 	assert(rte_errno > 0);
 	return -rte_errno;
 }
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH 3/3] net/mlx5: adjust removal error
  2017-11-02 15:42 [PATCH 0/3] Fail-safe fix removal handling lack Matan Azrad
  2017-11-02 15:42 ` [PATCH 1/3] net/failsafe: " Matan Azrad
  2017-11-02 15:42 ` [PATCH 2/3] net/mlx4: adjust removal error Matan Azrad
@ 2017-11-02 15:42 ` Matan Azrad
  2017-11-03 13:06   ` Adrien Mazarguil
  2017-12-13 14:29 ` [PATCH v2 0/4] Fail-safe fix removal handling lack Matan Azrad
  3 siblings, 1 reply; 98+ messages in thread
From: Matan Azrad @ 2017-11-02 15:42 UTC (permalink / raw)
  To: Adrien Mazarguil, Gaetan Rivet; +Cc: dev

Fail-safe PMD expects to get -ENODEV error value if sub PMD control
command fails because of device removal.

Make control callbacks return with -ENODEV when the device has
disappeared.

Signed-off-by: Matan Azrad <matan@mellanox.com>
---
 drivers/net/mlx5/mlx5.h        |  1 +
 drivers/net/mlx5/mlx5_ethdev.c | 39 +++++++++++++++++++++++++++++++++++----
 drivers/net/mlx5/mlx5_flow.c   |  2 ++
 drivers/net/mlx5/mlx5_rss.c    |  4 ++++
 drivers/net/mlx5/mlx5_rxq.c    | 12 ++++++++++--
 drivers/net/mlx5/mlx5_stats.c  |  6 +++++-
 drivers/net/mlx5/mlx5_txq.c    |  2 ++
 7 files changed, 59 insertions(+), 7 deletions(-)

diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index e6a69b8..0dd104a 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -208,6 +208,7 @@ int mlx5_ibv_device_to_pci_addr(const struct ibv_device *,
 int mlx5_set_link_up(struct rte_eth_dev *dev);
 void priv_dev_select_tx_function(struct priv *priv, struct rte_eth_dev *dev);
 void priv_dev_select_rx_function(struct priv *priv, struct rte_eth_dev *dev);
+int mlx5_removed(const struct priv *priv);
 
 /* mlx5_mac.c */
 
diff --git a/drivers/net/mlx5/mlx5_ethdev.c b/drivers/net/mlx5/mlx5_ethdev.c
index c31ea4b..bf61cd6 100644
--- a/drivers/net/mlx5/mlx5_ethdev.c
+++ b/drivers/net/mlx5/mlx5_ethdev.c
@@ -394,6 +394,8 @@ struct priv *
 
 	ret = priv_sysfs_write(priv, name, value_str, (sizeof(value_str) - 1));
 	if (ret == -1) {
+		if (mlx5_removed(priv))
+			errno = ENODEV;
 		DEBUG("cannot write %s `%s' (%lu) to sysfs: %s",
 		      name, value_str, value, strerror(errno));
 		return -1;
@@ -925,13 +927,17 @@ struct priv *
 {
 	struct utsname utsname;
 	int ver[3];
+	int ret;
 
 	if (uname(&utsname) == -1 ||
 	    sscanf(utsname.release, "%d.%d.%d",
 		   &ver[0], &ver[1], &ver[2]) != 3 ||
 	    KERNEL_VERSION(ver[0], ver[1], ver[2]) < KERNEL_VERSION(4, 9, 0))
-		return mlx5_link_update_unlocked_gset(dev, wait_to_complete);
-	return mlx5_link_update_unlocked_gs(dev, wait_to_complete);
+		ret = mlx5_link_update_unlocked_gset(dev, wait_to_complete);
+	ret =  mlx5_link_update_unlocked_gs(dev, wait_to_complete);
+	if (ret && mlx5_removed(mlx5_get_priv(dev)))
+		return -ENODEV;
+	return ret;
 }
 
 /**
@@ -978,6 +984,8 @@ struct priv *
 	     strerror(ret));
 	priv_unlock(priv);
 	assert(ret >= 0);
+	if (mlx5_removed(priv))
+		return -ENODEV;
 	return -ret;
 }
 
@@ -1029,6 +1037,8 @@ struct priv *
 out:
 	priv_unlock(priv);
 	assert(ret >= 0);
+	if (mlx5_removed(priv))
+		return -ENODEV;
 	return -ret;
 }
 
@@ -1083,6 +1093,8 @@ struct priv *
 out:
 	priv_unlock(priv);
 	assert(ret >= 0);
+	if (mlx5_removed(priv))
+		return -ENODEV;
 	return -ret;
 }
 
@@ -1364,13 +1376,13 @@ struct priv *
 	if (up) {
 		err = priv_set_flags(priv, ~IFF_UP, IFF_UP);
 		if (err)
-			return err;
+			return errno == ENODEV ? -ENODEV : err;
 		priv_dev_select_tx_function(priv, dev);
 		priv_dev_select_rx_function(priv, dev);
 	} else {
 		err = priv_set_flags(priv, ~IFF_UP, ~IFF_UP);
 		if (err)
-			return err;
+			return errno == ENODEV ? -ENODEV : err;
 		dev->rx_pkt_burst = removed_rx_burst;
 		dev->tx_pkt_burst = removed_tx_burst;
 	}
@@ -1474,3 +1486,22 @@ struct priv *
 		dev->rx_pkt_burst = mlx5_rx_burst;
 	}
 }
+
+/**
+ * Check if mlx5 device was removed.
+ *
+ * @param priv
+ *   Pointer to private structure.
+ *
+ * @return
+ *   -ENODEV when device is removed and rte_errno is set, otherwise 0.
+ */
+int
+mlx5_removed(const struct priv *priv)
+{
+	struct ibv_device_attr device_attr;
+
+	if (ibv_query_device(priv->ctx, &device_attr) == EIO)
+		return -(rte_errno = ENODEV);
+	return 0;
+}
diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c
index 5f49bf5..448c0a3 100644
--- a/drivers/net/mlx5/mlx5_flow.c
+++ b/drivers/net/mlx5/mlx5_flow.c
@@ -3068,6 +3068,8 @@ struct rte_flow *
 		priv_lock(priv);
 		ret = priv_fdir_ctrl_func(priv, filter_op, arg);
 		priv_unlock(priv);
+		if (ret && mlx5_removed(priv))
+			ret = ENODEV;
 		break;
 	default:
 		ERROR("%p: filter type (%d) not supported",
diff --git a/drivers/net/mlx5/mlx5_rss.c b/drivers/net/mlx5/mlx5_rss.c
index f3de46d..1ad9269 100644
--- a/drivers/net/mlx5/mlx5_rss.c
+++ b/drivers/net/mlx5/mlx5_rss.c
@@ -250,6 +250,8 @@
 	priv_lock(priv);
 	ret = priv_dev_rss_reta_query(priv, reta_conf, reta_size);
 	priv_unlock(priv);
+	if (ret && mlx5_removed(priv))
+		return -ENODEV;
 	return -ret;
 }
 
@@ -282,5 +284,7 @@
 		mlx5_dev_stop(dev);
 		mlx5_dev_start(dev);
 	}
+	if (ret && mlx5_removed(priv))
+		return -ENODEV;
 	return -ret;
 }
diff --git a/drivers/net/mlx5/mlx5_rxq.c b/drivers/net/mlx5/mlx5_rxq.c
index a1f382b..c9a549d 100644
--- a/drivers/net/mlx5/mlx5_rxq.c
+++ b/drivers/net/mlx5/mlx5_rxq.c
@@ -278,6 +278,8 @@
 	(*priv->rxqs)[idx] = &rxq_ctrl->rxq;
 out:
 	priv_unlock(priv);
+	if (mlx5_removed(priv))
+		return -ENODEV;
 	return -ret;
 }
 
@@ -485,8 +487,11 @@
 	}
 exit:
 	priv_unlock(priv);
-	if (ret)
+	if (ret) {
 		WARN("unable to arm interrupt on rx queue %d", rx_queue_id);
+		if (mlx5_removed(priv))
+			return -ENODEV;
+	}
 	return -ret;
 }
 
@@ -537,9 +542,12 @@
 	if (rxq_ibv)
 		mlx5_priv_rxq_ibv_release(priv, rxq_ibv);
 	priv_unlock(priv);
-	if (ret)
+	if (ret) {
 		WARN("unable to disable interrupt on rx queue %d",
 		     rx_queue_id);
+		if (mlx5_removed(priv))
+			return -ENODEV;
+	}
 	return -ret;
 }
 
diff --git a/drivers/net/mlx5/mlx5_stats.c b/drivers/net/mlx5/mlx5_stats.c
index 5e225d3..33b2a60 100644
--- a/drivers/net/mlx5/mlx5_stats.c
+++ b/drivers/net/mlx5/mlx5_stats.c
@@ -438,13 +438,17 @@ struct mlx5_counter_ctrl {
 		stats_n = priv_ethtool_get_stats_n(priv);
 		if (stats_n < 0) {
 			priv_unlock(priv);
-			return -1;
+			ret = -1;
+			goto error;
 		}
 		if (xstats_ctrl->stats_n != stats_n)
 			priv_xstats_init(priv);
 		ret = priv_xstats_get(priv, stats);
 		priv_unlock(priv);
 	}
+error:
+	if (ret < 0 && mlx5_removed(priv))
+		return -ENODEV;
 	return ret;
 }
 
diff --git a/drivers/net/mlx5/mlx5_txq.c b/drivers/net/mlx5/mlx5_txq.c
index fbb2630..a0101cb 100644
--- a/drivers/net/mlx5/mlx5_txq.c
+++ b/drivers/net/mlx5/mlx5_txq.c
@@ -186,6 +186,8 @@
 	(*priv->txqs)[idx] = &txq_ctrl->txq;
 out:
 	priv_unlock(priv);
+	if (mlx5_removed(priv))
+		return -ENODEV;
 	return -ret;
 }
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* Re: [PATCH 2/3] net/mlx4: adjust removal error
  2017-11-02 15:42 ` [PATCH 2/3] net/mlx4: adjust removal error Matan Azrad
@ 2017-11-03 13:05   ` Adrien Mazarguil
  2017-11-05  6:52     ` Matan Azrad
  0 siblings, 1 reply; 98+ messages in thread
From: Adrien Mazarguil @ 2017-11-03 13:05 UTC (permalink / raw)
  To: Matan Azrad; +Cc: Gaetan Rivet, dev

On Thu, Nov 02, 2017 at 03:42:03PM +0000, Matan Azrad wrote:
> Fail-safe PMD expects to get -ENODEV error value if sub PMD control
> command fails because of device removal.
> 
> Make control callbacks return with -ENODEV when the device has
> disappeared.
> 
> Signed-off-by: Matan Azrad <matan@mellanox.com>

I think there are a several inconsistencies regarding the places where
mlx4_removed() is used, this could lead to mistakes or redundant calls to
this function later on.

You have to choose between low-level internal functions
(e.g. mlx4_set_sysfs_ulong()) or user-facing ones from the eth_dev_ops
interface (e.g. mlx4_dev_set_link_up()), but neither intermediate functions
nor a mix of all approaches.

Standardizing on low-level functions is not practical as it means you'd have
to check for a device removal after each ibv_*() call. Therefore my
suggestion is to check it at the highest level, in all functions exposed
though mlx4_dev_ops in case of error, even innocuous one like
mlx4_stats_get() and those returning void (rte_errno can still be set), all
in the name of consistency.

The mlx4_removed() documentation should be updated to reflect the places
it's supposed to be called as well. All this means a larger patch is
necessary.

See below for coding style issues.

> ---
>  drivers/net/mlx4/mlx4.h        |  1 +
>  drivers/net/mlx4/mlx4_ethdev.c | 38 ++++++++++++++++++++++++++++++++++----
>  drivers/net/mlx4/mlx4_flow.c   |  2 ++
>  drivers/net/mlx4/mlx4_intr.c   |  5 ++++-
>  drivers/net/mlx4/mlx4_rxq.c    |  1 +
>  drivers/net/mlx4/mlx4_txq.c    |  1 +
>  6 files changed, 43 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/net/mlx4/mlx4.h b/drivers/net/mlx4/mlx4.h
> index e0a9853..cac9654 100644
> --- a/drivers/net/mlx4/mlx4.h
> +++ b/drivers/net/mlx4/mlx4.h
> @@ -149,6 +149,7 @@ int mlx4_flow_ctrl_get(struct rte_eth_dev *dev,
>  		       struct rte_eth_fc_conf *fc_conf);
>  int mlx4_flow_ctrl_set(struct rte_eth_dev *dev,
>  		       struct rte_eth_fc_conf *fc_conf);
> +int mlx4_removed(const struct priv *priv);
>  
>  /* mlx4_intr.c */
>  
> diff --git a/drivers/net/mlx4/mlx4_ethdev.c b/drivers/net/mlx4/mlx4_ethdev.c
> index b0acd12..76914b0 100644
> --- a/drivers/net/mlx4/mlx4_ethdev.c
> +++ b/drivers/net/mlx4/mlx4_ethdev.c
> @@ -312,6 +312,8 @@
>  
>  	ret = mlx4_sysfs_write(priv, name, value_str, (sizeof(value_str) - 1));
>  	if (ret < 0) {
> +		if (mlx4_removed(priv))
> +			ret = -ENODEV;
>  		DEBUG("cannot write %s `%s' (%lu) to sysfs: %s",
>  		      name, value_str, value, strerror(rte_errno));
>  		return ret;
> @@ -340,15 +342,19 @@
>  
>  	if (sock == -1) {
>  		rte_errno = errno;
> -		return -rte_errno;
> +		goto error;
>  	}
>  	ret = mlx4_get_ifname(priv, &ifr->ifr_name);
>  	if (!ret && ioctl(sock, req, ifr) == -1) {
>  		rte_errno = errno;
> -		ret = -rte_errno;
> +		close(sock);
> +		goto error;
>  	}
>  	close(sock);
>  	return ret;
> +error:
> +	mlx4_removed(priv);
> +	return -rte_errno;
>  }
>  
>  /**
> @@ -473,13 +479,17 @@
>  	if (up) {
>  		err = mlx4_set_flags(priv, ~IFF_UP, IFF_UP);
>  		if (err)
> -			return err;
> +			goto error;
>  	} else {
>  		err = mlx4_set_flags(priv, ~IFF_UP, ~IFF_UP);
>  		if (err)
> -			return err;
> +			goto error;
>  	}
>  	return 0;
> +error:
> +	if (mlx4_removed(priv))
> +		return -ENODEV;
> +	return err;
>  }
>  
>  /**
> @@ -947,6 +957,7 @@ enum rxmode_toggle {
>  
>  	ifr.ifr_data = (void *)&ethpause;
>  	if (mlx4_ifreq(priv, SIOCETHTOOL, &ifr)) {
> +		mlx4_removed(priv);
>  		ret = rte_errno;
>  		WARN("ioctl(SIOCETHTOOL, ETHTOOL_GPAUSEPARAM)"
>  		     " failed: %s",
> @@ -1002,6 +1013,7 @@ enum rxmode_toggle {
>  	else
>  		ethpause.tx_pause = 0;
>  	if (mlx4_ifreq(priv, SIOCETHTOOL, &ifr)) {
> +		mlx4_removed(priv);
>  		ret = rte_errno;
>  		WARN("ioctl(SIOCETHTOOL, ETHTOOL_SPAUSEPARAM)"
>  		     " failed: %s",
> @@ -1013,3 +1025,21 @@ enum rxmode_toggle {
>  	assert(ret >= 0);
>  	return -ret;
>  }

Missing empty line.

> +/**
> + * Check if mlx4 device was removed.

"mlx4" is a somewhat redundant given PMD name.

A separate paragraph should describe where this function is supposed to be
called.

> + *
> + * @param priv
> + *   Pointer to private structure.
> + *
> + * @return
> + *   -ENODEV when device is removed and rte_errno is set, otherwise 0.
> + */
> +int
> +mlx4_removed(const struct priv *priv)
> +{
> +	struct ibv_device_attr device_attr;
> +
> +	if (ibv_query_device(priv->ctx, &device_attr) == EIO)
> +		return -(rte_errno = ENODEV);

Although a nice shortcut, coding rules don't allow this. You have to assign
rte_errno on its own separate line. My suggestion if you want to avoid a
block would be to return 0 directly when != EIO.

> +	return 0;
> +}
> diff --git a/drivers/net/mlx4/mlx4_flow.c b/drivers/net/mlx4/mlx4_flow.c
> index 8b87b29..606c888 100644
> --- a/drivers/net/mlx4/mlx4_flow.c
> +++ b/drivers/net/mlx4/mlx4_flow.c
> @@ -1069,6 +1069,8 @@ struct mlx4_drop {
>  	err = errno;
>  	msg = "flow rule rejected by device";
>  error:
> +	if (mlx4_removed(priv))
> +		err = ENODEV;
>  	return rte_flow_error_set
>  		(error, err, RTE_FLOW_ERROR_TYPE_HANDLE, flow, msg);
>  }
> diff --git a/drivers/net/mlx4/mlx4_intr.c b/drivers/net/mlx4/mlx4_intr.c
> index b17d109..0ebdb28 100644
> --- a/drivers/net/mlx4/mlx4_intr.c
> +++ b/drivers/net/mlx4/mlx4_intr.c
> @@ -359,7 +359,10 @@
>  			ret = EINVAL;
>  	}
>  	if (ret) {
> -		rte_errno = ret;
> +		if (mlx4_removed(dev->data->dev_private))
> +			ret = ENODEV;
> +		else
> +			rte_errno = ret;
>  		WARN("unable to disable interrupt on rx queue %d",
>  		     idx);
>  	} else {
> diff --git a/drivers/net/mlx4/mlx4_rxq.c b/drivers/net/mlx4/mlx4_rxq.c
> index 7fe21b6..43dad26 100644
> --- a/drivers/net/mlx4/mlx4_rxq.c
> +++ b/drivers/net/mlx4/mlx4_rxq.c
> @@ -832,6 +832,7 @@ void mlx4_rss_detach(struct mlx4_rss *rss)
>  	ret = rte_errno;
>  	mlx4_rx_queue_release(rxq);
>  	rte_errno = ret;
> +	mlx4_removed(priv);
>  	assert(rte_errno > 0);
>  	return -rte_errno;
>  }
> diff --git a/drivers/net/mlx4/mlx4_txq.c b/drivers/net/mlx4/mlx4_txq.c
> index a9c5bd2..09bdfd8 100644
> --- a/drivers/net/mlx4/mlx4_txq.c
> +++ b/drivers/net/mlx4/mlx4_txq.c
> @@ -372,6 +372,7 @@ struct txq_mp2mr_mbuf_check_data {
>  	ret = rte_errno;
>  	mlx4_tx_queue_release(txq);
>  	rte_errno = ret;
> +	mlx4_removed(priv);
>  	assert(rte_errno > 0);
>  	return -rte_errno;
>  }
> -- 
> 1.8.3.1
> 

-- 
Adrien Mazarguil
6WIND

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH 3/3] net/mlx5: adjust removal error
  2017-11-02 15:42 ` [PATCH 3/3] net/mlx5: " Matan Azrad
@ 2017-11-03 13:06   ` Adrien Mazarguil
  2017-11-05  6:57     ` Matan Azrad
  0 siblings, 1 reply; 98+ messages in thread
From: Adrien Mazarguil @ 2017-11-03 13:06 UTC (permalink / raw)
  To: Matan Azrad; +Cc: Gaetan Rivet, dev

On Thu, Nov 02, 2017 at 03:42:04PM +0000, Matan Azrad wrote:
> Fail-safe PMD expects to get -ENODEV error value if sub PMD control
> command fails because of device removal.
> 
> Make control callbacks return with -ENODEV when the device has
> disappeared.
> 
> Signed-off-by: Matan Azrad <matan@mellanox.com>

In short I have the same comments as on the mlx4 patch about usage
consistency, this also applies to mlx5; mlx5_removed() should be only used
by the public callbacks from struct eth_dev_ops.

There's an additional difficulty with this PMD, you need to take into
account the fact it provides secondary process support (mlx5_dev_sec_ops).
I think secondary processes do not have any IBV context available for
mlx5_removed() to query, which should resolve to a no-op in this case.
Make sure secondary processes do not crash whatever happens.

See below for coding style and other issues.

> ---
>  drivers/net/mlx5/mlx5.h        |  1 +
>  drivers/net/mlx5/mlx5_ethdev.c | 39 +++++++++++++++++++++++++++++++++++----
>  drivers/net/mlx5/mlx5_flow.c   |  2 ++
>  drivers/net/mlx5/mlx5_rss.c    |  4 ++++
>  drivers/net/mlx5/mlx5_rxq.c    | 12 ++++++++++--
>  drivers/net/mlx5/mlx5_stats.c  |  6 +++++-
>  drivers/net/mlx5/mlx5_txq.c    |  2 ++
>  7 files changed, 59 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
> index e6a69b8..0dd104a 100644
> --- a/drivers/net/mlx5/mlx5.h
> +++ b/drivers/net/mlx5/mlx5.h
> @@ -208,6 +208,7 @@ int mlx5_ibv_device_to_pci_addr(const struct ibv_device *,
>  int mlx5_set_link_up(struct rte_eth_dev *dev);
>  void priv_dev_select_tx_function(struct priv *priv, struct rte_eth_dev *dev);
>  void priv_dev_select_rx_function(struct priv *priv, struct rte_eth_dev *dev);
> +int mlx5_removed(const struct priv *priv);
>  
>  /* mlx5_mac.c */
>  
> diff --git a/drivers/net/mlx5/mlx5_ethdev.c b/drivers/net/mlx5/mlx5_ethdev.c
> index c31ea4b..bf61cd6 100644
> --- a/drivers/net/mlx5/mlx5_ethdev.c
> +++ b/drivers/net/mlx5/mlx5_ethdev.c
> @@ -394,6 +394,8 @@ struct priv *
>  
>  	ret = priv_sysfs_write(priv, name, value_str, (sizeof(value_str) - 1));
>  	if (ret == -1) {
> +		if (mlx5_removed(priv))
> +			errno = ENODEV;
>  		DEBUG("cannot write %s `%s' (%lu) to sysfs: %s",
>  		      name, value_str, value, strerror(errno));
>  		return -1;
> @@ -925,13 +927,17 @@ struct priv *
>  {
>  	struct utsname utsname;
>  	int ver[3];
> +	int ret;
>  
>  	if (uname(&utsname) == -1 ||
>  	    sscanf(utsname.release, "%d.%d.%d",
>  		   &ver[0], &ver[1], &ver[2]) != 3 ||
>  	    KERNEL_VERSION(ver[0], ver[1], ver[2]) < KERNEL_VERSION(4, 9, 0))
> -		return mlx5_link_update_unlocked_gset(dev, wait_to_complete);
> -	return mlx5_link_update_unlocked_gs(dev, wait_to_complete);
> +		ret = mlx5_link_update_unlocked_gset(dev, wait_to_complete);
> +	ret =  mlx5_link_update_unlocked_gs(dev, wait_to_complete);

Besides the extra space after "ret =", I think this doesn't work as
intended. A "else" statement is necessary.

> +	if (ret && mlx5_removed(mlx5_get_priv(dev)))
> +		return -ENODEV;
> +	return ret;
>  }
>  
>  /**
> @@ -978,6 +984,8 @@ struct priv *
>  	     strerror(ret));
>  	priv_unlock(priv);
>  	assert(ret >= 0);
> +	if (mlx5_removed(priv))
> +		return -ENODEV;
>  	return -ret;
>  }
>  
> @@ -1029,6 +1037,8 @@ struct priv *
>  out:
>  	priv_unlock(priv);
>  	assert(ret >= 0);
> +	if (mlx5_removed(priv))
> +		return -ENODEV;
>  	return -ret;
>  }
>  
> @@ -1083,6 +1093,8 @@ struct priv *
>  out:
>  	priv_unlock(priv);
>  	assert(ret >= 0);
> +	if (mlx5_removed(priv))
> +		return -ENODEV;
>  	return -ret;
>  }
>  
> @@ -1364,13 +1376,13 @@ struct priv *
>  	if (up) {
>  		err = priv_set_flags(priv, ~IFF_UP, IFF_UP);
>  		if (err)
> -			return err;
> +			return errno == ENODEV ? -ENODEV : err;

There is a documentation issue here since the mlx5 PMD didn't get all the
errno consistency fixes that mlx4 got, however err is documented as being -1
in case of error, whereas priv_dev_set_link() returns a positive errno value
instead and mlx5_set_link_down/up() should return only negative errno values
but are documented as returning positive ones.

Anyway to keep it short: currently in mlx5, priv_*() => positive errno and
the public-facing mlx5_*() => negative errno, hence you should return a
positive ENODEV here.

You could avoid this mess by patching the public callbacks only and not
internal functions like this one.

>  		priv_dev_select_tx_function(priv, dev);
>  		priv_dev_select_rx_function(priv, dev);
>  	} else {
>  		err = priv_set_flags(priv, ~IFF_UP, ~IFF_UP);
>  		if (err)
> -			return err;
> +			return errno == ENODEV ? -ENODEV : err;

Same here.

>  		dev->rx_pkt_burst = removed_rx_burst;
>  		dev->tx_pkt_burst = removed_tx_burst;
>  	}
> @@ -1474,3 +1486,22 @@ struct priv *
>  		dev->rx_pkt_burst = mlx5_rx_burst;
>  	}
>  }
> +
> +/**
> + * Check if mlx5 device was removed.
> + *

"mlx5" is redundant.

As with mlx4, a short paragraph should describe where this function is
supposed to be used.

> + * @param priv
> + *   Pointer to private structure.
> + *
> + * @return
> + *   -ENODEV when device is removed and rte_errno is set, otherwise 0.
> + */
> +int
> +mlx5_removed(const struct priv *priv)
> +{
> +	struct ibv_device_attr device_attr;
> +
> +	if (ibv_query_device(priv->ctx, &device_attr) == EIO)
> +		return -(rte_errno = ENODEV);

Coding rules prohibit this kind of affectation, see mlx4 comments.

> +	return 0;
> +}
> diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c
> index 5f49bf5..448c0a3 100644
> --- a/drivers/net/mlx5/mlx5_flow.c
> +++ b/drivers/net/mlx5/mlx5_flow.c
> @@ -3068,6 +3068,8 @@ struct rte_flow *
>  		priv_lock(priv);
>  		ret = priv_fdir_ctrl_func(priv, filter_op, arg);
>  		priv_unlock(priv);
> +		if (ret && mlx5_removed(priv))
> +			ret = ENODEV;
>  		break;
>  	default:
>  		ERROR("%p: filter type (%d) not supported",
> diff --git a/drivers/net/mlx5/mlx5_rss.c b/drivers/net/mlx5/mlx5_rss.c
> index f3de46d..1ad9269 100644
> --- a/drivers/net/mlx5/mlx5_rss.c
> +++ b/drivers/net/mlx5/mlx5_rss.c
> @@ -250,6 +250,8 @@
>  	priv_lock(priv);
>  	ret = priv_dev_rss_reta_query(priv, reta_conf, reta_size);
>  	priv_unlock(priv);
> +	if (ret && mlx5_removed(priv))
> +		return -ENODEV;
>  	return -ret;
>  }
>  
> @@ -282,5 +284,7 @@
>  		mlx5_dev_stop(dev);
>  		mlx5_dev_start(dev);
>  	}
> +	if (ret && mlx5_removed(priv))
> +		return -ENODEV;
>  	return -ret;
>  }
> diff --git a/drivers/net/mlx5/mlx5_rxq.c b/drivers/net/mlx5/mlx5_rxq.c
> index a1f382b..c9a549d 100644
> --- a/drivers/net/mlx5/mlx5_rxq.c
> +++ b/drivers/net/mlx5/mlx5_rxq.c
> @@ -278,6 +278,8 @@
>  	(*priv->rxqs)[idx] = &rxq_ctrl->rxq;
>  out:
>  	priv_unlock(priv);
> +	if (mlx5_removed(priv))
> +		return -ENODEV;
>  	return -ret;
>  }
>  
> @@ -485,8 +487,11 @@
>  	}
>  exit:
>  	priv_unlock(priv);
> -	if (ret)
> +	if (ret) {
>  		WARN("unable to arm interrupt on rx queue %d", rx_queue_id);
> +		if (mlx5_removed(priv))
> +			return -ENODEV;
> +	}
>  	return -ret;
>  }
>  
> @@ -537,9 +542,12 @@
>  	if (rxq_ibv)
>  		mlx5_priv_rxq_ibv_release(priv, rxq_ibv);
>  	priv_unlock(priv);
> -	if (ret)
> +	if (ret) {
>  		WARN("unable to disable interrupt on rx queue %d",
>  		     rx_queue_id);
> +		if (mlx5_removed(priv))
> +			return -ENODEV;
> +	}
>  	return -ret;
>  }
>  
> diff --git a/drivers/net/mlx5/mlx5_stats.c b/drivers/net/mlx5/mlx5_stats.c
> index 5e225d3..33b2a60 100644
> --- a/drivers/net/mlx5/mlx5_stats.c
> +++ b/drivers/net/mlx5/mlx5_stats.c
> @@ -438,13 +438,17 @@ struct mlx5_counter_ctrl {
>  		stats_n = priv_ethtool_get_stats_n(priv);
>  		if (stats_n < 0) {
>  			priv_unlock(priv);
> -			return -1;
> +			ret = -1;
> +			goto error;
>  		}
>  		if (xstats_ctrl->stats_n != stats_n)
>  			priv_xstats_init(priv);
>  		ret = priv_xstats_get(priv, stats);
>  		priv_unlock(priv);
>  	}
> +error:
> +	if (ret < 0 && mlx5_removed(priv))
> +		return -ENODEV;
>  	return ret;
>  }
>  
> diff --git a/drivers/net/mlx5/mlx5_txq.c b/drivers/net/mlx5/mlx5_txq.c
> index fbb2630..a0101cb 100644
> --- a/drivers/net/mlx5/mlx5_txq.c
> +++ b/drivers/net/mlx5/mlx5_txq.c
> @@ -186,6 +186,8 @@
>  	(*priv->txqs)[idx] = &txq_ctrl->txq;
>  out:
>  	priv_unlock(priv);
> +	if (mlx5_removed(priv))
> +		return -ENODEV;
>  	return -ret;
>  }
>  
> -- 
> 1.8.3.1
> 

-- 
Adrien Mazarguil
6WIND

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH 2/3] net/mlx4: adjust removal error
  2017-11-03 13:05   ` Adrien Mazarguil
@ 2017-11-05  6:52     ` Matan Azrad
  2017-11-06 16:51       ` Adrien Mazarguil
  0 siblings, 1 reply; 98+ messages in thread
From: Matan Azrad @ 2017-11-05  6:52 UTC (permalink / raw)
  To: Adrien Mazarguil; +Cc: Gaetan Rivet, dev

Hi Adrien,
Thanks for the review :)

Please see below comments.

> -----Original Message-----
> From: Adrien Mazarguil [mailto:adrien.mazarguil@6wind.com]
> Sent: Friday, November 3, 2017 3:06 PM
> To: Matan Azrad <matan@mellanox.com>
> Cc: Gaetan Rivet <gaetan.rivet@6wind.com>; dev@dpdk.org
> Subject: Re: [PATCH 2/3] net/mlx4: adjust removal error
> 
> On Thu, Nov 02, 2017 at 03:42:03PM +0000, Matan Azrad wrote:
> > Fail-safe PMD expects to get -ENODEV error value if sub PMD control
> > command fails because of device removal.
> >
> > Make control callbacks return with -ENODEV when the device has
> > disappeared.
> >
> > Signed-off-by: Matan Azrad <matan@mellanox.com>
> 
> I think there are a several inconsistencies regarding the places where
> mlx4_removed() is used, this could lead to mistakes or redundant calls to this
> function later on.
> 
> You have to choose between low-level internal functions (e.g.
> mlx4_set_sysfs_ulong()) or user-facing ones from the eth_dev_ops
> interface (e.g. mlx4_dev_set_link_up()), but neither intermediate functions
> nor a mix of all approaches.

You are touching here, exactly in one of my design thoughts:
Either using always "low" level error adjustments or using always high level  adjustments.
The high level approach does less reuse of code but simpler to maintain (as you said).
I decided to combine the two approaches while never going to the lowest level code(ibv, pipes).
Adding the check in mlx4_dev_set_link() can replace two checks: in mlx4_dev_set_link_up() and mlx4_dev_set_link_down(). 
Adding the check in mlx4_flow_toggle()can replace many checks: all flows callbacks and also mlx4_mac_addr_add(),mlx4_mac_addr_set().
You right regarding  mlx4_set_sysfs_ulong() it can be replaced by check in mlx4_mtu_set() - will fix it in V2.
You right regarding  mlx4_ifreq(), it can be replaced by check in in mlx4_link_update() - - will fix it in V2.

I can understand the consistency approach but I think the above two cases to be in lower level functions are harmless and reuse code.
What do you think?

> 
> Standardizing on low-level functions is not practical as it means you'd have to
> check for a device removal after each ibv_*() call. Therefore my suggestion is
> to check it at the highest level, in all functions exposed though
> mlx4_dev_ops in case of error, even innocuous one like
> mlx4_stats_get() and those returning void (rte_errno can still be set), all in
> the name of consistency.
> 

If everything OK with the callback (even in a removal case) why to set rte_errno?
Specifically in mlx4_stats_get() has no error flow and we don't want error return in case of removal since we can provide stats even after removal (SW counters) and this is a good "feature" for failsafe plug out saving stats process.  

> The mlx4_removed() documentation should be updated to reflect the places
> it's supposed to be called as well. All this means a larger patch is necessary.
> 

Do you mean documentation in code(comment) or mlx4 docs, maybe both?

> See below for coding style issues.
> 
> > ---
> >  drivers/net/mlx4/mlx4.h        |  1 +
> >  drivers/net/mlx4/mlx4_ethdev.c | 38
> ++++++++++++++++++++++++++++++++++----
> >  drivers/net/mlx4/mlx4_flow.c   |  2 ++
> >  drivers/net/mlx4/mlx4_intr.c   |  5 ++++-
> >  drivers/net/mlx4/mlx4_rxq.c    |  1 +
> >  drivers/net/mlx4/mlx4_txq.c    |  1 +
> >  6 files changed, 43 insertions(+), 5 deletions(-)
> >
> > diff --git a/drivers/net/mlx4/mlx4.h b/drivers/net/mlx4/mlx4.h index
> > e0a9853..cac9654 100644
> > --- a/drivers/net/mlx4/mlx4.h
> > +++ b/drivers/net/mlx4/mlx4.h
> > @@ -149,6 +149,7 @@ int mlx4_flow_ctrl_get(struct rte_eth_dev *dev,
> >  		       struct rte_eth_fc_conf *fc_conf);  int
> > mlx4_flow_ctrl_set(struct rte_eth_dev *dev,
> >  		       struct rte_eth_fc_conf *fc_conf);
> > +int mlx4_removed(const struct priv *priv);
> >
> >  /* mlx4_intr.c */
> >
> > diff --git a/drivers/net/mlx4/mlx4_ethdev.c
> > b/drivers/net/mlx4/mlx4_ethdev.c index b0acd12..76914b0 100644
> > --- a/drivers/net/mlx4/mlx4_ethdev.c
> > +++ b/drivers/net/mlx4/mlx4_ethdev.c
> > @@ -312,6 +312,8 @@
> >
> >  	ret = mlx4_sysfs_write(priv, name, value_str, (sizeof(value_str) - 1));
> >  	if (ret < 0) {
> > +		if (mlx4_removed(priv))
> > +			ret = -ENODEV;
> >  		DEBUG("cannot write %s `%s' (%lu) to sysfs: %s",
> >  		      name, value_str, value, strerror(rte_errno));
> >  		return ret;
> > @@ -340,15 +342,19 @@
> >
> >  	if (sock == -1) {
> >  		rte_errno = errno;
> > -		return -rte_errno;
> > +		goto error;
> >  	}
> >  	ret = mlx4_get_ifname(priv, &ifr->ifr_name);
> >  	if (!ret && ioctl(sock, req, ifr) == -1) {
> >  		rte_errno = errno;
> > -		ret = -rte_errno;
> > +		close(sock);
> > +		goto error;
> >  	}
> >  	close(sock);
> >  	return ret;
> > +error:
> > +	mlx4_removed(priv);
> > +	return -rte_errno;
> >  }
> >
> >  /**
> > @@ -473,13 +479,17 @@
> >  	if (up) {
> >  		err = mlx4_set_flags(priv, ~IFF_UP, IFF_UP);
> >  		if (err)
> > -			return err;
> > +			goto error;
> >  	} else {
> >  		err = mlx4_set_flags(priv, ~IFF_UP, ~IFF_UP);
> >  		if (err)
> > -			return err;
> > +			goto error;
> >  	}
> >  	return 0;
> > +error:
> > +	if (mlx4_removed(priv))
> > +		return -ENODEV;
> > +	return err;
> >  }
> >
> >  /**
> > @@ -947,6 +957,7 @@ enum rxmode_toggle {
> >
> >  	ifr.ifr_data = (void *)&ethpause;
> >  	if (mlx4_ifreq(priv, SIOCETHTOOL, &ifr)) {
> > +		mlx4_removed(priv);
> >  		ret = rte_errno;
> >  		WARN("ioctl(SIOCETHTOOL, ETHTOOL_GPAUSEPARAM)"
> >  		     " failed: %s",
> > @@ -1002,6 +1013,7 @@ enum rxmode_toggle {
> >  	else
> >  		ethpause.tx_pause = 0;
> >  	if (mlx4_ifreq(priv, SIOCETHTOOL, &ifr)) {
> > +		mlx4_removed(priv);
> >  		ret = rte_errno;
> >  		WARN("ioctl(SIOCETHTOOL, ETHTOOL_SPAUSEPARAM)"
> >  		     " failed: %s",
> > @@ -1013,3 +1025,21 @@ enum rxmode_toggle {
> >  	assert(ret >= 0);
> >  	return -ret;
> >  }
> 
> Missing empty line.
> 
OK.

> > +/**
> > + * Check if mlx4 device was removed.
> 
> "mlx4" is a somewhat redundant given PMD name.
> 
> A separate paragraph should describe where this function is supposed to be
> called.
> 
OK.

> > + *
> > + * @param priv
> > + *   Pointer to private structure.
> > + *
> > + * @return
> > + *   -ENODEV when device is removed and rte_errno is set, otherwise 0.
> > + */
> > +int
> > +mlx4_removed(const struct priv *priv) {
> > +	struct ibv_device_attr device_attr;
> > +
> > +	if (ibv_query_device(priv->ctx, &device_attr) == EIO)
> > +		return -(rte_errno = ENODEV);
> 
> Although a nice shortcut, coding rules don't allow this. You have to assign
> rte_errno on its own separate line. My suggestion if you want to avoid a
> block would be to return 0 directly when != EIO.
> 

Can you address me to this code rule documentation?

> > +	return 0;
> > +}
> > diff --git a/drivers/net/mlx4/mlx4_flow.c
> > b/drivers/net/mlx4/mlx4_flow.c index 8b87b29..606c888 100644
> > --- a/drivers/net/mlx4/mlx4_flow.c
> > +++ b/drivers/net/mlx4/mlx4_flow.c
> > @@ -1069,6 +1069,8 @@ struct mlx4_drop {
> >  	err = errno;
> >  	msg = "flow rule rejected by device";
> >  error:
> > +	if (mlx4_removed(priv))
> > +		err = ENODEV;
> >  	return rte_flow_error_set
> >  		(error, err, RTE_FLOW_ERROR_TYPE_HANDLE, flow, msg);  }
> diff --git
> > a/drivers/net/mlx4/mlx4_intr.c b/drivers/net/mlx4/mlx4_intr.c index
> > b17d109..0ebdb28 100644
> > --- a/drivers/net/mlx4/mlx4_intr.c
> > +++ b/drivers/net/mlx4/mlx4_intr.c
> > @@ -359,7 +359,10 @@
> >  			ret = EINVAL;
> >  	}
> >  	if (ret) {
> > -		rte_errno = ret;
> > +		if (mlx4_removed(dev->data->dev_private))
> > +			ret = ENODEV;
> > +		else
> > +			rte_errno = ret;
> >  		WARN("unable to disable interrupt on rx queue %d",
> >  		     idx);
> >  	} else {
> > diff --git a/drivers/net/mlx4/mlx4_rxq.c b/drivers/net/mlx4/mlx4_rxq.c
> > index 7fe21b6..43dad26 100644
> > --- a/drivers/net/mlx4/mlx4_rxq.c
> > +++ b/drivers/net/mlx4/mlx4_rxq.c
> > @@ -832,6 +832,7 @@ void mlx4_rss_detach(struct mlx4_rss *rss)
> >  	ret = rte_errno;
> >  	mlx4_rx_queue_release(rxq);
> >  	rte_errno = ret;
> > +	mlx4_removed(priv);
> >  	assert(rte_errno > 0);
> >  	return -rte_errno;
> >  }
> > diff --git a/drivers/net/mlx4/mlx4_txq.c b/drivers/net/mlx4/mlx4_txq.c
> > index a9c5bd2..09bdfd8 100644
> > --- a/drivers/net/mlx4/mlx4_txq.c
> > +++ b/drivers/net/mlx4/mlx4_txq.c
> > @@ -372,6 +372,7 @@ struct txq_mp2mr_mbuf_check_data {
> >  	ret = rte_errno;
> >  	mlx4_tx_queue_release(txq);
> >  	rte_errno = ret;
> > +	mlx4_removed(priv);
> >  	assert(rte_errno > 0);
> >  	return -rte_errno;
> >  }
> > --
> > 1.8.3.1
> >
> 
> --
> Adrien Mazarguil
> 6WIND

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH 3/3] net/mlx5: adjust removal error
  2017-11-03 13:06   ` Adrien Mazarguil
@ 2017-11-05  6:57     ` Matan Azrad
  0 siblings, 0 replies; 98+ messages in thread
From: Matan Azrad @ 2017-11-05  6:57 UTC (permalink / raw)
  To: Adrien Mazarguil; +Cc: Gaetan Rivet, dev

Hi Adrien,
Thanks for this too.

> -----Original Message-----
> From: Adrien Mazarguil [mailto:adrien.mazarguil@6wind.com]
> Sent: Friday, November 3, 2017 3:06 PM
> To: Matan Azrad <matan@mellanox.com>
> Cc: Gaetan Rivet <gaetan.rivet@6wind.com>; dev@dpdk.org
> Subject: Re: [PATCH 3/3] net/mlx5: adjust removal error
> 
> On Thu, Nov 02, 2017 at 03:42:04PM +0000, Matan Azrad wrote:
> > Fail-safe PMD expects to get -ENODEV error value if sub PMD control
> > command fails because of device removal.
> >
> > Make control callbacks return with -ENODEV when the device has
> > disappeared.
> >
> > Signed-off-by: Matan Azrad <matan@mellanox.com>
> 
> In short I have the same comments as on the mlx4 patch about usage
> consistency, this also applies to mlx5; mlx5_removed() should be only used
> by the public callbacks from struct eth_dev_ops.
> 
> There's an additional difficulty with this PMD, you need to take into account
> the fact it provides secondary process support (mlx5_dev_sec_ops).
> I think secondary processes do not have any IBV context available for
> mlx5_removed() to query, which should resolve to a no-op in this case.
> Make sure secondary processes do not crash whatever happens.
> 
Will check it, thanks!

> See below for coding style and other issues.
> 
> > ---
> >  drivers/net/mlx5/mlx5.h        |  1 +
> >  drivers/net/mlx5/mlx5_ethdev.c | 39
> +++++++++++++++++++++++++++++++++++----
> >  drivers/net/mlx5/mlx5_flow.c   |  2 ++
> >  drivers/net/mlx5/mlx5_rss.c    |  4 ++++
> >  drivers/net/mlx5/mlx5_rxq.c    | 12 ++++++++++--
> >  drivers/net/mlx5/mlx5_stats.c  |  6 +++++-
> >  drivers/net/mlx5/mlx5_txq.c    |  2 ++
> >  7 files changed, 59 insertions(+), 7 deletions(-)
> >
> > diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h index
> > e6a69b8..0dd104a 100644
> > --- a/drivers/net/mlx5/mlx5.h
> > +++ b/drivers/net/mlx5/mlx5.h
> > @@ -208,6 +208,7 @@ int mlx5_ibv_device_to_pci_addr(const struct
> > ibv_device *,  int mlx5_set_link_up(struct rte_eth_dev *dev);  void
> > priv_dev_select_tx_function(struct priv *priv, struct rte_eth_dev
> > *dev);  void priv_dev_select_rx_function(struct priv *priv, struct
> > rte_eth_dev *dev);
> > +int mlx5_removed(const struct priv *priv);
> >
> >  /* mlx5_mac.c */
> >
> > diff --git a/drivers/net/mlx5/mlx5_ethdev.c
> > b/drivers/net/mlx5/mlx5_ethdev.c index c31ea4b..bf61cd6 100644
> > --- a/drivers/net/mlx5/mlx5_ethdev.c
> > +++ b/drivers/net/mlx5/mlx5_ethdev.c
> > @@ -394,6 +394,8 @@ struct priv *
> >
> >  	ret = priv_sysfs_write(priv, name, value_str, (sizeof(value_str) - 1));
> >  	if (ret == -1) {
> > +		if (mlx5_removed(priv))
> > +			errno = ENODEV;
> >  		DEBUG("cannot write %s `%s' (%lu) to sysfs: %s",
> >  		      name, value_str, value, strerror(errno));
> >  		return -1;
> > @@ -925,13 +927,17 @@ struct priv *
> >  {
> >  	struct utsname utsname;
> >  	int ver[3];
> > +	int ret;
> >
> >  	if (uname(&utsname) == -1 ||
> >  	    sscanf(utsname.release, "%d.%d.%d",
> >  		   &ver[0], &ver[1], &ver[2]) != 3 ||
> >  	    KERNEL_VERSION(ver[0], ver[1], ver[2]) < KERNEL_VERSION(4, 9,
> 0))
> > -		return mlx5_link_update_unlocked_gset(dev,
> wait_to_complete);
> > -	return mlx5_link_update_unlocked_gs(dev, wait_to_complete);
> > +		ret = mlx5_link_update_unlocked_gset(dev,
> wait_to_complete);
> > +	ret =  mlx5_link_update_unlocked_gs(dev, wait_to_complete);
> 
> Besides the extra space after "ret =", I think this doesn't work as intended. A
> "else" statement is necessary.
> 
Will fix it, thanks!

> > +	if (ret && mlx5_removed(mlx5_get_priv(dev)))
> > +		return -ENODEV;
> > +	return ret;
> >  }
> >
> >  /**
> > @@ -978,6 +984,8 @@ struct priv *
> >  	     strerror(ret));
> >  	priv_unlock(priv);
> >  	assert(ret >= 0);
> > +	if (mlx5_removed(priv))
> > +		return -ENODEV;
> >  	return -ret;
> >  }
> >
> > @@ -1029,6 +1037,8 @@ struct priv *
> >  out:
> >  	priv_unlock(priv);
> >  	assert(ret >= 0);
> > +	if (mlx5_removed(priv))
> > +		return -ENODEV;
> >  	return -ret;
> >  }
> >
> > @@ -1083,6 +1093,8 @@ struct priv *
> >  out:
> >  	priv_unlock(priv);
> >  	assert(ret >= 0);
> > +	if (mlx5_removed(priv))
> > +		return -ENODEV;
> >  	return -ret;
> >  }
> >
> > @@ -1364,13 +1376,13 @@ struct priv *
> >  	if (up) {
> >  		err = priv_set_flags(priv, ~IFF_UP, IFF_UP);
> >  		if (err)
> > -			return err;
> > +			return errno == ENODEV ? -ENODEV : err;
> 
> There is a documentation issue here since the mlx5 PMD didn't get all the
> errno consistency fixes that mlx4 got, however err is documented as being -1
> in case of error, whereas priv_dev_set_link() returns a positive errno value
> instead and mlx5_set_link_down/up() should return only negative errno
> values but are documented as returning positive ones.
> 
> Anyway to keep it short: currently in mlx5, priv_*() => positive errno and the
> public-facing mlx5_*() => negative errno, hence you should return a positive
> ENODEV here.
> 

Ok.

> You could avoid this mess by patching the public callbacks only and not
> internal functions like this one.
> 
> >  		priv_dev_select_tx_function(priv, dev);
> >  		priv_dev_select_rx_function(priv, dev);
> >  	} else {
> >  		err = priv_set_flags(priv, ~IFF_UP, ~IFF_UP);
> >  		if (err)
> > -			return err;
> > +			return errno == ENODEV ? -ENODEV : err;
> 
> Same here.
> 
> >  		dev->rx_pkt_burst = removed_rx_burst;
> >  		dev->tx_pkt_burst = removed_tx_burst;
> >  	}
> > @@ -1474,3 +1486,22 @@ struct priv *
> >  		dev->rx_pkt_burst = mlx5_rx_burst;
> >  	}
> >  }
> > +
> > +/**
> > + * Check if mlx5 device was removed.
> > + *
> 
> "mlx5" is redundant.
> 
> As with mlx4, a short paragraph should describe where this function is
> supposed to be used.
> 
> > + * @param priv
> > + *   Pointer to private structure.
> > + *
> > + * @return
> > + *   -ENODEV when device is removed and rte_errno is set, otherwise 0.
> > + */
> > +int
> > +mlx5_removed(const struct priv *priv) {
> > +	struct ibv_device_attr device_attr;
> > +
> > +	if (ibv_query_device(priv->ctx, &device_attr) == EIO)
> > +		return -(rte_errno = ENODEV);
> 
> Coding rules prohibit this kind of affectation, see mlx4 comments.
> 
> > +	return 0;
> > +}
> > diff --git a/drivers/net/mlx5/mlx5_flow.c
> > b/drivers/net/mlx5/mlx5_flow.c index 5f49bf5..448c0a3 100644
> > --- a/drivers/net/mlx5/mlx5_flow.c
> > +++ b/drivers/net/mlx5/mlx5_flow.c
> > @@ -3068,6 +3068,8 @@ struct rte_flow *
> >  		priv_lock(priv);
> >  		ret = priv_fdir_ctrl_func(priv, filter_op, arg);
> >  		priv_unlock(priv);
> > +		if (ret && mlx5_removed(priv))
> > +			ret = ENODEV;
> >  		break;
> >  	default:
> >  		ERROR("%p: filter type (%d) not supported", diff --git
> > a/drivers/net/mlx5/mlx5_rss.c b/drivers/net/mlx5/mlx5_rss.c index
> > f3de46d..1ad9269 100644
> > --- a/drivers/net/mlx5/mlx5_rss.c
> > +++ b/drivers/net/mlx5/mlx5_rss.c
> > @@ -250,6 +250,8 @@
> >  	priv_lock(priv);
> >  	ret = priv_dev_rss_reta_query(priv, reta_conf, reta_size);
> >  	priv_unlock(priv);
> > +	if (ret && mlx5_removed(priv))
> > +		return -ENODEV;
> >  	return -ret;
> >  }
> >
> > @@ -282,5 +284,7 @@
> >  		mlx5_dev_stop(dev);
> >  		mlx5_dev_start(dev);
> >  	}
> > +	if (ret && mlx5_removed(priv))
> > +		return -ENODEV;
> >  	return -ret;
> >  }
> > diff --git a/drivers/net/mlx5/mlx5_rxq.c b/drivers/net/mlx5/mlx5_rxq.c
> > index a1f382b..c9a549d 100644
> > --- a/drivers/net/mlx5/mlx5_rxq.c
> > +++ b/drivers/net/mlx5/mlx5_rxq.c
> > @@ -278,6 +278,8 @@
> >  	(*priv->rxqs)[idx] = &rxq_ctrl->rxq;
> >  out:
> >  	priv_unlock(priv);
> > +	if (mlx5_removed(priv))
> > +		return -ENODEV;
> >  	return -ret;
> >  }
> >
> > @@ -485,8 +487,11 @@
> >  	}
> >  exit:
> >  	priv_unlock(priv);
> > -	if (ret)
> > +	if (ret) {
> >  		WARN("unable to arm interrupt on rx queue %d",
> rx_queue_id);
> > +		if (mlx5_removed(priv))
> > +			return -ENODEV;
> > +	}
> >  	return -ret;
> >  }
> >
> > @@ -537,9 +542,12 @@
> >  	if (rxq_ibv)
> >  		mlx5_priv_rxq_ibv_release(priv, rxq_ibv);
> >  	priv_unlock(priv);
> > -	if (ret)
> > +	if (ret) {
> >  		WARN("unable to disable interrupt on rx queue %d",
> >  		     rx_queue_id);
> > +		if (mlx5_removed(priv))
> > +			return -ENODEV;
> > +	}
> >  	return -ret;
> >  }
> >
> > diff --git a/drivers/net/mlx5/mlx5_stats.c
> > b/drivers/net/mlx5/mlx5_stats.c index 5e225d3..33b2a60 100644
> > --- a/drivers/net/mlx5/mlx5_stats.c
> > +++ b/drivers/net/mlx5/mlx5_stats.c
> > @@ -438,13 +438,17 @@ struct mlx5_counter_ctrl {
> >  		stats_n = priv_ethtool_get_stats_n(priv);
> >  		if (stats_n < 0) {
> >  			priv_unlock(priv);
> > -			return -1;
> > +			ret = -1;
> > +			goto error;
> >  		}
> >  		if (xstats_ctrl->stats_n != stats_n)
> >  			priv_xstats_init(priv);
> >  		ret = priv_xstats_get(priv, stats);
> >  		priv_unlock(priv);
> >  	}
> > +error:
> > +	if (ret < 0 && mlx5_removed(priv))
> > +		return -ENODEV;
> >  	return ret;
> >  }
> >
> > diff --git a/drivers/net/mlx5/mlx5_txq.c b/drivers/net/mlx5/mlx5_txq.c
> > index fbb2630..a0101cb 100644
> > --- a/drivers/net/mlx5/mlx5_txq.c
> > +++ b/drivers/net/mlx5/mlx5_txq.c
> > @@ -186,6 +186,8 @@
> >  	(*priv->txqs)[idx] = &txq_ctrl->txq;
> >  out:
> >  	priv_unlock(priv);
> > +	if (mlx5_removed(priv))
> > +		return -ENODEV;
> >  	return -ret;
> >  }
> >
> > --
> > 1.8.3.1
> >
> 
> --
> Adrien Mazarguil
> 6WIND

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH 1/3] net/failsafe: fix removal handling lack
  2017-11-02 15:42 ` [PATCH 1/3] net/failsafe: " Matan Azrad
@ 2017-11-06  8:19   ` Gaëtan Rivet
  0 siblings, 0 replies; 98+ messages in thread
From: Gaëtan Rivet @ 2017-11-06  8:19 UTC (permalink / raw)
  To: Matan Azrad; +Cc: Adrien Mazarguil, dev, stable

Hello Matan,

On Thu, Nov 02, 2017 at 03:42:02PM +0000, Matan Azrad wrote:
> There is time between the physical removal of the device until
> sub-device PMDs get a RMV interrupt. At this time DPDK PMDs and
> applications still don't know about the removal and may call sub-device
> control operation which should return an error.
> 
> In previous code this error is reported to the application contrary to
> fail-safe principle that the app should not be aware of device removal.
> 
> Define a removal error that each sub-device PMD should return in case
> of an error caused by removal event; The special error is -ENODEV.
> 
> Add an error check in each relevant control command error flow and
> prevent an error report to application when its value is -ENODEV.
> 
> Fixes: a46f8d5 ("net/failsafe: add fail-safe PMD")
> Fixes: b737a1e ("net/failsafe: support flow API")
> Cc: stable@dpdk.org
> 

This is not a fix.

This would be useless backported in stable without the related
mlx4 and mlx5 changes. The related mlx4 and mlx5 patches are themselves
not marked as fixes and won't be backported.

> Signed-off-by: Matan Azrad <matan@mellanox.com>
> ---
>  doc/guides/nics/fail_safe.rst                   |  7 +++++++
>  doc/guides/prog_guide/env_abstraction_layer.rst |  3 +++
>  drivers/net/failsafe/failsafe_flow.c            | 16 +++++++++------
>  drivers/net/failsafe/failsafe_ops.c             | 27 ++++++++++++++++---------
>  drivers/net/failsafe/failsafe_private.h         |  8 ++++++++
>  5 files changed, 45 insertions(+), 16 deletions(-)
> 
> diff --git a/doc/guides/nics/fail_safe.rst b/doc/guides/nics/fail_safe.rst
> index c4e3d2e..5023fc4 100644
> --- a/doc/guides/nics/fail_safe.rst
> +++ b/doc/guides/nics/fail_safe.rst
> @@ -193,6 +193,13 @@ any time. The fail-safe PMD will register a callback for such event and react
>  accordingly. It will try to safely stop, close and uninit the sub-device having
>  emitted this event, allowing it to free its eventual resources.
>  
> +When fail-safe PMD gets -ENODEV error from control command sent to removable
> +sub-devices, it assumes that the error reason is device removal. In this case
> +fail-safe returns success value to application. The PMD controlling the
> +sub-device is still responsible to emit a removal event (RMV) in addition to
> +returning -ENODEV from control operations after the device has been physically
> +removed. Only the reception of this event unregisters it on the fail-safe side.
> +
>  Fail-safe glossary
>  ------------------
>  
> diff --git a/doc/guides/prog_guide/env_abstraction_layer.rst b/doc/guides/prog_guide/env_abstraction_layer.rst
> index 4775eb3..bd2fd87 100644
> --- a/doc/guides/prog_guide/env_abstraction_layer.rst
> +++ b/doc/guides/prog_guide/env_abstraction_layer.rst
> @@ -213,6 +213,9 @@ device having emitted a Device Removal Event. In such case, calling
>  callback. Care must be taken not to close the device from the interrupt handler
>  context. It is necessary to reschedule such closing operation.
>  
> +Unsuccessful control operations (for those that return errors) may return
> +-ENODEV after the device is physically unplugged.
> +

I think I should be neither ack-ing nor nack-ing this change.
Could you propose it on its own, so that people ignoring fail-safe
related matters could look into it as well?

>  Blacklisting
>  ~~~~~~~~~~~~
>  
> diff --git a/drivers/net/failsafe/failsafe_flow.c b/drivers/net/failsafe/failsafe_flow.c
> index 153ceee..ce9b769 100644
> --- a/drivers/net/failsafe/failsafe_flow.c
> +++ b/drivers/net/failsafe/failsafe_flow.c
> @@ -87,7 +87,7 @@
>  		DEBUG("Calling rte_flow_validate on sub_device %d", i);
>  		ret = rte_flow_validate(PORT_ID(sdev),
>  				attr, patterns, actions, error);
> -		if (ret) {
> +		if (ret && !SUBDEV_REMOVED(sdev, ret)) {

Here and for subsequent checks, there should be an explicit check
against zero instead of using unary !.

>  			ERROR("Operation rte_flow_validate failed for sub_device %d"
>  			      " with error %d", i, ret);
>  			return ret;
> @@ -111,7 +111,8 @@
>  	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
>  		flow->flows[i] = rte_flow_create(PORT_ID(sdev),
>  				attr, patterns, actions, error);
> -		if (flow->flows[i] == NULL) {
> +		if (flow->flows[i] == NULL &&
> +			!SUBDEV_REMOVED(sdev, -rte_errno)) {
>  			ERROR("Failed to create flow on sub_device %d",
>  				i);
>  			goto err;
> @@ -150,7 +151,7 @@
>  			continue;
>  		local_ret = rte_flow_destroy(PORT_ID(sdev),
>  				flow->flows[i], error);
> -		if (local_ret) {
> +		if (local_ret && !SUBDEV_REMOVED(sdev, local_ret)) {
>  			ERROR("Failed to destroy flow on sub_device %d: %d",
>  					i, local_ret);
>  			if (ret == 0)
> @@ -175,7 +176,7 @@
>  	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
>  		DEBUG("Calling rte_flow_flush on sub_device %d", i);
>  		ret = rte_flow_flush(PORT_ID(sdev), error);
> -		if (ret) {
> +		if (ret && !SUBDEV_REMOVED(sdev, ret)) {
>  			ERROR("Operation rte_flow_flush failed for sub_device %d"
>  			      " with error %d", i, ret);
>  			return ret;
> @@ -199,8 +200,11 @@
>  
>  	sdev = TX_SUBDEV(dev);
>  	if (sdev != NULL) {
> -		return rte_flow_query(PORT_ID(sdev),
> +		int ret = rte_flow_query(PORT_ID(sdev),
>  				flow->flows[SUB_ID(sdev)], type, arg, error);
> +
> +		if (!SUBDEV_REMOVED(sdev, ret))
> +			return ret;
>  	}
>  	WARN("No active sub_device to query about its flow");
>  	return -1;
> @@ -223,7 +227,7 @@
>  			WARN("flow isolation mode of sub_device %d in incoherent state.",
>  				i);
>  		ret = rte_flow_isolate(PORT_ID(sdev), set, error);
> -		if (ret) {
> +		if (ret && !SUBDEV_REMOVED(sdev, ret)) {
>  			ERROR("Operation rte_flow_isolate failed for sub_device %d"
>  			      " with error %d", i, ret);
>  			return ret;
> diff --git a/drivers/net/failsafe/failsafe_ops.c b/drivers/net/failsafe/failsafe_ops.c
> index f460551..cc7ab7f 100644
> --- a/drivers/net/failsafe/failsafe_ops.c
> +++ b/drivers/net/failsafe/failsafe_ops.c
> @@ -314,7 +314,7 @@
>  	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
>  		DEBUG("Calling rte_eth_dev_set_link_up on sub_device %d", i);
>  		ret = rte_eth_dev_set_link_up(PORT_ID(sdev));
> -		if (ret) {
> +		if (ret && !SUBDEV_REMOVED(sdev, ret)) {
>  			ERROR("Operation rte_eth_dev_set_link_up failed for sub_device %d"
>  			      " with error %d", i, ret);
>  			return ret;
> @@ -333,7 +333,7 @@
>  	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
>  		DEBUG("Calling rte_eth_dev_set_link_down on sub_device %d", i);
>  		ret = rte_eth_dev_set_link_down(PORT_ID(sdev));
> -		if (ret) {
> +		if (ret && !SUBDEV_REMOVED(sdev, ret)) {
>  			ERROR("Operation rte_eth_dev_set_link_down failed for sub_device %d"
>  			      " with error %d", i, ret);
>  			return ret;
> @@ -418,7 +418,7 @@
>  				rx_queue_id,
>  				nb_rx_desc, socket_id,
>  				rx_conf, mb_pool);
> -		if (ret) {
> +		if (ret && !SUBDEV_REMOVED(sdev, ret)) {
>  			ERROR("RX queue setup failed for sub_device %d", i);
>  			goto free_rxq;
>  		}
> @@ -484,7 +484,7 @@
>  				tx_queue_id,
>  				nb_tx_desc, socket_id,
>  				tx_conf);
> -		if (ret) {
> +		if (ret && !SUBDEV_REMOVED(sdev, ret)) {
>  			ERROR("TX queue setup failed for sub_device %d", i);
>  			goto free_txq;
>  		}
> @@ -563,7 +563,7 @@
>  	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
>  		DEBUG("Calling link_update on sub_device %d", i);
>  		ret = (SUBOPS(sdev, link_update))(ETH(sdev), wait_to_complete);
> -		if (ret && ret != -1) {
> +		if (ret && ret != -1  && !SUBDEV_REMOVED(sdev, ret)) {
>  			ERROR("Link update failed for sub_device %d with error %d",
>  			      i, ret);
>  			return ret;
> @@ -587,6 +587,7 @@
>  fs_stats_get(struct rte_eth_dev *dev,
>  	     struct rte_eth_stats *stats)
>  {
> +	struct rte_eth_stats backup;
>  	struct sub_device *sdev;
>  	uint8_t i;
>  	int ret;
> @@ -596,14 +597,20 @@
>  		struct rte_eth_stats *snapshot = &sdev->stats_snapshot.stats;
>  		uint64_t *timestamp = &sdev->stats_snapshot.timestamp;
>  
> +		rte_memcpy(&backup, snapshot, sizeof(backup));
>  		ret = rte_eth_stats_get(PORT_ID(sdev), snapshot);
>  		if (ret) {
> +			if (SUBDEV_REMOVED(sdev, ret)) {
> +				rte_memcpy(snapshot, &backup, sizeof(backup));
> +				goto inc;
> +			}
>  			ERROR("Operation rte_eth_stats_get failed for sub_device %d with error %d",
>  				  i, ret);
>  			*timestamp = 0;
>  			return ret;
>  		}
>  		*timestamp = rte_rdtsc();
> +inc:
>  		failsafe_stats_increment(stats, snapshot);
>  	}
>  	return 0;
> @@ -716,7 +723,7 @@
>  	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
>  		DEBUG("Calling rte_eth_dev_set_mtu on sub_device %d", i);
>  		ret = rte_eth_dev_set_mtu(PORT_ID(sdev), mtu);
> -		if (ret) {
> +		if (ret && !SUBDEV_REMOVED(sdev, ret)) {
>  			ERROR("Operation rte_eth_dev_set_mtu failed for sub_device %d with error %d",
>  			      i, ret);
>  			return ret;
> @@ -735,7 +742,7 @@
>  	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
>  		DEBUG("Calling rte_eth_dev_vlan_filter on sub_device %d", i);
>  		ret = rte_eth_dev_vlan_filter(PORT_ID(sdev), vlan_id, on);
> -		if (ret) {
> +		if (ret && !SUBDEV_REMOVED(sdev, ret)) {
>  			ERROR("Operation rte_eth_dev_vlan_filter failed for sub_device %d"
>  			      " with error %d", i, ret);
>  			return ret;
> @@ -769,7 +776,7 @@
>  	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
>  		DEBUG("Calling rte_eth_dev_flow_ctrl_set on sub_device %d", i);
>  		ret = rte_eth_dev_flow_ctrl_set(PORT_ID(sdev), fc_conf);
> -		if (ret) {
> +		if (ret && !SUBDEV_REMOVED(sdev, ret)) {
>  			ERROR("Operation rte_eth_dev_flow_ctrl_set failed for sub_device %d"
>  			      " with error %d", i, ret);
>  			return ret;
> @@ -806,7 +813,7 @@
>  	RTE_ASSERT(index < FAILSAFE_MAX_ETHADDR);
>  	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
>  		ret = rte_eth_dev_mac_addr_add(PORT_ID(sdev), mac_addr, vmdq);
> -		if (ret) {
> +		if (ret && !SUBDEV_REMOVED(sdev, ret)) {
>  			ERROR("Operation rte_eth_dev_mac_addr_add failed for sub_device %"
>  			      PRIu8 " with error %d", i, ret);
>  			return ret;
> @@ -848,7 +855,7 @@
>  	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
>  		DEBUG("Calling rte_eth_dev_filter_ctrl on sub_device %d", i);
>  		ret = rte_eth_dev_filter_ctrl(PORT_ID(sdev), type, op, arg);
> -		if (ret) {
> +		if (ret && !SUBDEV_REMOVED(sdev, ret)) {
>  			ERROR("Operation rte_eth_dev_filter_ctrl failed for sub_device %d"
>  			      " with error %d", i, ret);
>  			return ret;
> diff --git a/drivers/net/failsafe/failsafe_private.h b/drivers/net/failsafe/failsafe_private.h
> index d81cc3c..ee81b70 100644
> --- a/drivers/net/failsafe/failsafe_private.h
> +++ b/drivers/net/failsafe/failsafe_private.h
> @@ -262,6 +262,14 @@ int failsafe_eth_lsc_event_callback(uint16_t port_id,
>  	(ETH(s)->dev_ops->ops)
>  
>  /**
> + * s: (struct sub_device *)
> + * e: (int) error
> + */
> +#define SUBDEV_REMOVED(s, e) \
> +	(s->remove || \
> +	 (((e) == -ENODEV) && (ETH(s)->data->dev_flags & RTE_ETH_DEV_INTR_RMV)))
> +
> +/**
>   * Atomic guard
>   */
>  
> -- 
> 1.8.3.1
> 

-- 
Gaëtan Rivet
6WIND

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH 2/3] net/mlx4: adjust removal error
  2017-11-05  6:52     ` Matan Azrad
@ 2017-11-06 16:51       ` Adrien Mazarguil
  0 siblings, 0 replies; 98+ messages in thread
From: Adrien Mazarguil @ 2017-11-06 16:51 UTC (permalink / raw)
  To: Matan Azrad; +Cc: Gaetan Rivet, dev

Hi Matan,

On Sun, Nov 05, 2017 at 06:52:59AM +0000, Matan Azrad wrote:
> Hi Adrien,
> Thanks for the review :)
> 
> Please see below comments.
> 
> > -----Original Message-----
> > From: Adrien Mazarguil [mailto:adrien.mazarguil@6wind.com]
> > Sent: Friday, November 3, 2017 3:06 PM
> > To: Matan Azrad <matan@mellanox.com>
> > Cc: Gaetan Rivet <gaetan.rivet@6wind.com>; dev@dpdk.org
> > Subject: Re: [PATCH 2/3] net/mlx4: adjust removal error
> > 
> > On Thu, Nov 02, 2017 at 03:42:03PM +0000, Matan Azrad wrote:
> > > Fail-safe PMD expects to get -ENODEV error value if sub PMD control
> > > command fails because of device removal.
> > >
> > > Make control callbacks return with -ENODEV when the device has
> > > disappeared.
> > >
> > > Signed-off-by: Matan Azrad <matan@mellanox.com>
> > 
> > I think there are a several inconsistencies regarding the places where
> > mlx4_removed() is used, this could lead to mistakes or redundant calls to this
> > function later on.
> > 
> > You have to choose between low-level internal functions (e.g.
> > mlx4_set_sysfs_ulong()) or user-facing ones from the eth_dev_ops
> > interface (e.g. mlx4_dev_set_link_up()), but neither intermediate functions
> > nor a mix of all approaches.
> 
> You are touching here, exactly in one of my design thoughts:
> Either using always "low" level error adjustments or using always high level  adjustments.
> The high level approach does less reuse of code but simpler to maintain (as you said).
> I decided to combine the two approaches while never going to the lowest level code(ibv, pipes).
> Adding the check in mlx4_dev_set_link() can replace two checks: in mlx4_dev_set_link_up() and mlx4_dev_set_link_down(). 
> Adding the check in mlx4_flow_toggle()can replace many checks: all flows callbacks and also mlx4_mac_addr_add(),mlx4_mac_addr_set().
> You right regarding  mlx4_set_sysfs_ulong() it can be replaced by check in mlx4_mtu_set() - will fix it in V2.
> You right regarding  mlx4_ifreq(), it can be replaced by check in in mlx4_link_update() - - will fix it in V2.
> 
> I can understand the consistency approach but I think the above two cases to be in lower level functions are harmless and reuse code.
> What do you think?

Well, given this is pure control path code where performance doesn't really
matter, in my opinion we should focus on making its maintenance easier by
having it directly in all eth_dev_ops. It's much easier to document as well.

Actually I think the ethdev API should evolve to provide a separate
"is_removed()" dev_ops implemented by PMDs and automatically used by upper
ethdev layers in order to not have to patch all callbacks in all PMDs like
you did.

> > Standardizing on low-level functions is not practical as it means you'd have to
> > check for a device removal after each ibv_*() call. Therefore my suggestion is
> > to check it at the highest level, in all functions exposed though
> > mlx4_dev_ops in case of error, even innocuous one like
> > mlx4_stats_get() and those returning void (rte_errno can still be set), all in
> > the name of consistency.
> > 
> 
> If everything OK with the callback (even in a removal case) why to set rte_errno?
> Specifically in mlx4_stats_get() has no error flow and we don't want error return in case of removal since we can provide stats even after removal (SW counters) and this is a good "feature" for failsafe plug out saving stats process.  

I said "in the name of consistency" to keep things logical without special
cases since updating rte_errno shouldn't really hurt anyone.

However void-returning functions like mlx4_stats_get() shouldn't have any
side effects (rte_errno should remain unchanged after calling it even if
undocumented), it's OK if you leave them out, although keep in mind mlx4
with its software counters is a bit special. Polling counters on a
nonexistent (unplugged) physical device may yield errors or random garbage
with other PMDs.

> > The mlx4_removed() documentation should be updated to reflect the places
> > it's supposed to be called as well. All this means a larger patch is necessary.
> > 
> 
> Do you mean documentation in code(comment) or mlx4 docs, maybe both?

I mean only the Doxygen comment describing what this function does.
Internally documenting where it's supposed to be called is useful.

> > See below for coding style issues.
> > 
> > > ---
> > >  drivers/net/mlx4/mlx4.h        |  1 +
> > >  drivers/net/mlx4/mlx4_ethdev.c | 38
> > ++++++++++++++++++++++++++++++++++----
> > >  drivers/net/mlx4/mlx4_flow.c   |  2 ++
> > >  drivers/net/mlx4/mlx4_intr.c   |  5 ++++-
> > >  drivers/net/mlx4/mlx4_rxq.c    |  1 +
> > >  drivers/net/mlx4/mlx4_txq.c    |  1 +
> > >  6 files changed, 43 insertions(+), 5 deletions(-)
> > >
> > > diff --git a/drivers/net/mlx4/mlx4.h b/drivers/net/mlx4/mlx4.h index
> > > e0a9853..cac9654 100644
> > > --- a/drivers/net/mlx4/mlx4.h
> > > +++ b/drivers/net/mlx4/mlx4.h
> > > @@ -149,6 +149,7 @@ int mlx4_flow_ctrl_get(struct rte_eth_dev *dev,
> > >  		       struct rte_eth_fc_conf *fc_conf);  int
> > > mlx4_flow_ctrl_set(struct rte_eth_dev *dev,
> > >  		       struct rte_eth_fc_conf *fc_conf);
> > > +int mlx4_removed(const struct priv *priv);
> > >
> > >  /* mlx4_intr.c */
> > >
> > > diff --git a/drivers/net/mlx4/mlx4_ethdev.c
> > > b/drivers/net/mlx4/mlx4_ethdev.c index b0acd12..76914b0 100644
> > > --- a/drivers/net/mlx4/mlx4_ethdev.c
> > > +++ b/drivers/net/mlx4/mlx4_ethdev.c
> > > @@ -312,6 +312,8 @@
> > >
> > >  	ret = mlx4_sysfs_write(priv, name, value_str, (sizeof(value_str) - 1));
> > >  	if (ret < 0) {
> > > +		if (mlx4_removed(priv))
> > > +			ret = -ENODEV;
> > >  		DEBUG("cannot write %s `%s' (%lu) to sysfs: %s",
> > >  		      name, value_str, value, strerror(rte_errno));
> > >  		return ret;
> > > @@ -340,15 +342,19 @@
> > >
> > >  	if (sock == -1) {
> > >  		rte_errno = errno;
> > > -		return -rte_errno;
> > > +		goto error;
> > >  	}
> > >  	ret = mlx4_get_ifname(priv, &ifr->ifr_name);
> > >  	if (!ret && ioctl(sock, req, ifr) == -1) {
> > >  		rte_errno = errno;
> > > -		ret = -rte_errno;
> > > +		close(sock);
> > > +		goto error;
> > >  	}
> > >  	close(sock);
> > >  	return ret;
> > > +error:
> > > +	mlx4_removed(priv);
> > > +	return -rte_errno;
> > >  }
> > >
> > >  /**
> > > @@ -473,13 +479,17 @@
> > >  	if (up) {
> > >  		err = mlx4_set_flags(priv, ~IFF_UP, IFF_UP);
> > >  		if (err)
> > > -			return err;
> > > +			goto error;
> > >  	} else {
> > >  		err = mlx4_set_flags(priv, ~IFF_UP, ~IFF_UP);
> > >  		if (err)
> > > -			return err;
> > > +			goto error;
> > >  	}
> > >  	return 0;
> > > +error:
> > > +	if (mlx4_removed(priv))
> > > +		return -ENODEV;
> > > +	return err;
> > >  }
> > >
> > >  /**
> > > @@ -947,6 +957,7 @@ enum rxmode_toggle {
> > >
> > >  	ifr.ifr_data = (void *)&ethpause;
> > >  	if (mlx4_ifreq(priv, SIOCETHTOOL, &ifr)) {
> > > +		mlx4_removed(priv);
> > >  		ret = rte_errno;
> > >  		WARN("ioctl(SIOCETHTOOL, ETHTOOL_GPAUSEPARAM)"
> > >  		     " failed: %s",
> > > @@ -1002,6 +1013,7 @@ enum rxmode_toggle {
> > >  	else
> > >  		ethpause.tx_pause = 0;
> > >  	if (mlx4_ifreq(priv, SIOCETHTOOL, &ifr)) {
> > > +		mlx4_removed(priv);
> > >  		ret = rte_errno;
> > >  		WARN("ioctl(SIOCETHTOOL, ETHTOOL_SPAUSEPARAM)"
> > >  		     " failed: %s",
> > > @@ -1013,3 +1025,21 @@ enum rxmode_toggle {
> > >  	assert(ret >= 0);
> > >  	return -ret;
> > >  }
> > 
> > Missing empty line.
> > 
> OK.
> 
> > > +/**
> > > + * Check if mlx4 device was removed.
> > 
> > "mlx4" is a somewhat redundant given PMD name.
> > 
> > A separate paragraph should describe where this function is supposed to be
> > called.
> > 
> OK.
> 
> > > + *
> > > + * @param priv
> > > + *   Pointer to private structure.
> > > + *
> > > + * @return
> > > + *   -ENODEV when device is removed and rte_errno is set, otherwise 0.
> > > + */
> > > +int
> > > +mlx4_removed(const struct priv *priv) {
> > > +	struct ibv_device_attr device_attr;
> > > +
> > > +	if (ibv_query_device(priv->ctx, &device_attr) == EIO)
> > > +		return -(rte_errno = ENODEV);
> > 
> > Although a nice shortcut, coding rules don't allow this. You have to assign
> > rte_errno on its own separate line. My suggestion if you want to avoid a
> > block would be to return 0 directly when != EIO.
> > 
> 
> Can you address me to this code rule documentation?

Yes and no, I take it back; there is no such coding rule in DPDK.

What bit me in the past was actually a checkpatch error which forbids
assignments in conditionals, e.g.:

 if ((x = foo()) == 42) ...

That's not the case here since it's not a conditional. However for
consistency with the rest of the code in that PMD, my comment still
stands :)

> > > +	return 0;
> > > +}
> > > diff --git a/drivers/net/mlx4/mlx4_flow.c
> > > b/drivers/net/mlx4/mlx4_flow.c index 8b87b29..606c888 100644
> > > --- a/drivers/net/mlx4/mlx4_flow.c
> > > +++ b/drivers/net/mlx4/mlx4_flow.c
> > > @@ -1069,6 +1069,8 @@ struct mlx4_drop {
> > >  	err = errno;
> > >  	msg = "flow rule rejected by device";
> > >  error:
> > > +	if (mlx4_removed(priv))
> > > +		err = ENODEV;
> > >  	return rte_flow_error_set
> > >  		(error, err, RTE_FLOW_ERROR_TYPE_HANDLE, flow, msg);  }
> > diff --git
> > > a/drivers/net/mlx4/mlx4_intr.c b/drivers/net/mlx4/mlx4_intr.c index
> > > b17d109..0ebdb28 100644
> > > --- a/drivers/net/mlx4/mlx4_intr.c
> > > +++ b/drivers/net/mlx4/mlx4_intr.c
> > > @@ -359,7 +359,10 @@
> > >  			ret = EINVAL;
> > >  	}
> > >  	if (ret) {
> > > -		rte_errno = ret;
> > > +		if (mlx4_removed(dev->data->dev_private))
> > > +			ret = ENODEV;
> > > +		else
> > > +			rte_errno = ret;
> > >  		WARN("unable to disable interrupt on rx queue %d",
> > >  		     idx);
> > >  	} else {
> > > diff --git a/drivers/net/mlx4/mlx4_rxq.c b/drivers/net/mlx4/mlx4_rxq.c
> > > index 7fe21b6..43dad26 100644
> > > --- a/drivers/net/mlx4/mlx4_rxq.c
> > > +++ b/drivers/net/mlx4/mlx4_rxq.c
> > > @@ -832,6 +832,7 @@ void mlx4_rss_detach(struct mlx4_rss *rss)
> > >  	ret = rte_errno;
> > >  	mlx4_rx_queue_release(rxq);
> > >  	rte_errno = ret;
> > > +	mlx4_removed(priv);
> > >  	assert(rte_errno > 0);
> > >  	return -rte_errno;
> > >  }
> > > diff --git a/drivers/net/mlx4/mlx4_txq.c b/drivers/net/mlx4/mlx4_txq.c
> > > index a9c5bd2..09bdfd8 100644
> > > --- a/drivers/net/mlx4/mlx4_txq.c
> > > +++ b/drivers/net/mlx4/mlx4_txq.c
> > > @@ -372,6 +372,7 @@ struct txq_mp2mr_mbuf_check_data {
> > >  	ret = rte_errno;
> > >  	mlx4_tx_queue_release(txq);
> > >  	rte_errno = ret;
> > > +	mlx4_removed(priv);
> > >  	assert(rte_errno > 0);
> > >  	return -rte_errno;
> > >  }
> > > --
> > > 1.8.3.1
> > >
> > 
> > --
> > Adrien Mazarguil
> > 6WIND

-- 
Adrien Mazarguil
6WIND

^ permalink raw reply	[flat|nested] 98+ messages in thread

* [PATCH v2 0/4] Fail-safe fix removal handling lack
  2017-11-02 15:42 [PATCH 0/3] Fail-safe fix removal handling lack Matan Azrad
                   ` (2 preceding siblings ...)
  2017-11-02 15:42 ` [PATCH 3/3] net/mlx5: " Matan Azrad
@ 2017-12-13 14:29 ` Matan Azrad
  2017-12-13 14:29   ` [PATCH v2 1/4] ethdev: add devop to check removal status Matan Azrad
                     ` (4 more replies)
  3 siblings, 5 replies; 98+ messages in thread
From: Matan Azrad @ 2017-12-13 14:29 UTC (permalink / raw)
  To: Adrien Mazarguil, Thomas Monjalon, Gaetan Rivet; +Cc: dev

There is time between the physical removal of the device until sub-device 
PMDs get a RMV interrupt. At this time DPDK PMDs and  applications still 
don't know about the removal and may call sub-device control operation 
which should return an error.

This series adds new ethdev operation to check device removal, adds support
for it in mlx PMDs and fixes the fail-safe bug of removal error report.

V2:
Remove ENODEV definition.
Remove checks from all mlx control commands.
Add new devop - "is_removed".
Implement it in mlx4 and mlx5.
Fix failsafe bug by the new devop.

Matan Azrad (4):
  ethdev: add devop to check removal status
  net/mlx4: support a device removal check operation
  net/mlx5: support a device removal check operation
  net/failsafe: fix removed device handling

 drivers/net/failsafe/failsafe_flow.c    | 18 ++++++++++-------
 drivers/net/failsafe/failsafe_ops.c     | 34 ++++++++++++++++++++++-----------
 drivers/net/failsafe/failsafe_private.h | 10 ++++++++++
 drivers/net/mlx4/mlx4.c                 |  1 +
 drivers/net/mlx4/mlx4.h                 |  1 +
 drivers/net/mlx4/mlx4_ethdev.c          | 20 +++++++++++++++++++
 drivers/net/mlx5/mlx5.c                 |  2 ++
 drivers/net/mlx5/mlx5.h                 |  1 +
 drivers/net/mlx5/mlx5_ethdev.c          | 20 +++++++++++++++++++
 lib/librte_ether/rte_ethdev.c           | 28 ++++++++++++++++++++++++---
 lib/librte_ether/rte_ethdev.h           | 17 +++++++++++++++++
 lib/librte_ether/rte_ethdev_version.map |  7 +++++++
 12 files changed, 138 insertions(+), 21 deletions(-)

-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 98+ messages in thread

* [PATCH v2 1/4] ethdev: add devop to check removal status
  2017-12-13 14:29 ` [PATCH v2 0/4] Fail-safe fix removal handling lack Matan Azrad
@ 2017-12-13 14:29   ` Matan Azrad
  2017-12-13 14:29   ` [PATCH v2 2/4] net/mlx4: support a device removal check operation Matan Azrad
                     ` (3 subsequent siblings)
  4 siblings, 0 replies; 98+ messages in thread
From: Matan Azrad @ 2017-12-13 14:29 UTC (permalink / raw)
  To: Adrien Mazarguil, Thomas Monjalon, Gaetan Rivet; +Cc: dev

There is time between the physical removal of the device until PMDs get
a RMV interrupt. At this time DPDK PMDs and applications still don't
know about the removal.

Current removal detection is achieved only by registration to device RMV
event and the notification comes asynchronously. So, there is no option
to detect a device removal synchronously.
Applications and other DPDK entities may want to check a device removal
synchronously and to take an immediate decision accordingly.

Add new dev op called is_removed to allow DPDK entities to check an
Ethernet device removal status immediately.

Signed-off-by: Matan Azrad <matan@mellanox.com>
---
 lib/librte_ether/rte_ethdev.c           | 28 +++++++++++++++++++++++++---
 lib/librte_ether/rte_ethdev.h           | 17 +++++++++++++++++
 lib/librte_ether/rte_ethdev_version.map |  7 +++++++
 3 files changed, 49 insertions(+), 3 deletions(-)

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index 318af28..c759d0e 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -142,7 +142,8 @@ enum {
 rte_eth_find_next(uint16_t port_id)
 {
 	while (port_id < RTE_MAX_ETHPORTS &&
-	       rte_eth_devices[port_id].state != RTE_ETH_DEV_ATTACHED)
+	       rte_eth_devices[port_id].state != RTE_ETH_DEV_ATTACHED &&
+	       rte_eth_devices[port_id].state != RTE_ETH_DEV_REMOVED)
 		port_id++;
 
 	if (port_id >= RTE_MAX_ETHPORTS)
@@ -286,8 +287,7 @@ struct rte_eth_dev *
 rte_eth_dev_is_valid_port(uint16_t port_id)
 {
 	if (port_id >= RTE_MAX_ETHPORTS ||
-	    (rte_eth_devices[port_id].state != RTE_ETH_DEV_ATTACHED &&
-	     rte_eth_devices[port_id].state != RTE_ETH_DEV_DEFERRED))
+	    (rte_eth_devices[port_id].state == RTE_ETH_DEV_UNUSED))
 		return 0;
 	else
 		return 1;
@@ -1118,6 +1118,28 @@ struct rte_eth_dev *
 }
 
 int
+rte_eth_dev_is_removed(uint16_t port_id)
+{
+	struct rte_eth_dev *dev;
+	int ret;
+
+	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, 0);
+
+	dev = &rte_eth_devices[port_id];
+
+	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->is_removed, 0);
+
+	if (dev->state == RTE_ETH_DEV_REMOVED)
+		return 1;
+
+	ret = dev->dev_ops->is_removed(dev);
+	if (ret != 0)
+		dev->state = RTE_ETH_DEV_REMOVED;
+
+	return ret;
+}
+
+int
 rte_eth_rx_queue_setup(uint16_t port_id, uint16_t rx_queue_id,
 		       uint16_t nb_rx_desc, unsigned int socket_id,
 		       const struct rte_eth_rxconf *rx_conf,
diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index 341c2d6..3aa9d3f 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -1196,6 +1196,9 @@ struct rte_eth_dcb_info {
 typedef int (*eth_dev_reset_t)(struct rte_eth_dev *dev);
 /** <@internal Function used to reset a configured Ethernet device. */
 
+typedef int (*eth_is_removed_t)(struct rte_eth_dev *dev);
+/**< @internal Function used to detect an Ethernet device removal. */
+
 typedef void (*eth_promiscuous_enable_t)(struct rte_eth_dev *dev);
 /**< @internal Function used to enable the RX promiscuous mode of an Ethernet device. */
 
@@ -1525,6 +1528,8 @@ struct eth_dev_ops {
 	eth_dev_close_t            dev_close;     /**< Close device. */
 	eth_dev_reset_t		   dev_reset;	  /**< Reset device. */
 	eth_link_update_t          link_update;   /**< Get device link state. */
+	eth_is_removed_t           is_removed;
+	/**< Check if the device was physically removed. */
 
 	eth_promiscuous_enable_t   promiscuous_enable; /**< Promiscuous ON. */
 	eth_promiscuous_disable_t  promiscuous_disable;/**< Promiscuous OFF. */
@@ -1711,6 +1716,7 @@ enum rte_eth_dev_state {
 	RTE_ETH_DEV_UNUSED = 0,
 	RTE_ETH_DEV_ATTACHED,
 	RTE_ETH_DEV_DEFERRED,
+	RTE_ETH_DEV_REMOVED,
 };
 
 /**
@@ -1997,6 +2003,17 @@ int rte_eth_dev_configure(uint16_t port_id, uint16_t nb_rx_queue,
 void _rte_eth_dev_reset(struct rte_eth_dev *dev);
 
 /**
+ * Check if an Ethernet device was physically removed.
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @return
+ *   - 0 when the Ethernet device is removed, otherwise 1.
+ */
+int
+rte_eth_dev_is_removed(uint16_t port_id);
+
+/**
  * Allocate and set up a receive queue for an Ethernet device.
  *
  * The function allocates a contiguous block of memory for *nb_rx_desc*
diff --git a/lib/librte_ether/rte_ethdev_version.map b/lib/librte_ether/rte_ethdev_version.map
index e9681ac..78547ff 100644
--- a/lib/librte_ether/rte_ethdev_version.map
+++ b/lib/librte_ether/rte_ethdev_version.map
@@ -198,6 +198,13 @@ DPDK_17.11 {
 
 } DPDK_17.08;
 
+DPDK_18.02 {
+	global:
+
+	rte_eth_dev_is_removed;
+
+} DPDK_17.11;
+
 EXPERIMENTAL {
 	global:
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH v2 2/4] net/mlx4: support a device removal check operation
  2017-12-13 14:29 ` [PATCH v2 0/4] Fail-safe fix removal handling lack Matan Azrad
  2017-12-13 14:29   ` [PATCH v2 1/4] ethdev: add devop to check removal status Matan Azrad
@ 2017-12-13 14:29   ` Matan Azrad
  2017-12-13 14:29   ` [PATCH v2 3/4] net/mlx5: " Matan Azrad
                     ` (2 subsequent siblings)
  4 siblings, 0 replies; 98+ messages in thread
From: Matan Azrad @ 2017-12-13 14:29 UTC (permalink / raw)
  To: Adrien Mazarguil, Thomas Monjalon, Gaetan Rivet; +Cc: dev

Add support to get removal status of mlx4 device.

Signed-off-by: Matan Azrad <matan@mellanox.com>
---
 drivers/net/mlx4/mlx4.c        |  1 +
 drivers/net/mlx4/mlx4.h        |  1 +
 drivers/net/mlx4/mlx4_ethdev.c | 20 ++++++++++++++++++++
 3 files changed, 22 insertions(+)

diff --git a/drivers/net/mlx4/mlx4.c b/drivers/net/mlx4/mlx4.c
index f9e4f9d..3cde640 100644
--- a/drivers/net/mlx4/mlx4.c
+++ b/drivers/net/mlx4/mlx4.c
@@ -256,6 +256,7 @@ struct mlx4_conf {
 	.filter_ctrl = mlx4_filter_ctrl,
 	.rx_queue_intr_enable = mlx4_rx_intr_enable,
 	.rx_queue_intr_disable = mlx4_rx_intr_disable,
+	.is_removed = mlx4_is_removed,
 };
 
 /**
diff --git a/drivers/net/mlx4/mlx4.h b/drivers/net/mlx4/mlx4.h
index 3aeef87..0eaba89 100644
--- a/drivers/net/mlx4/mlx4.h
+++ b/drivers/net/mlx4/mlx4.h
@@ -165,6 +165,7 @@ int mlx4_flow_ctrl_get(struct rte_eth_dev *dev,
 int mlx4_flow_ctrl_set(struct rte_eth_dev *dev,
 		       struct rte_eth_fc_conf *fc_conf);
 const uint32_t *mlx4_dev_supported_ptypes_get(struct rte_eth_dev *dev);
+int mlx4_is_removed(struct rte_eth_dev *dev);
 
 /* mlx4_intr.c */
 
diff --git a/drivers/net/mlx4/mlx4_ethdev.c b/drivers/net/mlx4/mlx4_ethdev.c
index 2f69e7d..0d46f5a 100644
--- a/drivers/net/mlx4/mlx4_ethdev.c
+++ b/drivers/net/mlx4/mlx4_ethdev.c
@@ -1060,3 +1060,23 @@ enum rxmode_toggle {
 	}
 	return NULL;
 }
+
+/**
+ * Check if mlx4 device was removed.
+ *
+ * @param dev
+ *   Pointer to Ethernet device structure.
+ *
+ * @return
+ *   1 when device is removed, otherwise 0.
+ */
+int
+mlx4_is_removed(struct rte_eth_dev *dev)
+{
+	struct ibv_device_attr device_attr;
+	struct priv *priv = dev->data->dev_private;
+
+	if (ibv_query_device(priv->ctx, &device_attr) == EIO)
+		return 1;
+	return 0;
+}
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH v2 3/4] net/mlx5: support a device removal check operation
  2017-12-13 14:29 ` [PATCH v2 0/4] Fail-safe fix removal handling lack Matan Azrad
  2017-12-13 14:29   ` [PATCH v2 1/4] ethdev: add devop to check removal status Matan Azrad
  2017-12-13 14:29   ` [PATCH v2 2/4] net/mlx4: support a device removal check operation Matan Azrad
@ 2017-12-13 14:29   ` Matan Azrad
  2017-12-13 14:29   ` [PATCH v2 4/4] net/failsafe: fix removed device handling Matan Azrad
  2017-12-19 17:10   ` [PATCH v3 0/6] Fail-safe\ethdev: fix removal handling lack Matan Azrad
  4 siblings, 0 replies; 98+ messages in thread
From: Matan Azrad @ 2017-12-13 14:29 UTC (permalink / raw)
  To: Adrien Mazarguil, Thomas Monjalon, Gaetan Rivet; +Cc: dev

Add support to get removal status of mlx5 device.
It is not supported in secondary process.

Signed-off-by: Matan Azrad <matan@mellanox.com>
---
 drivers/net/mlx5/mlx5.c        |  2 ++
 drivers/net/mlx5/mlx5.h        |  1 +
 drivers/net/mlx5/mlx5_ethdev.c | 20 ++++++++++++++++++++
 3 files changed, 23 insertions(+)

diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index 0548d17..e0b781b 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -303,6 +303,7 @@ struct mlx5_args {
 	.tx_descriptor_status = mlx5_tx_descriptor_status,
 	.rx_queue_intr_enable = mlx5_rx_intr_enable,
 	.rx_queue_intr_disable = mlx5_rx_intr_disable,
+	.is_removed = mlx5_is_removed,
 };
 
 static const struct eth_dev_ops mlx5_dev_sec_ops = {
@@ -350,6 +351,7 @@ struct mlx5_args {
 	.tx_descriptor_status = mlx5_tx_descriptor_status,
 	.rx_queue_intr_enable = mlx5_rx_intr_enable,
 	.rx_queue_intr_disable = mlx5_rx_intr_disable,
+	.is_removed = mlx5_is_removed,
 };
 
 static struct {
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index e6a69b8..2ec7ae7 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -208,6 +208,7 @@ int mlx5_ibv_device_to_pci_addr(const struct ibv_device *,
 int mlx5_set_link_up(struct rte_eth_dev *dev);
 void priv_dev_select_tx_function(struct priv *priv, struct rte_eth_dev *dev);
 void priv_dev_select_rx_function(struct priv *priv, struct rte_eth_dev *dev);
+int mlx5_is_removed(struct rte_eth_dev *dev);
 
 /* mlx5_mac.c */
 
diff --git a/drivers/net/mlx5/mlx5_ethdev.c b/drivers/net/mlx5/mlx5_ethdev.c
index a3cef68..5cf0849 100644
--- a/drivers/net/mlx5/mlx5_ethdev.c
+++ b/drivers/net/mlx5/mlx5_ethdev.c
@@ -1474,3 +1474,23 @@ struct priv *
 		dev->rx_pkt_burst = mlx5_rx_burst;
 	}
 }
+
+/**
+ * Check if mlx5 device was removed.
+ *
+ * @param dev
+ *   Pointer to Ethernet device structure.
+ *
+ * @return
+ *   1 when device is removed, otherwise 0.
+ */
+int
+mlx5_is_removed(struct rte_eth_dev *dev)
+{
+	struct ibv_device_attr device_attr;
+	struct priv *priv = dev->data->dev_private;
+
+	if (ibv_query_device(priv->ctx, &device_attr) == EIO)
+		return 1;
+	return 0;
+}
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH v2 4/4] net/failsafe: fix removed device handling
  2017-12-13 14:29 ` [PATCH v2 0/4] Fail-safe fix removal handling lack Matan Azrad
                     ` (2 preceding siblings ...)
  2017-12-13 14:29   ` [PATCH v2 3/4] net/mlx5: " Matan Azrad
@ 2017-12-13 14:29   ` Matan Azrad
  2017-12-13 15:16     ` Gaëtan Rivet
  2017-12-19 17:10   ` [PATCH v3 0/6] Fail-safe\ethdev: fix removal handling lack Matan Azrad
  4 siblings, 1 reply; 98+ messages in thread
From: Matan Azrad @ 2017-12-13 14:29 UTC (permalink / raw)
  To: Adrien Mazarguil, Thomas Monjalon, Gaetan Rivet; +Cc: dev, stable

There is time between the physical removal of the device until
sub-device PMDs get a RMV interrupt. At this time DPDK PMDs and
applications still don't know about the removal and may call sub-device
control operation which should return an error.

In previous code this error is reported to the application contrary to
fail-safe principle that the app should not be aware of device removal.

Add an removal check in each relevant control command error flow and
prevent an error report to application when the sub-device is removed.

Fixes: a46f8d5 ("net/failsafe: add fail-safe PMD")
Fixes: b737a1e ("net/failsafe: support flow API")
Cc: stable@dpdk.org

Signed-off-by: Matan Azrad <matan@mellanox.com>
---
 drivers/net/failsafe/failsafe_flow.c    | 18 ++++++++++-------
 drivers/net/failsafe/failsafe_ops.c     | 34 ++++++++++++++++++++++-----------
 drivers/net/failsafe/failsafe_private.h | 10 ++++++++++
 3 files changed, 44 insertions(+), 18 deletions(-)

diff --git a/drivers/net/failsafe/failsafe_flow.c b/drivers/net/failsafe/failsafe_flow.c
index 153ceee..1e39d66 100644
--- a/drivers/net/failsafe/failsafe_flow.c
+++ b/drivers/net/failsafe/failsafe_flow.c
@@ -87,7 +87,7 @@
 		DEBUG("Calling rte_flow_validate on sub_device %d", i);
 		ret = rte_flow_validate(PORT_ID(sdev),
 				attr, patterns, actions, error);
-		if (ret) {
+		if (ret && fs_is_removed(sdev) == 0) {
 			ERROR("Operation rte_flow_validate failed for sub_device %d"
 			      " with error %d", i, ret);
 			return ret;
@@ -111,7 +111,7 @@
 	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
 		flow->flows[i] = rte_flow_create(PORT_ID(sdev),
 				attr, patterns, actions, error);
-		if (flow->flows[i] == NULL) {
+		if (flow->flows[i] == NULL && fs_is_removed(sdev) == 0) {
 			ERROR("Failed to create flow on sub_device %d",
 				i);
 			goto err;
@@ -150,7 +150,7 @@
 			continue;
 		local_ret = rte_flow_destroy(PORT_ID(sdev),
 				flow->flows[i], error);
-		if (local_ret) {
+		if (local_ret && fs_is_removed(sdev) == 0) {
 			ERROR("Failed to destroy flow on sub_device %d: %d",
 					i, local_ret);
 			if (ret == 0)
@@ -175,7 +175,7 @@
 	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
 		DEBUG("Calling rte_flow_flush on sub_device %d", i);
 		ret = rte_flow_flush(PORT_ID(sdev), error);
-		if (ret) {
+		if (ret && fs_is_removed(sdev) == 0) {
 			ERROR("Operation rte_flow_flush failed for sub_device %d"
 			      " with error %d", i, ret);
 			return ret;
@@ -199,8 +199,12 @@
 
 	sdev = TX_SUBDEV(dev);
 	if (sdev != NULL) {
-		return rte_flow_query(PORT_ID(sdev),
-				flow->flows[SUB_ID(sdev)], type, arg, error);
+		int ret = rte_flow_query(PORT_ID(sdev),
+					 flow->flows[SUB_ID(sdev)],
+					 type, arg, error);
+
+		if (ret && fs_is_removed(sdev) == 0)
+			return ret;
 	}
 	WARN("No active sub_device to query about its flow");
 	return -1;
@@ -223,7 +227,7 @@
 			WARN("flow isolation mode of sub_device %d in incoherent state.",
 				i);
 		ret = rte_flow_isolate(PORT_ID(sdev), set, error);
-		if (ret) {
+		if (ret && fs_is_removed(sdev) == 0) {
 			ERROR("Operation rte_flow_isolate failed for sub_device %d"
 			      " with error %d", i, ret);
 			return ret;
diff --git a/drivers/net/failsafe/failsafe_ops.c b/drivers/net/failsafe/failsafe_ops.c
index e16a590..6799b55 100644
--- a/drivers/net/failsafe/failsafe_ops.c
+++ b/drivers/net/failsafe/failsafe_ops.c
@@ -121,6 +121,8 @@
 					dev->data->nb_tx_queues,
 					&dev->data->dev_conf);
 		if (ret) {
+			if (fs_is_removed(sdev) != 0)
+				continue;
 			ERROR("Could not configure sub_device %d", i);
 			return ret;
 		}
@@ -163,8 +165,11 @@
 			continue;
 		DEBUG("Starting sub_device %d", i);
 		ret = rte_eth_dev_start(PORT_ID(sdev));
-		if (ret)
+		if (ret) {
+			if (fs_is_removed(sdev) != 0)
+				continue;
 			return ret;
+		}
 		sdev->state = DEV_STARTED;
 	}
 	if (PRIV(dev)->state < DEV_STARTED)
@@ -196,7 +201,7 @@
 	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
 		DEBUG("Calling rte_eth_dev_set_link_up on sub_device %d", i);
 		ret = rte_eth_dev_set_link_up(PORT_ID(sdev));
-		if (ret) {
+		if (ret && fs_is_removed(sdev) == 0) {
 			ERROR("Operation rte_eth_dev_set_link_up failed for sub_device %d"
 			      " with error %d", i, ret);
 			return ret;
@@ -215,7 +220,7 @@
 	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
 		DEBUG("Calling rte_eth_dev_set_link_down on sub_device %d", i);
 		ret = rte_eth_dev_set_link_down(PORT_ID(sdev));
-		if (ret) {
+		if (ret && fs_is_removed(sdev) == 0) {
 			ERROR("Operation rte_eth_dev_set_link_down failed for sub_device %d"
 			      " with error %d", i, ret);
 			return ret;
@@ -300,7 +305,7 @@
 				rx_queue_id,
 				nb_rx_desc, socket_id,
 				rx_conf, mb_pool);
-		if (ret) {
+		if (ret && fs_is_removed(sdev) == 0) {
 			ERROR("RX queue setup failed for sub_device %d", i);
 			goto free_rxq;
 		}
@@ -366,7 +371,7 @@
 				tx_queue_id,
 				nb_tx_desc, socket_id,
 				tx_conf);
-		if (ret) {
+		if (ret && fs_is_removed(sdev) == 0) {
 			ERROR("TX queue setup failed for sub_device %d", i);
 			goto free_txq;
 		}
@@ -445,7 +450,7 @@
 	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
 		DEBUG("Calling link_update on sub_device %d", i);
 		ret = (SUBOPS(sdev, link_update))(ETH(sdev), wait_to_complete);
-		if (ret && ret != -1) {
+		if (ret && ret != -1 && fs_is_removed(sdev) == 0) {
 			ERROR("Link update failed for sub_device %d with error %d",
 			      i, ret);
 			return ret;
@@ -469,6 +474,7 @@
 fs_stats_get(struct rte_eth_dev *dev,
 	     struct rte_eth_stats *stats)
 {
+	struct rte_eth_stats backup;
 	struct sub_device *sdev;
 	uint8_t i;
 	int ret;
@@ -478,14 +484,20 @@
 		struct rte_eth_stats *snapshot = &sdev->stats_snapshot.stats;
 		uint64_t *timestamp = &sdev->stats_snapshot.timestamp;
 
+		rte_memcpy(&backup, snapshot, sizeof(backup));
 		ret = rte_eth_stats_get(PORT_ID(sdev), snapshot);
 		if (ret) {
+			if (fs_is_removed(sdev) != 0) {
+				rte_memcpy(snapshot, &backup, sizeof(backup));
+				goto inc;
+			}
 			ERROR("Operation rte_eth_stats_get failed for sub_device %d with error %d",
 				  i, ret);
 			*timestamp = 0;
 			return ret;
 		}
 		*timestamp = rte_rdtsc();
+inc:
 		failsafe_stats_increment(stats, snapshot);
 	}
 	return 0;
@@ -598,7 +610,7 @@
 	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
 		DEBUG("Calling rte_eth_dev_set_mtu on sub_device %d", i);
 		ret = rte_eth_dev_set_mtu(PORT_ID(sdev), mtu);
-		if (ret) {
+		if (ret && fs_is_removed(sdev) == 0) {
 			ERROR("Operation rte_eth_dev_set_mtu failed for sub_device %d with error %d",
 			      i, ret);
 			return ret;
@@ -617,7 +629,7 @@
 	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
 		DEBUG("Calling rte_eth_dev_vlan_filter on sub_device %d", i);
 		ret = rte_eth_dev_vlan_filter(PORT_ID(sdev), vlan_id, on);
-		if (ret) {
+		if (ret && fs_is_removed(sdev) == 0) {
 			ERROR("Operation rte_eth_dev_vlan_filter failed for sub_device %d"
 			      " with error %d", i, ret);
 			return ret;
@@ -651,7 +663,7 @@
 	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
 		DEBUG("Calling rte_eth_dev_flow_ctrl_set on sub_device %d", i);
 		ret = rte_eth_dev_flow_ctrl_set(PORT_ID(sdev), fc_conf);
-		if (ret) {
+		if (ret && fs_is_removed(sdev) == 0) {
 			ERROR("Operation rte_eth_dev_flow_ctrl_set failed for sub_device %d"
 			      " with error %d", i, ret);
 			return ret;
@@ -688,7 +700,7 @@
 	RTE_ASSERT(index < FAILSAFE_MAX_ETHADDR);
 	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
 		ret = rte_eth_dev_mac_addr_add(PORT_ID(sdev), mac_addr, vmdq);
-		if (ret) {
+		if (ret && fs_is_removed(sdev) == 0) {
 			ERROR("Operation rte_eth_dev_mac_addr_add failed for sub_device %"
 			      PRIu8 " with error %d", i, ret);
 			return ret;
@@ -730,7 +742,7 @@
 	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
 		DEBUG("Calling rte_eth_dev_filter_ctrl on sub_device %d", i);
 		ret = rte_eth_dev_filter_ctrl(PORT_ID(sdev), type, op, arg);
-		if (ret) {
+		if (ret && fs_is_removed(sdev) == 0) {
 			ERROR("Operation rte_eth_dev_filter_ctrl failed for sub_device %d"
 			      " with error %d", i, ret);
 			return ret;
diff --git a/drivers/net/failsafe/failsafe_private.h b/drivers/net/failsafe/failsafe_private.h
index d81cc3c..0539782 100644
--- a/drivers/net/failsafe/failsafe_private.h
+++ b/drivers/net/failsafe/failsafe_private.h
@@ -375,4 +375,14 @@ int failsafe_eth_lsc_event_callback(uint16_t port_id,
 	rte_wmb();
 }
 
+/*
+ * Check if sub device was removed.
+ */
+static inline int
+fs_is_removed(struct sub_device *sdev)
+{
+	if (sdev->remove == 1 || rte_eth_dev_is_removed(PORT_ID(sdev)) != 0)
+		return 1;
+	return 0;
+}
 #endif /* _RTE_ETH_FAILSAFE_PRIVATE_H_ */
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* Re: [PATCH v2 4/4] net/failsafe: fix removed device handling
  2017-12-13 14:29   ` [PATCH v2 4/4] net/failsafe: fix removed device handling Matan Azrad
@ 2017-12-13 15:16     ` Gaëtan Rivet
  2017-12-13 15:48       ` Matan Azrad
  0 siblings, 1 reply; 98+ messages in thread
From: Gaëtan Rivet @ 2017-12-13 15:16 UTC (permalink / raw)
  To: Matan Azrad; +Cc: Adrien Mazarguil, Thomas Monjalon, dev, stable

Hi Matan,

On Wed, Dec 13, 2017 at 02:29:30PM +0000, Matan Azrad wrote:
> There is time between the physical removal of the device until
> sub-device PMDs get a RMV interrupt. At this time DPDK PMDs and
> applications still don't know about the removal and may call sub-device
> control operation which should return an error.
> 
> In previous code this error is reported to the application contrary to
> fail-safe principle that the app should not be aware of device removal.
> 
> Add an removal check in each relevant control command error flow and
> prevent an error report to application when the sub-device is removed.
> 
> Fixes: a46f8d5 ("net/failsafe: add fail-safe PMD")
> Fixes: b737a1e ("net/failsafe: support flow API")
> Cc: stable@dpdk.org
> 

This patch is not a fix.
It relies on an eth_dev API evolution. Without this evolution,
this patch is meaningless and would break compilation if backported in
stable branch.

Please remove those tags.

> Signed-off-by: Matan Azrad <matan@mellanox.com>
> ---
>  drivers/net/failsafe/failsafe_flow.c    | 18 ++++++++++-------
>  drivers/net/failsafe/failsafe_ops.c     | 34 ++++++++++++++++++++++-----------
>  drivers/net/failsafe/failsafe_private.h | 10 ++++++++++
>  3 files changed, 44 insertions(+), 18 deletions(-)

< ... >

> +/*
> + * Check if sub device was removed.
> + */
> +static inline int
> +fs_is_removed(struct sub_device *sdev)
> +{
> +	if (sdev->remove == 1 || rte_eth_dev_is_removed(PORT_ID(sdev)) != 0)
> +		return 1;
> +	return 0;
> +}

Have you considered adding this check within the subdev iterator itself?
I think it would prevent you from having to add it to each return value
checks.

It is still MT-unsafe anyway.

-- 
Gaëtan Rivet
6WIND

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v2 4/4] net/failsafe: fix removed device handling
  2017-12-13 15:16     ` Gaëtan Rivet
@ 2017-12-13 15:48       ` Matan Azrad
  2017-12-13 16:09         ` Gaëtan Rivet
  0 siblings, 1 reply; 98+ messages in thread
From: Matan Azrad @ 2017-12-13 15:48 UTC (permalink / raw)
  To: Gaëtan Rivet; +Cc: Adrien Mazarguil, Thomas Monjalon, dev, stable

Hi Gaetan
Thanks for the review.
Some comments..

> -----Original Message-----
> From: Gaëtan Rivet [mailto:gaetan.rivet@6wind.com]
> Sent: Wednesday, December 13, 2017 5:17 PM
> To: Matan Azrad <matan@mellanox.com>
> Cc: Adrien Mazarguil <adrien.mazarguil@6wind.com>; Thomas Monjalon
> <thomas@monjalon.net>; dev@dpdk.org; stable@dpdk.org
> Subject: Re: [PATCH v2 4/4] net/failsafe: fix removed device handling
> 
> Hi Matan,
> 
> On Wed, Dec 13, 2017 at 02:29:30PM +0000, Matan Azrad wrote:
> > There is time between the physical removal of the device until
> > sub-device PMDs get a RMV interrupt. At this time DPDK PMDs and
> > applications still don't know about the removal and may call
> > sub-device control operation which should return an error.
> >
> > In previous code this error is reported to the application contrary to
> > fail-safe principle that the app should not be aware of device removal.
> >
> > Add an removal check in each relevant control command error flow and
> > prevent an error report to application when the sub-device is removed.
> >
> > Fixes: a46f8d5 ("net/failsafe: add fail-safe PMD")
> > Fixes: b737a1e ("net/failsafe: support flow API")
> > Cc: stable@dpdk.org
> >
> 
> This patch is not a fix.
> It relies on an eth_dev API evolution. Without this evolution, this patch is
> meaningless and would break compilation if backported in stable branch.
> 

It is a fix because the bug is finally solved by this patch.
I agree that it cannot be backported itself, but maybe all the series should be backported.
Other idea:
Add new patch which documents the bug and backport it.
Remove it in this patch and remove cc stable from it.
What do you think?

> Please remove those tags.
> 
> > Signed-off-by: Matan Azrad <matan@mellanox.com>
> > ---
> >  drivers/net/failsafe/failsafe_flow.c    | 18 ++++++++++-------
> >  drivers/net/failsafe/failsafe_ops.c     | 34 ++++++++++++++++++++++-----
> ------
> >  drivers/net/failsafe/failsafe_private.h | 10 ++++++++++
> >  3 files changed, 44 insertions(+), 18 deletions(-)
> 
> < ... >
> 
> > +/*
> > + * Check if sub device was removed.
> > + */
> > +static inline int
> > +fs_is_removed(struct sub_device *sdev) {
> > +	if (sdev->remove == 1 || rte_eth_dev_is_removed(PORT_ID(sdev))
> != 0)
> > +		return 1;
> > +	return 0;
> > +}
> 
> Have you considered adding this check within the subdev iterator itself?
> I think it would prevent you from having to add it to each return value
> checks.
> 
> It is still MT-unsafe anyway.
>

This fix doesn't come to solve the MT issue, It comes to solve the error report to application because of removal.
Adding the check in subdev iterator doesn't make sense for this issue.

Matan. 
> --
> Gaëtan Rivet
> 6WIND

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v2 4/4] net/failsafe: fix removed device handling
  2017-12-13 15:48       ` Matan Azrad
@ 2017-12-13 16:09         ` Gaëtan Rivet
  2017-12-13 17:09           ` Thomas Monjalon
  2017-12-13 21:55           ` Gaëtan Rivet
  0 siblings, 2 replies; 98+ messages in thread
From: Gaëtan Rivet @ 2017-12-13 16:09 UTC (permalink / raw)
  To: Matan Azrad; +Cc: Adrien Mazarguil, Thomas Monjalon, dev, stable

On Wed, Dec 13, 2017 at 03:48:46PM +0000, Matan Azrad wrote:
> Hi Gaetan
> Thanks for the review.
> Some comments..
> 
> > -----Original Message-----
> > From: Gaëtan Rivet [mailto:gaetan.rivet@6wind.com]
> > Sent: Wednesday, December 13, 2017 5:17 PM
> > To: Matan Azrad <matan@mellanox.com>
> > Cc: Adrien Mazarguil <adrien.mazarguil@6wind.com>; Thomas Monjalon
> > <thomas@monjalon.net>; dev@dpdk.org; stable@dpdk.org
> > Subject: Re: [PATCH v2 4/4] net/failsafe: fix removed device handling
> > 
> > Hi Matan,
> > 
> > On Wed, Dec 13, 2017 at 02:29:30PM +0000, Matan Azrad wrote:
> > > There is time between the physical removal of the device until
> > > sub-device PMDs get a RMV interrupt. At this time DPDK PMDs and
> > > applications still don't know about the removal and may call
> > > sub-device control operation which should return an error.
> > >
> > > In previous code this error is reported to the application contrary to
> > > fail-safe principle that the app should not be aware of device removal.
> > >
> > > Add an removal check in each relevant control command error flow and
> > > prevent an error report to application when the sub-device is removed.
> > >
> > > Fixes: a46f8d5 ("net/failsafe: add fail-safe PMD")
> > > Fixes: b737a1e ("net/failsafe: support flow API")
> > > Cc: stable@dpdk.org
> > >
> > 
> > This patch is not a fix.
> > It relies on an eth_dev API evolution. Without this evolution, this patch is
> > meaningless and would break compilation if backported in stable branch.
> > 
> 
> It is a fix because the bug is finally solved by this patch.
> I agree that it cannot be backported itself, but maybe all the series should be backported.
> Other idea:
> Add new patch which documents the bug and backport it.
> Remove it in this patch and remove cc stable from it.
> What do you think?
> 

I think you could write a crude version that would not rely on the
ethdev evolution (checking sdev->remove only), which would be incomplete
but still better than nothing.
And why not in this patch document the issue.
Without any dependency outside failsafe, this could be backported.

Then complete the fix with the API evolution if the new devops is
accepted.

> > Please remove those tags.
> > 
> > > Signed-off-by: Matan Azrad <matan@mellanox.com>
> > > ---
> > >  drivers/net/failsafe/failsafe_flow.c    | 18 ++++++++++-------
> > >  drivers/net/failsafe/failsafe_ops.c     | 34 ++++++++++++++++++++++-----
> > ------
> > >  drivers/net/failsafe/failsafe_private.h | 10 ++++++++++
> > >  3 files changed, 44 insertions(+), 18 deletions(-)
> > 
> > < ... >
> > 
> > > +/*
> > > + * Check if sub device was removed.
> > > + */
> > > +static inline int
> > > +fs_is_removed(struct sub_device *sdev) {
> > > +	if (sdev->remove == 1 || rte_eth_dev_is_removed(PORT_ID(sdev))
> > != 0)
> > > +		return 1;
> > > +	return 0;
> > > +}
> > 
> > Have you considered adding this check within the subdev iterator itself?
> > I think it would prevent you from having to add it to each return value
> > checks.
> > 
> > It is still MT-unsafe anyway.
> >
> 
> This fix doesn't come to solve the MT issue, It comes to solve the error report to application because of removal.
> Adding the check in subdev iterator doesn't make sense for this issue.
> 
> Matan. 

If you add this check in the iterator itself, you would skip removed
devices before attempting operating upon them, right?

Then it should probably help with your issue, unless you tested it and
verified that it didnt?

Something like this:

---8<---

diff --git a/drivers/net/failsafe/failsafe_private.h b/drivers/net/failsafe/failsafe_private.h
index d81cc3ca6..62ddc0689 100644
--- a/drivers/net/failsafe/failsafe_private.h
+++ b/drivers/net/failsafe/failsafe_private.h
@@ -316,8 +316,12 @@ fs_find_next(struct rte_eth_dev *dev,
        subs = PRIV(dev)->subs;
        tail = PRIV(dev)->subs_tail;
        while (sid < tail) {
+               if (min_state > DEV_PROBED &&
+                   fs_is_removed(&sub[sid]))
+                       goto next;
                if (subs[sid].state >= min_state)
                        break;
+next:
                sid++;
        }
        *sid_out = sid;

--->8---

Only issue being that it is completely racy, but as this MT-unsafe property is
inescapable we might as well ignore it and go for KISS.

If that's enough, I would prefer instead of having this additional check
added to all rte_eth operations.

-- 
Gaëtan Rivet
6WIND

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* Re: [PATCH v2 4/4] net/failsafe: fix removed device handling
  2017-12-13 16:09         ` Gaëtan Rivet
@ 2017-12-13 17:09           ` Thomas Monjalon
  2017-12-14 10:40             ` Matan Azrad
  2017-12-13 21:55           ` Gaëtan Rivet
  1 sibling, 1 reply; 98+ messages in thread
From: Thomas Monjalon @ 2017-12-13 17:09 UTC (permalink / raw)
  To: Gaëtan Rivet, Matan Azrad; +Cc: Adrien Mazarguil, dev, stable

13/12/2017 17:09, Gaëtan Rivet:
> On Wed, Dec 13, 2017 at 03:48:46PM +0000, Matan Azrad wrote:
> > From: Gaëtan Rivet [mailto:gaetan.rivet@6wind.com]
> > > > Fixes: a46f8d5 ("net/failsafe: add fail-safe PMD")
> > > > Fixes: b737a1e ("net/failsafe: support flow API")
> > > > Cc: stable@dpdk.org
> > > >
> > > 
> > > This patch is not a fix.
> > > It relies on an eth_dev API evolution. Without this evolution, this patch is
> > > meaningless and would break compilation if backported in stable branch.
> > > 
> > 
> > It is a fix because the bug is finally solved by this patch.
> > I agree that it cannot be backported itself, but maybe all the series should be backported.
> > Other idea:
> > Add new patch which documents the bug and backport it.
> > Remove it in this patch and remove cc stable from it.
> > What do you think?
> > 
> 
> I think you could write a crude version that would not rely on the
> ethdev evolution (checking sdev->remove only), which would be incomplete
> but still better than nothing.
> And why not in this patch document the issue.
> Without any dependency outside failsafe, this could be backported.
> 
> Then complete the fix with the API evolution if the new devops is
> accepted.

I think it is not worth the effort.
It is a limitation in earlier releases and will be properly fixed
with the new op.
Please just remove the Cc:stable@dpdk.org.

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v2 4/4] net/failsafe: fix removed device handling
  2017-12-13 16:09         ` Gaëtan Rivet
  2017-12-13 17:09           ` Thomas Monjalon
@ 2017-12-13 21:55           ` Gaëtan Rivet
  2017-12-14 10:40             ` Matan Azrad
  1 sibling, 1 reply; 98+ messages in thread
From: Gaëtan Rivet @ 2017-12-13 21:55 UTC (permalink / raw)
  To: Matan Azrad; +Cc: Adrien Mazarguil, Thomas Monjalon, dev, stable

Hi again Matan,

On Wed, Dec 13, 2017 at 05:09:16PM +0100, Gaëtan Rivet wrote:
> On Wed, Dec 13, 2017 at 03:48:46PM +0000, Matan Azrad wrote:
> > Hi Gaetan
> > Thanks for the review.
> > Some comments..
> > 
> > > -----Original Message-----
> > > From: Gaëtan Rivet [mailto:gaetan.rivet@6wind.com]
> > > Sent: Wednesday, December 13, 2017 5:17 PM
> > > To: Matan Azrad <matan@mellanox.com>
> > > Cc: Adrien Mazarguil <adrien.mazarguil@6wind.com>; Thomas Monjalon
> > > <thomas@monjalon.net>; dev@dpdk.org; stable@dpdk.org
> > > Subject: Re: [PATCH v2 4/4] net/failsafe: fix removed device handling
> > > 
> > > Hi Matan,
> > > 
> > > On Wed, Dec 13, 2017 at 02:29:30PM +0000, Matan Azrad wrote:
> > > > There is time between the physical removal of the device until
> > > > sub-device PMDs get a RMV interrupt. At this time DPDK PMDs and
> > > > applications still don't know about the removal and may call
> > > > sub-device control operation which should return an error.
> > > >
> > > > In previous code this error is reported to the application contrary to
> > > > fail-safe principle that the app should not be aware of device removal.
> > > >
> > > > Add an removal check in each relevant control command error flow and
> > > > prevent an error report to application when the sub-device is removed.
> > > >
> > > > Fixes: a46f8d5 ("net/failsafe: add fail-safe PMD")
> > > > Fixes: b737a1e ("net/failsafe: support flow API")
> > > > Cc: stable@dpdk.org
> > > >
> > > 
> > > This patch is not a fix.
> > > It relies on an eth_dev API evolution. Without this evolution, this patch is
> > > meaningless and would break compilation if backported in stable branch.
> > > 
> > 
> > It is a fix because the bug is finally solved by this patch.
> > I agree that it cannot be backported itself, but maybe all the series should be backported.
> > Other idea:
> > Add new patch which documents the bug and backport it.
> > Remove it in this patch and remove cc stable from it.
> > What do you think?
> > 
> 
> I think you could write a crude version that would not rely on the
> ethdev evolution (checking sdev->remove only), which would be incomplete
> but still better than nothing.
> And why not in this patch document the issue.
> Without any dependency outside failsafe, this could be backported.
> 
> Then complete the fix with the API evolution if the new devops is
> accepted.
> 
> > > Please remove those tags.
> > > 
> > > > Signed-off-by: Matan Azrad <matan@mellanox.com>
> > > > ---
> > > >  drivers/net/failsafe/failsafe_flow.c    | 18 ++++++++++-------
> > > >  drivers/net/failsafe/failsafe_ops.c     | 34 ++++++++++++++++++++++-----
> > > ------
> > > >  drivers/net/failsafe/failsafe_private.h | 10 ++++++++++
> > > >  3 files changed, 44 insertions(+), 18 deletions(-)
> > > 
> > > < ... >
> > > 
> > > > +/*
> > > > + * Check if sub device was removed.
> > > > + */
> > > > +static inline int
> > > > +fs_is_removed(struct sub_device *sdev) {
> > > > +	if (sdev->remove == 1 || rte_eth_dev_is_removed(PORT_ID(sdev))
> > > != 0)
> > > > +		return 1;
> > > > +	return 0;
> > > > +}
> > > 
> > > Have you considered adding this check within the subdev iterator itself?
> > > I think it would prevent you from having to add it to each return value
> > > checks.
> > > 
> > > It is still MT-unsafe anyway.
> > >
> > 
> > This fix doesn't come to solve the MT issue, It comes to solve the error report to application because of removal.
> > Adding the check in subdev iterator doesn't make sense for this issue.
> > 
> > Matan. 
> 
> If you add this check in the iterator itself, you would skip removed
> devices before attempting operating upon them, right?
> 
> Then it should probably help with your issue, unless you tested it and
> verified that it didnt?
> 
> Something like this:
> 
> ---8<---
> 
> diff --git a/drivers/net/failsafe/failsafe_private.h b/drivers/net/failsafe/failsafe_private.h
> index d81cc3ca6..62ddc0689 100644
> --- a/drivers/net/failsafe/failsafe_private.h
> +++ b/drivers/net/failsafe/failsafe_private.h
> @@ -316,8 +316,12 @@ fs_find_next(struct rte_eth_dev *dev,
>         subs = PRIV(dev)->subs;
>         tail = PRIV(dev)->subs_tail;
>         while (sid < tail) {
> +               if (min_state > DEV_PROBED &&
> +                   fs_is_removed(&sub[sid]))
> +                       goto next;
>                 if (subs[sid].state >= min_state)
>                         break;
> +next:
>                 sid++;
>         }
>         *sid_out = sid;
> 
> --->8---
> 
> Only issue being that it is completely racy, but as this MT-unsafe property is
> inescapable we might as well ignore it and go for KISS.
> 
> If that's enough, I would prefer instead of having this additional check
> added to all rte_eth operations.
> 

Ok, actually you were right here to do it this way. The "is_removed"
check needs to happen after the operation attempt to effectively
mitigate the possible race. Checking before attempting the call will be
much less effective.

That being said, would it be cleaner to have eth_dev ops return -ENODEV
directly, and check against it within fail-safe?

-- 
Gaëtan Rivet
6WIND

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v2 4/4] net/failsafe: fix removed device handling
  2017-12-13 21:55           ` Gaëtan Rivet
@ 2017-12-14 10:40             ` Matan Azrad
  2017-12-14 10:48               ` Gaëtan Rivet
  0 siblings, 1 reply; 98+ messages in thread
From: Matan Azrad @ 2017-12-14 10:40 UTC (permalink / raw)
  To: Gaëtan Rivet; +Cc: Adrien Mazarguil, Thomas Monjalon, dev, stable

Hi Gaetan

> -----Original Message-----
> From: Gaëtan Rivet [mailto:gaetan.rivet@6wind.com]
> Sent: Wednesday, December 13, 2017 11:56 PM
> To: Matan Azrad <matan@mellanox.com>
> Cc: Adrien Mazarguil <adrien.mazarguil@6wind.com>; Thomas Monjalon
> <thomas@monjalon.net>; dev@dpdk.org; stable@dpdk.org
> Subject: Re: [PATCH v2 4/4] net/failsafe: fix removed device handling
> 
> Hi again Matan,
> 
> On Wed, Dec 13, 2017 at 05:09:16PM +0100, Gaëtan Rivet wrote:
> > On Wed, Dec 13, 2017 at 03:48:46PM +0000, Matan Azrad wrote:
> > > Hi Gaetan
> > > Thanks for the review.
> > > Some comments..
> > >
> > > > -----Original Message-----
> > > > From: Gaëtan Rivet [mailto:gaetan.rivet@6wind.com]
> > > > Sent: Wednesday, December 13, 2017 5:17 PM
> > > > To: Matan Azrad <matan@mellanox.com>
> > > > Cc: Adrien Mazarguil <adrien.mazarguil@6wind.com>; Thomas
> Monjalon
> > > > <thomas@monjalon.net>; dev@dpdk.org; stable@dpdk.org
> > > > Subject: Re: [PATCH v2 4/4] net/failsafe: fix removed device
> > > > handling
> > > >
> > > > Hi Matan,
> > > >
> > > > On Wed, Dec 13, 2017 at 02:29:30PM +0000, Matan Azrad wrote:
> > > > > There is time between the physical removal of the device until
> > > > > sub-device PMDs get a RMV interrupt. At this time DPDK PMDs and
> > > > > applications still don't know about the removal and may call
> > > > > sub-device control operation which should return an error.
> > > > >
> > > > > In previous code this error is reported to the application
> > > > > contrary to fail-safe principle that the app should not be aware of
> device removal.
> > > > >
> > > > > Add an removal check in each relevant control command error flow
> > > > > and prevent an error report to application when the sub-device is
> removed.
> > > > >
> > > > > Fixes: a46f8d5 ("net/failsafe: add fail-safe PMD")
> > > > > Fixes: b737a1e ("net/failsafe: support flow API")
> > > > > Cc: stable@dpdk.org
> > > > >
> > > >
> > > > This patch is not a fix.
> > > > It relies on an eth_dev API evolution. Without this evolution,
> > > > this patch is meaningless and would break compilation if backported in
> stable branch.
> > > >
> > >
> > > It is a fix because the bug is finally solved by this patch.
> > > I agree that it cannot be backported itself, but maybe all the series should
> be backported.
> > > Other idea:
> > > Add new patch which documents the bug and backport it.
> > > Remove it in this patch and remove cc stable from it.
> > > What do you think?
> > >
> >
> > I think you could write a crude version that would not rely on the
> > ethdev evolution (checking sdev->remove only), which would be
> > incomplete but still better than nothing.
> > And why not in this patch document the issue.
> > Without any dependency outside failsafe, this could be backported.
> >
> > Then complete the fix with the API evolution if the new devops is
> > accepted.
> >
> > > > Please remove those tags.
> > > >
> > > > > Signed-off-by: Matan Azrad <matan@mellanox.com>
> > > > > ---
> > > > >  drivers/net/failsafe/failsafe_flow.c    | 18 ++++++++++-------
> > > > >  drivers/net/failsafe/failsafe_ops.c     | 34
> ++++++++++++++++++++++-----
> > > > ------
> > > > >  drivers/net/failsafe/failsafe_private.h | 10 ++++++++++
> > > > >  3 files changed, 44 insertions(+), 18 deletions(-)
> > > >
> > > > < ... >
> > > >
> > > > > +/*
> > > > > + * Check if sub device was removed.
> > > > > + */
> > > > > +static inline int
> > > > > +fs_is_removed(struct sub_device *sdev) {
> > > > > +	if (sdev->remove == 1 ||
> rte_eth_dev_is_removed(PORT_ID(sdev))
> > > > != 0)
> > > > > +		return 1;
> > > > > +	return 0;
> > > > > +}
> > > >
> > > > Have you considered adding this check within the subdev iterator itself?
> > > > I think it would prevent you from having to add it to each return
> > > > value checks.
> > > >
> > > > It is still MT-unsafe anyway.
> > > >
> > >
> > > This fix doesn't come to solve the MT issue, It comes to solve the error
> report to application because of removal.
> > > Adding the check in subdev iterator doesn't make sense for this issue.
> > >
> > > Matan.
> >
> > If you add this check in the iterator itself, you would skip removed
> > devices before attempting operating upon them, right?
> >
> > Then it should probably help with your issue, unless you tested it and
> > verified that it didnt?
> >
> > Something like this:
> >
> > ---8<---
> >
> > diff --git a/drivers/net/failsafe/failsafe_private.h
> > b/drivers/net/failsafe/failsafe_private.h
> > index d81cc3ca6..62ddc0689 100644
> > --- a/drivers/net/failsafe/failsafe_private.h
> > +++ b/drivers/net/failsafe/failsafe_private.h
> > @@ -316,8 +316,12 @@ fs_find_next(struct rte_eth_dev *dev,
> >         subs = PRIV(dev)->subs;
> >         tail = PRIV(dev)->subs_tail;
> >         while (sid < tail) {
> > +               if (min_state > DEV_PROBED &&
> > +                   fs_is_removed(&sub[sid]))
> > +                       goto next;
> >                 if (subs[sid].state >= min_state)
> >                         break;
> > +next:
> >                 sid++;
> >         }
> >         *sid_out = sid;
> >
> > --->8---
> >
> > Only issue being that it is completely racy, but as this MT-unsafe
> > property is inescapable we might as well ignore it and go for KISS.
> >
> > If that's enough, I would prefer instead of having this additional
> > check added to all rte_eth operations.
> >
> 
> Ok, actually you were right here to do it this way. The "is_removed"
> check needs to happen after the operation attempt to effectively mitigate
> the possible race. Checking before attempting the call will be much less
> effective.
> 
> That being said, would it be cleaner to have eth_dev ops return -ENODEV
> directly, and check against it within fail-safe?
> 

I think that according to "is_removed" semantic we must return a Boolean value (Each value different from '0' means that the device is removed) like other functions in c library (for example isspace()).

Thanks.
 

> --
> Gaëtan Rivet
> 6WIND

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v2 4/4] net/failsafe: fix removed device handling
  2017-12-13 17:09           ` Thomas Monjalon
@ 2017-12-14 10:40             ` Matan Azrad
  0 siblings, 0 replies; 98+ messages in thread
From: Matan Azrad @ 2017-12-14 10:40 UTC (permalink / raw)
  To: Thomas Monjalon, Gaëtan Rivet; +Cc: Adrien Mazarguil, dev, stable



> -----Original Message-----
> From: Thomas Monjalon [mailto:thomas@monjalon.net]
> Sent: Wednesday, December 13, 2017 7:09 PM
> To: Gaëtan Rivet <gaetan.rivet@6wind.com>; Matan Azrad
> <matan@mellanox.com>
> Cc: Adrien Mazarguil <adrien.mazarguil@6wind.com>; dev@dpdk.org;
> stable@dpdk.org
> Subject: Re: [PATCH v2 4/4] net/failsafe: fix removed device handling
> 
> 13/12/2017 17:09, Gaëtan Rivet:
> > On Wed, Dec 13, 2017 at 03:48:46PM +0000, Matan Azrad wrote:
> > > From: Gaëtan Rivet [mailto:gaetan.rivet@6wind.com]
> > > > > Fixes: a46f8d5 ("net/failsafe: add fail-safe PMD")
> > > > > Fixes: b737a1e ("net/failsafe: support flow API")
> > > > > Cc: stable@dpdk.org
> > > > >
> > > >
> > > > This patch is not a fix.
> > > > It relies on an eth_dev API evolution. Without this evolution,
> > > > this patch is meaningless and would break compilation if backported in
> stable branch.
> > > >
> > >
> > > It is a fix because the bug is finally solved by this patch.
> > > I agree that it cannot be backported itself, but maybe all the series should
> be backported.
> > > Other idea:
> > > Add new patch which documents the bug and backport it.
> > > Remove it in this patch and remove cc stable from it.
> > > What do you think?
> > >
> >
> > I think you could write a crude version that would not rely on the
> > ethdev evolution (checking sdev->remove only), which would be
> > incomplete but still better than nothing.
> > And why not in this patch document the issue.
> > Without any dependency outside failsafe, this could be backported.
> >
> > Then complete the fix with the API evolution if the new devops is
> > accepted.
> 
> I think it is not worth the effort.
> It is a limitation in earlier releases and will be properly fixed with the new op.
> Please just remove the Cc:stable@dpdk.org.

OK.

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v2 4/4] net/failsafe: fix removed device handling
  2017-12-14 10:40             ` Matan Azrad
@ 2017-12-14 10:48               ` Gaëtan Rivet
  2017-12-14 13:07                 ` Matan Azrad
  0 siblings, 1 reply; 98+ messages in thread
From: Gaëtan Rivet @ 2017-12-14 10:48 UTC (permalink / raw)
  To: Matan Azrad; +Cc: Adrien Mazarguil, Thomas Monjalon, dev, stable

On Thu, Dec 14, 2017 at 10:40:22AM +0000, Matan Azrad wrote:
> Hi Gaetan
> 

<snip>

> > >
> > > If you add this check in the iterator itself, you would skip removed
> > > devices before attempting operating upon them, right?
> > >
> > > Then it should probably help with your issue, unless you tested it and
> > > verified that it didnt?
> > >
> > > Something like this:
> > >
> > > ---8<---
> > >
> > > diff --git a/drivers/net/failsafe/failsafe_private.h
> > > b/drivers/net/failsafe/failsafe_private.h
> > > index d81cc3ca6..62ddc0689 100644
> > > --- a/drivers/net/failsafe/failsafe_private.h
> > > +++ b/drivers/net/failsafe/failsafe_private.h
> > > @@ -316,8 +316,12 @@ fs_find_next(struct rte_eth_dev *dev,
> > >         subs = PRIV(dev)->subs;
> > >         tail = PRIV(dev)->subs_tail;
> > >         while (sid < tail) {
> > > +               if (min_state > DEV_PROBED &&
> > > +                   fs_is_removed(&sub[sid]))
> > > +                       goto next;
> > >                 if (subs[sid].state >= min_state)
> > >                         break;
> > > +next:
> > >                 sid++;
> > >         }
> > >         *sid_out = sid;
> > >
> > > --->8---
> > >
> > > Only issue being that it is completely racy, but as this MT-unsafe
> > > property is inescapable we might as well ignore it and go for KISS.
> > >
> > > If that's enough, I would prefer instead of having this additional
> > > check added to all rte_eth operations.
> > >
> > 
> > Ok, actually you were right here to do it this way. The "is_removed"
> > check needs to happen after the operation attempt to effectively mitigate
> > the possible race. Checking before attempting the call will be much less
> > effective.
> > 
> > That being said, would it be cleaner to have eth_dev ops return -ENODEV
> > directly, and check against it within fail-safe?
> > 
> 
> I think that according to "is_removed" semantic we must return a Boolean value (Each value different from '0' means that the device is removed) like other functions in c library (for example isspace()).
> 

Sure, I wasn't discussing the interface proposed by rte_eth_dev_is_removed().

What I meant was to ask whether checking rte_eth_dev_is_removed() would
be more interesting in the ethdev layer, making the eth_dev_ops return
-ENODEV regardless of the previous error if this check is supported by
the driver and signal that the port is removed.

I think this information could be interesting to other systems, not just
fail-safe.

-- 
Gaëtan Rivet
6WIND

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v2 4/4] net/failsafe: fix removed device handling
  2017-12-14 10:48               ` Gaëtan Rivet
@ 2017-12-14 13:07                 ` Matan Azrad
  2017-12-14 13:27                   ` Gaëtan Rivet
  0 siblings, 1 reply; 98+ messages in thread
From: Matan Azrad @ 2017-12-14 13:07 UTC (permalink / raw)
  To: Gaëtan Rivet; +Cc: Adrien Mazarguil, Thomas Monjalon, dev, stable

Hi Gaetan

> -----Original Message-----
> From: Gaëtan Rivet [mailto:gaetan.rivet@6wind.com]
> Sent: Thursday, December 14, 2017 12:49 PM
> To: Matan Azrad <matan@mellanox.com>
> Cc: Adrien Mazarguil <adrien.mazarguil@6wind.com>; Thomas Monjalon
> <thomas@monjalon.net>; dev@dpdk.org; stable@dpdk.org
> Subject: Re: [PATCH v2 4/4] net/failsafe: fix removed device handling
> 
> On Thu, Dec 14, 2017 at 10:40:22AM +0000, Matan Azrad wrote:
> > Hi Gaetan
> >
> 
> <snip>
> 
> > > >
> > > > If you add this check in the iterator itself, you would skip
> > > > removed devices before attempting operating upon them, right?
> > > >
> > > > Then it should probably help with your issue, unless you tested it
> > > > and verified that it didnt?
> > > >
> > > > Something like this:
> > > >
> > > > ---8<---
> > > >
> > > > diff --git a/drivers/net/failsafe/failsafe_private.h
> > > > b/drivers/net/failsafe/failsafe_private.h
> > > > index d81cc3ca6..62ddc0689 100644
> > > > --- a/drivers/net/failsafe/failsafe_private.h
> > > > +++ b/drivers/net/failsafe/failsafe_private.h
> > > > @@ -316,8 +316,12 @@ fs_find_next(struct rte_eth_dev *dev,
> > > >         subs = PRIV(dev)->subs;
> > > >         tail = PRIV(dev)->subs_tail;
> > > >         while (sid < tail) {
> > > > +               if (min_state > DEV_PROBED &&
> > > > +                   fs_is_removed(&sub[sid]))
> > > > +                       goto next;
> > > >                 if (subs[sid].state >= min_state)
> > > >                         break;
> > > > +next:
> > > >                 sid++;
> > > >         }
> > > >         *sid_out = sid;
> > > >
> > > > --->8---
> > > >
> > > > Only issue being that it is completely racy, but as this MT-unsafe
> > > > property is inescapable we might as well ignore it and go for KISS.
> > > >
> > > > If that's enough, I would prefer instead of having this additional
> > > > check added to all rte_eth operations.
> > > >
> > >
> > > Ok, actually you were right here to do it this way. The "is_removed"
> > > check needs to happen after the operation attempt to effectively
> > > mitigate the possible race. Checking before attempting the call will
> > > be much less effective.
> > >
> > > That being said, would it be cleaner to have eth_dev ops return
> > > -ENODEV directly, and check against it within fail-safe?
> > >
> >
> > I think that according to "is_removed" semantic we must return a Boolean
> value (Each value different from '0' means that the device is removed) like
> other functions in c library (for example isspace()).
> >
> 
> Sure, I wasn't discussing the interface proposed by
> rte_eth_dev_is_removed().
> 
> What I meant was to ask whether checking rte_eth_dev_is_removed()
> would be more interesting in the ethdev layer, making the eth_dev_ops
> return -ENODEV regardless of the previous error if this check is supported by
> the driver and signal that the port is removed.
> 
> I think this information could be interesting to other systems, not just fail-
> safe.
> 

Ok. Got you now.
Interesting approach - plan:
	1. update fs_link_update to use rte_eth* functions.
	2. maybe -EIO is preferred because -ENODEV is used for no port error?
	3. update all relevant rte_eth* to use "is_removed" in error flows(1 patch for flow APIs and 1 for the others).
	4. Change fs checks in error flows to check rte_eth* return values.
	5. Remove CC stable from commit massage.

What do you think?

> --
> Gaëtan Rivet
> 6WIND

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v2 4/4] net/failsafe: fix removed device handling
  2017-12-14 13:07                 ` Matan Azrad
@ 2017-12-14 13:27                   ` Gaëtan Rivet
  2017-12-14 14:43                     ` Matan Azrad
  0 siblings, 1 reply; 98+ messages in thread
From: Gaëtan Rivet @ 2017-12-14 13:27 UTC (permalink / raw)
  To: Matan Azrad; +Cc: Adrien Mazarguil, Thomas Monjalon, dev, stable

On Thu, Dec 14, 2017 at 01:07:31PM +0000, Matan Azrad wrote:
> Hi Gaetan
> 
> > -----Original Message-----
> > From: Gaëtan Rivet [mailto:gaetan.rivet@6wind.com]
> > Sent: Thursday, December 14, 2017 12:49 PM
> > To: Matan Azrad <matan@mellanox.com>
> > Cc: Adrien Mazarguil <adrien.mazarguil@6wind.com>; Thomas Monjalon
> > <thomas@monjalon.net>; dev@dpdk.org; stable@dpdk.org
> > Subject: Re: [PATCH v2 4/4] net/failsafe: fix removed device handling
> > 
> > On Thu, Dec 14, 2017 at 10:40:22AM +0000, Matan Azrad wrote:
> > > Hi Gaetan
> > >
> > 

<snip>

> > > > Ok, actually you were right here to do it this way. The "is_removed"
> > > > check needs to happen after the operation attempt to effectively
> > > > mitigate the possible race. Checking before attempting the call will
> > > > be much less effective.
> > > >
> > > > That being said, would it be cleaner to have eth_dev ops return
> > > > -ENODEV directly, and check against it within fail-safe?
> > > >
> > >
> > > I think that according to "is_removed" semantic we must return a Boolean
> > value (Each value different from '0' means that the device is removed) like
> > other functions in c library (for example isspace()).
> > >
> > 
> > Sure, I wasn't discussing the interface proposed by
> > rte_eth_dev_is_removed().
> > 
> > What I meant was to ask whether checking rte_eth_dev_is_removed()
> > would be more interesting in the ethdev layer, making the eth_dev_ops
> > return -ENODEV regardless of the previous error if this check is supported by
> > the driver and signal that the port is removed.
> > 
> > I think this information could be interesting to other systems, not just fail-
> > safe.
> > 
> 
> Ok. Got you now.
> Interesting approach - plan:
> 	1. update fs_link_update to use rte_eth* functions.

I'm surprised it doesn't already.
Either the rte_eth* function was introduced after the failsafe, or be
wary of potential issues. I don't see a problem right now though.

> 	2. maybe -EIO is preferred because -ENODEV is used for no port error?

Good point, didn't think about it.
Prepare yourself maybe to some arguments about the most relevant error
code. -EIO seems fine to me, but maybe use a wrapper for all this.

Something like:

---8<---

static int
eth_error(pid, int original_ret)
{
    int ret;

    if (original_ret == 0)
        return original_ret;
    ret = rte_eth_is_removed(pid);
    if (ret == 0 || ret == -ENOTSUP)
        return original_ret;
    return -EIO;
}

int
rte_eth_ops_xyz(pid)
{
        int ret;
        ret = eth_dev(pid).ops_xyz();
        return eth_error(pid, ret);
}

--->8---

This way you would be able to change it easily and the logic would be
insulated.

> 	3. update all relevant rte_eth* to use "is_removed" in error flows(1 patch for flow APIs and 1 for the others).
> 	4. Change fs checks in error flows to check rte_eth* return values.
> 	5. Remove CC stable from commit massage.
> 
> What do you think?
> 

Agreed otherwise.

Thanks,

> > --
> > Gaëtan Rivet
> > 6WIND

-- 
Gaëtan Rivet
6WIND

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v2 4/4] net/failsafe: fix removed device handling
  2017-12-14 13:27                   ` Gaëtan Rivet
@ 2017-12-14 14:43                     ` Matan Azrad
  0 siblings, 0 replies; 98+ messages in thread
From: Matan Azrad @ 2017-12-14 14:43 UTC (permalink / raw)
  To: Gaëtan Rivet; +Cc: Adrien Mazarguil, Thomas Monjalon, dev, stable

Hi

> -----Original Message-----
> From: Gaëtan Rivet [mailto:gaetan.rivet@6wind.com]
> Sent: Thursday, December 14, 2017 3:27 PM
> To: Matan Azrad <matan@mellanox.com>
> Cc: Adrien Mazarguil <adrien.mazarguil@6wind.com>; Thomas Monjalon
> <thomas@monjalon.net>; dev@dpdk.org; stable@dpdk.org
> Subject: Re: [PATCH v2 4/4] net/failsafe: fix removed device handling
> 
> On Thu, Dec 14, 2017 at 01:07:31PM +0000, Matan Azrad wrote:
> > Hi Gaetan
> >
> > > -----Original Message-----
> > > From: Gaëtan Rivet [mailto:gaetan.rivet@6wind.com]
> > > Sent: Thursday, December 14, 2017 12:49 PM
> > > To: Matan Azrad <matan@mellanox.com>
> > > Cc: Adrien Mazarguil <adrien.mazarguil@6wind.com>; Thomas Monjalon
> > > <thomas@monjalon.net>; dev@dpdk.org; stable@dpdk.org
> > > Subject: Re: [PATCH v2 4/4] net/failsafe: fix removed device
> > > handling
> > >
> > > On Thu, Dec 14, 2017 at 10:40:22AM +0000, Matan Azrad wrote:
> > > > Hi Gaetan
> > > >
> > >
> 
> <snip>
> 
> > > > > Ok, actually you were right here to do it this way. The "is_removed"
> > > > > check needs to happen after the operation attempt to effectively
> > > > > mitigate the possible race. Checking before attempting the call
> > > > > will be much less effective.
> > > > >
> > > > > That being said, would it be cleaner to have eth_dev ops return
> > > > > -ENODEV directly, and check against it within fail-safe?
> > > > >
> > > >
> > > > I think that according to "is_removed" semantic we must return a
> > > > Boolean
> > > value (Each value different from '0' means that the device is
> > > removed) like other functions in c library (for example isspace()).
> > > >
> > >
> > > Sure, I wasn't discussing the interface proposed by
> > > rte_eth_dev_is_removed().
> > >
> > > What I meant was to ask whether checking rte_eth_dev_is_removed()
> > > would be more interesting in the ethdev layer, making the
> > > eth_dev_ops return -ENODEV regardless of the previous error if this
> > > check is supported by the driver and signal that the port is removed.
> > >
> > > I think this information could be interesting to other systems, not
> > > just fail- safe.
> > >
> >
> > Ok. Got you now.
> > Interesting approach - plan:
> > 	1. update fs_link_update to use rte_eth* functions.
> 
> I'm surprised it doesn't already.
> Either the rte_eth* function was introduced after the failsafe, or be wary of
> potential issues. I don't see a problem right now though.
> 
> > 	2. maybe -EIO is preferred because -ENODEV is used for no port
> error?
> 
> Good point, didn't think about it.
> Prepare yourself maybe to some arguments about the most relevant error
> code. -EIO seems fine to me, but maybe use a wrapper for all this.
> 
> Something like:
> 
> ---8<---
> 
> static int
> eth_error(pid, int original_ret)
> {
>     int ret;
> 
>     if (original_ret == 0)
>         return original_ret;
>     ret = rte_eth_is_removed(pid);
>     if (ret == 0 || ret == -ENOTSUP)
>         return original_ret;
>     return -EIO;
> }
> 
> int
> rte_eth_ops_xyz(pid)
> {
>         int ret;
>         ret = eth_dev(pid).ops_xyz();
>         return eth_error(pid, ret);
> }
> 
> --->8---
> 
> This way you would be able to change it easily and the logic would be
> insulated.
> 

Nice.

> > 	3. update all relevant rte_eth* to use "is_removed" in error flows(1
> patch for flow APIs and 1 for the others).
> > 	4. Change fs checks in error flows to check rte_eth* return values.
> > 	5. Remove CC stable from commit massage.
> >
> > What do you think?
> >
> 
> Agreed otherwise.
> 

Will create V3, thanks!

> Thanks,
> 
> > > --
> > > Gaëtan Rivet
> > > 6WIND
> 
> --
> Gaëtan Rivet
> 6WIND

^ permalink raw reply	[flat|nested] 98+ messages in thread

* [PATCH v3 0/6] Fail-safe\ethdev: fix removal handling lack
  2017-12-13 14:29 ` [PATCH v2 0/4] Fail-safe fix removal handling lack Matan Azrad
                     ` (3 preceding siblings ...)
  2017-12-13 14:29   ` [PATCH v2 4/4] net/failsafe: fix removed device handling Matan Azrad
@ 2017-12-19 17:10   ` Matan Azrad
  2017-12-19 17:10     ` [PATCH v3 1/6] ethdev: add devop to check removal status Matan Azrad
                       ` (6 more replies)
  4 siblings, 7 replies; 98+ messages in thread
From: Matan Azrad @ 2017-12-19 17:10 UTC (permalink / raw)
  To: Adrien Mazarguil, Thomas Monjalon, Gaetan Rivet; +Cc: dev

There is time between the physical removal of the device until sub-device PMDs get a RMV interrupt. 
At this time DPDK PMDs and applications still don't know about the removal and may call sub-device 
control operation which should return an error.

This series adds new ethdev operation to check device removal, adds support for it in mlx PMDs, adjust ethdev APIs to return -EIO in case of removal and fixes the fail-safe bug of removal error report.

V2:
Remove ENODEV definition.
Remove checks from all mlx control commands.
Add new devop - "is_removed".
Implement it in mlx4 and mlx5.
Fix failsafe bug by the new devop.

V3:
Adjust ethdev APIs removal error report.
Change failsafe check to check eth_dev* return values.
Remove backporting of fail-safe patch.

Matan Azrad (6):
  ethdev: add devop to check removal status
  net/mlx4: support a device removal check operation
  net/mlx5: support a device removal check operation
  ethdev: adjust APIs removal error report
  ethdev: adjust flow APIs removal error report
  net/failsafe: fix removed device handling

 drivers/net/failsafe/failsafe_flow.c    |  18 +--
 drivers/net/failsafe/failsafe_ops.c     |  35 ++++--
 drivers/net/failsafe/failsafe_private.h |  12 ++
 drivers/net/mlx4/mlx4.c                 |   1 +
 drivers/net/mlx4/mlx4.h                 |   1 +
 drivers/net/mlx4/mlx4_ethdev.c          |  20 +++
 drivers/net/mlx5/mlx5.c                 |   2 +
 drivers/net/mlx5/mlx5.h                 |   1 +
 drivers/net/mlx5/mlx5_ethdev.c          |  20 +++
 lib/librte_ether/rte_ethdev.c           | 213 +++++++++++++++++++++-----------
 lib/librte_ether/rte_ethdev.h           |  68 +++++++++-
 lib/librte_ether/rte_ethdev_version.map |   7 ++
 lib/librte_ether/rte_flow.c             |  34 +++--
 lib/librte_ether/rte_flow.h             |   2 +
 14 files changed, 334 insertions(+), 100 deletions(-)

-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 98+ messages in thread

* [PATCH v3 1/6] ethdev: add devop to check removal status
  2017-12-19 17:10   ` [PATCH v3 0/6] Fail-safe\ethdev: fix removal handling lack Matan Azrad
@ 2017-12-19 17:10     ` Matan Azrad
  2017-12-19 17:20       ` Stephen Hemminger
  2018-01-07  9:53       ` Thomas Monjalon
  2017-12-19 17:10     ` [PATCH v3 2/6] net/mlx4: support a device removal check operation Matan Azrad
                       ` (5 subsequent siblings)
  6 siblings, 2 replies; 98+ messages in thread
From: Matan Azrad @ 2017-12-19 17:10 UTC (permalink / raw)
  To: Adrien Mazarguil, Thomas Monjalon, Gaetan Rivet; +Cc: dev

There is time between the physical removal of the device until PMDs get
a RMV interrupt. At this time DPDK PMDs and applications still don't
know about the removal.

Current removal detection is achieved only by registration to device RMV
event and the notification comes asynchronously. So, there is no option
to detect a device removal synchronously.
Applications and other DPDK entities may want to check a device removal
synchronously and to take an immediate decision accordingly.

Add new dev op called is_removed to allow DPDK entities to check an
Ethernet device removal status immediately.

Signed-off-by: Matan Azrad <matan@mellanox.com>
---
 lib/librte_ether/rte_ethdev.c           | 28 +++++++++++++++++++++++++---
 lib/librte_ether/rte_ethdev.h           | 17 +++++++++++++++++
 lib/librte_ether/rte_ethdev_version.map |  7 +++++++
 3 files changed, 49 insertions(+), 3 deletions(-)

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index 318af28..c759d0e 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -142,7 +142,8 @@ enum {
 rte_eth_find_next(uint16_t port_id)
 {
 	while (port_id < RTE_MAX_ETHPORTS &&
-	       rte_eth_devices[port_id].state != RTE_ETH_DEV_ATTACHED)
+	       rte_eth_devices[port_id].state != RTE_ETH_DEV_ATTACHED &&
+	       rte_eth_devices[port_id].state != RTE_ETH_DEV_REMOVED)
 		port_id++;
 
 	if (port_id >= RTE_MAX_ETHPORTS)
@@ -286,8 +287,7 @@ struct rte_eth_dev *
 rte_eth_dev_is_valid_port(uint16_t port_id)
 {
 	if (port_id >= RTE_MAX_ETHPORTS ||
-	    (rte_eth_devices[port_id].state != RTE_ETH_DEV_ATTACHED &&
-	     rte_eth_devices[port_id].state != RTE_ETH_DEV_DEFERRED))
+	    (rte_eth_devices[port_id].state == RTE_ETH_DEV_UNUSED))
 		return 0;
 	else
 		return 1;
@@ -1118,6 +1118,28 @@ struct rte_eth_dev *
 }
 
 int
+rte_eth_dev_is_removed(uint16_t port_id)
+{
+	struct rte_eth_dev *dev;
+	int ret;
+
+	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, 0);
+
+	dev = &rte_eth_devices[port_id];
+
+	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->is_removed, 0);
+
+	if (dev->state == RTE_ETH_DEV_REMOVED)
+		return 1;
+
+	ret = dev->dev_ops->is_removed(dev);
+	if (ret != 0)
+		dev->state = RTE_ETH_DEV_REMOVED;
+
+	return ret;
+}
+
+int
 rte_eth_rx_queue_setup(uint16_t port_id, uint16_t rx_queue_id,
 		       uint16_t nb_rx_desc, unsigned int socket_id,
 		       const struct rte_eth_rxconf *rx_conf,
diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index 341c2d6..3aa9d3f 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -1196,6 +1196,9 @@ struct rte_eth_dcb_info {
 typedef int (*eth_dev_reset_t)(struct rte_eth_dev *dev);
 /** <@internal Function used to reset a configured Ethernet device. */
 
+typedef int (*eth_is_removed_t)(struct rte_eth_dev *dev);
+/**< @internal Function used to detect an Ethernet device removal. */
+
 typedef void (*eth_promiscuous_enable_t)(struct rte_eth_dev *dev);
 /**< @internal Function used to enable the RX promiscuous mode of an Ethernet device. */
 
@@ -1525,6 +1528,8 @@ struct eth_dev_ops {
 	eth_dev_close_t            dev_close;     /**< Close device. */
 	eth_dev_reset_t		   dev_reset;	  /**< Reset device. */
 	eth_link_update_t          link_update;   /**< Get device link state. */
+	eth_is_removed_t           is_removed;
+	/**< Check if the device was physically removed. */
 
 	eth_promiscuous_enable_t   promiscuous_enable; /**< Promiscuous ON. */
 	eth_promiscuous_disable_t  promiscuous_disable;/**< Promiscuous OFF. */
@@ -1711,6 +1716,7 @@ enum rte_eth_dev_state {
 	RTE_ETH_DEV_UNUSED = 0,
 	RTE_ETH_DEV_ATTACHED,
 	RTE_ETH_DEV_DEFERRED,
+	RTE_ETH_DEV_REMOVED,
 };
 
 /**
@@ -1997,6 +2003,17 @@ int rte_eth_dev_configure(uint16_t port_id, uint16_t nb_rx_queue,
 void _rte_eth_dev_reset(struct rte_eth_dev *dev);
 
 /**
+ * Check if an Ethernet device was physically removed.
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @return
+ *   - 0 when the Ethernet device is removed, otherwise 1.
+ */
+int
+rte_eth_dev_is_removed(uint16_t port_id);
+
+/**
  * Allocate and set up a receive queue for an Ethernet device.
  *
  * The function allocates a contiguous block of memory for *nb_rx_desc*
diff --git a/lib/librte_ether/rte_ethdev_version.map b/lib/librte_ether/rte_ethdev_version.map
index e9681ac..78547ff 100644
--- a/lib/librte_ether/rte_ethdev_version.map
+++ b/lib/librte_ether/rte_ethdev_version.map
@@ -198,6 +198,13 @@ DPDK_17.11 {
 
 } DPDK_17.08;
 
+DPDK_18.02 {
+	global:
+
+	rte_eth_dev_is_removed;
+
+} DPDK_17.11;
+
 EXPERIMENTAL {
 	global:
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH v3 2/6] net/mlx4: support a device removal check operation
  2017-12-19 17:10   ` [PATCH v3 0/6] Fail-safe\ethdev: fix removal handling lack Matan Azrad
  2017-12-19 17:10     ` [PATCH v3 1/6] ethdev: add devop to check removal status Matan Azrad
@ 2017-12-19 17:10     ` Matan Azrad
  2017-12-19 17:10     ` [PATCH v3 3/6] net/mlx5: " Matan Azrad
                       ` (4 subsequent siblings)
  6 siblings, 0 replies; 98+ messages in thread
From: Matan Azrad @ 2017-12-19 17:10 UTC (permalink / raw)
  To: Adrien Mazarguil, Thomas Monjalon, Gaetan Rivet; +Cc: dev

Add support to get removal status of mlx4 device.

Signed-off-by: Matan Azrad <matan@mellanox.com>
---
 drivers/net/mlx4/mlx4.c        |  1 +
 drivers/net/mlx4/mlx4.h        |  1 +
 drivers/net/mlx4/mlx4_ethdev.c | 20 ++++++++++++++++++++
 3 files changed, 22 insertions(+)

diff --git a/drivers/net/mlx4/mlx4.c b/drivers/net/mlx4/mlx4.c
index f9e4f9d..3cde640 100644
--- a/drivers/net/mlx4/mlx4.c
+++ b/drivers/net/mlx4/mlx4.c
@@ -256,6 +256,7 @@ struct mlx4_conf {
 	.filter_ctrl = mlx4_filter_ctrl,
 	.rx_queue_intr_enable = mlx4_rx_intr_enable,
 	.rx_queue_intr_disable = mlx4_rx_intr_disable,
+	.is_removed = mlx4_is_removed,
 };
 
 /**
diff --git a/drivers/net/mlx4/mlx4.h b/drivers/net/mlx4/mlx4.h
index 3aeef87..0eaba89 100644
--- a/drivers/net/mlx4/mlx4.h
+++ b/drivers/net/mlx4/mlx4.h
@@ -165,6 +165,7 @@ int mlx4_flow_ctrl_get(struct rte_eth_dev *dev,
 int mlx4_flow_ctrl_set(struct rte_eth_dev *dev,
 		       struct rte_eth_fc_conf *fc_conf);
 const uint32_t *mlx4_dev_supported_ptypes_get(struct rte_eth_dev *dev);
+int mlx4_is_removed(struct rte_eth_dev *dev);
 
 /* mlx4_intr.c */
 
diff --git a/drivers/net/mlx4/mlx4_ethdev.c b/drivers/net/mlx4/mlx4_ethdev.c
index 2f69e7d..0d46f5a 100644
--- a/drivers/net/mlx4/mlx4_ethdev.c
+++ b/drivers/net/mlx4/mlx4_ethdev.c
@@ -1060,3 +1060,23 @@ enum rxmode_toggle {
 	}
 	return NULL;
 }
+
+/**
+ * Check if mlx4 device was removed.
+ *
+ * @param dev
+ *   Pointer to Ethernet device structure.
+ *
+ * @return
+ *   1 when device is removed, otherwise 0.
+ */
+int
+mlx4_is_removed(struct rte_eth_dev *dev)
+{
+	struct ibv_device_attr device_attr;
+	struct priv *priv = dev->data->dev_private;
+
+	if (ibv_query_device(priv->ctx, &device_attr) == EIO)
+		return 1;
+	return 0;
+}
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH v3 3/6] net/mlx5: support a device removal check operation
  2017-12-19 17:10   ` [PATCH v3 0/6] Fail-safe\ethdev: fix removal handling lack Matan Azrad
  2017-12-19 17:10     ` [PATCH v3 1/6] ethdev: add devop to check removal status Matan Azrad
  2017-12-19 17:10     ` [PATCH v3 2/6] net/mlx4: support a device removal check operation Matan Azrad
@ 2017-12-19 17:10     ` Matan Azrad
  2017-12-19 17:10     ` [PATCH v3 4/6] ethdev: adjust APIs removal error report Matan Azrad
                       ` (3 subsequent siblings)
  6 siblings, 0 replies; 98+ messages in thread
From: Matan Azrad @ 2017-12-19 17:10 UTC (permalink / raw)
  To: Adrien Mazarguil, Thomas Monjalon, Gaetan Rivet; +Cc: dev

Add support to get removal status of mlx5 device.
It is not supported in secondary process.

Signed-off-by: Matan Azrad <matan@mellanox.com>
---
 drivers/net/mlx5/mlx5.c        |  2 ++
 drivers/net/mlx5/mlx5.h        |  1 +
 drivers/net/mlx5/mlx5_ethdev.c | 20 ++++++++++++++++++++
 3 files changed, 23 insertions(+)

diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index 0548d17..e0b781b 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -303,6 +303,7 @@ struct mlx5_args {
 	.tx_descriptor_status = mlx5_tx_descriptor_status,
 	.rx_queue_intr_enable = mlx5_rx_intr_enable,
 	.rx_queue_intr_disable = mlx5_rx_intr_disable,
+	.is_removed = mlx5_is_removed,
 };
 
 static const struct eth_dev_ops mlx5_dev_sec_ops = {
@@ -350,6 +351,7 @@ struct mlx5_args {
 	.tx_descriptor_status = mlx5_tx_descriptor_status,
 	.rx_queue_intr_enable = mlx5_rx_intr_enable,
 	.rx_queue_intr_disable = mlx5_rx_intr_disable,
+	.is_removed = mlx5_is_removed,
 };
 
 static struct {
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index e6a69b8..2ec7ae7 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -208,6 +208,7 @@ int mlx5_ibv_device_to_pci_addr(const struct ibv_device *,
 int mlx5_set_link_up(struct rte_eth_dev *dev);
 void priv_dev_select_tx_function(struct priv *priv, struct rte_eth_dev *dev);
 void priv_dev_select_rx_function(struct priv *priv, struct rte_eth_dev *dev);
+int mlx5_is_removed(struct rte_eth_dev *dev);
 
 /* mlx5_mac.c */
 
diff --git a/drivers/net/mlx5/mlx5_ethdev.c b/drivers/net/mlx5/mlx5_ethdev.c
index a3cef68..5cf0849 100644
--- a/drivers/net/mlx5/mlx5_ethdev.c
+++ b/drivers/net/mlx5/mlx5_ethdev.c
@@ -1474,3 +1474,23 @@ struct priv *
 		dev->rx_pkt_burst = mlx5_rx_burst;
 	}
 }
+
+/**
+ * Check if mlx5 device was removed.
+ *
+ * @param dev
+ *   Pointer to Ethernet device structure.
+ *
+ * @return
+ *   1 when device is removed, otherwise 0.
+ */
+int
+mlx5_is_removed(struct rte_eth_dev *dev)
+{
+	struct ibv_device_attr device_attr;
+	struct priv *priv = dev->data->dev_private;
+
+	if (ibv_query_device(priv->ctx, &device_attr) == EIO)
+		return 1;
+	return 0;
+}
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH v3 4/6] ethdev: adjust APIs removal error report
  2017-12-19 17:10   ` [PATCH v3 0/6] Fail-safe\ethdev: fix removal handling lack Matan Azrad
                       ` (2 preceding siblings ...)
  2017-12-19 17:10     ` [PATCH v3 3/6] net/mlx5: " Matan Azrad
@ 2017-12-19 17:10     ` Matan Azrad
  2018-01-07  9:56       ` Thomas Monjalon
  2017-12-19 17:10     ` [PATCH v3 5/6] ethdev: adjust flow " Matan Azrad
                       ` (2 subsequent siblings)
  6 siblings, 1 reply; 98+ messages in thread
From: Matan Azrad @ 2017-12-19 17:10 UTC (permalink / raw)
  To: Adrien Mazarguil, Thomas Monjalon, Gaetan Rivet; +Cc: dev

rte_eth_dev_is_removed API was added to detect a device removal
synchronously.

When a device removal occurs during control command execution, many
different errors can be reported to the user.

Adjust all ethdev APIs error reports to return -EIO in case of device
removal using rte_eth_dev_is_removed API.

Signed-off-by: Matan Azrad <matan@mellanox.com>
---
 lib/librte_ether/rte_ethdev.c | 187 ++++++++++++++++++++++++++----------------
 lib/librte_ether/rte_ethdev.h |  51 +++++++++++-
 2 files changed, 165 insertions(+), 73 deletions(-)

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index c759d0e..301d108 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -362,6 +362,16 @@ struct rte_eth_dev *
 	return -ENODEV;
 }
 
+static int
+eth_err(uint16_t port_id, int ret)
+{
+	if (ret == 0)
+		return 0;
+	if (rte_eth_dev_is_removed(port_id))
+		return -EIO;
+	return ret;
+}
+
 /* attach the new device, then store port_id of the device */
 int
 rte_eth_dev_attach(const char *devargs, uint16_t *port_id)
@@ -516,7 +526,8 @@ struct rte_eth_dev *
 		return 0;
 	}
 
-	return dev->dev_ops->rx_queue_start(dev, rx_queue_id);
+	return eth_err(port_id, dev->dev_ops->rx_queue_start(dev,
+							     rx_queue_id));
 
 }
 
@@ -542,7 +553,7 @@ struct rte_eth_dev *
 		return 0;
 	}
 
-	return dev->dev_ops->rx_queue_stop(dev, rx_queue_id);
+	return eth_err(port_id, dev->dev_ops->rx_queue_stop(dev, rx_queue_id));
 
 }
 
@@ -568,7 +579,8 @@ struct rte_eth_dev *
 		return 0;
 	}
 
-	return dev->dev_ops->tx_queue_start(dev, tx_queue_id);
+	return eth_err(port_id, dev->dev_ops->tx_queue_start(dev,
+							     tx_queue_id));
 
 }
 
@@ -594,7 +606,7 @@ struct rte_eth_dev *
 		return 0;
 	}
 
-	return dev->dev_ops->tx_queue_stop(dev, tx_queue_id);
+	return eth_err(port_id, dev->dev_ops->tx_queue_stop(dev, tx_queue_id));
 
 }
 
@@ -912,7 +924,7 @@ struct rte_eth_dev *
 				port_id, diag);
 		rte_eth_dev_rx_queue_config(dev, 0);
 		rte_eth_dev_tx_queue_config(dev, 0);
-		return diag;
+		return eth_err(port_id, diag);
 	}
 
 	/* Initialize Rx profiling if enabled at compilation time. */
@@ -922,7 +934,7 @@ struct rte_eth_dev *
 				port_id, diag);
 		rte_eth_dev_rx_queue_config(dev, 0);
 		rte_eth_dev_tx_queue_config(dev, 0);
-		return diag;
+		return eth_err(port_id, diag);
 	}
 
 	return 0;
@@ -1022,7 +1034,7 @@ struct rte_eth_dev *
 	if (diag == 0)
 		dev->data->dev_started = 1;
 	else
-		return diag;
+		return eth_err(port_id, diag);
 
 	rte_eth_dev_config_restore(port_id);
 
@@ -1064,7 +1076,7 @@ struct rte_eth_dev *
 	dev = &rte_eth_devices[port_id];
 
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->dev_set_link_up, -ENOTSUP);
-	return (*dev->dev_ops->dev_set_link_up)(dev);
+	return eth_err(port_id, (*dev->dev_ops->dev_set_link_up)(dev));
 }
 
 int
@@ -1077,7 +1089,7 @@ struct rte_eth_dev *
 	dev = &rte_eth_devices[port_id];
 
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->dev_set_link_down, -ENOTSUP);
-	return (*dev->dev_ops->dev_set_link_down)(dev);
+	return eth_err(port_id, (*dev->dev_ops->dev_set_link_down)(dev));
 }
 
 void
@@ -1114,7 +1126,7 @@ struct rte_eth_dev *
 	rte_eth_dev_stop(port_id);
 	ret = dev->dev_ops->dev_reset(dev);
 
-	return ret;
+	return eth_err(port_id, ret);
 }
 
 int
@@ -1238,7 +1250,7 @@ struct rte_eth_dev *
 			dev->data->min_rx_buf_size = mbp_buf_size;
 	}
 
-	return ret;
+	return eth_err(port_id, ret);
 }
 
 /**
@@ -1357,8 +1369,8 @@ struct rte_eth_dev *
 					  &local_conf.offloads);
 	}
 
-	return (*dev->dev_ops->tx_queue_setup)(dev, tx_queue_id, nb_tx_desc,
-					       socket_id, &local_conf);
+	return eth_err(port_id, (*dev->dev_ops->tx_queue_setup)(dev,
+		       tx_queue_id, nb_tx_desc, socket_id, &local_conf));
 }
 
 void
@@ -1414,14 +1426,16 @@ struct rte_eth_dev *
 rte_eth_tx_done_cleanup(uint16_t port_id, uint16_t queue_id, uint32_t free_cnt)
 {
 	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+	int ret;
 
 	/* Validate Input Data. Bail if not valid or not supported. */
 	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->tx_done_cleanup, -ENOTSUP);
 
 	/* Call driver to free pending mbufs. */
-	return (*dev->dev_ops->tx_done_cleanup)(dev->data->tx_queues[queue_id],
-			free_cnt);
+	ret = (*dev->dev_ops->tx_done_cleanup)(dev->data->tx_queues[queue_id],
+					       free_cnt);
+	return eth_err(port_id, ret);
 }
 
 void
@@ -1558,7 +1572,7 @@ struct rte_eth_dev *
 
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->stats_get, -ENOTSUP);
 	stats->rx_nombuf = dev->data->rx_mbuf_alloc_failed;
-	return (*dev->dev_ops->stats_get)(dev, stats);
+	return eth_err(port_id, (*dev->dev_ops->stats_get)(dev, stats));
 }
 
 int
@@ -1604,12 +1618,12 @@ struct rte_eth_dev *
 		count = (*dev->dev_ops->xstats_get_names_by_id)(dev, NULL,
 				NULL, 0);
 		if (count < 0)
-			return count;
+			return eth_err(port_id, count);
 	}
 	if (dev->dev_ops->xstats_get_names != NULL) {
 		count = (*dev->dev_ops->xstats_get_names)(dev, NULL, 0);
 		if (count < 0)
-			return count;
+			return eth_err(port_id, count);
 	} else
 		count = 0;
 
@@ -1736,8 +1750,12 @@ struct rte_eth_dev *
 	}
 
 	/* Fill xstats_names_copy structure */
-	rte_eth_xstats_get_names(port_id, xstats_names_copy, expected_entries);
-
+	ret = rte_eth_xstats_get_names(port_id, xstats_names_copy,
+				       expected_entries);
+	if (ret < 0) {
+		free(xstats_names_copy);
+		return ret;
+	}
 	/* Filter stats */
 	for (i = 0; i < size; i++) {
 		if (ids[i] >= expected_entries) {
@@ -1810,7 +1828,7 @@ struct rte_eth_dev *
 			xstats_names + cnt_used_entries,
 			size - cnt_used_entries);
 		if (cnt_driver_entries < 0)
-			return cnt_driver_entries;
+			return eth_err(port_id, cnt_driver_entries);
 		cnt_used_entries += cnt_driver_entries;
 	}
 
@@ -1830,7 +1848,10 @@ struct rte_eth_dev *
 	int ret;
 
 	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
-	expected_entries = get_xstats_count(port_id);
+	ret = get_xstats_count(port_id);
+	if (ret < 0)
+		return ret;
+	expected_entries = (uint16_t)ret;
 	struct rte_eth_xstat xstats[expected_entries];
 	dev = &rte_eth_devices[port_id];
 
@@ -1901,6 +1922,7 @@ struct rte_eth_dev *
 	signed int xcount = 0;
 	uint64_t val, *stats_ptr;
 	uint16_t nb_rxqs, nb_txqs;
+	int ret;
 
 	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -EINVAL);
 
@@ -1923,7 +1945,7 @@ struct rte_eth_dev *
 				     (n > count) ? n - count : 0);
 
 		if (xcount < 0)
-			return xcount;
+			return eth_err(port_id, xcount);
 	}
 
 	if (n < count + xcount || xstats == NULL)
@@ -1931,7 +1953,9 @@ struct rte_eth_dev *
 
 	/* now fill the xstats structure */
 	count = 0;
-	rte_eth_stats_get(port_id, &eth_stats);
+	ret = rte_eth_stats_get(port_id, &eth_stats);
+	if (ret < 0)
+		return ret;
 
 	/* global stats */
 	for (i = 0; i < RTE_NB_STATS; i++) {
@@ -2011,8 +2035,8 @@ struct rte_eth_dev *
 rte_eth_dev_set_tx_queue_stats_mapping(uint16_t port_id, uint16_t tx_queue_id,
 		uint8_t stat_idx)
 {
-	return set_queue_stats_mapping(port_id, tx_queue_id, stat_idx,
-			STAT_QMAP_TX);
+	return eth_err(port_id, set_queue_stats_mapping(port_id, tx_queue_id,
+						stat_idx, STAT_QMAP_TX));
 }
 
 
@@ -2020,8 +2044,8 @@ struct rte_eth_dev *
 rte_eth_dev_set_rx_queue_stats_mapping(uint16_t port_id, uint16_t rx_queue_id,
 		uint8_t stat_idx)
 {
-	return set_queue_stats_mapping(port_id, rx_queue_id, stat_idx,
-			STAT_QMAP_RX);
+	return eth_err(port_id, set_queue_stats_mapping(port_id, rx_queue_id,
+						stat_idx, STAT_QMAP_RX));
 }
 
 int
@@ -2033,7 +2057,8 @@ struct rte_eth_dev *
 	dev = &rte_eth_devices[port_id];
 
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->fw_version_get, -ENOTSUP);
-	return (*dev->dev_ops->fw_version_get)(dev, fw_version, fw_size);
+	return eth_err(port_id, (*dev->dev_ops->fw_version_get)(dev,
+							fw_version, fw_size));
 }
 
 void
@@ -2123,7 +2148,7 @@ struct rte_eth_dev *
 	if (!ret)
 		dev->data->mtu = mtu;
 
-	return ret;
+	return eth_err(port_id, ret);
 }
 
 int
@@ -2163,7 +2188,7 @@ struct rte_eth_dev *
 			vfc->ids[vidx] &= ~(UINT64_C(1) << vbit);
 	}
 
-	return ret;
+	return eth_err(port_id, ret);
 }
 
 int
@@ -2196,7 +2221,8 @@ struct rte_eth_dev *
 	dev = &rte_eth_devices[port_id];
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->vlan_tpid_set, -ENOTSUP);
 
-	return (*dev->dev_ops->vlan_tpid_set)(dev, vlan_type, tpid);
+	return eth_err(port_id, (*dev->dev_ops->vlan_tpid_set)(dev, vlan_type,
+							       tpid));
 }
 
 int
@@ -2274,7 +2300,7 @@ struct rte_eth_dev *
 					    &dev->data->dev_conf.rxmode);
 	}
 
-	return ret;
+	return eth_err(port_id, ret);
 }
 
 int
@@ -2309,9 +2335,8 @@ struct rte_eth_dev *
 	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
 	dev = &rte_eth_devices[port_id];
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->vlan_pvid_set, -ENOTSUP);
-	(*dev->dev_ops->vlan_pvid_set)(dev, pvid, on);
 
-	return 0;
+	return eth_err(port_id, (*dev->dev_ops->vlan_pvid_set)(dev, pvid, on));
 }
 
 int
@@ -2323,7 +2348,7 @@ struct rte_eth_dev *
 	dev = &rte_eth_devices[port_id];
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->flow_ctrl_get, -ENOTSUP);
 	memset(fc_conf, 0, sizeof(*fc_conf));
-	return (*dev->dev_ops->flow_ctrl_get)(dev, fc_conf);
+	return eth_err(port_id, (*dev->dev_ops->flow_ctrl_get)(dev, fc_conf));
 }
 
 int
@@ -2339,7 +2364,7 @@ struct rte_eth_dev *
 
 	dev = &rte_eth_devices[port_id];
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->flow_ctrl_set, -ENOTSUP);
-	return (*dev->dev_ops->flow_ctrl_set)(dev, fc_conf);
+	return eth_err(port_id, (*dev->dev_ops->flow_ctrl_set)(dev, fc_conf));
 }
 
 int
@@ -2357,7 +2382,8 @@ struct rte_eth_dev *
 	dev = &rte_eth_devices[port_id];
 	/* High water, low water validation are device specific */
 	if  (*dev->dev_ops->priority_flow_ctrl_set)
-		return (*dev->dev_ops->priority_flow_ctrl_set)(dev, pfc_conf);
+		return eth_err(port_id, (*dev->dev_ops->priority_flow_ctrl_set)
+					(dev, pfc_conf));
 	return -ENOTSUP;
 }
 
@@ -2432,7 +2458,8 @@ struct rte_eth_dev *
 		return ret;
 
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->reta_update, -ENOTSUP);
-	return (*dev->dev_ops->reta_update)(dev, reta_conf, reta_size);
+	return eth_err(port_id, (*dev->dev_ops->reta_update)(dev, reta_conf,
+							     reta_size));
 }
 
 int
@@ -2452,7 +2479,8 @@ struct rte_eth_dev *
 
 	dev = &rte_eth_devices[port_id];
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->reta_query, -ENOTSUP);
-	return (*dev->dev_ops->reta_query)(dev, reta_conf, reta_size);
+	return eth_err(port_id, (*dev->dev_ops->reta_query)(dev, reta_conf,
+							    reta_size));
 }
 
 int
@@ -2464,7 +2492,8 @@ struct rte_eth_dev *
 	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
 	dev = &rte_eth_devices[port_id];
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rss_hash_update, -ENOTSUP);
-	return (*dev->dev_ops->rss_hash_update)(dev, rss_conf);
+	return eth_err(port_id, (*dev->dev_ops->rss_hash_update)(dev,
+								 rss_conf));
 }
 
 int
@@ -2476,7 +2505,8 @@ struct rte_eth_dev *
 	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
 	dev = &rte_eth_devices[port_id];
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rss_hash_conf_get, -ENOTSUP);
-	return (*dev->dev_ops->rss_hash_conf_get)(dev, rss_conf);
+	return eth_err(port_id, (*dev->dev_ops->rss_hash_conf_get)(dev,
+								   rss_conf));
 }
 
 int
@@ -2498,7 +2528,8 @@ struct rte_eth_dev *
 
 	dev = &rte_eth_devices[port_id];
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->udp_tunnel_port_add, -ENOTSUP);
-	return (*dev->dev_ops->udp_tunnel_port_add)(dev, udp_tunnel);
+	return eth_err(port_id, (*dev->dev_ops->udp_tunnel_port_add)(dev,
+								udp_tunnel));
 }
 
 int
@@ -2521,7 +2552,8 @@ struct rte_eth_dev *
 	}
 
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->udp_tunnel_port_del, -ENOTSUP);
-	return (*dev->dev_ops->udp_tunnel_port_del)(dev, udp_tunnel);
+	return eth_err(port_id, (*dev->dev_ops->udp_tunnel_port_del)(dev,
+								udp_tunnel));
 }
 
 int
@@ -2532,7 +2564,7 @@ struct rte_eth_dev *
 	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
 	dev = &rte_eth_devices[port_id];
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->dev_led_on, -ENOTSUP);
-	return (*dev->dev_ops->dev_led_on)(dev);
+	return eth_err(port_id, (*dev->dev_ops->dev_led_on)(dev));
 }
 
 int
@@ -2543,7 +2575,7 @@ struct rte_eth_dev *
 	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
 	dev = &rte_eth_devices[port_id];
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->dev_led_off, -ENOTSUP);
-	return (*dev->dev_ops->dev_led_off)(dev);
+	return eth_err(port_id, (*dev->dev_ops->dev_led_off)(dev));
 }
 
 /*
@@ -2619,7 +2651,7 @@ struct rte_eth_dev *
 		dev->data->mac_pool_sel[index] |= (1ULL << pool);
 	}
 
-	return ret;
+	return eth_err(port_id, ret);
 }
 
 int
@@ -2745,7 +2777,7 @@ struct rte_eth_dev *
 					&dev->data->hash_mac_addrs[index]);
 	}
 
-	return ret;
+	return eth_err(port_id, ret);
 }
 
 int
@@ -2758,7 +2790,8 @@ struct rte_eth_dev *
 	dev = &rte_eth_devices[port_id];
 
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->uc_all_hash_table_set, -ENOTSUP);
-	return (*dev->dev_ops->uc_all_hash_table_set)(dev, on);
+	return eth_err(port_id, (*dev->dev_ops->uc_all_hash_table_set)(dev,
+								       on));
 }
 
 int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx,
@@ -2788,7 +2821,8 @@ int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx,
 	}
 
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->set_queue_rate_limit, -ENOTSUP);
-	return (*dev->dev_ops->set_queue_rate_limit)(dev, queue_idx, tx_rate);
+	return eth_err(port_id, (*dev->dev_ops->set_queue_rate_limit)(dev,
+							queue_idx, tx_rate));
 }
 
 int
@@ -2826,7 +2860,8 @@ int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx,
 	dev = &rte_eth_devices[port_id];
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->mirror_rule_set, -ENOTSUP);
 
-	return (*dev->dev_ops->mirror_rule_set)(dev, mirror_conf, rule_id, on);
+	return eth_err(port_id, (*dev->dev_ops->mirror_rule_set)(dev,
+						mirror_conf, rule_id, on));
 }
 
 int
@@ -2839,7 +2874,8 @@ int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx,
 	dev = &rte_eth_devices[port_id];
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->mirror_rule_reset, -ENOTSUP);
 
-	return (*dev->dev_ops->mirror_rule_reset)(dev, rule_id);
+	return eth_err(port_id, (*dev->dev_ops->mirror_rule_reset)(dev,
+								   rule_id));
 }
 
 int
@@ -3061,7 +3097,8 @@ int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx,
 	dev = &rte_eth_devices[port_id];
 
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rx_queue_intr_enable, -ENOTSUP);
-	return (*dev->dev_ops->rx_queue_intr_enable)(dev, queue_id);
+	return eth_err(port_id, (*dev->dev_ops->rx_queue_intr_enable)(dev,
+								queue_id));
 }
 
 int
@@ -3075,7 +3112,8 @@ int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx,
 	dev = &rte_eth_devices[port_id];
 
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rx_queue_intr_disable, -ENOTSUP);
-	return (*dev->dev_ops->rx_queue_intr_disable)(dev, queue_id);
+	return eth_err(port_id, (*dev->dev_ops->rx_queue_intr_disable)(dev,
+								queue_id));
 }
 
 
@@ -3103,7 +3141,8 @@ int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx,
 
 	dev = &rte_eth_devices[port_id];
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->filter_ctrl, -ENOTSUP);
-	return (*dev->dev_ops->filter_ctrl)(dev, filter_type, filter_op, arg);
+	return eth_err(port_id, (*dev->dev_ops->filter_ctrl)(dev, filter_type,
+							     filter_op, arg));
 }
 
 void *
@@ -3353,7 +3392,8 @@ int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx,
 
 	dev = &rte_eth_devices[port_id];
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->set_mc_addr_list, -ENOTSUP);
-	return dev->dev_ops->set_mc_addr_list(dev, mc_addr_set, nb_mc_addr);
+	return eth_err(port_id, dev->dev_ops->set_mc_addr_list(dev,
+						mc_addr_set, nb_mc_addr));
 }
 
 int
@@ -3365,7 +3405,7 @@ int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx,
 	dev = &rte_eth_devices[port_id];
 
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->timesync_enable, -ENOTSUP);
-	return (*dev->dev_ops->timesync_enable)(dev);
+	return eth_err(port_id, (*dev->dev_ops->timesync_enable)(dev));
 }
 
 int
@@ -3377,7 +3417,7 @@ int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx,
 	dev = &rte_eth_devices[port_id];
 
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->timesync_disable, -ENOTSUP);
-	return (*dev->dev_ops->timesync_disable)(dev);
+	return eth_err(port_id, (*dev->dev_ops->timesync_disable)(dev));
 }
 
 int
@@ -3390,7 +3430,8 @@ int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx,
 	dev = &rte_eth_devices[port_id];
 
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->timesync_read_rx_timestamp, -ENOTSUP);
-	return (*dev->dev_ops->timesync_read_rx_timestamp)(dev, timestamp, flags);
+	return eth_err(port_id, (*dev->dev_ops->timesync_read_rx_timestamp)
+				(dev, timestamp, flags));
 }
 
 int
@@ -3403,7 +3444,8 @@ int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx,
 	dev = &rte_eth_devices[port_id];
 
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->timesync_read_tx_timestamp, -ENOTSUP);
-	return (*dev->dev_ops->timesync_read_tx_timestamp)(dev, timestamp);
+	return eth_err(port_id, (*dev->dev_ops->timesync_read_tx_timestamp)
+				(dev, timestamp));
 }
 
 int
@@ -3415,7 +3457,8 @@ int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx,
 	dev = &rte_eth_devices[port_id];
 
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->timesync_adjust_time, -ENOTSUP);
-	return (*dev->dev_ops->timesync_adjust_time)(dev, delta);
+	return eth_err(port_id, (*dev->dev_ops->timesync_adjust_time)(dev,
+								      delta));
 }
 
 int
@@ -3427,7 +3470,8 @@ int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx,
 	dev = &rte_eth_devices[port_id];
 
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->timesync_read_time, -ENOTSUP);
-	return (*dev->dev_ops->timesync_read_time)(dev, timestamp);
+	return eth_err(port_id, (*dev->dev_ops->timesync_read_time)(dev,
+								timestamp));
 }
 
 int
@@ -3439,7 +3483,8 @@ int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx,
 	dev = &rte_eth_devices[port_id];
 
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->timesync_write_time, -ENOTSUP);
-	return (*dev->dev_ops->timesync_write_time)(dev, timestamp);
+	return eth_err(port_id, (*dev->dev_ops->timesync_write_time)(dev,
+								timestamp));
 }
 
 int
@@ -3451,7 +3496,7 @@ int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx,
 
 	dev = &rte_eth_devices[port_id];
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->get_reg, -ENOTSUP);
-	return (*dev->dev_ops->get_reg)(dev, info);
+	return eth_err(port_id, (*dev->dev_ops->get_reg)(dev, info));
 }
 
 int
@@ -3463,7 +3508,7 @@ int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx,
 
 	dev = &rte_eth_devices[port_id];
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->get_eeprom_length, -ENOTSUP);
-	return (*dev->dev_ops->get_eeprom_length)(dev);
+	return eth_err(port_id, (*dev->dev_ops->get_eeprom_length)(dev));
 }
 
 int
@@ -3475,7 +3520,7 @@ int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx,
 
 	dev = &rte_eth_devices[port_id];
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->get_eeprom, -ENOTSUP);
-	return (*dev->dev_ops->get_eeprom)(dev, info);
+	return eth_err(port_id, (*dev->dev_ops->get_eeprom)(dev, info));
 }
 
 int
@@ -3487,7 +3532,7 @@ int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx,
 
 	dev = &rte_eth_devices[port_id];
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->set_eeprom, -ENOTSUP);
-	return (*dev->dev_ops->set_eeprom)(dev, info);
+	return eth_err(port_id, (*dev->dev_ops->set_eeprom)(dev, info));
 }
 
 int
@@ -3502,7 +3547,7 @@ int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx,
 	memset(dcb_info, 0, sizeof(struct rte_eth_dcb_info));
 
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->get_dcb_info, -ENOTSUP);
-	return (*dev->dev_ops->get_dcb_info)(dev, dcb_info);
+	return eth_err(port_id, (*dev->dev_ops->get_dcb_info)(dev, dcb_info));
 }
 
 int
@@ -3525,7 +3570,8 @@ int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx,
 	dev = &rte_eth_devices[port_id];
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->l2_tunnel_eth_type_conf,
 				-ENOTSUP);
-	return (*dev->dev_ops->l2_tunnel_eth_type_conf)(dev, l2_tunnel);
+	return eth_err(port_id, (*dev->dev_ops->l2_tunnel_eth_type_conf)(dev,
+								l2_tunnel));
 }
 
 int
@@ -3556,7 +3602,8 @@ int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx,
 	dev = &rte_eth_devices[port_id];
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->l2_tunnel_offload_set,
 				-ENOTSUP);
-	return (*dev->dev_ops->l2_tunnel_offload_set)(dev, l2_tunnel, mask, en);
+	return eth_err(port_id, (*dev->dev_ops->l2_tunnel_offload_set)(dev,
+							l2_tunnel, mask, en));
 }
 
 static void
diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index 3aa9d3f..936bf79 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -2047,6 +2047,7 @@ int rte_eth_dev_configure(uint16_t port_id, uint16_t nb_rx_queue,
  *   memory buffers to populate each descriptor of the receive ring.
  * @return
  *   - 0: Success, receive queue correctly set up.
+ *   - -EIO: if device is removed.
  *   - -EINVAL: The size of network buffers which can be allocated from the
  *      memory pool does not fit the various buffer sizes allowed by the
  *      device controller.
@@ -2147,6 +2148,7 @@ int rte_eth_tx_queue_setup(uint16_t port_id, uint16_t tx_queue_id,
  * @return
  *   - 0: Success, the receive queue is started.
  *   - -EINVAL: The port_id or the queue_id out of range.
+ *   - -EIO: if device is removed.
  *   - -ENOTSUP: The function not supported in PMD driver.
  */
 int rte_eth_dev_rx_queue_start(uint16_t port_id, uint16_t rx_queue_id);
@@ -2163,6 +2165,7 @@ int rte_eth_tx_queue_setup(uint16_t port_id, uint16_t tx_queue_id,
  * @return
  *   - 0: Success, the receive queue is stopped.
  *   - -EINVAL: The port_id or the queue_id out of range.
+ *   - -EIO: if device is removed.
  *   - -ENOTSUP: The function not supported in PMD driver.
  */
 int rte_eth_dev_rx_queue_stop(uint16_t port_id, uint16_t rx_queue_id);
@@ -2180,6 +2183,7 @@ int rte_eth_tx_queue_setup(uint16_t port_id, uint16_t tx_queue_id,
  * @return
  *   - 0: Success, the transmit queue is started.
  *   - -EINVAL: The port_id or the queue_id out of range.
+ *   - -EIO: if device is removed.
  *   - -ENOTSUP: The function not supported in PMD driver.
  */
 int rte_eth_dev_tx_queue_start(uint16_t port_id, uint16_t tx_queue_id);
@@ -2196,6 +2200,7 @@ int rte_eth_tx_queue_setup(uint16_t port_id, uint16_t tx_queue_id,
  * @return
  *   - 0: Success, the transmit queue is stopped.
  *   - -EINVAL: The port_id or the queue_id out of range.
+ *   - -EIO: if device is removed.
  *   - -ENOTSUP: The function not supported in PMD driver.
  */
 int rte_eth_dev_tx_queue_stop(uint16_t port_id, uint16_t tx_queue_id);
@@ -2297,7 +2302,7 @@ int rte_eth_tx_queue_setup(uint16_t port_id, uint16_t tx_queue_id,
  *   - (-EINVAL) if port identifier is invalid.
  *   - (-ENOTSUP) if hardware doesn't support this function.
  *   - (-EPERM) if not ran from the primary process.
- *   - (-EIO) if re-initialisation failed.
+ *   - (-EIO) if re-initialisation failed or device is removed.
  *   - (-ENOMEM) if the reset failed due to OOM.
  *   - (-EAGAIN) if the reset temporarily failed and should be retried later.
  */
@@ -2533,6 +2538,7 @@ int rte_eth_xstats_get_by_id(uint16_t port_id, const uint64_t *ids,
  * @return
  *    0 on success
  *    -ENODEV for invalid port_id,
+ *    -EIO if device is removed,
  *    -EINVAL if the xstat_name doesn't exist in port_id
  */
 int rte_eth_xstats_get_id_by_name(uint16_t port_id, const char *xstat_name,
@@ -2624,6 +2630,7 @@ int rte_eth_dev_set_rx_queue_stats_mapping(uint16_t port_id,
  *   - (0) if successful.
  *   - (-ENOTSUP) if operation is not supported.
  *   - (-ENODEV) if *port_id* invalid.
+ *   - (-EIO) if device is removed.
  *   - (>0) if *fw_size* is not enough to store firmware version, return
  *          the size of the non truncated string.
  */
@@ -2695,6 +2702,7 @@ int rte_eth_dev_get_supported_ptypes(uint16_t port_id, uint32_t ptype_mask,
  *   - (0) if successful.
  *   - (-ENOTSUP) if operation is not supported.
  *   - (-ENODEV) if *port_id* invalid.
+ *   - (-EIO) if device is removed.
  *   - (-EINVAL) if *mtu* invalid.
  *   - (-EBUSY) if operation is not allowed when the port is running
  */
@@ -2715,6 +2723,7 @@ int rte_eth_dev_get_supported_ptypes(uint16_t port_id, uint32_t ptype_mask,
  *   - (0) if successful.
  *   - (-ENOSUP) if hardware-assisted VLAN filtering not configured.
  *   - (-ENODEV) if *port_id* invalid.
+ *   - (-EIO) if device is removed.
  *   - (-ENOSYS) if VLAN filtering on *port_id* disabled.
  *   - (-EINVAL) if *vlan_id* > 4095.
  */
@@ -2757,6 +2766,7 @@ int rte_eth_dev_set_vlan_strip_on_queue(uint16_t port_id, uint16_t rx_queue_id,
  *   - (0) if successful.
  *   - (-ENOSUP) if hardware-assisted VLAN TPID setup is not supported.
  *   - (-ENODEV) if *port_id* invalid.
+ *   - (-EIO) if device is removed.
  */
 int rte_eth_dev_set_vlan_ether_type(uint16_t port_id,
 				    enum rte_vlan_type vlan_type,
@@ -2781,6 +2791,7 @@ int rte_eth_dev_set_vlan_ether_type(uint16_t port_id,
  *   - (0) if successful.
  *   - (-ENOSUP) if hardware-assisted VLAN filtering not configured.
  *   - (-ENODEV) if *port_id* invalid.
+ *   - (-EIO) if device is removed.
  */
 int rte_eth_dev_set_vlan_offload(uint16_t port_id, int offload_mask);
 
@@ -3522,6 +3533,7 @@ struct rte_eth_dev_tx_buffer {
  * @return
  *   Failure: < 0
  *     -ENODEV: Invalid interface
+ *     -EIO: device is removed
  *     -ENOTSUP: Driver does not support function
  *   Success: >= 0
  *     0-n: Number of packets freed. More packets may still remain in ring that
@@ -3634,6 +3646,7 @@ int _rte_eth_dev_callback_process(struct rte_eth_dev *dev,
  *   - (-ENOTSUP) if underlying hardware OR driver doesn't support
  *     that operation.
  *   - (-ENODEV) if *port_id* invalid.
+ *   - (-EIO) if device is removed.
  */
 int rte_eth_dev_rx_intr_enable(uint16_t port_id, uint16_t queue_id);
 
@@ -3655,6 +3668,7 @@ int _rte_eth_dev_callback_process(struct rte_eth_dev *dev,
  *   - (-ENOTSUP) if underlying hardware OR driver doesn't support
  *     that operation.
  *   - (-ENODEV) if *port_id* invalid.
+ *   - (-EIO) if device is removed.
  */
 int rte_eth_dev_rx_intr_disable(uint16_t port_id, uint16_t queue_id);
 
@@ -3712,6 +3726,7 @@ int rte_eth_dev_rx_intr_ctl_q(uint16_t port_id, uint16_t queue_id,
  *   - (-ENOTSUP) if underlying hardware OR driver doesn't support
  *     that operation.
  *   - (-ENODEV) if *port_id* invalid.
+ *   - (-EIO) if device is removed.
  */
 int  rte_eth_led_on(uint16_t port_id);
 
@@ -3726,6 +3741,7 @@ int rte_eth_dev_rx_intr_ctl_q(uint16_t port_id, uint16_t queue_id,
  *   - (-ENOTSUP) if underlying hardware OR driver doesn't support
  *     that operation.
  *   - (-ENODEV) if *port_id* invalid.
+ *   - (-EIO) if device is removed.
  */
 int  rte_eth_led_off(uint16_t port_id);
 
@@ -3740,6 +3756,7 @@ int rte_eth_dev_rx_intr_ctl_q(uint16_t port_id, uint16_t queue_id,
  *   - (0) if successful.
  *   - (-ENOTSUP) if hardware doesn't support flow control.
  *   - (-ENODEV)  if *port_id* invalid.
+ *   - (-EIO)  if device is removed.
  */
 int rte_eth_dev_flow_ctrl_get(uint16_t port_id,
 			      struct rte_eth_fc_conf *fc_conf);
@@ -3756,7 +3773,7 @@ int rte_eth_dev_flow_ctrl_get(uint16_t port_id,
  *   - (-ENOTSUP) if hardware doesn't support flow control mode.
  *   - (-ENODEV)  if *port_id* invalid.
  *   - (-EINVAL)  if bad parameter
- *   - (-EIO)     if flow control setup failure
+ *   - (-EIO)     if flow control setup failure or device is removed.
  */
 int rte_eth_dev_flow_ctrl_set(uint16_t port_id,
 			      struct rte_eth_fc_conf *fc_conf);
@@ -3774,7 +3791,7 @@ int rte_eth_dev_flow_ctrl_set(uint16_t port_id,
  *   - (-ENOTSUP) if hardware doesn't support priority flow control mode.
  *   - (-ENODEV)  if *port_id* invalid.
  *   - (-EINVAL)  if bad parameter
- *   - (-EIO)     if flow control setup failure
+ *   - (-EIO)     if flow control setup failure or device is removed.
  */
 int rte_eth_dev_priority_flow_ctrl_set(uint16_t port_id,
 				struct rte_eth_pfc_conf *pfc_conf);
@@ -3794,6 +3811,7 @@ int rte_eth_dev_priority_flow_ctrl_set(uint16_t port_id,
  *   - (0) if successfully added or *mac_addr" was already added.
  *   - (-ENOTSUP) if hardware doesn't support this feature.
  *   - (-ENODEV) if *port* is invalid.
+ *   - (-EIO) if device is removed.
  *   - (-ENOSPC) if no more MAC addresses can be added.
  *   - (-EINVAL) if MAC address is invalid.
  */
@@ -3845,6 +3863,7 @@ int rte_eth_dev_default_mac_addr_set(uint16_t port,
  *   - (0) if successful.
  *   - (-ENOTSUP) if hardware doesn't support.
  *   - (-EINVAL) if bad parameter.
+ *   - (-EIO) if device is removed.
  */
 int rte_eth_dev_rss_reta_update(uint16_t port,
 				struct rte_eth_rss_reta_entry64 *reta_conf,
@@ -3864,6 +3883,7 @@ int rte_eth_dev_rss_reta_update(uint16_t port,
  *   - (0) if successful.
  *   - (-ENOTSUP) if hardware doesn't support.
  *   - (-EINVAL) if bad parameter.
+ *   - (-EIO) if device is removed.
  */
 int rte_eth_dev_rss_reta_query(uint16_t port,
 			       struct rte_eth_rss_reta_entry64 *reta_conf,
@@ -3885,6 +3905,7 @@ int rte_eth_dev_rss_reta_query(uint16_t port,
  *   - (0) if successful.
  *   - (-ENOTSUP) if hardware doesn't support.
   *  - (-ENODEV) if *port_id* invalid.
+ *   - (-EIO) if device is removed.
  *   - (-EINVAL) if bad parameter.
  */
 int rte_eth_dev_uc_hash_table_set(uint16_t port, struct ether_addr *addr,
@@ -3905,6 +3926,7 @@ int rte_eth_dev_uc_hash_table_set(uint16_t port, struct ether_addr *addr,
  *   - (0) if successful.
  *   - (-ENOTSUP) if hardware doesn't support.
   *  - (-ENODEV) if *port_id* invalid.
+ *   - (-EIO) if device is removed.
  *   - (-EINVAL) if bad parameter.
  */
 int rte_eth_dev_uc_all_hash_table_set(uint16_t port, uint8_t on);
@@ -3928,6 +3950,7 @@ int rte_eth_dev_uc_hash_table_set(uint16_t port, struct ether_addr *addr,
  *   - (0) if successful.
  *   - (-ENOTSUP) if hardware doesn't support this feature.
  *   - (-ENODEV) if *port_id* invalid.
+ *   - (-EIO) if device is removed.
  *   - (-EINVAL) if the mr_conf information is not correct.
  */
 int rte_eth_mirror_rule_set(uint16_t port_id,
@@ -3946,6 +3969,7 @@ int rte_eth_mirror_rule_set(uint16_t port_id,
  *   - (0) if successful.
  *   - (-ENOTSUP) if hardware doesn't support this feature.
  *   - (-ENODEV) if *port_id* invalid.
+ *   - (-EIO) if device is removed.
  *   - (-EINVAL) if bad parameter.
  */
 int rte_eth_mirror_rule_reset(uint16_t port_id,
@@ -3964,6 +3988,7 @@ int rte_eth_mirror_rule_reset(uint16_t port_id,
  *   - (0) if successful.
  *   - (-ENOTSUP) if hardware doesn't support this feature.
  *   - (-ENODEV) if *port_id* invalid.
+ *   - (-EIO) if device is removed.
  *   - (-EINVAL) if bad parameter.
  */
 int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx,
@@ -3979,6 +4004,7 @@ int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx,
  * @return
  *   - (0) if successful.
  *   - (-ENODEV) if port identifier is invalid.
+ *   - (-EIO) if device is removed.
  *   - (-ENOTSUP) if hardware doesn't support.
  *   - (-EINVAL) if bad parameter.
  */
@@ -3996,6 +4022,7 @@ int rte_eth_dev_rss_hash_update(uint16_t port_id,
  * @return
  *   - (0) if successful.
  *   - (-ENODEV) if port identifier is invalid.
+ *   - (-EIO) if device is removed.
  *   - (-ENOTSUP) if hardware doesn't support RSS.
  */
 int
@@ -4017,6 +4044,7 @@ int rte_eth_dev_rss_hash_update(uint16_t port_id,
  * @return
  *   - (0) if successful.
  *   - (-ENODEV) if port identifier is invalid.
+ *   - (-EIO) if device is removed.
  *   - (-ENOTSUP) if hardware doesn't support tunnel type.
  */
 int
@@ -4039,6 +4067,7 @@ int rte_eth_dev_rss_hash_update(uint16_t port_id,
  * @return
  *   - (0) if successful.
  *   - (-ENODEV) if port identifier is invalid.
+ *   - (-EIO) if device is removed.
  *   - (-ENOTSUP) if hardware doesn't support tunnel type.
  */
 int
@@ -4057,6 +4086,7 @@ int rte_eth_dev_rss_hash_update(uint16_t port_id,
  *   - (0) if successful.
  *   - (-ENOTSUP) if hardware doesn't support this filter type.
  *   - (-ENODEV) if *port_id* invalid.
+ *   - (-EIO) if device is removed.
  */
 int rte_eth_dev_filter_supported(uint16_t port_id,
 		enum rte_filter_type filter_type);
@@ -4077,6 +4107,7 @@ int rte_eth_dev_filter_supported(uint16_t port_id,
  *   - (0) if successful.
  *   - (-ENOTSUP) if hardware doesn't support.
  *   - (-ENODEV) if *port_id* invalid.
+ *   - (-EIO) if device is removed.
  *   - others depends on the specific operations implementation.
  */
 int rte_eth_dev_filter_ctrl(uint16_t port_id, enum rte_filter_type filter_type,
@@ -4092,6 +4123,7 @@ int rte_eth_dev_filter_ctrl(uint16_t port_id, enum rte_filter_type filter_type,
  * @return
  *   - (0) if successful.
  *   - (-ENODEV) if port identifier is invalid.
+ *   - (-EIO) if device is removed.
  *   - (-ENOTSUP) if hardware doesn't support.
  */
 int rte_eth_dev_get_dcb_info(uint16_t port_id,
@@ -4299,6 +4331,7 @@ int rte_eth_tx_queue_info_get(uint16_t port_id, uint16_t queue_id,
  *   - (0) if successful.
  *   - (-ENOTSUP) if hardware doesn't support.
  *   - (-ENODEV) if *port_id* invalid.
+ *   - (-EIO) if device is removed.
  *   - others depends on the specific operations implementation.
  */
 int rte_eth_dev_get_reg_info(uint16_t port_id, struct rte_dev_reg_info *info);
@@ -4312,6 +4345,7 @@ int rte_eth_tx_queue_info_get(uint16_t port_id, uint16_t queue_id,
  *   - (>=0) EEPROM size if successful.
  *   - (-ENOTSUP) if hardware doesn't support.
  *   - (-ENODEV) if *port_id* invalid.
+ *   - (-EIO) if device is removed.
  *   - others depends on the specific operations implementation.
  */
 int rte_eth_dev_get_eeprom_length(uint16_t port_id);
@@ -4328,6 +4362,7 @@ int rte_eth_tx_queue_info_get(uint16_t port_id, uint16_t queue_id,
  *   - (0) if successful.
  *   - (-ENOTSUP) if hardware doesn't support.
  *   - (-ENODEV) if *port_id* invalid.
+ *   - (-EIO) if device is removed.
  *   - others depends on the specific operations implementation.
  */
 int rte_eth_dev_get_eeprom(uint16_t port_id, struct rte_dev_eeprom_info *info);
@@ -4344,6 +4379,7 @@ int rte_eth_tx_queue_info_get(uint16_t port_id, uint16_t queue_id,
  *   - (0) if successful.
  *   - (-ENOTSUP) if hardware doesn't support.
  *   - (-ENODEV) if *port_id* invalid.
+ *   - (-EIO) if device is removed.
  *   - others depends on the specific operations implementation.
  */
 int rte_eth_dev_set_eeprom(uint16_t port_id, struct rte_dev_eeprom_info *info);
@@ -4362,6 +4398,7 @@ int rte_eth_tx_queue_info_get(uint16_t port_id, uint16_t queue_id,
  * @return
  *   - (0) if successful.
  *   - (-ENODEV) if *port_id* invalid.
+ *   - (-EIO) if device is removed.
  *   - (-ENOTSUP) if PMD of *port_id* doesn't support multicast filtering.
  *   - (-ENOSPC) if *port_id* has not enough multicast filtering resources.
  */
@@ -4378,6 +4415,7 @@ int rte_eth_dev_set_mc_addr_list(uint16_t port_id,
  * @return
  *   - 0: Success.
  *   - -ENODEV: The port ID is invalid.
+ *   - -EIO: if device is removed.
  *   - -ENOTSUP: The function is not supported by the Ethernet driver.
  */
 int rte_eth_timesync_enable(uint16_t port_id);
@@ -4391,6 +4429,7 @@ int rte_eth_dev_set_mc_addr_list(uint16_t port_id,
  * @return
  *   - 0: Success.
  *   - -ENODEV: The port ID is invalid.
+ *   - -EIO: if device is removed.
  *   - -ENOTSUP: The function is not supported by the Ethernet driver.
  */
 int rte_eth_timesync_disable(uint16_t port_id);
@@ -4410,6 +4449,7 @@ int rte_eth_dev_set_mc_addr_list(uint16_t port_id,
  *   - 0: Success.
  *   - -EINVAL: No timestamp is available.
  *   - -ENODEV: The port ID is invalid.
+ *   - -EIO: if device is removed.
  *   - -ENOTSUP: The function is not supported by the Ethernet driver.
  */
 int rte_eth_timesync_read_rx_timestamp(uint16_t port_id,
@@ -4427,6 +4467,7 @@ int rte_eth_timesync_read_rx_timestamp(uint16_t port_id,
  *   - 0: Success.
  *   - -EINVAL: No timestamp is available.
  *   - -ENODEV: The port ID is invalid.
+ *   - -EIO: if device is removed.
  *   - -ENOTSUP: The function is not supported by the Ethernet driver.
  */
 int rte_eth_timesync_read_tx_timestamp(uint16_t port_id,
@@ -4446,6 +4487,7 @@ int rte_eth_timesync_read_tx_timestamp(uint16_t port_id,
  * @return
  *   - 0: Success.
  *   - -ENODEV: The port ID is invalid.
+ *   - -EIO: if device is removed.
  *   - -ENOTSUP: The function is not supported by the Ethernet driver.
  */
 int rte_eth_timesync_adjust_time(uint16_t port_id, int64_t delta);
@@ -4481,6 +4523,7 @@ int rte_eth_timesync_read_tx_timestamp(uint16_t port_id,
  *   - 0: Success.
  *   - -EINVAL: No timestamp is available.
  *   - -ENODEV: The port ID is invalid.
+ *   - -EIO: if device is removed.
  *   - -ENOTSUP: The function is not supported by the Ethernet driver.
  */
 int rte_eth_timesync_write_time(uint16_t port_id, const struct timespec *time);
@@ -4521,6 +4564,7 @@ int rte_eth_timesync_read_tx_timestamp(uint16_t port_id,
  * @return
  *   - (0) if successful.
  *   - (-ENODEV) if port identifier is invalid.
+ *   - (-EIO) if device is removed.
  *   - (-ENOTSUP) if hardware doesn't support tunnel type.
  */
 int
@@ -4548,6 +4592,7 @@ int rte_eth_timesync_read_tx_timestamp(uint16_t port_id,
  * @return
  *   - (0) if successful.
  *   - (-ENODEV) if port identifier is invalid.
+ *   - (-EIO) if device is removed.
  *   - (-ENOTSUP) if hardware doesn't support tunnel type.
  */
 int
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH v3 5/6] ethdev: adjust flow APIs removal error report
  2017-12-19 17:10   ` [PATCH v3 0/6] Fail-safe\ethdev: fix removal handling lack Matan Azrad
                       ` (3 preceding siblings ...)
  2017-12-19 17:10     ` [PATCH v3 4/6] ethdev: adjust APIs removal error report Matan Azrad
@ 2017-12-19 17:10     ` Matan Azrad
  2018-01-07  9:58       ` Thomas Monjalon
  2017-12-19 17:10     ` [PATCH v3 6/6] net/failsafe: fix removed device handling Matan Azrad
  2018-01-10 12:30     ` [PATCH v4 0/6] Fail-safe\ethdev: fix removal handling lack Matan Azrad
  6 siblings, 1 reply; 98+ messages in thread
From: Matan Azrad @ 2017-12-19 17:10 UTC (permalink / raw)
  To: Adrien Mazarguil, Thomas Monjalon, Gaetan Rivet; +Cc: dev

rte_eth_dev_is_removed API was added to detect a device removal
synchronously.

When a device removal occurs during flow command execution, many
different errors can be reported to the user.

Adjust all flow APIs error reports to return -EIO in case of device
removal using rte_eth_dev_is_removed API.

Signed-off-by: Matan Azrad <matan@mellanox.com>
---
 lib/librte_ether/rte_flow.c | 34 +++++++++++++++++++++++++++-------
 lib/librte_ether/rte_flow.h |  2 ++
 2 files changed, 29 insertions(+), 7 deletions(-)

diff --git a/lib/librte_ether/rte_flow.c b/lib/librte_ether/rte_flow.c
index 6659063..be481ce 100644
--- a/lib/librte_ether/rte_flow.c
+++ b/lib/librte_ether/rte_flow.c
@@ -106,6 +106,18 @@ struct rte_flow_desc_data {
 	MK_FLOW_ACTION(VF, sizeof(struct rte_flow_action_vf)),
 };
 
+static int
+flow_err(uint16_t port_id, int ret, struct rte_flow_error *error)
+{
+	if (ret == 0)
+		return 0;
+	if (rte_eth_dev_is_removed(port_id))
+		return rte_flow_error_set(error, EIO,
+					  RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
+					  NULL, rte_strerror(EIO));
+	return ret;
+}
+
 /* Get generic flow operations structure from a port. */
 const struct rte_flow_ops *
 rte_flow_ops_get(uint16_t port_id, struct rte_flow_error *error)
@@ -144,7 +156,8 @@ struct rte_flow_desc_data {
 	if (unlikely(!ops))
 		return -rte_errno;
 	if (likely(!!ops->validate))
-		return ops->validate(dev, attr, pattern, actions, error);
+		return flow_err(port_id, ops->validate(dev, attr, pattern,
+						       actions, error), error);
 	return rte_flow_error_set(error, ENOSYS,
 				  RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
 				  NULL, rte_strerror(ENOSYS));
@@ -159,12 +172,17 @@ struct rte_flow *
 		struct rte_flow_error *error)
 {
 	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+	struct rte_flow *flow;
 	const struct rte_flow_ops *ops = rte_flow_ops_get(port_id, error);
 
 	if (unlikely(!ops))
 		return NULL;
-	if (likely(!!ops->create))
-		return ops->create(dev, attr, pattern, actions, error);
+	if (likely(!!ops->create)) {
+		flow = ops->create(dev, attr, pattern, actions, error);
+		if (flow == NULL)
+			flow_err(port_id, -rte_errno, error);
+		return flow;
+	}
 	rte_flow_error_set(error, ENOSYS, RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
 			   NULL, rte_strerror(ENOSYS));
 	return NULL;
@@ -182,7 +200,8 @@ struct rte_flow *
 	if (unlikely(!ops))
 		return -rte_errno;
 	if (likely(!!ops->destroy))
-		return ops->destroy(dev, flow, error);
+		return flow_err(port_id, ops->destroy(dev, flow, error),
+				error);
 	return rte_flow_error_set(error, ENOSYS,
 				  RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
 				  NULL, rte_strerror(ENOSYS));
@@ -199,7 +218,7 @@ struct rte_flow *
 	if (unlikely(!ops))
 		return -rte_errno;
 	if (likely(!!ops->flush))
-		return ops->flush(dev, error);
+		return flow_err(port_id, ops->flush(dev, error), error);
 	return rte_flow_error_set(error, ENOSYS,
 				  RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
 				  NULL, rte_strerror(ENOSYS));
@@ -219,7 +238,8 @@ struct rte_flow *
 	if (!ops)
 		return -rte_errno;
 	if (likely(!!ops->query))
-		return ops->query(dev, flow, action, data, error);
+		return flow_err(port_id, ops->query(dev, flow, action, data,
+						    error), error);
 	return rte_flow_error_set(error, ENOSYS,
 				  RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
 				  NULL, rte_strerror(ENOSYS));
@@ -237,7 +257,7 @@ struct rte_flow *
 	if (!ops)
 		return -rte_errno;
 	if (likely(!!ops->isolate))
-		return ops->isolate(dev, set, error);
+		return flow_err(port_id, ops->isolate(dev, set, error), error);
 	return rte_flow_error_set(error, ENOSYS,
 				  RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
 				  NULL, rte_strerror(ENOSYS));
diff --git a/lib/librte_ether/rte_flow.h b/lib/librte_ether/rte_flow.h
index 47c88ea..180438f 100644
--- a/lib/librte_ether/rte_flow.h
+++ b/lib/librte_ether/rte_flow.h
@@ -1237,6 +1237,8 @@ struct rte_flow_error {
  *
  *   -ENOSYS: underlying device does not support this functionality.
  *
+ *   -EIO: underlying device is removed.
+ *
  *   -EINVAL: unknown or invalid rule specification.
  *
  *   -ENOTSUP: valid but unsupported rule specification (e.g. partial
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH v3 6/6] net/failsafe: fix removed device handling
  2017-12-19 17:10   ` [PATCH v3 0/6] Fail-safe\ethdev: fix removal handling lack Matan Azrad
                       ` (4 preceding siblings ...)
  2017-12-19 17:10     ` [PATCH v3 5/6] ethdev: adjust flow " Matan Azrad
@ 2017-12-19 17:10     ` Matan Azrad
  2017-12-19 22:21       ` Gaëtan Rivet
  2018-01-10 12:30     ` [PATCH v4 0/6] Fail-safe\ethdev: fix removal handling lack Matan Azrad
  6 siblings, 1 reply; 98+ messages in thread
From: Matan Azrad @ 2017-12-19 17:10 UTC (permalink / raw)
  To: Adrien Mazarguil, Thomas Monjalon, Gaetan Rivet; +Cc: dev

There is time between the physical removal of the device until
sub-device PMDs get a RMV interrupt. At this time DPDK PMDs and
applications still don't know about the removal and may call sub-device
control operation which should return an error.

In previous code this error is reported to the application contrary to
fail-safe principle that the app should not be aware of device removal.

Add an removal check in each relevant control command error flow and
prevent an error report to application when the sub-device is removed.

Fixes: a46f8d5 ("net/failsafe: add fail-safe PMD")
Fixes: b737a1e ("net/failsafe: support flow API")

Signed-off-by: Matan Azrad <matan@mellanox.com>
---
 drivers/net/failsafe/failsafe_flow.c    | 18 ++++++++++-------
 drivers/net/failsafe/failsafe_ops.c     | 35 ++++++++++++++++++++++-----------
 drivers/net/failsafe/failsafe_private.h | 12 +++++++++++
 3 files changed, 47 insertions(+), 18 deletions(-)

diff --git a/drivers/net/failsafe/failsafe_flow.c b/drivers/net/failsafe/failsafe_flow.c
index 153ceee..123acb4 100644
--- a/drivers/net/failsafe/failsafe_flow.c
+++ b/drivers/net/failsafe/failsafe_flow.c
@@ -87,7 +87,7 @@
 		DEBUG("Calling rte_flow_validate on sub_device %d", i);
 		ret = rte_flow_validate(PORT_ID(sdev),
 				attr, patterns, actions, error);
-		if (ret) {
+		if (fs_is_error(sdev, ret)) {
 			ERROR("Operation rte_flow_validate failed for sub_device %d"
 			      " with error %d", i, ret);
 			return ret;
@@ -111,7 +111,7 @@
 	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
 		flow->flows[i] = rte_flow_create(PORT_ID(sdev),
 				attr, patterns, actions, error);
-		if (flow->flows[i] == NULL) {
+		if (flow->flows[i] == NULL && fs_is_error(sdev, -rte_errno)) {
 			ERROR("Failed to create flow on sub_device %d",
 				i);
 			goto err;
@@ -150,7 +150,7 @@
 			continue;
 		local_ret = rte_flow_destroy(PORT_ID(sdev),
 				flow->flows[i], error);
-		if (local_ret) {
+		if (fs_is_error(sdev, local_ret)) {
 			ERROR("Failed to destroy flow on sub_device %d: %d",
 					i, local_ret);
 			if (ret == 0)
@@ -175,7 +175,7 @@
 	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
 		DEBUG("Calling rte_flow_flush on sub_device %d", i);
 		ret = rte_flow_flush(PORT_ID(sdev), error);
-		if (ret) {
+		if (fs_is_error(sdev, ret)) {
 			ERROR("Operation rte_flow_flush failed for sub_device %d"
 			      " with error %d", i, ret);
 			return ret;
@@ -199,8 +199,12 @@
 
 	sdev = TX_SUBDEV(dev);
 	if (sdev != NULL) {
-		return rte_flow_query(PORT_ID(sdev),
-				flow->flows[SUB_ID(sdev)], type, arg, error);
+		int ret = rte_flow_query(PORT_ID(sdev),
+					 flow->flows[SUB_ID(sdev)],
+					 type, arg, error);
+
+		if (fs_is_error(sdev, ret))
+			return ret;
 	}
 	WARN("No active sub_device to query about its flow");
 	return -1;
@@ -223,7 +227,7 @@
 			WARN("flow isolation mode of sub_device %d in incoherent state.",
 				i);
 		ret = rte_flow_isolate(PORT_ID(sdev), set, error);
-		if (ret) {
+		if (fs_is_error(sdev, ret)) {
 			ERROR("Operation rte_flow_isolate failed for sub_device %d"
 			      " with error %d", i, ret);
 			return ret;
diff --git a/drivers/net/failsafe/failsafe_ops.c b/drivers/net/failsafe/failsafe_ops.c
index e16a590..313ea2f 100644
--- a/drivers/net/failsafe/failsafe_ops.c
+++ b/drivers/net/failsafe/failsafe_ops.c
@@ -121,6 +121,8 @@
 					dev->data->nb_tx_queues,
 					&dev->data->dev_conf);
 		if (ret) {
+			if (!fs_is_error(sdev, ret))
+				continue;
 			ERROR("Could not configure sub_device %d", i);
 			return ret;
 		}
@@ -163,8 +165,11 @@
 			continue;
 		DEBUG("Starting sub_device %d", i);
 		ret = rte_eth_dev_start(PORT_ID(sdev));
-		if (ret)
+		if (ret) {
+			if (!fs_is_error(sdev, ret))
+				continue;
 			return ret;
+		}
 		sdev->state = DEV_STARTED;
 	}
 	if (PRIV(dev)->state < DEV_STARTED)
@@ -196,7 +201,7 @@
 	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
 		DEBUG("Calling rte_eth_dev_set_link_up on sub_device %d", i);
 		ret = rte_eth_dev_set_link_up(PORT_ID(sdev));
-		if (ret) {
+		if (fs_is_error(sdev, ret)) {
 			ERROR("Operation rte_eth_dev_set_link_up failed for sub_device %d"
 			      " with error %d", i, ret);
 			return ret;
@@ -215,7 +220,7 @@
 	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
 		DEBUG("Calling rte_eth_dev_set_link_down on sub_device %d", i);
 		ret = rte_eth_dev_set_link_down(PORT_ID(sdev));
-		if (ret) {
+		if (fs_is_error(sdev, ret)) {
 			ERROR("Operation rte_eth_dev_set_link_down failed for sub_device %d"
 			      " with error %d", i, ret);
 			return ret;
@@ -300,7 +305,7 @@
 				rx_queue_id,
 				nb_rx_desc, socket_id,
 				rx_conf, mb_pool);
-		if (ret) {
+		if (fs_is_error(sdev, ret)) {
 			ERROR("RX queue setup failed for sub_device %d", i);
 			goto free_rxq;
 		}
@@ -366,7 +371,7 @@
 				tx_queue_id,
 				nb_tx_desc, socket_id,
 				tx_conf);
-		if (ret) {
+		if (fs_is_error(sdev, ret)) {
 			ERROR("TX queue setup failed for sub_device %d", i);
 			goto free_txq;
 		}
@@ -445,7 +450,8 @@
 	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
 		DEBUG("Calling link_update on sub_device %d", i);
 		ret = (SUBOPS(sdev, link_update))(ETH(sdev), wait_to_complete);
-		if (ret && ret != -1) {
+		if (ret && ret != -1 && sdev->remove == 0 &&
+		    rte_eth_dev_is_removed(PORT_ID(sdev)) == 0) {
 			ERROR("Link update failed for sub_device %d with error %d",
 			      i, ret);
 			return ret;
@@ -469,6 +475,7 @@
 fs_stats_get(struct rte_eth_dev *dev,
 	     struct rte_eth_stats *stats)
 {
+	struct rte_eth_stats backup;
 	struct sub_device *sdev;
 	uint8_t i;
 	int ret;
@@ -478,14 +485,20 @@
 		struct rte_eth_stats *snapshot = &sdev->stats_snapshot.stats;
 		uint64_t *timestamp = &sdev->stats_snapshot.timestamp;
 
+		rte_memcpy(&backup, snapshot, sizeof(backup));
 		ret = rte_eth_stats_get(PORT_ID(sdev), snapshot);
 		if (ret) {
+			if (!fs_is_error(sdev, ret)) {
+				rte_memcpy(snapshot, &backup, sizeof(backup));
+				goto inc;
+			}
 			ERROR("Operation rte_eth_stats_get failed for sub_device %d with error %d",
 				  i, ret);
 			*timestamp = 0;
 			return ret;
 		}
 		*timestamp = rte_rdtsc();
+inc:
 		failsafe_stats_increment(stats, snapshot);
 	}
 	return 0;
@@ -598,7 +611,7 @@
 	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
 		DEBUG("Calling rte_eth_dev_set_mtu on sub_device %d", i);
 		ret = rte_eth_dev_set_mtu(PORT_ID(sdev), mtu);
-		if (ret) {
+		if (fs_is_error(sdev, ret)) {
 			ERROR("Operation rte_eth_dev_set_mtu failed for sub_device %d with error %d",
 			      i, ret);
 			return ret;
@@ -617,7 +630,7 @@
 	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
 		DEBUG("Calling rte_eth_dev_vlan_filter on sub_device %d", i);
 		ret = rte_eth_dev_vlan_filter(PORT_ID(sdev), vlan_id, on);
-		if (ret) {
+		if (fs_is_error(sdev, ret)) {
 			ERROR("Operation rte_eth_dev_vlan_filter failed for sub_device %d"
 			      " with error %d", i, ret);
 			return ret;
@@ -651,7 +664,7 @@
 	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
 		DEBUG("Calling rte_eth_dev_flow_ctrl_set on sub_device %d", i);
 		ret = rte_eth_dev_flow_ctrl_set(PORT_ID(sdev), fc_conf);
-		if (ret) {
+		if (fs_is_error(sdev, ret)) {
 			ERROR("Operation rte_eth_dev_flow_ctrl_set failed for sub_device %d"
 			      " with error %d", i, ret);
 			return ret;
@@ -688,7 +701,7 @@
 	RTE_ASSERT(index < FAILSAFE_MAX_ETHADDR);
 	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
 		ret = rte_eth_dev_mac_addr_add(PORT_ID(sdev), mac_addr, vmdq);
-		if (ret) {
+		if (fs_is_error(sdev, ret)) {
 			ERROR("Operation rte_eth_dev_mac_addr_add failed for sub_device %"
 			      PRIu8 " with error %d", i, ret);
 			return ret;
@@ -730,7 +743,7 @@
 	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
 		DEBUG("Calling rte_eth_dev_filter_ctrl on sub_device %d", i);
 		ret = rte_eth_dev_filter_ctrl(PORT_ID(sdev), type, op, arg);
-		if (ret) {
+		if (fs_is_error(sdev, ret)) {
 			ERROR("Operation rte_eth_dev_filter_ctrl failed for sub_device %d"
 			      " with error %d", i, ret);
 			return ret;
diff --git a/drivers/net/failsafe/failsafe_private.h b/drivers/net/failsafe/failsafe_private.h
index d81cc3c..585b554 100644
--- a/drivers/net/failsafe/failsafe_private.h
+++ b/drivers/net/failsafe/failsafe_private.h
@@ -34,6 +34,7 @@
 #ifndef _RTE_ETH_FAILSAFE_PRIVATE_H_
 #define _RTE_ETH_FAILSAFE_PRIVATE_H_
 
+#include <stdbool.h>
 #include <sys/queue.h>
 
 #include <rte_atomic.h>
@@ -375,4 +376,15 @@ int failsafe_eth_lsc_event_callback(uint16_t port_id,
 	rte_wmb();
 }
 
+/*
+ * Check if error should be reported to the user.
+ */
+static inline bool
+fs_is_error(struct sub_device *sdev, int err)
+{
+	/* A device removal shouldn't be reported as an error. */
+	if (err == 0 || sdev->remove == 1 || err == -EIO)
+		return false;
+	return true;
+}
 #endif /* _RTE_ETH_FAILSAFE_PRIVATE_H_ */
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* Re: [PATCH v3 1/6] ethdev: add devop to check removal status
  2017-12-19 17:10     ` [PATCH v3 1/6] ethdev: add devop to check removal status Matan Azrad
@ 2017-12-19 17:20       ` Stephen Hemminger
  2017-12-19 17:24         ` Matan Azrad
  2018-01-07  9:53       ` Thomas Monjalon
  1 sibling, 1 reply; 98+ messages in thread
From: Stephen Hemminger @ 2017-12-19 17:20 UTC (permalink / raw)
  To: Matan Azrad; +Cc: Adrien Mazarguil, Thomas Monjalon, Gaetan Rivet, dev

On Tue, 19 Dec 2017 17:10:10 +0000
Matan Azrad <matan@mellanox.com> wrote:

>  int
> +rte_eth_dev_is_removed(uint16_t port_id)
> +{
> +	struct rte_eth_dev *dev;
> +	int ret;
> +
> +	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, 0);
> +
> +	dev = &rte_eth_devices[port_id];
> +
> +	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->is_removed, 0);
> +
> +	if (dev->state == RTE_ETH_DEV_REMOVED)
> +		return 1;
> +
> +	ret = dev->dev_ops->is_removed(dev);
> +	if (ret != 0)
> +		dev->state = RTE_ETH_DEV_REMOVED;
> +
> +	return ret;
> +}
> +

This looks good.
May be a candidate to use bool instead of int for return value?

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v3 1/6] ethdev: add devop to check removal status
  2017-12-19 17:20       ` Stephen Hemminger
@ 2017-12-19 17:24         ` Matan Azrad
  2017-12-19 20:51           ` Thomas Monjalon
  0 siblings, 1 reply; 98+ messages in thread
From: Matan Azrad @ 2017-12-19 17:24 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: Adrien Mazarguil, Thomas Monjalon, Gaetan Rivet, dev

HI

> -----Original Message-----
> From: Stephen Hemminger [mailto:stephen@networkplumber.org]
> Sent: Tuesday, December 19, 2017 7:20 PM
> To: Matan Azrad <matan@mellanox.com>
> Cc: Adrien Mazarguil <adrien.mazarguil@6wind.com>; Thomas Monjalon
> <thomas@monjalon.net>; Gaetan Rivet <gaetan.rivet@6wind.com>;
> dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v3 1/6] ethdev: add devop to check removal
> status
> 
> On Tue, 19 Dec 2017 17:10:10 +0000
> Matan Azrad <matan@mellanox.com> wrote:
> 
> >  int
> > +rte_eth_dev_is_removed(uint16_t port_id) {
> > +	struct rte_eth_dev *dev;
> > +	int ret;
> > +
> > +	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, 0);
> > +
> > +	dev = &rte_eth_devices[port_id];
> > +
> > +	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->is_removed, 0);
> > +
> > +	if (dev->state == RTE_ETH_DEV_REMOVED)
> > +		return 1;
> > +
> > +	ret = dev->dev_ops->is_removed(dev);
> > +	if (ret != 0)
> > +		dev->state = RTE_ETH_DEV_REMOVED;
> > +
> > +	return ret;
> > +}
> > +
> 
> This looks good.
> May be a candidate to use bool instead of int for return value?

Yes, I thought about it but didn't see any precedence for bool usage in ethdev APIs.
Guys, what do you think?

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v3 1/6] ethdev: add devop to check removal status
  2017-12-19 17:24         ` Matan Azrad
@ 2017-12-19 20:51           ` Thomas Monjalon
  2017-12-19 22:13             ` Gaëtan Rivet
  0 siblings, 1 reply; 98+ messages in thread
From: Thomas Monjalon @ 2017-12-19 20:51 UTC (permalink / raw)
  To: Matan Azrad; +Cc: Stephen Hemminger, Adrien Mazarguil, Gaetan Rivet, dev

19/12/2017 18:24, Matan Azrad:
> HI
> 
> > -----Original Message-----
> > From: Stephen Hemminger [mailto:stephen@networkplumber.org]
> > Sent: Tuesday, December 19, 2017 7:20 PM
> > To: Matan Azrad <matan@mellanox.com>
> > Cc: Adrien Mazarguil <adrien.mazarguil@6wind.com>; Thomas Monjalon
> > <thomas@monjalon.net>; Gaetan Rivet <gaetan.rivet@6wind.com>;
> > dev@dpdk.org
> > Subject: Re: [dpdk-dev] [PATCH v3 1/6] ethdev: add devop to check removal
> > status
> > 
> > On Tue, 19 Dec 2017 17:10:10 +0000
> > Matan Azrad <matan@mellanox.com> wrote:
> > 
> > >  int
> > > +rte_eth_dev_is_removed(uint16_t port_id) {
> > > +	struct rte_eth_dev *dev;
> > > +	int ret;
> > > +
> > > +	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, 0);
> > > +
> > > +	dev = &rte_eth_devices[port_id];
> > > +
> > > +	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->is_removed, 0);
> > > +
> > > +	if (dev->state == RTE_ETH_DEV_REMOVED)
> > > +		return 1;
> > > +
> > > +	ret = dev->dev_ops->is_removed(dev);
> > > +	if (ret != 0)
> > > +		dev->state = RTE_ETH_DEV_REMOVED;
> > > +
> > > +	return ret;
> > > +}
> > > +
> > 
> > This looks good.
> > May be a candidate to use bool instead of int for return value?
> 
> Yes, I thought about it but didn't see any precedence for bool usage in ethdev APIs.
> Guys, what do you think?

I think this function can return error, isn't it?
(look at macros *_OR_ERR_RET used in the function)

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v3 1/6] ethdev: add devop to check removal status
  2017-12-19 20:51           ` Thomas Monjalon
@ 2017-12-19 22:13             ` Gaëtan Rivet
  2017-12-20  8:39               ` Matan Azrad
  0 siblings, 1 reply; 98+ messages in thread
From: Gaëtan Rivet @ 2017-12-19 22:13 UTC (permalink / raw)
  To: Thomas Monjalon; +Cc: Matan Azrad, Stephen Hemminger, Adrien Mazarguil, dev

On Tue, Dec 19, 2017 at 09:51:10PM +0100, Thomas Monjalon wrote:
> 19/12/2017 18:24, Matan Azrad:
> > HI
> > 
> > > -----Original Message-----
> > > From: Stephen Hemminger [mailto:stephen@networkplumber.org]
> > > Sent: Tuesday, December 19, 2017 7:20 PM
> > > To: Matan Azrad <matan@mellanox.com>
> > > Cc: Adrien Mazarguil <adrien.mazarguil@6wind.com>; Thomas Monjalon
> > > <thomas@monjalon.net>; Gaetan Rivet <gaetan.rivet@6wind.com>;
> > > dev@dpdk.org
> > > Subject: Re: [dpdk-dev] [PATCH v3 1/6] ethdev: add devop to check removal
> > > status
> > > 
> > > On Tue, 19 Dec 2017 17:10:10 +0000
> > > Matan Azrad <matan@mellanox.com> wrote:
> > > 
> > > >  int
> > > > +rte_eth_dev_is_removed(uint16_t port_id) {
> > > > +	struct rte_eth_dev *dev;
> > > > +	int ret;
> > > > +
> > > > +	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, 0);
> > > > +
> > > > +	dev = &rte_eth_devices[port_id];
> > > > +
> > > > +	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->is_removed, 0);
> > > > +
> > > > +	if (dev->state == RTE_ETH_DEV_REMOVED)
> > > > +		return 1;
> > > > +
> > > > +	ret = dev->dev_ops->is_removed(dev);
> > > > +	if (ret != 0)
> > > > +		dev->state = RTE_ETH_DEV_REMOVED;
> > > > +
> > > > +	return ret;
> > > > +}
> > > > +
> > > 
> > > This looks good.
> > > May be a candidate to use bool instead of int for return value?
> > 
> > Yes, I thought about it but didn't see any precedence for bool usage in ethdev APIs.
> > Guys, what do you think?
> 
> I think this function can return error, isn't it?
> (look at macros *_OR_ERR_RET used in the function)
> 

But those macros are used to return 0.

While I think I see a logic behind it, I think it is surprising the
API user, which is not ideal.

-- 
Gaëtan Rivet
6WIND

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v3 6/6] net/failsafe: fix removed device handling
  2017-12-19 17:10     ` [PATCH v3 6/6] net/failsafe: fix removed device handling Matan Azrad
@ 2017-12-19 22:21       ` Gaëtan Rivet
  2017-12-20 10:58         ` Matan Azrad
  0 siblings, 1 reply; 98+ messages in thread
From: Gaëtan Rivet @ 2017-12-19 22:21 UTC (permalink / raw)
  To: Matan Azrad; +Cc: Adrien Mazarguil, Thomas Monjalon, dev

Hi Matan,

On Tue, Dec 19, 2017 at 05:10:15PM +0000, Matan Azrad wrote:
> There is time between the physical removal of the device until
> sub-device PMDs get a RMV interrupt. At this time DPDK PMDs and
> applications still don't know about the removal and may call sub-device
> control operation which should return an error.
> 
> In previous code this error is reported to the application contrary to
> fail-safe principle that the app should not be aware of device removal.
> 
> Add an removal check in each relevant control command error flow and
> prevent an error report to application when the sub-device is removed.
> 
> Fixes: a46f8d5 ("net/failsafe: add fail-safe PMD")
> Fixes: b737a1e ("net/failsafe: support flow API")
> 
> Signed-off-by: Matan Azrad <matan@mellanox.com>
> ---

<snip>

> +/*
> + * Check if error should be reported to the user.
> + */
> +static inline bool
> +fs_is_error(struct sub_device *sdev, int err)
> +{
> +	/* A device removal shouldn't be reported as an error. */
> +	if (err == 0 || sdev->remove == 1 || err == -EIO)
> +		return false;
> +	return true;
> +}

This is better, thanks.

However is there a reason you did not follow the same pattern as ethdev with
eth_err? I see the two functions as similar in their intent, making them
close to each other would be clearer to a reader being familiar with the
ethdev API and that would be interested in fail-safe.

What do you think?

-- 
Gaëtan Rivet
6WIND

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v3 1/6] ethdev: add devop to check removal status
  2017-12-19 22:13             ` Gaëtan Rivet
@ 2017-12-20  8:39               ` Matan Azrad
  0 siblings, 0 replies; 98+ messages in thread
From: Matan Azrad @ 2017-12-20  8:39 UTC (permalink / raw)
  To: Gaëtan Rivet, Thomas Monjalon
  Cc: Stephen Hemminger, Adrien Mazarguil, dev

Hi

> -----Original Message-----
> From: Gaëtan Rivet [mailto:gaetan.rivet@6wind.com]
> Sent: Wednesday, December 20, 2017 12:13 AM
> To: Thomas Monjalon <thomas@monjalon.net>
> Cc: Matan Azrad <matan@mellanox.com>; Stephen Hemminger
> <stephen@networkplumber.org>; Adrien Mazarguil
> <adrien.mazarguil@6wind.com>; dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v3 1/6] ethdev: add devop to check removal
> status
> 
> On Tue, Dec 19, 2017 at 09:51:10PM +0100, Thomas Monjalon wrote:
> > 19/12/2017 18:24, Matan Azrad:
> > > HI
> > >
> > > > -----Original Message-----
> > > > From: Stephen Hemminger [mailto:stephen@networkplumber.org]
> > > > Sent: Tuesday, December 19, 2017 7:20 PM
> > > > To: Matan Azrad <matan@mellanox.com>
> > > > Cc: Adrien Mazarguil <adrien.mazarguil@6wind.com>; Thomas
> Monjalon
> > > > <thomas@monjalon.net>; Gaetan Rivet <gaetan.rivet@6wind.com>;
> > > > dev@dpdk.org
> > > > Subject: Re: [dpdk-dev] [PATCH v3 1/6] ethdev: add devop to check
> > > > removal status
> > > >
> > > > On Tue, 19 Dec 2017 17:10:10 +0000 Matan Azrad
> > > > <matan@mellanox.com> wrote:
> > > >
> > > > >  int
> > > > > +rte_eth_dev_is_removed(uint16_t port_id) {
> > > > > +	struct rte_eth_dev *dev;
> > > > > +	int ret;
> > > > > +
> > > > > +	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, 0);
> > > > > +
> > > > > +	dev = &rte_eth_devices[port_id];
> > > > > +
> > > > > +	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops-
> >is_removed, 0);
> > > > > +
> > > > > +	if (dev->state == RTE_ETH_DEV_REMOVED)
> > > > > +		return 1;
> > > > > +
> > > > > +	ret = dev->dev_ops->is_removed(dev);
> > > > > +	if (ret != 0)
> > > > > +		dev->state = RTE_ETH_DEV_REMOVED;
> > > > > +
> > > > > +	return ret;
> > > > > +}
> > > > > +
> > > >
> > > > This looks good.
> > > > May be a candidate to use bool instead of int for return value?
> > >
> > > Yes, I thought about it but didn't see any precedence for bool usage in
> ethdev APIs.
> > > Guys, what do you think?
> >
> > I think this function can return error, isn't it?
> > (look at macros *_OR_ERR_RET used in the function)
> >
> 
> But those macros are used to return 0.
> 
> While I think I see a logic behind it, I think it is surprising the API user, which is
> not ideal.
> 

The logic behind it is that "is" semantic is question which expects to yes\no answer.
Therefore, user who uses this API just expects to either True or False return value and doesn't need to check more errors options like -ENODEV or -ENOTSUP.
I decided that the return value will be only 0 or 1 to make it easier to user:
Removed - 1,
Present - 0,
No support - it makes sense that PMD which doesn't implement "is_removed" devop means that its underlying devices are not removable so they are always present and this function should return '0' for it.
No port - I think that '0' here will be better that '1'.

Stephen suggestion to replace '0' and '1' by 'false' and 'true' makes sense but I decided not to do it like this because of the next ideas:
1. No precedence for bool value in ethdev APIs.
2. Maybe it will be problematic to use *OR_ERR_RET defines to return bool value. 

I hope that this explanation helps you.

Thanks, 
Matan.
   
  


   
> --
> Gaëtan Rivet
> 6WIND

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v3 6/6] net/failsafe: fix removed device handling
  2017-12-19 22:21       ` Gaëtan Rivet
@ 2017-12-20 10:58         ` Matan Azrad
  2018-01-08 10:57           ` Gaëtan Rivet
  0 siblings, 1 reply; 98+ messages in thread
From: Matan Azrad @ 2017-12-20 10:58 UTC (permalink / raw)
  To: Gaëtan Rivet; +Cc: Adrien Mazarguil, Thomas Monjalon, dev

Hi Gaetan

> -----Original Message-----
> From: Gaëtan Rivet [mailto:gaetan.rivet@6wind.com]
> Sent: Wednesday, December 20, 2017 12:22 AM
> To: Matan Azrad <matan@mellanox.com>
> Cc: Adrien Mazarguil <adrien.mazarguil@6wind.com>; Thomas Monjalon
> <thomas@monjalon.net>; dev@dpdk.org
> Subject: Re: [PATCH v3 6/6] net/failsafe: fix removed device handling
> 
> Hi Matan,
> 
> On Tue, Dec 19, 2017 at 05:10:15PM +0000, Matan Azrad wrote:
> > There is time between the physical removal of the device until
> > sub-device PMDs get a RMV interrupt. At this time DPDK PMDs and
> > applications still don't know about the removal and may call
> > sub-device control operation which should return an error.
> >
> > In previous code this error is reported to the application contrary to
> > fail-safe principle that the app should not be aware of device removal.
> >
> > Add an removal check in each relevant control command error flow and
> > prevent an error report to application when the sub-device is removed.
> >
> > Fixes: a46f8d5 ("net/failsafe: add fail-safe PMD")
> > Fixes: b737a1e ("net/failsafe: support flow API")
> >
> > Signed-off-by: Matan Azrad <matan@mellanox.com>
> > ---
> 
> <snip>
> 
> > +/*
> > + * Check if error should be reported to the user.
> > + */
> > +static inline bool
> > +fs_is_error(struct sub_device *sdev, int err) {
> > +	/* A device removal shouldn't be reported as an error. */
> > +	if (err == 0 || sdev->remove == 1 || err == -EIO)
> > +		return false;
> > +	return true;
> > +}
> 
> This is better, thanks.
> 
> However is there a reason you did not follow the same pattern as ethdev
> with eth_err? I see the two functions as similar in their intent, making them
> close to each other would be clearer to a reader being familiar with the
> ethdev API and that would be interested in fail-safe.
> 
> What do you think?
> 

I think that there is a real different between eth_err function to fs_is_error:
ethdev uses eth_err function to adjust removal return value to be -EIO.
fail-safe uses fs_is_error function to check if an error should be reported to the user to save the fail-safe principle that the app should not be aware of device removal  -  this is the main idea that also causes me to change the name from fs_is_removed to fs_is_error.

> --
> Gaëtan Rivet
> 6WIND

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v3 1/6] ethdev: add devop to check removal status
  2017-12-19 17:10     ` [PATCH v3 1/6] ethdev: add devop to check removal status Matan Azrad
  2017-12-19 17:20       ` Stephen Hemminger
@ 2018-01-07  9:53       ` Thomas Monjalon
  1 sibling, 0 replies; 98+ messages in thread
From: Thomas Monjalon @ 2018-01-07  9:53 UTC (permalink / raw)
  To: Matan Azrad; +Cc: dev, Adrien Mazarguil, Gaetan Rivet

19/12/2017 18:10, Matan Azrad:
> There is time between the physical removal of the device until PMDs get
> a RMV interrupt. At this time DPDK PMDs and applications still don't
> know about the removal.
> 
> Current removal detection is achieved only by registration to device RMV
> event and the notification comes asynchronously. So, there is no option
> to detect a device removal synchronously.
> Applications and other DPDK entities may want to check a device removal
> synchronously and to take an immediate decision accordingly.
> 
> Add new dev op called is_removed to allow DPDK entities to check an
> Ethernet device removal status immediately.
> 
> Signed-off-by: Matan Azrad <matan@mellanox.com>

Acked-by: Thomas Monjalon <thomas@monjalon.net>

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v3 4/6] ethdev: adjust APIs removal error report
  2017-12-19 17:10     ` [PATCH v3 4/6] ethdev: adjust APIs removal error report Matan Azrad
@ 2018-01-07  9:56       ` Thomas Monjalon
  0 siblings, 0 replies; 98+ messages in thread
From: Thomas Monjalon @ 2018-01-07  9:56 UTC (permalink / raw)
  To: Matan Azrad; +Cc: dev, Adrien Mazarguil, Gaetan Rivet

19/12/2017 18:10, Matan Azrad:
> rte_eth_dev_is_removed API was added to detect a device removal
> synchronously.
> 
> When a device removal occurs during control command execution, many
> different errors can be reported to the user.
> 
> Adjust all ethdev APIs error reports to return -EIO in case of device
> removal using rte_eth_dev_is_removed API.
> 
> Signed-off-by: Matan Azrad <matan@mellanox.com>

Acked-by: Thomas Monjalon <thomas@monjalon.net>

Your git does not display the function name in @@ context.
Please update your environment. Thanks

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v3 5/6] ethdev: adjust flow APIs removal error report
  2017-12-19 17:10     ` [PATCH v3 5/6] ethdev: adjust flow " Matan Azrad
@ 2018-01-07  9:58       ` Thomas Monjalon
  0 siblings, 0 replies; 98+ messages in thread
From: Thomas Monjalon @ 2018-01-07  9:58 UTC (permalink / raw)
  To: Matan Azrad; +Cc: dev, Adrien Mazarguil, Gaetan Rivet

19/12/2017 18:10, Matan Azrad:
> rte_eth_dev_is_removed API was added to detect a device removal
> synchronously.
> 
> When a device removal occurs during flow command execution, many
> different errors can be reported to the user.
> 
> Adjust all flow APIs error reports to return -EIO in case of device
> removal using rte_eth_dev_is_removed API.
> 
> Signed-off-by: Matan Azrad <matan@mellanox.com>

Acked-by: Thomas Monjalon <thomas@monjalon.net>

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v3 6/6] net/failsafe: fix removed device handling
  2017-12-20 10:58         ` Matan Azrad
@ 2018-01-08 10:57           ` Gaëtan Rivet
  2018-01-08 12:55             ` Matan Azrad
  0 siblings, 1 reply; 98+ messages in thread
From: Gaëtan Rivet @ 2018-01-08 10:57 UTC (permalink / raw)
  To: Matan Azrad; +Cc: Adrien Mazarguil, Thomas Monjalon, dev

Hi Matan,

Sorry for the delay on this.

On Wed, Dec 20, 2017 at 10:58:29AM +0000, Matan Azrad wrote:
> Hi Gaetan
> 
> > -----Original Message-----
> > From: Gaëtan Rivet [mailto:gaetan.rivet@6wind.com]
> > Sent: Wednesday, December 20, 2017 12:22 AM
> > To: Matan Azrad <matan@mellanox.com>
> > Cc: Adrien Mazarguil <adrien.mazarguil@6wind.com>; Thomas Monjalon
> > <thomas@monjalon.net>; dev@dpdk.org
> > Subject: Re: [PATCH v3 6/6] net/failsafe: fix removed device handling
> > 
> > Hi Matan,
> > 
> > On Tue, Dec 19, 2017 at 05:10:15PM +0000, Matan Azrad wrote:
> > > There is time between the physical removal of the device until
> > > sub-device PMDs get a RMV interrupt. At this time DPDK PMDs and
> > > applications still don't know about the removal and may call
> > > sub-device control operation which should return an error.
> > >
> > > In previous code this error is reported to the application contrary to
> > > fail-safe principle that the app should not be aware of device removal.
> > >
> > > Add an removal check in each relevant control command error flow and
> > > prevent an error report to application when the sub-device is removed.
> > >
> > > Fixes: a46f8d5 ("net/failsafe: add fail-safe PMD")
> > > Fixes: b737a1e ("net/failsafe: support flow API")

As stated previously, please do not include those fixes lines.

> > >
> > > Signed-off-by: Matan Azrad <matan@mellanox.com>
> > > ---
> > 
> > <snip>
> > 
> > > +/*
> > > + * Check if error should be reported to the user.
> > > + */
> > > +static inline bool
> > > +fs_is_error(struct sub_device *sdev, int err) {
> > > +	/* A device removal shouldn't be reported as an error. */
> > > +	if (err == 0 || sdev->remove == 1 || err == -EIO)
> > > +		return false;
> > > +	return true;
> > > +}
> > 
> > This is better, thanks.
> > 
> > However is there a reason you did not follow the same pattern as ethdev
> > with eth_err? I see the two functions as similar in their intent, making them
> > close to each other would be clearer to a reader being familiar with the
> > ethdev API and that would be interested in fail-safe.
> > 
> > What do you think?
> > 
> 
> I think that there is a real different between eth_err function to fs_is_error:
> ethdev uses eth_err function to adjust removal return value to be -EIO.
> fail-safe uses fs_is_error function to check if an error should be reported to the user to save the fail-safe principle that the app should not be aware of device removal  -  this is the main idea that also causes me to change the name from fs_is_removed to fs_is_error.

I would have preferred if it followed the same pattern as ethdev (that
function be used to adjust the return value, not performing a flag check).

While better on its own, the pattern:

    if (fs_is_error(sdev, err)) {
            ERROR("xxxx");
            return err;
    }

is dangerous, as then the author is forbidden from returning err, assuming
err could be -EIO. He or she would be forced to return an explicit "0".
To be clear, here would be an easy mistake to do:

    if (fs_is_error(sdev, err)) {
            ERROR("xxxx");
    }
    return err;

And this kind of code-flow is not unusual, or even unwanted.
I dislike having this kind of implicit rule derived from using a helper
such as fs_is_error().

The alternative

    if ((err = fs_err(sdev, err))) {
            ERROR("xxxx");
            return err;
    }

Forces the value err to be set to the correct one.

This mistake can already be found in your patch:

> @@ -150,7 +150,7 @@
>                         continue;
>                 local_ret = rte_flow_destroy(PORT_ID(sdev),
>                                 flow->flows[i], error);
> -               if (local_ret) {
> +               if (fs_is_error(sdev, local_ret)) {
>                         ERROR("Failed to destroy flow on sub_device %d: %d",
>                                         i, local_ret);
>                         if (ret == 0)

Your environment does not include the function, but this is within
fs_flow_destroy (please update to include the context by the way
it helps a lot the review :). Afterward, line 162 ret is directly
used as return value.

Also, fs_err() would need to transform rte_errno when relevant (mostly
in failsafe_flow.c I think).

This is the kind of subtlety that needs to be avoided when designing
APIs, even internal ones. This will induce errors afterward and
complicate the maintenance of the codebase.

Best regards,

-- 
Gaëtan Rivet
6WIND

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v3 6/6] net/failsafe: fix removed device handling
  2018-01-08 10:57           ` Gaëtan Rivet
@ 2018-01-08 12:55             ` Matan Azrad
  2018-01-08 13:46               ` Gaëtan Rivet
  0 siblings, 1 reply; 98+ messages in thread
From: Matan Azrad @ 2018-01-08 12:55 UTC (permalink / raw)
  To: Gaëtan Rivet; +Cc: Adrien Mazarguil, Thomas Monjalon, dev

Hi Gaetan

From: Gaëtan Rivet, Monday, January 8, 2018 12:58 PM
> Hi Matan,
> 
> Sorry for the delay on this.
> 

It's OK in spite of I need to fetch it back :)

> On Wed, Dec 20, 2017 at 10:58:29AM +0000, Matan Azrad wrote:
> > Hi Gaetan
> >
> > > -----Original Message-----
> > > From: Gaëtan Rivet [mailto:gaetan.rivet@6wind.com]
> > > Sent: Wednesday, December 20, 2017 12:22 AM
> > > To: Matan Azrad <matan@mellanox.com>
> > > Cc: Adrien Mazarguil <adrien.mazarguil@6wind.com>; Thomas Monjalon
> > > <thomas@monjalon.net>; dev@dpdk.org
> > > Subject: Re: [PATCH v3 6/6] net/failsafe: fix removed device
> > > handling
> > >
> > > Hi Matan,
> > >
> > > On Tue, Dec 19, 2017 at 05:10:15PM +0000, Matan Azrad wrote:
> > > > There is time between the physical removal of the device until
> > > > sub-device PMDs get a RMV interrupt. At this time DPDK PMDs and
> > > > applications still don't know about the removal and may call
> > > > sub-device control operation which should return an error.
> > > >
> > > > In previous code this error is reported to the application
> > > > contrary to fail-safe principle that the app should not be aware of
> device removal.
> > > >
> > > > Add an removal check in each relevant control command error flow
> > > > and prevent an error report to application when the sub-device is
> removed.
> > > >
> > > > Fixes: a46f8d5 ("net/failsafe: add fail-safe PMD")
> > > > Fixes: b737a1e ("net/failsafe: support flow API")
> 
> As stated previously, please do not include those fixes lines.
> 
> > > >
> > > > Signed-off-by: Matan Azrad <matan@mellanox.com>
> > > > ---
> > >
> > > <snip>
> > >
> > > > +/*
> > > > + * Check if error should be reported to the user.
> > > > + */
> > > > +static inline bool
> > > > +fs_is_error(struct sub_device *sdev, int err) {
> > > > +	/* A device removal shouldn't be reported as an error. */
> > > > +	if (err == 0 || sdev->remove == 1 || err == -EIO)
> > > > +		return false;
> > > > +	return true;
> > > > +}
> > >
> > > This is better, thanks.
> > >
> > > However is there a reason you did not follow the same pattern as
> > > ethdev with eth_err? I see the two functions as similar in their
> > > intent, making them close to each other would be clearer to a reader
> > > being familiar with the ethdev API and that would be interested in fail-
> safe.
> > >
> > > What do you think?
> > >
> >
> > I think that there is a real different between eth_err function to
> fs_is_error:
> > ethdev uses eth_err function to adjust removal return value to be -EIO.
> > fail-safe uses fs_is_error function to check if an error should be reported to
> the user to save the fail-safe principle that the app should not be aware of
> device removal  -  this is the main idea that also causes me to change the
> name from fs_is_removed to fs_is_error.
> 
> I would have preferred if it followed the same pattern as ethdev (that
> function be used to adjust the return value, not performing a flag check).
> 
> While better on its own, the pattern:
> 
>     if (fs_is_error(sdev, err)) {
>             ERROR("xxxx");
>             return err;
>     }
> 
> is dangerous, as then the author is forbidden from returning err, assuming
> err could be -EIO. He or she would be forced to return an explicit "0".
> To be clear, here would be an easy mistake to do:
> 
>     if (fs_is_error(sdev, err)) {
>             ERROR("xxxx");
>     }
>     return err;
> 
> And this kind of code-flow is not unusual, or even unwanted.
> I dislike having this kind of implicit rule derived from using a helper such as
> fs_is_error().
> 
> The alternative
> 
>     if ((err = fs_err(sdev, err))) {
>             ERROR("xxxx");
>             return err;
>     }
> 
> Forces the value err to be set to the correct one.
> 
Good point, will change it.

> This mistake can already be found in your patch:
> 
> > @@ -150,7 +150,7 @@
> >                         continue;
> >                 local_ret = rte_flow_destroy(PORT_ID(sdev),
> >                                 flow->flows[i], error);
> > -               if (local_ret) {
> > +               if (fs_is_error(sdev, local_ret)) {
> >                         ERROR("Failed to destroy flow on sub_device %d: %d",
> >                                         i, local_ret);
> >                         if (ret == 0)
> 

Sorry, I can't see any issue here.

> Your environment does not include the function, but this is within
> fs_flow_destroy (please update to include the context by the way it helps a
> lot the review :). Afterward, line 162 ret is directly used as return value.
> 
I don't understand what do you mean.

> Also, fs_err() would need to transform rte_errno when relevant (mostly in
> failsafe_flow.c I think).
> 
Your suggestion is always to update rte_errno to 0 in case the error is because of removal?

> This is the kind of subtlety that needs to be avoided when designing APIs,
> even internal ones. This will induce errors afterward and complicate the
> maintenance of the codebase.
>

Thanks for the lesson! 
 
> Best regards,
> 
> --
> Gaëtan Rivet
> 6WIND

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v3 6/6] net/failsafe: fix removed device handling
  2018-01-08 12:55             ` Matan Azrad
@ 2018-01-08 13:46               ` Gaëtan Rivet
  2018-01-08 14:00                 ` Matan Azrad
  0 siblings, 1 reply; 98+ messages in thread
From: Gaëtan Rivet @ 2018-01-08 13:46 UTC (permalink / raw)
  To: Matan Azrad; +Cc: Adrien Mazarguil, Thomas Monjalon, dev

On Mon, Jan 08, 2018 at 12:55:49PM +0000, Matan Azrad wrote:
> Hi Gaetan
> 
> From: Gaëtan Rivet, Monday, January 8, 2018 12:58 PM
> > Hi Matan,
> > 
> > Sorry for the delay on this.
> > 
> 
> It's OK in spite of I need to fetch it back :)
> 
> > On Wed, Dec 20, 2017 at 10:58:29AM +0000, Matan Azrad wrote:
> > > Hi Gaetan
> > >
> > > > -----Original Message-----
> > > > From: Gaëtan Rivet [mailto:gaetan.rivet@6wind.com]
> > > > Sent: Wednesday, December 20, 2017 12:22 AM
> > > > To: Matan Azrad <matan@mellanox.com>
> > > > Cc: Adrien Mazarguil <adrien.mazarguil@6wind.com>; Thomas Monjalon
> > > > <thomas@monjalon.net>; dev@dpdk.org
> > > > Subject: Re: [PATCH v3 6/6] net/failsafe: fix removed device
> > > > handling
> > > >
> > > > Hi Matan,
> > > >
> > > > On Tue, Dec 19, 2017 at 05:10:15PM +0000, Matan Azrad wrote:
> > > > > There is time between the physical removal of the device until
> > > > > sub-device PMDs get a RMV interrupt. At this time DPDK PMDs and
> > > > > applications still don't know about the removal and may call
> > > > > sub-device control operation which should return an error.
> > > > >
> > > > > In previous code this error is reported to the application
> > > > > contrary to fail-safe principle that the app should not be aware of
> > device removal.
> > > > >
> > > > > Add an removal check in each relevant control command error flow
> > > > > and prevent an error report to application when the sub-device is
> > removed.
> > > > >
> > > > > Fixes: a46f8d5 ("net/failsafe: add fail-safe PMD")
> > > > > Fixes: b737a1e ("net/failsafe: support flow API")
> > 
> > As stated previously, please do not include those fixes lines.
> > 
> > > > >
> > > > > Signed-off-by: Matan Azrad <matan@mellanox.com>
> > > > > ---
> > > >
> > > > <snip>
> > > >
> > > > > +/*
> > > > > + * Check if error should be reported to the user.
> > > > > + */
> > > > > +static inline bool
> > > > > +fs_is_error(struct sub_device *sdev, int err) {
> > > > > +	/* A device removal shouldn't be reported as an error. */
> > > > > +	if (err == 0 || sdev->remove == 1 || err == -EIO)
> > > > > +		return false;
> > > > > +	return true;
> > > > > +}
> > > >
> > > > This is better, thanks.
> > > >
> > > > However is there a reason you did not follow the same pattern as
> > > > ethdev with eth_err? I see the two functions as similar in their
> > > > intent, making them close to each other would be clearer to a reader
> > > > being familiar with the ethdev API and that would be interested in fail-
> > safe.
> > > >
> > > > What do you think?
> > > >
> > >
> > > I think that there is a real different between eth_err function to
> > fs_is_error:
> > > ethdev uses eth_err function to adjust removal return value to be -EIO.
> > > fail-safe uses fs_is_error function to check if an error should be reported to
> > the user to save the fail-safe principle that the app should not be aware of
> > device removal  -  this is the main idea that also causes me to change the
> > name from fs_is_removed to fs_is_error.
> > 
> > I would have preferred if it followed the same pattern as ethdev (that
> > function be used to adjust the return value, not performing a flag check).
> > 
> > While better on its own, the pattern:
> > 
> >     if (fs_is_error(sdev, err)) {
> >             ERROR("xxxx");
> >             return err;
> >     }
> > 
> > is dangerous, as then the author is forbidden from returning err, assuming
> > err could be -EIO. He or she would be forced to return an explicit "0".
> > To be clear, here would be an easy mistake to do:
> > 
> >     if (fs_is_error(sdev, err)) {
> >             ERROR("xxxx");
> >     }
> >     return err;
> > 
> > And this kind of code-flow is not unusual, or even unwanted.
> > I dislike having this kind of implicit rule derived from using a helper such as
> > fs_is_error().
> > 
> > The alternative
> > 
> >     if ((err = fs_err(sdev, err))) {
> >             ERROR("xxxx");
> >             return err;
> >     }
> > 
> > Forces the value err to be set to the correct one.
> > 
> Good point, will change it.
> 
> > This mistake can already be found in your patch:
> > 
> > > @@ -150,7 +150,7 @@
> > >                         continue;
> > >                 local_ret = rte_flow_destroy(PORT_ID(sdev),
> > >                                 flow->flows[i], error);
> > > -               if (local_ret) {
> > > +               if (fs_is_error(sdev, local_ret)) {
> > >                         ERROR("Failed to destroy flow on sub_device %d: %d",
> > >                                         i, local_ret);
> > >                         if (ret == 0)
> > 
> 
> Sorry, I can't see any issue here.
> 

You're right, actually the code would still be correct.
I checked again the rest of the edit, there shouldn't be any issue,
usually "0" is explicitly returned.

Still, the point stands.

> > Your environment does not include the function, but this is within
> > fs_flow_destroy (please update to include the context by the way it helps a
> > lot the review :). Afterward, line 162 ret is directly used as return value.
> > 
> I don't understand what do you mean.
> 
> > Also, fs_err() would need to transform rte_errno when relevant (mostly in
> > failsafe_flow.c I think).
> > 
> Your suggestion is always to update rte_errno to 0 in case the error is because of removal?
> 

If the error is indeed due to the device being absent, then rte_errno
should be set back to its previous value I think.

-- 
Gaëtan Rivet
6WIND

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v3 6/6] net/failsafe: fix removed device handling
  2018-01-08 13:46               ` Gaëtan Rivet
@ 2018-01-08 14:00                 ` Matan Azrad
  2018-01-08 14:31                   ` Gaëtan Rivet
  0 siblings, 1 reply; 98+ messages in thread
From: Matan Azrad @ 2018-01-08 14:00 UTC (permalink / raw)
  To: Gaëtan Rivet; +Cc: Adrien Mazarguil, Thomas Monjalon, dev

Hi Gaetan

From: Gaëtan Rivet, Monday, January 8, 2018 3:47 PM
> On Mon, Jan 08, 2018 at 12:55:49PM +0000, Matan Azrad wrote:
> > Hi Gaetan
> >
> > From: Gaëtan Rivet, Monday, January 8, 2018 12:58 PM
> > > Hi Matan,
> > >
> > > Sorry for the delay on this.
> > >
> >
> > It's OK in spite of I need to fetch it back :)
> >
> > > On Wed, Dec 20, 2017 at 10:58:29AM +0000, Matan Azrad wrote:
<snip>
> > > And this kind of code-flow is not unusual, or even unwanted.
> > > I dislike having this kind of implicit rule derived from using a
> > > helper such as fs_is_error().
> > >
> > > The alternative
> > >
> > >     if ((err = fs_err(sdev, err))) {
> > >             ERROR("xxxx");
> > >             return err;
> > >     }
> > >
> > > Forces the value err to be set to the correct one.
> > >
> > Good point, will change it.
> >
> > > This mistake can already be found in your patch:
> > >
> > > > @@ -150,7 +150,7 @@
> > > >                         continue;
> > > >                 local_ret = rte_flow_destroy(PORT_ID(sdev),
> > > >                                 flow->flows[i], error);
> > > > -               if (local_ret) {
> > > > +               if (fs_is_error(sdev, local_ret)) {
> > > >                         ERROR("Failed to destroy flow on sub_device %d: %d",
> > > >                                         i, local_ret);
> > > >                         if (ret == 0)
> > >
> >
> > Sorry, I can't see any issue here.
> >
> 
> You're right, actually the code would still be correct.
> I checked again the rest of the edit, there shouldn't be any issue, usually "0"
> is explicitly returned.
> 
> Still, the point stands.
> 
Yes.

> > > Your environment does not include the function, but this is within
> > > fs_flow_destroy (please update to include the context by the way it
> > > helps a lot the review :). Afterward, line 162 ret is directly used as return
> value.
> > >
> > I don't understand what do you mean.
> >
> > > Also, fs_err() would need to transform rte_errno when relevant
> > > (mostly in failsafe_flow.c I think).
> > >
> > Your suggestion is always to update rte_errno to 0 in case the error is
> because of removal?
> >
> 
> If the error is indeed due to the device being absent, then rte_errno should
> be set back to its previous value I think.
So, I think it will require old rte_errno save before each device command...
Why not to set it to 0 in the special case(removal) by the new internal API?

> 
> --
> Gaëtan Rivet
> 6WIND

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v3 6/6] net/failsafe: fix removed device handling
  2018-01-08 14:00                 ` Matan Azrad
@ 2018-01-08 14:31                   ` Gaëtan Rivet
  0 siblings, 0 replies; 98+ messages in thread
From: Gaëtan Rivet @ 2018-01-08 14:31 UTC (permalink / raw)
  To: Matan Azrad; +Cc: Adrien Mazarguil, Thomas Monjalon, dev

On Mon, Jan 08, 2018 at 02:00:54PM +0000, Matan Azrad wrote:
> Hi Gaetan
> 

<snip>

> > > > Your environment does not include the function, but this is within
> > > > fs_flow_destroy (please update to include the context by the way it
> > > > helps a lot the review :). Afterward, line 162 ret is directly used as return
> > value.
> > > >
> > > I don't understand what do you mean.
> > >
> > > > Also, fs_err() would need to transform rte_errno when relevant
> > > > (mostly in failsafe_flow.c I think).
> > > >
> > > Your suggestion is always to update rte_errno to 0 in case the error is
> > because of removal?
> > >
> > 
> > If the error is indeed due to the device being absent, then rte_errno should
> > be set back to its previous value I think.
> So, I think it will require old rte_errno save before each device command...
> Why not to set it to 0 in the special case(removal) by the new internal API?
> 

Resetting it to 0 might be sufficient, yes.

There might be some old-school devs out-there that would such things as:

    do_thing_x();
    do_thing_y();
    do_thing_z();
    if (check_for_any_error(errno)) {
        abort();
    }

But I'm not too fond of this kind of pattern, so I'm not specifically
opposed to code that does not go with this flow.

-- 
Gaëtan Rivet
6WIND

^ permalink raw reply	[flat|nested] 98+ messages in thread

* [PATCH v4 0/6] Fail-safe\ethdev: fix removal handling lack
  2017-12-19 17:10   ` [PATCH v3 0/6] Fail-safe\ethdev: fix removal handling lack Matan Azrad
                       ` (5 preceding siblings ...)
  2017-12-19 17:10     ` [PATCH v3 6/6] net/failsafe: fix removed device handling Matan Azrad
@ 2018-01-10 12:30     ` Matan Azrad
  2018-01-10 12:31       ` [PATCH v4 1/6] ethdev: add devop to check removal status Matan Azrad
                         ` (6 more replies)
  6 siblings, 7 replies; 98+ messages in thread
From: Matan Azrad @ 2018-01-10 12:30 UTC (permalink / raw)
  To: Thomas Monjalon, Gaetan Rivet; +Cc: dev

There is time between the physical removal of the device until sub-device PMDs get a RMV interrupt. 
At this time DPDK PMDs and applications still don't know about the removal and may call sub-device control operation which should return an error.

This series adds new ethdev operation to check device removal, adds support for it in mlx PMDs, adjust ethdev APIs to return -EIO in case of removal and fixes the fail-safe bug of removal error report.

V2:
Remove ENODEV definition.
Remove checks from all mlx control commands.
Add new devop - "is_removed".
Implement it in mlx4 and mlx5.
Fix failsafe bug by the new devop.

V3:
Adjust ethdev APIs removal error report.
Change failsafe check to check eth_dev* return values.
Remove backporting of fail-safe patch.

V4:
Improve fail-safe internal API to adjust the actual error value as discussed.
Remove "Fixes" lines from fail-safe patch.
No changes in ethdev\mlx patches.

Matan Azrad (6):
  ethdev: add devop to check removal status
  net/mlx4: support a device removal check operation
  net/mlx5: support a device removal check operation
  ethdev: adjust APIs removal error report
  ethdev: adjust flow APIs removal error report
  net/failsafe: fix removed device handling

 drivers/net/failsafe/failsafe_flow.c    |  18 +--
 drivers/net/failsafe/failsafe_ops.c     |  35 ++++--
 drivers/net/failsafe/failsafe_private.h |  11 ++
 drivers/net/mlx4/mlx4.c                 |   1 +
 drivers/net/mlx4/mlx4.h                 |   1 +
 drivers/net/mlx4/mlx4_ethdev.c          |  20 +++
 drivers/net/mlx5/mlx5.c                 |   2 +
 drivers/net/mlx5/mlx5.h                 |   1 +
 drivers/net/mlx5/mlx5_ethdev.c          |  20 +++
 lib/librte_ether/rte_ethdev.c           | 213 +++++++++++++++++++++-----------
 lib/librte_ether/rte_ethdev.h           |  68 +++++++++-
 lib/librte_ether/rte_ethdev_version.map |   7 ++
 lib/librte_ether/rte_flow.c             |  34 +++--
 lib/librte_ether/rte_flow.h             |   2 +
 14 files changed, 333 insertions(+), 100 deletions(-)

-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 98+ messages in thread

* [PATCH v4 1/6] ethdev: add devop to check removal status
  2018-01-10 12:30     ` [PATCH v4 0/6] Fail-safe\ethdev: fix removal handling lack Matan Azrad
@ 2018-01-10 12:31       ` Matan Azrad
  2018-01-10 12:31       ` [PATCH v4 2/6] net/mlx4: support a device removal check operation Matan Azrad
                         ` (5 subsequent siblings)
  6 siblings, 0 replies; 98+ messages in thread
From: Matan Azrad @ 2018-01-10 12:31 UTC (permalink / raw)
  To: Thomas Monjalon, Gaetan Rivet; +Cc: dev

There is time between the physical removal of the device until PMDs get
a RMV interrupt. At this time DPDK PMDs and applications still don't
know about the removal.

Current removal detection is achieved only by registration to device RMV
event and the notification comes asynchronously. So, there is no option
to detect a device removal synchronously.
Applications and other DPDK entities may want to check a device removal
synchronously and to take an immediate decision accordingly.

Add new dev op called is_removed to allow DPDK entities to check an
Ethernet device removal status immediately.

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
---
 lib/librte_ether/rte_ethdev.c           | 28 +++++++++++++++++++++++++---
 lib/librte_ether/rte_ethdev.h           | 17 +++++++++++++++++
 lib/librte_ether/rte_ethdev_version.map |  7 +++++++
 3 files changed, 49 insertions(+), 3 deletions(-)

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index 318af28..c759d0e 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -142,7 +142,8 @@ enum {
 rte_eth_find_next(uint16_t port_id)
 {
 	while (port_id < RTE_MAX_ETHPORTS &&
-	       rte_eth_devices[port_id].state != RTE_ETH_DEV_ATTACHED)
+	       rte_eth_devices[port_id].state != RTE_ETH_DEV_ATTACHED &&
+	       rte_eth_devices[port_id].state != RTE_ETH_DEV_REMOVED)
 		port_id++;
 
 	if (port_id >= RTE_MAX_ETHPORTS)
@@ -286,8 +287,7 @@ struct rte_eth_dev *
 rte_eth_dev_is_valid_port(uint16_t port_id)
 {
 	if (port_id >= RTE_MAX_ETHPORTS ||
-	    (rte_eth_devices[port_id].state != RTE_ETH_DEV_ATTACHED &&
-	     rte_eth_devices[port_id].state != RTE_ETH_DEV_DEFERRED))
+	    (rte_eth_devices[port_id].state == RTE_ETH_DEV_UNUSED))
 		return 0;
 	else
 		return 1;
@@ -1118,6 +1118,28 @@ struct rte_eth_dev *
 }
 
 int
+rte_eth_dev_is_removed(uint16_t port_id)
+{
+	struct rte_eth_dev *dev;
+	int ret;
+
+	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, 0);
+
+	dev = &rte_eth_devices[port_id];
+
+	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->is_removed, 0);
+
+	if (dev->state == RTE_ETH_DEV_REMOVED)
+		return 1;
+
+	ret = dev->dev_ops->is_removed(dev);
+	if (ret != 0)
+		dev->state = RTE_ETH_DEV_REMOVED;
+
+	return ret;
+}
+
+int
 rte_eth_rx_queue_setup(uint16_t port_id, uint16_t rx_queue_id,
 		       uint16_t nb_rx_desc, unsigned int socket_id,
 		       const struct rte_eth_rxconf *rx_conf,
diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index 341c2d6..3aa9d3f 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -1196,6 +1196,9 @@ struct rte_eth_dcb_info {
 typedef int (*eth_dev_reset_t)(struct rte_eth_dev *dev);
 /** <@internal Function used to reset a configured Ethernet device. */
 
+typedef int (*eth_is_removed_t)(struct rte_eth_dev *dev);
+/**< @internal Function used to detect an Ethernet device removal. */
+
 typedef void (*eth_promiscuous_enable_t)(struct rte_eth_dev *dev);
 /**< @internal Function used to enable the RX promiscuous mode of an Ethernet device. */
 
@@ -1525,6 +1528,8 @@ struct eth_dev_ops {
 	eth_dev_close_t            dev_close;     /**< Close device. */
 	eth_dev_reset_t		   dev_reset;	  /**< Reset device. */
 	eth_link_update_t          link_update;   /**< Get device link state. */
+	eth_is_removed_t           is_removed;
+	/**< Check if the device was physically removed. */
 
 	eth_promiscuous_enable_t   promiscuous_enable; /**< Promiscuous ON. */
 	eth_promiscuous_disable_t  promiscuous_disable;/**< Promiscuous OFF. */
@@ -1711,6 +1716,7 @@ enum rte_eth_dev_state {
 	RTE_ETH_DEV_UNUSED = 0,
 	RTE_ETH_DEV_ATTACHED,
 	RTE_ETH_DEV_DEFERRED,
+	RTE_ETH_DEV_REMOVED,
 };
 
 /**
@@ -1997,6 +2003,17 @@ int rte_eth_dev_configure(uint16_t port_id, uint16_t nb_rx_queue,
 void _rte_eth_dev_reset(struct rte_eth_dev *dev);
 
 /**
+ * Check if an Ethernet device was physically removed.
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @return
+ *   - 0 when the Ethernet device is removed, otherwise 1.
+ */
+int
+rte_eth_dev_is_removed(uint16_t port_id);
+
+/**
  * Allocate and set up a receive queue for an Ethernet device.
  *
  * The function allocates a contiguous block of memory for *nb_rx_desc*
diff --git a/lib/librte_ether/rte_ethdev_version.map b/lib/librte_ether/rte_ethdev_version.map
index e9681ac..78547ff 100644
--- a/lib/librte_ether/rte_ethdev_version.map
+++ b/lib/librte_ether/rte_ethdev_version.map
@@ -198,6 +198,13 @@ DPDK_17.11 {
 
 } DPDK_17.08;
 
+DPDK_18.02 {
+	global:
+
+	rte_eth_dev_is_removed;
+
+} DPDK_17.11;
+
 EXPERIMENTAL {
 	global:
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH v4 2/6] net/mlx4: support a device removal check operation
  2018-01-10 12:30     ` [PATCH v4 0/6] Fail-safe\ethdev: fix removal handling lack Matan Azrad
  2018-01-10 12:31       ` [PATCH v4 1/6] ethdev: add devop to check removal status Matan Azrad
@ 2018-01-10 12:31       ` Matan Azrad
  2018-01-10 12:31       ` [PATCH v4 3/6] net/mlx5: " Matan Azrad
                         ` (4 subsequent siblings)
  6 siblings, 0 replies; 98+ messages in thread
From: Matan Azrad @ 2018-01-10 12:31 UTC (permalink / raw)
  To: Thomas Monjalon, Gaetan Rivet; +Cc: dev

Add support to get removal status of mlx4 device.

Signed-off-by: Matan Azrad <matan@mellanox.com>
---
 drivers/net/mlx4/mlx4.c        |  1 +
 drivers/net/mlx4/mlx4.h        |  1 +
 drivers/net/mlx4/mlx4_ethdev.c | 20 ++++++++++++++++++++
 3 files changed, 22 insertions(+)

diff --git a/drivers/net/mlx4/mlx4.c b/drivers/net/mlx4/mlx4.c
index f9e4f9d..3cde640 100644
--- a/drivers/net/mlx4/mlx4.c
+++ b/drivers/net/mlx4/mlx4.c
@@ -256,6 +256,7 @@ struct mlx4_conf {
 	.filter_ctrl = mlx4_filter_ctrl,
 	.rx_queue_intr_enable = mlx4_rx_intr_enable,
 	.rx_queue_intr_disable = mlx4_rx_intr_disable,
+	.is_removed = mlx4_is_removed,
 };
 
 /**
diff --git a/drivers/net/mlx4/mlx4.h b/drivers/net/mlx4/mlx4.h
index 3aeef87..0eaba89 100644
--- a/drivers/net/mlx4/mlx4.h
+++ b/drivers/net/mlx4/mlx4.h
@@ -165,6 +165,7 @@ int mlx4_flow_ctrl_get(struct rte_eth_dev *dev,
 int mlx4_flow_ctrl_set(struct rte_eth_dev *dev,
 		       struct rte_eth_fc_conf *fc_conf);
 const uint32_t *mlx4_dev_supported_ptypes_get(struct rte_eth_dev *dev);
+int mlx4_is_removed(struct rte_eth_dev *dev);
 
 /* mlx4_intr.c */
 
diff --git a/drivers/net/mlx4/mlx4_ethdev.c b/drivers/net/mlx4/mlx4_ethdev.c
index 2f69e7d..0d46f5a 100644
--- a/drivers/net/mlx4/mlx4_ethdev.c
+++ b/drivers/net/mlx4/mlx4_ethdev.c
@@ -1060,3 +1060,23 @@ enum rxmode_toggle {
 	}
 	return NULL;
 }
+
+/**
+ * Check if mlx4 device was removed.
+ *
+ * @param dev
+ *   Pointer to Ethernet device structure.
+ *
+ * @return
+ *   1 when device is removed, otherwise 0.
+ */
+int
+mlx4_is_removed(struct rte_eth_dev *dev)
+{
+	struct ibv_device_attr device_attr;
+	struct priv *priv = dev->data->dev_private;
+
+	if (ibv_query_device(priv->ctx, &device_attr) == EIO)
+		return 1;
+	return 0;
+}
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH v4 3/6] net/mlx5: support a device removal check operation
  2018-01-10 12:30     ` [PATCH v4 0/6] Fail-safe\ethdev: fix removal handling lack Matan Azrad
  2018-01-10 12:31       ` [PATCH v4 1/6] ethdev: add devop to check removal status Matan Azrad
  2018-01-10 12:31       ` [PATCH v4 2/6] net/mlx4: support a device removal check operation Matan Azrad
@ 2018-01-10 12:31       ` Matan Azrad
  2018-01-10 12:31       ` [PATCH v4 4/6] ethdev: adjust APIs removal error report Matan Azrad
                         ` (3 subsequent siblings)
  6 siblings, 0 replies; 98+ messages in thread
From: Matan Azrad @ 2018-01-10 12:31 UTC (permalink / raw)
  To: Thomas Monjalon, Gaetan Rivet; +Cc: dev

Add support to get removal status of mlx5 device.
It is not supported in secondary process.

Signed-off-by: Matan Azrad <matan@mellanox.com>
---
 drivers/net/mlx5/mlx5.c        |  2 ++
 drivers/net/mlx5/mlx5.h        |  1 +
 drivers/net/mlx5/mlx5_ethdev.c | 20 ++++++++++++++++++++
 3 files changed, 23 insertions(+)

diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index 0548d17..e0b781b 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -303,6 +303,7 @@ struct mlx5_args {
 	.tx_descriptor_status = mlx5_tx_descriptor_status,
 	.rx_queue_intr_enable = mlx5_rx_intr_enable,
 	.rx_queue_intr_disable = mlx5_rx_intr_disable,
+	.is_removed = mlx5_is_removed,
 };
 
 static const struct eth_dev_ops mlx5_dev_sec_ops = {
@@ -350,6 +351,7 @@ struct mlx5_args {
 	.tx_descriptor_status = mlx5_tx_descriptor_status,
 	.rx_queue_intr_enable = mlx5_rx_intr_enable,
 	.rx_queue_intr_disable = mlx5_rx_intr_disable,
+	.is_removed = mlx5_is_removed,
 };
 
 static struct {
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index e6a69b8..2ec7ae7 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -208,6 +208,7 @@ int mlx5_ibv_device_to_pci_addr(const struct ibv_device *,
 int mlx5_set_link_up(struct rte_eth_dev *dev);
 void priv_dev_select_tx_function(struct priv *priv, struct rte_eth_dev *dev);
 void priv_dev_select_rx_function(struct priv *priv, struct rte_eth_dev *dev);
+int mlx5_is_removed(struct rte_eth_dev *dev);
 
 /* mlx5_mac.c */
 
diff --git a/drivers/net/mlx5/mlx5_ethdev.c b/drivers/net/mlx5/mlx5_ethdev.c
index a3cef68..5cf0849 100644
--- a/drivers/net/mlx5/mlx5_ethdev.c
+++ b/drivers/net/mlx5/mlx5_ethdev.c
@@ -1474,3 +1474,23 @@ struct priv *
 		dev->rx_pkt_burst = mlx5_rx_burst;
 	}
 }
+
+/**
+ * Check if mlx5 device was removed.
+ *
+ * @param dev
+ *   Pointer to Ethernet device structure.
+ *
+ * @return
+ *   1 when device is removed, otherwise 0.
+ */
+int
+mlx5_is_removed(struct rte_eth_dev *dev)
+{
+	struct ibv_device_attr device_attr;
+	struct priv *priv = dev->data->dev_private;
+
+	if (ibv_query_device(priv->ctx, &device_attr) == EIO)
+		return 1;
+	return 0;
+}
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH v4 4/6] ethdev: adjust APIs removal error report
  2018-01-10 12:30     ` [PATCH v4 0/6] Fail-safe\ethdev: fix removal handling lack Matan Azrad
                         ` (2 preceding siblings ...)
  2018-01-10 12:31       ` [PATCH v4 3/6] net/mlx5: " Matan Azrad
@ 2018-01-10 12:31       ` Matan Azrad
  2018-01-10 12:31       ` [PATCH v4 5/6] ethdev: adjust flow " Matan Azrad
                         ` (2 subsequent siblings)
  6 siblings, 0 replies; 98+ messages in thread
From: Matan Azrad @ 2018-01-10 12:31 UTC (permalink / raw)
  To: Thomas Monjalon, Gaetan Rivet; +Cc: dev

rte_eth_dev_is_removed API was added to detect a device removal
synchronously.

When a device removal occurs during control command execution, many
different errors can be reported to the user.

Adjust all ethdev APIs error reports to return -EIO in case of device
removal using rte_eth_dev_is_removed API.

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
---
 lib/librte_ether/rte_ethdev.c | 187 ++++++++++++++++++++++++++----------------
 lib/librte_ether/rte_ethdev.h |  51 +++++++++++-
 2 files changed, 165 insertions(+), 73 deletions(-)

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index c759d0e..301d108 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -362,6 +362,16 @@ struct rte_eth_dev *
 	return -ENODEV;
 }
 
+static int
+eth_err(uint16_t port_id, int ret)
+{
+	if (ret == 0)
+		return 0;
+	if (rte_eth_dev_is_removed(port_id))
+		return -EIO;
+	return ret;
+}
+
 /* attach the new device, then store port_id of the device */
 int
 rte_eth_dev_attach(const char *devargs, uint16_t *port_id)
@@ -516,7 +526,8 @@ struct rte_eth_dev *
 		return 0;
 	}
 
-	return dev->dev_ops->rx_queue_start(dev, rx_queue_id);
+	return eth_err(port_id, dev->dev_ops->rx_queue_start(dev,
+							     rx_queue_id));
 
 }
 
@@ -542,7 +553,7 @@ struct rte_eth_dev *
 		return 0;
 	}
 
-	return dev->dev_ops->rx_queue_stop(dev, rx_queue_id);
+	return eth_err(port_id, dev->dev_ops->rx_queue_stop(dev, rx_queue_id));
 
 }
 
@@ -568,7 +579,8 @@ struct rte_eth_dev *
 		return 0;
 	}
 
-	return dev->dev_ops->tx_queue_start(dev, tx_queue_id);
+	return eth_err(port_id, dev->dev_ops->tx_queue_start(dev,
+							     tx_queue_id));
 
 }
 
@@ -594,7 +606,7 @@ struct rte_eth_dev *
 		return 0;
 	}
 
-	return dev->dev_ops->tx_queue_stop(dev, tx_queue_id);
+	return eth_err(port_id, dev->dev_ops->tx_queue_stop(dev, tx_queue_id));
 
 }
 
@@ -912,7 +924,7 @@ struct rte_eth_dev *
 				port_id, diag);
 		rte_eth_dev_rx_queue_config(dev, 0);
 		rte_eth_dev_tx_queue_config(dev, 0);
-		return diag;
+		return eth_err(port_id, diag);
 	}
 
 	/* Initialize Rx profiling if enabled at compilation time. */
@@ -922,7 +934,7 @@ struct rte_eth_dev *
 				port_id, diag);
 		rte_eth_dev_rx_queue_config(dev, 0);
 		rte_eth_dev_tx_queue_config(dev, 0);
-		return diag;
+		return eth_err(port_id, diag);
 	}
 
 	return 0;
@@ -1022,7 +1034,7 @@ struct rte_eth_dev *
 	if (diag == 0)
 		dev->data->dev_started = 1;
 	else
-		return diag;
+		return eth_err(port_id, diag);
 
 	rte_eth_dev_config_restore(port_id);
 
@@ -1064,7 +1076,7 @@ struct rte_eth_dev *
 	dev = &rte_eth_devices[port_id];
 
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->dev_set_link_up, -ENOTSUP);
-	return (*dev->dev_ops->dev_set_link_up)(dev);
+	return eth_err(port_id, (*dev->dev_ops->dev_set_link_up)(dev));
 }
 
 int
@@ -1077,7 +1089,7 @@ struct rte_eth_dev *
 	dev = &rte_eth_devices[port_id];
 
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->dev_set_link_down, -ENOTSUP);
-	return (*dev->dev_ops->dev_set_link_down)(dev);
+	return eth_err(port_id, (*dev->dev_ops->dev_set_link_down)(dev));
 }
 
 void
@@ -1114,7 +1126,7 @@ struct rte_eth_dev *
 	rte_eth_dev_stop(port_id);
 	ret = dev->dev_ops->dev_reset(dev);
 
-	return ret;
+	return eth_err(port_id, ret);
 }
 
 int
@@ -1238,7 +1250,7 @@ struct rte_eth_dev *
 			dev->data->min_rx_buf_size = mbp_buf_size;
 	}
 
-	return ret;
+	return eth_err(port_id, ret);
 }
 
 /**
@@ -1357,8 +1369,8 @@ struct rte_eth_dev *
 					  &local_conf.offloads);
 	}
 
-	return (*dev->dev_ops->tx_queue_setup)(dev, tx_queue_id, nb_tx_desc,
-					       socket_id, &local_conf);
+	return eth_err(port_id, (*dev->dev_ops->tx_queue_setup)(dev,
+		       tx_queue_id, nb_tx_desc, socket_id, &local_conf));
 }
 
 void
@@ -1414,14 +1426,16 @@ struct rte_eth_dev *
 rte_eth_tx_done_cleanup(uint16_t port_id, uint16_t queue_id, uint32_t free_cnt)
 {
 	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+	int ret;
 
 	/* Validate Input Data. Bail if not valid or not supported. */
 	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->tx_done_cleanup, -ENOTSUP);
 
 	/* Call driver to free pending mbufs. */
-	return (*dev->dev_ops->tx_done_cleanup)(dev->data->tx_queues[queue_id],
-			free_cnt);
+	ret = (*dev->dev_ops->tx_done_cleanup)(dev->data->tx_queues[queue_id],
+					       free_cnt);
+	return eth_err(port_id, ret);
 }
 
 void
@@ -1558,7 +1572,7 @@ struct rte_eth_dev *
 
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->stats_get, -ENOTSUP);
 	stats->rx_nombuf = dev->data->rx_mbuf_alloc_failed;
-	return (*dev->dev_ops->stats_get)(dev, stats);
+	return eth_err(port_id, (*dev->dev_ops->stats_get)(dev, stats));
 }
 
 int
@@ -1604,12 +1618,12 @@ struct rte_eth_dev *
 		count = (*dev->dev_ops->xstats_get_names_by_id)(dev, NULL,
 				NULL, 0);
 		if (count < 0)
-			return count;
+			return eth_err(port_id, count);
 	}
 	if (dev->dev_ops->xstats_get_names != NULL) {
 		count = (*dev->dev_ops->xstats_get_names)(dev, NULL, 0);
 		if (count < 0)
-			return count;
+			return eth_err(port_id, count);
 	} else
 		count = 0;
 
@@ -1736,8 +1750,12 @@ struct rte_eth_dev *
 	}
 
 	/* Fill xstats_names_copy structure */
-	rte_eth_xstats_get_names(port_id, xstats_names_copy, expected_entries);
-
+	ret = rte_eth_xstats_get_names(port_id, xstats_names_copy,
+				       expected_entries);
+	if (ret < 0) {
+		free(xstats_names_copy);
+		return ret;
+	}
 	/* Filter stats */
 	for (i = 0; i < size; i++) {
 		if (ids[i] >= expected_entries) {
@@ -1810,7 +1828,7 @@ struct rte_eth_dev *
 			xstats_names + cnt_used_entries,
 			size - cnt_used_entries);
 		if (cnt_driver_entries < 0)
-			return cnt_driver_entries;
+			return eth_err(port_id, cnt_driver_entries);
 		cnt_used_entries += cnt_driver_entries;
 	}
 
@@ -1830,7 +1848,10 @@ struct rte_eth_dev *
 	int ret;
 
 	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
-	expected_entries = get_xstats_count(port_id);
+	ret = get_xstats_count(port_id);
+	if (ret < 0)
+		return ret;
+	expected_entries = (uint16_t)ret;
 	struct rte_eth_xstat xstats[expected_entries];
 	dev = &rte_eth_devices[port_id];
 
@@ -1901,6 +1922,7 @@ struct rte_eth_dev *
 	signed int xcount = 0;
 	uint64_t val, *stats_ptr;
 	uint16_t nb_rxqs, nb_txqs;
+	int ret;
 
 	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -EINVAL);
 
@@ -1923,7 +1945,7 @@ struct rte_eth_dev *
 				     (n > count) ? n - count : 0);
 
 		if (xcount < 0)
-			return xcount;
+			return eth_err(port_id, xcount);
 	}
 
 	if (n < count + xcount || xstats == NULL)
@@ -1931,7 +1953,9 @@ struct rte_eth_dev *
 
 	/* now fill the xstats structure */
 	count = 0;
-	rte_eth_stats_get(port_id, &eth_stats);
+	ret = rte_eth_stats_get(port_id, &eth_stats);
+	if (ret < 0)
+		return ret;
 
 	/* global stats */
 	for (i = 0; i < RTE_NB_STATS; i++) {
@@ -2011,8 +2035,8 @@ struct rte_eth_dev *
 rte_eth_dev_set_tx_queue_stats_mapping(uint16_t port_id, uint16_t tx_queue_id,
 		uint8_t stat_idx)
 {
-	return set_queue_stats_mapping(port_id, tx_queue_id, stat_idx,
-			STAT_QMAP_TX);
+	return eth_err(port_id, set_queue_stats_mapping(port_id, tx_queue_id,
+						stat_idx, STAT_QMAP_TX));
 }
 
 
@@ -2020,8 +2044,8 @@ struct rte_eth_dev *
 rte_eth_dev_set_rx_queue_stats_mapping(uint16_t port_id, uint16_t rx_queue_id,
 		uint8_t stat_idx)
 {
-	return set_queue_stats_mapping(port_id, rx_queue_id, stat_idx,
-			STAT_QMAP_RX);
+	return eth_err(port_id, set_queue_stats_mapping(port_id, rx_queue_id,
+						stat_idx, STAT_QMAP_RX));
 }
 
 int
@@ -2033,7 +2057,8 @@ struct rte_eth_dev *
 	dev = &rte_eth_devices[port_id];
 
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->fw_version_get, -ENOTSUP);
-	return (*dev->dev_ops->fw_version_get)(dev, fw_version, fw_size);
+	return eth_err(port_id, (*dev->dev_ops->fw_version_get)(dev,
+							fw_version, fw_size));
 }
 
 void
@@ -2123,7 +2148,7 @@ struct rte_eth_dev *
 	if (!ret)
 		dev->data->mtu = mtu;
 
-	return ret;
+	return eth_err(port_id, ret);
 }
 
 int
@@ -2163,7 +2188,7 @@ struct rte_eth_dev *
 			vfc->ids[vidx] &= ~(UINT64_C(1) << vbit);
 	}
 
-	return ret;
+	return eth_err(port_id, ret);
 }
 
 int
@@ -2196,7 +2221,8 @@ struct rte_eth_dev *
 	dev = &rte_eth_devices[port_id];
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->vlan_tpid_set, -ENOTSUP);
 
-	return (*dev->dev_ops->vlan_tpid_set)(dev, vlan_type, tpid);
+	return eth_err(port_id, (*dev->dev_ops->vlan_tpid_set)(dev, vlan_type,
+							       tpid));
 }
 
 int
@@ -2274,7 +2300,7 @@ struct rte_eth_dev *
 					    &dev->data->dev_conf.rxmode);
 	}
 
-	return ret;
+	return eth_err(port_id, ret);
 }
 
 int
@@ -2309,9 +2335,8 @@ struct rte_eth_dev *
 	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
 	dev = &rte_eth_devices[port_id];
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->vlan_pvid_set, -ENOTSUP);
-	(*dev->dev_ops->vlan_pvid_set)(dev, pvid, on);
 
-	return 0;
+	return eth_err(port_id, (*dev->dev_ops->vlan_pvid_set)(dev, pvid, on));
 }
 
 int
@@ -2323,7 +2348,7 @@ struct rte_eth_dev *
 	dev = &rte_eth_devices[port_id];
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->flow_ctrl_get, -ENOTSUP);
 	memset(fc_conf, 0, sizeof(*fc_conf));
-	return (*dev->dev_ops->flow_ctrl_get)(dev, fc_conf);
+	return eth_err(port_id, (*dev->dev_ops->flow_ctrl_get)(dev, fc_conf));
 }
 
 int
@@ -2339,7 +2364,7 @@ struct rte_eth_dev *
 
 	dev = &rte_eth_devices[port_id];
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->flow_ctrl_set, -ENOTSUP);
-	return (*dev->dev_ops->flow_ctrl_set)(dev, fc_conf);
+	return eth_err(port_id, (*dev->dev_ops->flow_ctrl_set)(dev, fc_conf));
 }
 
 int
@@ -2357,7 +2382,8 @@ struct rte_eth_dev *
 	dev = &rte_eth_devices[port_id];
 	/* High water, low water validation are device specific */
 	if  (*dev->dev_ops->priority_flow_ctrl_set)
-		return (*dev->dev_ops->priority_flow_ctrl_set)(dev, pfc_conf);
+		return eth_err(port_id, (*dev->dev_ops->priority_flow_ctrl_set)
+					(dev, pfc_conf));
 	return -ENOTSUP;
 }
 
@@ -2432,7 +2458,8 @@ struct rte_eth_dev *
 		return ret;
 
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->reta_update, -ENOTSUP);
-	return (*dev->dev_ops->reta_update)(dev, reta_conf, reta_size);
+	return eth_err(port_id, (*dev->dev_ops->reta_update)(dev, reta_conf,
+							     reta_size));
 }
 
 int
@@ -2452,7 +2479,8 @@ struct rte_eth_dev *
 
 	dev = &rte_eth_devices[port_id];
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->reta_query, -ENOTSUP);
-	return (*dev->dev_ops->reta_query)(dev, reta_conf, reta_size);
+	return eth_err(port_id, (*dev->dev_ops->reta_query)(dev, reta_conf,
+							    reta_size));
 }
 
 int
@@ -2464,7 +2492,8 @@ struct rte_eth_dev *
 	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
 	dev = &rte_eth_devices[port_id];
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rss_hash_update, -ENOTSUP);
-	return (*dev->dev_ops->rss_hash_update)(dev, rss_conf);
+	return eth_err(port_id, (*dev->dev_ops->rss_hash_update)(dev,
+								 rss_conf));
 }
 
 int
@@ -2476,7 +2505,8 @@ struct rte_eth_dev *
 	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
 	dev = &rte_eth_devices[port_id];
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rss_hash_conf_get, -ENOTSUP);
-	return (*dev->dev_ops->rss_hash_conf_get)(dev, rss_conf);
+	return eth_err(port_id, (*dev->dev_ops->rss_hash_conf_get)(dev,
+								   rss_conf));
 }
 
 int
@@ -2498,7 +2528,8 @@ struct rte_eth_dev *
 
 	dev = &rte_eth_devices[port_id];
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->udp_tunnel_port_add, -ENOTSUP);
-	return (*dev->dev_ops->udp_tunnel_port_add)(dev, udp_tunnel);
+	return eth_err(port_id, (*dev->dev_ops->udp_tunnel_port_add)(dev,
+								udp_tunnel));
 }
 
 int
@@ -2521,7 +2552,8 @@ struct rte_eth_dev *
 	}
 
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->udp_tunnel_port_del, -ENOTSUP);
-	return (*dev->dev_ops->udp_tunnel_port_del)(dev, udp_tunnel);
+	return eth_err(port_id, (*dev->dev_ops->udp_tunnel_port_del)(dev,
+								udp_tunnel));
 }
 
 int
@@ -2532,7 +2564,7 @@ struct rte_eth_dev *
 	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
 	dev = &rte_eth_devices[port_id];
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->dev_led_on, -ENOTSUP);
-	return (*dev->dev_ops->dev_led_on)(dev);
+	return eth_err(port_id, (*dev->dev_ops->dev_led_on)(dev));
 }
 
 int
@@ -2543,7 +2575,7 @@ struct rte_eth_dev *
 	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
 	dev = &rte_eth_devices[port_id];
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->dev_led_off, -ENOTSUP);
-	return (*dev->dev_ops->dev_led_off)(dev);
+	return eth_err(port_id, (*dev->dev_ops->dev_led_off)(dev));
 }
 
 /*
@@ -2619,7 +2651,7 @@ struct rte_eth_dev *
 		dev->data->mac_pool_sel[index] |= (1ULL << pool);
 	}
 
-	return ret;
+	return eth_err(port_id, ret);
 }
 
 int
@@ -2745,7 +2777,7 @@ struct rte_eth_dev *
 					&dev->data->hash_mac_addrs[index]);
 	}
 
-	return ret;
+	return eth_err(port_id, ret);
 }
 
 int
@@ -2758,7 +2790,8 @@ struct rte_eth_dev *
 	dev = &rte_eth_devices[port_id];
 
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->uc_all_hash_table_set, -ENOTSUP);
-	return (*dev->dev_ops->uc_all_hash_table_set)(dev, on);
+	return eth_err(port_id, (*dev->dev_ops->uc_all_hash_table_set)(dev,
+								       on));
 }
 
 int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx,
@@ -2788,7 +2821,8 @@ int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx,
 	}
 
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->set_queue_rate_limit, -ENOTSUP);
-	return (*dev->dev_ops->set_queue_rate_limit)(dev, queue_idx, tx_rate);
+	return eth_err(port_id, (*dev->dev_ops->set_queue_rate_limit)(dev,
+							queue_idx, tx_rate));
 }
 
 int
@@ -2826,7 +2860,8 @@ int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx,
 	dev = &rte_eth_devices[port_id];
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->mirror_rule_set, -ENOTSUP);
 
-	return (*dev->dev_ops->mirror_rule_set)(dev, mirror_conf, rule_id, on);
+	return eth_err(port_id, (*dev->dev_ops->mirror_rule_set)(dev,
+						mirror_conf, rule_id, on));
 }
 
 int
@@ -2839,7 +2874,8 @@ int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx,
 	dev = &rte_eth_devices[port_id];
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->mirror_rule_reset, -ENOTSUP);
 
-	return (*dev->dev_ops->mirror_rule_reset)(dev, rule_id);
+	return eth_err(port_id, (*dev->dev_ops->mirror_rule_reset)(dev,
+								   rule_id));
 }
 
 int
@@ -3061,7 +3097,8 @@ int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx,
 	dev = &rte_eth_devices[port_id];
 
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rx_queue_intr_enable, -ENOTSUP);
-	return (*dev->dev_ops->rx_queue_intr_enable)(dev, queue_id);
+	return eth_err(port_id, (*dev->dev_ops->rx_queue_intr_enable)(dev,
+								queue_id));
 }
 
 int
@@ -3075,7 +3112,8 @@ int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx,
 	dev = &rte_eth_devices[port_id];
 
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rx_queue_intr_disable, -ENOTSUP);
-	return (*dev->dev_ops->rx_queue_intr_disable)(dev, queue_id);
+	return eth_err(port_id, (*dev->dev_ops->rx_queue_intr_disable)(dev,
+								queue_id));
 }
 
 
@@ -3103,7 +3141,8 @@ int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx,
 
 	dev = &rte_eth_devices[port_id];
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->filter_ctrl, -ENOTSUP);
-	return (*dev->dev_ops->filter_ctrl)(dev, filter_type, filter_op, arg);
+	return eth_err(port_id, (*dev->dev_ops->filter_ctrl)(dev, filter_type,
+							     filter_op, arg));
 }
 
 void *
@@ -3353,7 +3392,8 @@ int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx,
 
 	dev = &rte_eth_devices[port_id];
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->set_mc_addr_list, -ENOTSUP);
-	return dev->dev_ops->set_mc_addr_list(dev, mc_addr_set, nb_mc_addr);
+	return eth_err(port_id, dev->dev_ops->set_mc_addr_list(dev,
+						mc_addr_set, nb_mc_addr));
 }
 
 int
@@ -3365,7 +3405,7 @@ int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx,
 	dev = &rte_eth_devices[port_id];
 
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->timesync_enable, -ENOTSUP);
-	return (*dev->dev_ops->timesync_enable)(dev);
+	return eth_err(port_id, (*dev->dev_ops->timesync_enable)(dev));
 }
 
 int
@@ -3377,7 +3417,7 @@ int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx,
 	dev = &rte_eth_devices[port_id];
 
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->timesync_disable, -ENOTSUP);
-	return (*dev->dev_ops->timesync_disable)(dev);
+	return eth_err(port_id, (*dev->dev_ops->timesync_disable)(dev));
 }
 
 int
@@ -3390,7 +3430,8 @@ int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx,
 	dev = &rte_eth_devices[port_id];
 
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->timesync_read_rx_timestamp, -ENOTSUP);
-	return (*dev->dev_ops->timesync_read_rx_timestamp)(dev, timestamp, flags);
+	return eth_err(port_id, (*dev->dev_ops->timesync_read_rx_timestamp)
+				(dev, timestamp, flags));
 }
 
 int
@@ -3403,7 +3444,8 @@ int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx,
 	dev = &rte_eth_devices[port_id];
 
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->timesync_read_tx_timestamp, -ENOTSUP);
-	return (*dev->dev_ops->timesync_read_tx_timestamp)(dev, timestamp);
+	return eth_err(port_id, (*dev->dev_ops->timesync_read_tx_timestamp)
+				(dev, timestamp));
 }
 
 int
@@ -3415,7 +3457,8 @@ int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx,
 	dev = &rte_eth_devices[port_id];
 
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->timesync_adjust_time, -ENOTSUP);
-	return (*dev->dev_ops->timesync_adjust_time)(dev, delta);
+	return eth_err(port_id, (*dev->dev_ops->timesync_adjust_time)(dev,
+								      delta));
 }
 
 int
@@ -3427,7 +3470,8 @@ int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx,
 	dev = &rte_eth_devices[port_id];
 
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->timesync_read_time, -ENOTSUP);
-	return (*dev->dev_ops->timesync_read_time)(dev, timestamp);
+	return eth_err(port_id, (*dev->dev_ops->timesync_read_time)(dev,
+								timestamp));
 }
 
 int
@@ -3439,7 +3483,8 @@ int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx,
 	dev = &rte_eth_devices[port_id];
 
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->timesync_write_time, -ENOTSUP);
-	return (*dev->dev_ops->timesync_write_time)(dev, timestamp);
+	return eth_err(port_id, (*dev->dev_ops->timesync_write_time)(dev,
+								timestamp));
 }
 
 int
@@ -3451,7 +3496,7 @@ int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx,
 
 	dev = &rte_eth_devices[port_id];
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->get_reg, -ENOTSUP);
-	return (*dev->dev_ops->get_reg)(dev, info);
+	return eth_err(port_id, (*dev->dev_ops->get_reg)(dev, info));
 }
 
 int
@@ -3463,7 +3508,7 @@ int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx,
 
 	dev = &rte_eth_devices[port_id];
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->get_eeprom_length, -ENOTSUP);
-	return (*dev->dev_ops->get_eeprom_length)(dev);
+	return eth_err(port_id, (*dev->dev_ops->get_eeprom_length)(dev));
 }
 
 int
@@ -3475,7 +3520,7 @@ int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx,
 
 	dev = &rte_eth_devices[port_id];
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->get_eeprom, -ENOTSUP);
-	return (*dev->dev_ops->get_eeprom)(dev, info);
+	return eth_err(port_id, (*dev->dev_ops->get_eeprom)(dev, info));
 }
 
 int
@@ -3487,7 +3532,7 @@ int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx,
 
 	dev = &rte_eth_devices[port_id];
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->set_eeprom, -ENOTSUP);
-	return (*dev->dev_ops->set_eeprom)(dev, info);
+	return eth_err(port_id, (*dev->dev_ops->set_eeprom)(dev, info));
 }
 
 int
@@ -3502,7 +3547,7 @@ int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx,
 	memset(dcb_info, 0, sizeof(struct rte_eth_dcb_info));
 
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->get_dcb_info, -ENOTSUP);
-	return (*dev->dev_ops->get_dcb_info)(dev, dcb_info);
+	return eth_err(port_id, (*dev->dev_ops->get_dcb_info)(dev, dcb_info));
 }
 
 int
@@ -3525,7 +3570,8 @@ int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx,
 	dev = &rte_eth_devices[port_id];
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->l2_tunnel_eth_type_conf,
 				-ENOTSUP);
-	return (*dev->dev_ops->l2_tunnel_eth_type_conf)(dev, l2_tunnel);
+	return eth_err(port_id, (*dev->dev_ops->l2_tunnel_eth_type_conf)(dev,
+								l2_tunnel));
 }
 
 int
@@ -3556,7 +3602,8 @@ int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx,
 	dev = &rte_eth_devices[port_id];
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->l2_tunnel_offload_set,
 				-ENOTSUP);
-	return (*dev->dev_ops->l2_tunnel_offload_set)(dev, l2_tunnel, mask, en);
+	return eth_err(port_id, (*dev->dev_ops->l2_tunnel_offload_set)(dev,
+							l2_tunnel, mask, en));
 }
 
 static void
diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index 3aa9d3f..936bf79 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -2047,6 +2047,7 @@ int rte_eth_dev_configure(uint16_t port_id, uint16_t nb_rx_queue,
  *   memory buffers to populate each descriptor of the receive ring.
  * @return
  *   - 0: Success, receive queue correctly set up.
+ *   - -EIO: if device is removed.
  *   - -EINVAL: The size of network buffers which can be allocated from the
  *      memory pool does not fit the various buffer sizes allowed by the
  *      device controller.
@@ -2147,6 +2148,7 @@ int rte_eth_tx_queue_setup(uint16_t port_id, uint16_t tx_queue_id,
  * @return
  *   - 0: Success, the receive queue is started.
  *   - -EINVAL: The port_id or the queue_id out of range.
+ *   - -EIO: if device is removed.
  *   - -ENOTSUP: The function not supported in PMD driver.
  */
 int rte_eth_dev_rx_queue_start(uint16_t port_id, uint16_t rx_queue_id);
@@ -2163,6 +2165,7 @@ int rte_eth_tx_queue_setup(uint16_t port_id, uint16_t tx_queue_id,
  * @return
  *   - 0: Success, the receive queue is stopped.
  *   - -EINVAL: The port_id or the queue_id out of range.
+ *   - -EIO: if device is removed.
  *   - -ENOTSUP: The function not supported in PMD driver.
  */
 int rte_eth_dev_rx_queue_stop(uint16_t port_id, uint16_t rx_queue_id);
@@ -2180,6 +2183,7 @@ int rte_eth_tx_queue_setup(uint16_t port_id, uint16_t tx_queue_id,
  * @return
  *   - 0: Success, the transmit queue is started.
  *   - -EINVAL: The port_id or the queue_id out of range.
+ *   - -EIO: if device is removed.
  *   - -ENOTSUP: The function not supported in PMD driver.
  */
 int rte_eth_dev_tx_queue_start(uint16_t port_id, uint16_t tx_queue_id);
@@ -2196,6 +2200,7 @@ int rte_eth_tx_queue_setup(uint16_t port_id, uint16_t tx_queue_id,
  * @return
  *   - 0: Success, the transmit queue is stopped.
  *   - -EINVAL: The port_id or the queue_id out of range.
+ *   - -EIO: if device is removed.
  *   - -ENOTSUP: The function not supported in PMD driver.
  */
 int rte_eth_dev_tx_queue_stop(uint16_t port_id, uint16_t tx_queue_id);
@@ -2297,7 +2302,7 @@ int rte_eth_tx_queue_setup(uint16_t port_id, uint16_t tx_queue_id,
  *   - (-EINVAL) if port identifier is invalid.
  *   - (-ENOTSUP) if hardware doesn't support this function.
  *   - (-EPERM) if not ran from the primary process.
- *   - (-EIO) if re-initialisation failed.
+ *   - (-EIO) if re-initialisation failed or device is removed.
  *   - (-ENOMEM) if the reset failed due to OOM.
  *   - (-EAGAIN) if the reset temporarily failed and should be retried later.
  */
@@ -2533,6 +2538,7 @@ int rte_eth_xstats_get_by_id(uint16_t port_id, const uint64_t *ids,
  * @return
  *    0 on success
  *    -ENODEV for invalid port_id,
+ *    -EIO if device is removed,
  *    -EINVAL if the xstat_name doesn't exist in port_id
  */
 int rte_eth_xstats_get_id_by_name(uint16_t port_id, const char *xstat_name,
@@ -2624,6 +2630,7 @@ int rte_eth_dev_set_rx_queue_stats_mapping(uint16_t port_id,
  *   - (0) if successful.
  *   - (-ENOTSUP) if operation is not supported.
  *   - (-ENODEV) if *port_id* invalid.
+ *   - (-EIO) if device is removed.
  *   - (>0) if *fw_size* is not enough to store firmware version, return
  *          the size of the non truncated string.
  */
@@ -2695,6 +2702,7 @@ int rte_eth_dev_get_supported_ptypes(uint16_t port_id, uint32_t ptype_mask,
  *   - (0) if successful.
  *   - (-ENOTSUP) if operation is not supported.
  *   - (-ENODEV) if *port_id* invalid.
+ *   - (-EIO) if device is removed.
  *   - (-EINVAL) if *mtu* invalid.
  *   - (-EBUSY) if operation is not allowed when the port is running
  */
@@ -2715,6 +2723,7 @@ int rte_eth_dev_get_supported_ptypes(uint16_t port_id, uint32_t ptype_mask,
  *   - (0) if successful.
  *   - (-ENOSUP) if hardware-assisted VLAN filtering not configured.
  *   - (-ENODEV) if *port_id* invalid.
+ *   - (-EIO) if device is removed.
  *   - (-ENOSYS) if VLAN filtering on *port_id* disabled.
  *   - (-EINVAL) if *vlan_id* > 4095.
  */
@@ -2757,6 +2766,7 @@ int rte_eth_dev_set_vlan_strip_on_queue(uint16_t port_id, uint16_t rx_queue_id,
  *   - (0) if successful.
  *   - (-ENOSUP) if hardware-assisted VLAN TPID setup is not supported.
  *   - (-ENODEV) if *port_id* invalid.
+ *   - (-EIO) if device is removed.
  */
 int rte_eth_dev_set_vlan_ether_type(uint16_t port_id,
 				    enum rte_vlan_type vlan_type,
@@ -2781,6 +2791,7 @@ int rte_eth_dev_set_vlan_ether_type(uint16_t port_id,
  *   - (0) if successful.
  *   - (-ENOSUP) if hardware-assisted VLAN filtering not configured.
  *   - (-ENODEV) if *port_id* invalid.
+ *   - (-EIO) if device is removed.
  */
 int rte_eth_dev_set_vlan_offload(uint16_t port_id, int offload_mask);
 
@@ -3522,6 +3533,7 @@ struct rte_eth_dev_tx_buffer {
  * @return
  *   Failure: < 0
  *     -ENODEV: Invalid interface
+ *     -EIO: device is removed
  *     -ENOTSUP: Driver does not support function
  *   Success: >= 0
  *     0-n: Number of packets freed. More packets may still remain in ring that
@@ -3634,6 +3646,7 @@ int _rte_eth_dev_callback_process(struct rte_eth_dev *dev,
  *   - (-ENOTSUP) if underlying hardware OR driver doesn't support
  *     that operation.
  *   - (-ENODEV) if *port_id* invalid.
+ *   - (-EIO) if device is removed.
  */
 int rte_eth_dev_rx_intr_enable(uint16_t port_id, uint16_t queue_id);
 
@@ -3655,6 +3668,7 @@ int _rte_eth_dev_callback_process(struct rte_eth_dev *dev,
  *   - (-ENOTSUP) if underlying hardware OR driver doesn't support
  *     that operation.
  *   - (-ENODEV) if *port_id* invalid.
+ *   - (-EIO) if device is removed.
  */
 int rte_eth_dev_rx_intr_disable(uint16_t port_id, uint16_t queue_id);
 
@@ -3712,6 +3726,7 @@ int rte_eth_dev_rx_intr_ctl_q(uint16_t port_id, uint16_t queue_id,
  *   - (-ENOTSUP) if underlying hardware OR driver doesn't support
  *     that operation.
  *   - (-ENODEV) if *port_id* invalid.
+ *   - (-EIO) if device is removed.
  */
 int  rte_eth_led_on(uint16_t port_id);
 
@@ -3726,6 +3741,7 @@ int rte_eth_dev_rx_intr_ctl_q(uint16_t port_id, uint16_t queue_id,
  *   - (-ENOTSUP) if underlying hardware OR driver doesn't support
  *     that operation.
  *   - (-ENODEV) if *port_id* invalid.
+ *   - (-EIO) if device is removed.
  */
 int  rte_eth_led_off(uint16_t port_id);
 
@@ -3740,6 +3756,7 @@ int rte_eth_dev_rx_intr_ctl_q(uint16_t port_id, uint16_t queue_id,
  *   - (0) if successful.
  *   - (-ENOTSUP) if hardware doesn't support flow control.
  *   - (-ENODEV)  if *port_id* invalid.
+ *   - (-EIO)  if device is removed.
  */
 int rte_eth_dev_flow_ctrl_get(uint16_t port_id,
 			      struct rte_eth_fc_conf *fc_conf);
@@ -3756,7 +3773,7 @@ int rte_eth_dev_flow_ctrl_get(uint16_t port_id,
  *   - (-ENOTSUP) if hardware doesn't support flow control mode.
  *   - (-ENODEV)  if *port_id* invalid.
  *   - (-EINVAL)  if bad parameter
- *   - (-EIO)     if flow control setup failure
+ *   - (-EIO)     if flow control setup failure or device is removed.
  */
 int rte_eth_dev_flow_ctrl_set(uint16_t port_id,
 			      struct rte_eth_fc_conf *fc_conf);
@@ -3774,7 +3791,7 @@ int rte_eth_dev_flow_ctrl_set(uint16_t port_id,
  *   - (-ENOTSUP) if hardware doesn't support priority flow control mode.
  *   - (-ENODEV)  if *port_id* invalid.
  *   - (-EINVAL)  if bad parameter
- *   - (-EIO)     if flow control setup failure
+ *   - (-EIO)     if flow control setup failure or device is removed.
  */
 int rte_eth_dev_priority_flow_ctrl_set(uint16_t port_id,
 				struct rte_eth_pfc_conf *pfc_conf);
@@ -3794,6 +3811,7 @@ int rte_eth_dev_priority_flow_ctrl_set(uint16_t port_id,
  *   - (0) if successfully added or *mac_addr" was already added.
  *   - (-ENOTSUP) if hardware doesn't support this feature.
  *   - (-ENODEV) if *port* is invalid.
+ *   - (-EIO) if device is removed.
  *   - (-ENOSPC) if no more MAC addresses can be added.
  *   - (-EINVAL) if MAC address is invalid.
  */
@@ -3845,6 +3863,7 @@ int rte_eth_dev_default_mac_addr_set(uint16_t port,
  *   - (0) if successful.
  *   - (-ENOTSUP) if hardware doesn't support.
  *   - (-EINVAL) if bad parameter.
+ *   - (-EIO) if device is removed.
  */
 int rte_eth_dev_rss_reta_update(uint16_t port,
 				struct rte_eth_rss_reta_entry64 *reta_conf,
@@ -3864,6 +3883,7 @@ int rte_eth_dev_rss_reta_update(uint16_t port,
  *   - (0) if successful.
  *   - (-ENOTSUP) if hardware doesn't support.
  *   - (-EINVAL) if bad parameter.
+ *   - (-EIO) if device is removed.
  */
 int rte_eth_dev_rss_reta_query(uint16_t port,
 			       struct rte_eth_rss_reta_entry64 *reta_conf,
@@ -3885,6 +3905,7 @@ int rte_eth_dev_rss_reta_query(uint16_t port,
  *   - (0) if successful.
  *   - (-ENOTSUP) if hardware doesn't support.
   *  - (-ENODEV) if *port_id* invalid.
+ *   - (-EIO) if device is removed.
  *   - (-EINVAL) if bad parameter.
  */
 int rte_eth_dev_uc_hash_table_set(uint16_t port, struct ether_addr *addr,
@@ -3905,6 +3926,7 @@ int rte_eth_dev_uc_hash_table_set(uint16_t port, struct ether_addr *addr,
  *   - (0) if successful.
  *   - (-ENOTSUP) if hardware doesn't support.
   *  - (-ENODEV) if *port_id* invalid.
+ *   - (-EIO) if device is removed.
  *   - (-EINVAL) if bad parameter.
  */
 int rte_eth_dev_uc_all_hash_table_set(uint16_t port, uint8_t on);
@@ -3928,6 +3950,7 @@ int rte_eth_dev_uc_hash_table_set(uint16_t port, struct ether_addr *addr,
  *   - (0) if successful.
  *   - (-ENOTSUP) if hardware doesn't support this feature.
  *   - (-ENODEV) if *port_id* invalid.
+ *   - (-EIO) if device is removed.
  *   - (-EINVAL) if the mr_conf information is not correct.
  */
 int rte_eth_mirror_rule_set(uint16_t port_id,
@@ -3946,6 +3969,7 @@ int rte_eth_mirror_rule_set(uint16_t port_id,
  *   - (0) if successful.
  *   - (-ENOTSUP) if hardware doesn't support this feature.
  *   - (-ENODEV) if *port_id* invalid.
+ *   - (-EIO) if device is removed.
  *   - (-EINVAL) if bad parameter.
  */
 int rte_eth_mirror_rule_reset(uint16_t port_id,
@@ -3964,6 +3988,7 @@ int rte_eth_mirror_rule_reset(uint16_t port_id,
  *   - (0) if successful.
  *   - (-ENOTSUP) if hardware doesn't support this feature.
  *   - (-ENODEV) if *port_id* invalid.
+ *   - (-EIO) if device is removed.
  *   - (-EINVAL) if bad parameter.
  */
 int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx,
@@ -3979,6 +4004,7 @@ int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx,
  * @return
  *   - (0) if successful.
  *   - (-ENODEV) if port identifier is invalid.
+ *   - (-EIO) if device is removed.
  *   - (-ENOTSUP) if hardware doesn't support.
  *   - (-EINVAL) if bad parameter.
  */
@@ -3996,6 +4022,7 @@ int rte_eth_dev_rss_hash_update(uint16_t port_id,
  * @return
  *   - (0) if successful.
  *   - (-ENODEV) if port identifier is invalid.
+ *   - (-EIO) if device is removed.
  *   - (-ENOTSUP) if hardware doesn't support RSS.
  */
 int
@@ -4017,6 +4044,7 @@ int rte_eth_dev_rss_hash_update(uint16_t port_id,
  * @return
  *   - (0) if successful.
  *   - (-ENODEV) if port identifier is invalid.
+ *   - (-EIO) if device is removed.
  *   - (-ENOTSUP) if hardware doesn't support tunnel type.
  */
 int
@@ -4039,6 +4067,7 @@ int rte_eth_dev_rss_hash_update(uint16_t port_id,
  * @return
  *   - (0) if successful.
  *   - (-ENODEV) if port identifier is invalid.
+ *   - (-EIO) if device is removed.
  *   - (-ENOTSUP) if hardware doesn't support tunnel type.
  */
 int
@@ -4057,6 +4086,7 @@ int rte_eth_dev_rss_hash_update(uint16_t port_id,
  *   - (0) if successful.
  *   - (-ENOTSUP) if hardware doesn't support this filter type.
  *   - (-ENODEV) if *port_id* invalid.
+ *   - (-EIO) if device is removed.
  */
 int rte_eth_dev_filter_supported(uint16_t port_id,
 		enum rte_filter_type filter_type);
@@ -4077,6 +4107,7 @@ int rte_eth_dev_filter_supported(uint16_t port_id,
  *   - (0) if successful.
  *   - (-ENOTSUP) if hardware doesn't support.
  *   - (-ENODEV) if *port_id* invalid.
+ *   - (-EIO) if device is removed.
  *   - others depends on the specific operations implementation.
  */
 int rte_eth_dev_filter_ctrl(uint16_t port_id, enum rte_filter_type filter_type,
@@ -4092,6 +4123,7 @@ int rte_eth_dev_filter_ctrl(uint16_t port_id, enum rte_filter_type filter_type,
  * @return
  *   - (0) if successful.
  *   - (-ENODEV) if port identifier is invalid.
+ *   - (-EIO) if device is removed.
  *   - (-ENOTSUP) if hardware doesn't support.
  */
 int rte_eth_dev_get_dcb_info(uint16_t port_id,
@@ -4299,6 +4331,7 @@ int rte_eth_tx_queue_info_get(uint16_t port_id, uint16_t queue_id,
  *   - (0) if successful.
  *   - (-ENOTSUP) if hardware doesn't support.
  *   - (-ENODEV) if *port_id* invalid.
+ *   - (-EIO) if device is removed.
  *   - others depends on the specific operations implementation.
  */
 int rte_eth_dev_get_reg_info(uint16_t port_id, struct rte_dev_reg_info *info);
@@ -4312,6 +4345,7 @@ int rte_eth_tx_queue_info_get(uint16_t port_id, uint16_t queue_id,
  *   - (>=0) EEPROM size if successful.
  *   - (-ENOTSUP) if hardware doesn't support.
  *   - (-ENODEV) if *port_id* invalid.
+ *   - (-EIO) if device is removed.
  *   - others depends on the specific operations implementation.
  */
 int rte_eth_dev_get_eeprom_length(uint16_t port_id);
@@ -4328,6 +4362,7 @@ int rte_eth_tx_queue_info_get(uint16_t port_id, uint16_t queue_id,
  *   - (0) if successful.
  *   - (-ENOTSUP) if hardware doesn't support.
  *   - (-ENODEV) if *port_id* invalid.
+ *   - (-EIO) if device is removed.
  *   - others depends on the specific operations implementation.
  */
 int rte_eth_dev_get_eeprom(uint16_t port_id, struct rte_dev_eeprom_info *info);
@@ -4344,6 +4379,7 @@ int rte_eth_tx_queue_info_get(uint16_t port_id, uint16_t queue_id,
  *   - (0) if successful.
  *   - (-ENOTSUP) if hardware doesn't support.
  *   - (-ENODEV) if *port_id* invalid.
+ *   - (-EIO) if device is removed.
  *   - others depends on the specific operations implementation.
  */
 int rte_eth_dev_set_eeprom(uint16_t port_id, struct rte_dev_eeprom_info *info);
@@ -4362,6 +4398,7 @@ int rte_eth_tx_queue_info_get(uint16_t port_id, uint16_t queue_id,
  * @return
  *   - (0) if successful.
  *   - (-ENODEV) if *port_id* invalid.
+ *   - (-EIO) if device is removed.
  *   - (-ENOTSUP) if PMD of *port_id* doesn't support multicast filtering.
  *   - (-ENOSPC) if *port_id* has not enough multicast filtering resources.
  */
@@ -4378,6 +4415,7 @@ int rte_eth_dev_set_mc_addr_list(uint16_t port_id,
  * @return
  *   - 0: Success.
  *   - -ENODEV: The port ID is invalid.
+ *   - -EIO: if device is removed.
  *   - -ENOTSUP: The function is not supported by the Ethernet driver.
  */
 int rte_eth_timesync_enable(uint16_t port_id);
@@ -4391,6 +4429,7 @@ int rte_eth_dev_set_mc_addr_list(uint16_t port_id,
  * @return
  *   - 0: Success.
  *   - -ENODEV: The port ID is invalid.
+ *   - -EIO: if device is removed.
  *   - -ENOTSUP: The function is not supported by the Ethernet driver.
  */
 int rte_eth_timesync_disable(uint16_t port_id);
@@ -4410,6 +4449,7 @@ int rte_eth_dev_set_mc_addr_list(uint16_t port_id,
  *   - 0: Success.
  *   - -EINVAL: No timestamp is available.
  *   - -ENODEV: The port ID is invalid.
+ *   - -EIO: if device is removed.
  *   - -ENOTSUP: The function is not supported by the Ethernet driver.
  */
 int rte_eth_timesync_read_rx_timestamp(uint16_t port_id,
@@ -4427,6 +4467,7 @@ int rte_eth_timesync_read_rx_timestamp(uint16_t port_id,
  *   - 0: Success.
  *   - -EINVAL: No timestamp is available.
  *   - -ENODEV: The port ID is invalid.
+ *   - -EIO: if device is removed.
  *   - -ENOTSUP: The function is not supported by the Ethernet driver.
  */
 int rte_eth_timesync_read_tx_timestamp(uint16_t port_id,
@@ -4446,6 +4487,7 @@ int rte_eth_timesync_read_tx_timestamp(uint16_t port_id,
  * @return
  *   - 0: Success.
  *   - -ENODEV: The port ID is invalid.
+ *   - -EIO: if device is removed.
  *   - -ENOTSUP: The function is not supported by the Ethernet driver.
  */
 int rte_eth_timesync_adjust_time(uint16_t port_id, int64_t delta);
@@ -4481,6 +4523,7 @@ int rte_eth_timesync_read_tx_timestamp(uint16_t port_id,
  *   - 0: Success.
  *   - -EINVAL: No timestamp is available.
  *   - -ENODEV: The port ID is invalid.
+ *   - -EIO: if device is removed.
  *   - -ENOTSUP: The function is not supported by the Ethernet driver.
  */
 int rte_eth_timesync_write_time(uint16_t port_id, const struct timespec *time);
@@ -4521,6 +4564,7 @@ int rte_eth_timesync_read_tx_timestamp(uint16_t port_id,
  * @return
  *   - (0) if successful.
  *   - (-ENODEV) if port identifier is invalid.
+ *   - (-EIO) if device is removed.
  *   - (-ENOTSUP) if hardware doesn't support tunnel type.
  */
 int
@@ -4548,6 +4592,7 @@ int rte_eth_timesync_read_tx_timestamp(uint16_t port_id,
  * @return
  *   - (0) if successful.
  *   - (-ENODEV) if port identifier is invalid.
+ *   - (-EIO) if device is removed.
  *   - (-ENOTSUP) if hardware doesn't support tunnel type.
  */
 int
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH v4 5/6] ethdev: adjust flow APIs removal error report
  2018-01-10 12:30     ` [PATCH v4 0/6] Fail-safe\ethdev: fix removal handling lack Matan Azrad
                         ` (3 preceding siblings ...)
  2018-01-10 12:31       ` [PATCH v4 4/6] ethdev: adjust APIs removal error report Matan Azrad
@ 2018-01-10 12:31       ` Matan Azrad
  2018-01-10 12:31       ` [PATCH v4 6/6] net/failsafe: fix removed device handling Matan Azrad
  2018-01-17 20:19       ` [PATCH v5 0/6] Fail-safe\ethdev: fix removal handling lack Matan Azrad
  6 siblings, 0 replies; 98+ messages in thread
From: Matan Azrad @ 2018-01-10 12:31 UTC (permalink / raw)
  To: Thomas Monjalon, Gaetan Rivet; +Cc: dev

rte_eth_dev_is_removed API was added to detect a device removal
synchronously.

When a device removal occurs during flow command execution, many
different errors can be reported to the user.

Adjust all flow APIs error reports to return -EIO in case of device
removal using rte_eth_dev_is_removed API.

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
---
 lib/librte_ether/rte_flow.c | 34 +++++++++++++++++++++++++++-------
 lib/librte_ether/rte_flow.h |  2 ++
 2 files changed, 29 insertions(+), 7 deletions(-)

diff --git a/lib/librte_ether/rte_flow.c b/lib/librte_ether/rte_flow.c
index 6659063..be481ce 100644
--- a/lib/librte_ether/rte_flow.c
+++ b/lib/librte_ether/rte_flow.c
@@ -106,6 +106,18 @@ struct rte_flow_desc_data {
 	MK_FLOW_ACTION(VF, sizeof(struct rte_flow_action_vf)),
 };
 
+static int
+flow_err(uint16_t port_id, int ret, struct rte_flow_error *error)
+{
+	if (ret == 0)
+		return 0;
+	if (rte_eth_dev_is_removed(port_id))
+		return rte_flow_error_set(error, EIO,
+					  RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
+					  NULL, rte_strerror(EIO));
+	return ret;
+}
+
 /* Get generic flow operations structure from a port. */
 const struct rte_flow_ops *
 rte_flow_ops_get(uint16_t port_id, struct rte_flow_error *error)
@@ -144,7 +156,8 @@ struct rte_flow_desc_data {
 	if (unlikely(!ops))
 		return -rte_errno;
 	if (likely(!!ops->validate))
-		return ops->validate(dev, attr, pattern, actions, error);
+		return flow_err(port_id, ops->validate(dev, attr, pattern,
+						       actions, error), error);
 	return rte_flow_error_set(error, ENOSYS,
 				  RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
 				  NULL, rte_strerror(ENOSYS));
@@ -159,12 +172,17 @@ struct rte_flow *
 		struct rte_flow_error *error)
 {
 	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+	struct rte_flow *flow;
 	const struct rte_flow_ops *ops = rte_flow_ops_get(port_id, error);
 
 	if (unlikely(!ops))
 		return NULL;
-	if (likely(!!ops->create))
-		return ops->create(dev, attr, pattern, actions, error);
+	if (likely(!!ops->create)) {
+		flow = ops->create(dev, attr, pattern, actions, error);
+		if (flow == NULL)
+			flow_err(port_id, -rte_errno, error);
+		return flow;
+	}
 	rte_flow_error_set(error, ENOSYS, RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
 			   NULL, rte_strerror(ENOSYS));
 	return NULL;
@@ -182,7 +200,8 @@ struct rte_flow *
 	if (unlikely(!ops))
 		return -rte_errno;
 	if (likely(!!ops->destroy))
-		return ops->destroy(dev, flow, error);
+		return flow_err(port_id, ops->destroy(dev, flow, error),
+				error);
 	return rte_flow_error_set(error, ENOSYS,
 				  RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
 				  NULL, rte_strerror(ENOSYS));
@@ -199,7 +218,7 @@ struct rte_flow *
 	if (unlikely(!ops))
 		return -rte_errno;
 	if (likely(!!ops->flush))
-		return ops->flush(dev, error);
+		return flow_err(port_id, ops->flush(dev, error), error);
 	return rte_flow_error_set(error, ENOSYS,
 				  RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
 				  NULL, rte_strerror(ENOSYS));
@@ -219,7 +238,8 @@ struct rte_flow *
 	if (!ops)
 		return -rte_errno;
 	if (likely(!!ops->query))
-		return ops->query(dev, flow, action, data, error);
+		return flow_err(port_id, ops->query(dev, flow, action, data,
+						    error), error);
 	return rte_flow_error_set(error, ENOSYS,
 				  RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
 				  NULL, rte_strerror(ENOSYS));
@@ -237,7 +257,7 @@ struct rte_flow *
 	if (!ops)
 		return -rte_errno;
 	if (likely(!!ops->isolate))
-		return ops->isolate(dev, set, error);
+		return flow_err(port_id, ops->isolate(dev, set, error), error);
 	return rte_flow_error_set(error, ENOSYS,
 				  RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
 				  NULL, rte_strerror(ENOSYS));
diff --git a/lib/librte_ether/rte_flow.h b/lib/librte_ether/rte_flow.h
index 47c88ea..180438f 100644
--- a/lib/librte_ether/rte_flow.h
+++ b/lib/librte_ether/rte_flow.h
@@ -1237,6 +1237,8 @@ struct rte_flow_error {
  *
  *   -ENOSYS: underlying device does not support this functionality.
  *
+ *   -EIO: underlying device is removed.
+ *
  *   -EINVAL: unknown or invalid rule specification.
  *
  *   -ENOTSUP: valid but unsupported rule specification (e.g. partial
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH v4 6/6] net/failsafe: fix removed device handling
  2018-01-10 12:30     ` [PATCH v4 0/6] Fail-safe\ethdev: fix removal handling lack Matan Azrad
                         ` (4 preceding siblings ...)
  2018-01-10 12:31       ` [PATCH v4 5/6] ethdev: adjust flow " Matan Azrad
@ 2018-01-10 12:31       ` Matan Azrad
  2018-01-10 12:43         ` Matan Azrad
  2018-01-10 13:47         ` Gaëtan Rivet
  2018-01-17 20:19       ` [PATCH v5 0/6] Fail-safe\ethdev: fix removal handling lack Matan Azrad
  6 siblings, 2 replies; 98+ messages in thread
From: Matan Azrad @ 2018-01-10 12:31 UTC (permalink / raw)
  To: Thomas Monjalon, Gaetan Rivet; +Cc: dev

There is time between the physical removal of the device until
sub-device PMDs get a RMV interrupt. At this time DPDK PMDs and
applications still don't know about the removal and may call sub-device
control operation which should return an error.

In previous code this error is reported to the application contrary to
fail-safe principle that the app should not be aware of device removal.

Add an removal check in each relevant control command error flow and
prevent an error report to application when the sub-device is removed.

Signed-off-by: Matan Azrad <matan@mellanox.com>
---
 drivers/net/failsafe/failsafe_flow.c    | 18 ++++++++++-------
 drivers/net/failsafe/failsafe_ops.c     | 35 ++++++++++++++++++++++-----------
 drivers/net/failsafe/failsafe_private.h | 11 +++++++++++
 3 files changed, 46 insertions(+), 18 deletions(-)

diff --git a/drivers/net/failsafe/failsafe_flow.c b/drivers/net/failsafe/failsafe_flow.c
index 153ceee..c072d1e 100644
--- a/drivers/net/failsafe/failsafe_flow.c
+++ b/drivers/net/failsafe/failsafe_flow.c
@@ -87,7 +87,7 @@
 		DEBUG("Calling rte_flow_validate on sub_device %d", i);
 		ret = rte_flow_validate(PORT_ID(sdev),
 				attr, patterns, actions, error);
-		if (ret) {
+		if ((ret = fs_err(sdev, ret))) {
 			ERROR("Operation rte_flow_validate failed for sub_device %d"
 			      " with error %d", i, ret);
 			return ret;
@@ -111,7 +111,7 @@
 	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
 		flow->flows[i] = rte_flow_create(PORT_ID(sdev),
 				attr, patterns, actions, error);
-		if (flow->flows[i] == NULL) {
+		if (flow->flows[i] == NULL && fs_err(sdev, -rte_errno)) {
 			ERROR("Failed to create flow on sub_device %d",
 				i);
 			goto err;
@@ -150,7 +150,7 @@
 			continue;
 		local_ret = rte_flow_destroy(PORT_ID(sdev),
 				flow->flows[i], error);
-		if (local_ret) {
+		if ((local_ret = fs_err(sdev, local_ret))) {
 			ERROR("Failed to destroy flow on sub_device %d: %d",
 					i, local_ret);
 			if (ret == 0)
@@ -175,7 +175,7 @@
 	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
 		DEBUG("Calling rte_flow_flush on sub_device %d", i);
 		ret = rte_flow_flush(PORT_ID(sdev), error);
-		if (ret) {
+		if ((ret = fs_err(sdev, ret))) {
 			ERROR("Operation rte_flow_flush failed for sub_device %d"
 			      " with error %d", i, ret);
 			return ret;
@@ -199,8 +199,12 @@
 
 	sdev = TX_SUBDEV(dev);
 	if (sdev != NULL) {
-		return rte_flow_query(PORT_ID(sdev),
-				flow->flows[SUB_ID(sdev)], type, arg, error);
+		int ret = rte_flow_query(PORT_ID(sdev),
+					 flow->flows[SUB_ID(sdev)],
+					 type, arg, error);
+
+		if ((ret = fs_err(sdev, ret)))
+			return ret;
 	}
 	WARN("No active sub_device to query about its flow");
 	return -1;
@@ -223,7 +227,7 @@
 			WARN("flow isolation mode of sub_device %d in incoherent state.",
 				i);
 		ret = rte_flow_isolate(PORT_ID(sdev), set, error);
-		if (ret) {
+		if ((ret = fs_err(sdev, ret))) {
 			ERROR("Operation rte_flow_isolate failed for sub_device %d"
 			      " with error %d", i, ret);
 			return ret;
diff --git a/drivers/net/failsafe/failsafe_ops.c b/drivers/net/failsafe/failsafe_ops.c
index e16a590..f5390db 100644
--- a/drivers/net/failsafe/failsafe_ops.c
+++ b/drivers/net/failsafe/failsafe_ops.c
@@ -121,6 +121,8 @@
 					dev->data->nb_tx_queues,
 					&dev->data->dev_conf);
 		if (ret) {
+			if (!fs_err(sdev, ret))
+				continue;
 			ERROR("Could not configure sub_device %d", i);
 			return ret;
 		}
@@ -163,8 +165,11 @@
 			continue;
 		DEBUG("Starting sub_device %d", i);
 		ret = rte_eth_dev_start(PORT_ID(sdev));
-		if (ret)
+		if (ret) {
+			if (!fs_err(sdev, ret))
+				continue;
 			return ret;
+		}
 		sdev->state = DEV_STARTED;
 	}
 	if (PRIV(dev)->state < DEV_STARTED)
@@ -196,7 +201,7 @@
 	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
 		DEBUG("Calling rte_eth_dev_set_link_up on sub_device %d", i);
 		ret = rte_eth_dev_set_link_up(PORT_ID(sdev));
-		if (ret) {
+		if ((ret = fs_err(sdev, ret))) {
 			ERROR("Operation rte_eth_dev_set_link_up failed for sub_device %d"
 			      " with error %d", i, ret);
 			return ret;
@@ -215,7 +220,7 @@
 	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
 		DEBUG("Calling rte_eth_dev_set_link_down on sub_device %d", i);
 		ret = rte_eth_dev_set_link_down(PORT_ID(sdev));
-		if (ret) {
+		if ((ret = fs_err(sdev, ret))) {
 			ERROR("Operation rte_eth_dev_set_link_down failed for sub_device %d"
 			      " with error %d", i, ret);
 			return ret;
@@ -300,7 +305,7 @@
 				rx_queue_id,
 				nb_rx_desc, socket_id,
 				rx_conf, mb_pool);
-		if (ret) {
+		if ((ret = fs_err(sdev, ret))) {
 			ERROR("RX queue setup failed for sub_device %d", i);
 			goto free_rxq;
 		}
@@ -366,7 +371,7 @@
 				tx_queue_id,
 				nb_tx_desc, socket_id,
 				tx_conf);
-		if (ret) {
+		if ((ret = fs_err(sdev, ret))) {
 			ERROR("TX queue setup failed for sub_device %d", i);
 			goto free_txq;
 		}
@@ -445,7 +450,8 @@
 	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
 		DEBUG("Calling link_update on sub_device %d", i);
 		ret = (SUBOPS(sdev, link_update))(ETH(sdev), wait_to_complete);
-		if (ret && ret != -1) {
+		if (ret && ret != -1 && sdev->remove == 0 &&
+		    rte_eth_dev_is_removed(PORT_ID(sdev)) == 0) {
 			ERROR("Link update failed for sub_device %d with error %d",
 			      i, ret);
 			return ret;
@@ -469,6 +475,7 @@
 fs_stats_get(struct rte_eth_dev *dev,
 	     struct rte_eth_stats *stats)
 {
+	struct rte_eth_stats backup;
 	struct sub_device *sdev;
 	uint8_t i;
 	int ret;
@@ -478,14 +485,20 @@
 		struct rte_eth_stats *snapshot = &sdev->stats_snapshot.stats;
 		uint64_t *timestamp = &sdev->stats_snapshot.timestamp;
 
+		rte_memcpy(&backup, snapshot, sizeof(backup));
 		ret = rte_eth_stats_get(PORT_ID(sdev), snapshot);
 		if (ret) {
+			if (!fs_err(sdev, ret)) {
+				rte_memcpy(snapshot, &backup, sizeof(backup));
+				goto inc;
+			}
 			ERROR("Operation rte_eth_stats_get failed for sub_device %d with error %d",
 				  i, ret);
 			*timestamp = 0;
 			return ret;
 		}
 		*timestamp = rte_rdtsc();
+inc:
 		failsafe_stats_increment(stats, snapshot);
 	}
 	return 0;
@@ -598,7 +611,7 @@
 	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
 		DEBUG("Calling rte_eth_dev_set_mtu on sub_device %d", i);
 		ret = rte_eth_dev_set_mtu(PORT_ID(sdev), mtu);
-		if (ret) {
+		if ((ret = fs_err(sdev, ret))) {
 			ERROR("Operation rte_eth_dev_set_mtu failed for sub_device %d with error %d",
 			      i, ret);
 			return ret;
@@ -617,7 +630,7 @@
 	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
 		DEBUG("Calling rte_eth_dev_vlan_filter on sub_device %d", i);
 		ret = rte_eth_dev_vlan_filter(PORT_ID(sdev), vlan_id, on);
-		if (ret) {
+		if ((ret = fs_err(sdev, ret))) {
 			ERROR("Operation rte_eth_dev_vlan_filter failed for sub_device %d"
 			      " with error %d", i, ret);
 			return ret;
@@ -651,7 +664,7 @@
 	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
 		DEBUG("Calling rte_eth_dev_flow_ctrl_set on sub_device %d", i);
 		ret = rte_eth_dev_flow_ctrl_set(PORT_ID(sdev), fc_conf);
-		if (ret) {
+		if ((ret = fs_err(sdev, ret))) {
 			ERROR("Operation rte_eth_dev_flow_ctrl_set failed for sub_device %d"
 			      " with error %d", i, ret);
 			return ret;
@@ -688,7 +701,7 @@
 	RTE_ASSERT(index < FAILSAFE_MAX_ETHADDR);
 	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
 		ret = rte_eth_dev_mac_addr_add(PORT_ID(sdev), mac_addr, vmdq);
-		if (ret) {
+		if ((ret = fs_err(sdev, ret))) {
 			ERROR("Operation rte_eth_dev_mac_addr_add failed for sub_device %"
 			      PRIu8 " with error %d", i, ret);
 			return ret;
@@ -730,7 +743,7 @@
 	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
 		DEBUG("Calling rte_eth_dev_filter_ctrl on sub_device %d", i);
 		ret = rte_eth_dev_filter_ctrl(PORT_ID(sdev), type, op, arg);
-		if (ret) {
+		if ((ret = fs_err(sdev, ret))) {
 			ERROR("Operation rte_eth_dev_filter_ctrl failed for sub_device %d"
 			      " with error %d", i, ret);
 			return ret;
diff --git a/drivers/net/failsafe/failsafe_private.h b/drivers/net/failsafe/failsafe_private.h
index d81cc3c..a306970 100644
--- a/drivers/net/failsafe/failsafe_private.h
+++ b/drivers/net/failsafe/failsafe_private.h
@@ -375,4 +375,15 @@ int failsafe_eth_lsc_event_callback(uint16_t port_id,
 	rte_wmb();
 }
 
+/*
+ * Adjust error value and rte_errno to the fail-safe actual error value.
+ */
+static inline int
+fs_err(struct sub_device *sdev, int err)
+{
+	/* A device removal shouldn't be reported as an error. */
+	if (sdev->remove == 1 || err == -EIO)
+		return rte_errno = 0;
+	return err;
+}
 #endif /* _RTE_ETH_FAILSAFE_PRIVATE_H_ */
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* Re: [PATCH v4 6/6] net/failsafe: fix removed device handling
  2018-01-10 12:31       ` [PATCH v4 6/6] net/failsafe: fix removed device handling Matan Azrad
@ 2018-01-10 12:43         ` Matan Azrad
  2018-01-10 13:51           ` Gaëtan Rivet
  2018-01-10 13:47         ` Gaëtan Rivet
  1 sibling, 1 reply; 98+ messages in thread
From: Matan Azrad @ 2018-01-10 12:43 UTC (permalink / raw)
  To: Matan Azrad, Gaetan Rivet; +Cc: dev, Thomas Monjalon

Hi Gaetan

> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Matan Azrad
> Sent: Wednesday, January 10, 2018 2:31 PM
> To: Thomas Monjalon <thomas@monjalon.net>; Gaetan Rivet
> <gaetan.rivet@6wind.com>
> Cc: dev@dpdk.org
> Subject: [dpdk-dev] [PATCH v4 6/6] net/failsafe: fix removed device handling
> 
> There is time between the physical removal of the device until sub-device
> PMDs get a RMV interrupt. At this time DPDK PMDs and applications still
> don't know about the removal and may call sub-device control operation
> which should return an error.
> 
> In previous code this error is reported to the application contrary to fail-safe
> principle that the app should not be aware of device removal.
> 
> Add an removal check in each relevant control command error flow and
> prevent an error report to application when the sub-device is removed.
> 
> Signed-off-by: Matan Azrad <matan@mellanox.com>
> ---
>  drivers/net/failsafe/failsafe_flow.c    | 18 ++++++++++-------
>  drivers/net/failsafe/failsafe_ops.c     | 35 ++++++++++++++++++++++--------
> ---
>  drivers/net/failsafe/failsafe_private.h | 11 +++++++++++
>  3 files changed, 46 insertions(+), 18 deletions(-)
> 
> diff --git a/drivers/net/failsafe/failsafe_flow.c
> b/drivers/net/failsafe/failsafe_flow.c
> index 153ceee..c072d1e 100644
> --- a/drivers/net/failsafe/failsafe_flow.c
> +++ b/drivers/net/failsafe/failsafe_flow.c
> @@ -87,7 +87,7 @@
>  		DEBUG("Calling rte_flow_validate on sub_device %d", i);
>  		ret = rte_flow_validate(PORT_ID(sdev),
>  				attr, patterns, actions, error);
> -		if (ret) {
> +		if ((ret = fs_err(sdev, ret))) {
This assignment in "if" statement causes to checkpatch error, I sent it as is because you asked it like this.
If you think I need to change it, I see 2 options:

1.
ret = fs_err(sdev, ret);
if (ret ) {...}

2.
if (fs_err(sdev, &ret)) {..}

what do you think?

>  			ERROR("Operation rte_flow_validate failed for
> sub_device %d"
>  			      " with error %d", i, ret);
>  			return ret;
> @@ -111,7 +111,7 @@
>  	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
>  		flow->flows[i] = rte_flow_create(PORT_ID(sdev),
>  				attr, patterns, actions, error);
> -		if (flow->flows[i] == NULL) {
> +		if (flow->flows[i] == NULL && fs_err(sdev, -rte_errno)) {
>  			ERROR("Failed to create flow on sub_device %d",
>  				i);
>  			goto err;
> @@ -150,7 +150,7 @@
>  			continue;
>  		local_ret = rte_flow_destroy(PORT_ID(sdev),
>  				flow->flows[i], error);
> -		if (local_ret) {
> +		if ((local_ret = fs_err(sdev, local_ret))) {
>  			ERROR("Failed to destroy flow on sub_device %d:
> %d",
>  					i, local_ret);
>  			if (ret == 0)
> @@ -175,7 +175,7 @@
>  	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
>  		DEBUG("Calling rte_flow_flush on sub_device %d", i);
>  		ret = rte_flow_flush(PORT_ID(sdev), error);
> -		if (ret) {
> +		if ((ret = fs_err(sdev, ret))) {
>  			ERROR("Operation rte_flow_flush failed for
> sub_device %d"
>  			      " with error %d", i, ret);
>  			return ret;
> @@ -199,8 +199,12 @@
> 
>  	sdev = TX_SUBDEV(dev);
>  	if (sdev != NULL) {
> -		return rte_flow_query(PORT_ID(sdev),
> -				flow->flows[SUB_ID(sdev)], type, arg, error);
> +		int ret = rte_flow_query(PORT_ID(sdev),
> +					 flow->flows[SUB_ID(sdev)],
> +					 type, arg, error);
> +
> +		if ((ret = fs_err(sdev, ret)))
> +			return ret;
>  	}
>  	WARN("No active sub_device to query about its flow");
>  	return -1;
> @@ -223,7 +227,7 @@
>  			WARN("flow isolation mode of sub_device %d in
> incoherent state.",
>  				i);
>  		ret = rte_flow_isolate(PORT_ID(sdev), set, error);
> -		if (ret) {
> +		if ((ret = fs_err(sdev, ret))) {
>  			ERROR("Operation rte_flow_isolate failed for
> sub_device %d"
>  			      " with error %d", i, ret);
>  			return ret;
> diff --git a/drivers/net/failsafe/failsafe_ops.c
> b/drivers/net/failsafe/failsafe_ops.c
> index e16a590..f5390db 100644
> --- a/drivers/net/failsafe/failsafe_ops.c
> +++ b/drivers/net/failsafe/failsafe_ops.c
> @@ -121,6 +121,8 @@
>  					dev->data->nb_tx_queues,
>  					&dev->data->dev_conf);
>  		if (ret) {
> +			if (!fs_err(sdev, ret))
> +				continue;
>  			ERROR("Could not configure sub_device %d", i);
>  			return ret;
>  		}
> @@ -163,8 +165,11 @@
>  			continue;
>  		DEBUG("Starting sub_device %d", i);
>  		ret = rte_eth_dev_start(PORT_ID(sdev));
> -		if (ret)
> +		if (ret) {
> +			if (!fs_err(sdev, ret))
> +				continue;
>  			return ret;
> +		}
>  		sdev->state = DEV_STARTED;
>  	}
>  	if (PRIV(dev)->state < DEV_STARTED)
> @@ -196,7 +201,7 @@
>  	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
>  		DEBUG("Calling rte_eth_dev_set_link_up on sub_device
> %d", i);
>  		ret = rte_eth_dev_set_link_up(PORT_ID(sdev));
> -		if (ret) {
> +		if ((ret = fs_err(sdev, ret))) {
>  			ERROR("Operation rte_eth_dev_set_link_up failed
> for sub_device %d"
>  			      " with error %d", i, ret);
>  			return ret;
> @@ -215,7 +220,7 @@
>  	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
>  		DEBUG("Calling rte_eth_dev_set_link_down on sub_device
> %d", i);
>  		ret = rte_eth_dev_set_link_down(PORT_ID(sdev));
> -		if (ret) {
> +		if ((ret = fs_err(sdev, ret))) {
>  			ERROR("Operation rte_eth_dev_set_link_down
> failed for sub_device %d"
>  			      " with error %d", i, ret);
>  			return ret;
> @@ -300,7 +305,7 @@
>  				rx_queue_id,
>  				nb_rx_desc, socket_id,
>  				rx_conf, mb_pool);
> -		if (ret) {
> +		if ((ret = fs_err(sdev, ret))) {
>  			ERROR("RX queue setup failed for sub_device %d",
> i);
>  			goto free_rxq;
>  		}
> @@ -366,7 +371,7 @@
>  				tx_queue_id,
>  				nb_tx_desc, socket_id,
>  				tx_conf);
> -		if (ret) {
> +		if ((ret = fs_err(sdev, ret))) {
>  			ERROR("TX queue setup failed for sub_device %d", i);
>  			goto free_txq;
>  		}
> @@ -445,7 +450,8 @@
>  	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
>  		DEBUG("Calling link_update on sub_device %d", i);
>  		ret = (SUBOPS(sdev, link_update))(ETH(sdev),
> wait_to_complete);
> -		if (ret && ret != -1) {
> +		if (ret && ret != -1 && sdev->remove == 0 &&
> +		    rte_eth_dev_is_removed(PORT_ID(sdev)) == 0) {
>  			ERROR("Link update failed for sub_device %d with
> error %d",
>  			      i, ret);
>  			return ret;
> @@ -469,6 +475,7 @@
>  fs_stats_get(struct rte_eth_dev *dev,
>  	     struct rte_eth_stats *stats)
>  {
> +	struct rte_eth_stats backup;
>  	struct sub_device *sdev;
>  	uint8_t i;
>  	int ret;
> @@ -478,14 +485,20 @@
>  		struct rte_eth_stats *snapshot = &sdev-
> >stats_snapshot.stats;
>  		uint64_t *timestamp = &sdev->stats_snapshot.timestamp;
> 
> +		rte_memcpy(&backup, snapshot, sizeof(backup));
>  		ret = rte_eth_stats_get(PORT_ID(sdev), snapshot);
>  		if (ret) {
> +			if (!fs_err(sdev, ret)) {
> +				rte_memcpy(snapshot, &backup,
> sizeof(backup));
> +				goto inc;
> +			}
>  			ERROR("Operation rte_eth_stats_get failed for
> sub_device %d with error %d",
>  				  i, ret);
>  			*timestamp = 0;
>  			return ret;
>  		}
>  		*timestamp = rte_rdtsc();
> +inc:
>  		failsafe_stats_increment(stats, snapshot);
>  	}
>  	return 0;
> @@ -598,7 +611,7 @@
>  	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
>  		DEBUG("Calling rte_eth_dev_set_mtu on sub_device %d", i);
>  		ret = rte_eth_dev_set_mtu(PORT_ID(sdev), mtu);
> -		if (ret) {
> +		if ((ret = fs_err(sdev, ret))) {
>  			ERROR("Operation rte_eth_dev_set_mtu failed for
> sub_device %d with error %d",
>  			      i, ret);
>  			return ret;
> @@ -617,7 +630,7 @@
>  	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
>  		DEBUG("Calling rte_eth_dev_vlan_filter on sub_device %d",
> i);
>  		ret = rte_eth_dev_vlan_filter(PORT_ID(sdev), vlan_id, on);
> -		if (ret) {
> +		if ((ret = fs_err(sdev, ret))) {
>  			ERROR("Operation rte_eth_dev_vlan_filter failed for
> sub_device %d"
>  			      " with error %d", i, ret);
>  			return ret;
> @@ -651,7 +664,7 @@
>  	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
>  		DEBUG("Calling rte_eth_dev_flow_ctrl_set on sub_device
> %d", i);
>  		ret = rte_eth_dev_flow_ctrl_set(PORT_ID(sdev), fc_conf);
> -		if (ret) {
> +		if ((ret = fs_err(sdev, ret))) {
>  			ERROR("Operation rte_eth_dev_flow_ctrl_set failed
> for sub_device %d"
>  			      " with error %d", i, ret);
>  			return ret;
> @@ -688,7 +701,7 @@
>  	RTE_ASSERT(index < FAILSAFE_MAX_ETHADDR);
>  	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
>  		ret = rte_eth_dev_mac_addr_add(PORT_ID(sdev),
> mac_addr, vmdq);
> -		if (ret) {
> +		if ((ret = fs_err(sdev, ret))) {
>  			ERROR("Operation rte_eth_dev_mac_addr_add
> failed for sub_device %"
>  			      PRIu8 " with error %d", i, ret);
>  			return ret;
> @@ -730,7 +743,7 @@
>  	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
>  		DEBUG("Calling rte_eth_dev_filter_ctrl on sub_device %d",
> i);
>  		ret = rte_eth_dev_filter_ctrl(PORT_ID(sdev), type, op, arg);
> -		if (ret) {
> +		if ((ret = fs_err(sdev, ret))) {
>  			ERROR("Operation rte_eth_dev_filter_ctrl failed for
> sub_device %d"
>  			      " with error %d", i, ret);
>  			return ret;
> diff --git a/drivers/net/failsafe/failsafe_private.h
> b/drivers/net/failsafe/failsafe_private.h
> index d81cc3c..a306970 100644
> --- a/drivers/net/failsafe/failsafe_private.h
> +++ b/drivers/net/failsafe/failsafe_private.h
> @@ -375,4 +375,15 @@ int failsafe_eth_lsc_event_callback(uint16_t port_id,
>  	rte_wmb();
>  }
> 
> +/*
> + * Adjust error value and rte_errno to the fail-safe actual error value.
> + */
> +static inline int
> +fs_err(struct sub_device *sdev, int err) {
> +	/* A device removal shouldn't be reported as an error. */
> +	if (sdev->remove == 1 || err == -EIO)
> +		return rte_errno = 0;
> +	return err;
> +}
>  #endif /* _RTE_ETH_FAILSAFE_PRIVATE_H_ */
> --
> 1.8.3.1

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v4 6/6] net/failsafe: fix removed device handling
  2018-01-10 12:31       ` [PATCH v4 6/6] net/failsafe: fix removed device handling Matan Azrad
  2018-01-10 12:43         ` Matan Azrad
@ 2018-01-10 13:47         ` Gaëtan Rivet
  1 sibling, 0 replies; 98+ messages in thread
From: Gaëtan Rivet @ 2018-01-10 13:47 UTC (permalink / raw)
  To: Matan Azrad; +Cc: Thomas Monjalon, dev

Hi Matan,

I am a bit concerned with the forceful rte_errno reset within fs_err, I
think it would have been safer to only do this when rte_errno is
modified by the eth_dev ops (either in rte_flow or another library).

This means that the eth_dev API will be slightly different whether the
ethdev is a fail-safe or a bare device, which might throw off some
users. One could read the eth_dev source and not see any rte_errno
change, and wonder why it is modified on some specific ports.

But this is pretty minor, and if problem arises it will be easy to
pinpoint and fix, so I won't bother you further on this patch.

Thanks for the good work.

On Wed, Jan 10, 2018 at 12:31:05PM +0000, Matan Azrad wrote:
> There is time between the physical removal of the device until
> sub-device PMDs get a RMV interrupt. At this time DPDK PMDs and
> applications still don't know about the removal and may call sub-device
> control operation which should return an error.
> 
> In previous code this error is reported to the application contrary to
> fail-safe principle that the app should not be aware of device removal.
> 
> Add an removal check in each relevant control command error flow and
> prevent an error report to application when the sub-device is removed.
> 
> Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Gaetan Rivet <gaetan.rivet@6wind.com>
> ---
>  drivers/net/failsafe/failsafe_flow.c    | 18 ++++++++++-------
>  drivers/net/failsafe/failsafe_ops.c     | 35 ++++++++++++++++++++++-----------
>  drivers/net/failsafe/failsafe_private.h | 11 +++++++++++
>  3 files changed, 46 insertions(+), 18 deletions(-)
> 

-- 
Gaëtan Rivet
6WIND

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v4 6/6] net/failsafe: fix removed device handling
  2018-01-10 12:43         ` Matan Azrad
@ 2018-01-10 13:51           ` Gaëtan Rivet
  0 siblings, 0 replies; 98+ messages in thread
From: Gaëtan Rivet @ 2018-01-10 13:51 UTC (permalink / raw)
  To: Matan Azrad; +Cc: dev, Thomas Monjalon

Hi Matan,

On Wed, Jan 10, 2018 at 12:43:33PM +0000, Matan Azrad wrote:
> Hi Gaetan
> 
> > -----Original Message-----
> > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Matan Azrad
> > Sent: Wednesday, January 10, 2018 2:31 PM
> > To: Thomas Monjalon <thomas@monjalon.net>; Gaetan Rivet
> > <gaetan.rivet@6wind.com>
> > Cc: dev@dpdk.org
> > Subject: [dpdk-dev] [PATCH v4 6/6] net/failsafe: fix removed device handling
> > 
> > There is time between the physical removal of the device until sub-device
> > PMDs get a RMV interrupt. At this time DPDK PMDs and applications still
> > don't know about the removal and may call sub-device control operation
> > which should return an error.
> > 
> > In previous code this error is reported to the application contrary to fail-safe
> > principle that the app should not be aware of device removal.
> > 
> > Add an removal check in each relevant control command error flow and
> > prevent an error report to application when the sub-device is removed.
> > 
> > Signed-off-by: Matan Azrad <matan@mellanox.com>
> > ---
> >  drivers/net/failsafe/failsafe_flow.c    | 18 ++++++++++-------
> >  drivers/net/failsafe/failsafe_ops.c     | 35 ++++++++++++++++++++++--------
> > ---
> >  drivers/net/failsafe/failsafe_private.h | 11 +++++++++++
> >  3 files changed, 46 insertions(+), 18 deletions(-)
> > 
> > diff --git a/drivers/net/failsafe/failsafe_flow.c
> > b/drivers/net/failsafe/failsafe_flow.c
> > index 153ceee..c072d1e 100644
> > --- a/drivers/net/failsafe/failsafe_flow.c
> > +++ b/drivers/net/failsafe/failsafe_flow.c
> > @@ -87,7 +87,7 @@
> >  		DEBUG("Calling rte_flow_validate on sub_device %d", i);
> >  		ret = rte_flow_validate(PORT_ID(sdev),
> >  				attr, patterns, actions, error);
> > -		if (ret) {
> > +		if ((ret = fs_err(sdev, ret))) {
> This assignment in "if" statement causes to checkpatch error, I sent it as is because you asked it like this.
> If you think I need to change it, I see 2 options:
> 
> 1.
> ret = fs_err(sdev, ret);
> if (ret ) {...}
> 
> 2.
> if (fs_err(sdev, &ret)) {..}
> 
> what do you think?
> 

Yes I forgot that checkpatch was like this.
Our mail crossed, but I acked this patch. I think this is acceptable at
the driver level, or should be at the discretion of the driver
maintainer.

So personally, I'd say leave it this way. If someone or something shouts
about this we will consider alternatives. Otherwise this is readable and
easily understandable as-is.

Regards,
-- 
Gaëtan Rivet
6WIND

^ permalink raw reply	[flat|nested] 98+ messages in thread

* [PATCH v5 0/6] Fail-safe\ethdev: fix removal handling lack
  2018-01-10 12:30     ` [PATCH v4 0/6] Fail-safe\ethdev: fix removal handling lack Matan Azrad
                         ` (5 preceding siblings ...)
  2018-01-10 12:31       ` [PATCH v4 6/6] net/failsafe: fix removed device handling Matan Azrad
@ 2018-01-17 20:19       ` Matan Azrad
  2018-01-17 20:19         ` [PATCH v5 1/6] ethdev: add devop to check removal status Matan Azrad
                           ` (6 more replies)
  6 siblings, 7 replies; 98+ messages in thread
From: Matan Azrad @ 2018-01-17 20:19 UTC (permalink / raw)
  To: Ferruh Yigit, Adrien Mazarguil, Gaetan Rivet; +Cc: Thomas Monjalon, dev

There is time between the physical removal of the device until sub-device PMDs get a RMV interrupt. 
At this time DPDK PMDs and applications still don't know about the removal and may call sub-device control operation which should return an error.

This series adds new ethdev operation to check device removal, adds support for it in mlx PMDs, adjust ethdev APIs to return -EIO in case of removal and fixes the fail-safe bug of removal error report.

V2:
Remove ENODEV definition.
Remove checks from all mlx control commands.
Add new devop - "is_removed".
Implement it in mlx4 and mlx5.
Fix failsafe bug by the new devop.

V3:
Adjust ethdev APIs removal error report.
Change failsafe check to check eth_dev* return values.
Remove backporting of fail-safe patch.

V4:
Improve fail-safe internal API to adjust the actual error value as discussed.
Remove "Fixes" lines from fail-safe patch.
No changes in ethdev\mlx patches.

V5:
Rebase on top of master-net-mlx. 

Matan Azrad (6):
  ethdev: add devop to check removal status
  net/mlx4: support a device removal check operation
  net/mlx5: support a device removal check operation
  ethdev: adjust APIs removal error report
  ethdev: adjust flow APIs removal error report
  net/failsafe: fix removed device handling

 drivers/net/failsafe/failsafe_flow.c    |  18 ++-
 drivers/net/failsafe/failsafe_ops.c     |  35 +++--
 drivers/net/failsafe/failsafe_private.h |  11 ++
 drivers/net/mlx4/mlx4.c                 |   1 +
 drivers/net/mlx4/mlx4.h                 |   1 +
 drivers/net/mlx4/mlx4_ethdev.c          |  20 +++
 drivers/net/mlx5/mlx5.c                 |   2 +
 drivers/net/mlx5/mlx5.h                 |   1 +
 drivers/net/mlx5/mlx5_ethdev.c          |  20 +++
 lib/librte_ether/rte_ethdev.c           | 218 +++++++++++++++++++++-----------
 lib/librte_ether/rte_ethdev.h           |  68 +++++++++-
 lib/librte_ether/rte_ethdev_version.map |   7 +
 lib/librte_ether/rte_flow.c             |  34 ++++-
 lib/librte_ether/rte_flow.h             |   2 +
 14 files changed, 338 insertions(+), 100 deletions(-)

-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 98+ messages in thread

* [PATCH v5 1/6] ethdev: add devop to check removal status
  2018-01-17 20:19       ` [PATCH v5 0/6] Fail-safe\ethdev: fix removal handling lack Matan Azrad
@ 2018-01-17 20:19         ` Matan Azrad
  2018-01-17 20:40           ` Ferruh Yigit
  2018-01-17 20:19         ` [PATCH v5 2/6] net/mlx4: support a device removal check operation Matan Azrad
                           ` (5 subsequent siblings)
  6 siblings, 1 reply; 98+ messages in thread
From: Matan Azrad @ 2018-01-17 20:19 UTC (permalink / raw)
  To: Ferruh Yigit, Adrien Mazarguil, Gaetan Rivet; +Cc: Thomas Monjalon, dev

There is time between the physical removal of the device until PMDs get
a RMV interrupt. At this time DPDK PMDs and applications still don't
know about the removal.

Current removal detection is achieved only by registration to device RMV
event and the notification comes asynchronously. So, there is no option
to detect a device removal synchronously.
Applications and other DPDK entities may want to check a device removal
synchronously and to take an immediate decision accordingly.

Add new dev op called is_removed to allow DPDK entities to check an
Ethernet device removal status immediately.

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
---
 lib/librte_ether/rte_ethdev.c           | 28 +++++++++++++++++++++++++---
 lib/librte_ether/rte_ethdev.h           | 17 +++++++++++++++++
 lib/librte_ether/rte_ethdev_version.map |  7 +++++++
 3 files changed, 49 insertions(+), 3 deletions(-)

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index b349599..c93cec1 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -114,7 +114,8 @@ enum {
 rte_eth_find_next(uint16_t port_id)
 {
 	while (port_id < RTE_MAX_ETHPORTS &&
-	       rte_eth_devices[port_id].state != RTE_ETH_DEV_ATTACHED)
+	       rte_eth_devices[port_id].state != RTE_ETH_DEV_ATTACHED &&
+	       rte_eth_devices[port_id].state != RTE_ETH_DEV_REMOVED)
 		port_id++;
 
 	if (port_id >= RTE_MAX_ETHPORTS)
@@ -262,8 +263,7 @@ struct rte_eth_dev *
 rte_eth_dev_is_valid_port(uint16_t port_id)
 {
 	if (port_id >= RTE_MAX_ETHPORTS ||
-	    (rte_eth_devices[port_id].state != RTE_ETH_DEV_ATTACHED &&
-	     rte_eth_devices[port_id].state != RTE_ETH_DEV_DEFERRED))
+	    (rte_eth_devices[port_id].state == RTE_ETH_DEV_UNUSED))
 		return 0;
 	else
 		return 1;
@@ -1094,6 +1094,28 @@ struct rte_eth_dev *
 }
 
 int
+rte_eth_dev_is_removed(uint16_t port_id)
+{
+	struct rte_eth_dev *dev;
+	int ret;
+
+	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, 0);
+
+	dev = &rte_eth_devices[port_id];
+
+	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->is_removed, 0);
+
+	if (dev->state == RTE_ETH_DEV_REMOVED)
+		return 1;
+
+	ret = dev->dev_ops->is_removed(dev);
+	if (ret != 0)
+		dev->state = RTE_ETH_DEV_REMOVED;
+
+	return ret;
+}
+
+int
 rte_eth_rx_queue_setup(uint16_t port_id, uint16_t rx_queue_id,
 		       uint16_t nb_rx_desc, unsigned int socket_id,
 		       const struct rte_eth_rxconf *rx_conf,
diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index f0eeefe..da0c5cf 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -1169,6 +1169,9 @@ struct rte_eth_dcb_info {
 typedef int (*eth_dev_reset_t)(struct rte_eth_dev *dev);
 /** <@internal Function used to reset a configured Ethernet device. */
 
+typedef int (*eth_is_removed_t)(struct rte_eth_dev *dev);
+/**< @internal Function used to detect an Ethernet device removal. */
+
 typedef void (*eth_promiscuous_enable_t)(struct rte_eth_dev *dev);
 /**< @internal Function used to enable the RX promiscuous mode of an Ethernet device. */
 
@@ -1498,6 +1501,8 @@ struct eth_dev_ops {
 	eth_dev_close_t            dev_close;     /**< Close device. */
 	eth_dev_reset_t		   dev_reset;	  /**< Reset device. */
 	eth_link_update_t          link_update;   /**< Get device link state. */
+	eth_is_removed_t           is_removed;
+	/**< Check if the device was physically removed. */
 
 	eth_promiscuous_enable_t   promiscuous_enable; /**< Promiscuous ON. */
 	eth_promiscuous_disable_t  promiscuous_disable;/**< Promiscuous OFF. */
@@ -1684,6 +1689,7 @@ enum rte_eth_dev_state {
 	RTE_ETH_DEV_UNUSED = 0,
 	RTE_ETH_DEV_ATTACHED,
 	RTE_ETH_DEV_DEFERRED,
+	RTE_ETH_DEV_REMOVED,
 };
 
 /**
@@ -1970,6 +1976,17 @@ int rte_eth_dev_configure(uint16_t port_id, uint16_t nb_rx_queue,
 void _rte_eth_dev_reset(struct rte_eth_dev *dev);
 
 /**
+ * Check if an Ethernet device was physically removed.
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @return
+ *   - 0 when the Ethernet device is removed, otherwise 1.
+ */
+int
+rte_eth_dev_is_removed(uint16_t port_id);
+
+/**
  * Allocate and set up a receive queue for an Ethernet device.
  *
  * The function allocates a contiguous block of memory for *nb_rx_desc*
diff --git a/lib/librte_ether/rte_ethdev_version.map b/lib/librte_ether/rte_ethdev_version.map
index e9681ac..78547ff 100644
--- a/lib/librte_ether/rte_ethdev_version.map
+++ b/lib/librte_ether/rte_ethdev_version.map
@@ -198,6 +198,13 @@ DPDK_17.11 {
 
 } DPDK_17.08;
 
+DPDK_18.02 {
+	global:
+
+	rte_eth_dev_is_removed;
+
+} DPDK_17.11;
+
 EXPERIMENTAL {
 	global:
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH v5 2/6] net/mlx4: support a device removal check operation
  2018-01-17 20:19       ` [PATCH v5 0/6] Fail-safe\ethdev: fix removal handling lack Matan Azrad
  2018-01-17 20:19         ` [PATCH v5 1/6] ethdev: add devop to check removal status Matan Azrad
@ 2018-01-17 20:19         ` Matan Azrad
  2018-01-17 20:19         ` [PATCH v5 3/6] net/mlx5: " Matan Azrad
                           ` (4 subsequent siblings)
  6 siblings, 0 replies; 98+ messages in thread
From: Matan Azrad @ 2018-01-17 20:19 UTC (permalink / raw)
  To: Ferruh Yigit, Adrien Mazarguil, Gaetan Rivet; +Cc: Thomas Monjalon, dev

Add support to get removal status of mlx4 device.

Signed-off-by: Matan Azrad <matan@mellanox.com>
---
 drivers/net/mlx4/mlx4.c        |  1 +
 drivers/net/mlx4/mlx4.h        |  1 +
 drivers/net/mlx4/mlx4_ethdev.c | 20 ++++++++++++++++++++
 3 files changed, 22 insertions(+)

diff --git a/drivers/net/mlx4/mlx4.c b/drivers/net/mlx4/mlx4.c
index 61c5bf4..703513e 100644
--- a/drivers/net/mlx4/mlx4.c
+++ b/drivers/net/mlx4/mlx4.c
@@ -256,6 +256,7 @@ struct mlx4_conf {
 	.filter_ctrl = mlx4_filter_ctrl,
 	.rx_queue_intr_enable = mlx4_rx_intr_enable,
 	.rx_queue_intr_disable = mlx4_rx_intr_disable,
+	.is_removed = mlx4_is_removed,
 };
 
 /**
diff --git a/drivers/net/mlx4/mlx4.h b/drivers/net/mlx4/mlx4.h
index 99dc335..2ab2988 100644
--- a/drivers/net/mlx4/mlx4.h
+++ b/drivers/net/mlx4/mlx4.h
@@ -171,6 +171,7 @@ int mlx4_flow_ctrl_get(struct rte_eth_dev *dev,
 int mlx4_flow_ctrl_set(struct rte_eth_dev *dev,
 		       struct rte_eth_fc_conf *fc_conf);
 const uint32_t *mlx4_dev_supported_ptypes_get(struct rte_eth_dev *dev);
+int mlx4_is_removed(struct rte_eth_dev *dev);
 
 /* mlx4_intr.c */
 
diff --git a/drivers/net/mlx4/mlx4_ethdev.c b/drivers/net/mlx4/mlx4_ethdev.c
index c80eab5..5318b56 100644
--- a/drivers/net/mlx4/mlx4_ethdev.c
+++ b/drivers/net/mlx4/mlx4_ethdev.c
@@ -1052,3 +1052,23 @@ enum rxmode_toggle {
 	}
 	return NULL;
 }
+
+/**
+ * Check if mlx4 device was removed.
+ *
+ * @param dev
+ *   Pointer to Ethernet device structure.
+ *
+ * @return
+ *   1 when device is removed, otherwise 0.
+ */
+int
+mlx4_is_removed(struct rte_eth_dev *dev)
+{
+	struct ibv_device_attr device_attr;
+	struct priv *priv = dev->data->dev_private;
+
+	if (ibv_query_device(priv->ctx, &device_attr) == EIO)
+		return 1;
+	return 0;
+}
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH v5 3/6] net/mlx5: support a device removal check operation
  2018-01-17 20:19       ` [PATCH v5 0/6] Fail-safe\ethdev: fix removal handling lack Matan Azrad
  2018-01-17 20:19         ` [PATCH v5 1/6] ethdev: add devop to check removal status Matan Azrad
  2018-01-17 20:19         ` [PATCH v5 2/6] net/mlx4: support a device removal check operation Matan Azrad
@ 2018-01-17 20:19         ` Matan Azrad
  2018-01-17 20:19         ` [PATCH v5 4/6] ethdev: adjust APIs removal error report Matan Azrad
                           ` (3 subsequent siblings)
  6 siblings, 0 replies; 98+ messages in thread
From: Matan Azrad @ 2018-01-17 20:19 UTC (permalink / raw)
  To: Ferruh Yigit, Adrien Mazarguil, Gaetan Rivet; +Cc: Thomas Monjalon, dev

Add support to get removal status of mlx5 device.
It is not supported in secondary process.

Signed-off-by: Matan Azrad <matan@mellanox.com>
---
 drivers/net/mlx5/mlx5.c        |  2 ++
 drivers/net/mlx5/mlx5.h        |  1 +
 drivers/net/mlx5/mlx5_ethdev.c | 20 ++++++++++++++++++++
 3 files changed, 23 insertions(+)

diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index 1c95f35..c13a2d3 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -284,6 +284,7 @@
 	.tx_descriptor_status = mlx5_tx_descriptor_status,
 	.rx_queue_intr_enable = mlx5_rx_intr_enable,
 	.rx_queue_intr_disable = mlx5_rx_intr_disable,
+	.is_removed = mlx5_is_removed,
 };
 
 static const struct eth_dev_ops mlx5_dev_sec_ops = {
@@ -331,6 +332,7 @@
 	.tx_descriptor_status = mlx5_tx_descriptor_status,
 	.rx_queue_intr_enable = mlx5_rx_intr_enable,
 	.rx_queue_intr_disable = mlx5_rx_intr_disable,
+	.is_removed = mlx5_is_removed,
 };
 
 static struct {
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index e740a4e..aaff180 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -237,6 +237,7 @@ int mlx5_ibv_device_to_pci_addr(const struct ibv_device *,
 void priv_dev_interrupt_handler_install(struct priv *, struct rte_eth_dev *);
 int mlx5_set_link_down(struct rte_eth_dev *dev);
 int mlx5_set_link_up(struct rte_eth_dev *dev);
+int mlx5_is_removed(struct rte_eth_dev *dev);
 eth_tx_burst_t priv_select_tx_function(struct priv *, struct rte_eth_dev *);
 eth_rx_burst_t priv_select_rx_function(struct priv *, struct rte_eth_dev *);
 
diff --git a/drivers/net/mlx5/mlx5_ethdev.c b/drivers/net/mlx5/mlx5_ethdev.c
index 6f78adc..1c067ca 100644
--- a/drivers/net/mlx5/mlx5_ethdev.c
+++ b/drivers/net/mlx5/mlx5_ethdev.c
@@ -1453,3 +1453,23 @@ struct ethtool_link_settings {
 	}
 	return rx_pkt_burst;
 }
+
+/**
+ * Check if mlx5 device was removed.
+ *
+ * @param dev
+ *   Pointer to Ethernet device structure.
+ *
+ * @return
+ *   1 when device is removed, otherwise 0.
+ */
+int
+mlx5_is_removed(struct rte_eth_dev *dev)
+{
+	struct ibv_device_attr device_attr;
+	struct priv *priv = dev->data->dev_private;
+
+	if (ibv_query_device(priv->ctx, &device_attr) == EIO)
+		return 1;
+	return 0;
+}
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH v5 4/6] ethdev: adjust APIs removal error report
  2018-01-17 20:19       ` [PATCH v5 0/6] Fail-safe\ethdev: fix removal handling lack Matan Azrad
                           ` (2 preceding siblings ...)
  2018-01-17 20:19         ` [PATCH v5 3/6] net/mlx5: " Matan Azrad
@ 2018-01-17 20:19         ` Matan Azrad
  2018-01-17 20:19         ` [PATCH v5 5/6] ethdev: adjust flow " Matan Azrad
                           ` (2 subsequent siblings)
  6 siblings, 0 replies; 98+ messages in thread
From: Matan Azrad @ 2018-01-17 20:19 UTC (permalink / raw)
  To: Ferruh Yigit, Adrien Mazarguil, Gaetan Rivet; +Cc: Thomas Monjalon, dev

rte_eth_dev_is_removed API was added to detect a device removal
synchronously.

When a device removal occurs during control command execution, many
different errors can be reported to the user.

Adjust all ethdev APIs error reports to return -EIO in case of device
removal using rte_eth_dev_is_removed API.

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
---
 lib/librte_ether/rte_ethdev.c | 192 +++++++++++++++++++++++++++---------------
 lib/librte_ether/rte_ethdev.h |  51 ++++++++++-
 2 files changed, 170 insertions(+), 73 deletions(-)

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index c93cec1..7044159 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -338,6 +338,16 @@ struct rte_eth_dev *
 	return -ENODEV;
 }
 
+static int
+eth_err(uint16_t port_id, int ret)
+{
+	if (ret == 0)
+		return 0;
+	if (rte_eth_dev_is_removed(port_id))
+		return -EIO;
+	return ret;
+}
+
 /* attach the new device, then store port_id of the device */
 int
 rte_eth_dev_attach(const char *devargs, uint16_t *port_id)
@@ -492,7 +502,8 @@ struct rte_eth_dev *
 		return 0;
 	}
 
-	return dev->dev_ops->rx_queue_start(dev, rx_queue_id);
+	return eth_err(port_id, dev->dev_ops->rx_queue_start(dev,
+							     rx_queue_id));
 
 }
 
@@ -518,7 +529,7 @@ struct rte_eth_dev *
 		return 0;
 	}
 
-	return dev->dev_ops->rx_queue_stop(dev, rx_queue_id);
+	return eth_err(port_id, dev->dev_ops->rx_queue_stop(dev, rx_queue_id));
 
 }
 
@@ -544,7 +555,8 @@ struct rte_eth_dev *
 		return 0;
 	}
 
-	return dev->dev_ops->tx_queue_start(dev, tx_queue_id);
+	return eth_err(port_id, dev->dev_ops->tx_queue_start(dev,
+							     tx_queue_id));
 
 }
 
@@ -570,7 +582,7 @@ struct rte_eth_dev *
 		return 0;
 	}
 
-	return dev->dev_ops->tx_queue_stop(dev, tx_queue_id);
+	return eth_err(port_id, dev->dev_ops->tx_queue_stop(dev, tx_queue_id));
 
 }
 
@@ -888,7 +900,7 @@ struct rte_eth_dev *
 				port_id, diag);
 		rte_eth_dev_rx_queue_config(dev, 0);
 		rte_eth_dev_tx_queue_config(dev, 0);
-		return diag;
+		return eth_err(port_id, diag);
 	}
 
 	/* Initialize Rx profiling if enabled at compilation time. */
@@ -898,7 +910,7 @@ struct rte_eth_dev *
 				port_id, diag);
 		rte_eth_dev_rx_queue_config(dev, 0);
 		rte_eth_dev_tx_queue_config(dev, 0);
-		return diag;
+		return eth_err(port_id, diag);
 	}
 
 	return 0;
@@ -998,7 +1010,7 @@ struct rte_eth_dev *
 	if (diag == 0)
 		dev->data->dev_started = 1;
 	else
-		return diag;
+		return eth_err(port_id, diag);
 
 	rte_eth_dev_config_restore(port_id);
 
@@ -1040,7 +1052,7 @@ struct rte_eth_dev *
 	dev = &rte_eth_devices[port_id];
 
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->dev_set_link_up, -ENOTSUP);
-	return (*dev->dev_ops->dev_set_link_up)(dev);
+	return eth_err(port_id, (*dev->dev_ops->dev_set_link_up)(dev));
 }
 
 int
@@ -1053,7 +1065,7 @@ struct rte_eth_dev *
 	dev = &rte_eth_devices[port_id];
 
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->dev_set_link_down, -ENOTSUP);
-	return (*dev->dev_ops->dev_set_link_down)(dev);
+	return eth_err(port_id, (*dev->dev_ops->dev_set_link_down)(dev));
 }
 
 void
@@ -1090,7 +1102,7 @@ struct rte_eth_dev *
 	rte_eth_dev_stop(port_id);
 	ret = dev->dev_ops->dev_reset(dev);
 
-	return ret;
+	return eth_err(port_id, ret);
 }
 
 int
@@ -1214,7 +1226,7 @@ struct rte_eth_dev *
 			dev->data->min_rx_buf_size = mbp_buf_size;
 	}
 
-	return ret;
+	return eth_err(port_id, ret);
 }
 
 /**
@@ -1333,8 +1345,8 @@ struct rte_eth_dev *
 					  &local_conf.offloads);
 	}
 
-	return (*dev->dev_ops->tx_queue_setup)(dev, tx_queue_id, nb_tx_desc,
-					       socket_id, &local_conf);
+	return eth_err(port_id, (*dev->dev_ops->tx_queue_setup)(dev,
+		       tx_queue_id, nb_tx_desc, socket_id, &local_conf));
 }
 
 void
@@ -1390,14 +1402,16 @@ struct rte_eth_dev *
 rte_eth_tx_done_cleanup(uint16_t port_id, uint16_t queue_id, uint32_t free_cnt)
 {
 	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+	int ret;
 
 	/* Validate Input Data. Bail if not valid or not supported. */
 	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->tx_done_cleanup, -ENOTSUP);
 
 	/* Call driver to free pending mbufs. */
-	return (*dev->dev_ops->tx_done_cleanup)(dev->data->tx_queues[queue_id],
-			free_cnt);
+	ret = (*dev->dev_ops->tx_done_cleanup)(dev->data->tx_queues[queue_id],
+					       free_cnt);
+	return eth_err(port_id, ret);
 }
 
 void
@@ -1534,7 +1548,7 @@ struct rte_eth_dev *
 
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->stats_get, -ENOTSUP);
 	stats->rx_nombuf = dev->data->rx_mbuf_alloc_failed;
-	return (*dev->dev_ops->stats_get)(dev, stats);
+	return eth_err(port_id, (*dev->dev_ops->stats_get)(dev, stats));
 }
 
 int
@@ -1580,12 +1594,12 @@ struct rte_eth_dev *
 		count = (*dev->dev_ops->xstats_get_names_by_id)(dev, NULL,
 				NULL, 0);
 		if (count < 0)
-			return count;
+			return eth_err(port_id, count);
 	}
 	if (dev->dev_ops->xstats_get_names != NULL) {
 		count = (*dev->dev_ops->xstats_get_names)(dev, NULL, 0);
 		if (count < 0)
-			return count;
+			return eth_err(port_id, count);
 	} else
 		count = 0;
 
@@ -1765,8 +1779,12 @@ struct rte_eth_dev *
 	if (ids && no_ext_stat_requested) {
 		rte_eth_basic_stats_get_names(dev, xstats_names_copy);
 	} else {
-		rte_eth_xstats_get_names(port_id, xstats_names_copy,
+		ret = rte_eth_xstats_get_names(port_id, xstats_names_copy,
 			expected_entries);
+		if (ret < 0) {
+			free(xstats_names_copy);
+			return ret;
+		}
 	}
 
 	/* Filter stats */
@@ -1813,7 +1831,7 @@ struct rte_eth_dev *
 			xstats_names + cnt_used_entries,
 			size - cnt_used_entries);
 		if (cnt_driver_entries < 0)
-			return cnt_driver_entries;
+			return eth_err(port_id, cnt_driver_entries);
 		cnt_used_entries += cnt_driver_entries;
 	}
 
@@ -1829,8 +1847,12 @@ struct rte_eth_dev *
 	unsigned int count = 0, i, q;
 	uint64_t val, *stats_ptr;
 	uint16_t nb_rxqs, nb_txqs;
+	int ret;
+
+	ret = rte_eth_stats_get(port_id, &eth_stats);
+	if (ret < 0)
+		return ret;
 
-	rte_eth_stats_get(port_id, &eth_stats);
 	dev = &rte_eth_devices[port_id];
 
 	nb_rxqs = RTE_MIN(dev->data->nb_rx_queues, RTE_ETHDEV_QUEUE_STAT_CNTRS);
@@ -1883,7 +1905,10 @@ struct rte_eth_dev *
 	int ret;
 
 	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
-	expected_entries = get_xstats_count(port_id);
+	ret = get_xstats_count(port_id);
+	if (ret < 0)
+		return ret;
+	expected_entries = (uint16_t)ret;
 	struct rte_eth_xstat xstats[expected_entries];
 	dev = &rte_eth_devices[port_id];
 	basic_count = get_xstats_basic_count(dev);
@@ -1966,6 +1991,7 @@ struct rte_eth_dev *
 	unsigned int count = 0, i;
 	signed int xcount = 0;
 	uint16_t nb_rxqs, nb_txqs;
+	int ret;
 
 	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -EINVAL);
 
@@ -1988,14 +2014,17 @@ struct rte_eth_dev *
 				     (n > count) ? n - count : 0);
 
 		if (xcount < 0)
-			return xcount;
+			return eth_err(port_id, xcount);
 	}
 
 	if (n < count + xcount || xstats == NULL)
 		return count + xcount;
 
 	/* now fill the xstats structure */
-	count = rte_eth_basic_stats_get(port_id, xstats);
+	ret = rte_eth_basic_stats_get(port_id, xstats);
+	if (ret < 0)
+		return ret;
+	count = ret;
 
 	for (i = 0; i < count; i++)
 		xstats[i].id = i;
@@ -2045,8 +2074,8 @@ struct rte_eth_dev *
 rte_eth_dev_set_tx_queue_stats_mapping(uint16_t port_id, uint16_t tx_queue_id,
 		uint8_t stat_idx)
 {
-	return set_queue_stats_mapping(port_id, tx_queue_id, stat_idx,
-			STAT_QMAP_TX);
+	return eth_err(port_id, set_queue_stats_mapping(port_id, tx_queue_id,
+						stat_idx, STAT_QMAP_TX));
 }
 
 
@@ -2054,8 +2083,8 @@ struct rte_eth_dev *
 rte_eth_dev_set_rx_queue_stats_mapping(uint16_t port_id, uint16_t rx_queue_id,
 		uint8_t stat_idx)
 {
-	return set_queue_stats_mapping(port_id, rx_queue_id, stat_idx,
-			STAT_QMAP_RX);
+	return eth_err(port_id, set_queue_stats_mapping(port_id, rx_queue_id,
+						stat_idx, STAT_QMAP_RX));
 }
 
 int
@@ -2067,7 +2096,8 @@ struct rte_eth_dev *
 	dev = &rte_eth_devices[port_id];
 
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->fw_version_get, -ENOTSUP);
-	return (*dev->dev_ops->fw_version_get)(dev, fw_version, fw_size);
+	return eth_err(port_id, (*dev->dev_ops->fw_version_get)(dev,
+							fw_version, fw_size));
 }
 
 void
@@ -2157,7 +2187,7 @@ struct rte_eth_dev *
 	if (!ret)
 		dev->data->mtu = mtu;
 
-	return ret;
+	return eth_err(port_id, ret);
 }
 
 int
@@ -2197,7 +2227,7 @@ struct rte_eth_dev *
 			vfc->ids[vidx] &= ~(UINT64_C(1) << vbit);
 	}
 
-	return ret;
+	return eth_err(port_id, ret);
 }
 
 int
@@ -2230,7 +2260,8 @@ struct rte_eth_dev *
 	dev = &rte_eth_devices[port_id];
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->vlan_tpid_set, -ENOTSUP);
 
-	return (*dev->dev_ops->vlan_tpid_set)(dev, vlan_type, tpid);
+	return eth_err(port_id, (*dev->dev_ops->vlan_tpid_set)(dev, vlan_type,
+							       tpid));
 }
 
 int
@@ -2308,7 +2339,7 @@ struct rte_eth_dev *
 					    &dev->data->dev_conf.rxmode);
 	}
 
-	return ret;
+	return eth_err(port_id, ret);
 }
 
 int
@@ -2343,9 +2374,8 @@ struct rte_eth_dev *
 	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
 	dev = &rte_eth_devices[port_id];
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->vlan_pvid_set, -ENOTSUP);
-	(*dev->dev_ops->vlan_pvid_set)(dev, pvid, on);
 
-	return 0;
+	return eth_err(port_id, (*dev->dev_ops->vlan_pvid_set)(dev, pvid, on));
 }
 
 int
@@ -2357,7 +2387,7 @@ struct rte_eth_dev *
 	dev = &rte_eth_devices[port_id];
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->flow_ctrl_get, -ENOTSUP);
 	memset(fc_conf, 0, sizeof(*fc_conf));
-	return (*dev->dev_ops->flow_ctrl_get)(dev, fc_conf);
+	return eth_err(port_id, (*dev->dev_ops->flow_ctrl_get)(dev, fc_conf));
 }
 
 int
@@ -2373,7 +2403,7 @@ struct rte_eth_dev *
 
 	dev = &rte_eth_devices[port_id];
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->flow_ctrl_set, -ENOTSUP);
-	return (*dev->dev_ops->flow_ctrl_set)(dev, fc_conf);
+	return eth_err(port_id, (*dev->dev_ops->flow_ctrl_set)(dev, fc_conf));
 }
 
 int
@@ -2391,7 +2421,8 @@ struct rte_eth_dev *
 	dev = &rte_eth_devices[port_id];
 	/* High water, low water validation are device specific */
 	if  (*dev->dev_ops->priority_flow_ctrl_set)
-		return (*dev->dev_ops->priority_flow_ctrl_set)(dev, pfc_conf);
+		return eth_err(port_id, (*dev->dev_ops->priority_flow_ctrl_set)
+					(dev, pfc_conf));
 	return -ENOTSUP;
 }
 
@@ -2466,7 +2497,8 @@ struct rte_eth_dev *
 		return ret;
 
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->reta_update, -ENOTSUP);
-	return (*dev->dev_ops->reta_update)(dev, reta_conf, reta_size);
+	return eth_err(port_id, (*dev->dev_ops->reta_update)(dev, reta_conf,
+							     reta_size));
 }
 
 int
@@ -2486,7 +2518,8 @@ struct rte_eth_dev *
 
 	dev = &rte_eth_devices[port_id];
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->reta_query, -ENOTSUP);
-	return (*dev->dev_ops->reta_query)(dev, reta_conf, reta_size);
+	return eth_err(port_id, (*dev->dev_ops->reta_query)(dev, reta_conf,
+							    reta_size));
 }
 
 int
@@ -2498,7 +2531,8 @@ struct rte_eth_dev *
 	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
 	dev = &rte_eth_devices[port_id];
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rss_hash_update, -ENOTSUP);
-	return (*dev->dev_ops->rss_hash_update)(dev, rss_conf);
+	return eth_err(port_id, (*dev->dev_ops->rss_hash_update)(dev,
+								 rss_conf));
 }
 
 int
@@ -2510,7 +2544,8 @@ struct rte_eth_dev *
 	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
 	dev = &rte_eth_devices[port_id];
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rss_hash_conf_get, -ENOTSUP);
-	return (*dev->dev_ops->rss_hash_conf_get)(dev, rss_conf);
+	return eth_err(port_id, (*dev->dev_ops->rss_hash_conf_get)(dev,
+								   rss_conf));
 }
 
 int
@@ -2532,7 +2567,8 @@ struct rte_eth_dev *
 
 	dev = &rte_eth_devices[port_id];
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->udp_tunnel_port_add, -ENOTSUP);
-	return (*dev->dev_ops->udp_tunnel_port_add)(dev, udp_tunnel);
+	return eth_err(port_id, (*dev->dev_ops->udp_tunnel_port_add)(dev,
+								udp_tunnel));
 }
 
 int
@@ -2555,7 +2591,8 @@ struct rte_eth_dev *
 	}
 
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->udp_tunnel_port_del, -ENOTSUP);
-	return (*dev->dev_ops->udp_tunnel_port_del)(dev, udp_tunnel);
+	return eth_err(port_id, (*dev->dev_ops->udp_tunnel_port_del)(dev,
+								udp_tunnel));
 }
 
 int
@@ -2566,7 +2603,7 @@ struct rte_eth_dev *
 	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
 	dev = &rte_eth_devices[port_id];
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->dev_led_on, -ENOTSUP);
-	return (*dev->dev_ops->dev_led_on)(dev);
+	return eth_err(port_id, (*dev->dev_ops->dev_led_on)(dev));
 }
 
 int
@@ -2577,7 +2614,7 @@ struct rte_eth_dev *
 	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
 	dev = &rte_eth_devices[port_id];
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->dev_led_off, -ENOTSUP);
-	return (*dev->dev_ops->dev_led_off)(dev);
+	return eth_err(port_id, (*dev->dev_ops->dev_led_off)(dev));
 }
 
 /*
@@ -2653,7 +2690,7 @@ struct rte_eth_dev *
 		dev->data->mac_pool_sel[index] |= (1ULL << pool);
 	}
 
-	return ret;
+	return eth_err(port_id, ret);
 }
 
 int
@@ -2779,7 +2816,7 @@ struct rte_eth_dev *
 					&dev->data->hash_mac_addrs[index]);
 	}
 
-	return ret;
+	return eth_err(port_id, ret);
 }
 
 int
@@ -2792,7 +2829,8 @@ struct rte_eth_dev *
 	dev = &rte_eth_devices[port_id];
 
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->uc_all_hash_table_set, -ENOTSUP);
-	return (*dev->dev_ops->uc_all_hash_table_set)(dev, on);
+	return eth_err(port_id, (*dev->dev_ops->uc_all_hash_table_set)(dev,
+								       on));
 }
 
 int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx,
@@ -2822,7 +2860,8 @@ int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx,
 	}
 
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->set_queue_rate_limit, -ENOTSUP);
-	return (*dev->dev_ops->set_queue_rate_limit)(dev, queue_idx, tx_rate);
+	return eth_err(port_id, (*dev->dev_ops->set_queue_rate_limit)(dev,
+							queue_idx, tx_rate));
 }
 
 int
@@ -2860,7 +2899,8 @@ int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx,
 	dev = &rte_eth_devices[port_id];
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->mirror_rule_set, -ENOTSUP);
 
-	return (*dev->dev_ops->mirror_rule_set)(dev, mirror_conf, rule_id, on);
+	return eth_err(port_id, (*dev->dev_ops->mirror_rule_set)(dev,
+						mirror_conf, rule_id, on));
 }
 
 int
@@ -2873,7 +2913,8 @@ int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx,
 	dev = &rte_eth_devices[port_id];
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->mirror_rule_reset, -ENOTSUP);
 
-	return (*dev->dev_ops->mirror_rule_reset)(dev, rule_id);
+	return eth_err(port_id, (*dev->dev_ops->mirror_rule_reset)(dev,
+								   rule_id));
 }
 
 RTE_INIT(eth_dev_init_cb_lists)
@@ -3137,7 +3178,8 @@ int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx,
 	dev = &rte_eth_devices[port_id];
 
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rx_queue_intr_enable, -ENOTSUP);
-	return (*dev->dev_ops->rx_queue_intr_enable)(dev, queue_id);
+	return eth_err(port_id, (*dev->dev_ops->rx_queue_intr_enable)(dev,
+								queue_id));
 }
 
 int
@@ -3151,7 +3193,8 @@ int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx,
 	dev = &rte_eth_devices[port_id];
 
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rx_queue_intr_disable, -ENOTSUP);
-	return (*dev->dev_ops->rx_queue_intr_disable)(dev, queue_id);
+	return eth_err(port_id, (*dev->dev_ops->rx_queue_intr_disable)(dev,
+								queue_id));
 }
 
 
@@ -3179,7 +3222,8 @@ int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx,
 
 	dev = &rte_eth_devices[port_id];
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->filter_ctrl, -ENOTSUP);
-	return (*dev->dev_ops->filter_ctrl)(dev, filter_type, filter_op, arg);
+	return eth_err(port_id, (*dev->dev_ops->filter_ctrl)(dev, filter_type,
+							     filter_op, arg));
 }
 
 void *
@@ -3429,7 +3473,8 @@ int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx,
 
 	dev = &rte_eth_devices[port_id];
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->set_mc_addr_list, -ENOTSUP);
-	return dev->dev_ops->set_mc_addr_list(dev, mc_addr_set, nb_mc_addr);
+	return eth_err(port_id, dev->dev_ops->set_mc_addr_list(dev,
+						mc_addr_set, nb_mc_addr));
 }
 
 int
@@ -3441,7 +3486,7 @@ int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx,
 	dev = &rte_eth_devices[port_id];
 
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->timesync_enable, -ENOTSUP);
-	return (*dev->dev_ops->timesync_enable)(dev);
+	return eth_err(port_id, (*dev->dev_ops->timesync_enable)(dev));
 }
 
 int
@@ -3453,7 +3498,7 @@ int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx,
 	dev = &rte_eth_devices[port_id];
 
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->timesync_disable, -ENOTSUP);
-	return (*dev->dev_ops->timesync_disable)(dev);
+	return eth_err(port_id, (*dev->dev_ops->timesync_disable)(dev));
 }
 
 int
@@ -3466,7 +3511,8 @@ int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx,
 	dev = &rte_eth_devices[port_id];
 
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->timesync_read_rx_timestamp, -ENOTSUP);
-	return (*dev->dev_ops->timesync_read_rx_timestamp)(dev, timestamp, flags);
+	return eth_err(port_id, (*dev->dev_ops->timesync_read_rx_timestamp)
+				(dev, timestamp, flags));
 }
 
 int
@@ -3479,7 +3525,8 @@ int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx,
 	dev = &rte_eth_devices[port_id];
 
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->timesync_read_tx_timestamp, -ENOTSUP);
-	return (*dev->dev_ops->timesync_read_tx_timestamp)(dev, timestamp);
+	return eth_err(port_id, (*dev->dev_ops->timesync_read_tx_timestamp)
+				(dev, timestamp));
 }
 
 int
@@ -3491,7 +3538,8 @@ int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx,
 	dev = &rte_eth_devices[port_id];
 
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->timesync_adjust_time, -ENOTSUP);
-	return (*dev->dev_ops->timesync_adjust_time)(dev, delta);
+	return eth_err(port_id, (*dev->dev_ops->timesync_adjust_time)(dev,
+								      delta));
 }
 
 int
@@ -3503,7 +3551,8 @@ int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx,
 	dev = &rte_eth_devices[port_id];
 
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->timesync_read_time, -ENOTSUP);
-	return (*dev->dev_ops->timesync_read_time)(dev, timestamp);
+	return eth_err(port_id, (*dev->dev_ops->timesync_read_time)(dev,
+								timestamp));
 }
 
 int
@@ -3515,7 +3564,8 @@ int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx,
 	dev = &rte_eth_devices[port_id];
 
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->timesync_write_time, -ENOTSUP);
-	return (*dev->dev_ops->timesync_write_time)(dev, timestamp);
+	return eth_err(port_id, (*dev->dev_ops->timesync_write_time)(dev,
+								timestamp));
 }
 
 int
@@ -3527,7 +3577,7 @@ int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx,
 
 	dev = &rte_eth_devices[port_id];
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->get_reg, -ENOTSUP);
-	return (*dev->dev_ops->get_reg)(dev, info);
+	return eth_err(port_id, (*dev->dev_ops->get_reg)(dev, info));
 }
 
 int
@@ -3539,7 +3589,7 @@ int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx,
 
 	dev = &rte_eth_devices[port_id];
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->get_eeprom_length, -ENOTSUP);
-	return (*dev->dev_ops->get_eeprom_length)(dev);
+	return eth_err(port_id, (*dev->dev_ops->get_eeprom_length)(dev));
 }
 
 int
@@ -3551,7 +3601,7 @@ int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx,
 
 	dev = &rte_eth_devices[port_id];
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->get_eeprom, -ENOTSUP);
-	return (*dev->dev_ops->get_eeprom)(dev, info);
+	return eth_err(port_id, (*dev->dev_ops->get_eeprom)(dev, info));
 }
 
 int
@@ -3563,7 +3613,7 @@ int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx,
 
 	dev = &rte_eth_devices[port_id];
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->set_eeprom, -ENOTSUP);
-	return (*dev->dev_ops->set_eeprom)(dev, info);
+	return eth_err(port_id, (*dev->dev_ops->set_eeprom)(dev, info));
 }
 
 int
@@ -3578,7 +3628,7 @@ int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx,
 	memset(dcb_info, 0, sizeof(struct rte_eth_dcb_info));
 
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->get_dcb_info, -ENOTSUP);
-	return (*dev->dev_ops->get_dcb_info)(dev, dcb_info);
+	return eth_err(port_id, (*dev->dev_ops->get_dcb_info)(dev, dcb_info));
 }
 
 int
@@ -3601,7 +3651,8 @@ int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx,
 	dev = &rte_eth_devices[port_id];
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->l2_tunnel_eth_type_conf,
 				-ENOTSUP);
-	return (*dev->dev_ops->l2_tunnel_eth_type_conf)(dev, l2_tunnel);
+	return eth_err(port_id, (*dev->dev_ops->l2_tunnel_eth_type_conf)(dev,
+								l2_tunnel));
 }
 
 int
@@ -3632,7 +3683,8 @@ int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx,
 	dev = &rte_eth_devices[port_id];
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->l2_tunnel_offload_set,
 				-ENOTSUP);
-	return (*dev->dev_ops->l2_tunnel_offload_set)(dev, l2_tunnel, mask, en);
+	return eth_err(port_id, (*dev->dev_ops->l2_tunnel_offload_set)(dev,
+							l2_tunnel, mask, en));
 }
 
 static void
diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index da0c5cf..9a2e1f6 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -2020,6 +2020,7 @@ int rte_eth_dev_configure(uint16_t port_id, uint16_t nb_rx_queue,
  *   memory buffers to populate each descriptor of the receive ring.
  * @return
  *   - 0: Success, receive queue correctly set up.
+ *   - -EIO: if device is removed.
  *   - -EINVAL: The size of network buffers which can be allocated from the
  *      memory pool does not fit the various buffer sizes allowed by the
  *      device controller.
@@ -2120,6 +2121,7 @@ int rte_eth_tx_queue_setup(uint16_t port_id, uint16_t tx_queue_id,
  * @return
  *   - 0: Success, the receive queue is started.
  *   - -EINVAL: The port_id or the queue_id out of range.
+ *   - -EIO: if device is removed.
  *   - -ENOTSUP: The function not supported in PMD driver.
  */
 int rte_eth_dev_rx_queue_start(uint16_t port_id, uint16_t rx_queue_id);
@@ -2136,6 +2138,7 @@ int rte_eth_tx_queue_setup(uint16_t port_id, uint16_t tx_queue_id,
  * @return
  *   - 0: Success, the receive queue is stopped.
  *   - -EINVAL: The port_id or the queue_id out of range.
+ *   - -EIO: if device is removed.
  *   - -ENOTSUP: The function not supported in PMD driver.
  */
 int rte_eth_dev_rx_queue_stop(uint16_t port_id, uint16_t rx_queue_id);
@@ -2153,6 +2156,7 @@ int rte_eth_tx_queue_setup(uint16_t port_id, uint16_t tx_queue_id,
  * @return
  *   - 0: Success, the transmit queue is started.
  *   - -EINVAL: The port_id or the queue_id out of range.
+ *   - -EIO: if device is removed.
  *   - -ENOTSUP: The function not supported in PMD driver.
  */
 int rte_eth_dev_tx_queue_start(uint16_t port_id, uint16_t tx_queue_id);
@@ -2169,6 +2173,7 @@ int rte_eth_tx_queue_setup(uint16_t port_id, uint16_t tx_queue_id,
  * @return
  *   - 0: Success, the transmit queue is stopped.
  *   - -EINVAL: The port_id or the queue_id out of range.
+ *   - -EIO: if device is removed.
  *   - -ENOTSUP: The function not supported in PMD driver.
  */
 int rte_eth_dev_tx_queue_stop(uint16_t port_id, uint16_t tx_queue_id);
@@ -2270,7 +2275,7 @@ int rte_eth_tx_queue_setup(uint16_t port_id, uint16_t tx_queue_id,
  *   - (-EINVAL) if port identifier is invalid.
  *   - (-ENOTSUP) if hardware doesn't support this function.
  *   - (-EPERM) if not ran from the primary process.
- *   - (-EIO) if re-initialisation failed.
+ *   - (-EIO) if re-initialisation failed or device is removed.
  *   - (-ENOMEM) if the reset failed due to OOM.
  *   - (-EAGAIN) if the reset temporarily failed and should be retried later.
  */
@@ -2506,6 +2511,7 @@ int rte_eth_xstats_get_by_id(uint16_t port_id, const uint64_t *ids,
  * @return
  *    0 on success
  *    -ENODEV for invalid port_id,
+ *    -EIO if device is removed,
  *    -EINVAL if the xstat_name doesn't exist in port_id
  */
 int rte_eth_xstats_get_id_by_name(uint16_t port_id, const char *xstat_name,
@@ -2597,6 +2603,7 @@ int rte_eth_dev_set_rx_queue_stats_mapping(uint16_t port_id,
  *   - (0) if successful.
  *   - (-ENOTSUP) if operation is not supported.
  *   - (-ENODEV) if *port_id* invalid.
+ *   - (-EIO) if device is removed.
  *   - (>0) if *fw_size* is not enough to store firmware version, return
  *          the size of the non truncated string.
  */
@@ -2668,6 +2675,7 @@ int rte_eth_dev_get_supported_ptypes(uint16_t port_id, uint32_t ptype_mask,
  *   - (0) if successful.
  *   - (-ENOTSUP) if operation is not supported.
  *   - (-ENODEV) if *port_id* invalid.
+ *   - (-EIO) if device is removed.
  *   - (-EINVAL) if *mtu* invalid.
  *   - (-EBUSY) if operation is not allowed when the port is running
  */
@@ -2688,6 +2696,7 @@ int rte_eth_dev_get_supported_ptypes(uint16_t port_id, uint32_t ptype_mask,
  *   - (0) if successful.
  *   - (-ENOSUP) if hardware-assisted VLAN filtering not configured.
  *   - (-ENODEV) if *port_id* invalid.
+ *   - (-EIO) if device is removed.
  *   - (-ENOSYS) if VLAN filtering on *port_id* disabled.
  *   - (-EINVAL) if *vlan_id* > 4095.
  */
@@ -2730,6 +2739,7 @@ int rte_eth_dev_set_vlan_strip_on_queue(uint16_t port_id, uint16_t rx_queue_id,
  *   - (0) if successful.
  *   - (-ENOSUP) if hardware-assisted VLAN TPID setup is not supported.
  *   - (-ENODEV) if *port_id* invalid.
+ *   - (-EIO) if device is removed.
  */
 int rte_eth_dev_set_vlan_ether_type(uint16_t port_id,
 				    enum rte_vlan_type vlan_type,
@@ -2754,6 +2764,7 @@ int rte_eth_dev_set_vlan_ether_type(uint16_t port_id,
  *   - (0) if successful.
  *   - (-ENOSUP) if hardware-assisted VLAN filtering not configured.
  *   - (-ENODEV) if *port_id* invalid.
+ *   - (-EIO) if device is removed.
  */
 int rte_eth_dev_set_vlan_offload(uint16_t port_id, int offload_mask);
 
@@ -3495,6 +3506,7 @@ struct rte_eth_dev_tx_buffer {
  * @return
  *   Failure: < 0
  *     -ENODEV: Invalid interface
+ *     -EIO: device is removed
  *     -ENOTSUP: Driver does not support function
  *   Success: >= 0
  *     0-n: Number of packets freed. More packets may still remain in ring that
@@ -3609,6 +3621,7 @@ int _rte_eth_dev_callback_process(struct rte_eth_dev *dev,
  *   - (-ENOTSUP) if underlying hardware OR driver doesn't support
  *     that operation.
  *   - (-ENODEV) if *port_id* invalid.
+ *   - (-EIO) if device is removed.
  */
 int rte_eth_dev_rx_intr_enable(uint16_t port_id, uint16_t queue_id);
 
@@ -3630,6 +3643,7 @@ int _rte_eth_dev_callback_process(struct rte_eth_dev *dev,
  *   - (-ENOTSUP) if underlying hardware OR driver doesn't support
  *     that operation.
  *   - (-ENODEV) if *port_id* invalid.
+ *   - (-EIO) if device is removed.
  */
 int rte_eth_dev_rx_intr_disable(uint16_t port_id, uint16_t queue_id);
 
@@ -3687,6 +3701,7 @@ int rte_eth_dev_rx_intr_ctl_q(uint16_t port_id, uint16_t queue_id,
  *   - (-ENOTSUP) if underlying hardware OR driver doesn't support
  *     that operation.
  *   - (-ENODEV) if *port_id* invalid.
+ *   - (-EIO) if device is removed.
  */
 int  rte_eth_led_on(uint16_t port_id);
 
@@ -3701,6 +3716,7 @@ int rte_eth_dev_rx_intr_ctl_q(uint16_t port_id, uint16_t queue_id,
  *   - (-ENOTSUP) if underlying hardware OR driver doesn't support
  *     that operation.
  *   - (-ENODEV) if *port_id* invalid.
+ *   - (-EIO) if device is removed.
  */
 int  rte_eth_led_off(uint16_t port_id);
 
@@ -3715,6 +3731,7 @@ int rte_eth_dev_rx_intr_ctl_q(uint16_t port_id, uint16_t queue_id,
  *   - (0) if successful.
  *   - (-ENOTSUP) if hardware doesn't support flow control.
  *   - (-ENODEV)  if *port_id* invalid.
+ *   - (-EIO)  if device is removed.
  */
 int rte_eth_dev_flow_ctrl_get(uint16_t port_id,
 			      struct rte_eth_fc_conf *fc_conf);
@@ -3731,7 +3748,7 @@ int rte_eth_dev_flow_ctrl_get(uint16_t port_id,
  *   - (-ENOTSUP) if hardware doesn't support flow control mode.
  *   - (-ENODEV)  if *port_id* invalid.
  *   - (-EINVAL)  if bad parameter
- *   - (-EIO)     if flow control setup failure
+ *   - (-EIO)     if flow control setup failure or device is removed.
  */
 int rte_eth_dev_flow_ctrl_set(uint16_t port_id,
 			      struct rte_eth_fc_conf *fc_conf);
@@ -3749,7 +3766,7 @@ int rte_eth_dev_flow_ctrl_set(uint16_t port_id,
  *   - (-ENOTSUP) if hardware doesn't support priority flow control mode.
  *   - (-ENODEV)  if *port_id* invalid.
  *   - (-EINVAL)  if bad parameter
- *   - (-EIO)     if flow control setup failure
+ *   - (-EIO)     if flow control setup failure or device is removed.
  */
 int rte_eth_dev_priority_flow_ctrl_set(uint16_t port_id,
 				struct rte_eth_pfc_conf *pfc_conf);
@@ -3769,6 +3786,7 @@ int rte_eth_dev_priority_flow_ctrl_set(uint16_t port_id,
  *   - (0) if successfully added or *mac_addr" was already added.
  *   - (-ENOTSUP) if hardware doesn't support this feature.
  *   - (-ENODEV) if *port* is invalid.
+ *   - (-EIO) if device is removed.
  *   - (-ENOSPC) if no more MAC addresses can be added.
  *   - (-EINVAL) if MAC address is invalid.
  */
@@ -3820,6 +3838,7 @@ int rte_eth_dev_default_mac_addr_set(uint16_t port,
  *   - (0) if successful.
  *   - (-ENOTSUP) if hardware doesn't support.
  *   - (-EINVAL) if bad parameter.
+ *   - (-EIO) if device is removed.
  */
 int rte_eth_dev_rss_reta_update(uint16_t port,
 				struct rte_eth_rss_reta_entry64 *reta_conf,
@@ -3839,6 +3858,7 @@ int rte_eth_dev_rss_reta_update(uint16_t port,
  *   - (0) if successful.
  *   - (-ENOTSUP) if hardware doesn't support.
  *   - (-EINVAL) if bad parameter.
+ *   - (-EIO) if device is removed.
  */
 int rte_eth_dev_rss_reta_query(uint16_t port,
 			       struct rte_eth_rss_reta_entry64 *reta_conf,
@@ -3860,6 +3880,7 @@ int rte_eth_dev_rss_reta_query(uint16_t port,
  *   - (0) if successful.
  *   - (-ENOTSUP) if hardware doesn't support.
   *  - (-ENODEV) if *port_id* invalid.
+ *   - (-EIO) if device is removed.
  *   - (-EINVAL) if bad parameter.
  */
 int rte_eth_dev_uc_hash_table_set(uint16_t port, struct ether_addr *addr,
@@ -3880,6 +3901,7 @@ int rte_eth_dev_uc_hash_table_set(uint16_t port, struct ether_addr *addr,
  *   - (0) if successful.
  *   - (-ENOTSUP) if hardware doesn't support.
   *  - (-ENODEV) if *port_id* invalid.
+ *   - (-EIO) if device is removed.
  *   - (-EINVAL) if bad parameter.
  */
 int rte_eth_dev_uc_all_hash_table_set(uint16_t port, uint8_t on);
@@ -3903,6 +3925,7 @@ int rte_eth_dev_uc_hash_table_set(uint16_t port, struct ether_addr *addr,
  *   - (0) if successful.
  *   - (-ENOTSUP) if hardware doesn't support this feature.
  *   - (-ENODEV) if *port_id* invalid.
+ *   - (-EIO) if device is removed.
  *   - (-EINVAL) if the mr_conf information is not correct.
  */
 int rte_eth_mirror_rule_set(uint16_t port_id,
@@ -3921,6 +3944,7 @@ int rte_eth_mirror_rule_set(uint16_t port_id,
  *   - (0) if successful.
  *   - (-ENOTSUP) if hardware doesn't support this feature.
  *   - (-ENODEV) if *port_id* invalid.
+ *   - (-EIO) if device is removed.
  *   - (-EINVAL) if bad parameter.
  */
 int rte_eth_mirror_rule_reset(uint16_t port_id,
@@ -3939,6 +3963,7 @@ int rte_eth_mirror_rule_reset(uint16_t port_id,
  *   - (0) if successful.
  *   - (-ENOTSUP) if hardware doesn't support this feature.
  *   - (-ENODEV) if *port_id* invalid.
+ *   - (-EIO) if device is removed.
  *   - (-EINVAL) if bad parameter.
  */
 int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx,
@@ -3954,6 +3979,7 @@ int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx,
  * @return
  *   - (0) if successful.
  *   - (-ENODEV) if port identifier is invalid.
+ *   - (-EIO) if device is removed.
  *   - (-ENOTSUP) if hardware doesn't support.
  *   - (-EINVAL) if bad parameter.
  */
@@ -3971,6 +3997,7 @@ int rte_eth_dev_rss_hash_update(uint16_t port_id,
  * @return
  *   - (0) if successful.
  *   - (-ENODEV) if port identifier is invalid.
+ *   - (-EIO) if device is removed.
  *   - (-ENOTSUP) if hardware doesn't support RSS.
  */
 int
@@ -3992,6 +4019,7 @@ int rte_eth_dev_rss_hash_update(uint16_t port_id,
  * @return
  *   - (0) if successful.
  *   - (-ENODEV) if port identifier is invalid.
+ *   - (-EIO) if device is removed.
  *   - (-ENOTSUP) if hardware doesn't support tunnel type.
  */
 int
@@ -4014,6 +4042,7 @@ int rte_eth_dev_rss_hash_update(uint16_t port_id,
  * @return
  *   - (0) if successful.
  *   - (-ENODEV) if port identifier is invalid.
+ *   - (-EIO) if device is removed.
  *   - (-ENOTSUP) if hardware doesn't support tunnel type.
  */
 int
@@ -4032,6 +4061,7 @@ int rte_eth_dev_rss_hash_update(uint16_t port_id,
  *   - (0) if successful.
  *   - (-ENOTSUP) if hardware doesn't support this filter type.
  *   - (-ENODEV) if *port_id* invalid.
+ *   - (-EIO) if device is removed.
  */
 int rte_eth_dev_filter_supported(uint16_t port_id,
 		enum rte_filter_type filter_type);
@@ -4052,6 +4082,7 @@ int rte_eth_dev_filter_supported(uint16_t port_id,
  *   - (0) if successful.
  *   - (-ENOTSUP) if hardware doesn't support.
  *   - (-ENODEV) if *port_id* invalid.
+ *   - (-EIO) if device is removed.
  *   - others depends on the specific operations implementation.
  */
 int rte_eth_dev_filter_ctrl(uint16_t port_id, enum rte_filter_type filter_type,
@@ -4067,6 +4098,7 @@ int rte_eth_dev_filter_ctrl(uint16_t port_id, enum rte_filter_type filter_type,
  * @return
  *   - (0) if successful.
  *   - (-ENODEV) if port identifier is invalid.
+ *   - (-EIO) if device is removed.
  *   - (-ENOTSUP) if hardware doesn't support.
  */
 int rte_eth_dev_get_dcb_info(uint16_t port_id,
@@ -4274,6 +4306,7 @@ int rte_eth_tx_queue_info_get(uint16_t port_id, uint16_t queue_id,
  *   - (0) if successful.
  *   - (-ENOTSUP) if hardware doesn't support.
  *   - (-ENODEV) if *port_id* invalid.
+ *   - (-EIO) if device is removed.
  *   - others depends on the specific operations implementation.
  */
 int rte_eth_dev_get_reg_info(uint16_t port_id, struct rte_dev_reg_info *info);
@@ -4287,6 +4320,7 @@ int rte_eth_tx_queue_info_get(uint16_t port_id, uint16_t queue_id,
  *   - (>=0) EEPROM size if successful.
  *   - (-ENOTSUP) if hardware doesn't support.
  *   - (-ENODEV) if *port_id* invalid.
+ *   - (-EIO) if device is removed.
  *   - others depends on the specific operations implementation.
  */
 int rte_eth_dev_get_eeprom_length(uint16_t port_id);
@@ -4303,6 +4337,7 @@ int rte_eth_tx_queue_info_get(uint16_t port_id, uint16_t queue_id,
  *   - (0) if successful.
  *   - (-ENOTSUP) if hardware doesn't support.
  *   - (-ENODEV) if *port_id* invalid.
+ *   - (-EIO) if device is removed.
  *   - others depends on the specific operations implementation.
  */
 int rte_eth_dev_get_eeprom(uint16_t port_id, struct rte_dev_eeprom_info *info);
@@ -4319,6 +4354,7 @@ int rte_eth_tx_queue_info_get(uint16_t port_id, uint16_t queue_id,
  *   - (0) if successful.
  *   - (-ENOTSUP) if hardware doesn't support.
  *   - (-ENODEV) if *port_id* invalid.
+ *   - (-EIO) if device is removed.
  *   - others depends on the specific operations implementation.
  */
 int rte_eth_dev_set_eeprom(uint16_t port_id, struct rte_dev_eeprom_info *info);
@@ -4337,6 +4373,7 @@ int rte_eth_tx_queue_info_get(uint16_t port_id, uint16_t queue_id,
  * @return
  *   - (0) if successful.
  *   - (-ENODEV) if *port_id* invalid.
+ *   - (-EIO) if device is removed.
  *   - (-ENOTSUP) if PMD of *port_id* doesn't support multicast filtering.
  *   - (-ENOSPC) if *port_id* has not enough multicast filtering resources.
  */
@@ -4353,6 +4390,7 @@ int rte_eth_dev_set_mc_addr_list(uint16_t port_id,
  * @return
  *   - 0: Success.
  *   - -ENODEV: The port ID is invalid.
+ *   - -EIO: if device is removed.
  *   - -ENOTSUP: The function is not supported by the Ethernet driver.
  */
 int rte_eth_timesync_enable(uint16_t port_id);
@@ -4366,6 +4404,7 @@ int rte_eth_dev_set_mc_addr_list(uint16_t port_id,
  * @return
  *   - 0: Success.
  *   - -ENODEV: The port ID is invalid.
+ *   - -EIO: if device is removed.
  *   - -ENOTSUP: The function is not supported by the Ethernet driver.
  */
 int rte_eth_timesync_disable(uint16_t port_id);
@@ -4385,6 +4424,7 @@ int rte_eth_dev_set_mc_addr_list(uint16_t port_id,
  *   - 0: Success.
  *   - -EINVAL: No timestamp is available.
  *   - -ENODEV: The port ID is invalid.
+ *   - -EIO: if device is removed.
  *   - -ENOTSUP: The function is not supported by the Ethernet driver.
  */
 int rte_eth_timesync_read_rx_timestamp(uint16_t port_id,
@@ -4402,6 +4442,7 @@ int rte_eth_timesync_read_rx_timestamp(uint16_t port_id,
  *   - 0: Success.
  *   - -EINVAL: No timestamp is available.
  *   - -ENODEV: The port ID is invalid.
+ *   - -EIO: if device is removed.
  *   - -ENOTSUP: The function is not supported by the Ethernet driver.
  */
 int rte_eth_timesync_read_tx_timestamp(uint16_t port_id,
@@ -4421,6 +4462,7 @@ int rte_eth_timesync_read_tx_timestamp(uint16_t port_id,
  * @return
  *   - 0: Success.
  *   - -ENODEV: The port ID is invalid.
+ *   - -EIO: if device is removed.
  *   - -ENOTSUP: The function is not supported by the Ethernet driver.
  */
 int rte_eth_timesync_adjust_time(uint16_t port_id, int64_t delta);
@@ -4456,6 +4498,7 @@ int rte_eth_timesync_read_tx_timestamp(uint16_t port_id,
  *   - 0: Success.
  *   - -EINVAL: No timestamp is available.
  *   - -ENODEV: The port ID is invalid.
+ *   - -EIO: if device is removed.
  *   - -ENOTSUP: The function is not supported by the Ethernet driver.
  */
 int rte_eth_timesync_write_time(uint16_t port_id, const struct timespec *time);
@@ -4496,6 +4539,7 @@ int rte_eth_timesync_read_tx_timestamp(uint16_t port_id,
  * @return
  *   - (0) if successful.
  *   - (-ENODEV) if port identifier is invalid.
+ *   - (-EIO) if device is removed.
  *   - (-ENOTSUP) if hardware doesn't support tunnel type.
  */
 int
@@ -4523,6 +4567,7 @@ int rte_eth_timesync_read_tx_timestamp(uint16_t port_id,
  * @return
  *   - (0) if successful.
  *   - (-ENODEV) if port identifier is invalid.
+ *   - (-EIO) if device is removed.
  *   - (-ENOTSUP) if hardware doesn't support tunnel type.
  */
 int
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH v5 5/6] ethdev: adjust flow APIs removal error report
  2018-01-17 20:19       ` [PATCH v5 0/6] Fail-safe\ethdev: fix removal handling lack Matan Azrad
                           ` (3 preceding siblings ...)
  2018-01-17 20:19         ` [PATCH v5 4/6] ethdev: adjust APIs removal error report Matan Azrad
@ 2018-01-17 20:19         ` Matan Azrad
  2018-01-17 20:19         ` [PATCH v5 6/6] net/failsafe: fix removed device handling Matan Azrad
  2018-01-18 11:27         ` [PATCH v6 0/6] Fail-safe\ethdev: fix removal handling lack Matan Azrad
  6 siblings, 0 replies; 98+ messages in thread
From: Matan Azrad @ 2018-01-17 20:19 UTC (permalink / raw)
  To: Ferruh Yigit, Adrien Mazarguil, Gaetan Rivet; +Cc: Thomas Monjalon, dev

rte_eth_dev_is_removed API was added to detect a device removal
synchronously.

When a device removal occurs during flow command execution, many
different errors can be reported to the user.

Adjust all flow APIs error reports to return -EIO in case of device
removal using rte_eth_dev_is_removed API.

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
---
 lib/librte_ether/rte_flow.c | 34 +++++++++++++++++++++++++++-------
 lib/librte_ether/rte_flow.h |  2 ++
 2 files changed, 29 insertions(+), 7 deletions(-)

diff --git a/lib/librte_ether/rte_flow.c b/lib/librte_ether/rte_flow.c
index 913d1a5..a86bfbd 100644
--- a/lib/librte_ether/rte_flow.c
+++ b/lib/librte_ether/rte_flow.c
@@ -107,6 +107,18 @@ struct rte_flow_desc_data {
 	MK_FLOW_ACTION(VF, sizeof(struct rte_flow_action_vf)),
 };
 
+static int
+flow_err(uint16_t port_id, int ret, struct rte_flow_error *error)
+{
+	if (ret == 0)
+		return 0;
+	if (rte_eth_dev_is_removed(port_id))
+		return rte_flow_error_set(error, EIO,
+					  RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
+					  NULL, rte_strerror(EIO));
+	return ret;
+}
+
 /* Get generic flow operations structure from a port. */
 const struct rte_flow_ops *
 rte_flow_ops_get(uint16_t port_id, struct rte_flow_error *error)
@@ -145,7 +157,8 @@ struct rte_flow_desc_data {
 	if (unlikely(!ops))
 		return -rte_errno;
 	if (likely(!!ops->validate))
-		return ops->validate(dev, attr, pattern, actions, error);
+		return flow_err(port_id, ops->validate(dev, attr, pattern,
+						       actions, error), error);
 	return rte_flow_error_set(error, ENOSYS,
 				  RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
 				  NULL, rte_strerror(ENOSYS));
@@ -160,12 +173,17 @@ struct rte_flow *
 		struct rte_flow_error *error)
 {
 	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+	struct rte_flow *flow;
 	const struct rte_flow_ops *ops = rte_flow_ops_get(port_id, error);
 
 	if (unlikely(!ops))
 		return NULL;
-	if (likely(!!ops->create))
-		return ops->create(dev, attr, pattern, actions, error);
+	if (likely(!!ops->create)) {
+		flow = ops->create(dev, attr, pattern, actions, error);
+		if (flow == NULL)
+			flow_err(port_id, -rte_errno, error);
+		return flow;
+	}
 	rte_flow_error_set(error, ENOSYS, RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
 			   NULL, rte_strerror(ENOSYS));
 	return NULL;
@@ -183,7 +201,8 @@ struct rte_flow *
 	if (unlikely(!ops))
 		return -rte_errno;
 	if (likely(!!ops->destroy))
-		return ops->destroy(dev, flow, error);
+		return flow_err(port_id, ops->destroy(dev, flow, error),
+				error);
 	return rte_flow_error_set(error, ENOSYS,
 				  RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
 				  NULL, rte_strerror(ENOSYS));
@@ -200,7 +219,7 @@ struct rte_flow *
 	if (unlikely(!ops))
 		return -rte_errno;
 	if (likely(!!ops->flush))
-		return ops->flush(dev, error);
+		return flow_err(port_id, ops->flush(dev, error), error);
 	return rte_flow_error_set(error, ENOSYS,
 				  RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
 				  NULL, rte_strerror(ENOSYS));
@@ -220,7 +239,8 @@ struct rte_flow *
 	if (!ops)
 		return -rte_errno;
 	if (likely(!!ops->query))
-		return ops->query(dev, flow, action, data, error);
+		return flow_err(port_id, ops->query(dev, flow, action, data,
+						    error), error);
 	return rte_flow_error_set(error, ENOSYS,
 				  RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
 				  NULL, rte_strerror(ENOSYS));
@@ -238,7 +258,7 @@ struct rte_flow *
 	if (!ops)
 		return -rte_errno;
 	if (likely(!!ops->isolate))
-		return ops->isolate(dev, set, error);
+		return flow_err(port_id, ops->isolate(dev, set, error), error);
 	return rte_flow_error_set(error, ENOSYS,
 				  RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
 				  NULL, rte_strerror(ENOSYS));
diff --git a/lib/librte_ether/rte_flow.h b/lib/librte_ether/rte_flow.h
index e0402cf..07ec217 100644
--- a/lib/librte_ether/rte_flow.h
+++ b/lib/librte_ether/rte_flow.h
@@ -1267,6 +1267,8 @@ struct rte_flow_error {
  *
  *   -ENOSYS: underlying device does not support this functionality.
  *
+ *   -EIO: underlying device is removed.
+ *
  *   -EINVAL: unknown or invalid rule specification.
  *
  *   -ENOTSUP: valid but unsupported rule specification (e.g. partial
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH v5 6/6] net/failsafe: fix removed device handling
  2018-01-17 20:19       ` [PATCH v5 0/6] Fail-safe\ethdev: fix removal handling lack Matan Azrad
                           ` (4 preceding siblings ...)
  2018-01-17 20:19         ` [PATCH v5 5/6] ethdev: adjust flow " Matan Azrad
@ 2018-01-17 20:19         ` Matan Azrad
  2018-01-18  8:44           ` Gaëtan Rivet
  2018-01-18 11:27         ` [PATCH v6 0/6] Fail-safe\ethdev: fix removal handling lack Matan Azrad
  6 siblings, 1 reply; 98+ messages in thread
From: Matan Azrad @ 2018-01-17 20:19 UTC (permalink / raw)
  To: Ferruh Yigit, Adrien Mazarguil, Gaetan Rivet; +Cc: Thomas Monjalon, dev

There is time between the physical removal of the device until
sub-device PMDs get a RMV interrupt. At this time DPDK PMDs and
applications still don't know about the removal and may call sub-device
control operation which should return an error.

In previous code this error is reported to the application contrary to
fail-safe principle that the app should not be aware of device removal.

Add an removal check in each relevant control command error flow and
prevent an error report to application when the sub-device is removed.

Signed-off-by: Matan Azrad <matan@mellanox.com>
---
 drivers/net/failsafe/failsafe_flow.c    | 18 ++++++++++-------
 drivers/net/failsafe/failsafe_ops.c     | 35 ++++++++++++++++++++++-----------
 drivers/net/failsafe/failsafe_private.h | 11 +++++++++++
 3 files changed, 46 insertions(+), 18 deletions(-)

diff --git a/drivers/net/failsafe/failsafe_flow.c b/drivers/net/failsafe/failsafe_flow.c
index 153ceee..c072d1e 100644
--- a/drivers/net/failsafe/failsafe_flow.c
+++ b/drivers/net/failsafe/failsafe_flow.c
@@ -87,7 +87,7 @@
 		DEBUG("Calling rte_flow_validate on sub_device %d", i);
 		ret = rte_flow_validate(PORT_ID(sdev),
 				attr, patterns, actions, error);
-		if (ret) {
+		if ((ret = fs_err(sdev, ret))) {
 			ERROR("Operation rte_flow_validate failed for sub_device %d"
 			      " with error %d", i, ret);
 			return ret;
@@ -111,7 +111,7 @@
 	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
 		flow->flows[i] = rte_flow_create(PORT_ID(sdev),
 				attr, patterns, actions, error);
-		if (flow->flows[i] == NULL) {
+		if (flow->flows[i] == NULL && fs_err(sdev, -rte_errno)) {
 			ERROR("Failed to create flow on sub_device %d",
 				i);
 			goto err;
@@ -150,7 +150,7 @@
 			continue;
 		local_ret = rte_flow_destroy(PORT_ID(sdev),
 				flow->flows[i], error);
-		if (local_ret) {
+		if ((local_ret = fs_err(sdev, local_ret))) {
 			ERROR("Failed to destroy flow on sub_device %d: %d",
 					i, local_ret);
 			if (ret == 0)
@@ -175,7 +175,7 @@
 	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
 		DEBUG("Calling rte_flow_flush on sub_device %d", i);
 		ret = rte_flow_flush(PORT_ID(sdev), error);
-		if (ret) {
+		if ((ret = fs_err(sdev, ret))) {
 			ERROR("Operation rte_flow_flush failed for sub_device %d"
 			      " with error %d", i, ret);
 			return ret;
@@ -199,8 +199,12 @@
 
 	sdev = TX_SUBDEV(dev);
 	if (sdev != NULL) {
-		return rte_flow_query(PORT_ID(sdev),
-				flow->flows[SUB_ID(sdev)], type, arg, error);
+		int ret = rte_flow_query(PORT_ID(sdev),
+					 flow->flows[SUB_ID(sdev)],
+					 type, arg, error);
+
+		if ((ret = fs_err(sdev, ret)))
+			return ret;
 	}
 	WARN("No active sub_device to query about its flow");
 	return -1;
@@ -223,7 +227,7 @@
 			WARN("flow isolation mode of sub_device %d in incoherent state.",
 				i);
 		ret = rte_flow_isolate(PORT_ID(sdev), set, error);
-		if (ret) {
+		if ((ret = fs_err(sdev, ret))) {
 			ERROR("Operation rte_flow_isolate failed for sub_device %d"
 			      " with error %d", i, ret);
 			return ret;
diff --git a/drivers/net/failsafe/failsafe_ops.c b/drivers/net/failsafe/failsafe_ops.c
index fe957ad..0976745 100644
--- a/drivers/net/failsafe/failsafe_ops.c
+++ b/drivers/net/failsafe/failsafe_ops.c
@@ -121,6 +121,8 @@
 					dev->data->nb_tx_queues,
 					&dev->data->dev_conf);
 		if (ret) {
+			if (!fs_err(sdev, ret))
+				continue;
 			ERROR("Could not configure sub_device %d", i);
 			return ret;
 		}
@@ -163,8 +165,11 @@
 			continue;
 		DEBUG("Starting sub_device %d", i);
 		ret = rte_eth_dev_start(PORT_ID(sdev));
-		if (ret)
+		if (ret) {
+			if (!fs_err(sdev, ret))
+				continue;
 			return ret;
+		}
 		sdev->state = DEV_STARTED;
 	}
 	if (PRIV(dev)->state < DEV_STARTED)
@@ -196,7 +201,7 @@
 	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
 		DEBUG("Calling rte_eth_dev_set_link_up on sub_device %d", i);
 		ret = rte_eth_dev_set_link_up(PORT_ID(sdev));
-		if (ret) {
+		if ((ret = fs_err(sdev, ret))) {
 			ERROR("Operation rte_eth_dev_set_link_up failed for sub_device %d"
 			      " with error %d", i, ret);
 			return ret;
@@ -215,7 +220,7 @@
 	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
 		DEBUG("Calling rte_eth_dev_set_link_down on sub_device %d", i);
 		ret = rte_eth_dev_set_link_down(PORT_ID(sdev));
-		if (ret) {
+		if ((ret = fs_err(sdev, ret))) {
 			ERROR("Operation rte_eth_dev_set_link_down failed for sub_device %d"
 			      " with error %d", i, ret);
 			return ret;
@@ -301,7 +306,7 @@
 				rx_queue_id,
 				nb_rx_desc, socket_id,
 				rx_conf, mb_pool);
-		if (ret) {
+		if ((ret = fs_err(sdev, ret))) {
 			ERROR("RX queue setup failed for sub_device %d", i);
 			goto free_rxq;
 		}
@@ -367,7 +372,7 @@
 				tx_queue_id,
 				nb_tx_desc, socket_id,
 				tx_conf);
-		if (ret) {
+		if ((ret = fs_err(sdev, ret))) {
 			ERROR("TX queue setup failed for sub_device %d", i);
 			goto free_txq;
 		}
@@ -446,7 +451,8 @@
 	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
 		DEBUG("Calling link_update on sub_device %d", i);
 		ret = (SUBOPS(sdev, link_update))(ETH(sdev), wait_to_complete);
-		if (ret && ret != -1) {
+		if (ret && ret != -1 && sdev->remove == 0 &&
+		    rte_eth_dev_is_removed(PORT_ID(sdev)) == 0) {
 			ERROR("Link update failed for sub_device %d with error %d",
 			      i, ret);
 			return ret;
@@ -470,6 +476,7 @@
 fs_stats_get(struct rte_eth_dev *dev,
 	     struct rte_eth_stats *stats)
 {
+	struct rte_eth_stats backup;
 	struct sub_device *sdev;
 	uint8_t i;
 	int ret;
@@ -479,14 +486,20 @@
 		struct rte_eth_stats *snapshot = &sdev->stats_snapshot.stats;
 		uint64_t *timestamp = &sdev->stats_snapshot.timestamp;
 
+		rte_memcpy(&backup, snapshot, sizeof(backup));
 		ret = rte_eth_stats_get(PORT_ID(sdev), snapshot);
 		if (ret) {
+			if (!fs_err(sdev, ret)) {
+				rte_memcpy(snapshot, &backup, sizeof(backup));
+				goto inc;
+			}
 			ERROR("Operation rte_eth_stats_get failed for sub_device %d with error %d",
 				  i, ret);
 			*timestamp = 0;
 			return ret;
 		}
 		*timestamp = rte_rdtsc();
+inc:
 		failsafe_stats_increment(stats, snapshot);
 	}
 	return 0;
@@ -599,7 +612,7 @@
 	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
 		DEBUG("Calling rte_eth_dev_set_mtu on sub_device %d", i);
 		ret = rte_eth_dev_set_mtu(PORT_ID(sdev), mtu);
-		if (ret) {
+		if ((ret = fs_err(sdev, ret))) {
 			ERROR("Operation rte_eth_dev_set_mtu failed for sub_device %d with error %d",
 			      i, ret);
 			return ret;
@@ -618,7 +631,7 @@
 	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
 		DEBUG("Calling rte_eth_dev_vlan_filter on sub_device %d", i);
 		ret = rte_eth_dev_vlan_filter(PORT_ID(sdev), vlan_id, on);
-		if (ret) {
+		if ((ret = fs_err(sdev, ret))) {
 			ERROR("Operation rte_eth_dev_vlan_filter failed for sub_device %d"
 			      " with error %d", i, ret);
 			return ret;
@@ -652,7 +665,7 @@
 	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
 		DEBUG("Calling rte_eth_dev_flow_ctrl_set on sub_device %d", i);
 		ret = rte_eth_dev_flow_ctrl_set(PORT_ID(sdev), fc_conf);
-		if (ret) {
+		if ((ret = fs_err(sdev, ret))) {
 			ERROR("Operation rte_eth_dev_flow_ctrl_set failed for sub_device %d"
 			      " with error %d", i, ret);
 			return ret;
@@ -689,7 +702,7 @@
 	RTE_ASSERT(index < FAILSAFE_MAX_ETHADDR);
 	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
 		ret = rte_eth_dev_mac_addr_add(PORT_ID(sdev), mac_addr, vmdq);
-		if (ret) {
+		if ((ret = fs_err(sdev, ret))) {
 			ERROR("Operation rte_eth_dev_mac_addr_add failed for sub_device %"
 			      PRIu8 " with error %d", i, ret);
 			return ret;
@@ -731,7 +744,7 @@
 	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
 		DEBUG("Calling rte_eth_dev_filter_ctrl on sub_device %d", i);
 		ret = rte_eth_dev_filter_ctrl(PORT_ID(sdev), type, op, arg);
-		if (ret) {
+		if ((ret = fs_err(sdev, ret))) {
 			ERROR("Operation rte_eth_dev_filter_ctrl failed for sub_device %d"
 			      " with error %d", i, ret);
 			return ret;
diff --git a/drivers/net/failsafe/failsafe_private.h b/drivers/net/failsafe/failsafe_private.h
index 54b5b91..5cfb327 100644
--- a/drivers/net/failsafe/failsafe_private.h
+++ b/drivers/net/failsafe/failsafe_private.h
@@ -376,4 +376,15 @@ int failsafe_eth_lsc_event_callback(uint16_t port_id,
 	rte_wmb();
 }
 
+/*
+ * Adjust error value and rte_errno to the fail-safe actual error value.
+ */
+static inline int
+fs_err(struct sub_device *sdev, int err)
+{
+	/* A device removal shouldn't be reported as an error. */
+	if (sdev->remove == 1 || err == -EIO)
+		return rte_errno = 0;
+	return err;
+}
 #endif /* _RTE_ETH_FAILSAFE_PRIVATE_H_ */
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* Re: [PATCH v5 1/6] ethdev: add devop to check removal status
  2018-01-17 20:19         ` [PATCH v5 1/6] ethdev: add devop to check removal status Matan Azrad
@ 2018-01-17 20:40           ` Ferruh Yigit
  0 siblings, 0 replies; 98+ messages in thread
From: Ferruh Yigit @ 2018-01-17 20:40 UTC (permalink / raw)
  To: Matan Azrad, Adrien Mazarguil, Gaetan Rivet
  Cc: Thomas Monjalon, dev, Neil Horman, Andrew Rybchenko, Ivan Malov

On 1/17/2018 8:19 PM, Matan Azrad wrote:
> There is time between the physical removal of the device until PMDs get
> a RMV interrupt. At this time DPDK PMDs and applications still don't
> know about the removal.
> 
> Current removal detection is achieved only by registration to device RMV
> event and the notification comes asynchronously. So, there is no option
> to detect a device removal synchronously.
> Applications and other DPDK entities may want to check a device removal
> synchronously and to take an immediate decision accordingly.
> 
> Add new dev op called is_removed to allow DPDK entities to check an
> Ethernet device removal status immediately.
> 
> Signed-off-by: Matan Azrad <matan@mellanox.com>
> Acked-by: Thomas Monjalon <thomas@monjalon.net>

As Thomas mentioned [1] new APIs needs to be EXPERIMENTAL.

[1]
https://dpdk.org/ml/archives/dev/2018-January/087719.html

> @@ -1970,6 +1976,17 @@ int rte_eth_dev_configure(uint16_t port_id, uint16_t nb_rx_queue,
>  void _rte_eth_dev_reset(struct rte_eth_dev *dev);
>  
>  /**
> + * Check if an Ethernet device was physically removed.
> + *

A EXPERIMENTAL api documentation required, something like:

/**
 * @warning
 * @b EXPERIMENTAL: this API may change without prior notice
....

> + * @param port_id
> + *   The port identifier of the Ethernet device.
> + * @return
> + *   - 0 when the Ethernet device is removed, otherwise 1.
> + */
> +int
> +rte_eth_dev_is_removed(uint16_t port_id);
> +
> +/**
>   * Allocate and set up a receive queue for an Ethernet device.
>   *
>   * The function allocates a contiguous block of memory for *nb_rx_desc*
> diff --git a/lib/librte_ether/rte_ethdev_version.map b/lib/librte_ether/rte_ethdev_version.map
> index e9681ac..78547ff 100644
> --- a/lib/librte_ether/rte_ethdev_version.map
> +++ b/lib/librte_ether/rte_ethdev_version.map
> @@ -198,6 +198,13 @@ DPDK_17.11 {
>  
>  } DPDK_17.08;
>  
> +DPDK_18.02 {
> +	global:
> +
> +	rte_eth_dev_is_removed;
> +
> +} DPDK_17.11;
> +

How to use EXPERIMENTAL tag in linker script it not documented, following makes
sense to me any comment is welcome:

Version script has tags and they are linked to previous version:

DPDK_17.05 {
   ....
} DPDK_17.08;

DPDK_17.11 {
   ....
} DPDK_17.08;

DPDK_18.02 {
   ....
} DPDK_17.11;



But as far as I understand that is only information only for linker. So we can
drop that part from EXPERIMENTAL and can make sure it is the *last* item in
file, like:

DPDK_17.05 {
   ....
} DPDK_17.08;

DPDK_17.11 {
   ....
} DPDK_17.08;

DPDK_18.02 {
   ....
} DPDK_17.11;

EXPERIMENTAL {
   ...
};

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v5 6/6] net/failsafe: fix removed device handling
  2018-01-17 20:19         ` [PATCH v5 6/6] net/failsafe: fix removed device handling Matan Azrad
@ 2018-01-18  8:44           ` Gaëtan Rivet
  0 siblings, 0 replies; 98+ messages in thread
From: Gaëtan Rivet @ 2018-01-18  8:44 UTC (permalink / raw)
  To: Matan Azrad; +Cc: Ferruh Yigit, Adrien Mazarguil, Thomas Monjalon, dev

Hi Matan,

On Wed, Jan 17, 2018 at 08:19:17PM +0000, Matan Azrad wrote:
> There is time between the physical removal of the device until
> sub-device PMDs get a RMV interrupt. At this time DPDK PMDs and
> applications still don't know about the removal and may call sub-device
> control operation which should return an error.
> 
> In previous code this error is reported to the application contrary to
> fail-safe principle that the app should not be aware of device removal.
> 
> Add an removal check in each relevant control command error flow and
> prevent an error report to application when the sub-device is removed.
> 
> Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Gaetan Rivet <gaetan.rivet@6wind.com>

-- 
Gaëtan Rivet
6WIND

^ permalink raw reply	[flat|nested] 98+ messages in thread

* [PATCH v6 0/6] Fail-safe\ethdev: fix removal handling lack
  2018-01-17 20:19       ` [PATCH v5 0/6] Fail-safe\ethdev: fix removal handling lack Matan Azrad
                           ` (5 preceding siblings ...)
  2018-01-17 20:19         ` [PATCH v5 6/6] net/failsafe: fix removed device handling Matan Azrad
@ 2018-01-18 11:27         ` Matan Azrad
  2018-01-18 11:27           ` [PATCH v6 1/6] ethdev: add devop to check removal status Matan Azrad
                             ` (6 more replies)
  6 siblings, 7 replies; 98+ messages in thread
From: Matan Azrad @ 2018-01-18 11:27 UTC (permalink / raw)
  To: Ferruh Yigit, Adrien Mazarguil, Gaetan Rivet; +Cc: Thomas Monjalon, dev

There is time between the physical removal of the device until sub-device PMDs get a RMV interrupt. 
At this time DPDK PMDs and applications still don't know about the removal and may call sub-device control operation which should return an error.

This series adds new ethdev operation to check device removal, adds support for it in mlx PMDs, adjust ethdev APIs to return -EIO in case of removal and fixes the fail-safe bug of removal error report.

V2:
Remove ENODEV definition.
Remove checks from all mlx control commands.
Add new devop - "is_removed".
Implement it in mlx4 and mlx5.
Fix failsafe bug by the new devop.

V3:
Adjust ethdev APIs removal error report.
Change failsafe check to check eth_dev* return values.
Remove backporting of fail-safe patch.

V4:
Improve fail-safe internal API to adjust the actual error value as discussed.
Remove "Fixes" lines from fail-safe patch.
No changes in ethdev\mlx patches.

V5:
Rebase on top of master-net-mlx. 

V6:
Move ethdev new API to be EXPERIMENTAL.

Matan Azrad (6):
  ethdev: add devop to check removal status
  net/mlx4: support a device removal check operation
  net/mlx5: support a device removal check operation
  ethdev: adjust APIs removal error report
  ethdev: adjust flow APIs removal error report
  net/failsafe: fix removed device handling

 drivers/net/failsafe/failsafe_flow.c    |  18 ++-
 drivers/net/failsafe/failsafe_ops.c     |  35 +++--
 drivers/net/failsafe/failsafe_private.h |  11 ++
 drivers/net/mlx4/mlx4.c                 |   1 +
 drivers/net/mlx4/mlx4.h                 |   1 +
 drivers/net/mlx4/mlx4_ethdev.c          |  20 +++
 drivers/net/mlx5/mlx5.c                 |   2 +
 drivers/net/mlx5/mlx5.h                 |   1 +
 drivers/net/mlx5/mlx5_ethdev.c          |  20 +++
 lib/librte_ether/rte_ethdev.c           | 218 +++++++++++++++++++++-----------
 lib/librte_ether/rte_ethdev.h           |  71 ++++++++++-
 lib/librte_ether/rte_ethdev_version.map |   1 +
 lib/librte_ether/rte_flow.c             |  34 ++++-
 lib/librte_ether/rte_flow.h             |   2 +
 14 files changed, 335 insertions(+), 100 deletions(-)

-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 98+ messages in thread

* [PATCH v6 1/6] ethdev: add devop to check removal status
  2018-01-18 11:27         ` [PATCH v6 0/6] Fail-safe\ethdev: fix removal handling lack Matan Azrad
@ 2018-01-18 11:27           ` Matan Azrad
  2018-01-18 17:18             ` Ferruh Yigit
  2018-01-18 11:27           ` [PATCH v6 2/6] net/mlx4: support a device removal check operation Matan Azrad
                             ` (5 subsequent siblings)
  6 siblings, 1 reply; 98+ messages in thread
From: Matan Azrad @ 2018-01-18 11:27 UTC (permalink / raw)
  To: Ferruh Yigit, Adrien Mazarguil, Gaetan Rivet; +Cc: Thomas Monjalon, dev

There is time between the physical removal of the device until PMDs get
a RMV interrupt. At this time DPDK PMDs and applications still don't
know about the removal.

Current removal detection is achieved only by registration to device RMV
event and the notification comes asynchronously. So, there is no option
to detect a device removal synchronously.
Applications and other DPDK entities may want to check a device removal
synchronously and to take an immediate decision accordingly.

Add new dev op called is_removed to allow DPDK entities to check an
Ethernet device removal status immediately.

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
---
 lib/librte_ether/rte_ethdev.c           | 28 +++++++++++++++++++++++++---
 lib/librte_ether/rte_ethdev.h           | 20 ++++++++++++++++++++
 lib/librte_ether/rte_ethdev_version.map |  1 +
 3 files changed, 46 insertions(+), 3 deletions(-)

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index b349599..c93cec1 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -114,7 +114,8 @@ enum {
 rte_eth_find_next(uint16_t port_id)
 {
 	while (port_id < RTE_MAX_ETHPORTS &&
-	       rte_eth_devices[port_id].state != RTE_ETH_DEV_ATTACHED)
+	       rte_eth_devices[port_id].state != RTE_ETH_DEV_ATTACHED &&
+	       rte_eth_devices[port_id].state != RTE_ETH_DEV_REMOVED)
 		port_id++;
 
 	if (port_id >= RTE_MAX_ETHPORTS)
@@ -262,8 +263,7 @@ struct rte_eth_dev *
 rte_eth_dev_is_valid_port(uint16_t port_id)
 {
 	if (port_id >= RTE_MAX_ETHPORTS ||
-	    (rte_eth_devices[port_id].state != RTE_ETH_DEV_ATTACHED &&
-	     rte_eth_devices[port_id].state != RTE_ETH_DEV_DEFERRED))
+	    (rte_eth_devices[port_id].state == RTE_ETH_DEV_UNUSED))
 		return 0;
 	else
 		return 1;
@@ -1094,6 +1094,28 @@ struct rte_eth_dev *
 }
 
 int
+rte_eth_dev_is_removed(uint16_t port_id)
+{
+	struct rte_eth_dev *dev;
+	int ret;
+
+	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, 0);
+
+	dev = &rte_eth_devices[port_id];
+
+	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->is_removed, 0);
+
+	if (dev->state == RTE_ETH_DEV_REMOVED)
+		return 1;
+
+	ret = dev->dev_ops->is_removed(dev);
+	if (ret != 0)
+		dev->state = RTE_ETH_DEV_REMOVED;
+
+	return ret;
+}
+
+int
 rte_eth_rx_queue_setup(uint16_t port_id, uint16_t rx_queue_id,
 		       uint16_t nb_rx_desc, unsigned int socket_id,
 		       const struct rte_eth_rxconf *rx_conf,
diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index f0eeefe..18c14e9 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -1169,6 +1169,9 @@ struct rte_eth_dcb_info {
 typedef int (*eth_dev_reset_t)(struct rte_eth_dev *dev);
 /** <@internal Function used to reset a configured Ethernet device. */
 
+typedef int (*eth_is_removed_t)(struct rte_eth_dev *dev);
+/**< @internal Function used to detect an Ethernet device removal. */
+
 typedef void (*eth_promiscuous_enable_t)(struct rte_eth_dev *dev);
 /**< @internal Function used to enable the RX promiscuous mode of an Ethernet device. */
 
@@ -1498,6 +1501,8 @@ struct eth_dev_ops {
 	eth_dev_close_t            dev_close;     /**< Close device. */
 	eth_dev_reset_t		   dev_reset;	  /**< Reset device. */
 	eth_link_update_t          link_update;   /**< Get device link state. */
+	eth_is_removed_t           is_removed;
+	/**< Check if the device was physically removed. */
 
 	eth_promiscuous_enable_t   promiscuous_enable; /**< Promiscuous ON. */
 	eth_promiscuous_disable_t  promiscuous_disable;/**< Promiscuous OFF. */
@@ -1684,6 +1689,7 @@ enum rte_eth_dev_state {
 	RTE_ETH_DEV_UNUSED = 0,
 	RTE_ETH_DEV_ATTACHED,
 	RTE_ETH_DEV_DEFERRED,
+	RTE_ETH_DEV_REMOVED,
 };
 
 /**
@@ -1970,6 +1976,20 @@ int rte_eth_dev_configure(uint16_t port_id, uint16_t nb_rx_queue,
 void _rte_eth_dev_reset(struct rte_eth_dev *dev);
 
 /**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Check if an Ethernet device was physically removed.
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @return
+ *   - 0 when the Ethernet device is removed, otherwise 1.
+ */
+int
+rte_eth_dev_is_removed(uint16_t port_id);
+
+/**
  * Allocate and set up a receive queue for an Ethernet device.
  *
  * The function allocates a contiguous block of memory for *nb_rx_desc*
diff --git a/lib/librte_ether/rte_ethdev_version.map b/lib/librte_ether/rte_ethdev_version.map
index e9681ac..88b7908 100644
--- a/lib/librte_ether/rte_ethdev_version.map
+++ b/lib/librte_ether/rte_ethdev_version.map
@@ -201,6 +201,7 @@ DPDK_17.11 {
 EXPERIMENTAL {
 	global:
 
+	rte_eth_dev_is_removed;
 	rte_mtr_capabilities_get;
 	rte_mtr_create;
 	rte_mtr_destroy;
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH v6 2/6] net/mlx4: support a device removal check operation
  2018-01-18 11:27         ` [PATCH v6 0/6] Fail-safe\ethdev: fix removal handling lack Matan Azrad
  2018-01-18 11:27           ` [PATCH v6 1/6] ethdev: add devop to check removal status Matan Azrad
@ 2018-01-18 11:27           ` Matan Azrad
  2018-01-18 16:59             ` Adrien Mazarguil
  2018-01-18 11:27           ` [PATCH v6 3/6] net/mlx5: " Matan Azrad
                             ` (4 subsequent siblings)
  6 siblings, 1 reply; 98+ messages in thread
From: Matan Azrad @ 2018-01-18 11:27 UTC (permalink / raw)
  To: Ferruh Yigit, Adrien Mazarguil, Gaetan Rivet; +Cc: Thomas Monjalon, dev

Add support to get removal status of mlx4 device.

Signed-off-by: Matan Azrad <matan@mellanox.com>
---
 drivers/net/mlx4/mlx4.c        |  1 +
 drivers/net/mlx4/mlx4.h        |  1 +
 drivers/net/mlx4/mlx4_ethdev.c | 20 ++++++++++++++++++++
 3 files changed, 22 insertions(+)

diff --git a/drivers/net/mlx4/mlx4.c b/drivers/net/mlx4/mlx4.c
index 61c5bf4..703513e 100644
--- a/drivers/net/mlx4/mlx4.c
+++ b/drivers/net/mlx4/mlx4.c
@@ -256,6 +256,7 @@ struct mlx4_conf {
 	.filter_ctrl = mlx4_filter_ctrl,
 	.rx_queue_intr_enable = mlx4_rx_intr_enable,
 	.rx_queue_intr_disable = mlx4_rx_intr_disable,
+	.is_removed = mlx4_is_removed,
 };
 
 /**
diff --git a/drivers/net/mlx4/mlx4.h b/drivers/net/mlx4/mlx4.h
index 99dc335..2ab2988 100644
--- a/drivers/net/mlx4/mlx4.h
+++ b/drivers/net/mlx4/mlx4.h
@@ -171,6 +171,7 @@ int mlx4_flow_ctrl_get(struct rte_eth_dev *dev,
 int mlx4_flow_ctrl_set(struct rte_eth_dev *dev,
 		       struct rte_eth_fc_conf *fc_conf);
 const uint32_t *mlx4_dev_supported_ptypes_get(struct rte_eth_dev *dev);
+int mlx4_is_removed(struct rte_eth_dev *dev);
 
 /* mlx4_intr.c */
 
diff --git a/drivers/net/mlx4/mlx4_ethdev.c b/drivers/net/mlx4/mlx4_ethdev.c
index c80eab5..5318b56 100644
--- a/drivers/net/mlx4/mlx4_ethdev.c
+++ b/drivers/net/mlx4/mlx4_ethdev.c
@@ -1052,3 +1052,23 @@ enum rxmode_toggle {
 	}
 	return NULL;
 }
+
+/**
+ * Check if mlx4 device was removed.
+ *
+ * @param dev
+ *   Pointer to Ethernet device structure.
+ *
+ * @return
+ *   1 when device is removed, otherwise 0.
+ */
+int
+mlx4_is_removed(struct rte_eth_dev *dev)
+{
+	struct ibv_device_attr device_attr;
+	struct priv *priv = dev->data->dev_private;
+
+	if (ibv_query_device(priv->ctx, &device_attr) == EIO)
+		return 1;
+	return 0;
+}
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH v6 3/6] net/mlx5: support a device removal check operation
  2018-01-18 11:27         ` [PATCH v6 0/6] Fail-safe\ethdev: fix removal handling lack Matan Azrad
  2018-01-18 11:27           ` [PATCH v6 1/6] ethdev: add devop to check removal status Matan Azrad
  2018-01-18 11:27           ` [PATCH v6 2/6] net/mlx4: support a device removal check operation Matan Azrad
@ 2018-01-18 11:27           ` Matan Azrad
  2018-01-18 16:59             ` Adrien Mazarguil
  2018-01-18 11:27           ` [PATCH v6 4/6] ethdev: adjust APIs removal error report Matan Azrad
                             ` (3 subsequent siblings)
  6 siblings, 1 reply; 98+ messages in thread
From: Matan Azrad @ 2018-01-18 11:27 UTC (permalink / raw)
  To: Ferruh Yigit, Adrien Mazarguil, Gaetan Rivet; +Cc: Thomas Monjalon, dev

Add support to get removal status of mlx5 device.
It is not supported in secondary process.

Signed-off-by: Matan Azrad <matan@mellanox.com>
---
 drivers/net/mlx5/mlx5.c        |  2 ++
 drivers/net/mlx5/mlx5.h        |  1 +
 drivers/net/mlx5/mlx5_ethdev.c | 20 ++++++++++++++++++++
 3 files changed, 23 insertions(+)

diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index 1c95f35..c13a2d3 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -284,6 +284,7 @@
 	.tx_descriptor_status = mlx5_tx_descriptor_status,
 	.rx_queue_intr_enable = mlx5_rx_intr_enable,
 	.rx_queue_intr_disable = mlx5_rx_intr_disable,
+	.is_removed = mlx5_is_removed,
 };
 
 static const struct eth_dev_ops mlx5_dev_sec_ops = {
@@ -331,6 +332,7 @@
 	.tx_descriptor_status = mlx5_tx_descriptor_status,
 	.rx_queue_intr_enable = mlx5_rx_intr_enable,
 	.rx_queue_intr_disable = mlx5_rx_intr_disable,
+	.is_removed = mlx5_is_removed,
 };
 
 static struct {
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index e740a4e..aaff180 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -237,6 +237,7 @@ int mlx5_ibv_device_to_pci_addr(const struct ibv_device *,
 void priv_dev_interrupt_handler_install(struct priv *, struct rte_eth_dev *);
 int mlx5_set_link_down(struct rte_eth_dev *dev);
 int mlx5_set_link_up(struct rte_eth_dev *dev);
+int mlx5_is_removed(struct rte_eth_dev *dev);
 eth_tx_burst_t priv_select_tx_function(struct priv *, struct rte_eth_dev *);
 eth_rx_burst_t priv_select_rx_function(struct priv *, struct rte_eth_dev *);
 
diff --git a/drivers/net/mlx5/mlx5_ethdev.c b/drivers/net/mlx5/mlx5_ethdev.c
index 6f78adc..1c067ca 100644
--- a/drivers/net/mlx5/mlx5_ethdev.c
+++ b/drivers/net/mlx5/mlx5_ethdev.c
@@ -1453,3 +1453,23 @@ struct ethtool_link_settings {
 	}
 	return rx_pkt_burst;
 }
+
+/**
+ * Check if mlx5 device was removed.
+ *
+ * @param dev
+ *   Pointer to Ethernet device structure.
+ *
+ * @return
+ *   1 when device is removed, otherwise 0.
+ */
+int
+mlx5_is_removed(struct rte_eth_dev *dev)
+{
+	struct ibv_device_attr device_attr;
+	struct priv *priv = dev->data->dev_private;
+
+	if (ibv_query_device(priv->ctx, &device_attr) == EIO)
+		return 1;
+	return 0;
+}
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH v6 4/6] ethdev: adjust APIs removal error report
  2018-01-18 11:27         ` [PATCH v6 0/6] Fail-safe\ethdev: fix removal handling lack Matan Azrad
                             ` (2 preceding siblings ...)
  2018-01-18 11:27           ` [PATCH v6 3/6] net/mlx5: " Matan Azrad
@ 2018-01-18 11:27           ` Matan Azrad
  2018-01-18 17:31             ` Ferruh Yigit
  2018-01-18 11:27           ` [PATCH v6 5/6] ethdev: adjust flow " Matan Azrad
                             ` (2 subsequent siblings)
  6 siblings, 1 reply; 98+ messages in thread
From: Matan Azrad @ 2018-01-18 11:27 UTC (permalink / raw)
  To: Ferruh Yigit, Adrien Mazarguil, Gaetan Rivet; +Cc: Thomas Monjalon, dev

rte_eth_dev_is_removed API was added to detect a device removal
synchronously.

When a device removal occurs during control command execution, many
different errors can be reported to the user.

Adjust all ethdev APIs error reports to return -EIO in case of device
removal using rte_eth_dev_is_removed API.

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
---
 lib/librte_ether/rte_ethdev.c | 192 +++++++++++++++++++++++++++---------------
 lib/librte_ether/rte_ethdev.h |  51 ++++++++++-
 2 files changed, 170 insertions(+), 73 deletions(-)

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index c93cec1..7044159 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -338,6 +338,16 @@ struct rte_eth_dev *
 	return -ENODEV;
 }
 
+static int
+eth_err(uint16_t port_id, int ret)
+{
+	if (ret == 0)
+		return 0;
+	if (rte_eth_dev_is_removed(port_id))
+		return -EIO;
+	return ret;
+}
+
 /* attach the new device, then store port_id of the device */
 int
 rte_eth_dev_attach(const char *devargs, uint16_t *port_id)
@@ -492,7 +502,8 @@ struct rte_eth_dev *
 		return 0;
 	}
 
-	return dev->dev_ops->rx_queue_start(dev, rx_queue_id);
+	return eth_err(port_id, dev->dev_ops->rx_queue_start(dev,
+							     rx_queue_id));
 
 }
 
@@ -518,7 +529,7 @@ struct rte_eth_dev *
 		return 0;
 	}
 
-	return dev->dev_ops->rx_queue_stop(dev, rx_queue_id);
+	return eth_err(port_id, dev->dev_ops->rx_queue_stop(dev, rx_queue_id));
 
 }
 
@@ -544,7 +555,8 @@ struct rte_eth_dev *
 		return 0;
 	}
 
-	return dev->dev_ops->tx_queue_start(dev, tx_queue_id);
+	return eth_err(port_id, dev->dev_ops->tx_queue_start(dev,
+							     tx_queue_id));
 
 }
 
@@ -570,7 +582,7 @@ struct rte_eth_dev *
 		return 0;
 	}
 
-	return dev->dev_ops->tx_queue_stop(dev, tx_queue_id);
+	return eth_err(port_id, dev->dev_ops->tx_queue_stop(dev, tx_queue_id));
 
 }
 
@@ -888,7 +900,7 @@ struct rte_eth_dev *
 				port_id, diag);
 		rte_eth_dev_rx_queue_config(dev, 0);
 		rte_eth_dev_tx_queue_config(dev, 0);
-		return diag;
+		return eth_err(port_id, diag);
 	}
 
 	/* Initialize Rx profiling if enabled at compilation time. */
@@ -898,7 +910,7 @@ struct rte_eth_dev *
 				port_id, diag);
 		rte_eth_dev_rx_queue_config(dev, 0);
 		rte_eth_dev_tx_queue_config(dev, 0);
-		return diag;
+		return eth_err(port_id, diag);
 	}
 
 	return 0;
@@ -998,7 +1010,7 @@ struct rte_eth_dev *
 	if (diag == 0)
 		dev->data->dev_started = 1;
 	else
-		return diag;
+		return eth_err(port_id, diag);
 
 	rte_eth_dev_config_restore(port_id);
 
@@ -1040,7 +1052,7 @@ struct rte_eth_dev *
 	dev = &rte_eth_devices[port_id];
 
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->dev_set_link_up, -ENOTSUP);
-	return (*dev->dev_ops->dev_set_link_up)(dev);
+	return eth_err(port_id, (*dev->dev_ops->dev_set_link_up)(dev));
 }
 
 int
@@ -1053,7 +1065,7 @@ struct rte_eth_dev *
 	dev = &rte_eth_devices[port_id];
 
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->dev_set_link_down, -ENOTSUP);
-	return (*dev->dev_ops->dev_set_link_down)(dev);
+	return eth_err(port_id, (*dev->dev_ops->dev_set_link_down)(dev));
 }
 
 void
@@ -1090,7 +1102,7 @@ struct rte_eth_dev *
 	rte_eth_dev_stop(port_id);
 	ret = dev->dev_ops->dev_reset(dev);
 
-	return ret;
+	return eth_err(port_id, ret);
 }
 
 int
@@ -1214,7 +1226,7 @@ struct rte_eth_dev *
 			dev->data->min_rx_buf_size = mbp_buf_size;
 	}
 
-	return ret;
+	return eth_err(port_id, ret);
 }
 
 /**
@@ -1333,8 +1345,8 @@ struct rte_eth_dev *
 					  &local_conf.offloads);
 	}
 
-	return (*dev->dev_ops->tx_queue_setup)(dev, tx_queue_id, nb_tx_desc,
-					       socket_id, &local_conf);
+	return eth_err(port_id, (*dev->dev_ops->tx_queue_setup)(dev,
+		       tx_queue_id, nb_tx_desc, socket_id, &local_conf));
 }
 
 void
@@ -1390,14 +1402,16 @@ struct rte_eth_dev *
 rte_eth_tx_done_cleanup(uint16_t port_id, uint16_t queue_id, uint32_t free_cnt)
 {
 	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+	int ret;
 
 	/* Validate Input Data. Bail if not valid or not supported. */
 	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->tx_done_cleanup, -ENOTSUP);
 
 	/* Call driver to free pending mbufs. */
-	return (*dev->dev_ops->tx_done_cleanup)(dev->data->tx_queues[queue_id],
-			free_cnt);
+	ret = (*dev->dev_ops->tx_done_cleanup)(dev->data->tx_queues[queue_id],
+					       free_cnt);
+	return eth_err(port_id, ret);
 }
 
 void
@@ -1534,7 +1548,7 @@ struct rte_eth_dev *
 
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->stats_get, -ENOTSUP);
 	stats->rx_nombuf = dev->data->rx_mbuf_alloc_failed;
-	return (*dev->dev_ops->stats_get)(dev, stats);
+	return eth_err(port_id, (*dev->dev_ops->stats_get)(dev, stats));
 }
 
 int
@@ -1580,12 +1594,12 @@ struct rte_eth_dev *
 		count = (*dev->dev_ops->xstats_get_names_by_id)(dev, NULL,
 				NULL, 0);
 		if (count < 0)
-			return count;
+			return eth_err(port_id, count);
 	}
 	if (dev->dev_ops->xstats_get_names != NULL) {
 		count = (*dev->dev_ops->xstats_get_names)(dev, NULL, 0);
 		if (count < 0)
-			return count;
+			return eth_err(port_id, count);
 	} else
 		count = 0;
 
@@ -1765,8 +1779,12 @@ struct rte_eth_dev *
 	if (ids && no_ext_stat_requested) {
 		rte_eth_basic_stats_get_names(dev, xstats_names_copy);
 	} else {
-		rte_eth_xstats_get_names(port_id, xstats_names_copy,
+		ret = rte_eth_xstats_get_names(port_id, xstats_names_copy,
 			expected_entries);
+		if (ret < 0) {
+			free(xstats_names_copy);
+			return ret;
+		}
 	}
 
 	/* Filter stats */
@@ -1813,7 +1831,7 @@ struct rte_eth_dev *
 			xstats_names + cnt_used_entries,
 			size - cnt_used_entries);
 		if (cnt_driver_entries < 0)
-			return cnt_driver_entries;
+			return eth_err(port_id, cnt_driver_entries);
 		cnt_used_entries += cnt_driver_entries;
 	}
 
@@ -1829,8 +1847,12 @@ struct rte_eth_dev *
 	unsigned int count = 0, i, q;
 	uint64_t val, *stats_ptr;
 	uint16_t nb_rxqs, nb_txqs;
+	int ret;
+
+	ret = rte_eth_stats_get(port_id, &eth_stats);
+	if (ret < 0)
+		return ret;
 
-	rte_eth_stats_get(port_id, &eth_stats);
 	dev = &rte_eth_devices[port_id];
 
 	nb_rxqs = RTE_MIN(dev->data->nb_rx_queues, RTE_ETHDEV_QUEUE_STAT_CNTRS);
@@ -1883,7 +1905,10 @@ struct rte_eth_dev *
 	int ret;
 
 	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
-	expected_entries = get_xstats_count(port_id);
+	ret = get_xstats_count(port_id);
+	if (ret < 0)
+		return ret;
+	expected_entries = (uint16_t)ret;
 	struct rte_eth_xstat xstats[expected_entries];
 	dev = &rte_eth_devices[port_id];
 	basic_count = get_xstats_basic_count(dev);
@@ -1966,6 +1991,7 @@ struct rte_eth_dev *
 	unsigned int count = 0, i;
 	signed int xcount = 0;
 	uint16_t nb_rxqs, nb_txqs;
+	int ret;
 
 	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -EINVAL);
 
@@ -1988,14 +2014,17 @@ struct rte_eth_dev *
 				     (n > count) ? n - count : 0);
 
 		if (xcount < 0)
-			return xcount;
+			return eth_err(port_id, xcount);
 	}
 
 	if (n < count + xcount || xstats == NULL)
 		return count + xcount;
 
 	/* now fill the xstats structure */
-	count = rte_eth_basic_stats_get(port_id, xstats);
+	ret = rte_eth_basic_stats_get(port_id, xstats);
+	if (ret < 0)
+		return ret;
+	count = ret;
 
 	for (i = 0; i < count; i++)
 		xstats[i].id = i;
@@ -2045,8 +2074,8 @@ struct rte_eth_dev *
 rte_eth_dev_set_tx_queue_stats_mapping(uint16_t port_id, uint16_t tx_queue_id,
 		uint8_t stat_idx)
 {
-	return set_queue_stats_mapping(port_id, tx_queue_id, stat_idx,
-			STAT_QMAP_TX);
+	return eth_err(port_id, set_queue_stats_mapping(port_id, tx_queue_id,
+						stat_idx, STAT_QMAP_TX));
 }
 
 
@@ -2054,8 +2083,8 @@ struct rte_eth_dev *
 rte_eth_dev_set_rx_queue_stats_mapping(uint16_t port_id, uint16_t rx_queue_id,
 		uint8_t stat_idx)
 {
-	return set_queue_stats_mapping(port_id, rx_queue_id, stat_idx,
-			STAT_QMAP_RX);
+	return eth_err(port_id, set_queue_stats_mapping(port_id, rx_queue_id,
+						stat_idx, STAT_QMAP_RX));
 }
 
 int
@@ -2067,7 +2096,8 @@ struct rte_eth_dev *
 	dev = &rte_eth_devices[port_id];
 
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->fw_version_get, -ENOTSUP);
-	return (*dev->dev_ops->fw_version_get)(dev, fw_version, fw_size);
+	return eth_err(port_id, (*dev->dev_ops->fw_version_get)(dev,
+							fw_version, fw_size));
 }
 
 void
@@ -2157,7 +2187,7 @@ struct rte_eth_dev *
 	if (!ret)
 		dev->data->mtu = mtu;
 
-	return ret;
+	return eth_err(port_id, ret);
 }
 
 int
@@ -2197,7 +2227,7 @@ struct rte_eth_dev *
 			vfc->ids[vidx] &= ~(UINT64_C(1) << vbit);
 	}
 
-	return ret;
+	return eth_err(port_id, ret);
 }
 
 int
@@ -2230,7 +2260,8 @@ struct rte_eth_dev *
 	dev = &rte_eth_devices[port_id];
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->vlan_tpid_set, -ENOTSUP);
 
-	return (*dev->dev_ops->vlan_tpid_set)(dev, vlan_type, tpid);
+	return eth_err(port_id, (*dev->dev_ops->vlan_tpid_set)(dev, vlan_type,
+							       tpid));
 }
 
 int
@@ -2308,7 +2339,7 @@ struct rte_eth_dev *
 					    &dev->data->dev_conf.rxmode);
 	}
 
-	return ret;
+	return eth_err(port_id, ret);
 }
 
 int
@@ -2343,9 +2374,8 @@ struct rte_eth_dev *
 	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
 	dev = &rte_eth_devices[port_id];
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->vlan_pvid_set, -ENOTSUP);
-	(*dev->dev_ops->vlan_pvid_set)(dev, pvid, on);
 
-	return 0;
+	return eth_err(port_id, (*dev->dev_ops->vlan_pvid_set)(dev, pvid, on));
 }
 
 int
@@ -2357,7 +2387,7 @@ struct rte_eth_dev *
 	dev = &rte_eth_devices[port_id];
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->flow_ctrl_get, -ENOTSUP);
 	memset(fc_conf, 0, sizeof(*fc_conf));
-	return (*dev->dev_ops->flow_ctrl_get)(dev, fc_conf);
+	return eth_err(port_id, (*dev->dev_ops->flow_ctrl_get)(dev, fc_conf));
 }
 
 int
@@ -2373,7 +2403,7 @@ struct rte_eth_dev *
 
 	dev = &rte_eth_devices[port_id];
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->flow_ctrl_set, -ENOTSUP);
-	return (*dev->dev_ops->flow_ctrl_set)(dev, fc_conf);
+	return eth_err(port_id, (*dev->dev_ops->flow_ctrl_set)(dev, fc_conf));
 }
 
 int
@@ -2391,7 +2421,8 @@ struct rte_eth_dev *
 	dev = &rte_eth_devices[port_id];
 	/* High water, low water validation are device specific */
 	if  (*dev->dev_ops->priority_flow_ctrl_set)
-		return (*dev->dev_ops->priority_flow_ctrl_set)(dev, pfc_conf);
+		return eth_err(port_id, (*dev->dev_ops->priority_flow_ctrl_set)
+					(dev, pfc_conf));
 	return -ENOTSUP;
 }
 
@@ -2466,7 +2497,8 @@ struct rte_eth_dev *
 		return ret;
 
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->reta_update, -ENOTSUP);
-	return (*dev->dev_ops->reta_update)(dev, reta_conf, reta_size);
+	return eth_err(port_id, (*dev->dev_ops->reta_update)(dev, reta_conf,
+							     reta_size));
 }
 
 int
@@ -2486,7 +2518,8 @@ struct rte_eth_dev *
 
 	dev = &rte_eth_devices[port_id];
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->reta_query, -ENOTSUP);
-	return (*dev->dev_ops->reta_query)(dev, reta_conf, reta_size);
+	return eth_err(port_id, (*dev->dev_ops->reta_query)(dev, reta_conf,
+							    reta_size));
 }
 
 int
@@ -2498,7 +2531,8 @@ struct rte_eth_dev *
 	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
 	dev = &rte_eth_devices[port_id];
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rss_hash_update, -ENOTSUP);
-	return (*dev->dev_ops->rss_hash_update)(dev, rss_conf);
+	return eth_err(port_id, (*dev->dev_ops->rss_hash_update)(dev,
+								 rss_conf));
 }
 
 int
@@ -2510,7 +2544,8 @@ struct rte_eth_dev *
 	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
 	dev = &rte_eth_devices[port_id];
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rss_hash_conf_get, -ENOTSUP);
-	return (*dev->dev_ops->rss_hash_conf_get)(dev, rss_conf);
+	return eth_err(port_id, (*dev->dev_ops->rss_hash_conf_get)(dev,
+								   rss_conf));
 }
 
 int
@@ -2532,7 +2567,8 @@ struct rte_eth_dev *
 
 	dev = &rte_eth_devices[port_id];
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->udp_tunnel_port_add, -ENOTSUP);
-	return (*dev->dev_ops->udp_tunnel_port_add)(dev, udp_tunnel);
+	return eth_err(port_id, (*dev->dev_ops->udp_tunnel_port_add)(dev,
+								udp_tunnel));
 }
 
 int
@@ -2555,7 +2591,8 @@ struct rte_eth_dev *
 	}
 
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->udp_tunnel_port_del, -ENOTSUP);
-	return (*dev->dev_ops->udp_tunnel_port_del)(dev, udp_tunnel);
+	return eth_err(port_id, (*dev->dev_ops->udp_tunnel_port_del)(dev,
+								udp_tunnel));
 }
 
 int
@@ -2566,7 +2603,7 @@ struct rte_eth_dev *
 	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
 	dev = &rte_eth_devices[port_id];
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->dev_led_on, -ENOTSUP);
-	return (*dev->dev_ops->dev_led_on)(dev);
+	return eth_err(port_id, (*dev->dev_ops->dev_led_on)(dev));
 }
 
 int
@@ -2577,7 +2614,7 @@ struct rte_eth_dev *
 	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
 	dev = &rte_eth_devices[port_id];
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->dev_led_off, -ENOTSUP);
-	return (*dev->dev_ops->dev_led_off)(dev);
+	return eth_err(port_id, (*dev->dev_ops->dev_led_off)(dev));
 }
 
 /*
@@ -2653,7 +2690,7 @@ struct rte_eth_dev *
 		dev->data->mac_pool_sel[index] |= (1ULL << pool);
 	}
 
-	return ret;
+	return eth_err(port_id, ret);
 }
 
 int
@@ -2779,7 +2816,7 @@ struct rte_eth_dev *
 					&dev->data->hash_mac_addrs[index]);
 	}
 
-	return ret;
+	return eth_err(port_id, ret);
 }
 
 int
@@ -2792,7 +2829,8 @@ struct rte_eth_dev *
 	dev = &rte_eth_devices[port_id];
 
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->uc_all_hash_table_set, -ENOTSUP);
-	return (*dev->dev_ops->uc_all_hash_table_set)(dev, on);
+	return eth_err(port_id, (*dev->dev_ops->uc_all_hash_table_set)(dev,
+								       on));
 }
 
 int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx,
@@ -2822,7 +2860,8 @@ int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx,
 	}
 
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->set_queue_rate_limit, -ENOTSUP);
-	return (*dev->dev_ops->set_queue_rate_limit)(dev, queue_idx, tx_rate);
+	return eth_err(port_id, (*dev->dev_ops->set_queue_rate_limit)(dev,
+							queue_idx, tx_rate));
 }
 
 int
@@ -2860,7 +2899,8 @@ int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx,
 	dev = &rte_eth_devices[port_id];
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->mirror_rule_set, -ENOTSUP);
 
-	return (*dev->dev_ops->mirror_rule_set)(dev, mirror_conf, rule_id, on);
+	return eth_err(port_id, (*dev->dev_ops->mirror_rule_set)(dev,
+						mirror_conf, rule_id, on));
 }
 
 int
@@ -2873,7 +2913,8 @@ int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx,
 	dev = &rte_eth_devices[port_id];
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->mirror_rule_reset, -ENOTSUP);
 
-	return (*dev->dev_ops->mirror_rule_reset)(dev, rule_id);
+	return eth_err(port_id, (*dev->dev_ops->mirror_rule_reset)(dev,
+								   rule_id));
 }
 
 RTE_INIT(eth_dev_init_cb_lists)
@@ -3137,7 +3178,8 @@ int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx,
 	dev = &rte_eth_devices[port_id];
 
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rx_queue_intr_enable, -ENOTSUP);
-	return (*dev->dev_ops->rx_queue_intr_enable)(dev, queue_id);
+	return eth_err(port_id, (*dev->dev_ops->rx_queue_intr_enable)(dev,
+								queue_id));
 }
 
 int
@@ -3151,7 +3193,8 @@ int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx,
 	dev = &rte_eth_devices[port_id];
 
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rx_queue_intr_disable, -ENOTSUP);
-	return (*dev->dev_ops->rx_queue_intr_disable)(dev, queue_id);
+	return eth_err(port_id, (*dev->dev_ops->rx_queue_intr_disable)(dev,
+								queue_id));
 }
 
 
@@ -3179,7 +3222,8 @@ int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx,
 
 	dev = &rte_eth_devices[port_id];
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->filter_ctrl, -ENOTSUP);
-	return (*dev->dev_ops->filter_ctrl)(dev, filter_type, filter_op, arg);
+	return eth_err(port_id, (*dev->dev_ops->filter_ctrl)(dev, filter_type,
+							     filter_op, arg));
 }
 
 void *
@@ -3429,7 +3473,8 @@ int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx,
 
 	dev = &rte_eth_devices[port_id];
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->set_mc_addr_list, -ENOTSUP);
-	return dev->dev_ops->set_mc_addr_list(dev, mc_addr_set, nb_mc_addr);
+	return eth_err(port_id, dev->dev_ops->set_mc_addr_list(dev,
+						mc_addr_set, nb_mc_addr));
 }
 
 int
@@ -3441,7 +3486,7 @@ int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx,
 	dev = &rte_eth_devices[port_id];
 
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->timesync_enable, -ENOTSUP);
-	return (*dev->dev_ops->timesync_enable)(dev);
+	return eth_err(port_id, (*dev->dev_ops->timesync_enable)(dev));
 }
 
 int
@@ -3453,7 +3498,7 @@ int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx,
 	dev = &rte_eth_devices[port_id];
 
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->timesync_disable, -ENOTSUP);
-	return (*dev->dev_ops->timesync_disable)(dev);
+	return eth_err(port_id, (*dev->dev_ops->timesync_disable)(dev));
 }
 
 int
@@ -3466,7 +3511,8 @@ int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx,
 	dev = &rte_eth_devices[port_id];
 
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->timesync_read_rx_timestamp, -ENOTSUP);
-	return (*dev->dev_ops->timesync_read_rx_timestamp)(dev, timestamp, flags);
+	return eth_err(port_id, (*dev->dev_ops->timesync_read_rx_timestamp)
+				(dev, timestamp, flags));
 }
 
 int
@@ -3479,7 +3525,8 @@ int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx,
 	dev = &rte_eth_devices[port_id];
 
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->timesync_read_tx_timestamp, -ENOTSUP);
-	return (*dev->dev_ops->timesync_read_tx_timestamp)(dev, timestamp);
+	return eth_err(port_id, (*dev->dev_ops->timesync_read_tx_timestamp)
+				(dev, timestamp));
 }
 
 int
@@ -3491,7 +3538,8 @@ int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx,
 	dev = &rte_eth_devices[port_id];
 
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->timesync_adjust_time, -ENOTSUP);
-	return (*dev->dev_ops->timesync_adjust_time)(dev, delta);
+	return eth_err(port_id, (*dev->dev_ops->timesync_adjust_time)(dev,
+								      delta));
 }
 
 int
@@ -3503,7 +3551,8 @@ int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx,
 	dev = &rte_eth_devices[port_id];
 
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->timesync_read_time, -ENOTSUP);
-	return (*dev->dev_ops->timesync_read_time)(dev, timestamp);
+	return eth_err(port_id, (*dev->dev_ops->timesync_read_time)(dev,
+								timestamp));
 }
 
 int
@@ -3515,7 +3564,8 @@ int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx,
 	dev = &rte_eth_devices[port_id];
 
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->timesync_write_time, -ENOTSUP);
-	return (*dev->dev_ops->timesync_write_time)(dev, timestamp);
+	return eth_err(port_id, (*dev->dev_ops->timesync_write_time)(dev,
+								timestamp));
 }
 
 int
@@ -3527,7 +3577,7 @@ int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx,
 
 	dev = &rte_eth_devices[port_id];
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->get_reg, -ENOTSUP);
-	return (*dev->dev_ops->get_reg)(dev, info);
+	return eth_err(port_id, (*dev->dev_ops->get_reg)(dev, info));
 }
 
 int
@@ -3539,7 +3589,7 @@ int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx,
 
 	dev = &rte_eth_devices[port_id];
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->get_eeprom_length, -ENOTSUP);
-	return (*dev->dev_ops->get_eeprom_length)(dev);
+	return eth_err(port_id, (*dev->dev_ops->get_eeprom_length)(dev));
 }
 
 int
@@ -3551,7 +3601,7 @@ int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx,
 
 	dev = &rte_eth_devices[port_id];
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->get_eeprom, -ENOTSUP);
-	return (*dev->dev_ops->get_eeprom)(dev, info);
+	return eth_err(port_id, (*dev->dev_ops->get_eeprom)(dev, info));
 }
 
 int
@@ -3563,7 +3613,7 @@ int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx,
 
 	dev = &rte_eth_devices[port_id];
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->set_eeprom, -ENOTSUP);
-	return (*dev->dev_ops->set_eeprom)(dev, info);
+	return eth_err(port_id, (*dev->dev_ops->set_eeprom)(dev, info));
 }
 
 int
@@ -3578,7 +3628,7 @@ int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx,
 	memset(dcb_info, 0, sizeof(struct rte_eth_dcb_info));
 
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->get_dcb_info, -ENOTSUP);
-	return (*dev->dev_ops->get_dcb_info)(dev, dcb_info);
+	return eth_err(port_id, (*dev->dev_ops->get_dcb_info)(dev, dcb_info));
 }
 
 int
@@ -3601,7 +3651,8 @@ int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx,
 	dev = &rte_eth_devices[port_id];
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->l2_tunnel_eth_type_conf,
 				-ENOTSUP);
-	return (*dev->dev_ops->l2_tunnel_eth_type_conf)(dev, l2_tunnel);
+	return eth_err(port_id, (*dev->dev_ops->l2_tunnel_eth_type_conf)(dev,
+								l2_tunnel));
 }
 
 int
@@ -3632,7 +3683,8 @@ int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx,
 	dev = &rte_eth_devices[port_id];
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->l2_tunnel_offload_set,
 				-ENOTSUP);
-	return (*dev->dev_ops->l2_tunnel_offload_set)(dev, l2_tunnel, mask, en);
+	return eth_err(port_id, (*dev->dev_ops->l2_tunnel_offload_set)(dev,
+							l2_tunnel, mask, en));
 }
 
 static void
diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index 18c14e9..cf4defb 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -2023,6 +2023,7 @@ int rte_eth_dev_configure(uint16_t port_id, uint16_t nb_rx_queue,
  *   memory buffers to populate each descriptor of the receive ring.
  * @return
  *   - 0: Success, receive queue correctly set up.
+ *   - -EIO: if device is removed.
  *   - -EINVAL: The size of network buffers which can be allocated from the
  *      memory pool does not fit the various buffer sizes allowed by the
  *      device controller.
@@ -2123,6 +2124,7 @@ int rte_eth_tx_queue_setup(uint16_t port_id, uint16_t tx_queue_id,
  * @return
  *   - 0: Success, the receive queue is started.
  *   - -EINVAL: The port_id or the queue_id out of range.
+ *   - -EIO: if device is removed.
  *   - -ENOTSUP: The function not supported in PMD driver.
  */
 int rte_eth_dev_rx_queue_start(uint16_t port_id, uint16_t rx_queue_id);
@@ -2139,6 +2141,7 @@ int rte_eth_tx_queue_setup(uint16_t port_id, uint16_t tx_queue_id,
  * @return
  *   - 0: Success, the receive queue is stopped.
  *   - -EINVAL: The port_id or the queue_id out of range.
+ *   - -EIO: if device is removed.
  *   - -ENOTSUP: The function not supported in PMD driver.
  */
 int rte_eth_dev_rx_queue_stop(uint16_t port_id, uint16_t rx_queue_id);
@@ -2156,6 +2159,7 @@ int rte_eth_tx_queue_setup(uint16_t port_id, uint16_t tx_queue_id,
  * @return
  *   - 0: Success, the transmit queue is started.
  *   - -EINVAL: The port_id or the queue_id out of range.
+ *   - -EIO: if device is removed.
  *   - -ENOTSUP: The function not supported in PMD driver.
  */
 int rte_eth_dev_tx_queue_start(uint16_t port_id, uint16_t tx_queue_id);
@@ -2172,6 +2176,7 @@ int rte_eth_tx_queue_setup(uint16_t port_id, uint16_t tx_queue_id,
  * @return
  *   - 0: Success, the transmit queue is stopped.
  *   - -EINVAL: The port_id or the queue_id out of range.
+ *   - -EIO: if device is removed.
  *   - -ENOTSUP: The function not supported in PMD driver.
  */
 int rte_eth_dev_tx_queue_stop(uint16_t port_id, uint16_t tx_queue_id);
@@ -2273,7 +2278,7 @@ int rte_eth_tx_queue_setup(uint16_t port_id, uint16_t tx_queue_id,
  *   - (-EINVAL) if port identifier is invalid.
  *   - (-ENOTSUP) if hardware doesn't support this function.
  *   - (-EPERM) if not ran from the primary process.
- *   - (-EIO) if re-initialisation failed.
+ *   - (-EIO) if re-initialisation failed or device is removed.
  *   - (-ENOMEM) if the reset failed due to OOM.
  *   - (-EAGAIN) if the reset temporarily failed and should be retried later.
  */
@@ -2509,6 +2514,7 @@ int rte_eth_xstats_get_by_id(uint16_t port_id, const uint64_t *ids,
  * @return
  *    0 on success
  *    -ENODEV for invalid port_id,
+ *    -EIO if device is removed,
  *    -EINVAL if the xstat_name doesn't exist in port_id
  */
 int rte_eth_xstats_get_id_by_name(uint16_t port_id, const char *xstat_name,
@@ -2600,6 +2606,7 @@ int rte_eth_dev_set_rx_queue_stats_mapping(uint16_t port_id,
  *   - (0) if successful.
  *   - (-ENOTSUP) if operation is not supported.
  *   - (-ENODEV) if *port_id* invalid.
+ *   - (-EIO) if device is removed.
  *   - (>0) if *fw_size* is not enough to store firmware version, return
  *          the size of the non truncated string.
  */
@@ -2671,6 +2678,7 @@ int rte_eth_dev_get_supported_ptypes(uint16_t port_id, uint32_t ptype_mask,
  *   - (0) if successful.
  *   - (-ENOTSUP) if operation is not supported.
  *   - (-ENODEV) if *port_id* invalid.
+ *   - (-EIO) if device is removed.
  *   - (-EINVAL) if *mtu* invalid.
  *   - (-EBUSY) if operation is not allowed when the port is running
  */
@@ -2691,6 +2699,7 @@ int rte_eth_dev_get_supported_ptypes(uint16_t port_id, uint32_t ptype_mask,
  *   - (0) if successful.
  *   - (-ENOSUP) if hardware-assisted VLAN filtering not configured.
  *   - (-ENODEV) if *port_id* invalid.
+ *   - (-EIO) if device is removed.
  *   - (-ENOSYS) if VLAN filtering on *port_id* disabled.
  *   - (-EINVAL) if *vlan_id* > 4095.
  */
@@ -2733,6 +2742,7 @@ int rte_eth_dev_set_vlan_strip_on_queue(uint16_t port_id, uint16_t rx_queue_id,
  *   - (0) if successful.
  *   - (-ENOSUP) if hardware-assisted VLAN TPID setup is not supported.
  *   - (-ENODEV) if *port_id* invalid.
+ *   - (-EIO) if device is removed.
  */
 int rte_eth_dev_set_vlan_ether_type(uint16_t port_id,
 				    enum rte_vlan_type vlan_type,
@@ -2757,6 +2767,7 @@ int rte_eth_dev_set_vlan_ether_type(uint16_t port_id,
  *   - (0) if successful.
  *   - (-ENOSUP) if hardware-assisted VLAN filtering not configured.
  *   - (-ENODEV) if *port_id* invalid.
+ *   - (-EIO) if device is removed.
  */
 int rte_eth_dev_set_vlan_offload(uint16_t port_id, int offload_mask);
 
@@ -3498,6 +3509,7 @@ struct rte_eth_dev_tx_buffer {
  * @return
  *   Failure: < 0
  *     -ENODEV: Invalid interface
+ *     -EIO: device is removed
  *     -ENOTSUP: Driver does not support function
  *   Success: >= 0
  *     0-n: Number of packets freed. More packets may still remain in ring that
@@ -3612,6 +3624,7 @@ int _rte_eth_dev_callback_process(struct rte_eth_dev *dev,
  *   - (-ENOTSUP) if underlying hardware OR driver doesn't support
  *     that operation.
  *   - (-ENODEV) if *port_id* invalid.
+ *   - (-EIO) if device is removed.
  */
 int rte_eth_dev_rx_intr_enable(uint16_t port_id, uint16_t queue_id);
 
@@ -3633,6 +3646,7 @@ int _rte_eth_dev_callback_process(struct rte_eth_dev *dev,
  *   - (-ENOTSUP) if underlying hardware OR driver doesn't support
  *     that operation.
  *   - (-ENODEV) if *port_id* invalid.
+ *   - (-EIO) if device is removed.
  */
 int rte_eth_dev_rx_intr_disable(uint16_t port_id, uint16_t queue_id);
 
@@ -3690,6 +3704,7 @@ int rte_eth_dev_rx_intr_ctl_q(uint16_t port_id, uint16_t queue_id,
  *   - (-ENOTSUP) if underlying hardware OR driver doesn't support
  *     that operation.
  *   - (-ENODEV) if *port_id* invalid.
+ *   - (-EIO) if device is removed.
  */
 int  rte_eth_led_on(uint16_t port_id);
 
@@ -3704,6 +3719,7 @@ int rte_eth_dev_rx_intr_ctl_q(uint16_t port_id, uint16_t queue_id,
  *   - (-ENOTSUP) if underlying hardware OR driver doesn't support
  *     that operation.
  *   - (-ENODEV) if *port_id* invalid.
+ *   - (-EIO) if device is removed.
  */
 int  rte_eth_led_off(uint16_t port_id);
 
@@ -3718,6 +3734,7 @@ int rte_eth_dev_rx_intr_ctl_q(uint16_t port_id, uint16_t queue_id,
  *   - (0) if successful.
  *   - (-ENOTSUP) if hardware doesn't support flow control.
  *   - (-ENODEV)  if *port_id* invalid.
+ *   - (-EIO)  if device is removed.
  */
 int rte_eth_dev_flow_ctrl_get(uint16_t port_id,
 			      struct rte_eth_fc_conf *fc_conf);
@@ -3734,7 +3751,7 @@ int rte_eth_dev_flow_ctrl_get(uint16_t port_id,
  *   - (-ENOTSUP) if hardware doesn't support flow control mode.
  *   - (-ENODEV)  if *port_id* invalid.
  *   - (-EINVAL)  if bad parameter
- *   - (-EIO)     if flow control setup failure
+ *   - (-EIO)     if flow control setup failure or device is removed.
  */
 int rte_eth_dev_flow_ctrl_set(uint16_t port_id,
 			      struct rte_eth_fc_conf *fc_conf);
@@ -3752,7 +3769,7 @@ int rte_eth_dev_flow_ctrl_set(uint16_t port_id,
  *   - (-ENOTSUP) if hardware doesn't support priority flow control mode.
  *   - (-ENODEV)  if *port_id* invalid.
  *   - (-EINVAL)  if bad parameter
- *   - (-EIO)     if flow control setup failure
+ *   - (-EIO)     if flow control setup failure or device is removed.
  */
 int rte_eth_dev_priority_flow_ctrl_set(uint16_t port_id,
 				struct rte_eth_pfc_conf *pfc_conf);
@@ -3772,6 +3789,7 @@ int rte_eth_dev_priority_flow_ctrl_set(uint16_t port_id,
  *   - (0) if successfully added or *mac_addr" was already added.
  *   - (-ENOTSUP) if hardware doesn't support this feature.
  *   - (-ENODEV) if *port* is invalid.
+ *   - (-EIO) if device is removed.
  *   - (-ENOSPC) if no more MAC addresses can be added.
  *   - (-EINVAL) if MAC address is invalid.
  */
@@ -3823,6 +3841,7 @@ int rte_eth_dev_default_mac_addr_set(uint16_t port,
  *   - (0) if successful.
  *   - (-ENOTSUP) if hardware doesn't support.
  *   - (-EINVAL) if bad parameter.
+ *   - (-EIO) if device is removed.
  */
 int rte_eth_dev_rss_reta_update(uint16_t port,
 				struct rte_eth_rss_reta_entry64 *reta_conf,
@@ -3842,6 +3861,7 @@ int rte_eth_dev_rss_reta_update(uint16_t port,
  *   - (0) if successful.
  *   - (-ENOTSUP) if hardware doesn't support.
  *   - (-EINVAL) if bad parameter.
+ *   - (-EIO) if device is removed.
  */
 int rte_eth_dev_rss_reta_query(uint16_t port,
 			       struct rte_eth_rss_reta_entry64 *reta_conf,
@@ -3863,6 +3883,7 @@ int rte_eth_dev_rss_reta_query(uint16_t port,
  *   - (0) if successful.
  *   - (-ENOTSUP) if hardware doesn't support.
   *  - (-ENODEV) if *port_id* invalid.
+ *   - (-EIO) if device is removed.
  *   - (-EINVAL) if bad parameter.
  */
 int rte_eth_dev_uc_hash_table_set(uint16_t port, struct ether_addr *addr,
@@ -3883,6 +3904,7 @@ int rte_eth_dev_uc_hash_table_set(uint16_t port, struct ether_addr *addr,
  *   - (0) if successful.
  *   - (-ENOTSUP) if hardware doesn't support.
   *  - (-ENODEV) if *port_id* invalid.
+ *   - (-EIO) if device is removed.
  *   - (-EINVAL) if bad parameter.
  */
 int rte_eth_dev_uc_all_hash_table_set(uint16_t port, uint8_t on);
@@ -3906,6 +3928,7 @@ int rte_eth_dev_uc_hash_table_set(uint16_t port, struct ether_addr *addr,
  *   - (0) if successful.
  *   - (-ENOTSUP) if hardware doesn't support this feature.
  *   - (-ENODEV) if *port_id* invalid.
+ *   - (-EIO) if device is removed.
  *   - (-EINVAL) if the mr_conf information is not correct.
  */
 int rte_eth_mirror_rule_set(uint16_t port_id,
@@ -3924,6 +3947,7 @@ int rte_eth_mirror_rule_set(uint16_t port_id,
  *   - (0) if successful.
  *   - (-ENOTSUP) if hardware doesn't support this feature.
  *   - (-ENODEV) if *port_id* invalid.
+ *   - (-EIO) if device is removed.
  *   - (-EINVAL) if bad parameter.
  */
 int rte_eth_mirror_rule_reset(uint16_t port_id,
@@ -3942,6 +3966,7 @@ int rte_eth_mirror_rule_reset(uint16_t port_id,
  *   - (0) if successful.
  *   - (-ENOTSUP) if hardware doesn't support this feature.
  *   - (-ENODEV) if *port_id* invalid.
+ *   - (-EIO) if device is removed.
  *   - (-EINVAL) if bad parameter.
  */
 int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx,
@@ -3957,6 +3982,7 @@ int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx,
  * @return
  *   - (0) if successful.
  *   - (-ENODEV) if port identifier is invalid.
+ *   - (-EIO) if device is removed.
  *   - (-ENOTSUP) if hardware doesn't support.
  *   - (-EINVAL) if bad parameter.
  */
@@ -3974,6 +4000,7 @@ int rte_eth_dev_rss_hash_update(uint16_t port_id,
  * @return
  *   - (0) if successful.
  *   - (-ENODEV) if port identifier is invalid.
+ *   - (-EIO) if device is removed.
  *   - (-ENOTSUP) if hardware doesn't support RSS.
  */
 int
@@ -3995,6 +4022,7 @@ int rte_eth_dev_rss_hash_update(uint16_t port_id,
  * @return
  *   - (0) if successful.
  *   - (-ENODEV) if port identifier is invalid.
+ *   - (-EIO) if device is removed.
  *   - (-ENOTSUP) if hardware doesn't support tunnel type.
  */
 int
@@ -4017,6 +4045,7 @@ int rte_eth_dev_rss_hash_update(uint16_t port_id,
  * @return
  *   - (0) if successful.
  *   - (-ENODEV) if port identifier is invalid.
+ *   - (-EIO) if device is removed.
  *   - (-ENOTSUP) if hardware doesn't support tunnel type.
  */
 int
@@ -4035,6 +4064,7 @@ int rte_eth_dev_rss_hash_update(uint16_t port_id,
  *   - (0) if successful.
  *   - (-ENOTSUP) if hardware doesn't support this filter type.
  *   - (-ENODEV) if *port_id* invalid.
+ *   - (-EIO) if device is removed.
  */
 int rte_eth_dev_filter_supported(uint16_t port_id,
 		enum rte_filter_type filter_type);
@@ -4055,6 +4085,7 @@ int rte_eth_dev_filter_supported(uint16_t port_id,
  *   - (0) if successful.
  *   - (-ENOTSUP) if hardware doesn't support.
  *   - (-ENODEV) if *port_id* invalid.
+ *   - (-EIO) if device is removed.
  *   - others depends on the specific operations implementation.
  */
 int rte_eth_dev_filter_ctrl(uint16_t port_id, enum rte_filter_type filter_type,
@@ -4070,6 +4101,7 @@ int rte_eth_dev_filter_ctrl(uint16_t port_id, enum rte_filter_type filter_type,
  * @return
  *   - (0) if successful.
  *   - (-ENODEV) if port identifier is invalid.
+ *   - (-EIO) if device is removed.
  *   - (-ENOTSUP) if hardware doesn't support.
  */
 int rte_eth_dev_get_dcb_info(uint16_t port_id,
@@ -4277,6 +4309,7 @@ int rte_eth_tx_queue_info_get(uint16_t port_id, uint16_t queue_id,
  *   - (0) if successful.
  *   - (-ENOTSUP) if hardware doesn't support.
  *   - (-ENODEV) if *port_id* invalid.
+ *   - (-EIO) if device is removed.
  *   - others depends on the specific operations implementation.
  */
 int rte_eth_dev_get_reg_info(uint16_t port_id, struct rte_dev_reg_info *info);
@@ -4290,6 +4323,7 @@ int rte_eth_tx_queue_info_get(uint16_t port_id, uint16_t queue_id,
  *   - (>=0) EEPROM size if successful.
  *   - (-ENOTSUP) if hardware doesn't support.
  *   - (-ENODEV) if *port_id* invalid.
+ *   - (-EIO) if device is removed.
  *   - others depends on the specific operations implementation.
  */
 int rte_eth_dev_get_eeprom_length(uint16_t port_id);
@@ -4306,6 +4340,7 @@ int rte_eth_tx_queue_info_get(uint16_t port_id, uint16_t queue_id,
  *   - (0) if successful.
  *   - (-ENOTSUP) if hardware doesn't support.
  *   - (-ENODEV) if *port_id* invalid.
+ *   - (-EIO) if device is removed.
  *   - others depends on the specific operations implementation.
  */
 int rte_eth_dev_get_eeprom(uint16_t port_id, struct rte_dev_eeprom_info *info);
@@ -4322,6 +4357,7 @@ int rte_eth_tx_queue_info_get(uint16_t port_id, uint16_t queue_id,
  *   - (0) if successful.
  *   - (-ENOTSUP) if hardware doesn't support.
  *   - (-ENODEV) if *port_id* invalid.
+ *   - (-EIO) if device is removed.
  *   - others depends on the specific operations implementation.
  */
 int rte_eth_dev_set_eeprom(uint16_t port_id, struct rte_dev_eeprom_info *info);
@@ -4340,6 +4376,7 @@ int rte_eth_tx_queue_info_get(uint16_t port_id, uint16_t queue_id,
  * @return
  *   - (0) if successful.
  *   - (-ENODEV) if *port_id* invalid.
+ *   - (-EIO) if device is removed.
  *   - (-ENOTSUP) if PMD of *port_id* doesn't support multicast filtering.
  *   - (-ENOSPC) if *port_id* has not enough multicast filtering resources.
  */
@@ -4356,6 +4393,7 @@ int rte_eth_dev_set_mc_addr_list(uint16_t port_id,
  * @return
  *   - 0: Success.
  *   - -ENODEV: The port ID is invalid.
+ *   - -EIO: if device is removed.
  *   - -ENOTSUP: The function is not supported by the Ethernet driver.
  */
 int rte_eth_timesync_enable(uint16_t port_id);
@@ -4369,6 +4407,7 @@ int rte_eth_dev_set_mc_addr_list(uint16_t port_id,
  * @return
  *   - 0: Success.
  *   - -ENODEV: The port ID is invalid.
+ *   - -EIO: if device is removed.
  *   - -ENOTSUP: The function is not supported by the Ethernet driver.
  */
 int rte_eth_timesync_disable(uint16_t port_id);
@@ -4388,6 +4427,7 @@ int rte_eth_dev_set_mc_addr_list(uint16_t port_id,
  *   - 0: Success.
  *   - -EINVAL: No timestamp is available.
  *   - -ENODEV: The port ID is invalid.
+ *   - -EIO: if device is removed.
  *   - -ENOTSUP: The function is not supported by the Ethernet driver.
  */
 int rte_eth_timesync_read_rx_timestamp(uint16_t port_id,
@@ -4405,6 +4445,7 @@ int rte_eth_timesync_read_rx_timestamp(uint16_t port_id,
  *   - 0: Success.
  *   - -EINVAL: No timestamp is available.
  *   - -ENODEV: The port ID is invalid.
+ *   - -EIO: if device is removed.
  *   - -ENOTSUP: The function is not supported by the Ethernet driver.
  */
 int rte_eth_timesync_read_tx_timestamp(uint16_t port_id,
@@ -4424,6 +4465,7 @@ int rte_eth_timesync_read_tx_timestamp(uint16_t port_id,
  * @return
  *   - 0: Success.
  *   - -ENODEV: The port ID is invalid.
+ *   - -EIO: if device is removed.
  *   - -ENOTSUP: The function is not supported by the Ethernet driver.
  */
 int rte_eth_timesync_adjust_time(uint16_t port_id, int64_t delta);
@@ -4459,6 +4501,7 @@ int rte_eth_timesync_read_tx_timestamp(uint16_t port_id,
  *   - 0: Success.
  *   - -EINVAL: No timestamp is available.
  *   - -ENODEV: The port ID is invalid.
+ *   - -EIO: if device is removed.
  *   - -ENOTSUP: The function is not supported by the Ethernet driver.
  */
 int rte_eth_timesync_write_time(uint16_t port_id, const struct timespec *time);
@@ -4499,6 +4542,7 @@ int rte_eth_timesync_read_tx_timestamp(uint16_t port_id,
  * @return
  *   - (0) if successful.
  *   - (-ENODEV) if port identifier is invalid.
+ *   - (-EIO) if device is removed.
  *   - (-ENOTSUP) if hardware doesn't support tunnel type.
  */
 int
@@ -4526,6 +4570,7 @@ int rte_eth_timesync_read_tx_timestamp(uint16_t port_id,
  * @return
  *   - (0) if successful.
  *   - (-ENODEV) if port identifier is invalid.
+ *   - (-EIO) if device is removed.
  *   - (-ENOTSUP) if hardware doesn't support tunnel type.
  */
 int
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH v6 5/6] ethdev: adjust flow APIs removal error report
  2018-01-18 11:27         ` [PATCH v6 0/6] Fail-safe\ethdev: fix removal handling lack Matan Azrad
                             ` (3 preceding siblings ...)
  2018-01-18 11:27           ` [PATCH v6 4/6] ethdev: adjust APIs removal error report Matan Azrad
@ 2018-01-18 11:27           ` Matan Azrad
  2018-01-18 11:27           ` [PATCH v6 6/6] net/failsafe: fix removed device handling Matan Azrad
  2018-01-20 21:12           ` [PATCH v7 0/6] Fail-safe\ethdev: fix removal handling lack Matan Azrad
  6 siblings, 0 replies; 98+ messages in thread
From: Matan Azrad @ 2018-01-18 11:27 UTC (permalink / raw)
  To: Ferruh Yigit, Adrien Mazarguil, Gaetan Rivet; +Cc: Thomas Monjalon, dev

rte_eth_dev_is_removed API was added to detect a device removal
synchronously.

When a device removal occurs during flow command execution, many
different errors can be reported to the user.

Adjust all flow APIs error reports to return -EIO in case of device
removal using rte_eth_dev_is_removed API.

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
---
 lib/librte_ether/rte_flow.c | 34 +++++++++++++++++++++++++++-------
 lib/librte_ether/rte_flow.h |  2 ++
 2 files changed, 29 insertions(+), 7 deletions(-)

diff --git a/lib/librte_ether/rte_flow.c b/lib/librte_ether/rte_flow.c
index 913d1a5..a86bfbd 100644
--- a/lib/librte_ether/rte_flow.c
+++ b/lib/librte_ether/rte_flow.c
@@ -107,6 +107,18 @@ struct rte_flow_desc_data {
 	MK_FLOW_ACTION(VF, sizeof(struct rte_flow_action_vf)),
 };
 
+static int
+flow_err(uint16_t port_id, int ret, struct rte_flow_error *error)
+{
+	if (ret == 0)
+		return 0;
+	if (rte_eth_dev_is_removed(port_id))
+		return rte_flow_error_set(error, EIO,
+					  RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
+					  NULL, rte_strerror(EIO));
+	return ret;
+}
+
 /* Get generic flow operations structure from a port. */
 const struct rte_flow_ops *
 rte_flow_ops_get(uint16_t port_id, struct rte_flow_error *error)
@@ -145,7 +157,8 @@ struct rte_flow_desc_data {
 	if (unlikely(!ops))
 		return -rte_errno;
 	if (likely(!!ops->validate))
-		return ops->validate(dev, attr, pattern, actions, error);
+		return flow_err(port_id, ops->validate(dev, attr, pattern,
+						       actions, error), error);
 	return rte_flow_error_set(error, ENOSYS,
 				  RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
 				  NULL, rte_strerror(ENOSYS));
@@ -160,12 +173,17 @@ struct rte_flow *
 		struct rte_flow_error *error)
 {
 	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+	struct rte_flow *flow;
 	const struct rte_flow_ops *ops = rte_flow_ops_get(port_id, error);
 
 	if (unlikely(!ops))
 		return NULL;
-	if (likely(!!ops->create))
-		return ops->create(dev, attr, pattern, actions, error);
+	if (likely(!!ops->create)) {
+		flow = ops->create(dev, attr, pattern, actions, error);
+		if (flow == NULL)
+			flow_err(port_id, -rte_errno, error);
+		return flow;
+	}
 	rte_flow_error_set(error, ENOSYS, RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
 			   NULL, rte_strerror(ENOSYS));
 	return NULL;
@@ -183,7 +201,8 @@ struct rte_flow *
 	if (unlikely(!ops))
 		return -rte_errno;
 	if (likely(!!ops->destroy))
-		return ops->destroy(dev, flow, error);
+		return flow_err(port_id, ops->destroy(dev, flow, error),
+				error);
 	return rte_flow_error_set(error, ENOSYS,
 				  RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
 				  NULL, rte_strerror(ENOSYS));
@@ -200,7 +219,7 @@ struct rte_flow *
 	if (unlikely(!ops))
 		return -rte_errno;
 	if (likely(!!ops->flush))
-		return ops->flush(dev, error);
+		return flow_err(port_id, ops->flush(dev, error), error);
 	return rte_flow_error_set(error, ENOSYS,
 				  RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
 				  NULL, rte_strerror(ENOSYS));
@@ -220,7 +239,8 @@ struct rte_flow *
 	if (!ops)
 		return -rte_errno;
 	if (likely(!!ops->query))
-		return ops->query(dev, flow, action, data, error);
+		return flow_err(port_id, ops->query(dev, flow, action, data,
+						    error), error);
 	return rte_flow_error_set(error, ENOSYS,
 				  RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
 				  NULL, rte_strerror(ENOSYS));
@@ -238,7 +258,7 @@ struct rte_flow *
 	if (!ops)
 		return -rte_errno;
 	if (likely(!!ops->isolate))
-		return ops->isolate(dev, set, error);
+		return flow_err(port_id, ops->isolate(dev, set, error), error);
 	return rte_flow_error_set(error, ENOSYS,
 				  RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
 				  NULL, rte_strerror(ENOSYS));
diff --git a/lib/librte_ether/rte_flow.h b/lib/librte_ether/rte_flow.h
index e0402cf..07ec217 100644
--- a/lib/librte_ether/rte_flow.h
+++ b/lib/librte_ether/rte_flow.h
@@ -1267,6 +1267,8 @@ struct rte_flow_error {
  *
  *   -ENOSYS: underlying device does not support this functionality.
  *
+ *   -EIO: underlying device is removed.
+ *
  *   -EINVAL: unknown or invalid rule specification.
  *
  *   -ENOTSUP: valid but unsupported rule specification (e.g. partial
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH v6 6/6] net/failsafe: fix removed device handling
  2018-01-18 11:27         ` [PATCH v6 0/6] Fail-safe\ethdev: fix removal handling lack Matan Azrad
                             ` (4 preceding siblings ...)
  2018-01-18 11:27           ` [PATCH v6 5/6] ethdev: adjust flow " Matan Azrad
@ 2018-01-18 11:27           ` Matan Azrad
  2018-01-20 21:12           ` [PATCH v7 0/6] Fail-safe\ethdev: fix removal handling lack Matan Azrad
  6 siblings, 0 replies; 98+ messages in thread
From: Matan Azrad @ 2018-01-18 11:27 UTC (permalink / raw)
  To: Ferruh Yigit, Adrien Mazarguil, Gaetan Rivet; +Cc: Thomas Monjalon, dev

There is time between the physical removal of the device until
sub-device PMDs get a RMV interrupt. At this time DPDK PMDs and
applications still don't know about the removal and may call sub-device
control operation which should return an error.

In previous code this error is reported to the application contrary to
fail-safe principle that the app should not be aware of device removal.

Add an removal check in each relevant control command error flow and
prevent an error report to application when the sub-device is removed.

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Gaetan Rivet <gaetan.rivet@6wind.com>
---
 drivers/net/failsafe/failsafe_flow.c    | 18 ++++++++++-------
 drivers/net/failsafe/failsafe_ops.c     | 35 ++++++++++++++++++++++-----------
 drivers/net/failsafe/failsafe_private.h | 11 +++++++++++
 3 files changed, 46 insertions(+), 18 deletions(-)

diff --git a/drivers/net/failsafe/failsafe_flow.c b/drivers/net/failsafe/failsafe_flow.c
index 153ceee..c072d1e 100644
--- a/drivers/net/failsafe/failsafe_flow.c
+++ b/drivers/net/failsafe/failsafe_flow.c
@@ -87,7 +87,7 @@
 		DEBUG("Calling rte_flow_validate on sub_device %d", i);
 		ret = rte_flow_validate(PORT_ID(sdev),
 				attr, patterns, actions, error);
-		if (ret) {
+		if ((ret = fs_err(sdev, ret))) {
 			ERROR("Operation rte_flow_validate failed for sub_device %d"
 			      " with error %d", i, ret);
 			return ret;
@@ -111,7 +111,7 @@
 	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
 		flow->flows[i] = rte_flow_create(PORT_ID(sdev),
 				attr, patterns, actions, error);
-		if (flow->flows[i] == NULL) {
+		if (flow->flows[i] == NULL && fs_err(sdev, -rte_errno)) {
 			ERROR("Failed to create flow on sub_device %d",
 				i);
 			goto err;
@@ -150,7 +150,7 @@
 			continue;
 		local_ret = rte_flow_destroy(PORT_ID(sdev),
 				flow->flows[i], error);
-		if (local_ret) {
+		if ((local_ret = fs_err(sdev, local_ret))) {
 			ERROR("Failed to destroy flow on sub_device %d: %d",
 					i, local_ret);
 			if (ret == 0)
@@ -175,7 +175,7 @@
 	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
 		DEBUG("Calling rte_flow_flush on sub_device %d", i);
 		ret = rte_flow_flush(PORT_ID(sdev), error);
-		if (ret) {
+		if ((ret = fs_err(sdev, ret))) {
 			ERROR("Operation rte_flow_flush failed for sub_device %d"
 			      " with error %d", i, ret);
 			return ret;
@@ -199,8 +199,12 @@
 
 	sdev = TX_SUBDEV(dev);
 	if (sdev != NULL) {
-		return rte_flow_query(PORT_ID(sdev),
-				flow->flows[SUB_ID(sdev)], type, arg, error);
+		int ret = rte_flow_query(PORT_ID(sdev),
+					 flow->flows[SUB_ID(sdev)],
+					 type, arg, error);
+
+		if ((ret = fs_err(sdev, ret)))
+			return ret;
 	}
 	WARN("No active sub_device to query about its flow");
 	return -1;
@@ -223,7 +227,7 @@
 			WARN("flow isolation mode of sub_device %d in incoherent state.",
 				i);
 		ret = rte_flow_isolate(PORT_ID(sdev), set, error);
-		if (ret) {
+		if ((ret = fs_err(sdev, ret))) {
 			ERROR("Operation rte_flow_isolate failed for sub_device %d"
 			      " with error %d", i, ret);
 			return ret;
diff --git a/drivers/net/failsafe/failsafe_ops.c b/drivers/net/failsafe/failsafe_ops.c
index fe957ad..0976745 100644
--- a/drivers/net/failsafe/failsafe_ops.c
+++ b/drivers/net/failsafe/failsafe_ops.c
@@ -121,6 +121,8 @@
 					dev->data->nb_tx_queues,
 					&dev->data->dev_conf);
 		if (ret) {
+			if (!fs_err(sdev, ret))
+				continue;
 			ERROR("Could not configure sub_device %d", i);
 			return ret;
 		}
@@ -163,8 +165,11 @@
 			continue;
 		DEBUG("Starting sub_device %d", i);
 		ret = rte_eth_dev_start(PORT_ID(sdev));
-		if (ret)
+		if (ret) {
+			if (!fs_err(sdev, ret))
+				continue;
 			return ret;
+		}
 		sdev->state = DEV_STARTED;
 	}
 	if (PRIV(dev)->state < DEV_STARTED)
@@ -196,7 +201,7 @@
 	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
 		DEBUG("Calling rte_eth_dev_set_link_up on sub_device %d", i);
 		ret = rte_eth_dev_set_link_up(PORT_ID(sdev));
-		if (ret) {
+		if ((ret = fs_err(sdev, ret))) {
 			ERROR("Operation rte_eth_dev_set_link_up failed for sub_device %d"
 			      " with error %d", i, ret);
 			return ret;
@@ -215,7 +220,7 @@
 	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
 		DEBUG("Calling rte_eth_dev_set_link_down on sub_device %d", i);
 		ret = rte_eth_dev_set_link_down(PORT_ID(sdev));
-		if (ret) {
+		if ((ret = fs_err(sdev, ret))) {
 			ERROR("Operation rte_eth_dev_set_link_down failed for sub_device %d"
 			      " with error %d", i, ret);
 			return ret;
@@ -301,7 +306,7 @@
 				rx_queue_id,
 				nb_rx_desc, socket_id,
 				rx_conf, mb_pool);
-		if (ret) {
+		if ((ret = fs_err(sdev, ret))) {
 			ERROR("RX queue setup failed for sub_device %d", i);
 			goto free_rxq;
 		}
@@ -367,7 +372,7 @@
 				tx_queue_id,
 				nb_tx_desc, socket_id,
 				tx_conf);
-		if (ret) {
+		if ((ret = fs_err(sdev, ret))) {
 			ERROR("TX queue setup failed for sub_device %d", i);
 			goto free_txq;
 		}
@@ -446,7 +451,8 @@
 	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
 		DEBUG("Calling link_update on sub_device %d", i);
 		ret = (SUBOPS(sdev, link_update))(ETH(sdev), wait_to_complete);
-		if (ret && ret != -1) {
+		if (ret && ret != -1 && sdev->remove == 0 &&
+		    rte_eth_dev_is_removed(PORT_ID(sdev)) == 0) {
 			ERROR("Link update failed for sub_device %d with error %d",
 			      i, ret);
 			return ret;
@@ -470,6 +476,7 @@
 fs_stats_get(struct rte_eth_dev *dev,
 	     struct rte_eth_stats *stats)
 {
+	struct rte_eth_stats backup;
 	struct sub_device *sdev;
 	uint8_t i;
 	int ret;
@@ -479,14 +486,20 @@
 		struct rte_eth_stats *snapshot = &sdev->stats_snapshot.stats;
 		uint64_t *timestamp = &sdev->stats_snapshot.timestamp;
 
+		rte_memcpy(&backup, snapshot, sizeof(backup));
 		ret = rte_eth_stats_get(PORT_ID(sdev), snapshot);
 		if (ret) {
+			if (!fs_err(sdev, ret)) {
+				rte_memcpy(snapshot, &backup, sizeof(backup));
+				goto inc;
+			}
 			ERROR("Operation rte_eth_stats_get failed for sub_device %d with error %d",
 				  i, ret);
 			*timestamp = 0;
 			return ret;
 		}
 		*timestamp = rte_rdtsc();
+inc:
 		failsafe_stats_increment(stats, snapshot);
 	}
 	return 0;
@@ -599,7 +612,7 @@
 	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
 		DEBUG("Calling rte_eth_dev_set_mtu on sub_device %d", i);
 		ret = rte_eth_dev_set_mtu(PORT_ID(sdev), mtu);
-		if (ret) {
+		if ((ret = fs_err(sdev, ret))) {
 			ERROR("Operation rte_eth_dev_set_mtu failed for sub_device %d with error %d",
 			      i, ret);
 			return ret;
@@ -618,7 +631,7 @@
 	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
 		DEBUG("Calling rte_eth_dev_vlan_filter on sub_device %d", i);
 		ret = rte_eth_dev_vlan_filter(PORT_ID(sdev), vlan_id, on);
-		if (ret) {
+		if ((ret = fs_err(sdev, ret))) {
 			ERROR("Operation rte_eth_dev_vlan_filter failed for sub_device %d"
 			      " with error %d", i, ret);
 			return ret;
@@ -652,7 +665,7 @@
 	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
 		DEBUG("Calling rte_eth_dev_flow_ctrl_set on sub_device %d", i);
 		ret = rte_eth_dev_flow_ctrl_set(PORT_ID(sdev), fc_conf);
-		if (ret) {
+		if ((ret = fs_err(sdev, ret))) {
 			ERROR("Operation rte_eth_dev_flow_ctrl_set failed for sub_device %d"
 			      " with error %d", i, ret);
 			return ret;
@@ -689,7 +702,7 @@
 	RTE_ASSERT(index < FAILSAFE_MAX_ETHADDR);
 	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
 		ret = rte_eth_dev_mac_addr_add(PORT_ID(sdev), mac_addr, vmdq);
-		if (ret) {
+		if ((ret = fs_err(sdev, ret))) {
 			ERROR("Operation rte_eth_dev_mac_addr_add failed for sub_device %"
 			      PRIu8 " with error %d", i, ret);
 			return ret;
@@ -731,7 +744,7 @@
 	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
 		DEBUG("Calling rte_eth_dev_filter_ctrl on sub_device %d", i);
 		ret = rte_eth_dev_filter_ctrl(PORT_ID(sdev), type, op, arg);
-		if (ret) {
+		if ((ret = fs_err(sdev, ret))) {
 			ERROR("Operation rte_eth_dev_filter_ctrl failed for sub_device %d"
 			      " with error %d", i, ret);
 			return ret;
diff --git a/drivers/net/failsafe/failsafe_private.h b/drivers/net/failsafe/failsafe_private.h
index 54b5b91..5cfb327 100644
--- a/drivers/net/failsafe/failsafe_private.h
+++ b/drivers/net/failsafe/failsafe_private.h
@@ -376,4 +376,15 @@ int failsafe_eth_lsc_event_callback(uint16_t port_id,
 	rte_wmb();
 }
 
+/*
+ * Adjust error value and rte_errno to the fail-safe actual error value.
+ */
+static inline int
+fs_err(struct sub_device *sdev, int err)
+{
+	/* A device removal shouldn't be reported as an error. */
+	if (sdev->remove == 1 || err == -EIO)
+		return rte_errno = 0;
+	return err;
+}
 #endif /* _RTE_ETH_FAILSAFE_PRIVATE_H_ */
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* Re: [PATCH v6 2/6] net/mlx4: support a device removal check operation
  2018-01-18 11:27           ` [PATCH v6 2/6] net/mlx4: support a device removal check operation Matan Azrad
@ 2018-01-18 16:59             ` Adrien Mazarguil
  0 siblings, 0 replies; 98+ messages in thread
From: Adrien Mazarguil @ 2018-01-18 16:59 UTC (permalink / raw)
  To: Matan Azrad; +Cc: Ferruh Yigit, Gaetan Rivet, Thomas Monjalon, dev

On Thu, Jan 18, 2018 at 11:27:10AM +0000, Matan Azrad wrote:
> Add support to get removal status of mlx4 device.
> 
> Signed-off-by: Matan Azrad <matan@mellanox.com>

Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>

-- 
Adrien Mazarguil
6WIND

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v6 3/6] net/mlx5: support a device removal check operation
  2018-01-18 11:27           ` [PATCH v6 3/6] net/mlx5: " Matan Azrad
@ 2018-01-18 16:59             ` Adrien Mazarguil
  0 siblings, 0 replies; 98+ messages in thread
From: Adrien Mazarguil @ 2018-01-18 16:59 UTC (permalink / raw)
  To: Matan Azrad; +Cc: Ferruh Yigit, Gaetan Rivet, Thomas Monjalon, dev

On Thu, Jan 18, 2018 at 11:27:11AM +0000, Matan Azrad wrote:
> Add support to get removal status of mlx5 device.
> It is not supported in secondary process.
> 
> Signed-off-by: Matan Azrad <matan@mellanox.com>

Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>

-- 
Adrien Mazarguil
6WIND

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v6 1/6] ethdev: add devop to check removal status
  2018-01-18 11:27           ` [PATCH v6 1/6] ethdev: add devop to check removal status Matan Azrad
@ 2018-01-18 17:18             ` Ferruh Yigit
  2018-01-18 17:57               ` Adrien Mazarguil
  2018-01-18 18:02               ` Matan Azrad
  0 siblings, 2 replies; 98+ messages in thread
From: Ferruh Yigit @ 2018-01-18 17:18 UTC (permalink / raw)
  To: Matan Azrad, Adrien Mazarguil, Gaetan Rivet; +Cc: Thomas Monjalon, dev

On 1/18/2018 11:27 AM, Matan Azrad wrote:
> There is time between the physical removal of the device until PMDs get
> a RMV interrupt. At this time DPDK PMDs and applications still don't
> know about the removal.
> 
> Current removal detection is achieved only by registration to device RMV
> event and the notification comes asynchronously. So, there is no option
> to detect a device removal synchronously.
> Applications and other DPDK entities may want to check a device removal
> synchronously and to take an immediate decision accordingly.

So we will have two methods to detect device removal, one is asynchronous as you
mentioned.
Device removal will cause an interrupt which trigger to run user callback.

New method is synchronous, but still triggered from application. I mean
application should do a rte_eth_dev_is_removed() to learn about status, what is
the use case here, polling continuously? Won't this also cause some latency
unless you dedicate a core just polling device status?

> 
> Add new dev op called is_removed to allow DPDK entities to check an
> Ethernet device removal status immediately.
> 
> Signed-off-by: Matan Azrad <matan@mellanox.com>
> Acked-by: Thomas Monjalon <thomas@monjalon.net>
> ---
>  lib/librte_ether/rte_ethdev.c           | 28 +++++++++++++++++++++++++---
>  lib/librte_ether/rte_ethdev.h           | 20 ++++++++++++++++++++
>  lib/librte_ether/rte_ethdev_version.map |  1 +
>  3 files changed, 46 insertions(+), 3 deletions(-)
> 
> diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
> index b349599..c93cec1 100644
> --- a/lib/librte_ether/rte_ethdev.c
> +++ b/lib/librte_ether/rte_ethdev.c
> @@ -114,7 +114,8 @@ enum {
>  rte_eth_find_next(uint16_t port_id)
>  {
>  	while (port_id < RTE_MAX_ETHPORTS &&
> -	       rte_eth_devices[port_id].state != RTE_ETH_DEV_ATTACHED)
> +	       rte_eth_devices[port_id].state != RTE_ETH_DEV_ATTACHED &&
> +	       rte_eth_devices[port_id].state != RTE_ETH_DEV_REMOVED)

If device is removed, why we are not allowed to re-use port_id assigned to it?
Overall I am not clear with RTE_ETH_DEV_REMOVED state, why we are not directly
setting RTE_ETH_DEV_UNUSED?

And state RTE_ETH_DEV_REMOVED set in ethdev layer, and ethdev layer won't let
reusing it, so what changes the state of dev? Will it stay as it is during
lifetime of the application?

>  		port_id++;
>  
>  	if (port_id >= RTE_MAX_ETHPORTS)
> @@ -262,8 +263,7 @@ struct rte_eth_dev *
>  rte_eth_dev_is_valid_port(uint16_t port_id)
>  {
>  	if (port_id >= RTE_MAX_ETHPORTS ||
> -	    (rte_eth_devices[port_id].state != RTE_ETH_DEV_ATTACHED &&
> -	     rte_eth_devices[port_id].state != RTE_ETH_DEV_DEFERRED))
> +	    (rte_eth_devices[port_id].state == RTE_ETH_DEV_UNUSED))
>  		return 0;
>  	else
>  		return 1;
> @@ -1094,6 +1094,28 @@ struct rte_eth_dev *
>  }
>  
>  int
> +rte_eth_dev_is_removed(uint16_t port_id)
> +{
> +	struct rte_eth_dev *dev;
> +	int ret;
> +
> +	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, 0);
> +
> +	dev = &rte_eth_devices[port_id];
> +
> +	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->is_removed, 0);
> +
> +	if (dev->state == RTE_ETH_DEV_REMOVED)
> +		return 1;

Isn't this conflict with below API documentation:

"
 * @return
 *   - 0 when the Ethernet device is removed, otherwise 1.
"

> +
> +	ret = dev->dev_ops->is_removed(dev);
> +	if (ret != 0)
> +		dev->state = RTE_ETH_DEV_REMOVED;

It isn't clear what "dev_ops->is_removed(dev)" should return, and this causing
incompatible usages in PMDs by time.
Please add some documentation about expected return values for dev_ops.


And this not real remove, PMD signals us and we stop using that device, but
device can be there, right?
If there is a real removal, can be possible to use eal hotplug?

<...>

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v6 4/6] ethdev: adjust APIs removal error report
  2018-01-18 11:27           ` [PATCH v6 4/6] ethdev: adjust APIs removal error report Matan Azrad
@ 2018-01-18 17:31             ` Ferruh Yigit
  2018-01-18 18:10               ` Matan Azrad
  0 siblings, 1 reply; 98+ messages in thread
From: Ferruh Yigit @ 2018-01-18 17:31 UTC (permalink / raw)
  To: Matan Azrad, Adrien Mazarguil, Gaetan Rivet; +Cc: Thomas Monjalon, dev

On 1/18/2018 11:27 AM, Matan Azrad wrote:
> rte_eth_dev_is_removed API was added to detect a device removal
> synchronously.
> 
> When a device removal occurs during control command execution, many
> different errors can be reported to the user.
> 
> Adjust all ethdev APIs error reports to return -EIO in case of device
> removal using rte_eth_dev_is_removed API.
> 
> Signed-off-by: Matan Azrad <matan@mellanox.com>
> Acked-by: Thomas Monjalon <thomas@monjalon.net>
> ---
>  lib/librte_ether/rte_ethdev.c | 192 +++++++++++++++++++++++++++---------------
>  lib/librte_ether/rte_ethdev.h |  51 ++++++++++-
>  2 files changed, 170 insertions(+), 73 deletions(-)
> 
> diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
> index c93cec1..7044159 100644
> --- a/lib/librte_ether/rte_ethdev.c
> +++ b/lib/librte_ether/rte_ethdev.c
> @@ -338,6 +338,16 @@ struct rte_eth_dev *
>  	return -ENODEV;
>  }
>  
> +static int
> +eth_err(uint16_t port_id, int ret)
> +{
> +	if (ret == 0)
> +		return 0;
> +	if (rte_eth_dev_is_removed(port_id))
> +		return -EIO;
> +	return ret;
> +}
> +
>  /* attach the new device, then store port_id of the device */
>  int
>  rte_eth_dev_attach(const char *devargs, uint16_t *port_id)
> @@ -492,7 +502,8 @@ struct rte_eth_dev *
>  		return 0;
>  	}
>  
> -	return dev->dev_ops->rx_queue_start(dev, rx_queue_id);
> +	return eth_err(port_id, dev->dev_ops->rx_queue_start(dev,
> +							     rx_queue_id));
>  
>  }

This patch updates *all* ethdev public APIs to add if device is removed check?
And each check goes to ethdev is_removed() dev_ops to ask if dev is removed.
These must be better way of doing this, am I missing something.

I definitely would like to see more comments for this patch.

Another question is what happens if device removed while or before dev_ops
called? There is no synchronizations in drivers for removal, right?

<...>

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v6 1/6] ethdev: add devop to check removal status
  2018-01-18 17:18             ` Ferruh Yigit
@ 2018-01-18 17:57               ` Adrien Mazarguil
  2018-01-18 18:02               ` Matan Azrad
  1 sibling, 0 replies; 98+ messages in thread
From: Adrien Mazarguil @ 2018-01-18 17:57 UTC (permalink / raw)
  To: Ferruh Yigit; +Cc: Matan Azrad, Gaetan Rivet, Thomas Monjalon, dev

On Thu, Jan 18, 2018 at 05:18:22PM +0000, Ferruh Yigit wrote:
> On 1/18/2018 11:27 AM, Matan Azrad wrote:
> > There is time between the physical removal of the device until PMDs get
> > a RMV interrupt. At this time DPDK PMDs and applications still don't
> > know about the removal.
> > 
> > Current removal detection is achieved only by registration to device RMV
> > event and the notification comes asynchronously. So, there is no option
> > to detect a device removal synchronously.
> > Applications and other DPDK entities may want to check a device removal
> > synchronously and to take an immediate decision accordingly.
> 
> So we will have two methods to detect device removal, one is asynchronous as you
> mentioned.
> Device removal will cause an interrupt which trigger to run user callback.
> 
> New method is synchronous, but still triggered from application. I mean
> application should do a rte_eth_dev_is_removed() to learn about status, what is
> the use case here, polling continuously? Won't this also cause some latency
> unless you dedicate a core just polling device status?

They are complementary. The use case is when devices get suddenly physically
pulled out of their chassis (you need to picture a raging sysadmin for
that), or logically in the case of a hypervisor removing a SR-IOV device
from a VM, this happens without prior notice.

It takes time for the PCI unplug notification to travel from the kernel to
DPDK, up to several seconds, during which the DPDK application may execute
control path operations on it. These may fail due to the now non-existent
device (e.g. no ACK will be returned by the device after adding a new MAC),
and these failures may be misinterpreted (e.g. permission denied, invalid
argument and so on).

To address this problem, PMDs that support physical hotplug must have all
their devops internally check for device removal before returning any other
error, in order to possibly convert the original error code to EIO.

Now patching each and every devop in each PMD with basically the same code
being counterproductive, this series puts this check at a higher level,
inside rte_ethdev. Since this results in a new devop, it can be exposed to
applications for free, as these may find a use for it as well.

> > Add new dev op called is_removed to allow DPDK entities to check an
> > Ethernet device removal status immediately.
> > 
> > Signed-off-by: Matan Azrad <matan@mellanox.com>
> > Acked-by: Thomas Monjalon <thomas@monjalon.net>
> > ---
> >  lib/librte_ether/rte_ethdev.c           | 28 +++++++++++++++++++++++++---
> >  lib/librte_ether/rte_ethdev.h           | 20 ++++++++++++++++++++
> >  lib/librte_ether/rte_ethdev_version.map |  1 +
> >  3 files changed, 46 insertions(+), 3 deletions(-)
> > 
> > diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
> > index b349599..c93cec1 100644
> > --- a/lib/librte_ether/rte_ethdev.c
> > +++ b/lib/librte_ether/rte_ethdev.c
> > @@ -114,7 +114,8 @@ enum {
> >  rte_eth_find_next(uint16_t port_id)
> >  {
> >  	while (port_id < RTE_MAX_ETHPORTS &&
> > -	       rte_eth_devices[port_id].state != RTE_ETH_DEV_ATTACHED)
> > +	       rte_eth_devices[port_id].state != RTE_ETH_DEV_ATTACHED &&
> > +	       rte_eth_devices[port_id].state != RTE_ETH_DEV_REMOVED)
> 
> If device is removed, why we are not allowed to re-use port_id assigned to it?
> Overall I am not clear with RTE_ETH_DEV_REMOVED state, why we are not directly
> setting RTE_ETH_DEV_UNUSED?
> 
> And state RTE_ETH_DEV_REMOVED set in ethdev layer, and ethdev layer won't let
> reusing it, so what changes the state of dev? Will it stay as it is during
> lifetime of the application?

While it switched to the REMOVED state, the underlying PMD still holds the
entry at this point; data is still allocated and so on. It will switch to
UNUSED after the PMD instance is fully de-initialized. In the meantime the
entry still needs to be skipped.

> >  		port_id++;
> >  
> >  	if (port_id >= RTE_MAX_ETHPORTS)
> > @@ -262,8 +263,7 @@ struct rte_eth_dev *
> >  rte_eth_dev_is_valid_port(uint16_t port_id)
> >  {
> >  	if (port_id >= RTE_MAX_ETHPORTS ||
> > -	    (rte_eth_devices[port_id].state != RTE_ETH_DEV_ATTACHED &&
> > -	     rte_eth_devices[port_id].state != RTE_ETH_DEV_DEFERRED))
> > +	    (rte_eth_devices[port_id].state == RTE_ETH_DEV_UNUSED))
> >  		return 0;
> >  	else
> >  		return 1;
> > @@ -1094,6 +1094,28 @@ struct rte_eth_dev *
> >  }
> >  
> >  int
> > +rte_eth_dev_is_removed(uint16_t port_id)
> > +{
> > +	struct rte_eth_dev *dev;
> > +	int ret;
> > +
> > +	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, 0);
> > +
> > +	dev = &rte_eth_devices[port_id];
> > +
> > +	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->is_removed, 0);
> > +
> > +	if (dev->state == RTE_ETH_DEV_REMOVED)
> > +		return 1;
> 
> Isn't this conflict with below API documentation:
> 
> "
>  * @return
>  *   - 0 when the Ethernet device is removed, otherwise 1.
> "

Documentation is indeed wrong here. Matan?

> 
> > +
> > +	ret = dev->dev_ops->is_removed(dev);
> > +	if (ret != 0)
> > +		dev->state = RTE_ETH_DEV_REMOVED;
> 
> It isn't clear what "dev_ops->is_removed(dev)" should return, and this causing
> incompatible usages in PMDs by time.
> Please add some documentation about expected return values for dev_ops.

It should be clarified as a boolean value (yes = nonzero, no = zero), like
most is*() functions (isalpha(), isblank() and so on).

> And this not real remove, PMD signals us and we stop using that device, but
> device can be there, right?

"Removal" in the sense of "device removal" not "PMD removal" which is
usually described as "unbinding". This was chosen based on the
similarly-named "removal" (RMV) event for consistency.

> If there is a real removal, can be possible to use eal hotplug?

Possibly, although I think it doesn't remove the case for this devop, right?

-- 
Adrien Mazarguil
6WIND

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v6 1/6] ethdev: add devop to check removal status
  2018-01-18 17:18             ` Ferruh Yigit
  2018-01-18 17:57               ` Adrien Mazarguil
@ 2018-01-18 18:02               ` Matan Azrad
  1 sibling, 0 replies; 98+ messages in thread
From: Matan Azrad @ 2018-01-18 18:02 UTC (permalink / raw)
  To: Ferruh Yigit, Adrien Mazarguil, Gaetan Rivet; +Cc: Thomas Monjalon, dev

Hi Ferruh

From: Ferruh Yigit, Thursday, January 18, 2018 7:18 PM
> On 1/18/2018 11:27 AM, Matan Azrad wrote:
> > There is time between the physical removal of the device until PMDs
> > get a RMV interrupt. At this time DPDK PMDs and applications still
> > don't know about the removal.
> >
> > Current removal detection is achieved only by registration to device
> > RMV event and the notification comes asynchronously. So, there is no
> > option to detect a device removal synchronously.
> > Applications and other DPDK entities may want to check a device
> > removal synchronously and to take an immediate decision accordingly.
> 
> So we will have two methods to detect device removal, one is asynchronous
> as you mentioned.
> Device removal will cause an interrupt which trigger to run user callback.

Yes.

> New method is synchronous, but still triggered from application. I mean
> application should do a rte_eth_dev_is_removed() to learn about status,
> what is the use case here, polling continuously? Won't this also cause some
> latency unless you dedicate a core just polling device status?
> 

It is for application and for other DPDK entities like PMDs, see fail-safe example in this series.
When hotplug in the game I think it can be used for application too.

> > Add new dev op called is_removed to allow DPDK entities to check an
> > Ethernet device removal status immediately.
> >
> > Signed-off-by: Matan Azrad <matan@mellanox.com>
> > Acked-by: Thomas Monjalon <thomas@monjalon.net>
> > ---
> >  lib/librte_ether/rte_ethdev.c           | 28 +++++++++++++++++++++++++---
> >  lib/librte_ether/rte_ethdev.h           | 20 ++++++++++++++++++++
> >  lib/librte_ether/rte_ethdev_version.map |  1 +
> >  3 files changed, 46 insertions(+), 3 deletions(-)
> >
> > diff --git a/lib/librte_ether/rte_ethdev.c
> > b/lib/librte_ether/rte_ethdev.c index b349599..c93cec1 100644
> > --- a/lib/librte_ether/rte_ethdev.c
> > +++ b/lib/librte_ether/rte_ethdev.c
> > @@ -114,7 +114,8 @@ enum {
> >  rte_eth_find_next(uint16_t port_id)
> >  {
> >  	while (port_id < RTE_MAX_ETHPORTS &&
> > -	       rte_eth_devices[port_id].state != RTE_ETH_DEV_ATTACHED)
> > +	       rte_eth_devices[port_id].state != RTE_ETH_DEV_ATTACHED &&
> > +	       rte_eth_devices[port_id].state != RTE_ETH_DEV_REMOVED)
> 
> If device is removed, why we are not allowed to re-use port_id assigned to
> it?
Sorry, don't understand.
We allow still to iterate over it here.

> Overall I am not clear with RTE_ETH_DEV_REMOVED state, why we are not
> directly setting RTE_ETH_DEV_UNUSED?
 
Someone should release the SW port resources before setting it to UNUSED.

> And state RTE_ETH_DEV_REMOVED set in ethdev layer, and ethdev layer
> won't let reusing it, so what changes the state of dev? Will it stay as it is
> during lifetime of the application?
> 
> >  		port_id++;
> >
> >  	if (port_id >= RTE_MAX_ETHPORTS)
> > @@ -262,8 +263,7 @@ struct rte_eth_dev *
> > rte_eth_dev_is_valid_port(uint16_t port_id)  {
> >  	if (port_id >= RTE_MAX_ETHPORTS ||
> > -	    (rte_eth_devices[port_id].state != RTE_ETH_DEV_ATTACHED &&
> > -	     rte_eth_devices[port_id].state != RTE_ETH_DEV_DEFERRED))
> > +	    (rte_eth_devices[port_id].state == RTE_ETH_DEV_UNUSED))
> >  		return 0;
> >  	else
> >  		return 1;
> > @@ -1094,6 +1094,28 @@ struct rte_eth_dev *  }
> >
> >  int
> > +rte_eth_dev_is_removed(uint16_t port_id) {
> > +	struct rte_eth_dev *dev;
> > +	int ret;
> > +
> > +	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, 0);
> > +
> > +	dev = &rte_eth_devices[port_id];
> > +
> > +	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->is_removed, 0);
> > +
> > +	if (dev->state == RTE_ETH_DEV_REMOVED)
> > +		return 1;
> 
> Isn't this conflict with below API documentation:
> 

Yes, You absolutely right, we need to change this documentation.

> "
>  * @return
>  *   - 0 when the Ethernet device is removed, otherwise 1.
> "
> 
> > +
> > +	ret = dev->dev_ops->is_removed(dev);
> > +	if (ret != 0)
> > +		dev->state = RTE_ETH_DEV_REMOVED;
> 
> It isn't clear what "dev_ops->is_removed(dev)" should return, and this
> causing incompatible usages in PMDs by time.
> Please add some documentation about expected return values for dev_ops.
>

OK
 
> 
> And this not real remove, PMD signals us and we stop using that device, but
> device can be there, right?

It says that the device is physically removed but there is some software resources which still were not released.

> If there is a real removal, can be possible to use eal hotplug?

I think EAL hotplug is asynchrony  as the current RMV event , so EAl hotplug event can be used instead of RMV event.
 


> <...>

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v6 4/6] ethdev: adjust APIs removal error report
  2018-01-18 17:31             ` Ferruh Yigit
@ 2018-01-18 18:10               ` Matan Azrad
  2018-01-19 16:19                 ` Ferruh Yigit
  0 siblings, 1 reply; 98+ messages in thread
From: Matan Azrad @ 2018-01-18 18:10 UTC (permalink / raw)
  To: Ferruh Yigit, Adrien Mazarguil, Gaetan Rivet; +Cc: Thomas Monjalon, dev

Hi Ferruh

From: Ferruh Yigit, Thursday, January 18, 2018 7:31 PM
> On 1/18/2018 11:27 AM, Matan Azrad wrote:
> > rte_eth_dev_is_removed API was added to detect a device removal
> > synchronously.
> >
> > When a device removal occurs during control command execution, many
> > different errors can be reported to the user.
> >
> > Adjust all ethdev APIs error reports to return -EIO in case of device
> > removal using rte_eth_dev_is_removed API.
> >
> > Signed-off-by: Matan Azrad <matan@mellanox.com>
> > Acked-by: Thomas Monjalon <thomas@monjalon.net>
> > ---
> >  lib/librte_ether/rte_ethdev.c | 192
> > +++++++++++++++++++++++++++---------------
> >  lib/librte_ether/rte_ethdev.h |  51 ++++++++++-
> >  2 files changed, 170 insertions(+), 73 deletions(-)
> >
> > diff --git a/lib/librte_ether/rte_ethdev.c
> > b/lib/librte_ether/rte_ethdev.c index c93cec1..7044159 100644
> > --- a/lib/librte_ether/rte_ethdev.c
> > +++ b/lib/librte_ether/rte_ethdev.c
> > @@ -338,6 +338,16 @@ struct rte_eth_dev *
> >  	return -ENODEV;
> >  }
> >
> > +static int
> > +eth_err(uint16_t port_id, int ret)
> > +{
> > +	if (ret == 0)
> > +		return 0;
> > +	if (rte_eth_dev_is_removed(port_id))
> > +		return -EIO;
> > +	return ret;
> > +}
> > +
> >  /* attach the new device, then store port_id of the device */  int
> > rte_eth_dev_attach(const char *devargs, uint16_t *port_id) @@ -492,7
> > +502,8 @@ struct rte_eth_dev *
> >  		return 0;
> >  	}
> >
> > -	return dev->dev_ops->rx_queue_start(dev, rx_queue_id);
> > +	return eth_err(port_id, dev->dev_ops->rx_queue_start(dev,
> > +							     rx_queue_id));
> >
> >  }
> 
> This patch updates *all* ethdev public APIs to add if device is removed
> check?

Yes.

> And each check goes to ethdev is_removed() dev_ops to ask if dev is
> removed.
Probably, if the REMOVED state setted in will not call device is_remove.

> These must be better way of doing this, am I missing something.

Suggest.

This code will replace similar code in each PMD.

> I definitely would like to see more comments for this patch.
> 
> Another question is what happens if device removed while or before
> dev_ops called? There is no synchronizations in drivers for removal, right?
> 

Yes. You right, the device removal can be changed a moment after the call.
Actually the caller suspected in removal before call it(and want to validate it) - so it makes sense. 
From this reason the check in ethdev APIs is called generally in error flows.


> <...>

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v6 4/6] ethdev: adjust APIs removal error report
  2018-01-18 18:10               ` Matan Azrad
@ 2018-01-19 16:19                 ` Ferruh Yigit
  2018-01-19 17:35                   ` Ananyev, Konstantin
                                     ` (2 more replies)
  0 siblings, 3 replies; 98+ messages in thread
From: Ferruh Yigit @ 2018-01-19 16:19 UTC (permalink / raw)
  To: Matan Azrad, Adrien Mazarguil, Gaetan Rivet
  Cc: Thomas Monjalon, dev, Andrew Rybchenko, Ananyev, Konstantin,
	Alejandro Lucero, Jerin Jacob, Hemant Agrawal, Shahaf Shuler,
	Adrien Mazarguil, Olivier MATZ, Zhang, Helin

On 1/18/2018 6:10 PM, Matan Azrad wrote:
> Hi Ferruh
> 
> From: Ferruh Yigit, Thursday, January 18, 2018 7:31 PM
>> On 1/18/2018 11:27 AM, Matan Azrad wrote:
>>> rte_eth_dev_is_removed API was added to detect a device removal
>>> synchronously.
>>>
>>> When a device removal occurs during control command execution, many
>>> different errors can be reported to the user.
>>>
>>> Adjust all ethdev APIs error reports to return -EIO in case of device
>>> removal using rte_eth_dev_is_removed API.
>>>
>>> Signed-off-by: Matan Azrad <matan@mellanox.com>
>>> Acked-by: Thomas Monjalon <thomas@monjalon.net>
>>> ---
>>>  lib/librte_ether/rte_ethdev.c | 192
>>> +++++++++++++++++++++++++++---------------
>>>  lib/librte_ether/rte_ethdev.h |  51 ++++++++++-
>>>  2 files changed, 170 insertions(+), 73 deletions(-)
>>>
>>> diff --git a/lib/librte_ether/rte_ethdev.c
>>> b/lib/librte_ether/rte_ethdev.c index c93cec1..7044159 100644
>>> --- a/lib/librte_ether/rte_ethdev.c
>>> +++ b/lib/librte_ether/rte_ethdev.c
>>> @@ -338,6 +338,16 @@ struct rte_eth_dev *
>>>  	return -ENODEV;
>>>  }
>>>
>>> +static int
>>> +eth_err(uint16_t port_id, int ret)
>>> +{
>>> +	if (ret == 0)
>>> +		return 0;
>>> +	if (rte_eth_dev_is_removed(port_id))
>>> +		return -EIO;
>>> +	return ret;
>>> +}
>>> +
>>>  /* attach the new device, then store port_id of the device */  int
>>> rte_eth_dev_attach(const char *devargs, uint16_t *port_id) @@ -492,7
>>> +502,8 @@ struct rte_eth_dev *
>>>  		return 0;
>>>  	}
>>>
>>> -	return dev->dev_ops->rx_queue_start(dev, rx_queue_id);
>>> +	return eth_err(port_id, dev->dev_ops->rx_queue_start(dev,
>>> +							     rx_queue_id));
>>>
>>>  }
>>
>> This patch updates *all* ethdev public APIs to add if device is removed
>> check?
> 
> Yes.
> 
>> And each check goes to ethdev is_removed() dev_ops to ask if dev is
>> removed.
> Probably, if the REMOVED state setted in will not call device is_remove.
> 
>> These must be better way of doing this, am I missing something.
> 
> Suggest.

With a silly analogy, this is like a blind person asking each time if he is dead
before talking to other person.

At first glance I can think of a kind of watchdog timer can be implemented in
ethdev layer. It provides periodic checks and if device is dead it calls the
registered user callback function.

This method presented as synchronous method but not triggered from side where
event happens, I mean not triggered from PMD but from application.
So does application doing polling continuously if device is dead?
Or if application is relying this patch to add a check in each API, what happens
if device removed during data processing, will app rely on asynchronous method?

I am including a few consumers of the ethdev to the mail thread, clearly I am
not very supportive of this patch, but specially taking release is being close
to the account, if there is no objection than me I will take as consensus to get
the patch in.

> 
> This code will replace similar code in each PMD.
> 
>> I definitely would like to see more comments for this patch.
>>
>> Another question is what happens if device removed while or before
>> dev_ops called? There is no synchronizations in drivers for removal, right?
>>
> 
> Yes. You right, the device removal can be changed a moment after the call.
> Actually the caller suspected in removal before call it(and want to validate it) - so it makes sense. 
> From this reason the check in ethdev APIs is called generally in error flows.
> 
> 
>> <...>

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v6 4/6] ethdev: adjust APIs removal error report
  2018-01-19 16:19                 ` Ferruh Yigit
@ 2018-01-19 17:35                   ` Ananyev, Konstantin
  2018-01-19 17:54                   ` Thomas Monjalon
  2018-01-21 20:07                   ` Ferruh Yigit
  2 siblings, 0 replies; 98+ messages in thread
From: Ananyev, Konstantin @ 2018-01-19 17:35 UTC (permalink / raw)
  To: Yigit, Ferruh, Matan Azrad, Adrien Mazarguil, Gaetan Rivet
  Cc: Thomas Monjalon, dev, Andrew Rybchenko, Alejandro Lucero,
	Jerin Jacob, Hemant Agrawal, Shahaf Shuler, Adrien Mazarguil,
	Olivier MATZ, Zhang, Helin



> -----Original Message-----
> From: Yigit, Ferruh
> Sent: Friday, January 19, 2018 4:19 PM
> To: Matan Azrad <matan@mellanox.com>; Adrien Mazarguil <adrien.mazarguil@6wind.com>; Gaetan Rivet <gaetan.rivet@6wind.com>
> Cc: Thomas Monjalon <thomas@monjalon.net>; dev@dpdk.org; Andrew Rybchenko <arybchenko@solarflare.com>; Ananyev, Konstantin
> <konstantin.ananyev@intel.com>; Alejandro Lucero <alejandro.lucero@netronome.com>; Jerin Jacob
> <jerin.jacob@caviumnetworks.com>; Hemant Agrawal <hemant.agrawal@nxp.com>; Shahaf Shuler <shahafs@mellanox.com>; Adrien
> Mazarguil <adrien.mazarguil@6wind.com>; Olivier MATZ <olivier.matz@6wind.com>; Zhang, Helin <helin.zhang@intel.com>
> Subject: Re: [PATCH v6 4/6] ethdev: adjust APIs removal error report
> 
> On 1/18/2018 6:10 PM, Matan Azrad wrote:
> > Hi Ferruh
> >
> > From: Ferruh Yigit, Thursday, January 18, 2018 7:31 PM
> >> On 1/18/2018 11:27 AM, Matan Azrad wrote:
> >>> rte_eth_dev_is_removed API was added to detect a device removal
> >>> synchronously.
> >>>
> >>> When a device removal occurs during control command execution, many
> >>> different errors can be reported to the user.
> >>>
> >>> Adjust all ethdev APIs error reports to return -EIO in case of device
> >>> removal using rte_eth_dev_is_removed API.
> >>>
> >>> Signed-off-by: Matan Azrad <matan@mellanox.com>
> >>> Acked-by: Thomas Monjalon <thomas@monjalon.net>
> >>> ---
> >>>  lib/librte_ether/rte_ethdev.c | 192
> >>> +++++++++++++++++++++++++++---------------
> >>>  lib/librte_ether/rte_ethdev.h |  51 ++++++++++-
> >>>  2 files changed, 170 insertions(+), 73 deletions(-)
> >>>
> >>> diff --git a/lib/librte_ether/rte_ethdev.c
> >>> b/lib/librte_ether/rte_ethdev.c index c93cec1..7044159 100644
> >>> --- a/lib/librte_ether/rte_ethdev.c
> >>> +++ b/lib/librte_ether/rte_ethdev.c
> >>> @@ -338,6 +338,16 @@ struct rte_eth_dev *
> >>>  	return -ENODEV;
> >>>  }
> >>>
> >>> +static int
> >>> +eth_err(uint16_t port_id, int ret)
> >>> +{
> >>> +	if (ret == 0)
> >>> +		return 0;
> >>> +	if (rte_eth_dev_is_removed(port_id))
> >>> +		return -EIO;
> >>> +	return ret;
> >>> +}
> >>> +
> >>>  /* attach the new device, then store port_id of the device */  int
> >>> rte_eth_dev_attach(const char *devargs, uint16_t *port_id) @@ -492,7
> >>> +502,8 @@ struct rte_eth_dev *
> >>>  		return 0;
> >>>  	}
> >>>
> >>> -	return dev->dev_ops->rx_queue_start(dev, rx_queue_id);
> >>> +	return eth_err(port_id, dev->dev_ops->rx_queue_start(dev,
> >>> +							     rx_queue_id));
> >>>
> >>>  }
> >>
> >> This patch updates *all* ethdev public APIs to add if device is removed
> >> check?
> >
> > Yes.
> >
> >> And each check goes to ethdev is_removed() dev_ops to ask if dev is
> >> removed.
> > Probably, if the REMOVED state setted in will not call device is_remove.
> >
> >> These must be better way of doing this, am I missing something.
> >
> > Suggest.
> 
> With a silly analogy, this is like a blind person asking each time if he is dead
> before talking to other person.

I am agree with Ferruh that it looks a bit clumsy...
Though I don't have any bright ideas here too.
As I can see right now is_removed() is implemented only for mlx PMD.
Would I make sense to hide that check inside mlx implementation then?
BTW:


 int
+rte_eth_dev_is_removed(uint16_t port_id)
+{
+	struct rte_eth_dev *dev;
+	int ret;
+
+	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, 0);
+
+	dev = &rte_eth_devices[port_id];
+
+	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->is_removed, 0);

I'd says these 2 checks have to be swapped.
Konstantin

+
+	if (dev->state == RTE_ETH_DEV_REMOVED)
+		return 1;
+


> 
> At first glance I can think of a kind of watchdog timer can be implemented in
> ethdev layer. It provides periodic checks and if device is dead it calls the
> registered user callback function.
> 
> This method presented as synchronous method but not triggered from side where
> event happens, I mean not triggered from PMD but from application.
> So does application doing polling continuously if device is dead?
> Or if application is relying this patch to add a check in each API, what happens
> if device removed during data processing, will app rely on asynchronous method?
> 
> I am including a few consumers of the ethdev to the mail thread, clearly I am
> not very supportive of this patch, but specially taking release is being close
> to the account, if there is no objection than me I will take as consensus to get
> the patch in.
> 
> >
> > This code will replace similar code in each PMD.
> >
> >> I definitely would like to see more comments for this patch.
> >>
> >> Another question is what happens if device removed while or before
> >> dev_ops called? There is no synchronizations in drivers for removal, right?
> >>
> >
> > Yes. You right, the device removal can be changed a moment after the call.
> > Actually the caller suspected in removal before call it(and want to validate it) - so it makes sense.
> > From this reason the check in ethdev APIs is called generally in error flows.
> >
> >
> >> <...>


^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v6 4/6] ethdev: adjust APIs removal error report
  2018-01-19 16:19                 ` Ferruh Yigit
  2018-01-19 17:35                   ` Ananyev, Konstantin
@ 2018-01-19 17:54                   ` Thomas Monjalon
  2018-01-19 18:13                     ` Ferruh Yigit
  2018-01-21 20:07                   ` Ferruh Yigit
  2 siblings, 1 reply; 98+ messages in thread
From: Thomas Monjalon @ 2018-01-19 17:54 UTC (permalink / raw)
  To: Ferruh Yigit
  Cc: dev, Matan Azrad, Adrien Mazarguil, Gaetan Rivet,
	Andrew Rybchenko, Ananyev, Konstantin, Alejandro Lucero,
	Jerin Jacob, Hemant Agrawal, Shahaf Shuler, Olivier MATZ, Zhang,
	Helin

19/01/2018 17:19, Ferruh Yigit:
> On 1/18/2018 6:10 PM, Matan Azrad wrote:
> > From: Ferruh Yigit, Thursday, January 18, 2018 7:31 PM
> >> This patch updates *all* ethdev public APIs to add if device is removed
> >> check?
> > 
> > Yes.
> > 
> >> And each check goes to ethdev is_removed() dev_ops to ask if dev is
> >> removed.
> > Probably, if the REMOVED state setted in will not call device is_remove.
> > 
> >> These must be better way of doing this, am I missing something.
> > 
> > Suggest.
> 
> With a silly analogy, this is like a blind person asking each time if he is dead
> before talking to other person.
> 
> At first glance I can think of a kind of watchdog timer can be implemented in
> ethdev layer. It provides periodic checks and if device is dead it calls the
> registered user callback function.
> 
> This method presented as synchronous method but not triggered from side where
> event happens, I mean not triggered from PMD but from application.
> So does application doing polling continuously if device is dead?
> Or if application is relying this patch to add a check in each API, what happens
> if device removed during data processing, will app rely on asynchronous method?

We cannot put a mutex on hardware removal :)
So we have to live with errors due to removal.
If we are trying to configure a removed device,
the error will be not related to the root cause.
This patch is just trying to improve the situation by returning
an appropriate error code if removal can be detected.
Note: the check is run only if there is an error
and if the removal is not already detected.

If I understand well, you prefer relying only on asynchronous
hotplug events? Even if they come really late?

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v6 4/6] ethdev: adjust APIs removal error report
  2018-01-19 17:54                   ` Thomas Monjalon
@ 2018-01-19 18:13                     ` Ferruh Yigit
  2018-01-19 18:16                       ` Thomas Monjalon
  0 siblings, 1 reply; 98+ messages in thread
From: Ferruh Yigit @ 2018-01-19 18:13 UTC (permalink / raw)
  To: Thomas Monjalon
  Cc: dev, Matan Azrad, Adrien Mazarguil, Gaetan Rivet,
	Andrew Rybchenko, Ananyev, Konstantin, Alejandro Lucero,
	Jerin Jacob, Hemant Agrawal, Shahaf Shuler, Olivier MATZ, Zhang,
	Helin

On 1/19/2018 5:54 PM, Thomas Monjalon wrote:
> 19/01/2018 17:19, Ferruh Yigit:
>> On 1/18/2018 6:10 PM, Matan Azrad wrote:
>>> From: Ferruh Yigit, Thursday, January 18, 2018 7:31 PM
>>>> This patch updates *all* ethdev public APIs to add if device is removed
>>>> check?
>>>
>>> Yes.
>>>
>>>> And each check goes to ethdev is_removed() dev_ops to ask if dev is
>>>> removed.
>>> Probably, if the REMOVED state setted in will not call device is_remove.
>>>
>>>> These must be better way of doing this, am I missing something.
>>>
>>> Suggest.
>>
>> With a silly analogy, this is like a blind person asking each time if he is dead
>> before talking to other person.
>>
>> At first glance I can think of a kind of watchdog timer can be implemented in
>> ethdev layer. It provides periodic checks and if device is dead it calls the
>> registered user callback function.
>>
>> This method presented as synchronous method but not triggered from side where
>> event happens, I mean not triggered from PMD but from application.
>> So does application doing polling continuously if device is dead?
>> Or if application is relying this patch to add a check in each API, what happens
>> if device removed during data processing, will app rely on asynchronous method?
> 
> We cannot put a mutex on hardware removal :)
> So we have to live with errors du to removal.
> If we are trying to configure a removed device,
> the error will be not related to the root cause.
> This patch is just trying to improve the situation by returning
> an appropriate error code if removal can be detected.
> Note: the check is run only if there is an error
> and if the removal is not already detected.
> 
> If I understand well, you prefer relying only on asynchronous
> hotplug events? Even if they come really late?

I think asynchronous hotplug events are better approach, but if they are late I
understand you need a solution.

I assume issue is when device is removed, application can make API calls until
it detects the removal.

For that case what do you think instead of ethdev abstraction layer doing a
translation, define a specific error type and return it from PMDs to indicate
that error is related to the missing device and application will know device is
no more there? And consistently use that error type in all PMDs.

Or what do you think above suggested flexible watchdog timer implementation in
ethdev, so applications may be informed about missing device faster.

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v6 4/6] ethdev: adjust APIs removal error report
  2018-01-19 18:13                     ` Ferruh Yigit
@ 2018-01-19 18:16                       ` Thomas Monjalon
  2018-01-20 19:04                         ` Matan Azrad
  0 siblings, 1 reply; 98+ messages in thread
From: Thomas Monjalon @ 2018-01-19 18:16 UTC (permalink / raw)
  To: Ferruh Yigit
  Cc: dev, Matan Azrad, Adrien Mazarguil, Gaetan Rivet,
	Andrew Rybchenko, Ananyev, Konstantin, Alejandro Lucero,
	Jerin Jacob, Hemant Agrawal, Shahaf Shuler, Olivier MATZ, Zhang,
	Helin

19/01/2018 19:13, Ferruh Yigit:
> On 1/19/2018 5:54 PM, Thomas Monjalon wrote:
> > 19/01/2018 17:19, Ferruh Yigit:
> >> On 1/18/2018 6:10 PM, Matan Azrad wrote:
> >>> From: Ferruh Yigit, Thursday, January 18, 2018 7:31 PM
> >>>> This patch updates *all* ethdev public APIs to add if device is removed
> >>>> check?
> >>>
> >>> Yes.
> >>>
> >>>> And each check goes to ethdev is_removed() dev_ops to ask if dev is
> >>>> removed.
> >>> Probably, if the REMOVED state setted in will not call device is_remove.
> >>>
> >>>> These must be better way of doing this, am I missing something.
> >>>
> >>> Suggest.
> >>
> >> With a silly analogy, this is like a blind person asking each time if he is dead
> >> before talking to other person.
> >>
> >> At first glance I can think of a kind of watchdog timer can be implemented in
> >> ethdev layer. It provides periodic checks and if device is dead it calls the
> >> registered user callback function.
> >>
> >> This method presented as synchronous method but not triggered from side where
> >> event happens, I mean not triggered from PMD but from application.
> >> So does application doing polling continuously if device is dead?
> >> Or if application is relying this patch to add a check in each API, what happens
> >> if device removed during data processing, will app rely on asynchronous method?
> > 
> > We cannot put a mutex on hardware removal :)
> > So we have to live with errors du to removal.
> > If we are trying to configure a removed device,
> > the error will be not related to the root cause.
> > This patch is just trying to improve the situation by returning
> > an appropriate error code if removal can be detected.
> > Note: the check is run only if there is an error
> > and if the removal is not already detected.
> > 
> > If I understand well, you prefer relying only on asynchronous
> > hotplug events? Even if they come really late?
> 
> I think asynchronous hotplug events are better approach, but if they are late I
> understand you need a solution.
> 
> I assume issue is when device is removed, application can make API calls until
> it detects the removal.
> 
> For that case what do you think instead of ethdev abstraction layer doing a
> translation, define a specific error type and return it from PMDs to indicate
> that error is related to the missing device and application will know device is
> no more there? And consistently use that error type in all PMDs.

Yes it is also a good solution.
I think Matan proposed to integrate it in ethdev to avoid duplicating
the same mechanism in every PMDs.

> Or what do you think above suggested flexible watchdog timer implementation in
> ethdev, so applications may be informed about missing device faster.

I think the watchdog would compete with hotplug events.
So I prefer not integrating one more asynchronous mechanism
for the same purpose.
If we want polling, it can be an option to enable in EAL hotplug.

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v6 4/6] ethdev: adjust APIs removal error report
  2018-01-19 18:16                       ` Thomas Monjalon
@ 2018-01-20 19:04                         ` Matan Azrad
  2018-01-20 20:28                           ` Thomas Monjalon
  0 siblings, 1 reply; 98+ messages in thread
From: Matan Azrad @ 2018-01-20 19:04 UTC (permalink / raw)
  To: Thomas Monjalon, Ferruh Yigit, Ananyev, Konstantin
  Cc: dev, Adrien Mazarguil, Gaetan Rivet, Andrew Rybchenko,
	Alejandro Lucero, Jerin Jacob, Hemant Agrawal, Shahaf Shuler,
	Olivier MATZ, Zhang, Helin

Hi all

From: Thomas Monjalon, Friday, January 19, 2018 8:17 PM
> 19/01/2018 19:13, Ferruh Yigit:
> > On 1/19/2018 5:54 PM, Thomas Monjalon wrote:
> > > 19/01/2018 17:19, Ferruh Yigit:
> > >> On 1/18/2018 6:10 PM, Matan Azrad wrote:
> > >>> From: Ferruh Yigit, Thursday, January 18, 2018 7:31 PM
> > >>>> This patch updates *all* ethdev public APIs to add if device is
> > >>>> removed check?
> > >>>
> > >>> Yes.
> > >>>
> > >>>> And each check goes to ethdev is_removed() dev_ops to ask if dev
> > >>>> is removed.
> > >>> Probably, if the REMOVED state setted in will not call device
> is_remove.
> > >>>
> > >>>> These must be better way of doing this, am I missing something.
> > >>>
> > >>> Suggest.
> > >>
> > >> With a silly analogy, this is like a blind person asking each time
> > >> if he is dead before talking to other person.

Just to accurate the analogy:)
This is like a blind person(application,ethdev) using its guide dog(ethdev device), every time the dog refuses to take action (error occurred), the blind person asks if the dog can be a guide dog anymore(removal error). 

> > >> At first glance I can think of a kind of watchdog timer can be
> > >> implemented in ethdev layer. It provides periodic checks and if
> > >> device is dead it calls the registered user callback function.
> > >>
> > >> This method presented as synchronous method but not triggered from
> > >> side where event happens, I mean not triggered from PMD but from
> application.
> > >> So does application doing polling continuously if device is dead?
> > >> Or if application is relying this patch to add a check in each API,
> > >> what happens if device removed during data processing, will app rely on
> asynchronous method?
> > >
> > > We cannot put a mutex on hardware removal :) So we have to live with
> > > errors du to removal.
> > > If we are trying to configure a removed device, the error will be
> > > not related to the root cause.
> > > This patch is just trying to improve the situation by returning an
> > > appropriate error code if removal can be detected.
> > > Note: the check is run only if there is an error and if the removal
> > > is not already detected.
> > >
> > > If I understand well, you prefer relying only on asynchronous
> > > hotplug events? Even if they come really late?
> >
> > I think asynchronous hotplug events are better approach, but if they
> > are late I understand you need a solution.
> >
> > I assume issue is when device is removed, application can make API
> > calls until it detects the removal.
> >
> > For that case what do you think instead of ethdev abstraction layer
> > doing a translation, define a specific error type and return it from
> > PMDs to indicate that error is related to the missing device and
> > application will know device is no more there? And consistently use that
> error type in all PMDs.
> 
> Yes it is also a good solution.
> I think Matan proposed to integrate it in ethdev to avoid duplicating the
> same mechanism in every PMDs.
> 

Yes, as a lot of ethdev API code pieces do.

> > Or what do you think above suggested flexible watchdog timer
> > implementation in ethdev, so applications may be informed about missing
> device faster.
> 
> I think the watchdog would compete with hotplug events.
> So I prefer not integrating one more asynchronous mechanism for the same
> purpose.
> If we want polling, it can be an option to enable in EAL hotplug.

Konstantin wrote in another thread:
>+	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, 0);
>+
>+	dev = &rte_eth_devices[port_id];
>+
>+	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->is_removed, 0);

> I'd says these 2 checks have to be swapped.

Konstantin, Please explain why.

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v6 4/6] ethdev: adjust APIs removal error report
  2018-01-20 19:04                         ` Matan Azrad
@ 2018-01-20 20:28                           ` Thomas Monjalon
  2018-01-20 20:45                             ` Matan Azrad
  0 siblings, 1 reply; 98+ messages in thread
From: Thomas Monjalon @ 2018-01-20 20:28 UTC (permalink / raw)
  To: Matan Azrad
  Cc: Ferruh Yigit, Ananyev, Konstantin, dev, Adrien Mazarguil,
	Gaetan Rivet, Andrew Rybchenko, Alejandro Lucero, Jerin Jacob,
	Hemant Agrawal, Shahaf Shuler, Olivier MATZ, Zhang, Helin

20/01/2018 20:04, Matan Azrad:
> Konstantin wrote in another thread:
> >+	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, 0);
> >+
> >+	dev = &rte_eth_devices[port_id];
> >+
> >+	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->is_removed, 0);
> 
> > I'd says these 2 checks have to be swapped.
> 
> Konstantin, Please explain why.

I think he was talking about these 2 tests:

+	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->is_removed, 0);
+	if (dev->state == RTE_ETH_DEV_REMOVED)
+		return 1;

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v6 4/6] ethdev: adjust APIs removal error report
  2018-01-20 20:28                           ` Thomas Monjalon
@ 2018-01-20 20:45                             ` Matan Azrad
  0 siblings, 0 replies; 98+ messages in thread
From: Matan Azrad @ 2018-01-20 20:45 UTC (permalink / raw)
  To: Thomas Monjalon
  Cc: Ferruh Yigit, Ananyev, Konstantin, dev, Adrien Mazarguil,
	Gaetan Rivet, Andrew Rybchenko, Alejandro Lucero, Jerin Jacob,
	Hemant Agrawal, Shahaf Shuler, Olivier MATZ, Zhang, Helin

Hi Thomas
From: Thomas Monjalon, Saturday, January 20, 2018 10:29 PM
> 20/01/2018 20:04, Matan Azrad:
> > Konstantin wrote in another thread:
> > >+	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, 0);
> > >+
> > >+	dev = &rte_eth_devices[port_id];
> > >+
> > >+	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->is_removed, 0);
> >
> > > I'd says these 2 checks have to be swapped.
> >
> > Konstantin, Please explain why.
> 
> I think he was talking about these 2 tests:
> 
> +	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->is_removed, 0);
> +	if (dev->state == RTE_ETH_DEV_REMOVED)
> +		return 1;

Ahh yes, it makes sense, I will swap them.

Thanks.

^ permalink raw reply	[flat|nested] 98+ messages in thread

* [PATCH v7 0/6] Fail-safe\ethdev: fix removal handling lack
  2018-01-18 11:27         ` [PATCH v6 0/6] Fail-safe\ethdev: fix removal handling lack Matan Azrad
                             ` (5 preceding siblings ...)
  2018-01-18 11:27           ` [PATCH v6 6/6] net/failsafe: fix removed device handling Matan Azrad
@ 2018-01-20 21:12           ` Matan Azrad
  2018-01-20 21:12             ` [PATCH v7 1/6] ethdev: add devop to check removal status Matan Azrad
                               ` (6 more replies)
  6 siblings, 7 replies; 98+ messages in thread
From: Matan Azrad @ 2018-01-20 21:12 UTC (permalink / raw)
  To: Ferruh Yigit, Adrien Mazarguil, Gaetan Rivet; +Cc: Thomas Monjalon, dev

There is time between the physical removal of the device until sub-device PMDs get a RMV interrupt. 
At this time DPDK PMDs and applications still don't know about the removal and may call sub-device control operation which should return an error.

This series adds new ethdev operation to check device removal, adds support for it in mlx PMDs, adjust ethdev APIs to return -EIO in case of removal and fixes the fail-safe bug of removal error report.

V2:
Remove ENODEV definition.
Remove checks from all mlx control commands.
Add new devop - "is_removed".
Implement it in mlx4 and mlx5.
Fix failsafe bug by the new devop.

V3:
Adjust ethdev APIs removal error report.
Change failsafe check to check eth_dev* return values.
Remove backporting of fail-safe patch.

V4:
Improve fail-safe internal API to adjust the actual error value as discussed.
Remove "Fixes" lines from fail-safe patch.
No changes in ethdev\mlx patches.

V5:
Rebase on top of master-net-mlx. 

V6:
Move ethdev new API to be EXPERIMENTAL.

V7:
Fix API return value description.
Swap checks in the new API as Konstantin suggested.
Add comment in the API as Ferruh suggested.

Matan Azrad (6):
  ethdev: add devop to check removal status
  net/mlx4: support a device removal check operation
  net/mlx5: support a device removal check operation
  ethdev: adjust APIs removal error report
  ethdev: adjust flow APIs removal error report
  net/failsafe: fix removed device handling

 drivers/net/failsafe/failsafe_flow.c    |  18 ++-
 drivers/net/failsafe/failsafe_ops.c     |  35 +++--
 drivers/net/failsafe/failsafe_private.h |  11 ++
 drivers/net/mlx4/mlx4.c                 |   1 +
 drivers/net/mlx4/mlx4.h                 |   1 +
 drivers/net/mlx4/mlx4_ethdev.c          |  20 +++
 drivers/net/mlx5/mlx5.c                 |   2 +
 drivers/net/mlx5/mlx5.h                 |   1 +
 drivers/net/mlx5/mlx5_ethdev.c          |  20 +++
 lib/librte_ether/rte_ethdev.c           | 219 +++++++++++++++++++++-----------
 lib/librte_ether/rte_ethdev.h           |  71 ++++++++++-
 lib/librte_ether/rte_ethdev_version.map |   1 +
 lib/librte_ether/rte_flow.c             |  34 ++++-
 lib/librte_ether/rte_flow.h             |   2 +
 14 files changed, 336 insertions(+), 100 deletions(-)

-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 98+ messages in thread

* [PATCH v7 1/6] ethdev: add devop to check removal status
  2018-01-20 21:12           ` [PATCH v7 0/6] Fail-safe\ethdev: fix removal handling lack Matan Azrad
@ 2018-01-20 21:12             ` Matan Azrad
  2018-01-20 21:12             ` [PATCH v7 2/6] net/mlx4: support a device removal check operation Matan Azrad
                               ` (5 subsequent siblings)
  6 siblings, 0 replies; 98+ messages in thread
From: Matan Azrad @ 2018-01-20 21:12 UTC (permalink / raw)
  To: Ferruh Yigit, Adrien Mazarguil, Gaetan Rivet; +Cc: Thomas Monjalon, dev

There is time between the physical removal of the device until PMDs get
a RMV interrupt. At this time DPDK PMDs and applications still don't
know about the removal.

Current removal detection is achieved only by registration to device RMV
event and the notification comes asynchronously. So, there is no option
to detect a device removal synchronously.
Applications and other DPDK entities may want to check a device removal
synchronously and to take an immediate decision accordingly.

Add new dev op called is_removed to allow DPDK entities to check an
Ethernet device removal status immediately.

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
---
 lib/librte_ether/rte_ethdev.c           | 29 ++++++++++++++++++++++++++---
 lib/librte_ether/rte_ethdev.h           | 20 ++++++++++++++++++++
 lib/librte_ether/rte_ethdev_version.map |  1 +
 3 files changed, 47 insertions(+), 3 deletions(-)

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index b349599..fd70d10 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -114,7 +114,8 @@ enum {
 rte_eth_find_next(uint16_t port_id)
 {
 	while (port_id < RTE_MAX_ETHPORTS &&
-	       rte_eth_devices[port_id].state != RTE_ETH_DEV_ATTACHED)
+	       rte_eth_devices[port_id].state != RTE_ETH_DEV_ATTACHED &&
+	       rte_eth_devices[port_id].state != RTE_ETH_DEV_REMOVED)
 		port_id++;
 
 	if (port_id >= RTE_MAX_ETHPORTS)
@@ -262,8 +263,7 @@ struct rte_eth_dev *
 rte_eth_dev_is_valid_port(uint16_t port_id)
 {
 	if (port_id >= RTE_MAX_ETHPORTS ||
-	    (rte_eth_devices[port_id].state != RTE_ETH_DEV_ATTACHED &&
-	     rte_eth_devices[port_id].state != RTE_ETH_DEV_DEFERRED))
+	    (rte_eth_devices[port_id].state == RTE_ETH_DEV_UNUSED))
 		return 0;
 	else
 		return 1;
@@ -1094,6 +1094,29 @@ struct rte_eth_dev *
 }
 
 int
+rte_eth_dev_is_removed(uint16_t port_id)
+{
+	struct rte_eth_dev *dev;
+	int ret;
+
+	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, 0);
+
+	dev = &rte_eth_devices[port_id];
+
+	if (dev->state == RTE_ETH_DEV_REMOVED)
+		return 1;
+
+	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->is_removed, 0);
+
+	ret = dev->dev_ops->is_removed(dev);
+	if (ret != 0)
+		/* Device is physically removed. */
+		dev->state = RTE_ETH_DEV_REMOVED;
+
+	return ret;
+}
+
+int
 rte_eth_rx_queue_setup(uint16_t port_id, uint16_t rx_queue_id,
 		       uint16_t nb_rx_desc, unsigned int socket_id,
 		       const struct rte_eth_rxconf *rx_conf,
diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index f0eeefe..ed31a10 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -1169,6 +1169,9 @@ struct rte_eth_dcb_info {
 typedef int (*eth_dev_reset_t)(struct rte_eth_dev *dev);
 /** <@internal Function used to reset a configured Ethernet device. */
 
+typedef int (*eth_is_removed_t)(struct rte_eth_dev *dev);
+/**< @internal Function used to detect an Ethernet device removal. */
+
 typedef void (*eth_promiscuous_enable_t)(struct rte_eth_dev *dev);
 /**< @internal Function used to enable the RX promiscuous mode of an Ethernet device. */
 
@@ -1498,6 +1501,8 @@ struct eth_dev_ops {
 	eth_dev_close_t            dev_close;     /**< Close device. */
 	eth_dev_reset_t		   dev_reset;	  /**< Reset device. */
 	eth_link_update_t          link_update;   /**< Get device link state. */
+	eth_is_removed_t           is_removed;
+	/**< Check if the device was physically removed. */
 
 	eth_promiscuous_enable_t   promiscuous_enable; /**< Promiscuous ON. */
 	eth_promiscuous_disable_t  promiscuous_disable;/**< Promiscuous OFF. */
@@ -1684,6 +1689,7 @@ enum rte_eth_dev_state {
 	RTE_ETH_DEV_UNUSED = 0,
 	RTE_ETH_DEV_ATTACHED,
 	RTE_ETH_DEV_DEFERRED,
+	RTE_ETH_DEV_REMOVED,
 };
 
 /**
@@ -1970,6 +1976,20 @@ int rte_eth_dev_configure(uint16_t port_id, uint16_t nb_rx_queue,
 void _rte_eth_dev_reset(struct rte_eth_dev *dev);
 
 /**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Check if an Ethernet device was physically removed.
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @return
+ *   1 when the Ethernet device is removed, otherwise 0.
+ */
+int
+rte_eth_dev_is_removed(uint16_t port_id);
+
+/**
  * Allocate and set up a receive queue for an Ethernet device.
  *
  * The function allocates a contiguous block of memory for *nb_rx_desc*
diff --git a/lib/librte_ether/rte_ethdev_version.map b/lib/librte_ether/rte_ethdev_version.map
index e9681ac..88b7908 100644
--- a/lib/librte_ether/rte_ethdev_version.map
+++ b/lib/librte_ether/rte_ethdev_version.map
@@ -201,6 +201,7 @@ DPDK_17.11 {
 EXPERIMENTAL {
 	global:
 
+	rte_eth_dev_is_removed;
 	rte_mtr_capabilities_get;
 	rte_mtr_create;
 	rte_mtr_destroy;
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH v7 2/6] net/mlx4: support a device removal check operation
  2018-01-20 21:12           ` [PATCH v7 0/6] Fail-safe\ethdev: fix removal handling lack Matan Azrad
  2018-01-20 21:12             ` [PATCH v7 1/6] ethdev: add devop to check removal status Matan Azrad
@ 2018-01-20 21:12             ` Matan Azrad
  2018-01-20 21:12             ` [PATCH v7 3/6] net/mlx5: " Matan Azrad
                               ` (4 subsequent siblings)
  6 siblings, 0 replies; 98+ messages in thread
From: Matan Azrad @ 2018-01-20 21:12 UTC (permalink / raw)
  To: Ferruh Yigit, Adrien Mazarguil, Gaetan Rivet; +Cc: Thomas Monjalon, dev

Add support to get removal status of mlx4 device.

Signed-off-by: Matan Azrad <matan@mellanox.com>
---
 drivers/net/mlx4/mlx4.c        |  1 +
 drivers/net/mlx4/mlx4.h        |  1 +
 drivers/net/mlx4/mlx4_ethdev.c | 20 ++++++++++++++++++++
 3 files changed, 22 insertions(+)

diff --git a/drivers/net/mlx4/mlx4.c b/drivers/net/mlx4/mlx4.c
index 61c5bf4..703513e 100644
--- a/drivers/net/mlx4/mlx4.c
+++ b/drivers/net/mlx4/mlx4.c
@@ -256,6 +256,7 @@ struct mlx4_conf {
 	.filter_ctrl = mlx4_filter_ctrl,
 	.rx_queue_intr_enable = mlx4_rx_intr_enable,
 	.rx_queue_intr_disable = mlx4_rx_intr_disable,
+	.is_removed = mlx4_is_removed,
 };
 
 /**
diff --git a/drivers/net/mlx4/mlx4.h b/drivers/net/mlx4/mlx4.h
index 99dc335..2ab2988 100644
--- a/drivers/net/mlx4/mlx4.h
+++ b/drivers/net/mlx4/mlx4.h
@@ -171,6 +171,7 @@ int mlx4_flow_ctrl_get(struct rte_eth_dev *dev,
 int mlx4_flow_ctrl_set(struct rte_eth_dev *dev,
 		       struct rte_eth_fc_conf *fc_conf);
 const uint32_t *mlx4_dev_supported_ptypes_get(struct rte_eth_dev *dev);
+int mlx4_is_removed(struct rte_eth_dev *dev);
 
 /* mlx4_intr.c */
 
diff --git a/drivers/net/mlx4/mlx4_ethdev.c b/drivers/net/mlx4/mlx4_ethdev.c
index c80eab5..5318b56 100644
--- a/drivers/net/mlx4/mlx4_ethdev.c
+++ b/drivers/net/mlx4/mlx4_ethdev.c
@@ -1052,3 +1052,23 @@ enum rxmode_toggle {
 	}
 	return NULL;
 }
+
+/**
+ * Check if mlx4 device was removed.
+ *
+ * @param dev
+ *   Pointer to Ethernet device structure.
+ *
+ * @return
+ *   1 when device is removed, otherwise 0.
+ */
+int
+mlx4_is_removed(struct rte_eth_dev *dev)
+{
+	struct ibv_device_attr device_attr;
+	struct priv *priv = dev->data->dev_private;
+
+	if (ibv_query_device(priv->ctx, &device_attr) == EIO)
+		return 1;
+	return 0;
+}
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH v7 3/6] net/mlx5: support a device removal check operation
  2018-01-20 21:12           ` [PATCH v7 0/6] Fail-safe\ethdev: fix removal handling lack Matan Azrad
  2018-01-20 21:12             ` [PATCH v7 1/6] ethdev: add devop to check removal status Matan Azrad
  2018-01-20 21:12             ` [PATCH v7 2/6] net/mlx4: support a device removal check operation Matan Azrad
@ 2018-01-20 21:12             ` Matan Azrad
  2018-01-20 21:12             ` [PATCH v7 4/6] ethdev: adjust APIs removal error report Matan Azrad
                               ` (3 subsequent siblings)
  6 siblings, 0 replies; 98+ messages in thread
From: Matan Azrad @ 2018-01-20 21:12 UTC (permalink / raw)
  To: Ferruh Yigit, Adrien Mazarguil, Gaetan Rivet; +Cc: Thomas Monjalon, dev

Add support to get removal status of mlx5 device.
It is not supported in secondary process.

Signed-off-by: Matan Azrad <matan@mellanox.com>
---
 drivers/net/mlx5/mlx5.c        |  2 ++
 drivers/net/mlx5/mlx5.h        |  1 +
 drivers/net/mlx5/mlx5_ethdev.c | 20 ++++++++++++++++++++
 3 files changed, 23 insertions(+)

diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index 1c95f35..c13a2d3 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -284,6 +284,7 @@
 	.tx_descriptor_status = mlx5_tx_descriptor_status,
 	.rx_queue_intr_enable = mlx5_rx_intr_enable,
 	.rx_queue_intr_disable = mlx5_rx_intr_disable,
+	.is_removed = mlx5_is_removed,
 };
 
 static const struct eth_dev_ops mlx5_dev_sec_ops = {
@@ -331,6 +332,7 @@
 	.tx_descriptor_status = mlx5_tx_descriptor_status,
 	.rx_queue_intr_enable = mlx5_rx_intr_enable,
 	.rx_queue_intr_disable = mlx5_rx_intr_disable,
+	.is_removed = mlx5_is_removed,
 };
 
 static struct {
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index e740a4e..aaff180 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -237,6 +237,7 @@ int mlx5_ibv_device_to_pci_addr(const struct ibv_device *,
 void priv_dev_interrupt_handler_install(struct priv *, struct rte_eth_dev *);
 int mlx5_set_link_down(struct rte_eth_dev *dev);
 int mlx5_set_link_up(struct rte_eth_dev *dev);
+int mlx5_is_removed(struct rte_eth_dev *dev);
 eth_tx_burst_t priv_select_tx_function(struct priv *, struct rte_eth_dev *);
 eth_rx_burst_t priv_select_rx_function(struct priv *, struct rte_eth_dev *);
 
diff --git a/drivers/net/mlx5/mlx5_ethdev.c b/drivers/net/mlx5/mlx5_ethdev.c
index 6f78adc..1c067ca 100644
--- a/drivers/net/mlx5/mlx5_ethdev.c
+++ b/drivers/net/mlx5/mlx5_ethdev.c
@@ -1453,3 +1453,23 @@ struct ethtool_link_settings {
 	}
 	return rx_pkt_burst;
 }
+
+/**
+ * Check if mlx5 device was removed.
+ *
+ * @param dev
+ *   Pointer to Ethernet device structure.
+ *
+ * @return
+ *   1 when device is removed, otherwise 0.
+ */
+int
+mlx5_is_removed(struct rte_eth_dev *dev)
+{
+	struct ibv_device_attr device_attr;
+	struct priv *priv = dev->data->dev_private;
+
+	if (ibv_query_device(priv->ctx, &device_attr) == EIO)
+		return 1;
+	return 0;
+}
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH v7 4/6] ethdev: adjust APIs removal error report
  2018-01-20 21:12           ` [PATCH v7 0/6] Fail-safe\ethdev: fix removal handling lack Matan Azrad
                               ` (2 preceding siblings ...)
  2018-01-20 21:12             ` [PATCH v7 3/6] net/mlx5: " Matan Azrad
@ 2018-01-20 21:12             ` Matan Azrad
  2018-01-20 21:12             ` [PATCH v7 5/6] ethdev: adjust flow " Matan Azrad
                               ` (2 subsequent siblings)
  6 siblings, 0 replies; 98+ messages in thread
From: Matan Azrad @ 2018-01-20 21:12 UTC (permalink / raw)
  To: Ferruh Yigit, Adrien Mazarguil, Gaetan Rivet; +Cc: Thomas Monjalon, dev

rte_eth_dev_is_removed API was added to detect a device removal
synchronously.

When a device removal occurs during control command execution, many
different errors can be reported to the user.

Adjust all ethdev APIs error reports to return -EIO in case of device
removal using rte_eth_dev_is_removed API.

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
---
 lib/librte_ether/rte_ethdev.c | 192 +++++++++++++++++++++++++++---------------
 lib/librte_ether/rte_ethdev.h |  51 ++++++++++-
 2 files changed, 170 insertions(+), 73 deletions(-)

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index fd70d10..c4ff1b0 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -338,6 +338,16 @@ struct rte_eth_dev *
 	return -ENODEV;
 }
 
+static int
+eth_err(uint16_t port_id, int ret)
+{
+	if (ret == 0)
+		return 0;
+	if (rte_eth_dev_is_removed(port_id))
+		return -EIO;
+	return ret;
+}
+
 /* attach the new device, then store port_id of the device */
 int
 rte_eth_dev_attach(const char *devargs, uint16_t *port_id)
@@ -492,7 +502,8 @@ struct rte_eth_dev *
 		return 0;
 	}
 
-	return dev->dev_ops->rx_queue_start(dev, rx_queue_id);
+	return eth_err(port_id, dev->dev_ops->rx_queue_start(dev,
+							     rx_queue_id));
 
 }
 
@@ -518,7 +529,7 @@ struct rte_eth_dev *
 		return 0;
 	}
 
-	return dev->dev_ops->rx_queue_stop(dev, rx_queue_id);
+	return eth_err(port_id, dev->dev_ops->rx_queue_stop(dev, rx_queue_id));
 
 }
 
@@ -544,7 +555,8 @@ struct rte_eth_dev *
 		return 0;
 	}
 
-	return dev->dev_ops->tx_queue_start(dev, tx_queue_id);
+	return eth_err(port_id, dev->dev_ops->tx_queue_start(dev,
+							     tx_queue_id));
 
 }
 
@@ -570,7 +582,7 @@ struct rte_eth_dev *
 		return 0;
 	}
 
-	return dev->dev_ops->tx_queue_stop(dev, tx_queue_id);
+	return eth_err(port_id, dev->dev_ops->tx_queue_stop(dev, tx_queue_id));
 
 }
 
@@ -888,7 +900,7 @@ struct rte_eth_dev *
 				port_id, diag);
 		rte_eth_dev_rx_queue_config(dev, 0);
 		rte_eth_dev_tx_queue_config(dev, 0);
-		return diag;
+		return eth_err(port_id, diag);
 	}
 
 	/* Initialize Rx profiling if enabled at compilation time. */
@@ -898,7 +910,7 @@ struct rte_eth_dev *
 				port_id, diag);
 		rte_eth_dev_rx_queue_config(dev, 0);
 		rte_eth_dev_tx_queue_config(dev, 0);
-		return diag;
+		return eth_err(port_id, diag);
 	}
 
 	return 0;
@@ -998,7 +1010,7 @@ struct rte_eth_dev *
 	if (diag == 0)
 		dev->data->dev_started = 1;
 	else
-		return diag;
+		return eth_err(port_id, diag);
 
 	rte_eth_dev_config_restore(port_id);
 
@@ -1040,7 +1052,7 @@ struct rte_eth_dev *
 	dev = &rte_eth_devices[port_id];
 
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->dev_set_link_up, -ENOTSUP);
-	return (*dev->dev_ops->dev_set_link_up)(dev);
+	return eth_err(port_id, (*dev->dev_ops->dev_set_link_up)(dev));
 }
 
 int
@@ -1053,7 +1065,7 @@ struct rte_eth_dev *
 	dev = &rte_eth_devices[port_id];
 
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->dev_set_link_down, -ENOTSUP);
-	return (*dev->dev_ops->dev_set_link_down)(dev);
+	return eth_err(port_id, (*dev->dev_ops->dev_set_link_down)(dev));
 }
 
 void
@@ -1090,7 +1102,7 @@ struct rte_eth_dev *
 	rte_eth_dev_stop(port_id);
 	ret = dev->dev_ops->dev_reset(dev);
 
-	return ret;
+	return eth_err(port_id, ret);
 }
 
 int
@@ -1215,7 +1227,7 @@ struct rte_eth_dev *
 			dev->data->min_rx_buf_size = mbp_buf_size;
 	}
 
-	return ret;
+	return eth_err(port_id, ret);
 }
 
 /**
@@ -1334,8 +1346,8 @@ struct rte_eth_dev *
 					  &local_conf.offloads);
 	}
 
-	return (*dev->dev_ops->tx_queue_setup)(dev, tx_queue_id, nb_tx_desc,
-					       socket_id, &local_conf);
+	return eth_err(port_id, (*dev->dev_ops->tx_queue_setup)(dev,
+		       tx_queue_id, nb_tx_desc, socket_id, &local_conf));
 }
 
 void
@@ -1391,14 +1403,16 @@ struct rte_eth_dev *
 rte_eth_tx_done_cleanup(uint16_t port_id, uint16_t queue_id, uint32_t free_cnt)
 {
 	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+	int ret;
 
 	/* Validate Input Data. Bail if not valid or not supported. */
 	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->tx_done_cleanup, -ENOTSUP);
 
 	/* Call driver to free pending mbufs. */
-	return (*dev->dev_ops->tx_done_cleanup)(dev->data->tx_queues[queue_id],
-			free_cnt);
+	ret = (*dev->dev_ops->tx_done_cleanup)(dev->data->tx_queues[queue_id],
+					       free_cnt);
+	return eth_err(port_id, ret);
 }
 
 void
@@ -1535,7 +1549,7 @@ struct rte_eth_dev *
 
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->stats_get, -ENOTSUP);
 	stats->rx_nombuf = dev->data->rx_mbuf_alloc_failed;
-	return (*dev->dev_ops->stats_get)(dev, stats);
+	return eth_err(port_id, (*dev->dev_ops->stats_get)(dev, stats));
 }
 
 int
@@ -1581,12 +1595,12 @@ struct rte_eth_dev *
 		count = (*dev->dev_ops->xstats_get_names_by_id)(dev, NULL,
 				NULL, 0);
 		if (count < 0)
-			return count;
+			return eth_err(port_id, count);
 	}
 	if (dev->dev_ops->xstats_get_names != NULL) {
 		count = (*dev->dev_ops->xstats_get_names)(dev, NULL, 0);
 		if (count < 0)
-			return count;
+			return eth_err(port_id, count);
 	} else
 		count = 0;
 
@@ -1766,8 +1780,12 @@ struct rte_eth_dev *
 	if (ids && no_ext_stat_requested) {
 		rte_eth_basic_stats_get_names(dev, xstats_names_copy);
 	} else {
-		rte_eth_xstats_get_names(port_id, xstats_names_copy,
+		ret = rte_eth_xstats_get_names(port_id, xstats_names_copy,
 			expected_entries);
+		if (ret < 0) {
+			free(xstats_names_copy);
+			return ret;
+		}
 	}
 
 	/* Filter stats */
@@ -1814,7 +1832,7 @@ struct rte_eth_dev *
 			xstats_names + cnt_used_entries,
 			size - cnt_used_entries);
 		if (cnt_driver_entries < 0)
-			return cnt_driver_entries;
+			return eth_err(port_id, cnt_driver_entries);
 		cnt_used_entries += cnt_driver_entries;
 	}
 
@@ -1830,8 +1848,12 @@ struct rte_eth_dev *
 	unsigned int count = 0, i, q;
 	uint64_t val, *stats_ptr;
 	uint16_t nb_rxqs, nb_txqs;
+	int ret;
+
+	ret = rte_eth_stats_get(port_id, &eth_stats);
+	if (ret < 0)
+		return ret;
 
-	rte_eth_stats_get(port_id, &eth_stats);
 	dev = &rte_eth_devices[port_id];
 
 	nb_rxqs = RTE_MIN(dev->data->nb_rx_queues, RTE_ETHDEV_QUEUE_STAT_CNTRS);
@@ -1884,7 +1906,10 @@ struct rte_eth_dev *
 	int ret;
 
 	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
-	expected_entries = get_xstats_count(port_id);
+	ret = get_xstats_count(port_id);
+	if (ret < 0)
+		return ret;
+	expected_entries = (uint16_t)ret;
 	struct rte_eth_xstat xstats[expected_entries];
 	dev = &rte_eth_devices[port_id];
 	basic_count = get_xstats_basic_count(dev);
@@ -1967,6 +1992,7 @@ struct rte_eth_dev *
 	unsigned int count = 0, i;
 	signed int xcount = 0;
 	uint16_t nb_rxqs, nb_txqs;
+	int ret;
 
 	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -EINVAL);
 
@@ -1989,14 +2015,17 @@ struct rte_eth_dev *
 				     (n > count) ? n - count : 0);
 
 		if (xcount < 0)
-			return xcount;
+			return eth_err(port_id, xcount);
 	}
 
 	if (n < count + xcount || xstats == NULL)
 		return count + xcount;
 
 	/* now fill the xstats structure */
-	count = rte_eth_basic_stats_get(port_id, xstats);
+	ret = rte_eth_basic_stats_get(port_id, xstats);
+	if (ret < 0)
+		return ret;
+	count = ret;
 
 	for (i = 0; i < count; i++)
 		xstats[i].id = i;
@@ -2046,8 +2075,8 @@ struct rte_eth_dev *
 rte_eth_dev_set_tx_queue_stats_mapping(uint16_t port_id, uint16_t tx_queue_id,
 		uint8_t stat_idx)
 {
-	return set_queue_stats_mapping(port_id, tx_queue_id, stat_idx,
-			STAT_QMAP_TX);
+	return eth_err(port_id, set_queue_stats_mapping(port_id, tx_queue_id,
+						stat_idx, STAT_QMAP_TX));
 }
 
 
@@ -2055,8 +2084,8 @@ struct rte_eth_dev *
 rte_eth_dev_set_rx_queue_stats_mapping(uint16_t port_id, uint16_t rx_queue_id,
 		uint8_t stat_idx)
 {
-	return set_queue_stats_mapping(port_id, rx_queue_id, stat_idx,
-			STAT_QMAP_RX);
+	return eth_err(port_id, set_queue_stats_mapping(port_id, rx_queue_id,
+						stat_idx, STAT_QMAP_RX));
 }
 
 int
@@ -2068,7 +2097,8 @@ struct rte_eth_dev *
 	dev = &rte_eth_devices[port_id];
 
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->fw_version_get, -ENOTSUP);
-	return (*dev->dev_ops->fw_version_get)(dev, fw_version, fw_size);
+	return eth_err(port_id, (*dev->dev_ops->fw_version_get)(dev,
+							fw_version, fw_size));
 }
 
 void
@@ -2158,7 +2188,7 @@ struct rte_eth_dev *
 	if (!ret)
 		dev->data->mtu = mtu;
 
-	return ret;
+	return eth_err(port_id, ret);
 }
 
 int
@@ -2198,7 +2228,7 @@ struct rte_eth_dev *
 			vfc->ids[vidx] &= ~(UINT64_C(1) << vbit);
 	}
 
-	return ret;
+	return eth_err(port_id, ret);
 }
 
 int
@@ -2231,7 +2261,8 @@ struct rte_eth_dev *
 	dev = &rte_eth_devices[port_id];
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->vlan_tpid_set, -ENOTSUP);
 
-	return (*dev->dev_ops->vlan_tpid_set)(dev, vlan_type, tpid);
+	return eth_err(port_id, (*dev->dev_ops->vlan_tpid_set)(dev, vlan_type,
+							       tpid));
 }
 
 int
@@ -2309,7 +2340,7 @@ struct rte_eth_dev *
 					    &dev->data->dev_conf.rxmode);
 	}
 
-	return ret;
+	return eth_err(port_id, ret);
 }
 
 int
@@ -2344,9 +2375,8 @@ struct rte_eth_dev *
 	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
 	dev = &rte_eth_devices[port_id];
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->vlan_pvid_set, -ENOTSUP);
-	(*dev->dev_ops->vlan_pvid_set)(dev, pvid, on);
 
-	return 0;
+	return eth_err(port_id, (*dev->dev_ops->vlan_pvid_set)(dev, pvid, on));
 }
 
 int
@@ -2358,7 +2388,7 @@ struct rte_eth_dev *
 	dev = &rte_eth_devices[port_id];
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->flow_ctrl_get, -ENOTSUP);
 	memset(fc_conf, 0, sizeof(*fc_conf));
-	return (*dev->dev_ops->flow_ctrl_get)(dev, fc_conf);
+	return eth_err(port_id, (*dev->dev_ops->flow_ctrl_get)(dev, fc_conf));
 }
 
 int
@@ -2374,7 +2404,7 @@ struct rte_eth_dev *
 
 	dev = &rte_eth_devices[port_id];
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->flow_ctrl_set, -ENOTSUP);
-	return (*dev->dev_ops->flow_ctrl_set)(dev, fc_conf);
+	return eth_err(port_id, (*dev->dev_ops->flow_ctrl_set)(dev, fc_conf));
 }
 
 int
@@ -2392,7 +2422,8 @@ struct rte_eth_dev *
 	dev = &rte_eth_devices[port_id];
 	/* High water, low water validation are device specific */
 	if  (*dev->dev_ops->priority_flow_ctrl_set)
-		return (*dev->dev_ops->priority_flow_ctrl_set)(dev, pfc_conf);
+		return eth_err(port_id, (*dev->dev_ops->priority_flow_ctrl_set)
+					(dev, pfc_conf));
 	return -ENOTSUP;
 }
 
@@ -2467,7 +2498,8 @@ struct rte_eth_dev *
 		return ret;
 
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->reta_update, -ENOTSUP);
-	return (*dev->dev_ops->reta_update)(dev, reta_conf, reta_size);
+	return eth_err(port_id, (*dev->dev_ops->reta_update)(dev, reta_conf,
+							     reta_size));
 }
 
 int
@@ -2487,7 +2519,8 @@ struct rte_eth_dev *
 
 	dev = &rte_eth_devices[port_id];
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->reta_query, -ENOTSUP);
-	return (*dev->dev_ops->reta_query)(dev, reta_conf, reta_size);
+	return eth_err(port_id, (*dev->dev_ops->reta_query)(dev, reta_conf,
+							    reta_size));
 }
 
 int
@@ -2499,7 +2532,8 @@ struct rte_eth_dev *
 	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
 	dev = &rte_eth_devices[port_id];
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rss_hash_update, -ENOTSUP);
-	return (*dev->dev_ops->rss_hash_update)(dev, rss_conf);
+	return eth_err(port_id, (*dev->dev_ops->rss_hash_update)(dev,
+								 rss_conf));
 }
 
 int
@@ -2511,7 +2545,8 @@ struct rte_eth_dev *
 	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
 	dev = &rte_eth_devices[port_id];
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rss_hash_conf_get, -ENOTSUP);
-	return (*dev->dev_ops->rss_hash_conf_get)(dev, rss_conf);
+	return eth_err(port_id, (*dev->dev_ops->rss_hash_conf_get)(dev,
+								   rss_conf));
 }
 
 int
@@ -2533,7 +2568,8 @@ struct rte_eth_dev *
 
 	dev = &rte_eth_devices[port_id];
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->udp_tunnel_port_add, -ENOTSUP);
-	return (*dev->dev_ops->udp_tunnel_port_add)(dev, udp_tunnel);
+	return eth_err(port_id, (*dev->dev_ops->udp_tunnel_port_add)(dev,
+								udp_tunnel));
 }
 
 int
@@ -2556,7 +2592,8 @@ struct rte_eth_dev *
 	}
 
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->udp_tunnel_port_del, -ENOTSUP);
-	return (*dev->dev_ops->udp_tunnel_port_del)(dev, udp_tunnel);
+	return eth_err(port_id, (*dev->dev_ops->udp_tunnel_port_del)(dev,
+								udp_tunnel));
 }
 
 int
@@ -2567,7 +2604,7 @@ struct rte_eth_dev *
 	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
 	dev = &rte_eth_devices[port_id];
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->dev_led_on, -ENOTSUP);
-	return (*dev->dev_ops->dev_led_on)(dev);
+	return eth_err(port_id, (*dev->dev_ops->dev_led_on)(dev));
 }
 
 int
@@ -2578,7 +2615,7 @@ struct rte_eth_dev *
 	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
 	dev = &rte_eth_devices[port_id];
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->dev_led_off, -ENOTSUP);
-	return (*dev->dev_ops->dev_led_off)(dev);
+	return eth_err(port_id, (*dev->dev_ops->dev_led_off)(dev));
 }
 
 /*
@@ -2654,7 +2691,7 @@ struct rte_eth_dev *
 		dev->data->mac_pool_sel[index] |= (1ULL << pool);
 	}
 
-	return ret;
+	return eth_err(port_id, ret);
 }
 
 int
@@ -2780,7 +2817,7 @@ struct rte_eth_dev *
 					&dev->data->hash_mac_addrs[index]);
 	}
 
-	return ret;
+	return eth_err(port_id, ret);
 }
 
 int
@@ -2793,7 +2830,8 @@ struct rte_eth_dev *
 	dev = &rte_eth_devices[port_id];
 
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->uc_all_hash_table_set, -ENOTSUP);
-	return (*dev->dev_ops->uc_all_hash_table_set)(dev, on);
+	return eth_err(port_id, (*dev->dev_ops->uc_all_hash_table_set)(dev,
+								       on));
 }
 
 int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx,
@@ -2823,7 +2861,8 @@ int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx,
 	}
 
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->set_queue_rate_limit, -ENOTSUP);
-	return (*dev->dev_ops->set_queue_rate_limit)(dev, queue_idx, tx_rate);
+	return eth_err(port_id, (*dev->dev_ops->set_queue_rate_limit)(dev,
+							queue_idx, tx_rate));
 }
 
 int
@@ -2861,7 +2900,8 @@ int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx,
 	dev = &rte_eth_devices[port_id];
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->mirror_rule_set, -ENOTSUP);
 
-	return (*dev->dev_ops->mirror_rule_set)(dev, mirror_conf, rule_id, on);
+	return eth_err(port_id, (*dev->dev_ops->mirror_rule_set)(dev,
+						mirror_conf, rule_id, on));
 }
 
 int
@@ -2874,7 +2914,8 @@ int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx,
 	dev = &rte_eth_devices[port_id];
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->mirror_rule_reset, -ENOTSUP);
 
-	return (*dev->dev_ops->mirror_rule_reset)(dev, rule_id);
+	return eth_err(port_id, (*dev->dev_ops->mirror_rule_reset)(dev,
+								   rule_id));
 }
 
 RTE_INIT(eth_dev_init_cb_lists)
@@ -3138,7 +3179,8 @@ int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx,
 	dev = &rte_eth_devices[port_id];
 
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rx_queue_intr_enable, -ENOTSUP);
-	return (*dev->dev_ops->rx_queue_intr_enable)(dev, queue_id);
+	return eth_err(port_id, (*dev->dev_ops->rx_queue_intr_enable)(dev,
+								queue_id));
 }
 
 int
@@ -3152,7 +3194,8 @@ int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx,
 	dev = &rte_eth_devices[port_id];
 
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rx_queue_intr_disable, -ENOTSUP);
-	return (*dev->dev_ops->rx_queue_intr_disable)(dev, queue_id);
+	return eth_err(port_id, (*dev->dev_ops->rx_queue_intr_disable)(dev,
+								queue_id));
 }
 
 
@@ -3180,7 +3223,8 @@ int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx,
 
 	dev = &rte_eth_devices[port_id];
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->filter_ctrl, -ENOTSUP);
-	return (*dev->dev_ops->filter_ctrl)(dev, filter_type, filter_op, arg);
+	return eth_err(port_id, (*dev->dev_ops->filter_ctrl)(dev, filter_type,
+							     filter_op, arg));
 }
 
 void *
@@ -3430,7 +3474,8 @@ int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx,
 
 	dev = &rte_eth_devices[port_id];
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->set_mc_addr_list, -ENOTSUP);
-	return dev->dev_ops->set_mc_addr_list(dev, mc_addr_set, nb_mc_addr);
+	return eth_err(port_id, dev->dev_ops->set_mc_addr_list(dev,
+						mc_addr_set, nb_mc_addr));
 }
 
 int
@@ -3442,7 +3487,7 @@ int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx,
 	dev = &rte_eth_devices[port_id];
 
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->timesync_enable, -ENOTSUP);
-	return (*dev->dev_ops->timesync_enable)(dev);
+	return eth_err(port_id, (*dev->dev_ops->timesync_enable)(dev));
 }
 
 int
@@ -3454,7 +3499,7 @@ int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx,
 	dev = &rte_eth_devices[port_id];
 
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->timesync_disable, -ENOTSUP);
-	return (*dev->dev_ops->timesync_disable)(dev);
+	return eth_err(port_id, (*dev->dev_ops->timesync_disable)(dev));
 }
 
 int
@@ -3467,7 +3512,8 @@ int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx,
 	dev = &rte_eth_devices[port_id];
 
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->timesync_read_rx_timestamp, -ENOTSUP);
-	return (*dev->dev_ops->timesync_read_rx_timestamp)(dev, timestamp, flags);
+	return eth_err(port_id, (*dev->dev_ops->timesync_read_rx_timestamp)
+				(dev, timestamp, flags));
 }
 
 int
@@ -3480,7 +3526,8 @@ int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx,
 	dev = &rte_eth_devices[port_id];
 
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->timesync_read_tx_timestamp, -ENOTSUP);
-	return (*dev->dev_ops->timesync_read_tx_timestamp)(dev, timestamp);
+	return eth_err(port_id, (*dev->dev_ops->timesync_read_tx_timestamp)
+				(dev, timestamp));
 }
 
 int
@@ -3492,7 +3539,8 @@ int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx,
 	dev = &rte_eth_devices[port_id];
 
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->timesync_adjust_time, -ENOTSUP);
-	return (*dev->dev_ops->timesync_adjust_time)(dev, delta);
+	return eth_err(port_id, (*dev->dev_ops->timesync_adjust_time)(dev,
+								      delta));
 }
 
 int
@@ -3504,7 +3552,8 @@ int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx,
 	dev = &rte_eth_devices[port_id];
 
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->timesync_read_time, -ENOTSUP);
-	return (*dev->dev_ops->timesync_read_time)(dev, timestamp);
+	return eth_err(port_id, (*dev->dev_ops->timesync_read_time)(dev,
+								timestamp));
 }
 
 int
@@ -3516,7 +3565,8 @@ int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx,
 	dev = &rte_eth_devices[port_id];
 
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->timesync_write_time, -ENOTSUP);
-	return (*dev->dev_ops->timesync_write_time)(dev, timestamp);
+	return eth_err(port_id, (*dev->dev_ops->timesync_write_time)(dev,
+								timestamp));
 }
 
 int
@@ -3528,7 +3578,7 @@ int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx,
 
 	dev = &rte_eth_devices[port_id];
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->get_reg, -ENOTSUP);
-	return (*dev->dev_ops->get_reg)(dev, info);
+	return eth_err(port_id, (*dev->dev_ops->get_reg)(dev, info));
 }
 
 int
@@ -3540,7 +3590,7 @@ int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx,
 
 	dev = &rte_eth_devices[port_id];
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->get_eeprom_length, -ENOTSUP);
-	return (*dev->dev_ops->get_eeprom_length)(dev);
+	return eth_err(port_id, (*dev->dev_ops->get_eeprom_length)(dev));
 }
 
 int
@@ -3552,7 +3602,7 @@ int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx,
 
 	dev = &rte_eth_devices[port_id];
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->get_eeprom, -ENOTSUP);
-	return (*dev->dev_ops->get_eeprom)(dev, info);
+	return eth_err(port_id, (*dev->dev_ops->get_eeprom)(dev, info));
 }
 
 int
@@ -3564,7 +3614,7 @@ int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx,
 
 	dev = &rte_eth_devices[port_id];
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->set_eeprom, -ENOTSUP);
-	return (*dev->dev_ops->set_eeprom)(dev, info);
+	return eth_err(port_id, (*dev->dev_ops->set_eeprom)(dev, info));
 }
 
 int
@@ -3579,7 +3629,7 @@ int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx,
 	memset(dcb_info, 0, sizeof(struct rte_eth_dcb_info));
 
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->get_dcb_info, -ENOTSUP);
-	return (*dev->dev_ops->get_dcb_info)(dev, dcb_info);
+	return eth_err(port_id, (*dev->dev_ops->get_dcb_info)(dev, dcb_info));
 }
 
 int
@@ -3602,7 +3652,8 @@ int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx,
 	dev = &rte_eth_devices[port_id];
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->l2_tunnel_eth_type_conf,
 				-ENOTSUP);
-	return (*dev->dev_ops->l2_tunnel_eth_type_conf)(dev, l2_tunnel);
+	return eth_err(port_id, (*dev->dev_ops->l2_tunnel_eth_type_conf)(dev,
+								l2_tunnel));
 }
 
 int
@@ -3633,7 +3684,8 @@ int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx,
 	dev = &rte_eth_devices[port_id];
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->l2_tunnel_offload_set,
 				-ENOTSUP);
-	return (*dev->dev_ops->l2_tunnel_offload_set)(dev, l2_tunnel, mask, en);
+	return eth_err(port_id, (*dev->dev_ops->l2_tunnel_offload_set)(dev,
+							l2_tunnel, mask, en));
 }
 
 static void
diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index ed31a10..084eeeb 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -2023,6 +2023,7 @@ int rte_eth_dev_configure(uint16_t port_id, uint16_t nb_rx_queue,
  *   memory buffers to populate each descriptor of the receive ring.
  * @return
  *   - 0: Success, receive queue correctly set up.
+ *   - -EIO: if device is removed.
  *   - -EINVAL: The size of network buffers which can be allocated from the
  *      memory pool does not fit the various buffer sizes allowed by the
  *      device controller.
@@ -2123,6 +2124,7 @@ int rte_eth_tx_queue_setup(uint16_t port_id, uint16_t tx_queue_id,
  * @return
  *   - 0: Success, the receive queue is started.
  *   - -EINVAL: The port_id or the queue_id out of range.
+ *   - -EIO: if device is removed.
  *   - -ENOTSUP: The function not supported in PMD driver.
  */
 int rte_eth_dev_rx_queue_start(uint16_t port_id, uint16_t rx_queue_id);
@@ -2139,6 +2141,7 @@ int rte_eth_tx_queue_setup(uint16_t port_id, uint16_t tx_queue_id,
  * @return
  *   - 0: Success, the receive queue is stopped.
  *   - -EINVAL: The port_id or the queue_id out of range.
+ *   - -EIO: if device is removed.
  *   - -ENOTSUP: The function not supported in PMD driver.
  */
 int rte_eth_dev_rx_queue_stop(uint16_t port_id, uint16_t rx_queue_id);
@@ -2156,6 +2159,7 @@ int rte_eth_tx_queue_setup(uint16_t port_id, uint16_t tx_queue_id,
  * @return
  *   - 0: Success, the transmit queue is started.
  *   - -EINVAL: The port_id or the queue_id out of range.
+ *   - -EIO: if device is removed.
  *   - -ENOTSUP: The function not supported in PMD driver.
  */
 int rte_eth_dev_tx_queue_start(uint16_t port_id, uint16_t tx_queue_id);
@@ -2172,6 +2176,7 @@ int rte_eth_tx_queue_setup(uint16_t port_id, uint16_t tx_queue_id,
  * @return
  *   - 0: Success, the transmit queue is stopped.
  *   - -EINVAL: The port_id or the queue_id out of range.
+ *   - -EIO: if device is removed.
  *   - -ENOTSUP: The function not supported in PMD driver.
  */
 int rte_eth_dev_tx_queue_stop(uint16_t port_id, uint16_t tx_queue_id);
@@ -2273,7 +2278,7 @@ int rte_eth_tx_queue_setup(uint16_t port_id, uint16_t tx_queue_id,
  *   - (-EINVAL) if port identifier is invalid.
  *   - (-ENOTSUP) if hardware doesn't support this function.
  *   - (-EPERM) if not ran from the primary process.
- *   - (-EIO) if re-initialisation failed.
+ *   - (-EIO) if re-initialisation failed or device is removed.
  *   - (-ENOMEM) if the reset failed due to OOM.
  *   - (-EAGAIN) if the reset temporarily failed and should be retried later.
  */
@@ -2509,6 +2514,7 @@ int rte_eth_xstats_get_by_id(uint16_t port_id, const uint64_t *ids,
  * @return
  *    0 on success
  *    -ENODEV for invalid port_id,
+ *    -EIO if device is removed,
  *    -EINVAL if the xstat_name doesn't exist in port_id
  */
 int rte_eth_xstats_get_id_by_name(uint16_t port_id, const char *xstat_name,
@@ -2600,6 +2606,7 @@ int rte_eth_dev_set_rx_queue_stats_mapping(uint16_t port_id,
  *   - (0) if successful.
  *   - (-ENOTSUP) if operation is not supported.
  *   - (-ENODEV) if *port_id* invalid.
+ *   - (-EIO) if device is removed.
  *   - (>0) if *fw_size* is not enough to store firmware version, return
  *          the size of the non truncated string.
  */
@@ -2671,6 +2678,7 @@ int rte_eth_dev_get_supported_ptypes(uint16_t port_id, uint32_t ptype_mask,
  *   - (0) if successful.
  *   - (-ENOTSUP) if operation is not supported.
  *   - (-ENODEV) if *port_id* invalid.
+ *   - (-EIO) if device is removed.
  *   - (-EINVAL) if *mtu* invalid.
  *   - (-EBUSY) if operation is not allowed when the port is running
  */
@@ -2691,6 +2699,7 @@ int rte_eth_dev_get_supported_ptypes(uint16_t port_id, uint32_t ptype_mask,
  *   - (0) if successful.
  *   - (-ENOSUP) if hardware-assisted VLAN filtering not configured.
  *   - (-ENODEV) if *port_id* invalid.
+ *   - (-EIO) if device is removed.
  *   - (-ENOSYS) if VLAN filtering on *port_id* disabled.
  *   - (-EINVAL) if *vlan_id* > 4095.
  */
@@ -2733,6 +2742,7 @@ int rte_eth_dev_set_vlan_strip_on_queue(uint16_t port_id, uint16_t rx_queue_id,
  *   - (0) if successful.
  *   - (-ENOSUP) if hardware-assisted VLAN TPID setup is not supported.
  *   - (-ENODEV) if *port_id* invalid.
+ *   - (-EIO) if device is removed.
  */
 int rte_eth_dev_set_vlan_ether_type(uint16_t port_id,
 				    enum rte_vlan_type vlan_type,
@@ -2757,6 +2767,7 @@ int rte_eth_dev_set_vlan_ether_type(uint16_t port_id,
  *   - (0) if successful.
  *   - (-ENOSUP) if hardware-assisted VLAN filtering not configured.
  *   - (-ENODEV) if *port_id* invalid.
+ *   - (-EIO) if device is removed.
  */
 int rte_eth_dev_set_vlan_offload(uint16_t port_id, int offload_mask);
 
@@ -3498,6 +3509,7 @@ struct rte_eth_dev_tx_buffer {
  * @return
  *   Failure: < 0
  *     -ENODEV: Invalid interface
+ *     -EIO: device is removed
  *     -ENOTSUP: Driver does not support function
  *   Success: >= 0
  *     0-n: Number of packets freed. More packets may still remain in ring that
@@ -3612,6 +3624,7 @@ int _rte_eth_dev_callback_process(struct rte_eth_dev *dev,
  *   - (-ENOTSUP) if underlying hardware OR driver doesn't support
  *     that operation.
  *   - (-ENODEV) if *port_id* invalid.
+ *   - (-EIO) if device is removed.
  */
 int rte_eth_dev_rx_intr_enable(uint16_t port_id, uint16_t queue_id);
 
@@ -3633,6 +3646,7 @@ int _rte_eth_dev_callback_process(struct rte_eth_dev *dev,
  *   - (-ENOTSUP) if underlying hardware OR driver doesn't support
  *     that operation.
  *   - (-ENODEV) if *port_id* invalid.
+ *   - (-EIO) if device is removed.
  */
 int rte_eth_dev_rx_intr_disable(uint16_t port_id, uint16_t queue_id);
 
@@ -3690,6 +3704,7 @@ int rte_eth_dev_rx_intr_ctl_q(uint16_t port_id, uint16_t queue_id,
  *   - (-ENOTSUP) if underlying hardware OR driver doesn't support
  *     that operation.
  *   - (-ENODEV) if *port_id* invalid.
+ *   - (-EIO) if device is removed.
  */
 int  rte_eth_led_on(uint16_t port_id);
 
@@ -3704,6 +3719,7 @@ int rte_eth_dev_rx_intr_ctl_q(uint16_t port_id, uint16_t queue_id,
  *   - (-ENOTSUP) if underlying hardware OR driver doesn't support
  *     that operation.
  *   - (-ENODEV) if *port_id* invalid.
+ *   - (-EIO) if device is removed.
  */
 int  rte_eth_led_off(uint16_t port_id);
 
@@ -3718,6 +3734,7 @@ int rte_eth_dev_rx_intr_ctl_q(uint16_t port_id, uint16_t queue_id,
  *   - (0) if successful.
  *   - (-ENOTSUP) if hardware doesn't support flow control.
  *   - (-ENODEV)  if *port_id* invalid.
+ *   - (-EIO)  if device is removed.
  */
 int rte_eth_dev_flow_ctrl_get(uint16_t port_id,
 			      struct rte_eth_fc_conf *fc_conf);
@@ -3734,7 +3751,7 @@ int rte_eth_dev_flow_ctrl_get(uint16_t port_id,
  *   - (-ENOTSUP) if hardware doesn't support flow control mode.
  *   - (-ENODEV)  if *port_id* invalid.
  *   - (-EINVAL)  if bad parameter
- *   - (-EIO)     if flow control setup failure
+ *   - (-EIO)     if flow control setup failure or device is removed.
  */
 int rte_eth_dev_flow_ctrl_set(uint16_t port_id,
 			      struct rte_eth_fc_conf *fc_conf);
@@ -3752,7 +3769,7 @@ int rte_eth_dev_flow_ctrl_set(uint16_t port_id,
  *   - (-ENOTSUP) if hardware doesn't support priority flow control mode.
  *   - (-ENODEV)  if *port_id* invalid.
  *   - (-EINVAL)  if bad parameter
- *   - (-EIO)     if flow control setup failure
+ *   - (-EIO)     if flow control setup failure or device is removed.
  */
 int rte_eth_dev_priority_flow_ctrl_set(uint16_t port_id,
 				struct rte_eth_pfc_conf *pfc_conf);
@@ -3772,6 +3789,7 @@ int rte_eth_dev_priority_flow_ctrl_set(uint16_t port_id,
  *   - (0) if successfully added or *mac_addr" was already added.
  *   - (-ENOTSUP) if hardware doesn't support this feature.
  *   - (-ENODEV) if *port* is invalid.
+ *   - (-EIO) if device is removed.
  *   - (-ENOSPC) if no more MAC addresses can be added.
  *   - (-EINVAL) if MAC address is invalid.
  */
@@ -3823,6 +3841,7 @@ int rte_eth_dev_default_mac_addr_set(uint16_t port,
  *   - (0) if successful.
  *   - (-ENOTSUP) if hardware doesn't support.
  *   - (-EINVAL) if bad parameter.
+ *   - (-EIO) if device is removed.
  */
 int rte_eth_dev_rss_reta_update(uint16_t port,
 				struct rte_eth_rss_reta_entry64 *reta_conf,
@@ -3842,6 +3861,7 @@ int rte_eth_dev_rss_reta_update(uint16_t port,
  *   - (0) if successful.
  *   - (-ENOTSUP) if hardware doesn't support.
  *   - (-EINVAL) if bad parameter.
+ *   - (-EIO) if device is removed.
  */
 int rte_eth_dev_rss_reta_query(uint16_t port,
 			       struct rte_eth_rss_reta_entry64 *reta_conf,
@@ -3863,6 +3883,7 @@ int rte_eth_dev_rss_reta_query(uint16_t port,
  *   - (0) if successful.
  *   - (-ENOTSUP) if hardware doesn't support.
   *  - (-ENODEV) if *port_id* invalid.
+ *   - (-EIO) if device is removed.
  *   - (-EINVAL) if bad parameter.
  */
 int rte_eth_dev_uc_hash_table_set(uint16_t port, struct ether_addr *addr,
@@ -3883,6 +3904,7 @@ int rte_eth_dev_uc_hash_table_set(uint16_t port, struct ether_addr *addr,
  *   - (0) if successful.
  *   - (-ENOTSUP) if hardware doesn't support.
   *  - (-ENODEV) if *port_id* invalid.
+ *   - (-EIO) if device is removed.
  *   - (-EINVAL) if bad parameter.
  */
 int rte_eth_dev_uc_all_hash_table_set(uint16_t port, uint8_t on);
@@ -3906,6 +3928,7 @@ int rte_eth_dev_uc_hash_table_set(uint16_t port, struct ether_addr *addr,
  *   - (0) if successful.
  *   - (-ENOTSUP) if hardware doesn't support this feature.
  *   - (-ENODEV) if *port_id* invalid.
+ *   - (-EIO) if device is removed.
  *   - (-EINVAL) if the mr_conf information is not correct.
  */
 int rte_eth_mirror_rule_set(uint16_t port_id,
@@ -3924,6 +3947,7 @@ int rte_eth_mirror_rule_set(uint16_t port_id,
  *   - (0) if successful.
  *   - (-ENOTSUP) if hardware doesn't support this feature.
  *   - (-ENODEV) if *port_id* invalid.
+ *   - (-EIO) if device is removed.
  *   - (-EINVAL) if bad parameter.
  */
 int rte_eth_mirror_rule_reset(uint16_t port_id,
@@ -3942,6 +3966,7 @@ int rte_eth_mirror_rule_reset(uint16_t port_id,
  *   - (0) if successful.
  *   - (-ENOTSUP) if hardware doesn't support this feature.
  *   - (-ENODEV) if *port_id* invalid.
+ *   - (-EIO) if device is removed.
  *   - (-EINVAL) if bad parameter.
  */
 int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx,
@@ -3957,6 +3982,7 @@ int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx,
  * @return
  *   - (0) if successful.
  *   - (-ENODEV) if port identifier is invalid.
+ *   - (-EIO) if device is removed.
  *   - (-ENOTSUP) if hardware doesn't support.
  *   - (-EINVAL) if bad parameter.
  */
@@ -3974,6 +4000,7 @@ int rte_eth_dev_rss_hash_update(uint16_t port_id,
  * @return
  *   - (0) if successful.
  *   - (-ENODEV) if port identifier is invalid.
+ *   - (-EIO) if device is removed.
  *   - (-ENOTSUP) if hardware doesn't support RSS.
  */
 int
@@ -3995,6 +4022,7 @@ int rte_eth_dev_rss_hash_update(uint16_t port_id,
  * @return
  *   - (0) if successful.
  *   - (-ENODEV) if port identifier is invalid.
+ *   - (-EIO) if device is removed.
  *   - (-ENOTSUP) if hardware doesn't support tunnel type.
  */
 int
@@ -4017,6 +4045,7 @@ int rte_eth_dev_rss_hash_update(uint16_t port_id,
  * @return
  *   - (0) if successful.
  *   - (-ENODEV) if port identifier is invalid.
+ *   - (-EIO) if device is removed.
  *   - (-ENOTSUP) if hardware doesn't support tunnel type.
  */
 int
@@ -4035,6 +4064,7 @@ int rte_eth_dev_rss_hash_update(uint16_t port_id,
  *   - (0) if successful.
  *   - (-ENOTSUP) if hardware doesn't support this filter type.
  *   - (-ENODEV) if *port_id* invalid.
+ *   - (-EIO) if device is removed.
  */
 int rte_eth_dev_filter_supported(uint16_t port_id,
 		enum rte_filter_type filter_type);
@@ -4055,6 +4085,7 @@ int rte_eth_dev_filter_supported(uint16_t port_id,
  *   - (0) if successful.
  *   - (-ENOTSUP) if hardware doesn't support.
  *   - (-ENODEV) if *port_id* invalid.
+ *   - (-EIO) if device is removed.
  *   - others depends on the specific operations implementation.
  */
 int rte_eth_dev_filter_ctrl(uint16_t port_id, enum rte_filter_type filter_type,
@@ -4070,6 +4101,7 @@ int rte_eth_dev_filter_ctrl(uint16_t port_id, enum rte_filter_type filter_type,
  * @return
  *   - (0) if successful.
  *   - (-ENODEV) if port identifier is invalid.
+ *   - (-EIO) if device is removed.
  *   - (-ENOTSUP) if hardware doesn't support.
  */
 int rte_eth_dev_get_dcb_info(uint16_t port_id,
@@ -4277,6 +4309,7 @@ int rte_eth_tx_queue_info_get(uint16_t port_id, uint16_t queue_id,
  *   - (0) if successful.
  *   - (-ENOTSUP) if hardware doesn't support.
  *   - (-ENODEV) if *port_id* invalid.
+ *   - (-EIO) if device is removed.
  *   - others depends on the specific operations implementation.
  */
 int rte_eth_dev_get_reg_info(uint16_t port_id, struct rte_dev_reg_info *info);
@@ -4290,6 +4323,7 @@ int rte_eth_tx_queue_info_get(uint16_t port_id, uint16_t queue_id,
  *   - (>=0) EEPROM size if successful.
  *   - (-ENOTSUP) if hardware doesn't support.
  *   - (-ENODEV) if *port_id* invalid.
+ *   - (-EIO) if device is removed.
  *   - others depends on the specific operations implementation.
  */
 int rte_eth_dev_get_eeprom_length(uint16_t port_id);
@@ -4306,6 +4340,7 @@ int rte_eth_tx_queue_info_get(uint16_t port_id, uint16_t queue_id,
  *   - (0) if successful.
  *   - (-ENOTSUP) if hardware doesn't support.
  *   - (-ENODEV) if *port_id* invalid.
+ *   - (-EIO) if device is removed.
  *   - others depends on the specific operations implementation.
  */
 int rte_eth_dev_get_eeprom(uint16_t port_id, struct rte_dev_eeprom_info *info);
@@ -4322,6 +4357,7 @@ int rte_eth_tx_queue_info_get(uint16_t port_id, uint16_t queue_id,
  *   - (0) if successful.
  *   - (-ENOTSUP) if hardware doesn't support.
  *   - (-ENODEV) if *port_id* invalid.
+ *   - (-EIO) if device is removed.
  *   - others depends on the specific operations implementation.
  */
 int rte_eth_dev_set_eeprom(uint16_t port_id, struct rte_dev_eeprom_info *info);
@@ -4340,6 +4376,7 @@ int rte_eth_tx_queue_info_get(uint16_t port_id, uint16_t queue_id,
  * @return
  *   - (0) if successful.
  *   - (-ENODEV) if *port_id* invalid.
+ *   - (-EIO) if device is removed.
  *   - (-ENOTSUP) if PMD of *port_id* doesn't support multicast filtering.
  *   - (-ENOSPC) if *port_id* has not enough multicast filtering resources.
  */
@@ -4356,6 +4393,7 @@ int rte_eth_dev_set_mc_addr_list(uint16_t port_id,
  * @return
  *   - 0: Success.
  *   - -ENODEV: The port ID is invalid.
+ *   - -EIO: if device is removed.
  *   - -ENOTSUP: The function is not supported by the Ethernet driver.
  */
 int rte_eth_timesync_enable(uint16_t port_id);
@@ -4369,6 +4407,7 @@ int rte_eth_dev_set_mc_addr_list(uint16_t port_id,
  * @return
  *   - 0: Success.
  *   - -ENODEV: The port ID is invalid.
+ *   - -EIO: if device is removed.
  *   - -ENOTSUP: The function is not supported by the Ethernet driver.
  */
 int rte_eth_timesync_disable(uint16_t port_id);
@@ -4388,6 +4427,7 @@ int rte_eth_dev_set_mc_addr_list(uint16_t port_id,
  *   - 0: Success.
  *   - -EINVAL: No timestamp is available.
  *   - -ENODEV: The port ID is invalid.
+ *   - -EIO: if device is removed.
  *   - -ENOTSUP: The function is not supported by the Ethernet driver.
  */
 int rte_eth_timesync_read_rx_timestamp(uint16_t port_id,
@@ -4405,6 +4445,7 @@ int rte_eth_timesync_read_rx_timestamp(uint16_t port_id,
  *   - 0: Success.
  *   - -EINVAL: No timestamp is available.
  *   - -ENODEV: The port ID is invalid.
+ *   - -EIO: if device is removed.
  *   - -ENOTSUP: The function is not supported by the Ethernet driver.
  */
 int rte_eth_timesync_read_tx_timestamp(uint16_t port_id,
@@ -4424,6 +4465,7 @@ int rte_eth_timesync_read_tx_timestamp(uint16_t port_id,
  * @return
  *   - 0: Success.
  *   - -ENODEV: The port ID is invalid.
+ *   - -EIO: if device is removed.
  *   - -ENOTSUP: The function is not supported by the Ethernet driver.
  */
 int rte_eth_timesync_adjust_time(uint16_t port_id, int64_t delta);
@@ -4459,6 +4501,7 @@ int rte_eth_timesync_read_tx_timestamp(uint16_t port_id,
  *   - 0: Success.
  *   - -EINVAL: No timestamp is available.
  *   - -ENODEV: The port ID is invalid.
+ *   - -EIO: if device is removed.
  *   - -ENOTSUP: The function is not supported by the Ethernet driver.
  */
 int rte_eth_timesync_write_time(uint16_t port_id, const struct timespec *time);
@@ -4499,6 +4542,7 @@ int rte_eth_timesync_read_tx_timestamp(uint16_t port_id,
  * @return
  *   - (0) if successful.
  *   - (-ENODEV) if port identifier is invalid.
+ *   - (-EIO) if device is removed.
  *   - (-ENOTSUP) if hardware doesn't support tunnel type.
  */
 int
@@ -4526,6 +4570,7 @@ int rte_eth_timesync_read_tx_timestamp(uint16_t port_id,
  * @return
  *   - (0) if successful.
  *   - (-ENODEV) if port identifier is invalid.
+ *   - (-EIO) if device is removed.
  *   - (-ENOTSUP) if hardware doesn't support tunnel type.
  */
 int
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH v7 5/6] ethdev: adjust flow APIs removal error report
  2018-01-20 21:12           ` [PATCH v7 0/6] Fail-safe\ethdev: fix removal handling lack Matan Azrad
                               ` (3 preceding siblings ...)
  2018-01-20 21:12             ` [PATCH v7 4/6] ethdev: adjust APIs removal error report Matan Azrad
@ 2018-01-20 21:12             ` Matan Azrad
  2018-01-20 21:12             ` [PATCH v7 6/6] net/failsafe: fix removed device handling Matan Azrad
  2018-01-21 20:28             ` [PATCH v7 0/6] Fail-safe\ethdev: fix removal handling lack Ferruh Yigit
  6 siblings, 0 replies; 98+ messages in thread
From: Matan Azrad @ 2018-01-20 21:12 UTC (permalink / raw)
  To: Ferruh Yigit, Adrien Mazarguil, Gaetan Rivet; +Cc: Thomas Monjalon, dev

rte_eth_dev_is_removed API was added to detect a device removal
synchronously.

When a device removal occurs during flow command execution, many
different errors can be reported to the user.

Adjust all flow APIs error reports to return -EIO in case of device
removal using rte_eth_dev_is_removed API.

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
---
 lib/librte_ether/rte_flow.c | 34 +++++++++++++++++++++++++++-------
 lib/librte_ether/rte_flow.h |  2 ++
 2 files changed, 29 insertions(+), 7 deletions(-)

diff --git a/lib/librte_ether/rte_flow.c b/lib/librte_ether/rte_flow.c
index 913d1a5..a86bfbd 100644
--- a/lib/librte_ether/rte_flow.c
+++ b/lib/librte_ether/rte_flow.c
@@ -107,6 +107,18 @@ struct rte_flow_desc_data {
 	MK_FLOW_ACTION(VF, sizeof(struct rte_flow_action_vf)),
 };
 
+static int
+flow_err(uint16_t port_id, int ret, struct rte_flow_error *error)
+{
+	if (ret == 0)
+		return 0;
+	if (rte_eth_dev_is_removed(port_id))
+		return rte_flow_error_set(error, EIO,
+					  RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
+					  NULL, rte_strerror(EIO));
+	return ret;
+}
+
 /* Get generic flow operations structure from a port. */
 const struct rte_flow_ops *
 rte_flow_ops_get(uint16_t port_id, struct rte_flow_error *error)
@@ -145,7 +157,8 @@ struct rte_flow_desc_data {
 	if (unlikely(!ops))
 		return -rte_errno;
 	if (likely(!!ops->validate))
-		return ops->validate(dev, attr, pattern, actions, error);
+		return flow_err(port_id, ops->validate(dev, attr, pattern,
+						       actions, error), error);
 	return rte_flow_error_set(error, ENOSYS,
 				  RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
 				  NULL, rte_strerror(ENOSYS));
@@ -160,12 +173,17 @@ struct rte_flow *
 		struct rte_flow_error *error)
 {
 	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+	struct rte_flow *flow;
 	const struct rte_flow_ops *ops = rte_flow_ops_get(port_id, error);
 
 	if (unlikely(!ops))
 		return NULL;
-	if (likely(!!ops->create))
-		return ops->create(dev, attr, pattern, actions, error);
+	if (likely(!!ops->create)) {
+		flow = ops->create(dev, attr, pattern, actions, error);
+		if (flow == NULL)
+			flow_err(port_id, -rte_errno, error);
+		return flow;
+	}
 	rte_flow_error_set(error, ENOSYS, RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
 			   NULL, rte_strerror(ENOSYS));
 	return NULL;
@@ -183,7 +201,8 @@ struct rte_flow *
 	if (unlikely(!ops))
 		return -rte_errno;
 	if (likely(!!ops->destroy))
-		return ops->destroy(dev, flow, error);
+		return flow_err(port_id, ops->destroy(dev, flow, error),
+				error);
 	return rte_flow_error_set(error, ENOSYS,
 				  RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
 				  NULL, rte_strerror(ENOSYS));
@@ -200,7 +219,7 @@ struct rte_flow *
 	if (unlikely(!ops))
 		return -rte_errno;
 	if (likely(!!ops->flush))
-		return ops->flush(dev, error);
+		return flow_err(port_id, ops->flush(dev, error), error);
 	return rte_flow_error_set(error, ENOSYS,
 				  RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
 				  NULL, rte_strerror(ENOSYS));
@@ -220,7 +239,8 @@ struct rte_flow *
 	if (!ops)
 		return -rte_errno;
 	if (likely(!!ops->query))
-		return ops->query(dev, flow, action, data, error);
+		return flow_err(port_id, ops->query(dev, flow, action, data,
+						    error), error);
 	return rte_flow_error_set(error, ENOSYS,
 				  RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
 				  NULL, rte_strerror(ENOSYS));
@@ -238,7 +258,7 @@ struct rte_flow *
 	if (!ops)
 		return -rte_errno;
 	if (likely(!!ops->isolate))
-		return ops->isolate(dev, set, error);
+		return flow_err(port_id, ops->isolate(dev, set, error), error);
 	return rte_flow_error_set(error, ENOSYS,
 				  RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
 				  NULL, rte_strerror(ENOSYS));
diff --git a/lib/librte_ether/rte_flow.h b/lib/librte_ether/rte_flow.h
index e0402cf..07ec217 100644
--- a/lib/librte_ether/rte_flow.h
+++ b/lib/librte_ether/rte_flow.h
@@ -1267,6 +1267,8 @@ struct rte_flow_error {
  *
  *   -ENOSYS: underlying device does not support this functionality.
  *
+ *   -EIO: underlying device is removed.
+ *
  *   -EINVAL: unknown or invalid rule specification.
  *
  *   -ENOTSUP: valid but unsupported rule specification (e.g. partial
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH v7 6/6] net/failsafe: fix removed device handling
  2018-01-20 21:12           ` [PATCH v7 0/6] Fail-safe\ethdev: fix removal handling lack Matan Azrad
                               ` (4 preceding siblings ...)
  2018-01-20 21:12             ` [PATCH v7 5/6] ethdev: adjust flow " Matan Azrad
@ 2018-01-20 21:12             ` Matan Azrad
  2018-01-21 20:28             ` [PATCH v7 0/6] Fail-safe\ethdev: fix removal handling lack Ferruh Yigit
  6 siblings, 0 replies; 98+ messages in thread
From: Matan Azrad @ 2018-01-20 21:12 UTC (permalink / raw)
  To: Ferruh Yigit, Adrien Mazarguil, Gaetan Rivet; +Cc: Thomas Monjalon, dev

There is time between the physical removal of the device until
sub-device PMDs get a RMV interrupt. At this time DPDK PMDs and
applications still don't know about the removal and may call sub-device
control operation which should return an error.

In previous code this error is reported to the application contrary to
fail-safe principle that the app should not be aware of device removal.

Add an removal check in each relevant control command error flow and
prevent an error report to application when the sub-device is removed.

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Gaetan Rivet <gaetan.rivet@6wind.com>
---
 drivers/net/failsafe/failsafe_flow.c    | 18 ++++++++++-------
 drivers/net/failsafe/failsafe_ops.c     | 35 ++++++++++++++++++++++-----------
 drivers/net/failsafe/failsafe_private.h | 11 +++++++++++
 3 files changed, 46 insertions(+), 18 deletions(-)

diff --git a/drivers/net/failsafe/failsafe_flow.c b/drivers/net/failsafe/failsafe_flow.c
index 153ceee..c072d1e 100644
--- a/drivers/net/failsafe/failsafe_flow.c
+++ b/drivers/net/failsafe/failsafe_flow.c
@@ -87,7 +87,7 @@
 		DEBUG("Calling rte_flow_validate on sub_device %d", i);
 		ret = rte_flow_validate(PORT_ID(sdev),
 				attr, patterns, actions, error);
-		if (ret) {
+		if ((ret = fs_err(sdev, ret))) {
 			ERROR("Operation rte_flow_validate failed for sub_device %d"
 			      " with error %d", i, ret);
 			return ret;
@@ -111,7 +111,7 @@
 	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
 		flow->flows[i] = rte_flow_create(PORT_ID(sdev),
 				attr, patterns, actions, error);
-		if (flow->flows[i] == NULL) {
+		if (flow->flows[i] == NULL && fs_err(sdev, -rte_errno)) {
 			ERROR("Failed to create flow on sub_device %d",
 				i);
 			goto err;
@@ -150,7 +150,7 @@
 			continue;
 		local_ret = rte_flow_destroy(PORT_ID(sdev),
 				flow->flows[i], error);
-		if (local_ret) {
+		if ((local_ret = fs_err(sdev, local_ret))) {
 			ERROR("Failed to destroy flow on sub_device %d: %d",
 					i, local_ret);
 			if (ret == 0)
@@ -175,7 +175,7 @@
 	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
 		DEBUG("Calling rte_flow_flush on sub_device %d", i);
 		ret = rte_flow_flush(PORT_ID(sdev), error);
-		if (ret) {
+		if ((ret = fs_err(sdev, ret))) {
 			ERROR("Operation rte_flow_flush failed for sub_device %d"
 			      " with error %d", i, ret);
 			return ret;
@@ -199,8 +199,12 @@
 
 	sdev = TX_SUBDEV(dev);
 	if (sdev != NULL) {
-		return rte_flow_query(PORT_ID(sdev),
-				flow->flows[SUB_ID(sdev)], type, arg, error);
+		int ret = rte_flow_query(PORT_ID(sdev),
+					 flow->flows[SUB_ID(sdev)],
+					 type, arg, error);
+
+		if ((ret = fs_err(sdev, ret)))
+			return ret;
 	}
 	WARN("No active sub_device to query about its flow");
 	return -1;
@@ -223,7 +227,7 @@
 			WARN("flow isolation mode of sub_device %d in incoherent state.",
 				i);
 		ret = rte_flow_isolate(PORT_ID(sdev), set, error);
-		if (ret) {
+		if ((ret = fs_err(sdev, ret))) {
 			ERROR("Operation rte_flow_isolate failed for sub_device %d"
 			      " with error %d", i, ret);
 			return ret;
diff --git a/drivers/net/failsafe/failsafe_ops.c b/drivers/net/failsafe/failsafe_ops.c
index fe957ad..0976745 100644
--- a/drivers/net/failsafe/failsafe_ops.c
+++ b/drivers/net/failsafe/failsafe_ops.c
@@ -121,6 +121,8 @@
 					dev->data->nb_tx_queues,
 					&dev->data->dev_conf);
 		if (ret) {
+			if (!fs_err(sdev, ret))
+				continue;
 			ERROR("Could not configure sub_device %d", i);
 			return ret;
 		}
@@ -163,8 +165,11 @@
 			continue;
 		DEBUG("Starting sub_device %d", i);
 		ret = rte_eth_dev_start(PORT_ID(sdev));
-		if (ret)
+		if (ret) {
+			if (!fs_err(sdev, ret))
+				continue;
 			return ret;
+		}
 		sdev->state = DEV_STARTED;
 	}
 	if (PRIV(dev)->state < DEV_STARTED)
@@ -196,7 +201,7 @@
 	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
 		DEBUG("Calling rte_eth_dev_set_link_up on sub_device %d", i);
 		ret = rte_eth_dev_set_link_up(PORT_ID(sdev));
-		if (ret) {
+		if ((ret = fs_err(sdev, ret))) {
 			ERROR("Operation rte_eth_dev_set_link_up failed for sub_device %d"
 			      " with error %d", i, ret);
 			return ret;
@@ -215,7 +220,7 @@
 	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
 		DEBUG("Calling rte_eth_dev_set_link_down on sub_device %d", i);
 		ret = rte_eth_dev_set_link_down(PORT_ID(sdev));
-		if (ret) {
+		if ((ret = fs_err(sdev, ret))) {
 			ERROR("Operation rte_eth_dev_set_link_down failed for sub_device %d"
 			      " with error %d", i, ret);
 			return ret;
@@ -301,7 +306,7 @@
 				rx_queue_id,
 				nb_rx_desc, socket_id,
 				rx_conf, mb_pool);
-		if (ret) {
+		if ((ret = fs_err(sdev, ret))) {
 			ERROR("RX queue setup failed for sub_device %d", i);
 			goto free_rxq;
 		}
@@ -367,7 +372,7 @@
 				tx_queue_id,
 				nb_tx_desc, socket_id,
 				tx_conf);
-		if (ret) {
+		if ((ret = fs_err(sdev, ret))) {
 			ERROR("TX queue setup failed for sub_device %d", i);
 			goto free_txq;
 		}
@@ -446,7 +451,8 @@
 	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
 		DEBUG("Calling link_update on sub_device %d", i);
 		ret = (SUBOPS(sdev, link_update))(ETH(sdev), wait_to_complete);
-		if (ret && ret != -1) {
+		if (ret && ret != -1 && sdev->remove == 0 &&
+		    rte_eth_dev_is_removed(PORT_ID(sdev)) == 0) {
 			ERROR("Link update failed for sub_device %d with error %d",
 			      i, ret);
 			return ret;
@@ -470,6 +476,7 @@
 fs_stats_get(struct rte_eth_dev *dev,
 	     struct rte_eth_stats *stats)
 {
+	struct rte_eth_stats backup;
 	struct sub_device *sdev;
 	uint8_t i;
 	int ret;
@@ -479,14 +486,20 @@
 		struct rte_eth_stats *snapshot = &sdev->stats_snapshot.stats;
 		uint64_t *timestamp = &sdev->stats_snapshot.timestamp;
 
+		rte_memcpy(&backup, snapshot, sizeof(backup));
 		ret = rte_eth_stats_get(PORT_ID(sdev), snapshot);
 		if (ret) {
+			if (!fs_err(sdev, ret)) {
+				rte_memcpy(snapshot, &backup, sizeof(backup));
+				goto inc;
+			}
 			ERROR("Operation rte_eth_stats_get failed for sub_device %d with error %d",
 				  i, ret);
 			*timestamp = 0;
 			return ret;
 		}
 		*timestamp = rte_rdtsc();
+inc:
 		failsafe_stats_increment(stats, snapshot);
 	}
 	return 0;
@@ -599,7 +612,7 @@
 	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
 		DEBUG("Calling rte_eth_dev_set_mtu on sub_device %d", i);
 		ret = rte_eth_dev_set_mtu(PORT_ID(sdev), mtu);
-		if (ret) {
+		if ((ret = fs_err(sdev, ret))) {
 			ERROR("Operation rte_eth_dev_set_mtu failed for sub_device %d with error %d",
 			      i, ret);
 			return ret;
@@ -618,7 +631,7 @@
 	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
 		DEBUG("Calling rte_eth_dev_vlan_filter on sub_device %d", i);
 		ret = rte_eth_dev_vlan_filter(PORT_ID(sdev), vlan_id, on);
-		if (ret) {
+		if ((ret = fs_err(sdev, ret))) {
 			ERROR("Operation rte_eth_dev_vlan_filter failed for sub_device %d"
 			      " with error %d", i, ret);
 			return ret;
@@ -652,7 +665,7 @@
 	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
 		DEBUG("Calling rte_eth_dev_flow_ctrl_set on sub_device %d", i);
 		ret = rte_eth_dev_flow_ctrl_set(PORT_ID(sdev), fc_conf);
-		if (ret) {
+		if ((ret = fs_err(sdev, ret))) {
 			ERROR("Operation rte_eth_dev_flow_ctrl_set failed for sub_device %d"
 			      " with error %d", i, ret);
 			return ret;
@@ -689,7 +702,7 @@
 	RTE_ASSERT(index < FAILSAFE_MAX_ETHADDR);
 	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
 		ret = rte_eth_dev_mac_addr_add(PORT_ID(sdev), mac_addr, vmdq);
-		if (ret) {
+		if ((ret = fs_err(sdev, ret))) {
 			ERROR("Operation rte_eth_dev_mac_addr_add failed for sub_device %"
 			      PRIu8 " with error %d", i, ret);
 			return ret;
@@ -731,7 +744,7 @@
 	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
 		DEBUG("Calling rte_eth_dev_filter_ctrl on sub_device %d", i);
 		ret = rte_eth_dev_filter_ctrl(PORT_ID(sdev), type, op, arg);
-		if (ret) {
+		if ((ret = fs_err(sdev, ret))) {
 			ERROR("Operation rte_eth_dev_filter_ctrl failed for sub_device %d"
 			      " with error %d", i, ret);
 			return ret;
diff --git a/drivers/net/failsafe/failsafe_private.h b/drivers/net/failsafe/failsafe_private.h
index 9fcf72e..4916365 100644
--- a/drivers/net/failsafe/failsafe_private.h
+++ b/drivers/net/failsafe/failsafe_private.h
@@ -381,4 +381,15 @@ int failsafe_eth_lsc_event_callback(uint16_t port_id,
 	rte_wmb();
 }
 
+/*
+ * Adjust error value and rte_errno to the fail-safe actual error value.
+ */
+static inline int
+fs_err(struct sub_device *sdev, int err)
+{
+	/* A device removal shouldn't be reported as an error. */
+	if (sdev->remove == 1 || err == -EIO)
+		return rte_errno = 0;
+	return err;
+}
 #endif /* _RTE_ETH_FAILSAFE_PRIVATE_H_ */
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* Re: [PATCH v6 4/6] ethdev: adjust APIs removal error report
  2018-01-19 16:19                 ` Ferruh Yigit
  2018-01-19 17:35                   ` Ananyev, Konstantin
  2018-01-19 17:54                   ` Thomas Monjalon
@ 2018-01-21 20:07                   ` Ferruh Yigit
  2 siblings, 0 replies; 98+ messages in thread
From: Ferruh Yigit @ 2018-01-21 20:07 UTC (permalink / raw)
  To: Matan Azrad, Adrien Mazarguil, Gaetan Rivet
  Cc: Thomas Monjalon, dev, Andrew Rybchenko, Ananyev, Konstantin,
	Alejandro Lucero, Jerin Jacob, Hemant Agrawal, Shahaf Shuler,
	Olivier MATZ, Zhang, Helin

On 1/19/2018 4:19 PM, Ferruh Yigit wrote:
> On 1/18/2018 6:10 PM, Matan Azrad wrote:
>> Hi Ferruh
>>
>> From: Ferruh Yigit, Thursday, January 18, 2018 7:31 PM
>>> On 1/18/2018 11:27 AM, Matan Azrad wrote:
>>>> rte_eth_dev_is_removed API was added to detect a device removal
>>>> synchronously.
>>>>
>>>> When a device removal occurs during control command execution, many
>>>> different errors can be reported to the user.
>>>>
>>>> Adjust all ethdev APIs error reports to return -EIO in case of device
>>>> removal using rte_eth_dev_is_removed API.
>>>>
>>>> Signed-off-by: Matan Azrad <matan@mellanox.com>
>>>> Acked-by: Thomas Monjalon <thomas@monjalon.net>
>>>> ---
>>>>  lib/librte_ether/rte_ethdev.c | 192
>>>> +++++++++++++++++++++++++++---------------
>>>>  lib/librte_ether/rte_ethdev.h |  51 ++++++++++-
>>>>  2 files changed, 170 insertions(+), 73 deletions(-)
>>>>
>>>> diff --git a/lib/librte_ether/rte_ethdev.c
>>>> b/lib/librte_ether/rte_ethdev.c index c93cec1..7044159 100644
>>>> --- a/lib/librte_ether/rte_ethdev.c
>>>> +++ b/lib/librte_ether/rte_ethdev.c
>>>> @@ -338,6 +338,16 @@ struct rte_eth_dev *
>>>>  	return -ENODEV;
>>>>  }
>>>>
>>>> +static int
>>>> +eth_err(uint16_t port_id, int ret)
>>>> +{
>>>> +	if (ret == 0)
>>>> +		return 0;
>>>> +	if (rte_eth_dev_is_removed(port_id))
>>>> +		return -EIO;
>>>> +	return ret;
>>>> +}
>>>> +
>>>>  /* attach the new device, then store port_id of the device */  int
>>>> rte_eth_dev_attach(const char *devargs, uint16_t *port_id) @@ -492,7
>>>> +502,8 @@ struct rte_eth_dev *
>>>>  		return 0;
>>>>  	}
>>>>
>>>> -	return dev->dev_ops->rx_queue_start(dev, rx_queue_id);
>>>> +	return eth_err(port_id, dev->dev_ops->rx_queue_start(dev,
>>>> +							     rx_queue_id));
>>>>
>>>>  }
>>>
>>> This patch updates *all* ethdev public APIs to add if device is removed
>>> check?
>>
>> Yes.
>>
>>> And each check goes to ethdev is_removed() dev_ops to ask if dev is
>>> removed.
>> Probably, if the REMOVED state setted in will not call device is_remove.
>>
>>> These must be better way of doing this, am I missing something.
>>
>> Suggest.
> 
> With a silly analogy, this is like a blind person asking each time if he is dead
> before talking to other person.
> 
> At first glance I can think of a kind of watchdog timer can be implemented in
> ethdev layer. It provides periodic checks and if device is dead it calls the
> registered user callback function.
> 
> This method presented as synchronous method but not triggered from side where
> event happens, I mean not triggered from PMD but from application.
> So does application doing polling continuously if device is dead?
> Or if application is relying this patch to add a check in each API, what happens
> if device removed during data processing, will app rely on asynchronous method?
> 
> I am including a few consumers of the ethdev to the mail thread, clearly I am
> not very supportive of this patch, but specially taking release is being close
> to the account, if there is no objection than me I will take as consensus to get
> the patch in.

It looks like there is no objection to the patch and it is already acked, I will
get latest version to next-net.

> 
>>
>> This code will replace similar code in each PMD.
>>
>>> I definitely would like to see more comments for this patch.
>>>
>>> Another question is what happens if device removed while or before
>>> dev_ops called? There is no synchronizations in drivers for removal, right?
>>>
>>
>> Yes. You right, the device removal can be changed a moment after the call.
>> Actually the caller suspected in removal before call it(and want to validate it) - so it makes sense. 
>> From this reason the check in ethdev APIs is called generally in error flows.
>>
>>
>>> <...>
> 

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v7 0/6] Fail-safe\ethdev: fix removal handling lack
  2018-01-20 21:12           ` [PATCH v7 0/6] Fail-safe\ethdev: fix removal handling lack Matan Azrad
                               ` (5 preceding siblings ...)
  2018-01-20 21:12             ` [PATCH v7 6/6] net/failsafe: fix removed device handling Matan Azrad
@ 2018-01-21 20:28             ` Ferruh Yigit
  6 siblings, 0 replies; 98+ messages in thread
From: Ferruh Yigit @ 2018-01-21 20:28 UTC (permalink / raw)
  To: Matan Azrad, Adrien Mazarguil, Gaetan Rivet; +Cc: Thomas Monjalon, dev

On 1/20/2018 9:12 PM, Matan Azrad wrote:
> There is time between the physical removal of the device until sub-device PMDs get a RMV interrupt. 
> At this time DPDK PMDs and applications still don't know about the removal and may call sub-device control operation which should return an error.
> 
> This series adds new ethdev operation to check device removal, adds support for it in mlx PMDs, adjust ethdev APIs to return -EIO in case of removal and fixes the fail-safe bug of removal error report.
> 
> V2:
> Remove ENODEV definition.
> Remove checks from all mlx control commands.
> Add new devop - "is_removed".
> Implement it in mlx4 and mlx5.
> Fix failsafe bug by the new devop.
> 
> V3:
> Adjust ethdev APIs removal error report.
> Change failsafe check to check eth_dev* return values.
> Remove backporting of fail-safe patch.
> 
> V4:
> Improve fail-safe internal API to adjust the actual error value as discussed.
> Remove "Fixes" lines from fail-safe patch.
> No changes in ethdev\mlx patches.
> 
> V5:
> Rebase on top of master-net-mlx. 
> 
> V6:
> Move ethdev new API to be EXPERIMENTAL.
> 
> V7:
> Fix API return value description.
> Swap checks in the new API as Konstantin suggested.
> Add comment in the API as Ferruh suggested.
> 
> Matan Azrad (6):
>   ethdev: add devop to check removal status
>   net/mlx4: support a device removal check operation
>   net/mlx5: support a device removal check operation
>   ethdev: adjust APIs removal error report
>   ethdev: adjust flow APIs removal error report
>   net/failsafe: fix removed device handling

Series applied to dpdk-next-net/master, thanks.

^ permalink raw reply	[flat|nested] 98+ messages in thread

end of thread, other threads:[~2018-01-21 20:28 UTC | newest]

Thread overview: 98+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-11-02 15:42 [PATCH 0/3] Fail-safe fix removal handling lack Matan Azrad
2017-11-02 15:42 ` [PATCH 1/3] net/failsafe: " Matan Azrad
2017-11-06  8:19   ` Gaëtan Rivet
2017-11-02 15:42 ` [PATCH 2/3] net/mlx4: adjust removal error Matan Azrad
2017-11-03 13:05   ` Adrien Mazarguil
2017-11-05  6:52     ` Matan Azrad
2017-11-06 16:51       ` Adrien Mazarguil
2017-11-02 15:42 ` [PATCH 3/3] net/mlx5: " Matan Azrad
2017-11-03 13:06   ` Adrien Mazarguil
2017-11-05  6:57     ` Matan Azrad
2017-12-13 14:29 ` [PATCH v2 0/4] Fail-safe fix removal handling lack Matan Azrad
2017-12-13 14:29   ` [PATCH v2 1/4] ethdev: add devop to check removal status Matan Azrad
2017-12-13 14:29   ` [PATCH v2 2/4] net/mlx4: support a device removal check operation Matan Azrad
2017-12-13 14:29   ` [PATCH v2 3/4] net/mlx5: " Matan Azrad
2017-12-13 14:29   ` [PATCH v2 4/4] net/failsafe: fix removed device handling Matan Azrad
2017-12-13 15:16     ` Gaëtan Rivet
2017-12-13 15:48       ` Matan Azrad
2017-12-13 16:09         ` Gaëtan Rivet
2017-12-13 17:09           ` Thomas Monjalon
2017-12-14 10:40             ` Matan Azrad
2017-12-13 21:55           ` Gaëtan Rivet
2017-12-14 10:40             ` Matan Azrad
2017-12-14 10:48               ` Gaëtan Rivet
2017-12-14 13:07                 ` Matan Azrad
2017-12-14 13:27                   ` Gaëtan Rivet
2017-12-14 14:43                     ` Matan Azrad
2017-12-19 17:10   ` [PATCH v3 0/6] Fail-safe\ethdev: fix removal handling lack Matan Azrad
2017-12-19 17:10     ` [PATCH v3 1/6] ethdev: add devop to check removal status Matan Azrad
2017-12-19 17:20       ` Stephen Hemminger
2017-12-19 17:24         ` Matan Azrad
2017-12-19 20:51           ` Thomas Monjalon
2017-12-19 22:13             ` Gaëtan Rivet
2017-12-20  8:39               ` Matan Azrad
2018-01-07  9:53       ` Thomas Monjalon
2017-12-19 17:10     ` [PATCH v3 2/6] net/mlx4: support a device removal check operation Matan Azrad
2017-12-19 17:10     ` [PATCH v3 3/6] net/mlx5: " Matan Azrad
2017-12-19 17:10     ` [PATCH v3 4/6] ethdev: adjust APIs removal error report Matan Azrad
2018-01-07  9:56       ` Thomas Monjalon
2017-12-19 17:10     ` [PATCH v3 5/6] ethdev: adjust flow " Matan Azrad
2018-01-07  9:58       ` Thomas Monjalon
2017-12-19 17:10     ` [PATCH v3 6/6] net/failsafe: fix removed device handling Matan Azrad
2017-12-19 22:21       ` Gaëtan Rivet
2017-12-20 10:58         ` Matan Azrad
2018-01-08 10:57           ` Gaëtan Rivet
2018-01-08 12:55             ` Matan Azrad
2018-01-08 13:46               ` Gaëtan Rivet
2018-01-08 14:00                 ` Matan Azrad
2018-01-08 14:31                   ` Gaëtan Rivet
2018-01-10 12:30     ` [PATCH v4 0/6] Fail-safe\ethdev: fix removal handling lack Matan Azrad
2018-01-10 12:31       ` [PATCH v4 1/6] ethdev: add devop to check removal status Matan Azrad
2018-01-10 12:31       ` [PATCH v4 2/6] net/mlx4: support a device removal check operation Matan Azrad
2018-01-10 12:31       ` [PATCH v4 3/6] net/mlx5: " Matan Azrad
2018-01-10 12:31       ` [PATCH v4 4/6] ethdev: adjust APIs removal error report Matan Azrad
2018-01-10 12:31       ` [PATCH v4 5/6] ethdev: adjust flow " Matan Azrad
2018-01-10 12:31       ` [PATCH v4 6/6] net/failsafe: fix removed device handling Matan Azrad
2018-01-10 12:43         ` Matan Azrad
2018-01-10 13:51           ` Gaëtan Rivet
2018-01-10 13:47         ` Gaëtan Rivet
2018-01-17 20:19       ` [PATCH v5 0/6] Fail-safe\ethdev: fix removal handling lack Matan Azrad
2018-01-17 20:19         ` [PATCH v5 1/6] ethdev: add devop to check removal status Matan Azrad
2018-01-17 20:40           ` Ferruh Yigit
2018-01-17 20:19         ` [PATCH v5 2/6] net/mlx4: support a device removal check operation Matan Azrad
2018-01-17 20:19         ` [PATCH v5 3/6] net/mlx5: " Matan Azrad
2018-01-17 20:19         ` [PATCH v5 4/6] ethdev: adjust APIs removal error report Matan Azrad
2018-01-17 20:19         ` [PATCH v5 5/6] ethdev: adjust flow " Matan Azrad
2018-01-17 20:19         ` [PATCH v5 6/6] net/failsafe: fix removed device handling Matan Azrad
2018-01-18  8:44           ` Gaëtan Rivet
2018-01-18 11:27         ` [PATCH v6 0/6] Fail-safe\ethdev: fix removal handling lack Matan Azrad
2018-01-18 11:27           ` [PATCH v6 1/6] ethdev: add devop to check removal status Matan Azrad
2018-01-18 17:18             ` Ferruh Yigit
2018-01-18 17:57               ` Adrien Mazarguil
2018-01-18 18:02               ` Matan Azrad
2018-01-18 11:27           ` [PATCH v6 2/6] net/mlx4: support a device removal check operation Matan Azrad
2018-01-18 16:59             ` Adrien Mazarguil
2018-01-18 11:27           ` [PATCH v6 3/6] net/mlx5: " Matan Azrad
2018-01-18 16:59             ` Adrien Mazarguil
2018-01-18 11:27           ` [PATCH v6 4/6] ethdev: adjust APIs removal error report Matan Azrad
2018-01-18 17:31             ` Ferruh Yigit
2018-01-18 18:10               ` Matan Azrad
2018-01-19 16:19                 ` Ferruh Yigit
2018-01-19 17:35                   ` Ananyev, Konstantin
2018-01-19 17:54                   ` Thomas Monjalon
2018-01-19 18:13                     ` Ferruh Yigit
2018-01-19 18:16                       ` Thomas Monjalon
2018-01-20 19:04                         ` Matan Azrad
2018-01-20 20:28                           ` Thomas Monjalon
2018-01-20 20:45                             ` Matan Azrad
2018-01-21 20:07                   ` Ferruh Yigit
2018-01-18 11:27           ` [PATCH v6 5/6] ethdev: adjust flow " Matan Azrad
2018-01-18 11:27           ` [PATCH v6 6/6] net/failsafe: fix removed device handling Matan Azrad
2018-01-20 21:12           ` [PATCH v7 0/6] Fail-safe\ethdev: fix removal handling lack Matan Azrad
2018-01-20 21:12             ` [PATCH v7 1/6] ethdev: add devop to check removal status Matan Azrad
2018-01-20 21:12             ` [PATCH v7 2/6] net/mlx4: support a device removal check operation Matan Azrad
2018-01-20 21:12             ` [PATCH v7 3/6] net/mlx5: " Matan Azrad
2018-01-20 21:12             ` [PATCH v7 4/6] ethdev: adjust APIs removal error report Matan Azrad
2018-01-20 21:12             ` [PATCH v7 5/6] ethdev: adjust flow " Matan Azrad
2018-01-20 21:12             ` [PATCH v7 6/6] net/failsafe: fix removed device handling Matan Azrad
2018-01-21 20:28             ` [PATCH v7 0/6] Fail-safe\ethdev: fix removal handling lack Ferruh Yigit

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.