All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4 00/11] eal: replace calls to rte_panic and refrain from new instances
@ 2018-04-19  6:00 Arnon Warshavsky
  2018-04-19  6:00 ` [PATCH v4 01/11] crypto: replace rte_panic instances in crypto driver Arnon Warshavsky
                   ` (10 more replies)
  0 siblings, 11 replies; 47+ messages in thread
From: Arnon Warshavsky @ 2018-04-19  6:00 UTC (permalink / raw)
  To: thomas, anatoly.burakov, wenzhuo.lu, declan.doherty, jerin.jacob,
	bruce.richardson, ferruh.yigit
  Cc: dev, arnon

The purpose of this patch series is to cleanup the library code
from paths that end up aborting the process,
and move to checking error values, in order to allow the running process
perform an orderly teardown or other mitigation of the event.

This patch modifies the majority of rte_panic calls
under lib and drivers, and replaces them with a log message
and an error return code according to context,
that can be propagated up the call stack.

- Focus was given to the dpdk initialization path
- Some of the panic calls within drivers were left in place where
  the call is from within an interrupt or calls that are
  on the data path,where there is no simple applicative
  route to propagate the error to temination.
  These should be handled by the driver maintainers.
- In order to avoid breaking ABI where panic was called from public
  void functions, a panic state variable was introduced so that
  it can be queried after calling these void functions.
  This tool place for a single function call.
- local void functions with no api were changed to retrun a value
  where needed
- No change took place in example and test files
- No change took place for debug assertions calling panic
- A new function was added to devtools/checkpatches.sh
  in order to prevent new additions of calls to rte_panic
  under lib and drivers.

Keep calm and don't panic

---

v2:
- reformat error messages so that literal string are in the same line
- fix typo in commit message
- add new return code to doxigen of rte_memzone_free()

v3:
- submit  all 13 patches changed and unchanged in the same patchset

v4:
- remove 2 patches that are no more relevant
- fix split literal string in error message
- change return value -1 to enum
- split value and success code in a static function

Arnon Warshavsky (11):
  crypto: replace rte_panic instances in crypto driver
  bond: replace rte_panic instances in bonding driver
  e1000: replace rte_panic instances in e1000 driver
  ixgbe: replace rte_panic instances in ixgbe driver
  eal: replace rte_panic instances in eventdev
  kni: replace rte_panic instances in kni
  eal: replace rte_panic instances in hugepage_info
  eal: replace rte_panic instances in interrupts thread
  eal: replace rte_panic instances in ethdev
  eal: replace rte_panic instances in init sequence
  devtools: prevent new instances of rte_panic and rte_exit

 devtools/checkpatches.sh                          |  94 ++++++++++++++++-
 drivers/crypto/dpaa2_sec/dpaa2_sec_dpseci.c       |   8 +-
 drivers/crypto/dpaa_sec/dpaa_sec.c                |   8 +-
 drivers/net/bonding/rte_eth_bond_8023ad.c         |  28 +++--
 drivers/net/bonding/rte_eth_bond_8023ad_private.h |   2 +-
 drivers/net/bonding/rte_eth_bond_api.c            |  20 ++--
 drivers/net/bonding/rte_eth_bond_pmd.c            |   9 +-
 drivers/net/bonding/rte_eth_bond_private.h        |   2 +-
 drivers/net/e1000/e1000_ethdev.h                  |   2 +-
 drivers/net/e1000/igb_ethdev.c                    |   3 +-
 drivers/net/e1000/igb_pf.c                        |  15 +--
 drivers/net/ixgbe/ixgbe_ethdev.c                  |   3 +-
 drivers/net/ixgbe/ixgbe_ethdev.h                  |   2 +-
 drivers/net/ixgbe/ixgbe_pf.c                      |  13 ++-
 lib/librte_eal/bsdapp/eal/eal.c                   |  86 +++++++++++-----
 lib/librte_eal/bsdapp/eal/eal_thread.c            |  65 +++++++++---
 lib/librte_eal/common/eal_common_launch.c         |  21 ++++
 lib/librte_eal/common/include/rte_debug.h         |  12 +++
 lib/librte_eal/linuxapp/eal/eal.c                 | 120 +++++++++++++++-------
 lib/librte_eal/linuxapp/eal/eal_hugepage_info.c   |  32 ++++--
 lib/librte_eal/linuxapp/eal/eal_interrupts.c      |  27 +++--
 lib/librte_eal/linuxapp/eal/eal_thread.c          |  65 +++++++++---
 lib/librte_ether/rte_ethdev.c                     |  36 +++++--
 lib/librte_eventdev/rte_eventdev_pmd_pci.h        |   8 +-
 lib/librte_eventdev/rte_eventdev_pmd_vdev.h       |   8 +-
 lib/librte_kni/rte_kni.c                          |  18 ++--
 lib/librte_kni/rte_kni_fifo.h                     |  11 +-
 27 files changed, 533 insertions(+), 185 deletions(-)

-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH v4 01/11] crypto: replace rte_panic instances in crypto driver
  2018-04-19  6:00 [PATCH v4 00/11] eal: replace calls to rte_panic and refrain from new instances Arnon Warshavsky
@ 2018-04-19  6:00 ` Arnon Warshavsky
  2018-04-19 10:53   ` Trahe, Fiona
  2018-04-19  6:01 ` [PATCH v4 02/11] bond: replace rte_panic instances in bonding driver Arnon Warshavsky
                   ` (9 subsequent siblings)
  10 siblings, 1 reply; 47+ messages in thread
From: Arnon Warshavsky @ 2018-04-19  6:00 UTC (permalink / raw)
  To: thomas, anatoly.burakov, wenzhuo.lu, declan.doherty, jerin.jacob,
	bruce.richardson, ferruh.yigit
  Cc: dev, arnon

replace panic calls with log and return value.

--
v2:
- reformat error message to include literal string in a single line
v4: replace -1 return value with -ENOMEM

Signed-off-by: Arnon Warshavsky <arnon@qwilt.com>
---
 drivers/crypto/dpaa2_sec/dpaa2_sec_dpseci.c | 8 +++++---
 drivers/crypto/dpaa_sec/dpaa_sec.c          | 8 +++++---
 2 files changed, 10 insertions(+), 6 deletions(-)

diff --git a/drivers/crypto/dpaa2_sec/dpaa2_sec_dpseci.c b/drivers/crypto/dpaa2_sec/dpaa2_sec_dpseci.c
index 23012e3..d465a2d 100644
--- a/drivers/crypto/dpaa2_sec/dpaa2_sec_dpseci.c
+++ b/drivers/crypto/dpaa2_sec/dpaa2_sec_dpseci.c
@@ -2861,9 +2861,11 @@ struct rte_security_ops dpaa2_sec_security_ops = {
 					RTE_CACHE_LINE_SIZE,
 					rte_socket_id());
 
-		if (cryptodev->data->dev_private == NULL)
-			rte_panic("Cannot allocate memzone for private "
-				  "device data");
+		if (cryptodev->data->dev_private == NULL) {
+			RTE_LOG(ERR, PMD, "%s() Cannot allocate memzone for private device data",
+			__func__);
+			return -ENOMEM;
+		}
 	}
 
 	dpaa2_dev->cryptodev = cryptodev;
diff --git a/drivers/crypto/dpaa_sec/dpaa_sec.c b/drivers/crypto/dpaa_sec/dpaa_sec.c
index b685220..7b63650 100644
--- a/drivers/crypto/dpaa_sec/dpaa_sec.c
+++ b/drivers/crypto/dpaa_sec/dpaa_sec.c
@@ -2356,9 +2356,11 @@ struct rte_security_ops dpaa_sec_security_ops = {
 					RTE_CACHE_LINE_SIZE,
 					rte_socket_id());
 
-		if (cryptodev->data->dev_private == NULL)
-			rte_panic("Cannot allocate memzone for private "
-					"device data");
+		if (cryptodev->data->dev_private == NULL) {
+			RTE_LOG(ERR, PMD, "%s() Cannot allocate memzone for private device data",
+			__func__);
+			return -ENOMEM;
+		}
 	}
 
 	dpaa_dev->crypto_dev = cryptodev;
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v4 02/11] bond: replace rte_panic instances in bonding driver
  2018-04-19  6:00 [PATCH v4 00/11] eal: replace calls to rte_panic and refrain from new instances Arnon Warshavsky
  2018-04-19  6:00 ` [PATCH v4 01/11] crypto: replace rte_panic instances in crypto driver Arnon Warshavsky
@ 2018-04-19  6:01 ` Arnon Warshavsky
  2018-04-19 17:25   ` Kevin Traynor
  2018-04-19  6:01 ` [PATCH v4 03/11] e1000: replace rte_panic instances in e1000 driver Arnon Warshavsky
                   ` (8 subsequent siblings)
  10 siblings, 1 reply; 47+ messages in thread
From: Arnon Warshavsky @ 2018-04-19  6:01 UTC (permalink / raw)
  To: thomas, anatoly.burakov, wenzhuo.lu, declan.doherty, jerin.jacob,
	bruce.richardson, ferruh.yigit
  Cc: dev, arnon

replace panic calls with log and retrun value.
Local functions to this file,
changing from void to int are non-abi-breaking
--
v4 - fix split literal strings in log messages

Signed-off-by: Arnon Warshavsky <arnon@qwilt.com>
---
 drivers/net/bonding/rte_eth_bond_8023ad.c         | 28 +++++++++++++++--------
 drivers/net/bonding/rte_eth_bond_8023ad_private.h |  2 +-
 drivers/net/bonding/rte_eth_bond_api.c            | 20 +++++++++++-----
 drivers/net/bonding/rte_eth_bond_pmd.c            |  9 +++++---
 drivers/net/bonding/rte_eth_bond_private.h        |  2 +-
 5 files changed, 40 insertions(+), 21 deletions(-)

diff --git a/drivers/net/bonding/rte_eth_bond_8023ad.c b/drivers/net/bonding/rte_eth_bond_8023ad.c
index c452318..7512901 100644
--- a/drivers/net/bonding/rte_eth_bond_8023ad.c
+++ b/drivers/net/bonding/rte_eth_bond_8023ad.c
@@ -893,7 +893,7 @@
 			bond_mode_8023ad_periodic_cb, arg);
 }
 
-void
+int
 bond_mode_8023ad_activate_slave(struct rte_eth_dev *bond_dev,
 				uint16_t slave_id)
 {
@@ -939,7 +939,7 @@
 	timer_cancel(&port->warning_timer);
 
 	if (port->mbuf_pool != NULL)
-		return;
+		return 0;
 
 	RTE_ASSERT(port->rx_ring == NULL);
 	RTE_ASSERT(port->tx_ring == NULL);
@@ -968,8 +968,9 @@
 	/* Any memory allocation failure in initialization is critical because
 	 * resources can't be free, so reinitialization is impossible. */
 	if (port->mbuf_pool == NULL) {
-		rte_panic("Slave %u: Failed to create memory pool '%s': %s\n",
-			slave_id, mem_name, rte_strerror(rte_errno));
+		RTE_LOG(ERR, PMD, "%s() Slave %u: Failed to create memory pool '%s': %s\n",
+			__func__, slave_id, mem_name, rte_strerror(rte_errno));
+		return -1;
 	}
 
 	snprintf(mem_name, RTE_DIM(mem_name), "slave_%u_rx", slave_id);
@@ -977,8 +978,9 @@
 			rte_align32pow2(BOND_MODE_8023AX_SLAVE_RX_PKTS), socket_id, 0);
 
 	if (port->rx_ring == NULL) {
-		rte_panic("Slave %u: Failed to create rx ring '%s': %s\n", slave_id,
-			mem_name, rte_strerror(rte_errno));
+		RTE_LOG(ERR, PMD, "%s() Slave %u: Failed to create rx ring '%s': %s\n",
+			__func__, slave_id, mem_name, rte_strerror(rte_errno));
+		return -1;
 	}
 
 	/* TX ring is at least one pkt longer to make room for marker packet. */
@@ -987,9 +989,12 @@
 			rte_align32pow2(BOND_MODE_8023AX_SLAVE_TX_PKTS + 1), socket_id, 0);
 
 	if (port->tx_ring == NULL) {
-		rte_panic("Slave %u: Failed to create tx ring '%s': %s\n", slave_id,
-			mem_name, rte_strerror(rte_errno));
+		RTE_LOG(ERR, PMD, "%s() Slave %u: Fail to create tx ring '%s': %s\n",
+			__func__, slave_id, mem_name, rte_strerror(rte_errno));
+		return -1;
 	}
+
+	return 0;
 }
 
 int
@@ -1143,9 +1148,12 @@
 	struct bond_dev_private *internals = bond_dev->data->dev_private;
 	uint8_t i;
 
-	for (i = 0; i < internals->active_slave_count; i++)
-		bond_mode_8023ad_activate_slave(bond_dev,
+	for (i = 0; i < internals->active_slave_count; i++) {
+		int rc = bond_mode_8023ad_activate_slave(bond_dev,
 				internals->active_slaves[i]);
+		if (rc != 0)
+			return rc;
+	}
 
 	return 0;
 }
diff --git a/drivers/net/bonding/rte_eth_bond_8023ad_private.h b/drivers/net/bonding/rte_eth_bond_8023ad_private.h
index 0f490a5..96a42f2 100644
--- a/drivers/net/bonding/rte_eth_bond_8023ad_private.h
+++ b/drivers/net/bonding/rte_eth_bond_8023ad_private.h
@@ -263,7 +263,7 @@ struct mode8023ad_private {
  * @return
  *  0 on success, negative value otherwise.
  */
-void
+int
 bond_mode_8023ad_activate_slave(struct rte_eth_dev *dev, uint16_t port_id);
 
 /**
diff --git a/drivers/net/bonding/rte_eth_bond_api.c b/drivers/net/bonding/rte_eth_bond_api.c
index aa89425..96aa1ff 100644
--- a/drivers/net/bonding/rte_eth_bond_api.c
+++ b/drivers/net/bonding/rte_eth_bond_api.c
@@ -69,14 +69,15 @@
 	return 0;
 }
 
-void
+int
 activate_slave(struct rte_eth_dev *eth_dev, uint16_t port_id)
 {
 	struct bond_dev_private *internals = eth_dev->data->dev_private;
 	uint8_t active_count = internals->active_slave_count;
 
 	if (internals->mode == BONDING_MODE_8023AD)
-		bond_mode_8023ad_activate_slave(eth_dev, port_id);
+		if (bond_mode_8023ad_activate_slave(eth_dev, port_id) != 0)
+			return -1;
 
 	if (internals->mode == BONDING_MODE_TLB
 			|| internals->mode == BONDING_MODE_ALB) {
@@ -357,10 +358,17 @@
 				bond_ethdev_primary_set(internals,
 							slave_port_id);
 
-			if (find_slave_by_id(internals->active_slaves,
-					     internals->active_slave_count,
-					     slave_port_id) == internals->active_slave_count)
-				activate_slave(bonded_eth_dev, slave_port_id);
+			int rc =
+				find_slave_by_id(internals->active_slaves,
+					internals->active_slave_count,
+					slave_port_id);
+
+			if (rc == internals->active_slave_count) {
+				int rc = activate_slave(bonded_eth_dev,
+							slave_port_id);
+				if (rc != 0)
+					return -1;
+			}
 		}
 	}
 
diff --git a/drivers/net/bonding/rte_eth_bond_pmd.c b/drivers/net/bonding/rte_eth_bond_pmd.c
index 2805c71..2d9052d 100644
--- a/drivers/net/bonding/rte_eth_bond_pmd.c
+++ b/drivers/net/bonding/rte_eth_bond_pmd.c
@@ -1741,8 +1741,10 @@ struct bwg_slave {
 		/* Any memory allocation failure in initialization is critical because
 		 * resources can't be free, so reinitialization is impossible. */
 		if (port->slow_pool == NULL) {
-			rte_panic("Slave %u: Failed to create memory pool '%s': %s\n",
-				slave_id, mem_name, rte_strerror(rte_errno));
+			RTE_LOG(ERR, PMD, "%s() Slave %u: Failed to create memory pool '%s': %s\n",
+				__func__, slave_id,
+				mem_name, rte_strerror(rte_errno));
+			return -1;
 		}
 	}
 
@@ -2673,7 +2675,8 @@ struct bwg_slave {
 			mac_address_slaves_update(bonded_eth_dev);
 		}
 
-		activate_slave(bonded_eth_dev, port_id);
+		if (activate_slave(bonded_eth_dev, port_id) != 0)
+			return -1;
 
 		/* If user has defined the primary port then default to using it */
 		if (internals->user_defined_primary_port &&
diff --git a/drivers/net/bonding/rte_eth_bond_private.h b/drivers/net/bonding/rte_eth_bond_private.h
index 94eca88..d99d42c 100644
--- a/drivers/net/bonding/rte_eth_bond_private.h
+++ b/drivers/net/bonding/rte_eth_bond_private.h
@@ -187,7 +187,7 @@ struct bond_dev_private {
 void
 deactivate_slave(struct rte_eth_dev *eth_dev, uint16_t port_id);
 
-void
+int
 activate_slave(struct rte_eth_dev *eth_dev, uint16_t port_id);
 
 void
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v4 03/11] e1000: replace rte_panic instances in e1000 driver
  2018-04-19  6:00 [PATCH v4 00/11] eal: replace calls to rte_panic and refrain from new instances Arnon Warshavsky
  2018-04-19  6:00 ` [PATCH v4 01/11] crypto: replace rte_panic instances in crypto driver Arnon Warshavsky
  2018-04-19  6:01 ` [PATCH v4 02/11] bond: replace rte_panic instances in bonding driver Arnon Warshavsky
@ 2018-04-19  6:01 ` Arnon Warshavsky
  2018-04-19 17:25   ` Kevin Traynor
  2018-04-19  6:01 ` [PATCH v4 04/11] ixgbe: replace rte_panic instances in ixgbe driver Arnon Warshavsky
                   ` (7 subsequent siblings)
  10 siblings, 1 reply; 47+ messages in thread
From: Arnon Warshavsky @ 2018-04-19  6:01 UTC (permalink / raw)
  To: thomas, anatoly.burakov, wenzhuo.lu, declan.doherty, jerin.jacob,
	bruce.richardson, ferruh.yigit
  Cc: dev, arnon

replace panic calls with log and retrun value.
Local function to this file,
changing from void to int is non-abi-breaking
--
v4 - keep error message literal string in a singhle line

Signed-off-by: Arnon Warshavsky <arnon@qwilt.com>
---
 drivers/net/e1000/e1000_ethdev.h |  2 +-
 drivers/net/e1000/igb_ethdev.c   |  3 ++-
 drivers/net/e1000/igb_pf.c       | 15 +++++++++------
 3 files changed, 12 insertions(+), 8 deletions(-)

diff --git a/drivers/net/e1000/e1000_ethdev.h b/drivers/net/e1000/e1000_ethdev.h
index 6354b89..2e527de 100644
--- a/drivers/net/e1000/e1000_ethdev.h
+++ b/drivers/net/e1000/e1000_ethdev.h
@@ -411,7 +411,7 @@ int eth_igb_rss_hash_conf_get(struct rte_eth_dev *dev,
 /*
  * misc function prototypes
  */
-void igb_pf_host_init(struct rte_eth_dev *eth_dev);
+int igb_pf_host_init(struct rte_eth_dev *eth_dev);
 
 void igb_pf_mbx_process(struct rte_eth_dev *eth_dev);
 
diff --git a/drivers/net/e1000/igb_ethdev.c b/drivers/net/e1000/igb_ethdev.c
index 9b808a9..4479616 100644
--- a/drivers/net/e1000/igb_ethdev.c
+++ b/drivers/net/e1000/igb_ethdev.c
@@ -833,7 +833,8 @@ static int igb_flex_filter_uninit(struct rte_eth_dev *eth_dev)
 	}
 
 	/* initialize PF if max_vfs not zero */
-	igb_pf_host_init(eth_dev);
+	if (igb_pf_host_init(eth_dev) != 0)
+		goto err_late;
 
 	ctrl_ext = E1000_READ_REG(hw, E1000_CTRL_EXT);
 	/* Set PF Reset Done bit so PF/VF Mail Ops can work */
diff --git a/drivers/net/e1000/igb_pf.c b/drivers/net/e1000/igb_pf.c
index b9f2e53..ae4b0a4 100644
--- a/drivers/net/e1000/igb_pf.c
+++ b/drivers/net/e1000/igb_pf.c
@@ -63,7 +63,7 @@ int igb_vf_perm_addr_gen(struct rte_eth_dev *dev, uint16_t vf_num)
 	return 0;
 }
 
-void igb_pf_host_init(struct rte_eth_dev *eth_dev)
+int igb_pf_host_init(struct rte_eth_dev *eth_dev)
 {
 	struct e1000_vf_info **vfinfo =
 		E1000_DEV_PRIVATE_TO_P_VFDATA(eth_dev->data->dev_private);
@@ -74,7 +74,7 @@ void igb_pf_host_init(struct rte_eth_dev *eth_dev)
 
 	RTE_ETH_DEV_SRIOV(eth_dev).active = 0;
 	if (0 == (vf_num = dev_num_vf(eth_dev)))
-		return;
+		return 0;
 
 	if (hw->mac.type == e1000_i350)
 		nb_queue = 1;
@@ -82,11 +82,14 @@ void igb_pf_host_init(struct rte_eth_dev *eth_dev)
 		/* per datasheet, it should be 2, but 1 seems correct */
 		nb_queue = 1;
 	else
-		return;
+		return 0;
 
 	*vfinfo = rte_zmalloc("vf_info", sizeof(struct e1000_vf_info) * vf_num, 0);
-	if (*vfinfo == NULL)
-		rte_panic("Cannot allocate memory for private VF data\n");
+	if (*vfinfo == NULL) {
+		RTE_LOG(CRIT, PMD, "%s(): Cannot allocate memory for private VF data\n",
+			__func__);
+		return -1;
+	}
 
 	RTE_ETH_DEV_SRIOV(eth_dev).active = ETH_8_POOLS;
 	RTE_ETH_DEV_SRIOV(eth_dev).nb_q_per_pool = nb_queue;
@@ -98,7 +101,7 @@ void igb_pf_host_init(struct rte_eth_dev *eth_dev)
 	/* set mb interrupt mask */
 	igb_mb_intr_setup(eth_dev);
 
-	return;
+	return 0;
 }
 
 void igb_pf_host_uninit(struct rte_eth_dev *dev)
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v4 04/11] ixgbe: replace rte_panic instances in ixgbe driver
  2018-04-19  6:00 [PATCH v4 00/11] eal: replace calls to rte_panic and refrain from new instances Arnon Warshavsky
                   ` (2 preceding siblings ...)
  2018-04-19  6:01 ` [PATCH v4 03/11] e1000: replace rte_panic instances in e1000 driver Arnon Warshavsky
@ 2018-04-19  6:01 ` Arnon Warshavsky
  2018-04-19 17:26   ` Kevin Traynor
  2018-04-19  6:01 ` [PATCH v4 05/11] eal: replace rte_panic instances in eventdev Arnon Warshavsky
                   ` (6 subsequent siblings)
  10 siblings, 1 reply; 47+ messages in thread
From: Arnon Warshavsky @ 2018-04-19  6:01 UTC (permalink / raw)
  To: thomas, anatoly.burakov, wenzhuo.lu, declan.doherty, jerin.jacob,
	bruce.richardson, ferruh.yigit
  Cc: dev, arnon

replace panic calls with log and retrun value.
Local function to this file,
changing from void to int is non-abi-breaking

Signed-off-by: Arnon Warshavsky <arnon@qwilt.com>
---
 drivers/net/ixgbe/ixgbe_ethdev.c |  3 ++-
 drivers/net/ixgbe/ixgbe_ethdev.h |  2 +-
 drivers/net/ixgbe/ixgbe_pf.c     | 13 +++++++++----
 3 files changed, 12 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ixgbe/ixgbe_ethdev.c b/drivers/net/ixgbe/ixgbe_ethdev.c
index a5e2fc0..c7797f1 100644
--- a/drivers/net/ixgbe/ixgbe_ethdev.c
+++ b/drivers/net/ixgbe/ixgbe_ethdev.c
@@ -1224,7 +1224,8 @@ struct rte_ixgbe_xstats_name_off {
 	memset(hwstrip, 0, sizeof(*hwstrip));
 
 	/* initialize PF if max_vfs not zero */
-	ixgbe_pf_host_init(eth_dev);
+	if (ixgbe_pf_host_init(eth_dev) != 0)
+		return -1;
 
 	ctrl_ext = IXGBE_READ_REG(hw, IXGBE_CTRL_EXT);
 	/* let hardware know driver is loaded */
diff --git a/drivers/net/ixgbe/ixgbe_ethdev.h b/drivers/net/ixgbe/ixgbe_ethdev.h
index 6550777..8bb41ec 100644
--- a/drivers/net/ixgbe/ixgbe_ethdev.h
+++ b/drivers/net/ixgbe/ixgbe_ethdev.h
@@ -661,7 +661,7 @@ int ixgbe_fdir_filter_program(struct rte_eth_dev *dev,
 
 void ixgbe_vlan_hw_strip_config(struct rte_eth_dev *dev);
 
-void ixgbe_pf_host_init(struct rte_eth_dev *eth_dev);
+int ixgbe_pf_host_init(struct rte_eth_dev *eth_dev);
 
 void ixgbe_pf_host_uninit(struct rte_eth_dev *eth_dev);
 
diff --git a/drivers/net/ixgbe/ixgbe_pf.c b/drivers/net/ixgbe/ixgbe_pf.c
index 4e61310..4cd3651 100644
--- a/drivers/net/ixgbe/ixgbe_pf.c
+++ b/drivers/net/ixgbe/ixgbe_pf.c
@@ -66,7 +66,7 @@ int ixgbe_vf_perm_addr_gen(struct rte_eth_dev *dev, uint16_t vf_num)
 	return 0;
 }
 
-void ixgbe_pf_host_init(struct rte_eth_dev *eth_dev)
+int ixgbe_pf_host_init(struct rte_eth_dev *eth_dev)
 {
 	struct ixgbe_vf_info **vfinfo =
 		IXGBE_DEV_PRIVATE_TO_P_VFDATA(eth_dev->data->dev_private);
@@ -84,11 +84,14 @@ void ixgbe_pf_host_init(struct rte_eth_dev *eth_dev)
 	RTE_ETH_DEV_SRIOV(eth_dev).active = 0;
 	vf_num = dev_num_vf(eth_dev);
 	if (vf_num == 0)
-		return;
+		return 0;
 
 	*vfinfo = rte_zmalloc("vf_info", sizeof(struct ixgbe_vf_info) * vf_num, 0);
-	if (*vfinfo == NULL)
-		rte_panic("Cannot allocate memory for private VF data\n");
+	if (*vfinfo == NULL) {
+		RTE_LOG(ERR, PMD, "%s() Cannot allocate memory for private VF data\n",
+				__func__);
+		return -1;
+	}
 
 	memset(mirror_info, 0, sizeof(struct ixgbe_mirror_info));
 	memset(uta_info, 0, sizeof(struct ixgbe_uta_info));
@@ -116,6 +119,8 @@ void ixgbe_pf_host_init(struct rte_eth_dev *eth_dev)
 
 	/* set mb interrupt mask */
 	ixgbe_mb_intr_setup(eth_dev);
+
+	return 0;
 }
 
 void ixgbe_pf_host_uninit(struct rte_eth_dev *eth_dev)
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v4 05/11] eal: replace rte_panic instances in eventdev
  2018-04-19  6:00 [PATCH v4 00/11] eal: replace calls to rte_panic and refrain from new instances Arnon Warshavsky
                   ` (3 preceding siblings ...)
  2018-04-19  6:01 ` [PATCH v4 04/11] ixgbe: replace rte_panic instances in ixgbe driver Arnon Warshavsky
@ 2018-04-19  6:01 ` Arnon Warshavsky
  2018-04-19 17:26   ` Kevin Traynor
  2018-04-19  6:01 ` [PATCH v4 06/11] kni: replace rte_panic instances in kni Arnon Warshavsky
                   ` (5 subsequent siblings)
  10 siblings, 1 reply; 47+ messages in thread
From: Arnon Warshavsky @ 2018-04-19  6:01 UTC (permalink / raw)
  To: thomas, anatoly.burakov, wenzhuo.lu, declan.doherty, jerin.jacob,
	bruce.richardson, ferruh.yigit
  Cc: dev, arnon

replace panic calls with log and retrun value.

--
v4 - fix split literal strings in log messages

Signed-off-by: Arnon Warshavsky <arnon@qwilt.com>
---
 lib/librte_eventdev/rte_eventdev_pmd_pci.h  | 8 +++++---
 lib/librte_eventdev/rte_eventdev_pmd_vdev.h | 8 +++++---
 2 files changed, 10 insertions(+), 6 deletions(-)

diff --git a/lib/librte_eventdev/rte_eventdev_pmd_pci.h b/lib/librte_eventdev/rte_eventdev_pmd_pci.h
index 8fb6138..dad2182 100644
--- a/lib/librte_eventdev/rte_eventdev_pmd_pci.h
+++ b/lib/librte_eventdev/rte_eventdev_pmd_pci.h
@@ -66,9 +66,11 @@
 						RTE_CACHE_LINE_SIZE,
 						rte_socket_id());
 
-		if (eventdev->data->dev_private == NULL)
-			rte_panic("Cannot allocate memzone for private "
-					"device data");
+		if (eventdev->data->dev_private == NULL) {
+			RTE_LOG(CRIT, EAL, "%s(): Cannot allocate memzone for private device data",
+				__func__);
+			return -1;
+		}
 	}
 
 	eventdev->dev = &pci_dev->device;
diff --git a/lib/librte_eventdev/rte_eventdev_pmd_vdev.h b/lib/librte_eventdev/rte_eventdev_pmd_vdev.h
index 8c64a06..b7c08fa 100644
--- a/lib/librte_eventdev/rte_eventdev_pmd_vdev.h
+++ b/lib/librte_eventdev/rte_eventdev_pmd_vdev.h
@@ -61,9 +61,11 @@
 						RTE_CACHE_LINE_SIZE,
 						socket_id);
 
-		if (eventdev->data->dev_private == NULL)
-			rte_panic("Cannot allocate memzone for private device"
-					" data");
+		if (eventdev->data->dev_private == NULL) {
+			RTE_LOG(CRIT, EAL, "%s(): Cannot allocate memzone for private device data",
+				__func__);
+			return NULL;
+		}
 	}
 
 	return eventdev;
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v4 06/11] kni: replace rte_panic instances in kni
  2018-04-19  6:00 [PATCH v4 00/11] eal: replace calls to rte_panic and refrain from new instances Arnon Warshavsky
                   ` (4 preceding siblings ...)
  2018-04-19  6:01 ` [PATCH v4 05/11] eal: replace rte_panic instances in eventdev Arnon Warshavsky
@ 2018-04-19  6:01 ` Arnon Warshavsky
  2018-04-19  6:01 ` [PATCH v4 07/11] eal: replace rte_panic instances in hugepage_info Arnon Warshavsky
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 47+ messages in thread
From: Arnon Warshavsky @ 2018-04-19  6:01 UTC (permalink / raw)
  To: thomas, anatoly.burakov, wenzhuo.lu, declan.doherty, jerin.jacob,
	bruce.richardson, ferruh.yigit
  Cc: dev, arnon

replace panic calls with log and retrun value.
Local function to this file,
changing from void to int is non-abi-breaking

Signed-off-by: Arnon Warshavsky <arnon@qwilt.com>
---
 lib/librte_kni/rte_kni.c      | 18 ++++++++++++------
 lib/librte_kni/rte_kni_fifo.h | 11 ++++++++---
 2 files changed, 20 insertions(+), 9 deletions(-)

diff --git a/lib/librte_kni/rte_kni.c b/lib/librte_kni/rte_kni.c
index 8a8f6c1..4dac407 100644
--- a/lib/librte_kni/rte_kni.c
+++ b/lib/librte_kni/rte_kni.c
@@ -353,37 +353,43 @@ struct rte_kni *
 	/* TX RING */
 	mz = slot->m_tx_q;
 	ctx->tx_q = mz->addr;
-	kni_fifo_init(ctx->tx_q, KNI_FIFO_COUNT_MAX);
+	if (kni_fifo_init(ctx->tx_q, KNI_FIFO_COUNT_MAX))
+		return NULL;
 	dev_info.tx_phys = mz->phys_addr;
 
 	/* RX RING */
 	mz = slot->m_rx_q;
 	ctx->rx_q = mz->addr;
-	kni_fifo_init(ctx->rx_q, KNI_FIFO_COUNT_MAX);
+	if (kni_fifo_init(ctx->rx_q, KNI_FIFO_COUNT_MAX))
+		return NULL;
 	dev_info.rx_phys = mz->phys_addr;
 
 	/* ALLOC RING */
 	mz = slot->m_alloc_q;
 	ctx->alloc_q = mz->addr;
-	kni_fifo_init(ctx->alloc_q, KNI_FIFO_COUNT_MAX);
+	if (kni_fifo_init(ctx->alloc_q, KNI_FIFO_COUNT_MAX))
+		return NULL;
 	dev_info.alloc_phys = mz->phys_addr;
 
 	/* FREE RING */
 	mz = slot->m_free_q;
 	ctx->free_q = mz->addr;
-	kni_fifo_init(ctx->free_q, KNI_FIFO_COUNT_MAX);
+	if (kni_fifo_init(ctx->free_q, KNI_FIFO_COUNT_MAX))
+		return NULL;
 	dev_info.free_phys = mz->phys_addr;
 
 	/* Request RING */
 	mz = slot->m_req_q;
 	ctx->req_q = mz->addr;
-	kni_fifo_init(ctx->req_q, KNI_FIFO_COUNT_MAX);
+	if (kni_fifo_init(ctx->req_q, KNI_FIFO_COUNT_MAX))
+		return NULL;
 	dev_info.req_phys = mz->phys_addr;
 
 	/* Response RING */
 	mz = slot->m_resp_q;
 	ctx->resp_q = mz->addr;
-	kni_fifo_init(ctx->resp_q, KNI_FIFO_COUNT_MAX);
+	if (kni_fifo_init(ctx->resp_q, KNI_FIFO_COUNT_MAX))
+		return NULL;
 	dev_info.resp_phys = mz->phys_addr;
 
 	/* Req/Resp sync mem area */
diff --git a/lib/librte_kni/rte_kni_fifo.h b/lib/librte_kni/rte_kni_fifo.h
index ac26a8c..5052015 100644
--- a/lib/librte_kni/rte_kni_fifo.h
+++ b/lib/librte_kni/rte_kni_fifo.h
@@ -7,17 +7,22 @@
 /**
  * Initializes the kni fifo structure
  */
-static void
+static int
 kni_fifo_init(struct rte_kni_fifo *fifo, unsigned size)
 {
 	/* Ensure size is power of 2 */
-	if (size & (size - 1))
-		rte_panic("KNI fifo size must be power of 2\n");
+	if (size & (size - 1)) {
+		RTE_LOG(CRIT, EAL, "%s(): KNI fifo size must be power of 2\n",
+				__func__);
+		return -1;
+	}
 
 	fifo->write = 0;
 	fifo->read = 0;
 	fifo->len = size;
 	fifo->elem_size = sizeof(void *);
+
+	return 0;
 }
 
 /**
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v4 07/11] eal: replace rte_panic instances in hugepage_info
  2018-04-19  6:00 [PATCH v4 00/11] eal: replace calls to rte_panic and refrain from new instances Arnon Warshavsky
                   ` (5 preceding siblings ...)
  2018-04-19  6:01 ` [PATCH v4 06/11] kni: replace rte_panic instances in kni Arnon Warshavsky
@ 2018-04-19  6:01 ` Arnon Warshavsky
  2018-04-19 14:03   ` Burakov, Anatoly
  2018-04-19 14:36   ` Kevin Traynor
  2018-04-19  6:01 ` [PATCH v4 08/11] eal: replace rte_panic instances in interrupts thread Arnon Warshavsky
                   ` (3 subsequent siblings)
  10 siblings, 2 replies; 47+ messages in thread
From: Arnon Warshavsky @ 2018-04-19  6:01 UTC (permalink / raw)
  To: thomas, anatoly.burakov, wenzhuo.lu, declan.doherty, jerin.jacob,
	bruce.richardson, ferruh.yigit
  Cc: dev, arnon

replace panic calls with log and retrun value.

v4
static size calculation function changed to return success/fail code
in addition to filling  the size result.

Signed-off-by: Arnon Warshavsky <arnon@qwilt.com>
---
 lib/librte_eal/linuxapp/eal/eal_hugepage_info.c | 32 ++++++++++++++++---------
 1 file changed, 21 insertions(+), 11 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal_hugepage_info.c b/lib/librte_eal/linuxapp/eal/eal_hugepage_info.c
index fb4b667..debae32 100644
--- a/lib/librte_eal/linuxapp/eal/eal_hugepage_info.c
+++ b/lib/librte_eal/linuxapp/eal/eal_hugepage_info.c
@@ -145,8 +145,8 @@
 	return num_pages;
 }
 
-static uint64_t
-get_default_hp_size(void)
+static int
+get_default_hp_size(uint64_t *result)
 {
 	const char proc_meminfo[] = "/proc/meminfo";
 	const char str_hugepagesz[] = "Hugepagesize:";
@@ -155,8 +155,11 @@
 	unsigned long long size = 0;
 
 	FILE *fd = fopen(proc_meminfo, "r");
-	if (fd == NULL)
-		rte_panic("Cannot open %s\n", proc_meminfo);
+	if (fd == NULL) {
+		RTE_LOG(CRIT, EAL, "%s(): Cannot open %s\n",
+				__func__, proc_meminfo);
+		return -1;
+	}
 	while(fgets(buffer, sizeof(buffer), fd)){
 		if (strncmp(buffer, str_hugepagesz, hugepagesz_len) == 0){
 			size = rte_str_to_size(&buffer[hugepagesz_len]);
@@ -164,9 +167,13 @@
 		}
 	}
 	fclose(fd);
-	if (size == 0)
-		rte_panic("Cannot get default hugepage size from %s\n", proc_meminfo);
-	return size;
+	if (size == 0) {
+		RTE_LOG(CRIT, EAL, "%s(): Cannot get default hugepage size from %s\n",
+						 __func__, proc_meminfo);
+		return -1;
+	}
+	*result = size;
+	return 0;
 }
 
 static const char *
@@ -191,11 +198,14 @@
 	char *retval = NULL;
 
 	FILE *fd = fopen(proc_mounts, "r");
-	if (fd == NULL)
-		rte_panic("Cannot open %s\n", proc_mounts);
+	if (fd == NULL) {
+		RTE_LOG(CRIT, EAL, "%s(): Cannot open %s\n",
+				__func__, proc_mounts);
+		return NULL;
+	}
 
-	if (default_size == 0)
-		default_size = get_default_hp_size();
+	if ((default_size == 0) && (get_default_hp_size(&default_size) != 0))
+		return NULL;
 
 	while (fgets(buf, sizeof(buf), fd)){
 		if (rte_strsplit(buf, sizeof(buf), splitstr, _FIELDNAME_MAX,
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v4 08/11] eal: replace rte_panic instances in interrupts thread
  2018-04-19  6:00 [PATCH v4 00/11] eal: replace calls to rte_panic and refrain from new instances Arnon Warshavsky
                   ` (6 preceding siblings ...)
  2018-04-19  6:01 ` [PATCH v4 07/11] eal: replace rte_panic instances in hugepage_info Arnon Warshavsky
@ 2018-04-19  6:01 ` Arnon Warshavsky
  2018-04-19 17:27   ` Kevin Traynor
  2018-04-19  6:01 ` [PATCH v4 09/11] eal: replace rte_panic instances in ethdev Arnon Warshavsky
                   ` (2 subsequent siblings)
  10 siblings, 1 reply; 47+ messages in thread
From: Arnon Warshavsky @ 2018-04-19  6:01 UTC (permalink / raw)
  To: thomas, anatoly.burakov, wenzhuo.lu, declan.doherty, jerin.jacob,
	bruce.richardson, ferruh.yigit
  Cc: dev, arnon

replace panic calls with log and retrun value.
Thread function removes the noretrun attribute.

Signed-off-by: Arnon Warshavsky <arnon@qwilt.com>
---
 lib/librte_eal/linuxapp/eal/eal_interrupts.c | 27 ++++++++++++++++++++-------
 1 file changed, 20 insertions(+), 7 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal_interrupts.c b/lib/librte_eal/linuxapp/eal/eal_interrupts.c
index 58e9328..8b8650a 100644
--- a/lib/librte_eal/linuxapp/eal/eal_interrupts.c
+++ b/lib/librte_eal/linuxapp/eal/eal_interrupts.c
@@ -785,7 +785,7 @@ struct rte_intr_source {
  * @return
  *  never return;
  */
-static __attribute__((noreturn)) void *
+static void *
 eal_intr_thread_main(__rte_unused void *arg)
 {
 	struct epoll_event ev;
@@ -803,8 +803,11 @@ static __attribute__((noreturn)) void *
 
 		/* create epoll fd */
 		int pfd = epoll_create(1);
-		if (pfd < 0)
-			rte_panic("Cannot create epoll instance\n");
+		if (pfd < 0) {
+			RTE_LOG(CRIT, EAL, "%s(): Cannot create epoll instance\n",
+					__func__);
+			return NULL;
+		}
 
 		pipe_event.data.fd = intr_pipe.readfd;
 		/**
@@ -813,8 +816,11 @@ static __attribute__((noreturn)) void *
 		 */
 		if (epoll_ctl(pfd, EPOLL_CTL_ADD, intr_pipe.readfd,
 						&pipe_event) < 0) {
-			rte_panic("Error adding fd to %d epoll_ctl, %s\n",
+			RTE_LOG(CRIT, EAL, "%s(): Error adding fd to %d "
+					"epoll_ctl, %s\n",
+					__func__,
 					intr_pipe.readfd, strerror(errno));
+			return NULL;
 		}
 		numfds++;
 
@@ -831,9 +837,14 @@ static __attribute__((noreturn)) void *
 			 * into wait list.
 			 */
 			if (epoll_ctl(pfd, EPOLL_CTL_ADD,
-					src->intr_handle.fd, &ev) < 0){
-				rte_panic("Error adding fd %d epoll_ctl, %s\n",
-					src->intr_handle.fd, strerror(errno));
+				src->intr_handle.fd, &ev) < 0) {
+				RTE_LOG(CRIT, EAL,
+						"%s(): Error adding fd %d "
+						"epoll_ctl, %s\n",
+						__func__,
+						src->intr_handle.fd,
+						strerror(errno));
+				return NULL;
 			}
 			else
 				numfds++;
@@ -848,6 +859,8 @@ static __attribute__((noreturn)) void *
 		 */
 		close(pfd);
 	}
+
+	return NULL;
 }
 
 int
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v4 09/11] eal: replace rte_panic instances in ethdev
  2018-04-19  6:00 [PATCH v4 00/11] eal: replace calls to rte_panic and refrain from new instances Arnon Warshavsky
                   ` (7 preceding siblings ...)
  2018-04-19  6:01 ` [PATCH v4 08/11] eal: replace rte_panic instances in interrupts thread Arnon Warshavsky
@ 2018-04-19  6:01 ` Arnon Warshavsky
  2018-04-19 17:27   ` Kevin Traynor
  2018-04-19  6:01 ` [PATCH v4 10/11] eal: replace rte_panic instances in init sequence Arnon Warshavsky
  2018-04-19  6:01 ` [PATCH v4 11/11] devtools: prevent new instances of rte_panic and rte_exit Arnon Warshavsky
  10 siblings, 1 reply; 47+ messages in thread
From: Arnon Warshavsky @ 2018-04-19  6:01 UTC (permalink / raw)
  To: thomas, anatoly.burakov, wenzhuo.lu, declan.doherty, jerin.jacob,
	bruce.richardson, ferruh.yigit
  Cc: dev, arnon

Local function to this file,
changing from void to int is non-abi-breaking

Signed-off-by: Arnon Warshavsky <arnon@qwilt.com>
---
 lib/librte_ether/rte_ethdev.c | 36 +++++++++++++++++++++++++-----------
 1 file changed, 25 insertions(+), 11 deletions(-)

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index 7821a88..9c13827 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -194,7 +194,7 @@ enum {
 	return port_id;
 }
 
-static void
+static int
 rte_eth_dev_shared_data_prepare(void)
 {
 	const unsigned flags = 0;
@@ -210,8 +210,12 @@ enum {
 					rte_socket_id(), flags);
 		} else
 			mz = rte_memzone_lookup(MZ_RTE_ETH_DEV_DATA);
-		if (mz == NULL)
-			rte_panic("Cannot allocate ethdev shared data\n");
+		if (mz == NULL) {
+			rte_spinlock_unlock(&rte_eth_shared_data_lock);
+			RTE_LOG(CRIT, EAL, "%s(): Cannot allocate ethdev shared data\n",
+					__func__);
+			return -1;
+		}
 
 		rte_eth_dev_shared_data = mz->addr;
 		if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
@@ -224,6 +228,8 @@ enum {
 	}
 
 	rte_spinlock_unlock(&rte_eth_shared_data_lock);
+
+	return 0;
 }
 
 struct rte_eth_dev *
@@ -274,7 +280,8 @@ struct rte_eth_dev *
 	uint16_t port_id;
 	struct rte_eth_dev *eth_dev = NULL;
 
-	rte_eth_dev_shared_data_prepare();
+	if (rte_eth_dev_shared_data_prepare() != 0)
+		return NULL;
 
 	/* Synchronize port creation between primary and secondary threads. */
 	rte_spinlock_lock(&rte_eth_dev_shared_data->ownership_lock);
@@ -317,7 +324,8 @@ struct rte_eth_dev *
 	uint16_t i;
 	struct rte_eth_dev *eth_dev = NULL;
 
-	rte_eth_dev_shared_data_prepare();
+	if (rte_eth_dev_shared_data_prepare() != 0)
+		return NULL;
 
 	/* Synchronize port attachment to primary port creation and release. */
 	rte_spinlock_lock(&rte_eth_dev_shared_data->ownership_lock);
@@ -345,7 +353,8 @@ struct rte_eth_dev *
 	if (eth_dev == NULL)
 		return -EINVAL;
 
-	rte_eth_dev_shared_data_prepare();
+	if (rte_eth_dev_shared_data_prepare() != 0)
+		return -1;
 
 	rte_spinlock_lock(&rte_eth_dev_shared_data->ownership_lock);
 
@@ -399,7 +408,8 @@ struct rte_eth_dev *
 int __rte_experimental
 rte_eth_dev_owner_new(uint64_t *owner_id)
 {
-	rte_eth_dev_shared_data_prepare();
+	if (rte_eth_dev_shared_data_prepare() != 0)
+		return -1;
 
 	rte_spinlock_lock(&rte_eth_dev_shared_data->ownership_lock);
 
@@ -450,7 +460,8 @@ struct rte_eth_dev *
 {
 	int ret;
 
-	rte_eth_dev_shared_data_prepare();
+	if (rte_eth_dev_shared_data_prepare() != 0)
+		return -1;
 
 	rte_spinlock_lock(&rte_eth_dev_shared_data->ownership_lock);
 
@@ -467,7 +478,8 @@ struct rte_eth_dev *
 			{.id = RTE_ETH_DEV_NO_OWNER, .name = ""};
 	int ret;
 
-	rte_eth_dev_shared_data_prepare();
+	if (rte_eth_dev_shared_data_prepare() != 0)
+		return -1;
 
 	rte_spinlock_lock(&rte_eth_dev_shared_data->ownership_lock);
 
@@ -482,7 +494,8 @@ struct rte_eth_dev *
 {
 	uint16_t port_id;
 
-	rte_eth_dev_shared_data_prepare();
+	if (rte_eth_dev_shared_data_prepare() != 0)
+		return;
 
 	rte_spinlock_lock(&rte_eth_dev_shared_data->ownership_lock);
 
@@ -502,7 +515,8 @@ struct rte_eth_dev *
 {
 	int ret = 0;
 
-	rte_eth_dev_shared_data_prepare();
+	if (rte_eth_dev_shared_data_prepare() != 0)
+		return -1;
 
 	rte_spinlock_lock(&rte_eth_dev_shared_data->ownership_lock);
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v4 10/11] eal: replace rte_panic instances in init sequence
  2018-04-19  6:00 [PATCH v4 00/11] eal: replace calls to rte_panic and refrain from new instances Arnon Warshavsky
                   ` (8 preceding siblings ...)
  2018-04-19  6:01 ` [PATCH v4 09/11] eal: replace rte_panic instances in ethdev Arnon Warshavsky
@ 2018-04-19  6:01 ` Arnon Warshavsky
  2018-04-19 14:39   ` Burakov, Anatoly
  2018-04-19 17:48   ` Aaron Conole
  2018-04-19  6:01 ` [PATCH v4 11/11] devtools: prevent new instances of rte_panic and rte_exit Arnon Warshavsky
  10 siblings, 2 replies; 47+ messages in thread
From: Arnon Warshavsky @ 2018-04-19  6:01 UTC (permalink / raw)
  To: thomas, anatoly.burakov, wenzhuo.lu, declan.doherty, jerin.jacob,
	bruce.richardson, ferruh.yigit
  Cc: dev, arnon

Local functions to this file,
changing from void to int are non-abi-breaking.
For handling the single function that cannot
change from void to int due to abi,
where this is the only place it is called in,
I added a state variable that is being checked
right after the call to this function.

--

v4 - fix split literal strings in log messages

Signed-off-by: Arnon Warshavsky <arnon@qwilt.com>
---
 lib/librte_eal/bsdapp/eal/eal.c           |  86 ++++++++++++++-------
 lib/librte_eal/bsdapp/eal/eal_thread.c    |  65 +++++++++++-----
 lib/librte_eal/common/eal_common_launch.c |  21 ++++++
 lib/librte_eal/common/include/rte_debug.h |  12 +++
 lib/librte_eal/linuxapp/eal/eal.c         | 120 ++++++++++++++++++++----------
 lib/librte_eal/linuxapp/eal/eal_thread.c  |  65 +++++++++++-----
 6 files changed, 270 insertions(+), 99 deletions(-)

diff --git a/lib/librte_eal/bsdapp/eal/eal.c b/lib/librte_eal/bsdapp/eal/eal.c
index d996190..9c2f6f1 100644
--- a/lib/librte_eal/bsdapp/eal/eal.c
+++ b/lib/librte_eal/bsdapp/eal/eal.c
@@ -151,7 +151,7 @@ enum rte_iova_mode
  * We also don't lock the whole file, so that in future we can use read-locks
  * on other parts, e.g. memzones, to detect if there are running secondary
  * processes. */
-static void
+static int
 rte_eal_config_create(void)
 {
 	void *rte_mem_cfg_addr;
@@ -160,60 +160,78 @@ enum rte_iova_mode
 	const char *pathname = eal_runtime_config_path();
 
 	if (internal_config.no_shconf)
-		return;
+		return 0;
 
 	if (mem_cfg_fd < 0){
 		mem_cfg_fd = open(pathname, O_RDWR | O_CREAT, 0660);
-		if (mem_cfg_fd < 0)
-			rte_panic("Cannot open '%s' for rte_mem_config\n", pathname);
+		if (mem_cfg_fd < 0) {
+			RTE_LOG(CRIT, EAL, "%s(): Cannot open '%s' for rte_mem_config\n",
+					__func__, pathname);
+			return -1;
+		}
 	}
 
 	retval = ftruncate(mem_cfg_fd, sizeof(*rte_config.mem_config));
 	if (retval < 0){
 		close(mem_cfg_fd);
-		rte_panic("Cannot resize '%s' for rte_mem_config\n", pathname);
+		RTE_LOG(CRIT, EAL, "%s(): Cannot resize '%s' for rte_mem_config\n",
+				__func__, pathname);
+		return -1;
 	}
 
 	retval = fcntl(mem_cfg_fd, F_SETLK, &wr_lock);
 	if (retval < 0){
 		close(mem_cfg_fd);
-		rte_exit(EXIT_FAILURE, "Cannot create lock on '%s'. Is another primary "
-				"process running?\n", pathname);
+		RTE_LOG(CRIT, EAL, "%s(): Cannot create lock on '%s'. Is another primary process running?\n",
+				__func__, pathname);
+		return -1;
 	}
 
 	rte_mem_cfg_addr = mmap(NULL, sizeof(*rte_config.mem_config),
 				PROT_READ | PROT_WRITE, MAP_SHARED, mem_cfg_fd, 0);
 
 	if (rte_mem_cfg_addr == MAP_FAILED){
-		rte_panic("Cannot mmap memory for rte_config\n");
+		RTE_LOG(CRIT, EAL, "%s(): Cannot mmap memory for rte_config\n",
+				__func__);
+		return -1;
 	}
 	memcpy(rte_mem_cfg_addr, &early_mem_config, sizeof(early_mem_config));
 	rte_config.mem_config = rte_mem_cfg_addr;
+
+	return 0;
 }
 
 /* attach to an existing shared memory config */
-static void
+static int
 rte_eal_config_attach(void)
 {
 	void *rte_mem_cfg_addr;
 	const char *pathname = eal_runtime_config_path();
 
 	if (internal_config.no_shconf)
-		return;
+		return 0;
 
 	if (mem_cfg_fd < 0){
 		mem_cfg_fd = open(pathname, O_RDWR);
-		if (mem_cfg_fd < 0)
-			rte_panic("Cannot open '%s' for rte_mem_config\n", pathname);
+		if (mem_cfg_fd < 0) {
+			RTE_LOG(CRIT, EAL, "%s(): Cannot open '%s' for rte_mem_config\n",
+					__func__, pathname);
+			return -1;
+		}
 	}
 
 	rte_mem_cfg_addr = mmap(NULL, sizeof(*rte_config.mem_config),
 				PROT_READ | PROT_WRITE, MAP_SHARED, mem_cfg_fd, 0);
 	close(mem_cfg_fd);
-	if (rte_mem_cfg_addr == MAP_FAILED)
-		rte_panic("Cannot mmap memory for rte_config\n");
+	if (rte_mem_cfg_addr == MAP_FAILED) {
+		RTE_LOG(CRIT, EAL, "%s(): Cannot mmap memory for rte_config\n",
+				__func__);
+		return -1;
+	}
 
 	rte_config.mem_config = rte_mem_cfg_addr;
+
+	return 0;
 }
 
 /* Detect if we are a primary or a secondary process */
@@ -237,23 +255,28 @@ enum rte_proc_type_t
 }
 
 /* Sets up rte_config structure with the pointer to shared memory config.*/
-static void
+static int
 rte_config_init(void)
 {
 	rte_config.process_type = internal_config.process_type;
 
 	switch (rte_config.process_type){
 	case RTE_PROC_PRIMARY:
-		rte_eal_config_create();
+		if (rte_eal_config_create())
+			return -1;
 		break;
 	case RTE_PROC_SECONDARY:
-		rte_eal_config_attach();
+		if (rte_eal_config_attach())
+			return -1;
 		rte_eal_mcfg_wait_complete(rte_config.mem_config);
 		break;
 	case RTE_PROC_AUTO:
 	case RTE_PROC_INVALID:
-		rte_panic("Invalid process type\n");
+		RTE_LOG(CRIT, EAL, "%s(): Invalid process type %d\n",
+				__func__, rte_config.process_type);
+		return -1;
 	}
+	return 0;
 }
 
 /* display usage */
@@ -595,7 +618,8 @@ static void rte_eal_init_alert(const char *msg)
 
 	rte_srand(rte_rdtsc());
 
-	rte_config_init();
+	if (rte_config_init() != 0)
+		return -1;
 
 	if (rte_mp_channel_init() < 0) {
 		rte_eal_init_alert("failed to init mp channel\n");
@@ -652,7 +676,8 @@ static void rte_eal_init_alert(const char *msg)
 
 	eal_check_mem_on_local_socket();
 
-	eal_thread_init_master(rte_config.master_lcore);
+	if (eal_thread_init_master(rte_config.master_lcore) != 0)
+		return -1;
 
 	ret = eal_thread_dump_affinity(cpuset, RTE_CPU_AFFINITY_STR_LEN);
 
@@ -666,18 +691,27 @@ static void rte_eal_init_alert(const char *msg)
 		 * create communication pipes between master thread
 		 * and children
 		 */
-		if (pipe(lcore_config[i].pipe_master2slave) < 0)
-			rte_panic("Cannot create pipe\n");
-		if (pipe(lcore_config[i].pipe_slave2master) < 0)
-			rte_panic("Cannot create pipe\n");
+		if (pipe(lcore_config[i].pipe_master2slave) < 0) {
+			RTE_LOG(CRIT, EAL, "%s(): Cannot create pipe\n",
+					__func__);
+			return -1;
+		}
+		if (pipe(lcore_config[i].pipe_slave2master) < 0) {
+			RTE_LOG(CRIT, EAL, "%s(): Cannot create pipe\n",
+					__func__);
+			return -1;
+		}
 
 		lcore_config[i].state = WAIT;
 
 		/* create a thread for each lcore */
 		ret = pthread_create(&lcore_config[i].thread_id, NULL,
 				     eal_thread_loop, NULL);
-		if (ret != 0)
-			rte_panic("Cannot create thread\n");
+		if (ret != 0) {
+			RTE_LOG(CRIT, EAL, "%s(): Cannot create thread\n",
+					__func__);
+			return -1;
+		}
 
 		/* Set thread_name for aid in debugging. */
 		snprintf(thread_name, RTE_MAX_THREAD_NAME_LEN,
diff --git a/lib/librte_eal/bsdapp/eal/eal_thread.c b/lib/librte_eal/bsdapp/eal/eal_thread.c
index d602daf..5c3947c 100644
--- a/lib/librte_eal/bsdapp/eal/eal_thread.c
+++ b/lib/librte_eal/bsdapp/eal/eal_thread.c
@@ -51,16 +51,22 @@
 	n = 0;
 	while (n == 0 || (n < 0 && errno == EINTR))
 		n = write(m2s, &c, 1);
-	if (n < 0)
-		rte_panic("cannot write on configuration pipe\n");
+	if (n < 0) {
+		RTE_LOG(CRIT, EAL, "%s(): Cannot write on configuration pipe\n",
+				__func__);
+		return -1;
+	}
 
 	/* wait ack */
 	do {
 		n = read(s2m, &c, 1);
 	} while (n < 0 && errno == EINTR);
 
-	if (n <= 0)
-		rte_panic("cannot read on configuration pipe\n");
+	if (n <= 0) {
+		RTE_LOG(CRIT, EAL, "%s(): Cannot read on configuration pipe\n",
+				__func__);
+		return -1;
+	}
 
 	return 0;
 }
@@ -84,8 +90,19 @@ void eal_thread_init_master(unsigned lcore_id)
 	RTE_PER_LCORE(_lcore_id) = lcore_id;
 
 	/* set CPU affinity */
-	if (eal_thread_set_affinity() < 0)
-		rte_panic("cannot set affinity\n");
+	if (eal_thread_set_affinity() < 0) {
+		RTE_LOG(CRIT, EAL, "%s(): Cannot set affinity\n", __func__);
+		rte_move_to_panic_state();
+	}
+}
+
+/* move to panic state and do not return */
+static __attribute__((noreturn)) void
+defunct_and_remain_in_endless_loop(void)
+{
+	rte_move_to_panic_state();
+	while (1)
+		sleep(1);
 }
 
 /* main loop of threads */
@@ -106,8 +123,11 @@ void eal_thread_init_master(unsigned lcore_id)
 		if (thread_id == lcore_config[lcore_id].thread_id)
 			break;
 	}
-	if (lcore_id == RTE_MAX_LCORE)
-		rte_panic("cannot retrieve lcore id\n");
+	if (lcore_id == RTE_MAX_LCORE) {
+		RTE_LOG(CRIT, EAL, "%s(): Cannot retrieve lcore id\n",
+				__func__);
+		defunct_and_remain_in_endless_loop();
+	}
 
 	m2s = lcore_config[lcore_id].pipe_master2slave[0];
 	s2m = lcore_config[lcore_id].pipe_slave2master[1];
@@ -116,8 +136,10 @@ void eal_thread_init_master(unsigned lcore_id)
 	RTE_PER_LCORE(_lcore_id) = lcore_id;
 
 	/* set CPU affinity */
-	if (eal_thread_set_affinity() < 0)
-		rte_panic("cannot set affinity\n");
+	if (eal_thread_set_affinity() < 0) {
+		RTE_LOG(CRIT, EAL, "%s(): Cannot set affinity\n", __func__);
+		defunct_and_remain_in_endless_loop();
+	}
 
 	ret = eal_thread_dump_affinity(cpuset, RTE_CPU_AFFINITY_STR_LEN);
 
@@ -133,8 +155,11 @@ void eal_thread_init_master(unsigned lcore_id)
 			n = read(m2s, &c, 1);
 		} while (n < 0 && errno == EINTR);
 
-		if (n <= 0)
-			rte_panic("cannot read on configuration pipe\n");
+		if (n <= 0) {
+			RTE_LOG(CRIT, EAL, "%s(): Cannot read on configuration pipe\n",
+					__func__);
+			defunct_and_remain_in_endless_loop();
+		}
 
 		lcore_config[lcore_id].state = RUNNING;
 
@@ -142,11 +167,17 @@ void eal_thread_init_master(unsigned lcore_id)
 		n = 0;
 		while (n == 0 || (n < 0 && errno == EINTR))
 			n = write(s2m, &c, 1);
-		if (n < 0)
-			rte_panic("cannot write on configuration pipe\n");
-
-		if (lcore_config[lcore_id].f == NULL)
-			rte_panic("NULL function pointer\n");
+		if (n < 0) {
+			RTE_LOG(CRIT, EAL, "%s(): Cannot write on configuration pipe\n",
+					__func__);
+			defunct_and_remain_in_endless_loop();
+		}
+
+		if (lcore_config[lcore_id].f == NULL) {
+			RTE_LOG(CRIT, EAL, "%s(): NULL function pointer\n",
+					__func__);
+			defunct_and_remain_in_endless_loop();
+		}
 
 		/* call the function and store the return value */
 		fct_arg = lcore_config[lcore_id].arg;
diff --git a/lib/librte_eal/common/eal_common_launch.c b/lib/librte_eal/common/eal_common_launch.c
index fe0ba3f..6f8bd46 100644
--- a/lib/librte_eal/common/eal_common_launch.c
+++ b/lib/librte_eal/common/eal_common_launch.c
@@ -14,6 +14,7 @@
 #include <rte_pause.h>
 #include <rte_per_lcore.h>
 #include <rte_lcore.h>
+#include <rte_debug.h>
 
 /*
  * Wait until a lcore finished its job.
@@ -88,3 +89,23 @@ enum rte_lcore_state_t
 		rte_eal_wait_lcore(lcore_id);
 	}
 }
+
+/* panic state */
+static int _panic_state;
+
+/**
+ * Check if the system is in panic state
+ * @return int
+ */
+int rte_get_panic_state(void)
+{
+	return _panic_state;
+}
+
+/**
+ * Move the system to be in panic state
+ */
+void rte_move_to_panic_state(void)
+{
+	_panic_state = 1;
+}
diff --git a/lib/librte_eal/common/include/rte_debug.h b/lib/librte_eal/common/include/rte_debug.h
index 272df49..b421d33 100644
--- a/lib/librte_eal/common/include/rte_debug.h
+++ b/lib/librte_eal/common/include/rte_debug.h
@@ -79,4 +79,16 @@ void __rte_panic(const char *funcname , const char *format, ...)
 }
 #endif
 
+/**
+ * Check if the system is in panic state
+ * @return int
+ */
+int rte_get_panic_state(void);
+
+/**
+ * Move the system to be in panic state
+ */
+void rte_move_to_panic_state(void);
+
+
 #endif /* _RTE_DEBUG_H_ */
diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index 21afa73..393441a 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -160,7 +160,7 @@ enum rte_iova_mode
  * We also don't lock the whole file, so that in future we can use read-locks
  * on other parts, e.g. memzones, to detect if there are running secondary
  * processes. */
-static void
+static int
 rte_eal_config_create(void)
 {
 	void *rte_mem_cfg_addr;
@@ -169,7 +169,7 @@ enum rte_iova_mode
 	const char *pathname = eal_runtime_config_path();
 
 	if (internal_config.no_shconf)
-		return;
+		return 0;
 
 	/* map the config before hugepage address so that we don't waste a page */
 	if (internal_config.base_virtaddr != 0)
@@ -179,30 +179,39 @@ enum rte_iova_mode
 	else
 		rte_mem_cfg_addr = NULL;
 
-	if (mem_cfg_fd < 0){
+	if (mem_cfg_fd < 0) {
 		mem_cfg_fd = open(pathname, O_RDWR | O_CREAT, 0660);
-		if (mem_cfg_fd < 0)
-			rte_panic("Cannot open '%s' for rte_mem_config\n", pathname);
+		if (mem_cfg_fd < 0) {
+			RTE_LOG(CRIT, EAL, "%s(): Cannot open '%s' for rte_mem_config\n",
+				__func__, pathname);
+			return -1;
+		}
 	}
 
 	retval = ftruncate(mem_cfg_fd, sizeof(*rte_config.mem_config));
-	if (retval < 0){
+	if (retval < 0) {
 		close(mem_cfg_fd);
-		rte_panic("Cannot resize '%s' for rte_mem_config\n", pathname);
+		RTE_LOG(CRIT, EAL, "%s(): Cannot resize '%s' for rte_mem_config\n",
+				__func__, pathname);
+		return -1;
 	}
 
 	retval = fcntl(mem_cfg_fd, F_SETLK, &wr_lock);
-	if (retval < 0){
+	if (retval < 0) {
 		close(mem_cfg_fd);
-		rte_exit(EXIT_FAILURE, "Cannot create lock on '%s'. Is another primary "
-				"process running?\n", pathname);
+		RTE_LOG(CRIT, EAL, "%s(): Cannot create lock on '%s'."
+				" Is another primary process running?\n",
+				__func__, pathname);
+		return -1;
 	}
 
 	rte_mem_cfg_addr = mmap(rte_mem_cfg_addr, sizeof(*rte_config.mem_config),
 				PROT_READ | PROT_WRITE, MAP_SHARED, mem_cfg_fd, 0);
 
-	if (rte_mem_cfg_addr == MAP_FAILED){
-		rte_panic("Cannot mmap memory for rte_config\n");
+	if (rte_mem_cfg_addr == MAP_FAILED) {
+		RTE_LOG(CRIT, EAL, "%s(): Cannot mmap memory for rte_config\n",
+			__func__);
+		return -1;
 	}
 	memcpy(rte_mem_cfg_addr, &early_mem_config, sizeof(early_mem_config));
 	rte_config.mem_config = rte_mem_cfg_addr;
@@ -211,10 +220,11 @@ enum rte_iova_mode
 	 * processes could later map the config into this exact location */
 	rte_config.mem_config->mem_cfg_addr = (uintptr_t) rte_mem_cfg_addr;
 
+	return 0;
 }
 
 /* attach to an existing shared memory config */
-static void
+static int
 rte_eal_config_attach(void)
 {
 	struct rte_mem_config *mem_config;
@@ -222,33 +232,40 @@ enum rte_iova_mode
 	const char *pathname = eal_runtime_config_path();
 
 	if (internal_config.no_shconf)
-		return;
+		return 0;
 
-	if (mem_cfg_fd < 0){
+	if (mem_cfg_fd < 0) {
 		mem_cfg_fd = open(pathname, O_RDWR);
-		if (mem_cfg_fd < 0)
-			rte_panic("Cannot open '%s' for rte_mem_config\n", pathname);
+		if (mem_cfg_fd < 0) {
+			RTE_LOG(CRIT, EAL, "%s(): Cannot open '%s' for rte_mem_config\n",
+						__func__, pathname);
+			return -1;
+		}
 	}
 
 	/* map it as read-only first */
 	mem_config = (struct rte_mem_config *) mmap(NULL, sizeof(*mem_config),
 			PROT_READ, MAP_SHARED, mem_cfg_fd, 0);
-	if (mem_config == MAP_FAILED)
-		rte_panic("Cannot mmap memory for rte_config! error %i (%s)\n",
-			  errno, strerror(errno));
+	if (mem_config == MAP_FAILED) {
+		RTE_LOG(CRIT, EAL, "%s(): Cannot mmap memory for rte_config! error %i (%s)\n",
+				__func__, errno, strerror(errno));
+		return -1;
+	}
 
 	rte_config.mem_config = mem_config;
+
+	return 0;
 }
 
 /* reattach the shared config at exact memory location primary process has it */
-static void
+static int
 rte_eal_config_reattach(void)
 {
 	struct rte_mem_config *mem_config;
 	void *rte_mem_cfg_addr;
 
 	if (internal_config.no_shconf)
-		return;
+		return 0;
 
 	/* save the address primary process has mapped shared config to */
 	rte_mem_cfg_addr = (void *) (uintptr_t) rte_config.mem_config->mem_cfg_addr;
@@ -263,16 +280,21 @@ enum rte_iova_mode
 	if (mem_config == MAP_FAILED || mem_config != rte_mem_cfg_addr) {
 		if (mem_config != MAP_FAILED)
 			/* errno is stale, don't use */
-			rte_panic("Cannot mmap memory for rte_config at [%p], got [%p]"
-				  " - please use '--base-virtaddr' option\n",
-				  rte_mem_cfg_addr, mem_config);
+			RTE_LOG(CRIT, EAL, "%s(): Cannot mmap memory for "
+					"rte_config at [%p], got [%p] - please use "
+					"'--base-virtaddr' option\n",
+					__func__, rte_mem_cfg_addr, mem_config);
 		else
-			rte_panic("Cannot mmap memory for rte_config! error %i (%s)\n",
-				  errno, strerror(errno));
+			RTE_LOG(CRIT, EAL, "%s(): Cannot mmap memory for "
+					"rte_config! error %i (%s)\n",
+					__func__, errno, strerror(errno));
+		return -1;
 	}
 	close(mem_cfg_fd);
 
 	rte_config.mem_config = mem_config;
+
+	return 0;
 }
 
 /* Detect if we are a primary or a secondary process */
@@ -296,24 +318,31 @@ enum rte_proc_type_t
 }
 
 /* Sets up rte_config structure with the pointer to shared memory config.*/
-static void
+static int
 rte_config_init(void)
 {
 	rte_config.process_type = internal_config.process_type;
 
 	switch (rte_config.process_type){
 	case RTE_PROC_PRIMARY:
-		rte_eal_config_create();
+		if (rte_eal_config_create() != 0)
+			return -1;
 		break;
 	case RTE_PROC_SECONDARY:
-		rte_eal_config_attach();
+		if (rte_eal_config_attach() != 0)
+			return -1;
 		rte_eal_mcfg_wait_complete(rte_config.mem_config);
-		rte_eal_config_reattach();
+		if (rte_eal_config_reattach() != 0)
+			return -1;
 		break;
 	case RTE_PROC_AUTO:
 	case RTE_PROC_INVALID:
-		rte_panic("Invalid process type\n");
+		RTE_LOG(CRIT, EAL, "%s(): Invalid process type %d\n",
+				__func__, rte_config.process_type);
+		return -1;
 	}
+
+	return 0;
 }
 
 /* Unlocks hugepage directories that were locked by eal_hugepage_info_init */
@@ -820,7 +849,8 @@ static void rte_eal_init_alert(const char *msg)
 
 	rte_srand(rte_rdtsc());
 
-	rte_config_init();
+	if (rte_config_init() != 0)
+		return -1;
 
 	if (rte_eal_log_init(logid, internal_config.syslog_facility) < 0) {
 		rte_eal_init_alert("Cannot init logging.");
@@ -892,6 +922,9 @@ static void rte_eal_init_alert(const char *msg)
 
 	eal_thread_init_master(rte_config.master_lcore);
 
+	if (rte_get_panic_state())
+		return -1;
+
 	ret = eal_thread_dump_affinity(cpuset, RTE_CPU_AFFINITY_STR_LEN);
 
 	RTE_LOG(DEBUG, EAL, "Master lcore %u is ready (tid=%x;cpuset=[%s%s])\n",
@@ -909,18 +942,27 @@ static void rte_eal_init_alert(const char *msg)
 		 * create communication pipes between master thread
 		 * and children
 		 */
-		if (pipe(lcore_config[i].pipe_master2slave) < 0)
-			rte_panic("Cannot create pipe\n");
-		if (pipe(lcore_config[i].pipe_slave2master) < 0)
-			rte_panic("Cannot create pipe\n");
+		if (pipe(lcore_config[i].pipe_master2slave) < 0) {
+			RTE_LOG(CRIT, EAL, "%s(): Cannot create pipe\n",
+					__func__);
+			return -1;
+		}
+		if (pipe(lcore_config[i].pipe_slave2master) < 0) {
+			RTE_LOG(CRIT, EAL, "%s(): Cannot create pipe\n",
+					__func__);
+			return -1;
+		}
 
 		lcore_config[i].state = WAIT;
 
 		/* create a thread for each lcore */
 		ret = pthread_create(&lcore_config[i].thread_id, NULL,
 				     eal_thread_loop, NULL);
-		if (ret != 0)
-			rte_panic("Cannot create thread\n");
+		if (ret != 0) {
+			RTE_LOG(CRIT, EAL, "%s(): Cannot create thread\n",
+					__func__);
+			return -1;
+		}
 
 		/* Set thread_name for aid in debugging. */
 		snprintf(thread_name, RTE_MAX_THREAD_NAME_LEN,
diff --git a/lib/librte_eal/linuxapp/eal/eal_thread.c b/lib/librte_eal/linuxapp/eal/eal_thread.c
index 08e150b..3afcee5 100644
--- a/lib/librte_eal/linuxapp/eal/eal_thread.c
+++ b/lib/librte_eal/linuxapp/eal/eal_thread.c
@@ -51,16 +51,22 @@
 	n = 0;
 	while (n == 0 || (n < 0 && errno == EINTR))
 		n = write(m2s, &c, 1);
-	if (n < 0)
-		rte_panic("cannot write on configuration pipe\n");
+	if (n < 0) {
+		RTE_LOG(CRIT, EAL, "%s(): Cannot write on configuration pipe\n",
+				__func__);
+		return -1;
+	}
 
 	/* wait ack */
 	do {
 		n = read(s2m, &c, 1);
 	} while (n < 0 && errno == EINTR);
 
-	if (n <= 0)
-		rte_panic("cannot read on configuration pipe\n");
+	if (n <= 0) {
+		RTE_LOG(CRIT, EAL, "%s(): Cannot read on configuration pipe\n",
+				__func__);
+		return -1;
+	}
 
 	return 0;
 }
@@ -84,8 +90,19 @@ void eal_thread_init_master(unsigned lcore_id)
 	RTE_PER_LCORE(_lcore_id) = lcore_id;
 
 	/* set CPU affinity */
-	if (eal_thread_set_affinity() < 0)
-		rte_panic("cannot set affinity\n");
+	if (eal_thread_set_affinity() < 0) {
+		RTE_LOG(CRIT, EAL, "%s(): Cannot set affinity\n", __func__);
+		rte_move_to_panic_state();
+	}
+}
+
+/* move to panic state and do not return */
+static __attribute__((noreturn)) void
+defunct_and_remain_in_endless_loop(void)
+{
+	rte_move_to_panic_state();
+	while (1)
+		sleep(1);
 }
 
 /* main loop of threads */
@@ -106,8 +123,11 @@ void eal_thread_init_master(unsigned lcore_id)
 		if (thread_id == lcore_config[lcore_id].thread_id)
 			break;
 	}
-	if (lcore_id == RTE_MAX_LCORE)
-		rte_panic("cannot retrieve lcore id\n");
+	if (lcore_id == RTE_MAX_LCORE) {
+		RTE_LOG(CRIT, EAL, "%s(): Cannot retrieve lcore id\n",
+				__func__);
+		defunct_and_remain_in_endless_loop();
+	}
 
 	m2s = lcore_config[lcore_id].pipe_master2slave[0];
 	s2m = lcore_config[lcore_id].pipe_slave2master[1];
@@ -116,8 +136,10 @@ void eal_thread_init_master(unsigned lcore_id)
 	RTE_PER_LCORE(_lcore_id) = lcore_id;
 
 	/* set CPU affinity */
-	if (eal_thread_set_affinity() < 0)
-		rte_panic("cannot set affinity\n");
+	if (eal_thread_set_affinity() < 0) {
+		RTE_LOG(CRIT, EAL, "%s(): Cannot set affinity\n", __func__);
+		defunct_and_remain_in_endless_loop();
+	}
 
 	ret = eal_thread_dump_affinity(cpuset, RTE_CPU_AFFINITY_STR_LEN);
 
@@ -133,8 +155,11 @@ void eal_thread_init_master(unsigned lcore_id)
 			n = read(m2s, &c, 1);
 		} while (n < 0 && errno == EINTR);
 
-		if (n <= 0)
-			rte_panic("cannot read on configuration pipe\n");
+		if (n <= 0) {
+			RTE_LOG(CRIT, EAL, "%s(): Cannot read on configuration pipe\n",
+					__func__);
+			defunct_and_remain_in_endless_loop();
+		}
 
 		lcore_config[lcore_id].state = RUNNING;
 
@@ -142,11 +167,17 @@ void eal_thread_init_master(unsigned lcore_id)
 		n = 0;
 		while (n == 0 || (n < 0 && errno == EINTR))
 			n = write(s2m, &c, 1);
-		if (n < 0)
-			rte_panic("cannot write on configuration pipe\n");
-
-		if (lcore_config[lcore_id].f == NULL)
-			rte_panic("NULL function pointer\n");
+		if (n < 0) {
+			RTE_LOG(CRIT, EAL, "%s(): Cannot write on configuration pipe\n",
+					__func__);
+			defunct_and_remain_in_endless_loop();
+		}
+
+		if (lcore_config[lcore_id].f == NULL) {
+			RTE_LOG(CRIT, EAL, "%s(): NULL function pointer\n",
+					__func__);
+			defunct_and_remain_in_endless_loop();
+		}
 
 		/* call the function and store the return value */
 		fct_arg = lcore_config[lcore_id].arg;
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v4 11/11] devtools: prevent new instances of rte_panic and rte_exit
  2018-04-19  6:00 [PATCH v4 00/11] eal: replace calls to rte_panic and refrain from new instances Arnon Warshavsky
                   ` (9 preceding siblings ...)
  2018-04-19  6:01 ` [PATCH v4 10/11] eal: replace rte_panic instances in init sequence Arnon Warshavsky
@ 2018-04-19  6:01 ` Arnon Warshavsky
  2018-04-19 17:52   ` Aaron Conole
  10 siblings, 1 reply; 47+ messages in thread
From: Arnon Warshavsky @ 2018-04-19  6:01 UTC (permalink / raw)
  To: thomas, anatoly.burakov, wenzhuo.lu, declan.doherty, jerin.jacob,
	bruce.richardson, ferruh.yigit
  Cc: dev, arnon

This patch adds a new function that is called
per every checked patch,
and alerts for new instances of rte_panic/rte_exit.
The check excludes comments, and alerts in the case
of a positive balance between additions and removals.

Signed-off-by: Arnon Warshavsky <arnon@qwilt.com>
---
 devtools/checkpatches.sh | 94 +++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 93 insertions(+), 1 deletion(-)

diff --git a/devtools/checkpatches.sh b/devtools/checkpatches.sh
index 7676a6b..fb37838 100755
--- a/devtools/checkpatches.sh
+++ b/devtools/checkpatches.sh
@@ -61,6 +61,90 @@ print_usage () {
 	END_OF_HELP
 }
 
+check_forbidden_additions() { # <file>
+    # ---------------------------------
+    #This awk script receives a list of expressions to monitor
+    #and a list of folders to search these expressions in
+    # - No search is done inside comments
+    # - Both additions and removals of the expressions are checked
+    #   A positive balance of additions fails the check
+	read -d '' awk_script << 'EOF'
+	BEGIN{
+		split(FOLDERS,deny_folders," ");
+		split(EXPRESSIONS,deny_expr," ");
+		in_file=0;
+		in_comment=0;
+		count=0;
+		comment_start="/*"
+		comment_end="*/"
+    }
+    # search for add/remove instances in current file
+    # state machine assumes the comments structure is enforced by
+    # checkpatches.pl
+    (in_file) {
+		# comment start
+		if (index($0,comment_start) > 0){
+			in_comment = 1
+		}
+		# non comment code
+		if (in_comment == 0) {
+			for (i in deny_expr) {
+				forbidden_added = "^\+.*" deny_expr[i];
+				forbidden_removed="^-.*" deny_expr[i];
+				current = expressions[deny_expr[i]]
+				if ($0 ~ forbidden_added) {
+					count = count + 1;
+					expressions[deny_expr[i]] = current + 1
+				}
+				if ($0 ~ forbidden_removed) {
+					count = count - 1;
+					expressions[deny_expr[i]] = current - 1
+				}
+			}
+		}
+
+		# comment end
+		if (index($0,comment_end) > 0)  {
+			in_comment = 0
+		}
+    }
+	# switch to next file , check if the balance of add/remove
+	# of previous filehad new additions
+	($0 ~ "^\+\+\+ b/") {
+		in_file = 0;
+		if (count > 0){
+			exit;
+		}
+		for (i in deny_folders){
+			re = "^\+\+\+ b/" deny_folders[i];
+			if ($0 ~ deny_folders[i]) {
+				in_file = 1
+				last_file = $0
+			}
+		}
+	}
+	END{
+		if (count > 0){
+			print "Forbidden expressions in " substr(last_file,6) ":"
+			for (key in expressions) {
+				if (expressions[key] > 0) {
+					print key
+				}
+			}
+			exit 1
+		}
+	}
+EOF
+	# ---------------------------------
+
+	# refrain from new additions of rte_panic() and rte_exit()
+	# under lib and net
+	# multiple folders and expressions are separated by spaces
+	awk -v FOLDERS="lib net" \
+		-v EXPRESSIONS="rte_panic\\\( rte_exit\\\(" \
+		"$awk_script" -
+}
+
 number=0
 quiet=false
 verbose=false
@@ -89,11 +173,19 @@ check () { # <patch> <commit> <title>
 	total=$(($total + 1))
 	! $verbose || printf '\n### %s\n\n' "$3"
 	if [ -n "$1" ] ; then
+		cat "$1" | check_forbidden_additions
+		[ $? -eq 0 ] || return 0
 		report=$($DPDK_CHECKPATCH_PATH $options "$1" 2>/dev/null)
 	elif [ -n "$2" ] ; then
-		report=$(git format-patch --find-renames --no-stat --stdout -1 $commit |
+		params=$(echo "--find-renames --no-stat --stdout -1")
+		body=$(git format-patch $params $commit)
+		echo "$body" | check_forbidden_additions
+		[ $? -eq 0 ] || return 0
+		report=$(echo "$body" |
 			$DPDK_CHECKPATCH_PATH $options - 2>/dev/null)
 	else
+		check_forbidden_additions -
+		[ $? -eq 0 ] || return 0
 		report=$($DPDK_CHECKPATCH_PATH $options - 2>/dev/null)
 	fi
 	[ $? -ne 0 ] || return 0
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* Re: [PATCH v4 01/11] crypto: replace rte_panic instances in crypto driver
  2018-04-19  6:00 ` [PATCH v4 01/11] crypto: replace rte_panic instances in crypto driver Arnon Warshavsky
@ 2018-04-19 10:53   ` Trahe, Fiona
  2018-04-19 13:49     ` Arnon Warshavsky
  0 siblings, 1 reply; 47+ messages in thread
From: Trahe, Fiona @ 2018-04-19 10:53 UTC (permalink / raw)
  To: Arnon Warshavsky, thomas, Burakov, Anatoly, Lu, Wenzhuo, Doherty,
	Declan, jerin.jacob, Richardson, Bruce, Yigit, Ferruh
  Cc: dev

Hi Arnon,
Can you change subject to crypto/dpaa:... please as it's only affecting that driver.
Fiona

> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Arnon Warshavsky
> Sent: Thursday, April 19, 2018 7:01 AM
> To: thomas@monjalon.net; Burakov, Anatoly <anatoly.burakov@intel.com>; Lu, Wenzhuo
> <wenzhuo.lu@intel.com>; Doherty, Declan <declan.doherty@intel.com>;
> jerin.jacob@caviumnetworks.com; Richardson, Bruce <bruce.richardson@intel.com>; Yigit, Ferruh
> <ferruh.yigit@intel.com>
> Cc: dev@dpdk.org; arnon@qwilt.com
> Subject: [dpdk-dev] [PATCH v4 01/11] crypto: replace rte_panic instances in crypto driver
> 
> replace panic calls with log and return value.
> 
> --
> v2:
> - reformat error message to include literal string in a single line
> v4: replace -1 return value with -ENOMEM
> 
> Signed-off-by: Arnon Warshavsky <arnon@qwilt.com>
> ---
>  drivers/crypto/dpaa2_sec/dpaa2_sec_dpseci.c | 8 +++++---
>  drivers/crypto/dpaa_sec/dpaa_sec.c          | 8 +++++---
>  2 files changed, 10 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/crypto/dpaa2_sec/dpaa2_sec_dpseci.c
> b/drivers/crypto/dpaa2_sec/dpaa2_sec_dpseci.c
> index 23012e3..d465a2d 100644
> --- a/drivers/crypto/dpaa2_sec/dpaa2_sec_dpseci.c
> +++ b/drivers/crypto/dpaa2_sec/dpaa2_sec_dpseci.c
> @@ -2861,9 +2861,11 @@ struct rte_security_ops dpaa2_sec_security_ops = {
>  					RTE_CACHE_LINE_SIZE,
>  					rte_socket_id());
> 
> -		if (cryptodev->data->dev_private == NULL)
> -			rte_panic("Cannot allocate memzone for private "
> -				  "device data");
> +		if (cryptodev->data->dev_private == NULL) {
> +			RTE_LOG(ERR, PMD, "%s() Cannot allocate memzone for private device data",
> +			__func__);
> +			return -ENOMEM;
> +		}
>  	}
> 
>  	dpaa2_dev->cryptodev = cryptodev;
> diff --git a/drivers/crypto/dpaa_sec/dpaa_sec.c b/drivers/crypto/dpaa_sec/dpaa_sec.c
> index b685220..7b63650 100644
> --- a/drivers/crypto/dpaa_sec/dpaa_sec.c
> +++ b/drivers/crypto/dpaa_sec/dpaa_sec.c
> @@ -2356,9 +2356,11 @@ struct rte_security_ops dpaa_sec_security_ops = {
>  					RTE_CACHE_LINE_SIZE,
>  					rte_socket_id());
> 
> -		if (cryptodev->data->dev_private == NULL)
> -			rte_panic("Cannot allocate memzone for private "
> -					"device data");
> +		if (cryptodev->data->dev_private == NULL) {
> +			RTE_LOG(ERR, PMD, "%s() Cannot allocate memzone for private device data",
> +			__func__);
> +			return -ENOMEM;
> +		}
>  	}
> 
>  	dpaa_dev->crypto_dev = cryptodev;
> --
> 1.8.3.1

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v4 01/11] crypto: replace rte_panic instances in crypto driver
  2018-04-19 10:53   ` Trahe, Fiona
@ 2018-04-19 13:49     ` Arnon Warshavsky
  0 siblings, 0 replies; 47+ messages in thread
From: Arnon Warshavsky @ 2018-04-19 13:49 UTC (permalink / raw)
  To: Trahe, Fiona
  Cc: thomas, Burakov, Anatoly, Lu, Wenzhuo, Doherty, Declan,
	jerin.jacob, Richardson, Bruce, Yigit, Ferruh, dev

Sure.Will change that in v5

On Thu, Apr 19, 2018 at 1:53 PM, Trahe, Fiona <fiona.trahe@intel.com> wrote:

> Hi Arnon,
> Can you change subject to crypto/dpaa:... please as it's only affecting
> that driver.
> Fiona
>
> > -----Original Message-----
> > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Arnon Warshavsky
> > Sent: Thursday, April 19, 2018 7:01 AM
> > To: thomas@monjalon.net; Burakov, Anatoly <anatoly.burakov@intel.com>;
> Lu, Wenzhuo
> > <wenzhuo.lu@intel.com>; Doherty, Declan <declan.doherty@intel.com>;
> > jerin.jacob@caviumnetworks.com; Richardson, Bruce <
> bruce.richardson@intel.com>; Yigit, Ferruh
> > <ferruh.yigit@intel.com>
> > Cc: dev@dpdk.org; arnon@qwilt.com
> > Subject: [dpdk-dev] [PATCH v4 01/11] crypto: replace rte_panic instances
> in crypto driver
> >
> > replace panic calls with log and return value.
> >
> > --
> > v2:
> > - reformat error message to include literal string in a single line
> > v4: replace -1 return value with -ENOMEM
> >
> > Signed-off-by: Arnon Warshavsky <arnon@qwilt.com>
> > ---
> >  drivers/crypto/dpaa2_sec/dpaa2_sec_dpseci.c | 8 +++++---
> >  drivers/crypto/dpaa_sec/dpaa_sec.c          | 8 +++++---
> >  2 files changed, 10 insertions(+), 6 deletions(-)
> >
> > diff --git a/drivers/crypto/dpaa2_sec/dpaa2_sec_dpseci.c
> > b/drivers/crypto/dpaa2_sec/dpaa2_sec_dpseci.c
> > index 23012e3..d465a2d 100644
> > --- a/drivers/crypto/dpaa2_sec/dpaa2_sec_dpseci.c
> > +++ b/drivers/crypto/dpaa2_sec/dpaa2_sec_dpseci.c
> > @@ -2861,9 +2861,11 @@ struct rte_security_ops dpaa2_sec_security_ops = {
> >                                       RTE_CACHE_LINE_SIZE,
> >                                       rte_socket_id());
> >
> > -             if (cryptodev->data->dev_private == NULL)
> > -                     rte_panic("Cannot allocate memzone for private "
> > -                               "device data");
> > +             if (cryptodev->data->dev_private == NULL) {
> > +                     RTE_LOG(ERR, PMD, "%s() Cannot allocate memzone
> for private device data",
> > +                     __func__);
> > +                     return -ENOMEM;
> > +             }
> >       }
> >
> >       dpaa2_dev->cryptodev = cryptodev;
> > diff --git a/drivers/crypto/dpaa_sec/dpaa_sec.c
> b/drivers/crypto/dpaa_sec/dpaa_sec.c
> > index b685220..7b63650 100644
> > --- a/drivers/crypto/dpaa_sec/dpaa_sec.c
> > +++ b/drivers/crypto/dpaa_sec/dpaa_sec.c
> > @@ -2356,9 +2356,11 @@ struct rte_security_ops dpaa_sec_security_ops = {
> >                                       RTE_CACHE_LINE_SIZE,
> >                                       rte_socket_id());
> >
> > -             if (cryptodev->data->dev_private == NULL)
> > -                     rte_panic("Cannot allocate memzone for private "
> > -                                     "device data");
> > +             if (cryptodev->data->dev_private == NULL) {
> > +                     RTE_LOG(ERR, PMD, "%s() Cannot allocate memzone
> for private device data",
> > +                     __func__);
> > +                     return -ENOMEM;
> > +             }
> >       }
> >
> >       dpaa_dev->crypto_dev = cryptodev;
> > --
> > 1.8.3.1
>
>


-- 

*Arnon Warshavsky*
*Qwilt | work: +972-72-2221634 | mobile: +972-50-8583058 | arnon@qwilt.com
<arnon@qwilt.com>*

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v4 07/11] eal: replace rte_panic instances in hugepage_info
  2018-04-19  6:01 ` [PATCH v4 07/11] eal: replace rte_panic instances in hugepage_info Arnon Warshavsky
@ 2018-04-19 14:03   ` Burakov, Anatoly
  2018-04-19 14:09     ` Arnon Warshavsky
  2018-04-19 14:36   ` Kevin Traynor
  1 sibling, 1 reply; 47+ messages in thread
From: Burakov, Anatoly @ 2018-04-19 14:03 UTC (permalink / raw)
  To: Arnon Warshavsky, thomas, wenzhuo.lu, declan.doherty,
	jerin.jacob, bruce.richardson, ferruh.yigit
  Cc: dev

On 19-Apr-18 7:01 AM, Arnon Warshavsky wrote:
> replace panic calls with log and retrun value.

typo: return

> 
> v4
> static size calculation function changed to return success/fail code
> in addition to filling  the size result.
> 
> Signed-off-by: Arnon Warshavsky <arnon@qwilt.com>
> ---

Please do not add version notes to commit message. You might want to use 
"git notes add" for this, or manually edit the patch and add notes after 
"---" before sending.

On patch contents,

Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>

-- 
Thanks,
Anatoly

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v4 07/11] eal: replace rte_panic instances in hugepage_info
  2018-04-19 14:03   ` Burakov, Anatoly
@ 2018-04-19 14:09     ` Arnon Warshavsky
  2018-04-19 14:45       ` Burakov, Anatoly
  0 siblings, 1 reply; 47+ messages in thread
From: Arnon Warshavsky @ 2018-04-19 14:09 UTC (permalink / raw)
  To: Burakov, Anatoly
  Cc: Thomas Monjalon, Lu, Wenzhuo, Doherty, Declan, jerin.jacob,
	Bruce Richardson, Yigit, Ferruh, dev

Thanks Anatoly. Will fix that in v5.
Is it preferred to keep all version notes in the cover letter alone?

On Thu, Apr 19, 2018 at 5:03 PM, Burakov, Anatoly <anatoly.burakov@intel.com
> wrote:

> On 19-Apr-18 7:01 AM, Arnon Warshavsky wrote:
>
>> replace panic calls with log and retrun value.
>>
>
> typo: return
>
>
>> v4
>> static size calculation function changed to return success/fail code
>> in addition to filling  the size result.
>>
>> Signed-off-by: Arnon Warshavsky <arnon@qwilt.com>
>> ---
>>
>
> Please do not add version notes to commit message. You might want to use
> "git notes add" for this, or manually edit the patch and add notes after
> "---" before sending.
>
> On patch contents,
>
> Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
>
> --
> Thanks,
> Anatoly
>



-- 

*Arnon Warshavsky*
*Qwilt | work: +972-72-2221634 | mobile: +972-50-8583058 | arnon@qwilt.com
<arnon@qwilt.com>*

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v4 07/11] eal: replace rte_panic instances in hugepage_info
  2018-04-19  6:01 ` [PATCH v4 07/11] eal: replace rte_panic instances in hugepage_info Arnon Warshavsky
  2018-04-19 14:03   ` Burakov, Anatoly
@ 2018-04-19 14:36   ` Kevin Traynor
  2018-04-20 13:12     ` Arnon Warshavsky
  1 sibling, 1 reply; 47+ messages in thread
From: Kevin Traynor @ 2018-04-19 14:36 UTC (permalink / raw)
  To: Arnon Warshavsky, thomas, anatoly.burakov, wenzhuo.lu,
	declan.doherty, jerin.jacob, bruce.richardson, ferruh.yigit
  Cc: dev

On 04/19/2018 07:01 AM, Arnon Warshavsky wrote:
> replace panic calls with log and retrun value.
> 
> v4
> static size calculation function changed to return success/fail code
> in addition to filling  the size result.
> 

fyi - this patch doesn't apply on master branch without fuzz

> Signed-off-by: Arnon Warshavsky <arnon@qwilt.com>
> ---
>  lib/librte_eal/linuxapp/eal/eal_hugepage_info.c | 32 ++++++++++++++++---------
>  1 file changed, 21 insertions(+), 11 deletions(-)
> 
> diff --git a/lib/librte_eal/linuxapp/eal/eal_hugepage_info.c b/lib/librte_eal/linuxapp/eal/eal_hugepage_info.c
> index fb4b667..debae32 100644
> --- a/lib/librte_eal/linuxapp/eal/eal_hugepage_info.c
> +++ b/lib/librte_eal/linuxapp/eal/eal_hugepage_info.c
> @@ -145,8 +145,8 @@
>  	return num_pages;
>  }
>  
> -static uint64_t
> -get_default_hp_size(void)
> +static int
> +get_default_hp_size(uint64_t *result)
>  {
>  	const char proc_meminfo[] = "/proc/meminfo";
>  	const char str_hugepagesz[] = "Hugepagesize:";
> @@ -155,8 +155,11 @@
>  	unsigned long long size = 0;
>  
>  	FILE *fd = fopen(proc_meminfo, "r");
> -	if (fd == NULL)
> -		rte_panic("Cannot open %s\n", proc_meminfo);
> +	if (fd == NULL) {
> +		RTE_LOG(CRIT, EAL, "%s(): Cannot open %s\n",
> +				__func__, proc_meminfo);
> +		return -1;
> +	}
>  	while(fgets(buffer, sizeof(buffer), fd)){
>  		if (strncmp(buffer, str_hugepagesz, hugepagesz_len) == 0){
>  			size = rte_str_to_size(&buffer[hugepagesz_len]);
> @@ -164,9 +167,13 @@
>  		}
>  	}
>  	fclose(fd);
> -	if (size == 0)
> -		rte_panic("Cannot get default hugepage size from %s\n", proc_meminfo);
> -	return size;
> +	if (size == 0) {
> +		RTE_LOG(CRIT, EAL, "%s(): Cannot get default hugepage size from %s\n",
> +						 __func__, proc_meminfo);
> +		return -1;
> +	}
> +	*result = size;
> +	return 0;
>  }
>  
>  static const char *
> @@ -191,11 +198,14 @@
>  	char *retval = NULL;
>  
>  	FILE *fd = fopen(proc_mounts, "r");
> -	if (fd == NULL)
> -		rte_panic("Cannot open %s\n", proc_mounts);
> +	if (fd == NULL) {
> +		RTE_LOG(CRIT, EAL, "%s(): Cannot open %s\n",
> +				__func__, proc_mounts);
> +		return NULL;
> +	}
>  
> -	if (default_size == 0)
> -		default_size = get_default_hp_size();
> +	if ((default_size == 0) && (get_default_hp_size(&default_size) != 0))
> +		return NULL;
>  
>  	while (fgets(buf, sizeof(buf), fd)){
>  		if (rte_strsplit(buf, sizeof(buf), splitstr, _FIELDNAME_MAX,
> 

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v4 10/11] eal: replace rte_panic instances in init sequence
  2018-04-19  6:01 ` [PATCH v4 10/11] eal: replace rte_panic instances in init sequence Arnon Warshavsky
@ 2018-04-19 14:39   ` Burakov, Anatoly
  2018-04-19 14:48     ` Arnon Warshavsky
  2018-04-19 17:48   ` Aaron Conole
  1 sibling, 1 reply; 47+ messages in thread
From: Burakov, Anatoly @ 2018-04-19 14:39 UTC (permalink / raw)
  To: Arnon Warshavsky, thomas, wenzhuo.lu, declan.doherty,
	jerin.jacob, bruce.richardson, ferruh.yigit
  Cc: dev

On 19-Apr-18 7:01 AM, Arnon Warshavsky wrote:
> Local functions to this file,
> changing from void to int are non-abi-breaking.
> For handling the single function that cannot
> change from void to int due to abi,
> where this is the only place it is called in,
> I added a state variable that is being checked
> right after the call to this function.

A rewrite of commit message is in order, i think :) Something like this:

Change some functions' return type from void to int. This will not break 
ABI because they are internal only.

(see below for comments on lcore changes)

> 
> --
> 
> v4 - fix split literal strings in log messages
> 
> Signed-off-by: Arnon Warshavsky <arnon@qwilt.com>

Again, please do not add patch/version notes to the commit message, put 
them after "---". Version history is not for commit messages, it's for 
people reviewing it before merge.

> ---
>   lib/librte_eal/bsdapp/eal/eal.c           |  86 ++++++++++++++-------
>   lib/librte_eal/bsdapp/eal/eal_thread.c    |  65 +++++++++++-----
>   lib/librte_eal/common/eal_common_launch.c |  21 ++++++
>   lib/librte_eal/common/include/rte_debug.h |  12 +++
>   lib/librte_eal/linuxapp/eal/eal.c         | 120 ++++++++++++++++++++----------
>   lib/librte_eal/linuxapp/eal/eal_thread.c  |  65 +++++++++++-----
>   6 files changed, 270 insertions(+), 99 deletions(-)
> 
> diff --git a/lib/librte_eal/bsdapp/eal/eal.c b/lib/librte_eal/bsdapp/eal/eal.c
> index d996190..9c2f6f1 100644
> --- a/lib/librte_eal/bsdapp/eal/eal.c
> +++ b/lib/librte_eal/bsdapp/eal/eal.c
> @@ -151,7 +151,7 @@ enum rte_iova_mode
>    * We also don't lock the whole file, so that in future we can use read-locks
>    * on other parts, e.g. memzones, to detect if there are running secondary
>    * processes. */
> -static void
> +static int

<...>

> +
> +/* move to panic state and do not return */
> +static __attribute__((noreturn)) void
> +defunct_and_remain_in_endless_loop(void)
> +{
> +	rte_move_to_panic_state();
> +	while (1)
> +		sleep(1);
>   }

It seems like you're mixing two different patchsets here. Maybe it would 
be beneficial to put lcore changes in a separate patch? Technically, 
rte_panic's in lcore are not part of init sequence.

(also, should panic state be volatile?)

>   
>   /* main loop of threads */
> @@ -106,8 +123,11 @@ void eal_thread_init_master(unsigned lcore_id)
>   		if (thread_id == lcore_config[lcore_id].thread_id)
>   			break;
>   	}
> -	if (lcore_id == RTE_MAX_LCORE)
> -		rte_panic("cannot retrieve lcore id\n");
> +	if (lcore_id == RTE_MAX_LCORE) {
> +		RTE_LOG(CRIT, EAL, "%s(): Cannot retrieve lcore id\n",


-- 
Thanks,
Anatoly

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v4 07/11] eal: replace rte_panic instances in hugepage_info
  2018-04-19 14:09     ` Arnon Warshavsky
@ 2018-04-19 14:45       ` Burakov, Anatoly
  2018-04-19 14:50         ` Burakov, Anatoly
  0 siblings, 1 reply; 47+ messages in thread
From: Burakov, Anatoly @ 2018-04-19 14:45 UTC (permalink / raw)
  To: Arnon Warshavsky
  Cc: Thomas Monjalon, Lu, Wenzhuo, Doherty, Declan, jerin.jacob,
	Bruce Richardson, Yigit, Ferruh, dev

On 19-Apr-18 3:09 PM, Arnon Warshavsky wrote:
> Thanks Anatoly. Will fix that in v5.
> Is it preferred to keep all version notes in the cover letter alone?
> 

Generally, cover letter should give general outline (i.e. "fixed 32 bit 
compile"), while notes for individual patches should be more specific 
about the changes between versions (but not too specific, i.e. don't do 
"change variable X on line 100 to be Y").

So, whatever you think gets your point across best. Not all changes 
deserve to be called out in the cover letter.

-- 
Thanks,
Anatoly

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v4 10/11] eal: replace rte_panic instances in init sequence
  2018-04-19 14:39   ` Burakov, Anatoly
@ 2018-04-19 14:48     ` Arnon Warshavsky
  2018-04-19 14:57       ` Burakov, Anatoly
  0 siblings, 1 reply; 47+ messages in thread
From: Arnon Warshavsky @ 2018-04-19 14:48 UTC (permalink / raw)
  To: Burakov, Anatoly
  Cc: Thomas Monjalon, Lu, Wenzhuo, Doherty, Declan, jerin.jacob,
	Bruce Richardson, Yigit, Ferruh, dev

Copy on the commit message  and volatile.

Regarding the new function defunct_and_remain_in_endless_loop ()
I don't think I can put that in a separate patch without breaking the
current patch independence.


On Thu, Apr 19, 2018 at 5:39 PM, Burakov, Anatoly <anatoly.burakov@intel.com
> wrote:

> On 19-Apr-18 7:01 AM, Arnon Warshavsky wrote:
>
>> Local functions to this file,
>> changing from void to int are non-abi-breaking.
>> For handling the single function that cannot
>> change from void to int due to abi,
>> where this is the only place it is called in,
>> I added a state variable that is being checked
>> right after the call to this function.
>>
>
> A rewrite of commit message is in order, i think :) Something like this:
>
> Change some functions' return type from void to int. This will not break
> ABI because they are internal only.
>
> (see below for comments on lcore changes)
>
>
>> --
>>
>> v4 - fix split literal strings in log messages
>>
>> Signed-off-by: Arnon Warshavsky <arnon@qwilt.com>
>>
>
> Again, please do not add patch/version notes to the commit message, put
> them after "---". Version history is not for commit messages, it's for
> people reviewing it before merge.
>
> ---
>>   lib/librte_eal/bsdapp/eal/eal.c           |  86 ++++++++++++++-------
>>   lib/librte_eal/bsdapp/eal/eal_thread.c    |  65 +++++++++++-----
>>   lib/librte_eal/common/eal_common_launch.c |  21 ++++++
>>   lib/librte_eal/common/include/rte_debug.h |  12 +++
>>   lib/librte_eal/linuxapp/eal/eal.c         | 120
>> ++++++++++++++++++++----------
>>   lib/librte_eal/linuxapp/eal/eal_thread.c  |  65 +++++++++++-----
>>   6 files changed, 270 insertions(+), 99 deletions(-)
>>
>> diff --git a/lib/librte_eal/bsdapp/eal/eal.c
>> b/lib/librte_eal/bsdapp/eal/eal.c
>> index d996190..9c2f6f1 100644
>> --- a/lib/librte_eal/bsdapp/eal/eal.c
>> +++ b/lib/librte_eal/bsdapp/eal/eal.c
>> @@ -151,7 +151,7 @@ enum rte_iova_mode
>>    * We also don't lock the whole file, so that in future we can use
>> read-locks
>>    * on other parts, e.g. memzones, to detect if there are running
>> secondary
>>    * processes. */
>> -static void
>> +static int
>>
>
> <...>
>
> +
>> +/* move to panic state and do not return */
>> +static __attribute__((noreturn)) void
>> +defunct_and_remain_in_endless_loop(void)
>> +{
>> +       rte_move_to_panic_state();
>> +       while (1)
>> +               sleep(1);
>>   }
>>
>
> It seems like you're mixing two different patchsets here. Maybe it would
> be beneficial to put lcore changes in a separate patch? Technically,
> rte_panic's in lcore are not part of init sequence.
>
> (also, should panic state be volatile?)
>
>
>     /* main loop of threads */
>> @@ -106,8 +123,11 @@ void eal_thread_init_master(unsigned lcore_id)
>>                 if (thread_id == lcore_config[lcore_id].thread_id)
>>                         break;
>>         }
>> -       if (lcore_id == RTE_MAX_LCORE)
>> -               rte_panic("cannot retrieve lcore id\n");
>> +       if (lcore_id == RTE_MAX_LCORE) {
>> +               RTE_LOG(CRIT, EAL, "%s(): Cannot retrieve lcore id\n",
>>
>
>
> --
> Thanks,
> Anatoly
>



-- 

*Arnon Warshavsky*
*Qwilt | work: +972-72-2221634 | mobile: +972-50-8583058 | arnon@qwilt.com
<arnon@qwilt.com>*

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v4 07/11] eal: replace rte_panic instances in hugepage_info
  2018-04-19 14:45       ` Burakov, Anatoly
@ 2018-04-19 14:50         ` Burakov, Anatoly
  2018-04-20 13:11           ` Arnon Warshavsky
  0 siblings, 1 reply; 47+ messages in thread
From: Burakov, Anatoly @ 2018-04-19 14:50 UTC (permalink / raw)
  To: Arnon Warshavsky
  Cc: Thomas Monjalon, Lu, Wenzhuo, Doherty, Declan, jerin.jacob,
	Bruce Richardson, Yigit, Ferruh, dev

On 19-Apr-18 3:45 PM, Burakov, Anatoly wrote:
> On 19-Apr-18 3:09 PM, Arnon Warshavsky wrote:
>> Thanks Anatoly. Will fix that in v5.
>> Is it preferred to keep all version notes in the cover letter alone?
>>
> 
> Generally, cover letter should give general outline (i.e. "fixed 32 bit 
> compile"), while notes for individual patches should be more specific 
> about the changes between versions (but not too specific, i.e. don't do 
> "change variable X on line 100 to be Y").
> 
> So, whatever you think gets your point across best. Not all changes 
> deserve to be called out in the cover letter.
> 
Just to be clear:

With my initial reply, i did not mean "patch notes should be in the 
cover letter". What i meant was that you have put your version changes 
"e.g. v4 - changed this to that" into the commit message.

What you should have done is put your patch notes after the commit 
message, like this:

replace panic calls with log and retrun value.

Signed-off-by: Arnon Warshavsky <arnon@qwilt.com>
---

v4
static size calculation function changed to return success/fail code
in addition to filling  the size result.


  lib/librte_eal/linuxapp/eal/eal_hugepage_info.c | 32 
++++++++++++++++---------
  1 file changed, 21 insertions(+), 11 deletions(-)

Note that the v4 comments are after the "---" - this is where the commit 
message ends as far as git concerned, so you can put your notes there.

-- 
Thanks,
Anatoly

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v4 10/11] eal: replace rte_panic instances in init sequence
  2018-04-19 14:48     ` Arnon Warshavsky
@ 2018-04-19 14:57       ` Burakov, Anatoly
  2018-04-19 17:31         ` Kevin Traynor
  2018-04-20 13:31         ` Arnon Warshavsky
  0 siblings, 2 replies; 47+ messages in thread
From: Burakov, Anatoly @ 2018-04-19 14:57 UTC (permalink / raw)
  To: Arnon Warshavsky
  Cc: Thomas Monjalon, Lu, Wenzhuo, Doherty, Declan, jerin.jacob,
	Bruce Richardson, Yigit, Ferruh, dev

On 19-Apr-18 3:48 PM, Arnon Warshavsky wrote:
> Copy on the commit message  and volatile.
> 
> Regarding the new function defunct_and_remain_in_endless_loop ()
> I don't think I can put that in a separate patch without breaking the 
> current patch independence.

How so?

Just leave some panic instances in there for thread-related stuff and 
fix them up in the next patch.

Also, i'm not sure sending threads into an infinite loop on panic is 
such a good idea. You might want to look at Olivier's approach [1] to 
creating threads, using pthread_barriers and pthread_kill/cancel.

This does warrant a separate patch now :)

-- 
Thanks,
Anatoly

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v4 02/11] bond: replace rte_panic instances in bonding driver
  2018-04-19  6:01 ` [PATCH v4 02/11] bond: replace rte_panic instances in bonding driver Arnon Warshavsky
@ 2018-04-19 17:25   ` Kevin Traynor
  2018-04-20 13:13     ` Arnon Warshavsky
  0 siblings, 1 reply; 47+ messages in thread
From: Kevin Traynor @ 2018-04-19 17:25 UTC (permalink / raw)
  To: Arnon Warshavsky, thomas, anatoly.burakov, wenzhuo.lu,
	declan.doherty, jerin.jacob, bruce.richardson, ferruh.yigit
  Cc: dev

On 04/19/2018 07:01 AM, Arnon Warshavsky wrote:
> replace panic calls with log and retrun value.
> Local functions to this file,
> changing from void to int are non-abi-breaking
> --
> v4 - fix split literal strings in log messages
> 
> Signed-off-by: Arnon Warshavsky <arnon@qwilt.com>
> ---
>  drivers/net/bonding/rte_eth_bond_8023ad.c         | 28 +++++++++++++++--------
>  drivers/net/bonding/rte_eth_bond_8023ad_private.h |  2 +-
>  drivers/net/bonding/rte_eth_bond_api.c            | 20 +++++++++++-----
>  drivers/net/bonding/rte_eth_bond_pmd.c            |  9 +++++---
>  drivers/net/bonding/rte_eth_bond_private.h        |  2 +-
>  5 files changed, 40 insertions(+), 21 deletions(-)
> 
> diff --git a/drivers/net/bonding/rte_eth_bond_8023ad.c b/drivers/net/bonding/rte_eth_bond_8023ad.c
> index c452318..7512901 100644
> --- a/drivers/net/bonding/rte_eth_bond_8023ad.c
> +++ b/drivers/net/bonding/rte_eth_bond_8023ad.c
> @@ -893,7 +893,7 @@
>  			bond_mode_8023ad_periodic_cb, arg);
>  }
>  
> -void
> +int
>  bond_mode_8023ad_activate_slave(struct rte_eth_dev *bond_dev,
>  				uint16_t slave_id)
>  {
> @@ -939,7 +939,7 @@
>  	timer_cancel(&port->warning_timer);
>  
>  	if (port->mbuf_pool != NULL)
> -		return;
> +		return 0;
>  
>  	RTE_ASSERT(port->rx_ring == NULL);
>  	RTE_ASSERT(port->tx_ring == NULL);
> @@ -968,8 +968,9 @@
>  	/* Any memory allocation failure in initialization is critical because
>  	 * resources can't be free, so reinitialization is impossible. */
>  	if (port->mbuf_pool == NULL) {
> -		rte_panic("Slave %u: Failed to create memory pool '%s': %s\n",
> -			slave_id, mem_name, rte_strerror(rte_errno));
> +		RTE_LOG(ERR, PMD, "%s() Slave %u: Failed to create memory pool '%s': %s\n",
> +			__func__, slave_id, mem_name, rte_strerror(rte_errno));
> +		return -1;
>  	}
>  
>  	snprintf(mem_name, RTE_DIM(mem_name), "slave_%u_rx", slave_id);
> @@ -977,8 +978,9 @@
>  			rte_align32pow2(BOND_MODE_8023AX_SLAVE_RX_PKTS), socket_id, 0);
>  
>  	if (port->rx_ring == NULL) {
> -		rte_panic("Slave %u: Failed to create rx ring '%s': %s\n", slave_id,
> -			mem_name, rte_strerror(rte_errno));
> +		RTE_LOG(ERR, PMD, "%s() Slave %u: Failed to create rx ring '%s': %s\n",
> +			__func__, slave_id, mem_name, rte_strerror(rte_errno));
> +		return -1;
>  	}
>  
>  	/* TX ring is at least one pkt longer to make room for marker packet. */
> @@ -987,9 +989,12 @@
>  			rte_align32pow2(BOND_MODE_8023AX_SLAVE_TX_PKTS + 1), socket_id, 0);
>  
>  	if (port->tx_ring == NULL) {
> -		rte_panic("Slave %u: Failed to create tx ring '%s': %s\n", slave_id,
> -			mem_name, rte_strerror(rte_errno));
> +		RTE_LOG(ERR, PMD, "%s() Slave %u: Fail to create tx ring '%s': %s\n",
> +			__func__, slave_id, mem_name, rte_strerror(rte_errno));
> +		return -1;
>  	}
> +
> +	return 0;
>  }
>  
>  int
> @@ -1143,9 +1148,12 @@
>  	struct bond_dev_private *internals = bond_dev->data->dev_private;
>  	uint8_t i;
>  
> -	for (i = 0; i < internals->active_slave_count; i++)
> -		bond_mode_8023ad_activate_slave(bond_dev,
> +	for (i = 0; i < internals->active_slave_count; i++) {
> +		int rc = bond_mode_8023ad_activate_slave(bond_dev,
>  				internals->active_slaves[i]);
> +		if (rc != 0)
> +			return rc;
> +	}
>  
>  	return 0;
>  }
> diff --git a/drivers/net/bonding/rte_eth_bond_8023ad_private.h b/drivers/net/bonding/rte_eth_bond_8023ad_private.h
> index 0f490a5..96a42f2 100644
> --- a/drivers/net/bonding/rte_eth_bond_8023ad_private.h
> +++ b/drivers/net/bonding/rte_eth_bond_8023ad_private.h
> @@ -263,7 +263,7 @@ struct mode8023ad_private {
>   * @return
>   *  0 on success, negative value otherwise.
>   */
> -void
> +int
>  bond_mode_8023ad_activate_slave(struct rte_eth_dev *dev, uint16_t port_id);
>  
>  /**
> diff --git a/drivers/net/bonding/rte_eth_bond_api.c b/drivers/net/bonding/rte_eth_bond_api.c
> index aa89425..96aa1ff 100644
> --- a/drivers/net/bonding/rte_eth_bond_api.c
> +++ b/drivers/net/bonding/rte_eth_bond_api.c
> @@ -69,14 +69,15 @@
>  	return 0;
>  }
>  
> -void
> +int
>  activate_slave(struct rte_eth_dev *eth_dev, uint16_t port_id)
>  {
>  	struct bond_dev_private *internals = eth_dev->data->dev_private;
>  	uint8_t active_count = internals->active_slave_count;
>  
>  	if (internals->mode == BONDING_MODE_8023AD)
> -		bond_mode_8023ad_activate_slave(eth_dev, port_id);
> +		if (bond_mode_8023ad_activate_slave(eth_dev, port_id) != 0)
> +			return -1;
>  
>  	if (internals->mode == BONDING_MODE_TLB
>  			|| internals->mode == BONDING_MODE_ALB) {
> @@ -357,10 +358,17 @@
>  				bond_ethdev_primary_set(internals,
>  							slave_port_id);
>  
> -			if (find_slave_by_id(internals->active_slaves,
> -					     internals->active_slave_count,
> -					     slave_port_id) == internals->active_slave_count)
> -				activate_slave(bonded_eth_dev, slave_port_id);
> +			int rc =

There's no need for the rc variables, the existing check would suffice here

> +				find_slave_by_id(internals->active_slaves,
> +					internals->active_slave_count,
> +					slave_port_id);
> +
> +			if (rc == internals->active_slave_count) {
> +				int rc = activate_slave(bonded_eth_dev,
> +							slave_port_id);
> +				if (rc != 0)
> +					return -1;
and this could be

if (activate_slave(bonded_eth_dev, slave_port_id))
	return -1;

> +			}
>  		}
>  	}
>  
> diff --git a/drivers/net/bonding/rte_eth_bond_pmd.c b/drivers/net/bonding/rte_eth_bond_pmd.c
> index 2805c71..2d9052d 100644
> --- a/drivers/net/bonding/rte_eth_bond_pmd.c
> +++ b/drivers/net/bonding/rte_eth_bond_pmd.c
> @@ -1741,8 +1741,10 @@ struct bwg_slave {
>  		/* Any memory allocation failure in initialization is critical because
>  		 * resources can't be free, so reinitialization is impossible. */
>  		if (port->slow_pool == NULL) {
> -			rte_panic("Slave %u: Failed to create memory pool '%s': %s\n",
> -				slave_id, mem_name, rte_strerror(rte_errno));
> +			RTE_LOG(ERR, PMD, "%s() Slave %u: Failed to create memory pool '%s': %s\n",
> +				__func__, slave_id,
> +				mem_name, rte_strerror(rte_errno));
> +			return -1;
>  		}
>  	}
>  
> @@ -2673,7 +2675,8 @@ struct bwg_slave {
>  			mac_address_slaves_update(bonded_eth_dev);
>  		}
>  
> -		activate_slave(bonded_eth_dev, port_id);
> +		if (activate_slave(bonded_eth_dev, port_id) != 0)
> +			return -1;

it's more consistent with the rest of the function to do,

if(activate_slave(bonded_eth_dev, port_id))
	return rc;

There's other places through the patches where "!= 0" is used but not
really needed

>  
>  		/* If user has defined the primary port then default to using it */
>  		if (internals->user_defined_primary_port &&
> diff --git a/drivers/net/bonding/rte_eth_bond_private.h b/drivers/net/bonding/rte_eth_bond_private.h
> index 94eca88..d99d42c 100644
> --- a/drivers/net/bonding/rte_eth_bond_private.h
> +++ b/drivers/net/bonding/rte_eth_bond_private.h
> @@ -187,7 +187,7 @@ struct bond_dev_private {
>  void
>  deactivate_slave(struct rte_eth_dev *eth_dev, uint16_t port_id);
>  
> -void
> +int
>  activate_slave(struct rte_eth_dev *eth_dev, uint16_t port_id);
>  
>  void
> 

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v4 03/11] e1000: replace rte_panic instances in e1000 driver
  2018-04-19  6:01 ` [PATCH v4 03/11] e1000: replace rte_panic instances in e1000 driver Arnon Warshavsky
@ 2018-04-19 17:25   ` Kevin Traynor
  2018-04-20 13:14     ` Arnon Warshavsky
  0 siblings, 1 reply; 47+ messages in thread
From: Kevin Traynor @ 2018-04-19 17:25 UTC (permalink / raw)
  To: Arnon Warshavsky, thomas, anatoly.burakov, wenzhuo.lu,
	declan.doherty, jerin.jacob, bruce.richardson, ferruh.yigit
  Cc: dev

On 04/19/2018 07:01 AM, Arnon Warshavsky wrote:
> replace panic calls with log and retrun value.
> Local function to this file,
> changing from void to int is non-abi-breaking
> --
> v4 - keep error message literal string in a singhle line
> 
> Signed-off-by: Arnon Warshavsky <arnon@qwilt.com>
> ---
>  drivers/net/e1000/e1000_ethdev.h |  2 +-
>  drivers/net/e1000/igb_ethdev.c   |  3 ++-
>  drivers/net/e1000/igb_pf.c       | 15 +++++++++------
>  3 files changed, 12 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/net/e1000/e1000_ethdev.h b/drivers/net/e1000/e1000_ethdev.h
> index 6354b89..2e527de 100644
> --- a/drivers/net/e1000/e1000_ethdev.h
> +++ b/drivers/net/e1000/e1000_ethdev.h
> @@ -411,7 +411,7 @@ int eth_igb_rss_hash_conf_get(struct rte_eth_dev *dev,
>  /*
>   * misc function prototypes
>   */
> -void igb_pf_host_init(struct rte_eth_dev *eth_dev);
> +int igb_pf_host_init(struct rte_eth_dev *eth_dev);
>  
>  void igb_pf_mbx_process(struct rte_eth_dev *eth_dev);
>  
> diff --git a/drivers/net/e1000/igb_ethdev.c b/drivers/net/e1000/igb_ethdev.c
> index 9b808a9..4479616 100644
> --- a/drivers/net/e1000/igb_ethdev.c
> +++ b/drivers/net/e1000/igb_ethdev.c
> @@ -833,7 +833,8 @@ static int igb_flex_filter_uninit(struct rte_eth_dev *eth_dev)
>  	}
>  
>  	/* initialize PF if max_vfs not zero */
> -	igb_pf_host_init(eth_dev);
> +	if (igb_pf_host_init(eth_dev) != 0)

You don't need "!= 0"

You need to set "error" here, or else return it from igb_pf_host_init().
We know -ENOMEM is the only error that can be returned from
igb_pf_host_init() but not sure if should assume that.

> +		goto err_late;
>  
>  	ctrl_ext = E1000_READ_REG(hw, E1000_CTRL_EXT);
>  	/* Set PF Reset Done bit so PF/VF Mail Ops can work */
> diff --git a/drivers/net/e1000/igb_pf.c b/drivers/net/e1000/igb_pf.c
> index b9f2e53..ae4b0a4 100644
> --- a/drivers/net/e1000/igb_pf.c
> +++ b/drivers/net/e1000/igb_pf.c
> @@ -63,7 +63,7 @@ int igb_vf_perm_addr_gen(struct rte_eth_dev *dev, uint16_t vf_num)
>  	return 0;
>  }
>  
> -void igb_pf_host_init(struct rte_eth_dev *eth_dev)
> +int igb_pf_host_init(struct rte_eth_dev *eth_dev)
>  {
>  	struct e1000_vf_info **vfinfo =
>  		E1000_DEV_PRIVATE_TO_P_VFDATA(eth_dev->data->dev_private);
> @@ -74,7 +74,7 @@ void igb_pf_host_init(struct rte_eth_dev *eth_dev)
>  
>  	RTE_ETH_DEV_SRIOV(eth_dev).active = 0;
>  	if (0 == (vf_num = dev_num_vf(eth_dev)))
> -		return;
> +		return 0;
>  
>  	if (hw->mac.type == e1000_i350)
>  		nb_queue = 1;
> @@ -82,11 +82,14 @@ void igb_pf_host_init(struct rte_eth_dev *eth_dev)
>  		/* per datasheet, it should be 2, but 1 seems correct */
>  		nb_queue = 1;
>  	else
> -		return;
> +		return 0;
>  
>  	*vfinfo = rte_zmalloc("vf_info", sizeof(struct e1000_vf_info) * vf_num, 0);
> -	if (*vfinfo == NULL)
> -		rte_panic("Cannot allocate memory for private VF data\n");
> +	if (*vfinfo == NULL) {
> +		RTE_LOG(CRIT, PMD, "%s(): Cannot allocate memory for private VF data\n",
> +			__func__);
> +		return -1;
> +	}
>  
>  	RTE_ETH_DEV_SRIOV(eth_dev).active = ETH_8_POOLS;
>  	RTE_ETH_DEV_SRIOV(eth_dev).nb_q_per_pool = nb_queue;
> @@ -98,7 +101,7 @@ void igb_pf_host_init(struct rte_eth_dev *eth_dev)
>  	/* set mb interrupt mask */
>  	igb_mb_intr_setup(eth_dev);
>  
> -	return;
> +	return 0;
>  }
>  
>  void igb_pf_host_uninit(struct rte_eth_dev *dev)
> 

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v4 04/11] ixgbe: replace rte_panic instances in ixgbe driver
  2018-04-19  6:01 ` [PATCH v4 04/11] ixgbe: replace rte_panic instances in ixgbe driver Arnon Warshavsky
@ 2018-04-19 17:26   ` Kevin Traynor
  2018-04-20 13:16     ` Arnon Warshavsky
  0 siblings, 1 reply; 47+ messages in thread
From: Kevin Traynor @ 2018-04-19 17:26 UTC (permalink / raw)
  To: Arnon Warshavsky, thomas, anatoly.burakov, wenzhuo.lu,
	declan.doherty, jerin.jacob, bruce.richardson, ferruh.yigit
  Cc: dev

On 04/19/2018 07:01 AM, Arnon Warshavsky wrote:
> replace panic calls with log and retrun value.

typo return, seems to be in a few commit msgs

> Local function to this file,
> changing from void to int is non-abi-breaking
> 
> Signed-off-by: Arnon Warshavsky <arnon@qwilt.com>
> ---
>  drivers/net/ixgbe/ixgbe_ethdev.c |  3 ++-
>  drivers/net/ixgbe/ixgbe_ethdev.h |  2 +-
>  drivers/net/ixgbe/ixgbe_pf.c     | 13 +++++++++----
>  3 files changed, 12 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/net/ixgbe/ixgbe_ethdev.c b/drivers/net/ixgbe/ixgbe_ethdev.c
> index a5e2fc0..c7797f1 100644
> --- a/drivers/net/ixgbe/ixgbe_ethdev.c
> +++ b/drivers/net/ixgbe/ixgbe_ethdev.c
> @@ -1224,7 +1224,8 @@ struct rte_ixgbe_xstats_name_off {
>  	memset(hwstrip, 0, sizeof(*hwstrip));
>  
>  	/* initialize PF if max_vfs not zero */
> -	ixgbe_pf_host_init(eth_dev);
> +	if (ixgbe_pf_host_init(eth_dev) != 0)
> +		return -1;

similar comment as previous patch about using an appropriate return value

>  
>  	ctrl_ext = IXGBE_READ_REG(hw, IXGBE_CTRL_EXT);
>  	/* let hardware know driver is loaded */
> diff --git a/drivers/net/ixgbe/ixgbe_ethdev.h b/drivers/net/ixgbe/ixgbe_ethdev.h
> index 6550777..8bb41ec 100644
> --- a/drivers/net/ixgbe/ixgbe_ethdev.h
> +++ b/drivers/net/ixgbe/ixgbe_ethdev.h
> @@ -661,7 +661,7 @@ int ixgbe_fdir_filter_program(struct rte_eth_dev *dev,
>  
>  void ixgbe_vlan_hw_strip_config(struct rte_eth_dev *dev);
>  
> -void ixgbe_pf_host_init(struct rte_eth_dev *eth_dev);
> +int ixgbe_pf_host_init(struct rte_eth_dev *eth_dev);
>  
>  void ixgbe_pf_host_uninit(struct rte_eth_dev *eth_dev);
>  
> diff --git a/drivers/net/ixgbe/ixgbe_pf.c b/drivers/net/ixgbe/ixgbe_pf.c
> index 4e61310..4cd3651 100644
> --- a/drivers/net/ixgbe/ixgbe_pf.c
> +++ b/drivers/net/ixgbe/ixgbe_pf.c
> @@ -66,7 +66,7 @@ int ixgbe_vf_perm_addr_gen(struct rte_eth_dev *dev, uint16_t vf_num)
>  	return 0;
>  }
>  
> -void ixgbe_pf_host_init(struct rte_eth_dev *eth_dev)
> +int ixgbe_pf_host_init(struct rte_eth_dev *eth_dev)
>  {
>  	struct ixgbe_vf_info **vfinfo =
>  		IXGBE_DEV_PRIVATE_TO_P_VFDATA(eth_dev->data->dev_private);
> @@ -84,11 +84,14 @@ void ixgbe_pf_host_init(struct rte_eth_dev *eth_dev)
>  	RTE_ETH_DEV_SRIOV(eth_dev).active = 0;
>  	vf_num = dev_num_vf(eth_dev);
>  	if (vf_num == 0)
> -		return;
> +		return 0;
>  
>  	*vfinfo = rte_zmalloc("vf_info", sizeof(struct ixgbe_vf_info) * vf_num, 0);
> -	if (*vfinfo == NULL)
> -		rte_panic("Cannot allocate memory for private VF data\n");
> +	if (*vfinfo == NULL) {
> +		RTE_LOG(ERR, PMD, "%s() Cannot allocate memory for private VF data\n",
> +				__func__);
> +		return -1;
> +	}
>  
>  	memset(mirror_info, 0, sizeof(struct ixgbe_mirror_info));
>  	memset(uta_info, 0, sizeof(struct ixgbe_uta_info));
> @@ -116,6 +119,8 @@ void ixgbe_pf_host_init(struct rte_eth_dev *eth_dev)
>  
>  	/* set mb interrupt mask */
>  	ixgbe_mb_intr_setup(eth_dev);
> +
> +	return 0;
>  }
>  
>  void ixgbe_pf_host_uninit(struct rte_eth_dev *eth_dev)
> 

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v4 05/11] eal: replace rte_panic instances in eventdev
  2018-04-19  6:01 ` [PATCH v4 05/11] eal: replace rte_panic instances in eventdev Arnon Warshavsky
@ 2018-04-19 17:26   ` Kevin Traynor
  2018-04-20 13:17     ` Arnon Warshavsky
  0 siblings, 1 reply; 47+ messages in thread
From: Kevin Traynor @ 2018-04-19 17:26 UTC (permalink / raw)
  To: Arnon Warshavsky, thomas, anatoly.burakov, wenzhuo.lu,
	declan.doherty, jerin.jacob, bruce.richardson, ferruh.yigit
  Cc: dev

On 04/19/2018 07:01 AM, Arnon Warshavsky wrote:
> replace panic calls with log and retrun value.
> 
> --
> v4 - fix split literal strings in log messages
> 
> Signed-off-by: Arnon Warshavsky <arnon@qwilt.com>
> ---
>  lib/librte_eventdev/rte_eventdev_pmd_pci.h  | 8 +++++---
>  lib/librte_eventdev/rte_eventdev_pmd_vdev.h | 8 +++++---
>  2 files changed, 10 insertions(+), 6 deletions(-)
> 
> diff --git a/lib/librte_eventdev/rte_eventdev_pmd_pci.h b/lib/librte_eventdev/rte_eventdev_pmd_pci.h
> index 8fb6138..dad2182 100644
> --- a/lib/librte_eventdev/rte_eventdev_pmd_pci.h
> +++ b/lib/librte_eventdev/rte_eventdev_pmd_pci.h
> @@ -66,9 +66,11 @@
>  						RTE_CACHE_LINE_SIZE,
>  						rte_socket_id());
>  
> -		if (eventdev->data->dev_private == NULL)
> -			rte_panic("Cannot allocate memzone for private "
> -					"device data");
> +		if (eventdev->data->dev_private == NULL) {
> +			RTE_LOG(CRIT, EAL, "%s(): Cannot allocate memzone for private device data",
> +				__func__);
> +			return -1;

return -ENOMEM

> +		}
>  	}
>  
>  	eventdev->dev = &pci_dev->device;
> diff --git a/lib/librte_eventdev/rte_eventdev_pmd_vdev.h b/lib/librte_eventdev/rte_eventdev_pmd_vdev.h
> index 8c64a06..b7c08fa 100644
> --- a/lib/librte_eventdev/rte_eventdev_pmd_vdev.h
> +++ b/lib/librte_eventdev/rte_eventdev_pmd_vdev.h
> @@ -61,9 +61,11 @@
>  						RTE_CACHE_LINE_SIZE,
>  						socket_id);
>  
> -		if (eventdev->data->dev_private == NULL)
> -			rte_panic("Cannot allocate memzone for private device"
> -					" data");
> +		if (eventdev->data->dev_private == NULL) {
> +			RTE_LOG(CRIT, EAL, "%s(): Cannot allocate memzone for private device data",
> +				__func__);
> +			return NULL;
> +		}
>  	}
>  
>  	return eventdev;
> 

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v4 08/11] eal: replace rte_panic instances in interrupts thread
  2018-04-19  6:01 ` [PATCH v4 08/11] eal: replace rte_panic instances in interrupts thread Arnon Warshavsky
@ 2018-04-19 17:27   ` Kevin Traynor
  2018-04-20 13:18     ` Arnon Warshavsky
  0 siblings, 1 reply; 47+ messages in thread
From: Kevin Traynor @ 2018-04-19 17:27 UTC (permalink / raw)
  To: Arnon Warshavsky, thomas, anatoly.burakov, wenzhuo.lu,
	declan.doherty, jerin.jacob, bruce.richardson, ferruh.yigit
  Cc: dev

On 04/19/2018 07:01 AM, Arnon Warshavsky wrote:
> replace panic calls with log and retrun value.
> Thread function removes the noretrun attribute.
> 
> Signed-off-by: Arnon Warshavsky <arnon@qwilt.com>
> ---
>  lib/librte_eal/linuxapp/eal/eal_interrupts.c | 27 ++++++++++++++++++++-------
>  1 file changed, 20 insertions(+), 7 deletions(-)
> 
> diff --git a/lib/librte_eal/linuxapp/eal/eal_interrupts.c b/lib/librte_eal/linuxapp/eal/eal_interrupts.c
> index 58e9328..8b8650a 100644
> --- a/lib/librte_eal/linuxapp/eal/eal_interrupts.c
> +++ b/lib/librte_eal/linuxapp/eal/eal_interrupts.c
> @@ -785,7 +785,7 @@ struct rte_intr_source {
>   * @return
>   *  never return;
>   */
> -static __attribute__((noreturn)) void *
> +static void *
>  eal_intr_thread_main(__rte_unused void *arg)
>  {
>  	struct epoll_event ev;
> @@ -803,8 +803,11 @@ static __attribute__((noreturn)) void *
>  
>  		/* create epoll fd */
>  		int pfd = epoll_create(1);
> -		if (pfd < 0)
> -			rte_panic("Cannot create epoll instance\n");
> +		if (pfd < 0) {
> +			RTE_LOG(CRIT, EAL, "%s(): Cannot create epoll instance\n",
> +					__func__);
> +			return NULL;
> +		}
>  
>  		pipe_event.data.fd = intr_pipe.readfd;
>  		/**
> @@ -813,8 +816,11 @@ static __attribute__((noreturn)) void *
>  		 */
>  		if (epoll_ctl(pfd, EPOLL_CTL_ADD, intr_pipe.readfd,
>  						&pipe_event) < 0) {
> -			rte_panic("Error adding fd to %d epoll_ctl, %s\n",
> +			RTE_LOG(CRIT, EAL, "%s(): Error adding fd to %d "
> +					"epoll_ctl, %s\n",
> +					__func__,
>  					intr_pipe.readfd, strerror(errno));
> +			return NULL;
>  		}
>  		numfds++;
>  
> @@ -831,9 +837,14 @@ static __attribute__((noreturn)) void *
>  			 * into wait list.
>  			 */
>  			if (epoll_ctl(pfd, EPOLL_CTL_ADD,
> -					src->intr_handle.fd, &ev) < 0){
> -				rte_panic("Error adding fd %d epoll_ctl, %s\n",
> -					src->intr_handle.fd, strerror(errno));
> +				src->intr_handle.fd, &ev) < 0) {

The alignment changed here, not sure if it was deliberate

> +				RTE_LOG(CRIT, EAL,
> +						"%s(): Error adding fd %d "
> +						"epoll_ctl, %s\n",
> +						__func__,
> +						src->intr_handle.fd,
> +						strerror(errno));
> +				return NULL;
>  			}
>  			else
>  				numfds++;
> @@ -848,6 +859,8 @@ static __attribute__((noreturn)) void *
>  		 */
>  		close(pfd);
>  	}
> +
> +	return NULL;
>  }
>  
>  int
> 

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v4 09/11] eal: replace rte_panic instances in ethdev
  2018-04-19  6:01 ` [PATCH v4 09/11] eal: replace rte_panic instances in ethdev Arnon Warshavsky
@ 2018-04-19 17:27   ` Kevin Traynor
  2018-04-20 13:23     ` Arnon Warshavsky
  0 siblings, 1 reply; 47+ messages in thread
From: Kevin Traynor @ 2018-04-19 17:27 UTC (permalink / raw)
  To: Arnon Warshavsky, thomas, anatoly.burakov, wenzhuo.lu,
	declan.doherty, jerin.jacob, bruce.richardson, ferruh.yigit
  Cc: dev

On 04/19/2018 07:01 AM, Arnon Warshavsky wrote:
> Local function to this file,
> changing from void to int is non-abi-breaking
> 
> Signed-off-by: Arnon Warshavsky <arnon@qwilt.com>
> ---
>  lib/librte_ether/rte_ethdev.c | 36 +++++++++++++++++++++++++-----------
>  1 file changed, 25 insertions(+), 11 deletions(-)
> 
> diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
> index 7821a88..9c13827 100644
> --- a/lib/librte_ether/rte_ethdev.c
> +++ b/lib/librte_ether/rte_ethdev.c
> @@ -194,7 +194,7 @@ enum {
>  	return port_id;
>  }
>  
> -static void
> +static int
>  rte_eth_dev_shared_data_prepare(void)
>  {
>  	const unsigned flags = 0;
> @@ -210,8 +210,12 @@ enum {
>  					rte_socket_id(), flags);
>  		} else
>  			mz = rte_memzone_lookup(MZ_RTE_ETH_DEV_DATA);
> -		if (mz == NULL)
> -			rte_panic("Cannot allocate ethdev shared data\n");
> +		if (mz == NULL) {
> +			rte_spinlock_unlock(&rte_eth_shared_data_lock);
> +			RTE_LOG(CRIT, EAL, "%s(): Cannot allocate ethdev shared data\n",
> +					__func__);
> +			return -1;
> +		}
>  
>  		rte_eth_dev_shared_data = mz->addr;
>  		if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
> @@ -224,6 +228,8 @@ enum {
>  	}
>  
>  	rte_spinlock_unlock(&rte_eth_shared_data_lock);
> +
> +	return 0;
>  }
>  
>  struct rte_eth_dev *
> @@ -274,7 +280,8 @@ struct rte_eth_dev *
>  	uint16_t port_id;
>  	struct rte_eth_dev *eth_dev = NULL;
>  
> -	rte_eth_dev_shared_data_prepare();
> +	if (rte_eth_dev_shared_data_prepare() != 0)

Lots of "!= 0"'s - you might gather by now that I don't like them :-)

> +		return NULL;
>  
>  	/* Synchronize port creation between primary and secondary threads. */
>  	rte_spinlock_lock(&rte_eth_dev_shared_data->ownership_lock);
> @@ -317,7 +324,8 @@ struct rte_eth_dev *
>  	uint16_t i;
>  	struct rte_eth_dev *eth_dev = NULL;
>  
> -	rte_eth_dev_shared_data_prepare();
> +	if (rte_eth_dev_shared_data_prepare() != 0)
> +		return NULL;
>  
>  	/* Synchronize port attachment to primary port creation and release. */
>  	rte_spinlock_lock(&rte_eth_dev_shared_data->ownership_lock);
> @@ -345,7 +353,8 @@ struct rte_eth_dev *
>  	if (eth_dev == NULL)
>  		return -EINVAL;
>  
> -	rte_eth_dev_shared_data_prepare();
> +	if (rte_eth_dev_shared_data_prepare() != 0)
> +		return -1;
>  
>  	rte_spinlock_lock(&rte_eth_dev_shared_data->ownership_lock);
>  
> @@ -399,7 +408,8 @@ struct rte_eth_dev *
>  int __rte_experimental
>  rte_eth_dev_owner_new(uint64_t *owner_id)
>  {
> -	rte_eth_dev_shared_data_prepare();
> +	if (rte_eth_dev_shared_data_prepare() != 0)
> +		return -1;
>  
>  	rte_spinlock_lock(&rte_eth_dev_shared_data->ownership_lock);
>  
> @@ -450,7 +460,8 @@ struct rte_eth_dev *
>  {
>  	int ret;
>  
> -	rte_eth_dev_shared_data_prepare();
> +	if (rte_eth_dev_shared_data_prepare() != 0)
> +		return -1;
>  
>  	rte_spinlock_lock(&rte_eth_dev_shared_data->ownership_lock);
>  
> @@ -467,7 +478,8 @@ struct rte_eth_dev *
>  			{.id = RTE_ETH_DEV_NO_OWNER, .name = ""};
>  	int ret;
>  
> -	rte_eth_dev_shared_data_prepare();
> +	if (rte_eth_dev_shared_data_prepare() != 0)
> +		return -1;
>  
>  	rte_spinlock_lock(&rte_eth_dev_shared_data->ownership_lock);
>  
> @@ -482,7 +494,8 @@ struct rte_eth_dev *
>  {

hmm, I'm wondering should void __rte_experimental
rte_eth_dev_owner_delete change to return an int, now that there is a
fail case and it is still experimental...?

>  	uint16_t port_id;
>  
> -	rte_eth_dev_shared_data_prepare();
> +	if (rte_eth_dev_shared_data_prepare() != 0)
> +		return;
>  
>  	rte_spinlock_lock(&rte_eth_dev_shared_data->ownership_lock);
>  
> @@ -502,7 +515,8 @@ struct rte_eth_dev *
>  {
>  	int ret = 0;
>  
> -	rte_eth_dev_shared_data_prepare();
> +	if (rte_eth_dev_shared_data_prepare() != 0)
> +		return -1;
>  
>  	rte_spinlock_lock(&rte_eth_dev_shared_data->ownership_lock);
>  
> 

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v4 10/11] eal: replace rte_panic instances in init sequence
  2018-04-19 14:57       ` Burakov, Anatoly
@ 2018-04-19 17:31         ` Kevin Traynor
  2018-04-20 13:32           ` Arnon Warshavsky
  2018-04-20 13:31         ` Arnon Warshavsky
  1 sibling, 1 reply; 47+ messages in thread
From: Kevin Traynor @ 2018-04-19 17:31 UTC (permalink / raw)
  To: Burakov, Anatoly, Arnon Warshavsky
  Cc: Thomas Monjalon, Lu, Wenzhuo, Doherty, Declan, jerin.jacob,
	Bruce Richardson, Yigit, Ferruh, dev

On 04/19/2018 03:57 PM, Burakov, Anatoly wrote:
> On 19-Apr-18 3:48 PM, Arnon Warshavsky wrote:
>> Copy on the commit message  and volatile.
>>
>> Regarding the new function defunct_and_remain_in_endless_loop ()
>> I don't think I can put that in a separate patch without breaking the
>> current patch independence.
> 
> How so?
> 
> Just leave some panic instances in there for thread-related stuff and
> fix them up in the next patch.
> 
> Also, i'm not sure sending threads into an infinite loop on panic is
> such a good idea. You might want to look at Olivier's approach [1] to
> creating threads, using pthread_barriers and pthread_kill/cancel.
> 

I haven't reviewed this one yet, but going into an infinite loop doesn't
seem like the right thing to do.

> This does warrant a separate patch now :)
> 

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v4 10/11] eal: replace rte_panic instances in init sequence
  2018-04-19  6:01 ` [PATCH v4 10/11] eal: replace rte_panic instances in init sequence Arnon Warshavsky
  2018-04-19 14:39   ` Burakov, Anatoly
@ 2018-04-19 17:48   ` Aaron Conole
  2018-04-20 13:55     ` Arnon Warshavsky
  1 sibling, 1 reply; 47+ messages in thread
From: Aaron Conole @ 2018-04-19 17:48 UTC (permalink / raw)
  To: Arnon Warshavsky
  Cc: thomas, anatoly.burakov, wenzhuo.lu, declan.doherty, jerin.jacob,
	bruce.richardson, ferruh.yigit, dev, Kevin Traynor

Arnon Warshavsky <arnon@qwilt.com> writes:

> Local functions to this file,
> changing from void to int are non-abi-breaking.
> For handling the single function that cannot
> change from void to int due to abi,
> where this is the only place it is called in,
> I added a state variable that is being checked
> right after the call to this function.
>
> --
>
> v4 - fix split literal strings in log messages
>
> Signed-off-by: Arnon Warshavsky <arnon@qwilt.com>
> ---

Hi Arnon,

Always happy to see panic calls get removed.  I have some comments inline.

>  lib/librte_eal/bsdapp/eal/eal.c           |  86 ++++++++++++++-------
>  lib/librte_eal/bsdapp/eal/eal_thread.c    |  65 +++++++++++-----
>  lib/librte_eal/common/eal_common_launch.c |  21 ++++++
>  lib/librte_eal/common/include/rte_debug.h |  12 +++
>  lib/librte_eal/linuxapp/eal/eal.c         | 120 ++++++++++++++++++++----------
>  lib/librte_eal/linuxapp/eal/eal_thread.c  |  65 +++++++++++-----
>  6 files changed, 270 insertions(+), 99 deletions(-)
>
> diff --git a/lib/librte_eal/bsdapp/eal/eal.c b/lib/librte_eal/bsdapp/eal/eal.c
> index d996190..9c2f6f1 100644
> --- a/lib/librte_eal/bsdapp/eal/eal.c
> +++ b/lib/librte_eal/bsdapp/eal/eal.c
> @@ -151,7 +151,7 @@ enum rte_iova_mode
>   * We also don't lock the whole file, so that in future we can use read-locks
>   * on other parts, e.g. memzones, to detect if there are running secondary
>   * processes. */
> -static void
> +static int
>  rte_eal_config_create(void)
>  {
>  	void *rte_mem_cfg_addr;
> @@ -160,60 +160,78 @@ enum rte_iova_mode
>  	const char *pathname = eal_runtime_config_path();
>  
>  	if (internal_config.no_shconf)
> -		return;
> +		return 0;
>  
>  	if (mem_cfg_fd < 0){
>  		mem_cfg_fd = open(pathname, O_RDWR | O_CREAT, 0660);
> -		if (mem_cfg_fd < 0)
> -			rte_panic("Cannot open '%s' for rte_mem_config\n", pathname);
> +		if (mem_cfg_fd < 0) {
> +			RTE_LOG(CRIT, EAL, "%s(): Cannot open '%s' for rte_mem_config\n",
> +					__func__, pathname);
> +			return -1;
> +		}
>  	}
>  
>  	retval = ftruncate(mem_cfg_fd, sizeof(*rte_config.mem_config));
>  	if (retval < 0){
>  		close(mem_cfg_fd);
> -		rte_panic("Cannot resize '%s' for rte_mem_config\n", pathname);
> +		RTE_LOG(CRIT, EAL, "%s(): Cannot resize '%s' for rte_mem_config\n",
> +				__func__, pathname);
> +		return -1;

Previously, it wasn't possible for mem_cfg_fd to be reused after a
failure.  Now it is - please reset it to -1. in these close conditions.

>  	}
>  
>  	retval = fcntl(mem_cfg_fd, F_SETLK, &wr_lock);
>  	if (retval < 0){
>  		close(mem_cfg_fd);
> -		rte_exit(EXIT_FAILURE, "Cannot create lock on '%s'. Is another primary "
> -				"process running?\n", pathname);
> +		RTE_LOG(CRIT, EAL, "%s(): Cannot create lock on '%s'. Is another primary process running?\n",
> +				__func__, pathname);
> +		return -1;
>  	}
>  
>  	rte_mem_cfg_addr = mmap(NULL, sizeof(*rte_config.mem_config),
>  				PROT_READ | PROT_WRITE, MAP_SHARED, mem_cfg_fd, 0);
>  
>  	if (rte_mem_cfg_addr == MAP_FAILED){
> -		rte_panic("Cannot mmap memory for rte_config\n");
> +		RTE_LOG(CRIT, EAL, "%s(): Cannot mmap memory for rte_config\n",
> +				__func__);
> +		return -1;
>  	}
>  	memcpy(rte_mem_cfg_addr, &early_mem_config, sizeof(early_mem_config));
>  	rte_config.mem_config = rte_mem_cfg_addr;
> +
> +	return 0;
>  }
>  
>  /* attach to an existing shared memory config */
> -static void
> +static int
>  rte_eal_config_attach(void)
>  {
>  	void *rte_mem_cfg_addr;
>  	const char *pathname = eal_runtime_config_path();
>  
>  	if (internal_config.no_shconf)
> -		return;
> +		return 0;
>  
>  	if (mem_cfg_fd < 0){
>  		mem_cfg_fd = open(pathname, O_RDWR);
> -		if (mem_cfg_fd < 0)
> -			rte_panic("Cannot open '%s' for rte_mem_config\n", pathname);
> +		if (mem_cfg_fd < 0) {
> +			RTE_LOG(CRIT, EAL, "%s(): Cannot open '%s' for rte_mem_config\n",
> +					__func__, pathname);
> +			return -1;
> +		}
>  	}
>  
>  	rte_mem_cfg_addr = mmap(NULL, sizeof(*rte_config.mem_config),
>  				PROT_READ | PROT_WRITE, MAP_SHARED, mem_cfg_fd, 0);
>  	close(mem_cfg_fd);

Again, previously this would have aborted on a failure.  So it needs to
be reset to a value that allows retry.

> -	if (rte_mem_cfg_addr == MAP_FAILED)
> -		rte_panic("Cannot mmap memory for rte_config\n");
> +	if (rte_mem_cfg_addr == MAP_FAILED) {
> +		RTE_LOG(CRIT, EAL, "%s(): Cannot mmap memory for rte_config\n",
> +				__func__);
> +		return -1;
> +	}
>  
>  	rte_config.mem_config = rte_mem_cfg_addr;
> +
> +	return 0;
>  }
>  
>  /* Detect if we are a primary or a secondary process */
> @@ -237,23 +255,28 @@ enum rte_proc_type_t
>  }
>  
>  /* Sets up rte_config structure with the pointer to shared memory config.*/
> -static void
> +static int
>  rte_config_init(void)
>  {
>  	rte_config.process_type = internal_config.process_type;
>  
>  	switch (rte_config.process_type){
>  	case RTE_PROC_PRIMARY:
> -		rte_eal_config_create();
> +		if (rte_eal_config_create())
> +			return -1;
>  		break;
>  	case RTE_PROC_SECONDARY:
> -		rte_eal_config_attach();
> +		if (rte_eal_config_attach())
> +			return -1;
>  		rte_eal_mcfg_wait_complete(rte_config.mem_config);
>  		break;
>  	case RTE_PROC_AUTO:
>  	case RTE_PROC_INVALID:

Not for this patch, but I just noticed that this should probably use a
'default' case.

> -		rte_panic("Invalid process type\n");
> +		RTE_LOG(CRIT, EAL, "%s(): Invalid process type %d\n",
> +				__func__, rte_config.process_type);
> +		return -1;
>  	}
> +	return 0;
>  }
>  
>  /* display usage */
> @@ -595,7 +618,8 @@ static void rte_eal_init_alert(const char *msg)
>  
>  	rte_srand(rte_rdtsc());
>  
> -	rte_config_init();
> +	if (rte_config_init() != 0)
> +		return -1;

Use rte_eal_init_alert to indicate why you are failing the init.

>  	if (rte_mp_channel_init() < 0) {
>  		rte_eal_init_alert("failed to init mp channel\n");
> @@ -652,7 +676,8 @@ static void rte_eal_init_alert(const char *msg)
>  
>  	eal_check_mem_on_local_socket();
>  
> -	eal_thread_init_master(rte_config.master_lcore);
> +	if (eal_thread_init_master(rte_config.master_lcore) != 0)
> +		return -1;

Is it ever possible to recover from this?  Still needs
rte_eal_init_alert() call.

>  
>  	ret = eal_thread_dump_affinity(cpuset, RTE_CPU_AFFINITY_STR_LEN);
>  
> @@ -666,18 +691,27 @@ static void rte_eal_init_alert(const char *msg)
>  		 * create communication pipes between master thread
>  		 * and children
>  		 */
> -		if (pipe(lcore_config[i].pipe_master2slave) < 0)
> -			rte_panic("Cannot create pipe\n");
> -		if (pipe(lcore_config[i].pipe_slave2master) < 0)
> -			rte_panic("Cannot create pipe\n");
> +		if (pipe(lcore_config[i].pipe_master2slave) < 0) {
> +			RTE_LOG(CRIT, EAL, "%s(): Cannot create pipe\n",
> +					__func__);
> +			return -1;
> +		}
> +		if (pipe(lcore_config[i].pipe_slave2master) < 0) {
> +			RTE_LOG(CRIT, EAL, "%s(): Cannot create pipe\n",
> +					__func__);
> +			return -1;
> +		}

How are you cleaning up the threads that were spawned?  Lets say this
loop will execute 5 times, and on the 3rd entry, these errors happen.
You now leave DPDK 'half-initialized' - you've spun up threads and
allocated memory.

Also, again use rte_eal_init_alert().  It was added for a reason :)

>  
>  		lcore_config[i].state = WAIT;
>  
>  		/* create a thread for each lcore */
>  		ret = pthread_create(&lcore_config[i].thread_id, NULL,
>  				     eal_thread_loop, NULL);
> -		if (ret != 0)
> -			rte_panic("Cannot create thread\n");
> +		if (ret != 0) {
> +			RTE_LOG(CRIT, EAL, "%s(): Cannot create thread\n",
> +					__func__);
> +			return -1;
> +		}

Same question as before.  If pthread_create is failing, there are worse
problems than aborting.

>  		/* Set thread_name for aid in debugging. */
>  		snprintf(thread_name, RTE_MAX_THREAD_NAME_LEN,
> diff --git a/lib/librte_eal/bsdapp/eal/eal_thread.c b/lib/librte_eal/bsdapp/eal/eal_thread.c
> index d602daf..5c3947c 100644
> --- a/lib/librte_eal/bsdapp/eal/eal_thread.c
> +++ b/lib/librte_eal/bsdapp/eal/eal_thread.c
> @@ -51,16 +51,22 @@
>  	n = 0;
>  	while (n == 0 || (n < 0 && errno == EINTR))
>  		n = write(m2s, &c, 1);
> -	if (n < 0)
> -		rte_panic("cannot write on configuration pipe\n");
> +	if (n < 0) {
> +		RTE_LOG(CRIT, EAL, "%s(): Cannot write on configuration pipe\n",
> +				__func__);
> +		return -1;
> +	}
>  
>  	/* wait ack */
>  	do {
>  		n = read(s2m, &c, 1);
>  	} while (n < 0 && errno == EINTR);
>  
> -	if (n <= 0)
> -		rte_panic("cannot read on configuration pipe\n");
> +	if (n <= 0) {
> +		RTE_LOG(CRIT, EAL, "%s(): Cannot read on configuration pipe\n",
> +				__func__);
> +		return -1;
> +	}
>  
>  	return 0;
>  }
> @@ -84,8 +90,19 @@ void eal_thread_init_master(unsigned lcore_id)
>  	RTE_PER_LCORE(_lcore_id) = lcore_id;
>  
>  	/* set CPU affinity */
> -	if (eal_thread_set_affinity() < 0)
> -		rte_panic("cannot set affinity\n");
> +	if (eal_thread_set_affinity() < 0) {
> +		RTE_LOG(CRIT, EAL, "%s(): Cannot set affinity\n", __func__);
> +		rte_move_to_panic_state();
> +	}
> +}
> +
> +/* move to panic state and do not return */
> +static __attribute__((noreturn)) void
> +defunct_and_remain_in_endless_loop(void)
> +{
> +	rte_move_to_panic_state();
> +	while (1)
> +		sleep(1);
>  }

This is worse than a panic.  Users will blame applications for appearing
to freeze.  Please leave the panics in place rather than do this.

>  /* main loop of threads */
> @@ -106,8 +123,11 @@ void eal_thread_init_master(unsigned lcore_id)
>  		if (thread_id == lcore_config[lcore_id].thread_id)
>  			break;
>  	}
> -	if (lcore_id == RTE_MAX_LCORE)
> -		rte_panic("cannot retrieve lcore id\n");
> +	if (lcore_id == RTE_MAX_LCORE) {
> +		RTE_LOG(CRIT, EAL, "%s(): Cannot retrieve lcore id\n",
> +				__func__);
> +		defunct_and_remain_in_endless_loop();
> +	}

I'm not even sure this check has merit, tbh.  Is there ever a chance for
an lcore thread to be spawned like this?  Probably a better patch would
just remove all the code you've inserted, but keep the check you
removed.

>  	m2s = lcore_config[lcore_id].pipe_master2slave[0];
>  	s2m = lcore_config[lcore_id].pipe_slave2master[1];
> @@ -116,8 +136,10 @@ void eal_thread_init_master(unsigned lcore_id)
>  	RTE_PER_LCORE(_lcore_id) = lcore_id;
>  
>  	/* set CPU affinity */
> -	if (eal_thread_set_affinity() < 0)
> -		rte_panic("cannot set affinity\n");
> +	if (eal_thread_set_affinity() < 0) {
> +		RTE_LOG(CRIT, EAL, "%s(): Cannot set affinity\n", __func__);
> +		defunct_and_remain_in_endless_loop();

How does this improve the user experience?

> +	}
>  
>  	ret = eal_thread_dump_affinity(cpuset, RTE_CPU_AFFINITY_STR_LEN);
>  
> @@ -133,8 +155,11 @@ void eal_thread_init_master(unsigned lcore_id)
>  			n = read(m2s, &c, 1);
>  		} while (n < 0 && errno == EINTR);
>  
> -		if (n <= 0)
> -			rte_panic("cannot read on configuration pipe\n");
> +		if (n <= 0) {
> +			RTE_LOG(CRIT, EAL, "%s(): Cannot read on configuration pipe\n",
> +					__func__);
> +			defunct_and_remain_in_endless_loop();

Same question.  Actually this could happen on shutdown, I think?  If
there's a race where the pipe is torn down before the thread?  Not sure
if there are any ordering guarantees around that.

> +		}
>  
>  		lcore_config[lcore_id].state = RUNNING;
>  
> @@ -142,11 +167,17 @@ void eal_thread_init_master(unsigned lcore_id)
>  		n = 0;
>  		while (n == 0 || (n < 0 && errno == EINTR))
>  			n = write(s2m, &c, 1);
> -		if (n < 0)
> -			rte_panic("cannot write on configuration pipe\n");
> -
> -		if (lcore_config[lcore_id].f == NULL)
> -			rte_panic("NULL function pointer\n");
> +		if (n < 0) {
> +			RTE_LOG(CRIT, EAL, "%s(): Cannot write on configuration pipe\n",
> +					__func__);
> +			defunct_and_remain_in_endless_loop();
> +		}
> +
> +		if (lcore_config[lcore_id].f == NULL) {
> +			RTE_LOG(CRIT, EAL, "%s(): NULL function pointer\n",
> +					__func__);
> +			defunct_and_remain_in_endless_loop();
> +		}

I don't see how any of this is better for the user.  In fact, I think
this is worse because it will make portions of the application stop
working without any way to move forward.  rte_panic() will at least give
the process a chance to recover from a potentially ephemeral condition.

>  		/* call the function and store the return value */
>  		fct_arg = lcore_config[lcore_id].arg;
> diff --git a/lib/librte_eal/common/eal_common_launch.c b/lib/librte_eal/common/eal_common_launch.c
> index fe0ba3f..6f8bd46 100644
> --- a/lib/librte_eal/common/eal_common_launch.c
> +++ b/lib/librte_eal/common/eal_common_launch.c
> @@ -14,6 +14,7 @@
>  #include <rte_pause.h>
>  #include <rte_per_lcore.h>
>  #include <rte_lcore.h>
> +#include <rte_debug.h>
>  
>  /*
>   * Wait until a lcore finished its job.
> @@ -88,3 +89,23 @@ enum rte_lcore_state_t
>  		rte_eal_wait_lcore(lcore_id);
>  	}
>  }
> +
> +/* panic state */
> +static int _panic_state;
> +
> +/**
> + * Check if the system is in panic state
> + * @return int
> + */
> +int rte_get_panic_state(void)
> +{
> +	return _panic_state;
> +}
> +
> +/**
> + * Move the system to be in panic state
> + */
> +void rte_move_to_panic_state(void)
> +{
> +	_panic_state = 1;
> +}
> diff --git a/lib/librte_eal/common/include/rte_debug.h b/lib/librte_eal/common/include/rte_debug.h
> index 272df49..b421d33 100644
> --- a/lib/librte_eal/common/include/rte_debug.h
> +++ b/lib/librte_eal/common/include/rte_debug.h
> @@ -79,4 +79,16 @@ void __rte_panic(const char *funcname , const char *format, ...)
>  }
>  #endif
>  
> +/**
> + * Check if the system is in panic state
> + * @return int
> + */
> +int rte_get_panic_state(void);
> +
> +/**
> + * Move the system to be in panic state
> + */
> +void rte_move_to_panic_state(void);

This seems to only exist as a way of triggering the run_once check in
the eal_init.  It doesn't add anything except one more state variable to
check against.  What is the purpose?

Further, it seems unrelated to removing panics.

> +
>  #endif /* _RTE_DEBUG_H_ */
> diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
> index 21afa73..393441a 100644
> --- a/lib/librte_eal/linuxapp/eal/eal.c
> +++ b/lib/librte_eal/linuxapp/eal/eal.c
> @@ -160,7 +160,7 @@ enum rte_iova_mode
>   * We also don't lock the whole file, so that in future we can use read-locks
>   * on other parts, e.g. memzones, to detect if there are running secondary
>   * processes. */
> -static void
> +static int
>  rte_eal_config_create(void)
>  {
>  	void *rte_mem_cfg_addr;
> @@ -169,7 +169,7 @@ enum rte_iova_mode
>  	const char *pathname = eal_runtime_config_path();
>  
>  	if (internal_config.no_shconf)
> -		return;
> +		return 0;
>  
>  	/* map the config before hugepage address so that we don't waste a page */
>  	if (internal_config.base_virtaddr != 0)
> @@ -179,30 +179,39 @@ enum rte_iova_mode
>  	else
>  		rte_mem_cfg_addr = NULL;
>  
> -	if (mem_cfg_fd < 0){
> +	if (mem_cfg_fd < 0) {
>  		mem_cfg_fd = open(pathname, O_RDWR | O_CREAT, 0660);
> -		if (mem_cfg_fd < 0)
> -			rte_panic("Cannot open '%s' for rte_mem_config\n", pathname);
> +		if (mem_cfg_fd < 0) {
> +			RTE_LOG(CRIT, EAL, "%s(): Cannot open '%s' for rte_mem_config\n",
> +				__func__, pathname);
> +			return -1;
> +		}
>  	}
>  
>  	retval = ftruncate(mem_cfg_fd, sizeof(*rte_config.mem_config));
> -	if (retval < 0){
> +	if (retval < 0) {
>  		close(mem_cfg_fd);
> -		rte_panic("Cannot resize '%s' for rte_mem_config\n", pathname);
> +		RTE_LOG(CRIT, EAL, "%s(): Cannot resize '%s' for rte_mem_config\n",
> +				__func__, pathname);
> +		return -1;
>  	}
>  
>  	retval = fcntl(mem_cfg_fd, F_SETLK, &wr_lock);
> -	if (retval < 0){
> +	if (retval < 0) {
>  		close(mem_cfg_fd);
> -		rte_exit(EXIT_FAILURE, "Cannot create lock on '%s'. Is another primary "
> -				"process running?\n", pathname);
> +		RTE_LOG(CRIT, EAL, "%s(): Cannot create lock on '%s'."
> +				" Is another primary process running?\n",
> +				__func__, pathname);
> +		return -1;
>  	}
>  
>  	rte_mem_cfg_addr = mmap(rte_mem_cfg_addr, sizeof(*rte_config.mem_config),
>  				PROT_READ | PROT_WRITE, MAP_SHARED, mem_cfg_fd, 0);
>  
> -	if (rte_mem_cfg_addr == MAP_FAILED){
> -		rte_panic("Cannot mmap memory for rte_config\n");
> +	if (rte_mem_cfg_addr == MAP_FAILED) {
> +		RTE_LOG(CRIT, EAL, "%s(): Cannot mmap memory for rte_config\n",
> +			__func__);
> +		return -1;
>  	}
>  	memcpy(rte_mem_cfg_addr, &early_mem_config, sizeof(early_mem_config));
>  	rte_config.mem_config = rte_mem_cfg_addr;
> @@ -211,10 +220,11 @@ enum rte_iova_mode
>  	 * processes could later map the config into this exact location */
>  	rte_config.mem_config->mem_cfg_addr = (uintptr_t) rte_mem_cfg_addr;
>  
> +	return 0;
>  }
>  
>  /* attach to an existing shared memory config */
> -static void
> +static int
>  rte_eal_config_attach(void)
>  {
>  	struct rte_mem_config *mem_config;
> @@ -222,33 +232,40 @@ enum rte_iova_mode
>  	const char *pathname = eal_runtime_config_path();
>  
>  	if (internal_config.no_shconf)
> -		return;
> +		return 0;
>  
> -	if (mem_cfg_fd < 0){
> +	if (mem_cfg_fd < 0) {
>  		mem_cfg_fd = open(pathname, O_RDWR);
> -		if (mem_cfg_fd < 0)
> -			rte_panic("Cannot open '%s' for rte_mem_config\n", pathname);
> +		if (mem_cfg_fd < 0) {
> +			RTE_LOG(CRIT, EAL, "%s(): Cannot open '%s' for rte_mem_config\n",
> +						__func__, pathname);
> +			return -1;
> +		}
>  	}
>  
>  	/* map it as read-only first */
>  	mem_config = (struct rte_mem_config *) mmap(NULL, sizeof(*mem_config),
>  			PROT_READ, MAP_SHARED, mem_cfg_fd, 0);
> -	if (mem_config == MAP_FAILED)
> -		rte_panic("Cannot mmap memory for rte_config! error %i (%s)\n",
> -			  errno, strerror(errno));
> +	if (mem_config == MAP_FAILED) {
> +		RTE_LOG(CRIT, EAL, "%s(): Cannot mmap memory for rte_config! error %i (%s)\n",
> +				__func__, errno, strerror(errno));
> +		return -1;
> +	}
>  
>  	rte_config.mem_config = mem_config;
> +
> +	return 0;
>  }
>  
>  /* reattach the shared config at exact memory location primary process has it */
> -static void
> +static int
>  rte_eal_config_reattach(void)
>  {
>  	struct rte_mem_config *mem_config;
>  	void *rte_mem_cfg_addr;
>  
>  	if (internal_config.no_shconf)
> -		return;
> +		return 0;
>  
>  	/* save the address primary process has mapped shared config to */
>  	rte_mem_cfg_addr = (void *) (uintptr_t) rte_config.mem_config->mem_cfg_addr;
> @@ -263,16 +280,21 @@ enum rte_iova_mode
>  	if (mem_config == MAP_FAILED || mem_config != rte_mem_cfg_addr) {
>  		if (mem_config != MAP_FAILED)
>  			/* errno is stale, don't use */
> -			rte_panic("Cannot mmap memory for rte_config at [%p], got [%p]"
> -				  " - please use '--base-virtaddr' option\n",
> -				  rte_mem_cfg_addr, mem_config);
> +			RTE_LOG(CRIT, EAL, "%s(): Cannot mmap memory for "
> +					"rte_config at [%p], got [%p] - please use "
> +					"'--base-virtaddr' option\n",
> +					__func__, rte_mem_cfg_addr, mem_config);
>  		else
> -			rte_panic("Cannot mmap memory for rte_config! error %i (%s)\n",
> -				  errno, strerror(errno));
> +			RTE_LOG(CRIT, EAL, "%s(): Cannot mmap memory for "
> +					"rte_config! error %i (%s)\n",
> +					__func__, errno, strerror(errno));
> +		return -1;
>  	}
>  	close(mem_cfg_fd);
>  
>  	rte_config.mem_config = mem_config;
> +
> +	return 0;
>  }
>  
>  /* Detect if we are a primary or a secondary process */
> @@ -296,24 +318,31 @@ enum rte_proc_type_t
>  }
>  
>  /* Sets up rte_config structure with the pointer to shared memory config.*/
> -static void
> +static int
>  rte_config_init(void)
>  {
>  	rte_config.process_type = internal_config.process_type;
>  
>  	switch (rte_config.process_type){
>  	case RTE_PROC_PRIMARY:
> -		rte_eal_config_create();
> +		if (rte_eal_config_create() != 0)
> +			return -1;
>  		break;
>  	case RTE_PROC_SECONDARY:
> -		rte_eal_config_attach();
> +		if (rte_eal_config_attach() != 0)
> +			return -1;
>  		rte_eal_mcfg_wait_complete(rte_config.mem_config);
> -		rte_eal_config_reattach();
> +		if (rte_eal_config_reattach() != 0)
> +			return -1;
>  		break;
>  	case RTE_PROC_AUTO:
>  	case RTE_PROC_INVALID:
> -		rte_panic("Invalid process type\n");
> +		RTE_LOG(CRIT, EAL, "%s(): Invalid process type %d\n",
> +				__func__, rte_config.process_type);
> +		return -1;
>  	}
> +
> +	return 0;
>  }
>  
>  /* Unlocks hugepage directories that were locked by eal_hugepage_info_init */
> @@ -820,7 +849,8 @@ static void rte_eal_init_alert(const char *msg)
>  
>  	rte_srand(rte_rdtsc());
>  
> -	rte_config_init();
> +	if (rte_config_init() != 0)
> +		return -1;
>  
>  	if (rte_eal_log_init(logid, internal_config.syslog_facility) < 0) {
>  		rte_eal_init_alert("Cannot init logging.");
> @@ -892,6 +922,9 @@ static void rte_eal_init_alert(const char *msg)
>  
>  	eal_thread_init_master(rte_config.master_lcore);
>  
> +	if (rte_get_panic_state())
> +		return -1;
> +

Please just use run_once.  That's a better way of preventing this.

>  	ret = eal_thread_dump_affinity(cpuset, RTE_CPU_AFFINITY_STR_LEN);
>  
>  	RTE_LOG(DEBUG, EAL, "Master lcore %u is ready (tid=%x;cpuset=[%s%s])\n",
> @@ -909,18 +942,27 @@ static void rte_eal_init_alert(const char *msg)
>  		 * create communication pipes between master thread
>  		 * and children
>  		 */
> -		if (pipe(lcore_config[i].pipe_master2slave) < 0)
> -			rte_panic("Cannot create pipe\n");
> -		if (pipe(lcore_config[i].pipe_slave2master) < 0)
> -			rte_panic("Cannot create pipe\n");
> +		if (pipe(lcore_config[i].pipe_master2slave) < 0) {
> +			RTE_LOG(CRIT, EAL, "%s(): Cannot create pipe\n",
> +					__func__);
> +			return -1;
> +		}
> +		if (pipe(lcore_config[i].pipe_slave2master) < 0) {
> +			RTE_LOG(CRIT, EAL, "%s(): Cannot create pipe\n",
> +					__func__);
> +			return -1;
> +		}
>  
>  		lcore_config[i].state = WAIT;
>  
>  		/* create a thread for each lcore */
>  		ret = pthread_create(&lcore_config[i].thread_id, NULL,
>  				     eal_thread_loop, NULL);
> -		if (ret != 0)
> -			rte_panic("Cannot create thread\n");
> +		if (ret != 0) {
> +			RTE_LOG(CRIT, EAL, "%s(): Cannot create thread\n",
> +					__func__);
> +			return -1;
> +		}
>  
>  		/* Set thread_name for aid in debugging. */
>  		snprintf(thread_name, RTE_MAX_THREAD_NAME_LEN,
> diff --git a/lib/librte_eal/linuxapp/eal/eal_thread.c b/lib/librte_eal/linuxapp/eal/eal_thread.c
> index 08e150b..3afcee5 100644
> --- a/lib/librte_eal/linuxapp/eal/eal_thread.c
> +++ b/lib/librte_eal/linuxapp/eal/eal_thread.c

All of the comments from the bsd side apply here.

> @@ -51,16 +51,22 @@
>  	n = 0;
>  	while (n == 0 || (n < 0 && errno == EINTR))
>  		n = write(m2s, &c, 1);
> -	if (n < 0)
> -		rte_panic("cannot write on configuration pipe\n");
> +	if (n < 0) {
> +		RTE_LOG(CRIT, EAL, "%s(): Cannot write on configuration pipe\n",
> +				__func__);
> +		return -1;
> +	}
>  
>  	/* wait ack */
>  	do {
>  		n = read(s2m, &c, 1);
>  	} while (n < 0 && errno == EINTR);
>  
> -	if (n <= 0)
> -		rte_panic("cannot read on configuration pipe\n");
> +	if (n <= 0) {
> +		RTE_LOG(CRIT, EAL, "%s(): Cannot read on configuration pipe\n",
> +				__func__);
> +		return -1;
> +	}
>  
>  	return 0;
>  }
> @@ -84,8 +90,19 @@ void eal_thread_init_master(unsigned lcore_id)
>  	RTE_PER_LCORE(_lcore_id) = lcore_id;
>  
>  	/* set CPU affinity */
> -	if (eal_thread_set_affinity() < 0)
> -		rte_panic("cannot set affinity\n");
> +	if (eal_thread_set_affinity() < 0) {
> +		RTE_LOG(CRIT, EAL, "%s(): Cannot set affinity\n", __func__);
> +		rte_move_to_panic_state();
> +	}
> +}
> +
> +/* move to panic state and do not return */
> +static __attribute__((noreturn)) void
> +defunct_and_remain_in_endless_loop(void)
> +{
> +	rte_move_to_panic_state();
> +	while (1)
> +		sleep(1);
>  }
>  
>  /* main loop of threads */
> @@ -106,8 +123,11 @@ void eal_thread_init_master(unsigned lcore_id)
>  		if (thread_id == lcore_config[lcore_id].thread_id)
>  			break;
>  	}
> -	if (lcore_id == RTE_MAX_LCORE)
> -		rte_panic("cannot retrieve lcore id\n");
> +	if (lcore_id == RTE_MAX_LCORE) {
> +		RTE_LOG(CRIT, EAL, "%s(): Cannot retrieve lcore id\n",
> +				__func__);
> +		defunct_and_remain_in_endless_loop();
> +	}
>  
>  	m2s = lcore_config[lcore_id].pipe_master2slave[0];
>  	s2m = lcore_config[lcore_id].pipe_slave2master[1];
> @@ -116,8 +136,10 @@ void eal_thread_init_master(unsigned lcore_id)
>  	RTE_PER_LCORE(_lcore_id) = lcore_id;
>  
>  	/* set CPU affinity */
> -	if (eal_thread_set_affinity() < 0)
> -		rte_panic("cannot set affinity\n");
> +	if (eal_thread_set_affinity() < 0) {
> +		RTE_LOG(CRIT, EAL, "%s(): Cannot set affinity\n", __func__);
> +		defunct_and_remain_in_endless_loop();
> +	}
>  
>  	ret = eal_thread_dump_affinity(cpuset, RTE_CPU_AFFINITY_STR_LEN);
>  
> @@ -133,8 +155,11 @@ void eal_thread_init_master(unsigned lcore_id)
>  			n = read(m2s, &c, 1);
>  		} while (n < 0 && errno == EINTR);
>  
> -		if (n <= 0)
> -			rte_panic("cannot read on configuration pipe\n");
> +		if (n <= 0) {
> +			RTE_LOG(CRIT, EAL, "%s(): Cannot read on configuration pipe\n",
> +					__func__);
> +			defunct_and_remain_in_endless_loop();
> +		}
>  
>  		lcore_config[lcore_id].state = RUNNING;
>  
> @@ -142,11 +167,17 @@ void eal_thread_init_master(unsigned lcore_id)
>  		n = 0;
>  		while (n == 0 || (n < 0 && errno == EINTR))
>  			n = write(s2m, &c, 1);
> -		if (n < 0)
> -			rte_panic("cannot write on configuration pipe\n");
> -
> -		if (lcore_config[lcore_id].f == NULL)
> -			rte_panic("NULL function pointer\n");
> +		if (n < 0) {
> +			RTE_LOG(CRIT, EAL, "%s(): Cannot write on configuration pipe\n",
> +					__func__);
> +			defunct_and_remain_in_endless_loop();
> +		}
> +
> +		if (lcore_config[lcore_id].f == NULL) {
> +			RTE_LOG(CRIT, EAL, "%s(): NULL function pointer\n",
> +					__func__);
> +			defunct_and_remain_in_endless_loop();
> +		}
>  
>  		/* call the function and store the return value */
>  		fct_arg = lcore_config[lcore_id].arg;

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v4 11/11] devtools: prevent new instances of rte_panic and rte_exit
  2018-04-19  6:01 ` [PATCH v4 11/11] devtools: prevent new instances of rte_panic and rte_exit Arnon Warshavsky
@ 2018-04-19 17:52   ` Aaron Conole
  2018-04-20 14:01     ` Arnon Warshavsky
  0 siblings, 1 reply; 47+ messages in thread
From: Aaron Conole @ 2018-04-19 17:52 UTC (permalink / raw)
  To: Arnon Warshavsky
  Cc: thomas, anatoly.burakov, wenzhuo.lu, declan.doherty, jerin.jacob,
	bruce.richardson, ferruh.yigit, dev

Arnon Warshavsky <arnon@qwilt.com> writes:

> This patch adds a new function that is called
> per every checked patch,
> and alerts for new instances of rte_panic/rte_exit.
> The check excludes comments, and alerts in the case
> of a positive balance between additions and removals.
>
> Signed-off-by: Arnon Warshavsky <arnon@qwilt.com>
> ---
>  devtools/checkpatches.sh | 94 +++++++++++++++++++++++++++++++++++++++++++++++-
>  1 file changed, 93 insertions(+), 1 deletion(-)
>
> diff --git a/devtools/checkpatches.sh b/devtools/checkpatches.sh
> index 7676a6b..fb37838 100755
> --- a/devtools/checkpatches.sh
> +++ b/devtools/checkpatches.sh
> @@ -61,6 +61,90 @@ print_usage () {
>  	END_OF_HELP
>  }
>  
> +check_forbidden_additions() { # <file>
> +    # ---------------------------------
> +    #This awk script receives a list of expressions to monitor
> +    #and a list of folders to search these expressions in
> +    # - No search is done inside comments
> +    # - Both additions and removals of the expressions are checked
> +    #   A positive balance of additions fails the check
> +	read -d '' awk_script << 'EOF'
> +	BEGIN{
> +		split(FOLDERS,deny_folders," ");
> +		split(EXPRESSIONS,deny_expr," ");
> +		in_file=0;
> +		in_comment=0;
> +		count=0;
> +		comment_start="/*"
> +		comment_end="*/"
> +    }
> +    # search for add/remove instances in current file
> +    # state machine assumes the comments structure is enforced by
> +    # checkpatches.pl
> +    (in_file) {
> +		# comment start
> +		if (index($0,comment_start) > 0){
> +			in_comment = 1
> +		}
> +		# non comment code
> +		if (in_comment == 0) {
> +			for (i in deny_expr) {
> +				forbidden_added = "^\+.*" deny_expr[i];
> +				forbidden_removed="^-.*" deny_expr[i];
> +				current = expressions[deny_expr[i]]
> +				if ($0 ~ forbidden_added) {
> +					count = count + 1;
> +					expressions[deny_expr[i]] = current + 1
> +				}
> +				if ($0 ~ forbidden_removed) {
> +					count = count - 1;
> +					expressions[deny_expr[i]] = current - 1
> +				}
> +			}
> +		}
> +
> +		# comment end
> +		if (index($0,comment_end) > 0)  {
> +			in_comment = 0
> +		}
> +    }
> +	# switch to next file , check if the balance of add/remove
> +	# of previous filehad new additions
> +	($0 ~ "^\+\+\+ b/") {
> +		in_file = 0;
> +		if (count > 0){
> +			exit;
> +		}
> +		for (i in deny_folders){
> +			re = "^\+\+\+ b/" deny_folders[i];
> +			if ($0 ~ deny_folders[i]) {
> +				in_file = 1
> +				last_file = $0
> +			}
> +		}
> +	}
> +	END{
> +		if (count > 0){
> +			print "Forbidden expressions in " substr(last_file,6) ":"
> +			for (key in expressions) {
> +				if (expressions[key] > 0) {
> +					print key
> +				}
> +			}
> +			exit 1
> +		}
> +	}
> +EOF
> +	# ---------------------------------
> +
> +	# refrain from new additions of rte_panic() and rte_exit()
> +	# under lib and net
> +	# multiple folders and expressions are separated by spaces
> +	awk -v FOLDERS="lib net" \
> +		-v EXPRESSIONS="rte_panic\\\( rte_exit\\\(" \

I don't think rte_panic should be considered forbidden.  Rather their
use should be flagged (as this patch does).  However, the 'exit 1'
(which will return a failure for the automatic checkpatch script bot)
might end up problematic as maintainers might consider it a patch that
is not ready.

I wouldn't object to this patch, but just think you might want to change
the print to something like:

  WARN: Are you sure you meant to use "$expression"?

> +		"$awk_script" -
> +}
> +
>  number=0
>  quiet=false
>  verbose=false
> @@ -89,11 +173,19 @@ check () { # <patch> <commit> <title>
>  	total=$(($total + 1))
>  	! $verbose || printf '\n### %s\n\n' "$3"
>  	if [ -n "$1" ] ; then
> +		cat "$1" | check_forbidden_additions
> +		[ $? -eq 0 ] || return 0
>  		report=$($DPDK_CHECKPATCH_PATH $options "$1" 2>/dev/null)
>  	elif [ -n "$2" ] ; then
> -		report=$(git format-patch --find-renames --no-stat --stdout -1 $commit |
> +		params=$(echo "--find-renames --no-stat --stdout -1")
> +		body=$(git format-patch $params $commit)
> +		echo "$body" | check_forbidden_additions
> +		[ $? -eq 0 ] || return 0
> +		report=$(echo "$body" |
>  			$DPDK_CHECKPATCH_PATH $options - 2>/dev/null)
>  	else
> +		check_forbidden_additions -
> +		[ $? -eq 0 ] || return 0
>  		report=$($DPDK_CHECKPATCH_PATH $options - 2>/dev/null)
>  	fi
>  	[ $? -ne 0 ] || return 0

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v4 07/11] eal: replace rte_panic instances in hugepage_info
  2018-04-19 14:50         ` Burakov, Anatoly
@ 2018-04-20 13:11           ` Arnon Warshavsky
  0 siblings, 0 replies; 47+ messages in thread
From: Arnon Warshavsky @ 2018-04-20 13:11 UTC (permalink / raw)
  To: Burakov, Anatoly
  Cc: Thomas Monjalon, Lu, Wenzhuo, Doherty, Declan, jerin.jacob,
	Bruce Richardson, Yigit, Ferruh, dev

Now clear.Thanks

On Thu, Apr 19, 2018 at 5:50 PM, Burakov, Anatoly <anatoly.burakov@intel.com
> wrote:

> On 19-Apr-18 3:45 PM, Burakov, Anatoly wrote:
>
>> On 19-Apr-18 3:09 PM, Arnon Warshavsky wrote:
>>
>>> Thanks Anatoly. Will fix that in v5.
>>> Is it preferred to keep all version notes in the cover letter alone?
>>>
>>>
>> Generally, cover letter should give general outline (i.e. "fixed 32 bit
>> compile"), while notes for individual patches should be more specific about
>> the changes between versions (but not too specific, i.e. don't do "change
>> variable X on line 100 to be Y").
>>
>> So, whatever you think gets your point across best. Not all changes
>> deserve to be called out in the cover letter.
>>
>> Just to be clear:
>
> With my initial reply, i did not mean "patch notes should be in the cover
> letter". What i meant was that you have put your version changes "e.g. v4 -
> changed this to that" into the commit message.
>
> What you should have done is put your patch notes after the commit
> message, like this:
>
> replace panic calls with log and retrun value.
>
> Signed-off-by: Arnon Warshavsky <arnon@qwilt.com>
> ---
>
> v4
> static size calculation function changed to return success/fail code
> in addition to filling  the size result.
>
>
>  lib/librte_eal/linuxapp/eal/eal_hugepage_info.c | 32
> ++++++++++++++++---------
>  1 file changed, 21 insertions(+), 11 deletions(-)
>
> Note that the v4 comments are after the "---" - this is where the commit
> message ends as far as git concerned, so you can put your notes there.
>
> --
> Thanks,
> Anatoly
>



-- 

*Arnon Warshavsky*
*Qwilt | work: +972-72-2221634 | mobile: +972-50-8583058 | arnon@qwilt.com
<arnon@qwilt.com>*

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v4 07/11] eal: replace rte_panic instances in hugepage_info
  2018-04-19 14:36   ` Kevin Traynor
@ 2018-04-20 13:12     ` Arnon Warshavsky
  0 siblings, 0 replies; 47+ messages in thread
From: Arnon Warshavsky @ 2018-04-20 13:12 UTC (permalink / raw)
  To: Kevin Traynor
  Cc: Thomas Monjalon, Burakov, Anatoly, Lu, Wenzhuo, Doherty, Declan,
	jerin.jacob, Bruce Richardson, Yigit, Ferruh, dev

Thanks Kevin
Will address that

On Thu, Apr 19, 2018 at 5:36 PM, Kevin Traynor <ktraynor@redhat.com> wrote:

> On 04/19/2018 07:01 AM, Arnon Warshavsky wrote:
> > replace panic calls with log and retrun value.
> >
> > v4
> > static size calculation function changed to return success/fail code
> > in addition to filling  the size result.
> >
>
> fyi - this patch doesn't apply on master branch without fuzz
>
> > Signed-off-by: Arnon Warshavsky <arnon@qwilt.com>
> > ---
> >  lib/librte_eal/linuxapp/eal/eal_hugepage_info.c | 32
> ++++++++++++++++---------
> >  1 file changed, 21 insertions(+), 11 deletions(-)
> >
> > diff --git a/lib/librte_eal/linuxapp/eal/eal_hugepage_info.c
> b/lib/librte_eal/linuxapp/eal/eal_hugepage_info.c
> > index fb4b667..debae32 100644
> > --- a/lib/librte_eal/linuxapp/eal/eal_hugepage_info.c
> > +++ b/lib/librte_eal/linuxapp/eal/eal_hugepage_info.c
> > @@ -145,8 +145,8 @@
> >       return num_pages;
> >  }
> >
> > -static uint64_t
> > -get_default_hp_size(void)
> > +static int
> > +get_default_hp_size(uint64_t *result)
> >  {
> >       const char proc_meminfo[] = "/proc/meminfo";
> >       const char str_hugepagesz[] = "Hugepagesize:";
> > @@ -155,8 +155,11 @@
> >       unsigned long long size = 0;
> >
> >       FILE *fd = fopen(proc_meminfo, "r");
> > -     if (fd == NULL)
> > -             rte_panic("Cannot open %s\n", proc_meminfo);
> > +     if (fd == NULL) {
> > +             RTE_LOG(CRIT, EAL, "%s(): Cannot open %s\n",
> > +                             __func__, proc_meminfo);
> > +             return -1;
> > +     }
> >       while(fgets(buffer, sizeof(buffer), fd)){
> >               if (strncmp(buffer, str_hugepagesz, hugepagesz_len) == 0){
> >                       size = rte_str_to_size(&buffer[hugepagesz_len]);
> > @@ -164,9 +167,13 @@
> >               }
> >       }
> >       fclose(fd);
> > -     if (size == 0)
> > -             rte_panic("Cannot get default hugepage size from %s\n",
> proc_meminfo);
> > -     return size;
> > +     if (size == 0) {
> > +             RTE_LOG(CRIT, EAL, "%s(): Cannot get default hugepage size
> from %s\n",
> > +                                              __func__, proc_meminfo);
> > +             return -1;
> > +     }
> > +     *result = size;
> > +     return 0;
> >  }
> >
> >  static const char *
> > @@ -191,11 +198,14 @@
> >       char *retval = NULL;
> >
> >       FILE *fd = fopen(proc_mounts, "r");
> > -     if (fd == NULL)
> > -             rte_panic("Cannot open %s\n", proc_mounts);
> > +     if (fd == NULL) {
> > +             RTE_LOG(CRIT, EAL, "%s(): Cannot open %s\n",
> > +                             __func__, proc_mounts);
> > +             return NULL;
> > +     }
> >
> > -     if (default_size == 0)
> > -             default_size = get_default_hp_size();
> > +     if ((default_size == 0) && (get_default_hp_size(&default_size) !=
> 0))
> > +             return NULL;
> >
> >       while (fgets(buf, sizeof(buf), fd)){
> >               if (rte_strsplit(buf, sizeof(buf), splitstr,
> _FIELDNAME_MAX,
> >
>
>


-- 

*Arnon Warshavsky*
*Qwilt | work: +972-72-2221634 | mobile: +972-50-8583058 | arnon@qwilt.com
<arnon@qwilt.com>*

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v4 02/11] bond: replace rte_panic instances in bonding driver
  2018-04-19 17:25   ` Kevin Traynor
@ 2018-04-20 13:13     ` Arnon Warshavsky
  0 siblings, 0 replies; 47+ messages in thread
From: Arnon Warshavsky @ 2018-04-20 13:13 UTC (permalink / raw)
  To: Kevin Traynor
  Cc: Thomas Monjalon, Burakov, Anatoly, Lu, Wenzhuo, Doherty, Declan,
	jerin.jacob, Bruce Richardson, Yigit, Ferruh, dev

Will do. Thanks

On Thu, Apr 19, 2018 at 8:25 PM, Kevin Traynor <ktraynor@redhat.com> wrote:

> On 04/19/2018 07:01 AM, Arnon Warshavsky wrote:
> > replace panic calls with log and retrun value.
> > Local functions to this file,
> > changing from void to int are non-abi-breaking
> > --
> > v4 - fix split literal strings in log messages
> >
> > Signed-off-by: Arnon Warshavsky <arnon@qwilt.com>
> > ---
> >  drivers/net/bonding/rte_eth_bond_8023ad.c         | 28
> +++++++++++++++--------
> >  drivers/net/bonding/rte_eth_bond_8023ad_private.h |  2 +-
> >  drivers/net/bonding/rte_eth_bond_api.c            | 20 +++++++++++-----
> >  drivers/net/bonding/rte_eth_bond_pmd.c            |  9 +++++---
> >  drivers/net/bonding/rte_eth_bond_private.h        |  2 +-
> >  5 files changed, 40 insertions(+), 21 deletions(-)
> >
> > diff --git a/drivers/net/bonding/rte_eth_bond_8023ad.c
> b/drivers/net/bonding/rte_eth_bond_8023ad.c
> > index c452318..7512901 100644
> > --- a/drivers/net/bonding/rte_eth_bond_8023ad.c
> > +++ b/drivers/net/bonding/rte_eth_bond_8023ad.c
> > @@ -893,7 +893,7 @@
> >                       bond_mode_8023ad_periodic_cb, arg);
> >  }
> >
> > -void
> > +int
> >  bond_mode_8023ad_activate_slave(struct rte_eth_dev *bond_dev,
> >                               uint16_t slave_id)
> >  {
> > @@ -939,7 +939,7 @@
> >       timer_cancel(&port->warning_timer);
> >
> >       if (port->mbuf_pool != NULL)
> > -             return;
> > +             return 0;
> >
> >       RTE_ASSERT(port->rx_ring == NULL);
> >       RTE_ASSERT(port->tx_ring == NULL);
> > @@ -968,8 +968,9 @@
> >       /* Any memory allocation failure in initialization is critical
> because
> >        * resources can't be free, so reinitialization is impossible. */
> >       if (port->mbuf_pool == NULL) {
> > -             rte_panic("Slave %u: Failed to create memory pool '%s':
> %s\n",
> > -                     slave_id, mem_name, rte_strerror(rte_errno));
> > +             RTE_LOG(ERR, PMD, "%s() Slave %u: Failed to create memory
> pool '%s': %s\n",
> > +                     __func__, slave_id, mem_name,
> rte_strerror(rte_errno));
> > +             return -1;
> >       }
> >
> >       snprintf(mem_name, RTE_DIM(mem_name), "slave_%u_rx", slave_id);
> > @@ -977,8 +978,9 @@
> >                       rte_align32pow2(BOND_MODE_8023AX_SLAVE_RX_PKTS),
> socket_id, 0);
> >
> >       if (port->rx_ring == NULL) {
> > -             rte_panic("Slave %u: Failed to create rx ring '%s': %s\n",
> slave_id,
> > -                     mem_name, rte_strerror(rte_errno));
> > +             RTE_LOG(ERR, PMD, "%s() Slave %u: Failed to create rx ring
> '%s': %s\n",
> > +                     __func__, slave_id, mem_name,
> rte_strerror(rte_errno));
> > +             return -1;
> >       }
> >
> >       /* TX ring is at least one pkt longer to make room for marker
> packet. */
> > @@ -987,9 +989,12 @@
> >                       rte_align32pow2(BOND_MODE_8023AX_SLAVE_TX_PKTS +
> 1), socket_id, 0);
> >
> >       if (port->tx_ring == NULL) {
> > -             rte_panic("Slave %u: Failed to create tx ring '%s': %s\n",
> slave_id,
> > -                     mem_name, rte_strerror(rte_errno));
> > +             RTE_LOG(ERR, PMD, "%s() Slave %u: Fail to create tx ring
> '%s': %s\n",
> > +                     __func__, slave_id, mem_name,
> rte_strerror(rte_errno));
> > +             return -1;
> >       }
> > +
> > +     return 0;
> >  }
> >
> >  int
> > @@ -1143,9 +1148,12 @@
> >       struct bond_dev_private *internals = bond_dev->data->dev_private;
> >       uint8_t i;
> >
> > -     for (i = 0; i < internals->active_slave_count; i++)
> > -             bond_mode_8023ad_activate_slave(bond_dev,
> > +     for (i = 0; i < internals->active_slave_count; i++) {
> > +             int rc = bond_mode_8023ad_activate_slave(bond_dev,
> >                               internals->active_slaves[i]);
> > +             if (rc != 0)
> > +                     return rc;
> > +     }
> >
> >       return 0;
> >  }
> > diff --git a/drivers/net/bonding/rte_eth_bond_8023ad_private.h
> b/drivers/net/bonding/rte_eth_bond_8023ad_private.h
> > index 0f490a5..96a42f2 100644
> > --- a/drivers/net/bonding/rte_eth_bond_8023ad_private.h
> > +++ b/drivers/net/bonding/rte_eth_bond_8023ad_private.h
> > @@ -263,7 +263,7 @@ struct mode8023ad_private {
> >   * @return
> >   *  0 on success, negative value otherwise.
> >   */
> > -void
> > +int
> >  bond_mode_8023ad_activate_slave(struct rte_eth_dev *dev, uint16_t
> port_id);
> >
> >  /**
> > diff --git a/drivers/net/bonding/rte_eth_bond_api.c
> b/drivers/net/bonding/rte_eth_bond_api.c
> > index aa89425..96aa1ff 100644
> > --- a/drivers/net/bonding/rte_eth_bond_api.c
> > +++ b/drivers/net/bonding/rte_eth_bond_api.c
> > @@ -69,14 +69,15 @@
> >       return 0;
> >  }
> >
> > -void
> > +int
> >  activate_slave(struct rte_eth_dev *eth_dev, uint16_t port_id)
> >  {
> >       struct bond_dev_private *internals = eth_dev->data->dev_private;
> >       uint8_t active_count = internals->active_slave_count;
> >
> >       if (internals->mode == BONDING_MODE_8023AD)
> > -             bond_mode_8023ad_activate_slave(eth_dev, port_id);
> > +             if (bond_mode_8023ad_activate_slave(eth_dev, port_id) !=
> 0)
> > +                     return -1;
> >
> >       if (internals->mode == BONDING_MODE_TLB
> >                       || internals->mode == BONDING_MODE_ALB) {
> > @@ -357,10 +358,17 @@
> >                               bond_ethdev_primary_set(internals,
> >                                                       slave_port_id);
> >
> > -                     if (find_slave_by_id(internals->active_slaves,
> > -                                          internals->active_slave_count,
> > -                                          slave_port_id) ==
> internals->active_slave_count)
> > -                             activate_slave(bonded_eth_dev,
> slave_port_id);
> > +                     int rc =
>
> There's no need for the rc variables, the existing check would suffice here
>
> > +                             find_slave_by_id(internals->active_slaves,
> > +                                     internals->active_slave_count,
> > +                                     slave_port_id);
> > +
> > +                     if (rc == internals->active_slave_count) {
> > +                             int rc = activate_slave(bonded_eth_dev,
> > +                                                     slave_port_id);
> > +                             if (rc != 0)
> > +                                     return -1;
> and this could be
>
> if (activate_slave(bonded_eth_dev, slave_port_id))
>         return -1;
>
> > +                     }
> >               }
> >       }
> >
> > diff --git a/drivers/net/bonding/rte_eth_bond_pmd.c
> b/drivers/net/bonding/rte_eth_bond_pmd.c
> > index 2805c71..2d9052d 100644
> > --- a/drivers/net/bonding/rte_eth_bond_pmd.c
> > +++ b/drivers/net/bonding/rte_eth_bond_pmd.c
> > @@ -1741,8 +1741,10 @@ struct bwg_slave {
> >               /* Any memory allocation failure in initialization is
> critical because
> >                * resources can't be free, so reinitialization is
> impossible. */
> >               if (port->slow_pool == NULL) {
> > -                     rte_panic("Slave %u: Failed to create memory pool
> '%s': %s\n",
> > -                             slave_id, mem_name,
> rte_strerror(rte_errno));
> > +                     RTE_LOG(ERR, PMD, "%s() Slave %u: Failed to create
> memory pool '%s': %s\n",
> > +                             __func__, slave_id,
> > +                             mem_name, rte_strerror(rte_errno));
> > +                     return -1;
> >               }
> >       }
> >
> > @@ -2673,7 +2675,8 @@ struct bwg_slave {
> >                       mac_address_slaves_update(bonded_eth_dev);
> >               }
> >
> > -             activate_slave(bonded_eth_dev, port_id);
> > +             if (activate_slave(bonded_eth_dev, port_id) != 0)
> > +                     return -1;
>
> it's more consistent with the rest of the function to do,
>
> if(activate_slave(bonded_eth_dev, port_id))
>         return rc;
>
> There's other places through the patches where "!= 0" is used but not
> really needed
>
> >
> >               /* If user has defined the primary port then default to
> using it */
> >               if (internals->user_defined_primary_port &&
> > diff --git a/drivers/net/bonding/rte_eth_bond_private.h
> b/drivers/net/bonding/rte_eth_bond_private.h
> > index 94eca88..d99d42c 100644
> > --- a/drivers/net/bonding/rte_eth_bond_private.h
> > +++ b/drivers/net/bonding/rte_eth_bond_private.h
> > @@ -187,7 +187,7 @@ struct bond_dev_private {
> >  void
> >  deactivate_slave(struct rte_eth_dev *eth_dev, uint16_t port_id);
> >
> > -void
> > +int
> >  activate_slave(struct rte_eth_dev *eth_dev, uint16_t port_id);
> >
> >  void
> >
>
>


-- 

*Arnon Warshavsky*
*Qwilt | work: +972-72-2221634 | mobile: +972-50-8583058 | arnon@qwilt.com
<arnon@qwilt.com>*

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v4 03/11] e1000: replace rte_panic instances in e1000 driver
  2018-04-19 17:25   ` Kevin Traynor
@ 2018-04-20 13:14     ` Arnon Warshavsky
  0 siblings, 0 replies; 47+ messages in thread
From: Arnon Warshavsky @ 2018-04-20 13:14 UTC (permalink / raw)
  To: Kevin Traynor
  Cc: Thomas Monjalon, Burakov, Anatoly, Lu, Wenzhuo, Doherty, Declan,
	jerin.jacob, Bruce Richardson, Yigit, Ferruh, dev

Same as in the other patches. Will do. Thanks

On Thu, Apr 19, 2018 at 8:25 PM, Kevin Traynor <ktraynor@redhat.com> wrote:

> On 04/19/2018 07:01 AM, Arnon Warshavsky wrote:
> > replace panic calls with log and retrun value.
> > Local function to this file,
> > changing from void to int is non-abi-breaking
> > --
> > v4 - keep error message literal string in a singhle line
> >
> > Signed-off-by: Arnon Warshavsky <arnon@qwilt.com>
> > ---
> >  drivers/net/e1000/e1000_ethdev.h |  2 +-
> >  drivers/net/e1000/igb_ethdev.c   |  3 ++-
> >  drivers/net/e1000/igb_pf.c       | 15 +++++++++------
> >  3 files changed, 12 insertions(+), 8 deletions(-)
> >
> > diff --git a/drivers/net/e1000/e1000_ethdev.h b/drivers/net/e1000/e1000_
> ethdev.h
> > index 6354b89..2e527de 100644
> > --- a/drivers/net/e1000/e1000_ethdev.h
> > +++ b/drivers/net/e1000/e1000_ethdev.h
> > @@ -411,7 +411,7 @@ int eth_igb_rss_hash_conf_get(struct rte_eth_dev
> *dev,
> >  /*
> >   * misc function prototypes
> >   */
> > -void igb_pf_host_init(struct rte_eth_dev *eth_dev);
> > +int igb_pf_host_init(struct rte_eth_dev *eth_dev);
> >
> >  void igb_pf_mbx_process(struct rte_eth_dev *eth_dev);
> >
> > diff --git a/drivers/net/e1000/igb_ethdev.c b/drivers/net/e1000/igb_
> ethdev.c
> > index 9b808a9..4479616 100644
> > --- a/drivers/net/e1000/igb_ethdev.c
> > +++ b/drivers/net/e1000/igb_ethdev.c
> > @@ -833,7 +833,8 @@ static int igb_flex_filter_uninit(struct rte_eth_dev
> *eth_dev)
> >       }
> >
> >       /* initialize PF if max_vfs not zero */
> > -     igb_pf_host_init(eth_dev);
> > +     if (igb_pf_host_init(eth_dev) != 0)
>
> You don't need "!= 0"
>
> You need to set "error" here, or else return it from igb_pf_host_init().
> We know -ENOMEM is the only error that can be returned from
> igb_pf_host_init() but not sure if should assume that.
>
> > +             goto err_late;
> >
> >       ctrl_ext = E1000_READ_REG(hw, E1000_CTRL_EXT);
> >       /* Set PF Reset Done bit so PF/VF Mail Ops can work */
> > diff --git a/drivers/net/e1000/igb_pf.c b/drivers/net/e1000/igb_pf.c
> > index b9f2e53..ae4b0a4 100644
> > --- a/drivers/net/e1000/igb_pf.c
> > +++ b/drivers/net/e1000/igb_pf.c
> > @@ -63,7 +63,7 @@ int igb_vf_perm_addr_gen(struct rte_eth_dev *dev,
> uint16_t vf_num)
> >       return 0;
> >  }
> >
> > -void igb_pf_host_init(struct rte_eth_dev *eth_dev)
> > +int igb_pf_host_init(struct rte_eth_dev *eth_dev)
> >  {
> >       struct e1000_vf_info **vfinfo =
> >               E1000_DEV_PRIVATE_TO_P_VFDATA(eth_dev->data->dev_private);
> > @@ -74,7 +74,7 @@ void igb_pf_host_init(struct rte_eth_dev *eth_dev)
> >
> >       RTE_ETH_DEV_SRIOV(eth_dev).active = 0;
> >       if (0 == (vf_num = dev_num_vf(eth_dev)))
> > -             return;
> > +             return 0;
> >
> >       if (hw->mac.type == e1000_i350)
> >               nb_queue = 1;
> > @@ -82,11 +82,14 @@ void igb_pf_host_init(struct rte_eth_dev *eth_dev)
> >               /* per datasheet, it should be 2, but 1 seems correct */
> >               nb_queue = 1;
> >       else
> > -             return;
> > +             return 0;
> >
> >       *vfinfo = rte_zmalloc("vf_info", sizeof(struct e1000_vf_info) *
> vf_num, 0);
> > -     if (*vfinfo == NULL)
> > -             rte_panic("Cannot allocate memory for private VF data\n");
> > +     if (*vfinfo == NULL) {
> > +             RTE_LOG(CRIT, PMD, "%s(): Cannot allocate memory for
> private VF data\n",
> > +                     __func__);
> > +             return -1;
> > +     }
> >
> >       RTE_ETH_DEV_SRIOV(eth_dev).active = ETH_8_POOLS;
> >       RTE_ETH_DEV_SRIOV(eth_dev).nb_q_per_pool = nb_queue;
> > @@ -98,7 +101,7 @@ void igb_pf_host_init(struct rte_eth_dev *eth_dev)
> >       /* set mb interrupt mask */
> >       igb_mb_intr_setup(eth_dev);
> >
> > -     return;
> > +     return 0;
> >  }
> >
> >  void igb_pf_host_uninit(struct rte_eth_dev *dev)
> >
>
>


-- 

*Arnon Warshavsky*
*Qwilt | work: +972-72-2221634 | mobile: +972-50-8583058 | arnon@qwilt.com
<arnon@qwilt.com>*

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v4 04/11] ixgbe: replace rte_panic instances in ixgbe driver
  2018-04-19 17:26   ` Kevin Traynor
@ 2018-04-20 13:16     ` Arnon Warshavsky
  0 siblings, 0 replies; 47+ messages in thread
From: Arnon Warshavsky @ 2018-04-20 13:16 UTC (permalink / raw)
  To: Kevin Traynor
  Cc: Thomas Monjalon, Burakov, Anatoly, Lu, Wenzhuo, Doherty, Declan,
	jerin.jacob, Bruce Richardson, Yigit, Ferruh, dev

> On 04/19/2018 07:01 AM, Arnon Warshavsky wrote:
> > replace panic calls with log and retrun value.
>
> typo return, seems to be in a few commit msgs
>

Yup. They were not even copy pasted so at least I am consistent :)


> +             return -1;
>
> similar comment as previous patch about using an appropriate return value
>

And those

Thanks

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v4 05/11] eal: replace rte_panic instances in eventdev
  2018-04-19 17:26   ` Kevin Traynor
@ 2018-04-20 13:17     ` Arnon Warshavsky
  0 siblings, 0 replies; 47+ messages in thread
From: Arnon Warshavsky @ 2018-04-20 13:17 UTC (permalink / raw)
  To: Kevin Traynor
  Cc: Thomas Monjalon, Burakov, Anatoly, Lu, Wenzhuo, Doherty, Declan,
	jerin.jacob, Bruce Richardson, Yigit, Ferruh, dev

Ok. Thanks

On Thu, Apr 19, 2018 at 8:26 PM, Kevin Traynor <ktraynor@redhat.com> wrote:

> On 04/19/2018 07:01 AM, Arnon Warshavsky wrote:
> > replace panic calls with log and retrun value.
> >
> > --
> > v4 - fix split literal strings in log messages
> >
> > Signed-off-by: Arnon Warshavsky <arnon@qwilt.com>
> > ---
> >  lib/librte_eventdev/rte_eventdev_pmd_pci.h  | 8 +++++---
> >  lib/librte_eventdev/rte_eventdev_pmd_vdev.h | 8 +++++---
> >  2 files changed, 10 insertions(+), 6 deletions(-)
> >
> > diff --git a/lib/librte_eventdev/rte_eventdev_pmd_pci.h
> b/lib/librte_eventdev/rte_eventdev_pmd_pci.h
> > index 8fb6138..dad2182 100644
> > --- a/lib/librte_eventdev/rte_eventdev_pmd_pci.h
> > +++ b/lib/librte_eventdev/rte_eventdev_pmd_pci.h
> > @@ -66,9 +66,11 @@
> >                                               RTE_CACHE_LINE_SIZE,
> >                                               rte_socket_id());
> >
> > -             if (eventdev->data->dev_private == NULL)
> > -                     rte_panic("Cannot allocate memzone for private "
> > -                                     "device data");
> > +             if (eventdev->data->dev_private == NULL) {
> > +                     RTE_LOG(CRIT, EAL, "%s(): Cannot allocate memzone
> for private device data",
> > +                             __func__);
> > +                     return -1;
>
> return -ENOMEM
>
> > +             }
> >       }
> >
> >       eventdev->dev = &pci_dev->device;
> > diff --git a/lib/librte_eventdev/rte_eventdev_pmd_vdev.h
> b/lib/librte_eventdev/rte_eventdev_pmd_vdev.h
> > index 8c64a06..b7c08fa 100644
> > --- a/lib/librte_eventdev/rte_eventdev_pmd_vdev.h
> > +++ b/lib/librte_eventdev/rte_eventdev_pmd_vdev.h
> > @@ -61,9 +61,11 @@
> >                                               RTE_CACHE_LINE_SIZE,
> >                                               socket_id);
> >
> > -             if (eventdev->data->dev_private == NULL)
> > -                     rte_panic("Cannot allocate memzone for private
> device"
> > -                                     " data");
> > +             if (eventdev->data->dev_private == NULL) {
> > +                     RTE_LOG(CRIT, EAL, "%s(): Cannot allocate memzone
> for private device data",
> > +                             __func__);
> > +                     return NULL;
> > +             }
> >       }
> >
> >       return eventdev;
> >
>
>


-- 

*Arnon Warshavsky*
*Qwilt | work: +972-72-2221634 | mobile: +972-50-8583058 | arnon@qwilt.com
<arnon@qwilt.com>*

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v4 08/11] eal: replace rte_panic instances in interrupts thread
  2018-04-19 17:27   ` Kevin Traynor
@ 2018-04-20 13:18     ` Arnon Warshavsky
  0 siblings, 0 replies; 47+ messages in thread
From: Arnon Warshavsky @ 2018-04-20 13:18 UTC (permalink / raw)
  To: Kevin Traynor
  Cc: Thomas Monjalon, Burakov, Anatoly, Lu, Wenzhuo, Doherty, Declan,
	jerin.jacob, Bruce Richardson, Yigit, Ferruh, dev

Not deliberate.Thanks

On Thu, Apr 19, 2018 at 8:27 PM, Kevin Traynor <ktraynor@redhat.com> wrote:

> On 04/19/2018 07:01 AM, Arnon Warshavsky wrote:
> > replace panic calls with log and retrun value.
> > Thread function removes the noretrun attribute.
> >
> > Signed-off-by: Arnon Warshavsky <arnon@qwilt.com>
> > ---
> >  lib/librte_eal/linuxapp/eal/eal_interrupts.c | 27
> ++++++++++++++++++++-------
> >  1 file changed, 20 insertions(+), 7 deletions(-)
> >
> > diff --git a/lib/librte_eal/linuxapp/eal/eal_interrupts.c
> b/lib/librte_eal/linuxapp/eal/eal_interrupts.c
> > index 58e9328..8b8650a 100644
> > --- a/lib/librte_eal/linuxapp/eal/eal_interrupts.c
> > +++ b/lib/librte_eal/linuxapp/eal/eal_interrupts.c
> > @@ -785,7 +785,7 @@ struct rte_intr_source {
> >   * @return
> >   *  never return;
> >   */
> > -static __attribute__((noreturn)) void *
> > +static void *
> >  eal_intr_thread_main(__rte_unused void *arg)
> >  {
> >       struct epoll_event ev;
> > @@ -803,8 +803,11 @@ static __attribute__((noreturn)) void *
> >
> >               /* create epoll fd */
> >               int pfd = epoll_create(1);
> > -             if (pfd < 0)
> > -                     rte_panic("Cannot create epoll instance\n");
> > +             if (pfd < 0) {
> > +                     RTE_LOG(CRIT, EAL, "%s(): Cannot create epoll
> instance\n",
> > +                                     __func__);
> > +                     return NULL;
> > +             }
> >
> >               pipe_event.data.fd = intr_pipe.readfd;
> >               /**
> > @@ -813,8 +816,11 @@ static __attribute__((noreturn)) void *
> >                */
> >               if (epoll_ctl(pfd, EPOLL_CTL_ADD, intr_pipe.readfd,
> >                                               &pipe_event) < 0) {
> > -                     rte_panic("Error adding fd to %d epoll_ctl, %s\n",
> > +                     RTE_LOG(CRIT, EAL, "%s(): Error adding fd to %d "
> > +                                     "epoll_ctl, %s\n",
> > +                                     __func__,
> >                                       intr_pipe.readfd, strerror(errno));
> > +                     return NULL;
> >               }
> >               numfds++;
> >
> > @@ -831,9 +837,14 @@ static __attribute__((noreturn)) void *
> >                        * into wait list.
> >                        */
> >                       if (epoll_ctl(pfd, EPOLL_CTL_ADD,
> > -                                     src->intr_handle.fd, &ev) < 0){
> > -                             rte_panic("Error adding fd %d epoll_ctl,
> %s\n",
> > -                                     src->intr_handle.fd,
> strerror(errno));
> > +                             src->intr_handle.fd, &ev) < 0) {
>
> The alignment changed here, not sure if it was deliberate
>
> > +                             RTE_LOG(CRIT, EAL,
> > +                                             "%s(): Error adding fd %d "
> > +                                             "epoll_ctl, %s\n",
> > +                                             __func__,
> > +                                             src->intr_handle.fd,
> > +                                             strerror(errno));
> > +                             return NULL;
> >                       }
> >                       else
> >                               numfds++;
> > @@ -848,6 +859,8 @@ static __attribute__((noreturn)) void *
> >                */
> >               close(pfd);
> >       }
> > +
> > +     return NULL;
> >  }
> >
> >  int
> >
>
>


-- 

*Arnon Warshavsky*
*Qwilt | work: +972-72-2221634 | mobile: +972-50-8583058 | arnon@qwilt.com
<arnon@qwilt.com>*

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v4 09/11] eal: replace rte_panic instances in ethdev
  2018-04-19 17:27   ` Kevin Traynor
@ 2018-04-20 13:23     ` Arnon Warshavsky
  2018-04-20 13:56       ` Thomas Monjalon
  0 siblings, 1 reply; 47+ messages in thread
From: Arnon Warshavsky @ 2018-04-20 13:23 UTC (permalink / raw)
  To: Kevin Traynor
  Cc: Thomas Monjalon, Burakov, Anatoly, Lu, Wenzhuo, Doherty, Declan,
	jerin.jacob, Bruce Richardson, Yigit, Ferruh, dev

>
> Lots of "!= 0"'s - you might gather by now that I don't like them :-)
>

No way. Would have never guessed that :)
Sure. When in Rome..


>
> hmm, I'm wondering should void __rte_experimental
> rte_eth_dev_owner_delete change to return an int, now that there is a
> fail case and it is still experimental...?
>

Good point. Missed that
thanks
/Arnon

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v4 10/11] eal: replace rte_panic instances in init sequence
  2018-04-19 14:57       ` Burakov, Anatoly
  2018-04-19 17:31         ` Kevin Traynor
@ 2018-04-20 13:31         ` Arnon Warshavsky
  1 sibling, 0 replies; 47+ messages in thread
From: Arnon Warshavsky @ 2018-04-20 13:31 UTC (permalink / raw)
  To: Burakov, Anatoly
  Cc: Thomas Monjalon, Lu, Wenzhuo, Doherty, Declan, jerin.jacob,
	Bruce Richardson, Yigit, Ferruh, dev

On Thu, Apr 19, 2018 at 5:57 PM, Burakov, Anatoly <anatoly.burakov@intel.com
> wrote:

> On 19-Apr-18 3:48 PM, Arnon Warshavsky wrote:
>
>> Copy on the commit message  and volatile.
>>
>> Regarding the new function defunct_and_remain_in_endless_loop ()
>> I don't think I can put that in a separate patch without breaking the
>> current patch independence.
>>
>
> How so?
>
> Just leave some panic instances in there for thread-related stuff and fix
> them up in the next patch.
>
> Also, i'm not sure sending threads into an infinite loop on panic is such
> a good idea. You might want to look at Olivier's approach [1] to creating
> threads, using pthread_barriers and pthread_kill/cancel.
>
> This does warrant a separate patch now :)


Thanks Anatoly
Going into the infinite loop looked like the lesser evil at the time ,
but I guess I was too eager to get rid of as many panics on this path as I
could.
Given the bw I will need to properly handle it, and the fact that there are
still some more panic calls in the code
I will withdraw this specific removal inside the thread and handle it for
the next build.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v4 10/11] eal: replace rte_panic instances in init sequence
  2018-04-19 17:31         ` Kevin Traynor
@ 2018-04-20 13:32           ` Arnon Warshavsky
  0 siblings, 0 replies; 47+ messages in thread
From: Arnon Warshavsky @ 2018-04-20 13:32 UTC (permalink / raw)
  To: Kevin Traynor
  Cc: Burakov, Anatoly, Thomas Monjalon, Lu, Wenzhuo, Doherty, Declan,
	jerin.jacob, Bruce Richardson, Yigit, Ferruh, dev

Agree. As I wrote below - I will put this instance back in place for this
patchset and handle it on a different one

On Thu, Apr 19, 2018 at 8:31 PM, Kevin Traynor <ktraynor@redhat.com> wrote:

> On 04/19/2018 03:57 PM, Burakov, Anatoly wrote:
> > On 19-Apr-18 3:48 PM, Arnon Warshavsky wrote:
> >> Copy on the commit message  and volatile.
> >>
> >> Regarding the new function defunct_and_remain_in_endless_loop ()
> >> I don't think I can put that in a separate patch without breaking the
> >> current patch independence.
> >
> > How so?
> >
> > Just leave some panic instances in there for thread-related stuff and
> > fix them up in the next patch.
> >
> > Also, i'm not sure sending threads into an infinite loop on panic is
> > such a good idea. You might want to look at Olivier's approach [1] to
> > creating threads, using pthread_barriers and pthread_kill/cancel.
> >
>
> I haven't reviewed this one yet, but going into an infinite loop doesn't
> seem like the right thing to do.
>
> > This does warrant a separate patch now :)
> >
>
>


-- 

*Arnon Warshavsky*
*Qwilt | work: +972-72-2221634 | mobile: +972-50-8583058 | arnon@qwilt.com
<arnon@qwilt.com>*

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v4 10/11] eal: replace rte_panic instances in init sequence
  2018-04-19 17:48   ` Aaron Conole
@ 2018-04-20 13:55     ` Arnon Warshavsky
  2018-04-20 14:53       ` Aaron Conole
  0 siblings, 1 reply; 47+ messages in thread
From: Arnon Warshavsky @ 2018-04-20 13:55 UTC (permalink / raw)
  To: Aaron Conole
  Cc: Thomas Monjalon, Burakov, Anatoly, Lu, Wenzhuo, Doherty, Declan,
	jerin.jacob, Bruce Richardson, Yigit, Ferruh, dev, Kevin Traynor

 Thanks Aaron

Previously, it wasn't possible for mem_cfg_fd to be reused after a

> failure.  Now it is - please reset it to -1. in these close conditions.
>
> Will do.

>
>
> Again, previously this would have aborted on a failure.  So it needs to
> be reset to a value that allows retry.
>

Same here

>
> >       switch (rte_config.process_type){
> >       case RTE_PROC_PRIMARY:
> > -             rte_eal_config_create();
> > +             if (rte_eal_config_create())
> > +                     return -1;
> >               break;
> >       case RTE_PROC_SECONDARY:
> > -             rte_eal_config_attach();
> > +             if (rte_eal_config_attach())
> > +                     return -1;
> >               rte_eal_mcfg_wait_complete(rte_config.mem_config);
> >               break;
> >       case RTE_PROC_AUTO:
> >       case RTE_PROC_INVALID:
>
> Not for this patch, but I just noticed that this should probably use a
> 'default' case.
>
> Will add this while Im here


>
> Use rte_eal_init_alert to indicate why you are failing the init.
>
Will do

>
> >       if (rte_mp_channel_init() < 0) {
> >               rte_eal_init_alert("failed to init mp channel\n");
> > @@ -652,7 +676,8 @@ static void rte_eal_init_alert(const char *msg)
> >
> >       eal_check_mem_on_local_socket();
> >
> > -     eal_thread_init_master(rte_config.master_lcore);
> > +     if (eal_thread_init_master(rte_config.master_lcore) != 0)
> > +             return -1;
>
> Is it ever possible to recover from this?


Definitely not recoverable, but not different than the other cases where
panic propagate all the way up rather than aborting

> Still needs
> rte_eal_init_alert() call.
>
Will do

>
>
> How are you cleaning up the threads that were spawned?  Lets say this
> loop will execute 5 times, and on the 3rd entry, these errors happen.
> You now leave DPDK 'half-initialized' - you've spun up threads and
> allocated memory.
>
...

>
> I don't see how any of this is better for the user.  In fact, I think
> this is worse because it will make portions of the application stop
> working without any way to move forward.  rte_panic() will at least give
> the process a chance to recover from a potentially ephemeral condition.
>
> As I wrote in a different reply on this patch
I was probably too eager to get rid of this panic taking some wrong
assumptions on the way the library will be called.
Removing the panic from the thread is obviously more complex and also ABI
breaking.
>From my own bw, I will not make it with a proper change to this version, so
I will revert back to panicing on this patchset
and aim for the thread in the next build.


>
> This seems to only exist as a way of triggering the run_once check in
> the eal_init.  It doesn't add anything except one more state variable to
> check against.  What is the purpose?
>

Actually this is not a run-once in purpose, rather an attempt to define a
state for the device
and on the way work around breaking abi on the the void function called
before that.

> +     if (rte_get_panic_state())
> +             return -1;
> +

Please just use run_once.  That's a better way of preventing this.
>


> As stated above - no a run-once
>
> All of the comments from the bsd side apply here.
>

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v4 09/11] eal: replace rte_panic instances in ethdev
  2018-04-20 13:23     ` Arnon Warshavsky
@ 2018-04-20 13:56       ` Thomas Monjalon
  0 siblings, 0 replies; 47+ messages in thread
From: Thomas Monjalon @ 2018-04-20 13:56 UTC (permalink / raw)
  To: Arnon Warshavsky, Kevin Traynor
  Cc: dev, Burakov, Anatoly, Lu, Wenzhuo, Doherty, Declan, jerin.jacob,
	Bruce Richardson, Yigit, Ferruh

20/04/2018 15:23, Arnon Warshavsky:
> >
> > Lots of "!= 0"'s - you might gather by now that I don't like them :-)
> >
> 
> No way. Would have never guessed that :)
> Sure. When in Rome..

It is a matter of taste. I like the explicit "!= 0".
At least, explicit NULL comparisons are recommended in the coding style:
	http://dpdk.org/doc/guides/contributing/coding_style.html#null-pointers
For boolean return, it is OK to be implicit.
But for error codes, I think it is better to be explicit.
Again, matter of taste.

By the way, looking at "git grep 'if (rte_'" suggests it is common.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v4 11/11] devtools: prevent new instances of rte_panic and rte_exit
  2018-04-19 17:52   ` Aaron Conole
@ 2018-04-20 14:01     ` Arnon Warshavsky
  2018-04-20 15:41       ` Burakov, Anatoly
  0 siblings, 1 reply; 47+ messages in thread
From: Arnon Warshavsky @ 2018-04-20 14:01 UTC (permalink / raw)
  To: Aaron Conole
  Cc: Thomas Monjalon, Burakov, Anatoly, Lu, Wenzhuo, Doherty, Declan,
	jerin.jacob, Bruce Richardson, Yigit, Ferruh, dev

>
> I don't think rte_panic should be considered forbidden.  Rather their
> use should be flagged (as this patch does).  However, the 'exit 1'
> (which will return a failure for the automatic checkpatch script bot)
> might end up problematic as maintainers might consider it a patch that
> is not ready.
>
> I wouldn't object to this patch, but just think you might want to change
> the print to something like:
>
>   WARN: Are you sure you meant to use "$expression"?
>

I can definitely change the warning text , but I think the whole point is
actually to prevent future panic calls.
If it were a recommendation, one can never converge with the attempt to get
rid of them, as not throwing a panic is a lot harder.

Thanks
Arnon

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v4 10/11] eal: replace rte_panic instances in init sequence
  2018-04-20 13:55     ` Arnon Warshavsky
@ 2018-04-20 14:53       ` Aaron Conole
  2018-04-23  8:07         ` Arnon Warshavsky
  0 siblings, 1 reply; 47+ messages in thread
From: Aaron Conole @ 2018-04-20 14:53 UTC (permalink / raw)
  To: Arnon Warshavsky
  Cc: Thomas Monjalon, Burakov, Anatoly, Lu, Wenzhuo, Doherty, Declan,
	jerin.jacob, Bruce Richardson, Yigit, Ferruh, dev, Kevin Traynor

Arnon Warshavsky <arnon@qwilt.com> writes:

> Thanks Aaron 
>
> Previously, it wasn't possible for mem_cfg_fd to be reused after a
>
>  failure.  Now it is - please reset it to -1. in these close conditions.
>
> Will do. 
>
>  Again, previously this would have aborted on a failure.  So it needs to
>  be reset to a value that allows retry.
>
> Same here
>
>  >       switch (rte_config.process_type){
>  >       case RTE_PROC_PRIMARY:
>  > -             rte_eal_config_create();
>  > +             if (rte_eal_config_create())
>  > +                     return -1;
>  >               break;
>  >       case RTE_PROC_SECONDARY:
>  > -             rte_eal_config_attach();
>  > +             if (rte_eal_config_attach())
>  > +                     return -1;
>  >               rte_eal_mcfg_wait_complete(rte_config.mem_config);
>  >               break;
>  >       case RTE_PROC_AUTO:
>  >       case RTE_PROC_INVALID:
>
>  Not for this patch, but I just noticed that this should probably use a
>  'default' case.
>
> Will add this while Im here
>  
>  
>  Use rte_eal_init_alert to indicate why you are failing the init.
>
> Will do 
>
>  >       if (rte_mp_channel_init() < 0) {
>  >               rte_eal_init_alert("failed to init mp channel\n");
>  > @@ -652,7 +676,8 @@ static void rte_eal_init_alert(const char *msg)
>  >  
>  >       eal_check_mem_on_local_socket();
>  >  
>  > -     eal_thread_init_master(rte_config.master_lcore);
>  > +     if (eal_thread_init_master(rte_config.master_lcore) != 0)
>  > +             return -1;
>
>  Is it ever possible to recover from this?
>
>  
> Definitely not recoverable, but not different than the other cases where panic propagate all the way
> up rather than aborting

Looking at the eal_thread_init_master, I think it's probably a
recoverable condition.  For instance, perhaps the core mask was wrong,
and could be corrected by re-attempting the initialization.  Just
suggesting that it's probably okay to allow a re-attempt here.  I would
suggest:

-	eal_thread_init_master(rte_config.master_lcore);
+	if (eal_thread_init_master(rte_config.master_lcore) != 0) {
+		rte_eal_init_alert("Cannot assign master lcore\n");
+		rte_errno = EINVAL;
+		return -1;
+	}

if you agree.

>  Still needs
>  rte_eal_init_alert() call.
>
> Will do 
>
>  How are you cleaning up the threads that were spawned?  Lets say this
>  loop will execute 5 times, and on the 3rd entry, these errors happen.
>  You now leave DPDK 'half-initialized' - you've spun up threads and
>  allocated memory.
>
> ... 
>
>  I don't see how any of this is better for the user.  In fact, I think
>  this is worse because it will make portions of the application stop
>  working without any way to move forward.  rte_panic() will at least give
>  the process a chance to recover from a potentially ephemeral condition.
>
> As I wrote in a different reply on this patch
> I was probably too eager to get rid of this panic taking some wrong assumptions on the way the
> library will be called.

Okay - guess emails got crossed in flight :)

> Removing the panic from the thread is obviously more complex and also ABI breaking.
> From my own bw, I will not make it with a proper change to this version, so I will revert back to
> panicing on this patchset 
> and aim for the thread in the next build.

I see.  Most likely you'll need a proper initialization protocol both
ways.  As a brief example, you'll need something to guarantee the thread
state (just a general outline):

--
  global_initial_state = UNINIT
  pthread_init_cond = PTHREAD_COND_INIT;
  global_spawned_ctr = atomic_ctr(0);

rte_eal_init()
  ...
  global_initial_state = INITIALIZING
  thread_ctr = 0;

  ...
  for_each_lcore()
    spawn_thread()
    if (spawn_fails)
       global_initial_state = UNINIT;
       pthread_cond_broadcast()
       return error

    thread_ctr++;

  while (thread_ctr != global_spawned_ctr)
     /* wait?  figure out when to declare extreme failure */

  global_initial_state = THREAD_INITIALIZED
  pthread_cond_broadcast(pthread_init_cond)

  while (thread_ctr)
    wait_some_grace_period()
    for_each_lcore_thread()
      thread_state = lcore_state[thr]; /* probably needs a mem barrier*/
      if (thread_state != THREAD_READY && thread_state != THREAD_STARTING)
         /* failure - message all threads to clean up */
      if (thread_state == THREAD_READY)
         thread_ctr--;


in eal_thread_loop():

  /* before even the set_affinity */
  atomic_inc(global_spawned_ctr);
  lcore_state[thr] = THREAD_STARTING;
  pthread_cond_wait(pthread_init_cond)

  if (global_initial_state != THREAD_INITIALIZED)
    lcore_state[thr] = THREAD_FAILED;
    return...;

  /* do all the normal checks... instead of the panic_state, just set
     lcore_state[thr] to THREAD_FAILED, clean up any additional
     allocated resources, and return... which will exit the thread */

  lcore_state[thr] = THREAD_READY;


--

In the above I hope it illustrates what you'll need - a way to signal to
each side that initialization phase is happening, and that
initialization was successful / failed, and to clean up in the failure
case.

Just meant for illustration so feel free to ignore / flame, but that's
how I would go about removing the rte_panic() calls.

>  This seems to only exist as a way of triggering the run_once check in
>  the eal_init.  It doesn't add anything except one more state variable to
>  check against.  What is the purpose?
>
>  
> Actually this is not a run-once in purpose, rather an attempt to define a state for the device 
> and on the way work around breaking abi on the the void function called before that.

I think it's a way to try and track state for initialization and to
prevent future inits.  Which ABI are you worried about?  rte_panic()?
I'm not sure how this is an ABI work around, but I'm probably not
thinking about it hard enough.

>> +     if (rte_get_panic_state())
>> +             return -1;
>> +
>
>  Please just use run_once.  That's a better way of preventing this.
>
>  
>  As stated above - no a run-once
>
>  All of the comments from the bsd side apply here.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v4 11/11] devtools: prevent new instances of rte_panic and rte_exit
  2018-04-20 14:01     ` Arnon Warshavsky
@ 2018-04-20 15:41       ` Burakov, Anatoly
  0 siblings, 0 replies; 47+ messages in thread
From: Burakov, Anatoly @ 2018-04-20 15:41 UTC (permalink / raw)
  To: Arnon Warshavsky, Aaron Conole
  Cc: Thomas Monjalon, Lu, Wenzhuo, Doherty, Declan, jerin.jacob,
	Bruce Richardson, Yigit, Ferruh, dev

On 20-Apr-18 3:01 PM, Arnon Warshavsky wrote:
> 
> 
>     I don't think rte_panic should be considered forbidden.  Rather their
>     use should be flagged (as this patch does).  However, the 'exit 1'
>     (which will return a failure for the automatic checkpatch script bot)
>     might end up problematic as maintainers might consider it a patch that
>     is not ready.
> 
>     I wouldn't object to this patch, but just think you might want to change
>     the print to something like:
> 
>        WARN: Are you sure you meant to use "$expression"?
> 
> 
> I can definitely change the warning text , but I think the whole point 
> is actually to prevent future panic calls.
> If it were a recommendation, one can never converge with the attempt to 
> get rid of them, as not throwing a panic is a lot harder.
> 
> Thanks
> Arnon
> 

Warning is enough to get it automatically flagged on review. Final 
decision is up to maintainers/committers as to whether accept such code.

-- 
Thanks,
Anatoly

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v4 10/11] eal: replace rte_panic instances in init sequence
  2018-04-20 14:53       ` Aaron Conole
@ 2018-04-23  8:07         ` Arnon Warshavsky
  0 siblings, 0 replies; 47+ messages in thread
From: Arnon Warshavsky @ 2018-04-23  8:07 UTC (permalink / raw)
  To: Aaron Conole
  Cc: Thomas Monjalon, Burakov, Anatoly, Lu, Wenzhuo, Doherty, Declan,
	jerin.jacob, Bruce Richardson, Yigit, Ferruh, dev, Kevin Traynor

>
> Looking at the eal_thread_init_master, I think it's probably a
> recoverable condition.  For instance, perhaps the core mask was wrong,
> and could be corrected by re-attempting the initialization.  Just
> suggesting that it's probably okay to allow a re-attempt here.  I would
> suggest:
>
> -       eal_thread_init_master(rte_config.master_lcore);
> +       if (eal_thread_init_master(rte_config.master_lcore) != 0) {
> +               rte_eal_init_alert("Cannot assign master lcore\n");
> +               rte_errno = EINVAL;
> +               return -1;
> +       }
>
> if you agree.
>
> Yes. This is indeed the way to go.

>
> --
>
> In the above I hope it illustrates what you'll need - a way to signal to
> each side that initialization phase is happening, and that
> initialization was successful / failed, and to clean up in the failure
> case.
>
> Just meant for illustration so feel free to ignore / flame, but that's
> how I would go about removing the rte_panic() calls.
>

Thanks.
Indeed I did not consider recovery and clean up from here

>
> >  This seems to only exist as a way of triggering the run_once check in
> >  the eal_init.  It doesn't add anything except one more state variable to
> >  check against.  What is the purpose?
> >
> >
> > Actually this is not a run-once in purpose, rather an attempt to define
> a state for the device
> > and on the way work around breaking abi on the the void function called
> before that.
>
> I think it's a way to try and track state for initialization and to
> prevent future inits.  Which ABI are you worried about?  rte_panic()?
> I'm not sure how this is an ABI work around, but I'm probably not
> thinking about it hard enough.
>

I now see the mess I did and why wasn't it clear.
I initially changed the api of eal_thread_init_master() from void to int.
Having had the ABI check failed, I only fixed the linuxapp back to void and
added this workaround to prevent ABI break on this patch.

I will align both linuxapp and bsd to remain at panic for this function as
well , and address it together with the thread itself




-- 

*Arnon Warshavsky*
*Qwilt | work: +972-72-2221634 | mobile: +972-50-8583058 | arnon@qwilt.com
<arnon@qwilt.com>*

^ permalink raw reply	[flat|nested] 47+ messages in thread

end of thread, other threads:[~2018-04-23  8:07 UTC | newest]

Thread overview: 47+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-04-19  6:00 [PATCH v4 00/11] eal: replace calls to rte_panic and refrain from new instances Arnon Warshavsky
2018-04-19  6:00 ` [PATCH v4 01/11] crypto: replace rte_panic instances in crypto driver Arnon Warshavsky
2018-04-19 10:53   ` Trahe, Fiona
2018-04-19 13:49     ` Arnon Warshavsky
2018-04-19  6:01 ` [PATCH v4 02/11] bond: replace rte_panic instances in bonding driver Arnon Warshavsky
2018-04-19 17:25   ` Kevin Traynor
2018-04-20 13:13     ` Arnon Warshavsky
2018-04-19  6:01 ` [PATCH v4 03/11] e1000: replace rte_panic instances in e1000 driver Arnon Warshavsky
2018-04-19 17:25   ` Kevin Traynor
2018-04-20 13:14     ` Arnon Warshavsky
2018-04-19  6:01 ` [PATCH v4 04/11] ixgbe: replace rte_panic instances in ixgbe driver Arnon Warshavsky
2018-04-19 17:26   ` Kevin Traynor
2018-04-20 13:16     ` Arnon Warshavsky
2018-04-19  6:01 ` [PATCH v4 05/11] eal: replace rte_panic instances in eventdev Arnon Warshavsky
2018-04-19 17:26   ` Kevin Traynor
2018-04-20 13:17     ` Arnon Warshavsky
2018-04-19  6:01 ` [PATCH v4 06/11] kni: replace rte_panic instances in kni Arnon Warshavsky
2018-04-19  6:01 ` [PATCH v4 07/11] eal: replace rte_panic instances in hugepage_info Arnon Warshavsky
2018-04-19 14:03   ` Burakov, Anatoly
2018-04-19 14:09     ` Arnon Warshavsky
2018-04-19 14:45       ` Burakov, Anatoly
2018-04-19 14:50         ` Burakov, Anatoly
2018-04-20 13:11           ` Arnon Warshavsky
2018-04-19 14:36   ` Kevin Traynor
2018-04-20 13:12     ` Arnon Warshavsky
2018-04-19  6:01 ` [PATCH v4 08/11] eal: replace rte_panic instances in interrupts thread Arnon Warshavsky
2018-04-19 17:27   ` Kevin Traynor
2018-04-20 13:18     ` Arnon Warshavsky
2018-04-19  6:01 ` [PATCH v4 09/11] eal: replace rte_panic instances in ethdev Arnon Warshavsky
2018-04-19 17:27   ` Kevin Traynor
2018-04-20 13:23     ` Arnon Warshavsky
2018-04-20 13:56       ` Thomas Monjalon
2018-04-19  6:01 ` [PATCH v4 10/11] eal: replace rte_panic instances in init sequence Arnon Warshavsky
2018-04-19 14:39   ` Burakov, Anatoly
2018-04-19 14:48     ` Arnon Warshavsky
2018-04-19 14:57       ` Burakov, Anatoly
2018-04-19 17:31         ` Kevin Traynor
2018-04-20 13:32           ` Arnon Warshavsky
2018-04-20 13:31         ` Arnon Warshavsky
2018-04-19 17:48   ` Aaron Conole
2018-04-20 13:55     ` Arnon Warshavsky
2018-04-20 14:53       ` Aaron Conole
2018-04-23  8:07         ` Arnon Warshavsky
2018-04-19  6:01 ` [PATCH v4 11/11] devtools: prevent new instances of rte_panic and rte_exit Arnon Warshavsky
2018-04-19 17:52   ` Aaron Conole
2018-04-20 14:01     ` Arnon Warshavsky
2018-04-20 15:41       ` Burakov, Anatoly

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.